A Machine Learning Approach for Speech Detection in Modern Wireless Communication Environment

Main Article Content

Shibanee Dash
https://orcid.org/0000-0002-3828-1325
Mihir Narayan Mohanty
https://orcid.org/0000-0003-1252-949X

Abstract

Modern wireless communication has gained a improved position as compared to previous time. Similarly, speech communication is the major focus area of research in respective applications. Many developments are done in this field. In this work, we have chosen the OFDM modulation based communication system, as it has importance in both licensed and unlicensed wireless communication platform. The voice signal is passed though the proposed model to obtain at the receiver end. Due to different circumstances, the signal may be corrupted partially at the user end. Authors try to achieve a better signal for reception using a neural network model of RBFN. The parameters are chosen for the RBFN model, as energy, ZCR, ACF, and fundamental frequency of the speech signal. In one part these parameters have eligibility to eliminate noise partially, where as in other part the RBFN model with these parameters proves its efficacy for both noisy speech signals with noisy channel as Gaussian channel. The efficiency of OFDM model is verified in terms of symbol error rate and the transmitted speech signal is evaluated in term of SNR that shows the reduction of noise. For visual inspection, a sample of signal, noisy signal and received signal is also shown. The experiment is performed with 5dB, 10dB, 15dB noise levels. The result proves the performance of RBFN model as the filter.The performance is measured as the listener’s voice in each condition. The results show that, at the time of the voice in noise environment, proposed technique improves the intelligibility on speech quality.

Article Details

How to Cite
Dash, S., & Mohanty, M. (2018). A Machine Learning Approach for Speech Detection in Modern Wireless Communication Environment. International Journal of Machine Learning and Networked Collaborative Engineering, 2(04), 170-179. Retrieved from https://mlnce.net/index.php/Home/article/view/50
Section
Articles
Author Biographies

Shibanee Dash, RVR & JC College of Engineering Guntur, Andhra Pradesh, India

Shibanee Dash is presently working as a Assistant Professor  in the Department of Electronics and Communication Engineering, at R.V.R & J.C College of Engineering (Autonomous), Guntur, Andhra Pradesh, India. She has Master of Technology in Electronics and Telecommunication at Kalinga Institute of Industrial Technology (Deemed to be University), India. She has 3 year of experience in teaching and research

Mihir Narayan Mohanty, ITER, Siksha 'O' Anusandhan(Deemed to be University),Bhubaneswar, Odisha, India

Mihir Narayan Mohanty is presently working as a Professor in the Department of Electronics and Communication Engineering, Institute of Technical Education and Research. Siksha ‘O’ Anusandhan (Deemed to be University), Bhubaneswar, Odisha, India. He has published over 300 papers in International/National Journals and Conferences along with approximately 25 years of teaching experience. He is the active member of many professional societies like IEEE, IET, EMC & EMI Engineers India, ISCA, ACEEE, IAEng, CSI and also Fellow of IETE and IE (I). He has received his M.Tech. Degree in Communication System Engineering from the Sambalpur University, Sambalpur, Odisha and done his Ph.D. work in Applied Signal Processing. His area of research interests includes Applied Signal and image Processing, Digital Signal/Image Processing, Biomedical Signal Processing, Microwave Communication Engineering and Speech Processing.

References

Haykin, S. S. (2011). Modern wireless communications. Pearson Education India.

Mohanty, M. N., Mishra, L. P., & Mohanty, S. K. (2011). Design of MIMO space-time code for high data rate wireless communication. International Journal on Computer Science and Engineering, 3(2), 693-696.

Dash, S., & Mohanty, M. N (2018). Voice Detection for Cognitive Radio Receiver in Cooperative Spectrum Sensing Environment, AESPC (Accepted).

Tse, D., & Viswanath, P. (2005). Fundamentals of wireless communication. Cambridge university press.

Junqua, J. C. (1993). The Lombard reflex and its role on human listeners and automatic speech recognizers. The Journal of the Acoustical Society of America, 93(1), 510-524, doi.org/10.1121/1.405631.

Loizou, P. C. (2007). Speech enhancement: theory and practice. CRC press

Summers, W. V., Pisoni, D. B., Bernacki, R. H., Pedlow, R. I., & Stokes, M. A. (1988). Effects of noise on speech production: Acoustic and perceptual analyses. The Journal of the Acoustical Society of America, 84(3), 917-928, doi.org/10.1121/1.396660.

Mowlaee, P., Stahl, J., & Kulmer, J. (2017). Iterative joint MAP single-channel speech enhancement given non-uniform phase prior. Speech Communication, 86, 85-96, doi.org/10.1016/j.specom.2016.11.008.

Rabiner, L. R., & Schafer, R. W. (1978). Digital processing of speech signals (Vol. 100, p. 17). Englewood Cliffs, NJ: Prentice-hall.

Mohanty, M. N., & Mishra, S. (2013, March). Design of MCM based wireless system using wavelet packet network & its PAPR analysis. In Circuits, Power and Computing Technologies (ICCPCT), 2013 International Conference on(pp. 821-824). IEEE, doi/10.1109/ICCPCT.2013.6528867.

Cooke, M. (2003). Glimpsing speech. Journal of Phonetics, 31:579 – 584

Mishra, D., Mishra, S., & Mohanty, M. N. (2011). Estimation of MIMO-OFDM Based Channel for High Data Rate Wireless Communication. IJCSIT) International Journal of Computer Science and Information Technologies, 2(3), 1263-1266.

Mishra, B., Mishra, S., & Mohanty, M. N. (2012). Design of Wavelet Packet Based Model for Multi Carrier Modulation. International Journal of Engineering Science and Technology, 4(04), 1572-1575.

Mowlaee, P., Stahl, J., & Kulmer, J. (2017). Iterative joint MAP single-channel speech enhancement given non-uniform phase prior. Speech Communication, 86, 85-96, doi.org/10.1016/j.specom.2016.11.008.

Haykin, S. S. (2009). Neural networks and learning machines (Vol. 3). Upper Saddle River: Pearson.

Phooi, S. K., & Ang, L. M. (2006, November). Adaptive RBF neural network training algorithm for nonlinear and nonstationary signal. In Computational Intelligence and Security, 2006 International Conference on (Vol. 1, pp. 433-436). IEEE, doi.org/10.1016/j.specom.2016.11.008.

Palo, H. K., Mohanty, M. N., & Chandra, M. (2015). Design of neural network model for emotional speech recognition. In Artificial intelligence and evolutionary algorithms in engineering systems (pp. 291-300). Springer, New Delhi, doi.org/10.1007/978-81-322-2135-7_32

Cheng, J. C., Su, T. J., Li, T. Y., & Wu, C. H. (2015, September). The Noise Reduction of Speech Signals Based on RBFN. In Intelligent Information Hiding and Multimedia Signal Processing (IIH-MSP), 2015 International Conference on (pp. 449-452). IEEE, doi/10.1109/IIH-MSP.2015.111.

Mohapatra, S. K., Palo, H. K., & Mohanty, M. N. (2017). Detection of Arrhythmia using Neural Network. Annals of Computer Science and Information Systems, 14, 97-100, doi/10.15439/2018KM42.

Singer, E., & Lippman, R. P. (1992, March). A speech recognizer using radial basis function neural networks in an HMM framework. In Acoustics, Speech, and Signal Processing, 1992. ICASSP-92., 1992 IEEE International Conference on (Vol. 1, pp. 629-632). IEEE, doi/10.1109/ICASSP.1992.225830.