GUI based Performance Analysis of Speech Enhancement Techniques
Mr. Shishir Banchhor
Mr. Jimish Dodia Ms. Darshana Gowda
Ms. Pooja Jagtap Student, B.E. (EXTC)
Student, B.E. (EXTC) Student, B.E. (EXTC)
Student, B.E. (EXTC) K.J. Somaiya I.E.I.T
K. J. Somaiya I.E.I.T
K.J. Somaiya I.E.I.T
K.J. Somaiya I.E.I.T Sion, Mumbai-22
Sion, Mumbai-22 firstname.lastname@example.org email@example.com firstname.lastname@example.org email@example.com
The speech, being a fundamental way of communication, has been embedded in various applications. The central methods for enhancing speech are removal of background noise, echo suppression or artificially bringing certain frequencies into speech signal. In this project, an attempt has been made towards studying speech enhancement techniques like Spectral Subtraction, Minimum Mean Square Error (MMSE), Kalman and Wiener filter. Based on our observations and analysis of various performance parameters, we conclude which of the methods is most suitable for speech enhancement. The implementation of the code is done using Graphic User Interface on MATLAB.
Keywords— Speech enhancement, FFT, Spectral subtraction, Kalman filter, Wiener filter, Performance parameters
Speech is the fundamental and common medium, hence important for us, to communicate. In general, there exists a need for voice based communications,human-machine/machine-machine interfaces, and automatic speech recognition systems to increase the reliability of these systems in noisy environments. In many cases, these systems work well in nearly noise-free conditions, but their performance deteriorates rapidly in noisy conditions. Therefore, improvement in existing pre-processing algorithms or introducing entire new class of algorithm for speech enhancement is always the objective of research community. In speech enhancement, the goal is to improve the quality of degraded speech. Speech enhancement algorithms are noise suppression techniques, using the knowledge from the field of hearing science, that mitigate the effect of the corrupting background noise, and hence improve the perceived speech quality and speech intelligibility. Enhancing of speech degraded by noise is used for many applications such as mobile phones, VoIP, teleconferencing systems, speech recognition, and hearing aids. The problem of improving performance of speech communication systems in noisy environments has been a challenging area for research for more than three decades now. Efforts to achieve higher quality and/or intelligibility of noisy speech may effectively end up improving performance of other speech applications such as speech coding/compression and speech recognition etc. given in . Speech enhancement has three major goals:
1. To improve the quality and intelligibility of speech corrupted by background noise, reduce the perceptual fatigue. 2. To make speech coders robust when to input noise.
3. To make speech recognition systems more robust to noise.
This project presents an overview of different speech enhancement methods and provides a review of some of the major aspects and approaches in this category.
II. BASIC BLOCK DIAGRAM
The basic block diagram for speech enhancement is as shown below in Fig. 1.
Fig. 1 Basic Block Diagram
The noisy input signal is sent through the analysis window. Here, a few samples of the signal are selected at a time as the signal is continuous and big and cannot be processed in one go. Fast Fourier Transform is applied to convert the signal from time domain to frequency domain. The magnitude of noise and noisy speech are compared and noise is subtracted from the affected speech. The enhanced speech received is in frequency domain and hence requires to be converted back to frequency domain....
Please join StudyMode to read the full document