A Robust Feature Extraction with Dual Fusion aided Extreme Learning for Audio–Visual Hindi Speech Recognition
In Automatic Speech Recognition (ASR) based system implementation, robustness to several noisy background situation is a unique challenge. In this paper, for estimating both audio and visual aspect feature in light of different information representation perspectives directs to the robust feature extraction from audio-visual speech image. Further, the authors obtain the bottleneck features from the bottleneck layer of the bottleneck deep neural network (BN-DNN). Further, a familiar powerful texture descriptor of Local Binary Pattern (LBP) and Local Phase Quantization (LPQ) is applied to obtain the visual related features from the face region. Moreover, the categorization is executed utilizing the help of Extreme Learning Machine (ELM) and to reach the global optimum through Jaya optimization algorithm for audio-visual Hindi speech recognition. The proposed scheme is evaluated in MATLAB platform and the implementation is equated with the existing audio-visual speech recognition (AVSR) approaches.
Full Text: PDF (downloaded 21 times)
- There are currently no refbacks.