Speech Recognition System Using MFCC and DTW
This paper presents a speech recognition system. The main goal of this system is to classify the speech of the speaker. In this system, there are three main parts – (1) pre-processing, (2) feature extractions and (3) classification. The input speech is preprocessed by Voice Active Detection (VAD).Features are extracted by Mel-Frequency Cepstral Coefficient (MFCC). The input speech is classified by Dynamic Time Wrapping (DTW). There are five types of speech objects and 32 speech signals for each object. The system is processed for both user-dependent and user-independent. The overall accuracy of training data testing is 100 % for both. However, the accuracy of the real-time data testing is 44 % and52 % for user-dependent and user-independent, respectively. Therefore, user-independent system is more accuracy and less error rate than user-independent one.
Index terms - DTW, MFCC,Speech recognition, User-Independent, VAD.