Lambdacism Detection in Speech Using Spectrogram Features and Machine Learning Techniques
Abstract
This study performs Feature extraction and preprocessing using Machine Learning techniques on spectrogram-based models such as: Cochleagrams, Gammatone spectrograms, Mel-Frequency Cepstral Coefficients (MFCCs), and Short-Time Fourier Transform (STFT) followed by a dimensionality reduction of the features extracted to two dimensions using Principal Component Analysis (PCA) of both the correct and in correct pronunciation of the English word "Alive" affected by Lambdacism errors causing a lallation by interchanging the phoneme /l/ to /r/ and mispronounced as "Arive". A One-Class SVM Model is introduced to enable effective visualization and classification, Experimental results indicate that the integration of multiple spectrogram-based features leads to enhanced detection performance, achieving high precision and recall despite data imbalance. The potential of anomaly detection frameworks is emphasized for applications in speech disorder diagnosis, particularly in contexts with limited mispronounced samples. Contributions are made toward the fields of Automatic Speech Recognition (ASR) and Natural Language Processing (NLP), offering a scalable and interpretable method for speech anomaly detection. Future directions include the integration of real-time feedback and the expansion of the model’s applicability to a broader range of phoneme error patterns.
Keywords - Cochleagram, Gammatone, Lambdacism, MFCC, One-Class SVM, PCA, Spectrogram, STFT