Automatic Calculation of Voice Onset Time (VOT) For Voiced Stop Sounds in Modern Standard Arabic (MSA)
Signal processing in current days is under studying. One of these studies focuses on speech processing. Speech signal have many important features. One of them is Voice Onset Time (VOT). This feature only appears in stop sounds. The human auditory system can utilize the VOT to differentiate between voiced and unvoiced stops like /p/ and /b/ in the English language. By VOT feature we can classify and detect languages and dialects. The main reason behind choosing this subject is that the researches in analyzing Arabic language in this field are not enough and automatic detection of VOT value in Modern Standard Arabic (MSA) is a new idea. In this paper, we will focus on designing an algorithm that will be used to detect the VOT value in MSA language automatically depending on the power signal. We apply this algorithm only on the voiced stop sounds /b/, /d/ and /d?/, and compare that VOT values automatically generated by the algorithm with the manual values calculated by reading the spectrogram. We created the corpus, and used CV-CV-CV format for each word, the target stop consonant is in the middle of word. The algorithm resulted in a high accuracy, and the error rate was 0.80%, 26.62% and 11.71% for the three stop voiced sounds /d/, /d?/ and /b/ respectively . The standard deviation was low in /d/ sound because it is easy to pronounce, and high in /d?/ sound because it is unique and difficult to pronounce.
Index terms: Arabic, VOT, MSA, ASR, HTK, HMM.