Correct pronunciation detection of the arabic alphabet using deep learning

Nishmia Ziafat*, Hafiz Farooq Ahmad, Iram Fatima, Muhammad Zia, Abdulaziz Alhumam, Kashif Rajpoot

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)
70 Downloads (Pure)

Abstract

Automatic speech recognition for Arabic has its unique challenges and there has been relatively slow progress in this domain. Specifically, Classic Arabic has received even less research at-tention. The correct pronunciation of the Arabic alphabet has significant implications on the meaning of words. In this work, we have designed learning models for the Arabic alphabet classification based on the correct pronunciation of an alphabet. The correct pronunciation classification of the Arabic alphabet is a challenging task for the research community. We divide the problem into two steps, firstly we train the model to recognize an alphabet, namely Arabic alphabet classification. Secondly, we train the model to determine its quality of pronunciation, namely Arabic alphabet pronunciation classification. Due to the less availability of audio data of this kind, we had to collect audio data from the experts, and novices for our model’s training. To train these models, we extract pronunciation features from audio data of the Arabic alphabet using mel-spectrogram. We have employed a deep convolution neural network (DCNN), AlexNet with transfer learning, and bidirectional long short-term memory (BLSTM), a type of recurrent neural network (RNN), for the classification of the audio data. For alphabet classification, DCNN, AlexNet, and BLSTM achieve an accuracy of 95.95%, 98.41%, and 88.32%, respectively. For Arabic alphabet pronunciation classification, DCNN, AlexNet, and BLSTM achieve an accuracy of 97.88%, 99.14%, and 77.71%, respectively.

Original languageEnglish
Article number2508
Number of pages19
JournalApplied Sciences (Switzerland)
Volume11
Issue number6
DOIs
Publication statusPublished - 11 Mar 2021

Bibliographical note

Funding Information:
Acknowledgments: This work is supported by the Deanship of Scientific Research, King Faisal University, Al-Ahsa, Saudi Arabia.

Publisher Copyright:
© 2021 by the authors. Licensee MDPI, Basel, Switzerland.

Keywords

  • Artificial neural network (ANN)
  • Bidirectional long short-term memory (BLSTM)
  • Deep convolution neural network (DCNN)
  • Deep learning (DL)
  • Recurrent neural network (RNN)

ASJC Scopus subject areas

  • Materials Science(all)
  • Instrumentation
  • Engineering(all)
  • Process Chemistry and Technology
  • Computer Science Applications
  • Fluid Flow and Transfer Processes

Fingerprint

Dive into the research topics of 'Correct pronunciation detection of the arabic alphabet using deep learning'. Together they form a unique fingerprint.

Cite this