(Publisher of Peer Reviewed Open Access Journals)

International Journal of Advanced Computer Research (IJACR)

ISSN (Print):2249-7277    ISSN (Online):2277-7970
Volume-9 Issue-41 March-2019
Full-Text PDF
Paper Title : Phoneme concatenation method considering half vowel sound for the Myanmar speech synthesis system
Author Name : Chaw Su Hlaing and Aye Thida
Abstract :

Myanmar language is a tonal language and it has different written form and spoken form. Therefore, correct grapheme to phoneme conversion is one of the important steps in the developing of Myanmar text-to-speech system. Every Myanmar consonant has inherent vowel or half vowel, schwa vowel depends on the word. Therefore, the correct vowel insertion is also a critical task. If these vowels can be handled, the TTS quality will be higher so that schwa vowel handling rules are presented in this paper. Besides, this paper discusses the approach considered for the vowels used to develop a text-to-speech (TTS) synthesis system for the Myanmar language. Concatenative method has been used to develop this TTS system using phoneme as the basic units for concatenation. Since phoneme plays an important role, Myanmar phoneme inventory is presented in detail. After analysing the number of phonemes and half-sound consonants to be recorded, the Myanmar phoneme speech database which contains total 157 phoneme speech sounds have been created. It can speech out for all Myanmar texts. These phonemes are fetched according to the result from the phonetic analysis modules and concatenated them by using proposed new phoneme concatenation algorithm. According to the experimental results, the system achieved the highest level of intelligibility and acceptable level of naturalness.

Keywords : Text to speech, Myanmar language, Phoneme, Concatenative speech synthesis, Half-vowel sound.
Cite this article : Hlaing CS, Thida A. Phoneme concatenation method considering half vowel sound for the Myanmar speech synthesis system. International Journal of Advanced Computer Research. 2019; 9(41):81-93. DOI:10.19101/IJACR.2018.839001.
References :
[1]Verma A, Singh DK. Robust assistive reading framework for visually challenged. International Journal of Image, Graphics and Signal Processing. 2017; 9(10):29-37.
[Crossref] [Google Scholar]
[2]Black AW, Campbell N. Optimising selection of units from speech databases for concatenative synthesis. CSTR; 1995.
[Google Scholar]
[3]Conkie A. Robust unit selection system for speech synthesis. In Joint Meeting of ASA/EAA/DAGA, Berlin, Germany. 1999.
[Google Scholar]
[4]Hunt AJ, Black AW. Unit selection in a concatenative speech synthesis system using a large speech database. In international conference on acoustics, speech, and signal processing 1996 (pp. 373-6). IEEE.
[Crossref] [Google Scholar]
[5]Toda T, Kawai H, Tsuzaki M, Shikano K. Unit selection algorithm for Japanese speech synthesis based on both phoneme unit and diphone unit. In international conference on acoustics, speech, and signal processing 2002 (pp. 465-8). IEEE.
[Crossref] [Google Scholar]
[6]Douke M, Hayashi M, Makino E. A study of automatic program production using TVML. Short Papers and Demos, Eurographics. 1999; 99:42-5.
[Google Scholar]
[7]Ramteke GD, Ramteke RJ. Efficient model for numerical text-to-speech synthesis system in marathi, hindi and english languages. International Journal of Image, Graphics & Signal Processing. 2017; 9(3):1-13.
[Crossref] [Google Scholar]
[8]Bakhsh NK, Alshomrani S, Khan I. A comparative study of Arabic text-to-speech synthesis systems. International Journal of Information Engineering and Electronic Business. 2014; 6(4):27-31.
[Crossref] [Google Scholar]
[9]Kasparaitis P, Kancys K. Phoneme vs. diphone in unit selection TTS of Lithuanian. Baltic Journal of Modern Computing. 2018; 6(2):162-72.
[Crossref] [Google Scholar]
[10]Jannati MJ, Sayadiyan A. Part-syllable transformation-based voice conversion with very limited training data. Circuits, Systems, and Signal Processing. 2018; 37(5):1935-57.
[Crossref] [Google Scholar]
[11]Myanmar language commission, Myanmar grammar, 30th year special edition. University Press, Yangon, Myanmar; 2005.
[12]Win KY, Takara T. Myanmar text-to-speech system with rule-based tone synthesis. Acoustical Science and Technology. 2011; 32(5):174-81.
[Crossref] [Google Scholar]
[13]Soe EP, Thida A. Text-to-speech synthesis for Myanmar language. International Journal of Scientific & Engineering Research. 2013; 4(6):1509-18.
[Google Scholar]
[14]Hlaing CS, Thida A. Phoneme based Myanmar text to speech system. International Journal of Advanced Computer Research. 2018; 8(34):47-58.
[Crossref] [Google Scholar]
[15]Maung ZM, Mikami Y. A rule-based syllable segmentation of Myanmar text. In proceedings of the IJCNLP-08 workshop on NLP for less privileged languages 2008 (pp. 51-8).
[Google Scholar]
[16]Acoustic phonetics and phonology of the Myanmar language. School of Human Communication Sciences, La Trobe University, Melbourne, Australia, 2007.
[17]Myanmar Language Commission, Myanmar Grammar, 30th Year Special Edition, University Press, Yangon, Myanmar, 2007.
[18]Lemmetty S. Review of speech synthesis technology. Helsinki University of Technology. Department of Electrical and Communications Engineering. Masters Thesis. 1999.
[19]http://tdil-dc.in/undertaking/article/449854TTS_Testing_Strategy_ver_2.1.pdf. Accessed 12 May 2018.