Abstract

Transliteration plays a very significant role in machine translation, which has many applications such as crosslingual information retrieval, communication, questionanswering etc. The main objective of this research paper is to provide a method for transliteration of named entities from English to Hindi language. The proposed method consists of two modules, both of which apply phonemebased approach to transliterate named entities. For transliteration, ModuleI utilizes CMU Pronouncing dictionary, which is a collection of 133270 words along with their pronunciation. If the word to be transliterated is not found in CMU Pronouncing dictionary, ModuleII is used. ModuleII is based on 5gram model, in which a maximum of five letters (two left, two right and one target letter) are used to generate transliterated target letter. The system has been tested on a database of 2408 NorthIndian names. Google Input tool for Windows has been used for comparative study of the proposed transliteration system. The word accuracy of the transliteration system has been found to be 70.22% against 58.73% of Google Input tool.

Details

Title
ENGLISH TO HINDI TRANSLITERATION SYSTEM USING COMBINATION-BASED APPROACH
Author
Dhindsa, Baljeet Kaur; Dharam Veer Sharma
Pages
609-613
Publication year
2017
Publication date
Sep 2017
Publisher
International Journal of Advanced Research in Computer Science
e-ISSN
09765697
Source type
Scholarly Journal
Language of publication
English
ProQuest document ID
2406987664
Copyright
© Sep 2017. This work is published under https://creativecommons.org/licenses/by-nc-sa/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.