Paper Title
Testing and Applying Tools to Develop DOGRI-Hindi SMT System
Abstract
The foundation of statistical analysis of any languages is the accessibility to the substantial corpus. We are dealing
with Statistical Machine Translation System and require extensive sentence-aligned parallel corpus. Various parallel corpora
do exist yet because of privacy rights or other lawful issues these are not shared by the engineers. So we are building up our
own particular Dogri-Hindi sentence aligned parallel corpus. In this paper we are examining the different methodologies
utilized by various specialists to create monolingual and bilingual parallel corpora with their favourable circumstances and
confinements, instruments and procedures utilized by them in corpus development. We have automated some portion of
corpus development and rest of the work is being done manually. We are taking written content from different sources
translating and aligning it. Hindi text is being translated into Dogri text by utilizing existing machine translation system. In
this paper we discussed about the approach applied by us in the development of Dogri-Hindi sentence-aligned parallel
corpus.
Keywords: Parallel Corpus, Spell Checker, Translator