Loading...
English to Malayalam Translation: A Statistical Approach
This paper underlines a methodology for translating text from English into the Dravidian language, Malayalam using statistical models. By using a monolingual Malayalam corpus and a bilingual English/Malayalam corpus in the training phase, the machine automatically generates Malayalam translations of...
| Main Author: | |
|---|---|
| Format: | Printed Book |
| Subjects: | |
| Online Access: | http://10.26.1.76/ks/005204.pdf |
| Summary: | This paper underlines a methodology for translating text from
English into the Dravidian language, Malayalam using statistical
models. By using a monolingual Malayalam corpus and a
bilingual English/Malayalam corpus in the training phase, the
machine automatically generates Malayalam translations of
English sentences. This paper also discusses a technique to
improve the alignment model by incorporating the parts of speech
information into the bilingual corpus. Removing the insignificant
alignments from the sentence pairs by this approach has ensured
better training results. Pre-processing techniques like suffix
separation from the Malayalam corpus and stop word elimination
from the bilingual corpus also proved to be effective in training.
Various handcrafted rules designed for the suffix separation
process which can be used as a guideline in implementing suffix
separation in Malayalam language are also presented in this paper.
The structural difference between the English Malayalam pair is
resolved in the decoder by applying the order conversion rules.
Experiments conducted on a sample corpus have generated
reasonably good Malayalam translations and the results are
verified with F measure, BLEU and WER evaluation metrics. |
|---|