載入...
A Post-Processing Scheme for Malayalam using Statistical Sub-character Language Models
Most of the Indian scripts do not have any robust commer- cial OCRs. Many of the laboratory prototypes report rea- sonable results at recognition/classification stage. However, word level accuracies are still poor. It is well known that word accuracy decreases as the number of characters in a word i...
主要作者: | |
---|---|
格式: | Printed Book |
出版: |
ACM
2010
|
主題: | |
在線閱讀: | http://10.26.1.76/ks/005435.pdf |
LEADER | 01741nam a22001457a 4500 | ||
---|---|---|---|
100 | |a Karthika Mohan and C. V. Jawahar |9 26700 | ||
245 | |a A Post-Processing Scheme for Malayalam using Statistical Sub-character Language Models | ||
260 | |b ACM |c 2010 | ||
500 | |a DAS '10 Proceedings of the 9th IAPR International Workshop on Document Analysis Systems 493-500 | ||
520 | |a Most of the Indian scripts do not have any robust commer- cial OCRs. Many of the laboratory prototypes report rea- sonable results at recognition/classification stage. However, word level accuracies are still poor. It is well known that word accuracy decreases as the number of characters in a word increase. For Malayalam, the average number of char- acters in a word is almost twice that of English. Moreover, the number of words required to cover 80% of the Malay- alam language is more than forty times that of other Indian languages such as Hindi. Hence a direct dictionary based post-processing scheme is not suitable for Malayalam. In this paper, we propose a post-processing scheme which uses statistical language models at the sub-character level to boost word level recognition results. We use a multi-stage graph representation and formulate the recognition task as an optimization problem. Edges of the graph encode the language information and nodes represent the visual simi- larities. An optimal path from source node to destination node represents the recognized text. We validate our method on more than 10,000 words from a Malayalam corpus. | ||
650 | |a UNICODE |a CONFERENCE PROCEEDINGS |9 26701 | ||
856 | |u http://10.26.1.76/ks/005435.pdf | ||
942 | |c KS | ||
999 | |c 76175 |d 76175 | ||
952 | |0 0 |1 0 |4 0 |7 0 |9 68174 |a MGUL |b MGUL |d 2016-02-08 |l 0 |r 2016-02-08 |w 2016-02-08 |y KS |