A System for Handwritten Script Identification From Indian Document
Obaidullah Md Sk, Supratik Kundu Das, Kaushik Roy
Abstract
In a country like India a number of scripts (a total of 13) are used to write different official languages (a total of 23). For development of Optical Character Recognizer (OCR) for a particular language, the script by which the document is written is to be identified first. The task is more challenging when it comes about handwritten documents. So identification of the script from a document may be written with any of these 13 scripts is a very challenging task. In this paper we have identified scripts written by any of the six official languages of India. Here we have used very simple and efficient features at document level for the same. Using some Abstract/Mathematical features, Structure based features and Script dependent features, series of classifiers were used. Overall accuracy of the proposed system is at present 92.8% on the test set without rejection.