A Novel Projection Profile and Hough Transform based Printed Character Segmentation for Bengali OCR

Producing computer recognizable characters from image scan of any text file using Optical Character Recognition (OCR) technology has a wide variety of applications in fields like Computer vision, Natural Language Processing (NLP), check processing, legal documents digitalization, form processing, car number-plate reading etc. However, character recognition of languages such as Bengali, with characteristics like head line (Matra line) and curved characters is still quite very challenging. Although many studies have been conducted to improve the Bengali OCR technology the accuracy of the character segmentation process was ignored in most of them. This paper presents a Hough Transformation and Projection Profile Analysis based character segmentation solution to this problem that utilizes a novel vertical straight-line strokes-based approach in detection of the base line, which significantly improves the segmentation of Bengali printed characters. The experimental result of the software Implementation of this study presented at the end clearly reflects upon the validity and suitability of this study. Keywords - Pattern Recognition, Image Processing, Optical Character Recognition, OCR, Bengali Printed Optical Character Segmentation, Hough Transformation, Projection Profile.