OCR::Naive

Convert images into text in an extremely naive fashion
Download

OCR::Naive Ranking & Summary

Advertisement

  • Rating:
  • License:
  • Perl Artistic License
  • Publisher Name:
  • Dmitry Karasik
  • Publisher web site:
  • http://search.cpan.org/~karasik/

OCR::Naive Tags


OCR::Naive Description

Convert images into text in an extremely naive fashion OCR::Naive is a Perl module that implements a very simple and unsophisticated OCR by finding all known images in a larger image. The known images are mapped to text using the preexisting dictionary, and the text lines are returned.The interesting stuff here is the image finding itself - it is done by a regexp! For all practical reasons, images can be easily treated as byte strings, and regexps are not exception. For example, one needs to locate an image 2x2 in larger 7x7 image. The regexp constructed should be the first scanline of smaller image, 2 bytes, verbatim, then 7 - 2 = 5 of any character, and finally the second scanline, 2 bytes again. Of course there are some quirks, but these explained in API section.Dictionaries for different fonts can be created interactively by bin/makedict; the non-interactive recognition is performed by bin/ocr which is a mere wrapper to this module.SYNOPSIS use Prima::noX11; # Prima imaging required use OCR::Naive; # load a dictionary created by bin/makedict $db = load_dictionary( 'my.dict'); # load image to recognize my $i = Prima::Image-> load( 'screenshot.png' ); $i = enhance_image( $i ); # ocr! print "$_\n" for recognize( $i, $db); Requirements: · Perl


OCR::Naive Related Software