Open Source OCR
I remember several years back I tried out gocr which is an open source character recognition engine. I wasn’t thoroughly impressed, but it sort of worked. Yesterday, I saw the news that Google has released Tesseract as an open source Optical Character Recognition engine. It was originally developed by HP and has been shelved for some time, it’s supposed to be among the top 3 in accuracy according to testing by UNLV. The source code is available at their sourceforge.net page. It’ will be good to see this taken up and integrated as a backend by open source scanning applications. (Maybe even office suites as a “recognize text in image file” type option….)
Popularity: 1% [?]
Similar Posts
- Converting pdf to tif (tiff) images
- FreeDos suspended development - nevermind
- MySQL moving for backup plan after Oracle’s purchase of InnoDB
- Graphviz and dot
- New List of Open Source software for Windows