OCRFeeder is a document layout analysis and optical character recognition system.
Given the images it will automatically outline its contents, distinguish between what's graphics and text and perform OCR over the latter. It generates multiple formats being its main one ODT.
It features a complete GTK graphical user interface that allows the users to correct any unrecognized characters, defined or correct bounding boxes, set paragraph styles, clean the input images, import PDFs, save and load the project, export everything to multiple formats, etc. OCRFeeder was developed as the project of the Master's Thesis in Computer Science of Joaquim Rocha.
Homepage
Download
Recent Releases
0.8.521 Feb 2024 14:13
minor feature:
Bug Fixes:
* Fix arguments for ghostscript invocation (thanks to @RenWal)
Other:
* The previously pushed tag was a mistake as it was pointing to a temporary branch. No real issues for the user, but this release/tag now fixes that.
0.805 Aug 2014 17:21
minor feature:
Version 0.8
============
New Features
-------------
* Add support for multiple image TIFFs
Bug Fixes
----------
* Fix PIL importation
* Fix error when exporting a PDF with empty text areas
* Fix PDF output options in ocrfeeder-cli
* Fix getting engine name in ocrfeeder-cli
* Fix the use of newer versions of Unpaper
* Fix text in the pages icon view
* Fix reordering pages in the icon view
* Fix issues when no locale is set
* Fix loading project with more than one page
* Fix updating the OCR engines in the BoxEditor
Improvements
-------------
* Port the application to GObject Introspection
* Scan with 300 DPI and in color mode
* Use the last visited directory when adding a new image
* Warn when no OCR engines are found on startup or when
performing the recognition
* Update the box editor's OCR controls sensitiveness
according to the existence of OCR engines
New and Updated Translations
-----------------------------
* Marek Černocký cz
* Daniel Mustieles es
* Fran Diéguez gl
* Dimitris Spingos el
* Aharon Don he
* Attila Hammer, Gabor Kelemen hu
* Rafael Ferreira pt_BR
* Martin Srebotnjak sl
* Мирослав Николић sr
* Piotr Drąg uk
* Wylmer Wang zh_CN