A downloadable tool for Windows, macOS, and Linux

OCReate - OCR-based PDF/Image-to-Text Converter

The project was done in the scope of CIS322 Software Engineering Course.

About

  • Convert various scanned paper documents and images into searchable and editable format
  • Automate the process of capturing alphanumeric information
  • Export extracted text in different formats (txt, doc and etc.)
  • Supports 50+ languages
  • Supports punctuatuion marks, digits and diacritics/accented characters
  • Supports Armenian

Prerequisites

  • JDK Version 8.0 or higher

Setup

  • Download the JAR
  • Open the command line and navigate to the directory where the JAR is
  • Run the following command:
java -jar ocreate.jar

Built With

  • JavaFX - Open source, next generation client application platform
  • Tesseract Engine - Tesseract Open-Source OCR Engine
  • Tess4J - A Java JNA wrapper for Tesseract OCR API


Download

Download
OCR.jar 86 MB