Skip to content

biyanto/OCR-tesseract-SE

 
 

Repository files navigation

OCR-tesseract-SE

OCR program based on Pytesseract - a wrapper for Tesseract. It includes language models to enhance the OCR performance.

Getting started

  • Install Tesseract

    • For Mac users: brew install tesseract
    • For Windows users: The latest installer can be downloaded from here.
    • For Linux users: sudo apt install tesseract-ocr -y
  • Add tesseract path to system environment variable

  • Download language models here.

  • Google colab notebook

Usage

For non-technical users

If you are from non-technical background, and would like to set up pytesseract on your computer from scratch, please refer to instructions here: Mac, Windows. The guide also includes instructions to set up python and virtual environment.

Acknowledgements

Contact

Ekta Vats ([email protected])
Centre for Digital Humanities
Uppsala University
Sweden

About

OCR based on pytesseract

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 100.0%