Skip to content

Commit

Permalink
Updated readme
Browse files Browse the repository at this point in the history
  • Loading branch information
Balearica committed Jan 7, 2025
1 parent 56d8238 commit 4fc74ce
Showing 1 changed file with 11 additions and 13 deletions.
24 changes: 11 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,18 +99,27 @@ The following are examples and projects built by the community using Tesseract.j

If you have a project or example repo that uses Tesseract.js, feel free to add it to this list using a pull request. Examples submitted should be well documented such that new users can run them; projects should be functional and actively maintained.

## Major changes in v6
Version 6 changes are documented in [this issue](https://github.com/naptha/tesseract.js/issues/993). Highlights are below.
- Fixed memory leak in previous versions
- Overall reductions in runtime and memory usage
- Breaking changes:
- All outputs formats other than `text` are disabled by default.
- To re-enable the `hocr` output (for example), set the following: `worker.recognize(image, {}, { hocr: true })`
- Minor changes to the structure of the JavaScript object (`blocks`) output
- See [this issue](https://github.com/naptha/tesseract.js/issues/993) for full list

## Major changes in v5
Version 5 changes are documented in [this issue](https://github.com/naptha/tesseract.js/issues/820). Highlights are below.

- Significantly smaller files by default (54% smaller for English, 73% smaller for Chinese)
- This results in a ~50% reduction in runtime for first-time users (who do not have the files cached yet)
- Significantly lower memory usage
- Compatible with iOS 17 (using default settings)
- Breaking changes:
- `createWorker` arguments changed
- Setting non-default language and OEM now happens in `createWorker`
- E.g. `createWorker("chi_sim", 1)`
- `worker.initialize` and `worker.loadLanguage` functions now do nothing and can be deleted from code
- `worker.initialize` and `worker.loadLanguage` functions should be deleted from code
- See [this issue](https://github.com/naptha/tesseract.js/issues/820) for full list

Upgrading from v2 to v5? See [this guide](https://github.com/naptha/tesseract.js/issues/771).
Expand All @@ -125,17 +134,6 @@ Version 4 includes many new features and bug fixes--see [this issue](https://git
- `createWorker` is now async
- `getPDF` function replaced by `pdf` recognize option

## Major changes in v3
- Significantly faster performance
- Runtime reduction of 84% for Browser and 96% for Node.js when recognizing the [example images](./examples/data)
- Upgrade to Tesseract v5.1.0 (using emscripten 3.1.18)
- Added SIMD-enabled build for supported devices
- Added support:
- Node.js version 18
- Removed support:
- ASM.js version, any other old versions of Tesseract.js-core (<3.0.0)
- Node.js versions 10 and 12

## Contributing

### Development
Expand Down

0 comments on commit 4fc74ce

Please sign in to comment.