Skip to content

A Practicioner's View on Binarization

Stefan Weil edited this page Sep 18, 2020 · 9 revisions

A Practicioner's View on Binarization

This page aims at providing an idea of how to approach binarization and image enhancement from the OCR-D user perspective. It will aggregate information on possible processors and parameters, general recommendations, and a few select pathological/hard examples, with hints on inter-dependencies with other processing steps.

Tools

As discussed in the workflow guide, it can be beneficial to do binarization repeatedly on both the raw (uncropped) page and the cropped page, the text/table regions, or the text lines. However, not all processors are capable of running on all hierarchy levels or when some binarization is already present.

Also, not all algorithms cope well with unnormalized/unenhanced images (with too low/high contrast, too low/high brightness, far-off white point, noise, optical distortion). Thus, there's a strong connection with image preprocessing (like raw denoising).

Moreover, some algorithms produce quite noisy results. This can be favourable for recognition (if the model is trained appropriately), but almost always degrades segmentation quality. Also, after binarization there is usually no need for very high pixel density images. Thus, there's a strong connection with image postprocessing (like binary denoising and downscaling).

Furthermore, at least for the class of adaptive algorithms, there is an additional interdependency. Methods like Sauvola or Niblack require setting an adequate window size (such that windows are always likely to include both text and background, of sufficient variance, but are still local "enough"). Thus, there's a strong connection with DPI (pixel density of the input images), especially DPI metadata.

... list processors as in the workflow table, with short remarks; also available now: ocrd-preprocess-image wrapping arbitrary external binarization or image enhancement tools.

Recommendations

... what to do under which circumstances, source media, resolution etc.

Examples

  1. This image has been provided by Jochen Barth (UB Heidelberg): low-contrast+rupture ocrd-olena-binarize+sauvola-ms-split+default-k ocrd-olena-binarize+sauvola-ms-split+k-0.05

  2. This image has been provided by Kay-Michael Würzner (SLUB): low-contrast+ink-setoff

... processor/parameter options, preprocessing options, discussion, more examples

Discussion

@jbarth-ubhd:

sauvola + kim is recommended in https://ocr-d.de/en/workflows for binarization, but sauvola parameter k is depended on black and white level. The wolf algorithm works independent of black and white level (does not need an individual parameter). singh is second best, sauvola-ms-split produces the thinnest of acceptable results. See https://digi.ub.uni-heidelberg.de/diglitData/v/olena-20200807.tif (multi-layer tif with different methods {method label at bottom} & black+white+noise levels)

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials
Discussions
Expert section on OCR-D- workflows
Particular workflow steps
Recommended workflows
Workflow Guide
Videos
Section on Ground Truth
Clone this wiki locally