A Practicioner's View on Binarization

This page aims at providing an idea of how to approach binarization and image enhancement from the OCR-D user perspective. It will aggregate information on possible processors and parameters, general recommendations, and a few select pathological/hard examples, with hints on inter-dependencies with other processing steps.

Tools

As discussed in the workflow guide, it can be beneficial to do binarization repeatedly on both the raw (uncropped) page and the cropped page, the text/table regions, or the text lines. However, not all processors are capable of running on all hierarchy levels or when some binarization is already present.

Also, not all algorithms cope well with unnormalized/unenhanced images (with too low/high contrast, too low/high brightness, far-off white point, noise, optical distortion). Thus, there's a strong connection with image preprocessing (like raw denoising).

Moreover, some algorithms produce quite noisy results. This can be favourable for recognition (if the model is trained appropriately), but almost always degrades segmentation quality. Also, after binarization there is usually no need for very high pixel density images. Thus, there's a strong connection with image postprocessing (like binary denoising and downscaling).

Furthermore, at least for the class of adaptive algorithms, there is an additional interdependency. Methods like Sauvola or Niblack require setting an adequate window size (such that windows are always likely to include both text and background, of sufficient variance, but are still local "enough"). Thus, there's a strong connection with DPI (pixel density of the input images), especially DPI metadata.

... list processors as in the workflow table, with short remarks; also available now: ocrd-preprocess-image wrapping arbitrary external binarization or image enhancement tools.

Recommendations

... what to do under which circumstances, source media, resolution etc.

Examples

This image has been provided by Jochen Barth (UB Heidelberg):
This image has been provided by Kay-Michael Würzner (SLUB):

... processor/parameter options, preprocessing options, discussion, more examples

Discussion

@jbarth-ubhd:

sauvola + kim is recommended in https://ocr-d.de/en/workflows for binarization, but sauvola parameter k is depended on black and white level. The wolf algorithm works independent of black and white level (does not need an individual parameter). singh is second best, sauvola-ms-split produces the thinnest of acceptable results. See https://digi.ub.uni-heidelberg.de/diglitData/v/olena-20200807.tif (multi-layer tif with different methods {method label at bottom} & black+white+noise levels)

Welcome to the OCR-D wiki, a companion to the OCR-D website.

Articles and tutorials

Discussions

Expert section on OCR-D- workflows

Particular workflow steps

Recommended workflows

Successful Workflows for Particular Material (Template)

Workflow Guide

Videos

Section on Ground Truth

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A Practicioner's View on Binarization

A Practicioner's View on Binarization

Tools

Recommendations

Examples

Discussion

Clone this wiki locally