Skip to content

Commit

Permalink
model and user guide: improve markdown
Browse files Browse the repository at this point in the history
  • Loading branch information
bertsky committed Jun 23, 2023
1 parent a07f613 commit 3842c6f
Show file tree
Hide file tree
Showing 2 changed files with 43 additions and 33 deletions.
16 changes: 8 additions & 8 deletions site/en/models.md
Original file line number Diff line number Diff line change
Expand Up @@ -95,14 +95,14 @@ This will look up the resource in the [bundled resource and user databases](#use
unarchive (where applicable) and store it in the [proper location](#where-is-the-data).


> **NOTE:** The special name `*` can be used instead of a resource name/url to
> **Note**: The special name `*` can be used instead of a resource name/url to
> download *all* known resources for this processor. To download all tesseract models:
```sh
ocrd resmgr download ocrd-tesserocr-recognize '*'
```

> **NOTE:** Equally, the special processor `*` can be used instead of a processor and a resource
> **Note**: Equally, the special processor `*` can be used instead of a processor and a resource
> to download *all* known resources for *all* installed processors:
```sh
Expand Down Expand Up @@ -162,10 +162,10 @@ To download models to `ocrd-models` in the host FS and `/models` in the containe
```sh
docker run --user $(id -u) \
--volume ocrd-models:/models \
ocrd/all \
ocrd resmgr download ocrd-tesserocr-recognize eng.traineddata\; \
ocrd resmgr download ocrd-calamari-recognize default\; \
...
ocrd/all \
ocrd resmgr download ocrd-tesserocr-recognize eng.traineddata\; \
ocrd resmgr download ocrd-calamari-recognize default\; \
...
```

To run processors, then as usual do:
Expand Down Expand Up @@ -197,7 +197,7 @@ This allows you to use the OCR-D/core resource manager mechanics, including
lookup of known resources by name or URL, without relying (only) on the
database maintained by the OCR-D/core developers.

> **NOTE:** If you produced or found resources that are interesting for the wider
> **Note**: If you produced or found resources that are interesting for the wider
> OCR(-D) community, please tell us in the [OCR-D gitter chat](https://gitter.im/OCR-D/Lobby)
> or open an issue in the respective Github repository, so we can add it to the database.
Expand Down Expand Up @@ -255,7 +255,7 @@ To use a specific model with OCR-D's ocropus wrapper in
ocrd-cis-ocropy-recognize -I OCR-D-SEG-LINE -O OCR-D-OCR-OCRO -P model fraktur-jze.pyrnn.gz
```

**NOTE:** Model must be downloade before with
> **Note**: The model must have been downloaded before with
```sh
ocrd resmgr download ocrd-cis-ocropy-recognize fraktur-jze.pyrnn.gz
Expand Down
60 changes: 35 additions & 25 deletions site/en/user_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,17 @@ title: User Guide for Non-IT Users

# User Guide for Non-IT Users

The following guide provides a detailed description on how to use the OCR-D-Software after it has been installed successfully. As explained in the
setup guide, you can either use the [OCR-D-Docker-solution](https://ocr-d.github.io/en/setup#ocrd_all-via-docker), or you can
[install the Software locally](https://ocr-d.github.io/en/setup#ocrd_all-natively). Note that these two options require different prerequisites to get
started with OCR-D after the installation as detailed in the very next two paragraphs. The [third preparatory step](#preparing-a-workspace) is
obligatory for both Docker and Non-Docker users!
The following guide provides a detailed description on how to use the OCR-D software after it has been
[installed](setup) successfully. As explained in the [Setup Guide](setup), you can either use the
[OCR-D Docker solution](https://ocr-d.github.io/en/setup#ocrd_all-via-docker), or you can
[install the Software natively](https://ocr-d.github.io/en/setup#ocrd_all-natively) on your OS.

Furthermore, Docker commands have a [different syntax than native calls](#translating-native-commands-to-docker-calls).
This guide always states native calls first but follows up with the respective command for Docker users.
Depending on which option you prefer, you will require different steps to run OCR-D, as detailed
in the following two paragraphs. (The [third paragraph](#preparing-a-workspace) is obligatory
for both Docker and native users.)

Docker commands need a [extra syntax over native commands](#translating-native-commands-to-docker-calls).
This guide always states native calls first, but follows up with the respective command for Docker.

## Preparations

Expand Down Expand Up @@ -239,9 +242,9 @@ a fileGrp `OCR-D-IMG` referencing your local image files.
> when copying and pasting from the sample calls provide on this website.

## Using the OCR-D-processors
## Using the OCR-D processors

### OCR-D Syntax
### OCR-D command-line interface syntax

There are several ways for invoking the OCR-D processors. Still, all of them
make use of the following syntax:
Expand All @@ -257,12 +260,15 @@ make use of the following syntax:
> **Note**: For some processors, all parameters are optional, while other processors such as
> `ocrd-tesserocr-recognize` will not work without some parameter specifications.
For information on the available processors, and their respective parameters,
see [getting more information about processors](#get-more-information-about-processors).

### Calling a single processor
If you just want to call a single processor, e.g. for testing purposes, you can go into your workspace and use the following command:
If you just want to run a single processor, you can go into your workspace and use the following command:
```sh
ocrd-{processor needed} -I {Input-Group} -O {Output-Group} [-p {parameter-file}] [-P {parameter} {value}]
ocrd-{processor name} -I {Input-Group} -O {Output-Group} [-p {parameter-file}] [-P {parameter} {value}]
## alternatively, using Docker:
docker run --rm -u $(id -u) -v $PWD:/data -- ocrd/all:maximum ocrd-{processor needed} -I {Input-Group} -O {Output-Group} [-p {parameter-file}] [-P {parameter} {value}]
docker run --rm -u $(id -u) -v $PWD:/data -- ocrd/all:maximum ocrd-{processor name} -I {Input-Group} -O {Output-Group} [-p {parameter-file}] [-P {parameter} {value}]
```
For example, your processor call command could look like this:
```sh
Expand All @@ -278,7 +284,7 @@ It will also add information about this processing step in the METS metadata.

> **Note**: For processors using multiple input- or output fileGrps you have to use a comma-separated list.
E.g.:
For example:

```sh
ocrd-cor-asv-ann-align -I OCR-D-OCR1,OCR-D-OCR2,OCR-D-OCR3 -O OCR-D-OCR4
Expand All @@ -299,10 +305,17 @@ docker run --rm -u $(id -u) -v $PWD:/data -- ocrd/all:maximum ocrd-cor-asv-ann-a

### Calling several processors

Running several processors one after another on the same data is called a **workflow**.
For workflow processing, you need a workflow format and a workflow engine.

In the most simple case, you just write a shell script which combines single processor
calls in a command sequence joined by `&&`. The following paragraphs will describe more
advanced options.

#### ocrd process

If you quickly want to specify a particular workflow on the CLI, you can use
`ocrd process`, which has a similar syntax as calling single processor CLIs.
`ocrd process`, which has a similar syntax as calling single processor CLIs:

```sh
ocrd process \
Expand Down Expand Up @@ -336,17 +349,14 @@ in your workspace (i.e. both as files on the filesystem and referenced in the `m
It will also add information about this processing step in the METS metadata.

The processors work on the files sequentially. So at first, all pages will be processed
with the first processor (e.g. binarized), then all pages will be processed
with the first processor (e.g. binarized), then (if successful) all pages will be processed
by the second processor (e.g. segmented) etc.

So In the end your workspace should contain a directory (and fileGrp) with (intermediate)
So in the end, your workspace should contain a directory (and fileGrp) with (intermediate)
processing results for each output fileGrp specified in the workflow.

> **Note**: In contrast to calling a single processor, for `ocrd process` you leave
out the prefix `ocrd-` before the name of a particular processor.

For information on the available processors see [section at the end](#get-more-information-about-processors).

> out the prefix `ocrd-` before the name of a particular processor.

#### ocrd-make

Expand Down Expand Up @@ -423,7 +433,7 @@ look like this …
ocrd-tesserocr-segment -I OCR-D-IMG -O OCR-D-SEG
```

… to run it with the [`ocrd/all:maximum`] Docker container …
… to run it with the [`ocrd/all:maximum`](https://hub.docker.com/r/ocrd/all/tags) Docker container …

```sh
docker run -u $(id -u) -v $PWD:/data -v ocrd-models:/models -- ocrd/all:maximum ocrd-tesserocr-segment -I OCR-D-IMG -O OCR-D-SEG
Expand Down Expand Up @@ -459,7 +469,7 @@ and/or processors. For an overview on the existing processors, their tasks and f



### Get more Information about Processors
### Get more information about processors

To get all available processors you might use the autocomplete in your preferred console.

Expand All @@ -473,16 +483,16 @@ Type `ocrd-` followed by a tab character (for autocompletion proposals) to get a
To get further information about a particular processor, call it with `--help` or `-h`:

```sh
{processor name} --help
ocrd-{processor name} --help
## alternatively, using Docker:
docker run --rm -u $(id -u) -v $PWD:/data -- ocrd/all:maximum {processor name} --help
docker run --rm -u $(id -u) -v $PWD:/data -- ocrd/all:maximum ocrd-{processor name} --help
```


### Using models

Several processors rely on models, which usually have to be downloaded beforehand.
An overview on the existing model repositories and short descriptions on the most important models
can be found [in our Models Guide](https://ocr-d.de/en/models).
can be found in our [Models Guide](https://ocr-d.de/en/models).
We strongly recommend to use the [OCR-D resource manager](https://ocr-d.de/en/models) to download the models,
as this makes it easy to both download and use them.

0 comments on commit 3842c6f

Please sign in to comment.