Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Printing docs update #540

Closed
wants to merge 4 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
54 changes: 39 additions & 15 deletions ocfweb/docs/docs/staff/backend/printhost.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,29 @@
[[!meta title="Printhost"]]
[[!meta title="Printing Infrastructure"]]

## Introduction

The OCF's print server is based around two components: [CUPS][cups], the
standard UNIX print server, and a custom print accounting system contained in
Print jobs at the OCF are processed both locally and by our central printing
server, known as printhost or whiteout. All local machines and printhost use
[CUPS][cups], an open source UNIX printing system, to manage printing jobs.

A CUPS instance runs on each of our desktops, as well as on our public and staff
login servers (tsunami and supernova, respectively). All print jobs, after going
through a filter that rasterizes any PDFs, are then forwarded onto printhost
for further processing before finally being printed.

The OCF's central print server is based around two components: CUPS,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might want to reflow this paragraph. /shrug

and a custom print accounting system contained in
the ocflib API. CUPS is responsible for receiving print jobs over the network,
converting documents to a printer-friendly format, and delivering processed
jobs to one of the available printers. The OCF's print accounting system,
nicknamed enforcer after one of the scripts, plugs into CUPS as a hook that
looks at jobs before and after going to the printer. It records jobs in a
database that keeps track of how many pages each user has printed, rejecting
jobs that go over quota. The high level flow of data through the print system
jobs that go over quota. The high level flow of data through printhost
looks like this:

```
[Application]
[Local machine]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add the local machine CUPS to this chart as well?

+
| PDF or PS document
v
Expand Down Expand Up @@ -52,12 +61,18 @@ looks like this:

The first stage of printing is handled by the application that sends the print
job, such as Evince. The application opens up a system print dialog, which gets
a list of available printers and options from the local CUPS client, which in
turn gets it from the printhost. The application renders the desired pages to a
PostScript, PDF, or other CUPS-compatible format, then sends it to the
printhost.

The CUPS server on the printhost receives the job and print options and queues
a list of available printers and options from the local CUPS client. The
application renders the desired pages to a PostScript, PDF, or other
CUPS-compatible format.

The local CUPS server queues the job, first sending it to a backend filter,
[raster-filter][raster-filter], which catches any PDF print jobs and rasterizes
the job beforehand, using the ImageMagick program [convert][convert]. Having
this initial rasterization performed locally reduces the processing load on
printhost, as running convert may take a nonnegligible amount of time and
resources per job. The filtered jobs are then passed on to whiteout via IPP.

The CUPS server on printhost receives the job and print options and queues
the job for printing. The actual document, plus metadata including user-set
options, is stored in the print spool at `/var/spool/cups` until a printer
becomes available to print it. The document is converted into a more
Expand All @@ -71,6 +86,8 @@ visiting the printer's IP over the web (e.g. `https://papercut/`). In the OCF's
case, security is provided by an access control list (ACL) which accepts print
jobs from the printhost and rejects jobs from other hosts.

[raster-filter]: https://github.com/ocf/puppet/blob/master/modules/ocf/files/packages/cups/raster-filter
[convert]: https://legacy.imagemagick.org/script/convert.php

### Filters

Expand All @@ -89,13 +106,22 @@ duplexing, then, finally, a device-specific filter such as `hpcups`. Each
filter is associated with an internal "cost", and CUPS picks the path with the
least total cost to print the document.

At the OCF, print jobs are all processed by a single filter, [ocfps][ocfps],
On each local machine, we use [Tea4CUPS][Tea4CUPS], a Python CUPS wrapper, to
run `raster-filter` on each local machine. As mentioned above, `raster-filter`
uses convert to rasterize PDF jobs, producing a PDF of lower complexity. If
convert returns any errors, the error messages are emailed to the root mailing
list ([email protected]) via the [convert-failure][convert-failure] script,
and the original, non-rasterized PDF is sent toward printhost.

On printhost, print jobs are all processed by a single filter, [ocfps][ocfps],
which converts raw PDFs to rasterized, printable PostScript. It calls on a
command-line converter to render the PDF as pixels (rasterization), then passes
the result and the rest of the arguments to standard CUPS filters. So far, this
has given us the fewest headaches in terms of malformatted output and printer
errors.

[Tea4CUPS]: https://wiki.debian.org/Tea4CUPS
[convert-failure]: https://github.com/ocf/puppet/blob/master/modules/ocf/files/packages/cups/convert_failure
[ocfps]: https://github.com/ocf/puppet/blob/master/modules/ocf_printhost/files/ocfps


Expand All @@ -115,8 +141,7 @@ afford the toner.

## Print accounting

The OCF uses a virtual CUPS printer backend called [Tea4CUPS][Tea4CUPS] to
install a page accounting hook that runs before and after each job is actually
The OCF uses a Tea4CUPS backend to install a page accounting hook that runs before and after each job is actually
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrap at 80 if possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also you should specify that this happens on printhost not locally here

sent to the printer. The script is called [enforcer][enforcer], but all the
logic is contained in the [ocflib printing package][ocflib.printing]. All jobs
are logged in the `ocfprinting` SQL database, including the username, print
Expand All @@ -130,7 +155,6 @@ over daily or semesterly quota, it emails the user and returns an error code
that cancels the job. Otherwise, it logs successful print jobs in the database
and emails users in the case a job fails.

[Tea4CUPS]: https://wiki.debian.org/Tea4CUPS
[enforcer]: https://github.com/ocf/puppet/blob/master/modules/ocf_printhost/files/enforcer
[ocflib.printing]: https://github.com/ocf/ocflib/tree/master/ocflib/printing

Expand Down
4 changes: 4 additions & 0 deletions ocfweb/docs/docs/staff/scripts/pdf-open.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,10 @@ Examples of PDFs that commonly fail include many Econ 136 and Bio 1B papers,
PDFs with strange images or scanned components in them, and things people try
to print straight from Gmail attachment viewer.

Normally, our printing system already performs PDF rasterization before sending
the job off to our printing server. However, in the case of troublesome PDFs,
it may help to rasterize the PDF beforehand.

If a user comes asking why their paper isn't printing right, first download the
PDF, then run `pdf-open $pdf_file`. Don't forget the hyphen, and make sure the
filename doesn't have any spaces or weird characters in it. After
Expand Down