diff --git a/ocfweb/docs/docs/staff/backend/printhost.md b/ocfweb/docs/docs/staff/backend/printhost.md index 07b2a8640..542fb99ea 100644 --- a/ocfweb/docs/docs/staff/backend/printhost.md +++ b/ocfweb/docs/docs/staff/backend/printhost.md @@ -1,20 +1,29 @@ -[[!meta title="Printhost"]] +[[!meta title="Printing Infrastructure"]] ## Introduction -The OCF's print server is based around two components: [CUPS][cups], the -standard UNIX print server, and a custom print accounting system contained in +Print jobs at the OCF are processed both locally and by our central printing +server, known as printhost or whiteout. All local machines and printhost use +[CUPS][cups], an open source UNIX printing system, to manage printing jobs. + +A CUPS instance runs on each of our desktops, as well as on our public and staff +login servers (tsunami and supernova, respectively). All print jobs, after going +through a filter that rasterizes any PDFs, are then forwarded onto printhost +for further processing before finally being printed. + +The OCF's central print server is based around two components: CUPS, +and a custom print accounting system contained in the ocflib API. CUPS is responsible for receiving print jobs over the network, converting documents to a printer-friendly format, and delivering processed jobs to one of the available printers. The OCF's print accounting system, nicknamed enforcer after one of the scripts, plugs into CUPS as a hook that looks at jobs before and after going to the printer. It records jobs in a database that keeps track of how many pages each user has printed, rejecting -jobs that go over quota. The high level flow of data through the print system +jobs that go over quota. The high level flow of data through printhost looks like this: ``` - [Application] + [Local machine] + | PDF or PS document v @@ -52,12 +61,18 @@ looks like this: The first stage of printing is handled by the application that sends the print job, such as Evince. The application opens up a system print dialog, which gets -a list of available printers and options from the local CUPS client, which in -turn gets it from the printhost. The application renders the desired pages to a -PostScript, PDF, or other CUPS-compatible format, then sends it to the -printhost. - -The CUPS server on the printhost receives the job and print options and queues +a list of available printers and options from the local CUPS client. The +application renders the desired pages to a PostScript, PDF, or other +CUPS-compatible format. + +The local CUPS server queues the job, first sending it to a backend filter, +[raster-filter][raster-filter], which catches any PDF print jobs and rasterizes +the job beforehand, using the ImageMagick program [convert][convert]. Having +this initial rasterization performed locally reduces the processing load on +printhost, as running convert may take a nonnegligible amount of time and +resources per job. The filtered jobs are then passed on to whiteout via IPP. + +The CUPS server on printhost receives the job and print options and queues the job for printing. The actual document, plus metadata including user-set options, is stored in the print spool at `/var/spool/cups` until a printer becomes available to print it. The document is converted into a more @@ -71,6 +86,8 @@ visiting the printer's IP over the web (e.g. `https://papercut/`). In the OCF's case, security is provided by an access control list (ACL) which accepts print jobs from the printhost and rejects jobs from other hosts. +[raster-filter]: https://github.com/ocf/puppet/blob/master/modules/ocf/files/packages/cups/raster-filter +[convert]: https://legacy.imagemagick.org/script/convert.php ### Filters @@ -89,13 +106,22 @@ duplexing, then, finally, a device-specific filter such as `hpcups`. Each filter is associated with an internal "cost", and CUPS picks the path with the least total cost to print the document. -At the OCF, print jobs are all processed by a single filter, [ocfps][ocfps], +On each local machine, we use [Tea4CUPS][Tea4CUPS], a Python CUPS wrapper, to +run `raster-filter` on each local machine. As mentioned above, `raster-filter` +uses convert to rasterize PDF jobs, producing a PDF of lower complexity. If +convert returns any errors, the error messages are emailed to the root mailing +list (root@ocf.berkeley.edu) via the [convert-failure][convert-failure] script, +and the original, non-rasterized PDF is sent toward printhost. + +On printhost, print jobs are all processed by a single filter, [ocfps][ocfps], which converts raw PDFs to rasterized, printable PostScript. It calls on a command-line converter to render the PDF as pixels (rasterization), then passes the result and the rest of the arguments to standard CUPS filters. So far, this has given us the fewest headaches in terms of malformatted output and printer errors. +[Tea4CUPS]: https://wiki.debian.org/Tea4CUPS +[convert-failure]: https://github.com/ocf/puppet/blob/master/modules/ocf/files/packages/cups/convert_failure [ocfps]: https://github.com/ocf/puppet/blob/master/modules/ocf_printhost/files/ocfps @@ -115,8 +141,7 @@ afford the toner. ## Print accounting -The OCF uses a virtual CUPS printer backend called [Tea4CUPS][Tea4CUPS] to -install a page accounting hook that runs before and after each job is actually +The OCF uses a Tea4CUPS backend to install a page accounting hook that runs before and after each job is actually sent to the printer. The script is called [enforcer][enforcer], but all the logic is contained in the [ocflib printing package][ocflib.printing]. All jobs are logged in the `ocfprinting` SQL database, including the username, print @@ -130,7 +155,6 @@ over daily or semesterly quota, it emails the user and returns an error code that cancels the job. Otherwise, it logs successful print jobs in the database and emails users in the case a job fails. -[Tea4CUPS]: https://wiki.debian.org/Tea4CUPS [enforcer]: https://github.com/ocf/puppet/blob/master/modules/ocf_printhost/files/enforcer [ocflib.printing]: https://github.com/ocf/ocflib/tree/master/ocflib/printing diff --git a/ocfweb/docs/docs/staff/scripts/pdf-open.md b/ocfweb/docs/docs/staff/scripts/pdf-open.md index 909782d44..f74adb450 100644 --- a/ocfweb/docs/docs/staff/scripts/pdf-open.md +++ b/ocfweb/docs/docs/staff/scripts/pdf-open.md @@ -9,6 +9,10 @@ Examples of PDFs that commonly fail include many Econ 136 and Bio 1B papers, PDFs with strange images or scanned components in them, and things people try to print straight from Gmail attachment viewer. +Normally, our printing system already performs PDF rasterization before sending +the job off to our printing server. However, in the case of troublesome PDFs, +it may help to rasterize the PDF beforehand. + If a user comes asking why their paper isn't printing right, first download the PDF, then run `pdf-open $pdf_file`. Don't forget the hyphen, and make sure the filename doesn't have any spaces or weird characters in it. After