ocf · jxlin123 · Aug 30, 2019 · Aug 30, 2019 · Aug 31, 2019 · Aug 31, 2019
diff --git a/ocfweb/docs/docs/staff/backend/printhost.md b/ocfweb/docs/docs/staff/backend/printhost.md
@@ -1,20 +1,29 @@
-[[!meta title="Printhost"]]
+[[!meta title="Printing Infrastructure"]]
 
 ## Introduction
 
-The OCF's print server is based around two components: [CUPS][cups], the
-standard UNIX print server, and a custom print accounting system contained in
+Print jobs at the OCF are processed both locally and by our central printing
+server, known as printhost or whiteout. All local machines and printhost use
+[CUPS][cups], an open source UNIX printing system, to manage printing jobs.
+
+A CUPS instance runs on each of our desktops, as well as on our public and staff
+login servers (tsunami and supernova, respectively). All print jobs, after going
+through a filter that rasterizes any PDFs, are then forwarded onto printhost
+for further processing before finally being printed.
+
+The OCF's central print server is based around two components: CUPS,
+and a custom print accounting system contained in
 the ocflib API. CUPS is responsible for receiving print jobs over the network,
 converting documents to a printer-friendly format, and delivering processed
 jobs to one of the available printers. The OCF's print accounting system,
 nicknamed enforcer after one of the scripts, plugs into CUPS as a hook that
 looks at jobs before and after going to the printer. It records jobs in a
 database that keeps track of how many pages each user has printed, rejecting
-jobs that go over quota. The high level flow of data through the print system
+jobs that go over quota. The high level flow of data through printhost
 looks like this:
 
 ```
-   [Application]
+  [Local machine]
          +
          | PDF or PS document
          v
@@ -52,12 +61,18 @@ looks like this:
 
 The first stage of printing is handled by the application that sends the print
 job, such as Evince. The application opens up a system print dialog, which gets
-a list of available printers and options from the local CUPS client, which in
-turn gets it from the printhost. The application renders the desired pages to a
-PostScript, PDF, or other CUPS-compatible format, then sends it to the
-printhost.
-
-The CUPS server on the printhost receives the job and print options and queues
+a list of available printers and options from the local CUPS client. The
+application renders the desired pages to a PostScript, PDF, or other
+CUPS-compatible format.
+
+The local CUPS server queues the job, first sending it to a backend filter,
+[raster-filter][raster-filter], which catches any PDF print jobs and rasterizes
+the job beforehand, using the ImageMagick program [convert][convert]. Having
+this initial rasterization performed locally reduces the processing load on
+printhost, as running convert may take a nonnegligible amount of time and
+resources per job. The filtered jobs are then passed on to whiteout via IPP.
+
+The CUPS server on printhost receives the job and print options and queues
 the job for printing. The actual document, plus metadata including user-set
 options, is stored in the print spool at `/var/spool/cups` until a printer
 becomes available to print it. The document is converted into a more
@@ -71,6 +86,8 @@ visiting the printer's IP over the web (e.g. `https://papercut/`). In the OCF's
 case, security is provided by an access control list (ACL) which accepts print
 jobs from the printhost and rejects jobs from other hosts.
 
+[raster-filter]: https://github.com/ocf/puppet/blob/master/modules/ocf/files/packages/cups/raster-filter
+[convert]: https://legacy.imagemagick.org/script/convert.php
 
 ### Filters
 
@@ -89,13 +106,22 @@ duplexing, then, finally, a device-specific filter such as `hpcups`. Each
 filter is associated with an internal "cost", and CUPS picks the path with the
 least total cost to print the document.
 
-At the OCF, print jobs are all processed by a single filter, [ocfps][ocfps],
+On each local machine, we use [Tea4CUPS][Tea4CUPS], a Python CUPS wrapper, to
+run `raster-filter` on each local machine. As mentioned above, `raster-filter`
+uses convert to rasterize PDF jobs, producing a PDF of lower complexity. If
+convert returns any errors, the error messages are emailed to the root mailing
+list ([email protected]) via the [convert-failure][convert-failure] script,
+and the original, non-rasterized PDF is sent toward printhost.
+
+On printhost, print jobs are all processed by a single filter, [ocfps][ocfps],
 which converts raw PDFs to rasterized, printable PostScript. It calls on a
 command-line converter to render the PDF as pixels (rasterization), then passes
 the result and the rest of the arguments to standard CUPS filters. So far, this
 has given us the fewest headaches in terms of malformatted output and printer
 errors.
 
+[Tea4CUPS]: https://wiki.debian.org/Tea4CUPS
+[convert-failure]: https://github.com/ocf/puppet/blob/master/modules/ocf/files/packages/cups/convert_failure
 [ocfps]: https://github.com/ocf/puppet/blob/master/modules/ocf_printhost/files/ocfps
 
 
@@ -115,8 +141,7 @@ afford the toner.
 
 ## Print accounting
 
-The OCF uses a virtual CUPS printer backend called [Tea4CUPS][Tea4CUPS] to
-install a page accounting hook that runs before and after each job is actually
+The OCF uses a Tea4CUPS backend to install a page accounting hook that runs before and after each job is actually
 sent to the printer. The script is called [enforcer][enforcer], but all the
 logic is contained in the [ocflib printing package][ocflib.printing]. All jobs
 are logged in the `ocfprinting` SQL database, including the username, print
@@ -130,7 +155,6 @@ over daily or semesterly quota, it emails the user and returns an error code
 that cancels the job. Otherwise, it logs successful print jobs in the database
 and emails users in the case a job fails.
 
-[Tea4CUPS]: https://wiki.debian.org/Tea4CUPS
 [enforcer]: https://github.com/ocf/puppet/blob/master/modules/ocf_printhost/files/enforcer
 [ocflib.printing]: https://github.com/ocf/ocflib/tree/master/ocflib/printing
 

diff --git a/ocfweb/docs/docs/staff/scripts/pdf-open.md b/ocfweb/docs/docs/staff/scripts/pdf-open.md
@@ -9,6 +9,10 @@ Examples of PDFs that commonly fail include many Econ 136 and Bio 1B papers,
 PDFs with strange images or scanned components in them, and things people try
 to print straight from Gmail attachment viewer.
 
+Normally, our printing system already performs PDF rasterization before sending
+the job off to our printing server. However, in the case of troublesome PDFs,
+it may help to rasterize the PDF beforehand.
+
 If a user comes asking why their paper isn't printing right, first download the
 PDF, then run `pdf-open $pdf_file`. Don't forget the hyphen, and make sure the
 filename doesn't have any spaces or weird characters in it. After