OpenPrinting/PDF as Standard Print Job Format

From The Linux Foundation
Jump to: navigation, search

Introduction

One of the decisions which was made on the OSDL Printing Summit in Atlanta in 2006 and widely accepted by all participants was to switch the standard print job transfer format from PostScript to PDF. This format has many important advantages, especially

  • PDF is the common platform-independent web format for printable documents
  • Portable
  • Easy post-processing (N-up, booklets, scaling, ...)
  • Easy Color management support
  • Easy High color depth support (> 8bit/channel)
  • Easy Transparency support
  • Smaller files
  • Linux workflow gets closer to Mac OS X

Most important here is the post-processing. In contrary to PostScript, one can easily distinguish in every PDF file which part of the data belongs to which page. So one can easily take the pages apart and do things like printing selected pages, 2, 4, ... pages per sheet, even/odd pages for manual duplex, scaling, ... PostScript files must be strictly DSC-conforming to allow this kind of page management. By using PDF we assure that page management always works.

Development

The switchover to PDF as standard print job format is work in progress. For CUPS appropriate filters are already available in the Subversion repositories of the OpenPrinting Japan SourceForge site (http://sourceforge.jp/projects/opfc).

The foomatic-rip filter was made PDF-ready by Lars Uebernickel due to an internship at OpenPrinting, mentored by Till Kamppeter and funded by the Linux Foundation.

The texttopdf filter to print plain text files and the pdftoijs filter to couple IJS plug-in drivers (like HPLIP or Gutenprint) with a PDF renderer got implemented by Tobias Hoffman as a Google Summer of Code project, mentored by Hin-Tak Leung, author of various drivers for printers with proprietary protocols (so-called winprinters or GDI printers).

All these filters got packaged up and uploaded into Ubuntu from Intrepid on. This way the PDF workflow got implemented on the printing system/server side.

To complete the PDF printing workflow, applications should be modified to emit PDF instead of PostScript when the document is sent to the printer. But due to the capability of CUPS to convert PostScript input to PDF the change of the applications is not urgent. Especially we will get the page management already working.

If you want to test the PDF printing workflow without changing your system, try Ubuntu Intrepid or newer. Live CDs are available.

How to switch a system to use PDF as standard print job format

Add the new PDF filters to CUPS

CUPS needs the following additional filters to make print job being processed in a PDF printing workflow, with page management being done by pdftopdf and not pstops any more:

  • imagetopdf
  • texttopdf
  • pstopdf
  • pdftopdf
  • pdftoraster
  • pdftoijs
  • pdftoopvp
  • cpdftocps

Most of these filters (all except pstopdf, pdftoraster, and cpdftocps) were proposed to be a part of the upstream CUPS package and therefore they are available in a convenient tarball. The tarball adds everything necessary to include the filters in the CUPS source tree and make them being built and installed together with CUPS.

To add the filters to CUPS, download the tarball, uncompress it, and add the filters to the CUPS source tree as described in the included README file.

The filters in the above-mentioned filter kit are Poppler-based, due to Poppler's more flexible API. Note that Poppler 0.11.0 or newer is needed. If you have only Poppler 0.10.x, either update to the new Poppler or apply the following patches: This small patch is required as it adds new needed functionality, and this bigger patch is highly recommended, it adds color management to Poppler.

For pdftoraster we use Ghostscript as it is much more optimized for printing. As the pstoraster filter was already part of Ghostscript also the pdftoraster filter was added as part of Ghostscript from version 8.64 on. So it installs together with the pstoraster filter when doing "make cups" and "make install-cups" in the source tree of Ghostscript.

If you have to stay with Ghostscript 8.63 for some reason, see these instructions to get its PDF-related bugs fixed and the pdftoraster filter added.

The remaining filters, pstopdf and cpdftocps, are simple scripts which can be downloaded here. Make them world-executable and put them into the /usr/lib/cups/filter directory. Download also the conversion rule files pstopdf.convs and cpdftocps.convs from the same place and copy them into the /etc/cups directory. The files must be world-readable.

The imagetopdf, pdftopdf, and pdftoopvp filters are written by Koji Otani (BBR Inc., Japan, sho at bbr dot jp) and hosted at the OPFC project at SourceForge Japan. He has also written a Poppler-based pdftoraster filter.

The texttopdf and pdftoijs filters are written by Tobias Hoffmann (th55 at gmx dot de) as a Google Summer of Code project, mentored by Hin-Tak Leung (hintak_leung at yahoo dot co dot uk). They are also hosted at the OPFC project at SourceForge Japan.

The pstopdf filter was originally written by Robert Sander (robert dot sander at epigenomics dot com) and improved for the PDF printing workflow by Till Kamppeter and Johan Kiviniemi (debian at johan dot kiviniemi dot name).

The cpdftocps filter converts CUPS PDF (PDF which was passed through the pdftopdf filter) to CUPS PostScript (PostScript which was passed through pstops, expected as input by native PostScript PPDs without *cupsFilter lines or legacy PPDs). It is a script which calls the Poppler filter pdftops to convert to PostScript and then pstops to insert the PostScript code of the PPD options. As the page management options of CUPS were already applied by pdftopdf, on the call of pstops from within this filter these options are filtered out of the command line. This filter can be considered as a "PostScript printer driver", as it generates the PostScript needed by the PostScript printers. The cpdftocps filter is written by Johan Kiviniemi (debian at johan dot kiviniemi dot name) and improved by Till Kamppeter.

The Ghostscript-based pdftoraster filter is written by Till Kamppeter, who also made the patch to fix Ghostscript's CUPS Raster output device. Note that in contrary to pstoraster this filter is written in C. This is needed to make use of the CUPS libraries to read out the PostScript code defined in the PPD file which is supposed to get inserted into the PostScript input data stream in order to let Ghostscript generate the correct CUPS Raster data according to the option settings supplied by the user. This is not possible with PDF input data. The filter then converts the PostScript code into equivalent command line options for the Ghostscript call and then calls Ghostscript. pstoraster only needs to call Ghostscript, as pstops has already embedded the PostScript code in the PostScript input stream then. Therefore pstoraster can be a simple shell script. As pdftoraster does not modify the input data stream, it works as well with a PostScript input data stream.

Modify the cost factors of already existing file conversion rules in CUPS

Every file format conversion rule in CUPS (in the /etc/cups/*.convs, also *cupsFilter lines in the PPD) files has not only an input and an output format and a filter name, but also a numerical cost factor. The cost factors of each filter chain computed by CUPS will be added and if there is more than one possible filter chain for a job, CUPS takes the "cheapest" one.

To make sure that the PDF-based way is always preferred, we raise the cost factor of the pstops filter from 66 to 100 in /etc/cups/mime.convs:

sed -i -r -e '/\spstops$/ { s/66/100/ }' /etc/cups/mime.convs

All other relevant conversion rules are in the conversion rule files coming with the new CUPS filters and their cost factors are already low enough.

Update Ghostscript to at least version 8.64

Also with PDF-centric printing Ghostscript is a central part of the printing infrastructure, as it takes PDF as well as PostScript as input format. Especially all the built-in drivers of Ghostscript can be continued to be used.

In Ghostscript many bugs on PDF rendering where fixed recently. Therefore use the newest Ghostscript version to get best results.

In case you cannot switch to Ghostscript 8.64 for some reason, here are patches to fix the most important bugs of Ghostscript 8.63.

Update Foomatic to version 4.0.x

foomatic-rip 3.x does not understand PDF as input format. This feature was added as a principle feature of foomatic-rip 4.0. foomatic-rip 4.0 feeds PDF directly into the Ghostscript process which renders the input into printer's format together with the driver, at least if Ghostscript is called directly in the beginning of the rendering command line and no option requires PostScript commands to be inserted into the data stream to get executed. Otherwise, foomatic-rip 4.0 converts the input into PostScript at first.

The new foomatic-db-engine has extensions for the PDF workflow in its PPD generator. Especially PPDs are generated with the lines

*cupsFilter:    "application/vnd.cups-postscript 100 foomatic-rip"
*cupsFilter:    "application/vnd.cups-pdf 0 foomatic-rip"

instead of

*cupsFilter:    "application/vnd.cups-postscript 0 foomatic-rip"

now, so that foomatic-rip accepts PDF as input format with this PPD. It also accepts the new "<prototype_pdf>...</prototype_pdf>" tag in the "<execution>" section of Foomatic driver XML files to allow specifying a separate command line prototype for PDF input. From this tag the PPD generator creates the new "*FoomaticRIPCommandLinePDF" keyword in the PPD files.

The new foomatic-db contains several optimizations for the PDF workflow, especially many options which worked by inserting PostScript code into the data stream were converted to Ghostscript command line options, so Ghostscript can take PDF as input with the same drivers.

Download the foomatic-filters, foomatic-db-engine, foomatic-db and (optionally) the foomatic-db-nonfree package from our download area, as released tarballs or daily snapshots or from our BZR repositories. See our Foomatic page for more information.

Patch ready-made PPDs using foomatic-rip to accept PDF as input format

There are many PPD files on a typical system which are not generated by Foomatic, but they use also foomatic-rip and so they (and their corresponding drivers) can be used with PDF input data. These PPDs come with driver packages, or as manufacturer-supplied PPDs.

To convert them all to accept PDF run the following commands:

cd /usr/share/ppd  # (or cd /usr/share/cups/model)
find . -name "*.ppd" | xargs perl -p -i -e \
's,^\*cupsFilter:\s*\"application/vnd.cups-postscript\s+0\s+foomatic-rip\",*cupsFilter: "application/vnd.cups-postscript 100 foomatic-rip"\n*cupsFilter: "application/vnd.cups-pdf 0 foomatic-rip",'
for f in `find . -name '*.ppd.gz'`; do gzip -cd $f | perl -p -e \
's,^\*cupsFilter:\s*\"application/vnd.cups-postscript\s+0\s+foomatic-rip\",*cupsFilter: "application/vnd.cups-postscript 100 foomatic-rip"\n*cupsFilter: "application/vnd.cups-pdf 0 foomatic-rip",' | \
gzip -9 > $f.tmp; rm $f; mv $f.tmp $f; done

Make desktop applications emitting PDF when printing

For KDE and Qt applications you only need a not too old Qt (4.x) and CUPS 1.2.x or newer. Then Qt emits print jobs in PDF format. So for these applications the PDF printing workflow is already upstream reality.

GTK and GNOME did not switch over to PDF yet. You can apply a patch to GTK to get at least some GTK/GNOME applications to emit print jobs in PDF.

OpenOffice.org, Firefox and Thunderbird still need to get switched over. Patches welcome.

If not all applications are sending PDF to CUPS, you will still get printouts and you will already get the advantages of doing page management on a PDF data stream. CUPS simply converts incoming PostScript using the pstopdf filter before doing any further step.

Related bugs and feature requests

The reports listed here are once feature requests to several free software projects for implementing the PDF printing workflow and also bug reports about problems concerning the PDF printing workflow.