PostScript to SVG Conversion
About This Document
This document details some of the work undertaken on the development of a Ghostscript driver to convert PostScript and PDF input to SVG. It is not intended as a complete report on the work, but rather a statement of progress, and with pointers to probable future research and development directions.
For SVG (Scalable Vector Graphics) to become a widely-accepted standard, it must itself be widely available as a possible output format from popular applications. One of the best ways of obtaining SVG output must be to generate it from PostScript, as almost all modern software can output to this via a printer driver.
Acrobat Distiller would seem like an obvious choice to extend to output SVG as well as PDF, but it does not have the required extensibility to enable this. Ghostscript, however, is a freely-available Level 3 PostScript interpreter which is modular in design, widely used, and available on almost all platforms. For these reasons, it was decided to develop PostScript to SVG conversion software using Ghostscript.
An additional benefit of using Ghostscript is that it is able to take both PostScript and PDF files as input.
Ghostscript is a Level 2 PostScript interpreter which supports a large number of printers and other devices with device drivers. These drivers have been contributed for free by various developers and include output to various printers (e.g. LaserJet), TIFF, PDF, and to screen display (e.g. X Windows).
Documentation about driver development for Ghostscript is available from the Ghostscript web site. Developing a new driver is best done by copying an existing one designed for a similar purpose, or adapting it for a special case (e.g. to allow use of new facilities on a new model of printer when a driver exists for an older model).
Ghostscript drivers work much like any other printer driver. A standard device table is defined for each driver to specify details such as resolution, orientation (e.g. origin top or bottom left), and to provide callback functions to implement various device features. This structure is extensible so that a developer can store additional information that the driver may need. The callback functions implement operations such as open/close device and begin/end page, and functions to set graphics state parameters, and to perform newpath, moveto, lineto commands etc.
At the simplest level a driver can let Ghostscript image the page as a bitmap, and the driver can write it out to the format appropriate for the device. In the case of a high-level driver such as for SVG, the job is a lot more complicated as we want to retain as much information as possible. One of the difficulties of developing the SVG driver was to obtain the text as text rather than vectors(!).
All graphics state parameters such as colour, line width, and line join are picked up from Ghostscript and utilised (except for dash patterns, which are not implemented yet).
Vector graphics using combinations of lines and curves with fill or stroke work as expected, and rect has been implemented to draw rectangles. Ghostscript does recognise when rectangular paths have been defined, and will call drivers that implement the procedures for filling and stroking them.
Bitmap graphics are currently unsupported in the SVG driver. Bitmaps will eventually be output as JPEG files and referenced from the SVG.
Fonts are currently specified using the font-family and font-size attributes in the text tag, but further processing is needed here to get compliance with CSS2 (Cascading Style Sheets, Level 2).
Text output by the SVG driver is encoded using the appropriate entities by mapping from the font encoding for `&', `<', and `>'. Other characters not in the usual ASCII character set are mapped with a Unicode character code where known, else output as is. For example, the text:
un cours de françaiswould be output as something like:
<text>un cours de français</text>
The SVG driver makes use of SVG's optimisations to reduce output file size. These optimisations include:
- the elimination of unnecessary separator tokens in path data;
- the use of absolute or relative coordinates as appropriate;
- not repeating vector graphics commands (e.g. two consecutive lines could be specified as "L30 762L 40 563" but would be output as "L30 762 40 563", removing the unnecessary repetition of the L command);
- use of horizontal or vertical line commands as appropriate;
- use of grouping with the <g> element to group repeated attributes such as font information for text.
See below for some samples shown in PostScript, PDF and SVG formats.
Here are some SVG samples shown with their original PostScript source, and in PDF form too. To view the SVG you will need an SVG browser plug-in or an external SVG viewing application. See the W3C SVG page for details of the latest implementations available.
|PostScript file||SVG file||PDF file|
The SVG driver is almost complete apart from a couple of things remaining to be implemented, but will probably have to change further in response to future revisions of the SVG specification.
Use of more advanced features of SVG may require an improved SVG browser plug-in or viewer. The current driver is built with the 12th August 1999 SVG specification.
Further work is required to add:
- Support for bitmap images as JPEG files.
- Support for dashed lines.
- Include an ability to pass a link through the PostScript as a pdfmark command to appear appropriately in the SVG.
Last updated Mon Dec 20 11:48:20 1999 (David Evans)