How to automatically print or convert to PDF, MS Office or other formats OpenDocument files

The script and tricks in the ODF scripting section of this website show how to create office-ready texts, presentations and spreadsheets automatically, in the OpenDocument format, which is a worldwide standards. This is all many people need to work today. Sometimes, however, it’s still necessary to either print those documents, or exchange them to somebody in other formats, like PDF or those of the older releases of Microsoft Office (newer releases of this program are already partially compatible with OpenDocument through free plugins, so if your partners have those versions they should really use those plugins, instead of bothering you with requests for drug-like, legacy file formats, but that’s another story).

Of course, if you only need to print or convert to other formats only once in a while there’s no reason to not do it from OpenOffice. The simple tricks explained below, however, are a life-saver when you need to do this many times, and of course you’d like your computer to do it for you while you have a coffee or something.

On Linux systems it is easy to do all this, and even send the converted files via email, automatically. Let’s assume that you have an OpenDocument text, spreadsheet or presentation already sitting in some folder, waiting to be processed.

Both printing and conversion to PDF, HTML or MS Office formats from the command line need OpenOffice to work. In the second case, the reason is that what makes the actual work is one of the OpenOffice macros linked below: when you launch OpenOffice, it executes that macro on the file indicated by the user and then exits. Macros are not needed for printing because OpenOffice has dedicated options for that. Usage of OpenOffice from the command line is explained on the OOo wiki. In a nutshell, this is the correct syntax:

  soffice -invisible macro://path-to-macro($FILE)

On some systems, you may need to provide the complete path to the soffice program. The -invisible option is what makes OpenOffice start without a graphical interface. The file to process must be passed as argument ($FILE) to the macro.

The command above is all you need if you are working on a complete Gnu/Linux desktop, that is a system that also has a graphical interface server (called X server). For the record, you can do the same thing in Windows with a batch script like this (taken from an OOoforum thread):


  "c:program filesOpenOffice.org1.1.4programsoffice" -invisible "macro:///Standard.Module1.ConvertToPDF(%1)"

When you want to work inside a Linux Web or print server, instead, that is on a computer where X was never installed, you need to set up some extra variables before launching OpenOffice, otherwise it won’t start. This is how to do it (the explanation for the extra commands are in the thread in which I found them, which also includes instructions on how to install OpenOffice on a (remote) server:

  export PATH=$PATH:/usr/bin/X11
  export LANG=en_US
  export HOME=/var/www
  xvfb-run -a /usr/bin/soffice -invisible macro://path-to-macro($FILE)

Please note the extra piece in the actual command, that is in the last line above: `xvfb-run -a`. Xvfb is a smaller X server used in special situations like this, when a full X wouldn’t be installable. Also, don’t forget that, depending on the server configuration and your actual needs, you’ll probably have to change the LANG and HOME variables.

Show me the macros!

The previous paragraph explains how to run OpenOffice from the command line on Linux or Windows in order to execute any macro. Let’s now look at the actual macros we need to print or save in Microsoft or other formats. There are several ones available online.

Those with the best explanation, which includes details on how to install any macro in OpenOffice, are SaveAsPDF and SaveAsDoc. The beauty of these macros is that it is very easy to modify them to save in HTML or any other format that OpenOffice can handle! You just have to substitute the right values for the file extension (MYEXTENSION) and the filter name (MY_FILTER_NAME) in this part of the macro:

   cFile = Left( cFile, Len( cFile ) - 4 ) + ".MYEXTENSION"
     cURL = ConvertToURL( cFile )

     oDoc.storeToURL( cURL, Array(_
              MakePropertyValue( "FilterName", "MY_FILTER_NAME" ),)

Another macro that saves an OpenDocument file in PDF format was posted to the Fedora mailing list. Whichever macro you choose, put it in a suitable folder, accessible from the script and user account that will use it, and replace the path-to-macro string above with the actual full path to the macro in the file system.

How to print or email OpenDocument files from the command line

In order to do this we just need two other command line options of OpenOffice (see here for the complete list or type `soffice -?` at a command prompt to get a complete listing):

  soffice -invisible -p <documents...>
  soffice -invisible -pt <printer> <documents...>

They both print all the specified documents. The only difference between them is that the first one uses the default printer, the second looks for the printer given as first parameter.

Finally, if you also want your script to email on your behalf the files that it generated in this way, you can use the text-based Mutt email client in this way ($EMAIL_TEXT is a separate text file containing the text of the message):


if you find any error in this page or have any suggestion, please tell me (but remove the numbers from the email address first!)

Generate OpenDocument spreadsheets from DB2 (or any other) database

DB2 pureXML is IBM software for management of XML data that eliminates much of the work typically involved in the management of XML data.The OpenDocument Format (ODF) is an open international standard for office texts, presentations and spreadsheets that is very simple to process or generate automatically. This page is a short synthesis of an article published in September 2010 by N. Subrahmanyam, Using DB2 pureXML and ODF Spreadsheets, to give an idea (see my comments at the end) of how flexible ODF scripting is. Please read the original full article to know how to actually generate ODF documents from DB2 pureXML files. Continue reading Generate OpenDocument spreadsheets from DB2 (or any other) database

Create OpenDocument invoices and other documents with Rexx

After my talk about ODF scripting at OOoCon 2010 I got by another OOoCon speaker, Rony G. Flatscher another script for automatic generation of OpenDocument invoices, or any other ODF text with a fixed structure. Roni’s script opens an OpenDocument text template as the one shown in the left picture below and replaces all the MF_ placeholders strings with values loaded by a plain text file, creating the filled form shown in the right picture.

Continue reading Create OpenDocument invoices and other documents with Rexx Conference 2010, preparing the next ten years

The conference celebrating the tenth birthday of started in Budapest yesterday morning. Here are some first notes from the field.

The opening session was a cool moment, both for the location (the Hungarian Parliament) and for the content. We started in the very hall of the Parliament. Incidentally, the first thing I noted there has nothing to do with OO.o but is a general problem of the FOSS and programming worlds: of about 150 people in the hall, no more than 10% were women, even if OO.o and FOSS users aren’t certainly 90% males, are we? But I digress.

Continue reading Conference 2010, preparing the next ten years

How to make OpenDocument slideshows out of plain text files

Slideshows are extremely popular as presentation and educational tools, but have a couple of serious problems. The first is readability: let’s admit it, many slideshows are almost unusable. One of the secrets to useful slideshows is terseness. Each slide should contain only a few short points or pictures which summarize the key concepts you want to transmit to the audience with that part of your talk.

The other big issue with slideshows is that GUI presentation software, be it PowerPoint, OpenOffice Impress, KPresenter or anything else, can be quite time-consuming and distracting, no matter how you use it. Writing bullets and sub bullets as simple text outlines is much faster, even when you’re just pasting together notes you scrabbled on your PDA, email fragments, quotes from Web pages or thoughts of the moment.

Continue reading How to make OpenDocument slideshows out of plain text files

How to generate and update ODF spreadsheets without OpenOffice

Sooner or later, many of us need to process some numeric data in plain text format, be they system logs or sales totals, and to generate reports and charts out of those data. Scripts and utilities like gnuplot could be very useful in such cases, except when the results needs to be a normal spreadsheets with charts and formulas, which is both editable and compatible with people who only know how to deal with spreadsheets in office suites.

Continue reading How to generate and update ODF spreadsheets without OpenOffice

Comments to, How to automatically create ODF invoices without OpenOffice

(Note: these are the comments appended to my original article, which I had to put in a separate page when I switched from Drupal to WordPress)

Just came to your site…

Just came to your site following a link from linuxtoday. Wow! This opens windows of opportunities! Somehow I’ve totally missed out on the fact that odt documents are just zip files. I’ve been reading a bit through some content.xml files. And it seems that it should be possible to use openoffice from a text editor just as fine.

Continue reading Comments to, How to automatically create ODF invoices without OpenOffice

Why and how the OpenDocument format can save you a lot of time!

The OpenDocument Format (ODF) is an internationally recognized open standard for digital office documents whose importance has also been acknowledged by Microsoft. ODF is good for a lot of reasons I have already explained in Everybody’s Guide to OpenDocument. However, there is also one more reason why ODF is great for everybody who must produce a lot of office documents, one that will be the subjects of many posts on this website: ODF is really simple to generate or edit automatically. Even if you aren’t a professional programmer, it takes very little effort to put together a script that generates or processes in any way texts, presentations or spreadshets in ODF format.

How the openness of ODF makes automatic generation of documents much simpler

Very often, we use computers to produce many different versions, every time with new data, of some reference text, presentation or spreadsheet. Changing those kind of files manually makes sense only if it happens once in a while. When it’s a regular activity, instead, it can become a huge waste of time. ODF, however, makes it very quick and easy to insert raw data into texts, spreadsheets or presentations with the slightest possible amount of manual work and without even running OpenOffice. This is possible because an ODF file is just a ZIP archive, with pictures and macros in their own folders and the actual text written, in XML format, inside a file called content.xml. Therefore, in order to create a new, 100% compliant ODF file with different data, tables or images, you only have to open the archive, process the text inside content.xml or put new pictures in their folder if necessary and zip everything again. You must only use OpenOffice once, to create a template by hand if you if you don’t find a suitable one online.

The power of script-based ODF processing

You could perform repetitive generation and editing of ODF office files even manually, with a text editor like Notepad, Emacs or VI. The real power of ODF, of course, is in the fact that you can (and should) do all that processing automatically, with very simple shell or Perl scripts, that is with tools that are included in any Gnu/Linux distribution but can also work on Windows and Mac. The main advantages of this approach to office document processing are:

  • it works even without Openoffice, so it could even run on a server
  • there is no need of any relational database but you can use one if necessary
  • learning to do these things with shell scripts instead of OpenOffice macros:
    • gives you skills that you can reuse in a lot of other contexts.
    • allows very easy integration with other command line tools, from cron jobs to mass mailing, chart generation with Gnuplot or scaling, watermarking, framing of all images in a document with ImageMagick
  • above all, it’s much simpler (and faster) than you’d think!

The last point is the most important. Using the method explained here everybody with just a basic grasp of shell scripting can generate, modify or analyze hundreds of ODF text documents with just a few minutes of easy coding.

Of course, this approach is not really flexible, scalable or really robust, unless you add lots of code for error management, but the idea here is not to develop industrial strength solutions. If that’s what you need, you’ll have to either use real XML based tools like Odfpy or go straight to the source, the book OpenDocument Essentials by J. David Eisenberg, that you can also purchase at

This said, there are tons of cases where heavyweight tools like those aren’t worth studying, installing and deploying, but people still end up wasting many hours on repetitive edits. Learning how to write quick and dirty shell scripts that can open and update an ODF file is an easy but huge time saver in such situations.

What’s next?

Here are some of the ODF scripting recipes that you’ll find on this website in the next days (but if there are other recipes that you would like to see published, just ask and if possible I’ll write them!):

(note: some of these posts are updated excerpts of articles originally written for Linux Format, and are republished here with their permission)