How to post content to a WordPress blog from the command line

WordPress is a great publishing system, but managing it manually can be a very time consuming process. This is especially true when you want to upload lots of posts, or if you would like to write content in your preferred, full-blown text editor and then have it “magically” appear online.

WordPress takes care of these needs allowing remote posting via email or the WordPress XML-RPC interface (if you enable the WordPress, Movable Type, MetaWeblog and Blogger XML-RPC checkbox in goinig to Settings > Writing > Remote Publishing). The first method is explained here and in other places, but requires setting up a dedicated email account. For several reasons I preferred not to do it that way, so I looked at the other system.

PHP scripts for this purpose are here and here. The first uses the cURL capabilities of PHP to send the data over SSL. The second uses IXR, the Incutio XML-RPC Library for PHP, to “incorporate both client and server classes, as it is designed to hide as much of the workings of XML-RPC from the user as possible”.

Both those scripts work, and the second can also be used to edit existing posts or get lists of the latest published articles. However I was looking for something that didn’t require PHP (personal preference, really). Eventually, I found the WordPress-CLI utility by Leo Charre and have already used it successfully to upload hundreds of posts to several of my WordPress websites (see bottom of this page for examples). Here’s how I did it.

WordPress-CLI installs like any other Perl Module, see the instructions in the README file. In order for it to work, however, I also had to download and install from CPAN the Perl Modules Getopt-Std-Strict-1.01 and LEOCHARRE-Debug-1.03.

Once everything is installed, you’ll have in your path a script called wordpress-upload-post. Run it at the command prompt or in a script, giving as options the name of the HTML file containing the post, plus its required publication date, title and category, as well as your user name and password, and you’ll have your post online.

To make things faster, I use wordpress-upload-post inside this script:


  #! /bin/bash
  # usage: post2wp.sh postfile blogname
  # postfile: text file containing post content in txt2tags format
  # blogname: name of blog

  POST=$1
  BLOGNAME=$2

  HTML=/tmp/tmp_wordpress_post.html
  ACCOUNTS_DIR= "$HOME/.blog_accounts"

  #########################################################################
  #extract title, category and publication date of the post
  TITLE=`grep '%TITLE: ' $POST | cut -c9-`
  CATEGORY=`grep '%CATEGORY: ' $POST | cut -c12-`
  DATE=`grep '%DATE: ' $POST | cut -c8-`

  YEAR=`echo $DATE | cut -c1-4`
  MONTH=`echo $DATE | cut -c5-6`
  DAY=`echo $DATE | cut -c7-8`
  HOUR=`echo $DATE | cut -c9-10`
  MIN=`echo $DATE | cut -c11-12`
  DATE="$HOUR:$MIN $YEAR/$MONTH/$DAY"

  rm -f $HTML.tmp*
  txt2tags -t xhtml --no-headers -i $POST -o $HTML

  ###########################################################################
  # source blog parameters: user name, password, XML_RPC url

  if [ -e $ACCOUNTS_DIR/$BLOGNAME ]
  then
    source $ACCOUNTS_DIR/$BLOGNAME
  else
    echo "Error! $BLOGNAME account file doesn't exist!"
    exit
  fi

  ###########################################################################
  # upload post to blog
  WP_OPTS="-D '$DATE' -t '$TITLE'  -c '$CATEGORY'  -u $USER -p '$PW' -x $XMLRPC_URL"
  WP_CMD="wordpress-upload-post $WP_OPTS $HTML"
  eval $WP_CMD
  exit


I use txt2tags as source format for most things I publish online. It is a very simple ASCII markup format, that I customize adding to each article to be published on WordPress comments like these:


  %TITLE: Is it OK for a School or Charity to accept software donations?
  %CATEGORY: Digiworld
  %DATE: 200707010900


The script extracts these variables from the source text and reformats them in the way required by wordpress-upload-post.

Next, it converts the article from the ASCII markup to HTML calling the txt2tags Python Script. Password, user name and the URL of the wordpress blog to use to upload the HTML post are in a separate file ($ACCOUNTS_DIR/$BLOGNAME) that has this format:


  USER='your user name'
  PW='your password'
  XMLRPC_URL='http://your.blog.home.page/xmlrpc.php'


and is read right after generating the HTML version of the post. The last part of the script, “upload post to blog” concatenates all the parameters in one option string WP_OPTS, build the publishing command WP_CMD and evaluates, thus publishing your post online. Enjoy! Of course, if you only post once in a while you won’t save much time, but if you ever need, as I did, to post lots of stuff, try this!

What’s missing

wordpress-upload-post and this script work great (see bottom of page), but they aren’t perfect. The biggest limit right now of WordPress-CLI is that you can’t specify WordPress tags or, if you have a multilingual blog, the language of the current post. I’d also like to use it to add comments to existing posts, but that’s not essential really. I discussed these things with Leo. His answer is that “some of these things simply won’t work. For example- adding tags- because xmlrpc.php does not implement a call to add a tag. I’ve made some hacks to be able to do so- but this works on a local/server level. [also] I can’t recall right now if they have a comment call”. Leo also asked me to forward to all readers of this page this invitation:

  I'd be open to actually share in maintenance of things like WordPress::XMLRPC and a WordPress::CLI revised version. If you want to do this level of changes/additions, we could set up a branch off some cvs server... and... implement bugzilla on some server- to keep track of changes and todos.

Another thing I’d like to figure out is a way to upload images to WordPress so that they are associated to that post and get thumbnails. If that were possible, it would be easy to figure out in advance the right URLs for both the images and their thumbnails, and add them to the txt2tags source. As it is today, in the rare cases where I need to upload with this script a post that does have images, I add them in a second moment by hand. Suggestions?

How I’ve used this script

If you are curious to what real posts written in txt2tags format, then converted to HTML and automatically uploaded to a WordPress blog look like, and don’t mind a bit of self-promotion, have a look at the following links. I have already used the script above to:

How to transform (almost) plain ASCII text to Lulu-ready PDF files, part 3

This is the core script I used to transform a set of plain ASCII files with the Txt2tags markup in one print-ready PDF file. Part 1 of this tutorial explain why I chose txt2tags as source format and Part 2 describes the complete flow.

Book creation workflow

  Listing 1: make_book.sh

    1   #! /bin/bash
    2
    3   CONFIG_DIR='/home/marco/.ebook_config'
    4   PREPROC="%!Includeconf: $CONFIG_DIR/txt2tags_preproc"
    5
    6   CURDIR=`date +%Y%m%d_%H%M_book`
    7   echo "Generating book in $CURDIR"
    8   rm -rf $CURDIR
    9   mkdir $CURDIR
   10   cp $1 $CURDIR/chapter_list
   11   cd $CURDIR
   12
   13   FILELIST=`cat chapter_list | tr "12" " " | perl -n -e "s/.//..//g; print"`
   14
   15   echo ''                                 >  source_tmp
   16   echo  $PREPROC                          >> source_tmp
   17   sed 's/.//%!Include: .//g' $FILELIST >> source_tmp
   18
   19   replace_urls_with_refs.pl source_tmp > source_with_refs
   20
   21   txt2tags -t tex -i source_with_refs -o tex_source.tex
   22   perl -pi.bak -e 's/OOPENSQUARE/[/g'   tex_source.tex
   23   perl -pi.bak -e 's/CLOOSESQUARE/]/g'  tex_source.tex
   24
   25   #remove txt2tags header and footer
   26   LINEE=`tail -n +8 tex_source.tex | wc -l`
   27   LINEE_TESTO=`expr $LINEE - 4`
   28   tail -n +8 tex_source.tex | head -n $LINEE_TESTO > stripped_source.tex
   29
   30   source custom_commands.sh
   31
   32   cat $CONFIG_DIR/header.tex stripped_source.tex $CONFIG_DIR/trailer.tex > complete_source.tex
   33   pdflatex complete_source.tex
   34   pdflatex complete_source.tex
   35
   36   # Generate URL list in HTML format
   37   generate_url_list.pl chapter_list html | txt2tags -t xhtml  -no-headers -i - -o url_list.html

All the txt2tags settings and some LaTeX templates are stored in the dedicated folder $CONFIG_DIR, so you can have a different configuration for each project. The scripts itself only takes one parameter, that is a list of all the source files that must be included in the book. Lines 6 to 12 create a work directory and copy the file list inside it. The files must be written in the file list with their absolute paths, in the order in which they must appear in the book.

Lines 13 to 17 of the script create a single source file (source_tmp) that contains the Include command loading all the txt2tags preprocessing directives (line 16) and then the content of all the individual files, in the right order but without the Include directives that are needed when processing them individually (line 17).

Line 19 runs a separate script, replace_urls_with_refs.pl, that adds the cross-reference numbers to the book text and dumps the result into another temporary file, source_with_refs. This script, not included here for brevity and because you can do without it if you don’t need cross-references like me, only does two things. First it reads a file in the $CONFIG_DIR folder that contains, one per line, all the URLs mentioned in the source files and the corresponding captions, in this format:

http://www.greenparty.org.uk/news/2851 | Windows Vista? A “landfill nightmare”

Next, replace_urls_with_refs.pl reads source_tmp and, whenever it finds a line like:

the UK Green Party officially declared Vista... a ["landfill nightmare" http://www.greenparty.org.uk/news/2851]

generates the right cross-reference number and puts it right after the text associated to the link itself, writing everything to source_with_refs. You can see the effect in the last figure of part 2 of this tutorial. After all this pre-processing, we can finally run txt2tags to produce a LaTeX file (line 21) but right after that we need to put back square brackets in place of some temporary markup generated by replace_urls_with_refs.pl (lines 22/23). The next part of the script, until line 32, remove the default LaTeX header and footer created by txt2tags, replaces them with those stored in the $CONFIG_DIR folder and dumps everything into complete_source.tex: this move allows you to declare whatever LaTeX class you wish to use (I used Memoir), or to give any other LaTeX instruction in the header, without any interference or involvement from txt2tags. figure_05_final_result1 Sometimes I use line 30, also optional, to execute any other post-processing commands on the LaTeX source that, for any reason, it is not convenient to run before. The two invocations of pdflatex in lines 33/34 finish the job: the first creates a first draft of the book to calculate page numbers and other data, the second produces the final PDF with clickable table of contents (see below) and all the other goodies the Memoir LaTeX class can handle. The script in line 37 is the one that scans again all the source files to produce the HTML list of references.

Summing all up

I haven’t described all the gory details and auxiliary scripts at lenght because my main goal with this article is to introduce a txt2tags-based way of working. As I already said, this general method is quite simple but in spite of this, or maybe just for this reason, I find it very, very flexible and powerful. Two great Free Software applications, txt2tags and pdflatex, plus about one hundred lines of codes in three separate scripts, can produce print-ready digital books and/or all the HTML code you need to make an online version or simply an associated website. Besides, you can easily add to the mix programs like curl or ftp upload everything to a server. Personally, my next step will be to extend make_book.sh to generate OpenDocument files thanks to OpenDocument scripting.

How to transform (almost) plain ASCII text to Lulu-ready PDF files, part 2

This page gives a general overview of a flow for transforming ASCII files in print-ready PDF books. The reasons for setting up such a flow in this way are explained in the first part of this tutorial.

Basic workflow

The basic usage of txt2tags is really simple. Once you’ve written something that you need to convert to PDF, text or HTML you can launch the graphic interface with the –gui option or run a command like this at the prompt:

  txt2tags -t xhtml -i mypost.txt -o mypost.html

This will tell txt2tags to save in the mypost.html file an HTML version of the content of mypost.txt. What tells the script the desired output format is the -t (target) option. In this case it is xhtml. Had it been txt or tex, it would have produced a plain text or LaTeX file.

This figure shows the txt2tags source of this article alongside with its plain text, HTML and PDF versions.

As you can see, syntax coloring for txt2tags is already available for Kate (the editor shown above) as well as emacs, Vim and other popular text editors. In order to obtain PDF files, you need to run pdflatex or similar tools on the .tex file created by txt2tags:

  txt2tags -t tex -i mypost.txt -o mypost.tex
  pdflatex mypost.tex

From single files to books

The real power of txt2tags, at least for me, is the fact that it makes easy to work on multiple, completely independent source files as if they were one (more on this later). This makes a breeze to create whole books and sets of HTML pages or other content related to the books, always keeping everything in sync and interlinked with the content of the other versions. Here is a real world case, that is how I created the PDF source and its HTML counterparts for the Guide.

I had some specific requirements, which by the way are common to many other projects of mine. First, I wanted each chapter of the book to be in a separate file. This is both to make incremental backups easier and to collate files from different previous projects without duplicating them. Then I wanted to create an independent, online HTML list of all the web pages mentioned in the book, with the same reference numbers used in the printed copy. I also wanted each of those links in the HTML list to have a descriptive caption that I had written by hand. This is, by the way, the reason why I worked out a custom, but relatively simple cross-referencing system, instead of doing everything in LaTeX. For example, at a certain point in the book I wrote that Windows Vista has been defined a “landfill nightmare”. This is the corresponding sentence in source file, complete with txt2tags markup, which includes the reference URL:

  the UK Green Party officially declared Vista... a ["landfill nightmare" http://www.greenparty.org.uk/news/2851]

I wanted the PDF version of the chapter to include a reference number like [19 - 2], to mean it’s the second cross-reference of chapter 19. I also wanted the HTML list to associate to that link the same number and the caption ‘Windows Vista? A “landfill nightmare”‘. Using the scripts explained below produced the HTML source for the online version of the chapter, the PDF shown in the previous figure and the HTML list with the same references number that you can see online.

The main script I used to create the PDF version ready for upload shown above is published and explained in the last part of this tutorial. The PDF resulting after processing the cross-references is shown here.

How to transform (almost) plain ASCII text to Lulu-ready PDF files, part 1

Many people write far more now that they are constantly online than in the pre-Internet age. Most of this activity is limited to Web or office-style publishing. People either write something that will only appear inside some Web browser or a traditional “document”, that is a single file, more or less nicely formatted for printing. Very often, however, they don’t do it in the most efficient way.

The most common solution for the first scenario still is to write HTML or Wiki-formatted content in a text editor or, through a browser, directly in the authoring interface of CMS systems like Drupal or WordPress. The other approach is even closer to the typewriting era, since it’s limited to using a word processor like OpenOffice. Both methods involve too much manual work for my taste, especially if you often want to reuse or move content from one format to the other.

Since I write a lot for both of the scenarios above and some more, some time ago I realized that I needed a more efficient and flexible workflow: something that was as close as possible to “write ONCE, publish anywhere, re-mixing and processing already written stuff in any possible way without getting mad along the way”. I wanted to write QUICKLY, without thinking at all of where or in which format the text would end up, while being prepared for all cases, from blog to book. I also wanted to use only Free Software that would run quickly even on old computers, with little or no configuration, if necessary on any operating system. Finally, I wanted the possibility to manage, search and process all my writings automatically, with command line utilities or shell scripts.

While I must admit I’m not there yet (especially when working on commission with very particular requirements) I already am pretty close to it in most cases. The rest of this article explains which software I chose and some scripts I wrote to work in this way, that is to write stuff only once and then convert it to a publishing-quality PDF or to HTML with just a few commands.

The first (easy) choice I had to make was “which file format should I use?”. I am a huge fan of the OpenDocument format (ODF), also because ODF is very easy to hack. However, the requirements above immediately exclude it as a source format for most of the stuff I write. The natural Free as in Freedom format for producing good PDF is still TeX or LaTeX, but I wanted HTML and OpenDocument as final options, which aren’t easy to obtain starting from TeX. Besides, I wanted to write quickly text that would be already highly readable in its native format, without too much markup in the way. The obvious conclusion was that I should write plain text marked up with a simple, wiki-like syntax as ReST, Markdown or txt2tags.

I chose the latter for two reasons. First, it has very good export to all formats I need (LaTeX, plain text, MediaWiki and HTML) with the exception of ODF which is, however, relatively easy to add, at least conceptually. Above all, however, txt2tags is simple. Its markup is very readable and easy to learn, but that’s not its biggest quality. figure_01_txt2tags_gui What I like is that the actual software consists of one small Python script that runs in a graphical interface (shown below) or at the prompt with a few options, without depending on any third party library or additional module. Unless your operating system doesn’t support Python, you only need to have that script and any text editor to work.

Sure, you need other software to generate PDF files from LaTeX, and auxiliary shell scripts for pre- and post-processing like what I describe below but (unlike what I found in other markup systems) that’s all Free Software that’s guaranteed to be already packaged in almost all Gnu/Linux distributions (including server-oriented ones, for automatic remote processing!) and also available for Windows. Besides, being a command line tool that can accept text from STDIN or send it to STDOUT, txt2tags integrates perfectly with any other script-based text processing procedure one may need.

Ultra-quick intro to txt2tags syntax pros and cons

The markup syntax of Txt2Tags (see its online demo) leaves the source text very readable. Headers have one or more equal signs at the beginning and end of the line. Numbered and non-numbered list items start with a dash or plus character. Hyperlinks are included in square brackets, asterisks delimitate bold text and slashes are for italic. To build tables you must enclose the content of each cell in pipe signs (|). Comments start with a percent and preprocessing directives with a negated comment (%!). The only two things I care about that txt2tags doesn’t support natively are footnotes and cross-references to tables and figures. For footnotes there’s one workaround in this tutorial and one in the txt2tags configuration file of S. D'Archino. Cross-references are (relatively speaking) much more complicated to add, but are still possible by generalizing the approach described in the final part of this article, if you really need them.

Click to read the How to transform (almost) plain ASCII text to Lulu-ready PDF files, part 2