pyblosxom2wxr.sh is a shell script that migrates content from PyBlosxom to WordPress. It converts PyBlosxom posts and comments into a WXR (WordPress eXtensible RSS) file that can be imported into WordPress.
The post file extension is hard-coded to
.txt, since that’s what PyBlosxom expects.
Pages are supported as well as posts. pyblosxom2wxr assumes that post filenames start with the date, in YYYY-MM-DD format, e.g.
2010-07-28_my_post.txt. Files without a prefix in that format are assumed to be pages. (This is hard coded but would be easy to change. Search for the
The filename is used as the WordPress post/page GUID, and the first line of the file is extracted and used as the title. The second line is assumed to be blank. If your files don’t follow that format, you’ll want to preprocess them or tweak the script.
Categories are not (yet) supported. All posts and pages are assigned to the “uncategorized” category in WordPress.
WordPress limits import files to 2MB, but pyblosxom2wxr can generate output files larger than that. If that happens, you can split it manually or with a tool like ChoppedPress.
By default, the last modified time of post and page files is used as their timestamp. However, if you have a
timestampsfile from the hardcodedates PyBlosxom plugin, it will be used instead. The default path is
../timestamp; you can customize this by editing the
timestamp_filevariable in the script.
pyblosxom2wxr doesn’t assign post ids. It omits
<wp:post_id>elements in the output file. This makes WordPress allocate post ids itself.
However, WordPress won’t allocate comment ids itself, so pyblosxom2wxr has to do that and populate them in
<wp:comment_id>elements. This means that importing a WXR file generated by pyblosxom2wxr may overwrite any existing comments!
- Posts with more than 256 comments are not supported well. Only the last 256
comments will be imported, and will likely be ordered wrong. See the
TODOnear the end of the script.