Reviving the Blog: Part 3 - Importing from WordPress

A blog isn’t terribly interesting unless it has content. Now that I have selected a platform and set it up, it’s time to import the content from my old blog. So how does one import the nearly 60 articles I wrote over the course of a year or so from WordPress into the new platform?

It turns out that Hexo has a number of plugins to import data from alternate blogging engines, including one for WordPress. This plugin is able to read an export file generated by WordPress and create the markdown posts for the Hexo blog. Of course, to generate the export file, I need a running WordPress site…

Docker to the Rescue

To run WordPress, you need a server that can run the PHP software and a MySQL database for the posts and other settings. Fortunately, someone created a Docker Compose file that runs both of these things! Docker is great for cases like this where you would normally need to install a lot of software you don’t normally use.

After installing Docker Desktop (you could also use Rancher Desktop), I created a directory containing the WordPress docker-compose.yml file. To start the services, simply run

1
docker-compose up --detach

This starts two containers: One running a MySQL database and one running WordPress. You can see this by running the following:

1
2
3
4
$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e915b8ac2154 wordpress:latest "docker-entrypoint.s…" 7 weeks ago Up About a minute 0.0.0.0:80->80/tcp wordpress-wordpress-1
9e130687e174 mariadb:10.6.4-focal "docker-entrypoint.s…" 7 weeks ago Up About a minute 0.0.0.0:3306->3306/tcp, 33060/tcp wordpress-db-1

This shows that there are two containers running. Wordpress is running in the container named wordpress-wordpress-1. The other container, with the name wordpress-db-1, is the MariaDB database. (a MySQL compatible database)

From the PORTS column, you can see that WordPress is listening on port 80. This means that you should be able to visit http://localhost:80 and see the default WordPress installation.

Importing the Backup

My previous hosting provider, BlueHost, had a nifty utility for making backups of a WordPress site. This generated a compressed archive containing two things:

  1. A SQL import script to recreate the database
  2. The files on disk that make up the WordPress installation (WordPress itself, configuration files, images, etc.)

First, I installed the mysql-client package (e.g. brew install mysql-client) and then imported the SQL backup file:

1
2
3
4
mysql --host=localhost --port=3306 \
--database=wordpress \
--user=wordpress \
--password=wordpress

The database name, username and password are all configured in the docker-compose.yml file. I’ve shown them, above, with the default values.

The backup file creates or recreates all the tables it needs. So at this point, I just need to configure WordPress. To do this, I need to do the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# Copy the archive into the wordpress container
docker cp wp_backup1533348557.tar.gz wordpress-wordpress-1:/var/www/html

# Shell into the container
docker exec -it wordpress-wordpress-1 /bin/bash

# Back up the default installation in case I need it
tar zcf wp-backup-orig.tgz * .htaccess

# Remove the default installation
rm -rf *.php wp-content wp-includes wp-admin

# Expand the archive
tar zxf wp_backup1533348557.tar.gz

# The backup was in a subdirectory, move it into place
cp -R davidjhay/* .

There’s just a few more things left to configure before I can use my backup.

The first thing I discovered is that the docker container doesn’t have a text editor, which makes it hard to edit the configuration files. So let’s fix that:

1
apt install vim

The database connection information is almost certainly not the same as my original hosting provider. To point it at my docker container, I edit the wp-config.php file:

1
2
3
4
5
6
7
8
9
10
11
12
// ** MySQL settings - You can get this info from your web host ** //
/** The name of the database for WordPress */
define('DB_NAME', 'wordpress');

/** MySQL database username */
define('DB_USER', 'wordpress');

/** MySQL database password */
define('DB_PASSWORD', 'wordpress');

/** MySQL hostname */
define('DB_HOST', 'db');

You may have noticed that the hostname is not localhost, which was used during the MySQL import. When working within a docker container, you reference the other containers by the name given in the docker-compose.yml. In this case, the name of the container is db.

The other change I had to make was to switch the site away from using HTTPS for it’s connections. I don’t have a SSL certificate, nor do I need one for the purposes of getting an export file.

1
2
3
4
5
6
7
8
9
10
11
12
select * from wp_jwta_options where option_value like 'https://%';

+-----------+-------------+-----------------------------+----------+
| option_id | option_name | option_value | autoload |
+-----------+-------------+-----------------------------+----------+
| 1 | siteurl | https://www.davidjhay.com | yes |
| 2 | home | https://www.davidjhay.com | yes |
+-----------+-------------+-----------------------------+----------+

update wp_jwta_options
set option_value = 'http://www.davidjhay.com'
where option_name in ('siteurl', 'home');

We’re almost there!

The final step is to update the local hosts file to make the system resolve my custom domain locally. For me, this meant running a command prompt as Administrator and editing C:\Windows\System32\drivers\etc\hosts (/etc/hosts for you Mac and Linux users out there)

1
127.0.0.1 localhost www.davidjhay.com

This last step is not, strictly, necessary. Unfortunately, the previous incarnation of the blog had a lot of hard-coded absolute URLs. Tracking them down and fixing them just for the purposes of importing content wasn’t really necessary.

Whew! I can now login to the admin interface of my old WordPress blog and export the content.

  1. Go to Tools -> Export in the admin sidebar
  2. Click “Export All”
  3. Click “Download” to get the zip file containing the XML export file.
  4. Unzip the export file

We are now ready to import the posts into the new Hexo site.

Hexo WordPress Import

After all of that, the import is somewhat anticlimactic. We simply install the migration plugin and run the import:

1
2
npm install hexo-migrator-wordpress
hexo migrate wordpress <wordpress-export.xml>

After a moment or two, I now had a whole bunch of markdown files in the source/_posts directory of my website!

At this point, the remaining work involved some search and replace operations to remove the absolute URLs, cleaning up any formatting that didn’t translate (code snippets didn’t import cleanly, for example) and just tidying up the posts.

Wrapping Up

This may seem like a lot of work. In reality, it took longer to write up the process in this post than it did to actually do the work. Like any project, if you take it step by step and solve each problem one at a time, you’ll be done in no time.

In hindsight, I should have read the documentation for the Hexo migration plugin more thoroughly. I spent a fair amount of time fixing the paragraph breaks in all of the posts, only to discover that the migration plugin would have done that for me! (Add the --paragraph-fix parameter)

The migration plugin also would have dealt with all the image attachments. As it turns out, however, it didn’t really matter since I had a different plan for the images.

That story will have to wait until another day.