Archiving My Blogs

October 18, 2024
Leave a comment

I recently came across an interesting idea from one of the oldest blogs on the internet: Scripting News. The idea is to back up a blog to a GitHub repository for archival purposes. Scripting News does this and I liked it so much that I decided to write my own script to do exactly that. As of today, all of my blogs will be backed up nightly to public repositories on GitHub (see the links below).

So why archive them? There are a number of reasons for it, but primarily it’s so that there is a browsable copy in case anything should happen to any of the blogs themselves. GitHub is a stable platform and is easily accessible to anyone. Since I primarily write about technology and development topics, git and GitHub will be familiar to most developers.

So how does it work? Well, the Node.js script uses a modified version of another script I wrote for exporting WordPress posts to Markdown files to export the WordPress data to Markdown files using WordPress REST API. It also downloads post images, author details, and taxonomies (categories and tags). All of this data is saved in the archive repositories that have already been checked out on my server for this purpose. The script runs git pull, git add ., git commit, then git push using the exec function from the child_process package from Node’s standard library which means only actual changes are committed.

My first draft used the octokit package from GitHub to push everything through GitHub’s API, but that was too slow and clumsy, in my opinion, plus it always committed everything each time it ran rather than just the changes. So I rewrote the script to use git like git should be used: only commit any actual changes. Another advantage is that I could theoretically add multiple origins so that I could have multiple archives. I may add a second one for GitLab so that my blogs are backed up on both GitHub and GitLab. We’ll see if I feel motivated to do that at some point.

The script is executed via a cronjob every night at 4 am directly from my server.

And that’s it. That’s all there is to it. It works rather well, but there are a few features I still want to add such as archiving post comments. That should be pretty trivial with WordPress’s REST API though.

Links

Tags

About the Author

Alex Seifert
Alex is a developer, a drummer and an amateur historian. He enjoys being on the stage in front of a large crowd, but also sitting in a room alone, programming something or reading a scary story.

Related Posts

Post a Comment

Your email is kept private. Required fields are marked *

*
*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>

My Portfolio