Moving from Acrylamid to Ghost

    Moving from Acrylamid to Ghost

    When I decided to move to Ghost at the beginning of the month, I realized that I needed to act quickly, because I've kept postponing this for years, so either I do it during the winter holidays, or it gets put off for who knows how long. So I set up the Ghost instance on DigitalOcean. That was a simple process. I also moved manually the last 20 posts and the 10 most viewed posts, so that there would be some content here and then I switched the DNS for rolisz.ro to point to the Ghost instance. Moving the posts manually took me about two days.

    But I have started blogging 10 years ago. In the meantime, I have written over 400 posts. Some of them have not aged well and I deleted them. Some were pointing to YouTube videos that no longer exist, some are references to my exams from university and some are posts that I am simply too ashamed that I ever wrote. But that still leaves me with about 350 posts which I wanted to keep.

    I didn't want to move another 330 posts by hand, so I wrote a tool to export my data from Acrylamid into JSON and then to import them into Ghost.

    Ghost uses MobileDoc to store post content. The recommended way of importing posts from external sources is to use the Ghost Admin API to import HTML and then Ghost will do a best effort conversion to MobileDoc. Unfortunately, they say it's a lossy conversion, so some things might not look the same when Ghost renders an HTML from the MobileDoc.

    My posts where in Markdown format. The easiest way to hack an exporter together was to piggyback on top of Acrylamid, by modifying the view that generated the search JSON. That view already exported a JSON, but it was stripped of HTML and it didn't contain some metadata, such as URL. I removed the HTML stripping, enabled all filters, added the needed metadata.  Because I had a custom picture gallery filter, I had to modify it to add <!--kg-card-begin: html--> before the gallery code and <!--kg-card-end: html--> after it. These two comments indicate to the Ghost importer that it should put what's between them in an HTML card.

    The importer uses the recommended Admin API for creating the posts. To use the Admin API, you have to create a new custom integration and get the admin API key from there. To upload HTML formatted posts, you have to append ?source=html to the post creation endpoint.

    # Split the key into ID and SECRET
    id, secret = ADMIN_KEY.split(':')
    
    def write_post(title, post_date, tags, content=None):
        # Prepare header and payload
        iat = int(date.now().timestamp())
    
        header = {'alg': 'HS256', 'typ': 'JWT', 'kid': id}
        payload = {
            'iat': iat,
            'exp': iat + 5 * 60,
            'aud': '/v3/admin/'
        }
    
        # Create the token (including decoding secret)
        token = jwt.encode(payload, bytes.fromhex(secret), algorithm='HS256', headers=header)
    
        # Make an authenticated request to create a post
        url = 'https://rolisz.ro/ghost/api/v3/admin/posts/?source=html'
        headers = {'Authorization': 'Ghost {}'.format(token.decode())}
        body = {'posts': [{'title': title, 'tags': tags, 'published_at': post_date, 'html': content}]}
        r = requests.post(url, json=body, headers=headers)
    
        return r
    Python function to upload a new post to Ghost

    Because I had already manually moved some posts (and because I ran the importer script on a subset of all the posts first), I needed to check whether a post already existed, before inserting it, otherwise Ghost would create a duplicate entry for me. To do this, I used the fact that Ghost would create the same slug from titles as did Acrylamid. This actually failed for about 5 posts (for examples one which had apostrophes or accented letters in the title), but I cleaned those up manually.

    posts = json.load(open("posts.json"))
    
    for f in search:
        key = "https://rolisz.ro"+f['url']
        resp = requests.get("https://rolisz.ro"+f['url'])
        sleep(0.5)
        d = datetime.datetime.strptime(f["date"], "%Y-%m-%dT%H:%M:%S%z")
        if resp.status_code != 200:
            if "/static/images/" in f['content']:
                f['content'] = f['content'].replace("/static/images/", "/content/images/")
            write_post(f['title'], d.isoformat(timespec='milliseconds'),
                       f['tags'], f['content'])
            sleep(1)
    Code to prepare posts for upload

    Ghost also expected the post publish date to have timezone information, which my exporter didn't add, so I had to do a small conversion here. I also corrected the paths of the images. Previously they were in a folder called static, while Ghost stores them in content.

    Because my Ghost blog is hosted on a 5$ DigitalOcean instance (referral link), it couldn't handle my Python script hammering it with several posts a second, so I had to add some sleeps, after checking the existence of posts and after uploading them.

    After uploading all posts like this, I still had to do some manual changes. For example, Ghost has the concept of featured image and I wanted to use it. In general I want my posts going forward to have at least one image, even if it's a random one from Unsplash. In some cases, I could use an existing image from a post as a featured image, in other cases I had to find a new one. Also, code blocks weren't migrated smoothly through the MobileDoc converter, so most of them needed some adjustment.

    Going through all my old posts took me a couple of days (much less though than it would have took without the importer) and it was a fun nostalgia trip down what kind of things were on my mind 10 years ago. For example, back then I was very much into customizing my Windows, with all kinds of Visual Styles, desktop gadgets and tools to make you more "productive". I now use only one thing from that list: F.lux. Also, the reviews that I did of books, movies and TV shows were much more bland (at least I hope that I wrote in a more entertaining style).