Generating AI-Powered Blog Titles for Your MDX Blog

tldr;

For some reason, I have a particular disdain for writing blog titles and meta descriptions. So I thought "maybe AI could do it for me..."

Turns out, it can. And it's pretty easy.

Why Use AI for Blog Titles?

Because writing them feels like busy work. And they are usually words that I would struggle to come up with myself. And I hate it. ⛈

The OG Blog Post Script

On this blog, all of my MDX files are stored in /src/content/articles. I have a simple script that let's me run

$ yarn article:new some-article-slug-i-like

When I run that, it generates a numbered MDX file in that directory with the appropriate slug. Here's what that script looked like originally:

/scripts/add-article.js

const fs = require('fs')
const path = require('path')

// helper that gets the number of the file from a string like 0001-my-first-post.mdx
function getFileNameIndex(fileName) {
  if (!fileName) return 0
  const match = fileName.match(/^[0-9]+\-/)
  if (!match) return 0
  return parseInt(match[0].slice(0, -1), 10)
}

async function addArticle() {
  // grab the title from the command line arguments
  const title = process.argv[2]
  const directory = process.argv[3] || 'articles'
  const directoryPath = `src/content/${directory}`

  // read the src/content/articles directory and get the last numbered file (e.g. 0035-hello-world.mdx)
  const files = await fs.readdirSync(
    path.join(process.cwd(), directoryPath),
    {
      withFileTypes: true,
    }
  )

  // sort by index and grab the last file
  const lastFile = files
    .filter((file) => file.name.endsWith('.mdx'))
    .sort((a, b) => {
      const aIndex = getFileNameIndex(a?.name) || 0
      const bIndex = getFileNameIndex(b?.name) || 0
      return bIndex - aIndex
    })[0]

  // get the index of the last file
  const index = getFileNameIndex(lastFile?.name) || 0

  // ensure the new index is a string with 4 digits
  const indexString = `${index + 1}`.padStart(4, '0')

  // get the current date in the format YYYY-MM-DD
  const date = new Date().toISOString().split('T')[0];

  // create a new file name with the next index
  const newFileName = `${indexString}-${title.toLowerCase().replace(/ /g, '-')}.mdx`

  // create the new file
  await fs.writeFileSync(
    path.join(process.cwd(), directoryPath, newFileName),
    `---
author: Ty Barho
date: '${date}'
title: ${title}
description:
tags:
---

`
  );
}

async function run() {
  await addArticle()
}

run()

And the correspoding package.json entry:

/package.json

{
  "scripts": {
    "dev": "next dev",
    "build": "next build",
    "start": "next start",
    "lint": "next lint",
    "postbuild": "next-sitemap",
    "article:new": "node ./scripts/add-article.js"
  },
}

This was a nice setup. If I ran the command

$ yarn article:new how-to-build-an-mdx-blog-in-next

the script would automatically generate a file named /src/articles/0123-how-to-build-an-mdx-blog-in-next.mdx with frontmatter content like this:

---
author: Ty Barho
date: '08/24/2023'
title: "how-to-build-an-mdx-blog-in-next"
description:
tags:
---

The bad part? I still have to go write those dang SEO title, and description attributes. What a pain. 😡

Setting Up the AI Model

I've been messing with LLM's for the past couple of weeks, and I thought: "Man, I really hate writing blog titles and meta descriptions. I wonder if ChatGPT could do that for me."

Turns out, it can.

Here's how I edited the script...

/scripts/add-article.js

const fs = require('fs')
const path = require('path')
const OpenAI = require('openai')

require('dotenv').config({
  path: path.join(process.cwd(), '.env.development.local'),
})

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
})

const defaultModelOptions = {
  model: 'gpt-3.5-turbo',
  stream: false,
}

function getFileNameIndex(fileName) {
  if (!fileName) return 0
  const match = fileName.match(/^[0-9]+\-/)
  if (!match) return 0
  return parseInt(match[0].slice(0, -1), 10)
}

async function addArticle() {
  // grab the title from the command line arguments
  const title = process.argv[2]
  const directory = process.argv[3] || 'articles'
  const directoryPath = `src/content/${directory}`

  // read the src/content/articles directory and get the last numbered file (e.g. 0035-hello-world.mdx)
  const files = await fs.readdirSync(path.join(process.cwd(), directoryPath), {
    withFileTypes: true,
  })

  const lastFile = files
    .filter((file) => file.name.endsWith('.mdx'))
    .sort((a, b) => {
      const aIndex = getFileNameIndex(a?.name) || 0
      const bIndex = getFileNameIndex(b?.name) || 0
      return bIndex - aIndex
    })[0]

  // get the index of the last file
  const index = getFileNameIndex(lastFile?.name) || 0

  // ensure the index is a string with 4 digits
  const indexString = `${index + 1}`.padStart(4, '0')

  // get the current date in the format YYYY-MM-DD
  const date = new Date().toISOString().split('T')[0]

  // create a new file name with the next index
  const newFileName = `${indexString}-${title
    .toLowerCase()
    .replace(/ /g, '-')}.mdx`

  try {
    const training = [
      { role: 'system', content: '[your base training here]' },
      { role: 'user', content: '0001-sample-post-1' },
      { role: 'assistant', content: '[example response 1]' },
      { role: 'user', content: '0002-sample-post-1' },
      { role: 'assistant', content: '[example response 2]' },
    ]

    const response = await openai.chat.completions.create({
      ...defaultModelOptions,
      messages: [
        ...training
      ],
    })

    const res = response.choices[0].message.content
    const ai = JSON.parse(res)

    const content = `---
author: Ty Barho
date: '${date}'
title: "${ai.title}"
description: "${ai.description}"
tags: ${ai.tags}
---

  `

    // create the new file
    await fs.writeFileSync(
      path.join(process.cwd(), directoryPath, newFileName),
      content
    )
  } catch (e) {
    console.error(e)
    process.exit(1)
  }
}

async function run() {
  await addArticle()
}

run()

After a litle prompt engineering, it worked like a charm! For example, if I run it with this command:

$ yarn article:new writing-an-automated-keyword-finder-with-puppeteer-and-ai

Here's the frontmatter I get:

---
author: Ty Barho
date: '2023-10-18'
title: "Automated Keyword Finder using Puppeteer and AI"
description: "Learn how to build an automated keyword finder using Puppeteer to scrape search engine results and AI algorithms for keyword analysis."
tags: automation,puppeteer,ai,keyword-analysis,web-scraping
---

Now, instead of writing all that from my already addled brain, I can do some quick edits, and I'm done.

Pretty neat, huh? 😎🤖

Prompt Engineering Example

For simplicity, I didn't include my training data above, just so you could read the code more clearly. But, for those who don't yet have a lot of experience with engineering these prompts, here they are for you to tweak to your use case.

The System Prompt

We start with the system prompt, which gives the LLM some basic parameters to operate by.

Woe

I'm still working through getting System prompts to "obey" me well. Often, some of my instructions are ignored. I've found using language like "You may NEVER" or the Question/Answer format work most of the time, though I still get errors.

{
  role: 'system',
  content: `
You are a bot that cleans blog post titles and generates tags and descriptions.
Titles must be grammatically correct.
Descriptions must be between 50 and 160 characters.
Here is an example:
<input>0014-adding-a-copy-button-mdx-code-snippets.mdx</input>
<output>
{
  "title": "Add a Copy Button to Your Rehype (NextJS / MDX) Code Snippets",
  "description": "In this article I'll show you how to add a convenient copy button to Rehype code snippets in your NextJS MDX blog.",
  "tags": "nextjs,mdx"
}
</output>

Please note that tags should be short abbreviations.
Example: artificial-intelligence = ai
Example: natural-language-processing = nlp
`,
}

The User / Assistant Prompts

These are where you give examples of expected output. This is called "multi-shot" prompt engineering, and IMO, this is where the meat of completion goodness happens. Make sure these are solid.

;[
  { role: 'user', content: `0016-nextjs-convertkit-blog-subscriber-form.mdx` },
  {
    role: 'assistant',
    content: `{
  "title": "Newsletter Subscriptions with NextJS & ConvertKit",
  "description": "In this article, I'll teach you how to create a newsletter subscription form in NextJS, and connect to ConvertKit using NextJS api routes.",
  "tags": "nextjs,convertkit,react-hook-form,yup,api"
}`,
  },
]

I'm only using one-shot prompting here. You probably want to use 2 or 3 of these "shots".

JSON Output

Notice that in both my system and assistant prompts, I am responding in JSON. This has proven effective in ensuring that I only get responses in JSON that I can parse.

Next Steps

So far, this has been a super useful time-saver. If I were looking at where to take this next, I might extend this by chaining another trained model to revise the description for specific SEO keywords & density, or possibly ensure that titles are more directly related to posts that I've already created, from a content perspective.

In the meantime, I'm just really happy that I don't have to write blog titles or meta descriptions by hand anymore 😅.

Hope this helps someone, and happy coding! 🦄