Updating a Large WordPress Site to Gatsby Frontend, Part I: Planning

August 06, 2021

The undertaking: update a large custom-theme WordPress marketing site to a decoupled headless CMS paired with the static frontend framework Gatsby. Gatsby, is a static site generator (SSG) powered by Node, React and a vast open-source plugin ecosystem.

This series of blog posts is written to outline the general process I used to accomplish this task. There will be terms included such as rsync and ssh etc. Rather than include extensive step-by-step instructions on each, I will link to manuals or tutorials which helped me along my journey.

This will be a full-stack dive into all facets of this project from concept to completion.

Let’s get to it!

First: Assess The Situation

The first step for any programming task should be to assess the situation. Below are some specs on the original website I was working with:

  • The database is a decade old.
    • 50+ standard posts.
    • 700+ pages.
    • 10 custom post types with 2000+ posts!
  • The custom theme was in its 5th recorded version, with loads of interwoven functionality.
  • The theme relied on a custom plugin for some frontend functionality and forms.
  • Plugins actively used included:
    • Advanced Custom Fields (ACF)
    • All in one SEO
    • Contact Form 7
    • Easy WP SMTP
    • Google XML Sitemaps
    • Wordfence Security
  • A blog located within a subfolder /blog/, which is a separate WordPress installation.
  • Two additional unused/outdated WordPress installations in other subfolders.
  • Tons of legacy files outside the standard wp-* folders, including thousands of images, old HTML, and miscellaneous clutter.

Phew! This was going to be a lot to organize while simultaneously modernizing the frontend and optimizing the backend.

The original website was actively updated and regularly maintained by multiple people – which means that I didn’t have the luxury of cloning it as-is, and working on it for 2-3 months uninterrupted. Rather, I had to ensure the new website developments also included any changes and new pages published by the team.

Next: Clean

I. Clean Unused Themes and Plugins

The next step is to determine which weight could immediately be removed. The first to go are unused themes and plugins. The less fluff, the better. This is a best practice regardless.

II. Removed Unused Images

There were thousands of images in subfolders which were directly linked to from within pages and posts. This presented some challenges. It was far too time consuming to manually check if each image exists in the database and needs to be kept. We’ll come back to this challenge later, as Gatsby provided an interesting way to help here right out of the box.

III. Remove Unnecessary Custom Fields

Over time, many marketing efforts were initialized but never gained traction. Some of these efforts required specific landing pages, and development involved utilizing custom fields with the Advanced Custom Fields plugin or custom meta. Some of these pages were eventually removed or redirected, but their custom field data remained. This doesn’t result in much database bloat, however it can become difficult to determine which fields are needed at a later date. It’s best to clean up unused data sooner than later whenever possible. In this case, however, some sleuthing was required to determine what could be purged.

One method I like to use involves an SQL query to determine which published posts or pages might still use custom field data.


SELECT * FROM wp_postmeta 
  INNER JOIN wp_posts ON wp_postmeta.post_id = wp_posts.id 
  WHERE wp_posts.post_status = 'publish' 
    AND wp_postmeta.meta_key LIKE '%some_custom_field%' 
  GROUP BY wp_posts.id;

When running this SQL query, if I see zero rows returned, it’s safe to remove the custom field definition from Advanced Custom Fields, and rule out the support requirement in the new theme.

SQL query returns no results.

Now, if I do get some results returned, I need to dig a bit deeper and see if the data is important enough to warrant including in the new site. To do this, I’d visit a handful of the posts by post_id in wp-admin, and see where the data is presented:

SQL query returns posts that have a meta_key populated (meta_key corresponds to the field in ACF)

If it’s not something that needs to live in the database anymore, it gets removed.

There are no significant performance implications if this unused data remains in the database. However it can become a source for confusion or uncertainty during times of troubleshooting. If it’s not used, lose it.

Then: Consider the Destination

I will be using some new plugins, and removing others in the new installation. For example, I can safely ditch Contact Form 7 and Google XML Sitemaps plugins, because that functionality will be handled differently with Gatsby.

We will also need to add new plugins to the destination installation to support Gatsby and GraphQL:

For more information on how Gatsby, GraphQL, and WordPress integrate, check out the Sourcing with WordPress article by Gatsby.

We’ll also be migrating from All in one SEO over to Yoast, because Yoast has a nice GraphQL support plugin available: WPGraphQL Yoast SEO Addon (at the time of this project, All in one SEO did not have GraphQL support available).

Our new backend will also need to be loosely based on our existing theme setup, namely because of the numerous custom post types installed.

In the next article, I review the process used to set up the staging environment, and eventually get into the sync/pull bash scripts needed to fetch the latest content or changes from production.

MichaelWritten by Michael - His career path has allowed him to incorporate his creative eye with a love of programming, analytical thinking, and learning. Michael has been married to his lovely wife Yohana since 2012. They have four wonderful children, two St. Bernard dogs, and a chinchilla. Follow @missionmikedev on Twitter