Migrating from WordPress to Hugo, Netlify, Forestry and Themefisher Parsa Part One

Well this has been a higher and higher priority project. I’ve got a bunch of experience with WordPress, but keeping up a server is a lot of work and all those plugins. Alex turned me onto Hugo, a static site generator. It is super fast and most importantly, it just works out of a Git repo, so everything is checked in. It’s super for simple sites and those that need to be fast.

This is a work in process, but I’m finally at the point where I can begin editing. The things that are still open (that I’ll cover in part deus) are how to edit a template that is using modules. Not super obvious, but here are the basics

TL;dr Hugo, Netlify and Forestry or Stackbit

I spent quite a bit of time wandering in the woods, loading Hugo, going through Netlify tutorials. While that was all great to understand how it all works, I wish I had just started with Forestry because they have a start template world, where you just click and they even have ported the WordPress default blog layout, plus they have nice layouts companies like Belkirk for universities, Parsa for pictorial blogs, Ananke for startups. FYI I spent hours trying to get Ananke up and it is just a starter here!

It then automatically authentications with GitHub and then name the org and repo where you want to store your site. Then once you have a repo which you then reimport into Forestry to edit.

Then it is simple to connect Netlify with it for production by going there and importing the new repo.

Why a JAMStack?

What is the big deal about this move. Well the basic reason is that it decomposes the problem that WordPress is trying to solve and each component is replaceable and much better. A typical website has three things integrated together and with a JAMStack (Java, Application APIs and Markup running in the client), you can mix and match the pieces that brings together the simplicity of easy to use editors and the power of software development.

The traditional four tiers

The old way had multiple pieces as illustrated by WordPress

Client. The browser
Web Server. This is a process running on the server that sends web pages out to the client. Typically in a WordPress installation, this is an Apache server with its own configuration.
App Server. This is the WordPress itself running as a process. When it gets a web request and then looks at the data layer to get and assemble content
Database. For WordPress, the default is a MySQL database with it’s own storage system.
Content Management System (CMS). This is the editing tools that let a non-technical person edit images, post blogs and edit pages.
Backend Services/Plugins. Inside the application server, if there is other work to be done, then there are WordPress Plugins that call for things like Shopify and other things like say a Disqus comment system.

The main issue with this architecture are three fold:

Version Control. In the configuration above, you have parameters (for Apache, WordPress and MySQL) sprayed all over the physical servers. In the worst case, you have multiple Apaches, multiple WordPress and multiple MySQL installation and believe you me, it is hard to keep them all in sync. The solution to this is to put the entire configuration into GitHub (or it’s cousins), so that you easily manage it all and know how to rebuild without twiddling a zillion configuration files.
Performance. As you can see, this is a “dynamic” assembly system. Yes, you can cache things, but fundamentally at run-time, a WordPress system goes from Apache to WordPress, then it looks up HTML templates, assembles them content database and then creates web pages. It is very flexible, but it isn’t fast. The solution is to treat the “client” as more than a dumb viewer, so put as much computation as you can into the Javascript that runs in the client. This works because today’s clients are incredibly fast. This flattens the stack dramatically.
Security. The attack surface for the traditional system is vast, there can be bugs in the server process, the WordPress server software, WordPress plugins calling external services, and the database server software. A JAMStack system eliminates all of these, now the client directly calls the APIs. Yes, there still bugs, but reducing surface area is really key and reduces the load on your website developers.
Scaling. This is a big deal, since there are no intermediate servers, you just have billions of clients and then static web pages. That means, that you can distribute them broadly in a content distribution network rather than having to try to optimize all those services.
Developer Experience. With a locked box like WordPress, you have to live in the PHP world, but with this “flat” architecture, you have many more choices. Specifically, you can use a Static Site Generator to take simpler to write Markdown files that have the content and mix them with HTML Templates or Layouts to get the static HTML. Also you can use a headless CMS, that is a tool that is just writer facing that changes the Markdown.

The Full JAMstack

So given the above, here are the key pieces to pick:

Repository: GitHub. The other good choice is GitHub, but the ability to fork code makes GitHub a great choice. All the tools use OAuth or GitHub Apps to have simple pulls and pushes into the repository. It is so great that everything is stored in commits!
Static Site Generator (SSG): Hugo. This is the core tool that specifies how to convert the components and get you to static HTML, CSS, and images and videos (all the assets). We ended up using Hugo for this mainly because I wanted to learn Go and it is supposed to be blazingly fast.
Hosting: Netlify. You can use something as simple as S3 or anything that serves a static website. We used Netlify because it makes it easy to work with Hugo and has a docker environment that actually will take the GitHub repository, run the SSG and then display the website.
Headless CMS: Forestry. Ok, this is the choice for editing the SSG. Headless is a bit of a strange term but means that it works against any (or most SSGs). This is the choice I’m least sure about.

JAMStack Hosting: Netlify

Netlify provides free hosting for these personal sites and that means no more fixing MySQL crashes. Like many modern systems, you just point it to your git repo, and then it starts a docker container to actually generate the static files which are wonderful.

The other great thing is that it also includes Domain registry and DNS services, so that eliminates yet another point of complexity. Personally, I will still use Namecheap because it is so great at searching for new domains, but not having a separate DNS is great.

Understanding Hugo

The core of the new stack is Hugo, This takes a set of files and the terms are pretty unfamiliar if you are coming as a WordPress user (vs a template author). But when you wade through this, at first it seems super simple, but when you layer a CMS on top, it is pretty confusing. I never actually learned this for WordPress and when I hit the Hugo Tutorial, it was super confusing. If you only know HTML, CSS and JPEGs, it is going to be very confusing.

Hugo Theory of operation

The main idea in Hugo is that you take a collection of Markdown files .md files and then depending the type and kind of the markdown, it searches through a huge set of layouts which is HTML with Go extensions that let you insert the Markdown file content into the HTML Layout to create the final HMTL.

Final.html = Content.md + sum(Layout-as-Go-Templates.html)https
RelativeURL(final.html) = RelativeFilePath(content.md)

But Jake Wiesler and Sara Soueidan do a good job of explaining what is going on. The documentation itself is just terrible and is tuned for developers who are deep into making custom sites (which is not me!).

Hugo taxonomy of paths and terms

The most important thing with these systems (at least for me) is some conceptual model for how it is all put together. With Hugo, this really starts with the directory layout. This is a system that makes many implicit assumptions depending on what is stored here. This is quite different than the database centric world of WordPress where what matters is entries are bing made in MySQL.

So here is the decoder ring on the directory layout and how it maps to various concepts in Hugo, but the core one is the content directory.

Content. The main directory for Hugo where you pile in all those Markdown MD files that have the raw text that you use. The thing that is a little confusing here is that the directory layout in Content is the exact directory structure of your website. In WordPress, the url hierarchy is something you add later, but here it is baked in.
Content Subdirectories set URL location and the kind of the Markdown. So for example if you have a folder content/blog then this is what is a called a section and they typically continue a list, the term seems to vary. But it means a collection of related posts. One of the implicit things is that the subdirectory that you live in type of the markedown file. So every file that lives in the blog directory has kind set as blog. Similarly, say you had a subdirectory content/projects/ then every MD file there would implicitly be of kind projects. This idea of type is used when combining the
Content/about.md. You can also have individual markdown pages which called single pages so that content/about.md implies that the URL to reach that content will be https://yoursite.com/about.html which makes some sense. Similarly if you want to have
Content/_index.md. This eventually becomes your home page, https://yoursite.com/index.html. Basically any _index.md file is processed first to control how the rest of the markdown is displayed
Every file.md has two parts. Every markdown file has two parts, the front matter which are basically the Go Variables that are available to the Go HTML Templates. Some of these are implicit, so type is by default set to the name of the parent directory of the Markdown file, but others are explicit like say description or date and then the second part is Markdown (or even full HTML) that is below. This body part is also available to the templates. The templates really control the system, while the Markdown is basically data.
Collections of pages. This is the bloggy or catalog part of the system. But basically in a directory, with a _index.md and then a set of other files that the index file can list in a collection. The _index.md is considered a list page while every entry there is considered a single page. That is the core idea, either a page is complete or it is a list of other pages.

Hugo Layouts: Go HTML Templates

The layouts are the crux of how to turn all those content into a real site.

layouts/_default/single.html and list.html. Although templates have zillions of different formats. But if you want the simplest site, create a _default and put in those two templates. Then for everything that looks like a single page and everything that is a list, it goes to these two types. So it’s nice to have them around.
The default type is the name of the markdown’s parent directory. Looking on templates based on type. Remember that idea of type which by default is your parent directory, well in Hugo, by default, you just look for the same relative path in layouts as the item has in contents. So for instance, the markdown file contents/projects/project1.md would be turned into final HTML with layouts/projects/single.html so basically the location in the directory hierarchy tells you where to find the template. And then contents/projects/_index.md uses the template layouts/projects/list.html and yes it’s confusing the name of the Go Template and the final template use the same .HTML suffix.
For files at the root, you need to define type manually, so for say content/about.html, you actually have to put in kind=about otherwise it defaults to using single.html so at the route where you might have say about, contact and so forth, normally they have different layouts and you then fill up layouts/about/single.html with what you want content/about.html to have.
The magic _default/baseof.html is the last trick. Most templates don’t declare an entire sheet. They usually use use {{define main}} for instance to create a callback for the main body. The thing that organizes everything is typically this baseof.html. It is the real center that has the raw HTML pieces like head, body and so forth, then the templates just define procedures. This lets you have boilerplate live there and the other layout templates fit in it. The good _baseof.html have lots and lots of callouts, so just about anything can be defined there.
Finally layouts/index.html and finally there is another magic type but this formats the homepage. As you can see from the above, if you don’t have one then it will use layouts/_default/list.html
So you can see the root directory is just another case of a section. Every section has a _index.html and then a set of pages. So you can stick a blog at the top without any work. Of if like most sites, it is a collection of independent pages, then the set contents/index.HTML so that it is custom and then each page will need its own type and corresponding layouts/_top page name_/single.html

Ok but what’s an archetype?

Ok when you get behind the strange word, this is the equivalent of a class in a traditional programming language. And you can so within ./archetypes you will see a Markdown file for every type in the Hugo. In other words, say that the immediate children of root was ./content/blog/ and ./content/project, then you have implicitly defined two types and the definitions for them would live in ./archetypes/blog.md and ./archetypes/project.md And yes it is a little strange to spray this stuff everywhere, I would have thought the type definition would go into the content area for simplicity.

Now what is in these archetype files (and yes it would be nice if they had a different suffix). Well, what they have is the definition of what is in the front matter, so they tell you what variables live in each template such as date or for a project, maybe projectid. The layout template can then call these variables and insert them into HTML.

Finally what’s all this about ./static

Ok last concept is that there are the images and other junk you need for a site. So these go into ./static and they end up living in URL-land at the root. So for instance if you create ./static/images then you can get all the images with https://yoursite/images.

The main use for this is to put all the style sheets, CSS, Javascript libraries and fonts that you need there and you can use them from the templates.

The other use is if you have stuff that shouldn’t be under Hugo control. A good example is Netlify CMS. It has it’s own system and javascript, so for instance creating ./static/admin/index.html means that when you run https://yoursite.com/admin/ it will just run that index.html. Pretty handy.

Compiling a hugo site what actually happens

OK, if you are debugging then brew install Hugo and then running Hugo server from the root of the Hugo repo will result in all the magic above and then the static pages of the site get output at https://localhost:3131. If you use a hoster then you can also run that command Hugo and it will just push all the files to ./public. You can then just copy all the files to S3 or wherever.

For the cool kids though, they can just take your repo and they will run Hugo build for you in a docker container. That’s pretty neat.

One note is that for simple sites, you just dump your images and CSS into ./static, but for big sites they will have what is called an “asset pipeline” which will compile the CSS from CSS template languages (I know everything is a template). There are fancy makefile like things called gulp for instance that run these jobs. When you are doing this local you will run the gulp and then will work on parts of the repo outside of Hugo control. That is directories *not* named contents, layouts, static and output.

When you are using something like Netlify, you can have it run those for you as well. So the complex system you will see Hugo mixed in with CSS generators and so forth.

Hot reloading.

On of the hot things to do now for speed is that every time you make a local edit to a file in a Hugo repo, the Hugo server and the Gulp server processes can watch for file changes and reload the whole site. Makes debugging way, way faster. By the same token, every time you do a push to your git repo, netlify and others can run their jobs and you get something posted up on the Internet nearly immediately. Pretty Neta!

Headless CMS: First Netlify CMS, then Forestry or maybe Stackbit

I then spent a horrible weekend trying to get the Netlify CMS running on that site. This should have been super simple, but turned out to be a disaster of trying to get it to work with existing templates. I ended up switching to Forestry.io which was my original instinct. But after a weekend of really understanding Hugo internals, I’m glad I did that before going to Foresty. This thing is web-based and pushes all changes back to Git.

So GitHub become the central repository and you can use Forestry to make commits and Netlify then picks those up and updates your site. Pretty sweet. The main problem with Forestry is that it is unclear how to edit all the pages. And the so called templates and partials are hidden.

Another alternative is Stackbit which is like a plug and play WordPress, you literally select which SSG, which Git repo and which CMS and then it plugs it all together. Pretty convenient. I’m trying Foresty on Stackbit now and it authenticates against everything and provides a unified UI.

Unsuccessful Netlify CMS work

Ok I spent an entire weekend trying to get Netlify CMS working. After all if Netlify is so great and I’m using it why should I use their CMS. Surely this will be well integrated.

Well it turns out there are two issues:

If you use one of the Netlify CMS templates, it loads a super amount of stuff including an asset pipeline and so forth. Since I had not stumbled on this explanation of directories above, I literally couldn’t understand what it was doing. What is worse is that “Hugo Getting Started” doesn’t explain anything about asset pipelines and only gets you to showing a single type with a blog and doesn’t have any help going farther.
So next I tried the add Netlify CMS to your site. Again, the problem was that I didn’t really understand what was going on and the CMS being not just for Hugo but other generators like Jekyll used different terminology for the same thing and there was no decoder ring for that. As an example, what is a kind in Hugo is called a collection in Netlify CMS.
The final problem is that not all Hugo templates are compatible with netlify CMS. For instance, unluckily the one I picked, didn’t display the Netlify CMS correct on the home page. It did work when I added ./static/admin/index.html but my poor attempts to hack in javascript for the home page were crippled because I didn’t understand the organization. In adding the site manually, I could never get Netlify Identity to actually work. Even when I entered the passwords, the call backs didn’t work properly.
The main thing I think is that I should have just used their templates. The problem was that when I did that, they included an asset pipeline using Gulp as well, so I got confused (since I didn’t understand the basics above about Hugo needing a Gulp pipeline for CSS, Hugo basically manages the HTML, but not the style data kept in CSS, so you need both).
I may get back to Netlify CMS at some point, but it would be nice to avoid too much CSS hacking for now 🙂

Using Forestry from a template

Then we have the adventures with Forestry. This went way better because they have a system where you get templates already. And their templates were easier to understand. Here were the confusing parts:

When you get the default Forestry, it is really tuned for changing the Markdown content and not for editing the layouts. WordPress is the same way, it is really for marketing and authors. so the default is that you can see the Markdown files, but the layouts and editing that is done manually, that is you edit the repo directly.
However you can edit the Sidebar section and add these directly. In fact, you can even edit the configuration files by setting the the Datafiles Section as example to edit all TOML or YAML files. As an aside, alot of folks seem to like TOML files, personally I try to use YAML for everything since they are basically equivalent and there is one less format to thing about. This is also available graphically in the Settings ⇒ Sidebar ⇒ Add Section ⇒ Document and then enter the path like config.yaml for instance and you can do the same for an entire directory. You then get a pretty version of the YAML file that doesn’t require git access.
But Forestry is not for editing the Go Templates. For that, you need to go to direct editing.

The rapid development of Theme ecosystems and using Modules

I never had to get into themes much with WordPress, they just kind of worked and they had their own update mechanism. This of course creates the nightmare of customization. There is no concept of upstream or taking fixes from the original author. So I mainly just used out of the bos.

With the use of GitHub, though, the ideal thing is to use the Git concepts. So the way that Hugo teaches is that there is a directory ./themes which basically replicateds the ./layouts as ./themes/<name of your theme>/layouts for example and by changing the ./config.toml you can change themes which is neat.

The problem is that Themes are constructed really differently. For instance Themefisher templates have all kinds of links back to the source repo. That means they use config.yaml to do module imports. That means that you don’t control the layout, assets, etc, they are linked from their GitHub repo. This is a super new feature and let’s you have a package approach to building your website. You can then use Hugo mod vendor to take all these packages and put them into a _vendor folder so they can be under git control.

The way that you make changes then is to do the mod vendor command and then you can create your own theme that overwrites the pieces that you need. That is because you can have multiple themes and overlay them. This gives you a kind of simple inheritance so that from left to right, you overlay:

# in config.yaml
theme:
  - my-overrides
  - base-theme

Which is pretty convenient. In the old way with submodules you could get one layer of overlap, but then if you are assembling lots of modules how do you make that work. This is alot like NPX or Yarn in the Node world.

Using ThemeFisher, Modules and Union File System

The theme I picked is from Themefisher Parsa. They have quite an advanced use and there are two ways to use it:

Fork their repo. This is the simplest way and then add it to Forestry. This is not treated like a real theme, so in some ways it’s easier to understand. There are really three ways that you can use a theme. A direct fork of their repo, fork the repo as a submodule in the themes data and then the most advanced way which is Go Modules.
Fork their repo into themes. This is the second way. This is nice in that it let’s you override selected things but you do have to add the theme to your base config.yaml.
Go modules (and now the preferred method). Ok this is the most advanced way and when you use this from the Forestry starter site this is what you get. What happens here is that you get a site with a config.yaml which points to their repo with a specific hash. You have have to use go mod vendor to cache everything they use. The advantage of this approach is that you can do deep assembly of all the javascript and so forth.

Net, net, it sure does see that themes are evolving very quickly in the Hugo world. The first simple fork is nice, but you can’t layer multiple elements. The second with the ./themes directory lets you layer things as you can call multiple themes. The final one with Go modules is the most advanced and feels a bit like a package managers.

The confusing thing about this model is how to do overrides. For instance, in the Forester starter version, everything is in config.yaml in the modules section and you can actually mount parts of the theme into file systems space of the Hugo repo with syntax like:

module:
  path: github.com/themefisher/parsa-hugo
  mounts:
    - source: layouts
      target: layouts
    - source: exampleSite/static/images
      target: static/images

And there is much automatic mounting, but then how do you add new images. That was a bit of a mystery, but when I chose image upload, what happened was I got my own real ./images folder and the images in exampleSite were somehow automagically copied up there.

The next mystery is how you can overlay layouts and override theme. According to Craftsman Digital, the trick is that the default for mounts is that it just goes into the ./themes folder as a virtual mount.

Hugo Modules are pretty complicated to understand but with defaults. As New Dynamic explains. The main idea is that with those mounts, you don’t have to have them in the repo. So if you want to get all the Twitter Bootstrap icons, you can create a virtual mount in the “Hugo” namespace:

module:
  imports:
  - path: github.com/twbs/bootstrap
  - mounts:
    - source: icons
    - target: assets/icons

Now we can access it in a Go Templates in ./layouts, you can not insert the Twitter Cart icon into any template:

{{ with resources.Get "icons/cart.svg" }}
  <div class="fill-current w-4">
    {{ .Content | safeHTML }}
  </div>
{{ end }}

Now Hugo has a Union file system, so when you mount things to override one icon set with another, just by this above the previous mounts:

- path: GitHub.com/refactoringui/heroicons
  mounts:
  - source: src/solid/shopping-cart.svg
    target: assets/icons/cart.svg

This however doesn’t work super well when you want to override a template without creating more folders. For instance, by default all images live in static/images and Forestry only knows about this and the binding to a single URL. So if you have images from both the ThemeFisher and from you new repo, it doesn’t work. The modules “shadows” the actual files there.

The only solution appears to be that you have to copy the images to ./static/images

Also if you want to mount different directories, then you can run to get the merger of two real sets of directories. This replaces the StaticDir directive in older versions of Hugo, so this means take the directories ./from-wordpress and ./from-cloudinary and put the union of all the contents into static/image:

modules:
  mounts:
    - source: from-wordpress
      target: static/image
    - source: from-cloudinary
      target: static/image

Tong Family