Building a SSG with Bash and friends

Posted on 2026-02-02 | Tags: bash

When building a static site generator, developers have a ton of programming languages, libraries, template engines, and even content authoring languages to choose from. Many developers would pick something like Ruby or Python and call it a day. Others would crave the speed of C, Go, Rust, or some other language known for producing fast (preferably statically-linked) binaries.

But few people these days reach for Bash, Awk, Sed, and the rest of the GNU Core Utilities. Sure, the syntax is a bit archaic and thorny compared to some of the newer scripting languages out there, but what a lot of people don’t realize is that Bash and Awk are actually Turing-complete. That basically means that both languages (yes, Bash is a programming language) can theoretically be used to perform any calculation, whether mathematical or logical. Besides that, Bash and friends are just plain cool.

But before I wrote the code, I had to come up with a plan. Basically, every SSG has three key components:

  1. A template engine
  2. A content authoring language
  3. A data serialization language (JSON, YAML, etc) for working with page metadata

My authoring language of choice is Markdown with embedded HTML, so that wasn’t a problem. There are plenty of tools available for converting Markdown to several formats. Templates and metadata were the hard parts. A bit of searching turned up mo, but because Bash can’t do nested data structures, it was of limited use.

For parsing the metadata I wanted to have at the top of every page, I could have used Python, Ruby, or Perl, but that would have been cheating. I found yq and jq, which were standalone tools that could handle the task, but they were kind of slow in testing and I had a lot of trouble getting them to work with mo.

So I decided to use shell code for templates and metadata. That meant every page would have a small Bash script at the top for defining page-specific variables, and all of my templates would be shell scripts. Yes, it’s kind of dangerous to do that, but it was the only way I could get it to work the way I wanted.

The configuration file

The first thing the program does is source config.sh to set a bunch of important global variables. Everything declared in config.sh becomes available to the page templates.

# file: config.sh
# Markdown processor
md_processor="cmark"
md_render_cmd="cmark --unsafe --smart"
# Site variables
site_name="Hyphen's place"
site_baseUrl=""
site_testUrl="http://localhost:8000"
site_author="HyphenHawk"                    site_dateFormat="%a, %b %d %Y"
site_description="Boring blog about Linux, programming, art, and other fun stuff."

Rounding up the pages

After loading the configuration, it finds all of the Markdown files in src/ with the find command. The search is recursive, which means it searches every subdirectory.

# file: blog.sh
# Use 'find' to get a list of page files
find_pages() {
  # Sort them newest to oldest.
  local result=$(find src/ -name "*.md" | sort -r)

  echo -e "$result"
}

The list (multi-line string) it generates is then run through sort to, well, sort it. Normally, the list would be sorted alphabetically, but something interesting happens when you start filenames with a date in YYYY-MM-DD format.

$ find drafts/ -name "*.md" | sort
drafts/2026-01-22-my-new-keyboard-config.md
drafts/2026-01-23-building-a-ssg-with-bash-and-friends.md

Add the -r option and you get pages sorted newest to oldest.

$ find drafts/ -name "*.md" | sort -r
drafts/2026-01-23-building-a-ssg-with-bash-and-friends.md
drafts/2026-01-22-my-new-keyboard-config.md

That made sorting blog posts easier, but made other things more difficult.

  1. Whenever the date of a page is changed in the metadata, it is not changed in the filename.
  2. Problem #1, but reversed.
  3. Whenever the filename of a page is changed, all links pointing to it must be updated.

Altogether, I thought the downsides of embedding dates inside filenames outweighed the ease of sorting the pages. So I chose to do it the hard way. I looped over the list of pages to extract all of the dates. Then I sorted the dates, removed the duplicates, and converted the date list into an array for easier looping. Finally, I used a double loop to generate the final sorted list.

# file: blog.sh
# Use 'find' to get a list of page files
find_pages() {
  # Find all of the pages and sort them newest to oldest.
  local page_list=$(find src/ -name "*.md" | sort -r)

  # The initial date list is a mult-iline string.
  local date_list=""

  # Convert the page list into an array for easy looping.
  IFS=$'\n'
  local page_array=( $page_list )
  unset IFS

  # Loop over the page array and extract the dates.
  for p in "${page_array[@]}"; do
    #echo "$p"

    # Extract the metadata.
    local page_meta=$(grab_page_meta "$p")

    # Get generate variable unsetting command.
    local unset_keys=$(echo -e "$page_meta" | awk '
      BEGIN {
        FS = "="
      }
      {
        print "unset " $1
      }'
    )
    eval "$page_meta"

    # Add date to date list.
    date_list="$date_list$page_date\n"

    # Unset the variables.
    eval "$unset_keys"
  done

 
  # Sort list of page dates and remove duplicates. Also removes the blank line.
  date_list=$(echo -e "$date_list" | sort | uniq | awk '$0 != ""' | sort -r)
  #echo "$date_list"

  # Convert list into an array.
  IFS=$'\n'
  local date_array=( $date_list )
  unset IFS


  # Build a list of pages (multi-line string) by looping over the dates and
  # the pages. Whenever a matching date is found, the page is added to the
  # list.
  local sorted_list=""
  for d in "${date_array[@]}"; do
    for p in "${page_array[@]}"; do
      local page_meta=$(grab_page_meta "$p")
      local unset_keys=$(echo -e "$page_meta" | awk '
        BEGIN {
          FS = "="
        }
        {
          print "unset " $1
        }'
      )
      eval "$page_meta"

      if [[ "$d" == "$page_date" ]]; then
        sorted_list="$sorted_list\n$p"
      fi
      eval "$unset_keys"
    done
  done
  
  # Return the list. Once again, Awk is used to filter out the blank line.
  echo -e "$sorted_list" | awk '$0 != ""'
}

The code above shows off some of the shortcomings and weirdness of Bash and friends. Most of the commands for sorting and filtering data require lines of text to be organized into discrete columns. While it’s possible to iterate over lines of text with Bash via a while loop, I like the for x in "${array[@]}" syntax better.

The most important part of this code is the opening and closing of page variables. This is the most repeated chunk of code in the whole program, and it can’t really be stuffed into a function. Well, not without making large chunks of the program harder to read. It basically has 5 steps.

  1. Get the line of shell code that generates the variables.
  2. Create a line of shell code to unset each of the variables.
  3. Execute the variable setting code.
  4. Do whatever we need to do.
  5. Execute the variable unsetting code.
# Grab the variables
local page_meta=$(grab_page_meta "$p")

# Create a chunk of code to unset all the variables.
local unset_keys=$(echo -e "$page_meta" | awk '
  BEGIN {
    FS = "="
  }
  {
    print "unset " $1
  }'
)

# "opening" the variables.
eval "$page_meta"

# Do something here

# "closing" the variables.
eval "$unset_keys"

It has to be done like this because the template functions depend on stuff being in the global scope, and different kinds of pages won’t have certain variables. For example, blog posts may or may not have tags. Unsetting all of the variables prevents data in one page from leaking into other pages.

Also take note of the syntax for converting a multi-line string into an array:

IFS=$'\n'
local things_array=( $things_list )
unset IFS

I’ve seen a lot of scripts where people write something like old_ifs="$IFS"; <some code here>; IFS="$old_ifs", but there’s no reason to do that. Whenever IFS is unset, it reverts back to its default value.

Extracting page bodies and metadata

Now let’s look at my solution for extracting metadata.

# file: blog.sh
grab_page_meta() {
  local file="$1"

  local result=$(cat "$file" | awk '
  {
    if ($0 != "+++") {
      # We also want to apply a prefix.
      print "page_" $0
    } else {
      exit
    }
  }')

  echo -e "$result"
}

The function above uses Awk to print every line up to but not including a special separator. It also prefixes each line with page_. The result is a multi-line string of shell code that can be tossed to eval. I used a similar function to extract the body of each page.

# file: blog.sh
grab_page_body() {
  local file="$1"

  local result=$(cat "$file" | awk '
  BEGIN {
    print_it = 0
  }
  {
    if ($0 == "+++") {
      print_it = 1
      next
    }

    if (print_it == 1) {
      print $0
    }
  }')

  echo -e "$result"
}

It prints everything after the separator, providing a nice multi-line string that can be tossed to a Markdown rendering program.

Creating URLs and finding page assets

One of the standout features of my SSG is that it automatically finds and copies page asset directories. Asset directories are named like some-page-assets, and the file some-page.md has to exist for the program to copy it. No point in copying asset directories for pages that don’t exist!

# file: blog.sh
get_asset_directory() {
  local src_full_path="$1"
  local file_directory=$(dirname "$src_full_path")
  local filename_no_ext=$(basename "$src_full_path" '.md')
  # This is a bit cheeky because '-' is not allowed in Bash
  # identifiers.
  echo "$file_directory/$filename_no_ext-assets"
}

Nothing special here. Just a bit of basename and dirname magic. Getting page URLs is also pretty easy.

# file: blog.sh
local page_output_name=$(get_output_file_name "$line")
local page_url=$(echo "$page_output_name" | sed 's/dest//')

And here’s the function for getting the output path.

# file: blog.sh
get_output_file_name() {
  # Get the full path to source file.
  local src_full_path="$1"
  local output_directory=$(get_output_directory "$src_full_path")

  # Get the old name.
  local old_name=$(basename "$src_full_path")

  # Generate the new name
  local new_name=$(echo "$(basename $src_full_path '.md').html")
  
  # Return the completed path
  echo "$output_directory/$new_name"
}

And finally, here’s get_output_directory().

# file: blog.sh
get_output_directory() {
  local src=$(dirname "$1")

  # Use a bit of Awk magic to replace the FIRST occurance of
  # 'src/' with 'dest/'.
  local result=$(echo "$src" | awk '
    !x{
      x=sub("src","dest")
    }
    {print $0}'
  )

  echo -e "$result"
}

It would have probably made more sense to use Sed instead of Awk here, but I wanted to see if Awk could do it.

Building pages

The build_page() function basically renders content with the Markdown rendering command specified in config.sh. Then it makes creative use of the command substitution syntax to sandwich the newly rendered chunk of HTML between the head and tail templates.

# file: blog.sh
build_page() {
  local raw="$1"
  local content=$(echo "$raw" | $md_render_cmd)
  local output=$(
    template_head;
    echo "$content";
    template_tail;
  )
  echo "$output"
}

The main build function

Now let’s jump to the main build function and go over it section by section.

This first part wipes the dest directory if it exists. Then it sets up some important variables like the page list and arrays for blog posts and tags. Then the page list is turned into an array. The site_menu variable is a special multi-line string that holds site navigation links.

  # Wipe the destination directory.
  if [[ -d dest/ ]]; then
    rm -rf dest/
  fi

  # Get a lit of all the pages.
  # Need to convert it an array.
  local page_list=$(find_pages)
  echo -e "$page_list"

  IFS=$'\n'
  local page_array=( $(find_pages) )
  unset IFS

  # We have to loop over the pages multiple times. The menu must be built
  # BEFORE the rest of the pages. We also need to grab the tags to build the
  # main tags page.
  . templates/site_menu_entry.sh
  local site_menu=""
  local site_raw_tags=()
  local site_blog_posts=()

After that, the function makes its first pass through the page array to build up the tag and blog post arrays, and the list of navigation links.

# file: blog.sh

  for line in "${page_array[@]}"; do
    local page_output_name=$(get_output_file_name "$line")
    local page_url=$(echo "$page_output_name" | sed 's/dest//')
    local page_meta=$(grab_page_meta "$line")
    
    local unset_keys=$(echo -e "$page_meta" | awk '
      BEGIN {
        FS = "="
      }
      {
        print "unset " $1
      }'
    )
    eval "$page_meta"

    # We have to filter out only the pages we want in the menu.
    if [[ "$page_menu" == "true" ]]; then
      local entry=$(template_menu_entry "$page_url" "$page_title")

      site_menu="$site_menu$entry"
    fi

    # Grab the tags while we're here.
    if [[ -n "$page_tags" ]]; then
      site_raw_tags+=(${page_tags[@]})
    fi

    # Also grab blog posts
    if [[ "$page_pageType" == "post" ]]; then
      site_blog_posts+=("$line")
    fi

    eval "$unset_keys"
  done

Then the function adds a final navigation link for the Tags page and removes duplicates from the tag array.

# file: blog.sh
  
  # Add an entry for the tags page.
  local tags_entry=$(template_menu_entry "/tags.html" "Tags")
  site_menu="$site_menu$tags_entry"

  # Remove duplicates from site tags.
  local site_tags=(); while IFS= read -r -d '' x; do site_tags+=("$x"); done < <(printf "%s\0" "${site_raw_tags[@]}" | sort -uz)
  unset IFS

This leads to another pass through the page array to actually build all the pages. Page asset directories are also detected and copied. The function ends by copying site-wide assets to dest/.

# file: blog.sh
  # Loop over them a second time to build the pages.
  for line in "${page_array[@]}"; do
#  while IFS= read -r line; do
    # Grab the metadata for each page.
    local page_meta=$(grab_page_meta "$line")

    # Need to grab the variables to unset.
    local unset_keys=$(echo -e "$page_meta" | awk '
      BEGIN {
        FS = "="
      }
      {
        print "unset " $1
      }'
    )

    # Get the page output name.
    local page_output_name=$(get_output_file_name "$line")

    # Get page output directory.
    local page_output_directory=$(get_output_directory "$line")

    # We also need to get the page URL.
    local page_url=$(echo "$page_output_name" | sed 's/dest//')

    # Get the page's possible assets directory.
    local page_asset_directory=$(get_asset_directory "$line")

    #echo "$page_output_name"

    # Create the output directory.
    mkdir -p "$page_output_directory"

    # Grab the content of each page.
    local page_body=$(grab_page_body "$line")


    #echo -e "$page_meta"
    #echo -e "$unset_keys"

    # Create the metadata keys.
    eval "$page_meta"
    #printf "%s\n" "${page_tags[*]}"

    # Build the page and write it to its destination.
    build_page "$page_body" > "$page_output_name"

    # Copy the asset directory if it exists.
    if [[ -d "$page_asset_directory" ]]; then
      cp -r "$page_asset_directory" "$page_output_directory"
    fi
    
    # Clear the metadata before moving on.
    eval "$unset_keys"
  done

  # Copy site-wide assets
  cp -r static/* dest/
}

Templates

There are really only two templates with a tiny chunk of repeated code. First, lets look at site_menu_entry.sh. It’s the chunk of code that generates menu entries.

# file: site_menu_entry.sh
template_menu_entry() {
  local entry_url="$1"
  local entry_title="$2"

  local result="
          <li class=\"site-menu-item\"><a href=\"$entry_url\">$entry_title</a></li>
"
  echo -e "$result"
}

Nothing special here, just basic string interpolation.

Now let’s look at the first function in head.sh.

# file: templates/head.sh
template_meta_chunk() {
  local chunk="
  <p class=\"post-meta\">
    Posted on $page_date"


  if [[ -n "$page_tags" ]]; then
    local line_of_tags=" | Tags: "

    #  
    for t in "${page_tags[@]}"; do
      line_of_tags="$line_of_tags<span class=\"page-tag\">$t</span>, "
    done

    # After looping over the tags, remove the last comma and space by deleting the last 2 characters.
    # NOTE: Should really think about converting the tags to an HTML unordered list.
    line_of_tags=$(echo -e "$line_of_tags" | sed -e 's/..$//')

    # Add to result
    chunk="$chunk$line_of_tags"
  fi
  chunk="$chunk
  </p>"

  echo -e "$chunk"
}

The point of this function is generate metadata chunks for blog posts. It also detects if the page has tags and adds them.

The next function is template_head(), which is responsible for generating the top half of every page.

# file: templates/head.sh
template_head() {
  if [[ -z "$page_description" ]]; then
    local page_description=$(echo "$site_description")
  fi
  
  if [[ -z "$page_title" ]]; then
    local page_title="None"
  fi

  local result="<!DOCTYPE html>
  <html lang=\"en\">

  <head>
    <meta charset=\"utf-8\">
    <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\">
    <meta name=\"description\" content=\"$page_description\">
    <title>$page_title -- $site_name</title>

    <link rel=\"stylesheet\" href=\"/style.css\">


  </head>

  <body>
    <header id=\"site-header\">
      <a id=\"site-name\" href=\"/index.html\">$site_name</a>
      <nav id=\"site-menu\">
        <ul>"

  # Build up the site menu.
  result="$result$site_menu"


  # Add the rest of the header.
  result="$result
        </ul>
      </nav>
    </header>
    <main>
"

  # Add some conditional stuff.
  # Skip adding a title to to the home page.
  if ! [[ "$page_title" == "Home" ]]; then
    result="$result
    <h1>$page_title</h1>"
  fi

  # Print post metadata if its a post.
  if [[ "$page_pageType" == "post" ]]; then
    local meta_chunk=$(template_meta_chunk)
    result="$result$meta_chunk"
  fi

  echo -e "$result"
}

Notice that it skips printing the title for the home page and adds the metadata section if the page is a blog post.

The next template is a bit of mess, but it was kind of necessary to get the desired output. templates/tail.sh, as its name implies, is responsible for generating the bottom half of every page. It’s also where I put the code for generating list of blog posts for the Home and Tags pages.

Let’s look at the first function in the file.

# file: templates/tail.sh
template_post_list_item() {
  local line="$1"
  local tag="$2"
  local result=""
  # Need to extract post information.
  local page_output_name=$(get_output_file_name "$line")
  local page_url=$(echo "$page_output_name" | sed 's/dest//')
  local page_meta=$(grab_page_meta "$line")
  
  local unset_keys=$(echo -e "$page_meta" | awk '
    BEGIN {
      FS = "="
    }
    {
      print "unset " $1
    }'
  )
  eval "$page_meta"

  # If we only want posts with a specific tag, check if one was specified.
  if [[ -n "$tag" ]] && [[ -n "$page_tags" ]]; then
    # If so, SKIP THE REST OF THIS FUNCTION if the page tags DO NOT contain the
    # required tag.
    local tag_found="false"

    for t in "${page_tags[@]}"; do
      if [[ "$t" == "$tag" ]]; then
        tag_found="true"
      fi
    done

    if [[ "$tag_found" == "false" ]]; then
      eval "$unset_keys"
      echo ""
      return
    fi
  fi

  # Begin building post list entry, starting with the meta chunk.
  local meta_chunk=$(template_meta_chunk)
  result="$result
  <li class=\"post-list-item\">
    <a href=\"$page_url\"><h2>$page_title</h2></a>
    $meta_chunk"
 
  # Add the page description is available.
  if [[ -n "$page_description" ]]; then
    result="$result
      <p class=\"post-desc\">$page_description</p>"
  fi

  # Complete the list item.
  result="$result
  </li>"
  
  eval "$unset_keys"
  echo -e "$result"
}

The function above is responsible for generating the individual list items on the Home and Tags pages. In Bash, functions can take optional arguments. I used this to basically check whether the list item was for the Home page or a specific tag on the Tags page. If its looking for a specific tag and doesn’t find it, the function returns without doing anything.

Now let’s look at the second function in tail.sh.

# file: templates/tail.sh
template_tail() {
  local result=""
  # Generate post list for home page.
  if [[ "$page_title" == "Home" ]]; then
    # Start the post list.
    result="
  <ul class=\"post-list\">
"
    for line in "${site_blog_posts[@]}"; do
      local item=$(template_post_list_item "$line")
      result="$result
      $item"
    done
      
      # End the post list.
      result="$result
    </ul>"
  fi

  if [[ "$page_title" == "Tags" ]]; then
    for tag in "${site_tags[@]}"; do
      result="$result
      <h2>$tag</h2>
      <ul class=\"post-list\">
      "

      for line in "${site_blog_posts[@]}"; do
        local list_item=$(template_post_list_item "$line" "$tag")
        if [[ -n "$list_item" ]]; then
          result="$result
          $list_item"
        fi
      done

      result="$result
      </ul>"
    done
  fi

  # Add the final footer section
  result="$result
  </main>
  <footer id=\"site-footer\">
    <p>&copy; 2025 HyphenHawk</p>
    <p>Content is published under the terms and conditions of the <a href=\"https://creativecommons.org/licenses/by-nd/4.0/\">Creative Commons Attribution-NoDerivatives 4.0 International license (CC BY-ND 4.0)</a></p>
  </footer>
</body>

</html>
"
  echo -e "$result"
}

So if it’s the Home page, it loops over the list of blog posts one time. If it’s the Tags page, it runs a double loop, iterating over blog posts for each tag. For all non-matching tags, the result of running template_post_list_item() is an empty string.

Conclusion

And that’s all I have to say about the SSG. It’s not pretty or fast, but it gets the job done. You can find the full source on my GitLab here.