How I use Hakyll

Every now and then, someone asks me if I have any recommendations for static site generators to try out. I usually reply that my own site is built with Hakyll, and that I like it, but also that for people unfamiliar with Haskell, it can be a bit daunting to use.

Sometimes this doesn’t scare people away, and then they ask me how exactly I use it, if I have any suggestions, examples, etc.

I used to reply that, unfortunately, I keep unfinished drafts and such in my blog’s git repository, so I’ve never made it public (and this is still true).

However, it occurred to me that I can just make my blog generate its own documentation: my Hakyll configuration is a literate Haskell file, and right now you’re reading the html woven from that file, created by itself as it was executed to generate the current version of my blog.

I’ll explain some things here, but not all. If you’re new to Hakyll then I recommend reading the official tutorials, but my hope is this can maybe serve as a helpful example of a real-life config.

Imports

Haskell Imports
{-# LANGUAGE OverloadedStrings #-}
import Hakyll

I use pandoc for generating html:

import Text.Pandoc.Options      (ReaderOptions (..), WriterOptions (..))
import Text.Pandoc.Extensions   (Extension(..), enableExtension)
import Text.Pandoc.Templates    (compileTemplate, Template)
import Text.Pandoc.Builder      (HasMeta(setMeta), MetaValue (..))
import Text.Pandoc.Highlighting (breezeDark)

Pandoc uses skylighting to get syntax highlighting in code blocks:

import Skylighting.Types (Syntax)
import Skylighting.Core  (parseSyntaxDefinitionFromText, addSyntaxDefinition)

Finally, the inevitable “we’re using the default haskell prelude” imports:

import System.FilePath.Posix (takeFileName, dropExtension)
import Control.Monad         (filterM)
import Data.Functor.Identity (runIdentity)
import Data.Functor          ((<&>))
import Data.Text             (Text)
import Data.Text.Encoding    (encodeUtf8, decodeUtf8)
import qualified Data.Text       as T
import qualified Data.Text.Lazy  as LT
import qualified Data.ByteString as LB

Items

Hakyll is exceedingly general in its design. Mostly, we have to tell it how to translate things (for which we have the Compiler monad), and it will keep track of what it’s already translated (for which there is the Item type).

The items form a directed graph, and Hakyll will cache already-translated things on the file system so it doesn’t have to re-translate everything all the time (in the _site/ directory by default), e.g. when using its “watch” command. Items are created using rules (which we’ll get to later), which require a Compiler to know what to do with their input.

skylighting syntax

My blog frequently talks about reasonably obscure languages; additionally, I sometimes want “syntax highlighting” for things which are not strictly programming languages (e.g. in the post about asterisk, asterisk’s cli has highlighting). Thus I regularly define my own highlighting syntax.

Incidentally, this is a nice demonstration of how to use the Compiler monad for something that’s not translated to html:

syntaxCompiler :: Compiler (Item Syntax)
syntaxCompiler = do
  text <- getResourceLBS
    <&> fmap (LT.fromStrict . decodeUtf8 . LB.toStrict)
  fp <- getResourceFilePath
  case parseSyntaxDefinitionFromText fp (itemBody text) of
    Right syntax -> makeItem syntax

This syntaxCompiler will translate an input resource (which it gets as a lazy ByteString from the Compiler monad) into a skylighting Syntax, and then store it in a Hakyll Item.

As mentioned, Items can be cached on the file system, which requires that they have some way to be written out. But I’ve not cared enough to implement this yet, so my Writable instance of Syntax does nothing:

instance Writable Syntax where
  write fp item = pure ()

compiling posts

Hakyll comes with several standard compilers suitable for blog posts (and many other things); if you’re just starting out, simply using pandocCompiler as in the tutorial will probably work nicely.

On the other hand, there’s maybe some extra things you want to pass to whatever translates your posts into html. I certainly have some:

postCompiler :: Compiler (Item String)
postCompiler = do

  -- a skylighting SyntaxMap, retrieving all Items under syntax/*
  syntaxMap <- loadAll "syntax/*"
   <&> fmap itemBody
   <&> foldr addSyntaxDefinition (writerSyntaxMap defaultHakyllWriterOptions)

  -- loads the Item under bib/style.csl, which defines how my reference sections look;
  csl <- load "bib/style.csl"
   -- required to make the types happy
  let bib = Item "dummy.bib" (Biblio "")

Note that I only load a style.csl file here, not the bibliography itself, which I include as YAML at the top of each post’s markdown file. I don’t like the YAML format much, but it’s been the easiest way to have separate bibliographies for each post — but see [ZormanBib] for a setup using a bibliography.bib file common to all posts.

Pandoc is amazingly general; especially its markdown reader is very flexible. I like setting some options to make it work better for me:

  let rOptions = defaultHakyllReaderOptions
        { readerExtensions = foldr enableExtension
            (readerExtensions defaultHakyllReaderOptions)
            [ Ext_lists_without_preceding_blankline
            , Ext_compact_definition_lists
            , Ext_citations
            ]
        }

  let wOptions = defaultHakyllWriterOptions
        { writerHighlightStyle = Just breezeDark
        , writerSyntaxMap = syntaxMap
        , writerTableOfContents = True
        -- I need some extra html around the table of contents, so here
        -- we pass a custom template defined below
        , writerTemplate = Just tocTemplate
        }

As with skylighting, we again get the resource body from the Compiler monad and pass it through pandoc. If you don’t set as many options, then probably a renderPandoc or renderPandocWith is enough here (of course, you can also use something else entirely to translate your posts — no need to rely on pandoc!).

  getResourceBody
      >>= readPandocWith rOptions
      <&> fmap (setMeta "link-citations" (MetaBool True))
      >>= processPandocBiblio csl bib
      <&> writePandocWith wOptions

Finally, define a little template for how to combine table of contents & the post’s body:

tocTemplate :: Text.Pandoc.Templates.Template Text
tocTemplate = either error id . runIdentity . compileTemplate "" $ T.unlines
  [ "<div class=\"toc-box\"><div class=\"toc\">"
  , "$title$"
  , "$toc$"
  , "</div></div>"
  , "$body$"
  ]

Routes & Rules

Finally, the main function tells Hakyll where to use which compiler, as a list of rules (which are written in yet another monad, Rules).

Broadly, we have to types of rules: match to load files present on the file system, and create to create new files which will be present in hakyll’s output.

Rules which do not have a route given will not be present in hakyll’s output.

main :: IO ()
main = hakyll $ do

    -- styling for my Reference sections; compile, and don't include in output
    match "bib/style.csl"
      $ compile cslCompiler

    -- here we create the syntax/* items we used in the postCompiler earlier
    match "syntax/*" $ do
      compile syntaxCompiler

    -- files in images/ & assets/* are just copied to hakyll's output
    match "images/*" $ do
        route   idRoute
        compile copyFileCompiler

    match "assets/*" $ do
      route idRoute
      compile copyFileCompiler

    -- css can be compressed
    match "css/*" $ do
        route   idRoute
        compile compressCssCompiler

    -- templates use the same mechanism as our syntax items
    match "templates/*" $ compile templateBodyCompiler

    -- about & contact sites are translated via pandoc, and change their path extension
    match (fromList ["about.rst", "contact.markdown"]) $ do
        route   $ setExtension "html"
        compile $ pandocCompiler
            >>= loadAndApplyTemplate "templates/default.html" defaultContext
            >>= relativizeUrls

    -- posts are translated using our postCompiler
    match "posts/*" $ do
        route $ customRoute removeDate -- setExtension "html"
        compile $ postCompiler
            -- save a snapshot! this will be used in the rss feed
            >>= saveSnapshot "content"
            >>= loadAndApplyTemplate "templates/post.html"    postCtx
            >>= loadAndApplyTemplate "templates/default.html" postCtx
            >>= relativizeUrls

    -- this is the rule for the site you're reading right now!
    match "hakyll/site.lhs" $ do
      route $ customRoute (const "how-i-use-hakyll.html")
      compile $ postCompiler
        >>= loadAndApplyTemplate "templates/post.html" infoCtx
        >>= loadAndApplyTemplate "templates/default.html" infoCtx
        >>= relativizeUrls

    -- the main site, hand-written in html for flexibility
    match "index.html" $ do
        route idRoute
        compile $ do
            posts <- recentFirst =<< loadAll "posts/*"
            let indexCtx =
                    listField "posts" postCtx (return posts) `mappend`
                    defaultContext
            getResourceBody
                >>= applyAsTemplate indexCtx
                >>= loadAndApplyTemplate "templates/default.html" indexCtx
                >>= relativizeUrls

Then also two create rules, for an rss feed and archive site:

    create ["rss.xml"] $ do
      route idRoute
      compile $ do
        let feedCtx = postCtx `mappend` bodyField "description"
        posts <- loadAllSnapshots "posts/*" "content"
          >>= recentFirst
          >>= filterM (fmap (/= Just "true")
                       . flip getMetadataField "hidden"
                       . itemIdentifier)
          <&> take 10
        renderRss feedConfiguration feedCtx posts

    create ["archive.html"] $ do
        route idRoute
        compile $ do
            posts <- recentFirst =<< loadAll "posts/*"
            let archiveCtx =
                    listField "posts" postCtx (return posts) `mappend`
                    constField "title" "Archives"            `mappend`
                    defaultContext

            makeItem ""
                >>= loadAndApplyTemplate "templates/archive.html" archiveCtx
                >>= loadAndApplyTemplate "templates/default.html" archiveCtx
                >>= relativizeUrls

And that’s it!


(psst, a couple auxiliary definitions used above are still missing:)

genCtx :: Bool -> Context String
genCtx b =
    boolField "showdate" (const b) `mappend`
    dateField "date" "%Y-%m-%d" `mappend`
    defaultContext

postCtx = genCtx True
infoCtx = genCtx False
feedConfiguration :: FeedConfiguration
feedConfiguration = FeedConfiguration
  { feedTitle       = "stuebinm.eu"
  , feedDescription = "apparently a blog"
  , feedAuthorName  = "terru"
  , feedAuthorEmail = "stuebinm@disroot.org"
  , feedRoot        = "https://stuebinm.eu"
  }
removeDate :: Identifier -> FilePath
removeDate ident = "posts/" <> dropExtension words <> ".html"
  where words = drop 11 . takeFileName . toFilePath $ ident

References