Every now and then, someone asks me if I have any recommendations for static site generators to try out. I usually reply that my own site is built with Hakyll, and that I like it, but also that for people unfamiliar with Haskell, it can be a bit daunting to use.
Sometimes this doesn’t scare people away, and then they ask me how exactly I use it, if I have any suggestions, examples, etc.
I used to reply that, unfortunately, I keep unfinished drafts and such in my blog’s git repository, so I’ve never made it public (and this is still true).
However, it occurred to me that I can just make my blog generate its own documentation: my Hakyll configuration is a literate Haskell file, and right now you’re reading the html woven from that file, created by itself as it was executed to generate the current version of my blog.
I’ll explain some things here, but not all. If you’re new to Hakyll then I recommend reading the official tutorials, but my hope is this can maybe serve as a helpful example of a real-life config.
Imports
Haskell Imports
{-# LANGUAGE OverloadedStrings #-}
import Hakyll
import Hakyll.Core.Provider
import Hakyll.Core.Compiler.InternalI use pandoc for generating html:
import Text.Pandoc.Options (ReaderOptions (..), WriterOptions (..))
import Text.Pandoc.Extensions (Extension(..), enableExtension)
import Text.Pandoc.Templates (compileTemplate, Template)
import Text.Pandoc.Builder (HasMeta(setMeta), MetaValue (..))
import Text.Pandoc.Highlighting (breezeDark)Pandoc uses skylighting to get syntax highlighting in code blocks:
import Skylighting.Types (Syntax)
import Skylighting.Core (parseSyntaxDefinitionFromText, addSyntaxDefinition)Finally, the inevitable “we’re using the default haskell prelude” imports:
import System.FilePath.Posix (takeFileName, dropExtension)
import System.Process (rawSystem)
import System.Directory (copyFile)
import Control.Monad (filterM, void)
import Data.Binary (Binary)
import Data.Functor.Identity (runIdentity)
import Data.Functor ((<&>))
import Data.Text (Text)
import Data.Text.Encoding (encodeUtf8, decodeUtf8)
import qualified Data.Text as T
import qualified Data.Text.Lazy as LT
import qualified Data.ByteString as LBItems
Hakyll is exceedingly general in its design. Mostly, we have to tell it how to
translate things (for which we have the Compiler monad), and it will
keep track of what it’s already translated (for which there is the Item
type).
The items form a directed graph, and Hakyll will cache already-translated things
on the file system so it doesn’t have to re-translate everything all the time
(in the _site/ directory by default), e.g. when using its “watch” command.
Items are created using rules (which we’ll get to later), which
require a Compiler to know what to do with their input.
skylighting syntax
My blog frequently talks about reasonably obscure languages; additionally, I sometimes want “syntax highlighting” for things which are not strictly programming languages (e.g. in the post about asterisk, asterisk’s cli has highlighting). Thus I regularly define my own highlighting syntax.
Incidentally, this is a nice demonstration of how to use the Compiler
monad for something that’s not translated to html:
syntaxCompiler :: Compiler (Item Syntax)
syntaxCompiler = do
text <- getResourceLBS
<&> fmap (LT.fromStrict . decodeUtf8 . LB.toStrict)
fp <- getResourceFilePath
case parseSyntaxDefinitionFromText fp (itemBody text) of
Right syntax -> makeItem syntaxThis syntaxCompiler will translate an input resource (which it gets
as a lazy ByteString from the Compiler monad) into a skylighting Syntax,
and then store it in a Hakyll Item.
As mentioned, Items can be cached on the file system, which requires
that they have some way to be written out. But I’ve not cared enough to implement
this yet, so my Writable instance of Syntax does nothing:
instance Writable Syntax where
write fp item = pure ()compiling posts
Hakyll comes with several standard compilers suitable for blog posts (and many other
things); if you’re just starting out, simply using pandocCompiler as in the
tutorial
will probably work nicely.
On the other hand, there’s maybe some extra things you want to pass to whatever translates your posts into html. I certainly have some:
postCompiler :: Compiler (Item String)
postCompiler = do
-- a skylighting SyntaxMap, retrieving all Items under syntax/*
syntaxMap <- loadAll "syntax/*"
<&> fmap itemBody
<&> foldr addSyntaxDefinition (writerSyntaxMap defaultHakyllWriterOptions)
-- loads the Item under bib/style.csl, which defines how my reference sections look;
csl <- load "bib/style.csl"
-- required to make the types happy
let bib = Item "dummy.bib" (Biblio "")Note that I only load a style.csl file here, not the bibliography itself, which
I include as YAML
at the top of each post’s markdown file. I don’t like the YAML format much, but
it’s been the easiest way to have separate bibliographies for each post — but see
[ZormanBib] for a setup using a bibliography.bib file common to all posts.
Pandoc is amazingly general; especially its markdown reader is very flexible. I like setting some options to make it work better for me:
let rOptions = defaultHakyllReaderOptions
{ readerExtensions = foldr enableExtension
(readerExtensions defaultHakyllReaderOptions)
[ Ext_lists_without_preceding_blankline
, Ext_compact_definition_lists
, Ext_citations
]
}
let wOptions = defaultHakyllWriterOptions
{ writerHighlightStyle = Just breezeDark
, writerSyntaxMap = syntaxMap
, writerTableOfContents = True
-- I need some extra html around the table of contents, so here
-- we pass a custom template defined below
, writerTemplate = Just tocTemplate
}As with skylighting, we again get the resource body from the Compiler monad and
pass it through pandoc. If you don’t set as many options, then probably a
renderPandoc or renderPandocWith is enough here (of course,
you can also use something else entirely to translate your posts — no need to
rely on pandoc!).
getResourceBody
>>= readPandocWith rOptions
<&> fmap (setMeta "link-citations" (MetaBool True))
>>= processPandocBiblio csl bib
<&> writePandocWith wOptionsFinally, define a little template for how to combine table of contents & the post’s body:
tocTemplate :: Text.Pandoc.Templates.Template Text
tocTemplate = either error id . runIdentity . compileTemplate "" $ T.unlines
[ "<div class=\"toc-box\"><div class=\"toc\">"
, "$title$"
, "$toc$"
, "</div></div>"
, "$body$"
]Compiling images
Hakyll provides a copyFileCompiler by default, to work on files that
ought to be copied as-is to the output. This is fine for most kinds of assets;
however, for images I want to strip any metadata.
First we need a newtype with a suitable instance of Writable:
newtype StripExifFile = StripExifFile FilePath
deriving (Binary, Eq, Ord, Show)
instance Writable StripExifFile where
write dest (Item _ (StripExifFile src)) = do
copyFile src dest
void $ rawSystem "exiftool" ["-all=", "--make", "--model", "--icc_profile", dest, "-overwrite_original"]And then a compiler to conveniently use it:
stripExifCompiler :: Compiler (Item StripExifFile)
stripExifCompiler = do
identifier <- getUnderlying
provider <- compilerProvider <$> compilerAsk
makeItem $ StripExifFile $ resourceFilePath provider identifierThis compiler uses several functions from the Hakyll.Core.Compiler.Internal
module (as it’s almost identical to the implementation of copyFileCompiler).
This is probably not good and should be changed lest it break sometime ..
Routes & Rules
Finally, the main function tells Hakyll where to use which compiler, as a list
of rules (which are written in yet another monad, Rules).
Broadly, we have to types of rules: match to load files present on
the file system, and create to create new files which will be
present in hakyll’s output.
Rules which do not have a route given will not be present in hakyll’s output.
main :: IO ()
main = hakyll $ do
-- styling for my Reference sections; compile, and don't include in output
match "bib/style.csl"
$ compile cslCompiler
-- here we create the syntax/* items we used in the postCompiler earlier
match "syntax/*" $ do
compile syntaxCompiler
-- files in images/ & assets/* are just copied to hakyll's output
match "images/*" $ do
route idRoute
compile stripExifCompiler
match "assets/*" $ do
route idRoute
compile copyFileCompiler
-- css can be compressed
match "css/*" $ do
route idRoute
compile compressCssCompiler
-- templates use the same mechanism as our syntax items
match "templates/*" $ compile templateBodyCompiler
-- about & contact sites are translated via pandoc, and change their path extension
match (fromList ["about.rst", "contact.markdown"]) $ do
route $ setExtension "html"
compile $ pandocCompiler
>>= loadAndApplyTemplate "templates/default.html" defaultContext
>>= relativizeUrls
-- posts are translated using our postCompiler
match "posts/*" $ do
route $ customRoute removeDate -- setExtension "html"
compile $ postCompiler
-- save a snapshot! this will be used in the rss feed
>>= saveSnapshot "content"
>>= loadAndApplyTemplate "templates/post.html" postCtx
>>= loadAndApplyTemplate "templates/default.html" postCtx
>>= relativizeUrls
-- this is the rule for the site you're reading right now!
match "hakyll/site.lhs" $ do
route $ customRoute (const "how-i-use-hakyll.html")
compile $ postCompiler
>>= loadAndApplyTemplate "templates/post.html" infoCtx
>>= loadAndApplyTemplate "templates/default.html" infoCtx
>>= relativizeUrls
-- the main site, hand-written in html for flexibility
match "index.html" $ do
route idRoute
compile $ do
posts <- recentFirst =<< loadAll "posts/*"
let indexCtx =
listField "posts" postCtx (return posts) `mappend`
defaultContext
getResourceBody
>>= applyAsTemplate indexCtx
>>= loadAndApplyTemplate "templates/default.html" indexCtx
>>= relativizeUrlsThen also two create rules, for an rss feed and archive site:
create ["rss.xml"] $ do
route idRoute
compile $ do
let feedCtx = postCtx `mappend` bodyField "description"
posts <- loadAllSnapshots "posts/*" "content"
>>= recentFirst
>>= filterM (fmap (/= Just "true")
. flip getMetadataField "hidden"
. itemIdentifier)
<&> take 10
renderRss feedConfiguration feedCtx posts
create ["archive.html"] $ do
route idRoute
compile $ do
posts <- recentFirst =<< loadAll "posts/*"
let archiveCtx =
listField "posts" postCtx (return posts) `mappend`
constField "title" "Archives" `mappend`
defaultContext
makeItem ""
>>= loadAndApplyTemplate "templates/archive.html" archiveCtx
>>= loadAndApplyTemplate "templates/default.html" archiveCtx
>>= relativizeUrlsAnd that’s it!
(psst, a couple auxiliary definitions used above are still missing:)
genCtx :: Bool -> Context String
genCtx b =
boolField "showdate" (const b) `mappend`
dateField "date" "%Y-%m-%d" `mappend`
defaultContext
postCtx = genCtx True
infoCtx = genCtx FalsefeedConfiguration :: FeedConfiguration
feedConfiguration = FeedConfiguration
{ feedTitle = "stuebinm.eu"
, feedDescription = "apparently a blog"
, feedAuthorName = "terru"
, feedAuthorEmail = "stuebinm@disroot.org"
, feedRoot = "https://stuebinm.eu"
}removeDate :: Identifier -> FilePath
removeDate ident = "posts/" <> dropExtension words <> ".html"
where words = drop 11 . takeFileName . toFilePath $ ident