2025-11-09

Machines should know which commit is deployed on them

TL;DR: I find it useful to have all the (NixOS) machines I administrate know meta information about their currently deployed configuration, which allows monitoring to shout at me if I forget to update one or (worse) have deployed changes I forgot to commit.

This comes up in conversation every now and then, and surprisingly often people are surprised how easily this can be done. Hence I thought I’d write a summary of what I do as a little reference I can point people to in the future.

Getting Git’s Information

This is fairly easy, whether using the (still experimental!) flakes or not. Flakes, by design, have a tighter integration to the underlying git repository (assuming the directory of the flake is a git repository, which it does not have to be) then we get information about it as part of the flake’s self input:

{
  inputs = { ... };
  
  outputs = { self }: { inherit self; };
}

Merely evaluating the self output here gives something like

{ 
  _type = "flake";
  inputs = { };
  lastModified = 1761994565;
  lastModifiedDate = "20251101105605";
  narHash = "sha256-5USBhx5RlZ6YVQiukrj5rsBXDZwO3d1QbNkLoixEFgc=";
  outPath = "/nix/store/z66skpk7izzap6336lsd091wpmcpk1v2-source";
  outputs = { self = «repeated»; };
  rev = "972aa7dd2f313733f551475115c6bb2cc1abdb57";
  revCount = 2;
  self = «repeated»;
  shortRev = "972aa7d";
  sourceInfo = { 
    lastModified = 1761994565;
    lastModifiedDate = "20251101105605";
    narHash = "sha256-5USBhx5RlZ6YVQiukrj5rsBXDZwO3d1QbNkLoixEFgc=";
    outPath = "/nix/store/z66skpk7izzap6336lsd091wpmcpk1v2-source";
    rev = "972aa7dd2f313733f551475115c6bb2cc1abdb57";
    revCount = 2;
    shortRev = "972aa7d";
    submodules = false;
  };
  submodules = false;
}

Entertainingly (as part of what appears to be a fundamental law of all things nix) we even get the same information twice.

Without flakes

What do we do if we don’t use flakes? In that case, the decision to keep our source files in a git repository is decoupled from the fact that they’re a human-appreciable collection of source files: nix does not know or care that they’re in a git repository, and so won’t give us the information unless we ask for it explicitly.

How do we ask? By simply misusing builtins.fetchGit:

builtins.fetchGit ./.

This will, if it’s evaluated in the root file of the git repository, clone it & give use essentially the same information that we saw before:

{
  lastModified = 1761994565;
  lastModifiedDate = "20251101105605";
  narHash = "sha256-5USBhx5RlZ6YVQiukrj5rsBXDZwO3d1QbNkLoixEFgc=";
  outPath = "/nix/store/z66skpk7izzap6336lsd091wpmcpk1v2-source";
  rev = "972aa7dd2f313733f551475115c6bb2cc1abdb57";
  revCount = 2;
  shortRev = "972aa7d";
  submodules = false;
}

(Though disappointingly, this time, we only get it once)

What does fetchGit do?

For those whose entry to using nix was with flakes & who are used to nix always copying their source files into the store before it does anything else, it may be worth interrogating for a moment what’s going on here: did we just copy the source tree twice? No!

In fact the ./. path here will happily desugar to the actual path of the file in which it is contained, even if that happens to be outside the nix store. Unless the pure evaluation mode is enabled (e.g. via --pure or by using flakes, where it is the default), this is allowed. Then fetchGit will clone it, doing no more work than the copying of a flake into the store in the other case.

Note that this means this approach does not work from inside a flake!

Uncommitted changes

When evaluating a flake in a “dirty” worktree which contains uncommitted changes, nix will produce a warning, and produce a slightly different attribute set:

{ 
  _type = "flake";
  dirtyRev = "9952748d841d18c5434919f457b0cbac4ba53ba7-dirty"; 
  dirtyShortRev = "9952748-dirty";
  inputs = { };
  lastModified = 1761951614;
  lastModifiedDate = "20251031230014";
  narHash = "sha256-5USBhx5RlZ6YVQiukrj5rsBXDZwO3d1QbNkLoixEFgc=";
  outPath = "/nix/store/z66skpk7izzap6336lsd091wpmcpk1v2-source";
  outputs = { self = «repeated»; };
  self = «repeated»;
  sourceInfo = { 
    dirtyRev = "9952748d841d18c5434919f457b0cbac4ba53ba7-dirty";
    dirtyShortRev = "9952748-dirty";
    lastModified = 1761951614;
    lastModifiedDate = "20251031230014";
    narHash = "sha256-5USBhx5RlZ6YVQiukrj5rsBXDZwO3d1QbNkLoixEFgc=";
    outPath = "/nix/store/z66skpk7izzap6336lsd091wpmcpk1v2-source";
    submodules = false; };
  submodules = false;
}

rev and shortRev have been replaced by dirtyRev and dirtyShortRev.

The same is (almost) true for fetchGit:

{
  dirtyRev = "9952748d841d18c5434919f457b0cbac4ba53ba7-dirty";
  dirtyShortRev = "9952748-dirty";
  lastModified = 1761951614;
  lastModifiedDate = "20251031230014";
  narHash = "sha256-5USBhx5RlZ6YVQiukrj5rsBXDZwO3d1QbNkLoixEFgc=";
  outPath = "/nix/store/z66skpk7izzap6336lsd091wpmcpk1v2-source";
  rev = "0000000000000000000000000000000000000000";
  revCount = 0;
  shortRev = "0000000";
  submodules = false;
}

Very annoyingly, the returned attributes differ: unlike with flakes, the attributes rev and shortRev still exist, but now always give a meaningless dummy value! On its own this is not a large problem—all the information is still there—but it’s good to be mindful of this when writing code which consumes this information and is used both with and without flakes, as otherwise you’ll get surprising bugs.

Aside: What is a “Short Rev”, anyways?

Depending on how much experience you have with git, and especially with very large git repositories (say, Nixpkgs), at this point there might be a question burning inside you: What is a “short hash” here, anyways, and exactly how long is it?

Well, if you’re using CppNix, the answer is “it’s exactly the first seven characters of the full-length commit hash”.

Except this is not at all how git behaves when it shortens commit hashes! If you ask it for one (e.g. via git rev-parse --short) it will create a “short” prefix of the full commit hash, but ensure that it is still long enough to uniquely identifies its referent object in the current repository, with seven characters being a minimum that might well grow if the repository is very large.

Is this a real issue?

Yes! In fact, Nixpkgs itself is a good example, leading to unfixable issues in Hydra: to find one, simply set git’s core.abbrev value to something low (say, 7) so it won’t add more characters than strictly necessary, and then do git log --oneline.

You won’t need much scrolling to find a short hash that is (at least) 8 characters long, giving you a case where CppNix’s seven character prefix is ambiguous.

Why did I specify “CppNix” above? Surely the Nix implementation you choose should not matter? Isn’t this ecosystem all about reproducability? Oh, but what a nice world that would be.

Instead, if you use Lix, you will get a dirtyShortRev that is produced by simply calling out to git — it will always be a unique identifier, exactly as git would like it to be, but its precise value and length might well depend on your git config.

Can you already hear the ugly beast rearing its head? Lix will not give you a guaranteed-unique identifier for the shortRev attribute in the non-dirty case, because that would break reproducability of flakes: the length of a guaranteed-unique short hash depends not just on the locked revision of a git repository input, but also on your git’s config, and worse, on exactly how much other stuff there is in that git repository; if it grows, eventually the short revs will grow, too.

As this is not only unlikely to be fixed, but indeed appears conceptionally impossible to fix at all, it’s probably best to never use the values of shortRev and dirtyShortRev at all, and prefer the full versions in all cases.

Telling the machine where it’s at

First, the standard standard way to include information like this into a system is to put it into /etc/os-release, where it can be picked up by various tools - for example, the boot menu will show it when choosing a generation. Such attributes are easily set by using the system.nixos.label NixOS options.

Second, I want my machines to immediately tell my what state is on them when I log in. NixOS and scriptable motd messages are tricky, so instead I set it from inside Nix:

let
  formatDate = date: with lib.strings;
    let
      year = substring 0 4 date;
      month = substring 4 2 date;
      day = substring 6 2 date;
      hour = substring 8 2 date;
      minute = substring 10 2 date;
      second = substring 12 2 date;
    in
      "${year}-${month}-${day} ${hour}:${minute}:${second} UTC";
in
{
  users.motd = ''
    Welcome to ${config.networking.hostName}, running NixOS ${config.system.nixos.release}!
    Built from ${self.dirtyRev or self.rev}.
    Last commit was at ${formatDate self.lastModifiedDate}.
    ${if self ? dirtyRev then "\nPlease remember to commit your changes!\n" else ""}
  '';
}

(why did CppNix’s designers think it appropriate to provide a “readable” date format as a string, but make contain only digits and nothing else, forcing us to re-parse that inside Nix if we want to have anything readable? I have no clue)

Monitoring

Finally I like to do one more thing as well: Set up some kind of monitoring which will tell me if I forgot to commit changes I have already deployed. This is especially important when a machine is administrated by more than one person: If one of us forgets to commit anything, others are either blocked from working on the machine, or may unwittingly roll back previous changes.

First, create two files inside /etc:

{
  environment.etc.commit.text = self.dirtyRev or self.rev;
  environment.etc.timestamp.text = builtins.toString self.lastModified;
}

Of course we could also simply parse this information out of /etc/os-release — but doing it this way makes it a little easier to write custom scripts, so, eh, why not.

monit

For reasons incomprehensible to most of my acquaintances, I still use monit for a lot of my monitoring. But it makes it easy to set up little checks that should cause notifications if they fail, and it’s trivial to set up, especially compared to more modern cloud-infra monitoring tools I could use on my usually comparatively tiny setups.

(on the other hand, it comes from the pre-systemd ages when it was seen as reasonable to write your monitoring daemons in C and run them as root so they could restart arbitrary startup scripts — your mileage may vary on that in today’s world)

Do we know what the currently deployed state is?

There is one main check:

Is the currently deployed commit the one that’s actually the head of the main branch, as pushed to some forge?

Most git forges offer convenient APIs for retrieving commit hashes of branch heads, so this is easily implemented as a little script:

let
  checkHash = pkgs.writeScriptBin "check-commit-hash" ''
    #!${lib.getExe pkgs.fish}
    set wanted (${lib.getExe pkgs.curl} -s \
                https://git.example.org/api/v1/repos/config/branches/main \
                -H 'accept: application/json' | jq -r .commit.id)

    if test $status != 0
      echo "could not reach git.infra4future.de"
      exit 2
    end

    set actual (cat /etc/commit)
    if test $actual != $wanted
      echo "${config.networking.hostName} was built on $actual, but commit on main is $wanted"
      exit 1
    end
  '';
in { 
  services.monit.config = ''
      check program deployed-commit-on-main path ${lib.getExe checkHash}
            if status == 1 for 64 cycles then alert
            if status == 2 for 3 cycles then alert
  '';
}

Conveniently, this has the by-product of also checking for dirty-ness of the currently deployed commit, as the -dirty at the end of dirtyCommit will cause the check to fail even if the hashes match.

Did we forget to update the host?

let
  checkDeployAge = pkgs.writeScriptBin "check-deploy-age" ''
    #!${lib.getExe pkgs.fish}

    set date (date +%s)
    # we do this indirection here so monit's config won't change on each deploy
    set deploytimestamp (cat /etc/haccfiles-timestamp)
    set age (expr $date - $deploytimestamp)

    if test $age -ge (expr 3600 \* 24 \* 10)
      echo "${config.networking.hostName} has not been deployed since 10 days, perhaps someone should do updates?"
      exit 1
    end
  '';
in {
  services.monit.config = ''
      check program check-deploy-age path ${lib.getExe checkDeployAge}
            if status == 1 then alert
  '';
}

Alternative: include the full config

There’s an alternative to this approach which achieves all the same goals (and then some), but which I found a little more annoying in practice. Simply do:

{
  environment.etc.nixfiles.target = self.outPath;
}

and the entire configuration that has currently been deployed will always be immediately available, no matter what has been messed up or forgotten. We actually ran this on the infra4future.de server for almost a year, and it worked well enough.

So why all this extra effort, if it’s this easy? Mainly, it makes deploys horribly sluggish — there is no good way to cache half a derivation, so each and every deploy (unless it is a complete no-op) will now require re-uploading the entire configuration to the server, and deploy speed is effectively capped at however long that takes on whatever uplink one’s currently using. And even if it’s only a couple kilobytes this can take a while, especially during debugging sessions from a moving train.

Conclusion

That’s it! Overall, I’m pretty satisfied with this approach & it’s been useful more than once the last few years.

There is one thing I wish I could add, tho: adding the whole configuration to the system closure is too costly, but adding a diff to the last commit (especially in dirty states) would be extremely useful, but I’ve never looked into how possible that would be … perhaps one day.