2021-11-13

Are Nix Expressions Pacman-Complete?

aka: likely the worst ‘what is a monad?’-post it is possible to write.

tl;dr: at the very least, they’re tic tac toe-complete!

Introduction

A: I wonder if Nix is Turing-Complete?

B: The Nix Expression language? Sure it is: it has λ, and β-reduction, what more would you need?

A: Well – I guess so. But can we use it for other things than just describing how to build things?

B: Not sure. It’s meant to describe how to build things, not to support any extra language features that could get in the way.

A: But Guix does perfectly fine just using Scheme to define packages, and that’s a general-purpose language!

B: I guess so. Scheme does have things like I/O, and side-effects, and … – people just don’t use those when defining derivations.

A: Does Nix really not have I/O? It can read files, which sounds like input, and it can write files, which sounds a lot like output. It can even print to the console!

B: But all input has to happen when evaluation starts, all at once, I think.

A: Is that a problem?

B: Well, it different from other languages, where you could first print something (say, a question) then wait for more input (let’s hope it’s an answer) and then after that print some more stuff, and so on.

A: And you don’t believe we can do that in Nix?

B: No, I don’t think so. Can we?

Some Facts about Nix

A: Here’s some facts about Nix (the language):

  1. It can read any file, from any path which it can construct as a string. Moreover, it can also import that file and evaluate it as an expression.

  2. It can write files with any content it can construct, but not to arbitrary locations – only to set paths in the Nix Store.

And here’s some facts about Nix (the interpreter):

  1. It cannot freeze the entire file system while it is busy evaluating some expression, nor can it take a snapshot of every file when it starts.

  2. Which means that we can force it to read files that were created only after it started evaluating!

I think that’s already anough, actually. Do you see it?

B: Hm … not sure yet. But I did notice that (1) and (2) imply that Nix can interpret itself – there’s an eval function:

let
  pkgs = import <nixpkgs> {};
in
  eval = expression:
    import (pkgs.writeText "eval" expression)

A: Huh, I didn’t think of that. So we can just plug strings into eval and it’ll interpret them as Nix expressions?

nix-repl> eval "1"
[1/0/1 built] building eval1

[1 built]

Uhm …

B: Looks like that’s not what nix repl was designed for. It did work, it’s just glued the resulting 1 right into the middle of the build messages.

A: It’s easier to see for derivations:

nix-repl> eval "(import <nixpkgs> {}).firefox"
[2 built]
«derivation /nix/store/xp2ycak6xn4zhryvdwjdakgz5xmapqdk-firefox-89.0.drv»

Loopings and Undecidability

A: You gave me an idea, though: what happens if we evaluate this?

let loop = a: let file = eval (toString a);
              in if file == 1 then loop a else "done";
in loop 1
[2/3 built] error: stack overflow (possible infinite recursion)

Huh. Looks like Nix doesn’t have tail-call optimisation. And doesn’t it usually detect infinite recursion?

B: Why would need to have tail-call optimisation? And detecting infinite recursion in general is a hard problem, you can’t expect it to always work!

A: Hm.

B: What?

A: if file == 1 ... then loop a. That’s always true. We could put a more complicated condition there.

B: Like what?

A: Like, say, a Turing machine.

B: Why would you even – no, don’t –

A: – too late, already done it: turingmachine.nix

> nix-build turingmachine.nix
building '/nix/store/nm5kf2ybl29dsbj4l1d9bg6assivm9a1-now-state-0-went-left.drv'...
building '/nix/store/fhim7mirjgp3rliajiij9bd60sfxps3w-now-state-1-went-right.drv'...
building '/nix/store/wm930rdavr3vsn1r09zzh2d71yjvbfi5-now-state-2-went-left.drv'...
building '/nix/store/5z5p51a8lgxcgvr176l6gdsb2v0pyxj4-now-state-3-went-left.drv'...
building '/nix/store/xk8lrv8zgbdz20yggz52dx31g0pr3698-now-state-3-went-right.drv'...
building '/nix/store/cr8yl0bxs0v8ypp6w2md7andmky0qi3f-now-state-0-went-right.drv'...
building '/nix/store/x8aj86xjwzi63wv1qgfg473s8lxrfyla-now-state-1-went-right.drv'...
building '/nix/store/bdwi16w3kxil65phad8w6zc1s08si8xd-now-state-0-went-left.drv'...
building '/nix/store/8472d4sl3l9h6h0m1nyis3v2d6jpavrh-now-state-1-went-left.drv'...
building '/nix/store/fh5al936sl8phzpf1a9ycn64kmggzxmr-now-state-0-went-left.drv'...
building '/nix/store/df3mzfxygv50l8p218yxf3f31zqclc4r-now-state-1-went-left.drv'...
building '/nix/store/5jzb29yjy3lldhdh535n24m7mdgi56c7-now-state-0-went-left.drv'...
building '/nix/store/pdc7c4ssslyf14h4cdn99pwayjp55dvk-now-state-1-went-right.drv'...
building '/nix/store/ylvfk029nv6njd9g2kjjrqqrpy990dgh-now-state-2-went-left.drv'...
building '/nix/store/l3qr8rhih8syw1afv3b6sx7z19a2817r-now-state-3-went-left.drv'...
building '/nix/store/pf7znba9ipck4qnacjh3mwxdyl65kpkx-now-state-3-went-right.drv'...
building '/nix/store/s1y3jw3q8jam18g64xpk5f4prqamx76j-now-state-0-went-right.drv'...
building '/nix/store/xigm9h6w2vgbyjx16bs1srv2fkmdx976-now-state-1-went-right.drv'...
building '/nix/store/4bm2f2aqf76w8dckplxc9vx2lm1x3ysc-now-state-2-went-left.drv'...
...

B: Oh god. Well, on the other hand — if you still expect Nix to detect whether or not that thing leads to infinite recursion, you’re literally asking it to solve the halting problem.

basic I/O

B: Anyways, can we please stop this and go back to the actual topic of this post?

A: you mean, I/O in Nix?

B: sigh yes, I guess I do.

A: So, fact (4) says we can read files that were created after Nix already started evaluating. Here’s a particularly boring example:

builtins.readFile (pkgs.writeFile "nix" "Lorem Ipsum")

B: right – this is what’s called “import from derivation”, and it’s also how things like niv work: write files into the nix store (using a fetcher like fetchGit), then import them into Nix.

A: Sure, but actually we can just import any path – we’re not limited to just those that are within the nix store:

builtins.readFile "/tmp/hello-nix.txt"

If we evaluate that, and then very quickly create that file (or force Nix to do some lengthy operation first), then Nix will read something a user put somewhere, after the start of evaluation!

> nix-build -E 'builtins.readFile "/tmp/hello-nix.txt"' \
    & echo hello > /tmp/hello-nix.txt
error: expression does not evaluate to a derivation (or a set or list of those)

See? It complains that the result isn’t a derivation, but it doesn’t complain about a missing file.

B: You sure that it would complain otherwise?

A: Sure. Nix is untyped, how would it know that builtins.readFile doesn’t produce a derivation until it’s done evaluating it?

> nix-build -E 'builtins.readFile "/tmp/does-not-exist"'
error: opening file '/tmp/does-not-exist': No such file or directory

See?

B: Okay, so technically we can create files just when we start Nix. I’m not sure how that is useful for anything, though.

Waiting for Input

A: Well, let’s make Nix wait a little:

let
  pause = idx: pkgs.stdenv.mkDerivation {
    name = "sleep-${seed}";
    src = pkgs.hello; # just some dummy input
    phases = [ "buildPhase" ];
    buildPhase = ''
      # change idx if used more than once in the same file to wait every time
      # ${toString idx}
      sleep 2
      echo waiting 3 ...
      sleep 2
      echo waiting 2 ...
      sleep 2
      echo waiting 1 ...
      # without this, Nix would consider evaluation to have failed
      mkdir -p $out
    '';
  };
  pause_and = idx: code:
    pkgs.stdenv.mkDerivation {
      name = "pause_and${toString idx}";
      buildInputs = [ (pause idx) ];
      phases = [ "buildPhase" ];
      src = pkgs.writeText "code" code;
      buildPhase = ''
        # ${seed}
        cp $src $out
      '';
    };
in import (pause_and 0 "10")

Using that, we can even write a handy little read_input function:

read_input = idx: name: import (pause_and idx ''
  with import <nixpkgs> {};
  # ${seed}
  lib.readFile
    "/tmp/input-${name}"
'');

B: Why’s the seed variable in there? It doesn’t seem to be doing anything …

A: … except we can bind it to a different value each time we run the build, making sure Nix will re-run everything even if nothing else has changed and idx is the same.

Now we can use it like this:

  seed = toString 0;
in
  read_input 0 "test"
> nix-build &
> sleep 4 && echo "hello from the shell" > /tmp/input-test
[1] 21354
building '/nix/store/07l20zp51ghl88d4xrpr8sw744fpi43r-sleep-0.drv'...
building
waiting 3 ...
waiting 2 ...
waiting 1 ...
building '/nix/store/8ibcbgbc4g60xan05xa49qcvncnaz35p-pause_and0.drv'...
building
error: expression does not evaluate to a derivation (or a set or list of those)

[1]+  Exit 1                  nix-build scratch.nix

A: See?

Maximal Sharing

B: Quite. But what happens if you try reading in the same file twice?

A: uhm, well, let’s see …

let seed = toString 1;
in
  (read_input 0 "test") + (read_input 0 "test")
> nix-build
building '/nix/store/k56caykkwwc4dygxvnyq7dsrahz3j7bg-code.drv'...
building '/nix/store/ryh9f942r57ji220ccizscwypgdg8hbj-sleep-2.drv'...
building
waiting 3 ...
waiting 2 ...
waiting 1 ...
building '/nix/store/ldz27rinz91sws3zjlbq7grb42zmz6n7-pause_and1.drv'...
building
error: expression does not evaluate to a derivation (or a set or list of those)

Huh. Looks like both read_inputs waited at the same time.

B: Not quite. Nix implements what it calls maximal lazyness: if you do the same thing twice, it’ll only evaluate it once.

A: Huh? But that was what the idx was for! … oh, I forgot to set it to different values each time. My bad.

B: But even if you did set them to different values, which one would be evaluated first?

A: Well, we can make one of them wait extra long:

let seed = toString 2;
in
  print
    ((read_input 0 "test")
     + (pkgs.lib.readFile (pause_and 2 (read_input 1 "test"))))
    4
> nix-build
building '/nix/store/wva39fi81zr0f3hlav0dz129zpvwcyvm-sleep-4.drv'...
building
waiting 3 ...
waiting 2 ...
waiting 1 ...
building '/nix/store/3l768hhidqxj4qz3wn4xak170gp69j0v-pause_and0.drv'...
building
building '/nix/store/90d3hggp881kapmlpsbqd506rsb9q4h4-sleep-4.drv'...
building
waiting 3 ...
waiting 2 ...
waiting 1 ...
building '/nix/store/gvly9wd9yjlk5bwvy01dhahs2cz8j40d-pause_and1.drv'...
building
building '/nix/store/4fzcwfrhdm3014515frv7mx7ki5wqjvd-sleep-4.drv'...
building
waiting 3 ...
waiting 2 ...
waiting 1 ...
building '/nix/store/s661xcgx66igj3qy4haf22k4xj0fqvi7-pause_and2.drv'...
building
these derivations will be built:
  /nix/store/ki59fdjbihnza1zw7ygvp71ygwqlxp4d-print.drv
building '/nix/store/ki59fdjbihnza1zw7ygvp71ygwqlxp4d-print.drv'...
building
hello from the shell
hello from the shell

/nix/store/akxfzsq8wwi85pzidl8b4s7qi4q9p41w-print

B: … aaand it cached the file after the first read.

A: So no reading in the same file twice, I guess.

B: Maximal sharing can be annoying sometimes.

Chaining

A: I guess we could let read_input count how many times it’s been called by passing some sort of state around, and then read from a different file each time, like this:

let
  read_with_state = prompt: oldstate: {
    result = read_input oldstate.state (toString oldstate.state);
    state = oldstate.state + 1;
  };
in
  ...

And then using it we just have to be careful to always pass that state from one call to the next. It’s kinda like chaining them together, dragging a sequential order into the evaluator’s laziness:

let
  result1 = read_with_state "first" {state = 0;};
  first_input = result1.result;

  result2 = read_with_state "second" result1;
  second_input = result2.result;
in
  first_input + second_input
building '/nix/store/kqrc3nsxcm06f9924nxscvs8z2vsk52f-code.drv'...
building '/nix/store/qwhmkg3r6xmjzmf8yk5cp6z7gs1x0afw-sleep-5.drv'...
building
waiting 3 ...
waiting 2 ...
waiting 1 ...
building '/nix/store/48njzi1al7qgvrcyv4jy3ln5gg39jj1a-pause_and0.drv'...
building
building '/nix/store/qmvq75cls017pkq0bz1z5whxpyxqrv7q-code.drv'...
building '/nix/store/0gh8r26wjzciljxcnyq89bq5mj8ysll1-sleep-5.drv'...
building
waiting 3 ...
waiting 2 ...
waiting 1 ...
building '/nix/store/8jki0snrr21bh08pvy22afq3xgq3axp7-pause_and1.drv'...
building
"Hello from input 0\nHello from Input 1\n"

B: You know, at this point you can probably stop pretending and just admit that what you’re building is a monad.

An IO Monad

A: A what?

B: That pattern of your read_with_state function – the way it always needs to be called with some extra “state” value from the last time it was used, wrapping its value into an extra structure that we don’t really care about by itself? That pattern is what’s called a monad.

A: Huh.

B: See, we can define an operation that takes two of these functions and chains them together (which is usually called bind). For yours, it would look like this:

bind = monadic: operation:
  monadic // (operation monadic.result) monadic

A: Hang on — what happened to the state attribute? It’s not mentioned in your definition of bind.

B: Right, it isn’t — the assumption here is that monadic is one of these attribute sets which has result and state attributes as above, and operation is a function which takes a result, and then also takes the entire state, and then returns another monadic value with the same attributes. If we had types, we could say this function takes a value of a certain type, a function, and then returns another value of the same type.

A: But do we have types?

B: We don’t.

A: We don’t?

B: We don’t.

A: How annoying.

B: Yes. We’ll have to be very careful – if we do anything wrong, things might just blow up and produce errors after half the evaluation!

A: Aaaaa! Why don’t we have types?

B: Because we don’t.

A: Exactly why?

B: Just because.

A: Okay, I guess we don’t. … anyways, where were we?

B: Explaining the bind function.

A: Ah, right. Why does operation take the result value as a first argument? Seems like we could save an argument by simply passing it the entire attribute set only once, since it already contains the result.

B: In theory, we could — but we’ll see in a moment why it’s more convenient to do it this way.

A: Well, okay.

B: So now we can “bind” two reads together:

let
  double_read = bind
    (v: read_with_state "first")
    (v: read_with_state "second")
in
  double_read {idx = 0;}

Notation

A: What if we want to read in three things?

B: Yeah, uhm …

bind
  (v: bind
    (v: read_with_state "first")
    (v: read_with_state "second"))
  (v: read_with_state "third")
  {idx = 0;}

A: Well that’s awkward.

B: We can do better, though:

let do = operations: monadic:
      pkgs.lib.foldl bind monadic operations;
in
do [
  (v: read_with_state "first")
  (v: read_with_state "second")
  (v: read_with_state "third")
] {idx = 0;}

A: Huh. Looks almost a little like an imperative language now.

B: I guess so.

Monad Laws

A: Hang on a minute. If I look up “monad” with a search engine, there’s all this stuff about how it also needs some function called pure, and obey “monad laws”, and …

B: Ah sorry, I skipped over these. Whoops.

A: So what do these do?

B: pure is very simple: it just takes a value, and returns it wrapped into the monad, like this:

let pure = val: state: state // {value = val;}

In fact, it’s often also called just that: return! (and sometimes also unit)

A: Why would we ever need to do that?

B: We can use it to change the value “inside” our monad – that is, change the value that the next function gets:

do [
  (v: read_with_state "hello")
  (v: pure (v + ", world!"))
]
> nix-build &
> echo "hello" > /tmp/input-0
"hello, world!"

A: And the monad laws?

B: There’s three of them. Each is an equation of two expressions that should evaluate to the same thing:

  1. Left Identity: bind (pure a) h is the same as just h a
  2. Right Identity: bind m pure is the same as just m
  3. Associativity: bind (bind m g) h is the same as bind m (v: bind (g v) h)

A: What happens if one of these three doesn’t hold?

B: Weird things.

A: Such as?

B: Well, mostly things won’t behave as you may expect them to — they’re not all that important here, but imagine we wanted an optimising compiler for Nix: perhaps we could use these laws for optimisations?

A: Perhaps. How do we know the laws are in fact true, though, for our definition of bind?

B: Well, if we had a formal semantics for Nix, we could write all the rules down and then prove from those no matter what the variables in the laws are, the laws will always hold.

A: Actually, we do have a formal semantics for Nix: it’s in Eelco Dolstra’s Thesis.

B: Looks out of date, though. Nix as described there only has the weird old-style let — no wonder, it’s from 2006!

A: Yeah, and I can’t really find a newer version, either.

B: It sounds like a lot of work, anyways.

A: What could we do instead?

B: Squint at the functions for a bit, then shrug, move on, and hope that it’s all right?

A: Sure, let’s do that instead!

Let’s play!

A: What we have so far is still a bit awkward to use. Sure, we can read in many things one after another, but it doesn’t seem like we can handle more than one thing at a time. What if we want to combine two inputs?

B: Yeah, that’s true – we can’t bind new variables on the fly, at least not without interrupting the nice do-way of writing things.

A: Hm, let me think … I’m pretty sure we can tweak the state of our monad a bit to simulate something like variables …

B: … I guess?

A: … it’d also be useful if we had more than just sequences of functions. What about loops, and conditions, and all that stuff? I’m sure we can add that to …

B: … I’m not sure that’s a good idea. Do you that’s even doable?

A: Yep! In fact, I’ve done it already: io.nix. I also wrote a version of tic tac toe in it:

...
{seed}:
with import ./io.nix {s = seed;};
do initialWorld [
  (v: assign "grid" emptyGrid)
  (v: assign "turn" "x")

  (while (v: !(hasWon "x" v.grid) && !(hasWon "o" v.grid))
    (ifThenElse (v: v.turn == "o")
      (v: doMonadic [
        (v: read_input "test")
        (ifThenElse (v: v.grid.${stripWhitespace v.input} == " ")
          (v: doMonadic [
            (v: assign "grid"
              (v.grid // { ${stripWhitespace v.input} = "o"; }))
            (v: print (showGrid v.grid))
            (v: assign "turn" "x")
          ])
          (v: print "this is not a legal move, try again!")
        )
      ])
      (v: doMonadic [
        (v: assign "grid" (v.grid // { ${maximalMove v.grid} = "x"; }))
        (v: print (showGrid v.grid))
        (v: assign "turn" "o")
      ])
    )
  )
  (match (v: gameState v.grid) {
    "draw" = v: print "this game ended in a draw";
    "o" = v: print "o won this game";
    "x" = v: print "x won this game";
  })
]

A: Well, that’s only the Monad-part of it; there’s lots of helper functions. The entire thing is in game.nix.

It actually works, too!

B: sigh

A: Just run nix-build -argstr seed 4 on that, then write all of your moves into the tmp files it tells you to. But make sure to never take more than three seconds to decide your next move, or it’ll fail to read in the next file and everything will crash!

On the plus side, if it does crash, you can restart where you left of by just running the same command again – maximal lazyness shold take care of the rest.

I guess this should count as proof that, in addition to being Turing-complete, Nix is also tic-tac-toe complete?

B: … I don’t know why I put up with you.

Addendum

A: So, does all of this actually mean anything?

B: Apart from that Nix kinda has impure functions, if you squint hard enough?

A: Well, we already knew that before – even without impure things like builtins.fetchGit that don’t require hashes, reading in files from outside the Nix store without requiring hashes will always be impure. It’s just so happens to also be very convenient.

B: I guess so. Nix doesn’t usually come with an I/O monad, though.

A: Yeah, that’s true. Blame this post for the monad, if you like.