Hacking on Isabelle/ML

Resources
Basic Isabelle/ML
How to find things
- Interactive exploring of ML files
- Layout of ML files

This is less a post than a couple of notes to myself; but perhaps they might also be helpful to others when starting out. I may extend or update it later.

Resources

General: along with the official Isabelle Homepage there is isabelle.systems, collecting helpful links to other sites.

Specific to Isabelle/ML:

the Isabelle/ML Cookbook, which while incomplete (with only sporadic updates) is still the best beginner-friendly source to get started with many things
Burkhart Wolff’s My Personal, Ecclectic Isabelle Programming Manual, which has much deeper information for a few topics

Basic Isabelle/ML

This is just a collection of helpful functions so I’ll have a place to look them up once I inevitably forget about them again.

How to execute code?

Useful antiquotations:

ML‹code› executes the ML code; the surrounding theory can be accessed via antiquotations, especially @{context}. Some functions may complain about a “missing local theory context”; use these either from inside an Isabelle command or with local_setup
local_setup‹code› expects the code to have type Proof.context => Proof.context, i.e. to modify the given theory context. Useful for hacking on functions that should eventually be called from an Isabelle command

Printing things

Printing any value that can reasonably be printed:

writeln (@{make_string} ...)

Pretty-printing terms (color-coding of variables depends on context):

Pretty.writeln (Syntax.pretty_term @{context} intr)

What are types?

Types are a simple ADT:

datatype typ =
  Type  of string * typ list |
  TFree of string * sort |
  TVar  of indexname * sort

Type (name, args) is an instance of type name (the name is qualified, e.g. Nat.Nat instead of just Nat). If this type is parameterised, then its arguments go into args
TFree (name, sort) is a type variable (e.g. 'a)
TVar ((name,index), sort) are schematic variables which may occur e.g. in theorems and can be instantiated. Values with the same name but different index are not considered equal.

Additionally, there’s also Term.dummyT (which is really Type ("dummy", [])). This is used to leave the type unspecified. Terms using this type may lead to errors if passed to functions which do not expect them (but usually they are resolved during type inference).

Note that there is no extra builtin function type; a ⇒ b is represented simply as Type ("fun", [a,b]).

What are terms?

Terms are also simply an ADT implementing Isabelle’s take on the lambda calculus:

datatype term =
  Const of string * typ |
  Free of string * typ |
  Var of indexname * typ |
  Bound of int |
  Abs of string * typ * term |
  $ of term * term

Const (name, type) are constants defined outside of the term (this may include function symbols, quantifiers, concrete values like true~/~false, datatype constructors, etc.)
Free (name, type) are free variables (when pretty-printed in jedit, these are printed in blue)
Var ((name,index),type) are schematic variables that can be instantiated, e.g. when they occur in a theorem (usually printed with a leading question mark); values with the same name but different index are not considered equal
Bound index is a variable bound by a lambda in the same term, referred to using a deBruijn-index (dangling indices lead to errors). Note that in this case the variables type and name are still recorded, but as part of the lambda abstraction
Abs (name,type,body) is the expression λname : type. body
a $ b is function application

Truly a lot of ways to represent a variable!

Note that all names must always be qualified by theory name (antiquatations do this automatically — don’t forget to insert theory names when replacing one with a concrete term!).

What are sorts?

There is one important other detail: sorts. These give an extremely simple meta-logic over types: each sort is just a list of strings, and “subsumes” all lists which contain (at least) the same elements.

There is a “top” sort (the empty list), and apart from that sorts are used to implement things like locales (e.g. in a term a < b, the types of a, b should have a sort that contains "Orderings.org").

Most sorts are just ["HOL.Type"]. There is a \<^sort> antiquotation to write them more easily.

How to find things

To find a specific function (especially if it’s a basic, “obvious” function, like some variation of a map/fold), the best way is often to use ripgrep on Isabelle’s src directory with a likely name/type signature.

The best way to find theories/ML files is likewise with fd.

Use bat (or isabat) to browse these quickly; pass -l sml for files ending in .ML to get correct syntax highlighting.

Basic SML functions may be found in contrib/polyml-x.x/src/basis/*; many additional useful (general) functions are in src/Pure/library.ML.

Interactive exploring of ML files

Isabelle/jedit can provide “jump to definition” and similar features for ML (as it does by default for Isabelle theory files), but only does so for files explicitly loaded by some theory.

So to get these features e.g. in inductive.ML, do this:

ML_file ‹~~/src/HOL/Tools/inductive.ML›

(the double tilde is Isabelle’s home directory, value of $ISABELLE_HOME)

Sometimes this leads to errors (e.g. command defined twice); these are safe to ignore.

Layout of ML files

Most ML files in Pure/HOL define a main signature, and everything contained therein is available qualified with that file’s name, but capitalised; e.g. the result type defined in HOL/Tools/inductive.ML is available as Inductive.result.

Sometimes if an ML file is not part of any theory that is in Main; in that case, import the corresponding theory into your own, and that signature will be available.