Visitors for Generalized Algebraic Data Types in OCaml (completed)

Starting Date: June 2022
Prerequisites: Familiarity with OCaml, or another functional programming language (e.g. F#, Haskell, etc.), or a willingness and abilty to learn.
Will results be assigned to University: No

Algebraic datatypes (ADTs) [1] are the basic method for defining how to build complex data values in functional programming languages: they specify both the basic, or atomic, values, as well as the constructors for building larger values out of smaller ones. For example, the following ADT defines generic lists that contain values of type 'a in the functional programming language OCaml.

type 'a list =
| Nil
| Cons of 'a * 'a list

When we want to define functions that process values of these types, we write functions that match on the structure of the value, as it is defined by the data type. Visitors are a way of automatically examining data type definitions, and generating code that does this matching. Thus, they make programming easier by allowing us to specify just the “interesting” code that processes the relevant parts of data values, and not having to write tedious “boilerplate” code.

OCaml has a library for generating such visitors [2], which is being widely used in many different projects. But it has its limitations: it cannot generate visitors for definitions of generalized algebraic data types.

Generalized algebraic data types (GADTs) [3] are an extension of algebraic data types to allow type constraints to be placed on the arguments that the data type’s constructor can accept. In this way, they allow us to build more specialised data values, and also allow code using these types to be simplified.

For example, using the following ADTs

type zero = Z
type 'a succ = S of 'a

the following OCaml GADT defines lists whose type 'b indicates exactly how many 'a elements it contains.

type ('a, 'b) sized_list =
| Nil : ('a, Z) sized_list
| Cons : 'a * ('a, 'b) sized_list -> ('a, 'b succ) sized_list

This project will extend the existing OCaml visitors library to be able to generate visitor code for GADTs, which are not currently handled. This is an active research question, and could potentially lead to a paper that we would submit to the top-tier conference ICFP (International Conference on Functional Programming).

[1] https://en.wikipedia.org/wiki/Algebraic_data_type
[2] https://gitlab.inria.fr/fpottier/visitors
[3] https://en.wikipedia.org/wiki/Generalized_algebraic_data_type