Correct and Certified Regular Expressions (available)

Starting Date: June 2022 (flexible)
Prerequisites: Good mathematical background; Functional programming skills; Basic Coq programming/proving
Will results be assigned to University: No

In programming language research ideas are explored in an ofte repeated loop. You have an idea then propose the language feature to study, its semantics, and the prove it has the properties you expect (type safety, performance, expressivity, etc.).

In this project, we want to explore such a cycle working with regular expressions as the language (to have a setting that is familiar and very useful), and mechanical proofs in the Coq proof assistant [1,2] to explore a certified implementation.

Regular expressions are a prime example of declarative programming, you define what you want to match and not how it is to be done. In this project, we start with an existing formal specification in Coq of the syntax and semantics of regular expressions [3], and work our way to an efficient and provably correct algorithm.

There are many approaches and ideas that can be explored to implement and certify the matcher, many interesting properties that can be established about our definitions, and lots of interesting extensions to explore [4]. That is to say, there is a lot flexibility to the set up of this plan.

The projects will be jointly supervised by Francisco Ferreira (lead), Julien Lange, and Reuben Rowe.

[1] https://softwarefoundations.cis.upenn.edu/lf-current/toc.html
[2] https://coq.inria.fr/
[3] https://github.com/PaPC-RHUL/regexp
[4] Introduction to Automata Theory, Languages, and Computation, 3rd Edition, Hopcroft, Motwani, and Ullman, Pearson education, 2014