A Comparative Analysis of Guardrail Frameworks for Large Language Models (available)

Starting Date:
Prerequisites:
Will results be assigned to University: No

This project explores how specialised programming frameworks called Guardrails, developed specifically for constraining large language models (LLMs), can prevent them from generating harmful, biased, or off-topic content.

The goal of the project is to build simple examples using three leading guardrail frameworks implemented in Python: Guardrails AI, NeMo Guardrails from NVIDIA, and Llama Guard from Meta. These examples will be based on an ongoing research project that attempts to constrain LLM behaviour via logical rules.  

By testing these frameworks hands-on, you will document which approaches work best in different scenarios, their technical limitations, and implementation challenges. 

This project will be co-lead by three academics in the CS Department — Julien Lange, Matteo Sammartino, and Prof. Kostas Stathis — who are experts in programming languages, safety, and AI.