Teaching Small LLMs to Reason – Undergraduate Research Opportunities

Starting Date: June 2024
Prerequisites: Very good Python skills
Will results be assigned to University: Yes

Large Language Models (LLMs) such as GPT4, are a game-changer for AI. Equipped with hundreds of billions of parameters, and trained on vast amounts of textual data totalling hundreds of terabytes, these models have revolutionised operations across numerous domains. But despite their considerable capabilities, their sheer size often means that they require substantial computational resources and energy to run, which can preclude them from being used by smaller organisations who want to have control of their operations.

Small Language Models (SLMs) are therefore becoming more attractive for enterprises to develop and deploy than large language models because they can get more control, such as in fine-tuning for particular domains and data security. A large number of them are also open source and because they are smaller, they are cheaper to run. Moreover, because SLMs can be tailored to more narrow and specific applications, that makes them more practical for companies that require a language model that is trained on more limited datasets, and can be fine-tuned for a particular domain.

This UROP project aims to study how to teach SLMs to reason. But the question of whether LLM-based systems can really reason is not easy to settle. This is because reasoning, insofar as it is founded in formal logic, is content neutral. The modus ponens rule of inference, for example, is valid whatever the premises are about. However, if we prompt an LLM with “All humans are mortal and Socrates is human therefore”, we are not instructing it to carry out deductive inference. Rather, we are asking it the following question. Given the statistical distribution of words in the public corpus, what words are likely to follow the sequence “All humans are mortal and Socrates is human therefore”. A good answer to this would be “Socrates is mortal”.

By opting for this project, you will utilise open-source SLMs for designated tasks employed in the DICE Lab at RHUL, assessing their efficacy in emulating human-like reasoning behaviour. The successful candidate will participate in group meetings within the lab, organised by Prof. Kostas Stathis, Dr Agnieszka Mensfelt and Mr Vince Trenscyeni, who will co-supervise the work.

We anticipate that this project will help the successful candidate gain knowledge and understanding of prompt engineering techniques and fine-tuning SLMs for a specific task, as well as improve their programming skills in the thematic areas of the project.

Further reading

Jacob Devlin, Ming-Wei Chang, Kenton Lee, Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. NAACL-HLT (1). 4171-4186

Murray Shanahan. 2024. Talking about Large Language Models. Commun. ACM 67, 2 (February 2024), 68–79. https://doi.org/10.1145/3624724.

Lucie Charlotte Magister, Jonathan Mallinson, Jakub Adámek, Eric Malmi, Aliaksei Severyn. 2023. Teaching Small Language Models to Reason. ACL (2). 1773-1781.