System Provenance Collection from a Database Server (completed)

Starting Date: June 2018
Prerequisites: Second Year
Will results be assigned to University:

A database server is a collection of an Operating System (OS) at its core that hosts a database – accessible from various services and devices in an enterprise network. The activities observed on the database server are of immense importance to show compliance with data governance policies. A crucial element of such a compliance is the activities carried out on the OS – which is the core focus of this project.

Data Provenance refers to records of the inputs, entities, systems and process that influence data of interest, providing a historical record of the data and its origins. To provide a holistic view of the data provenance of the data stored on a database, the provenance of the underlying OS is paramount.

This project will deploy a database server running on a Linux OS. The database would be populated using synthetic data over a period of time. The aim is to understand OS provenance records and their relations to the activities carried out on the data in the respective database. The final goal is to show provenance records pattern of each of the database activity.

The student should have an interest in and willingness to learn basic data provenance, would have prior knowledge of basic MySQL. Ideally, would be familiar with C programming language and Linux OS – especially syscalls and Linux Audit Framework. Good time-management and strong writing skills.  We would use git and latex to write up the results; prior experience of these tools would be helpful but not required. Even if you do not have the right skills as listed above but you consider yourself dedicated, passionate, hardworking and willing to learn new skills, we would like to hear from you.

It is intended that once the implementation is working it can be used for practical trials, and we would anticipate that a potential conference paper may be submitted for publication based on the implementation and subsequent trials; the respective student would be a co-author of this paper.

As part of the project, you will work with an experienced and dedicate team of researchers who encourage innovative thinking and students taking ownership. You will be given necessary support throughout the project period with regular meetings, blackboard sessions and guidance on how to carry out research effectively. This project is part of much larger EPSRC funded project, so you would have an opportunity to work and contribute to a research project with real world significance and impact. In previous year’s projects, the student was co-inventor on the patent generated from the respective UROP project and also a co-author on the related research paper.