Blockchain to provide Data Provenance Integrity and Privacy

Starting Date: June 2017
Duration: 3 Months
Time commitment: 20h/week
Prerequisites: Second year

Blockchain, also part of the cryptocurrencies, can be viewed as a potentially shared/semi-shared/private, immutable ledger for recording sequence of events or history of transactions. The blockchain technology can be deployed to provide a high-degree of trust, accountability and transparency associated with a set of transactions/events – especially log files and data provenance.

Data provenance is the field of recording the history of data, from its inceptions to various stages of the data lifecycle. Data provenance provides a detail picture of how a data item was collected, where it was stored and how it was used. Such an information can be useful to data auditing and to understand whether the organisation is following its own stated data privacy policy.

Integrity of log files and data provenance record is of paramount importance to build a trustworthiness of such mechanisms – foundation of that is in the indelible proof of correctness of log files and provenance records. To provide an indelible proof that the collected provenance record is not altered or corrupted after collection. This project will look into blockchain technologies and implement an efficient integrity mechanism along with an integrity validation/attestation mechanism – independently verifiable by a third party. Enabling third parties to ascertain the trust in the data provenance, that can lead to accountability and transparency of data management activities.

 The student should have an interest in and willingness to use basic cryptography, ideally would have prior knowledge of basic integrity mechanism (e.g., Hash, Blockchains). Knowledge about blockchains would be a plus. Ideally, have a firm grasp of C or C++, Java, or C# programming language, and like experimenting with Operating Systems if required. Good time-management, communication, self-starter, self-organisation and strong writing skills. We would use git and latex to write up the results; prior experience of these would be helpful but not required.

It is intended that once the implementation is working and benchmarked, we would anticipate a conference paper being submitted for publication based on the implementation and subsequent benchmarking; the author of the code would be a co-author of this paper.