Netidee Blog Bild
Provenance Networks
Bringing trust networks to the blockchain. (07.04.2019)
Förderjahr 2017 / Stipendien Call #12 / ProjektID: 2418 / Projekt: Decentralised Data Provenance based on the Blockchain

The Blockchain does in general not represent a silver bullet to every problem and like with every new technology it also brings its own limitations to the table. One such limitation specifically when working with data provenance is the threat of duplication. This special kind of attack can be explained by a simple example. For example, imagine some scientist working with a data provenance enabled software. The scientist performs some experiments and some results get produced. Besides these results, the provenance enabled software also produces some provenance data which gets saved to the Blockchain. So far everything is fine as can be seen in the figure below.

Thread of Duplication 1

However, the scientist is not entirely happy with the results and by manipulating the experiment and rerunning it the scientist is able to produce a new set of results backed by a new set of provenance data that also gets stored in the Blockchain. This can be seen in the following figure.

Thread of Duplication 2

The scientist now essentially was able to duplicate the provenance data and hide one set of provenance data behind another one as can be seen in the next figure.

Thread of Duplication Result

To find the duplication attempt one would need to crawl the Blockchain for such data provenance duplications. Although theoretically possible in cases where the provenance data is indeed saved in the chain it becomes very difficult for solutions which make use of off-chaining and store only the hash in the chain since we can not tell what data is represented by this hash and if it is provenance data in the first place.

To solve this issue and some other we introduced provenance networks. Provenance networks are inspired by trust networks from the domain of trust propagation and like trust networks, they are directed weighted graphs where each node is a smart contract. These smart contracts are used to store provenance information and also to propagate trust towards other smart contracts in the provenance network. We call them also provenance contracts. We call the edges between our provenance contracts, links. They have a weight which expresses a trust level and a direction expressing from which contract the trust gets propagated towards which other contract.

Example Provenance Network

The figure above shows an example of a provenance network. The blogger contract in this example "trusts" the contract of the informatics institute since it has some reputation and is also trusted by the TU Wien main contract which essentially makes it a part of the TU Wien. However, the informatics institute does only "know" the bloggers' provenance contract since there is no reliable source propagating any trust towards this particular contract. Propagating trust is important since everybody is able to create new provenance contracts. This means that malicious users could simply try and replicate contracts or whole parts of the network. However, well-known anchor points with high reputation, like in our example the TU Wien contract can be used to propagate trust towards correctly behaving contracts. Furthermore, in such a provenance network even without any anchor points by searching the whole network users will always be able to find all the provenance regarding a certain resource and domain experts will then be able to distinguish the correct provenance data from the duplication attempt.

Provenance networks fulfill also a second very important task, they allow us to search for provenance information across use cases and domains. Different provenance use cases often have different provenance models which makes it hard to co-operate between them. However, provenance networks are not dependent on one specific provenance model allowing us the create a common search space between different provenance domains and use cases. For more details on this topic relate to my master thesis or to the project prototype at Github.

Tags:

data provenance provenance networks blockchain

Svetoslav Videnov

Profile picture for user svidenov
I am a software engineering master student at the TU Wien. My research interests are in distributed systems and microservice architectures. I am currently working at the TU Wien as a research assistant in the distributed systems group.

My master thesis aims to combine the advantages of the blockchain with data provenance. The blockchain is a distributed ledger which allows persisting data in an unchangeable way. Data provenance is an approach to track what happened to data and by this allowing to build trust into this data.
CAPTCHA
Diese Frage dient der Überprüfung, ob Sie ein menschlicher Besucher sind und um automatisierten SPAM zu verhindern.
    Datenschutzinformation
    Der datenschutzrechtliche Verantwortliche (Internet Privatstiftung Austria - Internet Foundation Austria, Österreich) würde gerne mit folgenden Diensten Ihre personenbezogenen Daten verarbeiten. Dies ist für die Nutzung der Website nicht notwendig, ermöglicht aber eine noch engere Interaktion mit Ihnen. Falls gewünscht, treffen Sie bitte eine Auswahl: