Foto Credits: Death To Stocks Pics

In the previous post, we discussed the role of online identities for Internet and the decentralized web in particular, and how these online identities have evolved over time: from centralized, to federated, to human-centric and now self-sovereign. In this post, we will deep dive into the concept of self-sovereign identity.

Please note that self-sovereign identity is a fairly new concept, and there is no consensus about what self-sovereign identity means. Many startups are trying to tackle the problem with different approaches, and philosophies of what makes an identity truly self sovereign.

In this post I will outline existing definition of self-sovereign identity, and will make a case for why we might be better off ditching the word identity and user and start referring to data and individuals instead. It will make it easier for the cause (principles of self-sovereign identity) to build on existing legal frameworks which already grant the theoretical foundations of sovereignty and control over our data.

Principles of Self-Sovereign Identity

In 2016 Christopher Allen argued that any self-sovereign identity must meet a series of guiding principles. The principles he outlined are an attempt to ensure that users control their own data which – he argues – is at the heart of self-sovereign identity. The most important principles include the right to:

Access & Control
Access to your own data at all times, meaning that individuals should always be able to easily retrieve all the claims and other data related to their online identity. Hidden data should be avoided, gatekeepers of data related to our personal identities should be eradicated. Control of one’s personal identity data, where users are the ultimate authority. Furthermore, users must be able to choose celebrity or privacy as they prefer. This doesn’t mean that a user controls all of the claims on their identity: other users may make claims about a user, but they should not be central to the identity itself.

Transparency, Interoperability & Portability
Algorithms governing our identity-related data should be transparent, open source, and independent to any particular architecture. Identity-related data must be long-lived and should preferably last forever, or at least for as long as the user wishes. Information and services about our identity-related data must be transportable. Identities must not be established and controlled by a singular third-party entity since these entities might disappear, regimes may change, or users may move to different jurisdictions. Transportable identities ensure that the user remains in control of his identity no matter what, and can also improve persistence over time since online identities are of little value if they only work in for a specific online service, company or in a specific country.

Minimization & Consent
Users must agree to the use of their Identity-related data and the claims made about that user. Furthermore, when personal data is disclosed, that disclosure should involve the minimum amount of data necessary to accomplish the task at hand.

Individual Privacy vs Protection of the Group
Allan outlines the delicate balance between the right to privacy and the need to disclose certain information for the security of the whole network of people. When there is a conflict between the needs of the identity network and the rights of individual users, then the network should err on the side of preserving the freedoms and rights of the individuals over the needs of the network.

Allan warns that these principles can be a double-edged sword, usable for both beneficial and maleficent purposes, and concludes that an identity system must balance transparency, fairness, and support of the common interests of a group at the same time guaranteeing protection for the individual.

Initiatives, Standards & Technologies

In order to achieve the goals, outlined by Christopher Allen and other individuals before and after him, these guiding principles have been incorporated in a series of initiatives and working groups, where identity authentication occurs through independent algorithms that are censorship-resistant and force-resilient, and that are run in a decentralized manner. By decoupling the data layer from the verification layer, we can take the power away from traditional third parties we needed in the past to verify the credentials of individuals.

The aim of all these initiatives is to come up with new data architectures and international open standards, that either decouple that data layer (set of personal data, and claims made from and about individuals) and the verification layer (verification by third parties, whether claims made are true), or grant control over privacy vs celebrity of one’s personal data.

  • Rebooting the Web of Trust
    Web of Trust refers to a new model of decentralized self-sovereign identity, dating back almost twenty-five years to the PGP initiative, with the aim to reboot an old idea to a new (blockchain) ecosystem. Today, the web of trust is more widely used as a term to include self-sovereign identity authentication & verification, certificate validation, and reputation assessment.
  • W3C working group on verifiable claims
    is working on standardisation of claims and verifications for web identity. The mission of the group is to make expressing, exchanging, and verifying claims that an individual has made about themselves, easier, more secure while keeping it trustworthy.
  • W3C initiative of Decentralized Identifiers (DIDs)
    Is working on decentralized identifiers (DIDs) as a new type of identifier intended for verifiable “self-sovereign” digital identity, that is, fully under the control of an entity and not dependent on a centralized registry, identity provider, or certificate authority.
  • Older initiatives dating back to 2005, like Tim Berner’s Lee SoLid (Social Linked Data) and WebIDs.
  • Furthermore, there is a set of initiatives working on privacy-centric technologies that aim to give users the option to choose between full disclosure or privacy, like zero-knowledge proofs as implemented by Zcash, or ring signatures as implemented by Monero, or secure multi-party computation as explained in this talk.

While I think that the principles outlined by Christopher Allen thoughts, and all emerging initiatives, standards and technologies are valid, I would like to advocate for a new narrative, which I would like to briefly outline:


Identity Data vs. Personal Data

Allen reduced the discussion to identity-related personal data: the data a company collects to identify you, and that you will need to log you into their services. While identity data is a subset of all personal data, the limitation to the term identity is not relevant anymore in a world of big data, where everything we do – aka our whole digital footprint – is connected to our digital identity.

Identity as a term is outdated and too narrow. It reflects the traditional, suppressive, and outdated notion that digital identity exists only hand in hand with particular online services or within the boundaries of certain nation states, that an individual exists for the web only, or as citizens of said nation-state.
Therefore I suggest to move beyond the term identity and start talking about self-sovereign data instead of self-sovereign identity.


Users vs People

In his definitions, Christopher Allen always refers to users which – in my opinion – is an abstract word for people or citizens. Abstracting the discussion to users is technocratic and deflects from the fact that as humans or citizens of certain countries, have fought for the constitutional right to sovereignty and privacy. This right already exists and is also applicable in the digital world. We don’t have to reinvent the wheel of existing legal frameworks and write an independent magna carta for personal data.

Furthermore, in my discussions with the people who have contributed to this post controversy came up around the term identity, data and the term self-sovereign identity. It has become clear that we are talking about a very sensitive set of data, where the lines are often very fine, and that we need much more discussion and awareness around the topic. Joachim Lohkamp, for example, argues: “I was very much inspired by Moxie Marlinspike and Christopher Allen’s visions of self-sovereign identity, yet I think we need to take it to a point where people fully own their identity as a representation of a fundamental human right. Self-sovereign identity as a concept is connected to all sorts of data that might or might not shape the identity in the form of attributes. While data can be copied and shared, the actual identity should be unique and irrevocable.”


Let’s specify existing laws, instead of reinventing the wheel!

While a magna carta for digital rights is important and valid, we can use existing legal frameworks: ie the Universal Declaration of Human Rights, Fundamental Rights (Germany) already grant the right to sovereignty and right to privacy, just like many other constitutions modern democratic nations states.
However, it is true that many of these existing fundamental right only grant theoretical and abstract rights which – in many cases – have not been fully reformulated for the digital post-internet era. In a society, where many people like to stick to letter of the law and less the spirit of the law, we probably need more explicit regulation, to specify fundamental human rights in a more and more data-driven world.
This already happening, albeit slowly. Internationally some countries have started pushing towards progressive data protection regulation, grating the individual more explicit sovereignty over their personal data and digital trail. Slowly this gap between outdated constitutional rights that were defined in the 18th century, and a more explicit legal framework, that address to the realities of a post-Internet era, are emerging. The most promising regulation addressing these issues is the European General Data Protection Regulation (GDPR), which will come into effect mid-2018.

However, the GDPR takes for granted centralised models of digital data storage and transmission that are now in the process of being replaced by newer ones based on distributed ledger technologies, most prominently blockchains, which could grant more privacy by design, if the data architectures are designed accordingly. However, state of the art blockchain like Bitcoin, Ethereum and the like, are currently not anonymous, but rather pseudonymous, thus allowing anyone to run big data publicly available transaction data on the blockchain explorer, and can, therefore, become a technology of control.


We should start talking about ”personal data” and “people”, instead of “identity” and “users”. Furthermore, it would make sense to base the discussion around self-sovereign data from human rights and privacy point of view, referencing to existing democratic legal frameworks that already grant sovereignty and the right to privacy.
It would be more productive for advocates of self-sovereign data, to base their claims on existing national and international legislation around fundamental human/citizen rights, in order to make a more impactful case for new technical solutions and better regulatory framework.

We also need a more interdisciplinary discussion around the topic of what sovereignty and privacy mean in a data-driven world, where the marginal cost of data collection and data mining is approaching zero, and with state of the art public blockchains have crowdsourced data mining.

Thanks to Joachim Lohkamp, Charleen Fei, Valentin Kalinov, Markus Sabadello, Elad Verbin for their valuable feedback.