One thing we’re missing at the DSNP level (and maybe the Project Liberty level) is an articulation of what we mean when we say “private”. We have a shared sense of privacy values that we as a team have encoded into the work thus far. This thread is a place to make that more explicit and to reach at least a starting, working definition that helps us frame and communicate our work.
Right now, I suspect that @wil and @harry have the high-level view to get this started. We hope to put down some early concepts and then roll that into a Privacy Working Group for refinement.
So, Harry and Wil, when you say “privacy” in connection with DSNP, what do you mean? What’s in scope, what’s not? What are you protecting and from whom? Feel free to answer at any level of granularity.
Note: this thread is an outgrowth of a call we did earlier on content moderation with Braxton, Denise, Scott, TJ, Chris, James, and Karl. The conversation wandered into privacy and we decided the right next step in considering this issue is to do some definitional work in the forum.
@teejayt73 reminded us in the call that this ties into all the work on metrics, monitoring, tracking, etc. It’s not just privacy related to your content we care about. It’s the rest of the data stream too.
Just to consolidate terms where possible, it would be useful to know how/where the concept of “user sovereignty” is related to privacy. I suspect there is a lot of overlap, but having a better understanding of the former (see my earlier message) will help us know the extent to which that is true.
The starting framework we currently have is mostly what is in the whitepaper. That places the focus on two ideas:
That it should be possible to have private user data (such as private changes to one’s social graph) that is stored and propagates through DSNP like public graph changes, but is encrypted and thus a client must request access. (Whitepaper 4.2.2-4.2.6)
Private communications along with a measure of private metadata should be possible, again propagating though the DSNP, and again a client must request the user “grant” access to read the contents or even know that a message was sent to them. (Whitepaper 5.4.4 Dead drops covers only 1:1 “direct” messaging)
So #1 above is about private information between a user and their clients, while still being portable data. #2 is about private communications, an extension of the public communications allowed by broadcast messages.
A third area of discussion that didn’t make it into the whitepaper, is verified attributes. This is the idea that a DSNP user could have attributes (potentially private or public) that assert that “someone” has verified a piece of information about the user without necessarily revealing the exact data that was proven. The “someone” who verifies would need to be able to be trusted, but several ideas around how to build that trust are still developing.
DSNP is a zero trust communication system (aka anyone can read the messages and so that includes bad actors). The core idea around privacy is that we are taking data that is placed into that zero trust environment and allowing the user to dictate who can read the contents of the message (and in the case of dead drops, the recipient as well).
And that is (so far) the who we are protecting from whom. Allowing private data to be placed into public store without a measurable loss of privacy. (Metadata mandates some small loss of privacy. That fact that data unknown to exist is now at a minimum known to maybe exist is an increase in public information. This is one of the important pieces of dead drops. It doesn’t hide that you sent a message, but it does hide who it was sent to. Although that message might have sent a message to no one.)
(Side note: when discussing privacy it is important to understand the limitations of information. While it is possible to give someone information, it is not possible to remove information from someone. I can ask you to delete some data, but you might have a backup or remember it in your brain. At that point it is a human or legal issue, but not a technological issue.)
Thank you, that all makes sense to me, and going back to the whitepaper is the right move. Thanks for grounding this conversation there. I am trying to figure out how to articulate all that into something coherent we can present to end users. Maybe a way to communicate is to start with the questions people lob at me. Taking your answers above, I’ve tried to apply them below and hope I haven’t made wrong assumptions in doing so.
Who can see my graph and my actions? Your graph is reconstructed from a history of Follow actions. Some of those actions are public, and anyone can see those. You can also do encrypted actions. Only people with the key can see those actions or even know that the encrypted things that exist are actions at all.
Who can see my messages or attributes? Anyone you (or your client?) has given the key to that message or attribute.
To what degree is there forward secrecy? How often do keys rotate-- are they session keys or long-lived keys?
Do I have a different key for each person in my network? Or one key per conversation no matter how many people are in it? Different keys for different groups of friends? Can I, for example, allow somebody to see only my professional relationships (or a subset of them) and other people to see only my personal relationships?
If I sign in to a new client, how does it get the keys to my friend’s private feed? If it gets a per-client key, how does that affect use of my old client?
Those are all great questions. I’ll see what I can do to answer them:
Correct. Your graph may be different depending on the permission you have given the client you are looking at it with. For example, while unlikely it is possible to follow someone publicly, but then unfollow them “privately”. So to non-permissioned clients or even your fiends it would look like you follow them, but to permissioned clients it would not.
It depends on what you mean by messages.
Dead drop messages sent TO you may only be discoverable by those clients to whom you have permissioned.
Dead drop messages sent FROM you are discoverable that they exist, but not who they are to nor the contents (without permissions)
In both of these cases permissions can mean a few things, but usually it means granted access to an asymmetric private key.
Usually the keys are long lived (or as long lived as you want them to be). The keys are different for private message sending than private graph.
Forward secrecy meaning: (Wikipedia)
a feature of specific key agreement protocols that gives assurances that session keys will not be compromised even if long-term secrets used in the session key exchange are compromised
It depends. We haven’t gone into detail, but if you want to be able to read your own messages, this becomes much harder. A compromise of a session key will of course not compromise your long term keys, but to achieve multi-client readability, it is possible, but harder. Signal and others get around this by just not being able to read historical messages. That doesn’t work on the whole, although would be able to enabled.
In the end we have a general outline of a way to do things, but it is not clear. Cross non-interactive client historical readability makes forward secrecy difficult without resorting to lots of data duplication or less standard cryptography.
For private graph the whitepaper way (using a symmetric key) does not have any forward secrecy, but we have made some changes to the storage location of the private graph (in a batch instead of on chain), so that might also change how we want to do things.
For messages, we would be using asymmetric keys, so yes. That said we have not yet gotten past more than conversions around to private messages with multiple people. Normally Signal, Apple Messaging, etc… encrypt the message for each intended recipient (and recipient’s device). That’s why you often see limits on group size. While that is possible for us to do, we have discussed other options as well.
Also sharing of a private graph with non-clients is also still in the discussion phase.
There’s the hard part. You can see several of the issues above deal with this question. Per client keys are possible, but could quickly get out of hand. Also they make it difficult to keep metadata privacy (due to timing attacks). Currently the idea is that they are shared and you rotate when you want to remove a client (or any other time). Still in discussion.
Really that’s still the line for most of the privacy pieces: Still in discussion.
Hi @james and all - me new to this site and project, kind of stumbled over it and … love it including your great work (!!!). @privacy: This question might have already been covered at other place, then just disregard or hint me to your answer … GDPR (grrr). Is this already somehow addressed in your initiative. In general, blockchain-projects are not supported by that EU regulation, at least so far. Yet, somehow one should tackle that problem early days given the fact that GDPR might be copied way forward to other legislations (?). An example. GDPR requests operators to be able to extract and delete personal data on request of the user. This applies also to historical data. I understand that with this project data ownership clearly sits with the user (THAT’s the FUTURE :-)), yet on a blockchain per se it is not possible to delete data - they’re stored once and for ever.
No rush for answering, was just a thought …
Interesting question! I wanted to share some thinking on the value of privacy that we at Smith + Crown had previously put together. Some of it is more relevant for other projects, but I think some of it could be useful for the purposes of this discussion: there are materials here that might offer some useful language and questions that might provoke self-reflection among users, as well as help them refine their moral intuitions on privacy.
As a general note, the Stanford Encyclopedia of Philosophy’s entry on Privacy is a great background resource, and documents different historical conceptions of privacy, analysis of privacy’s value, and arguments concerning what extent privacy should be weighed against other values across different contexts. Much of the ideas touched on here are gone into more detail there.
Conceptions of Privacy. While the value of privacy is typically acknowledged and affirmed across popular imagination, there often isn’t a clear definition of privacy per say, but rather a sense of being deprived of something valuable by acts of surveillance. We would anticipate conversations on the definition of privacy to continue maturing and evolving, so it can be good to be aware of philosophical traditions that thinking might end up drawing on.
Privacy as dignity (Tries to articulate what about an invasion of someone’s privacy is an affront to that person’s human dignity. More bound up with positive views on human autonomy, selfhood, and integrity that basic acts of surveillance can violate.)
Privacy as understood in terms of private vs public spheres. (e.g. the private as spaces where particular groups—government, employers, etc—are not permitted to intrude on, or influence; freedom from interference.) (See Aristotle, John Locke, Adam Smith, and others.)
It could be interesting to hear which of these are most in-sync with Liberty’s understanding of privacy, or where it departs in notable aspects from these different ways privacy has been historically understood. [Also, I hope this isn’t too much of a history drop; while I find it fascinating, I also fit the caricature here: xkcd: Privacy Opinions]
Value of Privacy. As might be expected, differences in conception of privacy often correspond with differences in outlook on what is valuable about privacy, how it should be protected or extended, what other values might override one’s right to privacy in contexts, and so on.
Does Liberty see privacy as instrumental to other values; community, intimacy, relationships, self-development, etc?
In the context of Liberty, what other values are expected to come into conflict with individuals’ interests in private communications, control of personal data, etc?
Are there any scenarios you envision where other’s interest in those communications or data outweighs the individual’s interest in privacy?
Skepticism on Privacy’s Value. While we’re now more in a moment in where the value of privacy is affirmed, if amorphously, there likely will be a more pointed interrogation of privacy and its limits, especially as projects move from vaguer statements about privacy to concrete mechanics intended to realize the value. It’s worth getting ahead of that moment–and being able to shape the narrative if we’re able to.
Some couple common lines of doubt or criticism regarding the value of privacy that one might be interested in how Liberty would respond to include:
Some of those skeptical about the value of privacy express concern with the appeal to one’s privacy as used to conceal wrongdoing and abuse or to shield oneself from criticism or public scrutiny. What, if anything, does Liberty see as a misuse of privacy?
Liberty proposes to expand people’s powers to make information private. But, prevailing inequity in access to the goods of privacy (or the ability to misuse it) might also lead one to lack enthusiasm for such expansion in absence of information on how it might affect them. Insofar as Liberty will expand such power, are there ways it is doing so that especially benefit those who have previously been deprived of its good?
Lastly, some critics suggest that violations of privacy are better understood as violations of property rights or bodily harm. A reductionist’s recommendation could be to rephrase claims about rights to privacy, harms of surveillance, goods of privacy, etc in terms of property or bodily harm—talking in terms of privacy doesn’t actually help pick out what’s really important in the circumstances it is invoked. How might Liberty reply— i.e. what makes privacy the more informative or otherwise useful concept to focus on for understanding the kinds of values Liberty enables or harms that it mitigates?strong text
And markus, re GDPR and the requirement that users be able to delete data that is under their control:
While it’s true that nothing can be removed from the blockchain itself, most of the data in this protocol lives in off-chain storage, with the blockchain only holding pointers. This serves both scalability (obviously) but also the ability for people to be able to delete things. The pointer will still exist, but it will be stale; furthermore, the pointer may itself not be in plaintext form on the public blockchain, but may instead be in a form that can only be decrypted by people (as represented by contracts) authorized by the data owner (also represented by a contract).
hi, if on chain bears just stale info / pointers and material info sits off-chain which can be controlled by the owner/user, I‘d guess thatwould comply even to the rigid GDPR regulation. Thank you very much for answering and helping me learn (!)
Thanks @MVossen for that thoughtful contribution and a very high-level approach to the question.
To some degree, privacy is all of the types of control, dignity, and preconditions you describe. When I started this thread I was thinking about giving users actionable information about what data gets shared with whom and when. That is, I imagined being able to say something like “when you send a private message, only you and your recipient can read it.” The sum of all our statements like that might define spheres of public vs private, but I don’t think we need to pick a philosophical definition. I suspect most of the people working on DSNP right now are, to some degree crypto nuts and conspiracists. We’re ok with people using DSNP to build systems that meet their preferred privacy requirements and definitions. That is, we’re not trying to fit people into a privacy structure/definition so much as trying to make a flexible structure capable of supporting a lot of different public/private approaches and tradeoffs. Maybe we’re explicitly ducking philosophy, if only because trying to agree on one would be too difficult.
In order to do give people that flexible structure, we have to figure out what our base technical guarantees are. @wil’s posts, above, do that to the degree that work is done. We have to turn that into legible information that enables users and developers to make informed decisions on how/when to trust it with data.
The questions you pose about the value of privacy are interesting too. I think we all agree that privacy is attached to other values. And of course, privacy can conflict with other values or even be outweighed by other values. AFAICT, we are trying to avoid making policy about those values. We are deferring those concerns to higher levels of the stack, mostly because those are nuanced, contextual questions. At the protocol level, we don’t have access to the nuance or the context. Clients and sites and users have that stuff and they will have to wrestle with these issues. We won’t have any privileged access to information in the system. In specific cases, clients and sites might have access to decryption keys we don’t, and they will have some hard decisions to make. The question in my mind, is what can we do to help them tackle those issues well?
As far as skepticism on the value of privacy, I wonder who will focus on privacy advocacy. My guess is that DSNP.org is not going to try to convince privacy skeptics. There are a lot of orgs that work on that (including some I support/advise/etc). Project Liberty might spend time on that, I suppose, and I look forward to participating there.
I quite like the question you propose on how we deliver privacy to people have been deprived of privacy, and to do this as a specific intention. There are a lot of groups we could identify who are in this category, and they are mostly vulnerable populations. It is worth asking, for example, how our work impacts children or activists or people subject to surveillance from intimate partners. What problems are we solving and causing for these groups when we build this system?
Hi @James@Wil@markus@harry@kfogel, and fellow contributors,
Good comments from all on this thread (and a great, worthwhile project). @Denise.duncan kindly provided the link to join discussions. Here are some thoughts from an EU perspective.
There are genuine concerns about storing Personal Identifiable Information (PII) – verified attributes – on a Blockchain:
PII on a ledger puts the privacy of the users in danger. We are all too familiar with identity data abuse – theft, hijacking, plagiarism – not to mention data breaches and hacking.
It violates current privacy regulation (e.g., GDPR; right to be forgotten). However, this ‘right to erasure’ is not an absolute right. Some reasons for refusing a right to erasure include compliance with a legal obligation.
Also, identity is dynamic (attributes can change over time) and the identity management lifecycle has a distinct segregation of duty / responsibility from any blockchain application.
The DSNP approach that favours pseudonymity – DSNP whitepaper – for Social Identities minimises data protection impact on DSNP. Most of the regulatory compliance obligations remain with identity providers who establish and lifecycle manage verified identity attributes. DSNP can be a consumer of pseudonymous, verified identity that can be relied upon.
The optimal approach and the least onerous burden on the DSNP, is to avoid being considered as a ‘data controller’ under GDPR. The legal definition of a ‘data controller’ is ‘a legal person, including a natural person, who determines the purposes and means for processing Personal Identifiable Information (PII) ’ e.g., Facebook.
Ideally the person who owns and controls their social profile becomes their own ‘data controller’ . If DSNP is not concerned with determining the purposes and means for processing Personal Identifiable Information (PII) ’ then GDPR is unlikely to determine that DSNP is a ‘data controller’.
Federal Privacy legislation, similar to GDPR, may well emerge so it’s good that DSNP take a ‘privacy by design’ approach upfront. Re-engineering privacy and data protection is not easy. Hope this is helpful.
Thank you for that. I agree we want to avoid being a Data Controller, and it is useful to have a framework to fit this all in. Can you say a little more about where the lines are around “determines the purposes and means for processing” PII? I’m not clear on what that leaves us and what that doesn’t.
Also, as an ecosystem, we can’t avoid GDPR entirely. At some point, services and apps sit on top of DSNP. They’ll have GDPR compliance requirements. Are there things we should think about now that will help those apps and services with compliance?
You are very welcome. Let me expand on your point about where the lines are around “determines the purposes and means for processing” PII? (The 99 articles of GDPR has challenges for us all).
We think it’s helpful to think about GDPR – EU and UK – in terms of the three main actors that it applies to. The Data Subject (you / me and our personal data), a Data Controller who determines they have a lawful basis to collect / process our personal data, and optionally a Data Controller may appoint a 3rd party Data Processor to perform this processing as instructed. The Data Processor does not determine a legitimate business purpose(s) but simply follows instruction. The essence of GDPR is that it places a heavy emphasis on the new legal obligations on Data Controllers to observe the new rights of the individual ( Data Subject )
This abstraction helps to frame things with an initial thought; which of the 3 actors, if any, reflects the role of DSNP? Maybe the role of DSNP is a Data Processor and the Data Subject and Data Controller roles merge in this new DSNP context which would be very convenient e.g., consent to share data is under the control of the Data Subject as a Data Controller . We mention this because GDPR aims clearly at our current world where Data Controllers and Data Subjects are in reality, separate and distinct, where the former is centralised and has many new obligations to serve the latter’s many new rights. Since the new DSNP is decentralised it looks like the individual person enjoys both roles? If true – we think so – this will be tremendously advantageous to Project Liberty as the individual inherits most of the GDPR liabilities.
As you say, the ecosystem cannot avoid GDPR entirely. Even pseudonymised PII can fall under GDPR; it depends on whether a pseudonym can be attributed to a natural person i.e. can indirectly identity the person. Some parts – app / wallet – of the ecosystem will be burdened by compliance and others less so.
Turning to purposes for processing PII, maybe an insurance industry example illustrates this better. I would like a motor insurance quote and may subsequently enter into a contract . It is necessary for the insurance company (data controller) to collect some personal information from me (Name, DOB, location, driving licence and so forth) to fulfil this quote and potential contract . This contract is one of 6 lawful bases for processing PII – a legitimate business purpose: Lawful basis for processing | ICO . Processing of PII is lawful when at least 1 of the 6 conditions enumerated above apply.
Of course, if I do not proceed with the insurance contract, it is no longer necessary – no business purpose – for the insurance company to retain my personal data so it must be deleted. This is referred to as storage limitation .
In contrast, @kfogel mentions “ off-chain storage, with the blockchain only holding pointers which may satisfy GDPR ”, leaves the off-chain storage to handle right to erasure , where this is appropriate. This is a good implementation strategy.
@wil also mentions a critical point that “ a third area of discussion that didn’t make it into the whitepaper, is verified attributes. This is the idea that a DSNP user could have attributes (potentially private or public) that assert that “someone” has verified a piece of information about the user without necessarily revealing the exact data that was proven. The “someone” who verifies would need to be able to be trusted, but several ideas around how to build that trust are still developing .” We agree with this and the individual DSNP user having the ability to make their verified attributes private or public is also a good implementation strategy.
We have a Privacy Notice that documents the legitimate business purpose(s) – e-ntitle.® uses consent – for processing PII. Scroll to the page footer Home (objectsoft.uk) which outlines the general idea of a Privacy Notice. This too can help frame what to consider in terms of compliance and which components of the ecosystem will be accountable for compliance. As I think about this, e-ntitle.® can capture a further consent from the Identity Owner to post / populate verified attributes and associated digital signature to initialise (launch) a personal profile somewhere in the DSNP ecosystem? It’s worth exploring this. Here is a template Privacy Notice that can help think about the purposes for processing PII: Make your own privacy notice | ICO
Hope this helps to refine where responsibilities for compliance will naturally lie.