Facebook, Google+, the Internet, identities...

Who are you?

The question's a pretty old one, of course, and the basis of rather a lot of philosophy. It's a particularly interesting question in the context of the internet, too, and a good way into a rather messy ball of Stuff I've been trying to fit together lately.

On the Internet, I'm usually AdamW. Except when someone else is AdamW, in which case I'm normally AdamWill. Sometimes I'm even awilliam. I'm also happyassassin.net , and awilliam at redhat dot com, and adamwill at fedoraproject dot org, and a few other email addresses. Several of these identities are tied in various ways to other aspects of my identity - my public identity, anyway. Rather more securely, I am 5F6CD707 and C1365CF0, though I don't often use those identities and a lot of things that ultimately track back to me aren't associated with them, so there's rather a disconnect between them and my other identities.

Some of these identities are linked to certain Internet services, many of which would be pleased to effectively take over control of those identities. Or, as they'd probably put it, look after all the boring technical stuff so I can worry about creating Exciting Content. Two of these that are particularly en vogue lately are Facebook and Google+. Both of these scare the hell out of me.

This Internet thing has been around for a long time, now. For most of that time, it's been run by engineers. This has resulted in quite a lot of its flaws: its organization and administration manage the neat trick of being Byzantine and ramshackle at the same time. Things rarely get Done very quickly; engineers have always preferred to nail everything down and make it proof against earthquakes, floods, alien invasions and plagues before letting anyone at it. But it's also responsible for all of the fantastic things about the internet. The internet is remarkably earthquake-, flood-, alien-invasion- and plague-proof; the fact that very smart engineers worry publicly about DNS server security and IPv4 exhaustion and so on is really a part of this. There is still, thankfully, a network of smart dedicated people making sure the technical foundations of the internet continue to work smoothly and properly. That's how it's got as far as it has.

Back in the Good Old Days, there was a pretty tight coupling between the internet's underlying technical nature and its dominant practical uses, the ways people interacted with it; the latter kind of grew naturally from the former. Your primary identity on the internet was probably your email address, of which you likely had one or two. It was likely provided directly by the entity which provided your internet access: your ISP, your educational institution, your employer. The fundamental nature of the internet was (and still is) a vast network of networks connected by common standards and protocols, and the way you interacted with it reflected this; you were a node of that network, and your interactions generally went out through the great tree view. A typical internet interaction would be you as a member of Institution A sending an email to a member of Institution B. Your email would seem to originate from your email address - soandso@institutiona.net - and would go to soandso@institutionb.net . You'd probably actually be connected through Institution A's network when you sent it. It would go out through Institution A's mail servers, travel across various other networks according to the common protocols, and wind up on Institution B's mail servers. Your correspondent would log in through Institution B's network and retrieve the mail.

This all seems pretty tedious and obvious, but I'm working towards something here, believe me! If you think about it, things have changed quite a lot in today's internet. If I had to summarize this change, I'd say that the dominant uses of the internet are becoming more and more divorced from its fundamental design and structure. Imagine a contemporary version of the interaction above. We're in a transitional phase, so it still might happen precisely as described; but it's equally, or more, likely that the networks of Institution A and Institution B would never be involved. The interaction might take place within a single network: most likely Facebook's, or Google's. Both parties might be connected to these networks in any number of ways. They might be on the network of Institution A or Institution B, but they're as likely or more likely to be on a home connection, or a cellular one.

In the Good Old Days, you may well not have access to your internet identity - your email address, we're simplifying it to - from anywhere outside of the institution which provided that identity. Your educational institution, ISP or whatever might not be set up to allow you to log in from outside its own network. Over time, institutions have tended to make this possible. This was the first step in abstracting our internet identities from the organizations that provided them. But we were still tied to these institutions in several ways; we relied on their networks for our internet actions even if the last step of our connection was not through them at that time.

A much bigger step - probably the biggest - in abstracting our internet identities away from the institutions which form the basis of the internet's design was the development of IM networks and widely-used third-party mail systems: Hotmail, Yahoo and Gmail.

These reduce the provider of our connection to the internet to the status of an incidental detail. It doesn't matter if we're connected from home, from work, from school, from the road, as our internet identity is no longer tied to this connection. We could be soandso@hotmail.com however we connected to Hotmail.

This is a reflection of one of the fundamental strengths of the Internet: the common protocols and standards that bind it together make this possible. It has clear and substantial practical advantages, which is why such services are so popular and have to a large extent displaced traditional institutional identities; fewer and fewer people use their ISP-provided, school-provided or even work-provided email accounts (though the effect is slower and smaller for employers, as they have a clear interest in requiring staff to use their employer-provided identities for employment-related internet actions). Instant messaging contributed to the same effect: it is a system for communicating, the fundamental practical task of the internet, which uses identities that are divorced from the institutions that provide our access to the internet.

More recently, social networking services (not sites) have become popular: Friendster, then MySpace, then Facebook, and now Google+. From where we're currently standing, these look very similar to third-party email providers and IM networks. One of their fundamental properties is that they provide a mechanism for interacting with others which is abstracted from the providers of our internet connections. Indeed, I'd argue that this is really all they do, in essence. The key property of Hotmail, Gmail, Friendster, MySpace, Facebook, Twitter, and Google+ is the same one: all of these services facilitate communication between parties, and establish a system of identities to facilitate this. Identity is key to most communication.

Those who champion the tradition of anonymity on the internet might argue this point, and it is not a perfect one. Some communication works without identity. A great work of philosophy is a great work of philosophy no matter who wrote it; it can be published anonymously without losing its significance or value. But I'd argue the principle is true more often than it isn't. Most obviously it is true of interactive communication; a communication which is not a single transmission, written to be generally applicable, but communication which is a serial interchange between multiple parties. Such a communication is meaningless without identification of the parties; how can you have a conversation with someone without the assurance that it is the same person with whom you are conversing all along? If IM systems sent each message you wrote to a random recipient, they would be rather less popular. In practice, even the 'anonymity' on the Internet is rarely true anonymity; more often, when we speak of 'anonymity' on the internet, we are really referring to arbitrary constructed identities. Very few internet communications have ever been truly anonymous, but they have often been between constructed identities. Up till now, on the Internet, you have always been able to be fratboy22 if you wanted to be one day, and philosophygal54 the next day if you wanted. But you would not begin a communication as fratboy22 and end it as philosophygal54. If you started a communication as fratboy22 you would continue to use that identity so that others would know they were talking to the same entity all along. They may know nothing else about that identity, but they - and the communication - benefit from the consistency of it.

So, we have established the importance and significance of identity to internet communications. Let's go and take a look at the abstraction of identities again. We mentioned that this abstraction had practical benefits. What were these? They broadened our access to our identities. The fact that we had identities that were no longer tied to the institutions that provided our internet access allowed us more flexibility in accessing those identities.

What's another way to look at that? Well, let's put it negatively: the drawback of institutionally-associated identities is that they restrict our ability to use these identities. Why is this? Because the institution controls the identity. Let's look at our earlier example: that of soandso@institutiona.com . Institution A has a significant stake in control of this identity. Whether Institution A or Soandso owns the identity is a fascinating question in legal, moral and philosophical terms, and will become interesting (vital) again later in this post, but let's hand-wave it for now. In practical terms, Institution A exerts significant control over that identity. If Soandso's relationship with Institution A ends, that identity may well end too; or to be more strict about it, Soandso's access to that identity may be restricted or curtailed. If you switch ISPs, you may well lose your email account from the first ISP, unless you continue to pay them to maintain it. If you finish school, you might lose your school email address.

So we've identified the fundamental weakness of institutionally-provided internet identities: though the identity is associated with the person who uses it to communicate, it is under the ultimate control of the institution which provides it. This is a disconnect which causes unpleasant practical consequences. We might not see this whole theoretical picture, but we see its practical effects: we want an identity without these practical limitations.

Third-party provided identities seem to provide this. Indeed, they inarguably do mitigate many of the practical negative effects of institutionally-provided internet identities discussed above, which explains their popularity. Our internet identity is no longer tied to the institution which provides our internet access. We can switch ISPs, switch schools, switch employers, and switch cellphone providers, and yet still maintain a consistent internet identity.

In theoretical terms, too, we can argue that third-party identity provision is an improvement. We could say that we want our internet identities to map to - be associated with - ourselves as people, as individual conscious entities. This is how we tend to use them. As discussed above, we might construct multiple identities, but we consider them to be associated with ourselves - the entity who creates the identity fratboy22 might also have the identity John Smith, and might want to present the two identities differently to others, but that entity considers itself to be the owner of those two identities. This is the mapping that entity wants. In the Good Old Days model, that is not the mapping. The identity is mapped to the institution that provides it, not the entity that uses it to communicate.

Looked at in this way, the new model is undeniably an improvement. It does get us closer to a situation in which our internet identity maps the way we want: in which it is associated with us as conscious entities.

Two thousand words, and here comes the but: but the new model is deficient in other ways - ways which may come to be more significant than the deficiencies of the old model.

We could start, though, by observing that the major deficiency of the new model is the same as the major deficiency of the old model. The identities provided by MSN, Myspace, Facebook, Twitter, Google+ and so on are not controlled entirely by the identities which use them to communicate. They are also controlled by the providers. The identity 'Adam Williamson' on Facebook is controlled by Facebook, as well as by the conscious entity which refers to itself by the identity 'Adam Williamson' in other contexts. I would argue that, in effect, the identity 'Adam Williamson' on Facebook is owned by Facebook, which struck an agreement with the conscious entity that refers to itself as 'Adam Williamson' to gain access to the identity 'Adam Williamson' on Facebook. I agreed to a legally-binding contract with Facebook in which they grant me the ability to use the identity 'Adam Williamson on Facebook' for communication, and a degree of assurance that most other entities will not be able to use that identity in most cases. In exchange, I grant them several things, many of them significant - such as rights to use the communications I carry out using that identity in certain ways.

We can, of course, obfuscate all of this with dull practical considerations like 'well, it costs Facebook money to let you use all those pretty services! They gotta make it back somehow!' Sure. But that's kind of a sidetrack, or if you prefer, a higher-level consideration. In this post, I'm trying to engage with something more fundamental, and increasingly, I'm of the opinion that it all comes down to control of identity in the end.

To clarify this, let's do a quick thought experiment on the side. Imagine if Facebook didn't control the identities used on Facebook. Imagine if we all signed into it via OpenID (and Facebook's service agreement didn't attempt to assert any control over the identity we used to communicate on Facebook). Then the relationship would be more on the level discussed above. Facebook would be providing services - transfer of data and so on - and we would be making concessions to Facebook in exchange for those services. Our identities would be a separate question. To me, this is a much more palatable scenario. The identity is the key.

The truly scary thing about Facebook and even more (for me) about Google+ is this conflation of identity and service provision; particularly in the context of the above discussion, where we established how important identity is. To me, the question of what entities we choose to provide services is interesting, important, and to be considered carefully, but it's rarely vital, or non-reversible. The question of how we choose to establish identity is on a different level; it's possibly the single most significant question we have to answer.

This isn't a new thought, of course. Many have seen this picture before. It's probably what led to the establishment of OpenID in the first place. But I wanted to think it through and nail down why I'm so extremely reluctant to turn anything significant that I do over to Google, Facebook or any of the other organizations that are pleased to work as hard as they can to conflate my identity and my communications, and hence take a significant stake in controlling both.

It's scary on a personal level, for obvious reasons. It's scary on a wider scale through the trivial observation that what's scary for any one internet user is scary for all of them. But it's also a significant threat to the fundamental nature of the internet itself. If identity is so important - and increasingly important - to what we do on the internet, then the more a small group of companies succeeds in establishing their status as the providers and arbiters of internet identities, the more the internet itself comes to look different. On the physical, nuts-and-bolts level it might still look like a big network of networks that interact through common, neutral protocols; but if all our practical use of the internet is divorced from that structure and tied up in the identities controlled by a small group of companies, to what extent can we really describe the internet as a system as being described by that physical design any more? Wouldn't a more accurate picture of the internet as a system be centered around the identities provided and controlled by that small set of companies, and hence the networks controlled by those companies? And isn't that a huge, and worrying, change from the picture of the internet we started with?

I think so. And I'm deeply worried about the long-term consequences of tying up our communications in the identities and networks owned by this small group of companies.

Let's move forward and look at what we're often pleased to refer to as 'the real world' - communications and interactions that take place outside of the internet. How do we handle identity here?

It would be tempting to observe that identity is less important in the real world. We could say that we can do many things in the real world which don't depend on identity. No-one asks us for our ID when we go out to buy a coffee, or when we stop to talk to a neighbour in the street. But this is not really the case. It's more accurate to say that identity is less problematic in these 'real world' interactions. Another thought experiment: imagine if you woke up every morning looking and sounding completely different from how you had the day before. This would be an excellent way to make identity more of a problem in the real world; we would then need to think of some way of knowing who that person on the street was. Even in a coffee shop, if we all changed appearance every second, things would become unmanageable. Many of our real world interactions rely on identity as much as internet interactions do; it just happens that it's much easier for us to provide, and for others to validate, an identity that's sufficient to the task at hand in most cases. We don't change appearance every morning, so we recognize our neighbour on the street. They provide an identity just by looking the way they did the day before, and we validate that entity through the magic of sight and memory. The same holds true in the coffee shop; the person who orders the coffee, the person who pays for it, and the person who picks it up all look the same. The coffee shop staff don't have much trouble validating that the entity that performs all three operations is the same one. We don't think about the problem of identity in these situations, but we're solving it all the same.

There are still cases where we need to establish an identity in a less intuitive way, though, and we've come up with lots of mechanisms for that. What are they? Well, at first you might say that your name is your identity, but really, it's often more complex than that.

Let's say we set up an account with a mail-order store. We provide them with a name and address. Is that the identity? It isn't, exactly; the name is a signifier, and it may act as an access token. But the identity is likely to be the account number the store assigns to the file. We can change the name and the address on the account, if we are able to convince the store that it should do this. We could change the address by satisfying the store that we are the same entity that opened the account, or some other entity which should have the ability to change the information associated with the account. The store might have other accounts under the same name, and even other accounts under the same address, if we share our living space with some other entity. The store probably doesn't even try very hard to verify that the name the entity provides to the store is the same name that entity uses in other contexts; this isn't likely to be important to the store anyway. All the store really needs is an identity which allows it to carry out a series of transactions with a defined entity or set of entities: a way in which it can be reasonably sure that the entity or entities that place an order, provide payment, and accept delivery of whatever the store is selling are the same one(s). There are many practical ways to achieve this, and we're not really interested in what they are.

What's really important is the observation this leads to, which is that we all have many, many identities. We may use the same signifiers and access keys for many of these identities - our name, our address, our telephone number - but they are all separate. I can have a different name as a signifier and access key on my account with Institution A as compared to my account with Institution B. Or I could have the same name associated with both, then change it on one, and it would not change on the other. Most of us have thousands of identities, and the metadata surrounding each may well be similar, but each identity is functionally separate from the others.

Then things get really complicated, because we want to associate these identities together, sometimes. Let's take a couple of examples.

One: credit! I suspect that the 'real world' organizations with the best shot at associating all the thousands of identities we all maintain all the time with us as individual conscious entities are credit agencies. How do they work? Well, they collect the metadata about many of our identities from the parties we established those identities with and associate the metadata. They then establish federated identities based on these. They perceive that the thousand identities with the metadata of my real-world name, address and phone number were probably established by the same entity, and they establish a new identity, under their control, associated with those same metadata.

Governments might do much the same thing; most of us have probably established dozens of identities with government agencies, and governments increasingly attempt to do what credit agencies do, and federate those identities. They will associate the same metadata together so they can perceive, for instance, that the conscious entity claiming unemployment benefit under one identity relationship with one government agency is paying employment income tax under another identity relationship with another government agency, and take appropriate action against this entity based on that perception. And recently governments are taking this process one step further by consolidating the identity relationships we have with them: the ultimate goal in many places being for each individual to have one identity relationship with all government agencies. This is what ID cards are for.

One more example: legal processes. Identity can become problematic in court cases, and the courts use the same process as credit agencies and governments: they establish federated identities by comparing metadata. If you commit some kind of fraud in your relationship with Institution A, the police will probably investigate the identity you have established with Institution A and use the metadata associated with that identity - your name, address, and phone number, perhaps - to locate you. They will come to your house - identified by the address associated with your identity with Institution A - and arrest you. You'll probably have a wallet containing some identity tokens with some of the same metadata on them. They'll talk to your neighbours, who will assert that the entity they just arrested habitually refers to itself by a name which is also a piece of metadata associated with your identity with Institution A. Your identity with Institution A might have a photo attached to it, which will be established to look very similar to the entity the police arrested. And so forth, and so on: the accumulation of associations between the metadata associated with the identity that is known to have performed the fraudulent transaction, and the metadata associated with the identity of the entity the police arrested, is quickly established to be sufficiently great for everyone to be pretty sure that the entity that was arrested is the same one that initiated the fraudulent transaction. Again, we do this kind of thing all the time without really thinking about it.

What's the point? The point is that in the real world we all deal with a very complex web of identity relationships all the time. We all have thousands of identity relationships that we use in different contexts; various bodies and processes can associate these different identity relationships by analysing the metadata associated with each, but this is a complex process which we don't do all the time and which is in a way a part of the complexity of the system. It's inherent in the nature of a system that complex that we don't often cede too much control over any of those identity relationships, and particularly over the fuzzy meta-identity that comes from sticking them all together, to any one other agency. There is no Facebook or Google in the real world, which can be said to be the arbiter of our identity in the singular. There are thousands of agencies which are the arbiters of our thousands of identity relationships, and it's reasonable to assert that ultimately we are the arbiters of our meta-identities, as much as any other agency. So far in human history the agencies which have had the best shot at displacing us, as individual entities, as the arbiters of our identities are governments, and it's significant that the governments which have come closest to this are often popularly considered the worst. Not for nothing is 'papers please!' the stereotype associated with Nazi officials in English pop culture: we understand on some level, even if we don't often explicitly formulate, that one of the most pernicious accomplishments of the Nazi government was to succeed to a considerable extent in establishing itself as the arbiter of the identities of its subjects. Ditto for the Soviet Union, North Korea, and any number of other totalitarian regimes: we are correct when we instinctively latch on to the control of identity as one of the worst features of these societies. The control of identity is vital to any agency which wishes to control the activities of others. Control of identity allows a repressive government to control the interactions of the entities which use those identities with other entities: the government that arbits your identity relationship with a school, an employer or a newspaper can prevent you attending that school, working for that employer or writing for that newspaper. And this is as true of any other agency as it is of a government.

By the famous maxim 'power corrupts; absolute power corrupts absolutely' we could perhaps extend our argument to suggest that any agency with too much control - or power - over our identities will be inclined to start to exert this power in repressive ways, as the governments discussed above did. This is, if anything, even more likely when those agencies are entities as sociopathic as corporations established under American law: even if we trust the people that currently run Google not to do anything evil with our identities (which is certainly an open question), it is not really those people to whom we are ceding this vital power, but the corporation known as Google, which (as any good bleeding-heart liberal knows) is a very different thing from the people who at any given time staff that corporation.

I suspect Wal-Mart would be hard pressed to suggest to Americans that all their interactions with each other, with other companies, and with the government should be facilitated by the identity relationships each American has established with Wal-Mart. It's notoriously the case that Americans are allergic to the suggestion that their interactions with each other and with other entities should be facilitated by the identity relationships each American establishes with the government. This is probably true of the inhabitants of most countries - not just Americans. So I suspect that people may well come to regret very sincerely, at some point down the road when the consequences become more clear, the decision many are making to allow a single entity, or a small group of entities - Facebook, Google, and so on - to arbit their identity for so many of their transactions on the internet. Those 'sign in with Facebook' and 'sign in with Google' buttons aren't just a handy convenience, if you look at it carefully enough; they're a fundamental realignment of how we establish and control our identities, and hence almost all of our activities.

Whew. Now I'm done screaming that the sky is falling, how do we stop it? Well, we have a technical solution already, and I've mentioned it a couple of times: OpenID. The key feature of OpenID, and what's so deeply wonderful about it, is that it doesn't work like Google or Facebook. It's not a single entity which provides you with an identity over which it exerts considerable control. OpenID is a standard by which you can provide an identity for a relationship. It defines the mechanism by which the identity is provided, but it does not actually provide or control the identity itself: some other body does. You can use accounts with various other providers with OpenID if you like, and sometimes you might want to do this: it makes a deal of sense for me to use my Fedora ID via OpenID to comment on a blog post about Fedora, for instance. It allows me to choose an appropriate identity relationship for the transaction in question, from any which is compatible with the OpenID standard. But you can also use the OpenID standard to provide an identity over which you have complete control; you can provide an identity via OpenID which is associated with software running on your own server, with no third party relationship involved at all. OpenID provides a framework to both replicate the complex web of identity relationships we use in the 'real world' - and hence avoid the concentration of too much power over identities in the hands of any one entity - and to give each of us as individuals far more direct power over any given identity relationship which uses the OpenID framework. OpenID, and any system with the same attributes - a decentralized framework for entities to establish identity relationships, rather than a centralized, all-powerful provider and arbiter of identities - is the Right Way To Do It.

The rest of the problem is only implementation: making sure that OpenID and OpenID-type systems are the way we establish identity for online relationships in the future. This could be achieved in lots of ways - by advocacy, by law, by consumer pressure, whatever your ideologically preferred mechanism may be. But it's what needs to happen. I resist, as far as possible, the notion of allowing Facebook, Google, Twitter, the Canadian government or any other agency to set itself up as a powerful arbiter of the identity relationships and hence communications and interactions I maintain with others. I should be the arbiter of my identity relationships and my communications.

This is why I'm on Facebook, but don't use my Facebook identity for any other purpose. This is why I use my Google account as sparingly as possible, and don't intend to put anything of significance in my Google+ account - even though in some ways, Google's approach is much more 'open' than Facebook's, it still contains the fundamental flaw that it sets up Google as the arbiter of my online identity. If you've made it all the way here (5,300 words and counting!), I hope it'll make you think twice about how valuable your identity, and your control over it, is, before you sign it over to someone else.


jdulaney wrote on 2011-07-13 02:51:
And, this can be extended to one of the fundamental reasons as to why I use open source software. Microsoft has too much control over the desktop market.
adamw wrote on 2011-07-13 03:12:
Well, that's really a different topic. The desktop operating system market doesn't have a lot to do with identity. But on this topic, Microsoft does happen to be another on the long list of companies that would be pleased to own your online identity; that's what Passport and 'Windows Live' is about. Luckily, they're doing a pretty terrible job of it so far. Unlike Google and Facebook they haven't yet come up with any compelling services tied to the identity they provide; MSN was there for a while, but people are leaving that in droves.
myroslav wrote on 2011-07-21 21:30:
Yesterday I struggled to register here with OpenID (LiveJournal and StartSSL and failed). Today I managed to get through with LiveJournal OpenID, but in the end resulted into plain user/password registration. Nevertheless. The point with OpenID for me is frustration. I never know which of my OpenID will be working with what service, and if they start to work if it work or it will break in the middle of redirect sequence. I hadn't seen RockSolid(tm) OpenID implementation so far. Have you tried looking into BrowserID that Mozilla folks are trying to push through? Another similar initiative is WebID (I consider it more fragile as it depends upon client side TLS certificates).
adamw wrote on 2011-07-21 23:15:
it would help if you'd tell me what *happened*; nothing in your comment gives me any clue as to exactly what went wrong or how it could be fixed. in general I've had no trouble using my fedora openid for most sites that allow openid. yes, i've seen browserid. I commented on it on a few sites and filed a couple of bugs.
myroslav wrote on 2011-07-21 23:47:
There are two distinct cases: 1) StartSSL OpenID - I authenticate at StartSSL successfully, auth window disappears, and in Janrain lightbox I get a report: We were unable sign you in. Please check your OpenID and try again. Need help? - Sorry we couldn't sign you in. 132 OpenID Response: Protocol Error Either the OpenID response parameters were invalid or the an error occurred during the OpenID check_authentication call. 2) LiveJournal - today it worked and I'm posting under that account. Yesterday rpx reported that they had an error and that they've recorded the issue. I do not have specific wording now. Yesterday it just refused to work. Clearing Cookies, Cache, and everything else hadn't worked. The error was in additional window spawned for rpx auth wrapper. Later Janrain lightbox got corrupted and I couldn't even try my StartSSL OpenID login, as it displayed blank lightbox with inactive "1" and "2" page selecters in right bottom corner.
[...] wrote at some length about online identity recently. Today I saw an article by Tim Adams about abusive and trollish behaviour on the internet, which [...]