Thoughts on a decentralized social network

The following is something I’ve had floating around in my Drafts folder for a long time. With Facebook’s outage yesterday, and Frances Haugen’s statements about Facebook, I’ve decided to post it in its unfinished form.

There’s been a growing chorus of complaints against Facebook, now that we are starting to get an idea of just how much information about us (and our friends) they are willing to sell. And how that information may have been used to rig the U.S. election. And how Facebook in general seems to be a cesspool of misinformation and poorly informed squabbling. I was somewhat reluctant to join Facebook and was relatively late doing so, but I’ll admit that I do get something out of the contact with friends. Twitter has come under even more fire for its willingness to keep extremists on its platform.

Some of my friends have started using Mastodon, some have started using Mewe. I’ve tried both myself, but both are unsatisfying to me.

Mastodon has the benefit of being a federated system running open-source software. It’s not dependent on any one entity to keep it going. Different nodes can have different posting and moderation policies, so if there’s a node whose users you find especially vexatious due to loose moderation policies, you could in theory just block the whole node. There are nodes for different language communities and different interest groups, which is interesting. But it’s designed to be more of a Twitter alternative, and doesn’t have the complete package that I think a social network should have (I’ll get to that later).

Mewe seems to be designed to be more of a Facebook alternative. It’s still got a stunted feature set, and it’s entirely private–there are no public posts. You need to be logged in to see anything.

I haven’t really played around with Micro.blog yet, but that seems like another Twitter clone.

And these follow on the heels of many other alternative social-network experiments. Diaspora. Tsu. Ello. Hello. App.net. Vero. Google+. Orkut. I can’t keep track of them all.

Before social networks were widely used, if you wanted to post your thoughts online, you probably used a blog. Blogging is still around (you’re soaking in it), despite having lost a lot of popularity. But the hosting model that blogging took is the one that a better social network should use: there are big services that will host your blog for you, but you can also get shared space on a web server (or your own virtual server, or your own actual server) and install your own blogging software. Blogs can talk to each other through what we used to call pingbacks, but have evolved into webmentions. Something like this would be a building block for a distributed social network.

Design principles

The social network should be distributed. It should not be owned by any one company. Instances of compatible software could host the feed for one person, or a few people, or many people.
Similarly, the social network should not depend on a single piece of software. Different implementations of the same basic protocol should interoperate. Data should be stored in a common format (or at least be exportable) so that a user can move from one piece of software to another without loss.
The protocol should be extensible, but allow for graceful degradation to avoid lockout.
The social network should serve the interests of its users, which will not all be the same.
The social network should allow private posts with flexible, fine-grained privacy, and public posts that can be viewed on the Web with a plain web browser.

Features

What should a social network have? These are content types that people would expect. Which doesn’t mean they need to be implemented, but there needs to be a good reason not to.

Short posts: Something like tweets, that you can dash off quickly.
Long posts: Blog entries with titles and other metadata.
Audio/visual posts (photos, videos, audio files).
“Collection” posts (photo albums, playlists).
Events.
Comments on posts.
Rebroadcasts (reblogs/shares/retweets).
Reactions (likes, upvotes/downvotes, etc).
Discussion groups.
Direct messaging.

Before you can actually use any of this, you need some scaffolding to support it all and hold it together.

Identity management: How you tell the software “this is me.”
Contact-list management: How you tell the software “these are the people I want to stay in touch with.”
Selective privacy: How you tell the software “this is public, that can only be viewed by certain people.”
A taxonomy system
A screening mechanism to decide what you see and what you don’t see.
An encryption/decryption function that lets you protect the privacy of your own posts, and access other private posts from others when you have permission.

You would use two piece of software: back-end software (“the server”) that stores your data and responds to network pings, much like blogs are today, and software that presents information to you, filters and sorts it, and lets you write posts, upload photos, etc (“the client”), much like RSS readers are today

Communications flow

Here’s a naive implementation. Alice and Bob are each running their own copy of a social-networking server. It’s important to point out that they don’t necessarily need to be running identical software, as long as both copies speak the same language.

Alice discovers that Bob is “out there” and wants to follow him, so her server sends his a contact card, requesting “please let me know when you post new stuff.” Bob’s server holds on to that until Bob logs in; Bob decides to let Alice follow him. Alice’s contact card includes a node to ping when he publishes new stuff, and Bob’s server adds that node to its notification list.

The next time Bob posts something, his server server pings Alice’s to let it know “hey, Bob just posted a thing, here’s the address for it.” Alice’s server retrieves the post and adds it to Alice’s newsfeed.

Someone who’s a better programmer than me could probably cobble together a proof-of-concept for something like this in a few days. Maybe only one day. I say this is a naive implementation because it overlooks a lot of important features. For one, it assumes that everything is public. The minute you get into restricting access, things get a lot more complicated.

Private posts should be a key feature of any social network, with robust encryption to guarantee that what should be private remains private. It introduces a lot of complexity.

Let’s return to Alice and Bob. Let’s say Bob has added Alice to his “close friends” circle, and he posts something restricted to that circle: only people in that circle should be able to read it.

Now, we get into public-key cryptography. When Alice and Bob shook virtual hands over the network, each of them gave the other their public keys. With public-key crypto, something encrypted under one’s public key can be decrypted under one’s private key, which means that Bob can send Alice something that only she can read; the reverse is also true: Bob can “sign” something by encrypting it under his private key (or more likely, by encrypting a “fingerprint” of the original document), and because it can only be decrypted by his public key, everyone else can verify that Bob really signed it.

Bob’s server server encrypts his post using a one-time pad and then pings Alice to let her know “hey, Bob wrote a thing”. This time, the ping message also includes the key to the one-time pad encrypted under Alice’s public key, so only Alice can decrypt the underlying message.

When Alice’s server is ready to build a feed to show Alice that includes Bob’s post, it sends Bob’s SNS a request for the post. Bob’s server will send the contents of the post, still encrypted under the one-time pad. Alice’s server decrypts the post at her end and incorporates it into her feed. This process is repeated for each person in Bob’s “close friends” circle. (This is called “enveloped data”)

If at some point in the future, Bob removes Alice from his “close friends” circle, his server will re-encrypt all the posts marked “close friends” under new one-time pads, invalidating the one-time pads that Alice would have.

Aside: There is a variation on this that is conceptually more complex, but might be computationally easier: each post has an unvarying “local key” (or all posts share the same local key), which is encrypted/decrypted by an intermediate key that can change whenever the circles of friends change; the intermediate key is what gets shared. The rationale for this is that longer texts take more time to encrypt and decrypt (this scales pretty linearly except with very short texts, which have some minimum level of overhead). This is for a system where circles of friends could change frequently, so reducing the churn of decrypting and re-encrypting everything would be desirable, and in this scenario, the source material doesn’t need to go through that process, only the intermediate keys. With very short posts (like tweets) and very long keys, the keys could actually be longer than the posts. But even small photos represent hundreds or thousands times as much data as a key, so there’s a lot of benefit in not needing to re-encrypt heavier media.

There might be another variation of this in which, instead of each post having an intermediate key, each circle has an intermediate key. I’m not sure if this would work at all. I haven’t found anything in the literature describing this approach.

Cryptography is a dauntingly complex subject where even my typical level of catastrophizing is laughably pollyanna-ish. There are obvious weak spots in this–once someone has seen something, even if you revoke their key, they might still retain a plaintext copy of the original.

Freestanding posts

By “freestanding posts” I mean a post that stands on its own, without being a response to something else. This would cover short posts, long posts, audio/visual posts, collection posts, and events.

Comments

Comments, shares, and reactions are a related set of concepts, in that they are all in response to another post, so they bring up some related questions.

Let’s look at comments.

If Alice wants to comment on a post that Bob made, where should that comment reside? There are a few possibilities:

Comments in the same place as the original post (“source hosted”). This is how most personal blogs work today.
Comments are handled by a third party. The post author embeds a bit of enabling code in their blog so that the remote comment mechanism appears there (“remote hosted”). Disqus is one such remote comment-hosting service that’s widely used in commercial blogs. Twitter and Facebook have both been pressed into service for this purpose in some places.
Comments are hosted in the same place the commenter uses for their own freestanding posts (“self hosted”). This is not widely used, but something similar in spirit has been around for a long time in the form of trackbacks, pingbacks, and webmentions.

Each has their pros and cons, and each one represents a different philosophical position. Source-hosted comments are conceptually tidy. The comments live alongside the text they’re commenting on. Post authors can control the discourse on their own blogs. Remote-hosted comments by a big service allow an author to outsource some of the messy work of moderation, and allow network-effect benefits, like discovering spambots and other bad actors. Self-hosting means the commenter has control of their own words. All three of these systems could co-exist, of course.

The commenting system you use is going to influence the kind of conversation you see. To put it another way, you should choose the commenting system based on the kind of conversation you want to see. In all cases here, comments should be tied to a persistent identity, which should cut down on drive-by comments and anonymous trolling (it won’t eliminate all of it, of course).

All three could be supported in a system such as I’m writing about, but the third option is the best fit philosophically. Comments could use any of the “freestanding post” types, and there could be comments on comments (ie, threaded comments), although those can become problematic.

A possible benefit to self-hosted comments is that the commenter has access to all the post types, and can be more discursive than a little box on someone else’s blog might invite.

An apparent problem is that Alice could comment on one of Bob’s posts, and if Bob had limited who could see his original post, not everyone would be able to read the context behind Alice’s comment. Something like this is already possible with existing social networks when one person blocks another, but it would probably be more common in the scenario I’m proposing. I’m not sure this really is a problem, though. If Charlie could see Alice’s response but not Bob’s original post, he could set up his client to simply hide Alice’s comment. Having your server limit distractions by leaving you out of discussions that you’re not a part of could be considered more a benefit than a problem.

In any case, answering the question “who sees what?” with comments is one of the tougher nuts to crack, and I don’t think I’ve gotten there.

Rebroadcasts

Years before there were retweets, there were reblogs. The idea has been around for a while, and it serves a useful function for passing along a message to people who wouldn’t otherwise see it. At a social level, a rebroadcast serves to put a post in front of people who hadn’t seen the original; at a technical level, a rebroadcast could be considered equivalent to a comment, but with a flag set to identify it as a rebroadcast so that other servers could filter it appropriately. The same questions of permissions and “who sees what” that apply to comments would apply here.

Reactions

For a long time, Facebook let you “like” something, representing a single bit of data. Eventually they added a few more reaction types, so that a reaction represented about three bits of data. Twitter and Instagram are still stuck in one-bit reactions. Mewe has taken a different approach, where the first four reactions can use any emoji, and everyone after that must use one of the first four. Medium is using “claps,” where you can clap repeatedly for something to show you really like it.

So where should a reaction live, and what function does it serve? I’ll try to answer the second question first, in the hope that it will help answer the first.

People have different motivations when they react to a post on a centralized social-networking service. A lot of the time, the intention is simply to telegraph to the post’s author “I have seen this and I am your friend.”

The more interesting question is why the reaction mechanism is there in the first place. Commercial social networks use them as a way of gauging a post’s popularity, so they can build feeds based on what’s popular and do whatever other data-mining they’re doing. But the more pernicious reason is to keep the author coming back for that little dopamine hit when they see how many reactions they’ve gotten.

So perhaps the first question should be whether reactions have any place at all in a self-hosted social-network system. The commercial motivations aren’t there. The psychological gaming shouldn’t be there. There might still be some value in determining how influential a post is: if I want my client to show me popular posts, these reactions are not a bad metric.

With that in mind, if the system is going to have reactions at all, they might as well live with the post the audience is reacting to, rather than be distributed among the respective social-network servers all the audience members use. The reactions could be passed as part of the post’s metadata. It would be possible to ping those individual servers (“Alice liked Bob’s post”), or for the process to go the other way. The distinction may be ultimately moot, unless Alice doesn’t care about recording in her own online presence the fact that she liked Bob’s post.

My own bias is that people should use their words, and if Alice wants to let Bob know that she read his post, she can leave a comment to that effect (I recognize that people can disagree about this). And comments can serve as a measure of a post’s importance just as much as reactions.

Discussion groups

There are a lot of approaches to message boards.

Facebook groups couldn’t be much simpler. One group has one board. Posts consist only of body text, with no title or metadata. Posts within the group aren’t organized by subject, and the posts with the most recent activity are always at the top. A post can have replies, and there can be one layer of sub-replies, but no more (this is a relatively recent feature on Facebook, too). A group can have a team of moderators who have the ability to approve or eject members, and to remove posts.

More complex discussion boards may have a lot of other features: they organize discussions by topic; they offer titles, tags, and other metadata; they have a “karma” system where users spend points to rate each other, earn points for high-quality posts, and earn privileges after building up sufficient karma, including the ability to moderate other users. This is how Stack Exchange, Slashdot, and some other sites work. Stack Exchange is interesting because it hosts a huge number of unrelated topics, and if you have built up a good reputation on one board, you can start out with a few karma points on another board.

Discussion groups are the knottiest content type. One of the design principles I’m advocating here is that you host the content that you create, and that conversations are assembled on the fly from distributed sources. I don’t think that makes sense for discussion groups, though, because they have a very different permission model.

One of the defining features of self-hosted content as I’m discussing it here is that you get to decide who gets to see what you write. But a discussion group can have members who don’t know each other or actively avoid each other, and it should function as a neutral zone where everyone agrees to play by a common set of rules. In order to have a moderated discussion board, all the members need to surrender some of the control they have over their own writing. A self-hosted content model where each post pings a central location could still allow a moderator to hide unwanted posts in a discussion, but message boards give moderators more control than that–frequently there’s an edit window after which you can’t edit what you wrote, and the moderators can reach in and edit what you wrote. Both of these would be inconsistent with self-hosted content.

Instead, a person should be able to host and moderate their own message board/s within this system, so the the identities of participants carry over. And the board should ping the members’ respective servers when there’s an update, and should (as a courtesy) send a member’s own posts to their personal server so they can keep a copy of it.

As long as the message boards speak the same language in terms of authentication, identity, and interacting with the members’ personal servers, and as long as they post the rules by the front door so you can see them as you walk in, different types of boards, complex and simple, could all be part of the same network.

Screening

Deciding what you see, what you don’t see, and how you see it would be an important function for a client, and these decisions could be made according to different criteria.

Blacklisting/whitelisting: You manually flag certain people as ones you never want to see anything from, or always see everything from.
Consensus blacklisting/whitelisting: You share blacklists/whitelists with some friends. Your server treats an appearance on a list as a vote, and you set a threshold for blacklisting/whitelisting based on number or percentage of votes. There could be some privacy issues surrounding shared blacklists/whitelists.
Curation: you subscribe to a blacklist/whitelist curated by professionals that indicate “these are people to avoid/follow.”
Popularity: Number of reactions and comments could be used as a score for a post’s attention-worthiness.

There’s no reason that someone couldn’t use several of these at once, using different screens much the way we already filter e-mail into different mailboxes.

Ad-hoc forums

A hashtag on Twitter can serve as the nexus for an ad-hoc forum. How would that work under a distributed system? How would you discover people talking about the same thing as you? Or discover people with similar interests to your own?

Because of the distributed nature of the system, it could not work the same as it does on Twitter, which can mine all the tweets it hosts and show you what it wants to show you.

In a distributed system, there are a couple of ways something like this might work.

Your own software could crawl your friends’ posts, and their friends’ posts, and so on, to find people within your extended network posting on the same subjects as you.
Specialized search engines could do this on a broader level. Technorati used to serve this purpose: you would set up your blog to ping Technorati when you posted, and it would collect the tags you used in its own index; you could look up a tag and see what other people were writing on the same subject.

Either way, you would need to set up your social-network client to look for “outside” posts on the tags that interest you and to show them to you.

Identities and personas

I have friends who have multiple Facebook accounts because they can’t afford—or just don’t want—to let their personal lives and professional lives overlap.

Facebook does let you create groups of friends, and restrict a post to one group or another (although it doesn’t let you exclude a group, which would probably be more useful to a lot of people). And some of my friends do use this feature, although the fact that many instead go the multiple-accounts route suggests to me that either they don’t trust themselves to use friend groups correctly (and these are smart, technically adept people), or they don’t trust FB not to betray them. Perhaps some of both.

While it might be helpful to have groups of friends for some purposes, having switchable personae that let one person with one account present different faces to the world seems like it would be a more realistic solution to the problem. While it would be possible to do both friend groups and personae, it would quickly get confusing.

Accountability

This is a big topic that I haven’t fleshed out yet. By “accountability,” I’m thinking of a combination of screening (above) and a consistent identity for online interactions. This doesn’t necessarily mean you lose all anonymity, simply that what you say today can somehow be tied to what you said yesterday. Diaspora had the concept of “aspects”—different personae that were all tied to the same account. In theory you could spool up a new persona every time you wanted to troll someone, but there could still be a way to connect screening metadata about a persona back to the underlying account, and reflecting that out to other personae created by the same account.

Business model

Ideally, the server software would be open source and have low enough computing requirements that it would be cheap to run, and non-technical people could just have their techie friend set up a partition of their web-hosting space as a favor. Big companies that already have onlineservices with free tiers of usage (Microsoft, Apple, Google, Dropbox, WordPress.com, etc) could offer their implementations of the server software as well. These would all presumably offer limited storage (up to N posts or N megabytes of content) at the free tier, with paid tiers for more.

The client software would do more of the heavy lifting, and would be where more of the personalization came into play. There would be multiple clients with different levels of complexity, some sold as shrinkwrapped software that runs on client devices, some as services tied to the use of a company’s server software, etc.