If you read my Google Wave review, you may also have noticed my comment about a non-cloud way of doing Google Wave. There are quite a number of people who are getting concerned about more and more personal information being uploaded to the Internet and the privacy nightmare it is turning into. Google wave itself seems to be an interesting concept for next generation collaboration on the Internet. It would be unwise not to look at it from a privacy centric view point.
Communication over the Internet is, at a very low level, done via message passing. The network essentially is a collection of computers (clients and servers) and messages are passed between them for any kind of communication. However, with the rise of forums, wikis and web 2.0 sites (and cloud computing in general) the communication pattern at a higher level is no more simple message passing but collaborative editing of documents which live on some server. Suddenly, the actual nodes in the network and the messages between them is no more the conduit of information. The living evolving document is the new, higher level conduit of information. For all practical purposes the collaboratively edited document is the “network channel” for communication between the participating clients.
Google Wave takes this essential idea much further. A wave document is the central focal point for communication between users, robots, external websites and even applications. It would not be a stretch of imagination to give an application a pointer to a wave document and make it automatically collaborate with other participants and applications without knowing about their IP addresses or connectivity status.
The unfortunate part of the story is that wave documents still reside in the cloud and hence will suffer from the same privacy concerns we have been dealing with when using wikis and forums. Peer-to-peer communication would be better for privacy but p2p architectures are still very much message passing oriented. Can we marry Google Wave and peer-to-peer?
It is not very hard to imagine an application which can keep peer-to-peer connections with friends (a friend-to-friend network) and collaborate over wave like documents. The documents would be replicated, similar to the way they are replicated between federated domains in Google Wave. In fact, the easiest way to do this may be to use some VPN software to create a private network between friends and run the open-source Google Wave servers on each node. Some of such VPN software is also not very hard to configure [eg. Remobo, Leaf, Wippien]. However, each user basically maintains his or her own wave domain and the Google Wave servers are probably not designed with such a use case in mind.
Even if we leave behind the re-usability of Google’s reference implementation, some practical architectural problems will remain. The Google wave servers are probably not designed for frequent disconnection in mind. The wave document should be accessible from some authoritative server at all points in time. Also, given the federated authentication, parts of the wave document may live entirely within the user domains first given access to. Adding more users to such parts of the document would require those domains to be online. This always available assumption might hold for cloud storage but doesn’t hold for peer-to-peer architectures… at least the kind discussed above.
P2P waves with redundant but encrypted content
I was thinking about this problem when I hit upon an idea. The basic problem is that if I wanted to attach a private message to a friend of mine on a peer-to-peer wave document then I cannot send the message if my friend is offline. However, if I encrypt my message using a shared key then even though the wave document is shared across the p2p network, only my trusted friend(s) can read the private message. Thus, comes the key insight:
Redundantly propagating encrypted content among peers can help with the “always available” problem with peer-to-peer networks.
However, if we are willing to do this for private messages, we can also extend the same for the whole wave document! Thus, instead of maintaining connectivity with at least a few of the participants in the wave document and sending the wave document in plain text to them, we can now maintain connectivity with unknown peers in a peer-to-peer network and encrypt the wave document with keys shared with just the users the document is intended for. Such an architecture would ensure good connectivity, fast downloads and very importantly uninterrupted availability which is one of the main strengths of storing wave documents in the cloud.
Curiously, a peer-to-peer network of the above nature already exists. Its called Wuala and it is a peer-to-peer file backup and sharing service with attention to privacy, selective sharing of files and file availability. A very interesting tech talk about the internals of the technology behind Wuala is here. Though the Wuala guys have currently focused on just getting the peer-to-peer storage right (which is a tough computer science problem in itself), it is not very hard to imagine having a wave document hosting over it. The essential idea is to take read-only file sharing and make it read-write with some level of revision history and conflict management.
However, we definitely lose on realtime communication with a Wuala like approach. Peer-to-peer storage and lookup will definitely be more latency than a direct connection to a wave server. On the other hand, that may not be a small price to pay for privacy.