So, if you haven’t noticed yet, Google Wave happened recently. If you have like an hour to kill for some really impressive tech demo, you might want to take a look at the video on the link.
Its a fairly ambitious project, one that merges email and chat into one seamless application. I call it ambitious because it tries to do the following simultaneously:
- Merge real-time (chat) and asynchronous (mail) like communication together. They had that going on in a somewhat crude form (compared to google wave) with gmail’s chat.
- Instead of independent messages being sent either way it more about editing a collaborative document. The concept is fairly close to a wiki except that its also real time.
- It has a decentralized architecture where pieces (also known as wavelets) of the collaborative document can originate from multiple domains and can be edited. The various pieces or wavelets can have
- Domain scope: some parts of the collaborative document may remain private in a set of domains.
- Revision history: every edit is stored so that there is a revision history and accountability of every change made to the collaborative document.
- The document model itself is pretty general and allows many views (and consequently many google wave “applications”) to be built with the existing framework of de-centralized real-time collaboration.
So its basically a decentralized, real-time wiki with a flexible document model ready for making applications and guess what, email are chat are the most basic things that can be built starting from that.
Comparison with file syncing services
You may also compare this to Live Mesh where the basic idea really was some collaboratively edited cloud storage with edit notifications available as an activity feed for applications to tap in. In fact, when I played with various file syncing services, such as Live Mesh and dropbox, the limitation which stood out most was the inability of applications to manage conflicting changes. Changing the underlying document model from essentially a large chunk of bytes to hierarchically organized, change history retaining document model is just what applications need. Of course the applications need to be redesigned (separation of the actual document and the changes made to it) but that may be a necessary step to achieve what Live Mesh set out to do in the first place. Of course, I am not sure if Live Mesh was supposed to be either decentralized or real-time.
Note that even though the implementation of google wave demoed at Google I/O was an HTML 5 application, there is nothing stopping a normal desktop application like Microsoft Office to adopt a similar document model.
Comparison with Social Networks
Does existence of google wave necessarily obsolete social networking sites like facebook (i.e. sites that provide the social network infrastructure and not the specialized social networking applications such as dopplr, last.fm, digg)? I think they still have a place because social network communication over facebook is very public and that public meme is a value that facebook provides in addition to a collaborative infrastructure. However, I am pretty sure there will be somebody out there who will take google wave’s reference implementation and just add enough “public” mantra to it to make a competitive social networking platform.
This is of course great news for social networking applications such as last.fm, dopplr and digg who can now use google wave for their social networking infrastructure.
Will it be robust and glitch-free?
Managing concurrent real-time editing of a document originating in pieces from several domains is of course not going to be glitch-free :). However, there is quite some stuff to be learnt from decentralized source code management (tools like git and mercurial). These guys have been dealing with decentralized conflict management and revision history since quite some time now. I guess the following additions to the wave implementation will make it more robust than just keeping content (wavelets) in authoritative domains.
- A wave domain should cache the entire wave aggressively (at least the parts it has access to).
- The document model should be built such that it is incrementally consistent i.e. if a set of changes from a domain or a user is removed then the document isn’t corrupted.
- Each incremental edit should be checksummed for integrity against the entire history of edits previous to this edit. This ensures that edits originating in different domains agree on consistency of their edits. This also requires that we revoke edits that happen over an older revision (the whole “rebase-ing” concept in git applies here… and should be possible to do automatically if changes are synced sufficiently in real-time).
Wave without the cloud?
I am sure we will all be happy to see Google or somebody else provide free and nearly unlimited storage for wave documents. Like gmail, it should be easy to monetize for the cloud expenses using ads. However, I was thinking, if it was possible to build it using p2p techniques alone. Given git like consistency and aggressive caching, it may not be a bad idea to let the client be authoritative on its wave documents but be offline too. It also achieves storage redundancy automatically so one may be able to just start a fresh client and sync up its waves from its peers if it has the appropriate signing key. Git itself may not be suitable for building this as git doesn’t provide a standard incremental consistency model (I am thinking basic DTD document type check). However, it should be fairly easy to adapt the core git storage with such a type check, handling wavelets from different authentication domains and wrap it in a real-time network layer to make such a client.
The bigger problem really is maintaining connections to a large number of peers in the network. However, NAT traversal schemes should help.