<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Defective Compass</title>
	<atom:link href="http://defectivecompass.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://defectivecompass.wordpress.com</link>
	<description>defective compass: def. a device which is leads one into the wrong direction; phil. reality is mostly deliberate misdirection.</description>
	<lastBuildDate>Tue, 10 Jan 2012 18:27:52 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='defectivecompass.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Defective Compass</title>
		<link>http://defectivecompass.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://defectivecompass.wordpress.com/osd.xml" title="Defective Compass" />
	<atom:link rel='hub' href='http://defectivecompass.wordpress.com/?pushpress=hub'/>
		<item>
		<title>Multi-log structured storage layer</title>
		<link>http://defectivecompass.wordpress.com/2012/01/09/multi-log-structured-storage-layer/</link>
		<comments>http://defectivecompass.wordpress.com/2012/01/09/multi-log-structured-storage-layer/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 00:43:13 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=172</guid>
		<description><![CDATA[I recently had to build an SSD garbage collector which proved to be an interesting and fun exercise. SSD or Solid State Drives are storage devices that promise low latency operations compared to mechanical hard drives which are prone to seek / &#8230; <a href="http://defectivecompass.wordpress.com/2012/01/09/multi-log-structured-storage-layer/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=172&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I recently had to build an SSD garbage collector which proved to be an interesting and fun exercise. SSD or <a href="http://en.wikipedia.org/wiki/Solid-state_drive">Solid State Drives</a> are storage devices that promise low latency operations compared to mechanical hard drives which are prone to seek / rotational latency. While reads to SSDs are low latency, writes are a different story. Flash media (which is the basis of most SSDs) requires &#8220;erasing&#8221; a block (typically 128K) of flash before anything can be written on it. Also, block erasure takes a lot of time (compared to actually reading / writing). Most SSDs have firmware which handles pre-erasing a bunch of blocks so that the write latency can be minimized. Also, given that blocks cannot be partially written, the valid content on blocks need to be re-written to make the free space available on partial blocks available for newer content. SSD firmware is, hence, also responsible for this &#8220;garbage collection&#8221; process. A good starting point for further reading on SSD garbage collection is <a href="http://en.wikipedia.org/wiki/Write_amplification">Wikipedia: Write Amplification</a>.</p>
<p>Note that garbage collection can be a processor heavy operation given the data movement involved. Recently, I designed and implemented a multi-log structured storage layer which I ran on top of an SSD to relieve it of the garbage collection task (which it was performing poorly). However, such a multi-log storage layer is very versatile and can have a number of other applications.</p>
<p>A multi-log storage layer is essentially a storage layer in which storage content is laid out in multiple contiguous logs. While I kept the metadata for the content on the SSD in RAM, it is possible to tweak the implementation to keep metadata in the log as well. The primary reason to divide the content into multiple logs is to keep the garbage collection efficient (not re-writing too much data) when the storage space is close to full. However, similar principles could also be used for garbage collection in RAM (in garbage collection supporting language runtimes such as Java). A muti-log storage layout can also be used over hard drives to get good sequential write performance, though the reads would need to be supported by a layer of caching to avoid too many seeks.</p>
<h2>Multi-log Storage Layer Design</h2>
<p>First, the storage layer (whether backed by a file, or a device, or even RAM) needs to divided into multiple fixed size contiguous logs. These logs should be large enough that IO writes at that size can be done at near peak write bandwidth. At the same time, the log size should be small enough so that we see a _large_ variance in the amount of garbage seen across logs. This variance in garbage will allow us to select logs with more garbage for garbage collection. It is advisable to use the smallest log size at which we can achieve near peak read/write bandwidth. The storage layer will mostly write full logs at any time. If durability requirements of the data are relaxed (such as in a cache) then one can do away with forced flushes of data and flush logs _only_ when a full log&#8217;s worth content is available. Reads may require random access across logs, but it should be okay because random access has no cost on SSD and we shouldn&#8217;t be iop bound because of the sizing of the log.</p>
<h3>The Write Queues</h3>
<p>An incoming write is kept in a RAM write queue. If the RAM write queue is designed using blocks of RAM of the same log size, it will allow us to read the write queue RAM logs in a manner similar to the regular logs while reading content. Otherwise, reading from content in the write queue would need to be coded separately.</p>
<p>It should be possible to enable concurrent writes into the write queue by keeping allocation of content blocks separate from the actual write (memory copy). The allocation of content blocks would required synchronized access to (only) a current_RAM_log and a current_RAM_log_offset. Once the allocation has been performed, multiple writes across multiple RAM logs at different offsets can happen concurrently. Once the writes complete, they should update a bytes_committed variable present in every RAM log (under a lock or using CAS). Once the bytes_committed is equal to the log size, the RAM log is ready to be committed to media.</p>
<p>A (synchronized) list of to-be-committed RAM logs is maintained and a task is spawned up as soon as there are to-be-committed RAM logs to commit them to media. Care should be taken to not have more than one log write on any device at any point in time. This will avoid randomizing the device firmware with multiple simultaneous random writes.</p>
<h3>Content Block handles</h3>
<p>A content block allocation also involves creation of a block handle to be given back to the higher software layers. This block handle will be used by the higher layers to read the block (possibly multiple times) and then delete it. Writes to the same block shouldn&#8217;t be supported as this will randomize the media firmware and break the one log size write at any point in time rule. Instead, writes should always go through new block creation.</p>
<p>A block handle encapsulates an ordered pair of (log, offset [into the log], size [after the offset]) tuples. In most cases, only the first tuple would be sufficient for the allocation request. However, in cases current_RAM_log in the write queue doesn&#8217;t have enough space for the allocation request the second tuple denotes the rest of the allocation. This also means that an allocation request cannot be larger than the log size. However, an aggregate data structure on top of this tuple pair could be built to address that. The tuple pair also comes handy during garbage collection when the content block is re-written to the media. Using a tuple pair structure puts a cap on the max size of data movement at any given point in time during garbage collection. Thus, large sized blocks (using an aggregate data structure at a higher layer) would be moved partially upon garbage collection improving garbage collection efficiency. The other advantage is the simplicity of using just a tuple pair thus avoiding code complexity and allocation/manipulation of list data structures.</p>
<h3>Content Block Deletion &#8211; Garbage Accounting/Collection</h3>
<p>When content blocks are deleted, a (protected) garbage_size variable on the log is updated to reflect the new and larger garbage size. Note that a content block deletion can cause upto two garbage size updates on the upto two logs it points to. A max-heap (ordered w.r.t. garbage size) of the logs is maintained and updated upon block deletion. The top of the heap (max garbage size) is the most eligible log for garbage collection. Note that, this won&#8217;t give us SSD wear-leveling, but 1) we can depend on the device firmware to do that 2) it can be done at the max-heap by suitably designing a metric to combine no. of writes with garbage size.</p>
<p>Garbage collection can be triggered whenever a (small) reserved pool of empty SSD logs is below its threshold. The RAM logs use this pool of SSD logs to flush their content into &#8211; which triggers garbage collection. The garbage collection task picks a log from the max-heap outlined above and starts re-writing the valid blocks present in it back to the storage layer. It can use the same read and write APIs which are used by the client for accessing the storage layer. Once all the valid blocks in a log are re-written, the log can be given back to the reserved pool of empty SSD logs.</p>
<p>Note that client writes are dependent on space in the RAM log based write queue which is in-turn dependent on space present in the reserved pool of empty SSD logs which is in-turn dependent on garbage collection. If garbage collection depends on the same write APIs as the client, then it will complete a circle of dependency back to the RAM log based write queue. To prevent deadlock, the write API reserves a few RAM logs for GC induced writes but doesn&#8217;t use those logs for client writes. This will break the deadlock.</p>
<h3>Data structures and Locking</h3>
<p>Here&#8217;s an outline of the data structures described above:</p>
<p>Storage Layer:</p>
<ul>
<ul>
<li>Storage Layer mutex (protects all the data structures below in the Storage Layer)</li>
<li>Write queue of RAM logs, current_RAM_log, current_RAM_log_offset</li>
<li>To-be-committed list of RAM logs</li>
<li>Reserved pool of SSD logs</li>
<li>Max heap of logs w.r.t garbage size</li>
</ul>
</ul>
<p>Log:</p>
<ul>
<ul>
<li>Log RW lock</li>
<li>Unordered Set of valid content blocks</li>
<li>Garbage Size</li>
<li>(For RAM logs only)
<ul>
<li>Pointer to memory</li>
<li>bytes_committed (used in write API for supporting concurrent writes)</li>
</ul>
</li>
<li>(For SSD logs only)
<ul>
<li>SSD device / file</li>
<li>SSD offset into device / file</li>
</ul>
</li>
</ul>
</ul>
<p>Content Block:</p>
<ul>
<ul>
<li>pair of (log, offset, size) tuples</li>
</ul>
</ul>
<p>The Storage Layer mutex is taken to protect the various free lists, write queues and the allocation log and offset. The log&#8217;s RW lock protects the log&#8217;s metadata (garbage size, set of content blocks). Note that the actual content doesn&#8217;t need any locking as once committed it is immutable. During read, we Read lock the locks on the pair of logs for the content block. This read locks protects against garbage collecting the logs (which would take write locks on them to change their garbage size and/or set of valid content blocks). Note that the locks taken while reading should honor lock ordering to avoid deadlocks. A simple scheme is to just use the log data structure&#8217;s address as the lock order. Technically, we could allow for larger concurrency by allowing append modifications to set of valid content blocks in RAM logs, or by allowing removal of content blocks and increase of garbage size if the log is not selected for garbage collection. However, the above locking scheme works well in practice and I could see no surprising bottlenecks caused by lock contention.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/172/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/172/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/172/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/172/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/172/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/172/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/172/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/172/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/172/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/172/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/172/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/172/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/172/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/172/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=172&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2012/01/09/multi-log-structured-storage-layer/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Threadpool/Async vs Multithreading</title>
		<link>http://defectivecompass.wordpress.com/2011/12/01/threadpool-async-vs-multihreading/</link>
		<comments>http://defectivecompass.wordpress.com/2011/12/01/threadpool-async-vs-multihreading/#comments</comments>
		<pubDate>Fri, 02 Dec 2011 00:28:58 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=161</guid>
		<description><![CDATA[Threadpools and async IO are all rage these days for writing highly concurrent servers. As far as I can remember this is a relatively new development and a few years ago, highly concurrent servers were written using async IO (without &#8230; <a href="http://defectivecompass.wordpress.com/2011/12/01/threadpool-async-vs-multihreading/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=161&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Threadpools and async IO are all rage these days for writing highly concurrent servers. As far as I can remember this is a relatively new development and a few years ago, highly concurrent servers were written using async IO (without threadpools&#8230; there weren&#8217;t so many cores then <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  ) or using multi-threading. What changed? Are servers designed using async / threadpools better compared to multi-threaded ones? Why?</p>
<p>Multi-threading involves using OS threads to schedule a bunch of concurrent tasks. The OS threads save and restore the entire processor state when a processor core switches the currently executing thread. Processor state can be large for 64 bit processors (16 64bit general purpose registers and 16 128 bit SSE registers = 16* (8 + 16 bytes) = 384 bytes without counting the segment, debug and control registers). There is also the issue of switching to a new stack which may be at a different location in memory and may be cache cold. So if there are threads which context switch very frequently (say in every few instructions) then the context switch overhead can be disproportionately high.</p>
<p>Async / threadpool helps this scenario by keeping the OS thread the same on a core (so keeping the top of the stack hot in the cache) and just calling into different tasks (essentially userspace functions or methods) enqueued in a queue. Switching between tasks doesn&#8217;t require saving any processor state because the functions save as much register state as it uses as it moves forward in the code path. This favors small tasks because the little processor state is saved/restored and the top of the stack always remains hot.</p>
<p>However, async / threadpools have a significant drawback. If the programmer fails to split its tasks evenly in terms of resource consumption (some tasks taking disproportionately large number of CPU cycles) then the async / threadpool system adds significant latency to small tasks queued up behind large ones. This problem doesn&#8217;t arise with threading because long running OS threads are preempted by the kernel to give CPU time fairly to other threads in the system.</p>
<p>The situation can be ameliorated by increasing the number of threads in the threadpool to be somewhat larger than the number of cores. This can allow some small tasks to be queued (and subsequently executed with a minor hit in latency) in threads which don&#8217;t have large tasks. However, this fails to solve the problem completely. You can never be sure of how many threads should you have in the threadpool simply because in most cases you don&#8217;t know how many long running tasks will you have in the system. Furthermore, if you have too many threads in the threadpool then you will end up context switching between them needlessly.</p>
<p>A better solution to the problem is to let the programmer break long running tasks into smaller tasks and schedule the subsequent tasks at the completion of the preceding tasks. This will allow for some amount of fairness in the task queue. However, this seems like asking the programmer to solve the same problem for every task he creates instead of having a single proven-to-work solution such as threading take care of the problem.</p>
<p>Is there a good solution to this problem? How about the following:<br />
We bring preemptive scheduling up from the kernel into userland. Essentially, the threadpool has timers on each of its threads / cores and if the timer fires before the current task is finished then the task is suspended (will require saving of processor state) and is enqueued at the end of the thread&#8217;s task queue. Windows supports Get/SetThreadContext() APIs which can help with this without creating new threads. Otherwise, one can always create new threads to resume the queued up tasks while keeping the current thread (one with the long task) suspended. The threads given up after preempted tasks are completed can be reused to satisfy more thread creations.</p>
<p>The above will achieve the best of both worlds. It will allow small task switching to be efficiently done in the userspace and at the same time allow long running tasks to be handled preemptively, thus reducing latency issues with async / threadpool systems.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/161/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/161/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/161/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=161&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2011/12/01/threadpool-async-vs-multihreading/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Cloud computing APIs and Timeouts</title>
		<link>http://defectivecompass.wordpress.com/2011/10/24/cloud-computing-apis-and-timeouts/</link>
		<comments>http://defectivecompass.wordpress.com/2011/10/24/cloud-computing-apis-and-timeouts/#comments</comments>
		<pubDate>Tue, 25 Oct 2011 00:24:04 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=150</guid>
		<description><![CDATA[Cloud computing APIs like other system or application APIs have their success and failure modes. However, cloud computing APIs have another failure mode which most system or application APIs don&#8217;t have &#8211; timeouts. One problem with timeouts is that they &#8230; <a href="http://defectivecompass.wordpress.com/2011/10/24/cloud-computing-apis-and-timeouts/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=150&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Cloud computing APIs like other system or application APIs have their success and failure modes. However, cloud computing APIs have another failure mode which most system or application APIs don&#8217;t have &#8211; timeouts. One problem with timeouts is that they take time <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> , which implies that whichever higher level operation is being performed, it generally has much higher latency before a response can be gathered for the user (which most probably is going to be an error).</p>
<p>There is another problem with timeouts. Timeouts are required in distributed systems because of the problem of consensus [ http://en.wikipedia.org/wiki/Consensus_(computer_science) ] also known as the FLP impossibility result. Cloud computing APIs require data consistency (consensus on data) all the time but they cannot fundamentally achieve it. So they shift the problem to consensus in mutual time. A conservative timeout is an engineering approximation of a consensus on completion of the operation with an error.</p>
<p>Now, computer clocks rarely run at the same rate. The maximum possible difference between computer clock rates is called <em>clock skew</em>. This clock skew error must be added to every cloud computing API one is using so that we are <em>really</em> sure that the cloud computing operation beneath the API has completed with an error.</p>
<p>Normally, this is not a big problem. However, with a proliferation of the cloud APIs, and increased composition and layering of these APIs, the clock skew error needs to be added to the timeouts at <em>every</em> layer. This results in highly inflated timeout values at the end user.</p>
<p>So what can we do? IMHO, the first thing to do is to reduce timeouts at their origins. Most timeouts originate with heartbeats between components and it is important to have smaller heartbeats. The heartbeats cannot be made very frequent because it uses up network iops. However, its definitely something that should be tuned. Also, a timeout of an operation is often the maximum of all the timeouts that we will experience in sub-operations. Thus, sub-operations should be chosen such that their max. value is lower than other alternatives.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/150/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/150/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/150/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=150&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2011/10/24/cloud-computing-apis-and-timeouts/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Elevators</title>
		<link>http://defectivecompass.wordpress.com/2011/03/08/elevators/</link>
		<comments>http://defectivecompass.wordpress.com/2011/03/08/elevators/#comments</comments>
		<pubDate>Wed, 09 Mar 2011 04:36:13 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Life]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=155</guid>
		<description><![CDATA[I am sure that a lot of you live or work in a building which has a battery of elevators to handle intra-building traffic. While I was (ironically) waiting for the elevators, I wondered why do the elevators require people &#8230; <a href="http://defectivecompass.wordpress.com/2011/03/08/elevators/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=155&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I am sure that a lot of you live or work in a building which has a battery of elevators to handle intra-building traffic. While I was (ironically) waiting for the elevators, I wondered why do the elevators require people to first request for up/down and <em>then,</em> inside the elevator, request for the proper floor. Wouldn&#8217;t it be more efficient if you could request the destination floor right at your current floor?</p>
<p>Imagine that instead of up/down buttons and up/down lights for each elevator on each floor,  you had a panel of buttons arranged vertically with lights for each elevator adjacent to it. Something like<br />
[1] r g b y<br />
[2] r g b y<br />
&#8230;<br />
where [1], [2], etc. are buttons and r g b y&#8230; are colored lights for elevators. When you want to go to a given floor, you directly press the floor&#8217;s button on the panel and the system immediately schedules an elevator to pick you up (you see that elevator&#8217;s colored light light-up on the panel). Then you go and walk to your elevator&#8230; when it comes, enter it, and then leave at your destination. The panel of buttons inside the elevator is optional. However, inside the elevator, it should give you an indication of which floors it is going to stop.</p>
<p>I think the biggest gain from using this approach is that you can schedule elevators efficiently. This is because the system has very early knowledge of destination floors. Thus, if two people want to go to the same floor, the system brings up only one elevator. At the same time, if two people want to go in the same direction but different floors, the system may bring up two elevators for them. This, this allows distribution of &#8220;floor load&#8221; equally among all the elevators instead of depending on the people to avoid flash crowds to a given elevator.</p>
<p>Another gain is that people don&#8217;t have to think about which floor they are currently on and decide whether they need to go up or down. All they need to think about is the destination floor (which they already do). This also reduces crowd movement inside the elevator near the panel where space is much smaller than a floor&#8217;s lobby.</p>
<p>The disadvantage of this approach is that you need more number of buttons, lights and wires installed on every floor. However, the cost for that shouldn&#8217;t be very high compared to the elevator system itself.</p>
<p>P.S. It probably would be possible to do an even better elevator schedule if the feedback from the system about which elevator is going to which floor is allowed to come late (as late as when an elevator actually arrives). However, first that doesn&#8217;t seem good from a usability standpoint (the user has to wait for a flash or a bell to indicate that their elevator has come). Secondly, that would encourage crowding around the panel in the floor lobby. So, I would let go of this schedule optimization for usability sake.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/155/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/155/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/155/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/155/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/155/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/155/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/155/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/155/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=155&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2011/03/08/elevators/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Pushing the limits in Internet technology</title>
		<link>http://defectivecompass.wordpress.com/2011/02/24/pushing-the-limits-in-internet/</link>
		<comments>http://defectivecompass.wordpress.com/2011/02/24/pushing-the-limits-in-internet/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 07:05:26 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Economics]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=143</guid>
		<description><![CDATA[I went through an interesting talk by Geoff Huston at the linux.conf.au conference (video link). The tone of the talk reminds of another paper I read some time back by Bob Briscoe &#8220;Flow Rate Fairness: Dismantling a Religion&#8220;. While Bob&#8217;s &#8230; <a href="http://defectivecompass.wordpress.com/2011/02/24/pushing-the-limits-in-internet/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=143&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I went through an interesting talk by <a href="http://www.apnic.net/events/apnic-speakers/geoff-huston">Geoff Huston</a> at the linux.conf.au conference (<a href="http://blip.tv/file/4692762/">video link</a>). The tone of the talk reminds of another paper I read some time back by Bob Briscoe &#8220;<a href="http://www.sigcomm.org/ccr/drupal/?q=node/172">Flow Rate Fairness: Dismantling a Religion</a>&#8220;. While Bob&#8217;s argument was about how the fundamental protocols on the internet (TCP) lack a market oriented view of fairness, Geoff&#8217;s argument is about how the fundamental protocols on the internet (IPv4) lacks a market oriented view of allocating IP addresses.  More importantly, this is a lesson for me again in the importance of business logistics in adoption of any kind of <em>infrastructure</em> technology. While this interplay may not be so important for end user technology (such as a consumer device like smartphones or laptops), it is extremely important for technology which forms the basis of businesses of a large number of independent firms. Geoff argued that the primary reason for lack of IPv6 adoption was a lack of understanding in market incentives.  The future for both address allocation and flow rate fairness as we march into the future remains unknown. I look forward to an interesting interplay between technology, openness and market forces in the Internet. We live in interesting times!</p>
<p>related posts: <a href="http://defectivecompass.wordpress.com/2008/07/06/network-neutrality-and-flow-rate-fairness">Network Neutrality and flow rate fairness</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/143/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/143/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/143/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=143&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2011/02/24/pushing-the-limits-in-internet/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>(x) amount of RAM should be enough for anybody</title>
		<link>http://defectivecompass.wordpress.com/2011/01/23/x-amount-of-ram-should-be-enough/</link>
		<comments>http://defectivecompass.wordpress.com/2011/01/23/x-amount-of-ram-should-be-enough/#comments</comments>
		<pubDate>Mon, 24 Jan 2011 07:27:15 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=137</guid>
		<description><![CDATA[Just came across this on coding horror: http://www.codinghorror.com/blog/2011/01/24-gigabytes-of-memory-ought-to-be-enough-for-anybody.html . RAM is as good a resource in a computer as CPU time itself. While CPU scheduling can be done in the order of milliseconds (thread context save/restore + CPU cache re-population) which &#8230; <a href="http://defectivecompass.wordpress.com/2011/01/23/x-amount-of-ram-should-be-enough/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=137&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Just came across this on coding horror: <a href="http://www.codinghorror.com/blog/2011/01/24-gigabytes-of-memory-ought-to-be-enough-for-anybody.html">http://www.codinghorror.com/blog/2011/01/24-gigabytes-of-memory-ought-to-be-enough-for-anybody.html</a></p>
<p> <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> .</p>
<p>RAM is as good a resource in a computer as CPU time itself. While CPU scheduling can be done in the order of milliseconds (thread context save/restore + CPU cache re-population) which makes applications appear to run simultaneously, the same cannot be said for RAM scheduling. RAM scheduling requires paging out memory to disk and give the newly freed RAM resources to the demanding application. Disk I/O is slow and a RAM context switch cannot be done in milliseconds. Thus, for a responsive computer, while it is okay to have a CPU good enough to keep UI features for a <em>single application</em> responsive, we need to make sure that the RAM working sets for <em>all applications</em> need to be resident in RAM for a well functioning (responsive) computer.</p>
<p>From a developer&#8217;s point of view, given that he can measure and control exactly how much time does it take for his <em>single application</em> to respond on a CPU, dealing with CPU scheduling is generally fine for him. He assumes that his app is the only app running on the user&#8217;s system and in most cases, his assumptions about code path lengths for app responsive-ness are reasonably accurate. Memory is a whole different beast. The developer has <em>no idea</em> how much of a working set can he assume a computer to have for its app. The worst case figure is essentially zero, so the developers go by the best case figure of having the entire system RAM to their application. If for some reason, the sum of working sets of all the applications running is greater than RAM, the computer <em>fundamentally cannot</em> effectively schedule those applications leading to a phenomenon commonly referred to as &#8220;thrashing&#8221; where the computer spends most of its time context switching between RAM working sets without getting any actual work done.</p>
<p>The solution we have all been comfortable with so far has been to keep inflating the amount of RAM in our systems such that the developer&#8217;s expectation of the amount of RAM needed trails behind the actual RAM present in our system. We could aim to make it closer to the CPU way of doing things where its possible for the developer to <em>track</em> the working set requirements of their applications and make it possible for them to gracefully degrade application experience with smaller amounts of RAM. A simple malloc() + demand paging in the background doesn&#8217;t cut it then. We would need a more elaborate memory programming model where the system would let the application know when it is taking RAM away from it.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/137/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/137/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/137/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=137&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2011/01/23/x-amount-of-ram-should-be-enough/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Peer-to-peer Google Wave &#8211; cloudless style</title>
		<link>http://defectivecompass.wordpress.com/2009/08/04/peer-to-peer-google-wave-cloudless-style/</link>
		<comments>http://defectivecompass.wordpress.com/2009/08/04/peer-to-peer-google-wave-cloudless-style/#comments</comments>
		<pubDate>Tue, 04 Aug 2009 09:32:38 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[Social]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=126</guid>
		<description><![CDATA[If you read my Google Wave review, you may also have noticed my comment about a non-cloud way of doing Google Wave. There are quite a number of people who are getting concerned about more and more personal information being &#8230; <a href="http://defectivecompass.wordpress.com/2009/08/04/peer-to-peer-google-wave-cloudless-style/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=126&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>If you read my <a href="http://defectivecompass.wordpress.com/2009/06/01/google-wave-is-impressive/">Google Wave review</a>, you may also have noticed my comment about a non-cloud way of doing Google Wave. There are quite a number of people who are getting concerned about more and more personal information being uploaded to the Internet and the privacy nightmare it is turning into. Google wave itself seems to be an interesting concept for next generation collaboration on the Internet. It would be unwise not to look at it from a privacy centric view point.</p>
<p>Communication over the Internet is, at a very low level, done via message passing. The network essentially is a collection of computers (clients and servers) and messages are passed between them for any kind of communication. However, with the rise of forums, wikis and web 2.0 sites (and cloud computing in general) the communication pattern at a higher level is no more simple message passing but collaborative editing of documents which live on some server. Suddenly, the actual nodes in the network and the messages between them is no more the conduit of information. The living evolving document is the new, higher level conduit of information. For all practical purposes the collaboratively edited document is the &#8220;network channel&#8221; for communication between the participating clients.</p>
<p>Google Wave takes this essential idea much further. A wave document is the central focal point for communication between users, robots, external websites and even applications. It would not be a stretch of imagination to give an application a pointer to a wave document and make it automatically collaborate with other participants and applications without knowing about their IP addresses or connectivity status.</p>
<p>The unfortunate part of the story is that wave documents still reside in the cloud and hence will suffer from the same privacy concerns we have been dealing with when using wikis and forums. Peer-to-peer communication would be better for privacy but p2p architectures are still very much message passing oriented. Can we marry Google Wave and peer-to-peer?</p>
<h2>P2P waves</h2>
<p>It is not very hard to imagine an application which can keep peer-to-peer connections with friends (a <a href="http://en.wikipedia.org/wiki/Friend-to-friend">friend-to-friend</a> network) and collaborate over wave like documents. The documents would be replicated, similar to the way they are replicated between federated domains in Google Wave. In fact, the easiest way to do this may be to use some VPN software to create a private network between friends and run the open-source Google Wave servers on each node. Some of such VPN software is also not very hard to configure [eg. <a href="http://www.remobo.com/">Remobo</a>, <a href="http://www.leafnetworks.net/index.jsp">Leaf</a>, <a href="http://wippien.com/">Wippien</a>]. However, each user basically maintains his or her own wave domain and the Google Wave servers are probably not designed with such a use case in mind.</p>
<p>Even if we leave behind the re-usability of Google&#8217;s reference implementation, some practical architectural problems will remain. The Google wave servers are probably not designed for frequent disconnection in mind. The wave document should be accessible from some authoritative server at <em>all</em> points in time. Also, given the federated authentication, parts of the wave document may live entirely within the user domains first given access to. Adding more users to such parts of the document would require those domains to be online. This <em>always available</em> assumption might hold for cloud storage but doesn&#8217;t hold for peer-to-peer architectures&#8230; at least the kind discussed above.</p>
<h2>P2P waves with redundant but encrypted content</h2>
<p>I was thinking about this problem when I hit upon an idea. The basic problem is that if I wanted to attach a private message to a friend of mine on a peer-to-peer wave document then I cannot send the message if my friend is offline. However, if I <em>encrypt</em> my message using a shared key then even though the wave document is shared across the p2p network, only my trusted friend(s) can read the private message. Thus, comes the key insight:</p>
<p style="text-align:center;"><em>Redundantly propagating encrypted content among peers can help with the &#8220;</em><em>always available&#8221; problem with peer-to-peer networks.</em></p>
<p style="text-align:left;">However, if we are willing to do this for private messages, we can also extend the same for the whole wave document! Thus, instead of maintaining connectivity with at least a few of the participants in the wave document and sending the wave document in plain text to them, we can now maintain connectivity with <em>unknown</em> peers in a peer-to-peer network and encrypt the wave document with keys shared with just the users the document is intended for. Such an architecture would ensure good connectivity, fast downloads and very importantly <em>uninterrupted availability</em> which is one of the main strengths of storing wave documents in the cloud.</p>
<p style="text-align:left;">Curiously, a peer-to-peer network of the above nature already exists. Its called <a href="http://www.wuala.com/en/">Wuala</a> and it is a peer-to-peer file backup and sharing service with attention to privacy, selective sharing of files and file availability. A very interesting tech talk about the internals of the technology behind Wuala is <a href="http://www.youtube.com/watch?v=3xKZ4KGkQY8">here</a>. Though the Wuala guys have currently focused on just getting the peer-to-peer storage right (which is a tough computer science problem in itself), it is not very hard to imagine having a wave document hosting over it. The essential idea is to take read-only file sharing and make it read-write with some level of revision history and conflict management.</p>
<p style="text-align:left;">However, we definitely lose on realtime communication with a Wuala like approach. Peer-to-peer storage and lookup will definitely be more latency than a direct connection to a wave server. On the other hand, that may not be a small price to pay for privacy.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/126/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/126/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/126/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=126&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2009/08/04/peer-to-peer-google-wave-cloudless-style/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Perspective on Google Chrome OS, iPhone / Palm Pre and Microsoft Windows</title>
		<link>http://defectivecompass.wordpress.com/2009/07/17/perspective-on-google-chrome-os-iphone-palm-pre-and-microsoft-windows/</link>
		<comments>http://defectivecompass.wordpress.com/2009/07/17/perspective-on-google-chrome-os-iphone-palm-pre-and-microsoft-windows/#comments</comments>
		<pubDate>Sat, 18 Jul 2009 04:12:20 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=118</guid>
		<description><![CDATA[Microsoft being in the software business pretty much relies on the fact that they deliver compelling platform value to the hardware vendors to whom they sell their client OS. What strikes me most about the iPhone / Palm Pre development &#8230; <a href="http://defectivecompass.wordpress.com/2009/07/17/perspective-on-google-chrome-os-iphone-palm-pre-and-microsoft-windows/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=118&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Microsoft being in the software business pretty much relies on the fact that they deliver compelling platform value to the hardware vendors to whom they sell their client OS. What strikes me most about the iPhone / Palm Pre development is the fact that they subsidize their hardware to such an extent that it doesn&#8217;t make any sense for hardware vendors to sell their bare hardware with Microsoft Software (priced separately) on top of it. You cannot get hardware of the class of iPhone / Palm Pre at the aggressive price point of $200 <em>without already being locked into a platform</em>.</p>
<p>Microsoft cannot just use their old &#8220;sell software platform&#8221; over commodity hardware to get to the feature list of iPhone / Palm Pre. They will <em>have to</em> partner with somebody or make their own hardware. This already happened with game consoles.</p>
<p><a href="http://googleblog.blogspot.com/2009/07/introducing-google-chrome-os.html">Google Chrome OS</a> is possibly a similar shot at consumer notebook / netbook platform business. Here is a hypothetical scenario: Somebody makes a really slick 3G/4G/WiFi capable netbook. However, instead of them scrambling for a free Linux distribution, Google <strong>pays</strong> them to install their OS on their netbook and subsidize its price for consumers. It is not such a crazy idea given that Google already pays Mozilla to make Google search the default and subsidize Firefox (making it free, actually) and many <a href="http://www.urbandictionary.com/define.php?term=craplet">craplet</a> providers pay notebook sellers to put craplets on your notebook. The netbook OS itself is locked in to Google technologies and Google webapps. Google makes money off their web platform. Suddenly, there is a price difference (a huge one actually&#8230; given Microsoft&#8217;s Windows OEM pricing is high compared to netbook prices) and Google has an advantage in consumer netbook space.</p>
<p>However, its not that rosy for Google yet. Though I talked about &#8220;Google technologies&#8221; and &#8220;Google platform&#8221;, there is none right now which makes as much money as search. It would be better for Google, if they come up with a bunch of popular webapps that make ad money for them&#8230; something which people would use most of the time. Google would then be able to play its huge and successful ad network to its advantage.</p>
<p>What can Microsoft do? Given that its significantly behind Google in advertising revenue, it should play on its other strengths. It is investing heavily in Azure and providing ways for enterprises to transition and operate off the cloud. However, this article is about consumer devices. <em>If Microsoft focuses on integrating Windows with Azure specific services then they will have a way to subsidize notebook / netbooks similar to the above scenario and make money from Azure services</em>. However, its situation is also similar to Google&#8217;s&#8230; they don&#8217;t have a highly successful (money generating) &#8220;web platform&#8221; yet.</p>
<p>What kind of web platform am I talking about? Some basic services are essential:</p>
<ol>
<li>Single sign-on across web applications</li>
<li>In-built Internet wide notification mechanism (nice interface over email / IM).</li>
<li>Collaboration platform (not very different from Live Mesh, though something like Google Wave is probably better)</li>
<li>Some basic apps like blog publishing, photo sharing etc.</li>
</ol>
<p>Microsoft already has such services under the <a href="http://home.live.com">Live </a>branding. However, the critical step really is to release Azure frameworks to enable other developers to tie in to Live web applications. It may not be a bad idea to showcase such apps in an App Store much like the iPhone App Store and make money off that.</p>
<p>Google has an edge that it starts to make money immediately over its search the moment Chrome OS gets into consumer hands. Also, it has Google App Engine which competes with Azure. In a sense, Google has two sources for making money, ad revenue and Google App Engine, and it is strong in ads. Microsoft is not so strong in ads and like Google, is just starting with Azure.</p>
<p>It is also interesting to note that Facebook already has a way for applications to be rolled in a highly popular social network. However, its inability to make money through advertising and keeping the social graph closed inside Facebook makes me feel that its necessary to look at providing Cloud infrastructure as a means for making money. Instead of keeping a closed social network, may be its a better idea to open up the network and compete for webapp developers who would target your cloud platform. Of course, all of this is great for the consumer who gets more and more free applications and subsidized hardware. However, this also means that traditional desktops like the Windows desktop, Mac OS X, KDE and Gnome don&#8217;t cut it for the netbooks&#8230; and the difference between netbooks and notebooks widen as the developments in the two different ecosystems deviate from each other.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/118/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/118/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/118/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=118&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2009/07/17/perspective-on-google-chrome-os-iphone-palm-pre-and-microsoft-windows/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Google Wave is impressive</title>
		<link>http://defectivecompass.wordpress.com/2009/06/01/google-wave-is-impressive/</link>
		<comments>http://defectivecompass.wordpress.com/2009/06/01/google-wave-is-impressive/#comments</comments>
		<pubDate>Tue, 02 Jun 2009 02:18:09 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>
		<category><![CDATA[Internet]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=111</guid>
		<description><![CDATA[So, if you haven&#8217;t noticed yet, Google Wave happened recently. If you have like an hour to kill for some really impressive tech demo, you might want to take a look at the video on the link. Its a fairly &#8230; <a href="http://defectivecompass.wordpress.com/2009/06/01/google-wave-is-impressive/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=111&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>So, if you haven&#8217;t noticed yet, <a href="http://wave.google.com/">Google Wave</a> happened recently. If you have like an hour to kill for some really impressive tech demo, you might want to take a look at the video on the link.</p>
<p>Its a fairly ambitious project, one that merges email and chat into one seamless application. I call it ambitious because it tries to do the following <em>simultaneously</em>:</p>
<ul>
<li>Merge real-time (chat) and asynchronous (mail) like communication together. They had that going on in a somewhat crude form (compared to google wave) with gmail&#8217;s chat.</li>
<li>Instead of independent messages being sent either way it more about editing a collaborative document. The concept is fairly close to a wiki except that its also real time.</li>
<li>It has a decentralized architecture where pieces (also known as wavelets) of the collaborative document can originate from multiple domains and can be edited. The various pieces or wavelets can have
<ul>
<li>Domain scope: some parts of the collaborative document may remain private in a set of domains.</li>
<li>Revision history: every edit is stored so that there is a revision history and accountability of every change made to the collaborative document.</li>
</ul>
</li>
<li>The document model itself is pretty general and allows many views (and consequently many google wave &#8220;applications&#8221;) to be built with the existing framework of de-centralized real-time collaboration.</li>
</ul>
<p>So its basically a decentralized, real-time wiki with a flexible document model ready for making applications and guess what, email are chat are the most basic things that can be built starting from that.</p>
<h2>Comparison with file syncing services</h2>
<p>You <em>may</em> also compare this to <a href="https://www.mesh.com/welcome/default.aspx">Live Mesh</a> where the basic idea really was some collaboratively edited cloud storage with edit notifications available as an activity feed for applications to tap in. In fact, when I played with various file syncing services, such as Live Mesh and <a href="https://www.getdropbox.com/">dropbox</a>, the limitation which stood out most was the inability of applications to manage conflicting changes. Changing the underlying document model from essentially a large chunk of bytes to hierarchically organized, change history retaining document model is just what applications need. Of course the applications need to be redesigned (separation of the actual document and the changes made to it) but that may be a necessary step to achieve what Live Mesh set out to do in the first place. Of course, I am not sure if Live Mesh was supposed to be either decentralized or real-time.</p>
<p>Note that even though the implementation of google wave demoed at Google I/O was an HTML 5 application, there is nothing stopping a normal desktop application like Microsoft Office to adopt a similar document model.</p>
<h2>Comparison with Social Networks</h2>
<p>Does existence of google wave necessarily obsolete social networking sites like <a href="http://facebook.com">facebook</a> (i.e. sites that provide the social network <em>infrastructure</em> and not the <em>specialized social networking applications</em> such as dopplr, last.fm, digg)? I think they still have a place because social network communication over facebook is very public and that public meme is a value that facebook provides in addition to a collaborative infrastructure. However, I am pretty sure there will be somebody out there who will take google wave&#8217;s reference implementation and just add enough &#8220;public&#8221; mantra to it to make a competitive social networking platform.</p>
<p>This is of course great news for social networking applications such as last.fm, dopplr and digg who can now use google wave for their social networking infrastructure.</p>
<h2>Will it be robust and glitch-free?</h2>
<p>Managing concurrent real-time editing of a document originating in pieces from several domains is of course not going to be glitch-free <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> . However, there is quite some stuff to be learnt from decentralized source code management (tools like <a href="http://en.wikipedia.org/wiki/Git_(software)">git</a> and <a href="http://www.selenic.com/mercurial/wiki/">mercurial</a>). These guys have been dealing with decentralized conflict management and revision history since quite some time now. I guess the following additions to the wave implementation will make it more robust than just keeping content (wavelets) in authoritative domains.</p>
<ul>
<li>A wave domain should cache the entire wave aggressively (at least the parts it has access to).</li>
<li>The document model should be built such that it is <em>incrementally consistent</em> i.e. if a set of changes from a domain or a user is removed then the document isn&#8217;t corrupted.</li>
<li>Each incremental edit should be checksummed for integrity against the entire history of edits previous to this edit. This ensures that edits originating in different domains agree on consistency of their edits. This also requires that we revoke edits that happen over an older revision (the whole &#8220;<a href="http://gitready.com/intermediate/2009/01/31/intro-to-rebase.html">rebase-ing</a>&#8221; concept in git applies here&#8230; and should be possible to do automatically if changes are synced sufficiently in real-time).</li>
</ul>
<h2>Wave without the cloud?</h2>
<p>I am sure we will all be happy to see Google or somebody else provide free and nearly unlimited storage for wave documents. Like gmail, it should be easy to monetize for the cloud expenses using ads. However, I was thinking, if it was possible to build it using p2p techniques alone. Given git like consistency and aggressive caching, it may not be a bad idea to let the <em>client</em> be authoritative on its wave documents but be offline too. It also achieves storage redundancy automatically so one may be able to just start a fresh client and sync up its waves from its peers if it has the appropriate signing key. Git itself may not be suitable for building this as git doesn&#8217;t provide a standard incremental consistency model (I am thinking basic DTD document type check). However, it should be fairly easy to adapt the core git storage with such a type check, handling wavelets from different authentication domains and wrap it in a real-time network layer to make such a client.</p>
<p>The bigger problem really is maintaining connections to a large number of peers in the network. However, <a href="http://en.wikipedia.org/wiki/NAT_traversal">NAT traversal</a> schemes should help.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/111/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/111/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/111/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/111/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/111/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/111/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/111/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/111/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/111/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/111/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/111/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/111/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/111/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/111/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=111&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2009/06/01/google-wave-is-impressive/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
		<item>
		<title>Many-core task parallelism: low overhead task switching</title>
		<link>http://defectivecompass.wordpress.com/2009/05/03/many-core-task-parallelism-low-overhead-task-switching/</link>
		<comments>http://defectivecompass.wordpress.com/2009/05/03/many-core-task-parallelism-low-overhead-task-switching/#comments</comments>
		<pubDate>Mon, 04 May 2009 06:46:47 +0000</pubDate>
		<dc:creator>defectivecompass</dc:creator>
				<category><![CDATA[Computing]]></category>

		<guid isPermaLink="false">http://defectivecompass.wordpress.com/?p=102</guid>
		<description><![CDATA[With the upcoming multi/many-core processors, there is an industry wide push towards programming for task-parallelism. Without it, programs aren&#8217;t getting any faster. There are high level programming constructs such as OpenMP and ConcRT which can be used at the programming language level to exploit task &#8230; <a href="http://defectivecompass.wordpress.com/2009/05/03/many-core-task-parallelism-low-overhead-task-switching/">Continue reading <span class="meta-nav">&#8594;</span></a><img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=102&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>With the upcoming multi/many-core processors, there is an industry wide push towards programming for task-parallelism. Without it, programs aren&#8217;t getting any faster. There are high level programming constructs such as <a href="http://en.wikipedia.org/wiki/OpenMP">OpenMP</a> and <a href="http://channel9.msdn.com/posts/Charles/The-Concurrency-Runtime-Fine-Grained-Parallelism-for-C/">ConcRT</a> which can be used at the programming language level to exploit task parallelism. However, one often forgets that every task runs within a context (usually a thread context) and switching between contexts (for synchronization or fair scheduling) is costly. It is generally recommended that one should try to expose as much parallelism in their programs as possible and let a runtime schedule them on the cores that are available on the system. Of course, with that comes context switching cost till there comes a time where dividing up work into finer parallel chunks leads to poor performance due to a large amount of context switch overhead.</p>
<p>At the hardware level, task parallelism allows the processor to schedule instructions from multiple streams together on its functional units. It can do so because the individual tasks are supposed to be <em>independent</em> instruction streams. However, with it also lies the caveat, these streams often contain a large amount of state which the hardware manages independently and which the system software has to save/restore on task switch. While very fine grained task parallelism seems like an elegant goal to achieve with software in general, after a level of parallelism granularity, task switch overheads can spoil the entire show. [A task switch in (system) software involves saving and restoring the entire register state including vector registers which can be a large amount of task state].</p>
<p>Fortunately, there has been some work (in computer architecture) to deal with this problem. One of the nicest examples I found was the <a href="http://en.wikipedia.org/wiki/Tera_Computer_Company">Tera Computer</a> system. Its architecture is detailed in this <a href="http://www.ai.mit.edu/projects/aries/course/notes/tera.pdf">paper</a> [PDF]. There are two notable features of the Tera Computer (later the Cray MTA) which enabled low overhead task switching:</p>
<ol>
<li>A large number of hardware thread contexts. The Tera Computer processors had 21 stage pipelines. However, a single processor could store the context of 128 threads in total. Thus, it was very effective at hiding instruction and memory latencies.</li>
<li>The Tera Computer supported hardware based thread synchronization primitives. You could make a (hardware) thread wait on the state of a memory word which you could update from another thread, thus &#8220;waking&#8221; up this thread. There was no need for system software to do any context save/restore as the thread context was always present in hardware.</li>
</ol>
<p>However, the above technique seems to have a limitation. If my program has more than 128 hardware contexts then I need to resort to context save/restore. If my program has fewer than 128 hardware contexts (but enough to keep the 21 stage pipeline busy) I am wasting hardware resources (register storage).</p>
<p>I thought about this for a while and it occurred to me that <em>register based architectures are perhaps not suitable for MIMD (task-parallel) processors</em>. I know this is a bold claim but I feel that may be resorting to a memory operand based architectures (where an instruction&#8217;s operands are memory addresses) may eliminate context switch overhead all together. Note that the working set of a hardware thread is probably cached on the processor L1 cache and is available as fast as register storage itself*. However, having 3 64bit memory addresses in every instruction doesn&#8217;t seem like a good idea. It increases program size and takes up valuable instruction memory bandwidth. What I propose instead is to do something similar to the <a href="http://software.intel.com/en-us/articles/itaniumr-processor-family-performance-advantages-register-stack-architecture/">register stack engine</a> in Itanium or <a href="http://www.sics.se/~psm/sparcstack.html">register windows</a> in SPARC. However, instead of using bona-fide registers, the instruction operands are <em>offsets from a per-thread stack pointer</em>. Using offsets allows register like compressed instruction encoding while at the same time not deviating from the memory operand instruction model. The only context that the software needs to save on a task switch is the stack pointer which is okay as its just a single register. Thus task switching simply involves saving/restoring the stack pointer and resuming execution by popping the return address off the stack. With this CPU architecture, its possible to do lightweight context switching without any assistance from hardware (like the Tera Computer).</p>
<p>But who am I kidding&#8230; I am going to see only x64 and ARM for the rest of my life :/.</p>
<p><strong>Update:</strong> When I was discussing this with a colleague, an interesting question came up: How do we integrate thread wake-up from software and hardware events with the above scheme? At first I thought just having a trap for the hardware events would be okay&#8230; however hardware events may require a much lower service latency and special processing than software events to get good performance (eg. wake-up after L1 cache miss). So I finally thought of the following: the architecture keeps separate lists of ready-to-run contexts (stack pointers) for hardware and software events and service the hardware ones preferentially. While notifications for wake-up of software events could be done in software, the hardware would need to take care of hardware wake-up events and wake-up (and resume the halted instruction) in the corresponding threads. Thus, it is also important to keep this list in pinned L1 cache. There also needs to be a way for the hardware to let the software know when a previously hardware interrupted thread is ready-to-run&#8230; a trap could work here as long as we guarantee that it won&#8217;t cause any further trap invocations. Another simpler and more performant mechanism could be to just pre-empt the current running thread with the hardware woken one and maintain a stack of contexts to switch back to the pre-empted ones once the current one arrives at a hardware / software synchronization point.</p>
<p>One may also think of using low overhead task parallelism to cleanly emulate cache hierarchies, cache coherency and many other kinds of system specific processor facilities (reminds me of the <a href="http://en.wikipedia.org/wiki/PALcode">PALcode</a> in DEC Alpha).</p>
<p>* The fact that the Itanium architects went with a backing store for their register stack (and not direct memory operands) makes me a little skeptical that my idea can be implemented in a performant way on real hardware. However, given a large number of switchable contexts, an occasional event of high latency to L1 shouldn&#8217;t be a real problem.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/defectivecompass.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/defectivecompass.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/defectivecompass.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/defectivecompass.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/defectivecompass.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/defectivecompass.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/defectivecompass.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/defectivecompass.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/defectivecompass.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/defectivecompass.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/defectivecompass.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/defectivecompass.wordpress.com/102/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/defectivecompass.wordpress.com/102/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/defectivecompass.wordpress.com/102/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=defectivecompass.wordpress.com&amp;blog=387750&amp;post=102&amp;subd=defectivecompass&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://defectivecompass.wordpress.com/2009/05/03/many-core-task-parallelism-low-overhead-task-switching/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="" medium="image">
			<media:title type="html">defectivecompass</media:title>
		</media:content>
	</item>
	</channel>
</rss>
