<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
<title>Dmitriy Samovskiy's Blog</title>
<atom:link href="http://www.somic.org/feed/" rel="self" type="application/rss+xml" />
<link>http://www.somic.org</link>
<description>Fubaredness is contagious - preventing its spread in IT one post at a time</description>
<lastBuildDate>Tue, 10 Jan 2012 05:31:39 +0000</lastBuildDate>
<generator>http://jekyllrb.com</generator>
<language>en</language>
<sy:updatePeriod>hourly</sy:updatePeriod>
<sy:updateFrequency>1</sy:updateFrequency>

	<item>
		<title>On Amazon EC2 Spot Price Spikes</title>
		<link>http://www.somic.org/2012/01/10/on-amazon-ec2-spot-price-spikes/</link>
		<comments>http://www.somic.org/2012/01/10/on-amazon-ec2-spot-price-spikes/#comments</comments>
		<pubDate>Tue, 10 Jan 2012 06:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[cloud-computing]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2012/01/10/on-amazon-ec2-spot-price-spikes/</guid>
		<description><![CDATA[Last week I came across an interesting post on Amazon EC2 spot price spikes published on GigaOm. In the comments, in response to a question from a reader, the author stated that &#8220;I don&#8217;t think anyone ever expected that the market would behave like this.&#8221; I have been interested in...
]]></description>
			<content:encoded><![CDATA[<p>Last week I came across an interesting <a href='http://gigaom.com/2011/12/27/how-to-deal-with-amazons-spot-server-price-spikes/'>post</a> on Amazon EC2 spot price spikes published on GigaOm. In the comments, in response to a question from a reader, the author stated that &#8220;I don&#8217;t think anyone ever expected that the market would behave like this.&#8221; I have been interested in this expectation for some time and decided to kick off 2012 blogging season on somic.org with a post dedicated to this topic.</p>

<p>As I described <a href='/2010/12/14/basics-iaas-spot-pricing/'>before</a>, a spot instance technically is not that much different from a regular on-demand instance - it has the same CPU and RAM capacity, same network traffic and bandwidth allowances. The only fundamental difference is that AWS can terminate it under certain circumstances (when spot price exceeds this instance&#8217;s bid) - because of this, most people rationally expect spot instances to trade at a discount to the price of regular on-demand.</p>

<p>Furthermore, somewhat similar to the concept known as <a href='http://en.wikipedia.org/wiki/Law_of_one_price'>law of one price</a>, one&#8217;s intuition says that if the spot price exceeds on-demand price, people will stop bidding on spot and will start getting regular instances instead, until spot price comes down as a result of reduced demand.</p>

<p>But then we face a question. How is it possible that we see the following prices for m1.small/Linux in us-east-1 when its on-demand price is 0.085:</p>
<p><pre>
SPOTINSTANCEPRICE	0.500000	2011-11-16T14:39:39-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	2.000000	2011-11-16T14:53:40-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	3.000000	2011-11-16T15:32:37-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	2.000000	2011-11-16T17:51:35-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	3.000000	2011-11-16T20:33:19-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	2.100000	2011-11-17T01:43:24-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	3.000000	2011-11-17T02:38:30-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	2.100000	2011-11-17T05:34:07-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	2.000000	2011-11-17T10:05:29-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	3.000000	2011-11-17T12:22:39-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	2.000000	2011-11-17T13:39:53-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	5.000000	2011-11-17T13:53:19-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	1.000000	2011-11-17T14:50:19-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	0.500000	2011-11-17T19:49:52-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	1.000000	2011-11-18T07:05:59-0600	m1.smalLinux/UNIX
SPOTINSTANCEPRICE	0.500000	2011-11-18T09:23:32-0600	m1.smalLinux/UNIX
</pre></p>
<p>Note that spot price in this timeframe fluctuated between 588% and 5,882% of on-demand regular price. (I am wondering if users who had spot instances running at these prices know how to spell &#8220;overpaid.&#8221;)</p>

<p>I think there are several possible explanations.</p>

<p>Firstly, it&#8217;s possible that whoever does actual bidding on spot instances is not the same person who pays the bill - for example, developers bid, while accounting department gets charged on its credit card. This may lead to careless bidding and drive the price to unnecessarily high level.</p>

<p>Secondly, it&#8217;s possible that some customers don&#8217;t have enough sophistication built into their automated bidding systems. In the price history snippet above, spot price remained extremely high for entire day of November 17 (Thursday) - it shouldn&#8217;t have gone unnoticed even by a semi-automated system (i.e., one with occasional human supervision and monitoring). The right course of action was to cancel one&#8217;s all outstanding bids and switch to using on-demand. Any automated bidding system must be monitoring spot price at all times and be prepared to switch new instance launches away from spot to on-demand if spot price remains elevated for long periods of time, as well as termine current spot instances.</p>

<p>Thirdly, it&#8217;s possible that EC2 decided to shut down spot market for this instance type by setting spot price above all bids (I am talking about $5 price).</p>

<p>And finally, it&#8217;s possible that some customers want to gamble. They could be bidding above their true price to be able to weather the spikes. Their thinking could go like this: &#8220;Over life of a spot instance, in normal times when spot price is below on-demand, we realize good savings. We could give back some of those savings to offset times when spot price spikes so that in the end we still come out ahead.&#8221;</p>

<p>Hence, in addition to recommendations of the GigaOm post&#8217;s author, here is my advice.</p>
<p><ol>
<li>Avoid spot instances for workloads that must run non-stop for the
foreseeable future (especially in us-east-1 where spot prices
seem to fluctuate and spike a lot more)</li>
<li>Do not set spot price above on-demand price unless you really know
what you are doing and have sufficient automated instrumentation
in place to protect you in case spot price does go through the roof</li>
<li>Do not submit spot instance bids for immediate execution if spot price
is already significantly higher than on-demand</li>
<li>Hoping that spot price will come down to below on-demand very soon
is not a bidding strategy, it's gambling</li>
</ol></p>
<p>You can read my other posts about EC2 spot instances <a href='/tag/amazon-ec2-spot'>here</a>.</p>
]]></content:encoded>
	</item>

	<item>
		<title>VXLAN and NVGRE - Not a Long Term Answer</title>
		<link>http://www.somic.org/2011/10/11/vxlan-nvgre-not-long-term-answer/</link>
		<comments>http://www.somic.org/2011/10/11/vxlan-nvgre-not-long-term-answer/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[cloud-computing]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/10/11/vxlan-nvgre-not-long-term-answer/</guid>
		<description><![CDATA[Last week I came across a blog post titled NVGRE Musings. It&#8217;s got some great links to posts about two recently introduced proposals - VXLAN and NVGRE. But what drew my attention was the following thought from the first paragraph: Supporting an L2 service is important for virtualized servers, which...
]]></description>
			<content:encoded><![CDATA[<p>Last week I came across a blog post titled <a href='http://codingrelic.geekhold.com/2011/10/nvgre-musings.html'>NVGRE
Musings</a>. It&#8217;s got some great links to posts about two recently introduced proposals - VXLAN and NVGRE. But what drew my attention was the following thought from the first paragraph:</p>
<p><blockquote>Supporting an L2 service is important for virtualized
servers, which need to be able to move from one physical server to
another without changing their IP address or interrupting the
services they provide.</blockquote></p>
<p>I don&#8217;t know if this is what network vendors are hearing from their big customers, but I strongly believe <a href='/2011/07/13/network-hardware-past-to-software-future/'>it's not
the right answer</a> to failover needs in the long term. No matter how hard I am looking, <strong>I don't see failover within the application as the network layer's
problem to solve</strong>. The fact that they can doesn&#8217;t mean they should. My experience, as well as my understanding of current state of affairs in the biggest web-based tech ops organizations, lead me to a conclusion that the best place to handle application-level failover is application software itself.</p>

<p>Since application lives on top of the network layer, network is not in a good position to provide custom tailored failover solutions - all it can do is generic functionality that must remain transparent to the app. The biggest selling point of this approach is customers don&#8217;t need to re-architect their applications. There is a mounting body of evidence, however, that application does benefit from at least some modernization when it&#8217;s being moved to a newer operating environment (this was the apporach taken by Netflix who picked parts of the old app that they liked and discarded parts that they found lacking after years of running them in their datacenter environment or that were found unfit for their new IaaS environment).</p>

<p>George Reese <a href='https://twitter.com/#!/GeorgeReese/status/100579970191073280'>said</a> it best:</p>
<p><blockquote>[...] New apps -> app is responsible for availability;
Old apps -> infrastructure is responsible for availability</blockquote></p>
<p>With this said, I do see a part of the network stack where the need for a new standard is the greatest. I am talking about DNS service name resolution.</p>

<p>Some parts are <a href='http://googlecode.blogspot.com/2010/01/proposal-to-extend-dns-protocol.html'>being addressed</a> by Google in their efforts around including a part of end-user&#8217;s IP address as a part of a resolution request so that a more meaningful geo-distribution can be done by authoritative DNS servers. But I think it should not stop here.</p>

<p>Right now when a client requests a list of IP addresses corresponding to a given hostname, it simply gets a list of IPs (which in general case technically is not even ordered). Instead, the response could include a lot more meta information: &#8220;here is a list of IP addresses corresponding to the service you requested - try them in this order; if all of them fail to respond with such and such timeout, here is a backup list; and finally here is a token for your request - if you fail to get any response from any of the IPs listed within such and such timeout, retry this DNS query.&#8221;</p>

<p>With expanded retry and failover capabilities in host name resolution protocol on the client, it will become much easier to build <a href='/2009/08/18/the-concept-of-hyper-distributed-application/'>hyper-distributed</a> highly available services and applications - and that&#8217;s what I think the network industry should be focusing on.</p>
]]></content:encoded>
	</item>

	<item>
		<title>Complex Systems: Generalists and Specialists</title>
		<link>http://www.somic.org/2011/09/30/complex-systems-generalists-and-specialists/</link>
		<comments>http://www.somic.org/2011/09/30/complex-systems-generalists-and-specialists/#comments</comments>
		<pubDate>Fri, 30 Sep 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[devops]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/09/30/complex-systems-generalists-and-specialists/</guid>
		<description><![CDATA[The following tweet from @saschabates that appeared in my stream this morning caught my attention: #surgecon emergent theme: complex systems cannot be effectively diagnosed without smart generalists who understand them end to end This statement is correct (otherwise it wouldn&#8217;t have been a theme emerging out of one of the...
]]></description>
			<content:encoded><![CDATA[<p>The following <a href='https://twitter.com/#!/saschabates/statuses/119793462261452800'>tweet</a> from @saschabates that appeared in my stream this morning caught my attention:</p>
<p><blockquote>#surgecon emergent theme: complex systems cannot
be effectively diagnosed without smart generalists who understand
them end to end</blockquote></p>
<p>This statement is correct (otherwise it wouldn&#8217;t have been a theme emerging out of one of the best tech conferences). However it caught my attention not because it&#8217;s right - but because it can be hugely misinterpreted.</p>
<p><strong>It's not saying that smart generalists are <em>the only</em>
key to diagnosing problems
in complex systems - it's saying that without smart generalists operating
a complex system is nearly impossible. The key to running a system
successfully is having a balanced mix of generalists and
specialists. Generalists are usually necessary but not sufficient.</strong></p>
<p>There are generally two types of roles in tech ops - generalists and specialists. A DBA, for example, is initially a specialist. And so is a network engineer. Specialists focus on some part of a bigger system; they have detailed knowledge about all interactions between components within their area of expertise and usually will have significant understanding of how their part integrates into the whole system. Their view is from inside out.</p>

<p>Generalists, on the other hand, approach the system from reverse angle - from outside in. Their focus is on interaction of components within the system; while they study behavior of the system as a whole, they will inevitably develop deeper understanding of individual components but the depth will vary by component.</p>

<p>Why can&#8217;t we all be generalists - know everything, interchangeable, able to effectively resolve all issues by ourselves without any help? The answer is simple. You can know tech. You can master <a href='/2011/09/05/troubleshooting/'>troubleshooting techniques</a> based on logic. But in a sufficiently complex system (which usually is changing at quite a fast pace), you won&#8217;t have enough time to accumulate enough experience with each component to become effective at running it sole-handedly.</p>

<p>While the idea &#8220;let&#8217;s all be generalists&#8221; could sound appealing, it&#8217;s not achievable in general case - the more complex a system gets, the less likely one will be able to become both a generalist and a specialist in it.</p>

<p>But that&#8217;s not all. At times an attempt is made to run a system with specialists but without generalists. This approach also doesn&#8217;t work universally. While it could work in smaller teams, sooner or later it breaks down as the number of people increases. Generalists are called in to help direct the specialists.</p>

<p>The bottom line: both generalists and specialists are key to successful operation of a complex system. These are distinct skills even if they are assigned to the same individuals. Size of the overall tech ops team and complexity of the system at hand play a decisive role in deteremining when it&#8217;s a good time to branch out into separate generalists and specialists.</p>
]]></content:encoded>
	</item>

	<item>
		<title>Troubleshooting</title>
		<link>http://www.somic.org/2011/09/05/troubleshooting/</link>
		<comments>http://www.somic.org/2011/09/05/troubleshooting/#comments</comments>
		<pubDate>Mon, 05 Sep 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[devops]]></category>
        
		  <category><![CDATA[distributed]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/09/05/troubleshooting/</guid>
		<description><![CDATA[One of the areas of tech ops that doesn&#8217;t get its fair share of discussion is troubleshooting. It&#8217;s not easy to teach troubleshooting - possibly because how successfully one can troubleshoot a given system largely depends on one&#8217;s experience with the system and on quality of the system&#8217;s feedback loops...
]]></description>
			<content:encoded><![CDATA[<p>One of the areas of tech ops that doesn&#8217;t get its fair share of discussion is troubleshooting. It&#8217;s not easy to teach troubleshooting - possibly because how successfully one can troubleshoot a given system largely depends on one&#8217;s experience with the system and on quality of the system&#8217;s feedback loops (accuracy and timeliness of monitoring data).</p>

<p>But despite the fact that troubleshooting is often more art than science, it has a set of general rules and guidelines, without which troubleshooting is nothing more than guessing. These are all common sense rules that formally come from <a href='http://en.wikipedia.org/wiki/Boolean_algebra_(logic)'>boolean
algebra</a> and <a href='http://en.wikipedia.org/wiki/First-order_logic'>first-order
logic</a>. They universally apply to the first half of troubleshooting - finding what&#8217;s wrong.</p>

<p>It&#8217;s important to emphasize that troubleshooting activities are always measured against two independent goals - finding and fixing the issue, and doing it as fast as possible. It&#8217;s the second goal that makes use of logic mandatory - you usually can&#8217;t afford to mentally build a list of anything that could have gone wrong and then start crossing items off this list one by one. To speed things up, you usually analyze symptoms and check only those hypotheses that plausibly match them. Ability to properly prioritize hypotheses comes purely from experience, but not wasting your time on things that can&#8217;t explain what you are observing has a lot to do with logic.</p>

<p>A key aspect of troubleshooting is <a href='http://en.wikipedia.org/wiki/Causality'>causality</a>: event A leads to event B, or A causes B, or A implies B (<b>A -> B</b>). A is sufficient for B here, and B is necessary for A.</p>
<p><b>A -> B</b> is the same as <b>NOT B -> NOT A</b>.
Imagine, for example, that
A = "filesystem is full" and B = "writes to filesystem are failing." In this
case A -> B. Therefore, if writes are working (NOT B), it means filesystem
is not full (NOT A). But if writes are failing (B), it does not automatically
mean that filesystem is full (for example, it could be mounted
read-only).</p>
<p>Another way to look at <b>A -> B</b> is <b>(NOT A) OR B</b>. This form can be easier to work with when you are applying negation - see below.</p>

<p>When A is sufficient and necessary for B, it means that A and B are are true or false both at the same time. Another way of saying it is &#8220;A is true if and only if B is true.&#8221; This statement formally consists of two: <b>A -> B</b> and <b>B -> A</b>.</p>

<p>Then there are important rules about negation that are called <a href='http://en.wikipedia.org/wiki/De_Morgan&apos;s_laws'>De Morgan's laws</a>:</p>
<p><blockquote>
<b>NOT (A OR B) = (NOT A) AND (NOT B)<br />
NOT (A AND B) = (NOT A) OR (NOT B)</b>
</blockquote></p>
<p>So how could you apply these rules in practice? First and foremost, never waste your time on checking A if you are observing NOT B and you know that A -&gt; B.</p>

<p>Secondly, never assume that NOT A causes NOT B if you only know that A -&gt; B.</p>

<p>Finally, never assume causality out of mere correlation of two events. If A and B tend to occur together, in bigger systems it&#8217;s often hard to determine if there is any causlity and which way it goes - further analysis is required.</p>

<p>Simple rules I mentioned in this post are not a complete guide to troubleshooting but they can still help you save time and resources - remember that any amount of time you spend investigating a hypothesis that you should have rejected based on pure logic, is time wasted.</p>
]]></content:encoded>
	</item>

	<item>
		<title>Amazon EC2 Spot Instances - A Flop?</title>
		<link>http://www.somic.org/2011/08/03/amazon-ec2-spot-instances-flop/</link>
		<comments>http://www.somic.org/2011/08/03/amazon-ec2-spot-instances-flop/#comments</comments>
		<pubDate>Wed, 03 Aug 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[cloud-computing]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/08/03/amazon-ec2-spot-instances-flop/</guid>
		<description><![CDATA[When Amazon Web Services launched EC2 spot instances in December 2009, I was very excited about the beginnings of potential revolution in how computing resources could be priced, bought and sold. I have followed this unprecedented phenomenon with great interest, blogging my thoughts along the way. But today, over 1.5...
]]></description>
			<content:encoded><![CDATA[<p>When Amazon Web Services launched EC2 spot instances in December 2009, I was very excited about the beginnings of potential revolution in how computing resources could be priced, bought and sold. I have followed this unprecedented phenomenon with great interest, <a href='/tag/amazon-ec2-spot/'>blogging</a> my thoughts along the way.</p>

<p>But today, over 1.5 years since the launch, I am not so sure anymore. While I have no insider information on what goals AWS set out for this program and how it&#8217;s been performing against these goals, there is a significant publicly available indicator that convincingly shows that this thing that AWS calls &#8220;spot market&#8221; is not performing the function of a spot market (clearing at equilibrium price). Instead, EC2 spot instances as of today are simply a discounted product with a couple of features removed (<a href='/2010/09/13/pricing-in-the-cloud/'>similar</a> to an airline selling non-refundable tickets at a discount to a price of fully-refundable tickets).</p>

<p>There are basically 4 features that AWS strips out of their regular on-demand product to justify a discount:</p>

<ul>
<li>a call to request an instance does not return an object corresponding to the instance you requested; instead you get a spot response object</li>

<li>there is an unfedined time interval between creation of spot response object and instance object (this time interval is usually small but technically it&#8217;s not defined)</li>

<li>a spot instance even during normal operation may never get started</li>

<li>a spot instance, once it&#8217;s started, can be terminated by EC2 under certain conditions even during normal operation</li>
</ul>

<p>Officially stated goal of this discounting is for EC2 to be able to reduce unused capacity while retaining a legal right to reclaim such capacity quickly if a need arises suddenly. As currently designed, it&#8217;s a win-win for both EC2 and customers. It&#8217;s a terrific idea. But it&#8217;s not a market driven by supply and demand.</p>

<p>If you want to see for yourself, please open a new browser tab and head over to <a href='http://cloudexchange.org'>http://cloudexchange.org</a>. Pick a product. Wait for a chart to load. Observe a nicely fluctuating price. So far so good.</p>

<p>But now, instead of looking at a weekly chart or monthly chart, look at all-time chart (click &#8220;All&#8221; in the lower right). Do you see it? <b>It's a flat
line! Well, more specifically, you will see a predominantly
constant-amplitude oscillator with constant upper and lower limits.</b></p>
<p><b>It's the fact that oscillator's upper and lower limits are constants that
shows that this is not a true spot market.</b> Why?
Because such limits are easily
identifiable - you only need to take a look at a long term chart. And
if bidders know in advance what the maximum price is going to be (occasional
spikes notwithstanding), they should
rationally bid above known maximum. And if this were a real market driven
by supply and demand, the oscillator should have swung higher
on some future iteration (once enough bids above current known maximum
accumulate). But it doesn't.</p>
<p>Note that it&#8217;s impossible to perform more extensive analysis due to lack of information - we don&#8217;t know how many bids are coming, for what times, we don&#8217;t know available supply (which can be fluctuating independently of bids since it&#8217;s shared with on-demand regular product). But overall constant upper and lower limits over long term are very unlikely in a system driven more or less by supply and demand.</p>

<p>You might object to my calling this a flop. Maybe you are right. This pricing mechanism definitely serves a purpose. But the idea of spot instances was to form a spot market - otherwise AWS should have named them &#8220;discounted instances.&#8221;</p>

<p>I think such renaming is the right thing to do, and with the knowledge they accumulated in the last 18+ months, AWS should start a real spot market, one driven by supply and demand, with more market information than just historical prices published via API. That&#8217;s what pioneers do - they critically analyze the past and continue to build fascinating future for all of us.</p>
<p><i>More on cloud pricing is <a href='/tag/cloud-pricing/'>here</a>.</i></p>
]]></content:encoded>
	</item>

	<item>
		<title>Following on Twitter Using RSS</title>
		<link>http://www.somic.org/2011/07/27/following-on-twitter-using-rss/</link>
		<comments>http://www.somic.org/2011/07/27/following-on-twitter-using-rss/#comments</comments>
		<pubDate>Wed, 27 Jul 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[internet]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/07/27/following-on-twitter-using-rss/</guid>
		<description><![CDATA[When I am trying to decide whether to follow a given account on Twitter or not, I usually look at the following three criteria: whether I am interested in this account's tweets whether this account's tweets include at least some degree of real-time relevance whether this account can participate in...
]]></description>
			<content:encoded><![CDATA[<p>When I am trying to decide whether to follow a given account on Twitter or not, I usually look at the following three criteria:</p>
<p><ol>
<li>whether I am interested in this account's tweets</li>
<li>whether this account's tweets include at least some degree of real-time
    relevance</li>
<li>whether this account can participate in a discussion</li>
</ol></p>
<p>Turns out however there are plenty of accounts that are missing properties #2 and/or #3. Bots that tweet links on a specific topic, for example. Or a celebrity comedian such as @shitmydadsays - this account probably won&#8217;t respond to any mentions and the tweets are rarely real-time sensitive.</p>

<p>I found it&#8217;s much more efficient to consume such tweets not via Twitter, but via RSS. Twitter used to include a feed on every account&#8217;s home page but not anymore. Here is how you can follow a Twitter account via RSS.</p>

<p>Let&#8217;s say you want to follow <a href='http://twitter.com/StephenAtHome'>@StephenAtHome</a>.</p>

<p>Open this link in your browser:</p>
<p><strong>https://api.twitter.com/1/users/show.xml?screen_name=StephenAtHome</strong></p>
<p>(If you prefer JSON, use <strong>https://api.twitter.com/1/users/show.json?screen_name=StephenAtHome</strong>).</p>

<p>Note user id value - for @StephenAtHome it&#8217;s 16303106.</p>

<p>Then add the following feed to your reader:</p>
<p><strong>http://twitter.com/statuses/user_timeline/16303106.rss</strong></p>
<p>This method helps me better manage my Twitter reading experience by ensuring real-time sensitive content and conversations go to Tweetdeck and the rest ends up in Google Reader.</p>
]]></content:encoded>
	</item>

	<item>
		<title>Network: From Hardware Past To Software Future</title>
		<link>http://www.somic.org/2011/07/13/network-hardware-past-to-software-future/</link>
		<comments>http://www.somic.org/2011/07/13/network-hardware-past-to-software-future/#comments</comments>
		<pubDate>Wed, 13 Jul 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[cloud-computing]]></category>
        
		  <category><![CDATA[infrastructure-development]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/07/13/network-hardware-past-to-software-future/</guid>
		<description><![CDATA[At this year&#8217;s GigaOm Structure conference, there was a single event that attracted my interest the most - network virtualization panel (I didn&#8217;t attend the conference, I was only following along over the Internet). It wasn&#8217;t just because it involved OpenFlow. I think there is a bigger trend at play...
]]></description>
			<content:encoded><![CDATA[<p>At this year&#8217;s GigaOm Structure conference, there was a single event that attracted my interest the most - <a href='http://gigaom.com/cloud/structure-network-virtualization-openflow/'>network virtualization panel</a> (I didn&#8217;t attend the conference, I was only following along over the Internet). It wasn&#8217;t just because it involved <a href='http://www.openflow.org/'>OpenFlow</a>. I think there is a bigger trend at play here - a lot functionality that we are used to seeing in network gear is moving to application level, from hardware to software. OpenFlow is just one of the manifestations of this bigger trend. Let me explain.</p>

<p>Networking was first about moving packets, in large quantities and with low latencies. This demand was met by specialized hardware which I assume was able to perform the job better than a general-purpose machine (&#8220;better&#8221; in this context means faster, more reliably and more cheaply). From their early days, network vendors have also extensively focused on what developers of modern distributed or <a href='/2009/08/18/the-concept-of-hyper-distributed-application/'>hyper-distributed</a> applications focus today - failure detection, fault tolerance. When application servers were still growing vertically (bigger machines with redundant power supplies, for example), network already was using distributed gossip-like protocols to exchange information.</p>

<p>Over time, however, more and more services found their home within the network layer - load balancing, virtual addresses, traffic encryption and so on. The idea was to let application remain unaware of all of this complexity on top of which it was sitting.</p>

<p>While this approach had been working for a while, it ran into a wall. Firstly, without direct control over network from applications, current setups were always extremely inflexible and high maintenance (dedicated network engineering staff, change management process in addition to application code rollouts, etc). Secondly, features baked into hardware take longer to tweak (unless vendor had sufficient foresight to plan for new requirements). Thirdly, hardware is harder to replace from financial perspective (pay up-front + maintenance).</p>

<p>Final hit was delivered relatively recently by infrastructure-as-a-code. Flexible IaaS models can&#8217;t effectively support customers&#8217; hardware. While there are places where hardware is still very visible to customers (VPN connectivity from customers&#8217; datacenters to their IaaS resources), this is a temporary phenomenon - there are numerous IaaS-compatible software solutions already (please see my disclosure in upper right).</p>

<p>Furthermore, a lot of non-packet-moving functionality can be efficiently delivered in software these days. Look at <a href='http://www.heroku.com'>Heroku</a> - their frontend routing mesh is a massively-scalable load balancer that could be tweaked in real-time. Good luck trying to accomplish the same in hardware.</p>
<p><strong>We currently think of Ciscos and Junipers
of the world as hardware vendors.
What they actually are is software companies - they just don't let their
software run anywhere except on their own hardware.</strong>
I bet we are going to see
this transformation play out within the next 3-5 years. In not so distant
future, network gear will go back to focusing on one thing they do
exceptionally well - moving packets.
All other functionality will turn into software products and will be
used on application servers.</p>
]]></content:encoded>
	</item>

	<item>
		<title>Two Weeks on Twitter Without Reading My Timeline</title>
		<link>http://www.somic.org/2011/06/26/two-weeks-on-twitter-without-reading-my-timeline/</link>
		<comments>http://www.somic.org/2011/06/26/two-weeks-on-twitter-without-reading-my-timeline/#comments</comments>
		<pubDate>Sun, 26 Jun 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[internet]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/06/26/two-weeks-on-twitter-without-reading-my-timeline/</guid>
		<description><![CDATA[TL;DR Twitter reading experience is extremely inflexible and not scalable, and the company discourages third-party developers from innovating in general-client niche. Twitter must significantly improve reading experience, or allow third-party developers more freedom. In the first half of this month, I decided to perform an experiment. For at least two...
]]></description>
			<content:encoded><![CDATA[<p><strong>TL;DR</strong> Twitter reading experience is extremely
inflexible and not scalable, and the company discourages
third-party developers from innovating in general-client niche. Twitter
must significantly improve reading experience, or allow
third-party developers more freedom.</p><p><hr /></p>
<p>In the first half of this month, I decided to perform an experiment. For at least two weeks I didn&#8217;t read my Twitter timeline. I only sent an occasional tweet or replied if necessary (the plan was to reply to mentions and to tweets surfaced by multiple searches that I read via RSS).</p>

<p>What could be the point of such a weird arrangement? Public tweets in general form a basis of three distinct activities - publishing, participating in a conversation and reading (Twitter as a whole also supports one-to-one private messaging via DMs).</p>

<p>Each activity delivers its own benefits at the cost of efforts to focus mentally and time. Reading is unique among them however because in a system based on following other accounts (where each account is free to publish anything they want), its signal-to-noise ratio is significantly lower than that of other activities.</p>

<p>Lower signal-to-noise ratio leads to higher costs (mental focus and time spent). As such, information I obtain via reading my Twitter timeline is relatively costly to me. The goal of the experiment was to see if I could replace Twitter timeline with a less costly way of obtaining the same information.</p>

<p>Turns out I couldn&#8217;t do it easily. Reading blogs as I always do, checking Techmeme and Hacker News kept me informed about the most important news but the color added by many folks I follow on Twitter, was missing.</p>

<p>This outcome was somewhat expected. But there was another thing that I realized during the experiment. <strong>Twitter the company stopped
paying attention to reading experience (lists was their last innovation there). 
Even more worryingly, it is my understanding that they actively discourage
third party developers from building general-purpose Twitter clients. This
leaves their official stance - "river of updates" - to be the only way of
consuming (reading) one's timeline.</strong></p>

<p>Maybe &#8220;river of updates&#8221; is the best approach for <em>many</em> people (even though I doubt it). Maybe even for <em>most</em>. But saying it&#8217;s the best experience <em>absolutely for all</em> is a stretch. I want bookmarks (plural is not a typo), I want ability to sort my timeline by attributes other than time (for example - location, sender), I want &#8220;always on top&#8221; attribute, I want filters that could be shared between users - in addition to obvious creteria such as sender, time and location, I want advanced things such as current rate of my timeline (how many tweets per minute are appearing in my timeline now), send rate of sender (how many tweets per minute the sender sent on average last minute, last 5 minutes and last 15 minutes).</p>

<p>Granted, I don&#8217;t mind if Twitter itself doesn&#8217;t feel that these are features worthy of their official client. But if it&#8217;s the case, Twitter must not discourage third-party clients either. And if Twitter sticks to its guns on this, I hope it won&#8217;t be too long before it&#8217;s overtaken by someone else who will provide a better reading experience.</p>
]]></content:encoded>
	</item>

	<item>
		<title>JSON vs XML in API</title>
		<link>http://www.somic.org/2011/06/08/json-vs-xml-in-api/</link>
		<comments>http://www.somic.org/2011/06/08/json-vs-xml-in-api/#comments</comments>
		<pubDate>Wed, 08 Jun 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[software-engineering]]></category>
        
		  <category><![CDATA[cohesiveft]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/06/08/json-vs-xml-in-api/</guid>
		<description><![CDATA[George Reese recently wrote a blog post about API design, William Vambenepe commented here. This is an interesting topic, I have a post on this subject too - it&#8217;s titled Developing API Server - Practical Rules of Thumb. In this post I would like to expand on the first point...
]]></description>
			<content:encoded><![CDATA[<p>George Reese recently wrote a <a href='http://broadcast.oreilly.com/2011/06/the-good-the-bad-the-ugly-of-rest-apis.html'>blog post about API design</a>, William Vambenepe commented <a href='http://stage.vambenepe.com/archives/1777'>here</a>. This is an interesting topic, I have a post on this subject too - it&#8217;s titled <a href='/2010/05/04/developing-api-server-practical-rules-of-thumb/'>Developing
API Server - Practical Rules of Thumb</a>. In this post I would like to expand on the first point George made in his post - JSON vs XML.</p>

<p>As you may know, I led design and development of <a href='http://www.cohesiveft.com/vpncubed'>VPN-Cubed</a> API at CohesiveFT, therefore I am approaching this subject primarily from perspective of API server side, not client.</p>

<p>We designed VPN-Cubed API to be able to support both JSON and XML. Since <em>GET</em> requests in HTTP have no body, arguments must be passed in query string. For all other HTTP methods, we take the arguments as a hash, convert them to JSON (or XML), set <em>Content-Type</em> header appropriately and send the resulting representation of arguments as a body of our request. Client also selects the format in which it wants to get response by using <em>Accept</em> header. We assume default value of <em>application/json</em> - which means JSON responses are sent by default.</p>

<p>Despite being designed for both JSON and XML, VPN-Cubed API ended up shipped with JSON only. And there is a reason why we chose JSON over XML.</p>

<p>Generally speaking, API is exchange of messages - client submits a request, server returns its response (example: &#8220;need new instance with these parameters&#8221; - &#8220;here is current representation of the instance object you requested&#8221;). In most domains, overwhelming majority of messages map nicely to nested lists (arrays) and hashes (dictionaries). This is a key insight that plays a role in JSON vs XML battle.</p>

<p>There is no easy and universal way how to represent nested hashes and arrays using XML (if there is, I hope to hear from you about it - I need stable libraries that can convert arrays and hashes to XML and back that could interop among each other for all major programming languages). Of course it&#8217;s possible and not terribly difficult, but it&#8217;s something that one must do.</p>

<p>Contrast this situation with JSON - you don&#8217;t need to worry about this, it&#8217;s already taken care of for you. The only limitation of JSON that we faced was that JSON doesn&#8217;t like integers as hash keys, it wants you to convert them to strings or use an array instead of a hash.</p>

<p>There are certainly some features in XML that JSON doesn&#8217;t have, but this is a show stopper. While your mileage may vary, I think this is the biggest reason why JSON has been slowly making ground against XML in API world recently. See also <a href='http://blog.programmableweb.com/2011/05/25/1-in-5-apis-say-bye-xml/'>this post</a> on Programmable Web.</p>
]]></content:encoded>
	</item>

	<item>
		<title>IaaS vs PaaS</title>
		<link>http://www.somic.org/2011/05/25/iaas-vs-paas/</link>
		<comments>http://www.somic.org/2011/05/25/iaas-vs-paas/#comments</comments>
		<pubDate>Wed, 25 May 2011 05:00:00 +0000</pubDate>
		<dc:creator>Dmitriy Samovskiy</dc:creator>
        
		  <category><![CDATA[cloud-computing]]></category>
        
		<guid isPermaLink="false">http://www.somic.org/2011/05/25/iaas-vs-paas/</guid>
		<description><![CDATA[For a very long time, I had regarded platform-as-a-service (PaaS) as a catch-all bucket for everything cloudy that was not software delivered over Internet on demand (SaaS) or infrastructure (IaaS). Over the past several months however, with announcement of new players in PaaS space such as CloudFoundry and OpenShift, I...
]]></description>
			<content:encoded><![CDATA[<p>For a very long time, I had regarded platform-as-a-service (PaaS) as a catch-all bucket for everything cloudy that was not software delivered over Internet on demand (SaaS) or infrastructure (IaaS). Over the past several months however, with announcement of new players in PaaS space such as CloudFoundry and OpenShift, I found myself thinking about PaaS in a new light.</p>

<p>PaaS currently seems to be converging on a concept that is essentially an expanded <a href='http://en.wikipedia.org/wiki/Application_server'>application server</a> (typical examples of application servers are Tomcat, Weblogic, Websphere, Glassfish, JBoss etc). You package your web application in a certain way and upload it to the server. Server then sets up the environment (such as, for example, your database connection pools) and runs your app.</p>

<p>PaaS of course adds a few twists (examples of functionality that PaaS could offer include multitenancy, autoscaling, API, off-premises hosting, multi-language) but fundamentally it essentially feels to me like a glorified application server.</p>

<p>Several observations.</p>

<p>Firstly, every big software vendor seems to have at least one product in its current lineup that in some shape or form fits into the application server space. I expect each of these vendors to repackage their offerings into a PaaS or PaaS-like product - the more the merrier.</p>

<p>Secondly, the more I think about it, the more I become convinced that a private PaaS will dominate private IaaS at enterprises for applications developed in-house. If a company adopts one of the application servers today as an internal standard, it simply makes no sense to allow internal development of any applications that would not run on them.</p>

<p>Thirdly, you gotta hand it to Google - when everyone was crazy about a cloud model popularized by Amazon EC2, they didn&#8217;t cave in and didn&#8217;t start offering low-level OS VMs. They have focused on language VMs (Python VM, JVM) and up since the very beginning - this looks exactly what PaaS has become now. In their latest release, they added <a href='http://googleappengine.blogspot.com/2011/05/app-engine-150-release.html'>backends</a> for long-running background processes (in other words, all daemons that do not fit HTTP request-response model). I expect other PaaS implementations to follow suit.</p>

<p>Fourthly (as a direct consequence of points #2 and #3 above), I now think that private IaaS clouds will become a place where enterprises run their vendor-supplied (possibly closed-source) non-web-based workloads. As a result, software vendors will need to adopt new ways how they distribute their software. There will be no need to do installers and try to detect a machine&#8217;s hardware and OS. All software can be shipped as a VM image (with or without customer access, or maybe just partial customer access).</p>

<p>And finally, I am now convinced that today&#8217;s PaaS moniker should become application server as a service. Or - to make the acronym easier to pronounce - a webapp container as a service (WCaaS or ACaaS). There is simply too much &#8220;platform&#8221; beyond an application server use case - think data store as a service, messaging bus as a service, external connectivity as a service, load balancing as a service, naming as a service, and so on. Each of these could be a standalone service.</p>

<p>Good times for cloud computing!</p>
]]></content:encoded>
	</item>

</channel>
</rss>

