Ecosystems and Platforms

21 Oct 2010

Here are the links to four very interesting and important posts about Internet software ecosystems and platforms, from four very influential people, in chronological order.

Chris Dixon – http://cdixon.org/2009/12/30/whats-strategic-for-google/
Fred Wilson – http://www.avc.com/a_vc/2010/04/the-twitter-platform.html
Mark Cuban – http://blogmaverick.com/2010/09/30/product-vs-feature-the-lesson-of-xmarks/
Seth Godin – http://sethgodin.typepad.com/seths_blog/2010/10/the-business-of-software.html

Make sure to read between the lines as well.

Categories: internet |

Devops != Sysadmin

06 Oct 2010

WARNING: This post may be misleading. What I call “devops” below is not what the canonical description of the term seems to be. Please make sure to read comments after this post to get a better picture.

In the comments to my Rise of Devops post from earlier this year, Damon Edwards asked an interesting question:

Why do you consider DevOps a standalone discipline? Isn't what yo described just the characteristics of state of the art systems administration?

In other words, is devops a standalone thing or is it just a fancy new name for a sysadmin. I couldn’t answer the question back then, but now, having thought about it for a bit, I am at least in a position to offer an opinion.

Most IT organizations consist of 2 groups of people: developers and non-developers. Whether these two groups work together or hate each others’ guts, doesn’t really matter for the purposes of this post. Non-developers group is traditionally subdivided into facilities, systems, network, storage, applications and security. In smaller teams, an individual could be a part of several functions from this list (people call it “wearing many ops hats”); in bigger teams each function could be its own department; in extra large teams, each function may even have its own director. Developers write and maintain code that powers the business, non-developers assure whatever is needed for the code above to work properly, is there and works as it should.

But there is another hidden taxonomy for non-developers group. If we look at it with the focus on individual activities instead of on people’s job descriptions, you will see this group consisting of:

activities around hardware purchased from vendors
activities around software purchased from vendors (excluding software delivered as a service)
activities around software and infrastructure delivered from vendors as a service
activities around open-source and/or freely downloadable software
activities around software developed in-house

I believe that the first 2 types of activities are mostly SYSTEMS ADMINISTRATION - common theme here is MANAGE or ADAPT. One needs to perform day-to-day tasks like user account management and troubleshooting; identify, prepare, test, perform and validate changes; analyze utilization and plan capacity expansion. Because items under management in this category are purchased from vendors, the expectation is that vendor provides everything necessary to run their stuff successfully, and sysadmin’s main goal here is to adapt it to run optimally in this company’s environment. Analogy here is you mold something into shape. To excel in this category, one needs a lot of knowledge about a particular system, its gotchas and peculiarities. Such knowledge is usually obtained through experience through trial and error, vendor trainings and user group meetings. One can totally do this without software development background.

The other 3 types of activities, on the other hand, are mostly DEVOPS (infrastructure as code) - in contrast to MANAGE and ADAPT, main themes here are CREATE or ASSEMBLE. It’s not taking something that already exists in some form and molding it into the shape you need - it’s taking a lot of standalone pieces and getting them to work together (like building with Lego). There is no way this can be done successfully at scale without software development background - devops is dominated by developers who turned into ops, or ops who have been doing a lot of serious scripting all along.

Interestingly, startups are more likely to lean towards hiring devops-minded engineers, because these days a lot of startups don’t need to manage their own hardware anymore and often can build their products on software stacks which do not include any proprietary software (that would need to be purchased from vendors). There might not be a need in sysadmin at startups anymore, but it doesn’t mean the sysadmin skills are no longer needed anywhere else in IT - just look at all the big IT companies on NASDAQ and NYSE selling big iron or complex software systems.

It’s also worth noting that separation could be not as clear-cut - for example, controlling hardware devices programmatically, using published or unpublished APIs or telnet & expect voodoo, definitely fits under devops.

Therefore, in my opinion, devops is indeed a standalone discipline, but it doesn’t mean that a given person’s role in organization could be devops. We are all engineers, and difference is what kind of activities one performs during a given day - when you perform a firmware upgrade on a piece of hardware, it’s sysadmin; when you put together a cluster management system for your nodes in EC2 - it’s devops.

P.S. Yes, I know that what I keep calling “devops” is not about what others call “devops” - culture of collaboration between developers and ops. I just don’t know what else to call this.

Categories: devops |

Dealing with Noisy Neighbors in the Cloud

28 Sep 2010

This is part 2 of my series dedicated to pricing in the cloud.

As I mentioned in the past, pricing is one of the most important aspects of cloud computing offerings. Up until now, however, I have been talking about pricing only from perspective of selling the services. This post is going to be different - today I hope to show you how pricing could be used to solve a very technical issue in IaaS.

Public IaaS clouds are usually multi-tenant - your virtual machines (VMs) are running alongside other customers’ VMs, potentially on the same hardware and infrastructure such as network and storage. Cloud makes no guarantees regarding placement of your instances, and since underlying resources and access to them are shared among all VMs, it’s not unreasonable to assume that from time to time your VMs may end up with “noisy neighbors.” In this context, a “noisy neighbor” is a VM that requests or is using a disproportionately large part of some shared resource.

While it may not be completely impossible to solve this problem in hypervisor or in architecture (for example, some noisiness can be reduced by allocating a dedicated network card to each VM), there could be another way.

In economics, noisy neighbors are a typical example of a negative externality. There are many interesting things about externalities, but one of the most well known is Coase Theorem. It turns out that under certain circumstances, negative externalities could be bargained away - recipient of a negative externality could pay the party causing it to make it stop, or the party causing it could pay the recipient to stop complaining.

There is a catch however. Coasian bargaining could be hampered by transaction costs - when it’s difficult or not practical to get all parties behind a table, the theorem won’t work. In our case, it’s obvious that there are infinite transaction costs - no single user of IaaS cloud knows who her neighbors are. This would be an insurmountable challenge, unless…

Unless cloud provider itself steps up and facilitates the bargaining. Cloud provider knows exactly who each other’s neighbors are and knows exactly who is noisy (i.e., whose instances are consuming a lot of shared resources). If cloud provider agrees to act as a proxy for bargaining negotiations, it could totally work. Here is one way of doing it.

When a customer launches new instances, she could specify the amount of money for which she would be willing to get her instances terminated or moved to another location. Let’s say customer A chose $1 - that’s how much she values inconvenience of being forced to move. Customer B’s instances are neighbors of customer A’s instances, and B would like her noisy neighbor to be not as noisy (cloud provider may have to offer some sort of aggregated view into current load on shared resources, so that B can confirm that indeed her problem is a noisy neighbor, not a bug in her own code).

If B is willing to get A moved (“silenced”) for $0.5, nothing will change. But if B is willing to part with $2 in order to get A to be quiet, we get a deal. A’s instance moves to another location but she gets $1 for her trouble, B’s instance gets to enjoy more resources for $2, and cloud provider could even take a cut from the transaction (in the simplest model, cloud provider could pocket the difference). Everybody wins! There are several technicalities that would need to be taken of here, but the main point remains - a purely technical problem could be resolved with pricing mechanism (and a bit of technology).

So who is with me? Do you think we will see anything like this before the end of the year? By the end of next year? If not, why?

Categories: cloud-computing | economics |

Pricing in the Cloud

13 Sep 2010

I started writing this post, but realized that it was going to be way too long. So I decided to split it into a series. Here goes part 1 of my series on Pricing in the Cloud.

Happy Birthday, Amazon EC2! You are 4 years old now - I wonder how many that would be in human years. In the blog post commemorating this milestone, Jeff Barr linked to a “really interesting developer position” open at EC2 right now. I clicked the link - looks like they are searching for someone to work on pricing models. This is indeed quite an exciting job opportunity, at the very intersection of my interests, but since I am not applying, I figured I may offer some thoughts on the subject here.

Revenue Management

Since the very beginning of EC2, pricing was an important necessary condition of a successful IaaS cloud. Billing by the hour, instead of by month as was the standard in VPS land (which EC2 successfully disrupted), enabled a whole new set of workloads - such as now famous TIFF conversion by the New York Times or Animoto scale up event. Neither would make sense with monthly billing - and while importance of APIs in IaaS is well known, the role played by pricing is often not as obvious. It’s easier for technologists to focus on features, but not paying attention to pricing is a mistake - my former colleague Alexis Richardson was among the first to note (back in 2008), ”Virtualization is the technology, but cloud is a business model."

IaaS cloud is an excludable non-rivalous good. Excludable because one doesn’t get access without paying. Non-rivalous because its use by one customer does not usually prevent other customers from using it. From this perspective, the business of selling cloud is similar to selling tickets to fill a movie theater or selling seats to fill an airplane.

An individual instance slot (an airplane seat’s equivalent) is a persishable resource - if it’s not filled for a period of time, potential revenues it could have delivered during this time are lost forever. The discipline that studies pricing of perishable resources is called yield management (or revenue management) - a super interesting field which includes elements from economics, statistics, mathematics and psychology. Quoting from Wikipedia: "The challenge is to sell the right resources to the right customer at the right time for the right price."

Lessons from the Airlines

I once watched a documentary on CNBC called ”Inside American Airlines.” If I remember correctly, they would show an airplane full of passengers and say something like no two people on this plane paid the same price for their travel. Think about it - it’s fascinating: hundred people, nearly same seats, all paid different prices. This is a key insight to revenue management in the cloud - how to get each customer to pay the maximum they are willing to pay. The solution, based on years of experience of the airline industry, is to have a gazillion of pricing schemes and options selling from the same pool of nearly identical seats. How do they do it? You could buy a ticket from the airline’s web site, or from airline agent, or from a travel agent, or from a non-airline travel web site (such as orbitz.com where I used to work). You could buy a multi-segment itinerary, or you could buy each segment as a separate ticket (a segment is one take-off and one landing). You could buy in advance, or you could buy just a couple of hours before take off. You could buy with an option to cancel for refund, or you can buy a non-refundable ticket. And so on and so forth - the list of pricing options will keep growing and growing.

A given combination of options defines a product. Even though you may think you are buying a right to occupy a seat, in reality you are buying a product. Each product is designed to meet a specific need and compete for a specific type of customer - such as, for example, provide transportation to a family going on Hawaii vacation or accommodate busy schedule of a businessman willing to pay the first class fare at the last minute. The more products are offered, the more different kinds of needs an airline could satisfy, all from the same pool of seats!

Revenue Management in the Cloud

It’s not difficult to see that AWS have already started on this path (and sooner or later, other cloud providers will catch up and start doing the same). In the cloud, the more pricing models are offered, the more types of workloads will find their way into the cloud. Therefore, if I had to guess what kind of pricing they will unveil next, I simply have to look at what kind of workloads don’t yet fit into any pricing schemes today (more on that in future posts).

AWS initially launched EC2 with on-demand pricing for a single instance type (m1.small). Over time they increased the number of instance types - so that more CPU intensive workloads could run in the cloud. Then they offered reserved instances - so that log running workloads could find their way into the cloud.

But spare capacity still remained (I assume) - and so spot instances were introduced. In exchange for a very attractive price, they secured a right to terminate an instance - thus allowing more short-lived workloads to run in the cloud. And it’s an instant win-win for customers and cloud provider. If I can identify my workload as such that fits a spot instance (I do nearly all of my testing and some development at work on spot instances), I free up slots for on-demand instances, which EC2 happily sells to others - thus increasing the number of customers that they can support with fixed capacity and reducing the overall number of insufficient capacity errors.

This is price targeting at its best. For more on price targeting techniques, please read Tim Harford’s “The Undercover Economist” or see this blog post.

To sum up:

attractive pricing is a necessary condition of IaaS
pricing is often overlooked by technologists who naturally focus on features
selling cloud resources is similar to airline selling seats on planes
the more pricing schemes are offered, the more different classes of workloads can optimally run in the cloud
sooner or later, I predict that other IaaS clouds will follow AWS lead and will start offering different pricing schemes

Don't forget that this is just a beginning of Pricing in the Cloud series - subscribe to the feed, via Feedburner or open this blog in Google Reader.

Categories: cloud-computing | economics |

Extending EC2 API - ec2-describe-ipaddress-ranges

01 Sep 2010

Do you remember how we used to programmatically consume services on the web before proliferation of APIs? That’s right - scraping! And do you know what prevents us from using this technique now, when some piece of data you need for your application, is not available via API? That’s right - absolutely nothing!

I recently came across something that was not available via EC2 API - lists of IP address ranges for each EC2 region. AWS team maintain such lists in their forums - currently at http://developer.amazonwebservices.com/connect/ann.jspa?annID=735 (I say “currently” because annID has already changed at least once). I asked for API, but in the meantime since I needed this now, I wrote a simple python script to scrape and parse the data - http://gist.github.com/559397 or embedded below. Enjoy! By the way, who would have thought that one would need to resort to scraping AWS, the very pioneers of infrastructure APIs, to get this simple bit of information?

Categories: cloud-computing | python |