Category: infrastructure-development

Network: From Hardware Past To Software Future

At this year’s GigaOm Structure conference, there was a single event that attracted my interest the most - network virtualization panel (I didn’t attend the conference, I was only following along over the Internet). It wasn’t just because it involved OpenFlow. I think there is a bigger trend at play...

Read more...

On Importance of Planning for Failure

As you probably heard, on May 21 of this year US East region of Amazon EC2 cloud experienced a severe outage. The event received considerable coverage around the blogosphere - you can find the most comprehensive collection of links on the topic at highscalability.com. The guidelines of design for failure...

Read more...

Coffee and Design for Failure

In the wake of the Judgment Day Outage, I would like to offer you a story. One morning John Doe woke up and decided he wanted a grande mocha. Nearby Starbucks was 3 minutes away by car. John went to his garage but looked like his garage door opener wouldn’t...

Read more...

The Biggest Challenge for Infrastructure as Code

What do you do when you come across a piece of open source software that you’d like to try? You could download its source code tarball, extract the files, build and install it following the rules and conventions for a given programming language (./configure && make && make install, ruby...

Read more...

Are You a Responsible Owner of Your Availability?

Last month AWS released Reduced Redundancy Storage feature of S3. There were several aspects of this announcement that appeal to different people, but I especially appreciated one part - S3 now offers a choice of less availability for a lower price. Availability of your system, just as any other part...

Read more...

The Rise of DevOps

If you are in IT, you probably noticed that most of the industry’s technical buzz lately has been centered around one of three huge areas - cloud computing, nosql and devops. Unlike Web 2.0 or Social Web, which are about content generation and content consumption models on the Internet, these...

Read more...

Normal Accidents in Complex IT Systems

Designing a fully-automated or nearly-fully-automated computer system with many moving parts and dependencies is tricky, whether a system is distributed, hyper distributed or otherwise. Failures happen and must be dealt with. After a while, most folks grow up from “failures are rare and can be ignored” to “failures are not...

Read more...

Punching UDP Holes in Amazon EC2

Disclaimer 1: Despite its possibly ominous name, this is NOT a network vulnerability or an attack that could lead to unauthorized access. UDP hole punching requires cooperation between two hosts, and hence can't be easily used as an attack by itself (in other words, in order to run it, you...

Read more...