Category: devops

The Rise of New Operations

It has been 6 years since I wrote a blog post titled The Rise of Devops. Many things have changed during this time and I realized a re-evaluation could be interesting. Today, in 2016, here is where I think we are. 1. Operations main focus is now scalability In the...

Read more...

5 Pitfalls of Increased Use of Statistics in Tech Ops

Our field is currently undergoing a seismic change towards becoming more and more quantitative. While in the past a chart was viewed by many as state of the art, charts won’t surprise anyone today. In fact, we now have systems that can produce any number of charts at varying time...

Read more...

A Path to DevOps Through Platform/Product Alignment

The topic of DevOps in the enterprise has been discussed extensively. There are many people who have made a meaningful contribution here due to their level of expertise in both. There are others however who know only one side well and whose contributions are adding quite a bit of noise...

Read more...

Risk in IT Systems

TL;DR - This post is not about ways and techniques how to run IT systems better or more reliably or deal with failure more efficiently; it's about mathematical models (or lack thereof) of what risk in IT systems actually is. I bet at least 80% of you will stop reading...

Read more...

Concise Introduction to Infrastructure as Code

After my last post, I received several questions about how one could get started with infrastructure as code. While I can’t provide a thorough step-by-step guide that will cover all possible situations and nuances, I thought I’d post a very brief generic outline. The end goal of infrastructure as code...

Read more...

Infrastructure As Code - Tiki-Taka of TechOps

Techops traditionally has been pursuing a dual mandate. On one hand, a part of resources is dedicated to new projects and expansion initiatives. On the other hand, there’s always been a significant effort to make sure existing systems are up and running. In each focus area, the industry has developed...

Read more...

Complex Systems: Generalists and Specialists

The following <a href=”https://twitter.com/#!/saschabates/statuses/119793462261452800” tweet</a> from @saschabates that appeared in my stream this morning caught my attention: #surgecon emergent theme: complex systems cannot be effectively diagnosed without smart generalists who understand them end to end This statement is correct (otherwise it wouldn’t have been a theme emerging out of one...

Read more...

Troubleshooting

One of the areas of tech ops that doesn’t get its fair share of discussion is troubleshooting. It’s not easy to teach troubleshooting - possibly because how successfully one can troubleshoot a given system largely depends on one’s experience with the system and on quality of the system’s feedback loops...

Read more...

On Importance of Planning for Failure

As you probably heard, on May 21 of this year US East region of Amazon EC2 cloud experienced a severe outage. The event received considerable coverage around the blogosphere - you can find the most comprehensive collection of links on the topic at highscalability.com. The guidelines of design for failure...

Read more...

Devops != Sysadmin

WARNING: This post may be misleading. What I call “devops” below is not what the canonical description of the term seems to be. Please make sure to read comments after this post to get a better picture. In the comments to my Rise of Devops post from earlier this year,...

Read more...

The Biggest Challenge for Infrastructure as Code

What do you do when you come across a piece of open source software that you’d like to try? You could download its source code tarball, extract the files, build and install it following the rules and conventions for a given programming language (./configure && make && make install, ruby...

Read more...

Are You a Responsible Owner of Your Availability?

Last month AWS released Reduced Redundancy Storage feature of S3. There were several aspects of this announcement that appeal to different people, but I especially appreciated one part - S3 now offers a choice of less availability for a lower price. Availability of your system, just as any other part...

Read more...

Devops - Solution to a Problem, Not a Cure for All Ills

With great interest I read a recent post by Chris Hoff on devops disconnect (make sure to read the comments too). Devops as a way to promote “collaborative and communicative culture” (see John Allspaw’s comment) - “devops the culture” henceforth - was born out of frustration on both sides of...

Read more...

The Rise of DevOps

If you are in IT, you probably noticed that most of the industry’s technical buzz lately has been centered around one of three huge areas - cloud computing, nosql and devops. Unlike Web 2.0 or Social Web, which are about content generation and content consumption models on the Internet, these...

Read more...

Operations Alerts and Tragedy of The Commons

Today I would like to continue my never ending quest of finding parallels between IT and economics and social sciences. I will start with a preamble, but if you are already familiar with a concept of “operations alert” in context of IT, you can skip it. Preamble I have spent...

Read more...