Developer's Attempt to Define Cloud Computing
I have been closely following cloud computing for many months now. As a developer, I get often frustrated by lack of clear and widely accepted definition of what cloud computing actually is. This is a problem, because without a definition, every imaginable operation performed over the Internet all of a sudden became a “cloud.” It dilutes the value and obscures the innovation cloud computing concept used to stand for in its early days.
The term “cloud computing” consists of 2 words - “cloud” and “computing.”
Cloud
Traditionally, an image of cloud is used on network diagrams to denote an opaque network entity (for example, Internet or MPLS cloud). Opaque in this case means that to an enduser it’s a black box - you hook up inputs and outputs as directed, and you get functionality. In addition to opaqueness, there are other less obvious properties that clouds on network diagrams usually possess:
- cloud is multi-tenant (many endusers use same one)
- cloud resources (links, bandwidth) are not dedicated (each enduser gets to use some up to their quota; if user A no longer uses a resource, cloud can assign it to user B)
- cloud is outside of enduser's full control
Computing
Firstly, allow me to note that I strongly disagree with pure linguistic approach here - to linguists, “computing” and “computer” are derived from the same root, such that “computing” is an action which involves a “computer.” I disagree with it because it’s too general and useless for our case.
I define “computing” as running user-provided software. It doesn’t have to be developed by user - one can download it from the web and run it. But it’s still the user who provides this software in this particular case. In contrast, if you use a web site to perform a certain operation, you also use software - but in this case, it’s the software developed and operated by the web site, hence it’s a service, not computing.
My Definition of Cloud Computing
Cloud computing is a form of using opaque multi-tenant networks of computers outside of enduser's full control with primary goal to run software provided by the enduser, in which computational resources are allocated dynamically (as opposed to being permanently assigned).
Examples and Caveats
- If we take a well-known SPI model (Software as a Service, Platform as a Service, Infrastructure as a Service), contrary to current mainstream thinking, only IaaS can be cloud computing when enduser provides the software to run.
- I added a clause about "primary goal" to eliminate things like Google Spreadsheet from cloud computing - even though a spreadsheet program may run macros (which are software code) and such macros could be provided by enduser, it's still not cloud computing, because the primary goal of a spreadsheet program is number crunching, not running macros.
- Programming frameworks (such as Hadoop for example) can be both: Hadoop can be cloud computing when enduser provides their map and reduce functions; but if enduser ends up running defaults or functions that ship with Hadoop distribution, there is no software supplied by enduser so it's not cloud computing.
- Things like storage as a service, backup as a service are all "cloudy," but they are not computing. There is already a term for this - Internet. Therefore, I consider "cloudy" by itself to be a redundant term.
- Google AppEngine (GAE) is a cloud computing platform. Many don't put it into IaaS category because it doesn't provide customers with access to low-level hypervisor-based VMs. But this alone doesn't make it non-IaaS from developer's standpoint - after all, a VM in hypervisor model is one thing, and a VM in language interpreter model is another (JVM, Erlang VM, Python VM, etc) but it's still a VM in a sense that it encapsulates running code inside and proxies all system-level requests through its abstraction layer. GAE provides access to its BigTable infrastructure, its memcache infrastructure so to me it's very much an IaaS system and satisfies my definition of "cloud computing."
- In my opinion, multi-tenancy is a necessary condition of a cloud computing platform. Multiple tenants must not be different companies - they can be different business units, different departments. The key is that there must be dynamic allocation of resources and scarcity. If all resources are dedicated to one organization and simply switched between applications, it's not cloud computing - it would be simply an infrastructure controlled via API.
- Same thing about on-premises server farms with cloudy features - they are not cloud computing, because they are not opaque to enduser and they are under enduser's full control.
Conclusion
All in all, I hope this blog post gets us closer to finally figuring out once and for all what “cloud computing” is and what it isn’t.
Categories: cloud-computing | software-engineering |