Monthly Archives: April 2010

Cloud Standards Considered Harmful

The nice thing about standards is that there are so many of them to choose from.
–Andrew S Tanenbaum

standard -noun

  1. something considered by an authority or by general consent as a basis of comparison; an approved model.
  2. an object that is regarded as the usual or most common size or form of its kind: We stock the deluxe models as well as the standards.
  3. a rule or principle that is used as a basis for judgment: They tried to establish standards for a new philosophical approach.
  4. an average or normal requirement, quality, quantity, level, grade, etc.: His work this week hasn’t been up to his usual standard.

SQL was first developed at IBM in the early 70s.

Many of the first database management systems were accessed through pointer operations and a user usually had to know the physical structure in order to construct sensible queries. These systems were inflexible and adding new applications or reorganizing the data was complex and difficult.

ANSI adopted SQL as a standard in 1986, after a decade of competing commercial products using SQL as the query language.

SQL became ‘the standard’ because it was open, straightforward, relatively simple and helped solve real problems.

TCP/IP emerged as the standard after a proliferation of competitive networking technology for largely the same reasons.

(another interesting story of emergent standards is POSIX, but apparently no one posts about it in any detail online, and you can only read about it if you are willing to part with $19… you know, the marginal cost of producing a PDF and all.)

People often compare cloud computing to a utility like electricity, one big happy grid of computational resources. Often those same people champion the call for ‘standards’, which makes me wonder if they have traveled much.

The call for standards is usually trumpeted with a need for ‘interoperability’ and avoiding lock in. We all know how well SQL standards prevent vendor lock for databases.

In discussing the evolution of standards with @benjaminblack, I pointed out that TCP/IP was more ‘standardized’ than SQL. His perspicacious response noted that with TCP/IP ‘if you don’t interop you are useless’ and ‘if databases had to talk to each other, they’d interop, too’.

Interoperability arising from a standard is a lie. The order is wrong. Interoperability comes because everyone adopts the same thing, which becomes the standard. Don’t confused a ‘specification’ with a ‘standard’. SQL became the defacto standard long before it was ‘officially’ a standard. SQL implementations will never be fully interoperable and truth be told there are often real advantages in proprietary extensions to that standard. TCP/IP became the defacto network standard and interoperable because that’s just the natural order of things. Interoperability will happen because it must, or else it won’t. Interop cannot come from a committee.

Interoperability is even more of a lie when it comes to cloud computing. If we are talking about IaaS (infrastructure as a service) then the compute abstractions for starting, stopping, and querying instances are almost trivial compared to the work of configuring and coordinating instances to do something useful. Sysadmin as a Service isn’t part of the standards. This is so trivial that you can find open source implementations that abstract the abstractions to a single interface. (Seriously, libcloud is just over 4K lines of python to abstract a dozen different clouds. At this point, supporting a new cloud with a halfway decent API is a day or two at most) The storage abstractions are in their infancy and networking abstractions are nearly non-existent in the context of what people consider cloud infrastructure. The APIs and formats are a distraction from the real cloud lock in, which is all the data. You want to move to a new cloud? How fast can you move terabytes between them? Petabytes?

Which brings me to PaaS (platform as a service), otherwise known as ‘locked in’ as a service. PaaS has all the same data issues, but without any common abstractions whatsoever. I mean sure, you could theoretically move a non-trivial JRuby rails app from Google App Engine to Heroku, but let’s be honest, sticking your face in a ceiling fan would probably be more fun and leave less scarring. That’s an example that is possible in theory, but in most cases, PaaS migration will mean a total rewrite.

Finally, SaaS (software as a service), which I love and use all the time, but I can’t convince myself that every web app is cloud computing. (Sorry, I just can’t.) Again, data is the lock in, please expose reasonable APIs, but standards don’t make any sense.

Committee-driven specifications get some adoption because most people like it when someone else will stand up and take responsibility for leading the way to salvation. CORBA and WS-* aren’t the worst ideas ever (I give that prize to Enterprise Java Beans) but they aren’t always simple or straight forward in comparison to other solutions. Adopting an official standard is good for three things, first, providing some semblance of interoperability, second, stifling innovation and finally, giving power to a standards body. For cloud computing, a standard in the name of interoperability is essentially solving a non-problem and calcifying interfaces pre-maturely.

Frankly, I’d rather double down on more innovation. Standards will emerge.

You want to make a cloud standard? Implement the cloud platform everyone uses because it is simple, open and solves real problems.

(Thanks to Ben Black for his feedback and for telling the same story a different way last year.)


%d bloggers like this: