Tag Archives: Infrastructure

Infrastructure Renaissance

Infrastructure is Code! FTW!

Wait… WTF does that even mean?

On the one hand this is a rallying cry for those of us in the configuration automation space, on the other hand it confuses innocent bystanders.

@zhenjl: @botchagalupe but ops ppl have always coded using perl, tcl (if u remembr that :)…now we just have ‘sexier’ langs like ruby and python (original)

Sure, there has always been scripts, which is technically code, but this is missing the point a bit.

For starters, getting ops teams to recognize and steal all applicable processes and tools from developers to produce and version there scripts would be a win, but that’s just the starting line…

The point isn’t just that you can write some scripts to help you do something with your infrastructure then exit.

The ‘Infrastructure’ IS an application, a long running process with inputs, outputs and state. Or at least infrastructure is very close to becoming one…

Everyone paying attention knows something is happening… what is it?  Hazy, foggy, smoky, cloudy, whatever…

EC2 is the herald of an infrastructure enlightenment, leading us out of the dark ages.

(We’ll ignore the history of virtualization technology, and ESX Server for now, but I’m sure someone has an opinion on all this.)

The bottom line is there are resources, and I can manipulate them through an API. With EC2, these are compute resources of various shapes and sizes, with some storage resources.

Virtualized networking is not exposed through a public API yet, but we’re still climbing abstraction mountain together and we all know it is just a little higher in this cloud. (What’s up Cisco, where you at? Oh, there you are… and these guys too… thanks Ben )

At some point in the not too distant future, web service calls and a credit card will be enough to create and manipulate arbitrary compute, storage and network resources.

So now what?

Until someone provides Sysadmin as a Service, there is still quite a bit of work left to be done. These resources need to be configured to provide services.

When adding capacity was a purchase order, then wait for weeks to rack and cable, taking a day or so to configure the machines was not rate limiting, and lost in the noise. When you can bring up new server instances in minutes, that same configuration step is conspicuous waste.

You might be thinking you’ll make images for each of your services and manage those. This approach to management exchanges the configuration drift of tradition IT management for image sprawl. Do those notes that got scribbled six months ago when the web server image was created have accurate details about how that machine is configured? Are you sure? The only way to know is start the VM and inspect it. Frankly, the same thing applies to the running instances started yesterday.

That’s where tools like Puppet comes in. The benefit is not just that a machine gets configured by running a script. Puppet is an application that audits your system configuration, and changes the details out of sync with policy. That policy is code with semantics that can be developed like software.

That’s the shift. Instead of thinking about managing server counts, we can reason about abstractions of service configurations. The pieces exposed by an API are not just instances of server and storage resources, but abstractions of services. The configuration of proxies, monitoring, middleware, web servers and databases are components to be composed in the dynamic ‘infrastructure’ application.

This is just the beginning, but I don’t want to spoil the rest of the story… at least not yet.

It is not necessary to change. Survival is not mandatory.

W. Edwards Deming

Agile Infrastructure

True stability results when presumed order and presumed disorder are balanced. A truly stable system expects the unexpected, is prepared to be disrupted, waits to be transformed.

-Tom Robbins

One thing that became clear as software practices matured and self-optimized was that not being able to build a project from source in an automated fashion can bring development progress to a grinding halt, particularly as more bodies are added. Without that ability to build from source in a predictable manner, which is the predicate for any flavor of test driven development or continuous integration, the development efforts from a growing team is like so many butterfly wings each capable of unleashing storms on the unsuspecting halfway around the world.

But how many organizations dependent on a web application can reliably build their production servers from bare metal? automatically? unattended? When your application is a ‘service’ on a server, how is that fundamentally different from building a traditional application from source?

How does capacity planning change in a world where ‘Digg’ and ‘Slashdot’ are explicit goals? When Facebook can drive adoption? When adding new servers changes from a purchase order and weeks of waiting to a web service call?

If you want to participate in this ‘as a Service’ brave new world (get up in that ‘aaS’ if you will), and your plan to bring up new servers involves a meatcloud sshing their little hearts out, you might as well give up now. Seriously…

How Agile is your infrastructure?

How Agile is your infrastructure?

Further, what is the plan to manage the life cycle of the servers? Most people have figured out that ‘tail -f’ is not a monitoring solution. But how many of them know exactly what is running on their machines and why? How many have servers that they are afraid to turn off because they aren’t sure what is running, but it might be important? How many configure a server, back away slowly and hope they aren’t the next one who has to touch it?

In another recent episode doing some custom Puppet work with Luke, who has essentially crossed the Developer-Sysadmin divide (I’m not sure he is a chief of the new tribe, but he’s definitely a shaman), Luke became frustrated that he couldn’t write Puppet code like he could Ruby code. (He had not written complex Puppet code for a while, since he stays pretty busy working on Puppet’s internal code.)

Sure, I guess this would be awesome if I was a sysadmin, but I can’t test this code. The only way I can have any confidence it works is to run the whole thing. I guess I just take for granted all the tools that are available to me as a developer now.

Luke Kanies

How does it change things when your infrastructure is code? Can be versioned and diffed? Can be shared and reused? Can be tested? Continuously?

How awesome will PuppetUnit or PuppetSpec be?

Test Driven Infrastructure?

It is only a matter of time…

%d bloggers like this: