True stability results when presumed order and presumed disorder are balanced. A truly stable system expects the unexpected, is prepared to be disrupted, waits to be transformed.
One thing that became clear as software practices matured and self-optimized was that not being able to build a project from source in an automated fashion can bring development progress to a grinding halt, particularly as more bodies are added. Without that ability to build from source in a predictable manner, which is the predicate for any flavor of test driven development or continuous integration, the development efforts from a growing team is like so many butterfly wings each capable of unleashing storms on the unsuspecting halfway around the world.
But how many organizations dependent on a web application can reliably build their production servers from bare metal? automatically? unattended? When your application is a ‘service’ on a server, how is that fundamentally different from building a traditional application from source?
How does capacity planning change in a world where ‘Digg’ and ‘Slashdot’ are explicit goals? When Facebook can drive adoption? When adding new servers changes from a purchase order and weeks of waiting to a web service call?
If you want to participate in this ‘as a Service’ brave new world (get up in that ‘aaS’ if you will), and your plan to bring up new servers involves a meatcloud sshing their little hearts out, you might as well give up now. Seriously…
Further, what is the plan to manage the life cycle of the servers? Most people have figured out that ‘tail -f’ is not a monitoring solution. But how many of them know exactly what is running on their machines and why? How many have servers that they are afraid to turn off because they aren’t sure what is running, but it might be important? How many configure a server, back away slowly and hope they aren’t the next one who has to touch it?
In another recent episode doing some custom Puppet work with Luke, who has essentially crossed the Developer-Sysadmin divide (I’m not sure he is a chief of the new tribe, but he’s definitely a shaman), Luke became frustrated that he couldn’t write Puppet code like he could Ruby code. (He had not written complex Puppet code for a while, since he stays pretty busy working on Puppet’s internal code.)
Sure, I guess this would be awesome if I was a sysadmin, but I can’t test this code. The only way I can have any confidence it works is to run the whole thing. I guess I just take for granted all the tools that are available to me as a developer now.
How does it change things when your infrastructure is code? Can be versioned and diffed? Can be shared and reused? Can be tested? Continuously?
How awesome will PuppetUnit or PuppetSpec be?
Test Driven Infrastructure?
It is only a matter of time…