A program calling command-line tools is the moral equivalent of web scraping.

I gave this talk at LPC 2012. It promotes the idea that programs layered on top of human-centric interfaces is a bad idea.

Click to access 50cef-all-plumbing-needs-an-api.pdf

The timing of this post with the announcement of the most recent bash vulnerability is not entirely coincidental.

Fedora for short-lifespan server instances

I read Máirín Duffy’s coverage of the Fedora Board’s userbase discussion. Really interesting. I wanted to add my take.

tl;dr: Puppet/Chef make Fedora’s short support period much less of an issue.

The OS is a building block

I’ve been watching a lot of videos on DevOps lately. Several close friends of mine are sysadmins and I’ve been learning a lot from them about the transformation that their profession is undergoing. From this year’s ChefConf, Adam Jacob’s keynote and the talk by Sascha Bates really impressed on me the big change in how admins should view machines — They’re not permanent, or even semi-permanent. They are ephemeral snowflakes that may live a year, or just an hour, so don’t get too attached.

Part of why admins like VMs is because the isolation they provide between different services. I used to run mail, DNS, and httpd from a single machine. Everything was mostly separate but not quite everything. They had separate userids but everyone’s config was in /etc, even touching the same files, sometimes. A full disk affected everybody. /var/log/messages didn’t split up their logging cleanly (by default, anyway.) It all just was built assuming there would be an admin at a command shell who could use their brain to resolve the conflicts to make everything play nice on a single OS image.

One service per instance

Admins adopted VMs for isolation, increased density, and better per-service resource allocation, but then ran into other problems. The setup that they did by hand once per new-hardware now was once per instance. (Editing /etc/sudoers for the fiftieth time gets old.) The tools then evolved further, until today one may keep no persistent state in an instance. Now, an instance is kickstarted into existence and configured automatically for the one job it will ever do. The sysadmin’s job isn’t to herd boxen any more, it’s to build ’em, run ’em, and then reap ’em.

All the OS mechanisms for co-existing server processes, they’re now either obsolete or vestigial to some degree. What is important is the malleability of the OS to assume all the tasks it may be asked to – rather like a stem cell needs to be able to become a nerve or muscle, but never needs to be both.

Never upgrade, just redeploy

Let’s come back to the odd fact that Fedora is both a precursor to RHEL, and yet almost never used in production as a server OS. I think this is going to change. In a world where instances are deployed constantly, instances are born and die but the herd lives on. Once everyone has their infrastructure encoded into a configuration management system, Fedora’s short release cycle becomes much less of a burden. If I have service foo deployed on a Fedora X instance, I will never be upgrading that instance. Instead I’ll be provisioning a new Fedora X+1 instance to run the foo service, start it, and throw the old instance in the proverbial bitbucket once the new one works.

Cheap and easy virt and config management gives admins what they’ve always wanted — stability when they want it (run a LT support distro image, or for the VM host) or the latest stuff for their fast-moving business-oriented instances, by running a fast-update or rolling-release distro.

What Fedora should do

We’re already working on some of these to some degree — I think we should try to do even more to ensure Fedora is useful for the fast-update instance role.

First, Fedora needs to be able to be small. Nobody’s going to read the manpages on a throwaway instance, nobody’s even going to run vi. Image size matters when multiplied for each instance. Can we get by without /usr/share/doc/* and its thousands of copies of the GPL text? Fedora seems pretty good but there must be more we can do.

Second, we need to ensure Fedora supports the packages people are really using these days. Latest Ruby. Latest OpenStack. Vagrant. Django. Chef. Puppet. All the weird JS stuff that’s popular now on GitHub. 🙂 Continue to improve packaging tools so it’s easier for new contributors to do their first package, as well as for long-time packagers to maintain more packages. And not just package for contribution to Fedora, but for admins to package for solely internal distribution. Like Sascha Bates stresses in her talk, packaging is a huge benefit to automation, but it does require effort. It can be easier.

Finally, I think we need to continue to look at how easy it is to configure and manage an instance of the OS, and tailor it more for automated configuration. I believe the key to this is adding programmatic interfaces where they are lacking. See my “All Plumbing needs an API” talk. Since we’re probably being configured by another piece of code rather than a person at the shell, we need clear, unambiguous programmatic interfaces with good error handling. Chef should not be calling cmdline tools and checking error codes, there should be a Ruby configuration library that natively controls the whatever-it-is directly! We want configuring Fedora to be fast, straightforward, and reliable.

Conclusion: Stable+fast-update is better than stable+self-built

Practically the whole history of Linux distros has been the conflict between stability and new features. With virtualization, one still must make this choice, but at a much finer granularity than before. If you’re going to re-instance within 6 months anyways, why manually build your latest-Ruby and whatnot to support your app on top of a stable distro image? Maybe just use Fedora for those.

Plumbing needs an API: “libification”

Here are my slides and video from my talk “All Plumbing needs an API” from Linux Plumbers Conference 2012.

My thesis is we would improve the quality of our platform if we recognize the flaws in excessive use of other commandline tools by other tools, and worked towards more APIs and libraries for low-level parts of the Linux platform. There are many reasons given in the talk I don’t want to rehash; here are some more thoughts since I gave the talk:

First, this is not a push for every single instance of one program being called by another to be replaced by an API. But on the spectrum of cmdline-parsing vs libraries, I think we are too far towards the former, and should make an effort to shift dramatically towards the latter.

Second, we can take a “pave the cowpaths” approach toward this — we should first libify those tools most commonly parsed by other tools. I’ve been helping out on liblvm, a library for LVM (whose tools are unquestionably over-parsed.) There are also many other network, storage, and system configuration tools that are candidates for libification.

The move towards virtualization & cloud computing has led to many tools formerly configured directly by the sysadmin now being configured by other tools. Broadening the coverage of our system-level management APIs will improve Linux’s flexibility and reliability as a virtualized OS.

A hundred other languages want to call your code

The users of a hundred programming languages would like to call your low-level code, but they can’t.

Things have changed in the last 20 years. More people are using languages like Python, Ruby, and a hundred more, that are further from the bare metal. People are building service stacks that tie together many lower-level functions.

Libraries and APIs that make low-level features available to convenient high-level languages (HLLs) are a good thing. As a HL coder, it’s pretty handy to install python-foo, type “import foo” and then have access to that functionality.

What if python-foo isn’t there? HLL users are out of luck, unless they are so determined they make their own python-foo that calls system(), and then parses the output using their language’s fancy text parsing features.

But system() is the devil. We hate system(), folks. If your code calls system() it’s bad, for four reasons:

Overhead. It creates a new process and subshell.
Security. If your code has elevated privileges and is including text input by an untrusted user, watch out. Remember little Bobby Tables, a semicolon is a dangerous thing.
Ease. Parsing command-line programs’ output can be a pain, even if your language helps lessen it. Parsing of errors is even harder and prone to be overlooked.
Portability. A different platform may (or may not) have the program you’re relying on, or its output may be different, and you won’t know.

Early on when I was learning Python, I tried to write a gui for OProfile by parsing its output. OProfile did nice (for the user) things like adding headers on its output, and changing the format of output depending on what it found. Great for users, but it doomed my project. I couldn’t parse the output reliably.

You want to make it easy for the people who are language gurus for each of the hundred languages out there to wrap your functionality without having to become an expert in your code, or even change it. Then the hordes using all the hundred languages can use your library without being an expert in your code or being enough of a guru in their language to write a wrapper. They can just happily use it.

Here’s a positive development, kmod. kmod is a new implementation of the utilities in module-init-tools: modprobe, lsmod, lsmod, etc. Not only does kmod include a libkmod C library, but the commandline programs use it, so we know it works. Yeah! This made it super easy for someone (me) to come along and write a language wrapper (python-kmod) without having to know about module internals. python-kmod makes it easy for Python users to manipulate modules using the friendly language features they’re used to, like exceptions for errors, and lists. If I had been forced to use system(), it probably would have mostly worked, but it would have failed when output parsing failed for some edge case.

I encourage all low-level program writers, my fellow Linux Plumbers, to consider how to make native language bindings possible for your code. You don’t have to write them, just make them possible and you will find all sorts of people calling your code, safely, who couldn’t before.