Thoughts about configuration management

Current configuration management tools solve a huge problem, providing a way to place repeatable, eventually coordinated application and server setup on large scale environments.

Although we have come a long way since handcrafting our servers, even with Chef or Puppet, I still feel like we’re chasing our tails.

Virtual Human

Before we had configurable software, we needed to edit a few source files or headers, change a few constants and recompile. Then configuration was presented, as means for a user to update the runtime features of a software without modifying the software itself. For the most part, software configuration stopped there. Even today, with what seems to be the proper way to handle configuration on multiple servers, we’re still editing text files and calling a shell command to restart or reload the service on the local machines.

Basically, we invest a huge amount of time writing code that eventually acts as a human being when interacting with the underlying operating system. Since most software was written for human interaction on the configuration aspect, everything is centered around it. e.g. configuration files are easy to write but only some software provide proper syntax checking, configuration related error messages are optimized for reading and not for handling with code.

Although today’s software is not usable without configuration, it’s still a second grade citizen when it comes to operating system support. Modern operating systems have package management, shared libraries, advanced service management and logging facilities. The only thing which is still poorly provided, if at all, is configuration.

Software normally opens a locally available file, reads it, translates it to code structures and only then verifies and uses it. Each and every step in this flow could fail. With the exception of correctness of the configuration file, the software should not deal with unavailable files, wrong permissions or even the syntax of the file itself.

For example, this is the code snippet in charge for configuration reading in bucky, a super-simple metrics collection daemon:

# https://github.com/cloudant/bucky/blob/acbf742/bucky/main.py
def load_config(cfgfile, full_trace=False):
  cfg_mapping = vars(cfg)
  try:
    if cfgfile is not None:
      execfile(cfgfile, cfg_mapping)
    except Exception, e:
      log.error("Failed to read config file: %s" % cfgfile)

  if full_trace:
    log.exception("Reason: %s" % e)
  else:
    log.error("Reason: %s" % e)
    sys.exit(1)

  for name in dir(cfg):
    if name.startswith("_"):
      continue
    if name in cfg_mapping:
      setattr(cfg, name, cfg_mapping[name])

This function sets some default configuration and then executes a python file that overrides them. There’s more code to handle errors from the operating system than code that handles the configuration itself. Some software goes even further and allow for configuration to be included from within the configuration, which makes it even messier, since now configuration processing needs to also list directories. This process is repeated and duplicated in almost every software available.

Receiving configuration

Like logging, shared libraries or high-level network stacks, software should have a proper API for receiving configuration. It shouldn’t actively access a file or even process it. This API could designate a file reader, a service discovery API or even a call to a configuration management system to fetch the current desired instructions. I use the term receiving because I believe software should be handed with configuration instructions instead of reading them.

If we stop thinking about the configuration on the software side in terms of a piece of code that reads a file, but rather receive configuration instructions, we can take it to the next level. This configuration API could provide an internal callback, and we could push new configuration updates without restarting the service. Today this is sometimes performed by using various low level signals, which emphasises even more the need for a better solution.

If we’ll have this kind of power in our hands, then our current configuration management could become much simpler. For example, instead of having our CM generating multiple files with multiple templates, which increases the chance for a mistake, we could feed the software with configuration instructions in a ready-to-use data structure.

I believe that’s enough for one day. We’ll let these ideas sink in as we return with part 2, where we will propose a way to bring this into life.