Reworking the Atlas CLI

The current CLI for Atlas was always a stopgap measure – something to let me invoke it on the command line during development. The time has come for a real user interface though. This post is really just me thinking through how this should work - it is rambly, but capturing the reasoning is helpful :-)

Project Centric

It has become apparent that Atlas is project-centric, such as git. A common and useful pattern for project centric stuff is to slap it all into a version-controllable directory, and provide a nice way to kickstart a new project. So, we have the beginning of our interactions:

$ atlas init

This will populate the current directory with a barebones atlas project, and initialize atlas’s state keeping. We’ll encourage the seperation of environment and system model, so let’s drop two files, a system model and a development environment description. As Atlas exists right now, that would be something like system.rb and env-dev.rb.

System and Environment Model Definitions

These two things, system model and environment description, are the heart of an atlas project so we want them to be front and center when you look at a project. I’d like to drop them right in the root of the project, but this conflicts with another desire – that we be able to detect and load system models and environment descriptors for a project.

Once a system starts growing, but before it really has a number of teams working on it or using it, I expect it will be very common to drop all the parts of the system into one project via seperate files, organized as makes sense for the system. Right now that is awkward, any files aside from the root must be explicitely included as external elements in the system model. This works but in the case of the environment descriptor, when you have multiple environments, means checking out the project multiple times and initializing each checkout for a different environment. Yuck.

To complicate it further, it is becoming apparent that in order to make systems composable from external models (ie, if Basho were to provide a descriptor for a Riak system, or Puppet Labs for a Puppet it would be nice to just have to reference one URL for it. It is convenient to allow one file to describe both the environment and the system in this case. Additionally, it has become apparent that sometimes an environment needs to define system elements. In the EC2 case, for example, this might be creating a VPN instance so the machine running atlas can connect to instances inside a security group.

Given that we want to have arbitrary environment and system models for a project, we’d like them to take center stage in the root of the project, and finally we migh want other convenience scripts in the root of the project, one option seems to be just changing their extension and globbing them together to load – ie, system.atlas and dev-env.atlas.

Given that atlas descriptors are ruby files, this is inconvenient for automatic file type detection for code highlighting, which is unfortunate.

Another option is to give up on the root and drop a model/ directory which contains the system and environment descriptors. Model is an overloaded term, due use of the word to describe database access code in Rails-inspired web frameworks. In this case, it is used to represent a model of the system and environments in which the system will run.

Using a model/ directory gives a slight affordance to new users in that when the open a descriptor, it is nicely highlighted. This is small, but frankly really handy. I don’t have my .emacs on every machine I may need to muck around with models from and syntax highlighting really helps with reading this stuff.

On descriptor location, looking way out, Atlas internally creates tree structures for its models and operates wholly on those trees. That the current implementation uses ruby to describe those models is basically a convenience. Personally, I think S-expressions model trees even more nicely, and could see using a lisp dialect as well, or a custom syntax. Taking ruby and calling it .atlas implies more authority than it deserves.

Given all this, I think I’ll go with the model/ directory approach. Given that, our newly initialized project now looks like

brianm@ufo:/tmp/ratless$ atlas init
brianm@ufo:/tmp/ratless$ find .
.
./model
./model/dev-env.rb
./model/system.rb
./README
brianm@ufo:/tmp/ratless$

Running Atlas For Real

The primary action of atlas, and what folks will most often do with it, is to converge a system instance on the model. Right now this is

brianm@ufo:/tmp/ratless$ atlas update

I am not sure the best command for this. I think of it as convergence, so am tempted to switch the command to converge. Probably I’ll just make them aliases for the same command, but I’ll need to sort out the canonical one for docs.

This brings up another change needed – a raw atlas converge works today because a project is initialized with the environment model to use. If a given project is going to support running against multiple environments we’ll need a way to pick which one to run against. To do this reasonably we’ll need to make the default configurable (the default will default to an environment named dev I think), which means a configuration system.

Configuration

Today atlas uses SQLite for all of its state, but this is a crap option for configuration. Configuration should be human mungeable text. So, we need to add a human mungeable text file.

I like simple sectioned key/value pairs for configuration, personally, a la the format used by Python’s ConfigParser. The other format I really like is Lua. An interesting alternative is Typesafe’s config library. The typesafe library provides a very readable curly-based block/hierarchical syntax in addition to straight key/value pairs and json, but json isn’t really human-friendly writeable IMHO). Using simpler things, like straight key/value pairs or sectioned key/value pairs encourages configuration to stay very simple, which is appealing.

Atlas is designed to be extended with various provisioners, installers, and listeners specialized for different environments or tools. Frequently, extensions will need configuration as well, and some extensions may need (or just have because some developers work that way) extensive and complex configuration. While big complex configuration is pretty distasteful, it can happen for valid reasons. If it can happen, we should at least steer it towards something consistent, which is an argument in favor of something like the Typesafe library.

Putting aside configuration syntax (and semantics I guess, in that there is at least a choice between hierarchical and flat here) there is where to put it. The clear trend is to hide it from plain sight – stick it in a .$file of some sort. This works if there is no expectation of changing configuration very often (emacs), or if there is an expectation of using tooling to change it (git). Personally, I dislike using tooling to change configuration, so will steer towards just pointing folks to the file.

Atlas presently uses a .atlas/ directory to keep application managed state – the aforementioned SQLite database, generated ssh keys, etc. This directory is only semi-private – while there is no expectation of users poking into the SQLite database, use of generated SSH keys is expected and okay. One options is to drop an atlas.conf file here. Putting it here implies that we don’t want users to muck with it often, and that out of the box users should not need to muck with it at all.

Another alternative is to drop an atlas.conf in the root. This implies that users can and should make changes to configuration and it deserves to be highlighted.

The last option I want to consider is a conf/ directory analogous to the models directory. This makes it easier to support multiple configuration file formats, etc, which if we take the Typesafe config approach might be helpful.

Conceptually, configuration should be used to control Atlas’s behavior within the project. The project itself should be the main focus of users’ focus, and I don’t expect configuration to be changed that often. As long as we pick defaults well then most users shouldn’t need to touch the Atlas progam configuration. Given this, I like the .atlas/atlas.conf option the most.

Back to Running Atlas

Okay, so we have a sketch for configuration. We will default to running in an environment named dev (not from the file name, but from the environment declaration:

env "dev" do
  # ...
end

env "prod" do
  # ...
end

env "load" do
  # ...
end

and we’ll process all the descriptors in models/ (location changeable via .atlas/atlas.conf) which can define multiple environments of which only one will be used. Using something other than the default can be either an argument or an option to converge:

$ atlas update 
$ atlas update -e load
$ atlas update load

The first would operate on the default (which defaults to dev the other two would operate on the load environment. I am not sure which form I like, argument or option. Need to ponder.