The past 12 months or so have been intersting for me from a software technology perspective. I’ve been trying to balance learning and introducing new technologies into the software stack for production systems I work on, while at the same time staying conservative enough to not introduce too many complexities and minimizing day-to-day operational issues.
As documented in Scaling Pinterest, via Marco.org, choices should often fall on well known, liked, battle-tested, and performant tools. There is a good case to be made to adopt technologies that have had at least 3-5 years to mature and self-select in the marketplace, especially if they are to be used in production systems with requirements on stability and uptime.
But it’s not only technology choices that affect a system’s runtime that matter. Choices about technology and tools for infrastructure management, as well as development environments and tools, also play an important role. Below is a highlight of some of the technologies I’ve used in the past few months that I believe I will be using on a regular basis over the next few years for a range of products and solutions.
Ansible is a powerful software orchestration and management platform. Ansible can be used to issue commands on machines (or hosts) over SSH. The hosts can be local or remote, physical or virtual machines, and can be accessed via username and password, or more commonly using a PKI. Commands are issued via tasks, and can be for the purpose of configuring hosts for specific purposes, or for directly running services or applications. Ansible provides a large library of modules for performing common tasks on hosts. The true power of Ansible comes from needing to perform repeatable tasks on a large set of hosts.
There are a relatively few number of core concepts that need to be understood in order to be proficient using Ansible.1 Once this is accomplished, a great deal of power is suddenly at your fingertips.
I use Ansible to setup all new clouds VMs that I create with a consistent configuration. Using the powerful Ansible roles concept, I can easily configure machines for particular purposes, for example, MySQL servers, Redis instances, Tomcat containers, and lots of other things. I’ve used Ansible for complex deployments of service-oriented applications for production. I’ve come to rely on Ansible on a daily basis, and it’s actually fun to use.
One of the simplest task is to issue single “ad-hoc” commands against a host or a set of hosts. This can be extremely useful, and often powerful. I often use it to restart a service (Tomcat) or check some state (running processes). Just to show an example, this command stops all Tomcat servers in the specified inventory of hosts:
$ ansible -i dev tomcat-servers -s -m "service" -a "name=tomcat7 state=stopped"
I once had a rogue Java process running on one of 20+ VMs that I had launched. At least, that’s what I suspected. That’s a lot of VMs to log into to investigate. So, instead I simply issued the following Ansible command:
$ ansible -i dev all -m "shell" -a "ps aux | grep java"
The output quickly told me where the process was running so I could stop it.
A best practice suggestion is to define your entire infrastructure in a master
site.yml Ansible playbook. This is not only useful for deploying everything in a new environment, but this general-purpose playbook can also be used to perform very specific actions using the powerful concepts of tags and host limits.2 A command like this could configure and deploy all Tomcat applications:
$ ansible-playbook -i prod site.yml --limit tomcat-servers
Combining host limits with tags can narrow down the tasks executed even further, for example to only re-deploy the Tomcat applications (without necessarily configuring everything). That could look something like this:
$ ansible-playbook -i prod site.yml --limit tomcat-servers -t "tomcat-war"
These simple examples show how you can organize your Ansible playbooks to be able to configure an entirely new environment, and deploy your entire infrastructure, but also perform cross-cuts to execute very specific tasks, against a limited set of hosts.
Software will always need to be deployed on some hosts that need to be configured, and other infrastructure (for example, databases) needs to be put in place. Ansible is a great and elegant tool for all these tasks, so I expect to be using Ansible for a long time.3
Docker is a virtualization project and set of tools that leverage capabilities of the Linux kernel to run processes in independent containers. Containers can roughly be seen as a lightweight virtual machines, providing process isolation (from the host machine) and resource controls (memory, CPU, I/O, network).
Process isolation and sandboxing is certainly a core feature of containers, but from a practical day-to-day perspective, far more important (to me) are two tangible benefits of containerizing your applications:
- Dependency management
- Application interface
By dependency management I mean in the same was you would use Python virtual environments to manage and separate dependencies for different applications. Often a developer will have all application dependencies correctly configured in their environment, but will forget to document or update the setup scripts to install the dependencies when installed in a fresh environment. This has the obvious downsides of breaking deployments and delaying integration. If applications are shared, tested and delivered as containers, these issues can often be avoided.
Providing a consistent application interface to start and stop applications can greatly simplify deployments, and application lifecycle mangement (e.g. restarts). I’ve been responsible for complex deployments of service-oriented applications where different components (services) are written in completely different languages and frameworks. As such, each component has a different process for managing its lifecycle: Grails applications running on Tomcat, Python web services using WSGI, Java command-line interfaces, Python scripts, shell scripts etc. No single application interface is particularly difficult by itself, but taken all together it gets somewhat unwieldy. How do you start and stop 8 diffrent Java CLI programs running on different VMs? How do you shutdown only a subset of the processes on a subset of VMs? How do you easily restart two Python web services and a Java process that are coordinating their activities?
With containers and Docker, the launching of an application is described in the relevant Dockerfile. Once the Docker images are created (and published on Docker Hub), the lifecycle management of the applications is identical regardless of the underlying application or service implementation details (including choice of programming languages or frameworks).
Now everything is a container that can be started, restarted and stopped. The simplicity of that common application interface is a huge benefit. And how do you manage the containeres across your deployment infrastructure? With Ansible of course.
Redis is a multi-purpose in-memory data structure store. Data can be persisted to disk and Redis can be used as a database, but I’ve never used it like that. I’ve used it as a key-value cache, and as a message broker. Redis is great as a simple cache, but one of the benefits of Redis is that it is general-purpose, allowing the same technology to be used in different scenarios. If Redis can solve multiple needs for a project, then I can reduce the number of technologies that I incorporate into the overall software stack.
As I mentioned, I’ve used Redis as part of a message system, more specifically for task processing using queues. A queue is simply an ordered list with first-in-first-out (FIFO) characteristics. Push tasks into one side of the queue, and take tasks out from the other side. JSON makes it easy to serialize and store complex structured task descriptions in the queue.
There are also other uses for Redis that would be interesting to explore in the future, including sorted sets for statistics applications. For example, consider the following commands to track the occurrence of terms (in some data source):
127.0.0.1:6379> ZINCRBY myzset 1 "my-word" "1" 127.0.0.1:6379> ZRANGE myzset 0 -1 WITHSCORES 1) "my-word" 2) "1" 127.0.0.1:6379> ZINCRBY myzset 2 "my-word" "3" 127.0.0.1:6379> ZRANGE myzset 0 -1 WITHSCORES 1) "my-word" 2) "3"
These statistics can be stored ephemerally in Redis, and then queried and later stored in an OLAP system. I’m sure there are other interesting situations where Redis would be a good fit.
I’ve created VMs for various needs for a while, but this year I really started to make heavy use of cloud infrastructures, mainly centered around OpenStack. I don’t know the in and outs of OpenStack, but I’ve become familiar with the OpenStack Horizon Dashboard interface, which is serviceable. I’m a user of cloud infrastructure, but it’s still useful to know some of the basics of how OpenStack works, and in particular how to use the compute capabilities provided by its Nova component. For instance, there is a command-line interface to Nova for managing compute instances (VMs) that I’ve found useful when setting up infrastructures.
As a cloud user, the important part is how to manage and organize a large set of resources, and for that Ansible has been the key for me. Next I want to explore the specifics of Amazon’s EC2, but again, the details are not so important because I will be using Ansible there too.
I’ve played with Python for many years, and I really like it. To me, it’s an elegant language and it’s easy to get things done using it. There is no shortage of 3rd party modules (libraries) and frameworks, and documentation is usually pretty good. The sooner you move to Python 3 the better probably (because there will be no Python 2.8).
What I’ve used Python for a lot lately is writing simple system management and monitoring scripts. For example to ensure that all deployed web services are running and responding to simple HTTP requests. Another example is to ensure that task queues in Redis are not growing faster than they can be processed for an extended period of time. Yet another example is a script to organize component-specific documentation files before building a system-wide documentation website. Python makes it easy to write well organized and maintainable code.4 Furthermore, the script dependencies are documented in requirements files so that they can easily be installed in a new virtual environment, or in a Docker image, which is of course how the Python scripts are deployed.
This brings things around: I’ve really enjoyed writing containerized Python 3 scripts for monitoring processing queue stored in Redis (and other tasks) that have been deployed on an OpenStack cloud using Ansible.
Git is of course great for source code management and versioning. I wanted to mention Jekyll again and what a great choice it is for information-centric websites. Since writing Deploying Jekyll with Git, I’ve used Jekyll for organizing the documentation for a large-scale application, which has worked out really well. Webpages in Jekyll are written in Markdown, which is how I also write this blog. Writing in Markdown keeps your content in a simple non-proprietary format, while making it easy to publish to the Web. Mac OS X is an excellent platform for developers. It’s a modern operating system with great application support, while also providing access to all the greatest Linux-based tools from the command-line, often installable via Homebrew. I write a lot of code in TextMate 2, use LaunchBar 6 for clipboard management, and TextExpander 5 for improved typing efficiency.
Of course the
site.ymlplaybook is only a set of includes of more specific playbooks, which can also be run separately. ↩
There are of course other orchestration platforms besides Ansible, but I have not found them as approchable or elegant. ↩
Although of course it requires some diligence on the part of the programmer. ↩