There are many tools and services available to monitor your web services. Pingdom, for example, is a popular service, but you can also write your own custom shell scripts to ensure your services are up and healthy.

A compromise between a service you can pay for and a completely custom solution is to rely on existing tools to do the job for you. For example, why not let a load balancer with support for health checks monitor your web services? HAProxy is a great load balancer with a built-in UI for displaying the status of the services you are (supposedly) load balancing between. Of course in our case we are not load balancing between anything, because we are not sending requests to the service instances. We just want to know if the services are up or not. Doesn’t this look great?

At least this gives you something to look at with clear color-coded indications if something is wrong with your services.

However, you may want to be notified if something goes down, so how do you accomplish that? Luckily, HAProxy can export this list as CSV using an HTTP endpoint (/;csv). So if you wanted to you could just curl that endpoint, grep the result for “DOWN” and send a push notification to your phone for each match using something like Pushover.

Here is an example HAProxy configuration for this setup (file: /etc/haproxy/haproxy.cfg):

global
    log 127.0.0.1 local0 notice
    maxconn 100
    user haproxy
    group haproxy
    
defaults
    log global
    mode http
    option httplog
    option dontlognull
    retries 3
    option redispatch
    
frontend frontend
    bind *:80
    mode http
    
backend tomcat-servers
    mode http
    balance roundrobin
    option httpclose
    option forwardfor
    option httpchk GET /health_check/ 
    # nodes to monitor 
    server 10.14.6.84 10.14.6.84:80 check fall 3 rise 2 maxconn 10 
    # more services here...
    
backend docker-hosts
    mode http
    balance roundrobin
    option httpclose 
    option forwardfor
    option httpchk GET /health_check/ 
    # nodes to monitor 
    server 10.14.6.88 10.14.6.88:80 check fall 3 rise 2 maxconn 10 
    # more services here...
    
listen stats *:1936
    stats enable
    stats uri /
    stats hide-version

Of course, you want to install and configure HAProxy using Ansible. And if you get creative you can be smart about which hosts from your Ansible inventory to include to be monitored, and how to categorize the services in HAProxy to better understand what is down when something goes wrong.

The key to this particular configuration is the httpchk option which uses a custom endpoint (/health_check/) that needs to be implemented by the services being monitored. As long as the services returns a valid HTTP 200 response for this endpoint, everything will look good. The implementation of this endpoint can be as simple as to immediately return a response, or exercise the service is some way, for example by querying any configured databases.

What’s your poor man’s web service monitoring setup?