There are many tools and services available to monitor your web services. Pingdom, for example, is a popular service, but you can also write your own custom shell scripts to ensure your services are up and healthy.
A compromise between a service you can pay for and a completely custom solution is to rely on existing tools to do the job for you. For example, why not let a load balancer with support for health checks monitor your web services? HAProxy is a great load balancer with a built-in UI for displaying the status of the services you are (supposedly) load balancing between. Of course in our case we are not load balancing between anything, because we are not sending requests to the service instances. We just want to know if the services are up or not. Doesn’t this look great?
At least this gives you something to look at with clear color-coded indications if something is wrong with your services.
However, you may want to be notified if something goes down, so how do you accomplish that? Luckily, HAProxy can export this list as CSV using an HTTP endpoint (
/;csv). So if you wanted to you could just
curl that endpoint,
grep the result for “DOWN” and send a push notification to your phone for each match using something like Pushover.
Here is an example HAProxy configuration for this setup (file:
global log 127.0.0.1 local0 notice maxconn 100 user haproxy group haproxy defaults log global mode http option httplog option dontlognull retries 3 option redispatch frontend frontend bind *:80 mode http backend tomcat-servers mode http balance roundrobin option httpclose option forwardfor option httpchk GET /health_check/ # nodes to monitor server 10.14.6.84 10.14.6.84:80 check fall 3 rise 2 maxconn 10 # more services here... backend docker-hosts mode http balance roundrobin option httpclose option forwardfor option httpchk GET /health_check/ # nodes to monitor server 10.14.6.88 10.14.6.88:80 check fall 3 rise 2 maxconn 10 # more services here... listen stats *:1936 stats enable stats uri / stats hide-version
Of course, you want to install and configure HAProxy using Ansible. And if you get creative you can be smart about which hosts from your Ansible inventory to include to be monitored, and how to categorize the services in HAProxy to better understand what is down when something goes wrong.
The key to this particular configuration is the
httpchk option which uses a custom endpoint (
/health_check/) that needs to be implemented by the services being monitored. As long as the services returns a valid HTTP 200 response for this endpoint, everything will look good. The implementation of this endpoint can be as simple as to immediately return a response, or exercise the service is some way, for example by querying any configured databases.
What’s your poor man’s web service monitoring setup?