How (and Why) to Add Health Checks to Your Docker Containers – CloudSavvy IT


    Blue Docker logo on a purple background

    You’ve built your Docker image, pushed it to your registry and started a new container in production. Everything’s working as you head home for the day but you’re met with outage reports when you come back next morning. Your container’s still running but it’s not serving requests.

    This scenario might be uncomfortably familiar to operations teams that work with Docker. Here’s how to use Docker’s health check feature to get accurate data on the availability of your services.

    How Health Checks Work

    Health checks allow a container to expose its workload’s availability. This stands apart from whether the container is running. If your database goes down, your API server won’t be able to handle requests, even though its Docker container is still running.

    This makes for unhelpful experiences during troubleshooting. A simple docker ps would report the container as available. Adding a health check extends the docker ps output to include the container’s true state.

    You configure container health checks in your Dockerfile. This accepts a command which the Docker daemon will execute every 30 seconds. Docker uses the command’s exit code to determine your container’s healthiness:

    • 0 – The container is healthy and working normally.
    • 1 – The container is unhealthy; the workload may not be functioning.
    • 2 – This status code is reserved by Docker and should not be used.

    When HEALTHCHECK is present in a Dockerfile, you’ll see the container’s healthiness in the STATUS column when you run docker ps.

    Healthiness isn’t checked straightaway when containers are created. The status will show as starting before the first check runs. This gives the container time to execute any startup tasks. A container with a passing health check will show as healthy; an unhealthy container displays unhealthy.

    Writing Health Check Commands

    Container health checks are configured with the HEALTHCHECK instruction in your Dockerfile. You should use a health check command that’s appropriate for your container. For web servers, curl gives you an easy way to perform a basic readiness check. Ping a known endpoint that should be available whenever your service is up.

    FROM nginx:latest
    HEALTHCHECK CMD curl --fail http://localhost/api/healthcheck || exit 1

    This example would mark the container as unhealthy if your server’s /api/healthcheck endpoint issued an error status.

    You can use docker inspect to see the output from health check commands. This is helpful when you’re debugging your HEALTHCHECK instruction.

    docker inspect --format="{{json .State.Health}}" my-container

    Replace my-container with the ID or name of the container you want to inspect. These details are displayed in docker ps output.

    Customizing Health Checks

    Docker lets you customize the frequency of health checks. You can also alter the criteria that marks a container as unhealthy.

    There are four options available:

    • --interval – Set the time between health checks. This lets you override the default value of 30 seconds. Use a higher interval to increase the time between checks. This helps if you have a low-priority service where regular health checks might impact performance. Use a more regular frequency for critical services.
    • --start-period – Set the duration after a container starts when health checks should be ignored. The command will still be run but an unhealthy status won’t be counted. This gives containers time to complete startup procedures.
    • --retries – This lets you require multiple successive failures before a container is marked as unhealthy. It defaults to 3. If a health check fails but the subsequent one passes, the container will not transition to unhealthy. It will become unhealthy after three consecutive failed checks.
    • --timeout – Set the timeout for health check commands. Docker will treat the check as failed if the command doesn’t exit within this time frame.

    Options are passed as flags to the HEALTHCHECK instruction. Supply them before the health check command:

    HEALTHCHECK --interval=60s --retries=5 CMD curl --fail http://localhost || exit 1

    This configuration instructs Docker to run curl every 60 seconds. The container will be marked as unhealthy if five consecutive tests have a non-zero exit code.

    Command Format

    The HEALTHCHECK command syntax supports either a plain CMD or an entrypoint-style exec array. You can also pass NONE instead of CMD to disable health checks:

    HEALTHCHECK NONE

    This lets you inhibit your base image’s health checks. Each HEALTHCHECK statement overrides any previous instructions in your image’s layers.

    What About Docker Compose?

    Docker Compose supports health check definitions too. Add a healthcheck section to your service:.

    version: "3"
    services:
      app:
        image: nginx:latest
        ports:
          - 80:80
        healthcheck:
          test: curl --fail http://localhost || exit 1
          interval: 10s
          retries: 5
          start_period: 5s
          timeout: 10s

    The test key defines the command to run. The other keys map to the parameters accepted by the Dockerfile HEALTHCHECK instruction.

    Summary

    Setting a HEALTHCHECK instruction makes it easier to diagnose a misbehaving container. You can track the readiness of your workload independently of the container’s “running” state.

    Health checks are compatible with any command that issues a 0 or 1 exit code. You can use common commands like curl and ping to inspect web services and databases. For more advanced control, write a dedicated script and include it in your images.



    Source link