-
-
Save ekristen/11254304 to your computer and use it in GitHub Desktop.
#!/bin/bash | |
# Author: Erik Kristensen | |
# Email: [email protected] | |
# License: MIT | |
# Nagios Usage: check_nrpe!check_docker_container!_container_id_ | |
# Usage: ./check_docker_container.sh _container_id_ | |
# | |
# Depending on your docker configuration, root might be required. If your nrpe user has rights | |
# to talk to the docker daemon, then root is not required. This is why root privileges are not | |
# checked. | |
# | |
# The script checks if a container is running. | |
# OK - running | |
# WARNING - restarting | |
# CRITICAL - stopped | |
# UNKNOWN - does not exist | |
# | |
# CHANGELOG - March 20, 2017 | |
# - Removes Ghost State Check, Checks for Restarting State, Properly finds the Networking IP addresses | |
# - Returns unknown (exit code 3) if docker binary is missing, unable to talk to the daemon, or if container id is missing | |
CONTAINER=$1 | |
if [ "x${CONTAINER}" == "x" ]; then | |
echo "UNKNOWN - Container ID or Friendly Name Required" | |
exit 3 | |
fi | |
if [ "x$(which docker)" == "x" ]; then | |
echo "UNKNOWN - Missing docker binary" | |
exit 3 | |
fi | |
docker info > /dev/null 2>&1 | |
if [ $? -ne 0 ]; then | |
echo "UNKNOWN - Unable to talk to the docker daemon" | |
exit 3 | |
fi | |
RUNNING=$(docker inspect --format="{{.State.Running}}" $CONTAINER 2> /dev/null) | |
if [ $? -eq 1 ]; then | |
echo "UNKNOWN - $CONTAINER does not exist." | |
exit 3 | |
fi | |
if [ "$RUNNING" == "false" ]; then | |
echo "CRITICAL - $CONTAINER is not running." | |
exit 2 | |
fi | |
RESTARTING=$(docker inspect --format="{{.State.Restarting}}" $CONTAINER) | |
if [ "$RESTARTING" == "true" ]; then | |
echo "WARNING - $CONTAINER state is restarting." | |
exit 1 | |
fi | |
STARTED=$(docker inspect --format="{{.State.StartedAt}}" $CONTAINER) | |
NETWORK=$(docker inspect --format="{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" $CONTAINER) | |
echo "OK - $CONTAINER is running. IP: $NETWORK, StartedAt: $STARTED" |
Are ghosted containers still an issue? This thread seems to indicate they aren't
Thanks, just what I needed!
💯
Awesome script, thank you :) docker inspect is a powerful command
💯 💯
How can i run this script without sudo?Because when i try to run this script via nagios user,it says does not exist and with sudo ./check_docker_container container_id command it works fine.Any suggestions?Thanks
when doing the ghost command, i get:
Template parsing error: template: :1:9: executing "" at <.State.Ghost>: Ghost is not a field of struct type *types.ContainerState
is this an error or a normal msg ? thanks !
I get the same error as michabb, (Docker version 1.9.1, build a34a1d5, on Ubuntu, Trusty Tahr 14.04.4 LTS), but when I issue the docker inspect command for the container I am using, the State.Ghost field does not exist. So I expect this is:
a. a State property that is just not in the container
b. a State property that is not included in the docker version
Neither give me any worries. But it would be nice to know which it is :)
I thought off option c: the property is only added in specific circumstances, that would be strange though, imho.
Ow the properties I do have are:
"State": {
"Status": "running",
"Running": true,
"Paused": false,
"Restarting": false,
"OOMKilled": false,
"Dead": false,
"Pid": 5667,
"ExitCode": 0,
"Error": "",
"StartedAt": "2016-02-23T11:44:17.338327578Z",
"FinishedAt": "0001-01-01T00:00:00Z"
},
A bit of a newbie in docker, but maybe Dead is a replacement for ghost, have to read more changelogs and manual pages.
Awesome, it helped me to write my own script. Good job.
since this comment, I've adopted this script with two modifications:
- It returns status UNKNOWN if the docker command is missing (instead of OK)
- I removed all the GHOST check code
hash docker 2>/dev/null || { echo "UNKNOWN - docker command not found"; exit 3; }
first of thanks for sharing.
Small comment:
Guys please be sure that when you run above script from nrpe i.e as nagios unprivileged user you have sudo rights for
docker inspect
command.
Lost like an hour to discover why the scriprt didn't work through nrpe.
Added something like to top # permissions if [ "$(whoami)" != "root" ]; then echo "Root privileges are required to run this, try running with sudo..." exit 2 fi
Thanks man, after 3 years it's still not outdated! 👍
If you are using Docker 1.12+ and using the IP driver (direct,macvlan, etc) -- the way to get the IP address is:
NETWORK=$(docker inspect --format="{{range .NetworkSettings.Networks}}{{.IPAddress}}{{end}}" $CONTAINER)
That also works on the default networked containers, so it's probably a better way.
A slight improvement:
docker ps -q --filter "name=nginx" | awk '{print $1}' | xargs docker inspect --format="{{ .State.Status }}"
other ps filter options:
(https://docs.docker.com/v1.11/engine/reference/commandline/ps/)
and inspect additional features :
(https://docs.docker.com/v1.11/engine/reference/commandline/inspect/)
Thanks a lot!!
2> /dev/null is what I needed after hours of searching. 🙇♂️
Thank you!
I wasn't getting notifications on this! My apologies.
I've updated the script with most of the suggestions in the comments.
Please note, I'm not using this script anymore, but if needed I'll move this to a git repo so pull requests can be accepted.
Hello guys.
Added nagios user to docker group, so it has permissions to speak to docker daemon.
When I execute it as nagios:
nagios@nrpe-client-host:~$ sh /usr/lib/nagios/plugins/check_docker_container.sh redis
/usr/lib/nagios/plugins/check_docker_container.sh: 25: [: xredis: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 30: [: x/usr/bin/docker: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 48: [: true: unexpected operator
/usr/lib/nagios/plugins/check_docker_container.sh: 55: [: false: unexpected operator
OK - redis is running. IP: 172.17.0.2, StartedAt: 2018-03-01T08:07:42.857992735Z
So, it works.
But when I'm trying to execute it from nagios host:
user@nagios-host:~$ /usr/local/nagios/libexec/check_nrpe -H 123.45.67.89 -c check_docker_container redis
NRPE: Unable to read output
Nagios displays the same "UNKNOWN - NRPE: Unable to read output"
Here is nrpe.cfg
command[check_docker_container]=/usr/lib/nagios/plugins/check_docker_container.sh
and service definition
define service {
use generic-service
host_name host-example
service_description Redis Docker Container
check_command check_nrpe!check_docker_container!redis
}
Am I missing something? What's wrong?
Thank you!
Just solved my issue.
It was about permissions, sorry.
Great script, btw. Thanks.
@wirtoo can you share the permission issue fix you used?
I'm running NRPE in a container. Do I need to add the nagios
user to /etc/sudoers
in the container itself?
From my Nagios host:
./check_nrpe -H 10.99.125.131 -c check_docker_container1
NRPE: Unable to read output
What I do wrong?
Remote
/usr/local/nagios/libexec/check_nrpe -H hostip -c check_docker -a asterisk
UNKNOWN - Missing docker binary
Local
/usr/lib64/nagios/plugins/check_docker asterisk
OK - asterisk is running. IP: 172.19.0.2, StartedAt: 2018-09-14T06:44:09.174409454Z
I understand this might be outdated, I mean this thread not the script. Take a look here and it may help some of you who are having permission problems. Nagios and Docker Monitoring
Awesome script, thank you, just what I needed!
HI,
i need docker stat output with mail alert shell script. Please help on this
Thank you for sharing this!
Thank you Erik Kristensen
Thanks a lot, this is also working for PRTG (with some small changes in the output).
cool, thank you