Kube-Prometheus-Stack on Docker Desktop for Mac (Sonoma 14.3 + Docker Desktop 4.27)

Initial Setup - reconfigure docker-desktop to expose metrics endpoints

  • Default setup of docker-desktop does not allow Prometheus to scrape metrics from the following kube-system pods;
    • kube-controller-manager-docker-desktop
    • etcd-docker-dekstop
    • kube-scheduler-docker-dekstop
    • kube-proxy-{id}
  • Edit the configmap kube-proxy in the kube-system namespace
    • metricsBindAddress: change to metricsBindAddress:
    • Manually delete the kube-proxy-{id} pod and let it reprovision
  • Docker-desktop runs inside a virtual machine, to be able to edit the cluster manifest files we must shell into the host using nsenter;
    • docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i sh
    • cd /etc/kubernetes/manifests
    • vi kube-controller-manager.yaml
  • kube-controller-manager-docker-desktop
    • Edit kube-controller-manager.yaml
    • bind-address= change to bind-address=
    • Save + exit, wait for the pod to reload automatically
  • kube-scheduler-docker-dekstop
    • Edit kube-scheduler.yaml
    • bind-address= change to bind-address=
    • Save + exit, the pod should reload automatically
  • etcd-docker-dekstop
    • Edit etcd.yaml
    • listen-metrics-urls=<port> change to listen-metrics-urls=<port>,http://<cluster IP>:2381 - you can find the cluster IP in the other settings in that same file
    • Save + exit - this will make the cluster unresponsive temporarily, wait a minute or two and it should come back with the pod restarted
  • You should now see all targets UP in the Prom dashboard now :)

Now you can install kube-prometheus-stack helm chart to deploy all the necessary components

eg: to install the components into monitoring namespace...

helm repo add prometheus-community
helm repo update
helm upgrade --install --wait \
            --values kube-prom-stack-values.yaml \
            --create-namespace \
            --namespace monitoring \
            prometheus \
            prometheus-community/kube-prometheus-stack \
            --debug \
            --timeout 5m

Small fix for the Grafana dashboard contained in kube-prom-stack-values.yaml...

# fix grafana dashboard:
      release: prometheus

docker-desktop update script

  • You will need to do the steps outlined here everytime the docker-desktop cluster is reset :(
  • A script would be nice, huh? Yvw! :)
  • Disclaimer: this works for me. No claims made on robustness / portability.
#!/usr/bin/env bash

echo "Updating docker-desktop pods to expose metrics endpoints"
echo "This will involve several kube-system pod restarts" 

echo "Fetching debian image to run nsenter on the docker-desktop host..."
docker pull debian

NODE_IP=$(kubectl get nodes -o wide --no-headers | awk -v OFS='\t\t' '{print $6}')
echo "Host Node IP: $NODE_IP"

echo "Updating kube-proxy configmap..."
kubectl get configmap/kube-proxy -n kube-system -o yaml > $MOD_YAML
if cat $MOD_YAML | grep -q "metricsBindAddress:"; then
    sed -i '' 's/metricsBindAddress:' $MOD_YAML     # non-standard sed for mac
    kubectl delete configmap/kube-proxy -n kube-system
    kubectl create -f $MOD_YAML
    echo "Restarting the kube-proxy pod"
    kubectl delete pod -n kube-system -l k8s-app=kube-proxy
    if ! kubectl wait -n kube-system --timeout=3m --for=condition=Ready pod -l k8s-app=kube-proxy; then
        echo "kube-proxy pod did not restart in time, please check the pod logs."
        exit 1
    echo "kube-proxy pod restarted."
    echo "kube-proxy metricBindAddress already updated, skipping."
rm -f $MOD_YAML

echo "Updating bind-address on kube-controller-manager..."
if kubectl describe pod kube-controller-manager-docker-desktop -n kube-system | grep -q "bind-address="; then
    docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i \
        sh -c "sed -i 's/--bind-address=' /etc/kubernetes/manifests/kube-controller-manager.yaml"
    echo "Waiting for kube-controller-manager to restart, this can take some time..."
    kubectl wait pod -l component=kube-controller-manager -n kube-system --timeout=3m --for=delete 
    if ! kubectl wait pod -l component=kube-controller-manager -n kube-system --timeout=3m --for=condition=Ready; then
        echo "kube-controller-manager pod did not restart in time, please check the pod logs."
        exit 1
    echo "kube-controller-manager pod restarted."
    echo "kube-controller-manager bind-address already updated, skipping."

echo "Updating bind-address on kube-scheduler"
if kubectl describe pod kube-scheduler-docker-desktop -n kube-system | grep -q "bind-address="; then
    docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i \
        sh -c "sed -i 's/--bind-address=' /etc/kubernetes/manifests/kube-scheduler.yaml"
    echo "Waiting for kube-scheduler to restart, this can take some time..."
    kubectl wait pod -l component=kube-scheduler -n kube-system --timeout=3m --for=delete
    if ! kubectl wait pod -l component=kube-scheduler -n kube-system --timeout=3m --for=condition=Ready; then
        echo "kube-scheduler pod did not restart in time, please check the pod logs."
        exit 1
    echo "kube-scheduler pod restarted."
    echo "kube-scheduler bind-address already updated, skipping."

echo "Adding node ip to listen-metrics-urls on etcd"
if kubectl describe pod etcd-docker-desktop -n kube-system | grep "listen-metrics-urls" | grep -q "http://${NODE_IP}:2381"; then
    echo "etcd listen-metrics-urls already updated, skipping."
    docker run -it --privileged --pid=host debian nsenter -t 1 -m -u -n -i \
        sh -c "sed -i 's/--listen-metrics-urls=http:\/\/\:2381/--listen-metrics-urls=http:\/\/\:2381,http:\/\/${NODE_IP}\:2381/g' /etc/kubernetes/manifests/etcd.yaml"
    echo "Waiting for etcd to restart, this can take some time..."
    kubectl wait pod -l component=etcd -n kube-system --timeout=3m --for=delete                         # as soon as etcd goes down this will respond with an error from the api server
    sleep 10                                                                                            # so we wait for a few seconds for the api server to reboot & then we can run kubectl commands again
    if ! kubectl wait pod -l component=etcd -n kube-system --timeout=3m --for=condition=Ready; then     # if all gone well this should respond immediately
        echo "etcd pod did not restart in time - this may just be the api server still rebooting, give it a few minutes before panicking."

echo "Done! You can now deploy the monitoring components."
