Skip to content

Instantly share code, notes, and snippets.

@hilbix
Created March 28, 2013 15:27
Show Gist options
  • Save hilbix/5264057 to your computer and use it in GitHub Desktop.
Save hilbix/5264057 to your computer and use it in GitHub Desktop.
Bash/AWK script to detect 100% CPU on kswapd processes, caused by a known Linux kernel issue on RHEL 5.5 (and CentOS) kernel 2.6.18-194.el5. Returns 0 if no problem, and usually 2 when the problem hits. (Note: It will return 1 on first invocation. Better safe than sorry.)
#!/bin/bash
PIDFILE=/tmp/kswapd-check.pids
scanpids()
{
cat -s /proc/[1-9]*/stat 2>/dev/null |
awk 'BEGIN {
want["(kswapd0)"]=1;
want["(kswapd1)"]=1;
}
want[$2] { print $1 }
'
}
[ -s "$PIDFILE" ] || scanpids > "$PIDFILE"
mv -f "$PIDFILE" "$PIDFILE.old"
exec awk '
BEGIN { cnt=0; up=0 }
$1+0>0 {
p = $1+0;
wasu[p] = $2;
wass[p] = $3;
f="/proc/" $1 "/stat"; getline < f; close(f);
user[p] = $14;
sys[p] = $15;
cnt++;
}
$1=="up" {
wasup=$2
}
END {
err=0;
# write empty file so it is regenerated
if (cnt<2) exit(1);
f="/proc/uptime"; getline up < f; close(f);
sub(/\./,"",up);
delta=up-wasup;
printf("up %d (%d)\n", up, delta);
for (a in user)
{
du = user[a]-wasu[a];
ds = sys[a]-wass[a];
if (delta < 2*(du+ds))
err++;
printf "%d %d %d (%d %d: %d)\n", a, user[a], sys[a], du, ds, err;
}
exit(err);
}
' "$PIDFILE.old" > "$PIDFILE"
@hilbix
Copy link
Author

hilbix commented May 17, 2013

RedHat has confirmed that it is a kernel issue for 2.6.18-194.el5

Solutions:

  • Minimum: kernel-2.6.18-194.32.1.el5 contains the immediate bugfix
  • Better: kernel-2.6.18-238.el5 contains additional kswapd-related bugfixes
  • Best: kernel-2.6.18-348.4.1.el5 latest kernel which runs with RHEL 5.5 without change

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment