Skip to content

Instantly share code, notes, and snippets.

@HawickMason
Forked from prasanthj/native-mem-tracking.md
Created April 27, 2021 06:58
Show Gist options
  • Save HawickMason/a77e115109ad540b18386aa6880c5769 to your computer and use it in GitHub Desktop.
Save HawickMason/a77e115109ad540b18386aa6880c5769 to your computer and use it in GitHub Desktop.
Native memory tracking in JVM

Enable native memory tracking in JVM by specifying the following flag

-XX:NativeMemoryTracking=detail

Know the <PID> of the java process

jps

To print ps based RSS

ps -p <PID> -o pcpu,rss,size,vsize

To print native memory tracking summary

jcmd <PID> VM.native_memory

Detailed tracking summary

jcmd <PID> VM.native_memory detail

Get the Rss from smaps

cat /proc/<PID>/smaps | grep Rss | cut -d: -f2 | tr -d " " | cut -f1 -dk | sort -n | awk '{ sum += $1 } END { print sum }'

Demo

To demonstrate the above commands lets use the following java program

import java.nio.ByteBuffer;
public class BaselineDirect {
  private static ByteBuffer humonguosBuffer = ByteBuffer.allocateDirect(1024*1024*1024);
  public static void main(String[] args) throws Exception {
    System.out.println("Direct allocation: " + humonguosBuffer.capacity());
    System.out.println("Native memory used: " + sun.misc.SharedSecrets.getJavaNioAccess().getDirectBufferPool().getMemoryUsed());
    System.out.println("Max direct memory: " + sun.misc.VM.maxDirectMemory());
    Thread.sleep(1000000);
  }
}

Compile and run

javac BaselineDirect.java
java -XX:NativeMemoryTracking=detail -Xms128M -Xmx128M -XX:MaxDirectMemorySize=1024M BaselineDirect

Direct allocation: 1073741824
Native memory used: 1073741824
Max direct memory: 1073741824
  • On another terminal
jps
31335 BaselineDirect

PID - 31335

ps -p 31335 -o pcpu,rss,size,vsize

%CPU   RSS  SIZE    VSZ
 0.8 1067708 5565316 5715364
jcmd 31335 VM.native_memory

31335:

Native Memory Tracking:

Total: reserved=2617472KB, committed=1319348KB
-                 Java Heap (reserved=131072KB, committed=131072KB)
                            (mmap: reserved=131072KB, committed=131072KB)

-                     Class (reserved=1081454KB, committed=29550KB)
                            (classes #386)
                            (malloc=24686KB #197)
                            (mmap: reserved=1056768KB, committed=4864KB)

-                    Thread (reserved=47485KB, committed=47485KB)
                            (thread #47)
                            (stack: reserved=47288KB, committed=47288KB)
                            (malloc=143KB #255)
                            (arena=54KB #92)

-                      Code (reserved=249630KB, committed=3410KB)
                            (malloc=30KB #295)
                            (mmap: reserved=249600KB, committed=3380KB)

-                        GC (reserved=32474KB, committed=32474KB)
                            (malloc=27678KB #174)
                            (mmap: reserved=4796KB, committed=4796KB)

-                  Compiler (reserved=135KB, committed=135KB)
                            (malloc=4KB #34)
                            (arena=131KB #3)

-                  Internal (reserved=1073464KB, committed=1073464KB)
                            (malloc=1073432KB #1723)
                            (mmap: reserved=32KB, committed=32KB)

-                    Symbol (reserved=1428KB, committed=1428KB)
                            (malloc=941KB #75)
                            (arena=488KB #1)

-    Native Memory Tracking (reserved=156KB, committed=156KB)
                            (malloc=89KB #1395)
                            (tracking overhead=67KB)

-               Arena Chunk (reserved=175KB, committed=175KB)
                            (malloc=175KB)

In the above output, direct byte buffer allocations are counted to Internal section. But internal shows 1048.3MB but in the example we allocated 1024MB. The additional 24MB seems to be allocated always (may be used by native decompressor?). To verify that, I ran another java program that just prints "Hello, World" and sleeps and looked at the Internal section. It looks like there is always 24MB allocation in Internal section even when the program does no allocation.

Output from HelloWorld java program
...
...
Internal (reserved=24887KB, committed=24887KB)
                            (malloc=24855KB #1716)
                            (mmap: reserved=32KB, committed=32KB)
...
...
cat /proc/31335/smaps | grep Rss | cut -d: -f2 | tr -d " " | cut -f1 -dk | sort -n | awk '{ sum += $1 } END { print sum }'
1069128

Native memory allocation tracing with JEMALLOC

Installing jemalloc in centos

yum -y install git autoconf gcc automake libtool
git clone https://github.com/libunwind/libunwind
cd libunwind
./autogen.sh
AM_CFLAGS="-m64" ./configure
AM_CFLAGS="-m64" make
make install prefix=/usr/local

cd ..
git clone https://github.com/jemalloc/jemalloc
cd jemalloc
./autogen.sh
./configure CC="gcc -m64 " CXX="g++ -m64 " --enable-prof --enable-stats --enable-debug --enable-fill --enable-prof-libunwind --with-static-libunwind=/usr/local/lib/libunwind-x86_64.a
make
make install prefix=/usr/local
  • copy /usr/local/lib/libjemalloc.so.2 and make sure it is in the machine where JEMALLOC tracing is going to happen
export LD_PRELOAD=/path/to/libjemalloc.so.2
export MALLOC_CONF="prof:true,lg_prof_interval:30,lg_prof_sample:17,prof_prefix:/tmp/jeprof.out"
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment