Wednesday, February 22, 2012

Figure out why is JAVA eating CPU?

In a production environment profiling is not an option, we have seen several times that our CPU has reached almost 100%, we are running tomcat on Redhat 5.5, so what is happening?

Fortunately, java comes with some great debugging tools, cooperating those tools with Linux built-in tools will let you know what is happening.

Here is what i am going to explain:
  • An introduction about java threads and its relation to what-so-called Linux LWP
  • A step-by-step
1- An introduction

as you may know, any java program starts when the JVM calls the main method, this creates a thread called the main thread and any thread you create using java code will be derived from the main thread (out-of-focus: it is always a good practice to give your threads a name, it is very useful in debugging, however, it is not required here) the same exact behavior occurs on the Linux level, the main thread for java means a process for the OS, and every thread you create using java the OS will create a Light-weight-process or LWP.
to cut it short, Java main thread = Linux process and Java thread = Linux LWP.

the idea here:
  • Ask Linux which LWP is eating the CPU.
  • Ask Java for a Thread Dump.
  • Map this LWP to a Java thread.
  • Get the part of code causing the issue.

2- A Step-By-Step:
  • Get the PID: the very first step is to know what is the Java process ID, we will use Linux tools ps and grep
    ps -A|grep java
    if you are running multiple java processes we can execute
    ps -ef|grep java

  • the next step is to get CPU usage per each LWP related to the main process, again we will use  ps  and  grep
    ps -eLo pid,lwp,nlwp,ruser,pcpu,stime,etime,args|grep {pid} > lwpthread.txt
    the file lwpthread.txt will contain some thing similar:
    PID NLWP SZ RUSER %CPU STIME ELAPSED COMMAND
    8234 8234 1110 admin 0.3 08:11 30:15 /usr/java/jdk1.6.0_24/bin/java
    8234 8245 1110  admin 99.0 08:45 10:15 /usr/java/jdk1.6.0_24/bin/java 

    as you can see, we have an LWP(mapped to a java thread) eating the CPU, get the NLWP, that will be our lead in the next step.

  • the next step is to generate a java thread dump, there are two main ways to create a thread dump, the first one is to use the JDK tool jstack and pass the PID to it, the second way is to send a kill signal to the JVM, however in the first way you will have control over where you want to save the thread dump, while on the second way you will have the thread dump written on the java process standard output stream.
     as said, we are using tomcat, so the thread dump will be in catalina.out
    kill -3 {pid}
    the thread dump will be printed to the file with full stack trace. the file will contain the thread you are after, but first convert your LWP id from DEC to HEX so 8245 would be 2035, now open the thread dump with text editor and search for 2035, you will find something similar:
    "TP-Processor234786" daemon prio=10 tid=0x00002aaad8024800 nid=0x2035 runnable [0x00002aaadef29000]
    java.lang.Thread.State: RUNNABLE
    at java.util.HashMap.get(HashMap.java:303)
    at ......

and you are done!

19 comments:

Javin Paul said...

This is fantastic tip man, Indeed useful. Production environment is always hard for debugging with logs level upto WARN and ERROR and limited access. Please share more of your experience. bookmarked it. I have also written about Java remote debugging in Eclipse you may like.

Javin

Sumit Pal said...

Another way to do it is using jconsole if you had started the JVM with the following
-Dcom.sun.management.jmxremote.port=9003

you can install jconsole plugin called topthreads

and then start jconsole with the following from any machine and connect to the JVM machine on the above port

jconsole -pluginpath topthreads.jar

moutaz salem said...

Thanks!!!
i will be posting more findings as i go :)

moutaz salem said...

Hi Sumit,

as i mentioned, profiling is not an option here :)

the same applies on Jconsole and VisualVM

Mamdouh said...

That was definitely a life saver :)

moutaz salem said...

thanks Mamdouh :)

Boudy said...

Man we are having the same problem at our live environment .. And we're trying to figure out a way to do this for almost 2 weeks now .. Thanks a lot, keep em comin ; D

moutaz salem said...

Most welcomed Boudy ;)

Rajkumar Selvaraj said...

awesome article.. thanks

moutaz salem said...

My pleasure, Rajkumar

hazem hegazi said...

Excellent work moutaz :)

Looking forward to read more articles by you

You are an artist!!

moutaz salem said...

thanks, Hazem

i learned from a Master :)

hazem hegazi said...

How humble you are!!!

My dear friend you are a real distinguished master, we learned a lot from you, really we did, and we still as you see :)

keep doing well, and always share :)

Stas Ostapenko said...

Hi. Thnaks for the post :) Btw, instead of "ps -Aef | grep java" you can use jps tool.

moutaz salem said...

of course, Stas
thanks for the tip :)

yorichunt said...

Very useful indeed, thanks very much.

moutaz salem said...

Most Welcomed, yorichunt

Mayur Choubey said...

This is just incredible... thanks a ton...

moutaz salem said...

you are more than welcome, Mayur