Determining Root Cause for Slow Application

I've often seen ColdFusion users complain that their server/application is running very slowly, not handling requests as fast as normal, or just plain "hung". Many would suggest examining log files to determine the status and recent history of the ColdFusion server in order to find clues that might reveal what the server was actually doing. While this troubleshooting step can be useful, it can only get you so far.

What's often overlooked is the ability to generate a "Full Thread Dump" from ColdFusion MX as a means to precisely determine exactly what the server was doing at exactly that moment, something that you wouldn't be able to otherwise garner from the logs alone.

Macromedia has a ColdFusionMX TechNote describing how to produce a Full Thread Dump, but there has been very little written so far describing what to do with it then. To that end, here is a reply I recently sent to the CFTalk mailing list:

You should take several full thread dumps in succession with a small interval in between them. Then to examine the thread dump, look for JRun Proxy Service threads beginning with jrpp-xxx where xxx is a number and jrpp-xxx is the thread id. Specifically, to find the running threads that are handling .cfm requests, search for the string (excluding the quotes) ".cfm:". This will lead you to a template and its path with a line number at the end after the colon. You'll also see other mentions of .cfm: further down in the stack of a particular thread, but it the top most one that was executing at that instant. Truncated example:

"jrpp-96" prio=5 tid=0x477e5008 nid=0xc50 runnable [698f000..698fdbc]
at coldfusion.tagext.NativeCfx.processRequest(Native Method)
at coldfusion.tagext.lang.CFSearchTag.doStartTag(
at coldfusion.runtime.CfJspPage._emptyTag(
at cf_cachelists2ecfm612782408.runPage(/opt/apache/htdocs/_cachelists.cfm:30)
at coldfusion.runtime.CfJspPage.invoke(
at coldfusion.tagext.lang.IncludeTag.doStartTag(

Here, jrpp-96 is the JRun thread which is handling the template /opt/apache/htdocs/_cachelists.cfm. That template is currently executing a tag that is either on line 30 or ends on line 30. In this case the tag is CFSEARCH. If you find that this thread stack exists in subsequent full thread dumps that were taken at short intervals, where the same thread id, same template, and same operation is occurring for both thread stacks, then that should be sufficient proof that the cfm request has been executing for the duration of the interval between the two thread dump instances.

Typically the number of jrpp thread stacks having cfm templates in the stack in a single full thread dump will not exceed the number of Simultaneous Requests set in the CFAdmin, also known as the activeHandler limit for JRun people. The full thread dump is precise enough to do full accounting of all running threads, showing exactly what they are doing at that instant.

Its important to emphasize that observing the actions of threads in one thread dump loses the context of time. That's what ColdFusion was doing at that instant, but one thread dump by itself says nothing about how long a thread runs. Rather, its best to look for the same template running with the same thread id across two or more thread dumps taken in rapid succession. It requires some effort to extract and analyze the threads in various thread dumps, but looking at one thread dump alone is not useful for determining what's slowing down the application response times.

Once that is done and you have a list of templates involved and operations that were being executed, then you can focus your effort to debug just that area.

Additional thread stack examples:

1) Reading data back from an ODBC Socket connection

"jrpp-45" prio=5 tid=0x47373e78 nid=0x2f4 runnable [2bae000..2bafdbc]
at Method)
at Source)
at Source)
at Source)
at Source)
at Source)
at Source)
at Source)
at macromedia.sequelink.ssp.Chain.send(Unknown Source)
at macromedia.sequelink.ctxt.stmt.StatementContext.execDirect(Unknown Source)
at macromedia.jdbc.sequelink.SequeLinkImplStatement.execute(Unknown Source)
at macromedia.jdbc.slbase.BaseStatement.commonExecute(Unknown Source)
at macromedia.jdbc.slbase.BaseStatement.executeInternal(Unknown Source)
at macromedia.jdbc.slbase.BaseStatement.execute(Unknown Source)

- locked {0x158ce0c0}(a macromedia.jdbc.sequelink.SequeLinkConnection)
at coldfusion.server.j2ee.sql.JRunStatement.execute(
at coldfusion.sql.Executive.executeQuery(
at coldfusion.sql.Executive.executeQuery(
at coldfusion.sql.SqlImpl.execute(
at coldfusion.tagext.sql.QueryTag.doEndTag(
at cf_additionalresources2ecfm1082472973.runPage(/opt/apache/htdocs/additionalresources.cfm:41)


2) CFHTTP trying to connect to the remote resource

"jrpp-12" prio=5 tid=0x3c342580 nid=0x72c in Object.wait() [3e7ee000..3e7efdbc]
at java.lang.Object.wait(Native Method)
- waiting on {0x10933140} (a HTTPClient.HTTPConnection$EstablishConnection)
at java.lang.Thread.join(
- locked {0x10933140} (a HTTPClient.HTTPConnection$EstablishConnection)
at HTTPClient.HTTPConnection.getSocket(
at HTTPClient.HTTPConnection.sendRequest(
- locked {0x109312b8}(a HTTPClient.HTTPConnection)
at HTTPClient.HTTPConnection.handleRequest(
at HTTPClient.HTTPConnection.setupRequest(
at HTTPClient.HTTPConnection.Get(
at cfdsp_htm2ecfm1703620815.runPage(/opt/apache/htdocs/dsp_htm.cfm:45)


Debugging stack traces in ColdFusion MX