SeeStack - Parse ColdFusion Thread Dumps Like a Pro!

Sleestacks are scary stuffSleestacks are scary stuff in The Land of the Lost, and so are ColdFusion MX thread dumps.

So many threads, so many stacks, what to do, what to do?

Never fear, SeeStack is here! ... from Webapper, the folks who make SeeFusion. A very helpful utility similar to SeeStack is used by Adobe ColdFusion Support Techs like me. SeeStack can quickly suck up gobs and gobs of those gnarly thread dumps(1) on one end and spit out(2) just the highlights of what you need to know on the other end.

To debug applications that seemed slow or hung, I used to manually read those unsightly stacks while my eyes would glaze over. But then I emerged from the primordial world and started using a stack trace analyzer. SeeStack will make debugging bottlenecks pages much, much easier and far less time consuming.

Thanks Webapper!

(1)(2) This phrasing has not yet been approved by the friendly marketing people at Webapper ;)

Datasource Timeouts and Support for CFQUERY Timeout Attribute

When considering Timeout settings and sources of problematic bottlenecks in ColdFusion applications, a widely held misconception is that all active database connections do not obey timeouts. Discussions surrounding slowness or unresponsiveness in ColdFusion applications often boil down to isolating unusual database activity as the culprit. The explanation provided is often that while the general Timeout in the ColdFusion Administrator, as well as the page level CFSETTING timeout attribute, will be enforced for all page requests except those actively waiting on a response from the database server after a SQL statement has been sent. While this is true, its sometines overlooked that the CFQUERY tag itself has a timeout attribute that is worth considering in some cases. In fact, this particular timeout is something that I've often forgotten, too.

One reason for the confusion over the CFQUERY tag timeout is that not all drivers support it. Of the databases drivers that ship with ColdFusion MX, only the Oracle and SQL Server drivers will enforce the CFQUERY attribute for timeout if it is supplied.

From the ColdFusion documentation for CFQUERY, the timeout attribute is described as:

CFQUERY timeout attribute

Maximum number of seconds that each action of a query is permitted to execute before returning an error. The cumulative time may exceed this value.

For JDBC statements, ColdFusion sets this attribute. For other drivers, check driver documentation.


[More]

Request timed out waiting for an available thread to run

The ColdFusion MX error message "Request timed out waiting for an available thread to run" occurs specifically when a thread in the queued request pool times out while still in the queue, before it ever had a chance to make it to the running request pool. This blog entry revisits a topic previously blogged about here.

An example of the Java stack trace for the exception might look like this at the top:

view plain print about
1java.lang.RuntimeException: Request timed out waiting for an available thread to run. You may want to consider increasing the number of active threads in the thread pool.
2    at jrunx.scheduler.ThreadPool$Throttle.enter(ThreadPool.java:116)
3    at jrunx.scheduler.ThreadPool$ThreadThrottle.invokeRunnable(ThreadPool.java:425)
4    at jrunx.scheduler.WorkerThread.run(WorkerThread.java:66)


The helpful hint provided in the exception message suggests increasing the active thread pool size, which would correspond to the Simultaneous Requests setting in the ColdFusion Administrator. While that might be a short term band-aid to the problem, in most cases this will simply delay the onset of the symptoms rather than truely solve the problem.

To back up for a moment, I'll take a look at some of the settings in the JRun server's jrun.xml configuration file, found in the SERVER-INF directory for a given server instance. The threadWaitTimeout setting in jrun.xml controls how long a thread will wait in the queued request pool. The activeHandlerThreads setting is the same as the Simultaneous Requests setting in the CFAdmin, and this setting is the maximum size of the running request pool, or the maximum of how many actual requests could be actively processed at any given moment. The maxHandlerThreads is the maximum total thread limit as a sum of the running request pool size (activeHandlerThreads) plus the maximum allowed queued threads (which can be restated at the maxiumum number of queued threads is equal to maxHandlerThreads - activeHandlerThreads).

If you're getting this error, then the problem is most likely bottleneck requests that are shifting the throughput rate downward so that more threads are added to the queued request pool than can be reasonably processed by the running request pool. It's not necessary that the running request pool is completely occupied with bottleneck threads, but just enough of them to cause a noticeable increase in current number of threads in the queued request pool. This doesn't necessarily mean that the queued request pool has reached its maximum size, but just that the threads in the queued pool weren't getting a chance to run within the threadWaitTimeout period.

To better diagnose the problem, try to find out what's in the running request pool. I recommend taking a series of three thread dumps about 15-30 seconds apart at the time this error is being reported, and then looking for differences between the running threads across the span of the three thread dumps to identify any that might be 'stuck' on something. The running threads are usually identifiable by having a 'jrpp-' to the thread id number and a reference to one of your application templates in the stack such as those with a .cfm or .cfc extension.

Related entries:



Related Blogs:

An easier way to take ColdFusion thread dumps on Windows

Thread dumps are often used a diagnostic utility to qualify and quantify the page requests running in a ColdFusion MX or JRun server. This is most useful for servers that appear to be hanging or spiking the CPU.

Brandon Purcell demonstrates how to generate ColdFusion MX thread dumps, also known as stack traces, while running it as Windows Service rather than from the command line.

Obtaining a Thread Dump with ColdFusion or JRun running as a Windows Service
While there is a limitation if using Terminal Services, this method will otherwise allow you to attach to a running jvm process to take a thread dump.



What to do with a ColdFusion thread dump once you've got it? This Macromedia technote on Debugging Stack Traces in ColdFusion MX helps make sense of it, although in my experience you have to read quite a lot of thread dumps before they start to look warm and fuzzy enough to be familiar, but that is when they start to offer up some of their secrets :)

Related entries:

Using AutoAdminLogin to automatically start ColdFusion on the commandline instead of Service

Here's a solution on how to configure Windows to automatically log on as Administrator and then start ColdFusion MX from the commandline using the batch file, all without any required user interaction. This solution is a response for a customer wishing to have this configuration so that when their machine enters a scheduled reboot overnight it can start itself in a manner that allows ColdFusion to run in a mode that will permit taking a thread dump at a later time when a user is around. To rephrase, the only value of this configuration is to be able to start ColdFusion in console mode after a machine reboot when no one is around.

Note that configuring Windows to automatically login as an Adminstrator account is a very serious security hazard and you probably shouldn't do this unless the machine is in a locked server room and you hold the keys, and there may be other security risks I hadn't thought of. However, it can be done. I don't know how to automatically lock the screen while the Administrator is automatically logged in, but if you know then please add a comment. In order for the cfstart.bat file to run in an interactive mode such that someone can come along and hit CTRL+BRK to generate a thread dump, a user must be logged in so that the console is accessible. The screen can be locked, but the user can't log off.

To get started, I followed the instructions listed in this Microsoft article on how to use AutoAdminLogon. This article assumes the following name/value pairs are set in the Windows Registry:

  • AutoAdminLogon
  • DefaultDomainName
  • DefaultUserName
  • DefaultPassword


However, I found that my system did not have the entry for DefaultPassword so I created it and entered the password in clear text, which the article mentions should be that way. Having AutoAdminLogon set to 1, and DefaultDomainName, DefaultUserName, & DefaultPassword all set, I rebooted the machine and confirmed that it automatically logged me on. Whenever tampering with the Windows Registry, be sure to make an export of either they individual key or the whole Registry first to use as a backup.

Then I created a new bat file in C:CFusionMX7in called cfstart_output.bat. The only entry in this bat file was
cfstart.bat >> cfstart_outputfile.txt. Then I dragged the the cfstart_output.bat file to the Start menu in
Start > Programs > Startup. Programs in Startup will run as soon as a user logs in.

The ColdFusion MX 7 Application Server Service should be set to Manual or Disabled if you're planning to start while using the bat file.

Rebooting the system, I found it automatically logged me on and started ColdFusion MX 7 on the commandline, but since output was being appeneded to a file the console window remained blank. I tested CTRL+BRK a couple times, then used CTRL+C to stop the ColdFusion console window. I then examined the cfstart_outputfile.txt file to confirm that it did contain 2 thread dumps.

Again, this is a very insecure configuration so think twice before doing this. But if you're ok with this, it will work.

See also:
An easier way to take ColdFusion Thread Dumps

CFUNITED Survivor Game Questions

Here are the ColdFusion challenges used in the CF Survivor game at CFUNITED 2005. Kudos to winner Daniel Elmore and runner up Joseph Danziger.

1) Output to the screen the ColdFusion server product version, including the build number.

2) Find the appropriate UDF from cflib.org to parse the search terms from a Google referrer, and modify it to work with Yahoo search referrer.



3) Create a ColdFusion custom function or use the built-in functions so that when called it takes a string as input and returns the string in opposite order, such as input "dog" returns "god".

[More]

Post mortem ColdFusion Issue Analysis

Here's a typical type of ColdFusion support ticket that we get in Macromedia Support. In this case, the server stopped responding or crashed over a weekend, and I was sent the log files and server settings to review for clues about what happened.

I'm providing this just an example of how I go about drawing conclusions and reconstructing the events that transpired. Maybe it will help you when thinking about troubleshooting your own CF servers if needed. This case isn't complete, so when further progress is made, I'll try to update the critical info here. Names and private info have been removed.

Background, this is ColdFusion MX 6.1 Updater 1 server configuration on Windows 2000 and IIS 5:

Problem Report

We had one of our CF sites go down this weekend and we dont know why the services were simply offline. Id like some help doing a post-mortem analysis to figure out why the services stopped.

[More]

Connection Pooling Deadlocks with User-based Security Models

This is an explanation of why someone with a ColdFusion application that uses User-based Security Models can get themselves into a connection pool deadlock, resulting in "Timeout trying to establish connection" errors even when the database sees no requests for connections coming in.

This follows up on a previous blog entry here that describes User-based Security Models with regard to connection pooling in general. If your're reading this and are confused, try reading the other blog entry first.

[More]

java.lang.OutOfMemoryError: unable to create new native thread

In tests that were done a couple years ago with ColdFusion MX, it was found that there is an upper limit on the number of operating system threads that can be open by one process, not to be confused with the number of file handle descriptors or jvm threads. When that os thread limit is reached for ColdFusion jrun process, the java.lang.OutOfMemoryError: unable to create new native thread error is thrown.

The results from testing on the Linux platform with Red Hat 7.x demonstrated the error when approximately 400 OS threads were tied to one OS process, the jvm process. Current Linux versions are thought to have improved that somewhat, so the limit might be higher at this time.

Solaris also demonstrated a limit, but had a higher threshold of about 1000-2000 OS threads per process, and Windows showed a limit of about 3000. at that time.

Given that for the JVM process under ColdFusion/JRun there is at least one operating system thread open for every JVM web thread, plus other operating system threads that are not tied to a JVM web thread, it is very likely that this error will occur if the total number of web threads (queued + running) begin to approach that experimental threshold of about 400 on Linux, for example.

[More]

activeHandlerThreads or Simultaneous Requests: Less is More

I'm recycling this explanation which I just emailed to someone about why setting the JRun activeHandlerThreads value in jrun.xml to 500 is not a good idea. To ColdFusion users this setting is better known as Simultaneous Requests.

Current values that you've set for the ColdFusion server's JRunProxyService in jrun.xml are far from default

500
Maximum number of web threads that are actively running.



1000
Maximum number of web threads queued + running



100
Minimum number of web threads to have around at all times


As a rule of thumb, I recommend that you set these to:

[More]

More Entries