I'm recycling this explanation which I just emailed to someone about why setting the JRun activeHandlerThreads value in jrun.xml to 500 is not a good idea. To ColdFusion users this setting is better known as Simultaneous Requests.
Current values that you've set for the ColdFusion server's JRunProxyService in jrun.xml are far from default
Maximum number of web threads that are actively running.
Maximum number of web threads queued + running
Minimum number of web threads to have around at all times
As a rule of thumb, I recommend that you set these to:
20-100, for 4 cpus, depending on whether you have a query intensive application (higher) or a cpu intensive application (lower)
300-500 given that Macromedia tests with Red Hat Linux 7.2 demonstrated that no more than about 400 OS threads could be spawned from a single process, forcing the OS to return an error to the JVM and the JVM to then report that it cannot create new native (OS) threads. "java.lang.OutOfMemoryError: unable to create new native thread" (having nothing to do with being out of memory actually). Red Hat 8 and more recent distributions are thought to have improved this somewhat. Solaris has been shown to permit 1000-2000 OS threads per process, and Windows up to about 3000.
1. This is the default setting. JRun doesn't collect idle jrpp threads all that fast, so under usual load there should always be many web threads that have recently finished and are hanging around waiting to accept another request. After 15 or 30 minutes of idle time you might see it drop down to the minimum.
As recommended anywhere in overall web application performance tuning, professional load testing is the best route to determine how to best tune server settings. Once settings are tuned to best rough estimates, load tests can be done while changing just one parameter at a time, where in ColdFusion applications the Simultaneous Requests setting is one of the most important settings. You'll find that as you test at different intervals a peak throughput rate can be observed, even graphically. Once the peak of the throughput curve is found, changing the Simultaneous Request/activeHandlerThreads setting lower might cause queuing because you're not letting the server do as much as it optimally can. Changing it too high can cause context switching, making the server do more than it really should. Graph the curve to find the peak, and stay there.
Most web threads should complete quickly, e.g. < 300 ms, and the activeHandler pool will process that number of threads all at the same time and swap in queued threads as needed. The activeHandler pool keeps a constant rate of threads, and the queue buffers the variations in load. Having an activeHandler/running pool set way too high causes excessive context switching where the CPU is doing more work to manage the list of running threads than actually executing the threads. That context switching will significantly decrease throughput and cause more queuing. Setting the activeHandler/running pool size low ensures that the CPU spends more time executing the threads than managing them.
> > apache connector timeouts and various nasty error messages in the log files
Regarding this, I would speculate that just setting the activeHandlers so high will automatically slow things down, contrary to intuition. Right away that sets things up to run slowing and cause timeout/hanging related problems.
You should run JRun metrics to evaluate the total/busy number of threads running during tests where activeHandlers is low. If you see the busy constant and the total growing and growing until it reaches the MaxHandler size, then you probably have a bottleneck in the application. For this you would want to take 2 thread dumps from the time where the queue/total is very high. There will be many jrpp threads there, perhaps as many as the MaxHandlers, but look for the ones showing your jsp's in the stack. Pull those stacks out into a separate file. Compare the stacks from two thread dumps taken about 15 seconds apart. If you see that jrpp-nnnn is running foo.jsp on line 10 where its in a database socketRead method call at the top of the stack, and you see in thread dump 2 that the jrpp-nnn is still on foo.jsp line 10 in socketRead, then what you have is a query operation that is taking a long time for the database to return. This might be a bottleneck.
Use thread dumps and metrics to assess how the server is performing and what its doing when it seems slow. Then track down the code and method calls and isolate them carefully. Find out what in them is slow and look for ways to improve them.
Its often a common mistake to run a application with bottlenecks and instead of finding/fixing the bottlenecks, people often just try to open the throttle wider, e.g. set activeHandlerThreads high. That usually won't work.
For a related discussion, see my other blog entry, Simultaneous Requests in CFMX 7
- JRun Threadpool settings, by Rupesh Kumar, ColdFusion Development