JVM Memory Management and ColdFusion Log Analysis

The following is a document I wrote for knowledge sharing with some peers, but I feel that it might have some value to other ColdFusion Devs, Testers, and Admins out there. The purpose was to illustrate how I went about analyzing CF's performance during a prior troubleshooting session. I'm re-purposing the content here after scrubbing some private information. Hopefully it still makes sense, although slightly out of context.

These are some technical notes on what to look for when analyzing ColdFusion server performance history. It includes concepts and techniques to assess what performance related problems might exist, emphasizing memory usage issues first. This is a somewhat simplified explanation about how the JVM manages memory and its relation to CF applications, and about how I went about analyzing them. There are many similar resources on the web, but I found many of them are quite technical, so this article is written with more of a layman's approach to make it more digistible to those not as familiar with troubleshooting ColdFusion or Java apps.

The ColdFusion Application Server runs inside (is "contained" in) a higher level JRun J2EE server. The J2EE server (and therefore the CF server) run on top a JVM (Java Virtual Machine). To analyze the ColdFusion and JVM performance, I take a forensic approach. I start by collecting the ColdFusion and JRun server logs. I also colelct the JVM Garbage Collection (GC) log that has been manually enabled to log information regarding how the JVM is cleaning up the memory that is has used. The JVM is configured with an algorithm that tells it what approach to take when cleaning up and freeing memory. The application's Java objects (like queries, session variables, local variables, cfc instances, etc) are held in the JVM's memory. Objects are said to hold "references" in memory, meaning that something in the application is potentially using that object. When the application no longer has a need for an object, its memory is dereferenced. That dereferenced memory can be released by the JVM and then reused by other objects that require it.


ColdFusion 9.01 Server Monitoring Enhancements

I began this thought as a comment to Adobe CF QA engineer Sagar Ganatra's blog entry describing the new Server Monitor enhancements in ColdFusion 9.01 updater, however, as it grew lengthy I decided my own blog post would be a more appropriate venue.

I'll add that the main reason for why one would want to run the ColdFusion Server Monitor on its own port via the Jetty implementation is that until now requests from the Server Monitor would go through the JRPP request pool, thereby adding additional traffic to the JRun active request pool, but more importantly if the JRun active request pool was queuing then the data refreshes in the Server Monitor would also queue and the Server Monitor may appear to hang as well. By establishing a separate request pool and port for Server Monitor requests in ColdFusion 9.01, the Server Monitor will not encounter a blocking situation as it would do previously.

Any general discussion of the Server Monitor should include the caveat that the use of Profiling, Monitoring, and Memory Tracking are not intended for production use (blanket statement: see comments for more on that). Moreover, if Memory Tracking is enabled in production, perhaps to help diagnose a prod performance problem, that it will only further decrease server performance. On 3 occasions in the last year alone I've helped ColdFusion shops that shot themselves in the foot by doing this. The impact to performance was substantially worse when enabled, and having them disable it immediately alleviated most of the problems, albeit not the problem that initially prompted them to enable the tracking.

Nice enhancements to Server Monitor in the future might include:
  • Persistant Metrics: The ability to persist the Server Monitor data to a database. A restart of CF will clear data as of now. A use case would be for load testing scenarios where the a a variety of metrics need to be quantitatively analyzed, which cannot presently done easily. Ideally, you'd want to know performance metrics at different points into a load test such as during the ramp up, after X minute intervals, and during cool down.

    A second use case would be the ability to produce reports to monitor server health over time, perhaps by providing the ability to generate weekly reports of key data, possibly with green and red arrow indicator to visually identify metrics that have improved or worsened.

  • CF Request Pool Ratios: Add the ability to analyze incoming requests to determine the real time ratio of CF request types such as CF Templates, CFCs, Flash Remoting, and Web Services. When expressed as a percentage, it could used to correctly determine the values to use in the Request Tuning part of the ColdFusion Administrator. Throughout much of my experience I have found that CF shops rarely set the Request Tuning values to an appropriate range, either letting them remain unchanged at the default, or increasing them way out of range (into the hundreds even).

  • Request Tuning Calculator: Provide a real suggested starting point for all Request tuning parameters including the JRun active and queued sizes based on the number of CPUs/cores and processing speeds. Presently, even in CF 901, a server will install with a default set of values that will the same on a small box as it would on a beefy production box. To do this correctly, the total number of instances used to process production load would also have to be incorporated for proper per-instance tuning. Having a Server Monitor Request Tuning Calculator would be a big plus towards helping server admins find the right starting range for their particular hardware.

  • Out of Process Memory Tracking: Since Memory Tracking is known to consume significant resources while tracking (a Heisenberg conundrum?), perhaps Memory Tracking could be done out of process over RMI or similar, akin to the JRockit Mission Control memory analyzer (which I've used for ColdFusion, but interpreting the data is not very intuitive).

Issue with stopping ColdFusion after starting from Builder

With the release of ColdFusion Builder there is an option available that provides the ability to start and stop one or more ColdFusion servers from ColdFusion Builder. In fact, you can configure CF Builder to automatically start a CF server when Builder opens, and stop the CF server when Builder closes. The autostart/autostop is convenient for a Development box where you want to minimize resource usage on the system. You can read more about this feature here.

Server Panel in ColdFusion Builder
Server Panel in ColdFusion Builder

Server Settings Panel in ColdFusion Builder showing Auto start/stop for CF Servers
Server Settings Panel in ColdFusion Builder showing Auto start/stop for CF Server

However, if you don't enable the automatic stop/start option, if you ever start ColdFusion server from Builder then close Builder without stopping ColdFusion there, then later you will not be able to stop the ColdFusion server using the standard ColdFusion stop script. I've encountered this on Mac OS X, but since its possible to configure CF Builder to start/stop remote CF servers, it's likely that the problem might occur when using ColdFusion server on Linux or Solaris as well, even though Builder doesn't run on those platforms.

Normally, to stop / start the ColdFusion server from the command line, you would the control script located (typically) at /opt/coldfusion9/bin/coldfusion, such as with ./coldfusion stop. That control script in turn invokes /opt/coldfusion9/runtime/bin/coldfusion9. When calling stop, the control script works by first grepping for any running ColdFusion processes with fgrep, like this: ps -axc | fgrep coldfusion9. If it finds a process listing that matches for the string "coldfusion9" then it stops that process.

Here's what you might see if you try to restart ColdFusion from the command line after it was started but not stopped from Builder:

view plain print about
1$ /opt/ColdFusion9/bin/coldfusion restart
2Restarting ColdFusion 9...
3ColdFusion 9 does not seem to be currently running
4Starting ColdFusion 9...
5The ColdFusion 9 server is starting up and will be available shortly.
6There has been an error starting ColdFusion 9, please check the logs.

The problem of not being able to use that control script to stop ColdFusion server after having started it from Builder arises because of how Builder starts the CF server. Rather than invoking /opt/coldfusion9/runtime/bin/coldfusion9, Builder instead invokes /opt/coldfusion9/runtime/bin/jrun. When the control script tries to grep for the process with a "coldfusion9" string, the control script doesn't find it because Builder invoked runtime/jrun instead of runtime/coldfusion9.

Why the need for runtime/jrun AND runtime/coldfusion9? I have no idea, especially since the files are identical and not symlinked.

view plain print about
1$ pwd
3$ diff jrun coldfusion9

I logged ColdFusion server bug 82573 for this where I proposed a change to the bin/coldfusion control script. My suggested change was only shown for Mac OS X, but you can easily change it yourself for the Linux or Solaris blocks in a similar way.

If you want to use my suggested fix on your local Mac OS X dev box, then you can refer to the full example control script containing the fix here: http://pastebin.com/Y7r6sDGu.

For brevity, I won't show the whole script in this blog entry. Instead, here's the diff between the backed up original coldfusion control script which I renamed to 'orig.coldfusion' compared to the fixed version 'coldfusion'.

view plain print about
11.    $ diff orig.coldfusion coldfusion
2    2.    13a14
3    3.    >
4    4.    34c35
5    5.    < ⬠ ⬠ ⬠ ⬠ $PSCMD | fgrep coldfusion9 ⬠> /dev/null 2>&1
6    6.    ---
7    7.    > ⬠ ⬠ ⬠ ⬠ $PSCMD | grep -i $JRUN_BIN | grep -v 'grep' > /dev/null 2>&1
8    8.    117c118,119
9    9.    < ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ $PSCMD | fgrep coldfusion9 | awk '{print $1}' | xargs kill -9 > /dev/null 2>&1
10    10.    ---
11    11.    > ⬠ ⬠ ⬠ ⬠ ⬠ $PSCMD | grep -i $JRUN_BIN ⬠| grep -v 'grep' | awk '{print $1}' | xargs kill -9 > /dev/null 2>&1
12    12.    > ⬠ ⬠ ⬠ ⬠ ⬠
13    13.    130,131c132,133
14    14.    < ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ $PSCMD | fgrep coldfusion9 | awk '{print $1}'
15    15.    < ⬠ ⬠ ⬠ ⬠ fi
16    16.    ---
17    17.    >
⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ $PSCMD | grep -i $JRUN_BIN ⬠| grep -v 'grep' | awk '{print $1}'
18    18.    >
⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ fi
19    19.    152c154
20    20.    < ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ PSCMD="ps -axc"
21    21.    ---
22    22.    >
⬠ ⬠ ⬠ ⬠ ⬠ ⬠ ⬠ PSCMD="ps -ef"

A ColdFusion Trick for Lost Datasource Password

Here's a quick trick if you don't have a datasource password when creating a new datasource but you do have another ColdFusion server with the same datasource.

Imagine you have two production servers running ColdFusion, each one with different datasources running different applications. What if you have a datasource on one server and you need to create that datasource on the second one but can't find (or don't have) the database password?

All recent ColdFusion versions use the same encryption algorithm for encrypting and decrypting passwords for datasources registered in the CF Administrator. This is why you can copy the ColdFusionX/lib/neo-datasources.xml from one ColdFusion 8 server to another ColdFuson 8 server, and the second server will have all the same datasources as the first. This is a quick way to mirror datasources across different ColdFusion servers.

But now, back to the problem where you have different datasources on each CF server, and you cannot copy over the whole datasource config file. If you don't have the database password, you can create a new datasource on the second server but without supplying a password. The datasource will then fail to verify. However, if you examine the datasource config file from the first server you can find the encrypted version of the password. A snippet from the ColdFusion8/lib/neo-datasource.xml file is shown below. Notice the encrypted version of the password in this xml sections:

view plain print about
1<var name="timeout">
4<var name="password">
7<var name="update">
8<boolean value="true"/>
10<var name="drop">
11<boolean value="true"/>
13<var name="pooling">
14<boolean value="true"/>
16<var name="url">

In this case the particular datasource has an encrypted version of the password shown as RgmrmRQhiQM=. You could find the datasource of interest in the config file, then find the encrypted version of the password, and copy it to the other neo-datasource.xml config file on the other server. Find the XML node for the failed datasource. It should have no value for the contents of the password field:

view plain print about
1<var name="password">

Then paste the encrypted version of the password in between:

view plain print about
1<var name="password">

For this to work, the ColdFusion server for where you are pasting the password should be stopped to avoid having ColdFusion overwrite your changes with a copy it already has in memory. Then start ColdFusion after pasting and the datasource will verify.

This can also work between ColdFusion versions. For example, ColdFusion MX 7 used neo-query.xml, and ColdFusion 8 restructured the file into neo-drivers.xml and neo-datasource.xml, but the encryption remained the same. You can copy the encrypted form of the password from a CF7 server and paste it into a CF8 or CF9 datasource config file.

This is a bit of a hack, but it does work.

Starting ColdFusion9 Solr: Using cfsolr in same directory

The cfsolr script for Mac, Linux, and Unix is written such that you must be in the ColdFusion9/solr/ directory when running the script. The script refers to the start.jar file without providing the full path.

The problem is that if you are not in the solr/ directory under the ColdFusion root directory, the cfsolr script echos that Solr has been started or stopped, even though it has not.

Since the standard error is redirected to the standard out with "2>&1" the problem is swallowed and the person performing the operation is led to believe that the operation has been carried out as expected.

Here's a snippet from the ColdFusion9/solr/cfsolr script showing that start.jar is referenced without a full path:

view plain print about
1SOLRSTART='nohup java $JVMARGS -jar start.jar > $SOLR/logs/start.log 2>&1 &'
2SOLRSTOP='nohup java $JVMARGS -jar start.jar --stop > $SOLR/logs/start.log 2>&1'

Looking at the logs, I see that the problem was quietly recorded in a solr log file:

view plain print about
1QAs-iMac:logs QA$ pwd
3QAs-iMac:logs QA$ cat start.log
4Unable to access jarfile start.jar

The script already has a variable defining the Solr directory path:

view plain print about

To fix the bug, prefix the reference to start.jar with ${SOLR}/start.jar like this:

view plain print about
1SOLRSTART='nohup java $JVMARGS -jar ${SOLR}/start.jar > $SOLR/logs/start.log 2>&1 &'
2SOLRSTOP='nohup java $JVMARGS -jar ${SOLR}/start.jar --stop > $SOLR/logs/start.log 2>&1'

With that fix, the cfsolr script can be called from any directory outside the solr directory.

Here is an examle of how the script falsely echos that the solr server has stopped or started when it has not (determined by grepping for the process):

view plain print about
1QAs-iMac:opt QA$ pwd
3QAs-iMac:opt QA$ ./ColdFusion9/bin/coldfusion stop
4Stopping ColdFusion 9, please wait
5Stopping coldfusion server.stopped
6ColdFusion 9 has been stopped
7QAs-iMac:opt QA$ ps -ef | grep solr
8 501 73310 1 0 0:00.25 ?? 0:02.64 /usr/bin/java -XX:+AggressiveOpts -XX:+ScavengeBeforeFullGC -XX:-UseParallelGC -Xmx256m -Dsolr.solr.home=multicore -DSTOP.PORT=8079 -DSTOP.KEY=cfsolrstop -jar start.jar

view plain print about
1QAs-iMac:opt QA$ ./ColdFusion9/solr/cfsolr start
2Starting ColdFusion Solr Server...
3ColdFusion Solr Server is starting up and will be available shortly.
4QAs-iMac:opt QA$ ps -ef | grep solr
5 501 78371 62961 0 0:00.00 ttys000 0:00.00 grep solr
6QAs-iMac:opt QA$ ps -ef | grep solr
7 501 78373 62961 0 0:00.00 ttys000 0:00.00 grep solr
8QAs-iMac:opt QA$ ps -ef | grep solr

Adobe LiveCycle DataServices for ColdFusion at CFObjective

Allaire's CEO, David OrfaoAfter a decade of working intensely with the ColdFusion server, I'm finally getting the courage to start presenting about it on the conference circuit. As a blogger, tweeter, and contributor to mailing lists I'm very confident helping others solve ColdFusion related problems because I can do that from the quiet comfort of my own desk. However, one of my greatest fears has always been public speaking. I'm the kind of person that feels like I need to know the subject matter cold, so that I can speak from the hip without relying on looking at the slides.

Blackstone Test CDsOver the years, I had some opportunities to present to small groups, and I recall each time feeling the adrenalin surge and my heart pounding. That started with presenting ColdFusion for Unix and Linux as an internal training class at Macromedia. Later, while taking classes at the Harvard Extension School, I was honored to be asked to present to CSCI-253 Developing Web-Based Database Applications. Even more so, I presented twice there in one year. The first time on Building ColdFusion Web Applications with CFEclipse and Dreamweaver, and later on ColdFusion Server Administration

MAX in ActionI've been attending ColdFusion conferences since the days of Allaire DevCon, but had never presented at any of them including MAX. My long time friend in the local ColdFusion Community, Brian Rinaldi, continued to encourage me to present at the local Boston CFUG as a starting point, as well as the new conference that he was organizing, RIA Unleashed, held in Bentley College this past November. The members of the CFUG were kind enough to let me present a draft of a presentation that I was to later give at RIA Unleashed. My presentation topic was Adobe LiveCycle DataServices Data Management for Mere Mortals

ColdFusion 1.5 on Floppy DisksFortunately at RIA Unleashed I was among the very first sessions after the keynote, so there was no time to build up butterflies that morning. If beforehand you would have told me that among the audience front row would be Ben Nadel, Simon Free, and Ray Camden with Tom Jordahl tucked way in the back then I surely would have freaked out. But they were both kind enough to chat with me before hand and even lend some technical assistance getting setup with the A/V, so that really put me at ease. With a firm limit of 50 minutes, I pushed all the way through what should have been a 90 minute talk, all the while trying to remember to speak clearly and loudly. The talk went off pretty much without a hitch as I found myself completely focused on the technical content and not at all worrying about the large room filled with people in front of me. I was delighted at the end when Tom complemented me on talk, which to me was the ultimate satisfaction.

First Unix machine to run ColdFusionI chose LCDS for ColdFusion as a topic because while I was a QA Engineer on the ColdFusion team at Adobe, I was paired with Tom, a Computer Scientist at Adobe who architected the integration between the products. Heck, Tom architected much of ColdFusion itself, and was in fact the original engineer to have ported ColdFusion to run on Unix and Linux back in the day. Tom is a font of information, and I cut my teeth on the feature under his guidance, which was then known as Flex Data Services and later renamed under the LiveCycle brand. I spent many days last summer and fall revisiting all the LCDS documentation again to ensure the quality of my presentation and to mentally prepare me for the upcoming conference.

ColdFusion Team, BangaloreWIth my first conference under my belt, I decided to throw my hat into the ring for the ultimate ColdFusion experience, CFObjective, which is promoted as The Only Enterprise ColdFusion Conference. I'm excited to announce that I have been selected to be a speaker at the conference, which runs from April 22-24th in Minneapolis, Minnesota. The conference is divided into three tracks for technologies related to ColdFusion. I'll be speaking the last day in the Flex track, once again on the topic of LiveCycle DataServices for ColdFusion Developers. Specifically I'll be talking about the prime feature of LCDS, the Data Management capabilities. With any luck I'll be updating my presentation to consider the benefits of working with the latest versions of Adobe software. Here's the brief description and the PDF:

Discussions of Adobe's LiveCycle Data Services are often entered with the same trepidation as those of Organic Chemistry or Quantum Mechanics, but with ColdFusion, building Web applications that manage complex data sets doesn't have to be that scary. Data Management is a pillar of LCDS that offers scalable, real-time data synchronization across very large numbers of connected clients with the benefits of conflict resolution and data pagination.† Come learn how to quickly get up to speed with Data Management by letting ColdFusion do the hard work for you.

If you're seriously interested in ColdFusion, then CFObjective is the conference for you. I hope to see you there.

ColdFusion Screams

Is Your ColdFusion Support a Real Turkey?

Are you getting half baked help solving complex server problems? Is your global service provider a little off? Sure, maybe its a lot off.

Webapper wants your feedbag feedback about what's important to you. Take this survey to let us know what you really want from ColdFusion support, and the first 100 respondents win a copy of SeeFusion Enterprise, a ColdFusion bottleneck analysis tool with all the trimmings.

Webapper's consulting practice is a one-stop shop for all of your ColdFusion support needs. Our service offerings have been developed and honed over many years, and through hundreds upon hundreds of successful engagements, and they deliver a full spectrum of expertise that covers the entire "Web application stack".

Webapper's consulting services practice is a world leader in ColdFusion support expertiseour engineers are all former employees from the consulting, support and engineering teams at Allaire/Macromedia/Adobe.

Ask a Photographer: Why buy a dSLR?

I'm often asked about my opinion on camera selection, i.e. "What camera should I buy?". That's a really hard question to answer because it all depends on what you want to do with the camera. Today a more narrow question was asked that's a little easier to discuss, and that is: "What are the advantages of a dSLR over a point-and-shoot camera?". Since my reply ran on a bit, I thought it would make a good blog topic, so here you go...

The first advantage that comes to mind for reasons why to get a dSLR is that the lenses are interchangeable. Often, lenses are worth more than the camera body for good reason, but sometimes you can find good inexpensive ones.

There are several factors that make a lens good. The most interesting reason is that the well built ones have a very wide aperture, which is the F number. The lower the number, the wider the aperture, the larger the opening. This is important because wide apertures produce shallow depth of field (DOF). Shallow DOF causes the subject to be in sharp focus but other things in the foreground or in the background bcome blurry and abstract. The lower the number, the more blurry other things become. This makes for a more artistic photograph. Bright items that are blurry typically produce a distinct pattern known as bokeh, and this is considered desirable in much photography.


Recent Tweets for Fri Oct 2, 2009 Part II

Follow me on Twitter!

Tue Sep 15 9:04 PM
@iotashan @rukumar Shan meet Rupesh. Rupesh meet Shan. You guys should talk CF9 ORM. ;-) Rupesh, Shan works with me & has an ORM issue
Tue Sep 15 8:51 PM
No CF Admin DSN setting for isolation level, but u can add SET TRANSACTION ISOLATION LEVEL <level> as u're validation query as workaround
Tue Sep 15 5:36 PM
Tue Sep 15 4:13 PM
@berniedolan Yup, and I was on a downhill at 35mph, slowed to 15 then skidded to within inches as he made a blinkerless right turn


Recent Tweets for Fri Oct 2, 2009

Follow me on Twitter!

Fri Oct 02 7:26 PM
Ghosts of ColdFusion Past http://yfrog.com/3omigoj
Fri Oct 02 4:40 PM
@john_mason_ Thanks. Indeed, the server was under high load.
Fri Oct 02 4:21 PM
Ditto that! RT @awest: Working at home really blows. Not. http://bit.ly/UflXy
Fri Oct 02 3:44 PM
@charliegriefer Enjoy! Twitter is gonna have a melt down. #adobeMAX
Fri Oct 02 3:12 PM
Have you ever launched the ColdFusion Server Monitor and seen the buttons for Monitoring, Profiling, and Memory just not show up at the top?


Previous Entries / More Entries