CFWDDX: Invalid XML character (Unicode: 0x14)

Recently I completed the upgrade of BlogCFC 4.0 (Beta) to the latest 5.9.8. AFAIK, there is no upgrade tool to facilitate a move across such a wide gap (effectively, it's about a 5 or 6 year gap in the code base, and there are some differences in the database schema as well). The impetus for the upgrade was so that I could change hosting providers (I'm now with Hostek, and love it!). As part of that change, I decided to switch from MSSQL to MySQL too. (BTW, I love the iPad client for MySQL, and Hostek has an app to access it's control panel too.)

During the upgrade process, I created some ColdFusion scripts to export from MSSQL as WDDX, then import that data into the the new BlogCFC schema that resides in MySQL. The data export was a simple SELECT * from each table that was fed into CFWDDX's action=cfml2wddx, and that allowed me to archive the backed up data locally as XML files (1 file per table).

What surprised me, however, was that during the data import I couldn't simply reverse the process by deserializing the WDDX XML file, running wddx2cfml, and inserting into the MySQL database. The CFWDDX tag threw a parse error referring to Unicode characters. The deserialization error baffled me as the initial serialization worked just fine.

Here's an abbreviated version of the error:

view plain print about
1WDDX packet parse error at line 957, column 31. An invalid XML character (Unicode: 0x14) was found in the element content of the document..
2
3coldfusion.wddx.WddxDeserializationException: WDDX packet parse error at line 957, column 31. An invalid XML character (Unicode: 0x14) was found in the element content of the document..
4    at coldfusion.wddx.DeserializerWorker.throwSAXException(DeserializerWorker.java:359)
5    at coldfusion.wddx.DeserializerWorker.fatalError(DeserializerWorker.java:245)
6    at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
7    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
8    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
9    at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
10    at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
11    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
12    at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
13    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
14    at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
15    at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
16    at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
17    at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
18    at org.apache.xerces.jaxp.SAXParserImpl.parse(Unknown Source)
19    at coldfusion.wddx.DeserializerWorker.deserialize(DeserializerWorker.java:268)
20    at coldfusion.wddx.WddxDeserializer.deserialize(WddxDeserializer.java:96)
21    at coldfusion.tagext.lang.WddxTag.deserialize(WddxTag.java:266)
22    at coldfusion.tagext.lang.WddxTag.doStartTag(WddxTag.java:145)

As you can see, it's essentially a SAX Parser error. I tested a straight forward xmlparse() as well, and it produced the same exception, although with a much smaller stack trace.

view plain print about
1coldfusion.xml.XmlProcessException: An error occured while Parsing an XML document.
2    at coldfusion.xml.XmlProcessor.parse(XmlProcessor.java:208)
3    at coldfusion.runtime.CFPage.XmlParse(CFPage.java:248)

After some thought, I realized that the error message was referring to Unicode characters by their HEX position. I found some lookup tables where I could translate the Unicode HEX value into the equivalent ASCII decimal value. In the case of "Unicode: 0x14", this was the same as the ASCII 20 character ("Device Control 4", whatever that is). I updated my script to replace ascii 20 with empty string, and the parsing got a bit further but then started to hit other Unicode characters. What I found was that it kept hitting Unicode that when converted to ASCII were non-printable characters between 0 and 31 (printable characters start at ASCII 32 with the "Space" character).

Finally, I realized that I could just iterate over the local WDDX file prior to deserialization and simply remove any and all ASCII characters between 0 and 31. That did the trick, and deserialization occurred correctly.

view plain print about
1<!--- read in the xml/wddx file --->
2<cffile action="read" file="/path/to/data/blog_entries.wddx" variable="f">
3
4<!--- replace lower, non-printable ascii chars --->
5<cfloop from="1" to="31" index="i">
6    <cfset f = replace(f,chr(i),"","all")>
7</cfloop>
8
9<!--- fix mixed occurances of ampersands by converting all to entities --->
10<cfset f = replace(f,'&amp;',"&","all")>
11<cfset f = replace(f,"&","&amp;","all")>
12
13<!--- deserialize the xml/wddx back to a query object --->
14<cfwddx action="wddx2cfml" input="#f#" output="q">
15<!--- <cfset x = xmlparse(f)> --->

ColdFusion Memory Tracking: Real World Performance Example

It is widely known that the built-in ColdFusion Server Monitor can in many cases cause a CF server to become entirely unresponsive if Memory Tracking is enabled. I've experienced this myself when I previously consulted with customers, and I was able to save many clients that engaged me to resolve performance problems by identifying that they had inadvertently enabled Memory Tracking in production. I've written about this before, as have others.

A Test
However, as I am currently working on a Performance Testing project for an enormously large web application, I took this opportunity to observe and measure the impact of enabling Memory Tracking on performance. I was not at all surprised by what happened (the server became entirely unresponsive in very little time), but I was pleased that I was able to document the exact impact in a more empirical manner.

Environment
This experiment took place in a staging environment with 3 machines: One to host ColdFusion, another to host IIS, and a third to host JMeter. A performance test was created in JMeter to moderately exercise the application. It was run as a stress test by applying 100 Virtual Users indefinitely with 0 think time (no delay between a HTTP Response and the next HTTP Request). This means that at any given moment JMeter is making 100 simultaneous Requests to the CF server. The CF server is a virtualized instance with a max heap size set to about 12GB, sitting on Windows 7 with 25 GB RAM and 4 x 2.4GHz processors.

Let'r Rip!
The JMeter Test Plan was started and left to run for several hours, all the while pounding ColdFusion. During this period Monitoring was enabled in Server Monitor, but neither Profiling nor Memory Tracking were enabled. The CF Server's throughput was measured in Server Montitor to be steady at about 20 Requests per Second +/- 4 requests (steady range of 16-24). Memory Used by the JVM was a steady 3.9 GB at the peak followed by troughs of about 1.2 GB, with Garbage Collection happening once a minute. This created the typical sawtooth pattern when using the -XX:UseParallelGC JVM GC option. The CPU was typically in the range of 8-12% usage (across the 4 CPUs). The total throughput and memory utilization held steady for the several hours of testing. The app was performing beautifully, with 0 errors logged.

The Death Spiral
Before terminating the JMeter Test Plan execution, I enabled Memory Tracking in the Server Monitor. The JVM memory used began to quickly rise at the rate of about +700 MB per minute. The memory used jumped from a steady 4GB up to 8GB in just 6 minutes before the Server Monitor Interface stopped updating completely. Attempts to disable Memory Tracking were futile as button clicks did not respond. I could only watch the Task Manager on the CF server to continue observing memory and CPU. During the several hours of testing, total System Memory in Task Manager showed about 7.4 GB used, but after Memory Tracking was enabled and Server Monitor became unresponsive, I observed the total System Memory to be 13.7 GB, an increase of about 6 GB. The JVM was at or very close to being at the max heap allowed of 12 GB and was not able to reclaim any memory via GC. At this point I decided to kill the process tree of jrunsvc.exe (which also killed it's child process of jrun.exe, the main server instance). I stopped JMeter and then started ColdFusion again, making sure to disable Memory Tracking before running my next performance test run.

Caveat Emptor
This was a great example to see how Memory Tracking can bring a server to its knees. Often reported and well understood anecdotally, but I thought some actual screenshots of it in action would help illustrate how dangerous this setting can be. Memory Tracking can be used effectively in development and QA for debugging or troubleshooting performance issues, but only when used under small load. I should point out that sometimes Profiling is known to similarly bring down a server, but I was not able to observe any impact of the Profiling setting on this particular application as performance seemed normal when enabled during a stress test.

Server Monitor during Test
ColdFusion Server Monitor: Memory Tracking Performance Impact

[More]

Presentation Files for Automated System Testing at CFObjective

Thank you to everyone that attended my presentation today at CFObjective conference on Automated System Testing with ColdFusion, CFSelenium, MXUnit, and Jenkins. I received a lot of positive feedback about the content, quality of material, and demonstration.

You can download my presentation slides, all the project files, and info here:



A general overview of the topic is here.

If you attended my presentation (it ended up being a full room!), please remember to complete the session evals! Go to the #cfobjective schedule page, click on my session, and fill out the evaluation at bottom

Ping me with any questions! Thank you!

Automated System Testing for Web Apps at CF.Objective

I'm excited to to have the honor of once again presenting at the CF.Objective() Enterprise ColdFusion Conference. This year I'll be talking about Automated System Testing for Web Applications with CFSelenium, MXUnit, and Jenkins.

I've been a Quality Assurance software developer since 2007 when I was on the ColdFusion server engineering team at Adobe. For the past couple years I've enjoyed working at FirstComp Insurance with one of the largest ColdFusion developer teams that I know of, including well known team members like Sam Farmer, Dan Vega, and Jason Delmore, as well as many others of ColdFusion's best.

Testing by Isolation
One of my goals last year was to create a test suite framework that could perform Automated System Testing of our collection of web applications that we use for our business. We run it all on ColdFusion with a truly massive code base, and we have many different web applications that drive different parts of the business, each with unique user interfaces (UI). Part of good development practices includes writing Unit Tests early in the project to test application modules (CFCs) in isolation. Unit Tests are great for catching issues early in the release cycle, but they don't test how all the parts work together across the whole application as a system.

Testing Across the Board
This is where System Testing (or UI Testing) comes in, and I'll be showing you how I built our automated UI test framework from the ground up.

[More]

Could not find ColdFusion component or interface Query

My best blogging years were when I worked in ColdFusion Technical Support, from Allaire and right on thru Macromedia then Adobe. Constantly fielding customer questions provided an endless source of fodder to investigate and blog about when a solution or workaround was found. It feels a little like old times again now that my QA team is expanding and I've been helping others come up to speed with our ColdFusion driven Automated Test Suite. Although my colleagues are experienced web professionals, I'm happy there is room for mentoring in ColdFusion, and that provides me with more fodder to share here.

After helping someone install ColdFusion 9.0 and apply the 9.01 updater, they reported the updater failed to complete. We cleaned things up a bit, confirmed installers, and tried again. Success. Shortly after, we continued setting up the test suite environment they reported a very unusual error that I'd never seen before, Could not find the ColdFusion component or interface Query. With a bit of Googling, I found that there were only 2 hits, and one was in a comment on Ben Nadel's blog where he provided the winning hint. The other hit was a tweet about it when someone else encountered this issue.

Per Ben's hint, I had my colleague check the CF Admin's Custom Tag mappings, and the source of the problem was immediately evident. The core mapping for "C:ColdFusion9CustomTags" was missing. Prior to then, I thought this mapping was immutable by the end user of the CF Admin. Perhaps it was due to the initial failed 9.01 updater, I'm not really sure how that mapping got wiped out, but as soon as we restored it, everything worked.

The mapping is needed because some parts of the Core CFML language are implemented as custom tags stored in that core location. This includes the query.cfc tag, which implements the script-based version of CFQuery. Without that mapping, there will be several language areas that won't work.

[More]

ImageMetadata.cfc - Enhancement for ColdFusion's Functions

ImageMetadata.cfc is now available on RIAForge.com. It is a utility used to extract additional metadata than ColdFusion's built-in functions alone. Importantly, it can also set metadata, which is lacking in ColdFusion.

ColdFusion 8.x and up has the built-in functions imageGetIPTCMetadata and imageGetExifMetadata. However, these functions are based on the metadata-extractor by Drew Noakes, and they only return a subset of the total metadata available. Also ColdFusion does not provide a method to set image metadata.

This ImageMetadata.cfc is intended as an enhancement to those functions. It is essentially a wrapper to the ExifTool command line utility by Phil Harvey. The CFC can extract a much wider range of metadata than ColdFusion can alone, and it offers the ability to set metadata. Setting metadata is important because it is common that you would want to set IPTC metadata such as Copyright, Creator, Description, Headline, or Location for example.

The CFC is provided with demonstration usage and the API documentation provided by the CFC Explorer.

The CFC is licensed for use under the Apache License, Version 2.0.

This was tested against Windows 7 and Mac OS X 10.5 against both ColdFusion 8.01 and 9.01.

The CFC requires that you have installed ExifTool to the ColdFusion server.

Additional notes, requirements, usage, and contact info are found in the CFC comments.

Example usage:

view plain print about
1[cfset imageMetaDataUtil = createObject("component","com.stevenerat.util.ImageMetadata").init("/usr/bin/exiftool")><br/><br/>[cfset imageFilePath = getDirectoryFromPath(ExpandPath("*.*")) & "demo_image.jpg"><br/><br/>
2[cfset headline = imageMetaDataUtil.getImageMetadataTag(imageFilePath,"headline")><br/><br/>
3[cfset tags = {}>
4[cfset tags['headline'] = "A NEW HEADLINE">
5[cfset imageMetaDataUtil.setImageMetadata(imageFilePath,tags)>




Go to the ImageMetadata.cfc Project Page


You may also want to review related projects on RIAForge for other ColdFusion image handling enhancements.

Multiserver Monitor: Permission Denied and crossdomain.xml

While helping out with this issue on the Adobe Forums, I learned that the ColdFusion 9 Multiserver Monitor now requires /crossdomain.xml on target servers rather than /CFIDE/multiservermonitor-access-policy.xml. I was not aware of this change, so hopefully this post will ensure that others who administer ColdFusion will be.

---------------------------

Since I was actually on the ColdFusion 8.0 engineering team at Adobe and personally tested the multiservermonitor back in 2006/7, I find it very surprising to learn that /crossdomain.xml is now required in the webroot INSTEAD of /CFIDE/multiservermonitor-access-policy.xml.

I did some testing on a couple local ColdFusion 9.01 servers, and to force the requirement of the access file, I loaded the CF Admin Multiserver Monitor over localhost (127.0.0.1) and then tried to add a different CF instance to the monitor using the other interface for the same machine 192.168.1.104. As expected, I got Permission Denied. I then went to the target server that I was trying to add, and I enabled the multiservermonitor-access-policy.xml by uncommenting the appropriate line. I was really stunned to find that the target server still showed a Permission Denied status (Figure 1).

[More]

Validation Query for MySQL communications link failure

"That's probably the best answer to any forum question I've ever posted and/or seen" is what someone said in response to my reply on the Adobe ColdFusion forums.

I always enjoy helping other with ColdFusion questions whenever I have time (a 2 year old child doesn't leave much room for extra time!), but reactions like this really make it feel that the help is worth my effort. That's what being an Adobe Community Professional is all about. For what its worth, here's what I had to say about using Validation Queries to eliminate surprises when using database connection pooling in ColdFusion:

The Maintain Connections setting means that after a db connection is created for a given database, that connection will be used for the current query and then kept open in a connection pool so that they can be reused for later queries. The reason is that opening a connection is an expensive, time consuming operation, and its more efficient to only have to authenticate once. When this setting is disabled, for every page request accessing a given database, a new database connection will be created, the db authentication will occur, the query/queries on the request will happen, then the db connection will be closed.

When you maintain connections you have a pool of db connections that exist for an extended period, being frequently reused with additional requests. If the connections are idle for a period greater than the Inactive Timeout setting in the datasource definition, then those connections are closed and the pool size is reduced. Also, if a request checks out a connection from the pool, attempts to use it for a query for some request, then if that db connection produces a db error then that is another situation where the db connection will be closed and removed from the pool.

It has been known to happen that when you are pooling datasource connections like this that its possible that the TCP connection to the database has been interrupted for some reason, and when the connection is checked out for use on a page request you will get some type of "communication" error. The actual error message will vary depending on the database.

If unchecking Maintain Connections resolves this MySQL Communication Link Failure issue for you, then you are better off re-checking it AND adding a validation query. ColdFusion 8.0 introduced a field in the dsn definition for Validation Query. It works this way: When a database connection is first created AND every subsequent time that connection is checked back out from the pool, the validation query will run BEFORE any queries for the page request. If the validation query fails, your page request will never see the error because ColdFusion will throw away that db connection and get another connection from the db connection pool. It will then run the validation query for that connection too. If that one errors, ColdFusion will continue closing the bad connections and checking out other connections until there are no connections left in the connection pool. If it actually got that far (meaning every connection in the pool turned out to be bad) then ColdFusion will then create a NEW db connection and use that one, and it will run the validation query on that too. All of this happens before your request runs to guarantee that your request gets a *good* db connection from the start.

A good validation query is something is that is highly efficient so that db isn't really taxed by having to run it. For MySQL you could use: Select 1 That's it. Enter that into the validation query field for the datasource and keep Maintain Connections checked to improve efficiency with connection pooling.

Selective, Bulk CFEncode Wrapper


Just a quick post to share a utility I wrote to facilitate using cfencode on a batch of ColdFusion templates. The purpose was to enable a way to encode a subset of templates in a selective way rather than just encoding everything recursively.

Example, say you have C:fooar, C:fooaz, and C:fooqux, and you want to encode only the baz directory and a single cfm in the qux directory, but without touching the bar directory. You would enter C:foo as the base path, and then for the file list you would enter baz and qux est.cfm similar to the screenshot here.

Code is without warranty, use at your own risk. Big caveat: Remember the cfencode tool doesn't truly encrypt CF templates, but merely obfuscates them. This can be easily decoded by those who know how. Also you may want to check out Ben Nadel's CFEncode extension for CFBuilder, and read Joshua Cyr's perspective on why you would or would not want to encode your files.

Plain Text

view plain print about
1<br/><br/><style>
2 h1 {
3 font-size:14pt;
4 }
5 h2 {
6 font-size:11pt;
7 }
8 td {
9 padding:10px;
10 align:right;
11 font-size:10pt;
12 }
13 .body {
14 font-family:Verdana,Helvetica;
15 font-size:10pt;
16 width:800;
17 }
18 .batch {
19 font-family:Verdana,Helvetica;
20 font-size:8pt;
21 width:800;
22 color:#252525;
23 }
24</style><br/><br/>[cfset currentDir = getdirectoryfrompath(expandpath('*.*'))>
<br/><br/>
25[cfif isdefined("form.basePath") and isdefined("form.fileList")><br/><br/> [cfsilent>
26
27 [cfset issues = arrayNew(1)>
28 [cfif server.os.name contains "Win">
29
30 [cffile action="write" file="#currentDir#/cfencode_batch.bat" output=" ">
31
32 [cfloop list="#trim(form.fileList)#" index="i" delimiters="#chr(10)#">
33 [cfset target = trim(form.basepath) & "" & trim(i)>
34 [cfset target = replace(target,"\","","all")>
35
36
37 [cfset recurse = "">
38 [cfif target contains ".cf">
39 [cfset type = "file">
40 [cfelse>
41 [cfset type = "dir">
42
43 [cfif right(target,1) eq "*">
44 [cfset recurse = "/r">
45
46 [cfset target = left(target,len(target)-1)>
47 [/cfif>
48 [cfif right(target,1) eq "">
49
50 [cfset target = left(target,len(target)-1)>
51 [/cfif>
52 [/cfif>
53
54 [cfif (type is "file" and fileExists(target)) or (type is "dir" and directoryExists(target))>
55 [cftry>
56
57 [cffile action="append" file="#currentDir#/cfencode_batch.bat"
58 output='#server.ColdFusion.rootDir#incfencode.exe #target# #recurse# /q /v "2"'>
59 [cfcatch>
60 [cfset arrayAppend(issues,"ERROR: [#target#] #chr(10)##chr(13)#[#cfcatch.message#] #chr(10)##chr(13)#[#cfcatch.detail#]")>
61 [/cfcatch>
62 [/cftry>
63 [cfelse>
64 [cfset arrayAppend(issues,"INFO: [#type# #target# does not exist]")>
65 [/cfif>
66 [/cfloop>
67
68
69 [cfscript>sleep(2000);[/cfscript>
70
71
72 [cfif fileexists("#currentDir#/cfencode_batch.bat")>
73 [cfexecute name="#currentDir#/cfencode_batch.bat">[/cfexecute>
74 [cfelse>
75 [cfset arrayAppend(issues,"INFO: [batch file does not exist at #currentDir#/cfencode_batch.bat]")>
76 [/cfif>
77
78 [cfelse>
79 [cfset arrayAppend(issues,"INFO: [MUST RUN THIS ON WINDOWS]")>
80 [/cfif>
81
82 [/cfsilent>
83
84[/cfif><br/><br/>
85<div class="body">
86 <h1>Encrypt ColdFusion Files</h1>
87 <p>
88 Enter a base path and then a list of relative paths off the base to specific ColdFusion files. Directories are acceptable too. Indicate recursion by ending a directory path ending in "*".
89 </p>
90 <p>
91 Files will be encrypted with cfencode in place, so make sure you have a backup of unencrypted source.
92 </p>
93 [cfoutput>
94 <form name="encodeForm" method="post" action="">
95 <table>
96 <tr><td>Base Path:</td><td><input name="basePath" value="[cfif isdefined("form.basePath")>#form.basePath#[/cfif>[cfoutput>[/cfoutput>" size="57" title="A base path to be prefixed to the relative path for each item in File List"/></td></tr>
97 <tr><td valign="top">File List:</td><td><textarea name="fileList" rows="15" cols="50" title="A list of files or directories to encode, one per line. Use relative paths from base path.">[cfif isdefined("form.fileList")>#form.fileList#[/cfif>[cfoutput>[/cfoutput></textarea></td></tr>
98 <tr><td colspan="2"><input name="submit" type="submit" value="Encode Files"></td></tr>
99 </table>
100 </form>
101 [/cfoutput>
102</div><br/><br/><br/><br/>[cfif isdefined("issues")>
103<div class="batch">
104 <hr>
105 <h2>CFENCODE Completed</h2>
106 [cfif arraylen(issues)>
107 [cfdump var="#issues#" label="There were problems with your request">
108 [/cfif>
109 [cfif fileexists("#currentDir#/cfencode_batch.bat")>
110 <br/><br/>Here's the contents of the batch file that was created:
111 [cffile action="read" file="#currentDir#/cfencode_batch.bat" variable="batchContents">
112 [cfoutput><pre>#batchContents#</pre>[/cfoutput>
113 [/cfif>
114</div>
115[/cfif><br/><br/>

JVM Memory Management and ColdFusion Log Analysis

The following is a document I wrote for knowledge sharing with some peers, but I feel that it might have some value to other ColdFusion Devs, Testers, and Admins out there. The purpose was to illustrate how I went about analyzing CF's performance during a prior troubleshooting session. I'm re-purposing the content here after scrubbing some private information. Hopefully it still makes sense, although slightly out of context.




These are some technical notes on what to look for when analyzing ColdFusion server performance history. It includes concepts and techniques to assess what performance related problems might exist, emphasizing memory usage issues first. This is a somewhat simplified explanation about how the JVM manages memory and its relation to CF applications, and about how I went about analyzing them. There are many similar resources on the web, but I found many of them are quite technical, so this article is written with more of a layman's approach to make it more digistible to those not as familiar with troubleshooting ColdFusion or Java apps.



The ColdFusion Application Server runs inside (is "contained" in) a higher level JRun J2EE server. The J2EE server (and therefore the CF server) run on top a JVM (Java Virtual Machine). To analyze the ColdFusion and JVM performance, I take a forensic approach. I start by collecting the ColdFusion and JRun server logs. I also colelct the JVM Garbage Collection (GC) log that has been manually enabled to log information regarding how the JVM is cleaning up the memory that is has used. The JVM is configured with an algorithm that tells it what approach to take when cleaning up and freeing memory. The application's Java objects (like queries, session variables, local variables, cfc instances, etc) are held in the JVM's memory. Objects are said to hold "references" in memory, meaning that something in the application is potentially using that object. When the application no longer has a need for an object, its memory is dereferenced. That dereferenced memory can be released by the JVM and then reused by other objects that require it.



[More]

More Entries