USA: (978) 496-9990
Germany: +49 7031 221461
UK: +44 207 193 1212

JVM

Thursday, 29th November 2012 – FREE Webinar – Preventing and diagnosing ColdFusion server crashes and slow downs

  • Are your ColdFusion applications running slow or even crashing the server?
  • Are you concerned about what increasing load will do the the reliability of your application?
  • Do you want to protect your organizations reputation for quality on the web?

Then join us for this free webinar with Intergral’s David Stockton and learn how to keep your ColdFusion servers alive and performing to their full potential. And when your server is crashing or running slow find out how to figure out what is going on and solve the problems fast so that your apps can be running reliably.

If your server is slow or sick this is for you! We will look at how to diagnose problems and some common ways to heal a sick ColdFusion server. We will also discuss what tools you can use to prevent problems from occurring.

This webinar is with David Stockton, technical consultant from the FusionReactor professional JVM and ColdFusion server monitor team. David has been using ColdFusion for more than 10 years and has spoken on server tuning and load testing many times.

He will demonstrate how to:

  • continuously monitor and gather metrics on your production servers
  • diagnose server and application issues
  • keep servers alive with unattended monitoring

We will also look at the FusionAnalytics ColdFusion Application and server analysis tool.

  • better server sizing business decisions
  • improve application performance
  • improve code quality
  • measure exactly how your applications are performing over time

We will raffle off one copy of FusionReactor – you must register to enter this raffle.

The webinar on “Preventing and diagnosing ColdFusion server crashes and slow downs” is on Thursday, November 29, 2012 3:00 PM – 4:00 PM EST. The webinar will cover fixing slow servers, performance bottlenecks location and diagnosis tips. It will be approximately 45 minutes including time for Q and A. The webinar is free. You can register athttps://www1.gotomeeting.com/register/242091952 See you there!

David started his career developing desktop applications using Visual Basic. After a period of working on interface design and prototyping for digital television set-top boxes, he made the move to web applications and working with ColdFusion in a variety of fields, from e-commerce to social networking.
In 2006 David joined the team at Intergral Information Solutions, makers of FusionReactor, FusionDebug and FusionAnalytics. David holds a senior consulting position for the Intergral UK team. David graduated from Staffordshire University with a Bachelor of Engineering degree (with honours) in Software Engineering.

The webinar will be hosted by Michael Smith, from TeraTech Inc. Click http://www.teratech.com/blog/index.cfm/2012/11/14/Preventing-and-diagnosing-ColdFusion-server-crashes-and-slow-downs-Thursday-112912-3pm-EST for further details.

System Requirements
PC-based attendees
Required: Windows® 7, Vista, XP or 2003 Server

Mac®-based attendees
Required: Mac OS® X 10.5 or newer

Mobile attendees
Required: iPhone®, iPad®, Android™ phone or Android tablet

CVE-2010-4476 – ColdFusion / Java hangs when converting 2.2250738585072012e-308 (or 2.2250738585072011e-308)

This JVM bug seems to be getting some high-level attention in the IT press so I thought I’d lay out the issue where CF is concerned:

History

The bug is in the JVM (it has been since ~2001) and so ColdFusion running on Sun JVMs are affected.
Someone out there has obviously made the link between the same issue happening in PHP and brought this issue to light again ( http://bugs.php.net/bug.php?id=53632 ). There’s a Java related discussion happening here: http://www.exploringbinary.com/java-hangs-when-converting-2-2250738585072012e-308/

How to reproduce

To have the bug show, you must call the parseDouble() method of the java.lang.Double class. There are several ways this can happen. Many people are discussing this as a vulnerability that can be executed at the HTTP header level like so:

Accept-Language: en-us;q=2.2250738585072012e-308

However, this requires a call to HttpServletRequest’s getLocale() method, something that isn’t done trivially on a JRun4, CF 9.0.1 instance (even when calling the ColdFusion function “getLocale()”). Thus, to show this problem, you must do something like…

#GetPageContext().getRequest().getLocale()#

… within your ColdFusion page.

From our experience, a more likely attack could be performed with code like this:

<cfparam name="URL.pageNum" default="1" />
<cfparam name="URL.itemsPerPage" default="10" />
<cfquery name="qProducts" datasource="mysql_dsn">
    SELECT * FROM products
    LIMIT #((URL.pageNum-1) * URL.itemsPerPage) + 1# , #URL.pageNum * URL.itemsPerPage#
</cfquery>

The problem here is “URL.pageNum-1“. This calculation causes a call to parseDouble() behind the scenes which means that if the page were called with “page_name.cfm?pageNum=2.2250738585072012e-308” then the thread would hang in an infinite loop.

What doesn’t show the issue?

Note that in this example, “URL.itemsPerPage” could also cause the issue because it is used in the multiplication calculation. If the variable were not used in any calculations but only output, it would not show the issue. This example does NOT show the problem:

<cfset x = 2.2250738585072012e-308 />
<cfoutput>#x#</cfoutput>

What can you do?

Short term

If you have FusionReactor installed and configured with CrashProtection enabled and configured, the threads can be automatically killed by FusionReactor, saving your server from almost certain failure. To do this, enable Crash Protection and configure a “Request Timeout” value and set it to use the “Abort and Notify” strategy. This will cause requests taking longer than this time to quit – even if they are stuck in the infinite loop bug as in this scenario.

For those of you who are wondering, this is NOT the same as the ColdFusion timeout mechanism and so the ColdFusion page timeout alone will not help you in this scenario.

It’s good practice to have FusionReactor installed and Crash Protection enabled because it can save you from a lot of these issues without you needing to do anything.

Long term

I’m sure Oracle/Sun will offer a new update in due course. However, you can also download the “Java SE Floating Point Updater Tool”:
Download: http://www.oracle.com/technetwork/java/javase/downloads/index.html#fpupdater
Read Me: http://www.oracle.com/technetwork/java/javase/fpupdater-tool-readme-305936.html

Further Help

If you’re in need of help updating your JVM and/or patching it then we can offer assistance in this area from as little as $800. The FusionReactor product is available from as little as $249 and contains a wealth of other features – the majority of which are not covered by the ColdFusion Server Monitor – http://www.fusion-reactor.com/fr/ for more information.

Notes

This article refers to JRun4, CF9 installations. The issue is apparent on a wide variety of Java platforms (we offer consulting for most Java environments) and is more prevalent on Tomcat installations (which includes JBoss).

References

Official security alert (CVE-2010-4476): http://www.oracle.com/technetwork/topics/security/alert-cve-2010-4476-305811.html

Killing Rogue Requests – Going native, don’t stop me now!

FusionReactor is a great monitoring tool and one of my favorite features is the ability to kill rogue requests. FusionReactor is sometimes limited by Java itself. Java has a known limitation that threads running “Native Code” can’t be killed (until the thread returns from the native code block).

What is Native Code?

Underlying all your ColdFusion goodness is Java, underlying the Java is the runtime environment typically implemented in C/C++ code. When you hit a code-block that must “go native” this is inside the C/C++ code typically waiting for an event to occur. When a thread is executing this native method the thread cannot be killed by the JVM.

What to look for?

Some of the most common examples where native code is used are:

  • CFHTTP calls
  • WebService calls
  • JDBC Queries

What you’re looking for is “Native Method” in the stack trace of the thread. Let’s look at some concrete examples…

CFHTTP Calls

Example CF Code:

<cfhttp url="http://localhost/blogs/dont_stop_me_now/slow.cfm" />

Example Java Stack Trace (available from FusionReactor):

java.net.SocketInputStream.socketRead0(SocketInputStream.java:???)[Native Method]
java.net.SocketInputStream.read(SocketInputStream.java:129)
HTTPClient.BufferedInputStream.fillBuff(BufferedInputStream.java:172)
HTTPClient.BufferedInputStream.read(BufferedInputStream.java:110)
HTTPClient.StreamDemultiplexor.read(StreamDemultiplexor.java:273)
HTTPClient.RespInputStream.read(RespInputStream.java:155)
HTTPClient.RespInputStream.read(RespInputStream.java:115)
HTTPClient.Response.readResponseHeaders(Response.java:1000)
HTTPClient.Response.getHeaders(Response.java:720)
HTTPClient.Response.getStatusCode(Response.java:259)
HTTPClient.RetryModule.responsePhase1Handler(RetryModule.java:83)
HTTPClient.HTTPResponse.handleResponse(HTTPResponse.java:761)
HTTPClient.HTTPResponse.getStatusCode(HTTPResponse.java:191)
coldfusion.tagext.net.HttpTag.connHelper(HttpTag.java:850)
coldfusion.tagext.net.HttpTag.doEndTag(HttpTag.java:1140)
cfslow_cfhttp2ecfm1758959420.runPage(C:\inetpub\wwwroot\blogs\dont_stop_me_now\slow_cfhttp.cfm:1)
coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:231)

WebService Calls

Example CF Code:

<cfset ws = createObject("webservice", "http://localhost/blogs/dont_stop_me_now/slow.cfc?wsdl") />
<cfset ws.goSlow() />

Example Java Stack Trace (available from FusionReactor):

java.net.SocketInputStream.socketRead0(SocketInputStream.java:???)[Native Method]
java.net.SocketInputStream.read(SocketInputStream.java:129)
java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
java.io.BufferedInputStream.read(BufferedInputStream.java:237)
org.apache.axis.transport.http.HTTPSender.readHeadersFromSocket(HTTPSender.java:581)
org.apache.axis.transport.http.HTTPSender.invoke(HTTPSender.java:142)
org.apache.axis.strategies.InvocationStrategy.visit(InvocationStrategy.java:32)
org.apache.axis.SimpleChain.doVisiting(SimpleChain.java:118)
org.apache.axis.SimpleChain.invoke(SimpleChain.java:83)
org.apache.axis.client.AxisClient.invoke(AxisClient.java:165)
org.apache.axis.client.Call.invokeEngine(Call.java:2765)
org.apache.axis.client.Call.invoke(Call.java:2748)
org.apache.axis.client.Call.invoke(Call.java:2424)
org.apache.axis.client.Call.invoke(Call.java:2347)
org.apache.axis.client.Call.invoke(Call.java:1804)
blogs.dont_stop_me_now.SlowCfcSoapBindingStub.goSlow(SlowCfcSoapBindingStub.java:157)
sun.reflect.NativeMethodAccessorImpl.invoke0(NativeMethodAccessorImpl.java:???)[Native Method]
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
java.lang.reflect.Method.invoke(Method.java:597)
coldfusion.xml.rpc.ServiceProxy.invokeImpl(ServiceProxy.java:224)
coldfusion.xml.rpc.ServiceProxy.invoke(ServiceProxy.java:154)
coldfusion.runtime.CfJspPage._invoke(CfJspPage.java:2360)
cfslow_ws2ecfm1005604111.runPage(C:\inetpub\wwwroot\blogs\dont_stop_me_now\slow_ws.cfm:2)
coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:231)

JDBC Queries

Example CF Code:

<cfquery name="wait" datasource="test">
   SELECT 1 waitfor delay '000:00:10:000'
</cfquery>

Example Java Stack Trace (available from FusionReactor):

java.net.SocketInputStream.socketRead0(SocketInputStream.java:???)[Native Method]
java.net.SocketInputStream.read(SocketInputStream.java:129)
macromedia.jdbc.sqlserver.SQLServerByteOrderedDataReader.makeMoreDataAvailable(null:???)
macromedia.jdbc.sqlserver.SQLServerByteOrderedDataReader.receive(null:???)
macromedia.jdbc.sqlserver.tds.TDSExecuteRequest.submitRequest(null:???)
macromedia.jdbc.sqlserver.tds.TDSRequest.execute(null:???)
macromedia.jdbc.sqlserver.SQLServerImplStatement.execute(null:???)
macromedia.jdbc.sqlserverbase.BaseStatement.commonExecute(null:???)
macromedia.jdbc.sqlserverbase.BaseStatement.executeInternal(null:???)
macromedia.jdbc.sqlserverbase.BaseStatement.execute(null:???)
coldfusion.server.j2ee.sql.JRunStatement.execute(JRunStatement.java:348)
coldfusion.sql.Executive.executeQuery(Executive.java:1229)
coldfusion.sql.Executive.executeQuery(Executive.java:1008)
coldfusion.sql.Executive.executeQuery(Executive.java:939)
coldfusion.sql.SqlImpl.execute(SqlImpl.java:341)
coldfusion.tagext.sql.QueryTag.executeQuery(QueryTag.java:843)
coldfusion.tagext.sql.QueryTag.doEndTag(QueryTag.java:533)
cfslow_db2ecfm445915345.runPage(C:\inetpub\wwwroot\blogs\dont_stop_me_now\slow_db.cfm:1)
coldfusion.runtime.CfJspPage.invoke(CfJspPage.java:231)

Why!?

All these examples are in native methods for socket reading. Socket functions (both reading and writing) are the most commonly found native methods in stack traces.

What can I do?

Unfortunately, the only current work-around is to restart your server. But this is a Java limitation that even without FusionReactor you would still have the problem – FusionReactor is just giving you visibility. The real solution is to investigate the root cause of the problem and solve that – that’s where we come in! We’re experts in this field and working on issues like this on a daily basis – give us a call!

JVM PermGen memory usage with many CFM templates

Have you noticed requests stop processing and your CPU usage is high?

There are many possible causes of this – a common one being using “Registry” as the CLIENT variable backing store.

Have you seen this combined with “java.lang.OutOfMemoryError: PermGen space” errors in your logs?

Again, there are several causes for filling the PermGen space but one common one is too many templates for the allotted space. The PermGen space stores information about classes. Behind the scenes of ColdFusion each CFM translates to a Java class. This means that if you have many templates used by your server, you’ll have lots of classes and use a lot of PermGen space. Remember this class information gets stored in the PermGen for the life of the server and is never unloaded!

Careful not to get confused with the CF administrator setting “Maximum number of cached templates” which are cached in the Heap space.

So, how many is too many?

Well, I looked at an example with a very simple set of CFMs. I took 10,000 CFM templates containing the single line:

<cfset x = now() />

The mean average PermGen increase per template (after execution of course) was 2,677 bytes. This probably doesn’t sound like a lot but put this into practice on a live server with a real application and it only takes ~1,000-2,000 templates before you’re out of PermGen space and an unstable server.

Note: It’s not just CFMs that are Java classes behind the scenes, your CFC functions count too!