HttpUnitCaseStudy

Revision as of 18:23, 6 November 2009 by Admin (Talk | contribs)
(diff) ← Older revision | Current revision (diff) | Newer revision → (diff)

HttpUnit Case Study

This demo script is part of the NetBeans World Tour 2008 session #8, Integrated Profiling Tools. Refer to the top-level demo scripts page for additional information.

Background

This case study is based on an actual usage of the NetBeans profiler as part of the development and testing of a production application. Andrés González of Spain used the NetBeans profiler to track down a memory leak in HttpUnit. HttpUnit is an open source project that provides a framework for unit testing of web pages. It essentially acts as a web browser so that you can write unit tests to verify that the correct pages are being sent back from your web application.

Andrés was using HttpUnit to run multi-day tests of his application (so it sort of sounds like he was using it as a convenient way to do long-term stress testing). He discovered something sort of odd - the JVM that was running the tests he created with HttpUnit would eventually report an OutOfMemoryError. The tests had to run for long periods of time before the OutOfMemoryError would occur though, so apparently each memory leak was relatively small.

Andrés used the NetBeans profiler to track down the problem and he wrote a blog entry about it (the entry is in Spanish, but Google can translate :-) ). There is also this thread out on the HttpUnit mailing lists where he reported what he found.

The sample application included here is not the program that Andrés wrote - it does however encounter the same problem in HttpUnit. The underlying issue has to do with the way that HttpUnit processes web pages that include JavaScript. The framework does have support for a subset of JavaScript, but not the entire language. If it encounters JavaScript that it does not understand it will throw an exception. If the web page you are attempting to test includes JavaScript that HttpUnit does not support and you do not want to wade through all those exceptions in the output then the HttpUnit documentation recommends that your test program call HttpUnitOptions.setExceptionsThrownOnScriptError( false );.

The side effect, however, is that HttpUnit will store every exception thrown during its JavaScript processing in an ArrayList and it never removes them. So if your tests access enough web pages that either have JavaScript errors or that include JavaScript that HttpUnit does not support, you can eventually get an OutOfMemoryError.

One additional note on the sample application - it does not require a web server in order to run. HttpUnit has a nice feature where if what you want to test is the response from a servlet then it can host the servlet for you, in the same JVM as your test application. So the sample application consists of:

  1. a servlet that returns JavaScript that has an error
  2. a main() method that repeatedly requests a page from that servlet.

Main Points to Hit

  • The NetBeans Profiler has a powerful capability to help you track down memory leaks.
  • Using instrumentation you can watch allocations on the heap happen in real time.
  • The profiler provides statistical values you can use to watch for patterns in your application's memory allocations. This "behavioral" approach can help you quickly identify the most likely memory leak candidates, even in situations such as this one where each individual leak is very small.
  • With its tight integration into the developer work flow, it is easy to start/stop profiling sessions and more importantly, to go from profiler results directly into the source code that has the problem.

Setup

  • NetBeans IDE 5.5 or higher (if you are using 5.5 you have to install the Profiler separately)
  • JDK 5 update 8 or higher
  • Open the HttpUnit sample project, which is available in this zip file
  • To avoid delay during the presentation, profile the application at least once so that the IDE can modify the build.xml file

Gotchas

This problem has also been known to occur on Windows. A possible workaround is to set -Xnoclassgc in the project properties of the HttpUnit project. Note, however, that using -Xnoclassgc skews the results a bit. The memory leak is still the same, but the String allocations done by java.lang.String.substring() will also be climbing in Generation count because classes are not being unloaded (the Rhino JavaScript engine apparently makes heavy use of dynamically generated classes to interpret JavaScript).

Demo Steps

  1. Open the HttpUnit project
  2. Right-click the HttpUnit project and choose Profile
  3. Select Memory from the list of tasks on the left of the dialog (if using 6.0) or click the Analyze Memory Usage button (if using 5.5)
  4. Select Record both object creation and garbage collection
  5. Select 10 for the value in Track every object allocations
  6. Check the Record stack trace for allocations option
  7. Make sure the Use defined Profiling Points option is not checked (if using 6.0)
  8. Click the Run button
  9. The application begins running. In the Output window of the IDE you will see lines such as these scrolling by:
    Image:output_HttpUnitCaseStudy.png
  10. Suggested Comment (SC): "This is a simple test application that emulates the behavior that Andrés saw. It uses HttpUnit to process the HTML that is returned by a servlet. Requests are being repeatedly sent to that servlet by the test."
  11. Click the Telemetry Overview icon (it looks like a graph and is under the Controls section) in the Profile window. This will open the Telemetry Overview window. Maximize it and you will see something similar to this:
    Image:overview_HttpUnitCaseStudy.png
  12. SC: "The purple graph on the left shows heap usage. The value is trending upward over time. But very slowly." NOTE: It takes over 30 seconds for this pattern to emerge, so skip this step if you are pressed for time or think the audience is getting restless.  :-)
  13. Un-maximize the Telemetry Overview so that you can see the Profile window.
  14. Click the Live Results icon in the Profile window.
  15. SC: "This Live Results window shows activity on the heap. The column on the left contains class names. For each class you can see information about the number of objects created, the number that are still in use (live), and a particularly interesting statistic called Generations."
  16. Click the Generations column to sort the display by Generations. This will sort it in descending order and will look like this:
    Image:liveresults_HttpUnitCaseStudy.png
  17. SC: "Two of the classes have huge values for Generation, in comparison to all the other classes: String and char array. More importantly, the Generations value for both of them continues to increase as the application runs. Note how for the rest of the classes this is not the case - they have stabilized."
  18. SC: "The generation count for a class is easy to calculate. All you have to understand is that each object has an age. The age of an object is simply the number of the Java virtual machines's garbage collections it has survived. So if for example an object is created at the beginning of an application and the garbage collector has run 466 times then the age of that object at that point in time is 466. To calculate the Generation value for a class, just count up the number of different ages across all of its objects that are currently on the heap. That count of different ages is the number of generations."
  19. SC: "Please note: there is no 'correct' value for generation count. The key thing to watch out for is classes that have generation counts that are always increasing. If the generation count is always increasing then that means objects of that class are being created repeatedly as the program runs and more importantly, not all existing object instances are being garbage collected. As a result, as time goes on and the garbage collector continues to run you get more and more objects created at different points in time and therefore with different ages."
  20. SC: "The increasing generation count for the String class (and its little friend char array :-) ) indicates Strings are being created repeatedly."
  21. Right-click the entry for String and select Take Snapshot and Show Allocation Stack Traces:
    Image:ct1_HttpUnitCaseStudy.png
  22. Select Profile > Stop SC: "We have the information we need, so we can stop the profiling session."
  23. SC: "The allocation stack traces view shows all the places in the code where Strings were allocated. In a typical Java application there can be dozens or even hundreds or thousands of places where Strings are allocated. What you want to know is: which of those allocations are resulting in memory leaks? You can use the generation count as a key indicator. Notice that only one of the allocation locations in this group has created Strings that have a large value for Generation count: java.lang.StringBuffer.toString(). If we were to continue running the application and take more snapshots we would see larger values each time."
  24. SC: "So we know that StringBuffer's toString() is allocating strings that appear to be candidate memory leaks. So what? How do we tie that back to our application's usage of HttpUnit?"
  25. Click the icon next to the entry for StringBuffer.toString() to expand it:
    Image:ct2_HttpUnitCaseStudy.png
  26. SC: "Ah ha! :-) The only strings allocated in StringBuffer.toString() with a large value for generation count are the Strings that resulted from calls to StringBuffer.toString() by a call from the HttpUnit code: com.meterware.httpunit.javascript.JavaScript$JavaScriptEngine.handleScriptException()."
  27. Expand the entry for handleScriptException() and since there is only one way it is being called the profiler goes ahead and expands the entire stack trace. You will see a straight-line back to the main() method in the test application:
    Image:ct3_HttpUnitCaseStudy.png
  28. Right-click Main.main (for it to be visible you will have to increase the width of the Method Name column) and choose Go To Source. SC: "One of the advantages of an integrated profiler - easy access to the source code."
  29. SC: "Here we are in the sample application calling HttpUnit's getResponse() method on line 46, which ends up making the call that results in the memory leak:
    Image:main_HttpUnitCaseStudy.png
  30. SC: "It turns out there is a workaround - HttpUnit provides a method that will clear the ArrayList in which these Strings are being collected. More details in the email thread that Andrés created on the HttpUnit web site." Here's a quick summary from Andrés in that thread: "I discovered that HttpUnit stores *all* JavaScript Exceptions in a private buffer, which *you* have to clear manually from time to time, if you don't want to loose your memory along the way :)"


Cleanup

None.

Not logged in. Log in, Register

By use of this website, you agree to the NetBeans Policies and Terms of Use. © 2012, Oracle Corporation and/or its affiliates. Sponsored by Oracle logo