The Low Latency Web

March 22, 2012

150,000 Requests/Sec – Dynamic HTML & JSON Can Be Fast

Filed under: HTTP Servers — lowlatencyweb @ 10:45 pm

Most web applications contain copious amounts of static content in the form of JavaScript, CSS, images, etc. Serving that efficiently means a better user experience, and also more CPU cycles free for dynamic content. Useful benchmarks of dynamic content are more difficult due to the huge number of components available today. Do you choose Ruby, Python, node.js, Java? Which framework on top of which runtime? Which templating system?

Well, this is the Low Latency Web! Here’s a plot showing requests per second vs. number of concurrent connections for a simple JVM application which prefix-routes each request to a random responder and renders the result as either JSON or a Jade HTML template.

Performance plateaus at around 150,000 requests/sec for HTML output and 180,000 requests/sec for JSON output. Latency is much greater than the ideal case of static content via nginx, but only around 6ms with 1,000 concurrent connections and below 2ms for <= 300 connections. Not bad for the JVM and a commodity server, and JSON generation performance is better than static content on EC2.

wrk was run the same as before, but the JSON test adds the Accept HTTP header:

wrk -t 10 -c N -r 10m http://localhost:8080/
wrk -t 10 -c N -r 10m -H "Accept: application/json" http://localhost:8080/


The complete test application is available in, simply extract it somewhere and run mvn compile exec:java. The only dependencies are a JVM and maven. The results above are from the server described in the original article, running JDK 7u3:

java version "1.7.0_03"
Java(TM) SE Runtime Environment (build 1.7.0_03-b04)
Java HotSpot(TM) 64-Bit Server VM (build 22.1-b02, mixed mode)

The test app only 124 lines of Java, but could just as well have been written in Clojure, JRuby, Scala or any other JVM language that compiles to byte code. It uses Jetty 8 as the HTTP server, Scalate as the templating engine, and Jackson as the JSON generator.

When the app starts it chooses 1,000 words from /usr/share/dict/words and maps each to a responder that keeps a monotonically increasing count of calls. Each request is randomly routed to a responder via a prefix match, and the result is output in JSON or HTML depending on whether the request’s Accept header contains application/json or not.

The benchmark tests the following, in approximate order of used CPU time:

  1. Jetty’s HTTP performance
  2. Scalate’s template rendering performance
  3. Jackson’s JSON generating performance

The cost of routing a request appears to be negligible but is theoretically O(log n).


1 Comment »

  1. […] This series of articles has drawn out the cargo cultists who insist that HTTP benchmarks must be run over the network, or that a real application should be tested, as if adding more variables makes a benchmark more useful. What would adding NIC overhead and network latency into the mix prove when the goal is to test a HTTP server, or a dynamic content stack? Zed Shaw addresses this in an excellent article on confounding, but the fundamental point is that a useful benchmark must isolate the variable being tested. […]

    Pingback by A Note On Benchmarking « The Low Latency Web — March 23, 2012 @ 2:42 pm

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

Blog at

%d bloggers like this: