Well, this is the Low Latency Web! Here’s a plot showing requests per second vs. number of concurrent connections for a simple JVM application which prefix-routes each request to a random responder and renders the result as either JSON or a Jade HTML template.
Performance plateaus at around 150,000 requests/sec for HTML output and 180,000 requests/sec for JSON output. Latency is much greater than the ideal case of static content via nginx, but only around 6ms with 1,000 concurrent connections and below 2ms for <= 300 connections. Not bad for the JVM and a commodity server, and JSON generation performance is better than static content on EC2.
The complete test application is available in app.zip, simply extract it somewhere and run
mvn compile exec:java. The only dependencies are a JVM and maven. The results above are from the server described in the original article, running JDK 7u3:
java version "1.7.0_03" Java(TM) SE Runtime Environment (build 1.7.0_03-b04) Java HotSpot(TM) 64-Bit Server VM (build 22.1-b02, mixed mode)
The test app only 124 lines of Java, but could just as well have been written in Clojure, JRuby, Scala or any other JVM language that compiles to byte code. It uses Jetty 8 as the HTTP server, Scalate as the templating engine, and Jackson as the JSON generator.
When the app starts it chooses 1,000 words from /usr/share/dict/words and maps each to a responder that keeps a monotonically increasing count of calls. Each request is randomly routed to a responder via a prefix match, and the result is output in JSON or HTML depending on whether the request’s
Accept header contains
application/json or not.
The benchmark tests the following, in approximate order of used CPU time:
- Jetty’s HTTP performance
- Scalate’s template rendering performance
- Jackson’s JSON generating performance
The cost of routing a request appears to be negligible but is theoretically O(log n).