It is now about four and a half years since the Servlet 3.0 specification was released in December 2009, together with Java EE 6. One feature that came in Servlet 3.0 was the possibility to decouple an HTTP request from any container threads, which is often referred to as Async Servlets, but perhaps more correctly called Asynchronous Processing in Servlets. I believe this is still an underutilized feature.
Motivation
By making a servlet asynchronous, you can return the thread calling the servlet to the container and continue the remainder of the request on another thread. In many cases just a few threads can handle lots of requests simultaneously, for example if an asynchronous HTTP client is used to call an external REST endpoint. This causes the number of threads needed to handle requests in the container to go down significantly. By using a lot fewer threads we save memory (normally hundreds of kilobytes per thread, although this is tweakable), but not only that – we also improve performance, because of reduced thread context switching.
Asynchronous processing should be used for requests that take a “long” time to process, especially when the request needs to wait for something. Often this is IO but it could also be some kind of event that is triggered independently of the request. Even if the called service mainly computes something you might want to hand the computation over to a thread pool that takes care of it (perhaps in parallel) and thereby let go of the container thread.
To be a viable case for async the client should also be interested in some result coming from the service, or otherwise you can just respond synchronously, but continue processing afterwards in another thread.
Another benefit that comes with async processing is that non-blocking IO can be used for the request and response bodies. This is a feature that came with Servlet 3.1 in Java EE 7 that was released in May 2013. By using this, which is especially appropriate if the request or response body is large, you can keep even more threads from being blocked.
When should it then not be used? Perhaps not by default, since it adds some complexity and a bit more code is needed. Error handling and debugging also become a little harder. When the request and response are small, the external IO is low and the processing is short there is least benefit of using an asynchronous servlet.
How it works
How does it work then? Actually it is quite simple to use. The basic flow for a servlet that calls an external REST service using an async HTTP client (for example AsyncHttpClient) looks like this:
- The container calls the servlet.
- The servlet tells the container that the request should be asynchronous, by calling
ServletRequest.startAsync
.startAsync
returns anAsyncContext
. - Thereafter the asynchronous call to the REST service is made.
- The service method of the servlet returns and at this point no thread is associated specifically with this request.
- Some time passes and then the response from the external REST service comes back.
- That response is processed and a response from this servlet is created and sent.
- The method
AsyncContext.complete
is called to tell the container that this request has been processed by the servlet.
Let us take a look at how that might look in code. Working code examples related to this post are available on GitHub.
protected void doGet(final HttpServletRequest req, HttpServletResponse resp) {
// Initialize async processing.
final AsyncContext context = req.startAsync();
// This call does not block.
client.callExternalService(
// This callback is invoked after the external service responds.
new Callback() {
public void callback(String result) {
ServletResponse response = context.getResponse();
response.setContentType("text/plain");
response.setCharacterEncoding("UTF-8");
byte[] entity = ("Result: " + result + ".n").getBytes(Charset.forName("UTF-8"));
response.setContentLength(entity.length);
try {
response.getOutputStream().write(entity);
} catch (IOException e) {
// Ignored.
}
context.complete();
}
});
}
Timeouts
One thing you might want to take care of is timeouts. You can set the timeout of a specific request through AsyncContext.setTimeout
, otherwise a container default value will be used. Then you can “listen” for timeouts by attaching an AsyncListener
through the method AsyncContext.addListener
. The AsyncListener
interface looks like this:
public interface AsyncListener extends EventListener {
void onComplete(AsyncEvent);
void onError(AsyncEvent);
void onStartAsync(AsyncEvent);
void onTimeout(AsyncEvent);
}
And the AsyncEvent
like this:
public class AsyncEvent {
public AsyncContext getAsyncContext() { ... }
public ServletRequest getSuppliedRequest { ... }
public ServletResponse getSuppliedResponse { ... }
public Throwable getThrowable { ... }
}
The code that sends the response (in the callback in the example above) will get an IllegalStateException
if the request timed out, since the AsyncContext
will be completed. This needs to be handled and the easiest might be to just catch the exception.
One thing I found odd is that, at least when using Jetty, when you don’t have any custom timeout handling the servlet is called again after the timeout occur. The second time the response will have status code 500, so you can take care of this by checking the status code at the top of the service method and just return if the status is 500.
Non-blocking IO
If your service is expected to receive large request or response bodies, especially if the clients write or read slowly, you would benefit from using the non-blocking IO feature introduced in Servlet 3.1, as mentioned earlier. On the ServletInputStream
there is the method setReadListener
where you can set a ReadListener
. The ReadListener
interface looks like this:
public interface ReadListener extends EventListener {
void onAllDataRead();
void onDataAvailable();
void onError(Throwable);
}
onDataAvailable
will be called whenever it is possible to read data without blocking. Inside that method you should read as long as ServletInputStream.isReady returns true
. Here is an example of how this can be used.
Similarly there is a method setWriteListener
on ServletOutputStream
to set a WriteListener
. The WriteListener
interface looks like this:
public interface WriteListener extends EventListener {
void onError(Throwable);
void onWritePossible();
}
Writing works analogously to reading.
Jersey
JAX-RS 2.0 and Jersey 2.x also support asynchronous processing. The example above might look like this with Jersey:
@GET
@Produces(MediaType.TEXT_PLAIN)
public void get(@Suspended final AsyncResponse response) {
// This call does not block.
client.callExternalService(
// This callback is invoked after the external service responds.
new Callback() {
public void callback(String result) {
response.resume("Result: " + result + ".n");
}
});
}
Here an AsyncResponse
is injected by using the Suspended
annotation. This is later in a callback used to respond by invoking the resume
method. This method takes the arguments we would normally return from the resource method, which now instead is declared with return type void
.
Spring
Spring has server-side async support as well. The same example again using a Spring controller:
@RequestMapping(value = "", method = GET, produces = "text/plain")
@ResponseBody
public DeferredResult get() {
final DeferredResult result = new DeferredResult<>();
// This call does not block.
client.callExternalService(
// This callback is invoked after the external service responds.
new Callback() {
public void callback(String result) {
result.setResult("Result: " + result + ".n");
}
});
return result;
}
Returning a DeferredResult
signals to Spring that the request should be treated asynchronously. After invoking setResult
the response will be sent back to the client.
Async clients
To take full advantage of asynchronous processing in servlets we need asynchronous non-blocking APIs, which do not block a thread while waiting for the response. Such an API most likely uses a thread pool with just a few threads, which are able to handle a large number of simultaneous outstanding requests. This can be achieved either by using non-blocking IO or using a message-based protocol. Here follows an incomplete overview of such client APIs for Java.
HTTP
- AsyncHttpClient, which uses Netty by default, but can also use Grizzly or Apache.
- Jersey client uses Apache, Grizzly or Jetty. Unfortunately this bug currently causes Jersey client to put each request in its own thread, which then blocks waiting for a response. Additionally, when the response comes back yet another thread is started for each request, so there are two threads per simultaneous request plus the thread pool of the underlying http client. Fail…
- Spring AsyncRestTemplate, which makes use of Apache.
RDBMS
- ADBCJ, Asynchronous Database Connectivity in Java, looks like an attempt to make a non-blocking API, but it seems abandoned now.
File
NoSql
- For data stores that have an HTTP API you can use an async HTTP client.
- For MongoDB there is MongoDB Asynchronous Java Driver.
- Cassandra: Java Driver 1.0 for Apache Cassandra.
- Lettuce for Redis.
Summary
Using asynchronous processing can take you a long way in making your web application more scalable. Both latency and throughput can be improved. To take full advantage you should use non-blocking IO for the request and response and use asynchronous APIs that use non-blocking IO for the external services you call.
Comments are most welcome. What are your experiences with asynchronous processing?
For RDBMS, PostgreSQL has asynchronous drivers for Scala (https://github.com/mauricio/postgresql-async) and Java (https://github.com/alaisi/pg-async-driver). Both are built on Netty.
The latest release of Spring Data has also started to add support for asynchronous repository methods. Take a look at Asynchronous repository method invocations.
Very nice post!
These days I have been researching a bit this theme (asynchronous web services, “reactive programming”, etc.), and this is in fact a very relevant topic, which does not get covered a lot.
Another JVM framework that keeps on popping up in my research is the Atmosphere framework (https://github.com/Atmosphere/atmosphere). It seems to develop an extra layer on top of JVM to deal with common Async communications, especially “server events”… but I am still not sure if it can do the exact same things mentioned in this post or it is a complementary solution only used for “long communications” (WebSocket, Comet, etc.), that seems more like the case.
Apart from Atmosphere, there are some other things popping up, such as RxJava, Reactor (but I still need to further research those).
Keep up with these posts, very good and informative!
Cheers!
Can’t get past the AsyncResponse in Jersey 2.5. -> “Asynchronous processing not supported on Servlet 2.x container”.
Instead of 2.x should you be referring to Jersey 2.6+ or something? I’m trying to use SSE on AppEngine. What am I missing?
I might misunderstand what you’re saying but the Jersey version is not related to the Servlet spec. version. You need Servlet 3.0+ and Jersey 2.3.?+. You need to supply more information about the problem.
Jersey 2.6 didn’t exist when I wrote the post, so no. ;-)
Henrik,
Thanks for the useful post. You might want to know that the latest Couchbase Java SDK now offers a very nice async API based on RxJava. I’ve used it on a recent project and found it to be very well done. Check it out here: https://github.com/couchbase/couchbase-java-client
I am wondering about the thread safety of ‘request’ and ‘response’ object in the callback.
From the Servlet 3.0 spec – Section 2.3.3.4 –
“If a thread created by the application uses the container-managed objects, such as
the request or response object, those objects must be accessed only within the
object’s life cycle as defined in sections 3.10 and 5.6. Be aware that other than the
startAsync, and complete methods, the request and response objects are not
thread safe. If those objects were accessed in the multiple threads, the access should
be synchronized or be done through a wrapper to add the thread safety, for instance,
synchronizing the call of the methods to access the request attribute, or using a local
output stream for the response object within a thread.”
More specifically, the ‘request’ and ‘response’ object may not even be ‘valid’ if the AsyncContext.complete() has already been called before the callback is even executed.
Am I missing something ?
Right, they are not thread safe and it is up to the developer to make sure they are accessed correctly. However, I assume it is a rare case that someone has several threads accessing them concurrently. In most cases, one thread takes care of the request and one (possibly) other thread takes care of the response. Do you have a situation where you need to take concurrent threads into consideration?
There can always be the ‘timeout’ thread and subsequent dispatch to the servlet for timeout handling (assuming no custom timeout handling via AsyncEventListener) racing with the Async client completion callback.
It appears to me that the right way to do this would be via a re-dispatch to the servlet from async completion and share a thread-safe state object between the servlet and the Async client library. to communicate the result of async call.
Thanks, nice post
Pingback: itemprop="name">Is Servlet 3.1 (Read|Write)Listener supported by DeferredResult in Spring 4? - BlogoSfera
This has been an enlightening post. I have gotten interested in asynchronous servlets since the same has been introduce in grails http://docs.grails.org/2.4.2/guide/async.html
The documentation in here is not very great.
I had a particular question on error handling. How can one send an error from the runnable ?
i tried doing response.sendError but got the error “Cannot call sendError after response has been committed”
Should we invoke the onError function ? If yes can you provide an example of the same ?
Terrific post
I’m struggling to understand the usefulness of this unless you have implemented non-blocking I/O at every layer of the application. You are simply moving thread creation from the servlet layer to the API layer. You will still be operating on a limited pool of threads which will internally block on I/O calls. In that case (and I assume this is the case for 99% of applications out there), wouldn’t it be easier to simply increase the request threadpool size of your application container and call it a day?
I should probably add that my confusion comes from the fact that the most common I/O usage pattern is database access, and the only async RDBMS library listed is abandoned. :) Some of the comments provided solutions in that area and I think the article should be updated with those as most people will have a need for it.
Thank you so much for your post. This is wonderful.
I’m making a servlet connection with mongodb asynchronus. had given up and wanted to turn the program into a regular program like synchronys. but thanks to your post I have succeeded.