- Ordinary synchronous method invocations.
- Scala Actors
- Akka Actors
I can tell you right away that the performance of Akka Actors is outstanding compared to Scala Actors.
Before looking at the benchmark I will briefly describe the sample application.
A trading system is essentially about matching buy and sell orders. A limit order is an order to buy a security at not more, or sell at not less, than a specific price. For example, if an investor wants to buy a stock, but doesn’t want to pay more than $20 for it, the investor can place a limit order to buy the stock at $20 “or better”. There are many other types of orders and special constraints. The sample is only handling plain limit orders.
Orders that are away from the current best price in the market are collected in an order book for the security, for later execution.
A matching engine manages one or more order books, i.e. the marketplace is sharded by order book. The matching engines holds current state of the order books. Clients connect to an order receiver service, which is responsible for routing the order to the correct matching engine. The order receiver is stateless, and the clients can use any order receiver independent of order book.
For redundancy, the matching engines work in pairs. Each order is processed by both matching engines. The order is also stored in a persistent transaction log, by both matching engines. In a real setup the primary and standby matching engines are typically deployed in separated data centers.
Now, over to the benchmark. The test scenario put buy and sell orders in 15 order books, divided in 3 matching engines. The orders are at different price levels, so an order book depth is built up, but in the end all orders are traded and that is verified by the JUnit test running the benchmark.
The scenario was run at different load, by varying the number of simulated clients from 1 to 40.
The benchmark results illustrated here were performed on a real 8 core box (dual cpu Xeon 5500 machine, 2.26 Ghz per core).
Here are the result of processing 750000 orders at each load level.
￼￼The Basic solution uses ordinary synchronous method invocations. It is extremely fast, but not an option for a true scalable solution. Asynchronous message passing is a better alternative for scaling out on multi-core or multiple nodes.
In the Scala and Akka Actors solutions the clients send each order message to an order receiver and waits on the response Future (!? operator in Scala and !! in Akka). The order receiver forwards the request to the matching engine responsible for the order book, i.e. the order receiver thread/dispatcher can immediately be used for next request. The matching engine sends the order message to the standby and both matching engines process the matching logic and transaction logging in parallel. Acknowledgment is replied to client when both are done.
The benchmark results shows that Akka Actors are able to process three times as many orders compared to Scala Actors at the the same load. Similar result with latency. The latency of Akka Actors is one third of Scala Actors. This holds for low load also. Average latency is not always the best measure, so let us look at some percentiles.
Operations that are waiting for a Future to complete has been used when sending messages. This has a scalability price tag, since the thread is blocked while waiting for the Future to complete. Better scalability can be achieved with one-way message passing, which is illustrated by the Scala/Akka Actor one-way solutions. It uses bang operation (!) for sending of all messages.
The matching engines writes each order to a transaction log file. This is a blocking IO bottleneck. To push the test of message passing one step further the benchmark has also been run without the transaction log. The Akka solution shines even more. More than three times higher transaction rate compared to Scala Actors at the the same load, when using solution based on sending messages and waiting for reply. For the one-way message passing solutions the Akka Actors are two times faster than Scala Actors.
Akka has great flexibility when it comes to specification of different dispatching mechanisms. The Akka Actor hawt is included in the benchmark as a comparison with the Akka Actor one-way solution. It uses the HawtDispatch threading library which is a Java clone of libdispatch. The last test without transaction log shows that HawtDispatcher has slightly better performance than the event-based dispatcher that has been used for Akka Actor one-way.
The complete source code for the sample application is located at: http://github.com/patriknw/akka-sample-trading
Update note Aug 15: Added Scala Actor one-way solution and new description of how to run the benchmark.
Update note Aug 22: New benchmark run on real 8 core box.