Qi4j and domain model persistence

The following entry was originally posted on Rickard’s blog. Jayway is a founding company of the Qi4j project.

The JavaZone 2009 conference is over, and although I couldn’t make it this year due to our project, StreamFlow, going into production soon, the Qi was definitely flowing there. I’ve been watching the videos from the conference (available here, and many kudos for making them available so soon), and there’s a number of presentations which either explicitly or implicitly relates to Qi4j. It seems that so many of the issues that Qi4j has been designed to deal with are things that are becoming known and annoying to a majority of developers. So, I’ll try to outline below just how the topics covered at JavaZone relate to Qi4j, and how Qi4j can help you deal with those problems.

Persisting domain models

I’ll start with Randy Staffords presentation “Patterns for persisting large and rich domain models”, in which he describes the criteria for the topic, and then the top ten things to be aware of. I basically agree with the criteria, and will go directly into the top ten list and how it relates to what we are doing.

Ripple loading

The first problem that Randy delves into is the one of “ripple loading”, whereby a single request to an application might result in many requests to the underlying database. If “many” is on the order of hundreds or thousands, then that is going to negatively impact the time it takes to respond to the request.

There are many ways in which Qi4j help here. First of all, our basic philosophy for EntityStores is that their primary purpose is to efficiently store and load the domain model for the purpose of handling application requests, and for this reason we have focused on implementing EntityStores that are based on key-value stores, where lookups of entities are just one map-call away, and where “mapping” is pretty much non-existant since the value is the serialized state of the Entity. We have chosen to use JSON as the default serialization format, so if you want to “look at the data” without using Qi4j, that is possible.

In any case, the first way in which this helps is by simply bringing down the cost of all those requests. If the key-value store is in-process rather than across-network as a RDBMS might typically be, then the overhead for making a call is radically reduced. And since an entire entity can be loaded by one map-lookup the number of requests is reduced. Both of these factors help deal with Randys axiom “IPC affects app response time”.

In my current project, StreamFlow, we are going one step further. We are implementing CommandQuerySeparation as described by Greg Young in the DDD-world. What this means is that all changes generate events, and those events can be listened to and can cause further events. This infrastructure can, for example, be used to drive a denormalization of the database, specifically, to generate the specific views that I know clients will be asking for. If a particular field in the domain model is changed, that causes an event, and if that field is being shown in the UI in some specific view, I can therefore consume that event and generate a new event that creates eagerly that view as a DTO, and store it. When the client makes a request for the view I am then guaranteed to only have to make one request to the underlying datastore to get that view. Reading becomes superefficient, and makes the read/write ratio in the database more even. For the query engine, which we have separated from domain persistence as there is generally no query support in key-value stores, it is also relieved as standard client views no longer uses the query engine. Instead the query engine can be used for ad hoc querying, and only that. This improves the response time of the query engine. This strategy is a tradeoff however: there is more space on disk required as we are eagerly generating views, and there might be a few of them, and the code to keep the views updated might be more tricky than the queries needed to create them on the fly. So this is a tradeoff to keep in mind. But, at least both options are available to us.

Since EntityStores are always invoked in UnitOfWorks, which are specified using Usecases, you can also attach metainfo about what the Usecase will be loading, which can be used by EntityStore implementations to do eagerloading. We don’t have any examples of this right now, as we have focused on key-value stores where eagerloading is not really applicable, but the infrastructure is there. For any kind of network-based EntityStores this would be useful to have.

The second problem Randy discusses is that of scalability. The issues he describes mostly have to do with having too many domain objects on the heap. In Qi4j, domain objects, i.e. entities, deal with this in many ways. First of all, an entity can be composed of many mixins, each dealing with a separate use-case. The mixins are lazy-loaded, so if a particular usecase only deals with 1 out of 10 mixins, then only that mixin will be instantiated. Second, references between entities are always done through proxies, so there’s no way to have hard references to entities that will fill up your heap. Third, entity references are only valid within a UoW, so once the UoW is done those references can be GC’ed, although the persistence implementation can choose to cache the state for use with later UoW’s. This also properly deals with Randys third problem, transaction isolation.

Now, onto the ten “pat-let”s that Randy describes:

1 Application Transaction

Application Transactions are used to deal with the fact that some interactions with users are so long that they are not appropriate to map to database transactions. Thinktime can be very long, so it just wouldn’t work to use transactions. The proper way to handle this, Randy describes, is to use the UnitOfWork concept to build up a set of changes, and then apply them either along the way or at the end of the UoW.

In Qi4j the ONLY way to deal with entities is through a UnitOfWork, whether you use it for “short” or “long” transactions. A UoW will contain all the entities that have been referenced within that UoW, with the state that has been read, and it can be paused and resumed (i.e. removed from its association with the current thread must like a regular transaction) which is used to handle “think time”. A Qi4j UoW can be applied(), which will send all changes down to the EntityStore will still keeping the UoW alive, and at the end of the UoW you can choose to complete() or discard() it, depending on what you want to do. This handles the Application Transaction “pat-let” as Randy describes it.

2 Editing copy

Different application transactions need to have their own copies of the entities, or else they will be doing “dirty reads” where they are seeing changes being made by other transactions.

In Qi4j, simply because all entities are accessed through a UoW, and all those references will be load to that UoW, this is handled properly. No two application transactions can or may share references to entities. The underlying EntityStore might provide a copy-on-write holder of the state, for efficiency reasons, but this will still enforce that the semantics is such that the UoW’s will be logically separated.

The color commentary on this is interesting, in that it differentiates between copy-on-read and copy-on-write. As it is, all EntityStores in Qi4j use copy-on-read, but the SPI is such that it allows copy-on-write. From Randys description of the issues, especially with regard to scalability, we should probably make a default implementation that does copy-on-write, so that all stores based on the standard MapEntityStoreMixin implementation gets that for free.

3 Factory-Assigned identifier

Entities in the domain model should have identifiers assigned by factories, probably using a UUID strategy.

Qi4j is designed from the ground up to use this approach, so we have an IdentityGenerator SPI that works with the UoW and EntityStore SPI to accomplish what Randy describes. Nothing much more to say on that, other than “I agree” with what Randy says.

There is one thing to consider though when combining DDD with the CQS patterns a la Greg Young and Udi Dahan. If you use that, then ALL Entities are created by other Entities, and all such creations generate events. The way to get around the question “so how do I create the first entity?” is to have your system either automatically create one on startup, with a known id, or have entities with known id’s created on the fly. Udi has some content about this on his blog. The main thing I have realized though, is that if your domain model is all about consuming commands and generating events, then the id assignment for new entities has to happen in the command consumption, and not the event consumption, and the creation of the entity happens in the event consumption. The reason is that if you do a replay of events, then the same state must be created, and that is dependent on the same id’s for new entities being used.

Simple example:
// Command to create Foo
void createFoo()
// Validate command is ok
// Generate id for new Foo
String id = …;

void fooCreated(String id)
… create Foo entity in UoW using id …

If you are using CQS and Event Sourcing, so that the commands are generating events, then in order to allow events (such as the call to fooCreated) to be replayed in the domain model, EVERYTHING that the event needs to determine the new state has to be included in the parameters, and this includes the id of the new entity. Otherwise, when you replay the event later it might(/will) cause another id to be generated, and then the whole thing falls apart. In StreamFlow we are doing something similar to the above, which makes command/event processing pretty straightforward and not so code-intensive. The events, such as “fooCreated(id)” are later used to generate reports, update the UI, create replicas on other cluster nodes, external integration, etc. Pretty darn cool actually.

4. Protected persistence variation

You may need to change the persistence implementation in your project later on, so use some kind of abstraction to allow this to happen.

This, I think, is one of the strengths of the Entity model in Qi4j. We have an EntityStore SPI that can be implemented by key-value stores, document stores, relational stores, graph stores, etc., without changing the domain code at all. One of the key reasons we can do this, I think, is because we separated out indexing and querying from the EntityStore SPI responsibilities. Because of this there are very few things that the EntityStore needs to be able to do (create/load/store), and therefore allows many interesting implementations. Currently we have in-memory, JDBM, Preferences API (for service configuration storage), Amazon S3, and various sandbox implementations for Coherence, JGroups, JavaSpaces, BerkeleyDB. I think that for all of these new and hop key-value stores, but which don’t have any application API’s to make them easily usable, they should look at simply leveraging the Qi4j EntityStore SPI which gives them immediately an appliation API that people can use. Our in-memory version also makes it easy for people to run tests of their domain models without having to have the entire infrastructure available, which is crucial. As one of the main trends today seems to be to explore these kinds of non-relational storage models, I would recommend that you look at using Qi4j as your application API.

5. Distributed caching

Applications often need to use distributed caches for performance, and these need to be consistent.

To a large degree this would be an implementation detail in EntityStores in Qi4j, as we don’t do any caching at all in the framework (specifically to allow this to be an implementation detail in the EntityStores). The only thing to say here is that if you are using a local key-value store and then events to keep replicas up-to-date, you don’t really need a caching solution, since the key-value store is basically doing that for you. So, whether this is necessary at all is probably dependent on whether you are using a centralized database or not as the persistence solution.

6. Disconnected domain model

If domain model entities refer to each other using pointers, then serialization of graphs become an issue. The solution is to use reference by identifiers instead.

In Qi4j, references between entities are done on two levels: when an entity is loaded in the UoW it will be a strongly typed Java object, and any references to other entities will also be regular Java objects, with interfaces. However, those references are to Entity Composites, which internally maintain the identifier of the entity. So, whenever a reference in one entity is set to another entity, at the end of the UoW the EntityStore SPI will store the reference using the identifier. So, on the EntityStore SPI level there is only properties and identifier references in entities. This deals properly with the problem that Randy describes.

One of the main problems that we still have, and which Randy describes, is that with the key-value stores, where all data from an entity is stored as a serialized JSON string, if you want to only add/remove an entity to a collection that collection is going to be loaded/stored as a whole. The actual references will not be deserialized during a “collection add” scenario, but the state will be loaded/stored. The only way to really get around that would be to implement the EntityStore SPI using the graph database Neo4j, where massive associations would be handled natively by the Neo4j API and so collection add/remove could be done as a O(1) operation. We have discussed a proper implementation using Neo4j, but have not finished this yet.

If you go outside the plain Qi4j API, and use events as I’m doing in the StreamFlow project, then solving this would be easy actually. On a collection change you would generate an event “fooAdded(ref)”, but the implementation wouldn’t actually do anything. The reference would however logically have been added to the collection. So, an event consumer could then asynchronously get this event and update whatever view needs the actual list of entities for clients to read. Problem solved.

7. Most frequently used cache contents

Cache stuff that you use a lot.

Again, Qi4j doesn’t cache anything per se, so the only zen comment I could make would be “you don’t need to cache if you already have a cache”, which would be true if you are using a local key-value EntityStore implementation, which has the access performance of a cache, and yet also serves as the actual store so it’s not technically a cache.

8. Cache warming

Use cache warming so that first users don’t get slow responses.

For Qi4j, see the previous comment.

9. Query design

The way your application queries for the data it needs, especially with a large and rich domain model, is a first-order determinant of its performance.

This is something that I spent a long time working on in the product that the Qi4j ideas are based on, the SiteVision CMS/portal product. In there, the domain model was running in the client, and so when the UI in the Swing applet client needed to load data it went to the server to get it, pretty much using the key/value ideas. However, in the first version this was pretty much using lazy-loading, and therefore became superslow as it caused a massive amount of roundtrips to display a simple tree (one HTTP request per mixin per entity). The way to get around this was to let each UI define a usecase specific loading policy, which was then used to optimize and eagerload the state. Instead of thousands of roundtrips there would be one or a couple of roundtrips, each getting a specific object but also the related objects that we knew would be used within the usecase.

In Qi4j, and in StreamFlow which is the first production example of using Qi4j, contrary to what Randy said I have not yet implemented this, although it is prepared for it. The reason I haven’t implemented it is because I have chosen to not let the domain model execute on the client, for various reasons, mostly related to security. Instead, the Swing UI (which I still think is a GREAT way to do rich clients) uses a REST API to access DTO’s which are optimized for the view that the client needs. In a sense we have specialized the server API to embody these usecase specific loading policies. The problem is then pushed to the REST resource on the server, which in turn could have the same issue when communicating with the EntityStore. But there we have cheated, and am using a low-latency key-value store, so even if we make lots of loads, the cost is manageable. We are also using events to preemptively generate the DTO’s on the server, further reducing this cost.

All that being said, Qi4j *is* prepared for implementation of usecase-optimal queries, as a UoW can be associated with a Usecase, and a Usecase can specify metadata such as query policies. I just haven’t needed it myself for the moment, for the above reasons. If anyone needs it there is a clear path for how and where to do it.

As for the color commentary, the Query API in Qi4j is such that we allow the domain model to be used to express queries, which is one of the patterns Randy recommends. For those queries that cannot be expressed using the domain model, yes, we do have a feature explicitly for named queries. In neither case will you have SQL in your domain model, which is the key idea. The Query API also has an implementation that can run against an Iterable, so if you want to run a query against an in-memory collection, such as a cache, then that is possible.

10. Balancing by cache affinity

Do loadbalancing on application servers so that the caches on each instance is likely to have the data needed.

Again, this is not really something that Qi4j as a framework can do anything about. This is more related to whatever framework you are using on top to do loadbalancing. For StreamFlow we are using Restlet for the REST API implementation, and it has support for these kinds of things, where we can plug in algorithms for selecting the node to use.


To summarize, I think Qi4j deals with all the problems and pat-lets that Randy describes, or in some cases at least allow other frameworks to deal with them without Qi4j getting in the way. I also agree with Randys initial statement that “persistence is where it’s at”. It is impossible to not talk about persistence when dealing with DDD, and yet at the same time we want these persistence issues to stay out of our domain models as much as possible. This is one of the key benefits of the Qi4j approach I think: instead of starting with a persistence technology and providing an abstraction over that, we instead started with our domain models and decided how we wanted to express them in code. That then drove the EntityStore SPI, which in turn drove the implementations of our persistence extensions. This way we are allowing the domain model to “pull” what it needs from persistence instead of having the persistence “push” its view of the world onto the domain model. This is a crucial point. “pull” – good. “push” – bad. For those of you who have studied the Toyota Production System or Lean, or even better the Systems Thinking variation, this will probably seem familiar.

This Post Has 3 Comments

  1. bqlr

    can you give me any suggestion for my DDD framework: jdon.sf.net, its ‘push’ way is domain events, domain model send asyn. event to persistence layer and persistence itself. how do you think domain events? thanks

Leave a Reply