Thursday, January 17, 2008

Domain Driven Design in Java: Repositories and DAOs

I past the last days digging the net from some information about the implementation of the Repository pattern, as described in Domain Driven Design, in a typical java environment backed by frameworks like Hibernate and Spring.

It turned out that the situation is more complicated than I expected, and that pieces of information are scattered all around, mostly in newsgroups and blogs, so it took me a while to grasp the whole picture. Here is my summary.

The REPOSITORY specification

The original definition of the repository pattern, comes from Martin Fowler's Patterns of Enterprise Application Architecture, but the pattern itself is credited to Edward Hieatt and Rob Mee. The whole idea is to abstract access to the underlying database and provide the illusion of an in-memory collection of domain objects. Hmmm… sounds a lot like DAO to me. And also a bit like Hibernate's lazy-loading. In fact the two concepts of Repository and DAO, are quite similar: they're doing basically the same thing.

Repositories and DAOs

So the first question is "Are Repository and DAOs the same thing with a different name?". I am tempted to write "Yes!", and this will anyway be a legitimate answer in many cases ( some evidences can be found on the Spring Forum and on Christian Bauer's blog), but some differences in the way the two patterns are used were emerging there and there, so I kept on investigating. Going back to the source didn't help, cause both the original definitions are rather vague, at least in 2008 terms, and open to a wide range of implementation styles. So, even if the original definitions were largely overlapping, common habits in using the two patterns consolidated , making Repositories and DAOs two solution styles for approaching the same problem, with some small differences.

  1. DAOs are strictly tied to the underlying representation on a DBMS. The original specification is more generic, allowing for file-based persistence and the like. But the vast majority of DAOs are pointing to a DBMS. Moreover, commonly used persistence frameworks, such as Hibernate help a lot in managing database portability, so this is a smaller issue nowadays, than it used to be. As a result the DAO concept eventually downsized, to a "database access point" while Repositories are still intended to be generic.
  2. Repositories provide a more abstract view over the underlying data model, providing an interface strictly coherent with the domain. DAOs might be implemented basically in many ways, but frameworks and code generation tools tend to put the focus on the data structure rather than on the domain model. This is sometimes a tiny issue, sometimes just a matter of style, but in large systems can degenerate in a severe maintenance problem.
  3. Repositories enforce access to the persistence layer on a one-repository-per-aggregate basis, while DAOs are normally developed one-per-entity or one-per-table. So, repositories are more tied to the DDD concept of Aggregate Root and have a different granularity than DAOs. This definitely makes the most significant difference.

Somehow, the differences above are just a matter of taste, except for point 3. So objections are valid and discussion may result endless. But for now I'll just say that the two patterns are doing basically the same thing, although with different styles. There are differences, especially if approaching the matter from a DDD angle. What's left to say is if those differences are enough to make a choice between one pattern and another, or eventually to choose both.

For a more complete perspective, please read also part 2 and part 3.


Anonymous said...

Regarding the difference between Repository and DAO, one point of view seem to be that a Repository can be implemented by delegating to a DAO.

Here is a quote from DDD authour Eric in the DDD discussion group:
"If your infrastructure makes it much easier to return DAOs from whereever these things are being stored, then fetch the DAO *inside the repository* and use the data to create a Category object *inside the repository*. "

Alberto Brandolini said...

Hi, thanks for the link.

The main point here is that the pure DDD approach makes perfectly sense if one does not consider capabilities provided by frameworks, which have been evolving in the meanwhile. Being too strict on the DDD principles might make you lose the competitive edge of a newer technology, while going straight on the technology could make you stuck with code that's easy to write but definitely hard to evolve.
Unfortunately you're not so lucky to know all the project ecosystem in advance...