Repository Pattern - How to understand it and how does it work with "complex" entities?

Question

I'm having a hard time understanding the Repository Pattern.

There are a lot of opinions on that topic like in Repository pattern done right but also other stuff like Repository is the new Singleton or again like in Don't use DAO use Repository or just take Spring JPA Data + Hibernate + MySQL + MAVEN where somehow a Repository appears to be the same as a DAO object.

I'm getting tired of reading this stuff since imho this can't be such a hard thing as it is displayed in a lot of articles.

I see it like this: It appears that what I want is something like this:

         ------------------------------------------------------------------------
         |                            Server                                    |
         ------------------------------------------------------------------------
         |                    |                        |                        |
Client <-|-> Service Layer  <-|->  Repository Layer  <-|-> ORM / Database Layer |
         |                    |                        |                        |  
         ------------------------------------------------------------------------

The Service Layer takes *DTOobjects and passes those to the Repository Layer that basically is nothing more than "the guy" who knows how an entity can be stored.

For example assume you have a composition of some tools (please note that this is just pseudo code)

@Entity
class ToolSet {
  @Id
  public Long id;
  @OneToOne
  public Tool tool1;
  @OneToOne
  public Tool tool2;
}

@Entity
class Tool {
  @Id
  public Long id;
  @OneToMany
  public ToolDescription toolDescription;
}

@Entity
class ToolDescription {
  @Id
  public Long id;
  @NotNull
  @OneToOne
  public Language language

  public String name;
  public String details;
}

The thing I'm not getting is the part where I am getting a ToolSetDTO object from the client.

As I understood it so far I could write a ToolSetRepository with a method ToolSetRepository.save(ToolSetDTO toolSetDto) that "knows how to store" a ToolSetDTO. But almost every tutorial does not pass the *DTO but the Entity instead.

What's bothering me here is that if you take my ToolSet example from above I'd have to do the following steps:

Take toolSetDto and check if not null
For each tool*Dto owned by toolSetDto
a) If has a valid id then convert from DTO to Entity otherwise create a new database entry
b) toolDescriptionDto and convert/save it to the database or create a new entry
After checking those above instanciate ToolSet (entity) and set it up for persisting it in the database

All this is too complex to simply let the service function (interface for the client) handle this.

What I was thinking about was creating e.g. a ToolSetRepository but the question here is

Does it take a ToolSet entity object or does it use a DTO object?
In any case: Is the *Repository allowed to use other repository objects? Like when I want to save ToolSet but I have to store Tool and ToolDescription first - would I use ToolRepository and ToolDescriptionRepository inside ToolSetRepository?
If so: Why doesn't it break the Repository Pattern? If this pattern is basically a layer between the service and my ORM framework it just does not "feel right" to add dependencies to other *Repository classes due to dependency reasons.

I don't know why I can't get my head around this. It does not sound that complicated but there's still help out there like Spring Data. Another thing that is bothering me since I really don't see how this makes anything easier. Especially since I'm using Hibernate already - I don't see the benefit (but maybe that's another question).

So .. I know this is a long question but I put already a few days of research into it. There's already existing code I am working on right now that starts to become a mess because I just can't see through this pattern.

I hope somebody can give me a bigger picture than most of the articles and tutorials which do not get beyond implementing a very, very simple example of a Repository Pattern.

in my view the ToolSetRepository should only know the ToolSet entity... and on the ToolSet you can also have the JaxB annotations, to use the entity as DTO. on the client side you have then only the jaxb classes generated with jaxws clientgen from the wsdl received from the webservice url plus "?wsdl".... on the server side then you receive the "unmanaged" entity. then you have to use entitymanager.merge to put it into managed state. thats all. in my view a specific repository is only needed for complex criterias where you cannot use named queries. e.g. criteria api queries. — StefanHeimberg, Jul 08 '15 at 22:57
@StefanHeimberg But how would `ToolSetRepository` for example handle the persistence of `Tool` and `ToolDescription`? Of should those already have been persisted? If those should have been persisted at this point already, then where would I do that? Doing this inside my service-method does not feel right because complex entities like `ToolSet` would bloat up the service-method code. Imho a service-method should ony do a few initialization and basic checking stuff and then delegate the work to the next layer. — Stefan Falk, Jul 08 '15 at 23:00
if you receive the "unmanaged" entity in the service layer (transaction boundary) and then use the merge() on the entity manager, the entity is then already knonw to the entitymanage. after the service method is finished, the transaction commits and the changes in the entity manager are persistet to the database... — StefanHeimberg, Jul 08 '15 at 23:03
AFAIK Hibernate (and JPA) are a whole DAO layer since its work is to connect to the datasource (database, in this case) despite the underlying details (MySQL, Oracle, SQL Server, etc) and you can query the datasource in many ways. In case you want/need to use specific queries for your entities, it allows you to use criteria, which is specified to use in Repository, so in the end Hibernate is both Dao and Repository. What you will be doing on top of it is creating your own layer to abstract this dao (or repository) or whatever you use to implement this and continue programming. — Luiggi Mendoza, Jul 08 '15 at 23:03
another point: if you say you do not want to send all the details of a entity to the client, then you can create a simple "Info" (f.ex. ToolSetInfo) object with all the jaxb annotations and then inside your repository use then jpa constructor expressions to populate the data from the db into this "unmanaged" object — StefanHeimberg, Jul 08 '15 at 23:07
@StefanHeimberg My problem is just that I can't just call merge() or save(). Assume e.g. that the client is allowed to create a set of tools where some tools are already persisted but others are "just created" on the client side - given it `String name` and `String description`. In that case I'd have to check all those possibilities (this is what I'm doing right now) in the service routine which really does not look maintainable on the long run. — Stefan Falk, Jul 08 '15 at 23:08
@StefanHeimberg Right now I'm basically creating almost an exact copy of the content from the database and send it to the client. The only thing - like I mentioned above - is that the client can "create" a tool on his side. If the client wants to save that tool I have to check if this tool does exist - if not I have to first create `Tool` then `ToolDesription` before being able to store it into the set. — Stefan Falk, Jul 08 '15 at 23:09
ok, but what is if you give the whole object tree (loaded by the entitymanager) to the client (with jaxb annotations) and on the client then you can make modifications on it. (f.ex. add new object onto the tree... (important here is that non of the "@Id" annotations are filled up on client)) and then when the server receives the modified tree a merge() will do the rest of the job... — StefanHeimberg, Jul 08 '15 at 23:11
@LuiggiMendoza Well, yes somehow I get the feeling that Hibernate makes a Repository a little obsolete and yet it does not. But there has to be a layer in between the service and the ORM layer - however one would call that. Then the question would basically be *how* that layer has to work and "how smart" such a managing object may be like in my example "Does `ToolSetRepository` work with `ToolRepository` etc. in order to persist `ToolSet`" — Stefan Falk, Jul 08 '15 at 23:11
the difference between a "new" created tool and e "existing" tool is (or should be) that the @Id annotated members on the entity MUST be NULL) — StefanHeimberg, Jul 08 '15 at 23:12
@StefanHeimberg Yes, basically I am doing what you describe. I'm using GWT Servlets that transmit data between client and server. Right now I determining if an object is *new* by asking if `id` is `null` or not. But it is *that* part of which I am not sure who is responsible for all that checking. Because still "no one" know *how* to persist e.g. `ToolSet` it's just not possible to simply call save() or merge(). So I'm back to Repository: An object that might know how to do that but the question there is how does it work if it only should take an entity and not a DTO object. — Stefan Falk, Jul 08 '15 at 23:19
merge() does already the check if new or not. and then create a insert or update query. in my view this is the responsibility of the underlying ORM. e.g. JPA. — StefanHeimberg, Jul 08 '15 at 23:20
@StefanHeimberg Okay I got to say that I will have to check if that would work with merge() the way you describe it. This would indeed raise the question what `Repository` is good for a little higher but then at least I wouldn't have to care anymore! I'm going to try out your suggestion right after work tomorrow and will let you know how it turned out for me. But it might be just what I was looking for! Thank you very much for now! — Stefan Falk, Jul 08 '15 at 23:24
you can have also a look at petclinic example: https://github.com/spring-projects/spring-petclinic. there repositories are used the switch between JPA and JDBC. — StefanHeimberg, Jul 09 '15 at 06:28
or here a java ee 7 petclinic... without DAO or Repository... https://github.com/agoncal/agoncal-application-petstore-ee7 (i prefere this one...) — StefanHeimberg, Jul 09 '15 at 06:32
@StefanHeimberg Thanks! I'm right now ending up implementing my interfaces like in your first link. I still got a few questions though like how does a repo. implementation work internally if it has to store sub-entities. Would it e.g. pass the entity class down or again the business class? That question becomes trickier if one has to fulfill security aspects. I now have a `AbstractXsrfRepository` that expects a session token. Some entities that require authentication-validation extend that in order to get certain data over that session id - I'm still not sure if that's a good way though. — Stefan Falk, Jul 09 '15 at 22:11
I think a repository implementation extending an abstract base-class is ok. At least if you implement the respository interface. — Frank, Sep 21 '16 at 09:31
A JPA repository implementation typically uses merge or insert operations to store the main entities. How the sub-entities are stored is then defined by the specific object-relational-mapper and the mapping itself (the annotation of the entity classes). — Frank, Sep 21 '16 at 09:48

score 122 · Accepted Answer · edited Aug 03 '15 at 16:23

122

You can read my "repository for dummies" post to understand the simple principle of the repository. I think your problem is that you're working with DTOs and in that scenario, you don't really use the repository pattern, you're using a DAO.

The main difference between a repository and a dao is that a repository returns only objects that are understood by the calling layer. Most of the time the repository is used by the business layer and thus, it returns business objects. A dao returns data which might or might not be a whole business object i.e the data isn't a valid business concept.

If your business objects are just data structures, it might be a hint you have a modeling problem i.e bad design. A repository makes more sense with 'rich' or at least properly encapsulated objects. If you're just loading/saving data structures probably you don't need a repository the orm is enough.

If you're dealing with business objects that are composed from other objects (an aggregate) and that object needs all its parts in order to be consistent (an aggregate root) then the repository pattern is the best solution because it will abstract all persistence details. Your app will just ask for a 'Product' and the repository will return it as a whole, regardless of how many tables or queries are required to restore the object.

Based on your code sample, you don't have 'real' business objects. You have data structures used by Hibernate. A business object is designed based on business concepts and use cases. The repository makes it possible for the BL not to care about how that object is persisted. In a way, a repository acts as a 'converter/mapper' between the object and the model that will be persisted. Basically the repo 'reduces' the objects to the required for persistence data.

A business object is not a ORM entity.It might be from a technical point of view, but from a design pov , one models business stuff the other models persistence stuff. In many cases these are not directly compatible.

The biggest mistake is to design your business object according to storage needs and mindset. And contrary to what many devs believe, an ORM purpose is not to persist business objects. Its purpose is to simulate a 'oop' database on top of a rdbms. The ORM mapping is between your db objects and tables, not between app objects (even less when dealing with business objects) and tables.

edited Aug 03 '15 at 16:23

Stefan Falk

23,898
50
191
378

answered Jul 09 '15 at 13:37

MikeSW

16,140
3
39
53

1

Hi! Now it's a lot clearer to me what a repository is but I'd like to ask for my specific code example if I may: The thing is that there's not a lot going on on my client side so there will never be a lot more to do than basically store e.g. `ToolSet` in the database. But that still does not mean that I can write `ToolSetRepository` that basically builds up my entity class and then persists it, is that correct? And one last big issue I got in my question is "*In any case: Is the `*Repository` allowed to use other repository objects (inside it)?*". – Stefan Falk Jul 09 '15 at 18:51
1

None of the tutorials I saw showed a more complex example where the repository actually has to do what people claim it's there for. I only see brutally simple example that make me wonder what I'm going to need this for? ^^ Except it is allowed to use other Repository objects in other Repository objects in order to get the job done. :) – Stefan Falk Jul 09 '15 at 18:52
4

@StefanFalk The repository pattern is in fact the **design of the interface**. Once you have the interface, you can go wild with the actual implementation. About a repository using other repositories, I think not, that's the wrong design, it would mean you have 2 repos dealing with the same objects. But a repository can use a lot of DAOs. In my codebase, the repos don't know about each other. Thing is, you need to have a proper business model (with clear consistency boundaries) in order to use the repository correctly, else it's just a complication. – MikeSW Jul 09 '15 at 19:14
3

Btw, the purpose of the Repository is to store/retrieve stuff. The more complex the stuff, the better it is. But remember the important bit: that object must represent a business **concept** either simple (one structure) or complex(lots of rules or many children). – MikeSW Jul 09 '15 at 19:18
Alright! Thank you very much. I think now I got the answers I need to do "not such a bad job" at this. I hope I don't mess it up. Thank you in any case and +1 or the additional help here! :) – Stefan Falk Jul 09 '15 at 19:34
May I ask one more thing... How would one handle security aspects e.g. I have a logged in user but I don't trust him - he could send me a fake ID and read a `ToolSet` from another user just by sending me some random IDs. At the moment I take the session token and first look into my database if the ID is affiliated with the sessionID/UserID. But moving everything into the Repository-Layer this raises now the question where to do such a validation act? – Stefan Falk Jul 09 '15 at 21:01
You don't move everything to persistence. The repository does strictly save/load. And the `ToolSet` should be associated with an user in hte sense that the class has an UserId. The mechanisms to validate a signed in user are an infrastructural concern which may involve persistence, but which has nothing to do with the repository pattern – MikeSW Jul 09 '15 at 21:05
But in my case that would mean that I'd have a `private getToolSetByIdAndValidate(sessionId, toolSetId)` in my service layer that would have to open a Hibernate Session in order to validate if the given `ToolSet` id matches indeed the authenticated user. Would that be okay from the design point of view? – Stefan Falk Jul 09 '15 at 21:09
Probably, but it depends on your specific app needs. – MikeSW Jul 10 '15 at 00:58
I have done it a little different. I created an `AbstractXsrfRepository` that takes a `HttpRequest` object with the according http session and the session id. Now, if I want to access the store I have to create `new StoreRepository(httpRequest);` and the `get(id)` method internally checks first if the `get(id)` call is allowed for the given session id and proceeds accordingly. – Stefan Falk Jul 10 '15 at 09:20
1

No! A repository is just a persistence concern. **Never** put stuff like http in it. You're violating the Separation of Concerns principle. You really want to have one object that does everything? The repo implementation is always part of persistence (as a concept), your app should use only the abstraction via injection i.e the di container should create the repo which should to ONLY save/load from db, not checking security. You can 'check' things if they are part of the query like 'select * from tools where id=@0 and userId=@1' . – MikeSW Jul 10 '15 at 13:22
On the second thought you're right! But I could just change that to `new StoreRepository(String sessionId)` and I should be good. That `sessionId` would be used internally to check e.g. `select * from tools where id=@0 and user_sessionId=@1` – Stefan Falk Jul 10 '15 at 15:18
I got one more thing you might be able to answer. You're saying "*a repository returns only objects that are understood by the calling layer*" so if I use repositories in my Service Layer [wouldn't it then be okay to return DTOs from my repositories](http://stackoverflow.com/questions/31743017/should-i-convert-an-entity-to-a-dto-inside-a-repository-object-and-return-it-to)? – Stefan Falk Aug 03 '15 at 16:27
@StefanFalk It would, if the service expects that. However, at least in DDD repository is used _only_ for the Domain layer, so the repo would return just domain objects. For other layers/components the 'repository' can be just a service or a query handler, but the principle of abstracting the persistence is the same. – MikeSW Aug 03 '15 at 19:45
1

+1000 for the point in the last paragraph. It's amazing how much a sound understanding of this abstraction and a use of a "firestop" between persistence frameworks and business logic can improve the underlying design. – DeaconDesperado Jan 09 '17 at 20:43
@MikeSW if you share any good and standard example of ORM any link or something would be a great help for all readers :) Thanks for :) – Ario May 13 '17 at 15:31
**"a repository acts as a 'converter/mapper' between the object and the model that will be persisted"**. Then what's the difference between **Adapter Pattern** and **Repository Pattern**? – Arash Jan 25 '20 at 21:28

Repository Pattern - How to understand it and how does it work with "complex" entities?

1 Answers1

Linked