DDD Effective modelling of aggregates and root aggregation creation

Question

We are starting a new project and we are keen to apply DDD principles. The project is using dotnet core, with EF core providing the persistence to SQL Server.

Initial view of the domain I will use an example of a task tracker to illustrate our issues and challenges as this would follow a similar structure.

In the beginning we understand the following: -

We have a Project
Users can be associated to Projects
A Project has Workstreams
A Workstream has Tasks
Users can post Comments against a Task
A User is able to change the status of a Task (in progress, complete etc)
A Project, with associated Worksteams and Tasks is initially created from a Template

The initial design was a large cluster aggregate with the Project being the root aggregate holding a collection of ProjectUsers and Workstreams, Workstreams has a collection of Tasks etc etc

This approach was obviously going to lead to a number of contention and performance issues due to having to load the whole Project aggregate for any changes within that aggregate.

Rightly or wrongly our next revision was to break the Comments out of the aggregate and to form a new aggregate using Comment as a root. The motivation for this was that the business envisaged there being a significant number of Comments raised against each Task.

As each Comment is related to a Task a Comment needs to hold a foreign key back to the Task. However this isn't possible following the principle that you can only reference another aggregate via its root. To overcome this we broke the Task out to another aggregate. This also seemed to satisfy the need that the Tasks could be Completed by different people and again would reduce contention.

We then faced the same problem with the reference from the Task to the Workstream the Task belongs to leading to us creating a new Workstream aggregate with the foreign key in the Task back to the Workstream.

The result is: -

A Project aggregate which only contains a list of Users assigned to the project
A Workstream aggregate which contains a foreign key to the Project
A Task aggregate which contains a foreign key to the Project
A Comments aggregate which contains a foreign key back to the Task

The Project has a method to create a new instance of a Workstream, allow us to set the foreign key. I.e. slightly simplified version

public class Project()
{
    string _name { get; private set;}
    public Project(Name)
    {
         _name = Name;
    }
    public Workstream CreateWorkstream(string name)
    {
        return new Workstream(name, Id);
    }

    ....+ Methods for managing user assignment to the project
}

In a similar way Workstream has a method to create a Task

public class Workstream()
{
    string _name { get; private set;}
    public int ProjectId { get; private set; }

    public Workstream(Name, Id)
    {
         _name = Name;
         _projectId = Id;
    }
    public Task CreateTask(string name)
    {
         return new Task(name, Id);
    }

    private readonly List<Task> _activities = new List<Task>();
    public IEnumerable<Task> Activities => _activities.AsReadOnly();
}

The Activities property has been added purely to support navigation when using the entities to build the read models.

The team are not comfortable that this approach, something doesn't feel right. The main concerns are:-

it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.
there is discomfort that the Project has to be created first and that the developer needs to ensure it is persisted so it gets an Id, ready for when the method to Create the template is called which is dependent on that Id for the foreign key. Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate() to orchestrate the creation and persistence of the separate objects to each repository?
is the method to create the new Workstream even in the correct place?
the entities are used to form the queries (support by the navigation properties) which are used to create the read models. Maybe the concern is that the object structure is being influence by the how we need to present data in a read only

We are now at the point where we are just going around in circles and could really use some advice to give us some direction.

It seems you're asking very diverse questions, some of which have already been covered multiple times on SO https://stackoverflow.com/questions/25742703/generating-identities-for-entities-in-ddd https://stackoverflow.com/questions/32082479/ddd-create-one-aggregate-root-within-another-ar — guillaume31, Jan 12 '18 at 09:42
Regarding domain modelling advice, it's bordering on "primarily opinion-based" at least as long as we don't know the use cases, invariants and collaborative/transactional load of the application. — guillaume31, Jan 12 '18 at 09:48

score 1 · Accepted Answer · answered Jan 11 '18 at 15:01

The team are not comfortable that this approach, something doesn't feel right.

That's a very good sign.

However this isn't possible following the principle that you can only reference another aggregate via its root.

You'll want to let go of this idea, it's getting in your way.

Short answer is that identifiers aren't references. Holding a copy of an identifier for another entity is fine.

Longer answer: DDD is based on the work of Eric Evans, who was describing a style that had worked for him on java projects at the beginning of the millennium.

The pain that he is strugging with is this: if the application is allowed object references to arbitrary data entities, then the behaviors of the domain end up getting scattered all over the code base. This increases the amount of work that you need to do to understand the domain, and it increases the cost of making (and testing!) change.

The reaction was to introduce a discipline; isolate the data from the application, by restricting the application's access to a few carefully constrained gate keepers (the "aggregate root" objects). The application can hold object references to the root objects, and can send messages to those root objects, but the application cannot hold a reference to, or send a message directly to, the objects hidden behind the api of the aggregate.

Instead, the application sends a message to the root object, and the root object can then forward the message to other entities within its own aggregate.

Thus, if we want to send a message to a Task inside of some Project, we need some mechanism to know which project to load, so that we can send the message to the project to send a message to the Task.

Effectively, this means you need a function somewhere that can take a TaskId, and return the corresponding ProjectId.

The simplest way to do this is to simply store the two fields together

{
    taskId: 67890,
    projectId: 12345
}

it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.

Maybe the concern is that the object structure is being influence by the how we need to present data in a read only

There's a sort of smell here, which is that you are describing the relations of a data structure. Aggregates aren't defined by relations as much as they are changes.

Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate

It's actually fairly normal to have a draft aggregate (which understands editing) that is separate from a Published aggregate (which understands use). Part of the point of domain driven design is to improve the business by noticing implicit boundaries between use cases and making them explicit.

You could use a domain service to create a project from a template, but in the common case, my guess is that you should do it "by hand" -- copy the state from the draft, and then send use that state to create the project; it avoids confusion when a publish and an edit are happening concurrently.

Robert Bräutigam · Answer 2 · 2018-01-11T13:14:16.470

Here is a different perspective that might nudge you out of your deadlock.

I feel you are doing data modeling instead of real domain modeling. You are concerned with a relational model that will be directly persisted using ORM (EF) and less concerned with the actual problem domain. That is why you are concerned that the project will load too many things, or which objects will hold foreign keys to what.

An alternative approach would be to forget persistence for a moment and concentrate on what things might need what responsibilities. With responsibilities I don't mean technical things like save/load/search, but things that the domain defines. Like creating a task, completing a task, adding a comment, etc. This should give you an outline of things, like:

interface Task {
    ...
    void CompleteBy(User user);
    ...
}

interface Project {
    ...
    Workstream CreateWorkstreamFrom(Template template);
    ...
}

Also, don't concentrate too much on what is an Entity, Value Object, Aggregate Root. First, represent your business correctly in a way you and your colleagues are happy with. That is the important part. Try to talk to non-technical people about your model, see if the language you are using fits, whether you can have a conversation with it. You can decide later what objects are Entities or Value Objects, that part is purely technical and less important.

One other point: don't bind your model directly to an ORM. ORMs are blunt instruments that will probably force you into bad decisions. You can use an ORM inside your domain objects, but don't make them be a part of the ORM. This way you can do your domain the right way, and don't have to be afraid to load too much for a specific function. You can do exactly the right things for all the business functions.

Thats makes a lot of sense. In the scenario where we could have a large number of comments against a Task. Does it make sense to break out Comment to a separate aggregate? If so this leads onto the question of how we reference the Task the Comment belongs to without breaking Task out from from the Project/Workstream/Task aggregrate — stuartmanton, Jan 11 '18 at 14:15
Those are all technical things. What hidden properties the `Comment` has is irrelevant, an implementation detail. It can have foreign keys inside to different other objects. It can know about its Task, Project, or any other object. The important part is the API it has, and that it reflects the Domain. — Robert Bräutigam, Jan 11 '18 at 16:17
Do you have an example of your last point? Every example of DDD that I have seen is just EF Models used as aggregates. — maembe, Jun 19 '20 at 17:35

DDD Effective modelling of aggregates and root aggregation creation

2 Answers2