Event Sourcing and Read Model generation

Question

Assuming Stack Overflow domain problem and the following definition of events:

UserRegistered(UserId, Name, Email)
UserNameChanged(UserId, Name)
QuestionAsked(UserId, QuestionId, Title, Question)

Assuming the following state of event store (in the order of appearance):

1) UserRegistered(1, "John", "john@gmail.com")
2) UserNameChanged(1, "SuperJohn")
3) UserNameChanged(1, "John007")
4) QuestionAsked(1, 1, "Help!", "Please!")

Assuming the following denormalized read model for questions listing (for the first page of SO):

QuestionItem(UserId, QuestionId, QuestionTitle, Question, UserName)

And the following event handler (which builds denormalized read model):

public class QuestionEventsHandler
{
    public void Handle(QuestionAsked question)
    {
        var item = new QuestionItem(
            question.UserId, 
            question.QuestionId, 
            question.Title, 
            question.Question, 
            ??? /* how should i get name of the user? */);
        ...
    }
}

My question is how can i find the name of the user who asked a question? Or more common: how should i handle events if my denormalized read model requires additional data which is not exists in the particular event?

I've examined existing samples of CQRS including SimpleSQRS of Greg Young and Fohjin sample of Mark Nijhof. But it seems to me that they are operate only with data that is included in events.

More discussion at: http://groups.google.com/group/dddcqrs/browse_thread/thread/5ead47db3261dcb5/b069165e48467f41?lnk=gst&q=include+information+in+event#b069165e48467f41 — Brian Low, Jun 16 '11 at 19:02

Chris Moutray · Answer 1 · 2012-02-06T14:46:05.407

26

Personally I think there's nothing wrong with looking up the user's name from within the event handler. But if you're in a position where you can't query the name from the User's read model then I'd introduce an additional event handler to QuestionEventsHandler, to handle the UserRegistered event.

That way the QuestionEventsHandler could maintain its own repository of user names (you wouldn't need to store the users email). The QuestionAsked handler can then query the user's name direct from it's own repository (as Rinat Abdullin said storage is cheap!).

Besides since your QuestionItem read model holds the user's name, you would need to handle the UserNameChanged event within the QuestionEventsHandler as well to ensure the name field within the QuestionItem is up-to-date.

To me this seems less effort than 'enriching the events' and has the benefit of not building dependencies on other parts of the system and their read models.

edited Feb 06 '12 at 14:46

answered Feb 06 '12 at 14:29

Chris Moutray

18,029
7
45
66

You may end up having lots of duplicate data everywhere in your read model if you are not careful – Narvalex Nov 13 '17 at 11:05
1

True but the read model is often denormalized and therefore will always have duplicate data. – Chris Moutray Nov 13 '17 at 13:56
As i guess, questions related events anx user related events are dispatched in different topics / streams. So, for example when user changes his username first, UsernameChanged eveny is published, and after he creates a question and QuestionCreatedEvent is published, there is no guarantiee that UsernameChanged event will be handled first. So handler for QuestionCreatedEvent may read old username from its local storage of usernames so we would get wrong read side – Teimuraz Jul 24 '18 at 18:43
1

@Teimuraz I think the point you're making is UserNameChanged could be processed after QuestionedAsked; I don't think it makes a difference whether QuestionEventsHandler maintains its own repository of names. There are 2 scenarios for order of events 1) name-changed then question-asked or 2) question-asked then name-changed; with both scenarios if you care that the name is out of date on the question then QuestionEventsHandler would need to handle the name-changed event. eg think about question-asked then name-changed after say 1 week or 1 month or event a year later... – Chris Moutray Jul 25 '18 at 15:17
Yes you are right. We can use both directions: When QuestionAskedEvent is handled, we get username from local username storage (which is updated on UsernameChangedEvent). When we handle UsernameChanged event in local username storage, then we also can update usernames in questions projection if we need. – Teimuraz Jul 25 '18 at 15:25

score 4 · Accepted Answer · answered Nov 03 '10 at 08:29

4

Just enrich the event with all the necessary information.

Greg's approach, as I recall, - enrich the event while creating and store/publish it this way.

answered Nov 03 '10 at 08:29

Rinat Abdullin

23,036
8
57
80

Thanks Rinat! I will go with your suggestion, but do you really agree that this is a good solution to enrich domain events with data that is needed only for read model? – Dmitry Schetnikovich Nov 04 '10 at 19:18
1

Yes. There are no big drawbacks that I can see. Besides, storage is cheap these days and enriched domain events help to analyze your system later as well. For example, I often put a lot of performance stats in operations that stress IO or CPU; this info is not even used in read models. But I need to optimize performance I can query domain log with LINQ for the history of operations and exact performance details. – Rinat Abdullin Nov 05 '10 at 10:30
One drawback would be that you aren't necessarily in a position to change the event producer. Another that the event will become bloated with incoherent data as time goes by and more and more event handlers wanting their specific piece of information gets added to the system. Whats the drawbacks of letting the event handler query the read model for the extra information? – Sebastian Ganslandt Jun 19 '11 at 13:58
11

This answer suggests that all data needed should be in the event. I disagree with this as a rule of thumb. Event handlers will be creating a de-normalised record, often containing aggregated & calculated fields that aren't necessarily sourced from a single aggregate. Perhaps a 'NumberOfQuestionsByMonth' view model as an example. This is the point of CQRS; it does these calculations when something is persisted, rather than when queried. To do these calculations you often need to query and process the data. This data is outside of the scope of the aggregate, and cannot be passed in the event. – David Masters Feb 06 '12 at 15:06
4

This seems like a messy solution. If you want to join two aggregates, then listen to events from both, or project into a read model which can do the joins for you. – Sebastian Good May 15 '12 at 02:49
3

I disagree as well, as one drawback may be that you need to change all events that happened in the past to contain the additional data as well (given the scenario that you do not know right from the start which data may be necessary at a later point in time). You for sure do not want to include everything that may potentially be of interest in the future. – Golo Roden Sep 09 '12 at 15:34
@SebastianGanslandt The drawback is that maybe the event handler is querying an information that is not there yet (in the read model) due to eventual consistency. – Narvalex Nov 15 '17 at 11:49
@RinatAbdullin What hapen when the user changes its name? There is going to be a massive update in the read model? I think that is a big problem in fully denormalized systems: massive updates on duplicated data everytime, instead on just a few places – Narvalex Nov 15 '17 at 11:53
@Narvalex You don't always want to update denormalised data. Orders for example must not change after a certain point in the ordering process (in some domains), so a confirmed or completed order, or the corresponding delivery note, must not change. If you really have to run many updates, then select a store that is optimised for the types of queries you need to run - and the updates (imho). While it is good to have additional information on the event, more complex read models could benefit from a normalised lookup. Lookup can be reused if some projections are driven by a single handler. – urbanhusky Nov 16 '17 at 08:04

score 1 · Answer 3 · answered Jan 11 '12 at 20:12

Pull events from the EventStore.

Remember - Your read models need to have read-only access to the EventStore already. Read Models are disposable. They are simply cached views. You should be able to delete / expire your Read Models at any time - and automatically rebuild your ReadModels from the EventStore. So - your ReadModelBuilders must already be able to query for past events.

public class QuestionEventsHandler
{
    public void Handle(QuestionAsked question)
    {
        // Get Name of User
        var nameChangedEvent = eventRepository.GetLastEventByAggregateId<UserNameChanged>(question.UserId);

        var item = new QuestionItem(
            question.UserId, 
            question.QuestionId, 
            question.Title, 
            question.Question, 

            nameChangedEvent.Name
    }
}

Also realize - the EventStore repository need not be the true EventStore, although it certainly could be. The advantage of distributed systems is that you can easily replicate the EventStore, if you needed, closer to your ReadModels.

I encountered this exact same scenario... where I needed more data than was available in a single event. This is especially true for Create-type events which require populating a new ReadModel with initial state.

From Read Models: You could pull from other Read Models. But I really don't recommend this, as you would introduce a big ball of dependency mud where views depend on views depend on views.

Additional Data in Events: You really don't want to bloat your events with all the extra data you'll need for views. It will really hurt you when your domain changes & you need to migrate events. Domain Events have specific purposes - they represent state changes. Not view data.

Hope this helps -

Ryan

This assumes you have access to the event repository, in my mind the handlers should only received events rather than pull events. Would this work in a highly distributed system? — Chris Moutray, Feb 06 '12 at 14:09
"Your read models need to have read-only access to the EventStore". That's news to me. I see no reason why the read side needs to have access to the event store. It simply needs to be alerted when an event occurs. I personally don't have a problem with querying out the username from the read side, given the handler exists on the read side anyway. — David Masters, Feb 06 '12 at 14:45
In our current project we will most likely have read access to the event store, because a stream of events related to a particular aggregate is to be displayed in the UI, facebook timeline style. Yes, we could use a projection and generate that (and perhaps we will if this needs some denormalization), but for now we're finding querying the most straight-forward way. — Dav, Aug 02 '12 at 06:16

Event Sourcing and Read Model generation

3 Answers3

Linked