0

I am using Entity Framework with a MySQL Database and DbContext. I have an entity "Message" that has a related entity "Sender". (The "Message" also has a related entity "Receiver"). I am trying to write a query that will return only the "latest" message for each receiver. But when I do this, I also want to load the associated "Sender" so that I can get access to one of the Sender's properties (e-mail field) which I need to include in my Data Transfer Object I am returning. "MessageDTO" is the Data Transfer Object I am returning that includes the id of the message, the content of the message and the sender's email.

If I exclude the sender's email from the DTO, then the following query returns exactly what I need (namely, the latest message for each receiver):

var refGroupQuery = (from m in dbContext.Messages.SqlQuery("select * from messages order by created_at desc")
     group m by m.receiver_id into refGroup
     select new MessageDTO { id = refGroup.FirstOrDefault().id, content = refGroup.FirstOrDefault().content});

However, the above statement does not load the Sender that is associated with the Message so when I re-include the sender's email in the DTO I get a NullReferenceException as shown below:

var refGroupQuery = (from m in dbContext.Messages.SqlQuery("select * from messages order by created_at desc")
     group m by m.receiver_id into refGroup
     select new MessageDTO { id = refGroup.FirstOrDefault().id, content = refGroup.FirstOrDefault().content, sender_email = refGroup.FirstOrDefault().sender.email});

refGroup.FirstOrDefault().sender.email throws a NullReferenceException because sender is null.

How can I load the Sender in the query so that I can include the Sender's email in my DTO?

EDIT:

As requested, I include the SQL that gets generated by Gert Arnold's suggested method:

{SELECT
1 AS `C1`, 
`Apply1`.`id`, 
`Apply1`.`sender_id`, 
`Apply1`.`RECEIVER_ID1` AS `receiver_id`, 
`Apply1`.`created_at`, 
`Apply1`.`read_status`, 
`Extent3`.`email`
FROM (SELECT
`Distinct1`.`receiver_id`, 
(SELECT
`Project2`.`id`
FROM (SELECT
`Extent2`.`id`, 
`Extent2`.`sender_id`, 
`Extent2`.`receiver_id`, 
`Extent2`.`created_at`, 
`Extent2`.`read_status`
FROM `messages` AS `Extent2`
 WHERE (`Extent1`.`receiver_id` = `Extent2`.`receiver_id`) OR ((`Extent1`.`receiver_id` IS  NULL) AND (`Extent2`.`receiver_id` IS  NULL))) AS `Project2` LIMIT 1) AS `id`, 
(SELECT
`Project2`.`sender_id`
FROM (SELECT
`Extent2`.`id`, 
`Extent2`.`sender_id`, 
`Extent2`.`receiver_id`, 
`Extent2`.`content`, 
`Extent2`.`created_at`, 
`Extent2`.`read_status`
FROM `messages` AS `Extent2`
 WHERE (`Extent1`.`receiver_id` = `Extent2`.`receiver_id`) OR ((`Extent1`.`receiver_id` IS  NULL) AND (`Extent2`.`receiver_id` IS  NULL))) AS `Project2` LIMIT 1) AS `sender_id`, 
(SELECT
`Project2`.`receiver_id`
FROM (SELECT
`Extent2`.`id`, 
`Extent2`.`sender_id`, 
`Extent2`.`receiver_id`, 
`Extent2`.`content`, 
`Extent2`.`created_at`,
`Extent2`.`read_status`
FROM `messages` AS `Extent2`
 WHERE (`Extent1`.`receiver_id` = `Extent2`.`receiver_id`) OR ((`Extent1`.`receiver_id` IS  NULL) AND (`Extent2`.`receiver_id` IS  NULL))) AS `Project2` LIMIT 1) AS `RECEIVER_ID1`, 
(SELECT
`Project2`.`receivable_type`
FROM (SELECT
`Extent2`.`id`, 
`Extent2`.`sender_id`, 
`Extent2`.`receiver_id`, 
`Extent2`.`content`, 
`Extent2`.`created_at`,  
`Extent2`.`read_status`
 WHERE (`Extent1`.`receiver_id` = `Extent2`.`receiver_id`) OR ((`Extent1`.`receiver_id` IS  NULL) AND (`Extent2`.`receiver_id` IS  NULL))) AS `Project2` LIMIT 1) AS `content`, 
(SELECT
`Project2`.`created_at`
FROM (SELECT
`Extent2`.`id`, 
`Extent2`.`sender_id`, 
`Extent2`.`receiver_id`,  
`Extent2`.`content`, 
`Extent2`.`created_at`, 
`Extent2`.`read_status`
FROM `messages` AS `Extent2`
 WHERE (`Extent1`.`receiver_id` = `Extent2`.`receiver_id`) OR ((`Extent1`.`receiver_id` IS  NULL) AND (`Extent2`.`receiver_id` IS  NULL))) AS `Project2` LIMIT 1) AS `created_at`, 
(SELECT
`Project2`.`updated_at`
FROM (SELECT
`Extent2`.`id`, 
`Extent2`.`sender_id`, 
`Extent2`.`receiver_id`, 
`Extent2`.`content`, 
`Extent2`.`created_at`, 
`Extent2`.`read_status`
FROM `messages` AS `Extent2`
 WHERE (`Extent1`.`receiver_id` = `Extent2`.`receiver_id`) OR ((`Extent1`.`receiver_id` IS  NULL) AND (`Extent2`.`receiver_id` IS  NULL))) AS `Project2` LIMIT 1) AS `read_status`
FROM (SELECT DISTINCT 
`Extent1`.`receiver_id`
FROM `messages` AS `Extent1`) AS `Distinct1`) AS `Apply1` LEFT OUTER JOIN `users` AS `Extent3` ON `Apply1`.`sender_id` = `Extent3`.`id`}
Stanley
  • 5,261
  • 9
  • 38
  • 55
  • Why do you use `SqlQuery`? Without it the whole query would get translated into SQL and the sender would be joined in. – Gert Arnold Oct 03 '13 at 17:26
  • Using SqlQuery was the only way I could get the "order by" to execute before the "group by". If I replace the SqlQuery with an "order by" directive the query no longer returns the latest record for each group, but just some record for each group. I read in http://stackoverflow.com/questions/5140785/mysql-order-before-group-by that "the selection of values from each group cannot be influenced by adding an ORDER BY clause. Sorting of the result set occurs after values have been chosen, and ORDER BY does not affect which values the server chooses" – Stanley Oct 04 '13 at 05:13

1 Answers1

0

You don't need the SqlQuery construct to do the ordering before the grouping:

var refGroupQuery = from m in dbContext.Messages
     group m by m.receiver_id into refGroup
     let firstItem = refGroup.OrderByDescending(x => x.created_at)
                             .FirstOrDefault()
     select new MessageDTO { 
                              id = firstItem.id, 
                              content = firstItem.content,
                              sender_email = firstItem.sender.email
                           };

This does the same, but it translates the whole statement into SQL, which has two advantages

  • sender is not lazy loaded for each message
  • sender.email does not crash when sender is null, because in SQL there is no null object reference. The whole expression (sender.email) just returns null.
Gert Arnold
  • 105,341
  • 31
  • 202
  • 291
  • This gives an error "Unknown column 'Extent1.receiver_id' in 'where clause'" – Stanley Oct 04 '13 at 07:28
  • I actually used exactly the method you provided. All I do additionally in my method is to return refGroupQuery to the calling function. I wonder if this is a bug in MySQL or the MySQL Adapter. I found http://bugs.mysql.com/bug.php?id=68513 and I am indeed using version 6.6.5 of the adapter. So looks like I'll need to upgrade to a new version first... – Stanley Oct 04 '13 at 08:06
  • I mean the whole SQL query :) I'm afraid a bug is very likely, but it could be a mapping problem. What does the mapping between `Message` and `Receiver` look like? Did you explicitly name the `receiver_id` column or do you rely on EF's default naming conventions? (I'm assuming you work code first here). – Gert Arnold Oct 04 '13 at 08:12
  • I have added the generated SQL query to the question. I actually followed a database-first approach as we had an existing database that was Ruby-on-Rails serviced and where the Web API functionality is being moved over to .NET. I am not relying on EF's default naming conventions as they will not work in our case. The "messages" table has a "receiver_id" field which is the foreign key for the "id" in the "users" table. So the message will have one receiver. – Stanley Oct 04 '13 at 12:27
  • Sorry - I pasted the wrong segment of code. It is correct now. (For the sake of a simplified question I simplified my actual model and column names slightly for the sake of the question.) – Stanley Oct 04 '13 at 12:36
  • 1
    Definitely a bug. `Extent1` is scoped to a subquery at the bottom, but referred to everywhere. – Gert Arnold Oct 04 '13 at 12:42
  • Ok thanks. I will give this same query a try when I get to upgrade the MySQL adapter to a newer version. Unfortunately this means an upgrade of some other components first. But will let you know. It seems the MySQL adapter is buggy enough to warrant an upgrade of everything. – Stanley Oct 04 '13 at 12:47