23

This is easy for me to perform in TSQL, but I'm just sitting here banging my head against the desk trying to get it to work in EF4!

I have a table, lets call it TestData. It has fields, say: DataTypeID, Name, DataValue.

DataTypeID, Name, DataValue
1,"Data 1","Value1"
1,"Data 1","Value2"
2,"Data 1","Value3"
3,"Data 1","Value4"

I want to group on DataID/Name, and concatenate DataValue into a CSV string. The desired result should contain -

DataTypeID, Name, DataValues
1,"Data 1","Value1,Value2"
2,"Data 1","Value3"
3,"Data 1","Value4"

Now, here's how I'm trying to do it -

var query = (from t in context.TestData
  group h by new { DataTypeID = h.DataTypeID, Name = h.Name } into g
  select new
 {
   DataTypeID = g.Key.DataTypeID,
   Name = g.Key.Name,
   DataValues = (string)g.Aggregate("", (a, b) => (a != "" ? "," : "") + b.DataValue),
 }).ToList()

The problem is that LINQ to Entities does not know how to convert this into SQL. This is part of a union of 3 LINQ queries, and I'd really like it to keep it that way. I imagine that I could retrieve the data and then perform the aggregate later. For performance reasons, that wouldn't work for my app. I also considered using a SQL server function. But that just doesn't seem "right" in the EF4 world.

Anyone care to take a crack at this?

Phil Figgins
  • 796
  • 1
  • 8
  • 22
  • Thanks, looks like if you had posted this as an answer, it'd be the accepted one. I'm pretty disappointed, although not completely surprised, this isn't possible. – Phil Figgins Nov 18 '10 at 15:47
  • In general, if you can't do it in (reasonably standard) SQL, you probably can't do it in L2E. Standard SQL doesn't have a way of taking an arbitrarily long list and concatenating it into a string. You could write a SQL function/proc for this, and you can call in with L2E, though. – Craig Stuntz Nov 18 '10 at 16:23

5 Answers5

11

If the ToList() is part of your original query and not just added for this example, then use LINQ to Objects on the resulting list to do the aggregation:

var query = (from t in context.TestData
            group t by new { DataTypeID = t.DataTypeID, Name = t.Name } into g 
            select new { DataTypeID = g.Key.DataTypeID, Name = g.Key.Name, Data = g.AsEnumerable()})
            .ToList()
            .Select (q => new { DataTypeID = q.DataTypeID, Name = q.Name, DataValues = q.Data.Aggregate ("", (acc, t) => (acc == "" ? "" : acc + ",") + t.DataValue) });

Tested in LINQPad and it produces this result:

alt text

Jeff Ogata
  • 56,645
  • 19
  • 114
  • 127
  • My apologies, the ToList() is at the end of everything, I should have left it out of the example. I have 3 LINQ queries unioned and I was hoping to get this CSV field in the LINQ query before the ToList. It seems that this is not possible per the link from moi_meme above. Thanks for trying! – Phil Figgins Nov 18 '10 at 15:29
  • Reason for downvoting: "I imagine that I could retrieve the data and then perform the aggregate later. For performance reasons, that wouldn't work for my app." You did not answer the question. – Pluc Jan 21 '15 at 21:53
  • @Jeff Ogata this is elegant. Would you please explain what's going on? Especially the last line? – Tigran Jul 21 '15 at 23:56
  • That's a really complicated way to do `String.Join`. – NetMage Mar 03 '21 at 22:08
8

Some of the Answers suggest calling ToList() and then perform the calculation as LINQ to OBJECT. That's fine for a little amount of data, but what if I have a huge amount of data that I do not want to load into memory too early, then, ToList() may not be an option.

So, the better idea would be to process/format the data in the presentation layer and let the Data Access layer do only loading or saving raw data that SQL likes. Moreover, in your presentation layer, most probably you are filtering the data by paging, or maybe you are showing one row in the details page, so, the data you will load into the memory is likely smaller than the data you load from the database. (Your situation/architecture may be different,.. but I am saying, most likely).

I had a similar requirement. My problem was to get the list of items from the Entity Framework object and create a formatted string (comma separated value)

  1. I created a property in my View Model which will hold the raw data from the repository and when populating that property, the LINQ query won't be a problem because you are simply querying what SQL understands.

  2. Then, I created a get-only property in my ViewModel which reads that Raw entity property and formats the data before displaying.

     public class MyViewModel
     {
         public IEnumerable<Entity> RawChildItems { get; set; }
    
         public string FormattedData
         {
             get
             {
                 if (this.RawChildItems == null)
                     return string.Empty;
    
                 string[] theItems = this.RawChildItems.ToArray();
    
                 return theItems.Length > 0
                     ? string.Format("{0} ( {1} )", this.AnotherRegularProperty, String.Join(", ", theItems.Select(z => z.Substring(0, 1))))
                     : string.Empty;
             }
         }
     }
    

Ok, in that way, I loaded the Data from LINQ to Entity to this View Model easily without calling.ToList().

Example:

IQueryable<MyEntity> myEntities = _myRepository.GetData();

IQueryable<MyViewModel> viewModels = myEntities.Select(x => new MyViewModel() { RawChildItems = x.MyChildren })

Now, I can call the FormattedData property of MyViewModel anytime when I need and the Getter will be executed only when the property is called, which is another benefit of this pattern (lazy processing).

An architecture recommendation: I strongly recommend to keep the data access layer away from all formatting or view logic or anything that SQL does not understand.

Your Entity Framework classes should be simple POCO that can directly map to a database column without any special mapper. And your Data Access layer (say a Repository that fetches data from your DbContext using LINQ to SQL) should get only the data that is directly stored in your database. No extra logic.

Then, you should have a dedicated set of classes for your Presentation Layer (say ViewModels) which will contain all logic for formatting data that your user likes to see. In that way, you won't have to struggle with the limitation of Entity Framework LINQ. I will never pass my Entity Framework model directly to the View. Nor, I will let my Data Access layer creates the ViewModel for me. Creating ViewModel can be delegated to your domain service layer or application layer, which is an upper layer than your Data Access Layer.

Emran Hussain
  • 11,551
  • 5
  • 41
  • 48
7

Thanks to moi_meme for the answer. What I was hoping to do is NOT POSSIBLE with LINQ to Entities. As others have suggested, you have to use LINQ to Objects to get access to string manipulation methods.

See the link posted by moi_meme for more info.

Update 8/27/2018 - Updated Link (again) - https://web.archive.org/web/20141106094131/http://www.mythos-rini.com/blog/archives/4510

And since I'm taking flack for a link-only answer from 8 years ago, I'll clarify just in case the archived copy disappears some day. The basic gist of it is that you cannot access string.join in EF queries. You must create the LINQ query, then call ToList() in order to execute the query against the db. Then you have the data in memory (aka LINQ to Objects), so you can access string.join.

The suggested code from the referenced link above is as follows -

var result1 = (from a in users
                b in roles
           where (a.RoleCollection.Any(x => x.RoleId = b.RoleId))
           select new 
           {
              UserName = a.UserName,
              RoleNames = b.RoleName)                 
           });

var result2 = (from a in result1.ToList()
           group a by a.UserName into userGroup
           select new 
           {
             UserName = userGroup.FirstOrDefault().UserName,
             RoleNames = String.Join(", ", (userGroup.Select(x => x.RoleNames)).ToArray())
           });

The author further suggests replacing string.join with aggregate for better performance, like so -

RoleNames = (userGroup.Select(x => x.RoleNames)).Aggregate((a,b) => (a + ", " + b))
Phil Figgins
  • 796
  • 1
  • 8
  • 22
  • 1
    Link dead- see https://web.archive.org/web/20141106094131/http://www.mythos-rini.com/blog/archives/4510 – jnm2 Sep 02 '15 at 14:21
  • I think for entity framework 3.5 in 2010, it was pretty good. It certainly answered my question at the time. Are there better approaches we could use today? Sure. But we have EF core 2.1 now. Good Lord, this is 8 years old! What's with all the hate? – Phil Figgins Oct 06 '18 at 03:49
  • 7
    Replacing string.join with aggregate gives worse (very worse) perfomance. Check the source code for string,join, it use a fast StringBuilder instead of string concatenation – edc65 Oct 08 '19 at 14:21
0

You are so very close already. Try this:

var query = (from t in context.TestData
  group h by new { DataTypeID = h.DataTypeID, Name = h.Name } into g
  select new
 {
   DataTypeID = g.Key.DataTypeID,
   Name = g.Key.Name,
   DataValues = String.Join(",", g),
 }).ToList()

Alternatively, you could do this, if EF doesn't allow the String.Join (which Linq-to-SQL does):

var qs = (from t in context.TestData
  group h by new { DataTypeID = h.DataTypeID, Name = h.Name } into g
  select new
 {
   DataTypeID = g.Key.DataTypeID,
   Name = g.Key.Name,
   DataValues = g
 }).ToArray();

var query = (from q in qs
            select new
            {
                q.DataTypeID,
                q.Name,
                DataValues = String.Join(",", q.DataValues),
            }).ToList();
Enigmativity
  • 113,464
  • 11
  • 89
  • 172
  • 3
    EF doesn't support string.join and I was hoping to get this into the main query. It's easy enough if I wrap with another query. I just was hoping to get this to all happen on the SQL side, not the client side. – Phil Figgins Nov 18 '10 at 15:31
  • `ToArray` is worse than `AsEnumerable` and never a good choice unless you have an overwhelming requirement for an array. – NetMage Oct 05 '18 at 21:09
  • @NetMage - Why do you say "never a good choice unless you have an overwhelming requirement for an array"? Can you break down why it's not a good choice and what you mean by an "overwhelming requirement"? – Enigmativity Oct 06 '18 at 01:31
  • 2
    `ToArray` is worse than `ToList` because it returns an exact storage match for the items contained, so it has to do a [final resize](https://stackoverflow.com/a/16323412/2557128) (e.g. create the final `Array` object) after it has loaded all elements. `AsEnumerable` simply converts the type to `IEnumerable` (query execution is still deferred), any subsequent LINQ method will cause the `IQueryable` to be enumerated. I am not sure how enumerating the query will allocate memory, but theoretically it could be more efficient. OTOH, if you planned to use `qs` later, `ToList()` would be better. – NetMage Oct 08 '18 at 19:58
  • @NetMage - You're right about the final resize. But my experience is that leaving queries as lazy often means that the query runs multiple times unless you're careful. – Enigmativity Oct 08 '18 at 23:52
-1

Maybe it's a good idea to create a view for this on the database (which concatenates the fields for you) and then make EF use this view instead of the original table?

I'm quite sure it's not possible in a LINQ statement or in the Mapping Details.

Gerrie Schenck
  • 22,148
  • 20
  • 68
  • 95