11

I would like to know If It's possible to create a "one-line" Linq to retrieve longest string value of specific Datatable column, meaning that all column data (numbers, dates,strings...) should be converted to string and then return longest string.

What I've found is only how to obtain longest string from a List, or max length value.

This is what I tried so far (error in Length):

string maxString = dt
  .AsEnumerable()
  .Select(row => row[mycolumn].ToString())
  .Where(s => s.OrderByDescending(st => st.Length).First());
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
Lucy82
  • 654
  • 2
  • 12
  • 32
  • "This is what I tried so far (obviously not working)" > Not so obvious, please explain. – Patrick Hofman Apr 15 '19 at 07:18
  • @PatrickHofman, sorry, this code returns me "char does not contain a deifintion for Length" – Lucy82 Apr 15 '19 at 07:20
  • 1
    No need to use `Where()`. This should do the trick: `.Select(row => row[mycolumn].ToString()).OrderByDescending(st => st.Length).First();` – Stephan Bauer Apr 15 '19 at 07:20
  • @StephanBauer, that was fast. Thanks, I knew It had to be something simple as that :) – Lucy82 Apr 15 '19 at 07:24
  • If you are loading that datatable from a database you should probably consider writing SQL to do this instead. Loading an entire table into memory and processing it with Linq is probably a lot slower. – Marie Apr 15 '19 at 13:26

3 Answers3

16

You are looking for ArgMax - a value such that it has max value of some kind of property. Standard Linq doesn't provide ArgMax but you can implement it via Aggregate (i.e. get a single value from a sequence):

 string maxString = dt
   .AsEnumerable()
   .Select(row => row[mycolumn].ToString())
   .Aggregate((s, a) => a.Length > s.Length ? a : s);
Dmitry Bychenko
  • 180,369
  • 20
  • 160
  • 215
  • 2
    This solution is likely much faster than the accepted answer since it doesn't have to actually sort the list, just iterate it once. – Marie Apr 15 '19 at 13:29
  • 2
    @Marie yes, I tested, this is the fastest one. Though It differs for 1 second comparing to Gilad Green answer in my test for exporting data into Excel with OpenXML for 320k+ rows and 7 columns. – Lucy82 Apr 16 '19 at 07:27
10

You are almost there:

string maxString = dt.AsEnumerable()
                     .Select(row => row[mycolumn].ToString())
                     .OrderByDescending(st => st.Length).FirstOrDefault();

A Where expects a predicate (function that will return true or false). Instead just order the projection (the .Select) as you did and retrieve the first item.


Notice that is is an O(nlogn) solution which can be improved to an O(n) solution by not sorting but by finding the item with the max length. One possible way of doing so is an in Dimitry's answer. For less than huge collections I'm not sure one would really feel the difference but it is indeed worth noticing this.


See that you can also use MoreLinq's .MaxBy that can be added through Nuget (For the GitHub repo) which will both give you the O(n) performance and the desired "one-liner":

var row = dt.AsEnumerable().MaxBy(r => r[mycolumn].ToString().Length);
Gilad Green
  • 36,708
  • 7
  • 61
  • 95
  • 3
    Isn't sorting to get the max a innefficient? Or does LINQ optimise that? – RoadRunner Apr 15 '19 at 08:06
  • 1
    @RoadRunner - you are right that sorting will be less efficient. This is an O(nlogn) solution while one can do O(n) as in the answer below. I tried to keep as close to the original as possible – Gilad Green Apr 15 '19 at 08:55
  • IMO you should not recommend FirstOrDefault, you should recommend a sanity check first. Then you dont have to worry about null popping up. – Marie Apr 15 '19 at 13:28
3

First of all do not use AsEnumerable right after dt.. Write somehow like this:

dt.OrderByDescending(row => row[mycolumn].Length).First();
NoImagination
  • 178
  • 1
  • 8
  • AsEnumerable shouldn't be used because it transfers calculations to app memory, when my example forces it to perform calculations at sql server. When all calculations done First() transfers the result to app memory. Try my example and tell me if it works, ok? – NoImagination Apr 15 '19 at 08:38
  • @Lucy82, well i checked my example and it works perfectly :). But as people above noticed it would be much better to find Max value by aggregation function. It requires just 1 cycle when OrderBy sorts. – NoImagination Apr 15 '19 at 12:39
  • @Lucy82 Neither the accepted, nor Dmitry's, answer would be good for large datasets that are persisted in a database. Ie. if dt is a table in a database and contains thousands of rows the other answers would first fetch all those rows and then do the sorting to get the max value. – Taemyr Apr 15 '19 at 12:44
  • @NoImagination I might be wrong but the question isnt tagged entity or anything. Isn't "when my example forces it to perform calculations at sql server" only true if they are using an ORM that supports that functionality? – Marie Apr 15 '19 at 13:31
  • @NoImagination It stands to reason that "dt" stands for DataTable (the `row[column]` part also makes no sense at all for a EF query) which means that leaving the AsEnumerable off simply won't work. It also means that the data is already in memory so really nothing lost there. – Voo Apr 15 '19 at 17:57
  • @NoImagination, Voo has a point, It doesn't work, I received "Datatable does not contain a definition for 'OrderByDescending'..." error. Though I'm testing all answers now and will post which one is fastest, including yours with AsEnumerable. – Lucy82 Apr 16 '19 at 06:48
  • I've tested speed of all answers with my code for exporting data into Excel with OpenXML for 320k+ rows and 7 columns. Code that NoImagination posted doesn't work, and difference between Dmitry Bychenko solution and Gilad Green is around 1 second. Winner is Dmitry's solution :) – Lucy82 Apr 16 '19 at 07:25
  • @Lucy82, then leave AsEnumerable and test my example – NoImagination Apr 16 '19 at 08:50
  • @NoImagination, sorry for wrong expression in my last comment. Your example doesn't work both ways, with or without AsEnumerable, there are errors. With AsEnumerable you get error for "Length". – Lucy82 Apr 16 '19 at 08:57
  • @Lucy82, strange thing. I tested this and all worked. Anyway, nevermind) – NoImagination Apr 16 '19 at 10:39