List determine max length by Linq

Question

I have the following structure

List<string[]> sList = new List<string[]>() {
    new[] { "x", "xxx", "xxxx" },  //1,3,4
    new[] { "x", "xx", "xx" },     //1,2,2
    new[] { "xxxxxx", "xx", "xx" } //6,2,2
};

and I need to determine the maximum string.length of the items by column

In this example the expected result should be:

List<int> Result = new List<int>() { 6, 3, 4 };

Is there a easy Linq-approach?

my effort (working but not using Linq):

List<int> Result = new List<int>();
foreach (string[] line in Table)
{
    for (int i = 0; i < line.Length; i++)
    {
        if (Result.Count > i)
        {
            if (Result[i] < line[i].Length)
            {
                Result[i] = line[i].Length;
            }
        }
        else
        {
            Result.Insert(i, line[i].Length);
        }
    }
}

the number of rows/columns is dynamic but each row has the same number of columns.

If you have a working solution, why are you trying to change it? — juharr, Sep 22 '15 at 12:29
@juharr - what a ridiculous question. Coding isn't just about getting something working. — , Sep 22 '15 at 12:30
@TimSchmelter you code is a good working, reliable part in my application. — Byyo, Apr 11 '16 at 07:28

score 21 · Answer 1 · answered Sep 22 '15 at 12:30

21

This is the best that I could come up with:

List<int> Result = sList.First().Select((x, i) => sList.Max(y => y[i].Length)).ToList();

Bonus points for one line?

Explanation: Since you said that they all have the same number of elements, take the first row and loop through that. Use the index to get that element from each of the other rows getting the length and then the maximum of that.

answered Sep 22 '15 at 12:30

TomDoesCode

3,580
2
18
34

2

thanks, great solution i wish i could accept both - but tim's is more fault-tolerant – Byyo Sep 23 '15 at 05:47

Tim Schmelter · Accepted Answer · 2015-09-22T12:39:45.550

20

One approach:

int maxColumns = sList.Max(arr => arr.Length);
List<int> Result = Enumerable.Range(0, maxColumns)
    .Select(i => sList.Max(arr => (arr.ElementAtOrDefault(i) ?? "").Length))
    .ToList();

You want the max-length per column. The columns are the array indices. So you need a way to look at the arrays per index. Therefore i've used Enumerable.Range(0, maxColumns). Then i'm using ElementAtOrDefault to handle the case that the array doesn't contain so many "columns"(not needed as i'm explaining below). That returns null for reference types like string. I replace them with the null-coalescing operator with "" which yields 0 as length.

Since you've mentioned that "each row has the same number of columns" you can make it a little bit more readable:

List<int> Result = Enumerable.Range(0, sList.First().Length)
    .Select(i => sList.Max(arr => arr[i].Length))
    .ToList();

edited Sep 22 '15 at 12:39

answered Sep 22 '15 at 12:27

Tim Schmelter

450,073
74
686
939

Yes, you did manage to to max by row instead of by column. Maybe a little explanation would help. – juharr Sep 22 '15 at 12:32
1

@Byyo i would take the 1st approach - it even works if a row has a invalid number of columns – fubo Sep 22 '15 at 12:41
@fubo: it has another subtle advantage, it even works if one of the strings in the arrays is `null`. – Tim Schmelter Sep 22 '15 at 13:19

score 6 · Answer 3 · edited May 23 '17 at 11:45

Here's an approach using Aggregate:

var maxLengths = sList.Select(strings => strings.Select(s => s.Length))
                      .Aggregate((prev, current) => prev.Zip(current, Math.Max));

This works with the assumption that each collection has the same number of items.

How It works:

First we take only the lengths of the string, so conceptually we are working with IEnumerable<IEnumerable<int>>, so our collection looks like this:
```
{{1, 3, 4},
 {1, 2, 2},
 {6, 2, 2}}
```
(this is not accurate - due to deferred execution we never actually have this collection - the lengths are calculated for each row as needed.)
Next, we use Aggregate to combine the vectors:
- We have {1, 3, 4} and {1, 2, 2}.
- Zip the two collections: {{1, 2}, {3, 2}, {4, 2}}.
- Apply Math.Max on each pair, and get {2, 3, 4}.
Repeat for {2, 3, 4} and {6, 2, 2} and receive {6, 3, 4}.

Pros:

Works for a single row.
Only iterate over the collection once.
Does not make any assumption on the concrete implementation (List of Arrays) - can work with any IEnumerable<IEnumerable<string>>, without relying on [i] or ElementAtOrDefault.
You've asked for a Linq solution.

Cons:

Somewhat complicated.
If the rows were in different lengths Zip would not behave as expected (return the shortest length rather than the longest).

I was just thinking of something like this and you kicked ass with this answer. I think it's the best by far. — Siderite Zackwehdex, Apr 08 '16 at 13:57

score 4 · Answer 4 · answered Sep 22 '15 at 12:31

4

Assuming that all your "rows" have the same number of items ("columns"):

var rowLength = sList[0].Length;
var result = Enumerable.Range(0, rowLength)
    .Select(index => sList.Select(row => row[index].Length).Max())
    .ToList();

answered Sep 22 '15 at 12:31

Konamiman

49,681
17
108
138

George Vovos · Answer 5 · 2016-04-08T22:02:21.457

Surprisingly, there aren't any good(efficient) answers

1)Don't use LINQ. Try a simple,readable procedural solution
2)If for some reason you want to use LINQ,at least make it efficient

Below you see the results of my sample code

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;

namespace ConsoleApplication9
{
    internal class Program
    {
        private static void Main(string[] args)
        {
            Random rand = new Random();
            int maxColumnLength = 10;
            int rowNumber = 100000;
            int colNumber = 300;
            List<string[]> input = new List<string[]>();
            for (int i = 0; i < rowNumber; i++)
            {
                string[] row = new string[colNumber];
                for (int j = 0; j < colNumber; j++)

                    row[j] = new string('x', rand.Next(maxColumnLength));

                input.Add(row);
            }

            var result = Time(SimpleIteration, input, "Simple Iteration");

            var result2 = Time(SimpleIterationWithLinq, input, "Simple Iteration LINQ");

            var result3 = Time(AcceptedAnswer, input, "Accepted Answer");

            var result4 = Time(TomDoesCode, input, "TomDoesCode");

            //var result5 = Time(Kobi, input, "Kobi"); //StackOverflow

            var result6 = Time(Konamiman, input, "Konamiman");

            var result7 = Time(RahulSingh, input, "RahulSingh");

            System.Console.ReadLine();
        }

        private static List<int> SimpleIteration(List<string[]> input)
        {
            int[] maxPerColumn = new int[input.First().Length];
            for (int i = 0; i < maxPerColumn.Length; i++)
                maxPerColumn[i] = -1;

            foreach (var row in input)
            {
                for (int i = 0; i < row.Length; i++)
                    if (maxPerColumn[i] < row[i].Length)
                        maxPerColumn[i] = row[i].Length;
            }

            return maxPerColumn.ToList();
        }

        private static List<int> SimpleIterationWithLinq(List<string[]> input)
        {
            return input.Aggregate(new int[input.First().Length], (maxPerColumn, row) =>
             {
                 for (int i = 0; i < row.Length; i++)
                     if (maxPerColumn[i] < row[i].Length)
                         maxPerColumn[i] = row[i].Length;

                 return maxPerColumn;
             }).ToList();
        }

        private static List<int> AcceptedAnswer(List<string[]> input)
        {
            return Enumerable.Range(0, input.First().Length)
                             .Select(i => input.Max(arr => arr[i].Length))
                             .ToList();
        }

        private static List<int> TomDoesCode(List<string[]> input)
        {
            return input.First().Select((x, i) => input.Max(y => y[i].Length)).ToList();
        }

        private static List<int> Kobi(List<string[]> input)
        {
            return input.Select(strings => strings.Select(s => s.Length))
                      .Aggregate((prev, current) => prev.Zip(current, Math.Max)).ToList();
        }

        private static List<int> Konamiman(List<string[]> input)
        {
            var rowLength = input[0].Length;
            return Enumerable.Range(0, rowLength)
                .Select(index => input.Select(row => row[index].Length).Max())
                .ToList();
        }

        private static List<int> RahulSingh(List<string[]> input)
        {
            int arrayLength = input.First().Length;
            return input.SelectMany(x => x)
                                     .Select((v, i) => new { Value = v, Index = i % arrayLength })
                                     .GroupBy(x => x.Index)
                                     .Select(x => x.Max(z => z.Value.Length))
                                     .ToList();
        }

        private static List<int> Time(Func<List<string[]>, List<int>> act, List<string[]> input, string methodName)
        {
            Stopwatch s = Stopwatch.StartNew();
            var result = act(input);
            s.Stop();
            Console.WriteLine(methodName.PadRight(25) + ":" + s.ElapsedMilliseconds + " ms");
            return result;
        }
    }
}

score 2 · Answer 6 · answered Sep 22 '15 at 13:09

This is what I have:-

  int arrayLength = sList.First().Length;
  List<int> result = sList.SelectMany(x => x)
                           .Select((v, i) => new { Value = v, Index = i % arrayLength  })
                           .GroupBy(x => x.Index)
                           .Select(x => x.Max(z => z.Value.Length))
                           .ToList();

Explanation:- Flatten your list, project it's value and index (Index is calculated based on columns length), group by index and select the value with maximum length.

You could have done sList.SelectMany(list => list .Select((s, index) => new{ Column = index, s.Length })) .GroupBy(g => g.Column) .Select(g => new { Column = g.Key, Length = g.Max(x => x.Length) }) — Phil, Apr 06 '16 at 14:10

Martijn · Answer 7 · 2015-09-24T12:24:49.813

The general method to aggregate results is the Aggregate extension method.

tl;rd for the simplest case:

    slist.Select(row => row.Select(str => str.Length))
         .Aggregate(l, r) => l.Zip(r, Math.Max);

First lets get the data in the form we want: A list of lengths

var llist = slist.Select(row => row.Select(str => str.Length));

Now we have a list of lists of length. Much easier to work with.

Aggregate works by keeping some running total, and running a function of which the parameters are the aggregate and the next row, yielding a new aggregate. You can optionally provide a seed value. It looks like this:

List<Int> seed = ???
Func<IEnumerable<Int>, IEnumerable<Int>, IEnumerable<Int>> accumulator  = ???
llist.Aggregate(seed, accumulator);

From here on, I assume that the number of "columns" is variable.

The seed value is new List<int>(). If there are 0 rows, that's your result.

List<Int> seed = new List<Int>();
Func<List<Int>, List<Int>, List<Int>> accumulator  = ???
llist.Aggregate(seed, accumulator);

For each row, the new aggregate is the max of each element of the aggregate and the next row.

This happens to be exactly what the .Zip method is for if the items are guaranteed to be of the same length: https://msdn.microsoft.com/en-us/library/vstudio/dd267698(v=vs.100).aspx

giving

private IEnumerable<Int> accumulate(aggregate, row){
  aggregate.Zip(row, (i1, i2) => Math.max(i1, i2));
}

unfortunately we don't have that guarantee, so we'll have to provide our own ZipAll. The rules are: If it an item exits in both, take the max. If it only exists in one, take that one. This is a bit of a potatoe; ZipAll should really be a library function.

  IEnumerable<T> ZipAll(IEnumerable<T> left, IENumerable<T> right, Func<T, T, T> selector) {
   var le = left.GetEnumerator;
   var re = right.GetEnumerator;

   while(le.MoveNext()){
     if (re.MoveNext()) {
       yield return selector(le.Current, re.Current);
     } else {
       yield return le.Current;
     }
   }

   while(re.MoveNext()) yield return re.Current;
}

(1)

That gives us

List<Int> seed = new List<Int>();
Func<List<Int>, List<Int>, List<Int>> accumulator = (l, r) => ZipAll(l, r, Math.Max);
llist.Aggregate(seed, accumulator);

If you inline everything(accept our custom ZipAll function) that becomes

llist.Aggregate(new List<Int>(), (l, r) => ZipAll(l, r, Math.Max));

If you have at least one row, you can leave out the seed (the first row becomes the seed) to get

llist.Aggregate((l, r) => ZipAll(l, r, Math.Max));

In case the number of columns is constant, we can use the build in Zip

llist.Aggregate((l, r) => l.Zip(r, Math.Max));

(1)Outside the scope of this question, a more general overload is

IEnumerable<TResult, TLeft, TRight> ZipAll(IEnumerable<TLeft> left, IENumerable<TRight> right, Func<TResult, TLeft, TRight> selector, Func<TResult, TLeft> leftSelector, Func<Tresult, TRight> rightselector) {
   var le = left.GetEnumerator;
   var re = right.GetEnumerator;

   while(le.MoveNext()){
     if (re.MoveNext()) {
       yield return selector(le.Current, re.Current);
     } else {
       yield return leftSelector(le.Current);
     }
   }

   while(re.MoveNext()) yield return rightSelector(re.Current);
}

With this we could write our earlier ZipAll in terms of this one with the left and right projections as identity function as

IEnumerable<T> ZipAll(this IEnumerable<T> left, IENumerable<T> right, Func<T, T, T> selector) {
    return ZipAll(left, right, selector, id => id, id => id);
 }

The OP did say "each row has the same number of columns", so I don't think you need `ZipAll` - `Zip` should work. You also have multiple syntax errors like missing semicolons and parentheses. Aside from that, the solution is identical to my answer, you only explained it in another way. — Kobi, Sep 24 '15 at 07:34
@Kobi thanks for the note, I did a quick check of the syntax errors, I think I got most. I'll double check later. Having different people explain the same thing in different ways is IMO the strength of SO. — Martijn, Sep 24 '15 at 13:10

List determine max length by Linq

7 Answers7

How It works:

Pros:

Cons: