6

I have following class which is populated with the data

public class cntrydata
{
    public string countryid { get; set; }
    public string countryname { get; set; }

    public IEnumerable<Data> data { get; set; }

}
public class Data
{
        public int year { get; set; }
        public float value { get; set; }
}

I have an IEnumerable result which has data like this:

IEnumerable<cntryData> result

USA
United States
   2000 12
   2001 22
   2004 32

CAN
Canada
   2001 29
   2003 22
   2004 24

I want to evaluate "result" object using LINQ to get following result:

2000 USA 12   CAN null
2001 USA 22   CAN 29
2003 USA null CAN 22
2004 USA 32   CAN 24

Also if result has more countries (say China with 1995 value 12) then result should look like this:

1995 USA null   CAN null CHN 12
2000 USA 12     CAN null CHN null
2001 USA 22     CAN 29   CHN null
2003 USA null   CAN 22   CHN null
2004 USA 32     CAN 24   CHN null

Can this be done using LINQ? Thank you.

hss
  • 317
  • 1
  • 2
  • 13
  • you want to look at GroupBy() – Jonesopolis Sep 06 '13 at 13:42
  • Looks like you want a ["left-outer-join"](http://msdn.microsoft.com/en-us/library/vstudio/bb397895.aspx) in LINQ. So link all years with all countries. You get the range in this way: `int minYear = result.Min(c => c.data.Min(d => d.year)); int maxYear = result.Max(c => c.data.Max(d => d.year)); var range=Enumerable.Range(minYear, maxYear-minYear+1);` – Tim Schmelter Sep 06 '13 at 13:52
  • @TimSchmelter If you skip a lot of years that's not a very good approach. That's only viable if they all fit within a fairly small range. – Servy Sep 06 '13 at 13:55
  • Do you know number of countries? – MarcinJuraszek Sep 06 '13 at 13:55
  • @MarcinJuraszek The input is an `IEnumerable` of country data, so it would seem not. – Servy Sep 06 '13 at 13:56
  • @Servy: You're right. Simply use `SelectMany` on all data's years. `result.SelectMany(c => c.data.Select(d => d.year)).Distinct().OrderBy(y => y)` – Tim Schmelter Sep 06 '13 at 13:57
  • @TimSchmelter Then `Distinct` – Servy Sep 06 '13 at 13:58
  • OK, another question. Did you tried something by yourself and got stuck or just want us to do all the work for you? ;) – MarcinJuraszek Sep 06 '13 at 13:59
  • You need to do a left join. You can take a look here: http://stackoverflow.com/questions/3404975/left-outer-join-in-linq – dan89 Sep 06 '13 at 14:02
  • 2
    You're just describing a pivot. Here's a couple of links: http://stackoverflow.com/questions/167304/is-it-possible-to-pivot-data-using-linq and http://stackoverflow.com/questions/320895/using-linq-to-create-crosstab-results – Amy B Sep 06 '13 at 14:02
  • @user2609787 But he needs to do a join on an unknown number of sequences. That complicates matters. – Servy Sep 06 '13 at 14:03
  • I agree that the OP wants something called `pivot` which is not very easy, unless the number of columns **is fixed** but as he said, it may be 2, 3, 4, ... – King King Sep 06 '13 at 14:03
  • You need to do a left join. You can take a look here: http://stackoverflow.com/questions/3404975/left-outer-join-in-linq – dan89 Sep 06 '13 at 14:04
  • @MarcinJuraszek: no the number of countries could be anywhere from 1 to many. – hss Sep 06 '13 at 14:05

3 Answers3

4

Update

Here is now you use the code below to make a data table:

  var newresult = result.SelectMany(cntry => cntry.data.Select(d => new { id = cntry.countryid, name = cntry.countryname, year = d.year, value = d.value }))
                        .GroupBy(f => f.year)
                        .Select(g => new { year = g.Key, placeList = g.Select(p => new { p.id, p.value })});


  DataTable table = new DataTable();

  table.Columns.Add("Year");
  foreach(string name in  result.Select(x => x.countryid).Distinct())
    table.Columns.Add(name);

  foreach(var item in newresult)
  {
    DataRow nr = table.NewRow();

    nr["Year"] = item.year;

    foreach(var l in item.placeList)
      nr[l.id] = l.value;

    table.Rows.Add(nr);
  }

  table.Dump();

And how that looks:

table


This is what linq can do, you could transform this to a data table easy enough, a list by year of locations and their values.

Flatten the input and then group by. Select what you want. Like this

var newresult = result.SelectMany(cntry => cntry.data.Select(d => new { id = cntry.countryid, name = cntry.countryname, year = d.year, value = d.value }))
                      .GroupBy(f => f.year)
                      .Select(g => new { year = g.Key, placeList = g.Select(p => new { p.id, p.value })});

Here is what the dump looks like in LinqPad.

dump

Here is the full test code

void Main()
{
  List<cntrydata> result = new List<cntrydata>()
  {
    new cntrydata() { countryid = "USA", countryname = "United States", 
      data = new List<Data>() { 
        new Data() { year = 2000, value = 12 },
        new Data() { year = 2001, value = 22 }, 
        new Data() { year = 2004, value = 32 } 
      }
    },
    new cntrydata() { countryid = "CAN", countryname = "Canada", 
      data = new List<Data>() { 
           new Data() { year = 2001, value = 29 }, 
           new Data() { year = 2003, value = 22 }, 
           new Data() { year = 2004, value = 24 } 
        }
    }
  };

  var newresult = result.SelectMany(cntry => cntry.data.Select(d => new { id = cntry.countryid, name = cntry.countryname, year = d.year, value = d.value }))
                        .GroupBy(f => f.year)
                        .Select(g => new { year = g.Key, placeList = g.Select(p => new { p.id, p.value })});

  newresult.Dump();

}

public class cntrydata
{
    public string countryid { get; set; }
    public string countryname { get; set; }

    public IEnumerable<Data> data { get; set; }

}

public class Data
{
        public int year { get; set; }
        public float value { get; set; }
}
Hogan
  • 69,564
  • 10
  • 76
  • 117
4

I found it surprisingly hard to come up with a clean answer on this one, and I am still not really satisfied, so feedback is welcome:

var countries = result.Select(x => x.countryid).Distinct();
var years = result.SelectMany(x => x.data).Select(x => x.year).Distinct();
var data = result.SelectMany(x => x.data
                                   .Select(y => new { Country = x.countryid,
                                                      Data = y }))
                 .ToDictionary(x => Tuple.Create(x.Country, x.Data.year),
                               x => x.Data.value);   

var pivot = (from c in countries
             from y in years
             select new { Country = c, Year = y, Value = GetValue(c, y, data) })
            .GroupBy(x => x.Year)
            .OrderBy(x => x.Key);



public float? GetValue(string country, int year,
                       IDictionary<Tuple<string, int>, float> data)
{
    float result;
    if(!data.TryGetValue(Tuple.Create(country, year), out result))
        return null;
    return result;
}

pivot will contain one item per year. Each of these items will contain one item per country.

If you want to format each line as a string, you can do something like this:

pivot.Select(g => string.Format("{0} {1}", g.Key, string.Join("\t", g.Select(x => string.Format("{0} {1}", x.Country, x.Value)))));
Daniel Hilgarth
  • 171,043
  • 40
  • 335
  • 443
  • the last result has 3 properties: `Country`, `Year`, `Value` which is not really what the OP wants. What he wants is something having `n+1` properties, whereas `n` is the number of the countries. It's very dynamical. – King King Sep 06 '13 at 14:12
  • @KingKing: Incorrect. Please read the description of what `pivot` contains. – Daniel Hilgarth Sep 06 '13 at 14:13
  • @DanielHilgarth - do you think my "trick" using `DataTable` is "cleaner"? It does not require a helping function. – Hogan Sep 06 '13 at 14:23
  • @Hogan And what's so bad about writing helper functions? This code is a lot shorter, more concise, and readable than yours as a result. – Servy Sep 06 '13 at 14:26
  • @DanielHilgarth I don't need to understand what your concept about `pivot` is, but the comment I made previously **is what the OP wants**. – King King Sep 06 '13 at 14:26
  • @Hogan: The code that fills the DataTable essentially is a helper method. I don't like it more or less than mine. – Daniel Hilgarth Sep 06 '13 at 14:27
  • @KingKing: You don't understand my answer. That's the problem. It provides exactly what the OP wants. – Daniel Hilgarth Sep 06 '13 at 14:28
  • 1
    @KingKing That's your interpretation. The OP never stated as such. It's a perfectly appropriate interpretation of the question as asked that the results he's shown is a grouping, rather than a single item with a number of properties not known at compile time. Unless he specifically stated it must be one of them, you can't say that this is wrong. – Servy Sep 06 '13 at 14:28
  • @DanielHilgarth I think the OP wants something containing 1 column of `year`, all the other columns may look like `USA`, `CAN`, `CHN`, ... However looks like that the OP didn't expressed it clearly enough. – King King Sep 06 '13 at 14:34
  • +1 for terseness. Does it bother you to enumerate result so many times? – Amy B Sep 06 '13 at 14:35
  • @KingKing: The last line that formats everything as a string provides exactly that format. – Daniel Hilgarth Sep 06 '13 at 14:35
  • @DavidB: Yes, that's one of the pain points I have with my answer. However, if the number of entries is reasonably small and both `result` and `data` are non-deferred enumerables, everything should be fine. – Daniel Hilgarth Sep 06 '13 at 14:37
3
  //group things up as required
var mainLookup = result
  .SelectMany(
    country => country.data,
    (country, data) => new {
      Name = country.Name,
      Year = data.Year,
      Val = data.Val
    }
  )
  .ToLookup(row => new {Name= row.Name, Year = row.Year}

List<string> names = mainLookup
  .Select(g => g.Key.Name)
  .Distinct()
  .ToList();
List<string> years = mainLookup
  .Select(g => g.Key.Year)
  .Distinct()
  .ToList();

// generate all possible pairs of names and years
var yearGroups = names
  .SelectMany(years, (name, year) => new {
    Name = name,
    Year = year
  })
  .GroupBy(x => x.Year)
  .OrderBy(g => g.Key);

IEnumerable<string> results =
  (
  from yearGroup in yearGroups
  let year = yearGroup.Key
     //establish consistent order of processing
  let pairKeys = yearGroup.OrderBy(x => x.Name)
  let data = string.Join("\t",
    from pairKey in pairKeys
     //probe original groups with each possible pair
    let val = mainLookup[pairKey].FirstOrDefault()
    let valString = val == null ? "null" : val.ToString()
    select pairKey.Name + " " + valString
    )
  select year.ToString() + "\t" + data; //resultItem
Amy B
  • 108,202
  • 21
  • 135
  • 185