1

I have a array of analytic events in my database and i would like to send this data grouped by date to my client app.

The data from the db looks something like this (but with hundreds of records):

[
  {
    "DateAdded": "2006-12-30 00:38:54",
    "Event": "click",
    "Category": "externalWebsite"
  },
  {
    "DateAdded": "2006-07-20 00:36:44",
    "Event": "click",
    "Category": "social"
  },
  {
    "DateAdded": "2006-09-20 00:36:44",
    "Event": "click",
    "Category": "social"
  },
  {
    "DateAdded": "2006-09-22 00:12:34",
    "Event": "load",
    "Category": "profile"
  }
]

What I would like to do is return the count of all the say 'social' 'click' but by month so it would look like this:

[
  {
    "name": "socialclicks",
    "series": [
      {
        "count": 259,
        "name": "Jan"
      },
      {
        "count": 0,
        "name": "Feb"
      },
      {
        "count": 52,
        "name": "Mar"
      }
      ... etc, etc up to Dec <====
    ]
  }
]

So, what I have been trying is to get all the records that are associated with a particular user using their id. This is simple.

Now I need to split them records into monthly counts showing the last 12 months from the current month (if the month doesn't exist return 0) - this is proving to be complicated and difficult.

My approach was this:

var records = context.records.where(r => r.Id = userId).ToList();
var jan
var feb
var mar
var apr
... etc, etc

for (int i = 0; i < records.Count ; i++)
{
    if (record.DateAdded > "2005-12-31 00:00.00" && record.DateAdded < "2006-01-31 00:00.00") {
       jan++;
    }

    if (record.DateAdded > "2006-01-31 00:00.00" && record.DateAdded < "2006-02-28 00:00.00") {
       feb++;
    }

    ...etc, etc
}

Then i use these variables to count and hard code the name for the returned data.

As you can see, there is lots of etc, etc because the code has become ridiculous!

There must be a more simple way to do this but i cant seem to find one!

Any assistance would be appreciated.

Thanks

DBoi
  • 627
  • 2
  • 17
  • 35
  • Possible duplicate of [Entity Framework: Efficiently grouping by month](https://stackoverflow.com/questions/9886190/entity-framework-efficiently-grouping-by-month) – Renat Aug 29 '19 at 09:50
  • Are the entries on which you are working created on the same year? – Jota.Toledo Aug 29 '19 at 10:03
  • Unfortunately not, i'm looking to only return the last 12 months but thought id tackle that one later down the line!! ie. if it was Aug 2018, i would return from Aug 2017 - Aug 2018 and anything that is null return as 0. – DBoi Aug 29 '19 at 10:05
  • You either update your question with those extra requirements, otherwise I wont consider them as part of the question´s scope. – Jota.Toledo Aug 29 '19 at 10:24
  • ok thanks, have updated my question as suggested. – DBoi Aug 29 '19 at 11:00
  • Is "social" and "click" fixed inputs to this, or do you want to group every combination? – Jamiec Aug 29 '19 at 11:06
  • I'm looking to group every combination. There could be 'social', 'load' or 'profile', 'click' – DBoi Aug 29 '19 at 11:13
  • @DBoi please **rephrase** your question to better represent what your goals are. As it is, your curren expected output gives the impression that you arent looking for all {Categories}x{Events} combinations, but for a certain pair. – Jota.Toledo Aug 29 '19 at 11:21
  • I've updated the return json to make that more clear - if i understand correctly – DBoi Aug 29 '19 at 11:21

1 Answers1

1

The first thing to do is group all your data by the 2 properties you're interested in

  • Event
  • Category

Example:

 var partialResult = entries.GroupBy(x => new {
       x.Event,
       x.Category
    });

From there, when you project your result and you can group again by Month & Year. - anonymous object used for demo, but you could easily define this as a struct/class as appropriate:

var result = entries.GroupBy(x => new {
       x.Event,
       x.Category
    }).Select(g => new {
       g.Key.Event,
       g.Key.Category,
       Series = g.GroupBy(x => new {x.DateAdded.Month, x.DateAdded.Year})
                 .Select(i => new{
                         i.Key.Month,
                         i.Key.Year,
                         Count = i.Count()
                }).ToArray()
    });

foreach(var item in result)
{
    Console.WriteLine($"Event:{item.Event} Category:{item.Category}");
    foreach(var serie in item.Series)
        Console.WriteLine($"\t{CultureInfo.CurrentCulture.DateTimeFormat.GetMonthName(serie.Month)}{serie.Year} Count={serie.Count}");
}

Edit: To satisfy your requirement that:

if the month doesn't exist return 0

You need to add a few complexities. First a method which can work out all the Month/Year combinations between 2 dates.

private static IEnumerable<(int Month, int Year)> MonthsBetween(
        DateTime startDate,
        DateTime endDate)
{
    DateTime iterator;
    DateTime limit;

    if (endDate > startDate)
    {
        iterator = new DateTime(startDate.Year, startDate.Month, 1);
        limit = endDate;
    }
    else
    {
        iterator = new DateTime(endDate.Year, endDate.Month, 1);
        limit = startDate;
    }

    var dateTimeFormat = CultureInfo.CurrentCulture.DateTimeFormat;
    while (iterator < limit)
    {
        yield return (iterator.Month,iterator.Year);                
        iterator = iterator.AddMonths(1);
    }
}

Also you'll need some kind of range to both calculate all the months between, as well as filter your original query:

var dateRangeStart = DateTime.Parse("2006-01-01");
var dateRangeEnd = DateTime.Parse("2007-01-01");
var monthRange = MonthsBetween(dateRangeStart,dateRangeEnd);

var results = entries.Where(e => e.DateAdded>=dateRangeStart && e.DateAdded<dateRangeEnd)
     ..... etc

And then, when outputting results you need to effectively do a left join onto your list of years/months. For some reason this is easier using query syntax than lambda.

foreach(var item in results)
{
    Console.WriteLine($"Event:{item.Event} Category:{item.Category}");

    var joinedSeries = from month in monthRange
                        join serie in item.Series
                        on new{month.Year, month.Month} equals new {serie.Year, serie.Month} into joined
                        from data in joined.DefaultIfEmpty()
                        select new {
                            Month = data == null ? month.Month : data.Month,
                            Year = data == null ? month.Year : data.Year,
                            Count = data == null ? 0 : data.Count
                        };


foreach(var serie in joinedSeries)
    Console.WriteLine($"\t{CultureInfo.CurrentCulture.DateTimeFormat.GetMonthName(serie.Month)}{serie.Year} Count={serie.Count}");
}

Live example: https://dotnetfiddle.net/K7ZoJN

Jamiec
  • 133,658
  • 13
  • 134
  • 193
  • Magic :) Thanks very much, this is so much better! – DBoi Aug 29 '19 at 12:44
  • @DBoi no probs. Just realised that this wont give you months which dont exist in your data. Problem? – Jamiec Aug 29 '19 at 13:00
  • Ideally I'm looking to have all months within the last 12 months and return 0 for any that don't exist, just so the user knows that its 0 if that makes sense? – DBoi Aug 29 '19 at 13:10
  • It makes sense, and you have 2 options. 1) Treat this as a UI issue and deal with it there or 2) Join your result to a precanned set of months. 2 is fiddly, but do-able. I could update the answer if you needed – Jamiec Aug 29 '19 at 13:13
  • Yes please, i think it would be better to deal with it server side - thanks for this, really appreciate the help – DBoi Aug 29 '19 at 13:18
  • This is awesome! Will now dissect it and learn more about what you've done here - thanks so much for putting the time in and doing the fiddle! – DBoi Aug 29 '19 at 14:23