2

I have a CSV I am loading into memory via csv to array, this returns an array of objects with key value pairs for each row of data with the key being the column and the value being the data associated with that column. As such:

[{col1: 'id1', col2: '123', col3: '12/01/12'},
{col1: 'id1', col2: '100', col3: '12/01/12'},
{col1: 'id2', col2: '-100', col3: '12/01/12'},
{col1: 'id2', col2: '123', col3: '13/01/12'}]

What I want to be able to do is group data based on a specific column so for example for id1 on the data 12/01/12 what was the sum total of column col2. Or for id2 what was the difference between the value in col2 between 12/01/12 and 13/01/12.

Fairly new (1 week in) to JS so any advice on this would be much appreciated.

GoodCat
  • 117
  • 2
  • 9
  • I suggest you look into a library called lodash (underscore). It aids with array related functions for js – user5328504 Apr 26 '17 at 17:17
  • Possible duplicate of [What is the most efficient method to groupby on a javascript array of objects?](http://stackoverflow.com/questions/14446511/what-is-the-most-efficient-method-to-groupby-on-a-javascript-array-of-objects) – Heretic Monkey Apr 26 '17 at 17:40

2 Answers2

2

For grouping, I'd use the Array filter method. It takes a function that should return true if you wish to include an array element in the filtered array. So to get the rows for which col1 is 'id1' and col3 is '12/01/12', you would do:

var data = [{col1: 'id1', col2: '123', col3: '12/01/12'},
            {col1: 'id1', col2: '100', col3: '12/01/12'},
            {col1: 'id2', col2: '-100', col3: '12/01/12'},
            {col1: 'id2', col2: '123', col3: '13/01/12'}];

var grouped = data.filter(  function(row) {
                               return row.col3 == '12/01/12' && row.col1 == 'id1';
                            });

For summing a column within a group, you can use the Array reduce method. It takes as its first argument a "reducing" function that reduces the elements of an array to a single value. It takes as its optional second argument a starting value. The reducing function must take two arguments. On the first iteration, the first argument is the specified starting value, and on subsequent iterations, it's the return value of the function on the previous iteration. The second argument to the reducing function is an element of your array. So to sum the second column of your grouped data, you'd do:

var sumOfGrouped = grouped.reduce(function(sumSoFar, row) { return sumSoFar + parseInt(row.col2) }, 0);

Look at the other Array methods, and you will find others that could be useful for you depending on what you want to do.

cjg
  • 2,594
  • 12
  • 13
  • Thank you - this worked exactly as I had hoped, will need to make it more generic but pretty sure this wont be an issue. I have one questions on the reduce function, what is the purpose of the 0, as including it gives me the correct summation where as if I don't include it I get a print out of all values like this: 123100-100123. I read the documentation and it says its for setting the initial value but this doesn't seem to explain my result. Thank you again – GoodCat Apr 27 '17 at 13:21
  • The 0 is the initial value of the reduction result. So in the example above, in the first iteration of reduce, sumSoFar is 0 (specified as the second argument to reduce), and row is grouped[0]. On that iteration, the function returns 0 + 123 = 123, which becomes the value for sumSoFar in the second iteration, with row = grouped[1]. On that iteration, the function returns 123 + 100 = 223, which is the result. If you omit the zero, then on the first iteration of reduce, sumSoFar is grouped[0] and row is grouped[1]... – cjg Apr 27 '17 at 14:28
  • ... When I delete the zero and run the function as-is, I get '[object Object]100'. This is because the function returns sumSoFar + parseInt(row.col2) = grouped[0] + parseInt(grouped[1].col2]. grouped[0] is an object, but it gets converted to a string ('[object Object]') and + gets treated as the string concatenation operator. grouped[1].col2 thus gets converted back into a string and concatenated onto '[object Object]'. If you change the function to return sumSoFar.col2 + parseInt(row.col2), then I'd expect the result you got, for a similar reason... – cjg Apr 27 '17 at 14:32
  • ... sumSoFar.col2 is a string, so javascript assumes + is the string concatenation operator, and '123' + 100 = '123100'. If you change the function to return parseInt(sumSoFar.col2) + parseInt(row.col2), then you'll get the correct result, but only if you're working with an array of 2 values. In that case reduce only runs one iteration. If you run it a second iteration, then sumSoFar becomes an integer, which has no col2 property. – cjg Apr 27 '17 at 14:34
1

I've created a function who's parameter is a String, representing the column name. In the example, I'm passing in "col1" -- the ID column.

The function creates an empty object. It loops through the data array and checks whether an object with the same ID has already been stored. If not, I'm creating a new property, which has the same name as the ID, and assigning an empty array to it.

This empty array will contain the objects who have the same ID.

Example

var data = [
    {col1: 'id1', col2: '123', col3: '12/01/12'},
    {col1: 'id1', col2: '100', col3: '12/01/12'},
    {col1: 'id2', col2: '-100', col3: '12/01/12'},
    {col1: 'id2', col2: '123', col3: '13/01/12'}
];

function sortData(columnName) {
    var sortedData = {};

    for (var i = 0; i < data.length; i++) {
        var object = data[i];

        if (Object.keys(sortedData).indexOf(object[columnName]) === -1) {
            sortedData[object[columnName]] = [];
        }

        sortedData[object[columnName]].push(object);
    }
    
    return sortedData;
}

console.log(sortData("col1"));
Toby Mellor
  • 8,093
  • 8
  • 34
  • 58
  • 1
    Thank you - this provided a useful starting point and I will probably use it alongside the solution provided by cjg but his provided the exact result I needed – GoodCat Apr 27 '17 at 13:37