0

My JSON file looks like the following, somewhere around 1000-2000 objects.

[{
    "date": "2015-01-25T22:13:18Z",
    "some_object": {
        "first_group": 20,
        "second_group": 90,
        "third_group": 39,
        "fourth_group": 40
    }
}, {
    "date": "2015-01-25T12:20:32Z",
    "some_object": {
        "first_group": 10,
        "second_group": 80,
        "third_group": 21,
        "fourth_group": 60
    }
}, {
    "date": "2015-02-26T10:53:03Z",
    "some_object": {
        "first_group": 12,
        "second_group": 23,
        "third_group": 13,
        "fourth_group": 30
    }
}]

After copying it in an array I need to perform the following manipulation on it:

First. Remove duplicate objects. 2 objects are considered the same if they have the same date (without taking the time into consideration). So in my JSON, the first two objects are considered the same. Now the tricky part is that when a duplicate is found, we shouldn't just randomly remove one of them, but merge (not sure if merge is the right word) the fields from some_object, so it becomes one object in the array. Therefore, with the JSON above, the first two objects would become one:

{
 "date": "2015-02-26T00:00:00Z",
 "some_object": {
    "first_group": 30, //20+10
    "second_group": 170, //90+80
    "third_group": 60, //39+21
    "fourth_group": 100 //40+60
 }
}

Even trickier is that there could be some 3-10 objects with the same date, but different time in the array. Therefore those should be merged into 1 object according to the rule above.

Second. Sort this array of objects ascending (from oldest to newest of the date field).

So what's so hard? Where did you get stuck?

I found out how to sort the array ascending (based on date) by using this and some of this.
But I have no idea how to do the first point of removing the duplicates and merging, in a time-efficient manner. Maybe something inside:

var array = [];//reading it from the JSON file
var object_date_sort_asc = function (obj1, obj2) {
    if (obj1.date > obj2.date) return 1;
    if (obj1.date < obj2.date) return -1;

    //some magic here
    return 0;
};
array.sort(object_date_sort_asc);

Any ideas?

Community
  • 1
  • 1
Alex
  • 2,325
  • 3
  • 29
  • 35
  • maybe you should consider converting the date property of your objects to a date object before doing comparisons. http://www.w3schools.com/jsref/jsref_obj_date.asp – toskv Sep 09 '15 at 21:21
  • @toskv Not sure if it's mandatory. I tested the `object_date_sort_asc` function written in my question with a portion of the JSON file and seems to work fine. I can sort it ascending successfully. Now, after that I need to check if 2 dates are the same (without time), so I convert the string dates to Date objects, use `setHours(0,0,0,0)` on them and then convert them back to strings. I still need to perform the actual merging, probably in the `object_date_sort_asc` function. Any further ideas? – Alex Sep 09 '15 at 21:31
  • sure, what you need to do is copy the properties of one object to the other one. this is a good example: http://stackoverflow.com/questions/171251/how-can-i-merge-properties-of-two-javascript-objects-dynamically – toskv Sep 09 '15 at 21:41

2 Answers2

1

Use an object whose properties are the dates, to keep track of dates that have already been seen, and the values are the objects. When you encounter a date that's been seen, just merge the elements.

var seen = {};
for (var i = 0; i < objects.length; i++) {
    var cur = objects[i];
    if (cur.date in seen) {
        var seen_cur = seen[cur.date];
        seen_cur.some_object.first_group += cur.some_object..first_group;
        seen_cur.some_object..second_group += cur.some_object..second_group;
        ...
    } else {
        seen[cur.date] = cur;
    }
}

Once this is done, you can convert the seen object to an array and sort it.

var arr = [];
for (var k in seen) {
    arr.push(seen[k]);
}
Barmar
  • 741,623
  • 53
  • 500
  • 612
  • Your solution works, but it needs minor editing as in `seen_cur.some_object.first_group += cur.some_object.first_group;` instead of `seen_cur.first_group += cur.first_group;` for the specific example in the question. Anyway, I understand the idea. Thank you, @Barmar! – Alex Sep 09 '15 at 23:43
  • Thanks, I missed that nested object. – Barmar Sep 09 '15 at 23:47
0

To remove duplicate objects, you can loop through your array using .map(). In each iteration, you push the dates, parsed using some simple regex (which removes the time), into an array—if and only if it is not present in the array to begin with:

  • If it is not in the array, push into array (of unique dates) and return the object
  • If it is in the array, do nothing

The logic above can be described as the following, assuming your array is assigned to the data variable:

// Remove duplicates
var dates = [];
var data_noDupes = $.map(data, function(item){ 
  var item_date = item.date.replace(/([\d\-]+)T.*/gi, '$1');
  if (dates.indexOf(item_date) === -1) {
    dates.push(item_date);
    return item;
  } 
});

This should remove all recurring instances of the same date.

With regards to the second part: to sort, you simply sort the returned array by the date, again parsed using some simply regex that removes the time:

// Sort data_noDupes
function sortByDate(a, b){
  var a_item_date = a.date.replace(/([\d\-]+)T.*/gi, '$1'),
      b_item_date = b.date.replace(/([\d\-]+)T.*/gi, '$1');
  return ((a_item_date < b_item_date) ? -1 : ((a_item_date > b_item_date) ? 1 : 0));
}

If you want to be extra safe, you should use momentjs to parse your date objects instead. I have simply modified how the dates are parsed in the functional example below, but with exactly the same logic as described above:

$(function() {
  var data = [{
    "date": "2015-02-26T10:53:03Z",
    "some_object": {
      "first_group": 12,
      "second_group": 23,
      "third_group": 13,
      "fourth_group": 30
    }
  }, {
    "date": "2015-01-25T12:20:32Z",
    "some_object": {
      "first_group": 10,
      "second_group": 80,
      "third_group": 21,
      "fourth_group": 60
    }
  }, {
    "date": "2015-01-25T22:13:18Z",
    "some_object": {
      "first_group": 20,
      "second_group": 90,
      "third_group": 39,
      "fourth_group": 40
    }
  }];

  // Remove duplicates
  var dates = [];
  var data_noDupes = $.map(data, function(item) {
    // Get date and format date
    var item_date = moment(new Date(item.date)).format('YYYY-MM-DD');
    
    // If it is not present in array of unique dates:
    // 1. Push into array
    // 2. Return object to new array
    if (dates.indexOf(item_date) === -1) {
      dates.push(item_date);
      return item;
    }
  });

  // Sort data_noDupes
  function sortByDate(a, b) {
    var a_item_date = moment(new Date(a.date));
    return ((a_item_date.isBefore(b.date)) ? -1 : ((a_item_date.isAfter(b.date)) ? 1 : 0));
  }

  data_noDupes.sort(sortByDate);
  console.log(data_noDupes);

  $('#input').val(JSON.stringify(data));
  $('#output').val(JSON.stringify(data_noDupes));
});
body {
  padding: 0;
  margin: 0;
}
textarea {
  padding: 0;
  margin: 0;
  height: 100vh;
}
<script src="https://cdnjs.cloudflare.com/ajax/libs/moment.js/2.10.6/moment.min.js"></script>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
<textarea id="input"></textarea>
<textarea id="output"></textarea>
Terry
  • 63,248
  • 15
  • 96
  • 118
  • Hey Terry! I'll full-check your solution a bit later, but from the snippet result I can see that the removing duplicates part works, but not the merging of what's inside `some_object`. Please take a look at my example of how the merging should happen, written in my question after *First* point. Thank you for the super-quick and super-detailed answer! – Alex Sep 10 '15 at 00:22