5

I have a array of javascript objects with some key and value. Below is how my array looks like.

[
{
"timestamp": 1474328370007,
"message": "hello"
},
{
"timestamp": 1474328302520,
"message": "how are you"
},
{
"timestamp": 1474328370007,
"message": "hello"
},
{
"timestamp": 1474328370007,
"message": "hello"
}
]

I want to remove the duplicate occurring of timestamp in the object and keep only single occurring of that object. The matching should happen based on the timestamp and not the message.

expected output is

[
{
 "timestamp": 1474328302520,
"message": "how are you"
},
{
"timestamp": 1474328370007,
"message": "hello"
}
]

trying something like this

var fs = require('fs');

fs.readFile("file.json", 'utf8', function (err,data) {
if (err) console.log(err);;
console.log(data);
// var result = [];
for (i=0; i<data.length;i++) {
  if(data[i].timestamp != data[i+1].timestamp)
    console.log('yes');
  }
});

I cannot figure out the data[i+1] part after the array ends. Is there any easy way with which I can do the above deduplication?

thank you in advance

csvb
  • 365
  • 2
  • 6
  • 14

4 Answers4

7

You could use an object as hash table and check against.

var array = [{ "timestamp": 1474328370007, "message": "hello" }, { "timestamp": 1474328302520, "message": "how are you" }, { "timestamp": 1474328370007, "message": "hello" }, { "timestamp": 1474328370007, "message": "hello" }],
    result = array.filter(function (a) {
        return !this[a.timestamp] && (this[a.timestamp] = true);
    }, Object.create(null));

console.log(result);

You could use a variable for the hash and one for the filtered result, like

var hash = Object.create(null),
    result = [];

for (i = 0; i < data.length; i++) {
    if (!hash[data[i].timestamp]) {
        hash[data[i].timestamp] = true;
        result.push(data[i]);
    }
}
Nina Scholz
  • 376,160
  • 25
  • 347
  • 392
4

Why you read json file with fs.readFile? Just require it.

The filtering job itself:

const arr = require('./file.json')

const tester = []
const result = []

arr.forEach(function(el) {
  if (tester.indexOf(el.timestamp) === -1) {
    tester.push(el.timestamp)
    result.push(el)
  }
})

UPDATE: Elegant solution using Array.prototype.reduce:

const result = arr.reduce(function(result, current) {
  if (result.indexOf(current) === -1) result.push(current);
}, []);

UPDATE Most efficient for most cases:

const hashmap = {};
arr.forEach(el => {
  if(!hash[el.timestamp]) hash[el.timestamp] = el;
})
const result = Object.values(hashmap);

UPDATE Most efficient and stable for all cases. The case where hashing function will cause collision on each case, the upper solution will be very unefficient. This one will be most stable one:

const result = [];
arr.sort((a,b) => a.timestamp - b.timestamp);
arr.forEach(el => {
  const last = result[result.length-1];
  if (el.timestamp === last.timestamp) continue;
  result.push(el);
});
Lazyexpert
  • 3,106
  • 1
  • 19
  • 33
3

You can use reduce and get the unique items

check this snippet

var arr = [{
  "timestamp": 1474328370007,
  "message": "hello"
}, {
  "timestamp": 1474328302520,
  "message": "how are you"
}, {
  "timestamp": 1474328370007,
  "message": "hello"
}, {
  "timestamp": 1474328370007,
  "message": "hello"
}];

var elements = arr.reduce(function(previous, current) {

  var object = previous.filter(object => object.timestamp === current.timestamp);
  if (object.length == 0) {
    previous.push(current);
  }
  return previous;
}, []);

console.log(elements);

Hope it helps

Geeky
  • 7,420
  • 2
  • 24
  • 50
  • Thank you Geeky. This helps. Can you explain as to how this code works? I am not able to understand what this line is doing: how reduce is used to get unique values? – csvb Nov 26 '16 at 04:37
  • What does this line do ? `var object = previous.filter(object => object.timestamp === current.timestamp);` and `return previous;}, [])` – csvb Nov 26 '16 at 04:44
  • It is filtering the objects where the already existing items and current items timestamp is equal – Geeky Nov 26 '16 at 05:25
  • and what does this operator do? `=>` – csvb Nov 26 '16 at 06:38
  • This is arrow function,for reference https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Arrow_functions – Geeky Nov 26 '16 at 06:39
1

A simple way to do this is to use an array of flags. There are definitely better ways, but this is a fairly simple way to do it that should work for you.

 data = [
        {
            "timestamp": 1474328370007,
            "message": "hello"
        },
        {
            "timestamp": 1474328302520,
            "message": "how are you"
        },
        {
           "timestamp": 1474328370007,
           "message": "hello"
        },
        {
           "timestamp": 1474328370007,
           "message": "hello"
        }
    ];

    // array to store result
    result = [];
    // store flags
    flags = [];

    for (i=0; i<data.length;i++) {
        // dont run the rest of the loop if we already have this timestamp
        if (flags[data[i].timestamp]) continue;

        // if we didn't have the flag stored, then we need to record it in the result
        result.push(data[i]);

        // if we don't yet have the flag, then store it so we skip it next time
         flags[data[i].timestamp] = true;
    }

    // stringify the result so that we can display it in an alert message
    alert(JSON.stringify(result))
Dylan Hamilton
  • 662
  • 4
  • 14