5

I am working on a Dimple/D3 chart that plots missing days' data as 0.

date                fruit   count
2013-12-08 12:12    apples  2
2013-12-08 12:12    oranges 5
2013-12-09 16:37    apples  1
                             <- oranges inserted on 12/09 as 0
2013-12-10 11:05    apples  6
2013-12-10 11:05    oranges 2
2013-12-10 20:21    oranges 1

I was able to get nrabinowitz's excellent answer to work, nearly.

My data's timestamp format is YYYY-MM-DD HH-MM, and the hashing + D3.extent time interval in days results in 0-points every day at midnight, even if there is data present from later in the same day.

An almost-solution I found was to use .setHours(0,0,0,0) to discard the hours/minutes, so that all data would appear to be from midnight:

...
var dateHash = data.reduce(function(agg, d) { 
 agg[d.date.setHours(0,0,0,0)] = true; 
 return agg; 
}, {});
...

This works as expected when there is just 1 entry per day everyday, BUT on days when there are multiple entries the values are added together. So in the data above on 12/10: apples: 6 , oranges: 3.

Ideally (in my mind) I would separate the plotting data from the datehash, and on the hash discard hours/minutes. This would compare the midnight-datehash with the D3 days interval, fill in 0s at midnight on days with missing data, and then plot the real points with hours/minutes intact.

I have tried data2 = data.slice() followed by setHours, but the graph still gets the midnight points:

...
// doesn't work, original data gets converted
var data2 = data.slice();
var dateHash = data2.reduce(function(agg, d) { 
 agg[d.date.setHours(0,0,0,0)] = true; 
 return agg; 
}, {});
...

Props to nrabinowitz, here is the adapted code:

// get the min/max dates
var extent = d3.extent(data, function(d) { return d.date; }),
  // hash the existing days for easy lookup
  dateHash = data.reduce(function(agg, d) {
      agg[d.date] = true;

// arrr this almost works except if multiple entries per day
//    agg[d.date.setHours(0,0,0,0)] = true; 

      return agg;
  }, {}),
  headers = ["date", "fruit", "count"];

// make even intervals
d3.time.days(extent[0], extent[1])
    // drop the existing ones
    .filter(function(date) {
        return !dateHash[date];
    })
    // fruit list grabbed from user input
    .forEach(function(date) {
fruitlist.forEach(function(fruits) {
        var emptyRow = { date: date };
        headers.forEach(function(header) {
            if(header === headers[0]) {
                emptyRow[header] = fruits;}
            else if(header === headers[1]) {
                emptyRow[header] = 0;};
    // and push them into the array
        data.push(emptyRow);
    });
// re-sort the data
data.sort(function(a, b) { return d3.ascending(a.date, b.date); });

(I'm not concerned with 0-points in the hour-scale, just the dailies. If the time.interval is changed from days to hours I suspect the hash and D3 will handle it fine.)

How can I separate the datehash from the data? Is that what I should be trying to do?

Community
  • 1
  • 1
williamtx
  • 81
  • 5
  • I'm not entirely sure what you're looking for, but you may want to use an [ordinal scale](https://github.com/mbostock/d3/wiki/Ordinal-Scales#wiki-ordinal) instead of a time scale. – Lars Kotthoff Dec 31 '13 at 09:31

1 Answers1

1

I can't think of a smooth way to do this but I've written some custom code which works with your example and can hopefully work with your real case.

var svg = dimple.newSvg("#chartContainer", 600, 400),
    data = [
        { date : '2013-12-08 12:12', fruit : 'apples', count : 2 },
        { date : '2013-12-08 12:12', fruit : 'oranges', count : 5 },
        { date : '2013-12-09 16:37', fruit : 'apples', count : 1 },
        { date : '2013-12-10 11:05', fruit : 'apples', count : 6 },
        { date : '2013-12-10 11:05', fruit : 'oranges', count : 2 },
        { date : '2013-12-10 20:21', fruit : 'oranges', count : 1 }
    ],
    lastDate = {},
    filledData = [],
    dayLength = 86400000,
    formatter = d3.time.format("%Y-%m-%d %H:%M");

// The logic below requires the data to be ordered by date
data.sort(function(a, b) { 
    return formatter.parse(a.date) - formatter.parse(b.date); 
});

// Iterate the data to find and fill gaps
data.forEach(function (d) {

    // Work from midday (this could easily be changed to midnight)
    var noon = formatter.parse(d.date).setHours(12, 0, 0, 0);

    // If the series value is not in the dictionary add it
    if (lastDate[d.fruit] === undefined) {
        lastDate[d.fruit] = formatter.parse(data[0].date).setHours(12, 0, 0, 0);
    }

    // Calculate the days since the last occurance of the series value and fill
    // with a line for each missing day
    for (var i = 1; i <= (noon - lastDate[d.fruit]) / dayLength - 1; i++) {
        filledData.push({ 
            date : formatter(new Date(lastDate[d.fruit] + (i * dayLength))), 
            fruit : d.fruit, 
            count : 0 });
    }

    // update the dictionary of last dates
    lastDate[d.fruit] = noon;

    // push to a new data array
    filledData.push(d);

}, this);

// Configure a dimple line chart to display the data
var chart = new dimple.chart(svg, filledData),
    x = chart.addTimeAxis("x", "date", "%Y-%m-%d %H:%M", "%Y-%m-%d"),
    y = chart.addMeasureAxis("y", "count"),
    s = chart.addSeries("fruit", dimple.plot.line);
s.lineMarkers = true;
chart.draw();

You can see this working in a fiddle here:

http://jsfiddle.net/LsvLJ/

John Kiernander
  • 4,904
  • 1
  • 15
  • 29
  • Thanks so much! This indeed worked as an alternate and drop-in solution to what I was using before. I'm still not sure why I couldn't get the datahash to separate from the data to get previous code working. Hopefully one or both of these will be useful to others looking to pad their data. Also, thank you for your work on Dimple in general :) – williamtx Jan 06 '14 at 17:18