1

I have a CSV dataset that I am exploiting with dc.js (crossfilter).

Date, Country 1,Country 2,Country 3,Country 4,Country 5,Country 6,Target country (...) 2014/12/11, USA, France, UAE, (...), Iraq

The thing I am trying to do is to plot a row chart with one row per country. Here's my solution as of today:

  var countries = ndx.dimension(function(d) {
    var list = [];
    list.push(d["Country 1"]);
    if (d["Country 2"]) {list.push(d["Country 2"]);};
    if (d["Country 3"]) {list.push(d["Country 3"]);};
    if (d["Country 4"]) {list.push(d["Country 4"]);};
    if (d["Country 5"]) {list.push(d["Country 5"]);};
    if (d["Country 6"]) {list.push(d["Country 6"]);};
    return list;
  });
  var countriesGroup = countries.group().reduceSum(function(d) {
    return d.totalNumberOfStrikes;
  });;
   countryChart
    .width(400).height(500)
    .group(countriesGroup)
    .dimension(countries)
    .ordering(function(d){ return -d.value });

But, as you can see, it doesn't push uniques in the list array. Which causes stupid results, as each combination of countries in the CSV rows creates a new item in the list.

What I want is to have a list containing each unique country, and then plot the thing in the row chart.

Can you help? Thank you very much!

Gordon
  • 19,811
  • 4
  • 36
  • 74
basbabybel
  • 780
  • 8
  • 17

2 Answers2

2

Based on later conversation in another question and the dc.js users group, here's a better reduction that keeps the data as it is:

var strikingCountriesGroup = xScaleDimension.group().reduce(
    function(p, v) { // add
        countryFields.forEach(function(c) {
            if(v[c]) p[v[c]] = (p[v[c]] || 0) + v.totalNumberOfStrikes;
        });
        return p;
    },
    function(p, v) { // remove
        countryFields.forEach(function(c) {
            if(v[c]) p[v[c]] = p[v[c]] - v.totalNumberOfStrikes;
        });
        return p;
    },
    function() { // initial
        return {};
    }
);

Although that may look like a big tangle of brackets, the idea is that the fields v[c], where c is "Country 1", "Country 2"... in the original data set, indirectly specify the fields that you want to create in the reduction.

We are reducing into the map p from the value v. We loop over the country fields, and for each c, if v has an entry for c, we add or subtract v.totalNumberOfStrikes from p[v[c]]. We have to be careful if the value doesn't already exist: the expression || 0 defaults a value to zero if it is undefined.

Then, we can create the stacks dynamically like this (sorting by value):

  var reducedCountries = strikingCountriesGroup.all()[0].value;
  var countries = d3.keys(reducedCountries).sort(function(a, b) {
      return reducedCountries[b] - reducedCountries[a];   
  });

  // we have to special-case the first group, see https://github.com/dc-js/dc.js/issues/797
  var first = countries.shift();
  strikingCountries
      .group(strikingCountriesGroup, first, 
         function(d) { 
             return d.value[first];
         });
  // rest
  countries.forEach(function(c) {    
      strikingCountries
          .stack(strikingCountriesGroup, c, 
             function(d) { 
                 return d.value[c];
             });
  });

Fiddle here: http://jsfiddle.net/gordonwoodhull/gfe04je9/11/

Community
  • 1
  • 1
Gordon
  • 19,811
  • 4
  • 36
  • 74
1

Probably the easiest way to do this is to flatten your array, so you just have Date, Country, Target in your source. Something like (untested):

var dest = [];
var countries = ["Country 1", "Country 2", ...]
source.forEach(function(d) {
    countries.forEach(function(c) {
        dest.push({Date: d.Date, Country: c, Target: d.Target});
    });
});

And then pass dest to crossfilter instead of your original data.

The advantage of doing it this way is that now when you click on rows in the chart, you can filter the rest of the charts by an individual country. Since crossfilter only filters by row, there is no other way (without serious trickery) to filter by individual country without inadvertently filtering other countries that share those rows.

Gordon
  • 19,811
  • 4
  • 36
  • 74
  • Only issue here is that your counts and sums are going to be inflated on any dimension besides your country dimension. If you need to deal with this situation, there are ways of defining custom groupings that handle that problem. – Ethan Jewett Dec 08 '14 at 18:18
  • Ah, that's a good point. Are you thinking of reducing to an object with fields for each country? – Gordon Dec 08 '14 at 20:36
  • I am so lost, to be honest. Been tinkering around since yesterday, without any success :( – basbabybel Dec 10 '14 at 16:00
  • Sorry if this wasn't helpful. @Ethan, were you thinking of a "set of tags" approach like in http://stackoverflow.com/questions/17524627/is-there-a-way-to-tell-crossfilter-to-treat-elements-of-array-as-separate-record ? – Gordon Dec 11 '14 at 07:05
  • @Gordon Yes, that was the idea, I think. Do what you suggest (blow out the combinations into individual lines) and then use fancy grouping functions to be sure not to double-count. Reductio supports this for sums and counts - see the exception aggregation example at the bottom of the readme - https://github.com/esjewett/reductio You can see the general approach of keeping a list of values at https://github.com/esjewett/reductio/blob/master/src/value-list.js and the counting based on them in https://github.com/esjewett/reductio/blob/master/src/exception-count.js – Ethan Jewett Dec 15 '14 at 17:15