13

All of the d3 tutorials I've found use data arranged in arrays of objects from which they graph one point for each object in the array. Given data in the following structure:

data = [
     {id: 1, x: 4, y: 10, type: 1},
     {id: 2, x: 5, y: 20, type: 2}
     ...
]

The x and y values are used to make a scatterplot. The type parameter is used to change the color of each point. See this jsfiddle for an example: http://jsfiddle.net/uxbHv/

Unfortuately, I have a different data structure and I can't figure out how to create the same graph by drawing two data points for each object. Here is some example data:

dataSet = [
     {xVar: 5, yVar1: 90, yVar2: 22},
     {xVar: 25, yVar1: 30, yVar2: 25},
     {xVar: 45, yVar1: 50, yVar2: 80},
     {xVar: 65, yVar1: 55, yVar2: 9},
     {xVar: 85, yVar1: 25, yVar2: 95}
]

I can graph xVar individually against yVar1 or yVar2, but I can not figure out how to get both on the same graph: http://jsfiddle.net/634QG/

Andrew Staroscik
  • 2,675
  • 1
  • 24
  • 26

2 Answers2

31

The general rule when using a data-join is that you want a one-to-one mapping from data to elements. So, if you have two series in your scatterplot, you’ll want two container elements (such as G elements) to represent the series. Since you currently have only one data array, you’ll also want to use array.map to convert the data representation into two parallel arrays with the same representation. This way, you don’t have to duplicate code for each series.

Say your data was represented in a CSV file with one column for the x-values, and multiple other columns for the y-values of each series:

x,y1,y2
5,90,22
25,30,25
45,50,80
65,55,9
85,25,95

If you want the code to be completely generic, you first need to compute the series’ names, such as ["y1", "y2"]. (If you added a third column to the CSV file, it might be ["y1", "y2", "y3"].) You can compute the names using d3.keys, which extracts the named properties from an object. For example, d3.keys({foo: 1, bar: 2}) returns ["foo", "bar"].

// Compute the series names ("y1", "y2", etc.) from the loaded CSV.
var seriesNames = d3.keys(data[0])
    .filter(function(d) { return d !== "x"; })
    .sort();

Now that you have the series names, you can create an array of arrays of points. The outer array represents the series (of which there are two) and the inner arrays store the data points. You can simultaneously convert the points to a consistent representation (objects with x and y properties), allowing you to reuse code across series.

// Map the data to an array of arrays of {x, y} tuples.
var series = seriesNames.map(function(series) {
  return data.map(function(d) {
    return {x: +d.x, y: +d[series]};
  });
});

Note this code uses the + operator to coerce the CSV values to numbers. (CSV files are untyped, so they are initially strings.)

Now that you’ve mapped your data to a regular format, you can create G elements for each series, and then circle elements within for each point. The resulting SVG structure will look like this:

<g class="series">
  <circle class="point" r="4.5" cx="1" cy="2"/>
  <circle class="point" r="4.5" cx="3" cy="2"/>
  …
</g>
<g class="series">
  <circle class="point" r="4.5" cx="5" cy="4"/>
  <circle class="point" r="4.5" cx="7" cy="6"/>
  …
</g>

And the corresponding D3 code:

// Add the points!
svg.selectAll(".series")
    .data(series)
  .enter().append("g")
    .attr("class", "series")
    .style("fill", function(d, i) { return z(i); })
  .selectAll(".point")
    .data(function(d) { return d; })
  .enter().append("circle")
    .attr("class", "point")
    .attr("r", 4.5)
    .attr("cx", function(d) { return x(d.x); })
    .attr("cy", function(d) { return y(d.y); });

I’ve also added a bit of code to assign each series a unique color by adding a fill style to the containing G element. There are lots of different ways to do this, of course. (You might want to be more specific about the color for each series, for example.) I’ve also left out the code that computes the domains of your x and y scales (as well as rendering the axes), but if you want to see the entire working example:

mbostock
  • 51,423
  • 13
  • 175
  • 129
  • almost a tutorial...this and the other similar questions that were answered through such long answers by Mike Bostock should be put into the FAQ – paxRoman Jul 26 '12 at 21:55
5

Place the two circles for each data point into a single svg:g element. This produces a one-to-one mapping for the data to elements but still allows you to show two different points.

var nodeEnter = vis1.selectAll("circle")
      .data(dataSet)
      .enter()
      .insert("svg:g");

nodeEnter.insert("svg:circle")
           .attr("cx", function (d) { return 100 - d.xVar})
           .attr("cy", function (d) { return 100 - d.yVar1})
           .attr("r", 2)
           .style("fill", "green");

nodeEnter.insert("svg:circle")
           .attr("cx", function (d) { return 100 - d.xVar})
           .attr("cy", function (d) { return 100 - d.yVar2})
           .attr("r", 2)
           .style("fill", "blue");

Working JSFiddle.

Brant Olsen
  • 5,628
  • 5
  • 36
  • 53
  • Thanks for the solution! I see how this works with static data, but now I am wondering if it will be difficult to add transitions and exits.... back to jsfiddle... – Andrew Staroscik Jul 26 '12 at 16:38
  • This might work (now), but I would strongly discourage this pattern—the API is not designed to work this way. When using a data-join, you should always have a one-to-one mapping between data and elements. Inserting or appending to the enter selection twice is a misuse of the API. The correct solution in this case is to create two `vis` elements, and map the data to two parallel arrays with a consistent representation. – mbostock Jul 26 '12 at 16:56
  • @AndrewStaroscik No, that’s still entering twice against the same selection. (You’ve created two different selection objects, but conceptually they represent the same elements so they are colliding.) See the answer I just added. – mbostock Jul 26 '12 at 17:56
  • @mbostock Updated. Does my update make sense or am I still just hacking the `D3` api? – Brant Olsen Jul 26 '12 at 19:21
  • I'm curious about Mike's take on the update also. My understanding is that adding additional sub-elements in this way (grouped by a parent element that represents the 1-to-1 mapping with the enter selection) is OK. Mike's solution is probably cleaner because the data does represent 2 series. But there may be scenarios where you wish to create multiple sub-elements from *exactly the same* bound data, and I believe the grouping technique you show in your updated code is the correct way to do this. But maybe Mike can confirm or deny this. :) – Scott Cameron Jul 27 '12 at 16:14
  • Mike's solution is better because it works even when you increase the number of series. Your approach resembles this solution http://stackoverflow.com/a/13627065/911207 and would be more appropriate if you were doing something like adding a permanent text label to every point of ONE series. – David Braun Jun 19 '14 at 14:22