1

I am learning d3.js and I have recently been encountering issues with not being able to access variables inside the d3.csv() function. I have just been initializing my variables at the beginning of my program to just make all my variables global.

This made me wonder if there is an issue if I were to just put all my code inside the d3.csv function, removing the need to even initialize my variables at the beginning of my code so it would look like:

d3.csv(data.csv, (data)=>{
    all of my code
});

Is there a downside to this (assuming I'm only using one CSV file) or is there some benefit to keeping code that doesn't need the data outside of the d3.csv method?

Gerardo Furtado
  • 100,839
  • 9
  • 121
  • 171
J. Hurley
  • 109
  • 1
  • 7
  • Each function within your code should do one thing, and do it well. If you put all your code within a callback function, you'll soon run into issues where you have too many nested call backs – EDToaster Jul 10 '19 at 20:33

1 Answers1

1

Note: Since you're asking about the callback of d3.csv I'm assuming you're using D3 v4 or below, because D3 v5 uses the then method of a promise. However, the rationale is the same.


The most important information is that d3.csv, as all other D3 XHR methods, is an asynchronous function. That means that everything inside the callback runs only after the CSV was downloaded and parsed.

//Outside the callback
//Code here runs immediately

d3.csv("example.csv", (data) => {
    //Inside the callback
    //Code here runs only after the CSV was downloaded and parsed
});

//Outside the callback
//Even if these lines come after d3.csv, code here runs before the code inside the callback

By the way, that explains your initial complaint ("... I have recently been encountering issues with not being able to access variables inside the d3.csv() function"). This answer is a good read on that subject.

With that in mind we have to optimise the code in such a way that things that don't depend on the data can be created/set immediately, because if we put them inside the callback we'll lose time without any good reason.

In a nutshell, you can put outside the callback things such as (but not limited to):

  • Selecting/creating the SVG, canvas or HTML containers
  • Scales (with ranges)
  • Axes generators
  • Line generators
  • Area generators
  • Stack generators
  • Pie generators
  • Histogram generators
  • Map projections
  • Hierarchy layouts
  • Formats (like time formats)
  • Force simulators
  • Drag behaviours
  • Zoom Behaviours

All those things don't depend on any data. For some of them (like the line generator, the area generator, the stack generator etc...) you'll pass the data after you have it.

Then, inside the callback, you put everything that depends on the data, such as (but not limited to):

  • Update, enter and exit selections
  • Scale's domains
  • Calling axes generators
  • Setting simulation's nodes and links
  • Nests
  • Passing the data to line generators, area generators, stack generators etc...
  • Transitions (that depend on the data)
  • Event listeners (that depend on the data)

As you can see, if you put eveything inside the callback you'll have a bunch of methods that could run immediately but, instead of that, they are just sitting there unnecessarily waiting for the data to be downloaded.

Gerardo Furtado
  • 100,839
  • 9
  • 121
  • 171
  • Does this mean that I can create scales with ranges along with axes generators that call those scales outside the `d3.csv()` function and then once the data is parsed go into the `d3.csv()` function and add the domains and call the axes generators? I definitely see how that would allow for more tasks to run in parallel. – J. Hurley Jul 16 '19 at 16:21
  • Yes, you can. This is what we do in almost all D3 codes. – Gerardo Furtado Jul 16 '19 at 23:37