I want to be able to update data and visualization on a chart made with d3, from a large json file. The data comes from the USDA's nutrient database. The json file was sourced from here: http://ashleyw.co.uk/project/food-nutrient-database, it is incredibly large (specifically 30MB). Just loading it is a hassle. Notepad++ loads it all on (1) line (after about a minute), Notepad (in about 20 seconds) loads it (across multiple lines) with poor formatting. Is it possible to efficiently use a Json database that large? Will it crash a browser or cause some kind of loading lag?
Asked
Active
Viewed 2,570 times
0
-
You're absolutely right, it will cause loading lag, but to my knowledge it shouldn't crash any browsers. Do you need all of the data? i.e. can you pre-process it to cut away any information you don't need? – Richard Marr Oct 15 '13 at 08:50
-
I do not need all the data. However, since the file is so large I don't know how to meaningfully strip away the unwanted data. Just viewing it in a somewhat friendly-read-format seems to be a challege. I'ts an array of 6600 objects, each nested with about 75 more objects. – Yansee Oct 15 '13 at 09:06
-
If you're using D3 I assume your Javascript Fu is strong. My advice would be to install Node.js (if you haven't already) and process the file that way. It's easy enough to do. Let me see if I can come up with an example. – Richard Marr Oct 15 '13 at 09:09
-
Unfortunately, D3 was my intro to javascript, html, sql, php etc. This journey started about a month ago when I found D3 and realized I'd have to learn the others before I dabbled in D3. I'm working with 3 weeks of w3schools.com. I managed to get a D3 Barchart and webpage working. Here is my fiddle http://jsfiddle.net/YQthy/ – Yansee Oct 15 '13 at 09:39
-
Well you should be able to load that file through D3 as it is. I'd suggest trying it and seeing what happens. If it's too large you'll need to think about processing it and Node is a good option for that (and easy to get into if you know any Javascript). Good luck with it all, and shout if you have any more questions, you're in the right place :) – Richard Marr Oct 15 '13 at 09:51
-
Depends on what you want to do with the data. If you need access to all the data at once (to plot it) then @Richard has the right idea of pre-process the data. Alternatively if you need only a subset of the data at a time then you could partition the data and build an index. That would work quite well if you just wanted to compare a handful of datums. If you have a twitter account then [download your data](https://blog.twitter.com/2012/your-twitter-archive) and see how twitter have broken down their JSON by month and load each file as and when needed. – Christopher Hackett Oct 15 '13 at 10:02
-
Thanks for the responses. I spent the last 2 hours figuring out how to install (and use) node.js. Very complicated, and very few current tutorials on the topic. Still, I managed to test your example and was surprised at how quickly it output the result. I will experiment with it to refine the array. @ChristopherHackett, I will download my twitter and check out how it is formatted to see if it gives me any insigt. Thanks. – Yansee Oct 15 '13 at 14:04
1 Answers
2
As I mentioned above, my recommendation would be to pre-process the JSON to cut out anything you don't need. Below is an example script in Node.js that will read in the file you're using and spit out a new file that has a lot of the content filtered out.
In this example I'm ignoring all fields apart from description, and only including information about vitamins. There should still be 6600 elements in the root array.
The resulting file from this script is about 5mb rather than 30mb
var fs = require('fs');
// open the file
fs.readFile('./foods-2011-10-03.json','utf8',function(err,data){
if (err) throw err;
var output = [];
// parse the file from a string into an object
data = JSON.parse(data);
// Loop through each element
data.forEach(function(d,i){
// spit an example element into the console for inspection:
// if ( i == 0 ) console.log(d);
// decide which parts of the object you'd like to keep
var element = {
description : d.description,
nutrients : []
};
// for example here I'm just keeping vitamins
d.nutrients.forEach(function(d,i){
if ( d.description.indexOf("Vitamin") == 0 ) element.nutrients.push(d);
});
output.push(element);
})
fs.writeFile( './foods-output.json', JSON.stringify(output), function(err){
if ( err ) throw err;
console.log('ok');
})
})

Richard Marr
- 3,044
- 1
- 22
- 31
-
Note that if the file was larger, or you have memory contraints, you could also consider using streams to filter the file line-by-line. See this answer here: http://stackoverflow.com/questions/11874096/parse-large-json-file-in-nodejs – Richard Marr Oct 15 '13 at 09:27