So I'm making a little scraper for learning purposes, in the end I should get a tree-like structure of the pages on the website.
I've been banging my head trying to get the requests right. This is more or less what I have:
var request = require('request');
function scanPage(url) {
// request the page at given url:
request.get(url, function(err, res, body) {
var pageObject = {};
/* [... Jquery mumbo-jumbo to
1. Fill the page object with information and
2. Get the links on that page and store them into arrayOfLinks
*/
var arrayOfLinks = ['url1', 'url2', 'url3'];
for (var i = 0; i < arrayOfLinks.length; i++) {
pageObj[arrayOfLinks[i]] = scanPage[arrayOfLinks[i]];
}
});
return pageObj;
}
I know this code is wrong on many levels, but it should give you an idea of what I'm trying to do.
How should I modify it to make it work? (without the use of promises if possible)
(You can assume that the website has a tree-like structure, so every page only has links to pages further down the three, hence the recursive approach)