0

I need to get all data from API. The data is served in batches (pages). Every batch has its page number.

My solution:

  1. GET page 1, save it to the file
  2. Loop by adding +1 to page number and append to the file the result from GET request
  3. Continue while no error

Currently the file is created and then I get: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory

So I used --max-old-space-size=8192 no errors since then. It's just keeps working with no result. File stays empty.

Please help!

const fs = require('fs');
const axios = require('axios');
const { response } = require('express');

var myWriteStream = fs.createWriteStream(
  '../dev-data/file.json',
  { flags: 'a' },
  { encoding: 'utf8' },
  err => {}
);

let pageNumber = 1;

// Getting initial batch on Page 1

axios
  .get(`https://api.example.com/?page=${pageNumber}`)
  .then(function (response) {
    var json = JSON.stringify(response.data);

// Saving result to the file

    fs.writeFile('../dev-data/declarations_list.json', json, 'utf-8', err => {
    });

// Looping GET + save to the file by adding + 1 to currentPage 

    do {
      pageNumber = response.data.page.currentPage + 1;
      axios
        .get(
            `https://api.example.com/?page=${pageNumber}`
        )
        .then(function (response) {
          console.log(`Current page: ${response.data.page.currentPage}`);
          pageNumber = response.data.page.currentPage;

          var json = JSON.stringify(response.data);

          myWriteStream.write(json);
        })
        .catch(function (error) {
          console.log(error);
        });

// Do while currentPage (no 'error')

    } while (response.data.page.currentPage);
  });

UPDATE

const fs = require('fs');
const axios = require('axios');
const { response } = require('express');

let pageNumber = 0;

do {
  pageNumber = pageNumber + 1;
  console.log(pageNumber);
  axios
    .get(`https://public-api.nazk.gov.ua/v1/declaration/?page=${pageNumber}`)
    .then(function (response) {
      console.log(response);
      console.log(`Current page: ${response.data.page.currentPage}`);
      pageNumber = response.data.page.currentPage;
      var json = JSON.stringify(response.data);
      fs.appendFileSync('../dev-data/declarations_list.json', json);
    })
    .catch(function (error) {
      console.log(error);
    });
  
} while (pageNumber < 15000);
Olex
  • 3
  • 2

1 Answers1

3

This is not tested (because of lacking api access), but i would try to write to the file every time a new page is loaded, basically like so:

const fs = require('fs');
const axios = require('axios');
const { response } = require('express');

let pageNumber = 0;
var stream = fs.createWriteStream('../dev-data/declarations_list.json', {flags:'a'});

do {
  pageNumber = ++;
  axios
    .get(
        `https://api.example.com/?page=${pageNumber}`
    )
    .then(function (response) {
      console.log(`Current page: ${response.data.page.currentPage}`);
      pageNumber = response.data.page.currentPage;
      var json = JSON.stringify(response.data);
      stream.write(json);
    })
    .catch(function (error) {
      console.log(error);
    });
    // Do while currentPage (no 'error')
} while (pageNumber < <total_number_of_pages>);

stream.end();

Also do not nest these axios calls. There is no need to and several problems might arise from that. However I think the biggest problem was the way you wrote to that stream.

Apart from that your loop never ends if there is no error. You will need to provide the total number of pages you would like to retrieve.

It seems to me if you are not so experienced with this, so you might look up something like "Nodejs and Express save JSON response to file" first, bevore going any further...

frankenapps
  • 5,800
  • 6
  • 28
  • 69
  • Thanks a lot! It is a huge leap forward. I have posted the real updated code. Now the error is: Error: connect ENFILE - Local (undefined:undefined). As I see from the log, it throws the numbers first up to 15000 and then the responses with errors. I guess I need to figure out how to run this loop step by step. – Olex Aug 25 '20 at 13:32
  • Ah, ok. Then you will need a stream nevertheless. I have not expected, that you would get that many responeses, but after a while, the limit of possible file handles is exceeded for your system. See https://stackoverflow.com/a/43370201/4593433 for a solution using a stream. – frankenapps Aug 27 '20 at 11:50