1

Stock Overflow -

I'm trying to process an image collection (~2000 images) with NodeJS. I'm able to extract the information needed, but I'm having a hard time getting the timing right so that I can save the outcome to a JSON file.

Towards the end you'll see

console.log(palette);
// Push single image data to output array.
output.push(palette);

The console.log works fine, but the the push method is appears to be executed after the empty output array has been written to data.json. Tried having a nested promise where I wouldn't write the file until all images have been processed.

The callback function in getPixels gets executed asynchronously.

The order of the output array is arbitrary.

Any and all help greatly appreciated! Thank you!

// Extract color information from all images in imageDirectory

var convert   = require('color-convert'),
    fs        = require('fs'),
    getPixels = require("get-pixels"),
    startTime = Date.now();

var processedImages = new Promise((resolve, reject) => {
  var imageDirectory = 'input',
      images         = fs.readdirSync(imageDirectory),
      output         = [];

  console.log('Found ' + images.length + ' images.');

  for (var image in images) {
    var imageLoaded = new Promise((resolve, reject) => {
      getPixels(imageDirectory + '/' + images[image], function(error, pixels) {
        if(error) {
          return 'Bad image path';
        }
        resolve(pixels);
      });
    });

    imageLoaded.then((pixels) => {
      var palette = {
        coloredPixels  : 0,
        hues           : [],
        image          : images[image],
        classification : false,
        pixelCount     : null
      };

      palette.pixelCount = pixels.shape[0] *
                           pixels.shape[1] *
                           pixels.shape[2];

      for (var i = 0; i < 256; i++) {
        palette.hues[i] = 0;
      }

      for (var i = 0; i < palette.pixelCount; i += 4) {
        var rgb        = [pixels.data[i    ],
                          pixels.data[i + 1],
                          pixels.data[i + 2]],
            hsl        = convert.rgb.hsl(rgb),
            hue        = hsl[0],
            saturation = hsl[1];

        if (saturation) {
          palette.hues[hue]++;
          palette.coloredPixels++;
        }
      }
      console.log(palette);
      // Push single image data to output array.
      output.push(palette);
    });
  }
  resolve(output);
});

processedImages.then((output) => {
  // write output array to data.json
  var json = JSON.stringify(output, null, 2); 
  fs.writeFileSync('data.json', json);

  // Calculate time spent
  var endTime = Date.now();
  console.log('Finished in ' + (endTime - startTime) / 1000 + ' seconds.');
});
Knut
  • 87
  • 1
  • 8
  • `for (var image in images) {` <- this will execute the loop immediately, and async functions will get stacked. You then call -> `resolve(output);` at the end of the loop, so basically your resolving instantly. – Keith Sep 27 '17 at 23:43
  • You need to use `Promise.all` instead of `new Promise` around that loop. – Bergi Sep 28 '17 at 00:01
  • Your `image` variable [does have a scope problem](https://stackoverflow.com/q/750486/1048572) – Bergi Sep 28 '17 at 00:02
  • `You need to use Promise.all` hope the images are not too big then.. (~2000 images) – Keith Sep 28 '17 at 00:02
  • Thank you all for your input. I'm trying to implement Andy's solution.The images are 256x256px and are on average around 50KB each. – Knut Sep 28 '17 at 00:09
  • 50KB isn't too bad then,. 100meg.. Just be aware that all these will get processed at the same time, this includes taking up memory, and IO thrashing. Bluebird's promise lib has a nice `Promise.map` it has a nice concurrency option, here you could say process at max 10 images at once. – Keith Sep 28 '17 at 00:17

1 Answers1

0

What you want to do is transform an array of "images" to an array of promises and wait for all promises to resolve, and then perform more transformations. Think of it as a series of transformations, because that's what you're doing here. In a nutshell:

const imagePromises = images.map(image => new Promise(resolve, reject) {
  getPixels(imageDirectory + '/' + image, (error, pixels) => {
    if(error) {
      reject('Bad image path');
      return;
    }
    resolve(pixels);
  });

const output = Promise.all(imagePromises).then(results => 
 results.map(pixels => { 
  return { 
    // do your crazy palette stuff (build a palette object) 
  };
});
Andy Gaskell
  • 31,495
  • 6
  • 74
  • 83
  • I don't see why not, but you could always break the large array of promises into chunks. – Andy Gaskell Sep 28 '17 at 00:16
  • `I don't see why not,` someone with your rep, I think you do.. Yes, breaking into chunks etc,. or even just doing it linear. Anyway, OP has said only 50K each so not really an issue, making them into 7 meg could certainly cause issues. – Keith Sep 28 '17 at 00:22
  • This question is not about how to scale a solution - that is a different problem. The issue at hand is understanding how to work with multiple promises. – Andy Gaskell Sep 28 '17 at 00:26
  • `that is a different problem` that you could have just introduced. Anyway, I can see where this is going. – Keith Sep 28 '17 at 00:28
  • Looking forward to your answer, my dude. – Andy Gaskell Sep 28 '17 at 00:29
  • `Looking forward to your answer`, is that meant to imply something?. Anyway, I have kind of already mentioned an alternative, something like bluebirds map, {concurrency}. And the OP has said he's going with your solution, so be happy!!!.. – Keith Sep 28 '17 at 00:39