How to apply async on for loop of range?

Question

As I understand, async is working only on array.

My application is reading a file of 1.2GB, and I want to read it in parts of 1024KB. Because RAM issue, I want to read 10 parts every time. From the documentation, eachlimit(arr, 10, iterator, callback) is the right function for me.

The problem is that I can't put all the parts in the array. This, because If I would do it, the Ram issue will raise, and the eachSeries is redundant.

In other words. I want to switch the following loops:

    for (var rangeStart = 0; rangeStart < stats.size; rangeStart = rangeStart + partSize) {
   //Where stats.size = 1200000000; partsize = 1024000, put the part of the file in the range into a buffer
}

to sync version, so that I complete every time 10 loops, and just then continue.

Do you really need simultaneous reads? If not, just use `eachlimit(arr, 1, ...` as a result you will be getting sequential parts. Yes, you can work out with 10 simultaneous reads, but it will not speed up HDD anyway. — alandarev, Aug 06 '14 at 09:45
@alandarev: OK, but still, even if I do in your way - How I pur all the file's parts in the array? It will raise the RAM memory error. — Or Smith, Aug 06 '14 at 10:25
You don't. What is your objective? You need to find a way of achieving it without having whole file loaded into RAM. That was everyday problem back in 90's :) — alandarev, Aug 06 '14 at 10:37
@alandarev: I need the parts of the file (in size od 1024KB) and do some function on each part (with the iterator). So I found way: read each part, do the operation, free the memory, and then go to the other part. This I can do only by the each.sync method. But the each.sync method request array which contains the all part of the files! and this is what I'm trying to avoid. — Or Smith, Aug 06 '14 at 11:03
> But the each.sync method request array which contains the all part of the files! ---- Does it? — alandarev, Aug 06 '14 at 11:23
@alandarev: So you say that I need to give array with one item each time? — Or Smith, Aug 06 '14 at 11:26
I can't see the code. But I highly doubt async would for some reason require you to supply it with file's content. — alandarev, Aug 06 '14 at 11:31
@OrSmith are you avoiding streaming the file for any reason? — nelsonic, Aug 06 '14 at 16:05
It sounds like you should be using a [Transform stream](http://nodejs.org/api/stream.html#stream_class_stream_transform_1) — Mike S, Aug 06 '14 at 21:43
@nelsonic: I can't do streaming because I have the file in the directory, and want to upload it to S3. So I have to read the file — Or Smith, Aug 10 '14 at 08:37
@OrSmith you can `fs.createReadStream('directory/filename.txt')` transform the data in the stream how ever you need to and pipe the output to S3 using **Knox** `putStream()`. This is the true power of node.js see: https://github.com/substack/stream-handbook — nelsonic, Aug 10 '14 at 09:03
@nelsonic: WOW, Thanks! It's working with knox. You can add it as an answer, and I will accept it. Do you know hot it works with pricing? Meaning, Amazon pricing is diffrent for each put/get. How It works with streaming? — Or Smith, Aug 10 '14 at 09:24
@OrSmith streaming a file counts as a single POST/PUT request. Even if its a Petabyte. ;-) — nelsonic, Aug 10 '14 at 21:15
@OrSmith detailed answer posted below. Would you mind updating your question to reflect this discussion (and thus make it easier for other people looking for how to upload large files to S3) Thanks! — nelsonic, Aug 11 '14 at 03:31

score 1 · Accepted Answer · answered Aug 11 '14 at 03:30

1

You don't need async to transform & upload large files to S3.

Simply stream the (large) file, do what ever transformation you need and pipe the result directly to amazon S3 using Knox

If you need a detailed example of how to do this, see: https://www.npmjs.org/package/stream-to-s3 (I wrote a quick node module to illustrate it for you =)

Installation:

npm install stream-to-s3

Usage:

var S = require('stream-to-s3');
var file = __dirname+'/your-large-file.txt';

S.streamFileToS3(file, function(){
  console.log('Awesomeness', file, 'was uploaded!');
  console.log('Visit:',S.S3FileUrl(file));
});

Done.

More detail on GitHub: https://github.com/nelsonic/stream-to-s3

answered Aug 11 '14 at 03:30

nelsonic

31,111
21
89
120

Strange, it works perfect yesterday, but now I get the following error, probably from xnox: `{ [Error: write ECONNRESET] code: 'ECONNRESET', errno: 'ECONNRESET', syscall: 'write' }`. What can be the reason? – Or Smith Aug 11 '14 at 13:16
ECONNRESET : http://stackoverflow.com/questions/17245881/node-js-econnreset what has changed? bigger file? are your Amazon S3 credentials correct? – nelsonic Aug 11 '14 at 14:42
No, it 996MB, my Amazon S3 credentials are correct and doesn't change from last time – Or Smith Aug 12 '14 at 06:53
@It seems that knox doesn't success to connect S3. Do you know a known issue about it? – Or Smith Aug 12 '14 at 09:01
First try using http://cyberduck.io with your S3 key/secret to upload a small file (to check that you are able to access S3). Then try a simple upload using knox. if you can share the code you've written on github/gist it might be easier to diagnose the issue. thanks. – nelsonic Aug 12 '14 at 15:28
This happens only with big files. I tried with Knox for small file (150KB), and everything works great. Here the question I upload: http://stackoverflow.com/questions/25265259/node-js-s3-write-to-s3-with-knox – Or Smith Aug 13 '14 at 06:35

How to apply async on for loop of range?

1 Answers1