1

Context: I'm working on code which uses a read stream to download a file from an SFTP server and upload it to GCS via a writeStream, using Nodejs v10.15.3.

Due to an error in the SFTP library I'm working, stream.pipe (that is, piping from the read stream the library produces) is actually broken in Node 10, Because of this, I am attempting to instead upload this file via the following code (where stream is the read stream and unnecessary information has been left out):

let acl = fileMode;
if (fileMode === 'public') {
    // options for acl are publicRead and private
    // need to add the Read if public
    acl += 'Read';
}
var options = {
    predefinedAcl: acl,
    destination: destPath,
    metadata: {
        contentType: contentType,
        cacheControl: 'no-cache'
    }
};
// Add in a check here for if the bucket exists
let file = new File(bucket, destPath);
let writeStream = file.createWriteStream(options);
writeStream.on('finish', () => {
    file.getMetadata()
        .then((metadata) => {
            console.log('metadata', metadata);
            return resolve(metadata);
        })
        .catch(error => {
            console.error('Error getting file metadata', error);
            return reject(error);
        });
});
stream.on('end', () => {
    try {
        writeStream.end();
    } catch (err) {
        console.error('Error closing writeStream', err);
        return reject(err);
    }
});
writeStream.on('error', error => {
    console.error('Error in writeStream', error);
    return reject(error);
});
stream.on('error', error => {
    console.error('Error in stream', error);
    return reject(error);
});
let data = stream.read();
while (data) {
    writeStream.write(data);
    data = stream.read();
}

When I use the while (data) method to stream from our SFTP server to a local file on the filesystem, this works without incident. However, when I try to run this code to upload to our GCS file, I get the following error:

MaxListenersExceededWarning: Possible EventEmitter memory leak detected. 11 close listeners added. Use emitter.setMaxListeners() to increase limit
Error in writeStream Error: Retry limit exceeded
// stacktrace omitted
Error Uploading to GCS from a stream: Retry limit exceeded
    Error: Retry limit exceeded

It seems like I must be doing something wrong here, but I have no idea why this isn't a valid method, nor am I sure if I'm missing some subtlety of streams (which I freely confess are pretty much a black box to me) or an issue with GCS.

EDIT: Okay, this actually appears to be completely unrelated to the SFTP issue. I've tried just uploading a file from the local fs using the recommended method, and am seeing the same error. The more 'streamlined' code I'm trying is:

// Add in a check here for if the bucket exists
let file = new File(bucket, destPath);

fs.createReadStream('sample_file.csv')
    .pipe(file.createWriteStream(options))
    .on('error', function(err) {
        console.error('err', err);
        return reject(err);
    })
    .on('finish', function() {
        console.log('resolving');
        return resolve({gcsUrl: url});
    });
Amber B.
  • 1,134
  • 10
  • 20
  • Where do you run your code? Is it App Engine, Cloud Function, Cloud Run, Compute VM Instance or the server outside of the Google Cloud? – Pawel Czuczwara Apr 23 '19 at 13:23
  • This Error: Uploading to GCS from a stream: Retry limit exceeded Is escalated by Client Library: for error 404 that occurs more than 5 times or from 499 to 600 when exponential back off fails more than 5 times. – Pawel Czuczwara Apr 23 '19 at 13:29
  • Right now it's running via a local docker setup outside of the Google Cloud, which is where we develop and test our services locally before deploying them. – Amber B. Apr 24 '19 at 14:16
  • Did you configure properly the CORS for the bucket? Did you configure your user/service account to have write access to the bucket? And can you verify that upload using gsutil (Could SDK) works from that configuration that you use? – Pawel Czuczwara Apr 24 '19 at 14:21
  • The answer to all three should be yes. Of note is that `bucket.exists` works, etc. – Amber B. Apr 24 '19 at 14:42
  • And what is the error on the Cloud Storage Bucket side? You can see it in Audit Logs: https://cloud.google.com/storage/docs/audit-logs – Pawel Czuczwara Apr 24 '19 at 15:51
  • ... huh. We can't find any logs which indicate an attempt to access the bucket we're using at all. We can see logs from other buckets being used successfully in other methods. I don't understand why or how this is the case. – Amber B. Apr 24 '19 at 16:40
  • Could you try to write the error from stream like that: https://stackoverflow.com/questions/46368065/google-cloud-storage-file-write-stream-fails to get more verbose response? – Pawel Czuczwara Apr 25 '19 at 07:16
  • Thanks for your help, Pawel! We actually were advised by one of your colleagues to open a ticket about this, and I know SO discourages extended discussion in the comments, so perhaps we should move this conversation here? https://github.com/googleapis/nodejs-storage/issues/675 – Amber B. Apr 25 '19 at 16:09
  • Google Cloud Support here! To investigate the issue, there are more information needed, that are private and can not be posted here. If you have a Google Cloud support, please fill a support ticket. If you do not have support, please open a [private Google issue](https://issuetracker.google.com/issues/new?component=187164) using your project ID. Then post the link to issue that you created as a comment here. With this link I can take a look into your project. – Pawel Czuczwara Apr 26 '19 at 06:50
  • On GitHub I can see, that errors are generated by resumable upload. And the [documentation](https://cloud.google.com/nodejs/docs/reference/storage/2.5.x/File#createWriteStream) points out that you need write access to $HOME directory for temp metadata files: ```Resumable uploads require write access to the $HOME directory. Through config-store, some metadata is stored. ``` – Pawel Czuczwara Apr 26 '19 at 07:23

2 Answers2

3

As correctly pointed out by Alex Riquelme this warning happens when you surpass the maximum default listeners for an event in Node.js. The maximum number of listeners for an event in in Node.js by default is 10. You can change this value, however it´s not recommended in this situation because it would be a waste of resources as the leak will be still there.

The reason why multiple listeners are going to be created to upload the files in GCS is because resumable uploads are enable by default in createWriteStream. In your case, as you are uploading a lot of small files, the recommended approach is to set options.resumable to false. That way you will avoid the overhead caused by the resumable uploads without having to allow more listeners to be created.

dhauptman
  • 974
  • 7
  • 14
0

This warning is actually expected. When you try to upload a file to GCS, it will try to optimize this upload and it will split your file in chunks (normally in chunks of 1MB). So, it will create multiple listeners to upload this file. By default, the max number of listeners in Node.js is 10 (Take a look at this documentation). if you want to set the number of listeners to unlimited just set the variable setMaxListeners(0); to 0

Alex Riquelme
  • 1,475
  • 7
  • 14
  • Hrm. Two questions... 1. What exactly am I supposed to call setMaxListeners on? I've tried both `stream.setMaxListeners(0);` and `writeStream.setMaxListeners(0);` but am still getting the "Possible EventEmitter memory leak detected" error. 2. This seems like an excessive number of listeners for a 7kB file... is that really correct? – Amber B. Apr 30 '19 at 17:59
  • 1. In [this](https://stackoverflow.com/questions/9768444/possible-eventemitter-memory-leak-detected) SO question you'll find many ways of using `emitter.setMaxListeners()`. However it seems it's not a recommended approach as you may end up hiding possible memory leaking. – dhauptman May 03 '19 at 14:00
  • 1
    2. Yes it seems a lot of listeners for 7kb file, you may consider setting `options.resumable` to `false` because [createWriteStream](https://cloud.google.com/nodejs/docs/reference/storage/2.5.x/File#createWriteStream) automatically enables resumable uploads which can cause some performance degradation when uploading small files. – dhauptman May 03 '19 at 14:00
  • @dhauptman Thank you, setting options.resumable to false fixed it! If this answer is edited with that suggestion or if a new answer is posted with it, I'll go ahead and make that the accepted answer. – Amber B. May 06 '19 at 13:25
  • You're welcome :)) I've posted a new answer with the fix. – dhauptman May 07 '19 at 13:46