4

I am working on a project to get the transcript out of an audio file. Audio files are of the format flac. I am using AWS Lambda and have written the code in node. Also, I am using IBM Speech to text service and using the basic example code given by them which can be found here. The problem is that my lambda function finishes before running these functions.

I am downloading a file from s3 and storing it locally(which is working fine). After that, I am trying to pass the same file to IBM Speech to Text SDK which should return the transcripts of the audio file to the local storage

Here is the code:

const downloadFile = function (url1, dest, cb) {
    const file = fs.createWriteStream(dest);
    https.get(url1, function (res) {
        //res.setEncoding('binary');
        res.pipe(file);
        file.on('finish', function () {
            const stats = fs.statSync(dest);
            const fileSizeInBytes = stats.size;
        //Convert the file size to megabytes (optional)
            const fileSizeInMegabytes = fileSizeInBytes / 1000000.0;
            console.log(fileSizeInMegabytes);
            file.close();
            RunIBMWatson(dest);
            callback(null,"Nice");
        });
    });
};
function RunIBMWatson(dest){
    console.log(dest);
    console.log("I am here");

    const recognizeStream = speech_to_text.createRecognizeStream(params);
    fs.createReadStream(dest).pipe(recognizeStream);
    recognizeStream.pipe(fs.createWriteStream('/tmp/transcription.txt'));
    recognizeStream.setEncoding('utf8');
    recognizeStream.on('results', function(event) { onEvent('Results:', event); });
    recognizeStream.on('data', function(event) { onEvent('Data:', event); });
    recognizeStream.on('error', function(event) { onEvent('Error:', event); });
    recognizeStream.on('close', function(event) { onEvent('Close:', event); });
    recognizeStream.on('speaker_labels', function(event) { onEvent('Speaker_Labels:', event); });

    function onEvent(name, event) {
      console.log("I am in onEvent");
      if (name === 'data'){
        console.log(event);
      }

and Here is the function logs that I get from AWS Lambda:

2018-03-05 03:31:53.585 54.093469
2018-03-05 03:31:53.588 /tmp/sample.flac
2018-03-05 03:31:53.588 I am here

I am a starter in both AWS Lambda and Node. So if anyone can point out the mistake I am making.

Utsav Kapoor
  • 63
  • 2
  • 6

2 Answers2

2

Yea RunIBMWatson is an asynchronous function because of the file IO, so because you're not waiting for the result from that function to return - the callback is executed thus ending the execution of your lambda.

Wrap the logic of RunIBMWatson in a Promise and once all the data is obtained and written to that transcript file - resolve the function. MDN: Promises

const downloadFile = function (url1, dest, cb) {

  ...
  console.log(fileSizeInMegabytes);
  file.close();
  return RunIBMWatson(dest)
  .then(() => { // return data from promise resolve can be back and accessible as params
    callback(null,"Nice");
  }
}

function RunIBMWatson(dest){

  return new Promise((resolve, reject) => {

    const rStream = speech_to_text.createRecognizeStream(params);
    fs.createReadStream(dest).pipe(rStream);

    rStream.pipe(fs.createWriteStream('/tmp/transcription.txt'));
    rStream.setEncoding('utf8');
    rStream.on('results', function(event) { onEvent('Results:', event); });
    rStream.on('data', function(event) { onEvent('Data:', event); });
    rStream.on('error', function(event) { onEvent('Error:', event); });
    rStream.on('close', function(event) { onEvent('Close:', event); });
    rStream.on('speaker_labels', function(event) { onEvent('Speaker_Labels:', event); });

    function onEvent(name, event) {
      console.log("I am in onEvent");
      if (name === 'data'){ // the data 
        resolve(); // you can return data too here
      }
    }
  })
}

Hope this helps

bneigher
  • 818
  • 4
  • 13
  • 24
1

The problem is the javascript event loop is empty so Lambda thinks it's done. You can download an npm module async/await which will help with that. info here

asyncawait is not included by default but that does not mean you cannot add it yourself. Simply add the package locally (npm install asyncawait), and include the node_modules folder in your ZIP before uploading your Lambda function.

If you want to handle dev dependencies separately (e.g.: test, aws-sdk to execute your function locally, etc), you can add them under devDependencies in your package.json.

jdmdevdotnet
  • 1
  • 2
  • 19
  • 50