1

So i am having a few issues trying to figure out how to fix the snip below. As of right now it is returning the values before the 'request(scanurl,.....' section of the for of loop is running. The loop does run. Also i do not think the 'lines200' var is being updated via the counter. I am a learner so any explanation would be greatly appreciated.

async function processLineByLine(baseData) {
    console.log('enter async func')
try {
  const fileStream = fs.createReadStream('./file.txt');
  let linesTotal = 0;
  let lines200 = 0;

  const rl = readline.createInterface({
    input: fileStream
  });
  for await (let line of rl) {
    console.log('in loop')
    const scanurl = (baseData.match(/http/gi)) ? (baseData) : ('https://' + baseData + line);
    linesTotal++;

  request(scanurl, {json: true}, function (error, response, body) {
            let statusCode = response.statusCode;
            let htmlBody = body;
            //console.log(htmlBody)
            //console.log(statusCode)
            if (statusCode == "200") {
                console.log('in 2nd if')
                let $ = cheerio.load(htmlBody);
                let titleScrape = $('title').html();
                console.log(titleScrape)
                if (titleScrape.match(/404 | test/gi)) { 
                    console.log('Matched')
                } else {
                        lines200++;
                        console.log(lines200)
                    }
                } else {
                    // Do nothing
                }
            });

        }
        return {
            total: linesTotal,
            count200: lines200,
        }; 
} catch (error) {
    console.error(error)
}

}

router.get('/:reqTarget', async (req, res) => {
    console.log('starting')
    var baseUrl = req.params.reqTarget;
    try {
        console.log('in the try')
    const initTest = await processLineByLine(baseUrl);
    const {total, count200} = initTest;
    console.log(total, count200)
    if (initTest) return res.status(200).send({
        message: 'STATUS 200 COUNT: ' + count200 + ' ' + 'TOTAL: ' + total });

    } catch (error) {
        console.log(error)
    }

});

Current Output:

starting
in the try
enter async func
in loop
in loop
in loop
in loop
in loop
in loop
in loop
33 0 //this is the return that is two early
in 2nd if
404 | test
Matched
in 2nd if
404 | test
Matched
mister.cake
  • 165
  • 2
  • 12
  • `request(scanurl, { json: true }, function (error, response, body) {` is asynchronous ... you're not waiting for it to start let alone finish ... `await` doesn't magically wait for asynchronous code to complete, it awaits a promise - and you have no promise being returned by request – Jaromanda X Dec 06 '19 at 04:04
  • 1
    There's a version of this asked about 10 times a day now. Perhaps this is a dup of [How do I return the response from an asynchronous call?](https://stackoverflow.com/questions/14220321/how-do-i-return-the-response-from-an-asynchronous-call/14220323#14220323), but it's even crazier now than it used to be because apparently people now think `async` or `await` have magic powers to know when callback-based asynchronous operations are done (hint they don't - `async` functions only help when using `await` on a promise that is connected to completion of your asynchronous operation). – jfriend00 Dec 06 '19 at 04:09
  • insted of wait can use setTimeout function – LDS Dec 06 '19 at 05:26
  • 1
    @LDS that's terrible advice – Jaromanda X Dec 06 '19 at 12:36

1 Answers1

1

When you have a loop containing asynchronous operations, you have one of two options. You can run them all in parallel and somehow track when they are all done. Or, you can run them sequentially one after the other. It appears your loop could be constructed either way, but I'll illustrate the sequential option.

The advent of async/await allows us to "pause" a for loop in the middle with an appropriate await. But, in order to do that, all asynchronous operations have to be promise-based so you can await those promises. To that end, I've switched from the request() library to the request-promise-native library which is a promise wrapper around the request library that uses native, built-in promises. It also has another nice feature in that it automatically checks for a 2xx status code so you don't have to do that yourself.

Here's what that code would look like:

const rp = require('request-promise-native');

async function processLineByLine(baseData) {
    console.log('enter async func')
    try {
        const fileStream = fs.createReadStream('./file.txt');
        let linesTotal = 0;
        let lines200 = 0;

        const rl = readline.createInterface({
            input: fileStream
        });
        for await (let line of rl) {
            console.log('in loop')
            const scanurl = (baseData.match(/http/gi)) ? (baseData) : ('https://' + baseData + line);
            linesTotal++;

            try {
                let htmlBody = await rp(scanurl, {json: true});
                let $ = cheerio.load(htmlBody);
                let titleScrape = $('title').html();
                console.log(titleScrape);
                if (titleScrape.match(/404 | test/gi)) {
                    console.log('Matched')
                } else {
                    lines200++;
                    console.log(lines200)
                }
            } catch(e) {
                console.log(`error on request(${scanurl})`, e);
                // like your original code, this will only log the error
                //   and then continue with the rest of the URLs
            }
        }
        return {
            total: linesTotal,
            count200: lines200,
        };
    } catch (error) {
        console.error(error)
    }

}

router.get('/:reqTarget', async (req, res) => {
    console.log('starting')
    var baseUrl = req.params.reqTarget;
    try {
        console.log('in the try')
        const initTest = await processLineByLine(baseUrl);
        const {total, count200} = initTest;
        console.log(total, count200)
        res.status(200).send({message: 'STATUS 200 COUNT: ' + count200 + ' ' + 'TOTAL: ' + total});

    } catch (error) {
        console.log(error)
        res.sendStatus(500);
    }

});
jfriend00
  • 683,504
  • 96
  • 985
  • 979