0

I'm writing an extension for Firefox to allow me to scrape some data from a web site.

A simplified description is that the site contains an index page with a list of links to subsidiary pages that contain the data I want. I browse to the index page where the extension detects the URL and asks if I want to scrape it. If I say yes, the content script scrapes the list of subsidiary page URLs and passes them to the background process which should open each page in turn in a new tab, scrape the data, then close the tab. To avoid issues at the server I set up a series of timeouts to create the new tabs, currently at 10 second intervals.

Once all the data is scraped it need to be passed to a PHP script on a server, but that's beyond the scope of this question.

All this works as described, except that I see this message appear on the console for each tab opened and closed (the tab number varies):

15:06:27.426 Uncaught (in promise) Error: Invalid tab ID: 171

I've puzzled over this most of the day but I can't track down the source of this error. It's doubly confusing because the code does exactly what I want.

Question: where is this error coming from, and what do I need to do to fix it?

Here's the code:

async function backgroundProcess() {
    "use strict";
    // delay in ms between each new tab. Set to some reasonable value after testing.
    const   newTabDelay = 5000;


    let tabList = [];
    let resultList = [];


    async function createTab(url, resolve, reject) {

        try {
            //Create a tab and open the link. Return a promise so that we can wait for everything to complete
            let tabId = await browser.tabs.create({url: url});
            console.log("executing script in tab " + tabId.id);
            let tabData = await browser.tabs.executeScript(
                tabId.id,
                {
                    file: 'scrapeMileage.js'
                });

            console.log("Content script executed");
            console.log(tabData);
            //resultList.push(tabData);
            let tabNumber = tabId.id;
            console.log("Removing tab Tab ID " + tabNumber);
            await browser.tabs.remove(tabNumber);
            console.log("Removed tab Tab ID " + tabNumber);

            resolve(tabData);
        } catch (e) {
            console.log("createTab catch "+e);
            reject(e);
        }
    }
    async function tabOpener(linkList) {

        try {
            console.log("Tab opener now running");
            //linkList.forEach((el)=>{console.log("(background) Link found:"+el)});

            // Loop through the list opening upo a new tab every few seconds
            for (let i = 0; i < Math.min(5, linkList.length); i++) {
                console.log("Creating tab for " + linkList[i]);
                tabList.push(new Promise((resolve, reject) => {
                    setTimeout(function () {
                        createTab(linkList[i], resolve, reject);
                    }, newTabDelay * i);
                }));
            }
            resultList = await Promise.all(tabList);
            console.log("Scraping complete");
            console.log(resultList);
        } catch (e) {
            console.log("tabOpener catch:"+e);
        }
    }

    function listener(message, sender, respond) {

        console.log("Received message: "+message.messageType);
        console.log(message);

        switch (message.messageType) {
            case 'mileageData':
                break;
            case 'scrapeRequest':
                console.log("Calling tab opener")
                tabOpener(message.data)
                break;

        }

    }
    console.log("Setting up message listener");
    browser.runtime.onMessage.addListener(listener);


}
console.log("Running background process");
backgroundProcess();

The code is liberally sprinkled with console.log() to help debug. Here's the console output.

15:05:57.459 Webconsole context has changed
15:05:57.462 Running background process background.js:81:9
15:05:57.463 Setting up message listener background.js:76:13
15:06:05.357 Received message: scrapeRequest background.js:62:17
15:06:05.357
Object { messageType: "scrapeRequest", data: (17) […] }
background.js:63:17
15:06:05.358 Calling tab opener background.js:69:25
15:06:05.358 Tab opener now running background.js:40:21
15:06:05.358 Creating tab for https://drivers.uber.com/p3/payments/trips/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx background.js:45:25
15:06:05.359 Creating tab for https://drivers.uber.com/p3/payments/trips/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx background.js:45:25
15:06:05.359 Creating tab for https://drivers.uber.com/p3/payments/trips/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx background.js:45:25
15:06:05.359 Creating tab for https://drivers.uber.com/p3/payments/trips/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx background.js:45:25
15:06:05.359 Creating tab for https://drivers.uber.com/p3/payments/trips/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx background.js:45:25
15:06:05.387 executing script in tab 167 background.js:16:21
15:06:06.895 Content script executed background.js:23:21
15:06:06.895
Array [ {…} ]
background.js:24:21
15:06:06.896 Removing tab Tab ID 167 background.js:27:21
15:06:06.905 Removed tab Tab ID 167 background.js:29:21
15:06:06.906 Uncaught (in promise) Error: Invalid tab ID: 167 undefined
15:06:10.371 executing script in tab 168 background.js:16:21
15:06:11.451 Content script executed background.js:23:21
15:06:11.451
Array [ {…} ]
background.js:24:21
15:06:11.452 Removing tab Tab ID 168 background.js:27:21
15:06:11.461 Removed tab Tab ID 168 background.js:29:21
15:06:11.461 Uncaught (in promise) Error: Invalid tab ID: 168 undefined
15:06:15.372 executing script in tab 169 background.js:16:21
15:06:16.751 Content script executed background.js:23:21
15:06:16.751
Array [ {…} ]
background.js:24:21
15:06:16.752 Removing tab Tab ID 169 background.js:27:21
15:06:16.762 Removed tab Tab ID 169 background.js:29:21
15:06:16.765 Uncaught (in promise) Error: Invalid tab ID: 169 undefined
15:06:20.385 executing script in tab 170 background.js:16:21
15:06:21.481 Content script executed background.js:23:21
15:06:21.481
Array [ {…} ]
background.js:24:21
15:06:21.482 Removing tab Tab ID 170 background.js:27:21
15:06:21.489 Removed tab Tab ID 170 background.js:29:21
15:06:21.490 Uncaught (in promise) Error: Invalid tab ID: 170 undefined
15:06:25.382 executing script in tab 171 background.js:16:21
15:06:27.414 Content script executed background.js:23:21
15:06:27.414
Array [ {…} ]
background.js:24:21
15:06:27.414 Removing tab Tab ID 171 background.js:27:21
15:06:27.423 Removed tab Tab ID 171 background.js:29:21
15:06:27.423 Scraping complete background.js:53:21
15:06:27.423
Array(5) [ (1) […], (1) […], (1) […], (1) […], (1) […] ]
background.js:54:21
15:06:27.426 Uncaught (in promise) Error: Invalid tab ID: 171 undefined

A couple of notes:

  • The scrapeMileage.js script currently does nothing except return a fixed value.
  • The data is data relating to me, retrieved for my purposes only. Once this is working I'd expect to scrape each page just once.
  • I've obfuscated the actual URLs involved, for privacy reasons.
  • Have to suspect calling `createTab()` in a `setTimeout`, in a `new Promise()`. You can unstitch that pattern by awaiting a promisified `setTimeout()` followed by `await createTab(linkList[i])`. Rewrite `createTab()` not to accept/use `resolve` and `reject`. Timing will be slightly different but safer. – Roamer-1888 Nov 17 '20 at 05:46
  • I'll see if I can get my head around your suggestion. The reason for encapsulating the timeout and create tab together was so that I can use a `Promise.all()` to wait for the whole scraping operation to complete before handing off the results to my server. –  Nov 17 '20 at 09:18
  • If everything asynchronous within the `for` loop is awaited, and with your results accumulated in the `resultsList` array, then `Promise.all()` isn't necessary. `tabList` becomes redundant. – Roamer-1888 Nov 17 '20 at 13:10
  • [How to make a promise from setTimeout](https://stackoverflow.com/a/22707551/3478010). – Roamer-1888 Nov 17 '20 at 13:13
  • 1
    Well, I refactored the code in line with your suggestions. It's simpler, so that's a good thing, but it's made no difference to the 'Uncaught error' message. The code is doing what I expect, and doesn't seem to have any adverse side effects, so I'm moving on to the rest of the project. Thanks for your input :) –  Nov 17 '20 at 22:42
  • Odd one. Sorry I couldn't be more helpful. – Roamer-1888 Nov 17 '20 at 22:44

1 Answers1

0

Judging by the full log, the error is definitely generated by browser.tabs.remove. And indeed, running it manually in devtools console with a non-existent tab id will produce the same error message. This can happen if something else already closed the tab, for example.

This error is trivially intercepted with a standard try/catch in async code just like you already do. If your code doesn't intercept this error the only explanations I see is a bug in Firefox or an incorrect polyfill for browser. You don't need this polyfill in Firefox.

wOxxOm
  • 65,848
  • 11
  • 132
  • 136
  • I'm not using a polyfill, and I've scoured the code for something that might close the tab early but not found anything unless something in the async code is grabbing the wrong ID somewhere. A bug in Firefox is a possibility, but I want to be sure I've exhausted all the JavaScript options before I go down that route. –  Nov 17 '20 at 09:20
  • If there is no other call to browser.tabs.remove **and** your posted code doesn't catch the error there's only explanation: a bug in Firefox. – wOxxOm Nov 17 '20 at 11:49
  • I refactored the code in line with a suggestion in the comments - tabs still open and close, Uncaught Error still appears. I can't find a route through the code that would call `browser.tabs.remove()` out of sequence or with the wrong tab ID, so I'll go with a Firefox bug and treat the message as spurious. If I get time I'll work up an example and post a bug report. Thanks for your input. –  Nov 17 '20 at 22:46
  • This was, indeed, a bug in Firefox. I've just upgraded to Firefox 84.0b2 and the message has disappeared. –  Nov 19 '20 at 22:10