0

I'm using a package from npm to scrape metadata from a url called url metadata what i'm trying to do is to loop through a json link list and scrape metadata from each link

code

    doc.edited_feed.items.forEach(item => {
        // get metadata of each item link 
        urlMetadata(item.link).then(metadata => {
        // add new item to the feed with the scraped metadata.image
      console.log("running urlmetadata function")
        feed1.addItem({
          title: item.title,
          link: url,
          description: item.contentSnippet,
          content: item.content,
          id: item.link,
          date: new Date(item.isoDate),
          image: metadata.image
        });
        });


      }
    }); // End Foreach


    console.log("after foreach block");
    response.type("application/xml");
    response.send(feed1.rss2());

The problem is that the metadataurl function is running after sending the response

output

    After foreach block
running urlmetadata function
running urlmetadata function
running urlmetadata function

which is the exact opposite of what i wrote i guess that it s something to do with async function or promise

Any help please :/ ?

Az Emna
  • 527
  • 2
  • 10
  • 26
  • Your guess is (almost) correct. The urlMetadata function is not RUNNING after sending the response, rather it is finishing after sending the response. This is due to the asyncronous nature of the function call. Solution: Put the response.send() into the .then() part of the promise – devnull69 Jul 04 '19 at 13:49
  • If i put the response.send() there then it will scrape the metadata of the first link and not all of the other links – Az Emna Jul 04 '19 at 13:51
  • Right, you should follow the link mentioned by @ponury-kostek and change forEach to a for or for-of loop – devnull69 Jul 04 '19 at 14:13

3 Answers3

3

Create an async function and await each response in a for loop

const funcName = async (items) => {
    for (let i = 0; i < items.length; i++) {
        let metadata = await urlMetadata(item.link);
        // do stuff with metadata
    }
}
Hassan Saleh
  • 964
  • 7
  • 12
1

At a minimum, a Promise is an object with a then method, which accepts a callback function to operate on the returned eventual value. ...

I recommend you to use fetch synchronousy for your situation

The Promise works by something of a race between resolve/reject and then. It tracks its own state of progress in a closure, knowing whether it is pending, resolved, or rejected.

Abdulbosid
  • 354
  • 1
  • 2
  • 12
0
(async () => {
      try {
        console.log("Start");

        for (let index = 0; index < doc.edited_feed.items.length; index++) {
          const item = doc.edited_feed.items[index];
          const metadata = await urlMetadata(item.link);
          feed1.addItem({
            title: item.title,
            link: url,
            description: item.contentSnippet,
            content: item.content,
            id: url,
            date: new Date(item.isoDate),
            image: metadata.image
          });
          console.log(metadata.image, "hellooooooooo");
        }
        response.type("application/xml");
        response.send(feed1.rss2());

        console.log("End");
      } catch (e) {
        console.log(e);
      }
    })();
Az Emna
  • 527
  • 2
  • 10
  • 26