2

I am trying to build a personal scraper of food recipes. I am able to get all other elements but food ingredients that are in unordered list. Here is a snippet of the page html: pagehtml

My code so far that doesn't find strong element but prints "Ingredients found."

    collectorDish.OnHTML(".ingredients", func(element *colly.HTMLElement) {
        fmt.Println("Ingredients found")
        element.ForEach("li", func(_ int, el *colly.HTMLElement) {
            fmt.Println(el.ChildText("strong"))
            el.ForEach("strong", func(_ int, elem *colly.HTMLElement) {
                fmt.Println(elem.Text)
            })
        })
    })

I have tried different ways to get these elements but no luck so far. I noticed that there is a difference of data when inspecting the page html. Under "Inspect -> elements" the html is as shown on the image, but in "Inspect->Source->pagename" the html stands:

<ul class="ingredients">
                <div class="ellipsis">
                    <div></div>
                    <div></div>
                    <div></div>
                    <div></div>

So is the reason why I don't receive ingredients in my code or the way page is built? I am a complete noobie and don't understand why html looks different in elements vs source. Looking for anykind of clues to get it working. Thanks and all the best!

M2R10
  • 23
  • 5
  • 1
    I think the problem might be due to [static vs dynamic web pages](https://www.google.com/search?client=firefox-b-d&q=static+web+page+vs+dynamic). Modern web pages, or should I say applications, tend to load dynamically. Perhaps you should check [rod](https://github.com/go-rod/rod) out. I just searched for `go pupeteer equivalent` fyi. – Nae Nov 29 '21 at 19:51
  • Good callout! After further inspection it indeed seems that these elements are loaded afterwards. Will check out rod. Thanks! – M2R10 Nov 29 '21 at 22:38

0 Answers0