0

I am scraping grubhub and I am not able to scrape the full menu.

https://www.grubhub.com/restaurant/buca-di-beppo-1875-s-bascom-ave-campbell/335944

For example in the above,it only scrapes appitizers. Scrolling is required to get the rest, however the captcha realizes it is automated (with selenium) and I cannot scrape anymore.

Here is what I have:

driver.get(link)
time.sleep(2)
page = driver.page_source
soup = BeautifulSoup(page, 'html.parser')
dishes = soup.find_all('div', class_='menuItemNew-name')
descs = soup.find_all('div', class_='padding-y-2')
dishes_ = []
descs_ = []
for items in dishes:
    dishes_ += items.find_all(text=True)
for items in descs:
    descs_ += items.find_all(text=True)

print(dishes_)
print(descs_)

descs are the descirptions of each dish which I also want to scrape.

How do I get the full menu (and the google maps link at the very bottom of the page if possible?)

undetected Selenium
  • 183,867
  • 41
  • 278
  • 352
code_vader
  • 256
  • 1
  • 8

1 Answers1

1

To scrape the full menu the google maps link at the very bottom of the page you need to induce WebDriverWait for the visibility_of_element_located() and you can use the following locator strategy:

  • Code Block:

    options = Options()
    options.add_argument("start-maximized")
    options.add_experimental_option("excludeSwitches", ["enable-automation"])
    options.add_experimental_option('excludeSwitches', ['enable-logging'])
    options.add_experimental_option('useAutomationExtension', False)
    options.add_argument('--disable-blink-features=AutomationControlled')
    s = Service('C:\\BrowserDrivers\\chromedriver.exe')
    driver = webdriver.Chrome(service=s, options=options)
    driver.get('https://www.grubhub.com/restaurant/buca-di-beppo-1875-s-bascom-ave-campbell/335944')
    WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "a[data-testid='restaurant-about-google-map-link']")))
    print(driver.page_source)
    
  • Console Output:

    <html lang="en" class=" async-hide"><head><script type="text/javascript" async="" charset="utf-8" id="utag_367" src="//d.impactradius-event.com/A1231534-f0ec-4c6c-b14f-75a55231a9591.js"></script><script src="https://ext.chtbl.com/trackable.js"></script><script type="text/javascript" async="" charset="utf-8" src="https://www.googletagmanager.com/gtag/js?id=G-7YX8989VK2" id="utag_628"></script><script type="text/javascript" async="" src="https://www.googletagmanager.com/gtag/destination?id=G-7YX8989VK2&amp;l=dataLayer&amp;cx=c"></script><script type="text/javascript" async="" src="https://www.googletagmanager.com/gtag/js?id=G-7YX8989VK2&amp;l=dataLayer&amp;cx=c"></script><script type="text/javascript" async="" src="https://www.googletagmanager.com/gtag/destination?id=DC-11687855&amp;l=dataLayer&amp;cx=c"></script><script type="text/javascript" async="" src="https://www.googletagmanager.com/gtag/js?id=DC-11687855&amp;l=dataLayer&amp;cx=c"></script><script type="text/javascript" async="" charset="utf-8" id="utag_577" src="//js.adsrvr.org/up_loader.1.1.0.js"></script><script type="text/javascript" async="" charset="utf-8" src="//analytics.tiktok.com/i18n/pixel/events.js?sdkid=undefinedttq" id="utag_568"></script><script type="text/javascript" async="" charset="utf-8" id="utag_550" src="//mi.grubhub.com/p/js/1.js"></script><script src="https://www.redditstatic.com/ads/pixel.js" async=""></script><script type="text/javascript" async="" charset="utf-8" src="https://pixel.mathtag.com/event/js?version=1.1&amp;delimiter=%2C&amp;industry=Internet%20Services&amp;event_type=catchall&amp;mt_id=1427886&amp;mt_pp=1&amp;mt_adid=227305" id="utag_430"></script><script async="" src="//px.airpr.com/airpr.js"></script><script type="text/javascript" async="" src="https://www.googletagmanager.com/gtag/js?id=AW-987205382&amp;l=dataLayer&amp;cx=c"></script><script async="" src="https://sc-static.net/scevent.min.js"></script><script type="text/javascript" async="" charset="utf-8" id="utag_566" src="https://connect.facebook.net/en_US/fbevents.js"></script><script type="text/javascript" defer="" async="" src="https://collector-21091.us.tvsquared.com/tv2track.js"></script><script type="text/javascript" async="" charset="utf-8" src="//bat.bing.com/bat.js" id="utag_171"></script><script type="text/javascript" async="" charset="utf-8" src="https://www.google-analytics.com/analytics.js" id="tealium-tag-7110"></script><script type="text/javascript" async="" src="https://www.google-analytics.com/plugins/ua/linkid.js"></script><script type="text/javascript" src="https://bam-cell.nr-data.net/1/5923691cbd?a=11156950&amp;sa=1&amp;v=1216.487a282&amp;t=Unnamed%20Transaction&amp;ct=https://www.grubhub.com/restaurant&amp;rst=2434&amp;ck=1&amp;ref=https://www.grubhub.com/restaurant/buca-di-beppo-1875-s-bascom-ave-campbell/335944&amp;be=541&amp;fe=2213&amp;dc=986&amp;af=err,xhr,stn,ins,spa&amp;perf=%7B%22timing%22:%7B%22of%22:1661037628304,%22n%22:0,%22f%22:1,%22dn%22:2,%22dne%22:71,%22c%22:71,%22s%22:100,%22ce%22:166,%22rq%22:166,%22rp%22:479,%22rpe%22:560,%22dl%22:485,%22di%22:987,%22ds%22:987,%22de%22:987,%22dc%22:2213,%22l%22:2213,%22le%22:2218%7D,%22navigation%22:%7B%7D%7D&amp;fp=826&amp;fcp=1572&amp;ja=%7B%22diner_type%22:%22diner_unknown%22,%22umami_app_version%22:%224.2.3852%22,%22ab_testing_status%22:%22optimize%20enabled%22,%22clickstream_browser_id%22:%22dec60c6c-11f2-4a3f-9f08-18cc784d5682%22,%22ad_block_enabled%22:true,%22is_spider_bot%22:false,%22clickstream_session_id%22:%22ae778399-20de-11ed-a9d5-23c0dcc7cb7b%22,%22first-paint%22:826.5,%22first-contentful-paint%22:1572.7999999523163,%22fetchStart%22:1,%22domainLookupStart%22:2,%22domainLookupEnd%22:71,%22connectStart%22:71,%22connectEnd%22:166,%22secureConnectionStart%22:100,%22requestStart%22:166,%22responseStart%22:479,%22responseEnd%22:560,%22domLoading%22:485,%22domInteractive%22:987,%22domContentLoadedEventStart%22:987,%22domContentLoadedEventEnd%22:987,%22domComplete%22:2213,%22loadEventStart%22:2213%7D&amp;jsonp=NREUM.setToken"></script><script src="https://js-agent.newrelic.com/nr-spa-1216.min.js"></script><script type="text/javascript" async="" src="https://www.google-analytics.com/gtm/js?id=GTM-58CKX3J&amp;t=teal_grubhublabs_UniversalproductionStandard&amp;cid=1361115206.1661037630"></script><script src="https://cdn.ravenjs.com/3.26.4/raven.min.js"></script><script src="https://assets.grubhub.com/assets/dll/load-uuid-740f2944b2a1abda6733.js"></script>
    
        <link rel="manifest" href="https://assets.grubhub.com/manifest.json">
    
    
        <link rel="search" type="application/opensearchdescription+xml" title="Find food" href="/opensearch.xml">
    
    
        <meta http-equiv="X-UA-Compatible" content="IE=edge">
        <meta charset="utf-8">
    .
    <div class="menuItemNew-price u-rounded--large s-textBox"><cb-icon class="menuItem-loading"><svg class="cb-icon cb-icon-svg cb-icon--sm" aria-hidden="true"><use xlink:href="#clock-back"></use></svg></cb-icon><span class="menuItem-priceAmount h6 s-textBox-title u-margin-bottom-cancel"><span class="" data-testid="menu-item-price" itemprop="price">$39.60</span><span data-testid="menu-item-price-plus" class="menuItem-pricePlus">+</span></span></div></div></button></article></div></div></div></div></div></div></span></div></div></span></div></div></div></div></div></div></main><a name="reviews"></a><div><div data-testid="restaurant-about-reviews-sections" class="s-container-lg u-block u-inset-3"><div class="s-row"><div class="s-col-xs-12"><div id="navSection-about" class="navSection" tabindex="0"><span data-testid="restaurant-about" id="ghs-restaurant-about"><div class="restaurantAbout"><h2 data-testid="restaurantAbout-header">Buca di Beppo Menu Info</h2><div class="restaurantAbout-details"><div data-testid="restaurantAbout-cuisines" class="s-col-xs-12"><a data-testid="restaurantAbout-cuisines--Dinner" class="restaurantAbout-details-cuisines-link u-padding-cancel s-link" href="/delivery/ca-campbell/dinner">Dinner,♂</a><a data-testid="restaurantAbout-cuisines--Lunch Specials" class="restaurantAbout-details-cuisines-link u-padding-cancel s-link" href="/delivery/ca-campbell/lunch_specials">Lunch Specials,♂</a><a data-testid="restaurantAbout-cuisines--Pasta" class="restaurantAbout-details-cuisines-link u-padding-cancel s-link" href="/delivery/ca-campbell/pasta">Pasta,♂</a><a data-testid="restaurantAbout-cuisines--Pizza" class="restaurantAbout-details-cuisines-link u-padding-cancel s-link" href="/delivery/ca-campbell/pizza">Pizza</a></div><span class=""><div data-testid="restaurant-price-rating" class="price-scale priceRating" title="$$$"><div data-testid="restaurant-price-rating-base" class="priceRating-base">$$$$$</div><div data-testid="restaurant-price-rating-value" class="priceRating-value" itemprop="priceRange">$$$</div></div></span></div><div class="restaurantAbout-info u-stack-y-4"><div class="restaurantAbout-info-contact"><a data-testid="restaurant-about-google-map-link" href="https://maps.google.com?daddr=1875%20S%20Bascom%20Ave%20Campbell%20CA%2095008" target="_blank" rel="noopener"><span data-testid="static-map" class="restaurantAbout-info-map"></span></a><a target="_blank" rel="noopener" data-testid="restaurant-about-address" href="http://maps.google.com/maps?daddr=1875 S Bascom Ave, Campbell, CA, 95008" class="restaurantAbout-info-address u-line-bottom u-line--thin u-line--light"><div>1875 S Bascom Ave</div>Campbell, CA 95008</a><div class="u-line-bottom u-line--thin u-line--light restaurantAbout-info-phone"><button data-testid="restaurant-phone-button" itemprop="telephone" content="4083777722" class="s-btn s-btn-tertiary u-padding-cancel restaurant-phone-button type"><span class="">(408) 377-7722</span></button></div><a href="/food/buca_di_beppo" data-testid="restaurantAbout-chainUrl"><div class="restaurantAbout-info-bottom restaurantAbout-info-chainLink u-line-bottom u-line--thin u-line--light"><span>View more about </span>Buca di Beppo</div></a></div><div class="restaurant-hours" data-testid="restaurant-hours"><h5 class="u-background--tinted u-inset-squished-4 u-text-secondary">Hours</h5><div class="u-inset-4 body u-flex u-flex-direction-row u-flex-justify-xs--between copy u-line-bottom u-line--thin u-line--light"><span data-testid="days0">Today</span><div class="u-text-right u-flex u-flex-direction-column"><div class="u-flexbox-order-2 u-text-secondary" data-testid="pickupHours00">Pickup: 10:30am–9:30pm</div><div class="u-flexbox-order-1" data-testid="deliveryHours00">Delivery: 10:30am–9:30pm</div></div></div><button data-testid="show-full-schedule-link" class="s-btn s-btn-tertiary u-inset-squished-4">See the full schedule</button></div></div></div></span><span data-testid="ghs-impression-tracker" style="width: 100%;"><div data-testid="taking-orders-carousel"><span data-testid="restaurant-section-data" type="sponsored" class="restaurant-section-data restaurant-sponsored"><div data-testid="in-view" class=""><span class="r2p"><ghs-restaurant-carousel><div class=" carousel-container s-container"><span class="p2r"><div data-testid="carousel" class="ghsCarousel"><div class="ghsCarousel"><span data-testid="carousel-scroll-wrapper" class="ghsCarousel-content ghsCarousel-content-scroll ghsCarousel-slides promo-carousel"></span></div></div></span></div></ghs-restaurant-carousel></span></div></span></div></span></div><span id="navSection-reviews" class="navSection" data-testid="ghs-impression-tracker"><div id="ghs-restaurant-reviews" class="u-block" data-testid="restaurant-reviews"><div data-testid="in-view" class=""><div class="u-background restaurantReviews clearfix"><div class="u-section-6" data-testid="restaurantReviews-container" id="restaurantPage-reviewHighlights"><div class="clearfix u-unclickable restaurantReviews-heading"><div class="s-row restaurantReviews-heading-content"><div data-testid="facet-header" class="s-col-md-8 s-form-group"><h2> Reviews for Buca di Beppo</h2><div class="u-stack-y-4"><span data-testid="star-rating-id"><div class="" data-testid="starRating"><span class="" data-testid="stars"><div class="stars stars--sm" data-testid="stars-static" style="background-position: 0px -168px;"></div></span><span data-testid="star-rating-text" class="u-text-secondary caption u-margin-cancel">208 <span>ratings</span></span></div></span></div><div class="restaurantReviews-ratingFacets u-stack-y-4"><span data-testid="review-section-rating-facets"><div class="ratingsFacets" data-testid="ratingfacets"><div class="" data-testid="ratingsfacet-details"><p class="ratingsFacet-header u-stack-y-4 body" data-testid="ratingsfacet-header">Here's what people are saying:</p><ul data-testid="ratingsfacet-facetlist" class="ratingsFacet-facetList s-row u-gutterless-3"><li class="ratingsFacet-facetList-listItem s-col-xs-4 u-gutter-3"><span class="u-stack-y-1 ratingsFacet-percent h5 u-margin-bottom-cancel">88</span> <span class="ratingsFacet-facetDesc u-text-secondary u-margin-bottom-cancel caption secondary">Food was good</span></li><li class="ratingsFacet-facetList-listItem s-col-xs-4 u-gutter-3"><span class="u-stack-y-1 ratingsFacet-percent h5 u-margin-bottom-cancel">79</span> <span class="ratingsFacet-facetDesc u-text-secondary u-margin-bottom-cancel caption secondary">Delivery was on time</span></li><li class="ratingsFacet-facetList-listItem s-col-xs-4 u-gutter-3"><span class="u-stack-y-1 ratingsFacet-percent h5 u-margin-bottom-cancel">88</span> <span class="ratingsFacet-facetDesc u-text-secondary u-margin-bottom-cancel caption secondary">Order was accurate</span></li></ul></div></div></span></div></div><div class="s-col-md-4 u-stack-y-3"></div></div></div><div class="restaurantReviews-restaurantPagePadding" data-testid="restaurantReviews-body" impressionid="reviewBodyId"><div class="review-container--loading"><div class="" data-testid="allReviews-sortBar"><span class="caption u-text-primary u-margin-bottom-cancel u-flex u-flex-align-xs--center"></span></div></div></div><span></span></div></div></div></div></span><span data-testid="faqs"><div data-testid="faqs-container" class="u-background u-inset-squished-3"><div class="u-padding-top-large"><div class="s-row"><div data-testid="faqs-heading" class="s-col-xs-12"><h2 class="u-stack-y-4 h1">FAQs</h2></div><div data-testid="faqs-body-container" itemscope="" itemtype="http://schema.org/FAQPage"><div data-testid="faq-question" class="s-col-xs-12 u-stack-y-4" itemprop="mainEntity" itemscope="" itemtype="http://schema.org/Question"><h6 itemprop="name"><span>Q) </span>Does Buca di Beppo (1875 S Bascom Ave) deliver?</h6><div class="faq-answer" data-testid="faq-answer" itemprop="acceptedAnswer" itemscope="" itemtype="http://schema.org/Answer"><span>A) </span><span itemprop="text"><span data-testid="safe-html"><div xmlns="http://www.w3.org/1999/xhtml" id="safeHtmlWrapper0">Yes, Buca di Beppo (1875 S Bascom Ave) delivery is available on Grubhub.</div></span></span></div></div><div data-testid="faq-question" class="s-col-xs-12 u-stack-y-4" itemprop="mainEntity" itemscope="" itemtype="http://schema.org/Question"><h6 itemprop="name"><span>Q) </span>Does Buca di Beppo (1875 S Bascom Ave) offer contact-free delivery?</h6><div class="faq-answer" data-testid="faq-answer" itemprop="acceptedAnswer" itemscope="" itemtype="http://schema.org/Answer"><span>A) </span><span itemprop="text"><span data-testid="safe-html"><div xmlns="http://www.w3.org/1999/xhtml" id="safeHtmlWrapper1">Yes, Buca di Beppo (1875 S Bascom Ave) provides contact-free delivery with Grubhub.</div></span></span></div></div><div data-testid="faq-question" class="s-col-xs-12 u-stack-y-4" itemprop="mainEntity" itemscope="" itemtype="http://schema.org/Question"><h6 itemprop="name"><span>Q) </span>What type of food is Buca di Beppo (1875 S Bascom Ave)?</h6><div class="faq-answer" data-testid="faq-answer" itemprop="acceptedAnswer" itemscope="" itemtype="http://schema.org/Answer"><span>A) </span><span itemprop="text"><span data-testid="safe-html"><div xmlns="http://www.w3.org/1999/xhtml" id="safeHtmlWrapper2">Buca di Beppo (1875 S Bascom Ave) is a Italian restaurant.</div></span></span></div></div><div data-testid="faq-question" class="s-col-xs-12 u-stack-y-4" itemprop="mainEntity" itemscope="" itemtype="http://schema.org/Question"><h6 itemprop="name"><span>Q) </span>Is Buca di Beppo (1875 S Bascom Ave) eligible for Grubhub+ free delivery?</h6><div class="faq-answer" data-testid="faq-answer" itemprop="acceptedAnswer" itemscope="" itemtype="http://schema.org/Answer"><span>A) </span><span itemprop="text"><span data-testid="safe-html"><div xmlns="http://www.w3.org/1999/xhtml" id="safeHtmlWrapper3">Yes, Grubhub offers free delivery for Buca di Beppo (1875 S Bascom Ave) with a <a href="https://www.grubhub.com/plus">Grubhub+</a> membership.</div></span></span></div></div></div></div></div></div></span>
    .
    <script type="text/javascript" id="tealium-script" src="https://tags.tiqcdn.com/utag/grubhubseamless/grubhub/prod/utag.js"></script><div><span data-testid="popover-content" id="ghs-popover-content-0"><aside class="ghsPopover  rightHAlign floatingCartDropDown floatingCart groupOrder-convertLink-onTop ghsPopover--undefined-theme isClosed fade" role="tooltip" style="inset: -10000px auto auto;"><div class="ghsPopover-spacer"></div><div class="popover-content"><span data-testid="closed-bag-popover" class="u-block" style="min-height: 150px; min-width: 300px;"><aside id="ghs-globalCart-container"><span><span data-testid="global-cart"><div data-testid="global-cart-body" id="global-cart" class="globalCart-panel body" tabindex="-1"><span data-testid="sev-one"></span><section class="globalCart-panel-contents"><div class="cart-error"><div class="globalCart-symbol u-text-secondary"></div><div class="cart-error-emptyCart u-text-center"><h5 class="cart-error-title">Your bag is empty.</h5></div></div></section></div></span></span></aside></span></div><span class="popover-caret popover-caret--undefined-theme"></span></aside></span></div><script type="text/javascript" id="clickstream-tag" src="https://assets.grubhub.com/libs/clickstreamjs/2.0.21/clickstream2.min.js"></script><script type="text/javascript" id="perimeter-x-script-tag" src="https://sensor.grubhub.com/O97ybH4J/init.js"></script><script type="text/javascript" id="app-boy-script" src="//assets.grubhub.com/libs/appboy/1.6/appboy.min.js"></script><script type="text/javascript" id="inauth-script-tag" src="https://www.cdn-net.com/cc.js?ts=1661037629801"></script><div><span data-testid="popover-content" id="ghs-popover-content-1"><aside class="ghsPopover  centerHAlign  ghsPopover--undefined-theme isClosed fade" role="tooltip" style="inset: -10000px auto auto;"><div class="ghsPopover-spacer"></div><div class="popover-content"><div class="ratingsFacet-popover"><span data-testid="review-section-rating-facets"><div class="ratingsFacets ratingsFacets--popover" data-testid="ratingfacets"><div class="u-inset-4 u-text-center" data-testid="ratingsfacet-details"><p class="ratingsFacet-header u-stack-y-4 body" data-testid="ratingsfacet-header">Here's what people are saying:</p><ul data-testid="ratingsfacet-facetlist" class="ratingsFacet-facetList s-row u-gutterless-3"><li class="ratingsFacet-facetList-listItem u-line-right u-line--thin s-col-xs-4 u-gutter-3"><span class="u-stack-y-1 ratingsFacet-percent h5">88</span> <span class="ratingsFacet-facetDesc u-text-secondary u-margin-bottom-cancel caption">Food was good</span></li><li class="ratingsFacet-facetList-listItem u-line-right u-line--thin s-col-xs-4 u-gutter-3"><span class="u-stack-y-1 ratingsFacet-percent h5">79</span> <span class="ratingsFacet-facetDesc u-text-secondary u-margin-bottom-cancel caption">Delivery was on time</span></li><li class="ratingsFacet-facetList-listItem u-line-right u-line--thin s-col-xs-4 u-gutter-3"><span class="u-stack-y-1 ratingsFacet-percent h5">88</span> <span class="ratingsFacet-facetDesc u-text-secondary u-margin-bottom-cancel caption">Order was accurate</span></li></ul></div></div></span></div></div><span class="popover-caret popover-caret--undefined-theme"></span></aside></span></div><script src="https://cdn.branch.io/branch-latest.min.js" id="branch loader" async="true"></script><div id="ttdUniversalPixelTag" style="display: none;"></div></body></html>
    
undetected Selenium
  • 183,867
  • 41
  • 278
  • 352