0

I am unable to attain each eBay listing's link. I searched all over Google/StackOverflow and found a solution which is to use this "link = items.findAll('a', href=True)". Unfortunately, it does not work for me since this is what I get in return. I only want the ebay links.

for items in bs.findAll('li', {'class': 's-item'}):
    links = bs.find_all('a', href=True)
    print(links)

What I want my code to return.

My Entire code: https://pastebin.com/uNd1th8Z

Thank you so much. Seriously.

Kay
  • 335
  • 3
  • 14

1 Answers1

1

@Crissal has given you a good reference, and I tested scraping from your url myself. The tricky part is to get the relevant hrefs (there are many irrelevant hrefs), if you are stuck you may refer to my code.

browser = webdriver.Chrome('(your_path_here)/chromedriver')
browser.get('https://www.ebay.com/sch/i.html?_from=R40&_nkw=watches&_sacat=0&_pgn=1')
bs = BeautifulSoup(browser.page_source, 'lxml', parse_only=SoupStrainer('a'))

#this is the one liner to get your relevant hrefs, those with class='s-item__link'
ls = [l['href'] for l in bs if (l.has_attr('data-track')&l.has_attr('href')&l.has_attr('class')) if l['class'][0]=='s-item__link']

#print or save to a text file
#print(ls)

#with open('links.txt', 'w', encoding='utf-8') as outfile:
#    outfile.write('\n'.join(ls))

Output from print

https://www.ebay.com/itm/SKMEI-Watch-Mens-Womens-Watches-Waterproof-Sport-Outdoor-LED-Digital-Wristwatch/392260574708?hash=item5b548d5df4:m:myDW5IBI631t2UBz65C2Ztw
https://www.ebay.com/itm/LED-Watches-Men-Women-Sport-Casual-Digital-Army-Military-Silicone-Wrist-Watch/193040878143?hash=item2cf2220a3f:m:m7VH5iopUucVEYPr9y5dfxA
https://www.ebay.com/itm/Men-Women-Leather-Strap-Line-Analog-Quartz-Ladies-Wrist-Watches-Fashion-Watch/202788343863?hash=item2f37209037:m:mOWAjJgHKIYzv8YqDPxLZqA
.
.
.

QuantStats
  • 1,448
  • 1
  • 6
  • 14