2

I'm trying to collect a list of "https://..." and hope to store them in csv file. I can do them manually such as use excel, copy the urls from the website of interest and paste them one by one. But it's tedious and definitely would take lot of time.

can someone suggest and guide for a faster way?

WanJin
  • 21
  • 2
  • This is a very general question. You need to show what you've tried and give examples of data, code etc. to ask a more specific question. See here: https://stackoverflow.com/help/how-to-ask . In general, given the little information you've provided, I'm guessing you'd need a headless browser like selenium, and a python package like beautiful soup or scrapy. – michjnich Jul 23 '20 at 09:21

1 Answers1

1

If you just need the addresses quickly from one page you could run this javascript snippet document.links.forEach(link=>console.log(link.href)) in the console of your browser, this will output all of the links on that page.

If you want to use python to scrape the page I would suggest taking a look at this question on stackoverflow, this uses the beautifulsoup framework.

If there is dynamic content loaded on the page with javascript it's probably better to use something like Selenium, relevant stackoverflow question

stilllearning
  • 163
  • 10