Python Extracting Links

Question

Is there a way to extract all links from single page, (for example that page has 3 links).

Then, Python opens those 3 links, and extract all links from those three (each link has 2 links, for example)?

Can someone write me what to use extracts links like that. But, I want to do something with those links when they are opened, for example extract all numbers from that site. I would like to use BeautifulSoup for some part of it, but can it be done only with BS4 or ?

You are about to get downvoted or have this post closed because you are asking for the SO community to just write code for you. Please show the code you have tried so far and explain what does/does not work. — kstullich, Jul 18 '18 at 01:16
I am sorry, but I am not asking to write the code. I want to see what do I need to learn to do that, modules etc... — YoungBoi, Jul 18 '18 at 01:17
BS4 doesn't have any code to download links, so you also need some way to do that—you can use `urllib.request` in the stdlib, or `requests`, or you can write low-level sockets code, but you have to use something other than BS4. (BS4 also requires a parser, but for that, it's just a matter of adding the name of one to the constructor, so I don't think that really counts as needing to use anything else.) — abarnert, Jul 18 '18 at 01:23
But anyway, if this question isn't too broad to answer, it's basically a library-recommendation or tutorial-recommendation question. Which are both perfectly good questions to ask, but unfortunately Stack Overflow is not the right place to ask them. — abarnert, Jul 18 '18 at 01:25

score -1 · Accepted Answer · answered Jul 18 '18 at 01:22

You can use scrapy for this.

First, to get all the links from the page you can go through the following SO Posts
1.retrieve links from web page using python and BeautifulSoup
2.Fetch all href link using selenium in python

so, once you understand how to get the links, go through the following tutorial https://doc.scrapy.org/en/latest/intro/tutorial.html
What you need is Exactly This . Which is basically you will use yield on every page with all the links you find in that particular page.

Python Extracting Links

1 Answers1