3

Thanks the help of this forum, i'm finally arrived at this python3 code:

import urllib.request
from bs4 import BeautifulSoup

url= 'https://www.inforge.net/xi/forums/liste-proxy.1118/'
soup = BeautifulSoup(urllib.request.urlopen(url), "lxml")

for tag in soup.find_all('a', {'class':'PreviewTooltip'}):
    links = (tag.get('href'))
    print (links)

It prints all the links of topics in the webpage: https://www.inforge.net/xi/forums/liste-proxy.1118/ But the last thing that I need is: how to tell python to write ad every line, first the "threads" word, the other part of the link? (https://www.inforge.net/xi/) Thanks in advance!

Allexj
  • 1,375
  • 6
  • 14
  • 29

1 Answers1

4

You just have to concatenate the base url with each link.

Try this code:

import urllib.request
from bs4 import BeautifulSoup

url= 'https://www.inforge.net/xi/forums/liste-proxy.1118/'
soup = BeautifulSoup(urllib.request.urlopen(url), "lxml")

base = 'https://www.inforge.net/xi/'

for tag in soup.find_all('a', {'class':'PreviewTooltip'}):
    links = (tag.get('href'))
    full_url = base + links
    print (full_url)

Output:

https://www.inforge.net/xi/threads/dichvusocks-us-23h10-pm-update-24-24-good-socks.455661/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-11h11-pm.455660/ https://www.inforge.net/xi/threads/dichvusocks-us-22h10-pm-update-24-24-good-socks.455656/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-9h45-pm.455655/ https://www.inforge.net/xi/threads/dichvusocks-us-18h30-pm-update-24-24-good-socks.455651/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-6h25-pm.455650/ https://www.inforge.net/xi/threads/dichvusocks-us-13h00-pm-update-24-24-good-socks.455634/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-1h00-pm.455633/ https://www.inforge.net/xi/threads/dichvusocks-us-09h15-am-update-24-24-good-socks.455631/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-8h00-am.455627/ https://www.inforge.net/xi/threads/dichvusocks-us-01h35-am-update-24-24-good-socks.455614/ https://www.inforge.net/xi/threads/dichvusocks-us-23h10-pm-update-24-24-good-socks.455610/ https://www.inforge.net/xi/threads/dichvusocks-us-20h15-pm-update-24-24-good-socks.455601/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-8h00-pm.455596/ https://www.inforge.net/xi/threads/dichvusocks-us-15h10-pm-update-24-24-good-socks.455588/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-1h25-pm.455587/ https://www.inforge.net/xi/threads/dichvusocks-us-10h45-am-update-24-24-good-socks.455585/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-10h40-am.455584/ https://www.inforge.net/xi/threads/vn5socks-net-auto-update-24-7-good-socks-7h30-am.455583/ https://www.inforge.net/xi/threads/dichvusocks-us-01h40-am-update-24-24-good-socks.455569/

Community
  • 1
  • 1
dot.Py
  • 5,007
  • 5
  • 31
  • 52
  • oh... i'm so stupid! i though that if I had done it, it'd added just to the first line.. but it was very easy.. thanks! – Allexj Jul 18 '16 at 17:10