0

Developing my algorithm, which is below:

from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


browser =webdriver.Firefox(executable_path=r'C:/path/geckodriver.exe')
browser.get('https://brainly.com.br/app/ask?entry=hero&q=jhyhv+vjh')

html = browser.execute_script("return document.documentElement.outerHTML")
p=[]
soup=BeautifulSoup(html,'html.parser')
for link in soup.select('div > a[href*=""]'):
    ref=link.get('href')
    rt = ('https://brainly.com.br'+str(ref))
    ar = p.append(rt)
    print(ar) 

Everything goes well, with a slight exception. When trying to execute the algorithm without using *append* to create the list, it works normally, but when using it, I get an Exit None.

My Doubt and What I Need to Change To Have a Valid and Orderly Exit in a List !.

Obs:Expected Exit:

['https://link1', 'https://link2']

2 Answers2

0
from selenium import webdriver
from bs4 import BeautifulSoup
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC


browser =webdriver.Firefox(executable_path=r'C:/path/geckodriver.exe')
browser.get('https://brainly.com.br/app/ask?entry=hero&q=jhyhv+vjh')

html = browser.execute_script("return document.documentElement.outerHTML")
p=[]
soup=BeautifulSoup(html,'html.parser')
for link in soup.select('div > a[href*=""]'):
    ref=link.get('href')
    rt = ('https://brainly.com.br'+str(ref))
    p.append(rt)
print(p) 
Vikas Sharma
  • 451
  • 2
  • 8
0

append modifies the list in place, so the variable that has the result is p.

# (...)

p=[]
soup=BeautifulSoup(html,'html.parser')
for link in soup.select('div > a[href*=""]'):
    ref=link.get('href')
    rt = ('https://brainly.com.br'+str(ref))
    p.append(rt)
print(p)
kendriu
  • 565
  • 3
  • 21