0

This is a script, for my personal study and my educational/didactic purpose, with Tor and Selenium connection.

Both the scraping (team name list) and the Tor connection worked fine.

Then I added a code with Web Driver Wait to press the Cookie button, but now nothing works correctly anymore. The code entered is in contrast to that of Tor.

How can I solve by keeping both the Tor code and the Web Driver Wait code active?

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
import sqlite3

### CONNESSIONE TOR ###
from selenium import webdriver
from selenium.webdriver.firefox.firefox_profile import FirefoxProfile
import os

torexe = os.popen('/home/mypc/.local/share/torbrowser/tbb/x86_64/tor-browser_en-US') 

profile = FirefoxProfile('/home/mypc/.local/share/torbrowser/tbb/x86_64/tor-browser_en-US/Browser/TorBrowser/Data/Browser/profile.default')
profile.set_preference('network.proxy.type', 1)
profile.set_preference('network.proxy.socks', '127.0.0.1')
profile.set_preference('network.proxy.socks_port', 9050)
profile.set_preference("network.proxy.socks_remote_dns", False)
profile.update_preferences()

firefox_options = webdriver.FirefoxOptions()
firefox_options.binary_location = '/usr/bin/firefox' 

driver = webdriver.Firefox(
    firefox_profile=profile, options=firefox_options, 
    executable_path='/usr/bin/geckodriver') 

#Scraping SerieA
driver.maximize_window()
wait = WebDriverWait(driver, 20)
driver.get("link")
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[id='onetrust-accept-btn-handler']"))).click()

for SerieA in driver.find_elements(By.CSS_SELECTOR, "a[href^='/squadra'][class^='rowCellParticipantName']"):
    print(SerieA.text)

### SALVARE IN DATABASE: Nomi Campionati
con = sqlite3.connect('/home/mypc/Scrivania/folder/Database.db')
 cursor = con.cursor()
records_added_Risultati = 0

Values = SerieA
sqlite_insert_query = 'INSERT INTO ARCHIVIO_Squadre_Campionato (Nome_Squadra) VALUES (?);'
 count = cursor.executemany(sqlite_insert_query, Values)   #executemany, no execute
con.commit()
print("Record inserted successfully ", cursor.rowcount)
records_added_Risultati = records_added_Risultati + 1
cursor.close()

The error is:

Traceback (most recent call last):
File "/usr/lib/python3.8/idlelib/run.py", line 559, in runcode
exec(code, self.locals)
File "/home/mypc/Scrivania/folder/example.py", line 31, in <module>
wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[id='onetrust-accept-btn-handler']"))).click()
 File "/home/mypc/.local/lib/python3.8/site-packages/selenium/webdriver/support/wait.py", line 80, in until

 raise TimeoutException(message, screen, stacktrace)
 selenium.common.exceptions.TimeoutException: Message: 

1 Answers1

1

You are mixing implicitly wait with explicit wait. This is not recommended and causes problems.
I'm quite sure your problems are caused by this.
You can read more about this here
Also see this

UPD
You have a typo in both element locators.
Just a spaces, but totally braking the locators.
Also, in case accept cookies is not stable i.e. sometimes not appearing put it inside try-accept block as following:

driver.maximize_window()
wait = WebDriverWait(driver, 10)
driver.get("https://www.thesiteurl/bla/bla/discrete")
try:
    wait.until(EC.element_to_be_clickable((By.CSS_SELECTOR, "button[id='onetrust-accept-btn-handler']"))).click()
except:
    pass

for SerieA in driver.find_elements(By.CSS_SELECTOR, "a[href^='/squadra'][class^='rowCellParticipantName']"):
    print(SerieA.text)
Prophet
  • 32,350
  • 22
  • 54
  • 79
  • I got it. Thank you for your kind reply. However, I note that the answers are for Java. I am new to Python, I have read but little understood. If it doesn't bother you, could you please write me an answer showing me what I need to change? I would be very grateful to you. Thank you :) – Frederick Man Jul 07 '21 at 21:35
  • I tried to remove driver.implicitly_wait (30), but I always get the same same error. So the error is not driver.implicitly_wait (30), but another one – Frederick Man Jul 07 '21 at 21:46
  • Thanks for the replies. You are very kind. Tried. I always get the same error: in until raise TimeoutException (message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message: – Frederick Man Jul 07 '21 at 22:32
  • Ok, thank you so much. I hope we can solve the problem. I've been going crazy for days. Thank you :) – Frederick Man Jul 08 '21 at 06:11
  • I changed the code in the question and added/changed the error. Now everything is exactly as I have on the pc, that is, as is the file in my ide python and console. The lines/rows are the same as I have. Same number of lines/rows and same line errors. The error in the line you read is now the same as the script code in the question. Ugual rows and number of error in rows – Frederick Man Jul 08 '21 at 06:37
  • That black element ONLY appears during scraping. If you open your browser normally, it's not there. – Frederick Man Jul 08 '21 at 06:50
  • PERFECT. Thank you. You are greattttttttt. I'm happy. Just one thing: database insert doesn't work. The results of the scraping are not saved in the database. I would like to save each team in a row. The column of the team name is called "Team_Name", while the table ARCHIVE_Squadre_Campionato. The code for inserting into the database is already present, but something is wrong. Why doesn't it insert? Is this a .text problem? How can I solve? Thank you – Frederick Man Jul 08 '21 at 07:04
  • @Propthet Okok, of course I accept the answer. You helped me. You have been very kind. Thank you. I really appreciate your help. Did you also answer the other similar question? I didn't notice, because I canceled it yesterday. I don't think two similar questions were very correct, so out of seriousness I deleted them. I hadn't seen your answer. I'm sorry. Thanks again for your help. Thank you very much. Thank you – Frederick Man Jul 08 '21 at 07:13
  • Thanks you are really kind. Being new to stackoverflow, I don't know how to use it well. If I ask a question and get no answers, is there a way to ask you? To get my question to you? – Frederick Man Jul 08 '21 at 07:23
  • Only now, analyzing calmly, did I realize that your kind response was not the solution to my problem. The problem was that scraping sometimes worked and sometimes it didn't. For example, the scraping worked once and twice or thrice. I assumed this was due to the cookie button not being pressed. When I tested the code of your answer, it worked. Now I have tried 5 more attempts, but 2 times it was successful and 3 times not. It means that the cookie button must necessarily be pressed. Try, except and pass may not have been the right solution. Excuse me, could you help me? Thank you – Frederick Man Jul 08 '21 at 16:01
  • Wait, there's a little misunderstanding :) I haven't discovered a new problem: it's always the same as the question. Basically you have added only Try, except and pass: as you know try and pass only skip the problem if there is one. The problem is, because the cookie button was not pressed. So I return to the unsolved problem. I posted the question, because due to cookies the scraping happened correctly 1 time yes and 2 times no. This is not a new problem. You have made sure that the key is not pressed, but the execution is skipped that already did not work as per question – Frederick Man Jul 08 '21 at 17:02
  • 1
    All right. However I set the question as solved, even if you just skipped the problem, without solving it. As said I don't want to make controversy and I respect and respect you. Thanks anyway for everything :) – Frederick Man Jul 08 '21 at 17:53
  • Done! green tick :) – Frederick Man Jul 08 '21 at 17:58
  • 1
    @Prophet : bruh ! let it be ! he took that WebdriverWait and css selector (Given by me) from the post he deleted without any acknowledgment . SO does not allow blocking, otherwise I would have blocked him by now. – cruisepandey Jul 08 '21 at 18:07
  • @Prophet Sorry, in the answer to this question from the other day, could I please delete the link in driver.get? By typing driver.get ("link") or something similar. For my peace of mind and security reasons, I would prefer that there was not a link in the application (since it is scraping, even if I do it for my personal study and formative purposes). Thanks :) – Frederick Man Jul 14 '21 at 03:12
  • @Prophet I don't want to delete it. It's not right. It's not respectful to you either. I just want to delete the link. You were kind and you answered me. Thanks again. I would just like to remove a link for my safety and peace of mind. The link is in your kind reply, so I can't delete it. Could I please delete the link in driver.get? By typing driver.get ("link") or something similar? Thanks and sorry – Frederick Man Jul 14 '21 at 15:20
  • @Prophet I can't comment, maybe I can't. It is your answer that I voted and accepted. It is the only answer. It's the third line of the code in your answer, where it says driver.get. Can you remove the link inside and write for example just driver.get ("link") or something similar to your liking? Thank you and excuse me – Frederick Man Jul 18 '21 at 04:28
  • Now I understand. changed it, sorry for inconvenience. – Prophet Jul 18 '21 at 06:53