0

I need open a 30 webs and i want get a data from this webs. If I open each one separately, i not have error's and getting a clear tables.

 def overall(web, country,season, Competition,comp_id,import_csv:bool):
    season_id = f'{season}_{Competition}_{country}'
    name = "overall"
    excel_name = f'{name}_{Competition}_{country}_{season}'
    table = 0
    url = web
    html = requests.get(url).content
    df_list = pd.read_html(html, header=0)
    df = df_list[table]
    data = pd.DataFrame(df)
    data.insert(0, "Country", f"{country}", False)
    data.insert(0,"season", f"{season}", False)
    data.insert(0, "Competition",f"{Competition}", False)
    data.insert(0, "Comp_id",f"{comp_id}", False)
    data.insert(0, "season_id",f"{season_id}", False)
    data.rename(columns={"Squad":"Team","Pts/MP":"PPG"},inplace=True)
    if import_csv == True:
        data.to_csv(os.path.join("C:\\All fotball project\\fbref\\overall_2021-2022",f'{excel_name}_data.csv'),index=False)
    else:
        return data 

if I run 10 such functions consecutively, I get random errors - table not found, although I have no errors in each separate. It's random, once in 3 lines, once in 5, and once in a line.

fbref.overall(web1, Stats',"England","2022","1","105",True)
fbref.overall(web2, Stats',"England","2022","2","106",True)
fbref.overall(web3, Stats',"England","2022","3","107",True)
fbref.overall(web4, Stats',"England","2022","4","108",True)
fbref.overall(web5, Stats',"England","2022","5","109",True)


--> 552     raise ValueError("No tables found")
554 result = []
555 unique_tables = set()

ValueError: No tables found

Sometimes i get error in firts function, after new run 1,2,3 ok and got error in 4 function. Next run i go this error in 2 function, but before that it was ok. I have not very good pc, but i don't think it's a problem here.

  • It is a bit difficult to see what you have in mind but seems like you are executing a loop `for w in [web1, web2, web3, web4, web5, web6]: fbref.overall(w, Stats ...)`. It seems that `pd.read_html`, can't really parse the html structure to detect the table you want to scrape. There can be several reasons for that. Better be more specific about the URLs that you wanted to scrape. Try to see if [this](https://stackoverflow.com/questions/53398785/pandas-read-html-valueerror-no-tables-found) gives an answer by using `Selenium`. – Yannis P. Aug 10 '22 at 19:24
  • I got answer. After a while I get a page that I have to confirm that I am not a computer. – Darjus Vasiukevic Aug 21 '22 at 20:50
  • Good to know, however I think Selenium can help in those cases, at least when it boils down to just click a checkbox – Yannis P. Aug 23 '22 at 16:00

0 Answers0