0

I am trying to retrieve data from an html table but the CSS selector I've defined can't find the elements of the whole table. The entire code is at the bottom but the problem is the following:

for match in driver.find_elements_by_css_selector("div[id='all_games'] td.right")

That results in a NoSuchElementException.

If I change it to:

for match in driver.find_elements_by_css_selector("div[id='all_games'] tr[data-row='16']")

I receive the data from row 16, the opposition and data lines must work. However, I can't work out a proper CSS selector to return all data to be matched against.

Any ideas would be greatly appreciated!

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

import pandas as pd

url = 'https://www.pro-football-reference.com/teams/nwe/2020.htm'

data = []

driver = webdriver.Chrome('/Users/zachbeaulieu/Downloads/chromedriver')
driver.implicitly_wait(10)
wait = WebDriverWait(driver, 10)

driver.get(url)
# wait for the page to load
wait.until(EC.visibility_of_element_located((By.CSS_SELECTOR, "div[id='all_games'] td.right")))

for match in driver.find_elements_by_css_selector("div[id='all_games'] td.right"):
    opposition = match.find_element_by_css_selector("td[data-stat~='opp']").text
    date = match.find_element_by_css_selector("td[data-stat~='game_date']").text

    data.append({
        "opposition": opposition.strip(),
        "date": date.strip()
            })

driver.close()

df = pd.DataFrame(data)
print(df)
imtrying
  • 3
  • 1

1 Answers1

0

Any reason you are using Selenium for this? Far easier, quicker, and more efficient to use pandas to parse the table. Then if you want just the date and opponent column, you can slice it.

import pandas as pd
import requests
from bs4 import BeautifulSoup, Comment

url = 'https://www.pro-football-reference.com/teams/nwe/2020.htm'

# This will put all the tables not in the comments into a list of tables
tables = df = pd.read_html(url, header=1)

print('Found %s tables in html.' %(len(tables)))

# Now lets get the other tables, and put those into that list
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/88.0.4324.104 Safari/537.36'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')

comments = soup.find_all(string=lambda text: isinstance(text, Comment))
for each in comments:
    if 'table' in str(each):
        try:
            tables.append(pd.read_html(str(each))[0])
        except:
            continue


print('Found %s tables in html with additional in comments.' %(len(tables)))

Output:

You can see initially you get 3 tables, but then it pulls the other 8 tables found in the comments of the html.

Found 3 tables in html.
Found 11 tables in html with additional in comments.

All 11 Tables:

for df in tables:
    print(df)
            Player   PF   Yds    Ply  Y/P  ...     Start  Time  Plays  Yds.4    Pts
0       Team Stats  326  5236  979.0  5.3  ...  Own 28.5  2:57   6.41   33.1   1.92
1       Opp. Stats  353  5660  982.0  5.8  ...  Own 24.6  3:10   6.40   35.9   2.16
2  Lg Rank Offense   27    27    NaN  NaN  ...        15     7   5.00   17.0  24.00
3  Lg Rank Defense    7    15    NaN  NaN  ...         1    32  24.00   25.0  16.00

[4 rows x 31 columns]
    Week  Day          Date Unnamed: 3  ... TO.1 Offense  Defense Sp. Tms
0      1  Sun  September 13  1:00PM ET  ...  3.0   14.11     2.20   -5.64
1      2  Sun  September 20  8:20PM ET  ...  1.0   19.82   -14.54   -6.98
2      3  Sun  September 27  1:00PM ET  ...  3.0   11.03    -0.71    5.81
3      4  Mon     October 5  7:05PM ET  ...  1.0  -10.82    -5.72    3.78
4      5  NaN           NaN        NaN  ...  NaN     NaN      NaN     NaN
5      6  Sun    October 18  1:00PM ET  ...  2.0  -13.30     6.67    1.59
6      7  Sun    October 25  4:25PM ET  ...  2.0   -9.49   -22.44    6.55
7      8  Sun    November 1  1:00PM ET  ...  1.0    8.26    -8.85   -0.85
8      9  Mon    November 9  8:15PM ET  ...  1.0   18.70   -18.71    2.16
9     10  Sun   November 15  8:20PM ET  ...  1.0   13.95    -3.45   -1.86
10    11  Sun   November 22  1:00PM ET  ...  NaN   13.67   -15.69   -0.27
11    12  Sun   November 29  1:00PM ET  ...  1.0   -4.68     0.00    9.06
12    13  Sun    December 6  4:25PM ET  ...  2.0    9.66    16.13   22.19
13    14  Thu   December 10  8:20PM ET  ...  1.0  -29.77    -1.19    8.58
14    15  Sun   December 20  1:00PM ET  ...  1.0   -1.00   -16.71    6.16
15    16  Mon   December 28  8:15PM ET  ...  NaN   -2.39   -26.17    1.10
16    17  Sun     January 3  1:00PM ET  ...  2.0   18.13    -3.92   -0.05

[17 rows x 25 columns]
            Player  3DAtt  3DConv   3D%  ...   4D%  RZAtt  RZTD  RZPct
0       Team Stats  186.0    76.0  40.9  ...  52.9   48.0  26.0   54.2
1       Opp. Stats  186.0    76.0  40.9  ...  56.3   49.0  32.0   65.3
2  Lg Rank Offense    NaN     NaN  17.0  ...  22.0    NaN   NaN   24.0
3  Lg Rank Defense    NaN     NaN  16.0  ...  16.0    NaN   NaN   27.0

[4 rows x 10 columns]
    No.           Player   Age  Pos   G  ...   NY/A  ANY/A  Sk%  4QC  GWD
0   1.0       Cam Newton  31.0   QB  15  ...   6.17   5.44  7.8  1.0  3.0
1   4.0  Jarrett Stidham  24.0  NaN   5  ...   4.90   2.92  8.3  NaN  NaN
2   2.0      Brian Hoyer  35.0   qb   1  ...   4.31   2.58  7.7  NaN  NaN
3  11.0   Julian Edelman  34.0   wr   6  ...  19.00  19.00  0.0  NaN  NaN
4  16.0    Jakobi Meyers  24.0   wr  14  ...  21.50  41.50  0.0  NaN  NaN
5   NaN       Team Total  26.5  NaN  16  ...   6.06   5.24  7.8  1.0  3.0
6   NaN        Opp Total   NaN  NaN  16  ...   6.90   6.20  4.6  NaN  NaN

[7 rows x 29 columns]
   Unnamed: 0_level_0  ... Unnamed: 27_level_0
                  No.  ...                 Fmb
0                37.0  ...                   1
1                 1.0  ...                   6
2                26.0  ...                   0
3                34.0  ...                   0
4                28.0  ...                   1
5                42.0  ...                   0
6                 4.0  ...                   0
7                80.0  ...                   1
8                16.0  ...                   1
9                10.0  ...                   3
10               15.0  ...                   1
11               11.0  ...                   0
12               19.0  ...                   0
13               14.0  ...                   0
14                2.0  ...                   1
15               85.0  ...                   1
16               47.0  ...                   1
17               44.0  ...                   1
18               86.0  ...                   0
19                NaN  ...                  19
20                NaN  ...                  11

[21 rows x 28 columns]
  Unnamed: 0_level_0 Unnamed: 1_level_0  ... Kick Returns Unnamed: 16_level_0
                 No.             Player  ...         Y/Rt                APYd
0               80.0  Gunner Olszewski+  ...         23.2               849.0
1               14.0     Donte Moncrief  ...         23.6               184.0
2               42.0        J.J. Taylor  ...         21.8               212.0
3               10.0       Damiere Byrd  ...          NaN               619.0
4               35.0        Kyle Dugger  ...         23.5                47.0
5                NaN         Team Total  ...         23.1              6672.0
6                NaN          Opp Total  ...         21.3                 NaN

[7 rows x 17 columns]
  Unnamed: 0_level_0 Unnamed: 1_level_0 Unnamed: 2_level_0  ... Punting           
                 No.             Player                Age  ...     Lng Blck   Y/P
0                6.0          Nick Folk               36.0  ...     NaN  NaN   NaN
1                7.0      Jake Bailey*+               23.0  ...    71.0  0.0  48.7
2                NaN         Team Total               26.5  ...    71.0  0.0  48.7
3                NaN          Opp Total                NaN  ...     NaN  0.0  46.0

[4 rows x 32 columns]
   Unnamed: 0_level_0 Unnamed: 1_level_0  ... Tackles Unnamed: 22_level_0
                  No.             Player  ...  QBHits                Sfty
0                21.0    Adrian Phillips  ...     1.0                 NaN
1                51.0   Ja'Whaun Bentley  ...     3.0                 NaN
2                31.0     Jonathan Jones  ...     1.0                 NaN
3                32.0     Devin McCourty  ...     0.0                 NaN
4                35.0        Kyle Dugger  ...     0.0                 NaN
5                93.0       Lawrence Guy  ...     7.0                 NaN
6                55.0         John Simon  ...     4.0                 NaN
7                50.0     Chase Winovich  ...    12.0                 NaN
8                27.0       J.C. Jackson  ...     0.0                 NaN
9                59.0         Terez Hall  ...     0.0                 NaN
10               91.0  Deatrich Wise Jr.  ...    11.0                 NaN
11               30.0     Jason McCourty  ...     0.0                 NaN
12               24.0   Stephon Gilmore*  ...     0.0                 NaN
13               70.0        Adam Butler  ...     7.0                 NaN
14               99.0       Byron Cowart  ...     3.0                 NaN
15               25.0    Terrence Brooks  ...     1.0                 NaN
16               90.0   Shilique Calhoun  ...     3.0                 NaN
17               58.0  Anfernee Jennings  ...     1.0                 NaN
18               33.0   Joejuan Williams  ...     0.0                 NaN
19               29.0      Justin Bethel  ...     0.0                 NaN
20               41.0       Myles Bryant  ...     0.0                 NaN
21               96.0      Tashawn Bower  ...     0.0                 NaN
22               52.0   Brandon Copeland  ...     0.0                 NaN
23               53.0          Josh Uche  ...     7.0                 NaN
24               92.0       Nick Thurman  ...     0.0                 NaN
25               22.0         Cody Davis  ...     0.0                 NaN
26               52.0       Akeem Spence  ...     0.0                 NaN
27               18.0    Matthew Slater*  ...     0.0                 NaN
28                1.0         Cam Newton  ...     NaN                 NaN
29                NaN       Derek Rivers  ...     4.0                 NaN
30               98.0         Carl Davis  ...     0.0                 NaN
31               80.0  Gunner Olszewski+  ...     0.0                 NaN
32                7.0      Jake Bailey*+  ...     0.0                 NaN
33               10.0       Damiere Byrd  ...     NaN                 NaN
34               37.0      Damien Harris  ...     NaN                 NaN
35                NaN    Xavier Williams  ...     0.0                 NaN
36               43.0       Rashod Berry  ...     0.0                 NaN
37               28.0    Michael Jackson  ...     0.0                 NaN
38               71.0     Michael Onwenu  ...     NaN                 NaN
39               28.0        James White  ...     NaN                 NaN
40               66.0      James Ferentz  ...     NaN                 NaN
41               15.0       N'Keal Harry  ...     NaN                 NaN
42                2.0        Brian Hoyer  ...     NaN                 NaN
43               85.0          Ryan Izzo  ...     NaN                 NaN
44               47.0      Jakob Johnson  ...     NaN                 NaN
45               44.0       Dalton Keene  ...     NaN                 NaN
46               94.0        Isaiah Mack  ...     0.0                 NaN
47               69.0         Shaq Mason  ...     NaN                 NaN
48               16.0      Jakobi Meyers  ...     NaN                 NaN
49               76.0        Isaiah Wynn  ...     NaN                 NaN
50                NaN         Team Total  ...    65.0                 NaN
51                NaN          Opp Total  ...     NaN                 NaN

[52 rows x 23 columns]
     No.             Player   Age  Pos   G  ...   FGM   FGA  Sfty  Pts  Pts/G
0    6.0          Nick Folk  36.0    k  16  ...  26.0  28.0   NaN  108    6.8
1    1.0         Cam Newton  31.0   QB  15  ...   NaN   NaN   NaN   78    5.2
2   34.0       Rex Burkhead  30.0  NaN  10  ...   NaN   NaN   NaN   36    3.6
3   28.0        James White  28.0  NaN  14  ...   NaN   NaN   NaN   18    1.3
4   37.0      Damien Harris  23.0   rb  10  ...   NaN   NaN   NaN   12    1.2
5   15.0       N'Keal Harry  23.0   wr  14  ...   NaN   NaN   NaN   12    0.9
6   32.0     Devin McCourty  33.0   FS  16  ...   NaN   NaN   NaN   12    0.8
7   26.0        Sony Michel  25.0   rb   9  ...   NaN   NaN   NaN   12    1.3
8   80.0  Gunner Olszewski+  24.0   wr  13  ...   NaN   NaN   NaN   12    0.9
9   86.0       Devin Asiasi  23.0   te   9  ...   NaN   NaN   NaN    6    0.7
10  10.0       Damiere Byrd  27.0   WR  16  ...   NaN   NaN   NaN    6    0.4
11  47.0      Jakob Johnson  26.0   FB  16  ...   NaN   NaN   NaN    6    0.4
12  91.0  Deatrich Wise Jr.  26.0   de  16  ...   NaN   NaN   NaN    6    0.4
13  16.0      Jakobi Meyers  24.0   wr  14  ...   NaN   NaN   NaN    2    0.1
14   NaN         Team Total  26.5  NaN  16  ...  26.0  28.0   NaN  326    NaN
15   NaN          Opp Total   NaN  NaN  16  ...  22.0  27.0   NaN  353    NaN

[16 rows x 23 columns]
    Rk  ...                                             Detail
0    1  ...            Cam Newton 4 yard rush (Nick Folk kick)
1    2  ...           Cam Newton 11 yard rush (Nick Folk kick)
2    3  ...           Sony Michel 1 yard rush (Nick Folk kick)
3    4  ...  Devin McCourty 43 yard interception return (Ni...
4    5  ...            Cam Newton 1 yard rush (Nick Folk kick)
5    6  ...  Jakob Johnson 1 yard pass from Cam Newton (run...
6    7  ...            Cam Newton 1 yard rush (Nick Folk kick)
7    8  ...  Rex Burkhead 11 yard pass from Cam Newton (Nic...
8    9  ...          Rex Burkhead 5 yard rush (Nick Folk kick)
9   10  ...   Rex Burkhead 2 yard rush (Nick Folk kick failed)
10  11  ...  Deatrich Wise Jr. 0 yard fumble return (Nick F...
11  12  ...  N'Keal Harry 4 yard pass from Jarrett Stidham ...
12  13  ...                Cam Newton 1 yard rush (run failed)
13  14  ...  Damien Harris 22 yard rush (Jakobi Meyers pass...
14  15  ...            Cam Newton 2 yard rush (Nick Folk kick)
15  16  ...            Cam Newton 5 yard rush (Nick Folk kick)
16  17  ...          Rex Burkhead 1 yard rush (Nick Folk kick)
17  18  ...            Cam Newton 1 yard rush (Nick Folk kick)
18  19  ...  Rex Burkhead 7 yard pass from Cam Newton (Nick...
19  20  ...  Rex Burkhead 24 yard pass from Jakobi Meyers (...
20  21  ...            Cam Newton 4 yard rush (Nick Folk kick)
21  22  ...         Damien Harris 9 yard rush (Nick Folk kick)
22  23  ...  Damiere Byrd 42 yard pass from Cam Newton (Nic...
23  24  ...           James White 7 yard rush (Nick Folk kick)
24  25  ...           James White 1 yard rush (Nick Folk kick)
25  26  ...            Cam Newton 1 yard rush (Nick Folk kick)
26  27  ...  Gunner Olszewski 70 yard punt return (Nick Fol...
27  28  ...            Cam Newton 2 yard rush (Nick Folk kick)
28  29  ...  Devin McCourty 44 yard blocked field goal retu...
29  30  ...  N'Keal Harry 5 yard pass from Cam Newton (Nick...
30  31  ...  Gunner Olszewski 38 yard pass from Jarrett Sti...
31  32  ...     Cam Newton 9 yard rush (Nick Folk kick failed)
32  33  ...  James White 7 yard pass from Cam Newton (Nick ...
33  34  ...  Cam Newton 19 yard pass from Jakobi Meyers (Ni...
34  35  ...  Devin Asiasi 26 yard pass from Cam Newton (Nic...
35  36  ...  Sony Michel 31 yard pass from Cam Newton (Nick...

[36 rows x 9 columns]
    Rk  ...                                             Detail
0    1  ...   Jordan Howard 1 yard rush (Ryan Fitzpatrick run)
1    2  ...  Tyler Lockett 4 yard pass from Russell Wilson ...
2    3  ...  D.K. Metcalf 54 yard pass from Russell Wilson ...
3    4  ...  David Moore 38 yard pass from Russell Wilson (...
4    5  ...  Freddie Swain 21 yard pass from Russell Wilson...
5    6  ...  Chris Carson 18 yard pass from Russell Wilson ...
6    7  ...  Foster Moreau 1 yard pass from Derek Carr (Dan...
7    8  ...  Hunter Renfrow 13 yard pass from Derek Carr (D...
8    9  ...  Tyreek Hill 6 yard pass from Patrick Mahomes (...
9   10  ...  Mecole Hardman 6 yard pass from Patrick Mahome...
10  11  ...  Tyrann Mathieu 25 yard interception return (Ha...
11  12  ...        Jeff Wilson 3 yard rush (Robbie Gould kick)
12  13  ...  Kyle Juszczyk 4 yard rush (Robbie Gould kick f...
13  14  ...       Jeff Wilson 16 yard rush (Robbie Gould kick)
14  15  ...        Jeff Wilson 7 yard rush (Robbie Gould kick)
15  16  ...            Zack Moss 8 yard rush (Tyler Bass kick)
16  17  ...            Zack Moss 4 yard rush (Tyler Bass kick)
17  18  ...           Josh Allen 2 yard rush (Tyler Bass kick)
18  19  ...  Breshad Perriman 50 yard pass from Joe Flacco ...
19  20  ...  Jamison Crowder 20 yard pass from Joe Flacco (...
20  21  ...  Breshad Perriman 15 yard pass from Joe Flacco ...
21  22  ...  Willie Snead 6 yard pass from Lamar Jackson (J...
22  23  ...  Willie Snead 18 yard pass from Lamar Jackson (...
23  24  ...  Randall Cobb 3 yard pass from Deshaun Watson (...
24  25  ...  Deshaun Watson 4 yard rush (Ka'imi Fairbairn k...
25  26  ...  Keke Coutee 6 yard pass from Deshaun Watson (K...
26  27  ...      Kenyan Drake 1 yard rush (Zane Gonzalez kick)
27  28  ...      Kenyan Drake 1 yard rush (Zane Gonzalez kick)
28  29  ...             Jared Goff 1 yard rush (Matt Gay kick)
29  30  ...  Kenny Young 79 yard interception return (Matt ...
30  31  ...  Cooper Kupp 2 yard pass from Jared Goff (Matt ...
31  32  ...      Salvon Ahmed 1 yard rush (Jason Sanders kick)
32  33  ...  Tua Tagovailoa 3 yard rush (Salvon Ahmed pass ...
33  34  ...    Tua Tagovailoa 1 yard rush (Jason Sanders kick)
34  35  ...            Zack Moss 5 yard rush (Tyler Bass kick)
35  36  ...  Lee Smith 4 yard pass from Josh Allen (Tyler B...
36  37  ...  Stefon Diggs 50 yard pass from Josh Allen (Tyl...
37  38  ...  Stefon Diggs 18 yard pass from Josh Allen (Tyl...
38  39  ...  Stefon Diggs 8 yard pass from Josh Allen (Tyle...
39  40  ...  Chris Herndon 21 yard pass from Sam Darnold (C...
40  41  ...     Josh Adams 1 yard rush (Chase McLaughlin kick)

[41 rows x 9 columns]
chitown88
  • 27,527
  • 4
  • 30
  • 59
  • It's not the case with this table but on this page the tables don't all load when accessed through pandas, so I think the page needs to be rendered to access all tables. Additionally, I was taking from this [post](https://stackoverflow.com/questions/35261899/selenium-scraping-with-multiple-urls/35262031#35262031) and I hope to do something similar where the code goes through multiple team pages and I can create one df to work with. Maybe I'm wrong but I had thought that it would be a messier process to do the whole thing with pandas. I will look into that as an option. – imtrying Mar 02 '21 at 15:48
  • Nope. It doesn't need to be rendered to access all the tables. Those other tables are within the comments of the html. It's just a matter of using beautiful soup to pull out the comments and parse those tables. I'll update the code in a moment to show you that you can pull all the tables out. – chitown88 Mar 02 '21 at 16:51
  • As far as iteration through URL’s, again, just use pandas to get each table from each url. You can then append them all together into 1 dataframe – chitown88 Mar 02 '21 at 22:16
  • Thank you, I never considered that, I still have a lot to discover here! I really appreciate it! – imtrying Mar 03 '21 at 01:25
  • no worries. I had to learn all that stuff too. When I first started pulling data from reference.com site, I was doing the same, and no idea how to see/pull out the comments – chitown88 Mar 03 '21 at 08:52