1

I am trying to do is export this table as a CSV for all 7 pages of 100 rows each within a Python script but an running into this error below the script.

"http://www.nhl.com/stats/player?aggregate=1&gameType=2&report=points&pos=S&reportType=game&startDate=2017-10-19&endDate=2017-10-29&filter=gamesPlayed,gte,1&sort=points,goals"

import pandas as pd

dfs = pd.read_html('http://www.nhl.com/stats/player?aggregate=1&gameType=2&report=skatersummary&pos=S&reportType=game&startDate=2017-10-19&endDate=2017-10-29&filter=gamesPlayed,gte,1&sort=points,goals,assists')
df = pd.concat(dfs, ignore_index=True)
df.to_csv("1019_1029.csv", index=False)
print(df)

ValueError: No tables found matching pattern '.+'

Michael T Johnson
  • 679
  • 2
  • 13
  • 26
  • From the code you should get error of Undefined ```df```, because you do not assign it before a use. Do you using Jupyter Notebook for editing and launching your code? Keep in mind - it stores global state until you do "kernel restart". – Timofey Chernousov Oct 31 '17 at 02:22
  • i did not mean to comment out. I was trying something and left it on accident. i just use python shell. – Michael T Johnson Oct 31 '17 at 12:49

1 Answers1

2

This site wont work with pandas.read_html. According to pandas documentation:

This function searches for <table> elements and only for <tr> and <th> rows and <td> elements within each <tr> or <th> element in the table. <td> stands for “table data”.

But site you are trying to parse uses <div> elements for structuring data into the table: Source code of reffered page

Hence, you will need custom parsing solution to read data from this site.

Timofey Chernousov
  • 1,284
  • 8
  • 12
  • 3
    Using the class names you could convert this html into ``, ``, `` etc. You could use a html parser library such as beautifulsoup to convert it, and then pass the output to `pandas.read_html`. https://stackoverflow.com/questions/5289189/how-to-change-tag-name-with-beautifulsoup
    – Håken Lid Oct 31 '17 at 13:10
  • How would that look in this scenario exactly Haken? – Michael T Johnson Oct 31 '17 at 18:33