2

I am trying to fix the code I've posted below. It works properly and my CSV file exports as intended, but I get a warning in my terminal: "The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead."

I reviewed the examples in the documentation and another thread on Stack Overflow, but I am having trouble getting the concat to actually work. It won't accept a data frame object so I have tried to convert to a list but I'm having a difficult time getting things to work like they do now. Could anyone help me replace this bit of code with the more future proof concat method?

import pandas as pd

result = pd.DataFrame()

for i in range (2013,2023):
    year = str(i)
    url = 'https://www.hockey-reference.com/leagues/NHL_'+year+'_skaters.html'

    df = pd.read_html(url,header=1)[0]
    df['year'] = year
    result = result.append(df, sort=False)

result = result[~result['Age'].str.contains("Age")]
result = result.reset_index(drop=True)

result.to_csv('hdb_data.csv',index=False)

1 Answers1

2

Actually, it has been removed in 2.0.0 (see here and GH35407) :

list_dfs = []

for year in range (2013, 2023):
    url = f"https://www.hockey-reference.com/leagues/NHL_{year}_skaters.html"
    list_dfs.append(pd.read_html(url, header=1)[0].assign(year=year))

result = (pd.concat(list_dfs).loc[lambda x: ~x["Age"].str.contains("Age")]
              .reset_index(drop=True))

result.to_csv("hdb_data.csv", index=False)

Output :

print(result)

         Rk             Player Age   Tm Pos  ...  HIT FOW FOL    FO%  year
0         1  Justin Abdelkader  25  DET  LW  ...  120  65  60   52.0  2013
1         2          Luke Adam  22  BUF  LW  ...    3   1   0  100.0  2013
2         3        Craig Adams  35  PIT  RW  ...  107  51  47   52.0  2013
...     ...                ...  ..  ...  ..  ...  ...  ..  ..    ...   ...
10370  1002          Artem Zub  26  OTT   D  ...  155   0   0    NaN  2022
10371  1003    Mats Zuccarello  34  MIN  LW  ...   36  21  34   38.2  2022
10372  1004       Jason Zucker  30  PIT  LW  ...   66   4  11   26.7  2022

[10373 rows x 29 columns]
Timeless
  • 22,580
  • 4
  • 12
  • 30