I need to read a list of HTML files into pandas DataFrames.
- Each HTML file has multiple dataframes ( I have used pd.concat to combine them ) .
- The HTML file names contains a string which I would like to add as a column.
# Read all files into a list
files = glob.glob('monthly_*.html')
# Zip the dfs with the desired string segment
zipped_dfs = [zip(pd.concat(pd.read_html(file)), file.split('_')[1]) for file in files]
I am having trouble unpacking the zipped list of ( df, product ).
dfs = []
# Loop through the list of zips,
for _zip in zipped_dfs:
# Unpack the zip
for _df, product in _zip:
# Adding the product string as a new column
_df['Product'] = product
dfs.append(_df)
However, I am getting the error 'str' object does not support item assignment
Could someone explain the best way to add the new column ?