I have two dataframes:
df1 is a reference table with a list of individual codes and their corresponding values.
df2 is a excerpt from a larger dataset, wherein one of the columns will contain multiple examples of the codes. It will also contain other values I want to ignore e.g. blanks and 'Not Applicable'.
I need to split out each individual code from df2 and find the corresponding value from the reference table df1. I then want to return a column in df2 with the maximum value from the entire string of codes.
import pandas as pd
df1 = [['H302',18],
['H312',17],
['H315',16],
['H316',15],
['H319',14],
['H320',13],
['H332',12],
['H304',11]]
df1 = pd.DataFrame(df1, columns=['Code', 'Value'])
df2 = [['H302,H304'],
['H332,H319,H312,H320,H316,H315,H302,H304'],
['H315,H312,H316'],
['H320,H332,H316,H315,H304,H302,H312'],
['H315,H319,H312,H316,H332'],
['H312'],
['Not Applicable'],
['']]
df2 = pd.DataFrame(df2, columns=['Code'])
I had previously used the following:
df3 = []
for i in range(len(df2)):
df3.append(df2['Code'][i].split(","))
max_values = []
for i in range(len(df3)):
for j in range(len(df3[i])):
for index in range(len(df1)):
if df1['Code'][index] == df3[i][j]:
df3[i][j] = df1['Value'][index]
max_values.append(max(df3[i]))
df2["Max Value"] = max_values
However, the .append function is being removed and when used I get the following error "'>' not supported between instances of 'numpy.ndarray' and 'str'"