Working with python 3.7 and pandas. I have two dataframes, one has information on a list of samples, 30000 rows and a log of the error recorded along with some information about the sample. The second dataframe is a dictionary of error number, contains more information about that error. There are 5000 rows The first dataframe has the following format:
Sample_ID Error_ID Material Mass (kg) Length (cm)
1 0004 steel 54 132
2 0002 wood 66 103
3 0003 aluminium 51 166
4 0002 plastic 35 144
5 0004 aluminium 41 156
6 0002 granite 200 199
The second dataframe:
Error_ID Info
0001 error 1
0002 error 2
0003 error 3
0004 error 4
I want to know if there is a straightforward way of creating a dataframe which is the same as the first but then appends the relevant error information to each row. Which doesn't require a loop through every element of dataframe 1.
Desired output:
Sample_ID Error_ID Material Mass (kg) Length (cm) Error_ID Info
1 0004 steel 54 132 0004 error 4
2 0002 wood 66 103 0002 error 2
3 0003 aluminium 51 166 0003 error 3
4 0002 plastic 35 144 0002 error 2
5 0004 aluminium 41 156 0004 error 4
6 0002 granite 200 199 0002 error 2
Here is the code to create the first 2 dataframes:
data_1 = [['1', '0004', 'steel', 54, 132], ['2', '0002', 'wood', 66, 103], ['3', '0003', 'aluminium', 51, 166], ['4', '0002', 'plastic', 35, 144], ['5', '0004', 'aluminium', 41, 156], ['6', '0002', 'granite', 200, 199]]
df_1 = pd.DataFrame(data_1, columns=['Sample_ID', 'Error_ID', 'Material', 'Mass (kg)', 'Length (cm)'])
data_2 = [['0001', 'error 1'], ['0002', 'error 2'], ['0003', 'error 3'], ['0004', 'error 4']]
df_2 = pd.DataFrame(data_2, columns=['Error_ID', 'Info'])
Thank you