I would like to optimize a code that is using two for().
I have the following dataframes:
import pandas as pd
import numpy as np
df_Original = pd.DataFrame({'System Model': ['System 100', 'System 101', 'System 108', 'System
200'],
'ID Sensor': [54, 55, 75, 100],
'Sensor Type': ['Analog', 'Digital', 'Analog', 'Digital']})
df_Second = pd.DataFrame({'ID Sensor': [54, 2, 55, 100],
'Sensor_Max': [1024, 1, 1,1],
'Sensor_Min': [0, 0, 0, 0]})
I would need to create a new column in df_Second with the indication of which 'System Model' the 'ID Sensor' belongs to. So, I implemented the following code:
# Boot
df_Second['new_columns_System_Model'] = np.NaN
# Iterative
for i in range(0, len(df_Original)):
for j in range(0, len(df_Second)):
# Condition
if(df_Original['ID Sensor'].iloc[i] == df_Second['ID Sensor'].iloc[j]):
# New column
df_Second['new_columns_System_Model'].iloc[j] = df_Original['System Model'].iloc[i]
The code is working perfectly. The output is being as desired:
ID Sensor Sensor_Max Sensor_Min new_columns_System_Model
54 1024 0 System 100
2 1 0 NaN
55 1 0 System 101
100 1 0 System 200
However, if the dataframes are larger, this code will take a long time iterating between the two for (). I ask for help to make the code more efficient. Thank you.