This method is time/cpu intensive and there's got to be a better way! Can someone help me vectorize the following code without using a loop? Basically, I have a df where each subject has multiple rows, and each row has a value. I want to add a column that displays the highest value for every subject (will be the same for every row of the subject).
import pandas as pd
import numpy as np
from numpy import nan
compare_table = pd.DataFrame({
'id': [1,1,1,2,2,3,3,3],
'day#': [1, 2, 3, 1, 2, 1, 2, 3],
'random#': [2,5,1,6, 4, 5, 9, 3],
'highest_random#': [nan, nan, nan, nan, nan, nan, nan, nan]}, columns=[
'id', 'day#','random#','highest_random#'])
for element in list(compare_table['id'].unique()):
highest_random = max(compare_table.loc[compare_table.loc[:,'id']==element, 'random#'])
compare_table.loc[compare_table.loc[:,'id']==element, 'highest_random#']= highest_random