Currently learning Python and Pandas
I am creating a df
with a lot of repetition in the calculations of the columns. I have created a loop to run through the multiple calculations on a selection of columns and create the new columns respectively. When I run the code for the first time it works as intended but the script is required to run multiple times with new data. On the second iteration, the loop duplicates the new columns instead of carrying on with the columns previously created.
I'm sure I'm missing something simple but I can't find anything in the SO archives that tell the loop not to duplicate but use the existing titled columns.
result = pd.read_csv('/Users/Documents/Base.csv')
smas = [100, 50]
headers_to_calc = ['nupl', 'funding rate']
h_count = len(headers_to_calc)
s_count = len(smas)
for h in headers_to_calc:
for s in smas:
sma = 'sma'
result[h,sma, s] = result[h].rolling(s).mean()
if s == s_count:
break
if h == h_count:
break
result
result.to_csv ('/Users/Documents/Base.csv')
This creates the columns with the correct rolling averages 100 and 50 for both nupl
and funding rate columns nupl sma 100
, nupl sma 50
,funding rate sma 100
and funding rate sma 50
When the script is run again however all the above columns are duplicated rather than recalculated and populated in the now existing headed up columns.
I'm thinking I need potentially an If statement that IF columns already exist do not recreate duplicate columns or maybe in the loop instantly merge the duplicate columns based on their nearly identical titles.