I am trying to add the rows on the second elif statement to the existing dataframe called dfTest. For some reason I get:
local variable 'dfTest' referenced before assignment.
What would be the correct way to achieve this?
The objective is to add a row with x values to dfTest columns if it does not exist a day prior.
d = {
'itemID':[2,2,2,2,2,2],
'orderedDate':[pd.to_datetime('12/1/21'), pd.to_datetime('12/2/21'), pd.to_datetime('12/3/21'), pd.to_datetime('12/4/21'), pd.to_datetime('12/6/21'), pd.to_datetime('12/7/21')],
'qty':[1,4,1,1,2,2]
}
dfTest = pd.DataFrame(d)
def addRowIfNotAdayBefore(row):
startDate = pd.to_datetime('12/1/21')
rowMinusOneDay = row['orderedDate'] - pd.Timedelta(days=1)
itemSeries = dfTest.loc[dfTest['itemID'] == row['itemID']]
exists = itemSeries['orderedDate'] == rowMinusOneDay
data = [0, 0, 0]
if row['orderedDate'] == startDate:
print('This is the first row!')
elif exists.any():
print('1st elif')
elif exists.any() == False:
print('2nd elif')
row = {'itemID': row['itemID'], 'orderedDate': rowMinusOneDay, 'qty': 0}
print(dfNewRows)
return dfNewRows
dfTest.apply(addRowIfNotAdayBefore, axis=1)
dfTest
Current output of dfTest:
itemID | orderedDate | qty |
---|---|---|
2 | 12/1/21 | 1 |
2 | 12/2/21 | 4 |
2 | 12/3/21 | 1 |
2 | 12/4/21 | 1 |
2 | 12/5/21 | 2 |
2 | 12/6/21 | 2 |
Output of dfTest should be like this:
itemID | orderedDate | qty |
---|---|---|
2 | 12/1/21 | 1 |
2 | 12/2/21 | 4 |
2 | 12/3/21 | 1 |
2 | 12/4/21 | 1 |
2 | 12/5/21 | 0 |
2 | 12/6/21 | 2 |
2 | 12/7/21 | 2 |