I am trying to calculate the column values of Cat "recursively"
Every loop should calculate the Cat columns max value (Catz) of a group of x. If the Date range becomes <=60, Cat column value should be updated with Catz +=1. I got an arcpy of this process going. I, however, have thousands of other data sets outside that need not be converted in arcpy friendly format. I am not well familiar with pandas.
Made reference to [1]: Calculate DataFrame values recursively and [2]: python pandas- apply function with two arguments to columns . I still havent quite understood the Series/Dataframe Concept and how to apply either outcome
import pandas as pd
import numpy as np
from datetime import datetime
from datetime import datetime as dt
from datetime import timedelta
import time
from datetime import date
dict = {'x':["ASPELBJNMI", "JUNRNEXCRG", "ASPELBJNMI", "JUNRNEXCRG"],
'start': ["6/27/2018", "8/4/2018", "8/22/2018", "8/12/2018"],
'finish':["8/11/2018", "10/3/2018", "8/31/2018", "10/26/2018"],
'DateRange':[0,0,0,0],
'Cat':[-1,-1,-1,-1],
'ID':[1,2,3,4]}
df = pd.DataFrame(dict)
df.set_index('ID')
def classd(houp):
Catz = houp.Cat.min()
Catz +=1
houp = houp.groupby('x')
for x, houp2 in houp:
houp.DateRange = (pd.to_datetime(houp.finish.loc[:]).min()- houp.start.loc[:]).astype('timedelta64[D]')
houp.Cat = np.where(houp.DateRange<=60, Catz , -1)
return houp
df['Cat'] = df[['x','DateRange','Cat']].apply(classd, axis=1).Cat
print df
I get the following Traceback when I run my code
Catz = houp.Cat.min() AttributeError: ("'long' object has no attribute 'min'", u'occurred at index 0')
Desired outcome
OBJECTID_1 * Conc * ID start finish DateRange Cat
1 ASPELBJNMI LAPMT 6/27/2018 8/11/2018 45 0
2 ASPELBJNMI KLKIY 8/22/2018 8/31/2018 9 1
15 JUNRNEXCRG CGCHK 8/4/2018 10/3/2018 60 1
16 JUNRNEXCRG IQYGJ 8/12/2018 10/26/2018 83 -1