2

I am trying to calculate the column values of Cat "recursively"

Every loop should calculate the Cat columns max value (Catz) of a group of x. If the Date range becomes <=60, Cat column value should be updated with Catz +=1. I got an arcpy of this process going. I, however, have thousands of other data sets outside that need not be converted in arcpy friendly format. I am not well familiar with pandas.

Made reference to [1]: Calculate DataFrame values recursively and [2]: python pandas- apply function with two arguments to columns . I still havent quite understood the Series/Dataframe Concept and how to apply either outcome

import pandas as pd
import numpy as np
from datetime import datetime
from datetime import datetime as dt
from datetime import timedelta
import time
from datetime import date
dict = {'x':["ASPELBJNMI", "JUNRNEXCRG", "ASPELBJNMI", "JUNRNEXCRG"], 
        'start': ["6/27/2018", "8/4/2018", "8/22/2018", "8/12/2018"], 
        'finish':["8/11/2018", "10/3/2018", "8/31/2018", "10/26/2018"],
        'DateRange':[0,0,0,0],
        'Cat':[-1,-1,-1,-1],
        'ID':[1,2,3,4]} 

df = pd.DataFrame(dict)

df.set_index('ID')
def classd(houp):
    Catz = houp.Cat.min()
    Catz +=1

    houp  = houp.groupby('x')
    for x, houp2 in houp:


        houp.DateRange  = (pd.to_datetime(houp.finish.loc[:]).min()- houp.start.loc[:]).astype('timedelta64[D]')

    houp.Cat = np.where(houp.DateRange<=60, Catz , -1)
    return houp

df['Cat'] =  df[['x','DateRange','Cat']].apply(classd, axis=1).Cat
print df

I get the following Traceback when I run my code

Catz = houp.Cat.min() AttributeError: ("'long' object has no attribute 'min'", u'occurred at index 0')

Desired outcome

   OBJECTID_1 * Conc *  ID  start   finish  DateRange   Cat
1   ASPELBJNMI  LAPMT   6/27/2018   8/11/2018   45  0
2   ASPELBJNMI  KLKIY   8/22/2018   8/31/2018   9   1
15  JUNRNEXCRG  CGCHK   8/4/2018    10/3/2018   60  1
16  JUNRNEXCRG  IQYGJ   8/12/2018   10/26/2018  83  -1
wwnde
  • 26,119
  • 6
  • 18
  • 32

1 Answers1

0

You program is little bit complecated to comprehend

But i would suggest to try something simple with apply function:

s.apply(lambda x: x ** 2)

here s is a series

https://pandas.pydata.org/docs/reference/api/pandas.Series.apply.html

Mahmud
  • 141
  • 2
  • 7