0

I have a DataFrame that I split into column Series (col_series in the snippet below)and use apply tests to each value in each Series. But I would like to report which row in the Series is affected when I detect and error.

...
            col_series.apply(self.testdatelimits, args= \
                (datetime.strptime('2018-01-01', '%Y-%m-%d'), key))


def testlimits(self, row_id, x, lowerlimit, col_name):
    low_error = None
    d = float(x)
    if lowerlimit != 'NA' and d < float(lowerlimit):
        low_error = 'Following record has column ' + col_name + ' lower than range check'
    if low_error is not None:
        self.set_error(col_index, row_id, low_error)

Of course the above fails because x is a str and does not have the name property. I am thinking that maybe I can pass in the row index in the Series, but am not clear on how to do that?

Edit: I switched to use a list comprehension to solve this issue rather than ps apply. It is significantly faster too

col_series = col_series.apply(pd.to_datetime, errors='ignore')
dfwithrow = pd.DataFrame(col_series)
dfwithrow.insert(0, 'rowid', range(0, len(dfwithrow)))
dfwithrow['lowerlimit'] = lowlimit
dfwithrow['colname'] = 'fred'

list(map(self.testdatelimits, dfwithrow['rowid'], dfwithrow[colvalue[0]], \
    dfwithrow['lowerlimit'], dfwithrow['colname']))
oldDave
  • 395
  • 6
  • 25
  • You could use `iterrows` in place of `apply`. See here: https://stackoverflow.com/q/16476924/8146556 – rahlf23 Jul 30 '18 at 15:04
  • 1
    There is probably a better way to do this than apply, definitely a better way to do this than iterrows. Can you post a sample input and a desired output? – user3483203 Jul 30 '18 at 15:05
  • In the end I used list comprehension to solve my problem, see edit – oldDave Aug 20 '18 at 17:04

0 Answers0