-1

I have data like:

df1 = pd.DataFrame(columns=list('XY'))
df1['X'] = np.arange(0,100,0.1)
df1['Y'] = np.cos(df1['X']) + 30

df2 = pd.DataFrame(columns=list('AB'))
symbols['X'] = [22, 43, 64, 86]

And I am defining a function as:

def find_nearest(df1, df1['X'], df2['A'], df1['Y']):
        array = np.asarray(df1['X'])
        idx = (np.abs(array - df2['A'])).argmin()
        return df1.iloc[idx][df1['Y']]

But I get a syntax error when calling on the columns of the dataframes in the line:

def find_nearest(df1, df1['X'], df2['A'], df1['Y']):

It seems like the function doesn't like when I directly call on columns of dataframes. If I assign the columns into their own variables, this works fine. But for memory sake, I am trying to avoid that.

Does anyone know a workaround? If anything needs clarification, let me know.

Cody Smith
  • 123
  • 7
  • This could help for an efficient one - https://stackoverflow.com/questions/45349561/. – Divakar Jul 26 '19 at 13:45
  • Calling a dataframe column in that case still produces a syntax error. Although, seeing as it is more efficient, once I get the syntax error resolved, I may use that instead of my original. Thanks @Divakar – Cody Smith Jul 26 '19 at 14:04
  • Linked one expects arrays and it finds the closest argmin indexes for one array with respect to another. So, you would need to feed in inputs accordingly - `df1['X'].values`, etc. – Divakar Jul 26 '19 at 14:06
  • you are missing a parenthesis while defining your function , that might be it – Ayoub ZAROU Jul 26 '19 at 14:08
  • Just checked the original code, and that wasn't it. I just forgot to put it here. Thanks for the catch though @AyoubZAROU – Cody Smith Jul 26 '19 at 14:14
  • Still got an error doing that @Divakar – Cody Smith Jul 26 '19 at 14:17
  • you should just pass df1 and df2 as arguments and not `df1['X'] ...`, it's not a valid variable name – Ayoub ZAROU Jul 26 '19 at 14:17

1 Answers1

1

df1['X'] is not a valid variable name in python, you could do instead :


def find_nearest(df1, df1_X, df2_A, df1_Y):
        array = np.asarray(df1_X
        idx = (np.abs(array - df2_A)).argmin()
        return df1.iloc[idx][df1_Y]

Or just :


def find_nearest(df1, df2):
        array = np.asarray(df1['X'])
        idx = (np.abs(array - df2['A'])).argmin()
        return df1.iloc[idx][df1['Y']]
Ayoub ZAROU
  • 2,387
  • 6
  • 20