I have a dataframe with two columns A
and B
. I want to take the scalar of column B
based on the value of column A
. I used loc
and .value
[0]
My data volume is relatively small, the main problem is to see whether the syntax of the code is correct. .value
seems to be deprecated.
import pandas as pd
import numpy as np
df = pd.DataFrame()
df[['A', 'B']] = pd.DataFrame(np.arange(10).reshape((5, 2)))
df1 = df.loc[df['A'] == 4, 'B'].values[0]
print(df1)
The result is
5
Can this code be optimized?
df1 = df.loc[df['A'] == 4, 'B'].values[0]
numpy
is faster:
%timeit df1 = df[df['A'] == 4].B.iloc[0]
723 µs ± 15.3 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit df1=df.loc[df['A'] == 4, 'B'].to_numpy()[0]
513 µs ± 4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit df1 = df.loc[df['A'] == 4, 'B'].iloc[0]
521 µs ± 20.2 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)