0

My data frame has six columns of float data and around 80,000 rows. One of the column is "Current" and sometimes it is negative. I wanted to find index locations when the "Current" value is negative. My code is given below:

currnet_index = my_df[(my_df["Current"]<0)].index.tolist()
print(current_index[:5])

This gives output as I wanted:

[0, 124, 251, 381, 512]

This is fine. Is it possible to write this code using iloc method? I tried with following code and but it is giving error. I am wondering to know which of them is best and fastest method?

current_index = my_df.iloc[(my_df["Current"]<0)]

The output is:

NotImplementedError: iLocation based boolean indexing on an integer type is not available
jpp
  • 159,742
  • 34
  • 281
  • 339
Msquare
  • 353
  • 1
  • 7
  • 17
  • https://stackoverflow.com/questions/31593201/pandas-iloc-vs-ix-vs-loc-explanation-how-are-they-different iloc gives you rows based on the location given..it wont return indexes...also as the error says it expects integers whereas the condition is giving boolean values..to make sense add .index after the () – iamklaus Sep 09 '18 at 13:43
  • @SarthakNegi, I just tried this `current_index = my_df.iloc[(my_df["Current"]<0).index]`. It gives error. – Msquare Sep 09 '18 at 13:52
  • your objective was to get the indexes right ?...also can you post the error message because it works fine for me... – iamklaus Sep 09 '18 at 14:04
  • If the indices are not unique identifiers you can simply use `np.where` – Bharath M Shetty Sep 09 '18 at 14:30

2 Answers2

1

With iloc you need to use a Boolean array rather than a Boolean series. For this, you can use pd.Series.values. Here's a demo:

df = pd.DataFrame({'Current': [1, 3, -4, 9, -3, 1, -2]})

res = df.iloc[df['Current'].lt(0).values].index

# Int64Index([2, 4, 6], dtype='int64')

Incidentally, loc works with either an array or a series.

jpp
  • 159,742
  • 34
  • 281
  • 339
  • Thanks. It did work. I am wondering how to find the execution time of these two methods? I mean, which is effective and best code to use? Please, let me. – Msquare Sep 10 '18 at 05:20
  • @Msquare, You can use `timeit`, e.g. [see here](https://stackoverflow.com/questions/8220801/how-to-use-timeit-module). I do not expect you to see a significant performance difference; you should first check whether this is truly the bottleneck in your application. – jpp Sep 10 '18 at 08:38
0

You can simply use the following

my_df.ix[my_df['Current']<0].index.values
Anant Gupta
  • 1,090
  • 11
  • 11