0

I'm trying to change all values in the slice except the first one but it does not work... what am i doing wrong ?

print(test)
test.loc[(test.col_1==-5)&(test.index>'2018-07-17 13:00:00')&(test.index<'2018-07-17 14:00:00'),['col_1']][1:]=-1
print(test)

provides the below output

17/07/2018 13:51:00 -5
17/07/2018 13:52:00 -1
17/07/2018 13:53:00 -5
17/07/2018 13:54:00 -5
17/07/2018 13:55:00 -5
17/07/2018 13:56:00 -5
17/07/2018 13:57:00 -5
17/07/2018 13:58:00 -5
17/07/2018 13:59:00 -5

17/07/2018 13:51:00 -5
17/07/2018 13:52:00 -1
17/07/2018 13:53:00 -5
17/07/2018 13:54:00 -5
17/07/2018 13:55:00 -5
17/07/2018 13:56:00 -5
17/07/2018 13:57:00 -5
17/07/2018 13:58:00 -5
17/07/2018 13:59:00 -5

whereas i was expecting the 2nd output to be

17/07/2018 13:51:00 -5
17/07/2018 13:52:00 -1
17/07/2018 13:53:00 -1
17/07/2018 13:54:00 -1
17/07/2018 13:55:00 -1
17/07/2018 13:56:00 -1
17/07/2018 13:57:00 -1
17/07/2018 13:58:00 -1
17/07/2018 13:59:00 -1
jpp
  • 159,742
  • 34
  • 281
  • 339
  • 2
    Can you add [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve) ? Because it seems both DataFrame are same. – jezrael Jul 26 '18 at 07:56
  • @jezrael: i used an image to avoid having to write some html... but ok fair enough... for sake of argument though this is not some code but the output results. – RogerLePatissier Jul 26 '18 at 09:23
  • @jezrael: the example was explicit if only you had read it properly in the first place... obviously you did not and to down vote the question because you didnt spend more than 1minute to understand it is quite poor... – RogerLePatissier Jul 26 '18 at 09:24
  • Thanks, sometimes in real it is different ;) So you need filter first row of `DataFrame` or need filter out first True value of boolean mask? – jezrael Jul 26 '18 at 10:23

3 Answers3

1

You can use numpy.where and use indexing [1:] to exclude the first time the criterion is True. Here's a minimal example:

df = pd.DataFrame([[1, -5], [2, -5], [3, -1], [4, -5], [5, -5], [6, -1]],
                  columns=['col1', 'col2'])

df.iloc[np.where(df['col1'].between(2, 5))[0][1:], 1] = -1

print(df)

   col1  col2
0     1    -5
1     2    -5
2     3    -1
3     4    -1
4     5    -1
5     6    -1
jpp
  • 159,742
  • 34
  • 281
  • 339
  • thank you jpp - i think this is indeed what i was doing wrong, ie: i should have looked at using iloc and np.where!!! – RogerLePatissier Jul 26 '18 at 10:54
  • @RogerLePatissier, Yes, sometimes it's useful to explore ways to *avoid* double masks. Usually it's possible in a vectorised way. – jpp Jul 26 '18 at 10:59
0

There is problem join boolean indexing (filtering) with selecting, one possible solution is add new condiction:

test.index = pd.to_datetime(test.index)
mask = (test.col_1==-5)&(test.index>'2018-07-17 13:00:00')&(test.index<'2018-07-17 14:00:00')

m1 = np.arange(len(test)) > 1
test.loc[mask & m1, 'col_1']=-1

print (test)
                     col_1
2018-07-17 13:51:00     -5
2018-07-17 13:52:00     -1
2018-07-17 13:53:00     -1
2018-07-17 13:54:00     -1
2018-07-17 13:55:00     -1
2018-07-17 13:56:00     -1
2018-07-17 13:57:00     -1
2018-07-17 13:58:00     -1
2018-07-17 13:59:00     -1
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • the question was not how to do this in a different way but why in its current form it does not work. Basically the problem is a lot more complex than that and so i reduced my question to just this bit.Using a 2 steps mask approach does not fit inthe overall and more complex picture i'm facing – RogerLePatissier Jul 26 '18 at 10:49
0

An actual answer to the problem:

pandas aligns all AXES when setting Series and DataFrame from .loc, and .iloc. This will not modify df because the column alignment is before value assignment.

so df.loc[1][column] not df[column].loc[1]

https://pandas.pydata.org/docs/user_guide/indexing.html

John
  • 27
  • 5