0

What is a simple and direct way to set the index of every second row of my dataframe to, say, ''? The method I used to use, df.loc[1::2, 'index'] = '' used to work but no longer does. I'm using Pandas version 1.1.0.

It now gives the following error:

ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()                                                                                  
> lib/python3.6/site-packages/pandas/core/indexes/multi.py(1902)__getitem__()      

Here's my test setup:

#!/usr/bin/python3
import pandas as pd
import numpy as np
df=  pd.DataFrame(np.random.random(10), range(10), columns=['foo'])
df.index.name='bar'

which gives:

       foo
bar          
0    0.818489
1    0.525593
2    0.741739
3    0.250103
4    0.304080
5    0.206198
6    0.982070
7    0.476621
8    0.053609
9    0.726157

but the following does nothing:

df.loc[1::2].index= ['']*len(df.loc[1::2].index)

i.e, the result is still

          foo
bar          
0    0.818489
1    0.525593
2    0.741739
3    0.250103
4    0.304080
5    0.206198
6    0.982070
7    0.476621
8    0.053609
9    0.726157

Why does that not work?

Similarly, this does not work:

df.index = df.index.to_numpy()
df.loc[1::2].index= ['']*len(df.loc[1::2].index)

Why not?

(The effort is motivated by the fact that it looks to me like the index is not just a sequence of integers (like it used to be?)

df.index                                                                                                                              
Out[]: RangeIndex(start=0, stop=10, step=1, name='bar')

)

This doesn't work, either: df.loc[1::2,'bar']= ''.

The following does work (in Pandas 1.0.4 but not 1.1.0), but it involves move the index to a column. Surely that isn't necessary?

df.reset_index(inplace=True)
df.loc[1::2,'bar']= ''
df.set_index('bar', inplace=True)

which gives me what I want, viz:

          foo
bar          
0    0.653306
     0.866628
2    0.356007
     0.393833
4    0.770817
     0.131656
6    0.314990
     0.419762
8    0.944348
     0.454487

I'm looking for a clean and clear and consise way to carry out this simple modification to matching index values by acting on the index directly.

(n.b. the title of this question isn't perfect. I don't want to use iloc; I want to address certain rows' indices all to the same value. So maybe the problem is slightly more general).

CPBL
  • 3,783
  • 4
  • 34
  • 44

1 Answers1

2

One way,

df = df.set_axis(pd.Index([index if i not in range(1, df.shape[0], 2) else '' 
                           for i, index in enumerate(df.index)], 
                          name=df.index.name))
print(df)

Output

          foo
bar          
0    0.302340
     0.744609
2    0.489255
     0.542356
4    0.072797
     0.810690
6    0.738350
     0.939177
8    0.827072
     0.751731

We could also use DataFrame.Index.values but we need remove RangeIndex. So the cleanest way is DataFrame.rename

df = df.rename(index=dict.fromkeys(df.index[1::2],''))
ansev
  • 30,322
  • 5
  • 17
  • 31
  • Hey, sorry. I edited my question for better readablility to setting those indices to "" rather than 0. So I edited your answer accordingly! ! – CPBL Aug 02 '20 at 20:40
  • Your second option is a lot prettier to me than the first, but your answer addresses rows by index. That is not what I'm after. I want to set them by value, ideally in one line / directly in the index. – CPBL Aug 02 '20 at 21:08
  • @CPBL check now – ansev Aug 02 '20 at 22:07
  • Nice. What about just the last answer, using `inplace=True` ? – CPBL Aug 03 '20 at 01:27
  • you can use it, but you should see https://stackoverflow.com/questions/45570984/in-pandas-is-inplace-true-considered-harmful-or-not – ansev Aug 03 '20 at 08:56