2

I have a pandas dataframe like this

df.head(10)
7   RT (min)    Area (Ab*s) Quality patch   similarity
8   10.167      23278313    64      NaN     NaN
9   10.167      23278313    47      NaN     NaN
10  10.167      23278313    38      NaN     NaN
28  10.333      3407159     49      10.167  0.983935
29  10.333      3407159     22      10.167  0.983935
30  10.333      3407159     16      10.167  0.983935
48  10.390      3299202     38      10.333  0.994514
49  10.390      3299202     35      10.333  0.994514
50  10.390      3299202     32      10.333  0.994514
68  10.516      2015786     50      10.390  0.988018

and I want when df['similarity']>0.99,and then df['RT (min)'] = df['patch']. for example, df should like this:

7   RT (min)    Area (Ab*s) Quality patch   similarity
8   10.167      23278313    64      NaN     NaN
9   10.167      23278313    47      NaN     NaN
10  10.167      23278313    38      NaN     NaN
28  10.333      3407159     49      10.167  0.983935
29  10.333      3407159     22      10.167  0.983935
30  10.333      3407159     16      10.167  0.983935
48  10.333      3299202     38      10.333  0.994514
49  10.333      3299202     35      10.333  0.994514
50  10.333      3299202     32      10.333  0.994514
68  10.516      2015786     50      10.390  0.988018

48,49,50 rows in RT (min) is replace with 48,49,50 rows in patch

I have try

p = df[df['similarity']>0.99].index.tolist()
df['RT (min)'][p] =df['patch'][p]

while I get the error

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

I don't know how to work it out.

X.tang
  • 21
  • 3

2 Answers2

1

Something like this:

mask = df['similarity'] > 0.99
df.loc[mask, 'RT'] = df['patch']

As an example:

df = pd.DataFrame({"RT":[10.1,10.2,10.4],"patch":[float("NaN"),10.3,10.3],"similarity":[float("NaN"),0.9,0.998]})

Producing:

    RT  patch   similarity
0   10.1    NaN NaN
1   10.2    10.3    0.900
2   10.4    10.3    0.998

Create a mask and use to assign values from patch

mask = df['similarity'] > 0.99
df.loc[mask, 'RT'] = df['patch']

Result:

RT  patch   similarity
0   10.1    NaN NaN
1   10.2    10.3    0.900
2   10.3    10.3    0.998
Wes Doyle
  • 2,199
  • 3
  • 17
  • 32
0

pd.Series.mask

You can assign as follows:

df['RT'] = df['RT'].mask(df['similarity'] > 0.99, df['patch'])
jpp
  • 159,742
  • 34
  • 281
  • 339