Replace DataFrame subset with other set ( DataFrame, serie, list...)

Question

I am trying to a twisted modification. I have a DataFrame that has 100k row. I have generated strings that I have added them to a new DataFrame.

At the end I have the following:

df[df['Col1'] == value1]:
        Col1           Col2

6200    value1         string1
6201    value1         string2
6202    value1         string3


stringdf:

         Col2
0        goodstring1
1        goodstring2

Idealy stringdf would be same lengh as the subset of df for a perticular value of Col1.

I would like to change the rows in df as far as possible. In this example it would be to change 2 rows.

I would get:

df[df['Col1'] == value1]:
        Col1           Col2

6200    value1         goodstring1
6201    value1         goodstring2
6202    value1         string3

My approach was:

for i in range(0,len(stringdf)):
     df['Col2'][df['Col1'] == value1].iloc[i] = stringdf['Col2'].iloc[i]

but this doesn't passes without affecting the dataframe df.

Any suggestion, explanation or advice ? I would like to have a very fast processing time.

Methods that I also tried are found here How to replace part of dataframe in pandas

Thank you for your help !

score 1 · Accepted Answer · answered Aug 03 '18 at 20:06

Reindex stringdf to the index of your sub dataframe that was filtered and then use update on the original dataframe.

df = pd.DataFrame(
    {'Col1': ['value1'] * 3, 
     'Col2': ['string1', 'string2', 'string3']}, 
    index=[6200, 6201, 6203])

stringdf = pd.DataFrame({'Col2': ['goodstring1', 'goodstring2']})

idx = df[df['Col1'] == 'value1'].index[:len(stringdf)]
df.update(stringdf.set_index(idx))

>>> df
        Col1         Col2
6200  value1  goodstring1
6201  value1  goodstring2
6203  value1      string3

Replace DataFrame subset with other set ( DataFrame, serie, list...)

1 Answers1