Pandas - Compare two dataframes and replace values matching condition

Question

I have two pandas dataframes(df1 and df2) with the exact same number of columns and rows. (colum and index names are the same as well) The values in these two dataframes may or may not differ.

I want to compare every value in df1 with the value in the corresponding position in df2 and if the value in df2 is equal or bigger then the value in df1 i want to replace the value in df1 with a random integer.

So i thought I would want something like this (but preferably there wouldn't be any loops at all)

for every value in df1
    df1.value - df2.value
    if df1.value < 1
        df1.value = np.random()

I tried looking at pandas df.replace function in combination with the df.where function but I just can't seem to get it work it.

Edit: I want to add something i forgot previously. When assigning my random int I want it to be within a a range based on my corresponding value. So it will be:

for every value in df1
    df1.value - df2.value
    if df1.value < 1
        df1.value = np.random( in range (df1.value -10, df.value +10)

I believe this not possible with Pietro Tortella answer as I'm processing the dataframe as whole.

Does anyone know how to solve this?

In can build a new dataframe based on df1, transform the values the way I want and then use that as a substitution. — PyuPyuPyu, Nov 30 '17 at 15:01

Pietro Tortella · Answer 1 · 2017-11-30T11:02:45.807

2

If memory is not a concern, I would create a third DataFrame of random numbers, and make a substitution using the difference as a mask.

For instance, something like

randoms = pd.DataFrame(
    np.random.randn(*df1.values.shape), 
    index=df1.index,
    columns=df1.columns
)

df1[df2 >= df1] = randoms[df2 >= df1]

edited Nov 30 '17 at 11:02

answered Nov 30 '17 at 10:42

Pietro Tortella

1,084
1
6
13

Why not df2 >= df1 ? – ziggy jones Nov 30 '17 at 10:53
Thanks! I really like this solution, it seems clean. For now it works though in the future memory will be an issue since I will be using very large dataframes (that may be >1 gig in size) – PyuPyuPyu Nov 30 '17 at 11:05

Pandas - Compare two dataframes and replace values matching condition

1 Answers1

Linked