0

I have two pandas dataframes(df1 and df2) with the exact same number of columns and rows. (colum and index names are the same as well) The values in these two dataframes may or may not differ.

I want to compare every value in df1 with the value in the corresponding position in df2 and if the value in df2 is equal or bigger then the value in df1 i want to replace the value in df1 with a random integer.

So i thought I would want something like this (but preferably there wouldn't be any loops at all)

for every value in df1
    df1.value - df2.value
    if df1.value < 1
        df1.value = np.random()

I tried looking at pandas df.replace function in combination with the df.where function but I just can't seem to get it work it.

Edit: I want to add something i forgot previously. When assigning my random int I want it to be within a a range based on my corresponding value. So it will be:

for every value in df1
    df1.value - df2.value
    if df1.value < 1
        df1.value = np.random( in range (df1.value -10, df.value +10)

I believe this not possible with Pietro Tortella answer as I'm processing the dataframe as whole.

Does anyone know how to solve this?

PyuPyuPyu
  • 1
  • 2
  • In can build a new dataframe based on df1, transform the values the way I want and then use that as a substitution. – PyuPyuPyu Nov 30 '17 at 15:01

1 Answers1

2

If memory is not a concern, I would create a third DataFrame of random numbers, and make a substitution using the difference as a mask.

For instance, something like

randoms = pd.DataFrame(
    np.random.randn(*df1.values.shape), 
    index=df1.index,
    columns=df1.columns
)

df1[df2 >= df1] = randoms[df2 >= df1]
Pietro Tortella
  • 1,084
  • 1
  • 6
  • 13