0

I have a query regarding replacing values in DataFrame. I have a dataset which consists of negative, positive, and zero values. I want to replace all the negatives with '*', zeros with '_' and positives with '**'.

Data is something like this:

    C1     C2    C3    C4
R1 1.2    0.0   -0.2   3
R2 -0.5   3.4    0.0   4
R3 0.5    -4    -2.8   0.0

Result should come like:

     C1    C2    C3   C4
R1   **    _     *    **
R2   *     **    _    **
R3   **    *    *      _

I tried a few options, given here Replace negative values by zero Unfortunately, once I replace a numeric DataFrame, then I cannot put any logical operation on it. So I Could just replace one of the values.

If anyone has an idea how to sort, please suggest. Thank you so much in advance.

mozway
  • 194,879
  • 13
  • 39
  • 75
sidrah maryam
  • 45
  • 2
  • 8

2 Answers2

3

Use np.sign and replace:

import numpy as np

out = np.sign(df).replace({-1: '*', 0: '_', 1: '**'})

Output:

    C1  C2 C3  C4
R1  **   _  *  **
R2   *  **  _  **
R3  **   *  *   _
mozway
  • 194,879
  • 13
  • 39
  • 75
3

Use numpy.select with DataFrame constructor if performance is important:

df1 = pd.DataFrame(np.select([df.gt(0), df.lt(0)], ['**','*'], '_'), 
                   index=df.index, 
                   columns=df.columns)
print (df1)
    C1  C2 C3  C4
R1  **   _  *  **
R2   *  **  _  **
R3  **   *  *   _

Or if possible rewrite original values:

df[:] = np.select([df.gt(0), df.lt(0)], ['**','*'], '_')
print (df)
    C1  C2 C3  C4
R1  **   _  *  **
R2   *  **  _  **
R3  **   *  *   _

Testing in 1k rows, 1k columns:

np.random.seed(1230)

df = pd.DataFrame(np.random.random(size=(1000,1000))).sub(.5).mask(lambda x: x.gt(.2), 0)

In [200]: %timeit np.sign(df).replace({-1: '*', 0: '_', 1: '**'})
206 ms ± 1.99 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

In [201]: %timeit pd.DataFrame(np.select([df.gt(0), df.lt(0)], ['**','*'], '_'),  
                               index=df.index, columns=df.columns)
52.2 ms ± 203 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252