Replacing -999 with a number but I want all replaced number to be different

Question

I have a Pandas DataFrame named df and in df['salary'] column, there are 400 values represented by same number -999. I want to replace that -999 value with any number in between 200 and 500. I want to replace all 400 values with a different number from 200 to 500. So far I have written this code:

df['salary'] = df['salary'].replace(-999, random.randint(200, 500))

but this code is replacing all -999 with the same value. I want all replaced values to be different from each other. How can do this.

Henry Yik · Accepted Answer · 2021-06-19T08:24:37.350

0

You can use Series.mask with np.random.randint:

df = pd.DataFrame({"salary":[0,1,2,3,4,5,-999,-999,-999,1,3,5,-999]})

df['salary'] = df["salary"].mask(df["salary"].eq(-999), np.random.randint(200, 500, size=len(df)))

print (df)

    salary
0        0
1        1
2        2
3        3
4        4
5        5
6      413
7      497
8      234
9        1
10       3
11       5
12     341

If you want non-repeating numbers instead:

s = pd.Series(range(200, 500)).sample(frac=1).reset_index(drop=True)

df['salary'] = df["salary"].mask(df["salary"].eq(-999), s)

edited Jun 19 '21 at 08:24

answered Jun 19 '21 at 08:11

Henry Yik

22,275
4
18
40

are you sure ```np.random.randint``` does not repeat numbers? – 99_m4n Jun 19 '21 at 08:14
I just re-read his Q and you might be right, added in the above. – Henry Yik Jun 19 '21 at 08:24
`np.random.randint(200, 500, size=len(df)))` is replacing -999 values with different value everytime. How can I apply random_state in it ? – Adarsh Wase Jun 21 '21 at 05:17
See [`np.random.seed`](https://stackoverflow.com/questions/21494489/what-does-numpy-random-seed0-do). – Henry Yik Jun 21 '21 at 08:06

Replacing -999 with a number but I want all replaced number to be different

1 Answers1