Is there a way to replace specific values in a Dataframe respectively with others?

Question

I'm currently working on a machine learning project where I have to replace -99 values (nan) with the means of each column. However, I cannot manage to replace the correct values, only the first across all columns. So, what I need is to have the mean each column instead than the -99 of that column.

I produce the means for each column first:

mean_miss = []

for i in df_train[vars_ind_numeric]:
    mean_miss = df_train[vars_ind_numeric].mean()

then pass:

for var in df_train[vars_ind_numeric]:
        df_train[vars_ind_numeric]=df_train[vars_ind_numeric]\
        .replace(nan, mean_miss[var])

Any idea on how to fix this? thanks in advance

Welcome to SO. could you please add sample input data and sample required output data? Please see https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples. — Roy2012, Jun 24 '20 at 17:44

score 0 · Answer 1 · answered Jun 24 '20 at 17:49

0

If what you're looking for is to fill NA values in each column with the column mean, here's a solution (for dummy data):

df = pd.DataFrame({"a": range(10), "b": range(10, 20)})
df.loc[5, "a"] = np.NaN
df.loc[9, "a"] = np.NaN
df.loc[7, "b"] = np.NaN

The resulting data is:

     a     b
0  0.0  10.0
1  1.0  11.0
2  2.0  12.0
3  3.0  13.0
4  4.0  14.0
5  NaN  15.0
6  6.0  16.0
7  7.0   NaN
8  8.0  18.0
9  NaN  19.0

The mean values are:

print(df.mean())
a     3.875000
b    14.222222
dtype: float64

And now, do the actual calculation:

df.fillna(df.mean())

       a          b
0  0.000  10.000000
1  1.000  11.000000
2  2.000  12.000000
3  3.000  13.000000
4  4.000  14.000000
5  3.875  15.000000
6  6.000  16.000000
7  7.000  14.222222
8  8.000  18.000000
9  3.875  19.000000

answered Jun 24 '20 at 17:49

Roy2012

11,755
2
22
35

thanks but my nan are in form of the number -99. So when that appears in the column has to be replaced by the mean. – Mr. C. Developer Jun 25 '20 at 12:09
I have managed to turn all the -99 and infinite numbers to nan and then just fill them with the mean. Thanks again – Mr. C. Developer Jun 25 '20 at 12:19
If this answers your question, do you mind accepting it for future generations? (click the checkmark next to the answer) – Roy2012 Jun 25 '20 at 12:45

Is there a way to replace specific values in a Dataframe respectively with others?

1 Answers1