-1

I want to replace all numerical values less than 120 with an average value calculated from same column in csv. I imported a CSV file as pd and got a complete table. to call the data frame I write data then I get the data file. to call one column I write data.steam and to calculate average value for the column I write average_steam=data.steam.mean() then print average_steam which returns 123. So, I want all values in column steam less than 120 to be replaced by 123. i.e if I have 12, 90, 130, 128,110 I want to get 123,123,130,128, 123. All necessary libraries are imported.

The code I tried:

data.steam 
average_steam=data.steam.mean() 
print average_steam
data.steamin.replace(data.steamin<=120,average_steam, inplace=True)

Prosper
  • 11
  • 1
  • Your answer should work for Python 3 at least. I just tried it. – Rohith Feb 25 '20 at 00:07
  • 1
    What's wrong with that code? Also, why are you using Python 2? – AMC Feb 25 '20 at 00:08
  • Does this answer your question? [replace values in column in data frame with average value](https://stackoverflow.com/questions/60341417/replace-values-in-column-in-data-frame-with-average-value) – Caperneoignis Feb 26 '20 at 18:53
  • This is a duplicate of a question you already asked. [here](https://stackoverflow.com/questions/60341417/replace-values-in-column-in-data-frame-with-average-value) – Caperneoignis Feb 26 '20 at 18:54

2 Answers2

1

If df is Your pd.DataFrame and xis column to modify then try:

df.x[df.x<120]=df.x.mean()

ipj
  • 3,488
  • 1
  • 14
  • 18
  • 2
    This is wrong. It will replace all columns instead of just the column x. You should replace the first df with df.x – Rohith Feb 25 '20 at 00:15
1

This was answered here.

In your case, it would be:

data.loc[data['steam'] < 120, 'steam'] = average_steam

Here is what is happening:

by using data.loc you are choosing some rows and columns. The first argument in loc is the rows. You are choosing rows where the value in the steam column is less than 120. The second argument is the columns. In your case you are choosing steam as your column. So you are choosing all rows in the column steam with a value less than 120. And then your are assigning to these cells the value average_steam.

Sinan Kurmus
  • 585
  • 3
  • 11