2

I have a sample dataset here. In real case, it has a train and test dataset. Both of them have around 300 columns and 800 rows. I want to filter out all those rows based on a certain value in one column and then set all values in that row from column 3 e.g. to column 50 to zero. How can I do it?

Sample dataset:

import pandas as pd 

data = {'Name':['Jai', 'Princi', 'Gaurav','Princi','Anuj','Nancy'], 
    'Age':[27, 24, 22, 32,66,43], 
    'Address':['Delhi', 'Kanpur', 'Allahabad', 'Kannauj', 'Katauj', 'vbinauj'], 
'Payment':[15,20,40,50,3,23],
    'Qualification':['Msc', 'MA', 'MCA', 'Phd','MA','MS']} 

df = pd.DataFrame(data) 
df

Here is the output of sample dataset:

    Name    Age   Address    Payment    Qualification
0   Jai     27     Delhi      15          Msc
1   Princi  24     Kanpur     20          MA
2   Gaurav  22     Allahabad  40          MCA
3   Princi  32     Kannauj    50          Phd
4   Anuj    66     Katauj     3           MA
5   Nancy   43     vbinauj    23          MS

As you can see, in the first column, there values =="Princi", So if I find rows that Name column value =="Princi", then I want to set column "Address" and "Payment" in those rows to zero.

Here is the expected output:

    Name    Age   Address    Payment    Qualification
0   Jai     27     Delhi      15          Msc
1   Princi  24     0          0           MA     #this row
2   Gaurav  22     Allahabad  40          MCA
3   Princi  32     0          0           Phd     #this row
4   Anuj    66     Katauj     3           MA
5   Nancy   43     vbinauj    23          MS

In my real dataset, I tried:

    train.loc[:, 'got':'tod']# get those columns # I could select all those columns
and train.loc[df['column_wanted'] == "that value"] # I got all those rows

But how can I combine them? Thanks for your help!

almo
  • 561
  • 2
  • 16

1 Answers1

4

Use the loc accessor; df.loc[boolean selection, columns]

df.loc[df['Name'].eq('Princi'),'Address':'Payment']=0



      Name  Age    Address  Payment Qualification
0     Jai   27      Delhi       15           Msc
1  Princi   24          0        0            MA
2  Gaurav   22  Allahabad       40           MCA
3  Princi   32          0        0           Phd
4    Anuj   66     Katauj        3            MA
5   Nancy   43    vbinauj       23            MS
wwnde
  • 26,119
  • 6
  • 18
  • 32
  • thanks for your help! In the real dataset, it covers many columns that needs to be changed to 0 i.e. about 200 columns. I used like ['column xxx': 'column ccc'] with that approach, it did not accept – almo Oct 04 '20 at 21:26
  • Do those columns follow each other? if they do, you can use `df.loc[df['Name'].eq('Princi'),'Address':'Payment']`, if they dont, put those columns in a list as the solution indicated. It should work – wwnde Oct 04 '20 at 21:30
  • Yes, it worked. I made some mistakes in my original one. Can you change it as Address':'Payment in your answer? This is a multiple columns problem. Thank you! – almo Oct 04 '20 at 21:36