1

I need to clean a Dataframe, and would like to select only columns with specific values in one of the rows. For instance extracting only those columns where the values in row number 3 is NaN.

Glassmanet
  • 15
  • 7
  • Does this answer your question? [How to select rows from a DataFrame based on column values](https://stackoverflow.com/questions/17071871/how-to-select-rows-from-a-dataframe-based-on-column-values) – Joe Ferndz Nov 27 '20 at 23:59
  • for future reference please review these docs: [how to provide a great pandas example](http://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples) as well as how to provide a [minimal, complete, and verifiable example](http://stackoverflow.com/help/mcve). These tips on [how to ask a good question](http://stackoverflow.com/help/how-to-ask) may also be useful. – piterbarg Nov 28 '20 at 00:27

1 Answers1

0

Joe's answer shows how to get rows based on column values, it seems like you want to get columns based on row values. Here's a simple way to achieve this using list comprehension.

In [45]: df = pd.DataFrame({'one': [2, 3, 4], 'two': [5, 6, 7], 'three': [8, 6, 1]})                                                                                                                 
In [46]: df                                                                                                                                                                                          
Out[46]: 
   one  two  three
0    2    5      8
1    3    6      6
2    4    7      1

Now we'll assign variables to say which row we're looking at, and the value which needs to be there in order to keep the column. Then we do the list comprehension and give the filtered df a new name

In [50]: row = 1                                                                                                                                                                                     
In [51]: value = 6                                                                                                                                                                                   
In [53]: list_comp = [c for c in df.columns if df[c][row] == value]                                                                                                                                   
In [54]: filtered_df = df[list_comp]                                                                                                                                                                  
In [55]: filtered_df                                                                                                                                                                                 
Out[55]: 
   two  three
0    5      8
1    6      6
2    7      1

  • is manually checking the fastest way we can do this? is there no method native to pandas which has optimized such searches? – shaha Dec 02 '21 at 23:49