9

I am trying to select data, read in from a file, represented by the values one and zero. I want to be able to select rows from a list of values and at the same time select for any column in which each of the selected rows has a value of one. To make it more complex I also want to select rows from a list of values where all values in a column for these rows is zero. Is this possible? Ultimately if another method besides pandas data frame would work better I would be willing to try that.

To be clear, any column may be selected and I do not know which ones ahead of time.

Thanks!

burkesquires
  • 1,345
  • 1
  • 14
  • 20

1 Answers1

11

You can use all() any() iloc[] operators. Check the official documentation, or this thread for more details

import pandas as pd
import random
import numpy as np


# Created a dump data as you didn't provide one
df = pd.DataFrame({'col1':  [random.getrandbits(1) for i in range(10)], 'col2':  [random.getrandbits(1) for i in range(10)], 'col3': [1]*10})
print(df)

# You can select the value directly by using iloc[] operator
# df.iloc to select by postion .loc to  Selection by Label
row_indexer,column_indexer=3,1
print(df.iloc[row_indexer,column_indexer])

# You can filter the data of a specific column this way
print(df[df['col1']==1])
print(df[df['col2']==1])

# Want to be able to select rows from a list of values and at the same time select for any column in which each of the selected rows has a value of one.
print(df[(df.T == 1).any()])

# If you wanna filter a specific columns with a condition on rows
print(df[(df['col1']==1)|(df['col2']==1)])

# To make it more complex I also want to select rows from a list of values where all values in a column for these rows is zero.
print(df[(df.T == 0).all()])

# If you wanna filter a specific columns with a condition on rows
print(df[(df['col1']==0) & (df['col2']==0)])
Alireza Mazochi
  • 897
  • 1
  • 15
  • 22
  • 2
    I think we're trying to encourage people to use `.loc` or `.iloc` instead of `.ix` these days, because of `.ix`'s hard-to-explain semantics. – DSM Nov 12 '14 at 21:56
  • Good point @DSM **.loc/.iloc** were introduced in 0.11 and encouraged to be used for user indexing choice. –  Nov 12 '14 at 22:04