0

I have some protein structural data, which I have cast as a Pandas DataFrame.

There is a column corresponding to the amino acid residue, which is labeled resi.

I'd like to select all rows for which the resi value is present in some other list. For now, I can select those rows for which ALA are present, using:

hydrophobic_residues = ['ALA', 'VAL', 'LEU', 'ILE', 'MET', 'PHE', 'TRP', 'PRO', 'TYR']
resi1 = resi1[(resi1['resi_name'].str.contains('ALA'))]['resi_num'].values

How do I select rows such that those rows containing the hydrophobic residues are all selected, without writing more conditionals inside the data frame selector? From what I can see, the Pandas documentation on string methods doesn't allow me to pass in a list of values.

ericmjl
  • 13,541
  • 12
  • 51
  • 80
  • Bummer... it looks like I found the answer here: http://stackoverflow.com/questions/22485375/efficiently-select-rows-that-match-one-of-several-values-in-pandas-dataframe?rq=1. Moderators: should I delete this question? – ericmjl May 22 '15 at 02:25

1 Answers1

0

Try this:

df.loc[df.resi1.isin(hydrophobic_residues), :]
Alexander
  • 105,104
  • 32
  • 201
  • 196