3
from Bio.PDB import PDBParser
from Bio.PDB import Selection
structure = PDBParser().get_structure('4GBX', '4GBX.pdb') # load your molecule
atom_list = Selection.unfold_entities(structure[0]['E'], 'A') # 'A' is for Atoms in the chain 'E'

When I unfold chain E in the PDB 4GBX using the code above, the last 2 Oxygen atoms in atom_list belong to water heteroatoms in the same chain. How can I get a list of only protein residue atoms and avoid other ligands or water molecules in the selection?

mahaswap
  • 76
  • 8
  • I seem to be a bit blind - 4gbx (https://www.rcsb.org/structure/4GBX) ... chain E? Are you speaking of atom number 6089 and 6091? However they belong to THR no? – quant Nov 15 '18 at 20:44
  • 4gbx: chain E: water residue numbers 101 and 102: atom numbers 6190 and 6191. Thanks. – mahaswap Nov 15 '18 at 21:21
  • `atom_list[-1].get_full_id() Out[21]: ('4GBX', 0, 'E', ('W', 102, ' '), ('O', ' '))` The 'W' here is for water heteroatom. – mahaswap Nov 15 '18 at 21:33
  • 1
    Yes, saw that too. However are you sure that the 'W' only occurs for heteroatoms? If yes, you can use something like `print([atom for atom in atom_list if atom.get_full_id()[3][0] == " "])` to filter your list ... – quant Nov 15 '18 at 21:35
  • 1
    Thanks @quant . I think that should do the trick. I guess I need a coffee or two. I remember now that I had done the exact same thing some months ago. – mahaswap Nov 15 '18 at 21:58

0 Answers0