1

As part of parsing a PDB file, I've extracted a set of coordinates (x, y, z) for particular atoms that I want to exist as floats. However, I also need to know how many sets of coordinates I have extracted.

Below is my code through the coordinate extraction, and what I thought would produce the count of how many sets of three coordinates I've extracted.

When using len(coordinates), I unfortunately get back that each set of coordinates contains 3 tuples (the x, y, and z coordinates.

Any insight into how to properly count the number of sets would be helpful. I'm quite new to Python and am still in the stage of being unsure about if I am even asking this correctly!

from sys import argv

with open(argv[1]) as pbd:
    print()
    for line in pbd:
        if line[:4] == 'ATOM':
            atom_type = line[13:16]
            if atom_type == "CA" or "N" or "C":

                x = float(line[31:38])
                y = float(line[39:46])
                z = float(line[47:54])

                coordinates = (x, y, z)

                # printing (coordinates) gives
                # (36.886, 53.177, 21.887)
                # (38.323, 52.817, 21.996)
                # (38.493, 51.553, 22.83)
                # (37.73, 51.314, 23.77)

                print(len(coordinates)) 

                # printing len(coordinates)) gives
                # 3
                # 3
                # 3
                # 3

Thank you for any insight!

biop91
  • 61
  • 4
  • Aside: `atom_type == "CA" or "N" or "C"` doesn't do what you think it does. See the answer [here](https://stackoverflow.com/questions/20002503/why-does-a-b-or-c-or-d-always-evaluate-to-true). – DSM Jul 22 '18 at 20:37
  • @DSM wow, incredibly helpful. I did not realize that I hadn't been pulling specific atom identities incorrectly. Thank you so much for your comment. – biop91 Jul 22 '18 at 22:24

2 Answers2

0

If you want to count the number of specific atoms in your file, try this one

from sys import argv

with open(argv[1]) as pbd:
    print()
    atomCount = 0
    for line in pbd:
        if line[:4] == 'ATOM':
            atom_type = line[13:16]
            if atom_type == "CA" or "N" or "C":
                atomCount += 1
    print(atomCount)

What it does is basically, you traverse your whole pbd file and check the type of each atom(seems fourth column in your data). Each time you encounter your desired atom types you increase a counter variable by 1.

unlut
  • 3,525
  • 2
  • 14
  • 23
0

Your coordinates variable is a tuple, tuples are ordered and unchangeable. Use lists is better.

coordinates=[]
for ....:
  coordinates.append([x,y,z])
len(coordinates) # should be 4 I guess.
avermaet
  • 1,543
  • 12
  • 33