-1

I have a class that is essentially supposed to be able to initialize itself and the set internal values based on a line of text (string). This seems to work fine when creating a single instance, however, upon creating a second instance, the line of text fed into the first instance is apparently getting fed into one of the internal variables on the second instance! The init constructor for the class is defined with default values for all relevant parameters which get passed to corresponding internal variables. Specifically, the 'prefixComments' parameter has a default of [] which means that the 'self.PrefixComments' should be set to the same thing (an empty list)... unfortunately, it is apparently getting set to be the line of text that was used to create the previous object (or, at least, that is my best guess).

I am truly puzzled by what is going on here. Any ideas on how to fix it. code and output follows:

code:

import collections
from collections import OrderedDict
import numpy as np
import string
import re
import gc

class AtomEntry:
    def __init__(self,atomName="",atomType="",charge=0.0,
                 comment="",prefixComments=[],
                 verbose=False):
        self.Name=atomName
        self.Type=atomType
        self.Charge=charge
        self.Comment=comment
        self.PrefixComments=prefixComments

    def from_line_string(self,line):
        #returns 1 if an error is encountered, 0 if successful
        lineTokens=line.split()
        if len(lineTokens)<4:
            print("Error: too few entries to construct ATOM record")
            return(1)
        elif lineTokens[0] != "ATOM":
            print("ERROR: atom entries must begin with the keyword 'ATOM'")
            return(1)
        else:
            self.Name=lineTokens[1]
            self.Type=lineTokens[2]
            self.Charge=float(lineTokens[3])
            if len(lineTokens) >=5:
                self.Comment=string.replace(
                    s=' '.join(lineTokens[5:len(lineTokens)]),
                    old='!',new='')
        return(0)

    def to_str(self,nameOnly=False):
        if nameOnly:
            return "%s"%(self.Name)
        else: 
            return repr(self)

    def __repr__(self):
        outStrs=self.PrefixComments
        outStrs.append(
            "ATOM %6s %6s %6.3f !%s"%(
                self.Name,self.Type,self.Charge,self.Comment))
        return ''.join(outStrs)

tempAtom1=AtomEntry()
tempAtom1.from_line_string("ATOM S1     SG2R50 -0.067 !   93.531")
print tempAtom1
print ""
gc.collect()
tempAtom2=AtomEntry()
tempAtom2.from_line_string("ATOM C1     CG2R53  0.443 !   83.436")
print tempAtom2
print""

print tempAtom2.Name
print tempAtom2.Type
print tempAtom2.Charge
print tempAtom2.Comment
print tempAtom2.PrefixComments

gc.collect()

output:

ATOM     S1 SG2R50 -0.067 !93.531

ATOM     S1 SG2R50 -0.067 !93.531ATOM     C1 CG2R53  0.443 !83.436

C1
CG2R53
0.443
83.436
['ATOM     S1 SG2R50 -0.067 !93.531', 'ATOM     C1 CG2R53  0.443 !83.436']
wmsmith
  • 542
  • 4
  • 15
  • Please try to post a [mcve], emphasis on the **Minimal**. – Eb946207 Dec 20 '18 at 01:54
  • 1
    @Prune: It's actually two different mutable `list` issues; one matches your duplicate, but they have a second problem too. I've reopened to ensure that part isn't missed. – ShadowRanger Dec 20 '18 at 01:57
  • Should the 'mutablelist' keyword be added to this post now that I know this was the core issue? – wmsmith Dec 20 '18 at 18:23

1 Answers1

3

You've got two issues, both relating to reusing lists. One, you used a mutable default argument for prefixComments/self.PrefixComments. Don't do that. Change the initializer to:

def __init__(self,atomName="",atomType="",charge=0.0,
             comment="",prefixComments=(),
             verbose=False):
    self.Name=atomName
    self.Type=atomType
    self.Charge=charge
    self.Comment=comment
    self.PrefixComments = list(prefixComments)

to avoid receiving a mutable argument in the first place, and to explicitly shallow copy it to a new list so the caller's argument is unlinked from the attribute.

Secondly, your __repr__ is modifying this attribute, so __repr__ isn't idempotent; it will build up and up each time you call it. Fix that too:

def __repr__(self):
    outStrs=self.PrefixComments[:]  # Shallow copy the l
    outStrs.append(
        "ATOM %6s %6s %6.3f !%s"%(
            self.Name,self.Type,self.Charge,self.Comment))
    return ''.join(outStrs)

Side-note: from_line_string should really be an alternate constructor, so you can directly use it to make a new instance from a string, rather than making a blank object only to reinitialize it on the next line. It's an easy fix; just change it to a classmethod that parses, then calls the regular constructor (and raises exceptions on error, rather than using C style return codes that make it easy to miss errors):

# Makes sure this works either on the class or an instance of the class
# as a constructor of a brand new instance
@classmethod
def from_line_string(cls, line):
    # Returns a new instance, or raises an exception if an error is encountered
    lineTokens = line.split()
    if len(lineTokens) < 4:
        raise ValueError("too few entries to construct ATOM record")
    elif lineTokens[0] != "ATOM":
        raise ValueError("atom entries must begin with the keyword 'ATOM'")

    name=lineTokens[1]
    type=lineTokens[2]
    charge=float(lineTokens[3])
    # This works fine, producing an empty string, even if lineTokens is
    # doesn't have an index 5 or higher; comment will be the empty string
    comment = ' '.join(lineTokens[5:]).replace('!', '')
    return cls(name, type, charge, comment)

That would simplify your use from:

tempAtom1=AtomEntry()
tempAtom1.from_line_string("ATOM S1     SG2R50 -0.067 !   93.531")

to:

tempAtom1 = AtomEntry.from_line_string("ATOM S1     SG2R50 -0.067 !   93.531")

You'd probably also want to make most of the arguments to __init__ mandatory (no defaults, aside from comment and prefixComment), since the other three attributes are required, and you no longer need to make empty instances just to reinitialize them with from_line_string.

ShadowRanger
  • 143,180
  • 12
  • 188
  • 271
  • Fascinating. So essentially the [] i set as a default is being used and modified. Did not realize that could happen. Good to know. Thanks for the help. For that last point though, I actually do need to be able to create 'dummy' atoms that have some default and / or blank values. I could certainly just feed those in every time, but that seems cumbersome. When i read it in from a file, the fields need to all be present as a data integrity check, but this will be used elsewhere where it may need to be, essentially, an empty placeholder. – wmsmith Dec 20 '18 at 02:43
  • In the case of this AtomEntry class, using the read_from_file command as an alternative constructor is perfect... How would this work for more complicated classes, however, that may need to iteratively load in data from multiple files? E.g. to build up lists of atoms, potentially in a dynamic manner. – wmsmith Dec 20 '18 at 03:00
  • 1
    @wmsmith: If you're building up many atoms into a single `list`, you'd use the alternate constructor to make each atom, but not to load whole files (that's not an alternate constructor, because it's not making a single instance of the class). If the file is sufficiently regular, a `read_from_file` alternate constructor that read a fixed number of lines (or just one) and parsed the line to a single instance would be feasible. – ShadowRanger Dec 20 '18 at 03:19