After having read the following question and various answers ("Least Astonishment" and the Mutable Default Argument), as well as the official documentation (https://docs.python.org/3/tutorial/controlflow.html#default-argument-values), I've written my ResultsClass so that each instance of it has a separate list without affecting the defaults (at least, this is what should be happening from my new-gained understanding):
class ResultsClass:
def __init__(self,
project = None,
badpolicynames = None,
nonconformpolicydisks = None,
diskswithoutpolicies = None,
dailydifferences = None,
weeklydifferences = None):
self.project = project
if badpolicynames is None:
self.badpolicynames = []
if nonconformpolicydisks is None:
self.nonconformpolicydisks = []
if diskswithoutpolicies is None:
self.diskswithoutpolicies = []
if dailydifferences is None:
self.dailydifferences = []
if weeklydifferences is None:
self.weeklydifferences = []
By itself, this works as expected:
i = 0
for result in results:
result.diskswithoutpolicies.append("count is " + str(i))
print(result.diskswithoutpolicies)
i = i+1
['count is 0']
['count is 1']
['count is 2']
['count is 3']
etc.
The context of this script is that I'm trying to obtain information from each project within our Google Cloud infrastructure; predominantly in this instance, a list of disks with a snapshot schedule associated with them, a list of the scheduled snapshots of each disk within the last 24 hours, those that have bad schedule names that do not fit our naming convention, and the disks that do not have any snapshot schedules associated with them at all.
Within the full script, I use this exact same ResultsClass; yet when used within multiple for loops, the append again seems to be adding to the default values, and in all honesty I don't understand why. The shortened version of the code is as follows:
# Code to obtain a list of projects
results = [ResultsClass() for i in range((len(projects)))]
for result in results:
for project in projects:
result.project = project
# Code to obtain each zone in the project
for zone in zones:
# Code to get each disk in zone
for disk in disks:
resourcepolicy = disk.get('resourcePolicies')
if resourcepolicy:
# Code to action if a resource policy exists
else:
result.badpolicynames.append(resourcepolicy[0].split('/')[-1])
result.nonconformpolicydisks.append(disk['id'])
else:
result.diskswithoutpolicies.append(disk['id'])
pprint(vars(result))
This then comes back with the results:
{'badpolicynames': [],
'dailydifferences': None,
'diskswithoutpolicies': ['**1098762112354315432**'],
'nonconformpolicydisks': [],
'project': '**project0**',
'weeklydifferences': None}
{'badpolicynames': [],
'dailydifferences': None,
'diskswithoutpolicies': ['**1098762112354315432**'],
['**1031876156872354739**'],
'nonconformpolicydisks': [],
'project': '**project1**',
'weeklydifferences': None}
Does a for loop (or multiple for loops) somehow negate the separate lists created within the ResultsClass? I need to understand why this is happening within Python and then how I can correct it.