2

I am trying to automate populating a town by randomly generating households. I generate the name of the town, generate the number of households, the last name of each household and number of occupants in each. That much is fine. I am now, however, trying to create each individual, to generate a first name, a sex, an age and an occupation, and I'd like to store this data in a list as well, one list containing the attributes of each person. The problem I'm running into is that I want to use a for loop, something like:

    #houseArray[currentFam][1] is the number of members in the current house. 
    for currentFam in range(houseArray[currentFam][1]):
        uniquelyNamedArray[0] = genSex()
        uniquelyNamedArray[1] = genFirstName()
        uniquelyNamedArray[2] = genAge()

So... look at the data of the first household, use a for loop to iterate through each member assigning stats, then go to the next household and do the same, progressing through each household. My problem lies in not knowing how to assign a unique name to each array created by the for loop. It doesn't really matter what the name is, it could be anything as long as each person has their own uniquely named array storing their attributes.

CromerMW
  • 39
  • 1
  • 1
  • 3

5 Answers5

5

Use a dictionary with the person's name as the key. Like:

people = {}
people["Billy Bloggs"] = ['23','Male','263 Evergreen Tce'] # store to dict
print ( people["Billy Bloggs"] ) # get stuff out of dict

Better still, give the attributes names by storing those as a dict as well:

people["Billy Bloggs"] = { 'Age':23, 'Gender':'M', 'Address':'263 Evergreen Tce' }
print ( people["Billy Bloggs"]['Age'] ) # Get billy's age

You can loop through the elements of a dictionary using the following syntax:

>>> mydict = {'a':'Apple', 'b':'Banana', 'c':'Cumquat'}
>>> for key, value in mydict.iteritems():
...     print ('Key is :' + key + ' Value is:' + value)
... 
Key is :a Value is:Apple
Key is :c Value is:Cumquat
Key is :b Value is:Banana

Note that there is no guarantee on the order of the data. You may insert data in the order A, B, C and get A, C, B back.

Note: The keys of a dict, in this case the person's name, are constrained to be unique. So if you store data to the same name twice, then the first key:value pair will be overwritten.

mydict["a"] = 5
mydict["a"] = 10
print (mydict["a"]) # prints 10

Sidenote: some of your gen*() functions could almost certainly be replaced by random.choice():

import random
first_names = ['Alice','Bob','Charlie','Dick','Eliza']
random_first_name = random.choice(first_names)
Li-aung Yip
  • 12,320
  • 5
  • 34
  • 49
2

You are mixing program data with variable names. It is okay to call a variable something generic; you do this all the time: e.g. in your for-loop, you use currentFam rather than the name of the family. Asking to uniquely name the array makes (no offense) as much sense as either asking what to name currentFam (it doesn't matter what you name it), or alternatively trying to do:

Andersons[0] = genSex()
Andersons[1] = genFirstName()
Andersons[2] = genAge()
Longs[0] = genSex()
Longs[1] = genFirstName()
Longs[2] = genAge()
Smiths[0] = genSex()
Smiths[1] = genFirstName()
Smiths[2] = genAge()
...

Variables are separate from program data.


You should just name your array person, and store it with other arrays. Even better would be to define a class Person(object): ..., so you could do things like x.name and x.age, but you don't need to do that. For example:

class Person(object):
    def __init__(self, **kw):
        self.data = kw
        self.__dict__.update(kw)
    def __repr__(self):
        return str('Person(**{})'.format(self.data))
    __str__ = __repr__

M = Person.M = 'm'
F = Person.F = 'f'

ALL_PEOPLE = set()
for ...:
    person = Person(name=..., age=..., sex=...)
    people.add(person)

Then to find people:

def findPeople(name=None, age=None, custom=set()):
    matchers = custom
    if name!=None:
        matchers.add(lambda x:name.lower() in x.name.lower())
    if age!=None:
        matchers.add(lambda x:age==x.age)

    return set(p for p in ALL_PEOPLE if all(m(p) for m in matchers))

Demo:

ALL_PEOPLE = set([
 Person(name='Alex', age=5, sex=M),
 Person(name='Alexander', age=33, sex=M),
 Person(name='Alexa', age=21, sex=F)
])

>>> pprint.pprint( findPeople(name='alex', custom={lambda p: p.age>10}) )
{Person(**{'age': 33, 'name': 'Alexander', 'sex': 'm'}),
 Person(**{'age': 21, 'name': 'Alexa', 'sex': 'f'})}
ninjagecko
  • 88,546
  • 24
  • 137
  • 145
  • 1
    I would actually argue that if you want to do something like `findPeople()`, you would be better off storing your data in a database and using an SQL query `SELECT * FROM people WHERE name=? AND age=?`. `sqlite3` is cheap, easy and fun. – Li-aung Yip Mar 15 '12 at 03:08
2

Keep data out of your variable names and just store them in a dict.

wim
  • 338,267
  • 99
  • 616
  • 750
2

First, while you haven't shown us the surrounding code, you are probably relying too much on global variables. Rather than trying to create uniquely named arrays for each family member simply do something like this:

Don't really do this (I'll tell you why in a minute)

#houseArray[currentFam][1] is the number of members in the current house. 
for currentFam in range(houseArray[currentFam][1]):
    family_member_info = []
    family_member_info[0] = genSex()
    family_member_info[1] = genFirstName()
    family_member_info[2] = genAge()
    # Pretend 2 is where we are storing the family member information list
    houseArray[currentFam][2].append(family_member_info)

A better way

Don't use an array for this sort of thing - it gets very difficult very quickly to tell what is actually stored in which index. Even in your example you have to note that houseArray[currentFam][1] is storing the number of members in the current house.

I would use either a dictionary or a named tuple and store your information in there. That way you can do something like this:

from collections import namedtuple

# Create a class called "household"
# with three fields, "owner", "size" and "members"
household = namedtuple("household", "owner size members")

househould_array = []
# Create some households and put them in the array
household_array.append(household("Family #1", 3, []))
household_array.append(household("Family #2", 1, []))
household_array.append(household("Family #3", 7, []))

# Loop over every household in the household_array
for family in household_array:
    # Each `household` namedtulpe's values can be accessed by
    # attribute as well as by index number
    # family[1] == family.size == 3
    # (for Family #1)
    for member_number in range(family.size):
        # family[2] == family.members == []
        # (before we put anything in it)
        family.members.append(generate_family_member())
Sean Vieira
  • 155,703
  • 32
  • 311
  • 293
0

Wow, I really enjoyed reading all of the other answers.
So many great suggestions including, but not limited to:

  • @Sean Vieira suggests named-tuples -- an excellent, light-weight choice;
  • @ninjagecko uses a neat trick to dynamically assign instance attributes;
  • @Li-aung Yip mentions using the built-in sqlite3 module.

Much if not all of what's here has already been suggested.
If nothing else I hope this answer is an introduction to what classes may provide beyond what is provided by other data-structures.

Caveat: If performance is a huge concern, modeling each entity as a class might be overkill.

from __future__ import division, print_function

class Town(object):
    def __init__(self, name=None, country=None, area_km2=0, population=0):
        self.name = name 
        self.area_km2 = area_km2
        self.area_mi2 = self.area_km2 * 0.38610217499077215
        self.population = population
        self.households = []

    @property
    def total_households(self):
        return len(self.households)

    @property
    def population_density_per_km2(self):
        try: 
            return self.population / self.area_km2
        except ZeroDivisionError: 
            return 0

    @property
    def population_density_per_mi2(self):
        try: 
            return self.population / self.area_mi2
        except ZeroDivisionError: 
            return 0

class Household(object):
    def __init__(self, primary_lang='Esperanto'):
        self.primary_lang = primary_lang
        self.members = []

    @property
    def total_members(self):
        return len(self.members)

class Person(object):
    def __init__(self, age=0, gender=None, first_name=None):
        self.age = age
        self.gender = gender
        self.first_name = first_name

if __name__ == '__main__':
    londontown = Town(name='London', 
                      country='UK', 
                      area_km2=1572,
                      population=7753600)

    print(londontown.population_density_per_km2)
    print(londontown.population_density_per_mi2)

    a_household = Household()
    a_household.members.append(
        Person(age=10, gender='m', first_name='john'),
    )
    a_household.members.append(
        Person(age=10, gender='f', first_name='jane')
    )

    londontown.households.append(a_household)
    print(londontown.total_households)
    print(a_household.total_members)
mechanical_meat
  • 163,903
  • 24
  • 228
  • 223