11

I'm working with datasets from two different webpages, but for the same individual - the data sets are legal info on. Some of the data is available on the first page, so I initialize a Defendant object with the proper info, and set the attributes that I don't currently have the data for to null. This is the class:

class Defendant(object):
    """holds data for each individual defendant"""
    def __init__(self,full_name,first_name,last_name,type_of_appeal,county,case_number,date_of_filing,
                 race,sex,dc_number,hair_color,eye_color,height,weight,birth_date,initial_receipt_date,current_facility,current_custody,current_release_date,link_to_page):
        self.full_name = full_name
        self.first_name = first_name
        self.last_name = last_name
        self.type_of_appeal = type_of_appeal
        self.county = county
        self.case_number = case_number
        self.date_of_filing = date_of_filing
        self.race = 'null'
        self.sex = 'null'
        self.dc_number = 'null'
        self.hair_color = 'null'
        self.eye_color = 'null'
        self.height = 'null'
        self.weight = 'null'
        self.birth_date = 'null'
        self.initial_receipt_date = 'null'
        self.current_facility = 'null'
        self.current_custody = 'null'
        self.current_release_date = 'null'
        self.link_to_page = link_to_page

And this is what it looks like when I add a half-filled out Defendant object to a list of defendants:

list_of_defendants.append(Defendant(name_final,'null','null',type_of_appeal_final,county_parsed_final,case_number,date_of_filing,'null','null','null','null','null','null','null','null','null','null','null','null',link_to_page))

then, when I get the rest of the data from the other page I update those attributes set to null like so:

        for defendant in list_of_defendants:
            defendant.sex = location_of_sex_on_page
            defendant.first_name = location_of_first_name_on_page
            ## Etc.

My question is: is there a more pythonic way to either add attributes to a class or a less ugly way of initializing the class object when I only have half of the information that I want to store in it?

n1c9
  • 2,662
  • 3
  • 32
  • 52
  • 1
    You can have the parameters default to `'null'` so that you don't need to specify them on initialization, you can specify the last one as `link_to_page = link_to_page` and skip all the ones in between. – Tadhg McDonald-Jensen Apr 07 '16 at 19:58
  • 2
    Null values are represented in Python as `None`, not as the string `'null'`. Please don't make baseless accusations against [Mr. Null](http://stackoverflow.com/q/4456438/744178). – jwodder Apr 07 '16 at 20:04

4 Answers4

4

First, use default values for any arguments that you're setting to null. This way you don't even need to specify these arguments when instantiating the object (and you can specify any you do need in any order by using the argument name). You should use the Python value None rather than the string "null" for these, unless there is some specific reason for using the string. In Python 2.x, arguments with default values need to go last, so link_to_page needs to be moved before these.

Then, you can set your attributes by updating the instance's __dict__ attribute, which stores the attributes attached to the instance. Each argument will be set as an attribute of the instance having the same name.

def __init__(self, full_name, first_name, last_name, type_of_appeal, county, case_number, 
             date_of_filing, link_to_page, race=None, sex=None, dc_number=None,
             hair_color=None, eye_color=None, height=None, weight=None, birth_date=None,
             initial_receipt_date=None, current_facility=None, current_custody=None, 
             current_release_date=None):

      # set all arguments as attributes of this instance
      code     = self.__init__.__func__.func_code
      argnames = code.co_varnames[1:code.co_argcount]
      locs     = locals()
      self.__dict__.update((name, locs[name]) for name in argnames)

You might also consider synthesizing the full_name from the two other name arguments. Then you don't have to pass in redundant information and it can never not match. You can do this on the fly via a property:

@property
def full_name(self):
    return self.first_name + " " + self.last_name

For updating, I'd add a method to do that, but accept keyword-only arguments using **. To help protect the integrity of the data, we will change only attributes that already exist and are set to None.

def update(self, **kwargs):
    self.__dict__.update((k, kwargs[k]) for k in kwargs
                          if self.__dict__.get(k, False) is None)

Then you can easily update all the ones you want with a single call:

defendant.update(eye_color="Brown", hair_color="Black", sex="Male")

To make sure an instance has been completely filled out, you can add a method or property that checks to make sure all attributes are not None:

@property
def valid(self):
    return all(self.__dict__[k] is not None for k in self.__dict__)
kindall
  • 178,883
  • 35
  • 278
  • 309
2

If you're okay with passing every attribute in as a name-value pair, you can use something like:

class Defendant(object):
    fields = ['full_name', 'first_name', 'last_name', 'type_of_appeal', 
              'county', 'case_number', 'date_of_filing', 'race', 'sex',
              'dc_number', 'hair_color', 'eye_color', 'height', 'weight', 
              'birth_date', 'initial_receipt_date', 'current_facility', 
              'current_custody', 'current_release_date', 'link_to_page']

    def __init__(self, **kwargs):
        self.update(**kwargs)

    def update(self, **kwargs):
        self.__dict__.update(kwargs)

    def blank_fields(self):
        return [field for field in self.fields if field not in self.__dict__]

    def verify(self):
        blanks = self.blank_fields()
        if blanks:
            print 'The fields {} have not been set.'.format(', '.join(blanks))
            return False
        return True

The usage would look something like:

defendant = Defendant(full_name='John Doe', first_name='John', last_name='Doe')
defendant.update(county='Here', height='5-11', birth_date='1000 BC')
defendant.verify()
# The fields type_of_appeal, case_number, date_of_filing, race... have not been set.

Extending this to use required fields and optional fields would be easy. Or, you could add required arguments to the initialization. Or, you could check to make sure that each name-value pair has a valid name. And so on...

Jared Goguen
  • 8,772
  • 2
  • 18
  • 36
1

So, a simpler example to illustrate how you could do:

class Foo:
  def __init__(self, a, b, e, c=None, d=None):
    self.a = a
    self.b = b
    self.c = c
    self.d = d
    self.e = e

But if you never have c and d when you need to instanciate, I would recommend this instead:

class Foo:
  def __init__(self, a, b, e):
    self.a = a
    self.b = b
    self.c = None
    self.d = None
    self.e = e

EDIT: Another method could be:

class Defendant(object):
    __attrs = (
        'full_name',
        'first_name',
        'last_name',
        'type_of_appeal',
        'county',
        'case_number',
        'date_of_filing',
        'race',
        'sex',
        'dc_number',
        'hair_color',
        'eye_color',
        'height',
        'weight',
        'birth_date',
        'initial_receipt_date',
        'current_facility',
        'current_custody',
        'current_release_date',
        'link_to_page'
    )

    def __update(self, *args, **kwargs):
        self.__dict__.update(dict(zip(self.__attrs, args)))
        self.__dict__.update(kwargs)

    def __init__(self, *args, **kwargs):
        self.__dict__ = dict.fromkeys(Defendant.__attrs, None)
        self.__update(*args, **kwargs)

    update_from_data = __update


if __name__ == '__main__':
    test = Defendant('foo bar', 'foo', 'bar', height=180, weight=85)
    test.update_from_data('Superman', 'Clark', 'Kent', hair_color='red', county='SmallVille')
DevLounge
  • 8,313
  • 3
  • 31
  • 44
1

I would say the most pythonic way is something that looks like this:

class Defendant(Model):
    full_name = None  # Some default value
    first_name = None
    last_name = None
    type_of_appeal = None
    county = None
    case_number = None
    date_of_filing = None
    race = None
    sex = None
    dc_number = None
    hair_color = None
    eye_color = None
    height = None
    weight = None
    birth_date = None
    initial_receipt_date = None
    current_facility = None
    current_custody = None
    current_release_date = None
    link_to_page = None

Clean, everything is defined only once and works automagically.

About that Model super class... If you are using any web framework like Django, by all means, inherit from their model and you're done. It has all the wiring you need.

Otherwise, an easy way to implement something short and sweet, inherit your Defendant class from:

class Model(object):
    def __init__(self, **kwargs):
        for k, v in kwargs.items():
            setattr(self, k, v)

And instantiate based on the fields you have available:

d1 = Defendant(height=1.75)
print d1.height

d2 = Defendant(full_name='Peter')
print d2.full_name

You can achieve a lot cooler things with a bit of meta programming, like field type checking, value checking, duplicated declarations and more.. If you are using python 3, you can easily allow passing the values to the __init__ method either by args (based on the order of declaration) or kwargs.

fips
  • 4,319
  • 5
  • 26
  • 42