2

I am wanting to take a list of various variable names, and assign all of them as instance variables to a class.

Furthermore I would also like to assign attributes to these instance variables from a database.

For Example: I have a dataframe with headers, ('col1', 'col2', 'col3', 'col4'). each row should be a class instance, and each column should be an instance variable of that class. Then the values in each row, should be assigned to each instance variables as attributes for every class instance.

how can i accomplish this?

here is a list of variables:

Index(['Id', 'MSSubClass', 'MSZoning', 'LotFrontage', 'LotArea', 'Street',
       'Alley', 'LotShape', 'LandContour', 'Utilities', 'LotConfig',
       'LandSlope', 'Neighborhood', 'Condition1', 'Condition2', 'BldgType',
       'HouseStyle', 'OverallQual', 'OverallCond', 'YearBuilt', 'YearRemodAdd',
       'RoofStyle', 'RoofMatl', 'Exterior1st', 'Exterior2nd', 'MasVnrType',
       'MasVnrArea', 'ExterQual', 'ExterCond', 'Foundation', 'BsmtQual',
       'BsmtCond', 'BsmtExposure', 'BsmtFinType1', 'BsmtFinSF1',
       'BsmtFinType2', 'BsmtFinSF2', 'BsmtUnfSF', 'TotalBsmtSF', 'Heating',
       'HeatingQC', 'CentralAir', 'Electrical', '1stFlrSF', '2ndFlrSF',
       'LowQualFinSF', 'GrLivArea', 'BsmtFullBath', 'BsmtHalfBath', 'FullBath',
       'HalfBath', 'BedroomAbvGr', 'KitchenAbvGr', 'KitchenQual',
       'TotRmsAbvGrd', 'Functional', 'Fireplaces', 'FireplaceQu', 'GarageType',
       'GarageYrBlt', 'GarageFinish', 'GarageCars', 'GarageArea', 'GarageQual',
       'GarageCond', 'PavedDrive', 'WoodDeckSF', 'OpenPorchSF',
       'EnclosedPorch', '3SsnPorch', 'ScreenPorch', 'PoolArea', 'PoolQC',
       'Fence', 'MiscFeature', 'MiscVal', 'MoSold', 'YrSold', 'SaleType',
       'SaleCondition', 'SalePrice'],
      dtype='object')

Here is an example dataframe:

import pandas as pd
from numpy import nan
d = {'name' : pd.Series(['steve', 'jeff', 'bob'], index=['1', '2', '3']),
       ....:      'salary' : pd.Series([34, 85, 213], index=['1', '2', '3']), 'male' : pd.Series([1, nan, 0], index=['1', '2', '3']), 'score' : pd.Series([1.46, 0.8, 3.], index=['1', '2', '3'])}

df = pd.DataFrame(d)
Clay Chester
  • 91
  • 16
  • Is this pretty much a duplicate of this question-answer: https://stackoverflow.com/questions/1639174/creating-class-instance-properties-from-a-dictionary – Bill Oct 08 '17 at 01:05
  • 1
    Possible duplicate of [Creating class instance properties from a dictionary?](https://stackoverflow.com/questions/1639174/creating-class-instance-properties-from-a-dictionary) – toonarmycaptain Oct 08 '17 at 01:13
  • in this post, the "objects" are automatically created from the dataframe. rather than having to define each object individually. for example: `>>> class AllMyFields: ... def __init__(self, dictionary): ... for k, v in dictionary.items(): ... setattr(self, k, v) ... >>> o = AllMyFields({'a': 1, 'b': 2}) >>> o.a 1` has to name the object as "0" i want these objects to be the index that i can call at will – Clay Chester Oct 08 '17 at 19:10

2 Answers2

1

This is a natural fit for namedtuples.

#! /usr/bin/env python3


import collections
import pandas as pd


if __name__ == '__main__':

    Person = collections.namedtuple('Person', 'male name salary score')

    d = {'name': pd.Series(['steve', 'jeff', 'bob'], index=['1', '2', '3']),
         'salary': pd.Series([34, 85, 213], index=['1', '2', '3']),
         'male': pd.Series([1, float('NaN'), 0], index=['1', '2', '3']),
         'score': pd.Series([1.46, 0.8, 3.], index=['1', '2', '3'])}
    df = pd.DataFrame(d, columns=sorted(d.keys()))
    print(df)

    for row in df.values:
        print(Person(*row.tolist()))

Output:

   male   name  salary  score
1   1.0  steve      34   1.46
2   NaN   jeff      85   0.80
3   0.0    bob     213   3.00
Person(male=1.0, name='steve', salary=34, score=1.46)
Person(male=nan, name='jeff', salary=85, score=0.8)
Person(male=0.0, name='bob', salary=213, score=3.0)
J_H
  • 17,926
  • 4
  • 24
  • 44
1

You can use df.to_dict('records') to generate a list of dictionaries,

[{'male': 1.0, 'name': 'steve', 'salary': 34, 'score': 1.46},
 {'male': nan, 'name': 'jeff', 'salary': 85, 'score': 0.8},
 {'male': 0.0, 'name': 'bob', 'salary': 213, 'score': 3.0}]

Then you can do something like this to create your list,

class Person(object):    
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)

people = [Person(**x) for x in df.to_dict('records')]
Aldehir
  • 2,025
  • 13
  • 10
  • when you do, `people = [Person(**x) for x in df.to_dict('df')]` what does **x mean? is that saying "all class instances". when i run this i receive the following error. TypeError: type object argument after ** must be a mapping, not str – Clay Chester Oct 08 '17 at 18:23
  • @ClayChester, should be `df.to_dict('records')`, not `df.to_dict('df')`. Take a look at the documentation for [DataFrame.to_dict()](https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_dict.html) – Aldehir Oct 08 '17 at 23:49