I have a Pandas pivot table of the format:
income_category age_category income age
High Middle aged 123,564.235 23.456
Medium Old 18,324.356 65.432
I have a category hierarchy with matching label
s in a self-referencing table called dimension
. Ie,
dimension_id label parent_dimension_id
1 Age categories
2 Young 1
3 Middle aged 1
4 Old 1
...and similarly for income
I'm really struggling to pick a row at a time and access cells in that row randomly.
I have the parent category id dimension_id
(in the code below it is already in cat_id_age
). So I want to iterate through the Numpy array, getting the matching category dimension_id
for that row, and insert it into a value table along with its corresponding value. However I've no idea how to do this Pythonically or Djangonically. (There are only a few categories so I think the Dictionary approach below for looking up dimension_id
is best.) To my iterative mind the process is:
# populate a Dictionary to find dimension_ids
age_dims = Dimension.objects.filter(parent_id=cat_id_age).values('label', 'id')
for row in Numpy_array:
dim_id = Dimension.get(row.age_category)
# Or is the Dict approach incorrect? I'm trying to do: SELECT dimension_id FROM dimension WHERE parent_dimension_id=cat_id_age AND label=row.age_category
# Djagonically? dim = Dimension.objects.get(parent_id=cat_id_age, label=row.age_category)
# Then insert categorized value, ie, INSERT INTO float_value (value, dimension_id) VALUES (row.age, dimension_id)
float_val = FloatValue(value=row.age, dimension_id=dim_id)
float_val.save()
...then repeat for income_category and income.
However I'm struggling with iterating like this - that may be my only problem but I've included the rest to communicate what I'm trying to do as I often seem a paradigm away Python (eg, sth like cursor.executemany("""insert values(?, ?, ?)""", map(tuple, numpy_arr[x:].tolist()))
?).
Any pointers really appreciated. (I'm using Django 1.7 and Python 3.4.)