I am doing an assignment for machine learning class in python. I started learning python just yesterday so I am not aware of practices used in python.
Part of my task is to load data from csv (2D array) lets call it arr_2d
and normalize that.
I've found sklearn
and numpy
solutions online but they expect 2D array as input.
My approach after loading arr_2d
is to parse them into array of objects (data: [HealthRecord]
).
My solution was a code similar to this (note: kinda pseudocode)
result = [] # 2D array of property values
for key in ['age','height','weight',...]:
tmp = list(map(lambda item: getattr(key, item), data))
result.append(tmp)
Result now contains 3 * data.length
items and I would use sklearn
to normalize single row in my result
array, then rotate it back and parse normalized to HealthRecord
.
I see this as overcomplicated and what I would like to see an option to do it any easier way, like sending [HealthRecord]
to sklearn.normalize
Code below shows my (simplified) loading and parsing:
class Person:
age: int
height: int
weight: int
def arr_2_obj(data: [[]]) -> Person:
person = Person()
person.age = data[0]
person.height = data[1]
person.weight = data[2]
return person
# age (days), height (cm), weight (kg)
input = [
[60*365, 125, 65],
[30*365, 195, 125],
[13*365, 116, 53],
[16*365, 164, 84],
[12*365, 125, 96],
[10*365, 90, 46],
]
parsed = []
for row in input:
parsed.append(arr_2_obj(row))
note: Person
class is HealthRecord
Thank you for any input or insights.
Edit: typo sci-learn -> sklearn