0

I have a pandas df column with a list of dictionaries for each each company name. Like below:

company  |  growth_scores
comp xyz | [{u'score': u'198', u'recorded_at': u'2016-09},{u'score': u'190', u'recorded_at': u'2016-08} 

I understand how to extract the keys and I'm familiar with the pd.apply method but I can't seem to piece together anything that will go row-by-row and perform the calculation. Ultimately, I need to perform a calculation and store the result in a new column, for each company.

The output should look like this:

company  |  growth_score_diff
comp xyz |  10%

Would love some guidance here!

Kevin
  • 37
  • 7

1 Answers1

1

Say you have the following DataFrame:

df = pd.DataFrame.from_dict({'company': 'Pandology', 'metrics': [[{'score': 10}, {'score': 20}, {'score': 35}]]})

which looks like this:

enter image description here

To compute a total score, you can map the metrics column to a new column called score_total. To perform the actual calculation, you define a function calculate_score which takes a row of metrics data as input and outputs a total score value. (In this case it's just a trivial sum calculation)

def calculate_score(metrics):
    total_score = 0
    for metric in metrics:
        total_score += metric['score']
    return total_score                

df['score_total'] = df['metrics'].map(calculate_score)

Now you have a new column containing the result:

enter image description here

kmandov
  • 3,130
  • 16
  • 15
  • hey @kmandov thanks for the help here! One issue I'm running into though, is that the scores are actually strings and thus, I can't perform the calc. Is there a way to work around that? – Kevin Sep 22 '16 at 19:58
  • You need to cast the strings into a numeric value. It can be as simple as: int(metric['score']) if scores are integers. Here is how you can do this: http://stackoverflow.com/questions/5608702/how-can-i-convert-a-string-to-either-int-or-float-with-priority-on-int – kmandov Sep 23 '16 at 08:10