Boxing data on Pandas based on 2 columns

Question

Imagine the following DF:

data = {'Person': ['A', 'A', 'B', 'B', 'C', 'C', 'C', 'C', 'C'], 'Field': ['Age', 'Weight', 'Age', 'Height', 'Height', 'year', 'month', 'day', 'city']}
df = pd.DataFrame(data)

  Field Person
    Age      A
 Weight      A
    Age      B
 Height      B
 Height      C
   year      C
  month      C
    day      C
   city      C

Imagine I wanted to reduce the number of queries I need to do to grab the field from each person. So I would first get A and B on a room and ask them their age, then I would ask A his height, then I could get B & C and ask them for their height and finally ask C for all the remaining fields.

This may sound more complicated than simply asking A, B and C separately. But imagine I had:

  Field Person
    Age      A
    Age      B
 Height      B
 Height      B
   year      B
  month      B
    Age      C
 Height      C
 Height      C
   year      C
  month      C

It is clear here that asking each person for the information is less effective than asking Age to A, B and C and then Height, Weight, year and months to B and C.

I can think of many ways of doing this programmatically but was wondering what is the most efficient one.

really this question is a dupe of this: http://stackoverflow.com/questions/34233455/using-panda-for-comparing-column-values-and-creating-column-based-on-the-values and this http://stackoverflow.com/questions/41481208/python-string-to-integer-as-a-key and countless others, but your wording is a little different — EdChum, Mar 14 '17 at 11:51
Are you wanting something different than my answer here or the linked posts — EdChum, Mar 14 '17 at 11:53
@EdChum Thanks for your reply. I was not aware that Factorize did something similar but not exactly what I need so I have reworded the question — Yona, Mar 14 '17 at 13:45
I think you need to explain clearer the logic here, I will remove my answer — EdChum, Mar 14 '17 at 13:49

Boxing data on Pandas based on 2 columns

0 Answers0