Please be advised, I am a beginning programmer and a beginning python/pandas user. I'm a behavioral scientist and learning to use pandas to process and organize my data. As a result, some of this might seem completely obvious and it may seem like a question not worthy of the forum. Please have tolerance! To me, this is days of work, and I have indeed spent hours trying to figure out the answer to this question already. Thanks in advance for any help.
My data look like this. The "real" Actor and Recipient data are always 5-digit numbers, and the "Behavior" data are always letter codes. My problem is that I also use this format for special lines, denoted by markers like "date" or "s" in the Actor column. These markers indicate that the "Behavior" column holds this special type of data, and not actual Behavior data. So, I want to replace the markers in the Actor column with NaN values, and grab the special data from the behavior column to put in another column (in this example, the empty Activity column).
follow Activity Actor Behavior Recipient1
0 1 NaN date 2.1.3.2012 NaN
1 1 NaN s ss.hx NaN
2 1 NaN 50505 vo 51608
3 1 NaN 51608 vr 50505
4 1 NaN s ss.he NaN
So far, I have written some code in pandas to select out the "s" lines into a new dataframe:
def get_act_line(group):
return group.ix[(group.Actor == 's')]
result = trimdata.groupby('follow').apply(get_act_line)
I've copied over the Behavior column in this dataframe to the Activity column, and replaced the Actor and Behavior values with NaN:
result.Activity = result.Behavior
result.Behavior = np.nan
result.Actor = np.nan
result.head()
So my new dataframe looks like this:
follow follow Activity Actor Behavior Recipient1
1 2 1 ss.hx NaN NaN NaN
34 1 hf.xa NaN NaN f.53702
74 1 hf.fe NaN NaN NaN
10 1287 10 ss.hf NaN NaN db
1335 10 fe NaN NaN db
What I would like to do now is to combine this dataframe with the original, replacing all of the values in these selected rows, but maintaining values for the other rows in the original dataframe.
This may seem like a simple question with an obvious solution, or perhaps I have gone about it all wrong to begin with!
I've worked through Wes McKinney's book, I've read the documentation on different types of merges, mapping, joining, transformations, concatenations, etc. I have browsed the forums and have not found an answer that helps me to figure this out. Your help will be very much appreciated.