2

I have a pd.DataFrame object that contains tweets and re-tweets from different users. What I'm trying to accomplish is to populate a column of rt_uid (i.e. retweet user id) with the corresponding uid of the user being retweeted. So the desired output will be:

Desired Output

   tw_id  tw_uid  rt_uid tw_uname rt_uname
0      0      10    12.0       u1       u3
1      1      10    12.0       u1       u3
2      2      12     NaN       u3     None
3      3      13     NaN       u4     None
4      4      14    10.0       u5       u1
5      5      15    10.0       u6       u1
6      6      16    10.0       u7       u1
7      7      16     NaN       u7     None
8      8      16     NaN       u7     None
9      9      12    13.0       u3       u4

the column rt_uid contains user ids of the users that were retweeted beforehand.

Code 1 presents a toy example of the dataset with my solution that didn't work out:

Code 1

import pandas as pd


tw_df = pd.DataFrame(dict(
        tw_id=np.arange(10),
        tw_uid=[10, 10, 12, 13, 14, 15, 16, 16, 16, 12],
        rt_uid=[None]*10,
        tw_uname=['u1', 'u1', 'u3', 'u4', 'u5', 'u6', 'u7', 'u7', 'u7', 'u3'],
        rt_uname=['u3', 'u3', None, None, 'u1', 'u1', 'u1', None, None, 'u4'],
    )
)
tw_df.loc[~tw_df.loc[:, 'rt_uname'].isnull(), 'rt_uid'] = tw_df.loc[tw_df.loc[:, 'tw_uname'].isin(tw_df.loc[:, 'rt_uname']), 'tw_uid']
tw_df

Wrong Output

enter image description here

As you can see, the the rt_uid column merely contain mirrors the tw_uid column.

  • I've looked at this post, but in my case, I need the data to be filtered for all the usernames (which may change, repeat etc.) and not for a specific one, so couldn't find the answer there.

What am I missing here? Thanks in advance.

Michael
  • 2,167
  • 5
  • 23
  • 38

1 Answers1

2

Create a dictionary of tw_uname and tw_uid using dict(zip()). Map the dict to rt_uname

tw_df['rt_uid']=tw_df['rt_uname'].map(dict(zip(tw_df.tw_uname,tw_df.tw_uid)))



 tw_id  tw_uid  rt_uid tw_uname rt_uname
0      0      10    12.0       u1       u3
1      1      10    12.0       u1       u3
2      2      12     NaN       u3     None
3      3      13     NaN       u4     None
4      4      14    10.0       u5       u1
5      5      15    10.0       u6       u1
6      6      16    10.0       u7       u1
7      7      16     NaN       u7     None
8      8      16     NaN       u7     None
9      9      12    13.0       u3       u4
wwnde
  • 26,119
  • 6
  • 18
  • 32