Python & Pandas: writing columns based on matches in different csvs

Question

New to Python and Pandas, any help or guidance appreciated.

I'm working with two different csv's, roster.csv and assignment.csv, with fields structured like so:

roster.csv:

first_name, last_name,  user_id,    user
John,   Doe,    0001,   abcd
Jane,   Doe,    0002,   efgh
John,   Smith,  0003,   ijkl

assignment.csv:

user,   text
abcd,   text1
efgh,   text2
ijkl,   text3

I'd like to use roster.csv to create new columns in assignment.csv when the user fields match, adding the first_name, last_name, and user_id fields. The output would look like:

user,   text,   first_name, last_name,  user_id
abcd,   text1,  John,   Doe,    0001
efgh,   text2,  Jane,   Doe,    0002
ijkl,   text3,  John,   Smith,  0003

So far I'm able to create an output that writes the headers for the fields above, but none of the data. Here's my working code thus far:

import pandas as pd

df_roster = pd.read_csv('roster.csv')
df_assignment = pd.read_csv('assignment.csv')

df3 = pd.concat([df_assignment, df_roster[~df_roster.user.isin(df_assignment.user)]])

Any insights on how to proceed? All assistance appreciated.

score 1 · Accepted Answer · answered Aug 22 '21 at 15:03

1

Why dont you use merge ?

Here is the guide for you. https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

answered Aug 22 '21 at 15:03

alex smolyakov

198
2
8

score 1 · Answer 2 · answered Aug 22 '21 at 15:05

1

Use merge, not concat, this ensures that the key will be aligned:

df_roster.merge(df_assignment, left_on='user_id', right_on='user')

answered Aug 22 '21 at 15:05

mozway

194,879
13
39
75

Python & Pandas: writing columns based on matches in different csvs

2 Answers2