0

New to Python and Pandas, any help or guidance appreciated.

I'm working with two different csv's, roster.csv and assignment.csv, with fields structured like so:

roster.csv:

first_name, last_name,  user_id,    user
John,   Doe,    0001,   abcd
Jane,   Doe,    0002,   efgh
John,   Smith,  0003,   ijkl

assignment.csv:

user,   text
abcd,   text1
efgh,   text2
ijkl,   text3

I'd like to use roster.csv to create new columns in assignment.csv when the user fields match, adding the first_name, last_name, and user_id fields. The output would look like:

user,   text,   first_name, last_name,  user_id
abcd,   text1,  John,   Doe,    0001
efgh,   text2,  Jane,   Doe,    0002
ijkl,   text3,  John,   Smith,  0003

So far I'm able to create an output that writes the headers for the fields above, but none of the data. Here's my working code thus far:

import pandas as pd

df_roster = pd.read_csv('roster.csv')
df_assignment = pd.read_csv('assignment.csv')

df3 = pd.concat([df_assignment, df_roster[~df_roster.user.isin(df_assignment.user)]])

Any insights on how to proceed? All assistance appreciated.

2 Answers2

1

Why dont you use merge ?

Here is the guide for you. https://pandas.pydata.org/pandas-docs/stable/user_guide/merging.html

alex smolyakov
  • 198
  • 2
  • 8
1

Use merge, not concat, this ensures that the key will be aligned:

df_roster.merge(df_assignment, left_on='user_id', right_on='user')
mozway
  • 194,879
  • 13
  • 39
  • 75