Problem: Add a new column to a DataFrame and populate with the values of a column from another DataFrame, depending on a condition, in one line of code similar to list comprehensions.
Example code:
I create a DataFrame called df with some pupil information
data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'],
'year': [2012, 2012, 2013, 2014, 2014],
'reports': [4, 24, 31, 2, 3]}
df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz',
'Maricopa', 'Yuma'])
Then a second DataFrame called df_extra which has a string representation of the year:
extra_data = {'year': [2012, 2013, 2014],
'yr_string': ['twenty twelve','twenty thirteen','twenty fourteen']}
df_extra = pd.DataFrame(extra_data)
Now how to add the values yr_string
as a new column to df where the numerical years match in one line of code?
I can easily do this with a couple of for loops, but would really like to know if this is possible to do in one line, similar to list comprehensions?
I have searched questions already on here, but there is nothing discussing adding a new column to an existing DataFrame from another DataFrame based on a condition in one line.