How to change column values based on another column value?

Question

I am trying to change the column values in my csv file that have state codes (state_cd) to state name (state_name). Here are my two csv files and my code.

states.csv

"state_name","state_cd"
"Alabama","AL"
"Alaska","AK"
"Arizona","AZ"
"Arkansas","AR"
"California","CA"
"Colorado","CO"
"Connecticut","CT"
"Delaware","DE"
"District of Columbia","DC"
"Florida","FL"
"Georgia","GA"
"Hawaii","HI"
"Idaho","ID"
"Illinois","IL"
"Indiana","IN"
"Iowa","IA"
"Kansas","KS"
"Kentucky","KY"
"Louisiana","LA"
"Maine","ME"
"Maryland","MD"
"Massachusetts","MA"
"Michigan","MI"
"Minnesota","MN"
"Mississippi","MS"
"Missouri","MO"
"Montana","MT"
"Nebraska","NE"
"Nevada","NV"
"New Hampshire","NH"
"New Jersey","NJ"
"New Mexico","NM"
"New York","NY"
"North Carolina","NC"
"North Dakota","ND"
"Ohio","OH"
"Oklahoma","OK"
"Oregon","OR"
"Pennsylvania","PA"
"Rhode Island","RI"
"South Carolina","SC"
"South Dakota","SD"
"Tennessee","TN"
"Texas","TX"
"Utah","UT"
"Vermont","VT"
"Virginia","VA"
"Washington","WA"
"West Virginia","WV"

schools.csv

"parent_category_nm","category_nm","place_name","address","city","state_cd"
"Schools","Elementary Schools","123 YOU N ME PRESCHOOL","809 W DETWEILLER DR STE A","PEORIA","IL"............................

Code.py

import pandas as pd

df1 = pd.read_csv("school.csv")

df2 = pd.read_csv("state.csv")

df3 = df1.merge(df2, left_on="state_cd", right_on="state_name", how="left")

I am not receiving any errors but the code also isn't doing anything. The expected output would be something like this with the state code changed to the state name.

"parent_category_nm","category_nm","place_name","address","city","state_cd"
"Schools","Elementary Schools","123 YOU N ME PRESCHOOL","809 W DETWEILLER DR STE A","PEORIA","Illinois"............................

Can anyone help with this?

@MichaelSzczesny meaning its an enormous csv and I can't post all of it — Steve, Sep 28 '21 at 17:50
Why are the values in your csv file surrounded in quotes? Seems to me like improper csv formatting. — Jab, Sep 28 '21 at 17:52
Does this answer your question? [Pandas Merging 101](https://stackoverflow.com/questions/53645882/pandas-merging-101) — Michael Szczesny, Sep 28 '21 at 18:02

Juan C · Answer 1 · 2021-09-28T18:14:04.720

4

What you want to do is:

df3 = df1.merge(df2, on="state_cd", how="left")

This will add the state_name column to your school dataframe.

If you want to keep state_cd as the column with the complete names:

 df1['state_cd'] = df1.state_cd.map(df2.set_index('state_cd').state_name)

edited Sep 28 '21 at 18:14

answered Sep 28 '21 at 17:50

Juan C

5,846
2
17
51

I have removed those – Steve Sep 28 '21 at 17:57
I have ran this code and the state_cd column in schools.csv still have the state code instead of the state name – Steve Sep 28 '21 at 18:01
But you want the names in the `state_cd` column? Might be a little confusing down the way – Juan C Sep 28 '21 at 18:02
Basically I want all the columns with the state_cd in the schools.csv to actually have the state name instead of the abbreviation. For example the first row that has "IL", I wish to change to Illinois. The column header is less important and could either be state_name or stay state_cd – Steve Sep 28 '21 at 18:06
@MichaelSzczesny I can update with no leading spaces – Steve Sep 28 '21 at 18:06
1

Then this is the correct answer. This solution does exactly what you want. – Michael Szczesny Sep 28 '21 at 18:06
@MichaelSzczesny hmmm I wonder why it's not doing anything for me – Steve Sep 28 '21 at 18:11
Edited another option, I think it's more confusing and less readable, but seems to be what you're looking for – Juan C Sep 28 '21 at 18:14

score 0 · Answer 2 · answered Sep 28 '21 at 17:52

0

df1['state'] = df1['state_cd'].merge(df2,how='left')

answered Sep 28 '21 at 17:52

SCCJS

96
8

How to change column values based on another column value?

2 Answers2