how to combine info from two different sized tibbles?

Question

New to R, just learning.

I have two tibbles, one, statecodes, having 67 rows, with the mapping from Census Bureau (CB) state code to state name/abbrev, and one, shapedata, 3233 row, with information about the size of each county in the US with the same statecode as in the first tibble. I would like to add the name and abbrev to the second tibble.

> head(statecodes)
# A tibble: 6 × 3
  Name       Abbrev Code 
  <chr>      <chr>  <chr>
1 Alabama    AL     01   
2 Alaska     AK     02   
3 Arizona    AZ     04   
4 Arkansas   AR     05   
5 California CA     06   
6 Colorado   CO     08 
  
> head(shapedata)
# A tibble: 6 × 9
  STATEFP COUNTYFP COUNTYNS AFFGEOID       GEOID NAME       LSAD  ALAND      AWATER   
  <chr>   <chr>    <chr>    <chr>          <chr> <chr>      <chr> <chr>      <chr>    
1 39      071      01074048 0500000US39071 39071 Highland   06    1432479992 12194983 
2 06      003      01675840 0500000US06003 06003 Alpine     06    1912292630 12557304 
3 12      033      00295737 0500000US12033 12033 Escambia   06    1701544502 563927612
4 17      101      00424252 0500000US17101 17101 Lawrence   06    963936864  5077783  
5 28      153      00695797 0500000US28153 28153 Wayne      06    2099745573 7255476  
6 28      141      00695791 0500000US28141 28141 Tishomingo 06    1098938845 52360190 


> nrow(statecodes)
[1] 67

> nrow(shapedata)
[1] 3233

I can't use mutate because the input/output are differently sized. I've looked at purrr, but don't see an obvious way to use it. I was trying to do something like this.

shapedata <- shapedata %>% mutate(statename = statecodes[statecodes$Code == STATEFP,]$Name)

where STATEFP is the same state code as is in `statecodes'

I could write a for loop, I'm just wondering if there's a more R-like method.

TIA

If you want "tidyverse" functions use `shapedata <- left_join(shapedata, statecodes, by = c('STATEFP' = 'Code'))` — qdread, Feb 16 '22 at 20:21
Here's a possible base R solution: `statecodes$Name[match(shapedata$STATEFP, statecodes$Code)]` Are you obtaining objects `statecodes` and `shapedata` from a package? A minimal reproducible example would be helpful — Alexander Christensen, Feb 16 '22 at 20:22

score 1 · Accepted Answer · answered Feb 16 '22 at 20:22

1

I guess you are looking for a join.

shapedata %>%
    left_join(statecodes, 
              by = c("STATEFP" = "Code"))

answered Feb 16 '22 at 20:22

Josep Pueyo

387
2
11

Wow! I haven't really been able to grok joins yet, but that answer certainly helped me along. Works – whdaffer Feb 16 '22 at 20:35
So, could you mark as the right answer and vote it? Thank you. ;) – Josep Pueyo Feb 17 '22 at 21:03

how to combine info from two different sized tibbles?

1 Answers1