2

I have a list of observations in one column that I'd like to rename. The dataframe is shown below. The column GEO has 40 distinct names that I'd like to rename. I've created another list with 40 alternative names that I'd like to use. Though I have thousands of observations, I only have 40 distinct names in there.

DataFrame

**Code to Reproduce**

library(cansim)
library(dplyr)

HPI_LIST <- c("v111955442", "v111955445",   "v111955448",   "v111955451",   "v111955454",   "v111955457",   "v111955460", "v111955463", "v111955466",   "v111955469",   "v111955472",   "v111955475",   "v111955478",   "v111955481",   "v111955484",   "v111955487",   "v111955490",   "v111955493",   "v111955496",   "v111955499",   "v111955502",   "v111955505",   "v111955508",   "v111955511",   "v111955514",   "v111955517",   "v111955520",   "v111955523",   "v111955526",   "v111955529",   "v111955532",   "v111955535",   "v111955538",   "v111955541",   "v111955544",   "v111955547",   "v111955550",   "v111955553",   "v111955556",   "v111955559")

NewHPIData <- get_cansim("18-10-0205-01")

NewHPI <- NewHPIData %>% filter(VECTOR %in% HPI_LIST, REF_DATE > 2018) %>% 
          arrange((COORDINATE)) %>% 
          select(REF_DATE, GEO, VALUE) %>% mutate(VALUE = VALUE / 100) %>%
          arrange(GEO)

I'd like to update names in GEO for example I'd like to rename Alberta with "New Housing Price Index - AB (x 100)"

M--
  • 25,431
  • 8
  • 61
  • 93
Dal
  • 317
  • 1
  • 8
  • 8
    Please do not post an image of data: it cannot be copied or searched (SEO), it breaks screen-readers, and it may not fit well on some mobile devices. Ref: https://meta.stackoverflow.com/a/285557/3358272. Instead, please make this question reproducible by including copy-able data (e.g., output from `dput(head(x))`), what code you've tried so far, and what your expected output is. Refs: https://stackoverflow.com/questions/5963269, https://stackoverflow.com/help/mcve, and https://stackoverflow.com/tags/r/info. – r2evans Nov 06 '18 at 22:07
  • 2
    One option is to use `merge()` to add your new names onto your dataframe. Then work with the merged-on column instead of the `GEO` column. If you provide example data (using, e.g., `dput()` as suggested by @r2evans), I'd be happy to provide you with the code to do so. – DanY Nov 06 '18 at 22:12
  • 1
    We can use named lookup vector, see `setNames` example in [my answer here](https://stackoverflow.com/a/47463563/680068). – zx8754 Nov 06 '18 at 22:16
  • Though there are 40 distinct names, the number of observations run in thousands and I'd like to just map the new names to these 40 distinct ones. – Dal Nov 06 '18 at 22:31

1 Answers1

1

Define your name map as a list

nameMap <- list( "Alberta" = "New Housing Price Index - AB (x 100)",
                "Atlantic Region" = "New Name 1",
                "British Columbia" = "New Name 2"
                ## and so on )

Then call recode on your GEO column using mutate_at:

NewHPI %>% mutate_at( "GEO", recode, !!!nameMap )
# # A tibble: 320 x 3
#    REF_DATE GEO                                  VALUE
#    <chr>    <chr>                                <dbl>
#  1 2018-01  New Housing Price Index - AB (x 100) 1.00 
#  2 2018-02  New Housing Price Index - AB (x 100) 0.998
#  3 2018-03  New Housing Price Index - AB (x 100) 0.996
#  4 2018-04  New Housing Price Index - AB (x 100) 0.996
#  5 2018-05  New Housing Price Index - AB (x 100) 0.996
#  6 2018-06  New Housing Price Index - AB (x 100) 0.998
#  7 2018-07  New Housing Price Index - AB (x 100) 0.999
#  8 2018-08  New Housing Price Index - AB (x 100) 0.994
#  9 2018-01  New Name 1                           1.00 
# 10 2018-02  New Name 1                           1.00 

The !!! operator is needed, because recode accepts an arbitrary number of arguments using ..., but you have a predefined list. This mechanism is known as unquote-splicing.

Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74