Split values in a column and reassign it to a new column

Question

In my DataFrame, one the columns has a value that is a combination of [state,country]

I tried this code:

voivodeshipdf <- voivodeshipdf %>% mutate(state =  as.character(unlist(str_split(voivodeship, ','))[1]))

but it only reassigns the value of the first row.

Please how do I update my code to split the right values for each row?

akrun · Accepted Answer · 2019-08-21T15:05:48.317

An option would be separate

library(tidyverse)
voivodeshipdf %>%
   separate(voivodeship, into = c('state', 'newcol'), sep=",", remove = FALSE) %>%
   select(-newcol)

Or extract

voivodeshipdf %>%
     extract(volvodeship, into = 'state', '^([^,]+),.*', remove = FALSE)

or with word

voivodeshipdf %>%
     mutate(state = word(volvodeship, 1, sep=","))

The issue in the OP's code is that is subsetting the list with [1], which would select the first list element as a list with one vector and it is getting assigned to the column due to recycling

Instead, what we need is to extract the first element from the list output of str_split with map or lapply (map would be more appropriate in tidyverse context)

voivodeshipdf %>% 
        mutate(state =  map_chr(str_split(voivodeship, ','), first))

score 2 · Answer 2 · answered Aug 21 '19 at 15:01

We can try using sub here for a base R option:

voivodeshipdf$state <- sub("^.*, ", "", voivodeshipdf$voivodeship)
voivodeshipdf$voivodeship <- sub(",.*$", "", voivodeshipdf$voivodeship)

Sample script:

voivodeship <- "Greater Poland voivodeship, poland"
sub("^.*, ", "", voivodeship)
sub(",.*$", "", voivodeship)

[1] "poland"
[1] "Greater Poland voivodeship"

Split values in a column and reassign it to a new column

2 Answers2