-1

new here so forgive me if this sounds dumb.

I am currently working on a project, the data set has a variable that consists of two digits. For example 12 or 34. The second digit is from 1 to 5 and I need to isolate this one. There are round about 18 000 observations so doing this manually is not an option for me.

I tried the seperate function from dplyr.

demog %>% separate(socioEcon, c("A", "B"))

I also tried a couple of other things which I already deleted.

How would you guys try to split the data?

Jon Spring
  • 55,165
  • 4
  • 35
  • 53
Chris
  • 1
  • 1

4 Answers4

3
demog %>% mutate(B = socioEcon %% 10)
Dmitry Zotikov
  • 2,133
  • 15
  • 12
0

Most separation options ask you to make it a character, so you just need to convert data types back and forth. Here's a riff of my answer here: using strsplit and subset in dplyr and mutate

library(tidyverse)

tribble(
  ~id, ~to_split,
  1, 15,
  2, 42,
  3, 53
) %>% 
  dplyr::mutate(second_digit = stringr::str_split(as.character(to_split), "") %>% 
                  purrr::map_chr(., 2))

As you can see, we're using str_split on a character vector and splitting by nothing (so you get each character), then using purrr's mapping to grab the second character. This is more efficient at a larger scale than dplyr's separate but you could also use that once it's converted to a character vector.

However, the most efficient way is definitely to do the math as your other answer provided. :)

GenesRus
  • 1,057
  • 6
  • 16
0
demog <- data.frame(socioEcon = c(12,34))

library(dplyr)
demog %>%
  mutate(socioEcon = as.character(socioEcon)) %>%
  tidyr::separate(socioEcon, c("A", "B"), sep = 1, remove = F)


  socioEcon A B
1        12 1 2
2        34 3 4
Jon Spring
  • 55,165
  • 4
  • 35
  • 53
0
demog <- data.frame(socioEcon = c(12,34))

library(stringr)
demog[, c("A", "B")] <- as.numeric(str_split_fixed(demog$socioEcon, "", 2))
carl
  • 305
  • 2
  • 13