-2

I searched stackoverflow and couldn't find the answer. Sorry if this has been asked ...

I have a string with numbers and some numbers have letters ...

x = c("1", "12", "14A", "12B", "6")

I want to separate the number component to get 2 separate columns, 1 with numbers and 1 with the letters ...

x = c(1, 12, 14, 12, 6)
y = c(NA, NA, "A", "B", NA)

would appreciate any help.

R-MASHup
  • 365
  • 2
  • 9
  • 2
    Why does the 6 at the end stay? Try `gsub('\\D+', '', x)` and `gsub('\\d+', '', x)` – Sotos Dec 13 '22 at 07:53
  • My first intuition was after using regex to use `parse_number` (which works) and `parse_character` (which unfortunately does not work) both from `readr` package. See: . It would be cool if `parse_character` could parse only character text,. – TarJae Dec 13 '22 at 09:28

2 Answers2

2

You can use tidyr's function extract:

library(tidyr)
data.frame(x) %>%
  extract(x, into = c("x","y"), regex = "(\\d+)([A-Z]+)?")
   x y
1  1  
2 12  
3 14 A
4 12 B
5  6  
Chris Ruehlemann
  • 20,321
  • 4
  • 12
  • 34
2

Using str_extract from stringr we can try:

library(stringr)

y <- str_extract(x, "\\D+")
x <- str_extract(x, "\\d+")
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360