0

I have a vector that has a series of numbers and words.

df <- as.character(c(1234, "Other", 5678, "Abstain"))

I would like to remove the last two digits of the numbers without affecting the words in the string.

df <- as.character(c(12, "Other", 56, "Abstain"))

halfer
  • 19,824
  • 17
  • 99
  • 186
Marco Pastor Mayo
  • 803
  • 11
  • 25
  • 1
    Well, vectors in R can't contain a mix of numbers and character values. It already probably converted to all characters. The code you provided above isn't actually valid R code (unless `Other` and `Abstain` are variables defined elsewhere). It would be better if you you provided a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) just to make things more clear. – MrFlick Oct 08 '18 at 16:51
  • 3
    Lacking that, you might be able to use `gsub("[0-9]{2}$", "", df)`. Note that this will change `"56"` to `""`. – r2evans Oct 08 '18 at 16:53
  • @r2evans was right. That code worked perfectly. Thanks! – Marco Pastor Mayo Oct 08 '18 at 17:26

1 Answers1

1

Probably a bit more robust/versatile/safe than the solution suggested by @r2evans in the comments.

gsub( "(\\d{2,})\\d{2}$", "\\1", df)

what it does:

pattern = "(^\\d{2,})\\d{2}$"

  • ^ matches the start of the string
  • \\d{2,} matches any substring of at least two digits (delete the comma of you only want to match strings of the exact length of 4 digits)
  • (^\\d{2,}) the round brackets define the start from the string and the following repetition of minimal two digits as a group.
  • \\d{2} a repetition of exactly two digits
  • $ matches the end of a string

in short: it matches any string that exits solely of digits, that starts with a minimum of two digits, andd ends with two digits (so the minimum length of the digit string = 4)

replacement = "\\1"

  • replaces the entire matches string woth the first defind group ( (^\\d{2,}) ) from the above described pattern.

sample data

df <- c(123, "Other", 5678, "Abstain", "b12345", 123456, "123aa345")

gsub("(^\\d{2,})\\d{2}$", "\\1", df)
#[1] "123"      "Other"    "56"       "Abstain"  "b12345"   "1234"     "123aa345" 
Wimpel
  • 26,031
  • 1
  • 20
  • 37