I have a column of data that looks like this:
**varX**
Q1#_1
Q1#_5
Q1#_10
I would like to edit the data to look like this:
**varX**
1
5
10
Is there a command I could use to simply keep all information after the underscore?
I have a column of data that looks like this:
**varX**
Q1#_1
Q1#_5
Q1#_10
I would like to edit the data to look like this:
**varX**
1
5
10
Is there a command I could use to simply keep all information after the underscore?
If you want a tidyverse
solution, you can use str_extract
from the stringr
package:
data %>%
mutate(varx = str_extract(varx, "[0-9]+$")) %>%
mutate(varx = as.numeric(varx)) # include this last line if you want a number and not character
In case you always have the Q1#_ string, you can do:
gsub("Q1#_", "", df$varX)
I think you're looking for sub
, substitute a certain part of a string with something else. You can give it a regular expression if you want to go fancy, or just give it a literal:
VarX <- sub('Q1#_', '', VarX, fixed=T)
The fancy way ("remove everything before and including the underscore") would be
VarX <- sub('^.*_', '', VarX)
And you may want to convert it to a numeric or an integer:
VarX <- as.integer(sub('Q1#_', '', VarX, fixed=T)) # or as.numeric
You could you use regular expressions:
df[["varX"]] <- sub(".+_", "", df[["varX"]])
df
varX
1 1
2 5
3 10
Or regular expressions-free: with strsplit()
:
df[["varX"]] <- sapply(df[["varX"]], function(x) strsplit(x, "_")[[c(1,2)]])