I'm currently working on preparing a dataset for data analysis in RStudio. I'm using the below code to duplicate the data so that for every variable column I create a new variable column next to it where I put the cleaned data. This works well for putting a single number from the original column into the next to then translate it into a written answer, but I don't know how to take a number from an answer with text. In the code below, column v1 has inputs that are sentences with numbers in them. When I use my mutate code, it doesn't transfer anything because the data is seen as text. I was wondering if there's a way to take the number from the data and put it into the new column. My goal is for column v1.1 to have 11 and 22 in it rather than the whole sentences that are column v1.
library(tidyr)
library(dplyr)
df <- data.frame(v1=c("11 because of reason x","22 but I like this"),
pages=c(32,45),
name=c("spark","python"))
df
df2 <- cbind(df, df)
df2 <- df2[, sort(names(df2))]
df2[, seq(2, 6,by=2)] <- NA
names(df2) <- sub("\\.", ".", names(df2))
df2 <- df2 %>%
mutate(v1.1 = ifelse( (v1 == 11)|(v1 == 22), v1, v1.1))
I'm hoping to make it so that I can use the mutate function from above and include some kind of stipulation to identify if a number is present at all in a cell even if it has text with it and to only put the number in the next corresponding column. I found this code below to separate numbers from text but it really didn't work for me. I could make it work if I can somehow include it under the mutate function.
df2 <- df1 %>%
separate(v1,
into = c("text", "num"),
sep = "(?<=[A-Za-z])(?=[0-9])"
)
When I used the above code to separate numbers and texts it also didn't work because the number values are in a sentence with parentheses and stuff and it seems like the code above only works for stuff like "AB55". I need a way to separate something like "(5+6)I think" into just a "5" or a "6". Is that at all possible? Thank you! I hope you all have a great day!