-2

I have a simple problem:

I have a column with thousands of values and I'm trying to convert it into a dichotomous variable (Yes|No). Replacing strings with 'No' was easy enough as the value I was converting was a single asterisk

Data$Complete <- gsub("\\*", "No", Data$Complete)

But when I attempt to replace everything apart from 'No', the following code replaces everything with 'Yes' in my string. I don't understand why it would as I'm specifying to replace everthing apart from "No":

Data$Complete <- Data[!Data$Complete %in% c("No"), "Complete"] <- "Yes" 

Any pointers would be appreciated.

Artem
  • 3,304
  • 3
  • 18
  • 41
Pryore
  • 510
  • 9
  • 22

1 Answers1

0

You can use combination of ifelse function and grepl to extract necessary data as below:

library(stringi)

# data simulation
set.seed(123)
n <- 1000
data <- data.frame(
  complete = stri_rand_strings(n = n, length = 20, pattern = "[A-Za-z0-9\\*]")
)

# string matching
data$yes_no <- ifelse(grepl("\\*", data$complete), "No", "Yes")
head(data)

Output:

              complete yes_no
1 HmOsw1WtXRxRfZ5tE1Jx    Yes
2 tgdzehXaH8xtgn0TkCJD    Yes
3 7PPM87DSFr1Qn6YC7ktM    Yes
4 e4NGoRoonQkch*SCMbL6     No
5 EfPm5QztsA7eKeJAm4SV    Yes
6 aJTxTtubO8vH2wi7XxZO    Yes
Artem
  • 3,304
  • 3
  • 18
  • 41