I know that in R for loops should be avoided and vectorized operations should be used instead.
I want to solve this with a for
loop and then try to use the apply
family, then also in Rcpp.
I load a dataset containing one column of passwords (alphanumeric).
Once loaded (a sample, for speed), I want to create new column with value (0,1) based on some conditions "contains_lower_chars", "contains_numbers" and so on.
Here what I tried to do, but it doesn't work - meaning each column I create has the same value.
library(tidyverse)
set.seed(123)
# load dataset from url, skip the first 16 rows
df <- read.csv('http://datashaping.com/passwords.txt', header = F, skip = 16) %>%
sample_frac(.001) %>%
rename(password = V1)
patterns = c("[a-z]","[A-Z]","[0-9]+")
df$has_lower <- 0
df$has_upper <- 0
df$has_numeric <- 0
for(i in 1:nrow(df)){
for(j in patterns){
n <- ifelse(grepl(j, df$password[i]),1,0)
}
df$has_lower[i] <- n
df$has_upper[i] <- n
df$has_numeric[i] <- n
}
Output I have in mind is:
password has_lower has_upper has_numeric
Bigmaccas 1 1 0
0127515559 0 0 1
dbqky73p 1 0 1