I'm trying to make a list containing 25 different passwords to check against another list of 50, and come back with the matches. This is for a university project on passwords. The idea is the list of 25 are the most commonly used passwords, and I would like R to tell me which of my 50 passwords match the most common 25. However I keep receiving the following error:
Error in $<-.data.frame(*tmp*, "Percent", value = character(0)) :
replacement has 0 rows, data has 25
I am using the following code
makeCounts <- function(x) {
return(x=list("count"=sum(grepl(x, Final_DF$pswd, ignore.case=TRUE))))
}
#creates a local variable named tmp which is removed afterwards
printCounts <- function(ct) {
tmp <- data.frame(Term=names(ct), Count=as.numeric(unlist(ct)))
tmp$Percent <- sprintf("%3.2f%%", ((tmp$Count / nrow(Final_DF$Pswd) * 100)))
print(tmp[order(-tmp$Count),], row.names=FALSE)
}
# create top 25 mostly commonly used pswds
worst.pass <- c("password", "123456", "12345678", "qwerty", "abc123",
"monkey", "1234567", "Qwertyuiop", "123", "dragon",
"000000", "1111111", "iloveyou", "1234", "12345",
"1234567890", "1q2w3e4r5t", "ashely", "shadow", "123123",
"654321", "superman", "sunshine", "tinkle", "football")
worst.ct <- sapply(worst.pass, makeCounts, simplify=FALSE)
printCounts(worst.ct)
The data containing my 50 passwords are is contained in my data frame Final_DF$Pswd and is as follows
> Final_DF$Pswd
[1] "monkey" "iloveyou" "dragon" "jbI2pnK$xi" "password" "computer" "!qessw"
[8] "tUNh&SSm6!" "sunshine" "wYrUeWV" "superman" "samsung" "utoXGe6$" "master"
[15] "wjZC&OvXX" "0R1cNTm9sGir" "Fbuu2bs89?" "pokemon" "secret" "x&W1TjO59" "buster"
[22] "purple" "shine" "flower" "marina" "Tg%OQT$0" "SbDUV&nOX" "peanut"
[29] "angel" "?1LOEc4Zfk" "computer" "spiderman" "nothing" "$M6LgmQgv$" "orange"
[36] "knight" "american" "outback" "TfuRpt3PiZ" "air" "surf" "lEi2a$$eyz"
[43] "date" "V$683rx$p" "newcastle" "estate" "foxy" "ginger" "coffee"
[50] "legs"
Show traceback of the error when I run printCounts(worst.ct)
reads
Error in `$<-.data.frame`(`*tmp*`, "Percent", value = character(0)) :
replacement has 0 rows, data has 25
4.
stop(sprintf(ngettext(N, "replacement has %d row, data has %d",
"replacement has %d rows, data has %d"), N, nrows), domain = NA)
3.
`$<-.data.frame`(`*tmp*`, "Percent", value = character(0))
2.
`$<-`(`*tmp*`, "Percent", value = character(0))
1.
printCounts(worst.ct)
I have read a couple of forum posts, and I am not sure if this has something to do with NA values? I am new to R and been looking at this for some time scratching my head.
Can anybody please tell me where I am going wrong?
> dput(Final_DF)
structure(list(gender = c("female", "male", "male", "female",
"female", "male", "male", "male", "male", "female", "male", "male",
"female", "female", "female", "female", "male", "female", "male",
"male", "female", "female", "female", "female", "female", "female",
"male", "female", "female", "female", "female", "female", "female",
"female", "male", "male", "female", "female", "male", "female",
"female", "male", "female", "female", "male", "male", "male",
"male", "male", "male"), age = structure(c(47L, 43L, 65L, 24L,
44L, 60L, 26L, 25L, 62L, 23L, 44L, 61L, 27L, 47L, 18L, 23L, 34L,
77L, 71L, 19L, 64L, 61L, 22L, 55L, 45L, 29L, 21L, 64L, 43L, 20L,
32L, 55L, 68L, 21L, 81L, 43L, 63L, 72L, 38L, 20L, 66L, 39L, 64L,
20L, 73L, 21L, 53L, 75L, 69L, 82L), class = c("variable", "integer"
), varname = "Age"), web_browser = structure(c(1L, 1L, 4L, 1L,
3L, 3L, 2L, 1L, 4L, 1L, 1L, 1L, 3L, 4L, 1L, 2L, 1L, 3L, 3L, 2L,
1L, 1L, 1L, 3L, 4L, 3L, 4L, 4L, 1L, 2L, 1L, 1L, 3L, 1L, 1L, 2L,
1L, 2L, 3L, 4L, 2L, 3L, 1L, 1L, 1L, 1L, 3L, 3L, 4L, 1L), .Label = c("Chrome",
"Internet Explorer", "Firefox", "Netscape"), class = c("variable",
"factor"), varname = "Browser"), Pswd = c("monkey", "iloveyou",
"dragon", "jbI2pnK$xi", "password", "computer", "!qessw", "tUNh&SSm6!",
"sunshine", "wYrUeWV", "superman", "samsung", "utoXGe6$", "master",
"wjZC&OvXX", "0R1cNTm9sGir", "Fbuu2bs89?", "pokemon", "secret",
"x&W1TjO59", "buster", "purple", "shine", "flower", "marina",
"Tg%OQT$0", "SbDUV&nOX", "peanut", "angel", "?1LOEc4Zfk", "computer",
"spiderman", "nothing", "$M6LgmQgv$", "orange", "knight", "american",
"outback", "TfuRpt3PiZ", "air", "surf", "lEi2a$$eyz", "date",
"V$683rx$p", "newcastle", "estate", "foxy", "ginger", "coffee",
"legs"), pswd_length = c(6L, 8L, 6L, 10L, 8L, 8L, 6L, 10L, 8L,
7L, 8L, 7L, 8L, 6L, 9L, 12L, 10L, 7L, 6L, 9L, 6L, 6L, 5L, 6L,
6L, 8L, 9L, 6L, 5L, 10L, 8L, 9L, 7L, 10L, 6L, 6L, 8L, 7L, 10L,
3L, 4L, 10L, 4L, 9L, 9L, 6L, 4L, 6L, 6L, 4L), last.num = c(NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA, NA, 9, NA, NA, NA, NA, NA, 0, NA, NA, NA, NA, NA, NA, NA,
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,
NA)), row.names = c(NA, -50L), class = "data.frame")