I have a data frame that contains strings representing 'Full Name'. Some are a complete, normal full name and some are not 'complete' or 'accurate' based on non-letter characters being present.
Example of dataframe:
Full name
----------
Mikki Clancy
Hermsdorfer, Mark (retired)
CSP, PSECU Lan Unit (typo)
Clifton Gurlen
G�mez, Oscar Prieto
Sj�¶strand, Anders
Lisa Terry
Meloy, Wilson {old}
Gregory Stevens
Charles Gruenberg
df <- structure(list(Full_name = c("Jane Clancy",
"Hermsdorfer, Mark (retired)",
"CSP, PSECU Lan Unit (typo)",
"Clif Gurlen",
"G�mez, Oscar Prieto",
"Sj�¶strand, Anders",
"Liza Terry",
"Meloy, Will {old}",
"Garret Stevens",
"Charly Ruenberg"), Group = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j")), class = "data.frame", row.names = c(NA, -10L))
The ask is to subset the complete dataframe based on strings that contain non-ascii characters ( for example from above values - '{}, (), &, �').
Desired output would be a the column of names that contain those characters, and then the total count of rows so I can calculate the % from the complete dataframe that are 'not complete' or 'accurate'.
Not Complete Full name
----------------------
Hermsdorfer, Mark (retired)
CSP, PSECU Lan Unit (typo)
G�mez, Oscar Prieto
Sj�¶strand, Anders
Meloy, Wilson {old}