My raw data has 3 columns; one of them is called First_Name
. The First_name
column has actual first names such as Prabhat
and Tony
in it but also a lot of invalid strings, i.e, strings that do not represent actual first names such as email addresses like Prabhat@gmail.com
or strings with numbers and special characters like aaa261
. So what I want to do is filter out the valid First_Name
strings.
Here are the steps I am taking:
1st step:
c <- read.csv("Test_Data.csv", TRUE, ",") .
2nd step:
First_Name <- pull(c, firstname) # pulling First_Name column from Raw Data.
3rd step:
df[] <- lapply(df[], as.character)
4th step:
df$new <- ifelse(grepl("[^A-z]", df$First_Name), "NA", df$First_Name)
But it's not working and giving me an error:
"Error in $<-.data.frame(*tmp*, new, value = logical(0)) : replacement has 0 rows, data has 50000" .