I'm trying to write a loop to perform the following actions on a data frame:
For every name in the 'Name' column, check to see if a rough match (accomplished with agrep() ) exists in the 'Referral' column. If a match exists, replace all cells in the 'Referral' column that roughly match the name with 'referral'.
Here is my code so far:
for (i in 1:1000){
for (q in 1:length(agrep(c$Name[i], c$Referal))){
if (length(agrep(c$Name[i], c$Referal)>0)){
c$Referal[agrep(c$Name[i], c$Referal)[q]]<-'panda'
}
}
}
This code, however, (after it takes 20 mins to run) replaces ALL cells in the 'Referral' column with 'referral'. I'm wondering if the 'i' in the first row stays the same throughout the whole loop? Obviously a clunky-ass chunk of code, but I can't think of why it would do this...
An example would be:
Name <- c('michael jordan', 'carrot', 'ginger')
Referral <-('internet', 'facebook', 'mike jordan')
df <- data.frame(Name, Referral)
After running the function, ideally df$Referral[3]=='referral' would be TRUE.