I am trying to loop through a large address data set(300,000+ lines) based on a common factor for each observation, ID2. This data set contains addresses from two different sources, and I am trying to find matches between them. To determine this match, I want to loop through each ID2 as a factor and search for a line from each of the two data sets (building and property data sets) Here is a picture of my desire output Picture of desired output Here is a sample code of what I have tried
PROPERTYNAME=c("Vista 1","Vista 1","Vista 1","Chesnut Street","Apple
Street","Apple Street")
CITY=c("Pittsburgh","Pittsburgh","Pittsburgh","Boston","New York","New
York")
STATE= c("PA","PA","PA","MA","NY","NY")
ID2=c(1,1,1,2,3,3)
IsBuild=c(1,0,0,0,1,1)
IsProp=c(0,1,1,1,0,0)
df=data.frame(PROPERTYNAME,CITY,STATE,ID2,IsBuild,IsProp)
for(i in levels(as.factor(df$ID2))){
for(row in 1:nrow(df)){
df$Any_Build[row][i]<-ifelse(as.numeric(df$IsBuild[row][i])==1)
df$Any_Prop[row][i]<-ifelse(as.numeric(df$IsProp[row][i])==1)
}
}
I've tried nested for loops but have had no luck and am struggling with the apply functions of r. I would appreciate any help. Thank you!