My goal is to get the number of common opponents for each car in a data set in a quick manner. For example, if the Datsun raced the Mazda RX4 and the Merc 230 and the Mazda RX4 also raced the Merc 230, we would return '1' for the number of common opponents. Provided is a sample using the mtcars data and I run my function on that. For such a small data set it takes around .31 seconds, but for larger data sets it takes quite a while.
#setting up sample dataset#
set.seed(44)
data(mtcars)
mtcars$car<-row.names(mtcars)
mtcars<-as.data.table(mtcars)
for(i in 1:nrow(mtcars)){
mtcars[i,count:=sample(1:5,1)]
}
expanded <- data.table(car = rep(mtcars$car, mtcars$count),
opponent = sample(mtcars$car, mtcars$count),
wins=sample(1:mtcars$count,mtcars$count))
head(expanded)
car opponent wins
1: Mazda RX4 Chrysler Imperial 1
2: Mazda RX4 Merc 280 2
3: Mazda RX4 Toyota Corolla 4
4: Mazda RX4 Merc 230 3
5: Mazda RX4 Wag Chrysler Imperial 1
6: Mazda RX4 Wag Merc 280 2
#this is the function i use now which takes a while#
commonCars<-function(carA,carB){
tA<-unique(expanded[car==carA,opponent]) #getting unique opponents of first car
tB<-unique(expanded[car==carB,opponent]) #getting unique opponents of first car's opponent
commonTeams<-tB[tB %in% tA] #getting their common opponents
return(nrow(expanded[car==carA & opponent %in% commonTeams,])) #returning the number of commons
}
ptm<-proc.time()
for(i in unique(expanded[,car])) { #looping through each individual car
for(j in unique(expanded[car==i,opponent])){ #getting the cars they raced#
expanded[car==i & opponent==j,common:=commonCars(i,j)]
}
}
proc.time()-ptm
user system elapsed
0.29 0.00 0.30
head(expanded)
car opponent wins common
1: Mazda RX4 Maserati Bora 4 3
2: Mazda RX4 Hornet 4 Drive 2 3
3: Mazda RX4 Datsun 710 3 3
4: Mazda RX4 Chrysler Imperial 1 1
5: Mazda RX4 Wag Maserati Bora 4 1
6: Mazda RX4 Wag Hornet 4 Drive 2 1