I have two dataset A and B that contains values and coordonates
A:
╔═══╦════════════╦═════════════╦═════════════╗
║ ║ name ║ x ║ y ║
╠═══╬════════════╬═════════════╬═════════════╣
║ 1 ║ city ║ 50.3 ║ 4.2 ║
║ 2 ║ farm ║ 14.8 ║ 8.6 ║
║ 3 ║ lake ║ 18.7 ║ 9.8 ║
║ 3 ║ Mountain ║ 44 ║ 9.8 ║
╚═══╩════════════╩═════════════╩═════════════╝
B:
╔═══╦════════════╦═════════════╦═════════════╗
║ ║ Temp ║ x ║ y ║
╠═══╬════════════╬═════════════╬═════════════╣
║ 1 ║ 18 ║ 50.7 ║ 6.2 ║
║ 2 ║ 17,3 ║ 20 ║ 11 ║
║ 3 ║ 15 ║ 15 ║ 9 ║
╚═══╩════════════╩═════════════╩═════════════╝
I would like this, C:
╔═══╦════════════╦═════════════╦═════════════╗
║ ║ Name ║ Temp ║ Distance ║
╠═══╬════════════╬═════════════╬═════════════╣
║ 1 ║ city ║ 18 ║ 2.039608 ║
║ 2 ║ farm ║ 15 ║ 0.447214 ║
║ 3 ║ lake ║ 17.3 ║ 1.769181 ║
║ 4 ║ Mountain ║ 18 ║ 7.605919 ║
╚═══╩════════════╩═════════════╩═════════════╝
I tried this :
A<- read.table(header = TRUE, text = "
Name x y
city 50.3 4.2
farm 14.8 8.6
lake 18.7 9.8
mountain 44 9.8")
B<- read.table(header = TRUE, text = "
Temp x y
18 50.7 6.2
17.3 20 11
15 15 9")
C<- data.frame(Name=character(),
Temp=numeric(),
Distance=numeric())
for(i in 1:nrow(A)) {
x1<- A[i,]$x
y1<- A[i,]$y
min = 100
index = 0
for(j in 1:nrow(B)) {
x2<- B[j,]$x
y2<- B[j,]$y
tmp = sqrt((((x2-x1)^2)+((y2-y1)^2)))
if (tmp < min) {
index = j
min = tmp
}
}
df <- list(Name=A[i,]$Name, Temp=B[index,]$Temp, Distance=min)
C <- rbind(C, df)
}
print(C)
But my first dataset is about 1,500,000 rows and my second one is about 5000 and this algorythm is very very slow. Is there a better way to do it ?