I have an excel sheet with 2 tabs. Each one has a zip code column, latitude column, longitude column, and zip code type column. I am trying to find the next closest zip code to any PO box or University zip codes. I want to use R to find the minimum distance between all zip codes and return the PO/University zip and the next closest real zip code (replacement ZIP). I already tried to have excel calculate the values, but the dataset is too big. I have 29557 normal zip codes by 10019 PO/University zip codes.
So far, I have converted the excel sheet to a csv, and now I have matrices and dataframes and am at a dead end.
Any ideas on how to accomplish this? I can elaborate on my problem if needed. Any help is appreciated, thank you.
Edit:
`>real_zip <-read.csv(file="Real_ZIP_Codes.csv",header=TRUE,row.names=1,sep=",")
>po_zip <- read.csv(file="PO_ZIP_Codes.csv",header=TRUE,row.names=1,sep=",")
>#Defining some variables. Unneeded ones are commented out
>#real_zip_dataframe <- data.frame(abs(real_zip))
>#po_zip_dataframe <- data.frame(po_zip)
>real_zip_matrix <- as.matrix(real_zip)
>po_zip_matrix <- as.matrix(po_zip)
>real_zip_list <- as.list(real_zip)
>po_zip_list <- as.list(po_zip)
>real_zip_matrix_lat <- as.matrix(abs(real_zip$LAT))
>real_zip_matrix_long <- as.matrix(abs(real_zip$LONG))
>po_zip_matrix_lat <- as.matrix(abs(po_zip$LAT))
>po_zip_matrix_long <- as.matrix(abs(po_zip$LONG))
>LatCalc <-mapply(diff,real_zip_matrix_lat,po_zip_matrix_lat,SIMPLIFY = FALSE)
>LongCalc <-mapply(diff,real_zip_matrix_long,po_zip_matrix_long,SIMPLIFY = FALSE)
>test<-apply(LatCalc,1:2,((real_zip_matrix_lat)-(po_zip_matrix_lat)))
>diff_lat<-real_zip_matrix_lat-po_zip_matrix_lat
>diff_long<-diff(po_zip_matrix_long,real_zip_matrix_long)
`
This is my script as a whole so far.
Some more info:
`> nrow(po_zip)
[1] 10019
> nrow(real_zip)
[1] 29557`
When I run:
`> diff_long<-diff(real_zip_matrix_long,po_zip_matrix_long)`
I get this error:
> diff_long<-diff(real_zip_matrix_long,po_zip_matrix_long)
Error in diff.default(real_zip_matrix_long, po_zip_matrix_long) :
'lag' and 'differences' must be integers >= 1
When I run: > test<-apply(LatCalc,1:2,real_zip_matrix_lat-as.vector(po_zip_matrix_lat))
I get this error:
Error in match.fun(FUN) :
'real_zip_matrix_lat - as.vector(po_zip_matrix_lat)' is not a function, character or symbol
In addition: Warning message:
In real_zip_matrix_lat - as.vector(po_zip_matrix_lat) :
longer object length is not a multiple of shorter object length