0

I'm trying to obtain GPS coordinate information for each species in a given data frame of species names using a package-specific function (Red::records) which pulls coordinate information from a database containing information about species distributions.

My For-loop is constructed below, where iterations is the nrow(names) and the function records returns lat/long coordinates:

for(i in 1:iterations){
  gbif[i,1] <- names[i,] ## grab names

  try(temp1 <- records(names[i,]))
  try(temp1$scientificName <- names[i,])

  try(temp2 <- merge(gbif, temp1, by.x="V1", by.y="scientificName"))
  datalist[[i]] <- temp2
}

After executing this loop, I am able to obtain data for species; however, it is not appropriately merged with the namelist. For example, calling records("Agyneta flibuscrocus") correctly returns 5 unique lat/long coordinates while calling records("Agyneta mongolica") produces an error with 0 records found (this is valid for each species when checked online).

After this loop, I bind all of the obtained records into a single data frame using:

dat = do.call(rbind, datalist) ## merge all occurrence data from GBIF into 
one data frame
dat <- unique(dat)

When I go to verify this data frame, I get the following sample data:

Agyneta flibuscrocus        -115.58400        49.72
Agyneta flibuscrocus        -117.58400        51.299
...
Agyneta mongolica           -115.58400        49.72
Agyneta mongolica           -117.58400        51.299

These erroneous replications are also repeated throughout the rest of the 200 names. As a side note, I wrapped everything in try statements because the code will not execute if it runs into a record that produces 0 results from the database.

I feel like I am overlooking something very obvious here?

Reproducible Data & Code:

install.packages("red")
library(red)

names = data.frame("Acantheis variatus", "Agyneta flibuscrocus", "Agyneta 
mongolica", "Alpaida alticeps", "Alpaide venilliae", "Amaurobius 
transversus", "Apochinomma nitidum")

iterations = nrow(names)
datalist = list()

temp1 <- data.frame() ## temporary data frame for joining occurrence data 
from GBIF

for(i in 1:iterations){
  gbif <- names[i,] ## grab name

  try(temp1 <- records(gbif))
  try(temp1$V1 <- gbif)

  datalist[[i]] <- temp1

}

dat = do.call(rbind, datalist)

1 Answers1

1

I adapted some parts of your script and now it seems to work properly (with your example data the function only successfully retrieves data for one species, the one that got replicated in your code, but that's not a coding issue).

The main reason for the erroneous duplications was the variable temp1 being reused. try(temp1 <- records(gbif)) failed but try(temp1$V1 <- gbif) did not, since both temp1 and gbif were (erroneously) defined. Make sure that variables defined in an iteration of a loop don't get carried over to the next iteration.

iterations = nrow(myNames)
datalist = list()

for(i in 1:iterations){
    gbif <- myNames[i,] ## grab name
    try_result <- try(records(gbif))
    if(class(try_result) != "try-error"){
        temp1 <- try_result
        temp1$V1 <- gbif
        datalist[[i]] <- temp1
        rm(temp1)
    }else{
        datalist[[i]] <- NA
    }
    rm(try_result)
}

dat <- do.call(rbind, datalist[!is.na(datalist)])
tobiasegli_te
  • 1,413
  • 1
  • 12
  • 18