0

I'm working with occurrence records and want to remove records that are missing coordinates, with the line:

records0 <- records[records$decimalLatitude == 0 | records$decimalLongitude == 0,]

(decimalLatitude and decimalLongitude are two of my columns)

But when I look at records0, it shows the right number of rows for which the coordinates are missing, but all of the other columns are empty as well (shows NA when they should still contain the rest of my data). Why could that be? (The object records containing my dataset looks as it should.)

records <- fread("Occurrences-Example.csv")

dput(records[1:4, ]) structure(list(rightsHolder = c("", "", "Naturalis Biodiversity Center", "Naturalis Biodiversity Center"), type = c(NA, NA, NA, NA), decimalLatitude = c(-4.565474, NA, NA, -0.832667), decimalLongitude = c(12.480469, NA, NA, 13.9735 ), scientificName = c("Gigasiphon gossweileri (Baker f.) Torre & Hillc.", "Gigasiphon gossweileri (Baker f.) Torre & Hillc.", "Bauhinia humblotiana Baill.", "Gigasiphon gossweileri (Baker f.) Torre & Hillc.")), row.names = c(NA, -4L), class = c("data.table", "data.frame"), .internal.selfref = <pointer: 0x7f8dfc812ee0>)

records0 <- records[records$decimalLatitude == 0 | records$decimalLongitude == 0,] records0 Empty data.table (0 rows) of 5 cols: rightsHolder,type,decimalLatitude,decimalLongitude,scientificName

prosoitos
  • 6,679
  • 5
  • 27
  • 41
Charlotte
  • 1
  • 1
  • 1
    Can you post a [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) of your data? – stlba Sep 30 '20 at 18:00
  • If a record is missing coordinates, are the longitude and latitude zero (0) or `NA`? Your code is checking if equal to zero, but your example data has `NA` I assume for missing elements? Perhaps you might want: `records0 <- records[is.na(records$decimalLatitude) | is.na(records$decimalLongitude),]`? That would use `is.na` to check for `NA` in your data. – Ben Sep 30 '20 at 21:04
  • If a record is missing coordinates, they are "NA" in this case, but if i replace "0" with "NA" in records0 <- records[records$decimalLatitude == 0 | records$decimalLongitude == 0,], it seems to still read the same rows as having missing coordinates (it does the same thing). records0 <- records[is.na(records$decimalLatitude) | is.na(records$decimalLongitude),] also gives the same result (records is empty). – Charlotte Sep 30 '20 at 21:10
  • Hmmm...I used your example data with the line of code using `is.na` and got two rows of data back... – Ben Sep 30 '20 at 22:25
  • Thanks. It works on the example but not on my full dataset – Charlotte Oct 01 '20 at 01:09
  • So it works for `records[1:4,]` but not `records`? Are there other differences to consider? – Ben Oct 01 '20 at 02:20
  • Actually, it's working for records, but when I try to remove the missing occurrences from records, records becomes empty. > recordsNA <- records[is.na(records$decimalLatitude) | is.na(records$decimalLongitude),] > records <- records[!records$ID %in% recordsNA$ID,] > records Empty data.table (0 rows) of 241 cols: gbifID,abstract,accessRights,accrualMethod,accrualPeriodicity,accrualPolicy... – Charlotte Oct 01 '20 at 14:56
  • This isn't reproducible for me as the example data does not have an `ID` column. Also, your statement `records <- records[...]` overwrites `records`, so be aware that may also make debugging more confusing. If you are willing, please edit your question with a reproducible example having `ID`. – Ben Oct 01 '20 at 21:34

0 Answers0