2

I use R to access bioinformatic resources from NCBI. In recent months, my ability to do this has become less and less consistent. Today, I'm generally unable to run the following code:

readr::read_tsv("https://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/Homo_sapiens.gene_info.gz")

and get the error:

Error in open.connection(3L, "rb") :
 Could not resolve host: ftp.ncbi.nlm.nih.gov

but sometimes (even occasionally today), this works. I know that the server is up (I can download this file from a browser), but this is just one example (and the simplest) of many things that aren't working consistently. I also think that the server isn't overloaded, because the error comes back to me immediately and without any timeout message.

Is there anything I can do to improve the consistency?

R version 4.2.2 (2022-10-31)
Platform: x86_64-apple-darwin21.6.0 (64-bit)
Running under: macOS Monterey 12.6

Matrix products: default
BLAS:   /usr/local/Cellar/openblas/0.3.21/lib/libopenblasp-r0.3.21.dylib
LAPACK: /usr/local/Cellar/r/4.2.2/lib/R/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] devtools_2.4.5 usethis_2.1.6

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.9        pillar_1.8.1      compiler_4.2.2    later_1.3.0
 [5] urlchecker_1.0.1  prettyunits_1.1.1 profvis_0.3.7     remotes_2.4.2
 [9] tools_4.2.2       bit_4.0.5         digest_0.6.31     pkgbuild_1.4.0
[13] pkgload_1.3.2     tibble_3.1.8      memoise_2.0.1     lifecycle_1.0.3
[17] pkgconfig_2.0.3   rlang_1.0.6       shiny_1.7.3       cli_3.4.1
[21] curl_4.3.3        parallel_4.2.2    fastmap_1.1.0     withr_2.5.0
[25] stringr_1.5.0     fs_1.5.2          htmlwidgets_1.5.4 vctrs_0.5.1
[29] hms_1.1.2         tidyselect_1.2.0  bit64_4.0.5       glue_1.6.2
[33] R6_2.5.1          processx_3.8.0    fansi_1.0.3       vroom_1.6.0
[37] sessioninfo_1.2.2 tzdb_0.3.0        callr_3.7.3       purrr_0.3.5
[41] readr_2.1.3       magrittr_2.0.3    ps_1.7.2          promises_1.2.0.1
[45] ellipsis_0.3.2    htmltools_0.5.4   mime_0.12         xtable_1.8-4
[49] httpuv_1.6.7      utf8_1.2.2        stringi_1.7.8     miniUI_0.1.1.1
[53] cachem_1.0.6      crayon_1.5.2
Ngwenyama
  • 175
  • 6
  • Loads fine for me, perhaps they are rate-limiting you? – Axeman Dec 15 '22 at 20:32
  • Axeman, I thought about that but have made very few (<30) requests within the last 24 hours. Also, is the error message consistent with that? – Ngwenyama Dec 15 '22 at 20:38
  • Maybe it's only a matter of reconnecting your WIFI: https://stackoverflow.com/a/34777941/20513099 – I_O Dec 15 '22 at 21:05
  • 2
    The error message "Could not resolve host" sounds like a DNS problem. This is unlikely specific to any R code. When the error occurs, try the URL with your web browser or `curl` to see of that works as well. There's no reason to assume anything other than internet connectivity problems for this particular error unless you can provide any other evidence. – MrFlick Dec 15 '22 at 21:23

1 Answers1

1

Thank you for all your help! Re-setting my router did not help, but I tried using curl directly, and hit the same error. I could get to this file via Opera browser, but not via Chrome or Safari. For those, I got this error:

This site can’t be reached
Check if there is a typo in ftp.ncbi.nlm.nih.gov.
DNS_PROBE_FINISHED_NXDOMAIN

I eventually solved it by flushing my DNS cache:

sudo dscacheutil -flushcache; sudo killall -HUP mDNSResponder

This seemed to help very temporarily, so I manually changed the DNS servers as described in step 3 here, and this seems to be better.

Since this was an in intermittent problem (lately more present than not), it remains to be seen if this fully fixed it, but so far so good.

Ngwenyama
  • 175
  • 6