2

I'm working with R 3.6.3 and PostgreSQL 12. I have several tables in a database. One of them has 8,000,000+ rows and I'm trying to get it to R. I'm using the RPostgreSQL package, which relies on DBI.

> library(RPostgreSQL)
Loading required package: DBI
Warning message:
package ‘RPostgreSQL’ was built under R version 3.6.2 
> drv <- dbDriver("PostgreSQL")
> con <- dbConnect(drv, dbname="ebird_work",host="localhost", port=5432, user="postgres")
> dbListTables(con)
[1] "spatial_ref_sys" "grid_sampl"      "gridpt_of_buf"   "or_counties"    
[5] "e_grid"          "ebird_sel_spt"   "ebird_or_cov"    "ebird_dct"      
[9] "ebird_sim"       "fin_pt_no"       "eb_new"          "samp_pt"        
[13] "or_buffered"  

I'm trying to pull in ebird_sel_spt. But

> dbExistsTable(con, "ebird_sel_spt")
[1] FALSE

So queries like the following don't work.

> dbSendQuery(con, "SELECT * FROM ebird_sel_spt LIMIT 1")
Error in postgresqlExecStatement(conn, statement, ...) : 
RS-DBI driver: (could not Retrieve the result : ERROR:  relation "ebird_sel_spt" does not exist
LINE 1: SELECT * FROM ebird_sel_spt LIMIT 1
                  ^
)

From PostgreSQL:

SELECT
   *
FROM
   pg_catalog.pg_tables
WHERE
   schemaname != 'pg_catalog'
AND schemaname != 'information_schema';

"p_loc" "ebird_sel_spt" "postgres"      false   false   false   false

I'm new to PostgreSQL and using it with R. What am I doing wrong?

EDIT - after the answer from @Parfait: I tried using "p_loc" with the . connector and I still get an error. I updated R and RPostgreSQL, since there was a warning at first that it was built under an older version. The error remains. What else should I be looking for?

> library(RPostgreSQL)
Loading required package: DBI
> drv <- dbDriver("PostgreSQL")
> con <- dbConnect(drv, dbname="ebird_work",host="localhost", port=5432, user="postgres")
> dbListTables(con)
[1] "spatial_ref_sys" "grid_sampl"      "gridpt_of_buf"   "or_counties"    
[5] "e_grid"          "ebird_sel_spt"   "ebird_or_cov"    "ebird_dct"      
[9] "ebird_sim"       "fin_pt_no"       "eb_new"          "samp_pt"        
[13] "or_buffered"    
> dbExistsTable(con, "ebird_sel_spt")
[1] FALSE
> dbExistsTable(con, "p_loc.ebird_sel_spt")
[1] FALSE
> t <- Id(schema = "p_loc", table = "ebird_sel_spt")
> dbExistsTable(con, t)
[1] FALSE
> con
<PostgreSQLConnection>

> sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 17763)

Matrix products: default

locale:
[1] LC_COLLATE=English_United States.1252 
[2] LC_CTYPE=English_United States.1252   
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.1252    

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] RPostgreSQL_0.6-2 DBI_1.1.0        

loaded via a namespace (and not attached):
[1] compiler_3.6.3

EDIT-in response to request from @Parfait:

#tried a different table to see if the problem was unique to my
#to the table I was interested in. it's not.
> dbExistsTable(con, "p_loc.or_counties")
[1] FALSE

> dbSendQuery(con, "SELECT * FROM p_loc.ebird_sel_spt LIMIT 1")
<PostgreSQLResult>
Warning message:
In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type geometry (id:29533) in column 18)

I made the pgtable query again:

schemename tablename         tableowner tablespace hasindexes hasrules hastriggers rowsecurity
"public"   "spatial_ref_sys" "postgres" [NULL]     true        false      false       false
"p_loc"    "or_counties"     "postgres" [NULL]     true        false  false       false
"p_loc"    "e_grid"          "postgres" [NULL]     true        false  false       false
"p_loc"    "ebird_sel_spt"   "postgres" [NULL]     false       false  false       false

2nd EDIT-This comes after the edit that @Parfait made to his answer. The answer seems to work and I will mark it so momentarily.

> dbExistsTable(con, c("p_loc", "ebird_sel_spt"))
[1] TRUE
> df <- dbGetQuery(con, "SELECT * FROM p_loc.ebird_sel_spt LIMIT 1")
Warning message:
In postgresqlExecStatement(conn, statement, ...) :
RS-DBI driver warning: (unrecognized PostgreSQL field type geometry (id:29533) in column 18)
> df
GLOBAL_UNIQUE_IDENTIFIER                         COMMON_NAME
URN:CornellLabOfOrnithology:EBIRD:OBS543103161 Band-tailed Pigeon
SCIENTIFIC_NAME         OBSERVATION_COUNT STATE_PROVINCE     COUNTY LOCALITY
Patagioenas fasciata                  1 Oregon             Washington   CRNA 5
LONGITUDE LATITUDE OBSERVATION_DATE TIME_OBSERVATIONS_STARTED
-123.0843  45.4498       2012-06-03                  06:22:00
FIRST_NAME                              LAST_NAME PROTOCOL_TYPE
Metro Parks and Nature Avian Monitoring      Weil    Stationary
ALL_SPECIES_REPORTED SAMPLING_EVENT_IDENTIFIER            OBS_TIME  point
                   1                 S40033350 2012-06-03 06:22:00 120747
                                             geom   pt_lon  pt_lat
01010000201E690000F851624162191E41A6F66655B8325341 493144.6 5032673
John Polo
  • 547
  • 1
  • 8
  • 25
  • see the accepted answer to : https://stackoverflow.com/questions/36753568/postgresql-tables-exists-but-getting-relation-does-not-exist-when-querying – Brian Mar 19 '20 at 14:49
  • You are repeating your original attempted query without qualifying schema name, `p_loc.` in `dbSendQuery`. – Parfait Mar 20 '20 at 15:53
  • @Parfait, sorry about making the wrong query. I edited the question to reflect the right query.. There is an error now I don't understand though. – John Polo Mar 21 '20 at 03:57
  • Avoid using `SELECT *` but explicitly define columns `SELECT Col1, Col2, Col3, ...`. Error appears to say column 18 is the issue. But that may be another question. You might be using PostGIS? Also, see if below works now so we can close this out. Thanks. – Parfait Mar 21 '20 at 04:03
  • @Parfait, yes, I'm using PostGIS too. Thanks for all of your efforts! – John Polo Mar 21 '20 at 04:07

1 Answers1

1

As your pg_catalog.pg_tables query shows which you omitted the important column headers, that table exists in a non-default schema (i.e., non-public schema), specifically in p_loc:

| schemaname | tablename     | tableowner | tablespace | hasindexes | hasrule | hastriggers |
|------------|---------------|------------|------------|------------|---------|-------------|
| p_loc      | ebird_sel_spt | postgres   | NULL       | false      | false   | false       |

Therefore consider identifying schema in needed method calls or with period-qualifier in SQL following pattern: database.schema.table.column:

# SCHEMA NAMESPACE
dbExistsTable(con, c("p_loc", "ebird_sel_spt"))
df <- dbReadTable(con, c("p_loc", "ebird_sel_spt"))

# SQL PERIOD-QUALIFIER
df <- dbGetQuery(con, "SELECT * FROM p_loc.ebird_sel_spt LIMIT 1")
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • thanks for answer. I tried it, but it doesn't work. – John Polo Mar 20 '20 at 03:31
  • What about `SELECT * FROM p_loc.ebird_sel_spt LIMIT 1`? Please post the columns of your Postgres `pg_tables` query. – Parfait Mar 20 '20 at 03:54
  • From your edit, you did not quality the schema name. Please run above query exactly with `p_loc.` on table name. Also, regardless of `dbExistsTable`, does `dbReadTable` as I post here work? – Parfait Mar 20 '20 at 15:52
  • I edit solution to avoid use of `Id()` but passing schema info directly in method calls. – Parfait Mar 20 '20 at 16:05