I am a Beginner when it comes to work with external databases in R.
A few month ago I already asked how to import a huge dataset into PostgreSqL and got the perfect answer. So I thought I try it again here.
Is there a simple way to do some plots, diagrams or boxplots for external data in R?
Here is my Code:
- First I connect to the database, do a join and get some mean values, which is slow but works fine
- The Problem is with the last bit of code, where i want to do a plot with the years on xlab and the price on the ylab.
db_tankdata <- 'tankdaten'
host_db <- 'localhost'
db_port <- '5432'
db_user <- 'postgres'
db_password <- 'xxx'
drv <- dbDriver("PostgreSQL")
con <- dbConnect(drv, dbname = db_tankdata, host=host_db,
port=db_port, user=db_user, password=db_password)
#do a join with tables from database, filter a city
ms_stations_comb <- tbl(con, "prices") %>% left_join(tbl(con, "stations"), by = c("station_uuid" ="uuid")) %>% filter(str_detect(post_code, "^481"))
#get mean prices for different types of fuel
ms_stations_comb %>% summarize(mean_diesel = mean(diesel), mean_e5 = mean(e5), mean_e10 = mean(e10))
#do a plot with years on xlab and price on ylab
ms_stations_comb %>% dbplot_line(year(date), e5)
The code does give me an error saying:
ERROR: column »dbplyr_016.e5« hast to be in GROUP-BY-clause or appear in a aggregate function LINE 1: SELECT "year(date)", "e5"
Edit: Basically I want to do a plot with years on xlab and price on y lab. The dataset contains for example fuel prices (e5) and dates with the format "2018-04-13 23:17:06".
Thanks in advance!