0

I'm trying to do a timeseries ggplot2 by year. My issue is that some columns have . instead of NA. As well as it seems that my variables are Factors and not numeric.

Dataset

DATE        IR      IQ
9/1/1983    77.6    85.7
10/1/1983   .       .
11/1/1983   .       .
12/1/1983   78      85.4

df_temp <- read.csv("",na.strings = "")

IR.factor <- factor(IR)
IQ.factor <- factor(IQ)
as.numeric(IR.factor)
as.numeric(IQ.factor)

head(df_temp)
str(df_temp)

df_temp <- df_temp[rowSums(is.na(df_temp)) != ncol(df_temp), ]

ggplot(aes(x=date, weight=value, fill=variable), data=df_temp) +
geom_bar() + 
labs(x='DATE', y='Index 2000=100, Not Seasonally Adjusted') +
labs(color='Legend') +
scale_fill_discrete(labels = c('IR.factor',
                             'IQ.factor'
                             )) +
 scale_y_continuous(breaks = round(seq(0, 200, by = 20), 1)) +
 scale_x_date(date_breaks = '5 year', date_minor_breaks = '5 year', 
 date_labels = '%Y') +
 theme(plot.title = element_text(lineheight=.7, face='bold')) +
 theme(legend.position='bottom')

Any suggestions are much appreciated

zx8754
  • 52,746
  • 12
  • 114
  • 209
  • 1
    Read "." as NA, see this answer for example: https://stackoverflow.com/a/19126403/680068 – zx8754 Feb 19 '18 at 16:13
  • 1
    In addition to reading your periods as NA, you might want to read the file with a function that doesn't read columns as factors. I believe that readr::read_csv() is a good option for this. – Fred Boehm Feb 19 '18 at 16:30
  • Thank you for the "." as NA tip. However I'm still getting the error "Don't know how to automatically pick scale for object of type function. Defaulting to continuous. Error in FUN(X[[i]], ...) : object 'variable' not found" –  Feb 19 '18 at 16:53

1 Answers1

0

Final code looked like this:

library(ggplot2)

df_temp <- read.csv(".csv",na.strings = c("","."))
    IR <- df_temp$IR[!is.na(df_temp$IR)]
    IQ <- df_temp$IR[!is.na(df_temp$IQ)]
    df_temp$YEAR <- as.Date(df_temp$YEAR)   
    ADJ_DF <- df_temp[df_temp$YEAR>="1998-01-01" & df_temp$YEAR <= "2018-01-
    01",]

ggplot(ADJ_DF, aes(YEAR, group=1, color=Legend))+
    geom_line(aes(y=IR, color="Import Price Index (IR)"))+
    geom_line(aes(y=IQ, color="Export Price Index IQ"))+
        labs(x = "Year", y = "Index 2000=100, Not Seasonally Adjusted", 
        title = "United States Import and Export Price Indexes: All 
        Commodaties (1998-2018)")+
        scale_x_date(date_breaks="2 years",date_labels="%Y")+
        theme(plot.title = element_text(lineheight=.7, face='bold')) +
        theme(legend.position='bottom')

Notes for any other beginners:

-have to use group=1 -dates must be in as.Date format and should be in YYYY-MM-DD format -only inlude what applies to both lines in the ggplot parenthesis -geom lines then contain the y variables