I need help making a really simple plot. It is merely a line graph with an accompanying line for a different set of prices (they are both time series, a line for each good. X = prices, Y = time). So I have a data set that follows the format:
#Date prices1 prices2
The dates all follow the format YYYY-MM-DD, and the two price columns are numbers. I have checked the class of all three columns to ensure that they are what they are supposed to be ("Date" , "numeric" and "numeric" respectively). Also a few things I feel I should mention:
The data was retrieved by a Quandl() call, and the lengths of the initial data frames were different. Thus, I had to join them using the full_join. I still checked the class() for each column in the final data frame and they are correct.
The
price1
column has a length of 91, whileprice2
column has a length of 100. I initially thought this was the source of the problem. But after having setdf$price2[92:100] = NA
, I still have the same problem (I can plot each of the lines separately, but neither show up when I use the lines() function).Furthermore, I made a separate script where I made a three column data frame where I have 100 columns and NA's for the first ten values of
col1
, NA's for 11th to 20th values ofcol2
, etc.
Now, I did not make them a time-series object and tried graphic them simply as normal data frames. I can plot both of them on their own , but I cannot for the life of me plot one and use the lines() function for the other. What could I be missing? If NA
's are the issue, then why am I unable to do the two-line plot with the Quandl data while my test data came out fine?
Due to the circumstances of the problem, I've decided to share the Quandl script and the test script.
#Original Script with issues
#Retrieving Data1
library(dplyr)
library(zoo)
library("Quandl")
data.1 = Quandl("JODI/OIL_TCPRKL_VEN")
#Putting data in chronological order
#not in order
print(data.1$Date[1])
print(data.1$Date[length(data.1$Date)])
data.1 = data.frame(
data.1$Date[length(data.1$Date):1],
data.1$Value[length(data.1$Value):1]
)
names(data.1) = c("Date", "Value1")
#Now in order
print(data.1$Date[1])
print(data.1$Date[length(data.1$Date)])
#Retrieving data2
data.2 = Quandl("JODI/OIL_TCPRKB_IRQ")
#not in order
print(data.2$Date[1])
print(data.2$Date[length(data.2$Date)])
data.2 = data.frame(
data.2$Date[length(data.2$Date):1],
data.2$Value[length(data.2$Value):1]
)
names(data.2) = c("Date", "Value2")
#now in order
print(data.2$Date[1])
print(data.2$Date[length(data.2$Date)])
#join the data
data.join = data.frame(full_join(data.1, data.2))
plot(data.join$Date, data.join$Value1,
col = "blue",
main = "Should have both lines",
type = "l",
sub = "only one of them shows up though. Why?",
xlab = "Date",
ylab = "Values")
lines(data.join$Value2)
#plot only has one line. Why??
Here is also a test script I made where I do not seem to be having the issue.
library(dplyr)
library(zoo)
time.a = as.Date(c(10:30))
time.b = as.Date(c(20:40))
time.c = as.Date(c(30:50))
value.a = as.numeric(seq(10,30,1))
value.b = as.numeric(seq(20,60,2))
value.c = as.numeric(seq(20,30,.5))
length(time.a)
length(time.b)
length(time.c)
length(value.a)
length(value.b)
length(value.c)
print(time.a)
print(time.b)
print(time.c)
print(value.a)
print(value.b)
print(value.c)
data.a = data.frame(time.a, value.a)
data.b = data.frame(time.b, value.b)
data.c = data.frame(time.c, value.c)
names(data.a) = c("Date", "Value.a")
names(data.b) = c("Date", "Value.b")
names(data.c) = c("Date", "Value.c")
all.data = full_join(data.a, data.b)
all.data = full_join(all.data, data.c)
plot(all.data$Date, all.data$Value.a,
type = "l",
main = "plot",
xlab = "Date",
ylab = "Values")
lines(all.data$Date, all.data$Value.b,
col = "blue")
lines(all.data$Date, all.data$Value.c,
col = "red")
I am really trying to understand why the first script doesn't work, while my second one does. Any help or hints would be greatly appreciated. Why doesn't it work?