I have a list of 701 given csv
files. Each one has the same number of columns (7) but different number of rows (between 25000 and 28000).
Here is an extract of the first file:
Date,Week,Week Day,Hour,Price,Volume,Sale/Purchase
18/03/2011,11,5,1,-3000.00,17416,Sell
18/03/2011,11,5,1,-1001.10,17427,Sell
18/03/2011,11,5,1,-1000.00,18055,Sell
18/03/2011,11,5,1,-500.10,18057,Sell
18/03/2011,11,5,1,-500.00,18064,Sell
18/03/2011,11,5,1,-400.10,18066,Sell
18/03/2011,11,5,1,-400.00,18066,Sell
18/03/2011,11,5,1,-300.10,18068,Sell
18/03/2011,11,5,1,-300.00,18118,Sell
I made a nonlinear regression of the supply curve of the ninth hour for the year 2012. The datas for 2012 are in 290. to 654. csv files.
allenamen <- dir(pattern="*.csv")
alledat <- lapply(allenamen, read.csv, header = TRUE, sep = ",", stringsAsFactors = FALSE)
h <- list()
for(i in 290:654) {
g <- function(a, b, c, d, p) {a*atan(b*p+c)+d}
f <- nlsLM(Volume ~ g(a,b,c,d,Price), data=subset(alledat[[i-289]], (Hour==9) & (Sale.Purchase == "Sell") & (!Price %in% as.character(-50:150))), start = list(a=4000, b=0.1, c=-5, d=32000))
h[[i-289]] <- coef(f)
}
This works and I get the coefficients a, b, c and d for every day in 2012.
This is the head(h)
:
[[1]]
a b c d
2.513378e+03 4.668218e-02 -3.181322e+00 2.637142e+04
[[2]]
a b c d
2.803172e+03 6.696201e-02 -4.576432e+00 2.574454e+04
[[3]]
a b c d
3.298991e+03 5.817949e-02 -3.425728e+00 2.393888e+04
[[4]]
a b c d
2.150487e+03 3.810406e-02 -2.658772e+00 2.675609e+04
[[5]]
a b c d
2.326199e+03 3.044967e-02 -1.780965e+00 2.604374e+04
[[6]]
a b c d
2934.0193270 0.0302937 -1.9912913 26283.0300823
And this is dput(head(h))
:
list(structure(c(2513.37818972349, 0.0466821822063123, -3.18132213466142,
26371.4241646124), .Names = c("a", "b", "c", "d")), structure(c(2803.17230054557,
0.0669620116294894, -4.57643230249848, 25744.5376725213), .Names = c("a",
"b", "c", "d")), structure(c(3298.99066895304, 0.0581794881246528,
-3.42572804902504, 23938.8754575156), .Names = c("a", "b", "c",
"d")), structure(c(2150.48734655237, 0.0381040636898022, -2.65877160023262,
26756.0907073567), .Names = c("a", "b", "c", "d")), structure(c(2326.19873555633,
0.0304496684589379, -1.7809654498454, 26043.735374657), .Names = c("a",
"b", "c", "d")), structure(c(2934.01932702805, 0.0302937043170001,
-1.99129130343521, 26283.0300823458), .Names = c("a", "b", "c",
"d")))
Now I am trying to get just a column with h$a
but I get NULL. How can I get just the a column?
In addition to this I want to plot the single coefficients and Date
. I tried this code:
koeffreihe <- function(x) {
files <- list.files(pattern="*.csv")
df <- data.frame()
for(i in 1:length(files)){
xx <- read.csv(as.character(files[i]))
xx <- subset(xx, Sale.Purchase == "Sell" & Hour == 3)
df <- rbind(df, xx)
g <- function(a, b, c, d, p) {a*atan(b*p+c)+d}
f <- nlsLM(Volume ~ g(a,b,c,d,Price), data=subset(alledat[[i]], (Hour==9) & (Sale.Purchase == "Sell") & (!Price %in% as.character(-50:150))), start = list(a=4000, b=0.1, c=-5, d=32000))
h[[i]] <- coef(f)
}
df$Date <- as.Date(as.character(df$Date), format="%d/%m/%Y")
plot(h$x ~ Date, df, xlim = as.Date(c("2012-01-01", "2012-12-31")))
}
koeffreihe(a)
But I get this error:
invalid type (NULL) for variable 'h$x'
So the problem is that h$a
is NULL. If someone can fix this problem I guess the code will work too.
Thank you for your help!