0

I have a list of 701 given csv files. Each one has the same number of columns (7) but different number of rows (between 25000 and 28000).

Here is an extract of the first file:

Date,Week,Week Day,Hour,Price,Volume,Sale/Purchase
18/03/2011,11,5,1,-3000.00,17416,Sell
18/03/2011,11,5,1,-1001.10,17427,Sell
18/03/2011,11,5,1,-1000.00,18055,Sell
18/03/2011,11,5,1,-500.10,18057,Sell
18/03/2011,11,5,1,-500.00,18064,Sell
18/03/2011,11,5,1,-400.10,18066,Sell
18/03/2011,11,5,1,-400.00,18066,Sell
18/03/2011,11,5,1,-300.10,18068,Sell
18/03/2011,11,5,1,-300.00,18118,Sell

I made a nonlinear regression of the supply curve of the ninth hour for the year 2012. The datas for 2012 are in 290. to 654. csv files.

allenamen <- dir(pattern="*.csv")
alledat <- lapply(allenamen, read.csv, header = TRUE, sep = ",", stringsAsFactors = FALSE)
h <- list()
for(i in 290:654) {
g <- function(a, b, c, d, p) {a*atan(b*p+c)+d}
f <- nlsLM(Volume ~ g(a,b,c,d,Price), data=subset(alledat[[i-289]], (Hour==9) & (Sale.Purchase == "Sell") & (!Price %in% as.character(-50:150))), start = list(a=4000, b=0.1, c=-5, d=32000))
h[[i-289]] <- coef(f)
}

This works and I get the coefficients a, b, c and d for every day in 2012.

This is the head(h):

[[1]]
        a             b             c             d 
2.513378e+03  4.668218e-02 -3.181322e+00  2.637142e+04 

[[2]]
        a             b             c             d 
2.803172e+03  6.696201e-02 -4.576432e+00  2.574454e+04 

[[3]]
        a             b             c             d 
 3.298991e+03  5.817949e-02 -3.425728e+00  2.393888e+04 

[[4]]
        a             b             c             d 
 2.150487e+03  3.810406e-02 -2.658772e+00  2.675609e+04 

[[5]]
        a             b             c             d 
2.326199e+03  3.044967e-02 -1.780965e+00  2.604374e+04 

[[6]]
        a             b             c             d 
2934.0193270     0.0302937    -1.9912913 26283.0300823

And this is dput(head(h)):

list(structure(c(2513.37818972349, 0.0466821822063123, -3.18132213466142, 
26371.4241646124), .Names = c("a", "b", "c", "d")), structure(c(2803.17230054557, 
0.0669620116294894, -4.57643230249848, 25744.5376725213), .Names = c("a", 
"b", "c", "d")), structure(c(3298.99066895304, 0.0581794881246528, 
-3.42572804902504, 23938.8754575156), .Names = c("a", "b", "c", 
"d")), structure(c(2150.48734655237, 0.0381040636898022, -2.65877160023262, 
26756.0907073567), .Names = c("a", "b", "c", "d")), structure(c(2326.19873555633, 
0.0304496684589379, -1.7809654498454, 26043.735374657), .Names = c("a", 
"b", "c", "d")), structure(c(2934.01932702805, 0.0302937043170001, 
-1.99129130343521, 26283.0300823458), .Names = c("a", "b", "c", 
"d")))

Now I am trying to get just a column with h$a but I get NULL. How can I get just the a column?

In addition to this I want to plot the single coefficients and Date. I tried this code:

koeffreihe <- function(x) {
files <- list.files(pattern="*.csv")    
df <- data.frame()  
for(i in 1:length(files)){
xx <- read.csv(as.character(files[i]))    
xx <- subset(xx, Sale.Purchase == "Sell" & Hour == 3)
df <- rbind(df, xx)
g <- function(a, b, c, d, p) {a*atan(b*p+c)+d}
f <- nlsLM(Volume ~ g(a,b,c,d,Price), data=subset(alledat[[i]], (Hour==9) & (Sale.Purchase == "Sell") & (!Price %in% as.character(-50:150))), start = list(a=4000, b=0.1, c=-5, d=32000))
h[[i]] <- coef(f)  
}
df$Date <- as.Date(as.character(df$Date), format="%d/%m/%Y")
plot(h$x ~ Date, df, xlim = as.Date(c("2012-01-01", "2012-12-31")))
}

koeffreihe(a)

But I get this error:

invalid type (NULL) for variable 'h$x'

So the problem is that h$a is NULL. If someone can fix this problem I guess the code will work too.

Thank you for your help!

Roland
  • 127,288
  • 10
  • 191
  • 288
fYpsE
  • 67
  • 6
  • Try `do.call(rbind, h)$a`. – Roland Jun 22 '14 at 14:13
  • @roland I tried j <- do.call(rbind,h) and then j$a but I get this error: error in j$a : $ operator is invalid for atomic vectors. class(j$a) is a matrix. I changed the class to data.frame and list but then I get NULL. – fYpsE Jun 23 '14 at 07:17
  • Try `do.call(rbind.data.frame, h)$a`. If you provided a [minimal reproducible example](http://stackoverflow.com/a/5963610/1412059), I could help you better. – Roland Jun 23 '14 at 07:23
  • I edited my post. I hope it will help understanding my problem. – fYpsE Jun 23 '14 at 09:10

2 Answers2

1

First transform your list into a data.frame:

h.df <- setNames(do.call(rbind.data.frame, h), names(h[[1]]))
#         a          b         c        d
#1 2513.378 0.04668218 -3.181322 26371.42
#2 2803.172 0.06696201 -4.576432 25744.54
#3 3298.991 0.05817949 -3.425728 23938.88
#4 2150.487 0.03810406 -2.658772 26756.09
#5 2326.199 0.03044967 -1.780965 26043.74
#6 2934.019 0.03029370 -1.991291 26283.03

Then you can extract variables easily:

h.df$a
#[1] 2513.378 2803.172 3298.991 2150.487 2326.199 2934.019

Alternatively you can iterate over the list to extract the variable:

sapply(h, "[", "a")
#       a        a        a        a        a        a 
#2513.378 2803.172 3298.991 2150.487 2326.199 2934.019 
Roland
  • 127,288
  • 10
  • 191
  • 288
  • Thank you for your help. The first problem is solved. But I get still an error in my my plot function "koeffreihe". I edited my entrance post. Do you have an idea for it? – fYpsE Jun 24 '14 at 12:37
  • If your problem is solved now you should consider accepting and upvoting an answer. I've reverted your edit. If you have a new question, ask a new question. However, you should first try to debug your code yourself. – Roland Jun 24 '14 at 12:42
0

In this line, although x is a variable, h$x is looking for a column named x in h:

plot(h$x ~ Date, df, xlim = as.Date(c("2012-01-01", "2012-12-31")))

You probably want h[[x]] instead.

From ?'[[':

x$name is equivalent to x[["name", exact = FALSE]].

That is, you are looking for a column literally named x.

Matthew Lundberg
  • 42,009
  • 6
  • 90
  • 112
  • I tried h[[x]] in the plot line but I get this: error in eval(expr, envir, enclos) : argument 'x' is missing with no default. I also tried it without the variable x and tried h[[a]] and h$a but R does not find a in both cases. – fYpsE Jun 23 '14 at 08:02