-4

Suppose I have an xts object or a data frame or data in csv format and would like to filter the data by getting rid of the illiquid days or the days that have observations that are less than a fixed size (say less than 5k observation per day)? I am operating on tick by tick data.

Posting data on request from the community. I would like to filter 5th april since it has less than 3 observations :

"DateTime","spy.prices.Open","spy.prices.High","spy.prices.Low","spy.prices.Close"
2007-04-02 09:34:59,142.16,142.34,142.13,142.2
2007-04-02 09:39:59,142.19,142.32,142.14,142.16
2007-04-02 09:44:58,142.16,142.27,142.03,142.25
2007-04-02 09:49:59,142.26,142.28,142.16,142.18
2007-04-02 09:54:57,142.17,142.24,142.15,142.2
2007-04-02 09:59:57,142.2,142.23,142.09,142.13

2007-04-05 14:19:57,144.3,144.34,144.29,144.33
2007-04-05 14:24:59,144.33,144.43,144.31,144.42

2007-04-10 14:34:58,144.64,144.71,144.59,144.62
2007-04-10 14:39:56,144.62,144.69,144.62,144.67
2007-04-10 14:44:59,144.67,144.72,144.67,144.71
2007-04-10 14:49:59,144.7,144.73,144.66,144.73
2007-04-10 14:54:59,144.73,144.75,144.69,144.7
2007-04-10 14:59:58,144.701,144.72,144.7,144.71
2007-04-10 15:04:58,144.72,144.78,144.71,144.74
2007-04-10 15:09:58,144.7499,144.79,144.74,144.77
2007-04-10 15:14:59,144.77,144.7799,144.69,144.69
2007-04-10 15:19:57,144.69,144.73,144.66,144.719
2007-04-10 15:24:59,144.71,144.79,144.71,144.79
2007-04-10 15:29:59,144.79,144.79,144.72,144.725
2007-04-10 15:34:59,144.73,144.79,144.73,144.78
2007-04-10 15:39:57,144.78,144.83,144.76,144.77
2007-04-10 15:44:59,144.78,144.81,144.73,144.77
2007-04-10 15:49:59,144.78,144.78,144.73,144.74
2007-04-10 15:54:57,144.74,144.8,144.73,144.79
2007-04-10 15:59:59,144.79,144.82,144.79,144.8
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
aajkaltak
  • 1,437
  • 4
  • 20
  • 28
  • Post some data. See: http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – harkmug May 31 '13 at 19:56
  • Are you serious you down voted? I am not looking for an exact solution. I am looking for methods. I don't see what is unclear about my question. I am not asking you to solve a homework problem for me. – aajkaltak May 31 '13 at 19:59

1 Answers1

6

Seems silly to just throw away data... but here you go:

library(xts)
x <- as.xts(read.zoo("data.csv",sep=",",header=TRUE,FUN=as.POSIXct))
x <- merge(x,N=apply.daily(x,nrow),fill=function(f) na.locf(f,fromLast=TRUE))
x <- x[x$N > 2,]
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418