0

The data frame looks like (about 10,000 timestamps)

Timestamp           OFR    OFRSIZ   BID BIDSIZ
2015-01-04 09:00:00 375     100     365  10
2015-04-01 09:00:33 369.9   10      365  10
2015-04-01 09:00:36 366     100     367.8 55
2015-04-01 09:00:42 367.45  30      366.4 130
2015-04-01 09:00:43 369.9   10      365   10
2015-04-01 09:00:44 365      5      367.8 55
2015-04-01 09:00:49 369.9   10      365    10

The requirement is a new data frame (New_df) with same timestamp and another column as depth computed as (OFRSIZ+BIDSIZ). Also can the same be applied to xts objects?

Jaap
  • 81,064
  • 34
  • 182
  • 193
gaurav kumar
  • 859
  • 2
  • 10
  • 24

2 Answers2

1

If you're original data.frame is called m1:

depth <- data.frame("Depth" = as.numeric(m1$OFRSIZ) + as.numeric(m1$BIDSIZ))
depth_xts <- xts(depth, order.by = index(m1))
New_df <- merge.xts(m1, depth_xts)

It looks like you're using TAQ data from WRDS. You might find some interesting functions in the packages [quantmod] and [highfrequency]

kristang
  • 557
  • 5
  • 17
  • @kristang..yes dear..i m using HIGHFREQUENCY package – gaurav kumar Aug 13 '15 at 11:30
  • @kristang..dear I am making depth from tqdata using (OFRSIZ+BIDSIZ)/2 as high frequency package does not have direct function for the same. The tqdata is an xts containing OFRSIZ and BIDSIZ as column corresponding to a unique timestamp as rows. Can a new xts be constructed using the same timestamps as rows and Column as (OFRSIZ+BIDSIZ)/2. – gaurav kumar Aug 13 '15 at 11:38
  • I don't think I understand you're question. Do you want an xts with only the time, OFRSIZE, BIDSIZ and (OFRSIZ+BIDSIZ)/2 as columns? – kristang Aug 13 '15 at 12:29
  • @kristang..yes dear..the requirement is xts with time and (OFRSIZ+BIDSIZ)/2 as columns? I am doing it as dep_xts <- xts(TIMESTAMP = Old_df$TIMESTAMP, Depth = Old_df$OFRSIZ + Old_df$BIDSIZ) but its craeates An xts object of Zero width. Its is not showing the enteries on dep_xts object. – gaurav kumar Aug 13 '15 at 12:56
  • Just change `depth` in my answer to:`depth <- data.frame("Depth" = (as.numeric(m1$OFRSIZ) + as.numeric(m1$BIDSIZ))/2)` – kristang Aug 13 '15 at 12:57
  • @kristang..dear..i am doing like dep_xts <- xts(TIMESTAMP = tqdata$TIMESTAMP, Depth = (tqdata$OFRSIZ) + (tqdata$BIDSIZ)), where tqdata is the xts xontaining OFRSIZ and BIDSIZ. still its is showing An xts object of zero width. Please help. – gaurav kumar Aug 13 '15 at 13:04
  • You can't do that. `tqdata$TIMESTAMP` needs to be added as the argument to `order.by` which creates the index in which the data is added. Additionally, if OFRSIZ and BIDSIZ are factors, you cannot perform numerical manipulations, and you need to class them as numerics. – kristang Aug 13 '15 at 13:14
  • @kristang...I am running the following code..depth<-data.frame(date=index(tqdata), coredata(tqdata)) #converting tqdata xts to dataframe... depth2 <- data.frame("TIMESTAMP" = depth$TIMESTAMP,"Depth" = as.numeric(depth$OFRSIZ) + as.numeric(depth$BIDSIZ))..this gives me an xts object of zero with, while depth2 <- data.frame("Depth" = as.numeric(depth$OFRSIZ) + as.numeric(depth$BIDSIZ))..gives me data frame of depth only..i require corresponding time also..Please help. – gaurav kumar Aug 13 '15 at 13:18
  • Do you want the final result to be a `data.frame` or an `xts` ? – kristang Aug 13 '15 at 13:30
  • @kristang...dear I think the problem lies in depth_xts <- xts(depth, order.by = index(m1))..as the error says order.by requires an appropriate time-based object..I tried using depth$TIMESTAMP insted of m1 as depth is my dataframe...but in vain..depth_xts21 <- xts(depth21, order.by = index(as.POSIXct(depth$TIMESTAMP))) Error in as.POSIXct.default(depth$TIMESTAMP) : do not know how to convert 'depth$TIMESTAMP' to class “POSIXct” – gaurav kumar Aug 13 '15 at 13:33
  • @kristang..output should be xts – gaurav kumar Aug 13 '15 at 13:45
  • What are the classes of your columns in the `depth` data frame? – ctloftin Aug 13 '15 at 13:46
  • data.frame': 7212 obs. of 8 variables: $ date : POSIXct, format: "2015-04-01 09:00:00" $ PRICE : Factor w/ 472 levels "364.2", $ SIZE : Factor w/ 874 levels "0","1"," $ OFR : Factor w/ 576 levels "365"," $ OFRSIZ: Factor w/ 1625 levels "1","10","100",.. $ BID : Factor w/ 571 levels "365.15","365.25 $ BIDSIZ: Factor w/ 1638 levels "1","10","100",..: – gaurav kumar Aug 13 '15 at 13:59
  • The principle behind `xts` is that you create a skeleton with the time-stamp as the index, and in this skeleton you put in the data you want. I would suggest reading these (old) slides: http://www.rinfinance.com/RinFinance2009/presentations/xts_quantmod_workshop.pdf – kristang Aug 13 '15 at 14:13
  • Part of the problem might be you're trying to convert factors to numeric incorrectly. Unfortunately R messes up when converting factors straight to numeric values. You have to first convert to character and then numeric: `depth$OFRSIZ <- as.numeric(as.character(depth$OFRSIZ))`. I'd start off converting all of your factors into numeric values first and then see what kind of errors you're getting. – ctloftin Aug 13 '15 at 14:20
  • There is no variable named `TIMESTAMP` in that dataframe, `depth`. Use `depth$date` as the argument to `order.by` – kristang Aug 13 '15 at 15:27
0

What about something like this:

New_df <- data.frame(Timestamp = Old_df$Timestamp, Depth = Old_df$OFRSIZ + Old_df$BIDSIZ)
ctloftin
  • 454
  • 3
  • 9
  • @ctloftin..dear it is not working on xts data set. I am running= depth <- xts(TIMESTAMP = tqdata$TIMESTAMP, Depth = tqdata$OFRSIZ + tqata$BIDSIZ) It i snot showing error, but it creates xts object of zero width – gaurav kumar Aug 12 '15 at 14:46
  • depth_df <- data.frame(TIMESTAMP = depth$TIMESTAMP, Depth = depth$OFRSIZ + depth$BIDSIZ) Error in data.frame(TIMESTAMP = depth$TIMESTAMP, Depth = depth$OFRSIZ + : arguments imply differing number of rows: 0, 7212 In addition: Warning message: In Ops.factor(depth$OFRSIZ, depth$BIDSIZ) : ‘+’ not meaningful for factors – gaurav kumar Aug 12 '15 at 14:56
  • 1
    @gauravkumar that means your OFRSIZ and BIDSIZ variables are actually factors, not numeric. Could you update your original post with a [minimal reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example)? – josliber Aug 12 '15 at 15:17
  • Agree with what @josilber said about the reproducible example. For adding the two columns together, you apparently need to convert them to numeric first: as.numeric(as.character(depth$OFRSIZ)) – ctloftin Aug 12 '15 at 15:28
  • @ctloftin and josilber- I convert them in numeric- depth$OFRSIZ<-as.integer(depth$OFRSIZ) depth$BIDSIZ<-as.integer(depth$BIDSIZ) depth_df <- data.frame(TIMESTAMP = depth$TIMESTAMP, Depth = depth$OFRSIZ + depth$BIDSIZ), the error is coming Error in data.frame(TIMESTAMP = depth$TIMESTAMP, Depth = depth$OFRSIZ + : arguments imply differing number of rows: 0, 7212 – gaurav kumar Aug 13 '15 at 03:11
  • When you run `class(depth$OFRSIZ)` and `class(depth$BIDSIZ)`, what do you get? – ctloftin Aug 13 '15 at 13:14