-4

I have output streamed as text in the following form:

[2] "TWS OrderStatus: orderId=12048 status=PreSubmitted 
                      filled=0 remaining=300 averageFillPrice=0 "

[3] "TWS OrderStatus: orderId=12049 status=PreSubmitted 
                      filled=0 remaining=300 averageFillPrice=0 "

I would like to capture such output and convert it to a data frame with columns: orderId, status, filled, remaining, averageFillPrice.

I am wondering what is the most efficient way to do it.

I tried capturing it with capture.output but then I am not so sure how to covert it to a data frame.

coffeinjunky
  • 11,254
  • 39
  • 57
kalka
  • 3
  • 4
  • What do you mean "streamed"? – nrussell Mar 04 '16 at 14:13
  • the function connects to a financial website and returns information as it happens. I close anyway the connection within 5 sec – kalka Mar 04 '16 at 14:14
  • It is difficult for us to reproduce your procedure. You are talking about "output streamed" by "the function" etc. This makes it very difficult for us to think about ways to help you. We don't even know what kind of object your captured output is. Please read http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – coffeinjunky Mar 04 '16 at 14:21
  • You are right. Let's forget about the streaming part. Suppose I have captured the output I need in a character vector of length n whose values are all in the same form as reported above... How can I convert it to a data frame? – kalka Mar 04 '16 at 14:24

2 Answers2

1

I think you can do this with a few base string functions. If you had your strings stored in a list, as in the example below, you could create a function to extract the information you need and then apply it to the list and output a data frame:

a <- "TWS OrderStatus: orderId=12048 status=PreSubmitted filled=0 remaining=300 averageFillPrice=0 "
b <- "TWS OrderStatus: orderId=12049 status=PreSubmitted filled=0 remaining=300 averageFillPrice=0 "
dat <- list(a, b)

extract <- function(x) {
    a <- as.vector(strsplit(x, " ")[[1]])[-(1:2)]
    return(sapply(a, function(b) substr(b, gregexpr("=", b)[[1]] + 1, nchar(b))))
}

as.data.frame(t(sapply(dat, extract)))

The output could be prettier but I'm sure you can clean it up a bit. It works if all your data follows the same pattern (i.e. split by spaces and where you don't want the bit before the equals signs).

Dan Lewer
  • 871
  • 5
  • 12
0

Another possible solution,

library("splitstackshape")
library("stringr")
makedf <- function(x) {
v1 <- str_split(trimws(sub(".*?:(.+)", "\\1", x)), " ") 
v3 <- as.data.frame(sapply(v1, function(i) t(i)))
v4 <- as.data.frame(t(cSplit(v3, "V1", "=")))
v4[] <- lapply(v4, as.character)
colnames(v4) <- v4[1,]
v4 <- v4[-1,]
    }
FinalDF <- rbindlist(lapply(txt, makedf))
FinalDF
#   orderId       status filled remaining averageFillPrice
#1:   12048 PreSubmitted      0       300                0
#2:   12049 PreSubmitted      0       300                0

DATA

txt <- list("TWS OrderStatus: orderId=12048 status=PreSubmitted filled=0 remaining=300 averageFillPrice=0 ", 
    "TWS OrderStatus: orderId=12049 status=PreSubmitted filled=0 remaining=300 averageFillPrice=0 ")
Sotos
  • 51,121
  • 6
  • 32
  • 66