I'm given some data in an .rData file. The format is an xts
object in character mode. (I realise this is an unusual format, but I have no control over it)
> head(trades)
SYMBOL EX PRICE SIZE COND BID BIDSIZ OFR
2012-05-04 09:30:00 "BAC" "T" "7.89" "38538" "F" "7.89" "523" "7.9"
2012-05-04 09:30:01 "BAC" "Z" "7.885" "288" "@" "7.88" "61033" "7.9"
2012-05-04 09:30:03 "BAC" "X" "7.89" "1000" "@" "7.88" "1974" "7.89"
2012-05-04 09:30:07 "BAC" "T" "7.89" "19052" "F" "7.88" "1058" "7.89"
2012-05-04 09:30:08 "BAC" "Y" "7.89" "85053" "F" "7.88" "108101" "7.9"
2012-05-04 09:30:09 "BAC" "D" "7.8901" "10219" "@" "7.89" "268" "7.9"
> mode(trades)
'character'
I would like to process this data by converting into a saner format, namely a tibble, so that I can store the columns as datetimes, doubles and integers.
I've managed to achieve this with the following code:
> trades_ = bind_cols(data_frame(DATE=index(trades)), as_data_frame(coredata(trades))) %>%
mutate_at(as.numeric, .cols=vars(PRICE, BID, OFR)) %>%
mutate_at(as.integer, .cols=vars(SIZE, BIDSIZ, OFRSIZ))
> head(trades_)
# A tibble: 6 × 10
DATE SYMBOL EX PRICE SIZE COND BID BIDSIZ OFR
<dttm> <chr> <chr> <dbl> <int> <chr> <dbl> <int> <dbl>
1 2012-05-04 09:30:00 BAC T 7.8900 38538 F 7.89 523 7.90
2 2012-05-04 09:30:01 BAC Z 7.8850 288 @ 7.88 61033 7.90
3 2012-05-04 09:30:03 BAC X 7.8900 1000 @ 7.88 1974 7.89
4 2012-05-04 09:30:07 BAC T 7.8900 19052 F 7.88 1058 7.89
5 2012-05-04 09:30:08 BAC Y 7.8900 85053 F 7.88 108101 7.90
6 2012-05-04 09:30:09 BAC D 7.8901 10219 @ 7.89 268 7.90
I'm wondering if there's already a built-in function for this. Something that looks at each column of the trades
matrix and figures out whether it's a column of integers, doubles, etc and converts it to the appropriate type.
This is the sort of thing that a csv parser would do.