0

Is there a better way of reshaping dataframe data?

temp <- bdh(conn,c("AUDUSD Curncy","EURUSD Curncy"),"PX_LAST","20110101")

gives

head(temp)
         ticker       date PX_LAST
1 AUDUSD Curncy 2011-01-01      NA
2 AUDUSD Curncy 2011-01-02      NA
3 AUDUSD Curncy 2011-01-03  1.0205
4 AUDUSD Curncy 2011-01-04  1.0040
5 AUDUSD Curncy 2011-01-05  1.0014
6 AUDUSD Curncy 2011-01-06  0.9969

and

tail(temp)
            ticker       date PX_LAST
2127 EURUSD Curncy 2013-11-26  1.3557
2128 EURUSD Curncy 2013-11-27  1.3570
2129 EURUSD Curncy 2013-11-28  1.3596
2130 EURUSD Curncy 2013-11-29  1.3591
2131 EURUSD Curncy 2013-11-30      NA
2132 EURUSD Curncy 2013-12-01      NA

in other words, the data are just vertically tacked on to each other and further processing is necessary in order to get them working. how can i regroup this data into the various tickers, i.e.

head(temp)
           AUDUSD.Curncy EURUSD.Curncy
2011-01-01            NA            NA
2011-01-02            NA            NA
2011-01-03        1.0205        1.3375
2011-01-04        1.0040        1.3315
2011-01-05        1.0014        1.3183
2011-01-06        0.9969        1.3028

All the reshaping questions I googled didnt have the kind of reshaping I wanted. I have implemented my own piecemeal solution given below but for learning's sake I wanted to ask you guys if there is a more elegant solution for this?

Henrik
  • 65,555
  • 14
  • 143
  • 159
swyx
  • 2,378
  • 5
  • 24
  • 39
  • 1
    Please make it easier for people to help you by `dput` a relevant, small sample of your data. See [here how to easily create a **minimal, reproducible example**](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610) – Henrik Dec 01 '13 at 00:35
  • thanks, will do. didnt know this existed. – swyx Dec 01 '13 at 15:40
  • `dput` is especially useful for data like yours above, where you have space delimiters both between and within columns. That's why I had to remove the " Curncy" part. On the same page you also find `reproduce`, which gives you some more control when you -yes- 'reproduce' your data for SO questions. Cheers. – Henrik Dec 01 '13 at 19:22

3 Answers3

5

You could try read.zoo. Use index.column to specify in which column index/time is stored, and reshape data according to splitcolumnn, . The result is a zoo time series

library(zoo)

z <- read.zoo(text = "ticker     date PX_LAST
1 AUDUSD 2011-01-01      NA
2 AUDUSD  2011-01-02      NA
3 AUDUSD 2011-01-03  1.0205
4 AUDUSD 2011-01-04  1.0040
5 AUDUSD  2011-01-05  1.0014
6 AUDUSD 2011-01-06  0.9969
2127 EURUSD  2013-11-26  1.3557
2128 EURUSD  2013-11-27  1.3570
2129 EURUSD  2013-11-28  1.3596
2130 EURUSD  2013-11-29  1.3591
2131 EURUSD  2013-11-30      NA
2132 EURUSD  2013-12-01      NA", index.column = "date", split = "ticker")

z
#            AUDUSD EURUSD
# 2011-01-01     NA     NA
# 2011-01-02     NA     NA
# 2011-01-03 1.0205     NA
# 2011-01-04 1.0040     NA
# 2011-01-05 1.0014     NA
# 2011-01-06 0.9969     NA
# 2013-11-26     NA 1.3557
# 2013-11-27     NA 1.3570
# 2013-11-28     NA 1.3596
# 2013-11-29     NA 1.3591
# 2013-11-30     NA     NA
# 2013-12-01     NA     NA

str(z)
Henrik
  • 65,555
  • 14
  • 143
  • 159
  • thanks very much, that is exactly what I was looking for, an elegant solution. – swyx Dec 01 '13 at 15:39
  • Glad to hear that you found my answer helpful. Yes, `read.zoo` is a really versatile function. Check out the `zoo` vignettes (e.g. `browseVignettes("zoo")`), heaps of nice stuff in addition to the 'ordinary' help texts. One especially dedicated to reading data in various forms. Good luck! – Henrik Dec 01 '13 at 19:16
1

This is exactly why we have created the RbbgExtension package. It is a wrapper around the Rbbg package that handles many issues when dealing with financial data - issues we have come across in our daily work with backtesting trading strategies etc. for a financial institution.

As you can see the output is a xts object, but if the query is across multiple tickers and multiple fields, then the output will an array - but you can read about why that is in the documentation.

We have made the package open source and publicly available on GitHub. Just use Hadley's devtools' function install_github("pgarnry/RbbgExtension") to get the package. It has a few dependencies including "Rbbg".

> require(RbbgExtension)
Loading required package: RbbgExtension
> 
> tickers <- c("AUDUSD", "EURUSD")
> 
> prices <- HistData(tickers = tickers,
+                    type = "Curncy",
+                    fields = "PX_LAST",
+                    startdate = "20110101")
R version 3.1.2 (2014-10-31) 
rJava Version 0.9-6 
Rbbg Version 0.5.3 
Java environment initialized successfully.
Looking for most recent blpapi3.jar file...
Adding C:\blp\API\APIv3\JavaAPI\v3.7.1.1\lib\blpapi3.jar to Java classpath
Bloomberg API Version 3.7.1.1 
> class(prices)
[1] "xts" "zoo"
> head(prices)
           AUDUSD EURUSD
2011-01-03 1.0168 1.3361
2011-01-04 1.0051 1.3308
2011-01-05 0.9995 1.3149
2011-01-06 0.9944 1.3003
2011-01-07 0.9959 1.2907
2011-01-10 0.9956 1.2951
> tail(prices)
           AUDUSD EURUSD
2015-01-26 0.7925 1.1238
2015-01-27 0.7937 1.1381
2015-01-28 0.7889 1.1287
2015-01-29 0.7762 1.1320
2015-01-30 0.7762 1.1291
2015-02-02 0.7806 1.1351
P. Garnry
  • 344
  • 1
  • 6
  • 12
-1

rbbg's blh (now bdh) is dumb. this outputs time series correctly.

bdhx <- function(conn,securities,start_date,end_date=NULL,fields="PX_LAST",override_fields = NULL,overrides = NULL) {
  temp <- bdh(conn=conn,securities=securities,fields=fields,start_date=start_date,end_date=end_date,override_fields=override_fields)
  if (colnames(temp)[1]=="date")
    {temp <- as.xts(temp)[,-1];colnames(temp) <- securities;res <- temp;}
  else
    {cn <- unique(temp[,1]);fil <- temp[,1]==cn[1];
     res <- xts(temp[fil,3],as.Date(temp[fil,2]));colnames(res) <- securities[1];
         for (i in 4:(length(cn)+2)){
          fil <- temp[,1]==cn[i-2]
          temp2 <- xts(temp[fil,3],as.Date(temp[fil,2]));colnames(temp2) <- securities[i-2];
          res <- merge.xts(res,temp2)}
     }
  res}
Community
  • 1
  • 1
swyx
  • 2,378
  • 5
  • 24
  • 39