2

I'm having trouble viewing a data table in a tabular format. If I use the PCRA package and export a dataset as an xts object (see the minimal example below), the xts object resembles a table:

class(stocksDat)
[1] "xts" "zoo"
> head(stocksDat[, 1:3])
                   AAN         ABM         ABT
1993-01-31  0.28947368 -0.11250000 -0.08971193
1993-02-28 -0.04081633  0.14893617 -0.05000000
1993-03-31 -0.08510638  0.01234568 -0.01435407
1993-04-30 -0.04651163  0.03658536  0.08912621
1993-05-31  0.08829268 -0.08875740 -0.02690583
1993-06-30 -0.01123596 -0.10389610 -0.05529954

But if I export it as a data.table (second part of minimal example), I get a 3 column output:

> class(stocksDat2)
[1] "data.table" "data.frame"
> stocksDat2
       TickerLast       Date      Return
    1:        AAN 1993-01-31  0.28947368
    2:        ABM 1993-01-31 -0.11250000
    3:        ABT 1993-01-31 -0.08971193
    4:       ADBE 1993-01-31  0.31999999
    5:        ADI 1993-01-31  0.01538462
   ---                                  
81140:        WGO 2015-12-31 -0.11555557
81141:        WHR 2015-12-31 -0.09629589
81142:        WMT 2015-12-31  0.05013595
81143:        WTS 2015-12-31 -0.10520627
81144:        XOM 2015-12-31 -0.04543236

What should I be doing to get the data.table to print just like the xts object, with a column of dates followed by columns of returns for each of the tickers? Converting the data.table to a data. Frame does not help. Is the PCRA output incorrectly formatted?

Sincerely and with many thanks in advance

Thomas Philips

Minimal example: Install package PCRA from CRAN

library(PCRA)
library(xts)

stockItems <- c("Date","TickerLast","Return")
# No need to specify begin and end dates if you want all

# Get the data as an xts object (the default)
stocksDat  <- selectCRSPandSPGMI("monthly",
          stockItems   = stockItems,
          factorItems  = NULL,
          subsetType   = NULL,
          subsetValues = NULL,
          outputType   = "xts")

names(stocksDat)
dim(stocksDat)
head(stocksDat[,1:3])
range(index(stocksDat)) # To see begin and end dates

# Now for a data.table output
stocksDat2  <- selectCRSPandSPGMI("monthly",
                        stockItems   = stockItems,
                        factorItems  = NULL,
                        subsetType   = NULL,
                        subsetValues = NULL,
                        outputType   = "data.table")
names(stocksDat2)
class(stocksDat2)
dim(stocksDat2)
head(stocksDat2)
Thomas Philips
  • 935
  • 2
  • 11
  • 22
  • 3
    According to ots help page, section Value, function `selectCRSPandSPGMI` returns "Either a *multivariate* xts object" or a data.table. I interpret multivariate as being wide format and the data.table is in the long format. So this is documented behavior (though not completely clear in the docs). – Rui Barradas Aug 24 '23 at 14:30
  • Thank you Rui, this makes everything clear. I have in the past converted between long and wide formats using the reshape2 format, but this is now deprecated. It may be necessary to modify the function or to offer the user a choice between long and wide formats when exporting. In my mind, everything should be exported in a wide format, but it is entirely possible that the long format has some advantages that I am not aware of. – Thomas Philips Aug 24 '23 at 14:48
  • You could use `as.data.table(stocksDat)` to convert the xts object to a wide data.table. I don't like working with time series in long format, but that's the preferred format for tidyverse-style functions and why it might be provided. Long-format will be more memory efficient for high-frequency data with lots of series (e.g. tick data) because most series won't have an observation at the same time. – Joshua Ulrich Aug 24 '23 at 15:03
  • See here [reshape data from long to wide in R](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) and here [Reshaping data.frame from wide to long format](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format). My favorites are package `tidyr`'s `pivot_wider` and `pivot_longer`. – Rui Barradas Aug 24 '23 at 17:26
  • 1
    Thank you Joshua. as.data.table() from the data.table library works perfectly. – Thomas Philips Aug 26 '23 at 02:14

1 Answers1

1

To reshape your data.table it should be as easy as:

stocksDat2[, tapply(Return, TickerLast, list), Date]

Which reads as:

  • By Date
    • Take all Return
    • Group it by TickerLast
    • And return a list

As list elements in the j slot of a data.table become new columns you get exactly one column for each TickerLast.

thothal
  • 16,690
  • 3
  • 36
  • 71