2

I need to use 'PerformanceAnalytics' package of R and to use this package, it requires me to convert the data into xts data. The data can be downloaded from this link: https://drive.google.com/file/d/0B8usDJAPeV85elBmWXFwaXB4WUE/edit?usp=sharing . Hence, I have created an xts data by using the following commands:

data<-read.csv('monthly.csv')
dataxts <- xts(data[,-1],order.by=as.Date(data$datadate,format="%d/%m/%Y"))

But after doing this, it looses the panel data structure. I tried to sort the xts data to get it back in panel data form but failed.

Can anyone please help me to reorganize the xts data to look like a panel data. I need to sort them by firm id (gvkey) and data(datadate).

cookie monster
  • 10,671
  • 4
  • 31
  • 45
Jairaj Gupta
  • 347
  • 4
  • 16
  • 2
    Hi and welcome to SO! Please update your post with a [**minimal**, reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-exam.ple/5963610#5963610). It is not very tempting to download a 152.8 MB file...Cheers. – Henrik Jan 27 '14 at 23:32

2 Answers2

2

xts objects are sorted by time index only. They cannot be sorted by anything else.

I would encourage you to split your data.frame into a list, by gvkey. Then convert each list element to xts and remove the columns that do not vary across time, storing them as xtsAttributes. You might also want to consider using the yearmon class, since you're dealing with monthly data.

You will have to determine how you want to encode non-numeric, time-varying values, since you cannot mix types in xts objects.

Data <- read.csv('monthly.csv', nrow=1000, as.is=TRUE)
DataList <- split(Data, Data$gvkey)
xtsList <- lapply(DataList, function(x) {
  attrCol <- c("iid","tic","cusip","conm","exchg","secstat","tpci",
    "cik","fic","conml","costat","idbflag","dldte")
  numCol <- c("ajexm","ajpm","cshtrm","prccm","prchm","prclm",
    "trfm", "trt1m", "rawpm", "rawxm", "cmth", "cshom", "cyear")
  toEncode <- c("isalrt","curcdm")
  y <- xts(x[,numCol], as.Date(x$datadate,format="%d/%m/%Y"))
  xtsAttributes(y) <- as.list(x[1,attrCol])
  y
})

Each list element is now an xts object, and is much more compact, since you do not repeat completely redundant data. And you can easily run analysis on each gvkey via lapply and friends.

> str(xtsList[["1004"]])
An ‘xts’ object on 1983-01-31/2012-12-31 containing:
  Data: num [1:360, 1:13] 3.38 3.38 3.38 3.38 3.38 ...
 - attr(*, "dimnames")=List of 2
  ..$ : NULL
  ..$ : chr [1:13] "ajexm" "ajpm" "cshtrm" "prccm" ...
  Indexed by objects of class: [Date] TZ: UTC
  xts Attributes:  
List of 13
 $ iid    : int 1
 $ tic    : chr "AIR"
 $ cusip  : int 361105
 $ conm   : chr "AAR CORP"
 $ exchg  : int 11
 $ secstat: chr "A"
 $ tpci   : chr "0"
 $ cik    : int 1750
 $ fic    : chr "USA"
 $ conml  : chr "AAR Corp"
 $ costat : chr "A"
 $ idbflag: chr "D"
 $ dldte  : chr ""

And you can access the attributes via xtsAttributes:

> xtsAttributes(xtsList[["1004"]])$fic
[1] "USA"
> xtsAttributes(xtsList[["1004"]])$tic
[1] "AIR"
Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
  • Thanks a lot Joshua. I will try it and get back if I need additional help. – Jairaj Gupta Jan 30 '14 at 10:26
  • Dear Joshua, I have tried your suggestion and its working fine, but beyond that I am getting stuck. Can you please tell me how can I calculate returns using Return.Calculate function of 'PerformanceAnalytics' package and save the output as a variable (monthly return) to my dataframe? Subsequently I also need to calculate the various downside risk measures. – Jairaj Gupta Jan 30 '14 at 11:26
  • prccm is the monthly closing price in the dataset. – Jairaj Gupta Jan 30 '14 at 11:32
0

An efficient way to achieve this goal is to covert the Panel Data (long format) into wide format using 'reshape2' package. After performing the estimations, convert it back to long format or panel data format. Here is an example:

library(foreign)
library(reshape2)
dd <- read.dta("DDA.dta") // DDA.dta is Stata data; keep only date, id and variable of interest (i.e. three columns in total)
wdd<-dcast(dd, datadate~gvkey) // gvkey is the id
require(PerformanceAnalytics)
wddxts <- xts(wdd[,-1],order.by=as.Date(wdd$datadate,format= "%Y-%m-%d"))

ssd60A<-rollapply(wddxts,width=60,SemiDeviation,by.column=TRUE,fill=NA) // e.g of rolling window calculation
ssd60A.df<-as.data.frame(ssd60A.xts) // convert dataframe to xts
ssd60A.df$datadate=rownames(ssd60A.df) // insert time index
lssd60A.df<-melt(ssd60A.df, id.vars=c('datadate'),var='gvkey') // convert back to panel format
write.dta(lssd60A.df,"ssd60A.dta",convert.factors = "string") // export as Stata file

Then simply merge it with the master database to perform some regression.

Jairaj Gupta
  • 347
  • 4
  • 16