5

I have this data which is basically a matrix but is in data.table format:

Lets call this matA:

Date    compA compB compC 
200101      1     2     3 
200102      2     4     1    

How do I make the matrix into a database type in data.table?

Lets call this dataset dtB:

Date    Company    Data
200101    compA       1
200101    compB       2
200101    compC       3
200102    compA       2
200102    compB       4
200102    compC       1

I have thought about a very stupid way to make a list of companies:

comp= as.data.table(c("compA", "compB", "compC"))
date= as.data.table(rep(matA[1,1],3))
data= as.data.table(matA[1,])

dtb= date[,Company := comp]
dtb= dtb[, Data := data]

And obviously I can at most get the data in date 200101. To get 200102, I write a for loop. Then, merge both 200101 and 200102 into dtB according to the colnames.

Is there a clever way to do this in data.table? Many thanks

Gabriel
  • 423
  • 6
  • 21
  • How do you guys seems to remember all the functions?? I am always trial and error... That works btw – Gabriel Jan 22 '18 at 19:44
  • 1
    Many thanks by the way! More trial and error! – Gabriel Jan 22 '18 at 19:49
  • 1
    I think we're all in trial and error phase. Some of us have just spent so much time on the error side that we are starting to get faster at moving to the trial side. :) – Benjamin Jan 22 '18 at 19:50
  • 1
    It also helps to search for the right terms on google/SO. For example, this would be a "reshape from wide to long format" type of question, which would give you SO questions like [this](https://stackoverflow.com/questions/2185252/reshaping-data-frame-from-wide-to-long-format). – acylam Jan 22 '18 at 19:52
  • Yes the problem is I constantly on stackoverflow, but it always hard to search what you need with the right term! – Gabriel Jan 22 '18 at 19:55
  • The data.table package has a `melt` command as well. – G. Grothendieck Jan 22 '18 at 20:02

2 Answers2

3

If you already have a true matrix, as in:

matA <- 
  as.matrix(
    read.table(text=
      "Date    compA compB compC 
      200101      1     2     3 
      200102      2     4     1",
      header=TRUE,
      row.names=1)
  )
names(dimnames(matA)) <- c('Date','Company')
matA
#        Company
#  Date   compA compB compC
#  200101     1     2     3
#  200102     2     4     1

...then you can use the base as.data.frame( as.table(x) ).

as.data.frame(
  as.table(matA),
  responseName = 'Data'
)

#     Date Company Data
# 1 200101   compA    1
# 2 200102   compA    2
# 3 200101   compB    2
# 4 200102   compB    4
# 5 200101   compC    3
# 6 200102   compC    1

No dependencies required.

This takes advantage of as.data.frame() method for the table class:

The as.data.frame method for objects inheriting from class "table" can be used to convert the array-based representation of a contingency table to a data frame containing the classifying factors and the corresponding entries (the latter as component named by responseName). This is the inverse of xtabs.

Of course, if you want, you can do as.data.table( as.table(x) ), or do a melt.

C8H10N4O2
  • 18,312
  • 8
  • 98
  • 134
2

Base R

data.frame(Date = matA[,"Date"],
           Company = rep(colnames(matA)[-1], each = NROW(matA)),
           Data = as.vector(matA[,-1]))
#    Date Company Data
#1 200101   compA    1
#2 200102   compA    2
#3 200101   compB    2
#4 200102   compB    4
#5 200101   compC    3
#6 200102   compC    1

data.table

library(data.table)
melt(data = as.data.table(matA),
     id.vars = "Date",
     variable.name = "Company",
     value.name = "Data")
#     Date Company Data
#1: 200101   compA    1
#2: 200102   compA    2
#3: 200101   compB    2
#4: 200102   compB    4
#5: 200101   compC    3
#6: 200102   compC    1

DATA

matA = structure(c(200101L, 200102L, 1L, 2L, 2L, 4L, 3L, 1L), .Dim = c(2L, 
4L), .Dimnames = list(NULL, c("Date", "compA", "compB", "compC")))
d.b
  • 32,245
  • 6
  • 36
  • 77