1

I'd like to create a new data table using the following vectors. I have two tables, a list of 100 people, and a list of 5 tests. I want to combine the tables, however, I want a row for each test for each subject, so in the new table i'll have 500 rows. I used simplified example below. This is in R

person  <- data.frame(c("a", "b", "c", "d"))
test <- data.frame(c("1", "2", "3", "4", "5"))

I want to make a new table from this like below

person   test
 a          1  
 a          2
 a          3
 a          4
 a          5
 b          1
 b          2
 b          3
 b          4
 b          5

... etc.

I initially tried joining, but there are no variables to match on, and also tried pasting with no luck. I feel i'm missing something because this sounds simple. Any help would be appreciated!

www
  • 38,575
  • 12
  • 48
  • 84
Numan Karim
  • 43
  • 1
  • 6

2 Answers2

2

Here you go.

expand.grid(person = c("a", "b", "c", "d"), 
                test = c("1", "2", "3", "4", "5"))

To make it a data.table()

expand.grid(person = c("a", "b", "c", "d"), 
            test = c("1", "2", "3", "4", "5"))%>%
data.table()

This gives all possible pairings between person and test.

Also this is probably a duplicate of some post.

InfiniteFlash
  • 1,038
  • 1
  • 10
  • 22
  • Thanks for your response, I'm getting an error Warning message: In format.data.frame(x, digits = digits, na.encode = FALSE) : corrupt data frame: columns will be truncated or padded with NAs > – Numan Karim Dec 28 '17 at 04:30
  • you should be able to coerce your data.frame to a `data.table()`. I wouldn't know why that warning is happening – InfiniteFlash Dec 28 '17 at 20:00
1

We can also use CJ from the package. The output is a data.table.

library(data.table)

CJ(person[, 1], test[, 1])
#     V1 V2
#  1:  a  1
#  2:  a  2
#  3:  a  3
#  4:  a  4
#  5:  a  5
#  6:  b  1
#  7:  b  2
#  8:  b  3
#  9:  b  4
# 10:  b  5
# 11:  c  1
# 12:  c  2
# 13:  c  3
# 14:  c  4
# 15:  c  5
# 16:  d  1
# 17:  d  2
# 18:  d  3
# 19:  d  4
# 20:  d  5
www
  • 38,575
  • 12
  • 48
  • 84
  • Thanks for your response. I'm getting the error Error in FUN(X[[i]], ...) : Invalid column: it has dimensions. Can't format it. If it's the result of data.table(table()), use as.data.table(table()) instead. Do my columns need to be in specific types? My tests should be factors I'd think but what format should my person column be in? – Numan Karim Dec 28 '17 at 04:45
  • Please make sure the codes works for your example data frame `person` and `test`. If it works, make sure your real-world data frames are similar to the `person` and `test` data frames you provided. – www Dec 28 '17 at 04:47