-2

Given e.g. the Orange data set, I would like to arrange the observations in a matrix in which the measurements (circumference) taken on each tree are arranged in rows (for a total of 5 rows).

One unsatisfactory way of obtaining this result is as follows:

mat<-matrix(Orange[,3],nrow=5, ncol = 7,byrow=T, dimnames = list(c(unique(Orange$Tree)),c(1:7)))
mimmo970
  • 39
  • 5
  • This might be of your interest: [How to reshape data from long to wide format](https://stackoverflow.com/questions/5890584/how-to-reshape-data-from-long-to-wide-format) – harre Aug 31 '22 at 12:28

2 Answers2

1

An alternative way would be using the dcast( ) function within the data.table package. This allows you to convert data from long to wide. In this case, I've created an ID to could the number of records per Tree.

In the re-shaped data, Tree becomes our primary column and circumference is recorded in 7 unique columns (one for each age).

library(data.table)

Orange <- data.table(Orange)[,ID := seq(1:.N), by=Tree]

Orange2 <- dcast(
  data = Orange,
  formula = Tree ~ ID,
  value.var = "circumference")

Orange2

  Tree  1  2   3   4   5   6   7
1:    3 30 51  75 108 115 139 140
2:    1 30 58  87 115 120 142 145
3:    5 30 49  81 125 142 174 177
4:    2 33 69 111 156 172 203 203
5:    4 32 62 112 167 179 209 214

EDIT (in response to additional comments/questions): Technically the data is already ordered by Tree (defined within the data). This is because the variable Tree is a factor variable with preset levels. To order numerically, here are 2 things: (1) Order by as.character( ) and (2) Re-level the variable.

Orange2[order(as.character(Tree),]
1:    1 30 58  87 115 120 142 145
2:    2 33 69 111 156 172 203 203
3:    3 30 51  75 108 115 139 140
4:    4 32 62 112 167 179 209 214
5:    5 30 49  81 125 142 174 177

class(Orange$Tree)
[1] "ordered" "factor"

levels(Orange$Tree)
[1] "3" "1" "5" "2" "4"

Orange2[,Tree := factor(Tree, c("1","2","3","4","5"), ordered = FALSE)]

Orange2[order(Tree),]
   Tree  1  2   3   4   5   6   7
1:    1 30 58  87 115 120 142 145
2:    2 33 69 111 156 172 203 203
3:    3 30 51  75 108 115 139 140
4:    4 32 62 112 167 179 209 214
5:    5 30 49  81 125 142 174 177
RyanF
  • 119
  • 6
  • Thanks this certainly works. I was wandering if there was a way to 1) obtain the same outcome with base-R (but without loops) and 2) have rows sorted according to Tree number – mimmo970 Aug 31 '22 at 07:43
  • @mimmo970, It's likely this could be done in base R using the `reshape( )` function, but I am less familiar with that function. Are you not able to access the `data.table` package? Also, I edited the answer above to provide insight into your question on sorting the data. – RyanF Aug 31 '22 at 12:00
0

In base, you could simply do:

aggregate(circumference ~ Tree, Orange, I)

If you don't want to order it afterwards: aggregate(circumference ~ as.character(Tree), Orange, I) (that will strip the factor ordering).

Or similar to @RyanF:

Orange$id <- sequence(rle(as.character(Orange$Tree))$lengths)

reshape(Orange[,-2],
        idvar = "Tree",
        timevar = "id",
        direction = "wide")

Output:

   Tree circumference.1 circumference.2 circumference.3 circumference.4 circumference.5 circumference.6 circumference.7
1     1              30              58              87             115             120             142             145
8     2              33              69             111             156             172             203             203
15    3              30              51              75             108             115             139             140
22    4              32              62             112             167             179             209             214
29    5              30              49              81             125             142             174             177
harre
  • 7,081
  • 2
  • 16
  • 28