-1

I have this data frame:

data <- data.frame(id=sample(1:10,2000,replace = T),value=sample(100:10000,2000,replace = T))
> head(data)
  id value
1  4  2032
2  3  2512
3  9  8925
4  8  8527
5  6  5176
6  9  8182

Now I want value for each id as colnames and the values are to be rows that correspond to the id.

What I want is not to summarise but to group the values according to id and need to convert the id into columns.

jay.sf
  • 60,139
  • 8
  • 53
  • 110
sai saran
  • 737
  • 9
  • 32

4 Answers4

1

This should work:

library(tidyverse)

data %>% 
  group_by(id = paste("id", id, sep = "_")) %>%
  mutate(rn = row_number()) %>%
  spread(id, value) %>%
  select(-rn)

Output (first 10 rows):

    id_1 id_10  id_2  id_3  id_4  id_5  id_6  id_7  id_8  id_9
   <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
 1  8161   576  4921  5965  8969  8419  7898  5724  6513  7475
 2  8526  8121  5200  7847  4033  9348  5051  4430  9320  2973
 3  4587  4505  1747  6179  6358   234  5649  5780  3579  4986
 4  2609  9058  5709  4284  4068   523  9156  3253  6753  5570
 5  1261  4533  5954  7703  2460  2171  4196  7576  7118  8702
 6  3125  8303  2364  9305  9094  1211  3439  8201  5268  6794
 7  3464   657  2917  4831  6154  3125  9964  9324  1917  7439
 8  6601  2297  4163  7866  6701  6336   262  6725  7646  5361
 9  3042  4296  9312  8990   366  5891  3984  4675  7289  9549
10  4829  5565  8841   775  5482  9519  1084  1845  4735  3467
# ... with 203 more rows

The tail of the dataset looks like:

   id_1 id_10  id_2  id_3  id_4  id_5  id_6  id_7  id_8  id_9
  <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1  2723    NA    NA    NA    NA    NA    NA    NA  7147    NA
2  7746    NA    NA    NA    NA    NA    NA    NA  1809    NA
3  4281    NA    NA    NA    NA    NA    NA    NA  8140    NA
4    NA    NA    NA    NA    NA    NA    NA    NA  6564    NA
5    NA    NA    NA    NA    NA    NA    NA    NA  6001    NA
6    NA    NA    NA    NA    NA    NA    NA    NA  3471    NA
arg0naut91
  • 14,574
  • 2
  • 17
  • 38
  • thanks for the response but i dont want to summarise and added the little bit information in title....i want to group the values according to id and need to convert the id as columns – sai saran Nov 10 '18 at 10:14
  • In this case just skip the summarise part? – arg0naut91 Nov 10 '18 at 10:15
  • simple words ...below thread have clear example https://stackoverflow.com/questions/53236989/excel-how-do-i-arrange-my-data-into-the-form-i-want – sai saran Nov 10 '18 at 10:19
  • @saisaran, in this case take a look at my edit, it should work fine now. – arg0naut91 Nov 10 '18 at 10:33
1

First split partial data frames by ID into a temporary list.

ls1 <- lapply(sort(unique(data$id)), function(x) data[data$id == x, ])

Second, number the values for each ID and summarize everything back into the original data frame structure.

data <- do.call(rbind, 
                lapply(1:(length(ls1)), 
                       function(x) transform(ls1[[x]], 
                                             time=1:length(ls1[[x]][[1]]))))
rm(ls1)  # remove tmp list

Finally use reshape().

result <- reshape(data, idvar="time", timevar="id", direction="wide")

Yields:

> head(result)
   time value.1 value.2 value.3 value.4 value.5 value.6 value.7 value.8 value.9 value.10
25    1    8097    8445    7029    3001    2823    7371    8359    6504    8902     9901
35    2     565    6701    6765    1187     116    9527    1680    3701    8514     4441
37    3    5383    5311    1073    9261    7899    6894    2297    1335    2910     5700
43    4    4885    6716    1608    6547    7379    5821    1295     866     702     8029
55    5    7721    8430    5324    6937     195    5758    1704    8017    9744     2062
71    6    4537    7004    8477    2071    9130    2072    4455    6628    6076     3888

> dim(result)
[1] 226  11

Data:

set.seed(42)
data <- data.frame(id=sample(1:10, 2000, replace=TRUE),
                   value=sample(100:10000, 2000, replace=TRUE))
jay.sf
  • 60,139
  • 8
  • 53
  • 110
0

The thing is you have to have a unique id values i.e. columns can not be duplicated.

library(tidyr)

set.seed(999)
data<-data.frame(id=sample(1:10,2000,replace = T),value=sample(100:10000,2000,replace = T))

# reshape to wide format
oo <- data %>% 
  distinct(id, .keep_all = TRUE) %>% 
  spread(id, value)

# rename columns, add prefix 'id'
colnames(oo) <- sapply(colnames(oo), function(x) paste0("id_", x))

Output

  id_1 id_2 id_3 id_4 id_5 id_6 id_7 id_8 id_9 id_10
1 9850 9160  407 4846 6612 9174 8294 1277 8854  9941
Aleksandr
  • 1,814
  • 11
  • 19
0

The first step is to create a list, where each element corresponds to one id:

l <- tapply(data$value, data$id, list)
l["2"]
# $`2`
#   [1] 3961 2644 4194 3630 2485  353 6801 4487 9770 5793 9291 7071 1842
#  [14] 1970 6200 6499 4067 2968 3879 1677 3964 4934 5891 7502 7333 7742
#  ....

Actually, for most purposes it would be recommended to use this data structure rather than what you are asking in your question. Given that, we have multiple vector of unequal length which we want to cbind. There have been multiple proposals how to do that (see, e.g., here). For instance,

library(qpcR)
result <- do.call(qpcR:::cbind.na, l)
head(result, 2)
#         1    2    3    4    5    6    7    8    9   10
# [1,] 3118 6938 2360 9680 1540 4900 1427  680 3020 3824
# [2,] 4430 9265 4275 3689  624 6713  196 4605 9439  190
tail(result, 2)
#         1  2  3  4    5  6  7  8  9 10
# [212,] NA NA NA NA 1775 NA NA NA NA NA
# [213,] NA NA NA NA 9398 NA NA NA NA NA
Julius Vainora
  • 47,421
  • 9
  • 90
  • 102