0

Data frame :

200.01  117:10520   227137.56097561
200.01  155:24  227137.56097561
200.01  265:47  227137.56097561
200.01  266:37  227137.56097561
200.01  281:568 227137.56097561
200.01  282:246 227137.56097561
200.31  190:3374    227360
200.56  110:1261    227545.365853659
200.56  186:571 227545.365853659
200.66  114:969 227619.512195122
200.66  118:3886    227619.512195122

The data is the one presented. The question : I want to make one row from the duplicated columns. Example :

200.01  117:10520 155:24 265:47  266:37 281:568 282:246 227137.56097561
alap
  • 646
  • 1
  • 11
  • 24
  • do you want to have them all appear in one column? or in many columns? it sounds like you might instead want a list where each element can be any length – Justin Nov 29 '12 at 17:31
  • I want it like in the example. So basically the example would be 1 row and after the algorithm it would be like 4 rows with : 200.01 200.31 200.56 200.66. And i would create a new frame – alap Nov 29 '12 at 17:34
  • 1
    I think that @Justin's question was "should 117:10520 ... 282:246" be one column or six columns. This looks to me like a very simple application of `aggregate()` if a single column is desired. – A5C1D2H2I1M1N2O1R2T1 Nov 29 '12 at 17:36
  • @mrdwab correct. especially if the number of duplicate entries can vary substantially you'll be better served with a named list rather than a many columned "sparse" data.frame. – Justin Nov 29 '12 at 17:37
  • It should be one column. Yeah. Can you propose a solution? – alap Nov 29 '12 at 17:46

1 Answers1

2

As mentioned in the comments, this is a pretty straightforward aggregate question:

Your data:

dat <- read.table(header = FALSE, stringsAsFactors=FALSE, text = "
                    200.01  117:10520   227137.56097561
                    200.01  155:24  227137.56097561
                    200.01  265:47  227137.56097561
                    200.01  266:37  227137.56097561
                    200.01  281:568 227137.56097561
                    200.01  282:246 227137.56097561
                    200.31  190:3374    227360
                    200.56  110:1261    227545.365853659
                    200.56  186:571 227545.365853659
                    200.66  114:969 227619.512195122
                    200.66  118:3886    227619.512195122")

Two options for aggregation. In the first one, V2 is a list. In the second option, V2 is a character string.

aggregate(V2 ~ V1 + V3, dat, c)
#       V1       V3                                                  V2
# 1 200.01 227137.6 117:10520, 155:24, 265:47, 266:37, 281:568, 282:246
# 2 200.31 227360.0                                            190:3374
# 3 200.56 227545.4                                   110:1261, 186:571
# 4 200.66 227619.5                                   114:969, 118:3886
aggregate(V2 ~ V1 + V3, dat, paste, collapse=" ")
#       V1       V3                                             V2
# 1 200.01 227137.6 117:10520 155:24 265:47 266:37 281:568 282:246
# 2 200.31 227360.0                                       190:3374
# 3 200.56 227545.4                               110:1261 186:571
# 4 200.66 227619.5                               114:969 118:3886

See also: R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate


If multiple columns are required, you might still want to aggregate and then split the columns up later using a custom function. One example function is tableFlatten shared by @RicardoSaporta, which will create as many columns as your longest list item. But, as @Justin mentioned in the comments, a list might be more useful depending on what you're trying to do.

dat2 <-  aggregate(V2 ~ V1 + V3, dat, c)
(dat2 <- tableFlatten(dat2))
#       V1       V3     V2.01    V2.02  V2.03  V2.04   V2.05   V2.06
# 1 200.01 227137.6 117:10520   155:24 265:47 266:37 281:568 282:246
# 2 200.31 227360.0  190:3374                                       
# 3 200.56 227545.4  110:1261  186:571                              
# 4 200.66 227619.5   114:969 118:3886 
Community
  • 1
  • 1
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485