0

My aim is to create a table that summarizes the countries featured in my sample. This table should only have two rows, a first row with different columns for each region and a second row with country names that are located in the respective region.

To give you an example, this is what my data.frame XYZ looks like:

..................wvs5red2.s003names.....wvs5red2.regiondummies
21............."Hong Kong"......................Asian Tigers
45............."South Korea"....................Asian Tigers
49............."Taiwan".............................Asian Tigers
66............."China"...............................East Asia & Pacific
80............."Indonesia"........................East Asia & Pacific
86............."Malaysia"...........................East Asia & Pacific 

My aim is to obtain a table that looks similar to this:

region.............Asian Tigers..............................................East Asia & Pacific
countries........Hong Kong, South Korea, Taiwan...........China, Indonesia, etc.

Do you have any idea how to obtain such a table? It took me hours searching for something similar.

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Tobias
  • 45
  • 1
  • 3
  • 3
    your data.frame hurts my eyes. Please take the advice given here : http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Joris Meys May 12 '11 at 13:46
  • @Joris yes, mine too, sorry for that and thanks for the link. – Tobias May 12 '11 at 15:24

3 Answers3

4

Simplest way is tapply:

XYZ <- structure(list(
    names = structure(c(2L, 5L, 6L, 1L, 3L, 4L), .Label = c("China", "Hong Kong", "Indonesia", "Malaysia", "South Korea", "Taiwan"), class = "factor"),
    region = structure(c(1L, 1L, 1L, 2L, 2L, 2L), .Label = c("Asian Tigers", "East Asia & Pacific"), class = "factor")),
    .Names = c("names", "region"), row.names = c(NA, -6L), class = "data.frame")

tapply(XYZ$names, XYZ$region, paste, collapse=", ")
#                     Asian Tigers              East Asia & Pacific 
# "Hong Kong, South Korea, Taiwan"     "China, Indonesia, Malaysia" 
Marek
  • 49,472
  • 15
  • 99
  • 121
  • also great, thanks, same problem with Andrie's solution that I can only vote one best answer (and the array I obtain when implementing your solution does not correspond as nicely with toLatex from the memisc package as mpiktas' code does) – Tobias May 12 '11 at 15:39
3

Recreate the data:

dat <- data.frame(
    country = c("Hong Kong", "South Korea", "Taiwan", "China", "Indonesia", "Malaysia"),
    region = c(rep("Asian Tigers", 3), rep("East Asia & Pacific", 3))
)
dat

      country              region
1   Hong Kong        Asian Tigers
2 South Korea        Asian Tigers
3      Taiwan        Asian Tigers
4       China East Asia & Pacific
5   Indonesia East Asia & Pacific
6    Malaysia East Asia & Pacific

Use ddply in package plyr combined with paste to summarise the data:

library(plyr)
ddply(dat, .(region), function(x)paste(x$country, collapse= ","))

               region                           V1
1        Asian Tigers Hong Kong,South Korea,Taiwan
2 East Asia & Pacific     China,Indonesia,Malaysia
Andrie
  • 176,377
  • 47
  • 447
  • 496
  • works also perfectly, thank you. (mpiktas' solution had the minimal advantage of displaying the regions as column names and I could only flag one answer as best...) – Tobias May 12 '11 at 15:30
2

First create data:

> country<-c("Hong Kong","Taiwan","China","Indonesia")
> region<-rep(c("Asian Tigers","East Asia & Pacific"),each=2)
> df<-data.frame(country=country,region=region)

Then run through column region and gather all the countries. We can use tapply, but I will use dlply from package plyr, since it retains list names.

> ll<-dlply(df,~region,function(d)paste(d$country,collapse=","))
> ll
$`Asian Tigers`
[1] "Hong Kong,Taiwan"

$`East Asia & Pacific`
[1] "China,Indonesia"

attr(,"split_type")
[1] "data.frame"
attr(,"split_labels")
               region
1        Asian Tigers
2 East Asia & Pacific

Now convert the list to the data.frame using do.call. Since we need nice names we need to pass argument check.names=FALSE:

> ll$check.names <- FALSE
> do.call("data.frame",ll)
      Asian Tigers East Asia & Pacific
1 Hong Kong,Taiwan     China,Indonesia
mpiktas
  • 11,258
  • 7
  • 44
  • 57