1

I have binned data reflecting the width of rivers across each continent. Below is a sample dataset. I pretty much just want to get the data into the form I have shown.

 dat <- read.table(text =    
                      "width continent bin
                      5.32     Africa  10
                      6.38     Africa  10
                      10.80    Asia    20
                      9.45     Africa  10
                      22.66    Africa  30
                      9.45     Asia    10",header = TRUE)

How do I melt the above toy dataset to create this dataframe?

Bin Count Continent
10  3     Africa
10  1     Asia
20  1     Asia
30  1     Africa
akrun
  • 874,273
  • 37
  • 540
  • 662
Jason Matney
  • 552
  • 6
  • 24

2 Answers2

5

We could use either one of the aggregate by group.

The data.table option would be to convert the 'data.frame' to 'data.table' (setDT(dat)), grouped by 'continent' and 'bin' variables, we get the number of elements per group (.N)

library(data.table)
setDT(dat)[,list(Count=.N) ,.(continent,bin)]
#    continent bin Count
#1:    Africa  10     3
#2:      Asia  20     1
#3:    Africa  30     1
#4:      Asia  10     1

Or a similar option with dplyr by grouping the variables and then use n() instead of .N to get the count.

 library(dplyr)
 dat %>%
      group_by(continent, bin) %>%
      summarise(Count=n())

Or we can use aggregate from base R and using the formula method, we get the length.

 aggregate(cbind(Count=width)~., dat, FUN=length)
 #   continent bin Count
 #1    Africa  10     3
 #2      Asia  10     1
 #3      Asia  20     1
 #4    Africa  30     1

From @Frank's and @David Arenburg's comments, some additional options using data.table and dplyr. We convert the dataset to data.table (setDT(dat)), convert to 'wide' format with dcast, then reconvert it back to 'long' using melt, and subset the roww (value>0)

 library(data.table)
 melt(dcast(setDT(dat),continent~bin))[value>0]

Using count from dplyr

 library(dplyr)
 count(dat, bin, continent)
akrun
  • 874,273
  • 37
  • 540
  • 662
2

With sqldf:

library(sqldf)
sqldf("SELECT bin, continent, COUNT(continent) AS count 
      FROM dat 
      GROUP BY bin, continent")

Output:

  bin continent count
1  10    Africa     3
2  10      Asia     1
3  20      Asia     1
4  30    Africa     1
mpalanco
  • 12,960
  • 2
  • 59
  • 67