0

I would like to create the number of rows by group, whereby the groups are defined using all the variables in the dataframe.

Here are a few methods that I have tried, using the available starwars dataset as an example:

library(dplyr)
myData <- starwars %>% select(skin_color, gender, species)

# Method 1: using add_count
myData %>%
  add_count(1:ncol(myData))

# Method 2: using aggregate
aggregate(. ~ 1:ncol(myData), data = myData, FUN = function(x){NROW(x)})

Both of which give an error that the length is incorrect. I suspect that I am using the wrong syntax. Is there a proper syntax to capture all the columns in my dataframe without having to type all of them, so that add_count and aggregate could produce the desired result?

  • @RonakShah As I have more than 10 variables, I needed to call all variables instead of writing them out one by one. The answer provided by Maurits Evers below works well by using group_by_all(). – Rais Kamis Dec 06 '18 at 03:00

1 Answers1

0

Are you after this?

myData %>% group_by_all() %>% add_count()
# A tibble: 87 x 4
# Groups:   skin_color, gender, species [59]
   skin_color  gender species     n
   <chr>       <chr>  <chr>   <int>
 1 fair        male   Human      13
 2 gold        NA     Droid       1
 3 white, blue NA     Droid       1
 4 white       male   Human       1
 5 light       female Human       6
 6 light       male   Human       5
 7 light       female Human       6
 8 white, red  NA     Droid       1
 9 light       male   Human       5
10 fair        male   Human      13
# ... with 77 more rows

or using aggregate

aggregate(count ~ ., data = transform(myData, count = 1), FUN = sum)
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68