0

I have a df with 900k rows and each row has an action (about 80 different actions in total) and a number (about 500 different numbers in total), so it looks something like this:

       Action       Number 
         a            1
         b            3
         a            7
         b            3
         b            1

How can I create a new df using R which creates a new row with the number, the action and the number of rows with that combination, so it looks something like this:

       Number       Action         Total
         1            a              1
         1            b              1
         3            b              2
         7            a              1
HL589
  • 43
  • 4

1 Answers1

2

Try with dplyr:

library(dplyr)
#Code
newdf <- df %>% group_by(Number,Action) %>% summarise(N=n())

Output:

# A tibble: 4 x 3
# Groups:   Number [3]
  Number Action     N
   <int> <chr>  <int>
1      1 a          1
2      1 b          1
3      3 b          2
4      7 a          1

Or in base R creating an indicator variable N and using aggregate():

#Base R
df$N <- 1
newdf <- aggregate(N~.,data=df,sum)

Output:

  Action Number N
1      a      1 1
2      b      1 1
3      b      3 2
4      a      7 1
Duck
  • 39,058
  • 13
  • 42
  • 84
  • In base R there is `length` function. In dplyr there is a `count` function. All in all, no need to answer 10 year old dupes – David Arenburg Nov 19 '20 at 15:26