-6

I have a thousand-row table of the type below and need to calculate the sum and mean of a continuous "count" variable for every categorical "df" varible.

I have attempted to solve this through table() function, but since I am a using continuous variable, I can't work myself towards a solution.

   df count
1   a     5
2   f     3
3   g     8
4   l     2
5   a    10
6   s     4
7   l     6
8   s     8
9   a     2
10  g     1
zx8754
  • 52,746
  • 12
  • 114
  • 209
Slavo
  • 9
  • 3
  • 2
    are you looking for `aggregate`? i.e. `aggregate(count ~ df, yourDF, mean)` – Sotos Aug 03 '16 at 09:00
  • 3
    Probably a duplicate of http://stackoverflow.com/questions/1660124/how-to-sum-a-variable-by-group or http://stackoverflow.com/questions/21982987/mean-per-group-in-a-data-frame – talat Aug 03 '16 at 09:02

3 Answers3

1

if I am not mistaken, you are looking for the following code

library(dplyr)
daf %>%
  group_by(df) %>%
  summarise(Sum = sum(count), Count = n()) %>%
  ungroup() %>%
  arrange(df)

"daf" is the data set that I am working on.

Enjoy R programming!!!

vaibhavnag
  • 19
  • 4
0

Maybe this would help you out ,

> df3 <- aggregate(count ~ df , df, mean)
> df3
  df    count
1  a 5.666667
2  f 3.000000
3  g 4.500000
4  l 4.000000
5  s 6.000000

> df2 <- aggregate(count ~ df , df, sum)
> df2
  df count
1  a    17
2  f     3
3  g     9
4  l     8
5  s    12

Simple aggregate functions can do it . Count in df3 is the mean and count in df2 is the sum .

Pankaj Kaundal
  • 1,012
  • 3
  • 13
  • 25
0

This isn't an especially unique question, but the suggested duplicated questions only ask for a single summary statistic. As this is a simple problem to solve in dplyr I thought I'd throw this in.

dframe <- data.frame(df = c("a", "f", "g", "l", "a", "s", "l", "s", "a", "g"), count = c(5, 3, 8, 2, 10, 4, 6, 8, 2, 1))
dframe
   df count
1   a     5
2   f     3
3   g     8
4   l     2
5   a    10
6   s     4
7   l     6
8   s     8
9   a     2
10  g     1

library(dplyr)
dframe %>% group_by(df) %>% summarise(sum = sum(count), mean = mean(count))
Source: local data frame [5 x 3]

      df   sum     mean
  (fctr) (dbl)    (dbl)
1      a    17 5.666667
2      f     3 3.000000
3      g     9 4.500000
4      l     8 4.000000
5      s    12 6.000000

You can see that summarise() allows you to calculate whatever, and however many, summary statistics for each group that you like.

doctorG
  • 1,681
  • 1
  • 11
  • 27