0

Newbie R question: I have a dataframe which I can subset into 5-6 categories based on one of the features. Is there an easy way to get sum of numbers from another column and display barplot with categories on x axis and sums as heigths of the bars?

another words: split(dataframe, dataframe$feature) and I have no idea how to get sum to sum each category separately.

Could not find anything useful on the web.

Thanks,

Py_Rad
  • 1
  • You are expect to provide data examples constructed in code or delivered with the R `dput` function. – IRTFM Dec 04 '15 at 19:19
  • Welcome to SO. First of all you should read [here](http://stackoverflow.com/help/how-to-ask) about how to ask a good question; a good question has better changes to be solved and you to receive help. On the other hand a read of [this](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) is also good. It explains how to create a reproducible example in R. Help users to help you by providing a piece of your data a desired output and things you have tried so far. – SabDeM Dec 04 '15 at 20:25

2 Answers2

1

without knowing specifics of your problem. I can offer a helpful solution that could get you thinking in the right direction about how to subset a data.frame by one category and get the count/sum of another:

library(dplyr)
library(magrittr)
age <- c(1, 2, 3, 4, 5)
name <- c("Jasmine","Jane", "Jake", "Julie", "Jenna")
grade <- c("A", "A", "B", "B", "C")
gender <- c("F", "F", "M", "F", "F" )
pet <- c(T, F, F, F, T)

df <- data.frame(age, name, grade, gender, pet)
colnames(df) <- c("age", "name", "grade", "gender", "pet")

df %>%
   group_by(pet) %>%
   summarise(count = sum(age)) 

Your output would be:

Source: local data frame [2 x 2]

    pet count      
   (lgl) (dbl)     
    1 FALSE     9
    2  TRUE     6

... And you could easily put that into a bar graph if that is what you are indeed looking for! I used this technique recently to summarise a very large data frame with many levels per factor and I needed the count based on another co-variate for generating bar graphs and I'm new-ish too!

jasdumas
  • 41
  • 1
  • 5
0

Thank you for replies, here is what I figured out:

`#Aggregate does summation and other functions by Categories
 tableofTwoFeatures <- aggregate(dataFrame$Feature1, by = list(Category = dataFrame$Feature2, FUN=sum, na.rm=TRUE)
 #Transpose dataframe to matrix
 bpmat <- t(tableofTwoFeatures[-1])
 #Create column names
 colnames(bpmat) <- tableofTwoFeatures[,1]
 barplot(bpmat)
`
Py_Rad
  • 1