I have a data frame like this
data.frame(age=c("(0,5]", "(5,10]", "(10,15]", "(15,20]", "(20,25]", "(25,30]"),
C1=c(0, 0, 0, 0, 0, 0),
C2=c(0, 0, 0, 0, 0, 0),
C3=c(0, 270, 30, 4, 0, 0),
C4=c(0, 30, 30, 4, 0, 0))
Just that the columns starting with C are +50. I'm going to use https://stackoverflow.com/a/10139458/792066 to create a pareto chart with the C columns, but the sheer amount of labels makes the chart pretty worthless. The usual solution is to create a new column called "others" with those that aren't top 5~10. I suppose I'm looking for what summarize()
does for factor columns with categorical variables. How can I sum all columns into a new column if their sum isn't in the range of the top X?