1

I have this dataset. which has a category column and a numeric column. I need to do a conditional sum just as how the formula sumifs do in excel.

p <- c("a", "b", "c","a", "b", "c","a", "b", "c","a", "b", "c","a", "b");
q <- c(6, 8, 5, 2, 5, 1, 3, 7, 7, 6, 4, 4, 3, 3);
t<-data.frame(p,q);
values<-seq(1,10,by = 2);

I need to create a table of sum with given intervals. vertically on the left I need the intervals [1,3), [3,5) etc. and horizontally i need a, b and c and sum should be aggregated.

I have tried using apply and cut within but it doesn't work or I am doing something wrong.

Sanjay
  • 41
  • 3

1 Answers1

-1

To achieve a conditional sum based on intervals and categories, you can use the dplyr package for data manipulation. Here's an example that demonstrates how you can accomplish this:

library(dplyr)

# Sample data
p <- c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b")
q <- c(6, 8, 5, 2, 5, 1, 3, 7, 7, 6, 4, 4, 3, 3)
t <- data.frame(p, q)
values <- seq(1, 10, by = 2)

# Create intervals
intervals <- cut(values, breaks = c(1, 3, 5, 7, 9, Inf), include.lowest = TRUE, labels = FALSE)

# Aggregate and calculate conditional sum
result <- t %>%
  group_by(p, interval = intervals) %>%
  summarize(sum_q = sum(q))

# Pivot the table for desired format
table_result <- pivot_wider(result, names_from = p, values_from = sum_q)

# Add interval column
table_result <- cbind(interval = levels(as.factor(intervals)), table_result)

# Print the result
print(table_result)

This code creates intervals using the cut function and then performs a conditional sum using group_by and summarize from dplyr. The result is then pivoted using pivot_wider to arrange it in the desired format with categories (a, b, c) as columns and intervals as rows.

this code will generate a table with the intervals vertically on the left, categories horizontally, and the sums aggregated accordingly. hope you find this useful