Questions tagged [aggregate]

Aggregate refers to the process of summarizing grouped data, commonly used in Statistics.

Aggregate refers to the process of summarizing grouped data, commonly used in Statistics. Typically this involves replacing groups of data with single values (e.g. sum, mean, standard deviation, etc.). In SQL databases and data manipulation libraries such as in , this is accomplished with the use of GROUP BY and aggregate functions.

Documentation:

8256 questions
1172
votes
14 answers

Group By Multiple Columns

How can I do GroupBy multiple columns in LINQ Something similar to this in SQL: SELECT * FROM GROUP BY , How can I convert this to LINQ: QuantityBreakdown ( MaterialID int, ProductID int, Quantity…
Sreedhar
  • 29,307
  • 34
  • 118
  • 188
834
votes
12 answers

LINQ Aggregate algorithm explained

This might sound lame, but I have not been able to find a really good explanation of Aggregate. Good means short, descriptive, comprehensive with a small and clear example.
Alexander Beletsky
  • 19,453
  • 9
  • 63
  • 86
684
votes
6 answers

What are Aggregates and PODs and how/why are they special?

This FAQ is about Aggregates and PODs and covers the following material: What are Aggregates? What are PODs (Plain Old Data)? How are they related? How and why are they special? What changes for C++11?
Armen Tsirunyan
  • 130,161
  • 59
  • 324
  • 434
552
votes
16 answers

How to group dataframe rows into list in pandas groupby

I have a pandas data frame df like: a b A 1 A 2 B 5 B 5 B 4 C 6 I want to group by the first column and get second column as lists in rows: A [1,2] B [5,5,4] C [6] Is it possible to do something like this using pandas groupby?
Abhishek Thakur
  • 16,337
  • 15
  • 66
  • 97
502
votes
18 answers

How to sum a variable by group

I have a data frame with two columns. First column contains categories such as "First", "Second", "Third", and the second column has numbers that represent the number of times I saw the specific groups from "Category". For example: Category …
boo-urns
  • 10,136
  • 26
  • 71
  • 107
448
votes
6 answers

How to use GROUP BY to concatenate strings in MySQL?

Basically the question is how to get from this: foo_id foo_name 1 A 1 B 2 C to this: foo_id foo_name 1 A B 2 C
Paweł Hajdan
  • 18,074
  • 9
  • 49
  • 65
429
votes
2 answers

C# Linq Group By on multiple columns

public class ConsolidatedChild { public string School { get; set; } public string Friend { get; set; } public string FavoriteColor { get; set; } public List Children { get; set; } } public class Child { public string…
Kasy
  • 4,301
  • 2
  • 15
  • 6
381
votes
11 answers

How do I Pandas group-by to get sum?

I am using this dataframe: Fruit Date Name Number Apples 10/6/2016 Bob 7 Apples 10/6/2016 Bob 8 Apples 10/6/2016 Mike 9 Apples 10/7/2016 Steve 10 Apples 10/7/2016 Bob 1 Oranges 10/7/2016 Bob 2 Oranges 10/6/2016 Tom …
Trying_hard
  • 8,931
  • 29
  • 62
  • 85
265
votes
5 answers

Multiple aggregations of the same column using pandas GroupBy.agg()

Is there a pandas built-in way to apply two different aggregating functions f1, f2 to the same column df["returns"], without having to call agg() multiple times? Example dataframe: import pandas as pd import datetime as dt import numpy as…
ely
  • 74,674
  • 34
  • 147
  • 228
222
votes
8 answers

Mean per group in a data.frame

I have a data.frame and I need to calculate the mean per group (i.e. per Month, below). Name Month Rate1 Rate2 Aira 1 12 23 Aira 2 18 73 Aira 3 19 45 Ben 1 53 19 Ben …
Ianthe
  • 5,559
  • 21
  • 57
  • 74
194
votes
10 answers

Aggregate / summarize multiple variables per group (e.g. sum, mean)

From a data frame, is there a easy way to aggregate (sum, mean, max etc) multiple variables simultaneously? Below are some sample data: library(lubridate) days = 365*2 date = seq(as.Date("2000-01-01"), length = days, by = "day") year =…
MikeTP
  • 7,716
  • 16
  • 44
  • 57
189
votes
5 answers

Summarizing multiple columns with dplyr?

I'm struggling a bit with the dplyr-syntax. I have a data frame with different variables and one grouping variable. Now I want to calculate the mean for each column within each group, using dplyr in R. df <- data.frame( a = sample(1:5, n,…
Daniel
  • 7,252
  • 6
  • 26
  • 38
188
votes
4 answers

aggregate() vs annotate() in Django

Django's QuerySet has two methods, annotate and aggregate. The documentation says that: Unlike aggregate(), annotate() is not a terminal clause. The output of the annotate() clause is a…
151
votes
18 answers

Count number of rows within each group

I have a dataframe and I would like to count the number of rows within each group. I reguarly use the aggregate function to sum data as follows: df2 <- aggregate(x ~ Year + Month, data = df1, sum) Now, I would like to count observations but can't…
MikeTP
  • 7,716
  • 16
  • 44
  • 57
147
votes
7 answers

Django: Calculate the Sum of the column values through query

I have a model: class ItemPrice(models.Model): price = models.DecimalField(max_digits=8, decimal_places=2) # ... I tried this to calculate the sum of price in this queryset: items = ItemPrice.objects.all().annotate(Sum('price')) What's…
Ahsan
  • 11,516
  • 12
  • 52
  • 79
1
2 3
99 100