How do I find the average of each data_point, within the data frame (df) in R?

Question

Data Frame (df):

Data_point | Measurement

a | 2

a | 4

b | 6

b | 8

c | 4

c | 10

Your question doesn't make sense to me. What would the "average of each data point" be? Also, finding either the mean, median or other statistic of a column of data (if that is what you want) is something you could easily find via either ?mean in R or by searching Google. Please also read the instructions for asking questions. — Elin, Jun 07 '18 at 10:27
@Elin I know I can use the mean function to find the mean of a column, but I do not want to find the mean of the whole column. I just want to find the mean corresponding to the data point a, then a mean corresponding to the data point b etc..... — Jed, Jun 07 '18 at 10:31
If point a only has one value the mean is that value. Please edit your question to explain what you mean after reading how to ask a question, how to create a minimum viable example. — Elin, Jun 07 '18 at 10:33
Also just to summarize you need to make it so that someone could copy your code and data and paste it into their R instance and you should include the desired results. — Elin, Jun 07 '18 at 10:36
Elin is right. Next time please read - [How do I ask a good question?](https://stackoverflow.com/help/how-to-ask) and [How to make a great R reproducible example?](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) — Shique, Jun 07 '18 at 10:43

Shique · Answer 1 · 2018-06-07T10:38:50.107

0

Data

Data_point <- c('a','a','b','b','c','c')
Measurement <- c(2,4,6,8,4,10)

df <- data.frame(Data_point=Data_point, Measurement=Measurement)

You can make use of mean and which to choose what values you want. Combining this into a function gives the result you want. As mentioned, there is probably a function already for this (probably somewhere in tidyverse), but this is used with just base R.

average <- function(x) {
  mean(df$Measurement[which(df$Data_point == x)])
}

sapply(unique(df$Data_point), average)
#[1] 3 7 7

edited Jun 07 '18 at 10:38

answered Jun 07 '18 at 10:33

Shique

724
3
18

This is a code only answer, you need to explain what you did and why you did it. – Elin Jun 07 '18 at 10:35
@shique Thank you! Please could you clarify what "x" represents in this case? – Jed Jun 07 '18 at 10:39
x represents input from the function that is in this code, defined as `average`. Using `sapply`, you can call the function multiple times, with the result from `unique(df$Data_point)` as inputs – Shique Jun 07 '18 at 10:41

score 0 · Answer 2 · answered Jun 07 '18 at 10:47

0

as Elin suggests, dplyr can do this. Using the data from Shique:

library(dplyr)
df %>% group_by(Data_point) %>% summarise_all(mean)

answered Jun 07 '18 at 10:47

MartijnVanAttekum

1,405
12
20

score 0 · Answer 3 · answered Jun 07 '18 at 12:09

0

A base R option would be aggregate

aggregate(.~ Data_point, df1, mean)

answered Jun 07 '18 at 12:09

akrun

874,273
37
540
662