Consider the following dataset:
a <- c(1,23,18,47,15,56,67,43,9)
b <- c("A","B","B","C","C","B","D","A","C")
df <- data.frame(var1=a, var2=b)
I need to run function (for example mean()) on sub parts of df (based on var2 value), like this:
df_A <- subset(df,var2=="A")
mean_A <- mean(df_A$var1)
df_B <- subset(df,var2=="B")
mean_B <- mean(df_B$var1)
df_C <- subset(df,var2=="C")
mean_C <- mean(df_C$var1)
df_D <- subset(df,var2=="D")
mean_D <- mean(df_D$var1)
The big difficulty I m facing here is I don't know in advance how many differents values I have in var2. In my example I have 4 possibilities : "A", "B", "C" and "D". But in the reality, it is random... sometimes I have a dataset with 2 differents values in var2, sometimes 15, sometimes more...
I think a loop could be a good solution but I am a bit lost...
Can you please help? Thanks in advance.