I have a dataframe which looks as following:
head(test_df, n =15)
# print the first 15rows of the dataframe
value frequency index
1 -2.90267705917358 1 1
2 -2.90254878997803 1 1
3 -2.90252590179443 1 1
4 -2.90219354629517 1 1
5 -2.90201354026794 1 1
6 -2.9016375541687 1 1
7 -2.90107154846191 1 1
8 -2.90089440345764 1 1
9 -2.89996957778931 1 1
10 -2.89970088005066 1 1
11 -2.89928865432739 1 2
12 -2.89920520782471 1 2
13 -2.89907360076904 1 2
14 -2.89888191223145 1 2
15 -2.8988630771637 1 2
The dataframe has 3columns and 61819rows. To aggregate the dataframe, I want to get the mean value for the columns 'value' and 'frequency' for all rows with the same 'index'.
I already found some useful links, see:
https://www.r-bloggers.com/2018/07/how-to-aggregate-data-in-r/
However, I could not solve the problem yet.
test_df_ag <- stats::aggregate(test_df[1:2], by = test_df[3], FUN = 'mean')
# aggregate the dataframe based on the 'index' column (build the mean)
index value frequency
1 1 NA 1
2 2 NA 1
3 3 NA 1
4 4 NA 1
5 5 NA 1
6 6 NA 1
7 7 NA 1
8 8 NA 1
9 9 NA 1
10 10 NA 1
11 11 NA 1
12 12 NA 1
13 13 NA 1
14 14 NA 1
15 15 NA 1
Since I just get NA values for the column 'value', I wonder whether it might just be a data type issue?! However also when I tried to convert the data type I failed...
base::typeof(test_df$value)
# query the data type of the 'value' column
[1] "integer"