I have this table:
3702 GO:0009611 0.682
3711 GO:0009611 35.418
4081 GO:0009611 18.072
3702 GO:0033554 0.400
3702 GO:0006812 0.378
3702 GO:0006412 0.373
3702 GO:0009058 0.346
3702 GO:0051641 0.312
29760 GO:0009611 28.697
I don't care about first column. Column 2 has some values repeated. What I'd like to get is a data.frame
where the first column is a value of the column 2 of my initial table, and the second column of my output would be the corresponding mean of the column 3 of my initial table.
Something like:
GO:0051179 1.7398
GO:0016311 2.1595
GO:0010467 1.45633
GO:0044093 15.483
GO:0006811 2.4175
GO:0044238 0.927667
GO:0006812 3.0138
GO:0006807 1.048
In fact, I've got this output using awk:
awk '{print $2"\t"$3}' BP.txt | awk '{hash1[$1]+=$2} ; {hash2[$1]+=1} END {for (x in hash1) {print x"\t"hash1[x]/hash2[x]}}'
but no clue about doing this in R.