I have the following dataset with multiple different ids in column 1 and I wish to calculate the mean and standard deviation for column 2 for each id
123456 0.1234
123456 0.5673
123456 0.0011
123456 -0.0947
123457 0.9938
123457 0.0001
123457 0.2839
I have the following code to get the mean per id but struggling to amend this to get the SD as well
awk '{sum4[$1] += $2; count4[$1]++}; END{ for (id in sum4) { print id, sum4[id]/count4[id] } }' < want3.txt > mean_id.txt
The desired output is a file of id mean and sd
123456 0.149275 0.2926
123457 0.425933 0.5118
Any advice would be much appreciated. Thanks