How do I calculate the mean of a column

Question

Anyone know how can I calculate the mean of one these columns (on linux)??

sda               2.91    20.44    6.13    2.95   217.53   186.67    44.55     0.84   92.97
sda               0.00     0.00    2.00    0.00    80.00     0.00    40.00     0.22  110.00 
sda               0.00     0.00    2.00    0.00   144.00     0.00    72.00     0.71  100.00 
sda               0.00    64.00    0.00    1.00     0.00     8.00     8.00     2.63   10.00
sda               0.00     1.84    0.31    1.38    22.09   104.29    74.91     3.39 2291.82 
sda               0.00     0.00    0.00    0.00     0.00     0.00     0.00     0.00    0.00

For example: mean(column 2)

http://unix.stackexchange.com/questions/13731/is-there-a-way-to-get-the-min-max-median-and-average-of-a-list-of-numbers-in — Ciro Santilli OurBigBook.com, Nov 21 '15 at 11:14

score 102 · Accepted Answer · answered Jun 26 '10 at 02:21

102

Awk:

awk '{ total += $2 } END { print total/NR }' yourFile.whatever

Read as:

For each line, add column 2 to a variable 'total'.
At the end of the file, print 'total' divided by the number of records.

answered Jun 26 '10 at 02:21

porges

30,133
4
83
114

1

@Porges: How to access specific intervals: Lets say in the second column, I want to find mean of elements 2 to 6? – SKPS Sep 26 '16 at 17:21
3

@SathishKrishnan this is a bit late, but for anyone else: you would prefix the first part with `NR==2,NR==6 { total += .....` (see: https://www.gnu.org/software/gawk/manual/html_node/Ranges.html) – porges Feb 10 '17 at 21:36

Chris Koknat · Answer 2 · 2016-10-18T23:55:14.893

Perl solution:

perl -lane '$total += $F[1]; END{print $total/$.}' file

-a autosplits the line into the @F array, which is indexed starting at 0
$. is the line number

If your fields are separated by commas instead of whitespace:

perl -F, -lane '$total += $F[1]; END{print $total/$.}' file

To print mean values of all columns, assign totals to array @t:

perl -lane 'for $c (0..$#F){$t[$c] += $F[$c]}; END{for $c (0..$#t){print $t[$c]/$.}}'

output:

0
0.485
14.38
1.74
0.888333333333333
77.27
49.8266666666667
39.91
1.29833333333333
434.131666666667

score 1 · Answer 3 · edited May 23 '17 at 11:46

1

You can use python for that, is available in Linux.

If that comes from a file, take a look at this question, just use float instead.

For instance:

#mean.py 
def main():
    with open("mean.txt", 'r') as f:
        data = [map(float, line.split()) for line in f]

    columnTwo = []
    for row in data:
        columnTwo.append( row[1] )

    print  sum(columnTwo,0.0) / len( columnTwo )



if __name__=="__main__":
    main()

Prints 14.38

_{I just include the data in the mean.txt file, not the row header: "sda"}

edited May 23 '17 at 11:46

Community

1
1

answered Jun 26 '10 at 02:19

OscarRyz

196,001
113
385
569

1

My first thought would probably have been Python as well... but making the list might be overly inefficient here, since you only really need the sum and the number of lines. (Also, for the fun of it: `with open("mean.txt", 'r') as f: n,t = map(sum, zip(*((1, float(line.split()[1])) for line in f))); print t/n`) – David Z Jun 26 '10 at 02:43

score 0 · Answer 4 · edited Jun 28 '16 at 09:21

0

Simple-r will calculate the mean with the following line:

r -k2 mean file.txt

for the second column. It can also do much more sophisticated statistical analysis, since it uses R environment for all of its statistical analysis.

edited Jun 28 '16 at 09:21

kenorb

155,785
88
678
743

answered Oct 01 '13 at 13:58

Tom

41
1

score 0 · Answer 5 · edited May 23 '17 at 12:10

0

David Zaslavsky for the fun of it:

with open("mean.txt", 'r') as f: 
    n,t = map(sum, zip(*((1, float(line.split()[1])) for line in f)))
print t/n

edited May 23 '17 at 12:10

Community

1
1

answered Jun 26 '10 at 03:25

OscarRyz

196,001
113
385
569

How do I calculate the mean of a column

5 Answers5

Linked