1

I am really really new on python. I have three col and about 7000~8000 rows on the excel file. How can I calculate the mean and std for each col after importing the data to python.

import xlrd
file_location = "C:/Users/Roy/Desktop/table.xlsx"
workbook = xlrd.open_workbook(file_location)
sheet=workbook.sheet_by_index(0)
AChampion
  • 29,683
  • 4
  • 59
  • 75
roy chung
  • 21
  • 1
  • 1
  • 2
  • 2
    Perhaps you are looking for [Pandas](http://pandas.pydata.org/). You can use the `read_excel` function and then it is very simple to calculate the mean and standard deviation for each column. You can find many examples out there on the web. Also, delete that "Urget" from your question... – Peque Oct 09 '15 at 09:14
  • You can use numpy too – chamzz.dot Jun 27 '17 at 06:56

3 Answers3

5

You can use pandas to do this:

import pandas as pd
df = pd.read_excel("C:/Users/Roy/Desktop/table.xlsx")
print(df.mean())
print(df.std())
AChampion
  • 29,683
  • 4
  • 59
  • 75
0

You should check out the functions in the Sheet class of xlrd.

If you are using python 3.4+ there is a standard library module for statistics which will do mean and standard deviation. Check out the documentation here and decide whether you need stdev or pstdev. Otherwise here is a previous answer that provides some example code for the mean and standard deviation operations.

from statistics import mean, stdev
my_stats = []
for i in range(sheet.ncols):
    m = mean(c.value for c in sheet.col(i))
    s = stdev(c.value for c in sheet.col(i))
    my_stats.append((m,s))
Community
  • 1
  • 1
machine yearning
  • 9,889
  • 5
  • 38
  • 51
0

You can use numpy library for calculate the std of a huge number of values in a excel sheet. See this code below:

import csv

import numpy as np
import math

#read csv file and convert it in to an array
csv= np.genfromtxt
csv = np.genfromtxt('your excel sheet name.csv',delimiter=",")
N=len(csv[:])
#print N(N is number of raws in the sheet)
#X is x bar which is the mean of each column
X = np.mean(csv,axis=0)
sum = 0

for j in range(len (X[:])):
    for i in range(N):
        val= csv[i,j]-X[j]
        squre = val**2
        sum= sum+squre
    final = math.sqrt(sum/(N-1))
    print final

The output of this code will show an array of stds of each column.

chamzz.dot
  • 607
  • 2
  • 12
  • 24