2

I am starting to use openpyxl and I want to copy the sum of a row. In Excel the value is 150, but when I try to print it, the output I get is the formula, not the actual value:

=SUM(B1:B19)

The script I use is:

print(ws["B20"].value)

Using "data_only" didn't work.

wb = ("First_File_b.xlsx" , data_only=True)

Any idea how I can solve to obtain the numerical value? Help would be greatly appreciated.

rainer
  • 3,295
  • 5
  • 34
  • 50
  • 1
    duplicate of http://stackoverflow.com/questions/23350581/openpyxl-1-8-5-reading-the-result-of-a-formula-typed-in-a-cell-using-openpyxl. In a nutshell, you can't! – Jean-François Fabre Aug 09 '16 at 16:39
  • Is there a workaround? – rainer Aug 09 '16 at 16:51
  • Well, you could try to parse the expression. For simple expression like `SUM` it would work, provided that the B1:B19 rows contain real data, not formulas again! I could try to write some workaround in your case if you're interested. – Jean-François Fabre Aug 09 '16 at 16:53
  • Jean, that would be magnificent, if it's not taking too much of your time. Thanks in sdvance – rainer Aug 09 '16 at 17:10

2 Answers2

1

Okay, here's a simple example

I have created a spreadsheet with first spreadsheet "Feuil1" (french version) which contains A1,...,A7 as 1,2,3,4,5,6,7 and A8=SUM(A1:A7)

Here's the code, that could be adapted maybe to other operators. maybe not so simply. It also supports ranges from A1:B12 for instance, untested and no parsing support for cols like AA although could be done.

import openpyxl,re

fre = re.compile(r"=(\w+)\((\w+):(\w+)\)$")
cre = re.compile(r"([A-Z]+)(\d+)")

def the_sum(a,b):
    return a+b

d=dict()
d["SUM"] = the_sum

def get_evaluated_value(w,sheet_name,cell_name):
    result = w[sheet_name][cell_name].value
    if isinstance(result,int) or isinstance(result,float):
        pass
    else:
        m = fre.match(result)
        if m:
            g = m.groups()
            operator=d[g[0]]    # ATM only sum is supported

            # compute range
            mc1 = cre.match(g[1])
            mc2 = cre.match(g[2])
            start_col = ord(mc1.group(1))
            end_col = ord(mc2.group(1))
            start_row = int(mc1.group(2))
            end_row = int(mc2.group(2))

            result = 0
            for i in range(start_col,end_col+1):
                for j in range(start_row,end_row+1):
                    c = chr(i)+str(j)
                    result = operator(result,w["Feuil1"][c].value)

    return result


w = openpyxl.load_workbook(r"C:\Users\dartypc\Desktop\test.xlsx")
print(get_evaluated_value(w,"Feuil1","A2"))
print(get_evaluated_value(w,"Feuil1","A8"))

output:

2
28

yay!

Jean-François Fabre
  • 137,073
  • 23
  • 153
  • 219
0

I have solved the matter using a combination of openpyxl and pandas:

import pandas as pd
import openpyxl
from openpyxl import Workbook , load_workbook

source_file = "Test.xlsx"
# write to file
wb = load_workbook (source_file)
ws = wb.active
ws.title = "hello world"
ws.append ([10,10])
wb.save(source_file)

# read from file
df = pd.read_excel(source_file)
sum_jan = df ["Jan"].sum() 
print (sum_jan)
rainer
  • 3,295
  • 5
  • 34
  • 50