how to read xlsx as pandas dataframe with formulas as strings

Question

I have a excel file with some calculated columns.

for example, I have some data in columns 'a' and column 'b' is calculated using values in column 'a'.

i need to append new data to column 'a' and calculate column 'b' and save the file.

import pandas as pd
df = pd.DataFrame({'a':[1,2,3],'b':["=a2","=a3","=a4"]})
df.to_excel('test.xlsx',index=False)

when i try to read the file using pandas read excel it reads the column 'b' as NaN.

df = pd.read_excel(r'test.xlsx')

how do i achieve this. may be if i can read the file as string and append the formulas as string. when i open the file in excel the excel will do the calculations?

score 4 · Answer 1 · answered Sep 30 '19 at 07:05

4

Use OpenPyXL to load the excel worksheet instead of directly with pandas

from openpyxl import load_workbook
import pandas as pd
wb = load_workbook(filename = 'test.xlsx')
sheet_name = wb.get_sheet_names()[0]
ws = wb[sheet_name]
df = pd.DataFrame(ws.values)

answered Sep 30 '19 at 07:05

henrywongkk

1,840
3
17
26

One difference to `read_excel` seems to be that this method doesn't automatically use the first row as the column names. If you need that, then see this answer: https://stackoverflow.com/a/56981317/1295595 – craq Jan 25 '21 at 22:59

score 0 · Answer 2 · answered Sep 30 '19 at 04:12

import pandas as pd
import xlsxwriter

name = '123.xlsx'
writer = pd.ExcelWriter(name,engine='xlsxwriter')
pd.DataFrame({}).to_excel(writer,sheet_name='Sheet1')
workbook = writer.book
worksheet = writer.sheets['Sheet1']
worksheet.write('A1',1)
worksheet.write('A2','=A1')
writer.save()

how to read xlsx as pandas dataframe with formulas as strings

2 Answers2

Linked