0

I'm creating a PY script to pull data from my database and generate an Excel spreadsheet. The PY script uses a bunch of different modules, but the modules in question are pandas and openpyxl. I would like to be able to autofit the column width for all of the columns in a similar fashion to how it is done in Excel (based on content). The problem is that some of the data uses formulas to answer some common data analysis questions for the data, like "How many unique keywords are there?", and to my knowledge openpyxl really doesn't like formulas.

I've looked through a lot of Stack Overflow and what I think is the closest to what I would like is something like this: How to access the real value of a cell using the openpyxl module for python. However, none of the options (including the one above) I've tried have worked.

The following code is one of the methods I have tried:

writer = pandas.ExcelWriter(file_name, engine = "openpyxl")
writer.book = openpyxl.load_workbook(file_name, data_only = True)
writer.sheets = dict((ws.title, ws) for ws in writer.book.worksheets)
lastRow = worksheet.max_row

worksheet['L1'] = f"=\"Count of Unique Keywords = \"&sum(--(frequency(match(A2:A{lastRow},A2:A{lastRow},0),row(A2:A{lastRow})-row(A2)+1)>0))"

print (worksheet['L1'].value)

With the above code, what I'm looking to have returned from the print is: Count of Unique Keywords = 123

But instead, I am getting the Excel function:

="Count of Unique Keywords = "&sum(--(frequency(match(A2:A100,A2:A100, 0),row(A2:A100)-row(A2)+1)>0))

Sorry if this has been answered already/if the format of this question isn't perfect. This is the first question I've asked on Stack Overflow.

1 Answers1

0

Have you tried..

worksheet['L1'] = f"=\"CONCATENATE("Count of Unique Keywords = ", SUM(--(FREQUENCY(MATCH(A2:A100,A2:A100, 0),ROW(A2:A100)-ROW(A2)+1)>0)))"

This will produce a single excel cell with "Count of Unique Keywords=" + the output from the formula assuming your function and the supporting data are working properly.

Angel Roman
  • 598
  • 1
  • 3
  • 13
  • Thanks for the response, but this did not work. Just to reiterate the problem, I want to be able to retrieve the value in Python, from Excel, after Python creates the data. In the example above I used a print to return the value of worksheet['L1'], and what I want the print to equal in python is 123 (in the example), but what I'm getting back is the Excel formula instead. If I manage to get the actual value back through Python, I can set the len of whatever worksheet['A1'].value through worksheet['L1'].value equals to an array, then loop through the columns of Excel to make their widths match. – Troy O'Donnell Apr 15 '19 at 16:41
  • I believe I understand your desired output. Can you show me the data set you are working (`print(worksheet)`) with so I may modify my answer to address your question correctly. – Angel Roman Apr 17 '19 at 02:04