2

I'm having an issue with saving an Excel file in openpyxl. I'm trying to create a processing script which would grab data from one excel file, dump it into a dump excel file, and after some tweaking around with formulas in excel, I will have all of the processed data in the dump excel file. My current code is as so.

from openpyxl import load_workbook
import os
import datetime
from openpyxl.cell import get_column_letter, Cell, column_index_from_string, coordinate_from_string

dump = dumplocation
desktop = desktoplocation
date = datetime.datetime.now().strftime("%Y-%m-%d")


excel = load_workbook(dump+date+ ".xlsx", use_iterators = True)

sheet = excel.get_sheet_by_name("Sheet1")

try:
    query = raw_input('How many rows of data is there?\n')
except ValueError:
    print 'Not a number'

#sheetname = raw_input('What is the name of the worksheet in the data?\n')
for filename in os.listdir(desktop):
    if filename.endswith(".xlsx"):
        print filename

        data = load_workbook(filename, use_iterators = True)
        ws = data.get_sheet_by_name(name = '17270115')

    #copying data from excel to data excel
        n=16
        for row in sheet.iter_rows():
            for cell in row:
                for rows in ws.iter_rows():
                    for cells in row:
                        n+=1
                        if (n>=17) and (n<=32):
                            cell.internal_value = cells.internal_value


    #adding column between time in UTC and the data
        column_index = 1
        new_cells = {}
        sheet.column_dimensions = {}
        for coordinate, cell in sheet._cells.iteritems():
            column_letter, row = coordinate_from_string(coordinate)
            column = column_index_from_string(column_letter)

    # shifting columns
            if column >= column_index:
                column += 1

            column_letter = get_column_letter(column)
            coordinate = '%s%s' % (column_letter, row)

    # it's important to create new Cell object
            new_cells[coordinate] = Cell(sheet, column_letter, row, cell.value)

        sheet.cells = new_cells
    #setting columns to be hidden
        for coordinate, cell in sheet._cells.iteritems():
            column_letter, row = coordinate_from_string(coordinate)
            column = column_index_from_string(column_letter)

            if (column<=3) and (column>=18):
               column.set_column(column, options={'hidden': True})

A lot of my code is messy I know since I just started Python two or three weeks ago. I also have a few outstanding issues which I can deal with later on. It doesn't seem like a lot of people are using openpyxl for my purposes. I tried using the normal Workbook module but that didn't seem to work because you can't iterate in the cell items. (which is required for me to copy and paste relevant data from one excel file to another)

UPDATE: I realised that openpyxl can only create workbooks but can't edit current ones. So I have decided to change tunes and edit the new workbook after I have transferred data into there. I have resulted to using back to Workbook to transfer data:

from openpyxl import Workbook
from openpyxl import worksheet
from openpyxl import load_workbook
import os
from openpyxl.cell import get_column_letter, Cell, column_index_from_string, coordinate_from_string

dump = "c:/users/y.lai/desktop/data/201501.xlsx"
desktop = "c:/users/y.lai/desktop/"



excel = Workbook()

sheet = excel.add_sheet

try:
    query = raw_input('How many rows of data is there?\n')
except ValueError:
    print 'Not a number'

#sheetname = raw_input('What is the name of the worksheet in the data?\n')
for filename in os.listdir(desktop):
    if filename.endswith(".xlsx"):
        print filename

        data = load_workbook(filename, use_iterators = True)
        ws = data.get_sheet_by_name(name = '17270115')

    #copying data from excel to data excel
        n=16
        q=0
        for x in range(6,int(query)):
            for s in range(65,90):   
                for cell in Cell(sheet,chr(s),x):
                    for rows in ws.iter_rows():
                        for cells in rows:
                            q+=1
                            if q>=5:
                                n+=1
                                if (n>=17) and (n<=32):
                                    cell.value = cells.internal_value

But this doesn't seem to work still

    Traceback (most recent call last):
      File "xxx\Desktop\xlspostprocessing.py", line 40, in <module>
    for cell in Cell(sheet,chr(s),x):
      File "xxx\AppData\Local\Continuum\Anaconda\lib\site-packages\openpyxl\cell.py", line 181, in __init__
    self._shared_date = SharedDate(base_date=worksheet.parent.excel_base_date)
    AttributeError: 'function' object has no attribute 'parent'

Went through the API but..I'm overwhelmed by the coding in there so I couldn't make much sense of the API. To me it looks like I have used the Cell module wrongly. I read the definition of the Cell and its attributes, thus having the chr(s) to give the 26 alphabets A-Z.

bomdaworld
  • 23
  • 1
  • 4
  • openpyxl can edit existing workbooks, just not if you open them in read-only mode. This discussion has now moved to the mailing list. – Charlie Clark Feb 02 '15 at 11:33
  • @CharlieClark I'm having this trouble saving, even though I'm not in read-only http://stackoverflow.com/questions/31237229/why-is-the-iteration-over-this-loop-not-adding-cells-in-openpyxl. Unable to read the Excel log files, unfortunately. Any idea what might be the issue? Happens whether I try save to a new workbook OR the one I've loaded. – Pyderman Jul 06 '15 at 03:44
  • @CharlieClark Another saving issue on Windows, this time with an absolute filename and the filename wrapped in `r''`: http://stackoverflow.com/questions/31362887/even-with-r-prepended-to-filename-openpyxl-not-able-to-save-on-windows – Pyderman Jul 12 '15 at 06:29

2 Answers2

1

You can iterate using the standard Workbook mode. use_iterators=True has been renamed read_only=True to emphasise what this mode is used for (on demand reading of parts).

Your code as it stands cannot work with this method as the workbook is read-only and cell.internal_value is always a read only property.

However, it looks like you're not getting that far because there is a problem with your Excel files. You might want to submit a bug with one of the files. Also the mailing list might be a better place for discussion.

Charlie Clark
  • 18,477
  • 4
  • 49
  • 55
  • So for my code the problem is that the one i'm writing in is "read only"? (ie excel in my code) +it's hard because the data comes in a .tst format which i delimit it manually before doing this process it seems like after i apply the code that the file gets corrupted Thanks for the help though :) – bomdaworld Jan 29 '15 at 22:10
0

You could try using xlrd and xlwt instead of pyopenxl but you might find exactly what you are looking to do already available in xlutil - all are from python-excel.

Steve Barnes
  • 27,618
  • 6
  • 63
  • 73