26

I want to be able to read an Excel file in Python, keep the Python script running doing something else after the reading is finished, and be able to edit the Excel file in another process in the meantime. I'm using python 2.7 and openpyxl.

Currently it looks like:

from openpyxl import load_workbook

def get_excel_data():
    OESwb = load_workbook(filename = OESconfigFile, data_only=True, 
                          read_only=True)
    ws = OESwb.get_sheet_by_name('MC01')
    aValue = ws['A1'].value
    return aValue

val = get_excel_data()

After I run the function, the Excel file is still locked for access from other processes (it gives the error "'filename' is currently in use. Try again later") even when I do not want to read it in Python anymore.

How can I close the file from my script? I've tried OESwb.close() but it gives the error "'Workbook' object has no attribute 'close'". I found this post but it doesn't seem to be helping.

EDIT: It appears OESwb.save('filename.xlsx') works, but only if read_only=False. However, it would be ideal to be able to close the file and still be in readonly mode. It appears this is a bug with openpyxl since it should close the file after load_workbook is finished.

Community
  • 1
  • 1
wordsforthewise
  • 13,746
  • 5
  • 87
  • 117
  • have you tried `OESwb.save()`? – Jeremy Jul 14 '15 at 20:47
  • This is not a bug. In read-only mode the file handler has to be kept open. Any changes to the file would not be noticed by openpyxl so there is little point in trying to edit the file with Excel while reading it with openpyxl. – Charlie Clark Jul 15 '15 at 07:16
  • 1
    I don't understand, why does the file handler have to be kept open in read-only mode? It's not that I want to read in changes to the file once it has been read, I want to read the file as read-only, then be able to edit it so other processes can read the changes. Regardless, I've modified my code to remove the read-only, use iterators, and data only, then save the file to close it. Haven't yet tested if the file closes without saving it if read-only is not enabled. – wordsforthewise Jul 16 '15 at 20:56
  • 1
    @CharlieClark this actually is a problem if for example your server needs to read a file (lets say initial configuration) from an Excel and then keep running for the next 50 days. The Excel file is locked for 50 days even when you have long discarded the workbook handler and would not be reading from it. – Mr. Napik Apr 06 '16 at 10:32
  • @Mr.Napik reading config from an Excel file? Well, you can use one of the proposed solutions to close the archive when you're done with it. – Charlie Clark Apr 06 '16 at 13:51
  • @CharlieClark "Reading config from an Excel file?" Assume that "config file" can mean something like a meeting agenda or an input for pay calculation or other data prepared by normal users before starting the script. – Mr. Napik Apr 07 '16 at 20:27
  • None of those sound like particularly likely use cases for a long-running process. In standard mode the file handle will be garbage collected and read-only mode also sounds unusual for the situation. Whatever, please feel free to submit a PR with tests. – Charlie Clark Apr 08 '16 at 07:41

8 Answers8

32

For some draconian reason, stackoverflow will allow me to post an answer but I don't have enough 'rep' to comment or vote -- so here we are.

The accepted answer of wb._archive.close() did not work for me. Possibly this is because I am using read-only mode. It may work fine when in 'normal' mode.

bmiller's answer is the only answer that worked for me as well:

with open(xlsx_filename, "rb") as f:
    in_mem_file = io.BytesIO(f.read())

wb = load_workbook(in_mem_file, read_only=True)

And as he said, it is faster when loading with open() versus only using read-only.

My working code based on bmiller's answer:

import openpyxl
import io

xlsx_filename=r'C:/location/of/file.xlsx'
with open(xlsx_filename, "rb") as f:
    in_mem_file = io.BytesIO(f.read())

wb = openpyxl.load_workbook(in_mem_file, read_only=True)
R Tsch
  • 313
  • 3
  • 14
Patrick Conwell
  • 630
  • 1
  • 5
  • 10
  • what is the pint of this? Why are you using read only mode to read the whole file into memory? According to the docs this mode is for optimized reads of large notebooks. If you can store the whole thing in memory why not drop the read_only flag? – Jamie Marshall Jan 07 '21 at 19:41
  • In my experience, even if the file is read into memory, large excel files still perform poorly without the read only flag. The read only flag significantly increases ops on large files, regardless of where they are read from. – Patrick Conwell Apr 13 '21 at 16:30
  • Hmmm, I'm not not quite understanding what your saying. I think you might not be fully understanding the operation here. – Jamie Marshall Apr 14 '21 at 04:09
  • There are two different operations in play here. The python operation of reading the file into memory, and the openpyxl feature to optimize reading large excel files. These operations are independent of each other. – Patrick Conwell May 11 '21 at 13:40
  • 2
    Thank you. I have been scratching my head for hours as why is python not unloading this excel from the memory and why I wasn't able to delete that bloody file using stutil in the next line. Your solution worked! – Prashant Kumar Sep 20 '21 at 17:26
21

I've tried all these solutions for closing an xlsx file in read-only mode and none seem to do the job. I finally ended up using an in-mem file:

with open(xlsx_filename, "rb") as f:
    in_mem_file = io.BytesIO(f.read())

wb = load_workbook(in_mem_file, read_only=True)

Might even load faster and no need to worry about closing anything.

bmiller
  • 1,454
  • 1
  • 14
  • 14
14
wb._archive.close()

Works with use_iterator too.

user5859387
  • 180
  • 1
  • 2
  • 6
    Calling `.close()` on a read-only workbook does not seem to be enough. Until the Python process exits, the Excel itself cannot write to the file. – Jan Dolejsi May 12 '20 at 20:46
5

For your latest information, openpyxl 2.4.4+ provides Workbook.close() method. Below are references.

http://openpyxl.readthedocs.io/en/stable/changes.html?highlight=close#id86
https://bitbucket.org/openpyxl/openpyxl/issues/673

Leonard2
  • 894
  • 8
  • 21
  • 3
    Calling `.close()` on a read-only workbook does not seem to be enough. Until the Python process exits, the Excel itself cannot write to the file. – Jan Dolejsi May 12 '20 at 20:45
3

I also found this to be a problem, and think it is strange that workbooks have no close method.

The solution I came up with was a context manager which "closes" the file for me so that I don't have put meaningless saves in my code every time I read a spreadsheet.

@contextlib.contextmanager
def load_worksheet_with_close(filename, *args, **kwargs):
    '''
    Open an openpyxl worksheet and automatically close it when finished.
    '''
    wb = openpyxl.load_workbook(filename, *args, **kwargs)
    yield wb
    # Create path in temporary directory and write workbook there to force
    # it to close
    path = os.path.join(tempfile.gettempdir(), os.path.basename(filename))
    wb.save(path)
    os.remove(path)

To use it:

with load_worksheet_with_close('myworkbook.xlsx') as wb:
    # Do things with workbook
  • Maybe consider merging your context solution with @user5859387's method to avoid having to save and delete! – CodeJockey Apr 07 '17 at 20:04
2

You can try:

wb = None

to free the resources, and load it again as soon as you need it again, in the same or other variable.

MASR
  • 21
  • 2
  • 1
    @lodebari it only works if you don't have other variables associated with that particular workbook in memory. For example, if you have a=workbook() and b=a, setting a=None is not enough to cause python to automatically collect garbage. You have to set b=None as well. – Ken T Jan 21 '17 at 08:56
  • 1
    I added `import gc; gc.collect(); os.remove(filename)` after setting my ref to `None`. It worked well. Thank you. – phyatt Mar 30 '22 at 04:48
2

Use OESwb._archive.close() This will close the additional ZipFile filehandle which was hold open in 'read_only=True' Mode. Be aware, after close you could not read more Data from OESwb. Be also aware, this ist a workaround and _archive could be removed in a future Version.

fill͡pant͡
  • 1,147
  • 2
  • 12
  • 24
stovfl
  • 14,998
  • 7
  • 24
  • 51
1

To close, I believe you need to save the file:

OESwb.save('filename.xlsx')

Hope this helps.

jharrison12
  • 150
  • 8
  • This works if not in read only mode, but the file should be closed anyway after load_workbook. Opening an issue on bitbucket for it, maybe I can fix it if I can figure it out. – wordsforthewise Jul 14 '15 at 21:11
  • this doesn't work for me. I am in read/write, and saving still leaves the file open.. THis is a big problem for me... – clg4 Mar 11 '16 at 18:48
  • @wordsforthewise openpyxl streams content from the file rather than loading it all into memory at once, so no, it can't close after load_workbook as it's still reading the content you request from the workbook. – sage88 Sep 28 '17 at 05:26