1

The following code works correctly the first time creating tabs x1 and x2 and fills with data. When executed again for new data it doesn't check if the tabs already exist and adds new erroneous tabs x11 and x21.

import pandas as pd
import numpy as np
from openpyxl import load_workbook
import os

filename = os.path.dirname(os.path.abspath(__file__))+ '\\Test_excel.xlsx' # path to current directory adding file name

book = load_workbook(filename)
writer = pd.ExcelWriter(filename, engine='openpyxl')
writer.book = book

x1 = np.random.randn(10, 4)
df1 = pd.DataFrame(x1)

x2 = np.random.randn(20, 2)
df2 = pd.DataFrame(x2)

df1.to_excel(writer, sheet_name = 'x1')
df2.to_excel(writer, sheet_name = 'x2')
writer.save()
writer.close()

I need it to overwrite the existing tabs data if the tab already exists otherwise create the tab. Any ideas?

ReKx
  • 996
  • 2
  • 10
  • 23
Keith W
  • 43
  • 2
  • 12
  • Why are you manually calling `load_workbook` and assigning the book to `writer.book`? If you won't do it you will get the exact behaviour you are looking for – DeepSpace Jul 11 '18 at 18:10
  • I don't want it to remove any other tabs that may already exist. When I remove `load_workbook` and `writer.book` commands it removes all but the two I am writing. – Keith W Jul 11 '18 at 18:20

2 Answers2

0

You asked a question that I also was going to run into soon in my own project. One solution is to look for the sheet and use the exception read_excel() raises. If there's no exception, then the sheet exists, so delete it. Here's the code I just tried out on Jupiter:

from xlrd import XLRDError

try:
    df = pd.read_excel( writer, sheet_name='x1')
except XLRDError: 
    print ("No sheet x1 found")
else:
    print ("Removing sheet")
    del book['x1']
David Gaertner
  • 386
  • 2
  • 7
  • I'm not using Jupyter, My Python 3.6 is not recognizing the `XLRDError`. I get `NameError: name 'XLRDError' is not defined`. – Keith W Jul 11 '18 at 18:40
  • Add `from xlrd import XLRDError` to your code. That solved it for me. – David Gaertner Jul 11 '18 at 18:49
  • I appreciate your assistance. I'm looking at this and trying to understand why a function from an otherwise unused module is needed to find an error for the try function? It looks like it might work but a mystery as to understanding what is happening. Any insight? – Keith W Jul 11 '18 at 20:50
  • Okay, so I ran it to force the error and the trace shows that `pandas` uses `xlrd` in the background to check the sheet name. When it doesn't exist it kicks up the `XLRDError`. Your code trys that and bypasses the error if it is not there. The next command creates the sheet. Wow! Still learning python and now a way to try something and take advantage of possible errors! – Keith W Jul 11 '18 at 21:16
  • I help others in order to learn more myself. I'm glad this helped you! – David Gaertner Jul 11 '18 at 22:04
  • I later am having trouble with the above suggestion. I found the following link that provides a full function that works consistently. Thanks to @MaxU ! [link](https://stackoverflow.com/a/49521465/9886639) – Keith W Jul 14 '18 at 20:02
  • @KeithW, Thanks for that link to a more complete answer than mine. What trouble did you have? – David Gaertner Jul 15 '18 at 16:43
  • I was getting the same error as the title of the referenced link. _Getting AttributeError 'Workbook' object has no attribute 'add_worksheet' - while writing data frame to excel sheet_. The function by @MaxU seems quite versatile with how data is inserted into excel. I hope it works well for you too! – Keith W Jul 16 '18 at 00:28
0

The quickest way to find out whether a sheet exists in an Excel file is probably to use openpyxl's read-only mode.

from openpyxl import load_workbook
wb = load_workbook("file.xlsx", read_only=True)
'x1' in wb.sheetnames

For full control over working with dataframes openpyxl also provides some utilities that will let you place and format a dataframe exactly where you want to.

Charlie Clark
  • 18,477
  • 4
  • 49
  • 55