0

I have to copy data from different workbooks and paste it into a master workbook. All the workbooks are located in a folder: C:\Users\f65651\data transfer. The copied data should be merged into one and then overwritten into the Master wkbk cells. Subsequently also, data from updated workbooks should be overwritten in the Master wkbk.

After some help, I have been able to incorporate all the excel workbooks together

import openpyxl as xl
import os

path1 ='C:\\Users\\f65651\Rresult.xlsx' #Master workbook
wb1 = xl.load_workbook(filename=path1)
ws1 = wb1.worksheets[0]

#iterating over the workbooks
for filename in os.listdir(directory):
    if filename.endswith(".xlsx"):
        g= os.path.join(directory, filename)
        f =xl.load_workbook(filename=g)
        f1 = f.worksheets[0]
        print (filename, f1)
        for row in f1:
            values=[cell.value for cell in row]
            ws1.append(values)
wb1.save(path1)
print ('Process finished!')

However with this code above, the data is appended under the Master wkbk existing table format instead of being overwritten directly into the cells

I have tried fixing this issue but i dont know how. I feel i am not doing the copying of the workbooks into the Master wkbk right. I also dont want to lose the formatting in the Master sheet. Please help!

For better understanding of the problem, I have attached a snippet of what i am trying to achieve, Data 1&2 are examples of the workbks and the Result file is the master sheet.

https://i.stack.imgur.com/0G4lM.png

Fathi
  • 15
  • 7
  • do you have to use only openpyxl? with pandas it will be easier to concat all the xl worbooks into one – Murali Aug 27 '19 at 07:42
  • @Abdlfatah: If you copy from **multiple** sources into **one** worksheet, you have to increment either the `row` or the `colum` of the target worksheet. You can't use the same `[cell.coordinate]` again and again. – stovfl Aug 27 '19 at 09:22
  • @Abdlfatah: ***"modify my code to how I can do that"***: Without your [edit]ed question that show your **expected result** and without `example data`, at least `2 tables with 3 rows and 3 columns` - **No**. Don't use `.load_workbook(...`, all have to be [mcve]. – stovfl Aug 27 '19 at 10:46
  • @Abdlfatah: Read [Worksheet.append](https://openpyxl.readthedocs.io/en/stable/api/openpyxl.worksheet.worksheet.html?highlight=append#openpyxl.worksheet.worksheet.Worksheet.append), simple loop the `row` create a `iterable .values` from the row as described and do `ws_results.append(...`. – stovfl Aug 27 '19 at 12:14
  • @Abdlfatah: Replace your `for cell in row:...` with `values = [cell.value for cell in row]; ws2.append(values)`. Repeat this for all your other Workbooks using [Find all files in a directory](https://stackoverflow.com/a/3964691/7414759) – stovfl Aug 27 '19 at 14:47
  • @stovfl So I tried what you said but I always keep getting AttributeError: 'str' object has no attribute 'value' I have tried to debug this error to no avail – Fathi Sep 02 '19 at 06:56
  • @Abdlfatah: ***"AttributeError: 'str' ..."***: You have changed from `for row in ws1:` to `for row in data:`. Your previous **outer** `for ...` is neede here. – stovfl Sep 02 '19 at 07:31

1 Answers1

0
from openpyxl import load_workbook
import os

directory = "workbooks"
master = Workbook()
master_sheet = master.active
master_sheet.title = "master_sheet"

for filename in os.listdir(directory):
    if filename.endswith(".xlsx"):
        file_path = os.path.join(directory, filename)
        sheet = load_workbook(file_path).active

        # Read each column's value of each excel sheet starting from row 3
        for index, row in enumerate(sheet.iter_rows()):
            if (index <= 1):
                for cell in row:
                    master_sheet[cell.coordinate].value = cell.value
            else:
                row_dict = {cell.coordinate[:1]:cell.value for cell in row}
                master_sheet.append(row_dict)
master.save("sheet3.xlsx")
Mazimia
  • 192
  • 2
  • 11