0

I have a folder with multiple *.xlsm-files for example "80-384sec -_november_2017.xlsm", "80-384sec -_december_2017.xlsm", ..... I can read a specific sheet from this file with python like this:

df_xlsmtocsv = pd.read_excel('80-384sec -_november_2017.xlsm', 'sheet3, index_col=None )

And my first solution is something like this:

for file in os.listdir():
    if file.endswith(".xlsm"):
        df_qctocsv = pd.read_excel(file, 'sheet3', index_col=None )
        print(df_qctocsv)
        with open('all.csv', 'a') as f:
            df_qctocsv.to_csv(f, index=True, header=None)

How can I read multiple xlsm-files and append all new messages to a csv-file and order this for example by first column?

After converting I want to copy all this rows from the csv-file to a new sheet in an existing file "messages.xlsx".

Brad Larson
  • 170,088
  • 45
  • 397
  • 571
TigerClaw
  • 167
  • 1
  • 2
  • 13
  • What do you mean by messages? Do you want to append a particular column from each of the files to a dataframe? Or do you want to merge certain sheets from every file and save all this to a new csv file? – jjj Nov 27 '17 at 09:36

1 Answers1

1

There is a lot of ways in which you can join data frames. One possible way is this:

import pandas as pd

df = pd.DataFrame()
for file in os.listdir():
    if file.endswith(".xlsm"):
        df_tmp = pd.read_excel(file, 'Sheet1', index_col=None)
        df = df.append(df_tmp)

df.to_csv('all.csv')

EDIT: If you want to add your dataframe to an existing xlsx file (adapted from here):

from openpyxl import load_workbook

book = load_workbook('<your-xlsx-file>')
wrt = pd.ExcelWriter('<your-output-file>', engine='openpyxl')
wrt.book = book
wrt.sheets = dict((ws.title, ws) for ws in book.worksheets)
df.to_excel(wrt, '<name-of-your-sheet>')
wrt.save()
jjj
  • 1,067
  • 1
  • 15
  • 30
  • But if I want to add the new csv file to an existing xlsx-file but in a new sheet, it overwrites me the file/existing sheet – TigerClaw Nov 27 '17 at 12:02
  • @TigerClaw could you update your question with the relevant part of your code? It should work if you pass an existing `ExcelWriter` object. See notes [here](http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.to_excel.html). – jjj Nov 27 '17 at 12:20
  • @TigerClaw also have a look at this answer: https://stackoverflow.com/a/20221655/4190925 – jjj Nov 27 '17 at 12:56
  • No, if I have a file with sheet1 and sheet2 and use df1.to_excel(writer,'sheet1') with Excel Writer it gaves me a file only with sheet1. Existing sheet2 ist aftere that not available. – TigerClaw Nov 27 '17 at 13:20
  • @TigerClaw I believe it was explained in the answer I linked to. I've added a modified version of it to my answer. – jjj Nov 27 '17 at 13:42
  • I solved this with two different xlsx files. But I write this two files to another file as sheets I lose my color-format ... – TigerClaw Nov 28 '17 at 08:04
  • @TigerClaw No wonder you loose formatting if you convert them to dataframes. If you want to keep your formatting have a look at the `openpyxl` package, but then I don't think you can use `pandas` for merging. Also, if my answer helped you, please accept it. – jjj Nov 28 '17 at 17:15