0

I want append multiple csv files data into same sheet of single excel sheet with one empty row between data.

1.csv

ID  Currency    Val1    Val2        Month
101 INR     57007037.32 1292025.24  2021-03
102 INR     49171143.9  1303785.98  2021-02

2.csv

ID  Currency    Val1    Val2        Month
103 INR     67733998.9  1370086.78  2020-12
104 INR     48838409.39 1203648.32  2020-11

Now I want to write into same sheet of excel sheet with one empty row like below.

output.xlsx

ID  Currency    Val1    Val2        Month
101 INR     57007037.32 1292025.24  2021-03
102 INR     49171143.9  1303785.98  2021-02

103 INR     67733998.9  1370086.78  2020-12
104 INR     48838409.39 1203648.32  2020-11

Error:

enter image description here

Rummy
  • 51
  • 8
  • Does this answer your question? [Python convert csv to xlsx](https://stackoverflow.com/questions/17684610/python-convert-csv-to-xlsx) – Tomerikoo Mar 08 '21 at 13:20
  • Your csv need to be right delimited (comma, semicolon, ...) without extra spaces, quoted or not, etc. – Corralien Apr 25 '21 at 17:29

4 Answers4

1

1.csv:

ID;Currency;Val1;Val2;Month
101;INR;57007037.32;1292025.24;2021-03
102;INR;49171143.9;1303785.98;2021-02

2.csv:

ID;Currency;Val1;Val3;Month;Year
103;INR;67733998.9;1370086.78;2020-12;2020
104;INR;48838409.39;1203648.32;2020-11;2020

3.csv

ID;Currency;Val2;Year
105;INR;34325309.92;2020
106;INR;18098469.39;2020
import pandas as pd
import numpy as np

dfs = []
files = ["1.csv", "2.csv", "3.csv"]

for csv in files:
    df = pd.read_csv(csv, delimiter=";")
    df = df.append(pd.DataFrame([[np.NaN] * df.shape[1]], columns=df.columns))
    dfs.append(df)

dfs = pd.concat(dfs).to_excel("output.xlsx", na_rep="", index=False)

enter image description here

Edit: problem of columns order

>>> df
    2019-01   2020-01   2020-09  ...   2021-03  2021-03.1   Name  id  currency
0  0.665912  0.140293  0.501259  ...  0.714760   0.586644    Ram   A       INR
1  0.217433  0.950174  0.618288  ...  0.699932   0.219194  Krish   A       INR
2  0.419540  0.788270  0.490949  ...  0.315056   0.312781  Sandy   A       INR
3  0.034803  0.335773  0.563574  ...  0.580068   0.949062  Dhanu   A       INR
>>> BASECOLS = ["id", "currency", "Name"]
>>> cols = BASECOLS + list(reversed(df.columns[~df.columns.isin(BASECOLS)]))
>>> df[cols]
  id currency   Name  2021-03.1   2021-03  ...  2020-09   2020-01   2019-01
0  A      INR    Ram   0.586644  0.714760  ...  0.501259  0.140293  0.665912
1  A      INR  Krish   0.219194  0.699932  ...  0.618288  0.950174  0.217433
2  A      INR  Sandy   0.312781  0.315056  ...  0.490949  0.788270  0.419540
3  A      INR  Dhanu   0.949062  0.580068  ...  0.563574  0.335773  0.034803
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • @ corralien, I'm getting error NameError: name 'np' is not defined. – Rummy Apr 26 '21 at 08:15
  • @ corralien, Now I'm getting error NameError: name 'dfs' is not defined and also my source csv files are delimited by '\t' space. – Rummy Apr 26 '21 at 09:05
  • @ corralien, Thanks for your kind help !! I have made some changes and here is the full code for some one. – Rummy Apr 26 '21 at 10:14
  • @ corralien, Below answer is perfectly working when all csv files columns are same and not working when columns are different. Request your help when we have different columns. – Rummy Apr 27 '21 at 11:28
  • I don't understand it works for me. What's the error? I will update my post soon. – Corralien Apr 27 '21 at 12:12
  • @ corralien, I'm getting in reverse order. attached the screenshot in question. Please help on this !! – Rummy Apr 27 '21 at 13:56
  • Is it just a problem of columns order (or a problem with the column `2021-03.1`)? – Corralien Apr 27 '21 at 14:05
  • @ Corralien, It is not problem with 2021-03.1.It is problem with columns order only. Order should be Id, currency, Name, 2020-03, 2020-03.1, 2020-02, 2020-01, 2019-12. – Rummy Apr 27 '21 at 14:20
  • Next time, open a new question please because there it's a little mess... Don't forget to upvote too if it helps :). I think the question may be close. – Corralien Apr 27 '21 at 14:52
  • @ Corralien, Thank You so Much !!...Can you please put all code lines in single snippet. – Rummy Apr 27 '21 at 15:04
  • No, I can't sorry because answers need two input data types (not same columns names, data format). I use purposely `df` as variable name so you can integrate easily the second snippet code into the loop of the first answer. You have to adapt the code a bit to suit your needs. – Corralien Apr 27 '21 at 15:44
  • @ corralien, It's Ok....I observed that data got reversed when writing into excel sheet. Is there any way to re reverse the data when writing data into excel ?? Any help ?? – Rummy Apr 27 '21 at 16:00
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/231665/discussion-between-corralien-and-rummy). – Corralien Apr 27 '21 at 16:54
0

The pandas package makes this much easier in my opinion:

import pandas as pd

files = [
    'path/to/file1.csv',
    'path/to/file2.csv',
    'path/to/file3.csv',
]


spreadsheet = pd.ExcelWriter('path/to/output.xlsx')

for file in files:
    sheet_name = file.split('.')[0]
    data = pd.read_csv(file)
    data.to_excel(spreadsheet, sheet_name=sheet_name, index=None)

spreadsheet.save()
jfaccioni
  • 7,099
  • 1
  • 9
  • 25
  • @ jfaccioni, I have updated mu question. Please help on this. – Rummy Apr 25 '21 at 15:14
  • @Rummy you should open a new question rather than update an old question. It makes the existing answers obsolete for anyone that searches this topic in the future. – jfaccioni Apr 25 '21 at 15:18
0

I suggest use to use pandas. It has an excellent xlsx writer thant can do the job for you very simple. Basically you have to initialize your excel writer then loop through csvs, read one by one and write to file. I suggest you to use pd.ExcelWriter so xlsx file will be touched only one time. Also mode='a' lets you append sheets to existing excel file, remove it if you want to overwrite the entire file. See docs.

import pandas as pd
with pd.ExcelWriter('output.xlsx', mode='a') as writer:
    #here you loop through csvs and load
    for csv in csvs:
        df = pd.read_csv(csv)
        df.to_excel(writer, sheet_name=csv)
Zeno Dalla Valle
  • 957
  • 5
  • 16
0
import pandas as pd
import numpy as np
import os

try:

    spreadsheet = pd.ExcelWriter('/home/Report.xlsx',engine='xlsxwriter')
    for root, dirs, files in os.walk('/home/'):
        final_data=pd.DataFrame()
        for csv in files:
            df = pd.read_csv(csv)
            df1 = pd.DataFrame([[np.nan] * df.shape[1]], columns=df.columns) 
            final_data=final_data.append(df) 
            final_data=final_data.append(df1)
            final_data.to_excel(spreadsheet,na_rep="",sheet_name='report',index=False)
    
    spreadsheet.save()

except (RuntimeError, TypeError, NameError):
        print("Unable to load data into Excel Sheet")
        raise
Rummy
  • 51
  • 8
  • While this code snippet may solve the problem, it doesn't explain why or how it answers the question. Please [include an explanation for your code](https://meta.stackexchange.com/q/114762/269535), as that really helps to improve the quality of your post. Remember that you are answering the question for readers in the future, and those people might not know the reasons for your code suggestion. You can use the [edit] button to improve this answer to get more votes and reputation! – Brian Tompsett - 汤莱恩 Apr 26 '21 at 11:02