0

Please find the code below:

import pandas as pd
import csv

# Reading the csv file 
df_new = pd.read_csv('source.csv') 

# saving xlsx file 
GFG = pd.ExcelWriter('source.xlsx') 
df_new.to_excel(GFG, index = False) 

GFG.save() 

# read excel
xl = pd.ExcelFile("source.xlsx")
df = xl.parse("Sheet1")

# get the column you want to copy
column = df["Marks"]

# paste it in the new excel file
with pd.ExcelWriter('Target_excel.xlsx', mode='A') as writer:
    column.to_excel(writer, sheet_name= "new sheet name", index = False)

writer.close()

In this code, it is replacing the existing contents of the target excel file. I want to update a column in sheet 2 without changing other columns.

Example:

Excel file 1--> column_name = 'Marks'

Marks = 10,20,30

Excel file 2--> there are two columns present in this file

Subject_name = Math, English, Science

Marks = 50, 20, 40

So I want to copy "Marks" column from Excel file 1 and paste it into "Marks" column of Excel file 2(Without changing the data of "Subject" column)

  • So you want to INSERT a new column and then copy the contents to the inserted one? – Joonyoung Park Oct 07 '20 at 05:44
  • @JoonyoungPark....No, I want to insert data into an existing column of sheet 2. – Pratik Wagh Oct 07 '20 at 06:43
  • Looks like a similar problem to [this](https://stackoverflow.com/questions/20219254/how-to-write-to-an-existing-excel-file-without-overwriting-data-using-pandas). I see "cols" parameter in "to_excel" function. – Joonyoung Park Oct 07 '20 at 07:53
  • But there is no "cols" parameter available in "to_excel" function – Pratik Wagh Oct 07 '20 at 09:27
  • [pandas to_excel()](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.to_excel.html) The document says it's actually "columns". Sorry if I confused you. Thought it was "cols" due to the link in previous comment. – Joonyoung Park Oct 08 '20 at 01:56
  • I tried to give column name in "columns " but it is giving error.....I gave column.to_excel(writer, sheet_name= "new sheet name", index = False, columns = ['Marks']) – Pratik Wagh Oct 08 '20 at 06:25
  • Ah so it was about overwriting problem. [This article](https://medium.com/@soulsinporto/how-to-add-new-worksheets-to-excel-workbooks-with-pandas-47122704fb75) seems to give solutions for excel file overwriting problem. – Joonyoung Park Oct 08 '20 at 06:58
  • The solution is quite good....but my problem is different.....I want to copy data from one excel file and paste it into 2nd excel file(into a particular column...say column name as 'Test_1') without replacing the other items of 2nd excel file – Pratik Wagh Oct 08 '20 at 09:58
  • Found a simpler way [Here](https://stackoverflow.com/questions/31395058/how-to-write-to-a-new-cell-in-python-using-openpyxl). – Joonyoung Park Oct 09 '20 at 17:22

1 Answers1

0
import pandas as pd
import openpyxl as pxl

def get_col_idx(worksheet, col_name):
    return next((i for i, col in enumerate(worksheet.iter_cols(1, worksheet.max_column)) if col[0].value == col_name), -1)

### ----- 0. csv -> xlsx (no change from your code)

df_new = pd.read_csv("source.csv")

GFG = pd.ExcelWriter("source.xlsx")
df_new.to_excel(GFG, index=False)
GFG.save()

### ----- 1. getting data to copy

# open file and get sheet of interest
source_workbook = pxl.load_workbook("source.xlsx")
source_sheet = source_workbook["Sheet1"]

# get "Marks" column index
col_idx = get_col_idx(source_sheet, "Marks")

# get contents in each cell
col_contents = [row[col_idx].value for row in source_sheet.iter_rows(min_row=2)]

### ----- 2. copy contents to target excel file

target_workbook = pxl.load_workbook("Target_excel.xlsx")
target_sheet = target_workbook["new sheet name"]

col_idx = get_col_idx(target_sheet, "Marks")

for i, value in enumerate(col_contents):
    cell = target_sheet.cell(row=i+2, column=col_idx+1)
    cell.value = value

target_workbook.save("Target_excel.xlsx")
Joonyoung Park
  • 474
  • 3
  • 6