2

I want to copy an specific rows and columns one sheet to another with "openpyxl" method. But my main excel file is .xlsb file and "openpyxl" is not supporting .xlsb file.So I build that complicated way. (* I can not change the .xlsb from Microsoft Excel according to company rules).

main document.xlsb file->temporary document.xlsx->my analyse document.xlsx

-Firstly, I' m changing the data format .xlsb to .xlsx with pandas.

-After that, from temporary document.xlsx, I' m taking specific columns and rows with openpyxl method and paste to my analyse document.xlsx

-My question is: I want to change D column format from "general" to "short date" and I am beginner at Python. Would you please help me about codes? Also if I can change the format cell in ".xlsb to .xlsx tranformation period" maybe I can take input from user: "which date do you want to append 'my analyse document.xlsx?'"

'main document.xlsx'

'temporary document.xlsx'

'my analyse document.xlsx'

import pandas as pd
import openpyxl

df = pd.read_excel("main document.xlsb",sheet_name="Data", engine="pyxlsb")
df.to_excel("temporary document.xlsx")

#! Python 3
# - Copy and Paste Ranges using OpenPyXl library

# Prepare the spreadsheets to copy from and paste too.

# File to be copied
wb = openpyxl.load_workbook("temporary document.xlsx")  # Add file name
sheet = wb["Sheet1"]  # Add Sheet name

# File to be pasted into
template = openpyxl.load_workbook("my analyse document.xlsx")  # Add file name
temp_sheet = template["Sheet2"]  # Add Sheet name


# Copy range of cells as a nested list
# Takes: start cell, end cell, and sheet you want to copy from.
def copyRange(startCol, startRow, endCol, endRow, sheet):
    rangeSelected = []
    # Loops through selected Rows
    for i in range(startRow, endRow + 1, 1):
        # Appends the row to a RowSelected list
        rowSelected = []
        for j in range(startCol, endCol + 1, 1):
            rowSelected.append(sheet.cell(row=i, column=j).value)
        # Adds the RowSelected List and nests inside the rangeSelected
        rangeSelected.append(rowSelected)

    return rangeSelected


# Paste range
# Paste data from copyRange into template sheet
def pasteRange(startCol, startRow, endCol, endRow, sheetReceiving, copiedData):
    countRow = 0
    for i in range(startRow, endRow + 1, 1):
        countCol = 0
        for j in range(startCol, endCol + 1, 1):
            sheetReceiving.cell(row=i, column=j).value = copiedData[countRow][countCol]
            countCol += 1
        countRow += 1


def createData():
    print("Processing...")
    selectedRange = copyRange(2, 2011, 183, 2274, sheet)  # Change the 4 number values (startCol, startRow, endCol, endRow, sheet)
    pastingRange = pasteRange(2, 4573, 182, 4836, temp_sheet, selectedRange)  # Change the 4 number values (startCol, startRow, endCol, endRow, sheet)
    # You can save the template as another file to create a new file here too.s
    template.save("my analyse document.xlsx")
    print("Range copied and pasted!")

go= createData()
piseynir
  • 237
  • 1
  • 4
  • 14

3 Answers3

3

Yeah have a look here in the docs:

import xlsxwriter

workbook = xlsxwriter.Workbook('your_file.xlsx')
worksheet = workbook.add_worksheet()

cell_format05 = workbook.add_format()
cell_format05.set_num_format('mm/dd/yy')
worksheet.write(5, 0, 36892.521, cell_format05)  # output -> 01/01/01

# untested code for you, get your cells into column_D8_downwards array
# this is rather a structural code, not functional!
row = 0
for cell in column_D8_downwards:
    worksheet.write(row, 'D8', cell, cell_format_05)
    row=+1

workbook.close()

So iterate over all cells in your column (D8 downwards) and write the old value into the cell with the new format.

Here is a good example.

zypro
  • 1,158
  • 3
  • 12
  • 33
  • Thanks for reply. I am new at python, according to that, I can not apply the codes very well. The codes which you mention is not worked. This error occured:" NameError: name 'cell_format05' is not defined" – piseynir Mar 30 '20 at 14:53
  • yeah, it was just copied out of the examples in the docs. You get a format if you add one to the workbook. – zypro Mar 30 '20 at 15:18
0

You can also do the same after reading the data with pandas.DataFrame with help of xlrd:

import xlrd
import pandas as pd

df = pd.read_csv('Your_File.csv')
df = df[6:].reset_index()
df.columns = df.iloc[0]

df['Date'] = df['Date_Int'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0))

print(df)

   TARİH       Date
0     43891 2020-03-01
1     43892 2020-03-02
2     43893 2020-03-03
3     43894 2020-03-04
4     43895 2020-03-05

Further you can change the date format as per your requirement.

ManojK
  • 1,570
  • 2
  • 9
  • 17
  • Thanks, but this error occured; "KeyError: 'Date_Int" with your solution – piseynir Mar 30 '20 at 14:52
  • Change this `df['Date'] = df['TARIH'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0))` because `Date_Int` is not the name of column in your file. It is a column which I added in my example data. – ManojK Mar 30 '20 at 14:54
  • File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'TARİH' this error occured. 'TARİH' is not column name, maybe it is related with that. – piseynir Mar 30 '20 at 15:00
  • `import pandas as pd import xlrd df = pd.read_excel("read file.xlsb",sheet_name="Data", engine="pyxlsb") df.to_excel("2nd file.xlsx") df['Date'] = df['TARİH'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0))` I' m doing like this, because after that i will use "openpyxl" and openpyxl doesn't work with .xlsb file. I will take specific rows and columns from "2nd file.xlsx" Here is my specific rows,columns copy method: https://yagisanatode.com/2017/11/18/copy-and-paste-ranges-in-excel-with-openpyxl-and-python-3/ – piseynir Mar 30 '20 at 15:48
  • Looks like you are trying to read the excel file with some blank rows at the top as shown in the picture, please delete rows from 1 to 6 in excel file and read again, ensure the row no. 7 is the header in your pandas dataframe. Then run the code `df['Date'] = df['TARİH'].apply(lambda x: xlrd.xldate.xldate_as_datetime(x, 0))` – ManojK Mar 30 '20 at 15:55
  • Yes it will work, but that i can not change "read file.xlsb" because that file is using by another department in company. Due to that i need to solve with python. Maybe i can delete it with another python code. Do you have any suggestions to delete first 6 columns? – piseynir Mar 30 '20 at 15:59
  • Read the excel in pandas dataframe as it is, then see the index number for the expected header (Python index starts from 0), then run this code if required modify the number `df = df[6:].reset_index()` and ensure that your dataframe has correct headers at the top then do the following `df.columns = df.iloc[0]` now your df should have correct headers then run the remaining code. – ManojK Mar 30 '20 at 16:10
  • df = df[6:].reset_index() df.columns = df.iloc[0] I tried but didn't work. Still headers are same with first image. – piseynir Mar 30 '20 at 16:22
  • I have edited my answer above, you can run the code in the same sequence. – ManojK Mar 30 '20 at 16:32
  • Last thing you mentioned didn't work also. I edited my question, maybe there is a different way to solve. – piseynir Mar 30 '20 at 17:45
  • I will check and reply tomorrow – ManojK Mar 30 '20 at 21:11
0

Problem solved in very simple way, I shocked when I solved it.

1-Open "my analyse document.xlsx"

2-Select the whole "D" column

3-Click home button

4-Format cell-> Short Date

Even I update the excel data with python, column format type is not changing.

Thanks for all supports.

piseynir
  • 237
  • 1
  • 4
  • 14