0

I found to have problem with conversion of .xlsx file to .csv using pandas library. Here is the code:

import pandas as pd


# If pandas is not installed: pip install pandas

class Program:
    def __init__(self):
        # file = input("Insert file name (without extension): ")
        file = "Daty"
        self.namexlsx = "D:\\" + file + ".xlsx"
        self.namecsv = "D:\\" + file + ".csv"
        Program.export(self.namexlsx, self.namecsv)

    def export(namexlsx, namecsv):
        try:
            read_file = pd.read_excel(namexlsx, sheet_name='Sheet1', index_col=0)
            read_file.to_csv(namecsv, index=False, sep=',')
            print("Conversion to .csv file has been successful.")

        except FileNotFoundError:
            print("File not found, check file name again.")
            print("Conversion to .csv file has failed.")


Program()

After running the code the console shows the ValueError: File is not a recognized excel file error File i have in that directory is "Daty.xlsx". Tried couple of thigns like looking up to documentation and other examples around internet but most had similar code.

Edit&Update What i intend afterwards is use the created csv file for conversion to .db file. So in the end the line of import will go .xlsx -> .csv -> .db. The idea of such program came as a training, but i cant get past point described above.

  • 1
    Can you use some separate, direct code to read the excel file? That is, can you just run `pd.read_excel(filename, sheet_name='Sheet1', index_col=0)`, and does that work? – scotscotmcc Jun 17 '21 at 15:43
  • Possible duplicate: https://stackoverflow.com/questions/65250207/pandas-cannot-open-an-excel-xlsx-file - try an `engine`. – doctorlove Jun 17 '21 at 15:45
  • @doctorlove adding `engine=’openpyxl’` creates yet another error `zipfile.BadZipFile: File is not a zip file` – Mr. Hishprung Jun 17 '21 at 16:10
  • @scotscotmccwith sole line after `try:` (without `read_file.to_csv`) still gives same error – Mr. Hishprung Jun 17 '21 at 16:19

2 Answers2

0

You can use like this-

import pandas as pd
data_xls = pd.read_excel('excelfile.xlsx', 'Sheet1', index_col=None)
data_xls.to_csv('csvfile.csv', encoding='utf-8', index=False)
scotscotmcc
  • 2,719
  • 1
  • 6
  • 29
Manish Kumar
  • 162
  • 3
  • Still the same problem, even ignored the __init__ part and just wrote the path to file directly to "" part. `data_xls = pd.read_excel('D:\\Daty.xlsx', 'Sheet1', index_col=None)` and got same error – Mr. Hishprung Jun 17 '21 at 16:03
0

I checked the xlsx itself, and apparently for some reason it was corrupted with columns in initial file being merged into one column. After opening and correcting the cells in the file everything runs smoothly.

Thank you for your time and apologise for inconvenience.