0

I try to read a excel file using pandas with the code below:

path = "QVI_transaction_data.xlsx"

I also tried using "./QVI_transaction_data.xlsx" rather than the one above, the name is just copy pasted from os.listdir() so there is no transcription problems

pd.read_excel(path, sheet_name = "in")

but it didn't worked, it outputs this error:

OSError: [Errno 22] Invalid argument

I also tried without the sheet_name argument, others posts say that there is a problem with the filename but I had worked with pandas before and I don't think there is something wrong with the name. anyone knows what is wrong about this?

this is how the file looks like:

enter image description here

  • Can you post the explicit path for `path = "QVI_transaction_data.xlsx" ` like it may have `path = "\home\QVI_transaction_data.xlsx" ` or `path = "/home/QVI_transaction_data.xlsx" `, or better if you could show us how you are reading the file from Directory path? – Karn Kumar Jul 13 '21 at 04:31
  • I am using colab, this is the path, /content/QVI_transaction_data.xlsx" I also tried using old python notebook on my pc having the same error – alexander garcia Jul 13 '21 at 04:36
  • using the full path I also get the same error – alexander garcia Jul 13 '21 at 04:37
  • Do you have a particular sheet name which you are trying to read? if so then pls make sure you are using right name. – Karn Kumar Jul 13 '21 at 04:44
  • yes, the sheet name is "in", I used it – alexander garcia Jul 13 '21 at 04:47
  • Okay, can you try like `df = pd.read_excel(r'/content/QVI_transaction_data.xlsx', sheet_name = 'in')`, does it complain about `xlrd`? if yes, then you can do `import xlrd`. or maybe you can try in case version issue `df = pd.read_excel(r'/content/QVI_transaction_data.xlsx', sheet_name = 'in', engine='openpyxl')` – Karn Kumar Jul 13 '21 at 04:56
  • Is there a possibility to share that file somewhere to reproduced? – Karn Kumar Jul 13 '21 at 05:11
  • What i sence from the Error code , you are using old version of python , you should try `df = pd.read_excel(path, sheetname=in)` – Karn Kumar Jul 13 '21 at 05:40
  • @KarnKumar: you mean older (pre-0.21) version of pandas. Not of Python. – smci Jul 13 '21 at 06:26
  • @smci, oh yes. i forgot to mention pandas, thx for the catch. – Karn Kumar Jul 13 '21 at 06:28

3 Answers3

0

One possible thing that could be done is to convert the excel file (.xlsx) file to .csv file which can be done through file and export it with csv file and then it could be loaded like: -

import pandas as pd
Data=pd.read_csv("File Name...")
print(Data)

Or if you want to load only the excel file directly this could be done: -

import pandas as pds
file =('path_of_excel_file')
newData = pds.read_excel(file)
newData
L Prathyusha
  • 260
  • 1
  • 3
  • 14
0

I have tried in the both possible ways in form of csv and as well as excel. Try something like this: -

import pandas as pds
file =('path_of_excel_file')
newData = pds.read_excel(file)
newData
L Prathyusha
  • 260
  • 1
  • 3
  • 14
  • @L Prathyusha , Reading Excel files may have issues if the `xlrd` or `openpyxl` is running old on the system. – Karn Kumar Jul 13 '21 at 05:06
  • one more possiblity is that to read the file like this: - `Data=pd.read_csv("File Name..."r)` by putting 'r' at the end of the code it might be helpful – L Prathyusha Jul 13 '21 at 05:16
0

As we can not have Data to reproduce, hence there might be different situation and different solutions.

I'm enlisting few situations which may lead you in a right direction...

Situation 1:

If you are using old python Version, then you should try simply below as its sheetname with older version and sheet_name with new version.

import pandas as pd
df = pd.read_excel(file_with_data, sheetname=sheet_with_data)

OR

You can use pd.ExcelFile instead ..

xls = pd.ExcelFile('path_to_file.xls')
df1 = pd.read_excel(xls, 'in')

OR

xl = pd.ExcelFile(path)
# xl = pd.ExcelFile("Full_Path_of _file")

xl.sheet_names
[u'in', u'in1', u'in2']
df = xl.parse("in")
df.head()

OR

df = pd.read_excel(open('your_xls_xlsx_filename','rb'), sheet_name='Sheet 1')
# or using sheet index starting 0
df = pd.read_excel(open('your_xls_xlsx_filename','rb'), sheet_name=1)

Note: Opting sheetname argument needs be opted meticulously for python pandas version-wise>

For Older python Version: use sheetname

For New python version: use sheet_name

Situation 2:

Copy the address directly from the right-click file properties-security will cause this problem So, copying and pasting the file path also produce these issues while there is no other evident issues

solve

It has nothing to do with the backslash /forward slash in the path, and it has nothing to do with whether the path contains. There are two solutions

1 Enter the path manually
2 Open this path in Explorer, then copy

General reading convention:

# When the parameter is None, all tables are returned, which is a dictionary of tables;
sheet = pd.read_excel('example.xls',sheet_name= None)
# When the parameter is list = [0, 1, 2, 3], the returned multi-table is also a dictionary
sheet = pd.read_excel('example.xls',sheet_name= 0)
sheet = pd.read_excel('example.xls',sheet_name= [0,1])
#The data of the table can also be read according to the name of the table header or the position of the table
sheet = pd.read_excel('example.xls',sheet_name= 'Sheet0')
sheet = pd.read_excel('example.xls',sheet_name= ['Sheet0','Sheet1'])
sheet = pd.read_excel('example.xls',sheet_name=[0,1,'Sheet3'])
Karn Kumar
  • 8,518
  • 3
  • 27
  • 53
  • 1
    When the OP isn't clear, ask them clarifications, and close the question if they still won't provide them. Better than an answer speculating about multiple causes. This one wasn't about the path. – smci Jul 13 '21 at 06:35
  • @smci, agreed.. will keep a note. – Karn Kumar Jul 13 '21 at 06:49