Read xlsx stored on sharepoint location with openpyxl in python?

Question

quick one.

I have XLSX file located on sharepoint drive and cannot open it using openpyxl in python, it works well if it is stored on my local drive.

I tried this.

from openpyxl import load_workbook
wb = load_workbook('https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx')

Throws up this exception:

C:\Anaconda\lib\site-packages\openpyxl\reader\excel.py in load_workbook(filename, use_iterators, keep_vba, guess_types, data_only)
    123     except (BadZipfile, RuntimeError, IOError, ValueError):
    124         e = exc_info()[1]
--> 125         raise InvalidFileException(unicode(e))
    126     wb = Workbook(guess_types=guess_types, data_only=data_only)
    127 

InvalidFileException: [Errno 22] invalid mode ('rb') or filename: 'https://...

Am I missing something? I need to read the content of one of the sheets in python.

EDIT:

Using crussell's advice, I receive 401 UNAUTHORIZED:

import requests
import urllib
from openpyxl import load_workbook
from requests.auth import HTTPBasicAuth

file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"

username = 'PotatoUser'
password = 'PotatoPassword'

resp=requests.get(file, auth=HTTPBasicAuth(username, password))
print(resp.content)

Seems like sharepoint and requests are not compatible, with both Digest Authentication and Basic Authentication http://docs.python-requests.org/en/latest/user/authentication/

score 4 · Answer 1 · edited Apr 09 '22 at 10:37

4

Instead of trying to load directly from a web-address, try using urllib.

import urllib
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
urllib.urlretrieve(file,"test.xlsx")

From further research, requests may be preferred over urllib. Try this:

import requests
from requests.auth import HTTPBasicAuth
file = "https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx"
        
username = 'myUsername'
password = 'myPassword'
        
resp=requests.get(file, auth=HTTPBasicAuth(username, password))
output = open('test.xlsx', 'wb')
output.write(resp.content)
output.close()

To get requests installed:

pip install requests

edited Apr 09 '22 at 10:37

linux_pangolin

61
1
7

answered Dec 10 '15 at 14:12

crussell

179
4

One forward slash throws this exception: URLError: – Gunay Anach Dec 10 '15 at 14:20
Thanks crussell, this seems to be right direction for me, now need to fight with authentication request while accessing the file. IOError: ('http error', 401, 'Unauthorized' ... – Gunay Anach Dec 10 '15 at 15:29
Is the file encrpyted? – crussell Dec 10 '15 at 15:40
The web server seems to be expecting some authentication credentials, do you have a username and password for the website? – crussell Dec 10 '15 at 15:46
Yes i do have credentials and the file is not encrypted – Gunay Anach Dec 10 '15 at 15:58
1

Thank you for your suggestions crussell, It seems like there is authentication issues which cannot be handled by **requests**. Will keep looking for another method. – Gunay Anach Dec 11 '15 at 12:11
1

Instead of basic auth since this is internal sharepoint, I used this instead - from requests_negotiate_sspi import HttpNegotiateAuth resp=requests.get(file, auth=HttpNegotiateAuth()) – ThinkCode Jun 28 '20 at 23:01

score 0 · Answer 2 · answered Dec 10 '15 at 14:08

0

You probably first need to download it first rather than opening it directly. The following approach should work:

import urllib2
from openpyxl import load_workbook
import StringIO

data = urllib2.urlopen("https://content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx")
xlsx = data.read()
wb = load_workbook(StringIO.StringIO(xlsx))

Python's StringIO could be used to make the downloaded data appear as a file interface.

answered Dec 10 '15 at 14:08

Martin Evans

45,791
17
81
97

Thanks Martin for your suggestion, however I get : 560 class HTTPRedirectHandler(BaseHandler): HTTPError: HTTP Error 400: Bad Request – Gunay Anach Dec 10 '15 at 14:18
I can open it in a browser, otherwise I get this exception: HTTPError: HTTP Error 401: Unauthorized – Gunay Anach Dec 10 '15 at 14:21
Are you going through a proxy? – Martin Evans Dec 10 '15 at 14:24
No, it supposed to be internal sharepoint location with web interface. – Gunay Anach Dec 10 '15 at 14:26
2

Apart from the headaches of logging into Sharepoint I don't think openpyxl supports the buffer interface. It's restricted to locally accessible files for reasons of simplicity. – Charlie Clark Dec 10 '15 at 17:47

score 0 · Answer 3 · answered Mar 08 '18 at 20:58

0

If the SP is internal, it might work by removing "https:" in the name you put in the load_workbook().

from openpyxl import load_workbook
file = '//content.potatocompany.com/workspaces/PotatoTeam/Shared Documents/XYZ errors/XYZ Errors_Confirm.xlsx'
wb = load_workbook(file)

without authentication if your account at work is directly connected to SP. Otherwise, at my work, we use NTML authentification, that you can do by using HttpNtlmAuth from the library request_ntml.

Let me know if it works or if you are still interested by this issue, I could give you example with request_ntml

answered Mar 08 '18 at 20:58

Ben.T

29,160
6
32
54

I would be interested in seeing an example with request_ntml, please. – Calab Nov 15 '19 at 16:22
1

I have no issues opening the spreadsheet from SharePoint in a web browser, but trying to do so from a python script is giving me a 403 forbidden error. What am I missing? `import requests_ntlm import requests file = "https://ourcompany.sharepoint.com/abcd/5UJJPA3D/FILE" u=r"domain\username" p="password" resp = requests.get(file, auth=requests_ntlm.HttpNtlmAuth(u,p)) print(resp.content) b'403 FORBIDDEN'` – Calab Nov 15 '19 at 19:43
@Calab Hi, unfortunately, I'm not at the same work anymore and I had to leave my scripts so I can't answer your question. sorry – Ben.T Nov 18 '19 at 20:29

Read xlsx stored on sharepoint location with openpyxl in python?

3 Answers3

Linked