Using the Python Office365-REST-Python-Client I have written the following Python function to download Excel spreadsheets from Sharepoint (based on the answer at How to read SharePoint Online (Office365) Excel files in Python with Work or School Account? )
import sys
from urlparse import urlparse
from office365.runtime.auth.authentication_context import AuthenticationContext
from office365.sharepoint.client_context import ClientContext
from office365.sharepoint.file import File
xmlErrText = "<?xml version=\"1.0\" encoding=\"utf-8\"?><m:error"
def download(sourceURL, destPath, username, password):
print "Download URL: {}".format(sourceURL)
urlParts = urlparse(sourceURL)
baseURL = urlParts.scheme + "://" + urlParts.netloc
relativeURL = urlParts.path
if len(urlParts.query):
relativeURL = relativeURL + "?" + urlParts.query
ctx_auth = AuthenticationContext(baseURL)
if ctx_auth.acquire_token_for_user(username, password):
try:
ctx = ClientContext(baseURL, ctx_auth)
web = ctx.web
ctx.load(web)
ctx.execute_query()
except:
print "Failed to execute Sharepoint query (possibly bad username/password?)"
return False
print "Logged into Sharepoint: {0}".format(web.properties['Title'])
response = File.open_binary(ctx, relativeURL)
if response.content.startswith(xmlErrText):
print "ERROR response document received. Possibly permissions or wrong URL? Document content follows:\n\n{}\n".format(response.content)
return False
else:
with open(destPath, 'wb') as f:
f.write(response.content)
print "Downloaded to: {}".format(destPath)
else:
print ctx_auth.get_last_error()
return False
return True
This function works fine for some URLs but fails for others, printing the following "file does not exist" document content on failure (newlines and whitespace added for readability):
<?xml version="1.0" encoding="utf-8"?>
<m:error xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata">
<m:code>
-2130575338, Microsoft.SharePoint.SPException
</m:code>
<m:message xml:lang="en-US">
The file /sites/path/to/document.xlsx does not exist.
</m:message>
</m:error>
I know that the username and password are correct. Indeed changing the password results in a completely different error.
I have found that this error can occur when either the document doesn't exist, or when there are insufficient permissions to access the document.
However, using the same username/password, I can download the document with the same URL in a web browser.
Note that this same function consistently works fine for some .xlsx URLs in the same Sharepoint repository, but consistently fails for some other .xlsx URLs in that same Sharepoint repository.
My only guess is that there are some more fine-grained permissions that need to me managed. But I'm completely ignorant to these if they exist.
Can anybody help me to resolve why the failure is occurring and figure out how to get it working for all the required files that I can already download in a web browser?
Additional Notes From Comments Below
- The failures are consistent for some some URLs. The successes are consistent for other URLs. Ie, for one URL, the result is always the same - it does not come and go.
- The files have not moved or been deleted. I can download them using browsers/PCs which have never accessed those files previously.
- The source of the URLs is Sharepoint itself. Doing a search in Sharepoint includes those files in the results list with a URL below each file. This is the URL that I'm using for each file. (For some files the script works and for others it does not; for all files the browser works for the same URL.)
- The URLs are all correctly encoded. In particular, spaces are encoded with
%20
.