0

I am trying to get all the information such as name, content of the file downloaded when I hit a GET API programmatically.

Context :

I have a GET API, whenever I hit that API from browser it automatically downloads the file into the system, I can see that file in filesystem. I would like achieve this using Python program.

Tested Approach :

I almost tried every approach to get the contents of the file, every time I download the file content appears to be in JavaScript format instead of text.

Python Code :

caseDocumentFolderPath = os.path.join(SubpoenaJobpath,caseDocumentsFolderName)
os.chdir(caseDocumentFolderPath)
documentResponse = requests.post(completeZipDownloadUrl, headers=headers)
print(documentResponse.content)
open('temp.txt', 'wb').write(documentResponse.content)

Output file :

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
    <meta HTTP-EQUIV="PRAGMA" CONTENT="NO-CACHE">
<script>
function redirectOnLoad() {
if (this.SfdcApp && this.SfdcApp.projectOneNavigator) { SfdcApp.projectOneNavigator.handleRedirect('https://texastech.my.salesforce.com?ec=302&startURL=%2Fvisualforce%2Fsession%3Furl%3Dhttps%253A%252F%252Ftexastech.lightning.force.com%252Fcontent%252Fsession%253Furl%253Dhttps%25253A%25252F%25252Ftexastech.file.force.com%25252Fsfc%25252Fservlet.shepherd%25252Fdocument%25252Fdownload%25252F0696T00000OcugCQAR%25253FoperationContext%25253DS1'); }  else 
if (window.location.replace){ 
window.location.replace('https://texastech.my.salesforce.com?ec=302&startURL=%2Fvisualforce%2Fsession%3Furl%3Dhttps%253A%252F%252Ftexastech.lightning.force.com%252Fcontent%252Fsession%253Furl%253Dhttps%25253A%25252F%25252Ftexastech.file.force.com%25252Fsfc%25252Fservlet.shepherd%25252Fdocument%25252Fdownload%25252F0696T00000OcugCQAR%25253FoperationContext%25253DS1');
} else {
window.location.href ='https://texastech.my.salesforce.com?ec=302&startURL=%2Fvisualforce%2Fsession%3Furl%3Dhttps%253A%252F%252Ftexastech.lightning.force.com%252Fcontent%252Fsession%253Furl%253Dhttps%25253A%25252F%25252Ftexastech.file.force.com%25252Fsfc%25252Fservlet.shepherd%25252Fdocument%25252Fdownload%25252F0696T00000OcugCQAR%25253FoperationContext%25253DS1';
} 
} 
redirectOnLoad();
</script>

</head>
</html>

Note: The link which allows the file download directly to browser is from Salesforce and I have tested the link is working fine. Here is the proof for that:

enter image description here

enter image description here

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Jakka rohith
  • 475
  • 2
  • 7
  • 15
  • I think it's probably an issue relating to the browser being capable of running scripts on the page that's returned, whereas you're directly getting the result of the POST, which certainly can't execute any code in the result. Maybe you need to be using a headless browser and not just directly doing a web request? – Random Davis Aug 30 '23 at 17:10

1 Answers1

0

In browser it works because you're logged in to Salesforce, you have a valid session id (in a cookie). Your Python program can emulate browser/curl/whatever and then you must use a cookie too (meaning you'd need to login to Salesforce API) or you can use libraries like "simple Salesforce".

(There is a way to store documents in SF that won't require authentication but well... Don't store anything sensitive that way and your variable names suggest it's sensitive)

"Pro" way to fetch documents would be log in ("simple Salesforce" has lots of options for it but if you want to do raw http requests of course that's an option too) and then pull them over either SOAP or REST API. Soap will give you the file base64encoded, rest will require a separate call to get the raw binary payload.

https://stackoverflow.com/a/60284736/313628 and https://stackoverflow.com/a/56268939/313628 might be a good start (shameless promotion)

eyescream
  • 18,088
  • 2
  • 34
  • 46
  • Hi, I have tried adding session_id using simple salesforce, but after that i have been seeing this error where it says Data is not available contact salesforce but when i paste the same url in browser i was able to download the file. ERROR : An error has occurred while processing your request. Please indicate the URL of the page you were requesting as well as any other related information. We apologize for the inconvenience.

    Thank you again for your patience and assistance. And thanks for using salesforce.com!

    Error ID: Data Not Available

    – Jakka rohith Aug 31 '23 at 19:45
  • How did you pass the session id? When pretending to be a browser did you set a cookie with "sid=00Dsessionidgoeshere"? Can you update the question with new code? – eyescream Aug 31 '23 at 19:52
  • I did not use any chordless browser i passed it as part of as bearer token in headers and i have realized a mistake and rectified that now, i am getting that a httml page as response with few links which i checked in browser does the change thing downloads the file directly in browser – Jakka rohith Aug 31 '23 at 20:27