1

Using a get request through a browser I can download a report from Salesforce as a .csv.

However, when I try to replicate this action using Python and the requests module I can not generate the report either as a .csv or a Data Frame.

The request follows this pattern:

url = '{0}/{1}?export=1&enc=ISO-8859-1&xf=csv&isdtp=nv'.format(base_url, report_id)

I have authenticated as follows:

import requests
import logging
# Function to establish a connection to Salesforce: sforce_connect => function() 
def sforce_connect(c_id, cs_id, sforce_uname, sforce_pwd):
    """Function to create a connection with Salesforce.
        Dependent on the requests module.
    Args:
        c_id (str): A string scalar denoting the Consumer (Client) ID.
        cs_id (str): A string scalar denoting the Consumer (Client) Secret ID
        sforce_uname (str): A string scalar denoting the user's salesforce username.
        sforce_pwd (str): A string scalar denoting the user's salesforce password.
    Returns:
        (list): A list comprised of the instance_url (str) scalar, and the oauth details (dict).
    """
    # Store the auth url as a string scalar: auth_url => string scalar
    # Visit https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/intro_understanding_oauth_endpoints.htm
    auth_url = 'https://login.salesforce.com/services/oauth2/token'    
    # Callback URL: redirect_url => string scalar
    # redirect_uri = 'http://localhost/'
    # Defensively attempt to:
    try:
        # Issue a POST request in order to retrieve access token: response => response object
        response = requests.post(auth_url, data = {
                            'client_id': c_id,
                            'client_secret': cs_id,
                            'grant_type': 'password',
                            'username': sforce_uname,
                            'password': sforce_pwd})
    # If an error occurs:
    except Exception as error:
        # Log the error: str => stdout(logfile)
        logging.critical("Failed to retrieve Bearer token from Salesforce API, rest of run will be problematic, the "
                         + " error caught: " + repr(error))
    # Extract the json from the response: json_res => json object
    json_res = response.json()
    # Extract the access token from the json response object: access_token => string scalar
    access_token = json_res['access_token']
    # Extract the token type from the json response object: token_type => string scalar
    token_type = json_res['token_type']
    # Extract the session id: session_id => string scalar
    session_id = json_res['id']
    # A dictionary denoting the Authorisation string required, token type concatenated with the token: auth => dictionary
    oauth = {'Authorization': ' '.join([token_type, access_token])}
    # Extract the instance url from the response.json(): instance_url => string scalar 
    instance_url = json_res['instance_url']
    # Explicitly define the return object: list => GlobalEnv() 
    return([instance_url, oauth, session_id])

# Apply the function to establish a connection with salesforce: 
# instance_url => string scalar, auth => dictionary, session_id => string scalar
instance_url, auth, session_id = sforce_connect(
    client_id, client_secret, sfdc_user, sfdc_pass
)

And I have attempted to retrieve the .csv as follows but it is not generating:

req = requests.get(url, headers = auth)
df = pd.read_csv(req.content) 

Due to security concerns I can not use the simple-salesforce package or Selenium (and am limited to requests, urllib3). Any help would be greatly appreciated.

Also, this is the returned object:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
    <meta HTTP-EQUIV="PRAGMA" CONTENT="NO-CACHE">





<script>
function redirectOnLoad() {
var escapedHash = '';
var url = '/saml/authn-request.jsp?saml_request_id=PERSONAL INFO;
if (window.location.hash) {
   escapedHash = '%23' + window.location.hash.slice(1);
}
if (window.location.replace){ 
window.location.replace(url + escapedHash);
} else {
window.location.href = url + escapedHash;
} 
} 
redirectOnLoad();
</script>

</head>


</html>





<!-- Body events -->
<script type="text/javascript">function bodyOnLoad(){if(window.PreferenceBits){window.PreferenceBits.prototype.csrfToken="null";};}function bodyOnBeforeUnload(){}function bodyOnFocus(){}function bodyOnUnload(){}</script>
            
</body>
</html>


<!--
...................................................................................................
...................................................................................................
...................................................................................................
...................................................................................................
-->
hello_friend
  • 5,682
  • 1
  • 11
  • 15

1 Answers1

1

You need to pass a HTTP Header Cookie sid={access_token}. This is hacky, not an officially supported API yadda yadda, can break anytime. You've been warned.

Check these out

  1. How to download a report as a CSV directly from Salesforce Lightning? (Python example)
  2. https://stackoverflow.com/a/56162619/313628 (not Python but I added screenshot of raw request & response)
  3. https://stackoverflow.com/a/57745683/313628 (read the answer, comments, upvote an idea to do this more official way)

Edit

OK, it's OAuth2's fault ;) I experimented a bit.

If you make a proper OAuth2 login call - you'll get a limited session_id / access_token back. It won't have permission to download reports, even if "connected app" allowed web in the scopes. To properly impersonate human & browser traffic you need old school SOAP call.

What you're about to see is a steaming pile of ... No error handling, probably not following Python best practices, parsing XML with regular expressions (what could possibly go wrong)... Examine it but try to salvage more production-ready code from simple-salesforce's source code, even if you can't actually use it on the project.

import requests
import re

auth_url="https://test.salesforce.com/services/Soap/u/51.0"
headers = {'Content-Type': 'text/xml;charset=UTF-8', 'SOAPAction' : "login"}
body = """<soapenv:Envelope xmlns:soapenv="http://schemas.xmlsoap.org/soap/envelope/" xmlns:urn="urn:partner.soap.sforce.com">
   <soapenv:Body>
      <urn:login>
         <urn:username>username goes here</urn:username>
         <urn:password>password and optionally security token</urn:password>
      </urn:login>
   </soapenv:Body>
</soapenv:Envelope>"""

login_response = requests.post(auth_url,data=body,headers=headers)
print(login_response.content)
session_id = re.search('<sessionId>(.*)</sessionId>', login_response.content.decode('utf-8')).group(1)
print(session_id)

report_url = 'https://your instance.my.salesforce.com/reportgoeshere?isdtp=p1&export=1&enc=UTF-8&xf=csv'

reportReq = requests.get(report_url, cookies={'sid': session_id})
reportData = reportReq.content.decode('utf-8')
print(reportData)
eyescream
  • 18,088
  • 2
  • 34
  • 46
  • thank you so much for your reply (+1). I have already seen your previous posts. However, I can't get the response object to be the report itself and subsequently, can't get the report as a Data Frame. Unfortunately it returns some automatically generated javascript. I will amend my post to show the output I receive. – hello_friend May 21 '21 at 06:54
  • also, should argument provided to the `cookies` parameter not be `{'sid': 'session_id'}` ? – hello_friend May 21 '21 at 06:55
  • i.e. the get request now be: `s = requests.get( url, headers = oauth, cookies = {'sid': session_id} )` – hello_friend May 21 '21 at 06:56
  • Try now but if you can - make this more Pythonic and post as your own answer, I don't want people blindly copying this into their projects... – eyescream May 21 '21 at 09:45
  • thank you very much for all of your help, unfortunately i still can't seem to get the data. From your post request I get a 500 error. – hello_friend May 24 '21 at 02:27
  • Go to SF Setup, find your user, scroll all the way down to login history and find the error. Might be that your IP is not in trusted range (if you're at office and have static IP why not add it in Setup -> Network. Alternatively https://help.salesforce.com/articleView?id=sf.user_security_token.htm&type=5) – eyescream May 24 '21 at 06:38