https://pypi.org/project/simple-salesforce/ is popular choice. Read up on the various login options including how to get to a sandbox.
You'll need to experiment with right query first, something like this should be good start
select id, name,
(select contentdocument.LatestPublishedVersion.Title,
contentdocument.LatestPublishedVersion.FileExtension,
contentdocument.LatestPublishedVersion.VersionData
from contentdocumentlinks)
from account
where id = '0017000000Lg8WgAAJ'
When called via REST API in my developer edition it returns account + 3 attachments
[
{
"attributes": {
"type": "Account",
"url": "/services/data/v56.0/sobjects/Account/0017000000Lg8WgAAJ"
},
"Id": "0017000000Lg8WgAAJ",
"Name": "University of Arizona",
"ContentDocumentLinks": {
"totalSize": 3,
"done": true,
"records": [
{
"attributes": {
"type": "ContentDocumentLink",
"url": "/services/data/v56.0/sobjects/ContentDocumentLink/06A4u00000jMtEHEA0"
},
"ContentDocument": {
"attributes": {
"type": "ContentDocument",
"url": "/services/data/v56.0/sobjects/ContentDocument/0694u00000VJqPmAAL"
},
"LatestPublishedVersion": {
"attributes": {
"type": "ContentVersion",
"url": "/services/data/v56.0/sobjects/ContentVersion/0684u00000WWgsIAAT"
},
"Title": "apex-07L3a00001NVoMZEA1",
"FileExtension": "log",
"VersionData": "/services/data/v56.0/sobjects/ContentVersion/0684u00000WWgsIAAT/VersionData"
}
}
},
{
"attributes": {
"type": "ContentDocumentLink",
"url": "/services/data/v56.0/sobjects/ContentDocumentLink/06A4u00000jMtFrEAK"
},
"ContentDocument": {
"attributes": {
"type": "ContentDocument",
"url": "/services/data/v56.0/sobjects/ContentDocument/0694u00000VJqQjAAL"
},
"LatestPublishedVersion": {
"attributes": {
"type": "ContentVersion",
"url": "/services/data/v56.0/sobjects/ContentVersion/0684u00000WWgtKAAT"
},
"Title": "combinepdf",
"FileExtension": "pdf",
"VersionData": "/services/data/v56.0/sobjects/ContentVersion/0684u00000WWgtKAAT/VersionData"
}
}
},
{
"attributes": {
"type": "ContentDocumentLink",
"url": "/services/data/v56.0/sobjects/ContentDocumentLink/06A4u00000jMtGkEAK"
},
"ContentDocument": {
"attributes": {
"type": "ContentDocument",
"url": "/services/data/v56.0/sobjects/ContentDocument/0694u00000VJqRhAAL"
},
"LatestPublishedVersion": {
"attributes": {
"type": "ContentVersion",
"url": "/services/data/v56.0/sobjects/ContentVersion/0684u00000WWguFAAT"
},
"Title": "Capture",
"FileExtension": "png",
"VersionData": "/services/data/v56.0/sobjects/ContentVersion/0684u00000WWguFAAT/VersionData"
}
}
}
]
}
}
]
This should be enough info to create the folder name and for each file - determine the filename+extension. The only remaining problem is actual payload, you can see that VersionData
is just some url instead of say base64-encoded payload. See my other answer https://stackoverflow.com/a/60284736/313628 for explanation. https://salesforce.stackexchange.com/q/300845/799 is neat too.
Anyway - iterating over that JSON fetching the raw payload, creating files...
Something like this (I'm sure it's naive, no null checks, I'm not well-versed in python)
from simple_salesforce import Salesforce
sf = Salesforce(username='user@example.com', password='hunter2', security_token='')
data = sf.query("""select id, name,
(select contentdocument.LatestPublishedVersion.Title,
contentdocument.LatestPublishedVersion.FileExtension,
contentdocument.LatestPublishedVersion.VersionData
from contentdocumentlinks)
from account
where id = '0017000000Lg8WgAAJ'""")
#print(results)
for row in data['records']:
print(row['Name'])
for doc in row['ContentDocumentLinks']['records']:
d = doc['ContentDocument']['LatestPublishedVersion']
print(" " + d['Title'] + "." + d['FileExtension'])
response = sf._call_salesforce(method='GET', url = sf.base_url + d['VersionData'].split(".0/")[1])
print(response)
#print(response.content)
outputs
University of Arizona
apex-07L3a00001NVoMZEA1.log
<Response [200]>
combinepdf.pdf
<Response [200]>
Capture.png
<Response [200]>