I'm trying to export a PDF > DOCX using Adobe's REST API: https://documentcloud.adobe.com/document-services/index.html#post-exportPDF
Issue I am facing is not being able to save it correctly locally (it corrupts). I found another thread with similar goal but the solution there isn't working for me. Here are relevant parts of my script:
url = "https://cpf-ue1.adobe.io/ops/:create?respondWith=%7B%22reltype%22%3A%20%22http%3A%2F%2Fns.adobe.com%2Frel%2Fprimary%22%7D"
payload = {}
payload['contentAnalyzerRequests'] = json.dumps(
{
"cpf:engine": {
"repo:assetId": "urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"
},
"cpf:inputs": {
"params": {
"cpf:inline": {
"targetFormat": "docx"
}
},
"documentIn": {
"dc:format": "application/pdf",
"cpf:location": "InputFile"
}
},
"cpf:outputs": {
"documentOut": {
"dc:format": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
"cpf:location": docx_filename,
}
}
}
)
myfile = {'InputFile': open(filename,'rb')}
response = requests.request("POST", url, headers=headers, data=payload, files=myfile)
location = response.headers['location']
...
polling here to make sure export is complete
...
if response.status_code == 200:
print('Export complete, saving file locally.')
write_to_file(docx_filename, response)
def write_to_file(filename, response):
with open(filename, 'wb') as f:
for chunk in response.iter_content(1024 * 1024):
f.write(chunk)
What I think is the issue (or at least a clue towards solution) is the following text at the begging of the response.content:
--Boundary_357737_1222103332_1635257304781
Content-Type: application/json
Content-Disposition: form-data; name="contentAnalyzerResponse"
{"cpf:inputs":{"params":{"cpf:inline":{"targetFormat":"docx"}},"documentIn":{"dc:format":"application/pdf","cpf:location":"InputFile"}},"cpf:engine":{"repo:assetId":"urn:aaid:cpf:Service-26c7fda2890b44ad9a82714682e35888"},"cpf:status":{"completed":true,"type":"","status":200},"cpf:outputs":{"documentOut":{"cpf:location":"output/pdf_test.docx","dc:format":"application/vnd.openxmlformats-officedocument.wordprocessingml.document"}}}
--Boundary_357737_1222103332_1635257304781
Content-Type: application/octet-stream
Content-Disposition: form-data; name="output/pdf_test.docx"
... actual byte content starts here...
Why is this being sent? Am I writing the content to the file incorrectly (I've tried f.write(response.content)
as well, same results). Should I be sending a different request to Adobe?