How to download a Confluence page attachment with Python?

Question

With the atlassian-python-api 1.15.1 module and python 3.6 how can I to download a file attached to a Confluence page ?

The page actions section of the API documentation mentions an API get_attachments_from_content, with which I can successfully obtain a list of all page attachments, with their metadata. There's an example at the end of this question of what I can obtain by printing one of the items in the results key.

What I already tried is to use the wget module to downloaad the attachment:

fname = wget.download(base_server_name + attachment['_links']['download'])

However, the downloaded file is not the one on the page, instead I have a large HTML file which looks like a light login page. Also, I'm not sure using wget is relevant here, I'd prefer a solution with the atlassian python API itself, as it's managing authentication by itself.

"result" key:

{'id': '56427526', 'type': 'attachment', 'status': 'current', 'title': 'main.c', 'metadata': {'mediaType': 'application/octet-stream', 'labels': {'results': [], 'start': 0, 'limit': 200, 'size': 0, '_links': {'self': 'https://foo.bar.com/confluence/rest/api/content/56427526/label'}}, '_expandable': {'currentuser': '', 'properties': '', 'frontend': '', 'editorHtml': ''}}, 'extensions': {'mediaType': 'application/octet-stream', 'fileSize': 363, 'comment': ''}, '_links': {'webui': '/pages/viewpage.action?pageId=14648850&preview=%2F14648850%2F56427526%2Fmain.c', 'download': '/download/attachments/14648850/main.c?version=1&modificationDate=1580726185883&api=v2', 'self': 'https://foo.bar.com/confluence/rest/api/content/56427526'}, '_expandable': {'container': '/rest/api/content/14648850', 'operations': '', 'children': '/rest/api/content/56427526/child', 'restrictions': '/rest/api/content/56427526/restriction/byOperation', 'history': '/rest/api/content/56427526/history', 'ancestors': '', 'body': '', 'version': '', 'descendants': '/rest/api/content/56427526/descendant', 'space': '/rest/api/space/~Tim'}}

score 10 · Accepted Answer · answered Feb 03 '20 at 12:52

While I didn't find a way to download the files directly with the atlassian-python-api module, I managed to do it with the requests module, thanks to this answer. Here's the code used to download all attachments visible in the page:

from atlassian import Confluence
import requests

confluence = Confluence(
    url="https://my.server.com/Confluence",
    username='MyUsername',
    password="MyPassword")

attachments_container = confluence.get_attachments_from_content(page_id=12345678, start=0, limit=500)
attachments = attachments_container['results']
for attachment in attachments:
        fname = attachment['title']
        download_link = confluence.url + attachment['_links']['download']
        r = requests.get(download_link, auth=(confluence.username, confluence.password))
        if r.status_code == 200:
            with open(fname, "wb") as f:
                for bits in r.iter_content():
                    f.write(bits)

I used `confluence.session.get(download_link)` instead of `requests.get(download_link, auth=...)`. This way it worked for the case of having used token authentication with `Confluence`. — Christian Baumann, Jun 05 '23 at 15:44
I used `confluence._session.get(f'{confluence.url}/{download_link}')` and it worked well with token auth. It seems in newer versions of the library, `session` is a private property available as `_session`. — Overbryd, Jun 09 '23 at 09:57

How to download a Confluence page attachment with Python?

1 Answers1

Linked