How do I create this python datastructure based on a directory?

Question

I am currently following along with this API documentation of WebODM for a drone mapping project. My goal is to point a function to a directory containing any varying amount of images. But I am unfamiliar with the datastructure represented below. How do I dynamically make something in the same format? The datastructure looks as follows

images = [
    ('images', ('image1.jpg', open('image1.jpg', 'rb'), 'image/jpg')), 
    ('images', ('image2.jpg', open('image2.jpg', 'rb'), 'image/jpg')),
    ('images', ('image3.jpg', open('image3.jpg', 'rb'), 'image/jpg')),
    # ...
]

So basically, how do I put all the image files in a directory in the above encoding without hardcoding anything? Is there a library or parser that already exists for this? Any suggestions would be appreciated. Thanks

This is just a list of nested tuples: (folder?, (filename, file object opened for reading, mime-type)) — Scott Hunter, Jul 19 '22 at 12:32
Use this [Find all files in a directory with extension .txt in Python](https://stackoverflow.com/q/3964681/6045800) to get all necessary files from a folder, then it's just a matter of outputting the relevant info — Tomerikoo, Jul 19 '22 at 12:43

score 0 · Answer 1 · answered Jul 19 '22 at 12:47

You can use the listdir function from the os module to get a list of all the contents of a directory. You can then filter the image files you are interested in and populate your image list.

from os import listdir
from os.path import isfile, join

def get_image_list(path, image_type):
    return [join(path, f) for f in listdir(path) if isfile(join(path, f)) and f.endswith(image_type)]


path = "path/to/images"
image_list = get_image_list(path, "jpg")

images = [('images', (i, open(i, 'rb'), 'image/jpg')) for i in image_list]

print(images)

Anentropic · Accepted Answer · 2022-07-19T16:14:50.657

I doubt there is a library for exactly this, but you can do it by breaking down the problem into parts.

One part will be to iterate over all of the image files in your directory, see https://stackoverflow.com/a/3215392/202168 for a starting point.

Another part is to understand the datastructure above. We can see it is a list of tuples. https://realpython.com/python-lists-tuples/

Each row is a tuple of two elements: first element is the string "images" and the second element is another tuple of three elements - these appear to be the filename, an opened handle to the file, and the MIME-type of the file.

Looking at the WebODM docs you linked we can see they are using the popular requests library in their example code to send the request. e.g. they show this example:

images = [
    ('images', ('image1.jpg', open('image1.jpg', 'rb'), 'image/jpg')), 
    ('images', ('image2.jpg', open('image2.jpg', 'rb'), 'image/jpg')),
    # ...
]
options = json.dumps([
    {'name': "orthophoto-resolution", 'value': 24}
])

res = requests.post('http://localhost:8000/api/projects/{}/tasks/'.format(project_id), 
            headers={'Authorization': 'JWT {}'.format(token)},
            files=images,
            data={
                'options': options
            }).json()

task_id = res['id']

So we can find more details about what the elements of these tuples represent by checking the requests docs: https://requests.readthedocs.io/en/latest/user/advanced/#post-multiple-multipart-encoded-files

a list of tuples of (form_field_name, file_info)

So "images" is the name of the form field we're sending in the post request, presumably this name is expected by the API so we shouldn't change it.

Now we can put our two parts together and dynamically generate the list of images to send:

import glob
from pathlib import Path

images = []
for path_str in glob.glob("/your/base/dir/*.jpg"):
    path = Path(path_str)
    images.append(
        ("images", (path.parts[-1], open(path, 'rb'), "image/jpeg")
    )

We can improve on the example code a bit by closing the files after uploading them, using a context manager.

Typically in Python we do do this as:

with open("myfile.jpg") as f:
    # do something with f

# when we exit the with block, the file is automatically closed

Here it is a bit trickier as we have a dynamic list of files. We can find an answer here which points the way: https://stackoverflow.com/a/53363923/202168

import glob
from contextlib import ExitStack
from pathlib import Path

import requests


options = json.dumps([
    {'name': "orthophoto-resolution", 'value': 24}
])

images = []
with ExitStack() as stack:
    for path_str in glob.glob("/your/base/dir/*.jpg"):
        path = Path(path_str)
        f = stack.enter_context(open(path, "rb"))
        images.append(
            ("images", (path.parts[-1], f, "image/jpeg")
        )
    # make the request within the stack context block
    # because it needs access to the opened file objects
    response = requests.post(
        f"http://localhost:8000/api/projects/{project_id}/tasks/", 
        headers={"Authorization": f"JWT {token}"},
        files=images,
        data={"options": options}
    ).json()

task_id = response['id']

This is a great explanation. Thanks! – Oom_Ben Jul 19 '22 at 13:24 — Oom_Ben, Jul 19 '22 at 13:24

How do I create this python datastructure based on a directory?

2 Answers2