0

I'm slowly getting my feet wet in Python but still seem to miss some fundamentals. Especially with lists and dictionaries.

I'm building an importer and there want to check a directory of files for the ones importable for me. Here's a function I'm trying to build for this:

def check_files(directory=os.path.dirname(os.path.realpath(__file__))):
    file_number = 0
    files = {}
    for file in os.listdir(directory):
        if os.path.isfile(file):
            file_name = os.fsdecode(file)
            --> files = {file_number: {'file_name': file_name}}
            with open(file_name,'r', encoding='utf-8', errors='ignore') as f:
                line = f.readline()
                if line == firstline['one']:
                    --> files = {file_number: {'file_type': 'one'}}
                elif line == firstline['two']:
                    --> files = {file_number: {'file_type': 'two'}}
                else:
                    --> files = {file_number: {'file_type': 'unknown'}}
        file_number += 1

    return files

As you can see I'm failing to build the dictionary I was thinking of building to carry some file information and return it.

Regarding the dictionary structure I was thinking about something like this:

files = {
    0: {'file_name': 'test1.csv', 'file_type': 'one'},
    1: {'file_name': 'test2.csv', 'file_type': 'two'}
}

My question is: how do I build the dictionary step by step as I get the values and add new dictionaries into it? I read through quite some dictionary explanations for beginners but they mostly don't handle this multi level case at least not building it step by step.

Helmi
  • 489
  • 7
  • 25
  • 1
    I gave you a theorical answer, your code with it applied and a suggested alternative that uses a `list` of `dict`s instead of a `dict` of `dict`s as your outter `dict` is using incremental integers. – Adirio Oct 20 '17 at 07:45

1 Answers1

1

Instead of using the literal construc, you should use the assign operator:

base_dict = {} # same that base_dict = dict()

for i in range(10):
    base_dict[i] = {'file_name': 'test' + str(i+1) + '.csv', 'file_type': i+1}

The first line is creating an empty dict.

The loop iterates i=0..9. Then we assign a new dict to element i of base_dict with base_dict[i] = .... You use square brackets to both access and modify (including creation) of key-value paris inside a dict.

Your code will be:

def check_files(directory=os.path.dirname(os.path.realpath(__file__))):
    files = {}
    for file in os.listdir(directory):
        if os.path.isfile(file):
            i = len(files)
            file_name = os.fsdecode(file)
            files[i] = {'file_name': file_name}
            with open(file_name,'r', encoding='utf-8', errors='ignore') as f:
                line = f.readline()
                if line == firstline['one']:
                    files[i]['file_type'] = 'one'
                elif line == firstline['two']:
                    files[i]['file_type'] = 'one'
                else:
                    files[i]['file_type'] = 'unknown'
    return files

As you can see I deleted the manual count you were using and get the number of already existing elements with i = len(files), then I use square brackets to enter all the info as needed.

IMPORTANT NOTE

Your case may be more complex than this, but having a dictionary whos keys are auto-incremented integers makes no sense, that is what lists are for. Your code with lists would look like this:

def check_files(directory=os.path.dirname(os.path.realpath(__file__))):
    files = []
    for file in os.listdir(directory):
        if os.path.isfile(file):
            file_name = os.fsdecode(file)
            files.append({'file_name': file_name})
            with open(file_name,'r', encoding='utf-8', errors='ignore') as f:
                line = f.readline()
                if line == firstline['one']:
                    files[-1]['file_type'] = 'one'
                elif line == firstline['two']:
                    files[-1]['file_type'] = 'one'
                else:
                    files[-1]['file_type'] = 'unknown'
    return files

As you can see it is very similar to the code above but it doesn't need to calculate the length every iteration as the builtin method list.append() already inserts the new data in the next position. lists offer you some advantages over dicts in the case of auto-incremented integers as keys.

The output would be:

files = [ {'file_name': 'test1.csv', 'file_type': 'one'}, {'file_name': 'test2.csv', 'file_type': 'two'} ]

Remember that even the output doesn't explicitly writes the numbers, lists allow you to access the data the same way. Additionally, negative integers can be used to access data from the bottom, so when I'm using files[-1], that means the last element we inserted. That's the reason why I do not need to know which element we are introducing in this example, we just append it at the end and access the last value appended.

Adirio
  • 5,040
  • 1
  • 14
  • 26