2

I have wrote a function to list all files and directories recursively.

import os
def walk_dir(path,dir_list=[],file_list=[]):
    for fname in os.listdir(path):
        fullname = os.path.join(path, fname)
        if  os.path.isdir(fullname) :
            dir_list.append(fullname)
            walk_dir(fullname,dir_list,file_list)
        elif os.path.isfile(fullname) :
            file_list.append(fullname)
    return {'dir':dir_list,'file':file_list}

It return a dictionay.
I create a target directory to test my code.

 mkdir -p /tmp/test
 cd /tmp/test
 mkdir -p  test{1..3}
 cd  test1
 touch /test1/test1{1..3}
 cd ../test2
 touch /test2/test2{1..2}

here is my target directory to try with:

tree  /tmp/test
/tmp/test
├── test1
│   ├── test11
│   ├── test12
│   └── test13
├── test2
│   ├── test21
│   └── test22
├── test3

To get all dirs and files in /tmp/test.

x = walk_dir('/tmp/test')
x['dir']
['/tmp/test/test1', '/tmp/test/test3', '/tmp/test/test2']

Now to reset x as {}--a dictionary contains none.

x = {}
x
{}
x['dir']
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
KeyError: 'dir'
dir_list
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
NameError: name 'dir_list' is not defined

Maybe every variable is initialized as none.
To get the files and dirs for the second time:

x = walk_dir('/tmp/test')
x['dir']
['/tmp/test/test1', '/tmp/test/test3', '/tmp/test/test2', '/tmp/test/test1', '/tmp/test/test3', '/tmp/test/test2']

I have reset x as {} ,why x['dir'] contains 6 directories instead of 3 directories?

showkey
  • 482
  • 42
  • 140
  • 295
  • 2
    Does this answer your question? ["Least Astonishment" and the Mutable Default Argument](https://stackoverflow.com/questions/1132941/least-astonishment-and-the-mutable-default-argument) – dspencer Apr 07 '20 at 04:17

1 Answers1

3

This is one of those Python gotchas.

The default parameter of an empty list is created at the function definition. It still persists between function calls.

You can see it demonstrated here. test_func is a function that has a default parameter of a list. Then the list is modified, and the modification exists and is changed every time the function runs.

>>> def test_func(param=[]):
...     param.append(1)
...     print(param)
... 
>>> test_func()
[1]
>>> test_func()
[1, 1]
>>> test_func()
[1, 1, 1]
>>> test_func()
[1, 1, 1, 1]
>>> test_func()
[1, 1, 1, 1, 1]
>>> test_func()
[1, 1, 1, 1, 1, 1]

If you look at your output, you'll see duplicates. Each path exists twice. If you run it again it should have three repeats of each, and so on. This only applies to mutable types. So things like lists and dictionaries will exhibit this behavior. If you don't intend to use this behavior then avoid using them as default arguments.

Instead use set the default parameter to None and check it within the function body.

>>> def test_func2(param=None):
...     if param is None:
...         param = []
...     param.append(1)
...     print(param)
... 
>>> test_func2()
[1]
>>> test_func2()
[1]
>>> test_func2()
[1]
ProfOak
  • 541
  • 4
  • 10