0

I need to replicate the functionality of a file directory tree as a list. I have to be able to search for specific "documents" through the "folders". All of which may contain duplicate names at other depths. I also have to be able to dynamically add new files and folders during runtime. So for example, a file tree like this:

MyFiles
    Important
        doc1
        doc2
    LessImportant
        doc3
        doc4
    LowPriority
        Important
            doc1
        LessImportant
            doc4

If I use nested lists, the above tree would end up looking like:

[MyFiles,[Important,[doc1,doc2],LessImportant,[doc3,doc4],LowPriority, 
[Important,[doc1],LessImportant,[doc4]]]]

And then I would have to run loops through all the nests to search for stuff and use .append to add new "folders" or "documents".

Is there a better / more efficient way than nested lists?

BerickCook
  • 101
  • 2
  • Do you need to distinguish between files and directories? – NewNewton Apr 26 '18 at 01:09
  • And: Do you need to get the path of the file your looking for as well? – NewNewton Apr 26 '18 at 01:35
  • If you need a tree-like structure, this might help: [How can I implement a tree in Python? Are there any built in data structures in Python like in Java](https://stackoverflow.com/q/2358045/2745495). – Gino Mempin Apr 26 '18 at 01:47
  • Have you considered using [ElementTree](https://docs.python.org/2/library/xml.etree.elementtree.html#elementtree-objects)? It might be an over kill but will give search and iterate functions for free. – Mike Robins Apr 26 '18 at 08:53
  • I don't need to distinguish between files and directories, I just need to be able to find the "files" that are organized via the "directories" – BerickCook Apr 26 '18 at 09:51

3 Answers3

1

Using ElementTree gives search and iterate functions.

import os
import xml.etree.ElementTree as ET

def ls(p):
    if os.path.isdir(p):
        node = ET.Element(os.path.basename(p), type='dir')
        node.extend([ls(os.path.join(p, f)) for f in os.listdir(p)])
    else:
        node = ET.Element(os.path.basename(p), type='file')
    return node

Then testing this by writing out as XML as that is quite easy from ElementTree:

root = ET.ElementTree(ls(r"C:\test\Myfiles"))

from xml.dom import minidom
def pp(tree):
    print ''.join(minidom.parseString(ET.tostring(tree.getroot())).toprettyxml(indent='  ').splitlines(True)[1:])

pp(root)

Gives

<Myfiles type="dir">
  <Important type="dir">
    <doc1 type="file"/>
    <doc2 type="file"/>
  </Important>
  <LessImportant type="dir">
    <doc1 type="file"/>
    <doc2 type="file"/>
  </LessImportant>
  <LowPriority type="dir">
    <Important type="dir">
      <doc1 type="file"/>
    </Important>
    <LessImportant type="dir">
      <doc4 type="file"/>
    </LessImportant>
  </LowPriority>
</Myfiles>

You'll can play around to decide if the dir or file should be an element tag or attribute.

Mike Robins
  • 1,733
  • 10
  • 14
0

What about such a structure using the dict datatype:

{"ID": 0, "Type": 'Folder', "Name": 'MyFiles', "Subdirectories": [1, 2, 3]}
{"ID": 1, "Type": 'Folder', "Name": 'Important', "Subdirectories": []}
{"ID": 2, "Type": 'Folder', "Name": 'LessImportant', "Subdirectories": []}
{"ID": 3, "Type": 'Folder', "Name": 'LowPriority', "Subdirectories": [4, 5]}
{"ID": 4, "Type": 'Folder', "Name": 'Important', "Subdirectories": []}
{"ID": 5, "Type": 'Folder', "Name": 'LessImmportant', "Subdirectories": []}

{"ID": 0, "Type": 'File', "Name": 'doc1', 'ParentDirectory': 1}
{"ID": 1, "Type": 'File', "Name": 'doc2', 'ParentDirectory': 1}
{"ID": 2, "Type": 'File', "Name": 'doc3', 'ParentDirectory': 2}
{"ID": 3, "Type": 'File', "Name": 'doc4', 'ParentDirectory': 2}
{"ID": 4, "Type": 'File', "Name": 'doc1', 'ParentDirectory': 4}
{"ID": 5, "Type": 'File', "Name": 'doc4', 'ParentDirectory': 5}

Which would let you parse the data in a recursive manner. Here the files are numerated seperately from folders. Each file has the Parentdirectory entry which is the current directory the file is in. The folders have a list of subdirectories and all elements are linked through the ID datafield.

What
  • 304
  • 1
  • 12
0

The OOP Approach

At first sight you might get the impression "Nah, that's too many lines of code" but it does have some great advantages (e.g. you're way more flexible).

Class / Basic Construct

class FileOrFolder:

    def __init__(self, name, children=None):
        self.name = name
        self.children = children if children else []

    def search_for(self, f_name):
        global hits  # defined later on

        for child in self.children:

            if child.name == f_name:
                hits.append(child.name)

            if child.children:
                child.search_for(f_name)

Recreating the File Tree

TREE = FileOrFolder("MyFiles", [
    FileOrFolder("Important", [
        FileOrFolder("doc1"),
        FileOrFolder("doc2")
    ]),
    FileOrFolder("LessImportant", [
        FileOrFolder("doc3"),
        FileOrFolder("doc4")
    ]),
    FileOrFolder("LowPriority", [
        FileOrFolder("Important", [
            FileOrFolder("doc1")
        ]),
        FileOrFolder("LessImportant", [
            FileOrFolder("doc4")
        ])
    ])
])

Application & Ouput

>>> hits = []
>>> TREE.search_for("doc4")
>>> print(hits)

['doc4', 'doc4']

NOTE: However, I don't know if your overall goal is to simply create a file tree manually or automatically iterate through an existing&real one and "copy it". In case it's the latter you would need to make some slight changes.

Community
  • 1
  • 1
NewNewton
  • 1,015
  • 1
  • 10
  • 22