2

I have tried this:

ListFiles = list(os.walk(os.getcwd()))
SplitTypes = []

for i in range ( 0 , len(ListFiles) ):
SplitTypes.extend(ListFiles[i].rsplit( "." ) [1])

print(SplitTypes)

but the result I got is this:

['p', 'y', 't', 'x', 't', '3', '2', ' ', 'p', 'm', '4', '0', '1', '2', '2', '4', '1', '4', 's', 'k', 'p', 'm', 'p', '4']

which the numbers are because of the screenshots I had in that directory and the time is separated by '.' and they are separated individually.

I want the result to display:

['py', 'txt', 'skp', 'png']

or something similar to that.

and also this file name 'Screen Shot 2017-05-24 at 11.24.33 am.png' should only show the png bit. After this I will use set() to remove the duplicates.

Adrian
  • 23
  • 1
  • 6

2 Answers2

4

This can be done by:

import os

ListFiles = os.walk(os.getcwd())
SplitTypes = []
for walk_output in ListFiles:
    for file_name in walk_output[-1]:
        SplitTypes.append(file_name.split(".")[-1])

print(SplitTypes)

os.walk gives output as 3 element tuples, out of which the last element is a list of the filenames. So we need to iterate through the output of os.walk, which will give a tuple for files in the current directory and an additional tuple for files in each sub-directory. Then we need to get the last elements of each tuple, which will give us list of file names. We will then, iterate over the filenames, split them using . and extract the extension which will be the last element of the list just produced using split. We can extract the last element of any sequence in Python by subscripting with -1. Lastly, we append the extracted extension to SplitTypes.

If you have no other folders in your folder, then your problem can be easily solved by:

import os

SplitTypes=[]
for file in os.listdir(os.curdir):
    SplitTypes.append(file.split('.')[-1])
print(SplitTypes)

os.listdir(path) gives only the filenames in the path directory.

Abhinav Gupta
  • 1,838
  • 14
  • 15
  • This code will work if there are no folders, in the first one I get results that isn't much different than the first result from the original code, The only difference is that it gives only the last alphabet on the extention. eg. txt becomes 't', py becomes 'y'. Second code works well except when there are folders. os.listdir(path) lists the folders and not only filenames (fyi I'm using mac so that might be the reason) – Adrian Jul 24 '17 at 01:41
  • Yes, as mentioned in the answer, second code snippet will only work when you don't have any other folders. I tested my first code snippet again and it worked fine on my Ubuntu machine. I took a quick look into the documentation and couldn't find anything pointing to difference in behaviour on Ubuntu and Mac for the functions I have used. Probably, I am overlooking something. – Abhinav Gupta Jul 24 '17 at 03:46
  • But I am not familiar with Mac OS at all. Can you print the walk_output variable from inside the loop and post the output here for some sample folder. That will help me find the issue. – Abhinav Gupta Jul 24 '17 at 03:53
  • I have found the problem in my code, I used os.listdir instead of walk, now it works but it prints out everything in the directories. Even files in the sub directories. – Adrian Jul 24 '17 at 04:55
  • I think I will be able to solve your issue, if you provide me the value of walk_ouput from my code in each iteration. If that's not possible, can you provide me the new code you are using along with the output? – Abhinav Gupta Jul 24 '17 at 08:30
  • ['DS_Store', 'txt', 'py', 'py', 'txt', 'py', 'txt', 'jpg', 'txt', 'png', 'png', 'png', 'png', 'skp', 'mp4', 'py', 'mp3'] The mp3 is in a sub directory but it still shows it, I want it to show only this directory. – Adrian Jul 24 '17 at 09:34
  • See [this](https://stackoverflow.com/questions/22207936/python-how-to-find-files-and-skip-directories-in-os-listdir) quiestion on filtering out the folders from the output of os.listdir(). – Abhinav Gupta Jul 24 '17 at 11:28
0

Path in pathlib is convenient for getting extensions in Python 3.4+

import os

from pathlib import Path

def find_extensions(dir_path,  excluded = ['', '.txt', '.lnk']):
    import os
    from pathlib import Path    
    extensions = set()
    for _, _, files in os.walk(dir_path):   
        for f in files:
            ext = Path(f).suffix.lower()
            if not ext in excluded:
                extensions.add(ext)
    return extensions 
LetzerWille
  • 5,355
  • 4
  • 23
  • 26