0

I am new to Python. I need to traverse the list of files in a directory, and have a 2D list of files (keys) with a value. Then I need to sort it based on their values, and delete the files with lower half of values. How can I do that?

This is what I did so far. I can't figure it out how to create such 2D array.

dir = "images"
num_files=len(os.listdir(dir))
for file in os.listdir(dir):
    print(file)
    value = my_function(file)
    #this is wrong:
    _list[0][0].append(value)

#and then sorting, and removing the files associated with lower half

Basically, the 2D array should look like [[file1, 0.876], [file2, 0.5], [file3, 1.24]], which needed to be sorted out based on second indexes.

DYZ
  • 55,249
  • 10
  • 64
  • 93
Tina J
  • 4,983
  • 13
  • 59
  • 125
  • 2
    Dupe of [How to sort (list/tuple) of lists/tuples?](https://stackoverflow.com/questions/3121979/how-to-sort-list-tuple-of-lists-tuples)? – DYZ Sep 28 '18 at 20:54
  • 3
    dont call lists list.Avoid all built-in names - the variable shadows the built-in – Patrick Artner Sep 28 '18 at 20:55
  • @DYZ That's only for sorting part. I know that part. – Tina J Sep 28 '18 at 20:55
  • @PatrickArtner ok will do. – Tina J Sep 28 '18 at 20:56
  • 3
    You create a list of tuples: `yourList.append( [file,value] )` or `yourList.append( (file,value) )` then you use the dupe to sort it. then you throw away the first n/2 elements of your sorted list. – Patrick Artner Sep 28 '18 at 20:56
  • Possible duplicate of [How to sort (list/tuple) of lists/tuples?](https://stackoverflow.com/questions/3121979/how-to-sort-list-tuple-of-lists-tuples) – Patrick Artner Sep 28 '18 at 21:04

2 Answers2

2

Based on the comments, looks like I have to do this when appending:

mylist.append([file, value])

And for sorting, I have to do this:

mylist.sort(key=lambda mylist: mylist[1])
Tina J
  • 4,983
  • 13
  • 59
  • 125
  • Also see https://docs.python.org/3.3/howto/sorting.html#operator-module-functions – Jeff Sep 28 '18 at 21:13
  • you might want to rename the lamda variable different from your list. `mylist.sort(key=lambda item: item[1])` things like that bring confusion ;) – Patrick Artner Sep 28 '18 at 21:15
0

I don't understand what this message means.

delete the files with lower half of values

Does this mean that you have to select the files having value less than the midpoint between minimum and maximum values on the files or that you just have to select the lower half of the files?

There isn't any need to use a 2D-array if the second coordinate depends on the first thanks to my_function. Here is a function that does what you need:

from os import listdir as ls
from os import remove as rm
from os.path import realpath

def delete_low_score_files(dir, func, criterion="midpoint")
    """Delete files having low score according to function f

       Args:
           dir (str): path of the dir;
           func (fun): function that score the files;
           criterion (str): can be "midpoint" or "half-list";
       Returns:
           (list) deleted files.
    """

    files = ls(dir)
    sorted_files = sorted(files, key=func)    

    if criterion == "midpoint":
        midpoint = func(sorted_files[-1]) - func(sorted_files[0])
        files_to_delete = [f for f in sorted_files if func(f) < midpoint]

    if criterion == "half-list":
        n = len(sorted_files)/2
        files_to_delete = sorted_files[:n]

    for f in files_to_delete:
        rm(realpath(f))

    return files_to_delete
Weird Mike
  • 1,207
  • 9
  • 14
Scrooge McDuck
  • 372
  • 2
  • 14