0

I have to delete the oldest file in dir when the newest added file exceed the used space limit. I don't know why the sorted list files = sorted(os.listdir(DIR), key=os.path.getctime) do not contain on first element the oldest file ( in this case file named 'file_1')

code

    print('START: Cycle no. ', cycle)
    time.sleep(1)
    print('Saving {0} files. {1} MB each'.format(FILES_NUM, MB_FILE_SIZE))
    i = 1
    while (i < FILES_NUM):
        usage = psutil.disk_usage(DIR)
        used = usage.used // (2**20)
        # print('Uzyta pamiec: ', used)
        if (used < 50): # 50 MB
            print('Saving file_{}'.format(i))
            with open("file_{}".format(i), 'wb') as f:
                f.write(os.urandom(FILE_SIZE))           
        else:
            files = sorted(os.listdir(DIR), key=os.path.getctime)
            print('Files list: ', files)
            os.remove(files[0])
            print('Deleted oldest file: ',files[0])
        i = i + 1

    print('KONIEC: Cycle no. ', cycle)
    print('Deleting the content of the card...')

results

Stress test results

EDIT: I know that the next file after deletion should have the ending in the file name one larger than the previous addition. In this example should be Saving file_22 instead of Saving file_23. The 22nd 'i' is used in deletion process, but how can I overcome this issue?

1 Answers1

0

You sort files by ctime not alphabetically so don't assume the oldest file will be file_1. Let's see this using a simplified version of your code:

import os
import sys

from datetime import datetime

FILES_NUM = 10
FILE_SIZE = 10


def main():
    for i in range(1, FILES_NUM + 1):
        if i == 5: # Assume disk usage has been exceeded
            files = sorted(os.listdir('.'), key=os.path.getctime)
            files = [
                f'{i} - {datetime.fromtimestamp(os.path.getctime(i))}'
                for i in files
                if i.startswith('file_')
            ]
            print(f'Files list: {files}')
            print(f'Deleted oldest file: {files[0]}')

        print(f'Saving file{i}')
        with open(f'file_{i}', 'wb') as f:
            f.write(os.urandom(FILE_SIZE))


if __name__ == '__main__':
    main()

First run (no files):

$ python test.py
Saving file1
Saving file2
Saving file3
Saving file4
Files list: ['file_1 - 2019-05-30 15:36:36.366754', 'file_2 - 2019-05-30 15:36:36.367754', 'file_4 - 2019-05-30 15:36:36.367754', 'file_3 - 2019-05-30 15:36:36.367754']
Deleted oldest file: file_1 - 2019-05-30 15:36:36.366754
Saving file5
Saving file6
Saving file7
Saving file8
Saving file9
Saving file10

Seconds run (old files already exist):

$ python test.py
Saving file1
Saving file2
Saving file3
Saving file4
Files list: ['file_6 - 2019-05-30 15:36:36.367754', 'file_5 - 2019-05-30 15:36:36.367754', 'file_7 - 2019-05-30 15:36:36.367754', 'file_8 - 2019-05-30 15:36:36.368754', 'file_10 - 2019-05-30 15:36:36.368754', 'file_9 - 2019-05-30 15:36:36.368754', 'file_1 - 2019-05-30 15:37:00.360535', 'file_2 - 2019-05-30 15:37:00.361535', 'file_4 - 2019-05-30 15:37:00.361535', 'file_3 - 2019-05-30 15:37:00.361535']
Deleted oldest file: file_6 - 2019-05-30 15:36:36.367754
Saving file5
Saving file6
Saving file7
Saving file8
Saving file9
Saving file10

As you can see the oldest file during second run is file_6. That's because when disk usage is above the threshold, we enter the if branch where we sort and list existing files, only 1-4 files were created at that point so 5-10 are still older.

Please also note that ctime is the time of the last metadata change on UNIX systems (file ownership, permissions, not a content modification). You can try mtime to sort by a modification date.

The index issue should be also fixed now after the logic in the code has been slightly changed.

Note: example is using Python 3.7+

HTF
  • 6,632
  • 6
  • 30
  • 49