0

I have an n number of log files that a script regularly downloads and upload on slack for monitoring purposes. However with recent improvements in our postgresql database some of the log files are now empty (meaning no errors or long queues were recorded) this being said, I would need to segregate files that are empty vs not empty and if it's empty skip the file from being uploaded entirely and proceed with the ones that are not empty.

348 postgresql.log.2021-09-28-0000
679 postgresql.log.2021-09-28-0100
  0 postgresql.log.2021-09-28-0200
  0 postgresql.log.2021-09-28-0300
  0 postgresql.log.2021-09-28-0400
  0 postgresql.log.2021-09-28-0500
  0 postgresql.log.2021-09-28-0600
  0 postgresql.log.2021-09-28-0700
  0 postgresql.log.2021-09-28-0800
  0 postgresql.log.2021-09-28-0900
  0 postgresql.log.2021-09-28-1000
  0 postgresql.log.2021-09-28-1100
  0 postgresql.log.2021-09-28-1200
  0 postgresql.log.2021-09-28-1300
  0 postgresql.log.2021-09-28-1400
  0 postgresql.log.2021-09-28-1500
  0 postgresql.log.2021-09-28-1600
  0 postgresql.log.2021-09-28-1700
  0 postgresql.log.2021-09-28-1800
  0 postgresql.log.2021-09-28-1900
  0 postgresql.log.2021-09-28-2000
  0 postgresql.log.2021-09-28-2100
  0 postgresql.log.2021-09-28-2200
  0 postgresql.log.2021-09-28-2300

In this case we can see that only the files

348 postgresql.log.2021-09-28-0000
679 postgresql.log.2021-09-28-0100

contains 348 bytes and 679 bytes of data respectively.

How do I make it so that the python script that is being used right now would validate if the file is empty first before being uploaded?

The closest thing I have found right now is

import os
if os.stat("postgresql.log.2021-09-28-2300").st_size == 0: 
    print('empty')

but this only checks for one file at a time, and I would rather do it in the whole directory as the file names (Date in particular and time) would change.

I'm relatively new at this and this was just handed down to me at work and I'd appreciate guides on how to make it work thank you so much.

  • Does this answer your question? [How can I iterate over files in a given directory?](https://stackoverflow.com/questions/10377998/how-can-i-iterate-over-files-in-a-given-directory) – mkrieger1 Sep 29 '21 at 09:19

2 Answers2

1

Your approach is good already, you just have to combine it with a way to check all files per directory, such as glob.glob.

from glob import glob
import os

path_of_directory = "..."
list_of_files = glob(path_of_directory+"/postgresql.log.*")
for f in list_of_files:
    if os.stat(f).st_size != 0:
        # do something with file f
Marius Wallraff
  • 391
  • 1
  • 5
  • 11
0

You can use glob to find files following a particular pattern. Then use os.stat to find its size. See the following code for example:

import glob
import os

# TODO: Modify this accordingly
file_pattern = 'postgresql.log.2021*'

# Get a list of files (file paths) 
files = glob.glob(file_pattern) )

# Find out their sizes 
files_and_size = [ (file_path, os.stat(file_path).st_size) 
                    for file_path in files ]

# Printing the file paths and corresponding sizes
for file_path, file_size in files_and_size:
    print(file_path, '==', file_size)
    if file_size > 0:
        # process the file with path "file_path"
    else:
        # delete the file
        os.remove(file_path)
Eshwar S R
  • 126
  • 2
  • 7
  • Thanks! This gave out an output, however i'm trying to only keep the files with values that have more than 0 bytes in them so I used this instead ```files_and_size = [ (file_path, os.stat(file_path).st_size != 0) ``` now i'm getting a boolean output like so: ```('postgresql.log.2021-09-28-1300', '==', False) ('postgresql.log.2021-09-28-0000', '==', True)``` Deleted some extra outputs to make comment fit. How do I make so that I only get to work on files that are "True" and the ones with "False" are deleted? – Gifter Villanueva Sep 29 '21 at 13:56
  • @GifterVillanueva Now that you have file_paths and their corresponding sizes, add an `if else` condition in the last for loop. Editing the answer with the same. – Eshwar S R Sep 29 '21 at 20:13