0

I have two directories - one with older data and one with newer data. As files are moved over between directories, the filenames are updated from 'old-filename.ext' to 'old-filename-updated.ext'. The file extensions aren't always the same length. I am trying to figure out how to check the new directory to see if an updated file with the proper filename exists, and if it doesn't, then create a new file.

    for files in old_dir:
      old_files = [:-5] #longest file extension is 4 chars long
      for file in new_dir:
        if old_files in new_dir:
          lst_updated_files.append(old.files)

This produces a list with just the filenames that matched in the new directory. But this is where I get stuck. How would I take the remaining files and create new versions of them in the updated directory?

Ryan
  • 1
  • 2

2 Answers2

0

You can try using os.listdir(): https://www.geeksforgeeks.org/python-os-listdir-method/

os.listdir() method in python is used to get the list of all files and directories in the specified directory. If we don’t specify any directory, then list of files and directories in the current working directory will be returned.

import os 
old_directory = os.listdir("path/to/old/directory")
new_directory = os.listdir("path/to/new/directory")

Next we need to compare the directories to see which files have not been moved over to the new directory. If we know that all files in the "new" directory will have the suffix -updated such as old-filename-updated.ext, then we can remove -updated from each file name string in new_directory so that the names match what is in old_directory, allowing an easier comparison to find what is missing.

import os 
old_directory = os.listdir("path/to/old/directory")
new_directory = os.listdir("path/to/new/directory")

# Use list comprehension to modify each element in new directory and remove "-updated"
new_directory_modified = [element.replace("-updated", "") for element in new_directory]

Now we can find the missing files using set arithmetic:

import os 
old_directory = os.listdir("path/to/old/directory")
new_directory = os.listdir("path/to/new/directory")

# Use list comprehension to modify each element in new directory and remove "-updated"
new_directory_modified = [element.replace("-updated", "") for element in new_directory]

# Using `set` is faster than using `in` to find missing elements
missing_elements = list(set(old_directory) - set(new_directory_modified))

From here, all you need to do is iterate over this list, add -updated to each element, and save into the new directory.

Tom Hood
  • 497
  • 7
  • 16
0

You could do something like the following to compare the file names in both directories. Note that this does not compare the file contents nor the modification dates. So, this snippet does not check if the files in the new directory are really an updated version of the files in the old directory. You could check if the file in the new directory is actually newer than the corresponding file in the old directory using e.g. os.stat (see python: which file is newer & by how much time).

import os
import shutil

# Get all files in the old and new directory.
# os.listdir returns everything inside the given directory (including other directories).
# os.path.isfile checks if the given path is a file.
dir_old = "path/to/old/directory"
dir_new = "path/to/new/directory"
files_old = [file for file in os.listdir(dir_old) if os.path.isfile(file)]
files_new = [file for file in os.listdir(dir_new) if os.path.isfile(file)]

# Artificial lists for testing purpose.
files_old = ["file1.txt", "file2.jpeg", "file3.so", "file4.a"]
files_new = ["file1_updated.txt", "file3_updated.so"]

files_updated = []
files_not_updated = []

for file_old in files_old:
    # Split file_old in its basename and its extension.
    fname, extension = os.path.splitext(file_old)
    # Assemble the corresponding new file name.
    file_new = fname + "_updated" + extension
    if file_new in files_new:
        # The file has already been updated.
        files_updated.append(file_new)
    else:
        # The file has not been updated, yet.
        files_not_updated.append(file_new)
        # Alternative 1: Create an empty file with the updated file name.
        open(os.path.join(dir_new, file_new), "x").close()
        # Alternative 2: Copy the old file to the new directory.
        shutil.copy(file_old, os.path.join(dir_new, file_new)
        # Or do whatever you do to actually update the file.
andthum
  • 87
  • 6