0

Right now I have a program which moves files from the sub-directories in the SOURCE folder to sub-directories in the DESTINATION folder. The files contain information like this: content of file before the move.

Now during the move from SOURCE to DESTINATION I want to modify the moving files on 2 places.

  • I want to copy Time and paste it as TimeDif under Time. The type stays Y and the value has to be the current time - the value of Time.
  • I want to modify the value of Power * 10. If the value of Power =< 1000 then type stays N, if else type of Power = Y

So after the file has been moved from SOURCE to DESTINATION it has to look like this:

content of file after the move.

Here is the code I have right now for moving the files, all the moving works smoothly. I just don't know where to start when I want to modify the content of the file:

import os, os.path
import time

#Make source, destination and archive paths.
source = r'c:\data\AS\Desktop\Source'
destination = r'c:\data\AS\Desktop\Destination'
archive = r'c:\data\AS\Desktop\Archive'

#Make directory paths and make sure to consider only directories under source.
for subdir in os.listdir(source):
    subdir_path = os.path.join(source, subdir)
    if not os.path.isdir(subdir_path):
        continue

#Now we want to get the absolute paths of the files inside those directories 
#and store them in a list.
    all_file_paths = [os.path.join(subdir_path, file) for file in os.listdir(subdir_path)]
    all_file_paths = [p for p in all_file_paths if os.path.isfile(p)]

#Exclude empty sub-directories
    if len(all_file_paths) == 0:
        continue

#Get only the newest files of those directories.
    newest_file_paths = max(all_file_paths, key=os.path.getctime)


#Now we are selecting the files which will be moved
#and make a destination path for them.
    for file_path in all_file_paths:
        if file_path == newest_file_paths and os.path.getctime(newest_file_paths) < time.time() - 120:
            dst_root = destination
        else:
            dst_root = archive

#Now its time to make the move.
        dst_path = os.path.join(dst_root, subdir, os.path.basename(file_path))
        os.rename(file_path, dst_path)
ManCity10
  • 119
  • 1
  • 12

3 Answers3

0

If the files are small then instead of moving the files you can simply:

  1. read the information from all of the files
  2. find the data you want to replace
  3. write the files with the new data in the source directory
  4. delete the old files

Something like

def move_file(file_path, dst_path):
  with open(file_path, "r") as input_file, open(dst_path, "w") as output_file:
      for line in input_file:
         if <line meets criteria to modify>:
             <modify_line>
         print(line, file=output_file)
      for <data> in <additional_data>:
         print(<data>, file=output_file)

  # remove the old file
  os.remove(file_path)

Then instead of os.rename in your original code call the move_file function

#Now we are selecting the files which will be moved
#and make a destination path for them.
    for file_path in all_file_paths:
        if file_path == newest_file_paths and os.path.getctime(newest_file_paths) < time.time() - 120:
            dst_root = destination
        else:
            dst_root = archive
#Now its time to make the move.
        dst_path = os.path.join(dst_root, subdir, os.path.basename(file_path))
        move_file(file_path, dst_path)

You might implement this like

import os
import time
from datetime import datetime

SOURCE = r'c:\data\AS\Desktop\Source'
DESTINATION = r'c:\data\AS\Desktop\Destination'
ARCHIVE = r'c:\data\AS\Desktop\Archive'

def get_time_difference(date, time_string):
    """
    You may want to modify this logic to change the way the time difference is calculated.
    """
    time_difference = datetime.now() - datetime.strptime(f"{date} {time_string}", "%d-%m-%Y %H:%M")
    hours = time_difference.total_seconds() // 3600
    minutes = (time_difference.total_seconds() % 3600) // 60
    return f"{int(hours)}:{int(minutes)}"

def move_and_transform_file(file_path, dst_path, delimiter="\t"):
    """
    Reads the data from the old file, writes it into the new file and then 
    deletes the old file.
    """
    with open(file_path, "r") as input_file, open(dst_path, "w") as output_file:
        data = {
            "Date": None,
            "Time": None,
            "Power": None,
        }
        time_difference_seen = False
        for line in input_file:
            (line_id, item, line_type, value) = line.strip().split()
            if item in data:
                data[item] = value
                if not time_difference_seen and data["Date"] is not None and data["Time"] is not None:
                    time_difference = get_time_difference(data["Date"], data["Time"])
                    time_difference_seen = True
                    print(delimiter.join([line_id, "TimeDif", line_type, time_difference]), file=output_file)
                if item == "Power":
                    value = str(int(value) * 10)
            print(delimiter.join((line_id, item, line_type, value)), file=output_file)

    os.remove(file_path)

def process_files(all_file_paths, newest_file_path, subdir):
    """
    For each file, decide where to send it, then perform the transformation.
    """
    for file_path in all_file_paths:
        if file_path == newest_file_path and os.path.getctime(newest_file_path) < time.time() - 120:
            dst_root = DESTINATION
        else:
            dst_root = ARCHIVE

        dst_path = os.path.join(dst_root, subdir, os.path.basename(file_path))
        move_and_transform_file(file_path, dst_path)

def main():
    """
    Gather the files from the directories and then process them.
    """
    for subdir in os.listdir(SOURCE):
        subdir_path = os.path.join(SOURCE, subdir)
        if not os.path.isdir(subdir_path):
            continue

        all_file_paths = [
            os.path.join(subdir_path, p) 
            for p in os.listdir(subdir_path) 
            if os.path.isfile(os.path.join(subdir_path, p))
        ]

        if all_file_paths:
            newest_path = max(all_file_paths, key=os.path.getctime)
            process_files(all_file_paths, newest_path, subdir)

if __name__ == "__main__":
    main()
  • Moving the files is a MUST that's the core business of the program, but how does the change of files take place? – ManCity10 May 25 '20 at 11:25
  • This approach sounds a yellow signal to me as it opens the `C` stream twice, contains data in the memory for a quite long time, and use the disk more (for a moment). I will not prefer copying the files. – Yogesh Aggarwal May 25 '20 at 11:50
  • I disagree, the data isn't contained in memory for very long, its written into the output file just after its read from the input file. If the input files were particularly large you might not want to take this approach but given the examples they are not. – MindOfMetalAndWheels May 25 '20 at 12:34
0

@MindOfMetalAndWheels Your code is surely just modifying the files and not moving them? I want to move and modify them. By the way if I try to insert your piece of code in mine I get an invalid syntax.

import os
import time
from datetime import datetime

SOURCE = r'c:\data\AS\Desktop\Source'
DESTINATION = r'c:\data\AS\Desktop\Destination'
ARCHIVE = r'c:\data\AS\Desktop\Archive'

def get_time_difference(date, time_string):
    """
    You may want to modify this logic to change the way the time difference is calculated.
    """
    time_difference = datetime.now() - datetime.strptime(f"{date} {time_string}", "%d-%m-%Y %H:%M")
    hours = time_difference.total_seconds() // 3600
    minutes = (time_difference.total_seconds() % 3600) // 60
    return f"{int(hours)}:{int(minutes)}"

def move_and_transform_file(file_path, dst_path, delimiter="\t"):
    """
    Reads the data from the old file, writes it into the new file and then 
    deletes the old file.
    """
    with open(file_path, "r") as input_file, open(dst_path, "w") as output_file:
        data = {
            "Date": None,
            "Time": None,
            "Power": None,
        }
        time_difference_seen = False
        for line in input_file:
            (line_id, item, line_type, value) = line.strip().split()
            if item in data:
                data[item] = value
                if not time_difference_seen and data["Date"] is not None and data["Time"] is not None:
                    time_difference = get_time_difference(data["Date"], data["Time"])
                    time_difference_seen = True
                    print(delimiter.join([line_id, "TimeDif", line_type, time_difference]), file=output_file)
                if item == "Power":
                    value = str(int(value) * 10)
            print(delimiter.join((line_id, item, line_type, value)), file=output_file)

    os.remove(file_path)

def process_files(all_file_paths, newest_file_path, subdir):
    """
    For each file, decide where to send it, then perform the transformation.
    """
    for file_path in all_file_paths:
        if file_path == newest_file_path and os.path.getctime(newest_file_path) < time.time() - 120:
            dst_root = DESTINATION
        else:
            dst_root = ARCHIVE

        dst_path = os.path.join(dst_root, subdir, os.path.basename(file_path))
        move_and_transform_file(file_path, dst_path)

def main():
    """
    Gather the files from the directories and then process them.
    """
    for subdir in os.listdir(SOURCE):
        subdir_path = os.path.join(SOURCE, subdir)
        if not os.path.isdir(subdir_path):
            continue

        all_file_paths = [
            os.path.join(subdir_path, p) 
            for p in os.listdir(subdir_path) 
            if os.path.isfile(os.path.join(subdir_path, p))
        ]

        if all_file_paths:
            newest_path = max(all_file_paths, key=os.path.getctime)
            process_files(all_file_paths, newest_path, subdir)

if __name__ == "__main__":
    main()
ManCity10
  • 119
  • 1
  • 12
  • I can see you're quite new to python, you have put the function definition inside your script, let me refactor your script to help you out – MindOfMetalAndWheels May 25 '20 at 13:28
  • @MindOfMetalAndWheels Much appreciated men, hope you can help me out :) – ManCity10 May 25 '20 at 13:44
  • I've amended my answer I don't know exactly how your text files are structured, or what all the possible line types might be so this may not work without some tweaking. I also don't know how exactly you want to calculate the time difference. be careful, this will delete your original files if run it - you might want to remove the os.remove() line when testing. – MindOfMetalAndWheels May 25 '20 at 14:22
  • First of all thank you for the time you've spent on it. I have runned your code with some test files. The files directly get moved to the archive folder (instead of destination) and the content of the file stays the same. I constantly use this https://i.stack.imgur.com/aFfFu.png as the testfile. – ManCity10 May 25 '20 at 14:36
  • spotted the mistake - never called the new function `move_and_transform_file` ! – MindOfMetalAndWheels May 25 '20 at 14:41
  • @MindOfMetalAndWheels. Seemed like it did what it had to do, now walk me through it men. Your code is way more complex then mine. Thank you indeed. – ManCity10 May 26 '20 at 06:28
  • Don't worry its not really much more complex. The script that you made has been split into a chain of methods. The main() and process_files() methods are essentially all written by you, they search for the files, find the file paths and the destination paths. Then the process_files() method calls move_and_transform_file() on each file path, this is the new bit which reads the old files and writes new ones. Does that make sense? – MindOfMetalAndWheels May 26 '20 at 13:17
  • Is there not a possibility to chat to each other? Your explanation makes sense yes. I just though the if __name__ = __main__ statement could only be used between 2 python files not in 1 python file. – ManCity10 May 26 '20 at 13:46
  • If you've got a specific question I will try to answer it. I'm sure as you learn more about python it will start to make more and more sense to you as time goes on. It seems from your file paths that you are using windows, can I recommend that you use VS code as your ide. This is an intro video to debugging which will help you to understand what the script is doing https://www.youtube.com/watch?time_continue=38&v=w8QHoVam1-I&feature=emb_title – MindOfMetalAndWheels May 26 '20 at 14:03
  • Well i'm using it on my work laptop which got Spyder installed, we cant just make changes to that. When I have a specific question I will tell you, how can I contact you? – ManCity10 May 26 '20 at 14:14
  • What does the if __name__ == "__main__" function do? I though It could only be used if you use at least 2 python files for your program. By the way I need to to accept your reaction as an answer but that's not possible because its my own post :) – ManCity10 May 26 '20 at 14:44
  • https://stackoverflow.com/questions/419163/what-does-if-name-main-do You could accept the answer I wrote - it has the same script in it. – MindOfMetalAndWheels May 26 '20 at 14:46
  • @MindOfMetalAndWheels If I ask to much let me know. What is your Get_time_difference doing, I've been looking to it for some time now but it looks complex... – ManCity10 May 27 '20 at 07:39
  • This calculates the TimeDif from the time and date in the file. I wasn't sure exactly how you wanted this calculated - currently is just returns the hours and minutes from the time of execution to the date and time in the file – MindOfMetalAndWheels May 27 '20 at 08:49
  • @MindOfMetalAndWheels I actually see you are not calling the move_and_transform file? How are you using it? Would like to hear from you... – ManCity10 Jun 02 '20 at 13:39
  • I call it at the bottom of `process_files` – MindOfMetalAndWheels Jun 04 '20 at 09:31
-1

You can't modify on the way of moving it. You first have to move it then you can do your work. For that you can store the final destinations of your files (including sub-directories in the name) in an array & iterate it later to open up files & do your work.

Here's a minimal example

def changeFile(fileName):
    # do your desired work here
    pass

files = ["dir/subdir1/file1", "dir/file"]

for file in files:
    os.rename(file, newPath)
    changeFile(newPath)
Yogesh Aggarwal
  • 1,071
  • 2
  • 12
  • 30