0

I am writing a python script to transfer large files via sftp with the pysftp module. I have a massive amount of data to transfer, a total of around 36Tb, divided in 54 runs, or batches.

I want only to carry out these transfers between certain hours of the day, for this example, between 6pm and 7am. So my idea is to use a for loop to iterate over all the runs/ batches. Upon each iteration, I would check what hour it is. If it is between 6pm and 7am I would transfer. Else the script would sleep until it is 6pm minimum. The code that I wrote looks like so:

runsList = 'runA runB runC'.split() # these are directories

# time constraints
bottomLimit = 7
upperLimit = 18
doNotUploadRange = range(bottomLimit, upperLimit)

for run in runsList:
    hour = dt.datetime.now().hour

    while hour in doNotUploadRange:
        print('do not upload now')
        
        time.sleep(1800)
        hour = dt.datetime.now().hour

    # when I leave the while condition above
    # do the transfer via pysftp (large amount of data) per run

The question here does not concern the code itself not I want to check whether or not the script is running (which can be checked with htop), but I am concerned that my script will crash, for whatever reason, before it finishes (perhaps it would be running for a full week if nothing crashes).

I do sometimes call scripts that run for a very long time and they do crash sometimes, with no obvious reasons for crash.

So my question is whether it is, for whatever reason, obvious that the script will crash after running for 6-7 days of can I expect it to finish provided that there is no error in the code itself? My idea is to call this script on the background, inside tmux I would python script.py &

BCArg
  • 2,094
  • 2
  • 19
  • 37
  • Does this answer your question? [Check if a program in a specific path is running](https://stackoverflow.com/questions/29260576/check-if-a-program-in-a-specific-path-is-running) – crissal Jun 24 '21 at 13:16
  • As an alternative to `tmux`, you can have a look at the `screen` command. It's fairly simple and well documented. – Alexis Jun 24 '21 at 13:18
  • An issue with your code is that you will not upload from 7h to 9h, because the loop will wait for 3h at 18h, 21h, 24h, 3h, 6h, and finally 9h. – Alexis Jun 24 '21 at 13:21
  • Last but not least, I assume that you have exhausted all resources available to compress the file's content before transfer? 36 Tb is indeed massive – Alexis Jun 24 '21 at 13:26
  • @crissal checking whether it is running or not is fine, I kinda of want to have an idea whether a script can run for such a long time without crashing, provided that there will be no error in it – BCArg Jun 24 '21 at 13:26
  • 1
    @BCArg No, python will not crash just because execution time is long. It is very common to run programs for multiple days. – Alexis Jun 24 '21 at 13:30
  • @Alexis `time.sleep()` is in seconds, so 1800 seconds would be half an hour, which is fine, though I can decrease yes, to check after 10 minutes i.e. `time.sleep(600)`. I will compress the files yes and I am actually looking for a faster alternative than simply `zip` – BCArg Jun 24 '21 at 13:32
  • If you think there is potential for compression, I think you should definitely start there and ask that as a separate question where you tell us a bit more about what type of data is stored. – Alexis Jun 24 '21 at 13:35
  • Are you by chance transferring John McAfee's [file stash](https://twitter.com/officialmcafee/status/1137777348919681024)? – Chillie Jun 24 '21 at 13:41

0 Answers0