0

I'm watching a folder on my synology, using a scheduled bash script (the bash script is being executed every minute), where all scanned documents are being dropped. My idea is to move them to two places, one is the paperless-ng folder and an unsorted folder, so I can move them to the correct folder by hand.

#!/bin/bash
dpath=/volume1/scanned/*
for FILE in $dpath
do
if [[ -f $FILE ]]
then
    cp $FILE /volume1/unsorted_documents/
    mv $FILE /volume1/docker/paperless/consume/
else
    echo “There are no files in the given path.”
fi
done

This script ends up in documents being corrupt, most of the time. My thought is that it isn't finished copying before the move command is being executed.

Is there a way to make sure that the copy is done, before the move is being executed? Or another, better solution?

Erik van de Ven
  • 4,747
  • 6
  • 38
  • 80
  • 1
    Corrupted how? Do you filenames contain blanks? – Benjamin W. Jan 26 '22 at 17:42
  • @BenjaminW. Corrupted as if the copied pdf file cannot be openened anymore. The moved file seems to be fine – Erik van de Ven Jan 26 '22 at 17:53
  • 1
    yeah, sounds like you're processing the files before they've completed downloading; if the source is unable to generate a secondary file (eg, `downloaded.file.DONE` - you only process files where an associate `.DONE` exists) then another option is to check the size (`wc -c`), wait n seconds and see if the size has changed; a variation would be to make note of each file's current size, write the details to a file and on the next script run if file's size has not changed (from the last script run) then process it and remove from the file (list of filenames and sizes) ... – markp-fuso Jan 26 '22 at 18:01
  • since your script is running every minute I'm assuming there's no 'hurry' to process a file and that you can wait an extra minute to verify its size is not changing – markp-fuso Jan 26 '22 at 18:02
  • `cp` completes before `mv` is run, but is it possible your script doesn't complete in a minute, meaning a second instance of your script starts before the previous one is done? – chepner Jan 26 '22 at 19:15
  • It would probably be better to have a single long-running script with a loop that sleeps for 60 seconds between iterations. – chepner Jan 26 '22 at 19:15
  • Is the script running directly on the NAS? – Fravadona Jan 26 '22 at 19:18
  • Make sure that you copy `$FILE` only when it is completely written and not before. – Cyrus Jan 26 '22 at 19:30

2 Answers2

0

I think your problem is related to dpath variable assignment. Try using find to get a list of files

#!/bin/bash
dpath=$(find /volume1/scanned -maxdepth 1 -type f)
# dpath=/volume1/scanned/*
for FILE in $dpath; do
  cp "$FILE" /volume1/unsorted_documents/
  mv "$FILE" /volume1/docker/paperless/consume/
done
nntrn
  • 416
  • 2
  • 7
  • 1
    That's still a string and relying on word splitting. A more robust approach would be using an array, see [this Q&A](https://stackoverflow.com/q/23356779/32668470), combined with quoting *all* expansions. – Benjamin W. Jan 26 '22 at 18:58
  • Oh i see, do you think spaces in the `$FILE` is causing the corruption?? – nntrn Jan 26 '22 at 19:08
  • the filenames are generated by my kyocera printer/scanner, so they do not contain any blanks. It is just odd that the cp command turns out to not be any problem, but the mv turns into a corrupted file. Not everytime but sometimes – Erik van de Ven Jan 26 '22 at 19:27
  • Blanks would result in trying to access non-existing files, I'd say. – Benjamin W. Jan 26 '22 at 20:05
  • what kind of files are they btw?? i'm curious to know what `file $FILE` outputs – nntrn Jan 27 '22 at 02:25
0

Thank you for all your answers! I took every answer in consideration and wrote another script, which should work much better. If there are any comments or recommendations, please let me know: https://gist.github.com/ErikvdVen/95009cc0fd9267deaeae6ddbeae31e54

Erik van de Ven
  • 4,747
  • 6
  • 38
  • 80