0

We have a main Linux server, say M, where we have files like below (for 2 months, and new files arriving daily)

Folder1

PROCESS1_20211117.txt.gz
PROCESS1_20211118.txt.gz
..
..
PROCESS1_20220114.txt.gz
PROCESS1_20220115.txt.gz

We want to copy only the latest file on our processing server, say P. So as of now, we were using the below command, on our processing server.

rsync --ignore-existing -azvh -rpgoDe ssh user@M:${TargetServerPath}/${PROCSS_NAME}_*txt.gz ${SourceServerPath}

This process worked fine until now, but from now, in the processing server, we can keep files only up to 3 days. However, in our main server, we can keep files for 2 months.

So when we remove older files from the processing server, the rsync command copies all files from main server to the processing server.

How can I change rsync command to copy only latest file from Main server?

*Note: the example above is only for one file. We have multiple files on which we have to use the same command. Hence we cannot hardcode any filename.

What I tried: There are multiple solutions, but all seems to be when I want to copy latest file from the server I am running rsync on, not on the remote server. Also I tried running below to get the latest file from main server, but I cannot pass variable to SSH in my company, as it is not allowed. So below command works if I pass individual path/file name, but cannot work as with variables.

 ssh M 'ls -1 ${TargetServerPath}/${PROCSS_NAME}_*txt.gz|tail -1'

Would really appreciate any suggestions on how to implement this solution.

OS: Linux 3.10.0-1160.31.1.el7.x86_64

Cyrus
  • 84,225
  • 14
  • 89
  • 153

1 Answers1

1

ssh quoting is confusing - to properly quote it, you have to double-quote it locally.

Handy printf %q trick is helpful - quote the relevant parts.

file=$(
   ssh M "ls -1 $(printf "%q" "${getServerPath}/${PROCSS_NAME}")_*.txt.gz" |
   tail -1
)
rsync --ignore-existing -azvh -rpgoDe ssh user@M:"$file" "${SourceServerPath}"

or maybe nicer to run tail -n1 on the remote, so that minimum amount of data are transferred (we only need one filename, not them all), invoke explicit shell and pass the variables as shell arguments:

file=$(ssh M "$(printf "%q " bash -c \
   'ls -1 "$1"_*.txt.gz | tail -n1'
   '_' "${TargetServerPath}/${PROCSS_NAME}"
)")

Overall, I recommend doing a function and using declare -f :

sshqfunc() { echo "bash -c $(printf "%q" "$(declare -f "$1"); $1 \"\$@\"")"; };
work() {
   ls -1 "$1"_*txt.gz | tail -1
}
tmp=$(ssh M "$(sshqfunc work)" _ "${TargetServerPath}/${PROCSS_NAME}")

or you can also use the mighty declare to transfer variables to remote - then run your command inside single quotes:

ssh M "
   $(declare -p TargetServerPath PROCSS_NAME);
   "'
   ls -1 ${TargetServerPath}/${PROCSS_NAME}_*txt.gz | tail -1
'
KamilCuk
  • 120,984
  • 8
  • 59
  • 111
  • That is the problem. If we are using variable inside `ssh` command, it is being evaluated on ssh host, where it is not defined. Hence it return some filename from the home directory on ssh shell. – user15998661 Jan 17 '22 at 10:06
  • Fixed. I just read `So below command works` and assumed it works, but there was an `if` following. – KamilCuk Jan 17 '22 at 10:13
  • Perfect. As soon as you mentioned double quotes, I realized what I was doing wrong. Your initial answer works perfectly if I just replace single quote with double. One question. Among first 2 commands, why do you say it would be nicer to `invoke explicit shell and pass the variables as shell arguments`? What would you choose? – user15998661 Jan 17 '22 at 10:27
  • `What would you choose?` uuhm, I would use `printf "%s\n"` instead of `ls`, there is no value in `ls` and it should not be parsed. And I would go with the `sshqfunc` one or the last one, but it really depends how "fast" I am writing the script, as in, how much time I have to write it. – KamilCuk Jan 17 '22 at 10:38