0

On a Linux server, data Files will be dumped continuously in a directory after intermittent intervals say of 5 or 10 or even 15 minutes. I want to preprocess/cleanse these files one by one and SCP to some other server.

How should I process all these files recursively?

Should I write a single bash script, which will run continuously and process files recursively in that directory? Or should I schedule a script to run after each 10 minutes?

For a single continuously running script what should be the loop condition? or an infinite while loop?

Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Sourabh Potnis
  • 1,431
  • 1
  • 17
  • 26

1 Answers1

0

I'd go for a scheduled script with cron, as infinite loops are, sort-of, bugs.

For the processing part, I'm not sure this is what you asked for but you can do something like this:

#!/bin/bash
FILES=/your/dir/*
for file in $FILES
do
  echo "I'm doing something with $file"
done
ToX 82
  • 1,064
  • 12
  • 35
  • An infinite loop is not a bug. A scheduled script risks sitting idle when new files arrive before the next scheduled run. – chepner Sep 10 '14 at 12:01
  • BTW, if you want to store a list of filenames in a variable, you need to use an array. As it is, `$FILES` is storing only the glob expression itself, not storing any actual filenames; `files=( /your/dir/* )` would be storing actual names, after which point one could iterate over those names with `for file in "${files[@]}"`. – Charles Duffy Dec 29 '15 at 23:01
  • 1
    Also, using all-caps names for your own variables is bad form. See fourth paragraph of http://pubs.opengroup.org/onlinepubs/009695399/basedefs/xbd_chap08.html for POSIX conventions on environment variable names, keeping in mind that environment variables and shell variables share a namespace (so a poorly-named shell variable can unintentionally override an environment variable -- not just for the current process, but all subprocesses as well). – Charles Duffy Dec 29 '15 at 23:02