3

I've got a bash script that runs on OSX.
It needs to manipulate some files on a network-share (AFP share on a Synology NAS).

Unfortunately those files are sometimes still being written when the script runs.

How do I determine if the file is in use or not ?

The normal method is by using "lsof", but that doesn't seem to work on network files if the other user is coming from another client on the LAN.

I could just attempt to rename the file. I suppose that will fail if the file is in use, but that is far from elegant.

Anybody have a better solution ?

Tonny
  • 671
  • 5
  • 17
  • Does anything in this [Apple Discussions](https://discussions.apple.com/thread/3252589?start=0&tstart=0) thread help? – summea May 11 '13 at 17:01
  • and this discussion about concurrent access network shared file in general http://serverfault.com/questions/61086/concurrent-nfs-access – Dyno Fu May 11 '13 at 17:04
  • @summea That thread discusses things from the perspective of the server hosting the share. I'm on the client side. – Tonny May 11 '13 at 18:44
  • @DynoHongjunFu Not much help for me. I have no control over the creation of the files. I just need to read them but have to be certain that the creator is done with the file. – Tonny May 11 '13 at 18:47

2 Answers2

3

This is not a generally solvable problem. The typical solution is to write the file to a temporary location and then move it to the final processing directory (since move within a filesystem is generally atomic). If you cannot control how or where the file is written, then you are left with heuristics, particularly doing things like looking at the file and seeing if it hasn't grown in "awhile," but none of these are particularly good compared to separating the writing from the enqueuing.

Rob Napier
  • 286,113
  • 34
  • 456
  • 610
  • I reached that conclusion myself already. Just posted here to see if I missed something, but apparently not. So I will attempt to rename the file first (atomic operation as you say) and then work with the renamed file. Additional benefit is that on rename I can get rid of spaces in the filename. Makes the rest of the script significantly easier. Would be a quoting nightmare with spaces. – Tonny May 12 '13 at 09:27
  • Spaces are *far* from the only dangerous characters in file names. You have to do proper quoting in scripts. There's no getting around it. – Ken Thomases May 12 '13 at 19:20
  • @KenThomases I know. Believe me I know... Fortunately for me the origins of all files are Windows or OSX. So most dangerous stuff is already not allowed by the OS. If something does come through it will cause the rename to throw a fit, which the error handling will catch and the file is skipped. That is good enough for this. Logfile made by the script will be checked by an operator anyway and he/she can manually deal with the problem-cases. – Tonny May 13 '13 at 18:26
  • 1
    The renaming failing is not the worst failure mode. Evaluating improperly quoted file names in a shell script can result in execution of code! If somebody makes a file called "foo$(rm -rf ~)bar" and you fail to quote it, you'll wipe the users home folder. Also, the only characters prohibited in file names in Mac OS X are the slash ('/') and null (ASCII 0) characters. Some file systems impose additional restrictions. – Ken Thomases May 13 '13 at 22:44
  • @KenThomases Took me while to revisit this question. 99% of users are Windows users that are barely computer-literate. I'll take my chances that they manage to generate such a filename. The other 1% knows better than to do that. – Tonny Jun 13 '13 at 10:20
1

Are the other potential accesses being done by arbitrary programs or can it be assumed that it's being done by other instances of your program running on other clients?

If the file is private to your program, then all instances of your program can participate in a cooperative locking scheme. You might use the lockfile command, for example. Be very sure to clean up your lock files even in the face of signals/exceptions. You can use the trap built-in command to help with that. See here for an explanation.

Ken Thomases
  • 88,520
  • 7
  • 116
  • 154
  • I wish I could. Files are created by various means of upload (ftp, scp, rsync). My script needs to read them once to convert them to another format and then (if conversion is successful) move the original to an archival location. Conversion will succeed (without given an error) if the file has a read-lock only, but the output will be broken (and therefore useless) if the file wasn't fully uploaded yet, and I have no way to detect that condition. – Tonny May 12 '13 at 09:24