2

Can somebody tell me what am I doing wrong in following shell function?

NOTES_PATH="/home/temp"

function open_note {

    local files=()                                                          
    while IFS=  read -r -d $'\0' filename; do                                        
        files+=("$filename")                                                   
    done < <(find "$NOTES_PATH" -maxdepth 1 -type f -print0 | xargs -0 ls -t)

    echo "There is/are ${#files[@]} file(s) available"
}

I get "There is/are 0 file(s) available" everytime even though I can see 2 text files in /home/temp directory.

Please note that I want to use while loop approach to read filenames, not any other way unless I cannot achieve what I want to do. [credits: John1024 answer]

P.S. If I run find /home/temp -maxdepth 1 -type f -print0 | xargs -0 ls -t from command line, I get expected result.

Community
  • 1
  • 1
shaffooo
  • 1,478
  • 23
  • 28
  • `-d $'\0'` is the same as `-d ''`. I don't know what specifying an empty read delimiter does. – melpomene Dec 05 '16 at 19:50
  • that input coming from `find` command will be null seperated – shaffooo Dec 05 '16 at 19:52
  • 2
    That input comes from `ls`, not `find`. And no, it won't be. – melpomene Dec 05 '16 at 19:54
  • If `find` emits more filenames than will fit one one `ls -t` command, they won't in fact be correctly sorted. `ls` has no place whatsoever for automated use. – Charles Duffy Dec 05 '16 at 20:14
  • 1
    Aside: `NOTE_PATH` shouldn't be all-uppercase. POSIX conventions for variable names have all-uppercase names reserved for variables with meaning to POSIX-specified tools (thus, the shell or other operating system components). This is most explicitly specified for environment variables, but since setting a shell variable overwrites any like-named environment variable, the convention necessarily applies there as well. – Charles Duffy Dec 05 '16 at 20:19

1 Answers1

3

The original code had several bugs:

  • The advantages of NUL-delimiting content prior to xargs -0 were lost, because the output of ls -t was not NUL-delimited.
  • Because xargs split results into multiple ls -t invocations, sorting took place only within these batches: Given enough filenames to require two or more invocations, those invocations would only be individually sorted, not globally sorted.
  • Parsing output of ls is generally error-prone.

A cleaner implementation (assuming GNU find and GNU sort) would look like the following:

open_note() {
  local filename mtime
  local -a files=( )

  while IFS= read -r -d' ' mtime && IFS= read -r -d '' filename; do
    files+=( "$filename" )
  done < <(find "$NOTES_PATH" -maxdepth 1 -type f -printf '%T@ %P\0' | sort -z -n)
}

The find action given emits a stream of the format:

<epoch time> <filename><NUL>

...which sort then sorts on, after which the while read loop discards the first part of each field.


A few other notes:

  • Don't use the function keyword; it makes your code gratuitously incompatible with POSIX sh, but (unlike other, more useful bashisms) adds no advantage whatsoever over the compliant syntax.
  • If you're trying to keep your locals scoped, be sure that you get all of them -- that includes names being looped over.
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
  • Assume that null-delimiting was originally applied to handle whitespaced filenames.. – vp_arth Dec 05 '16 at 20:24
  • @vp_arth, ...and that contradicts anything I said how? Using `ls` (without an implementation implementing the currently-proposal-stage `-0` flag) is still throwing away that advantage: A filename with a literal newline will be indistinguishable from two separate names (or will have that newline replaced by a placeholder character, meaning that the output doesn't correctly represent the file's name) in the output from `ls -t`. – Charles Duffy Dec 05 '16 at 20:26
  • @vp_arth, ...if you're referring only to those advantages that could be also gained (on GNU platforms) with `xargs -d $'\n'` with conventional newline-delimited content, then I grant that you have a point, but it's silly to fix only a subset of corner cases when there's a readily-available way to fix all of them. – Charles Duffy Dec 05 '16 at 20:28
  • Your answer is awesome. I just can't to understand, why you still use `\0` as delimiter(and `sort -z`), while `read` and `sort` both have default '\n'? – vp_arth Dec 05 '16 at 20:38
  • @vp_arth, a newline literal can be present in a filename. NUL is the only character that can't. – Charles Duffy Dec 05 '16 at 20:39
  • @vp_arth, ...moreover, if you have a directory that multiple users have access to, filenames can easily constitute a privilege boundary, and thus be a point of interest for security purposes. If someone runs `mkdir -p $'/tmp/\n/etc/passwd'`, you don't want a script cleaning out cruft from `/tmp` to get confused about what's a standalone filename and what isn't. (I've seen data loss events caused by buffer overflows throwing content into filenames that wasn't even *intentionally* malicious; it's very much worth cultivating defensive habits). – Charles Duffy Dec 05 '16 at 20:42
  • `touch $(echo -e '/tmp/test\0zero')` just skips a null... Good to know, thanks :) – vp_arth Dec 05 '16 at 20:47
  • @vp_arth, no POSIX system will let you create a file named `/tmp/testzero`, where that `` is a literal NUL, so it's moot: the building blocks of UNIX are NUL-terminated strings, so you can't have a literal NUL within them. Whether the NUL gets skipped or truncates the output varies on circumstances (what you demonstrated above is specifically bash-local command substitution behavior), but it'll always be one of the two. – Charles Duffy Dec 05 '16 at 20:51
  • @vp_arth, ...on a different point, consider using `printf` instead of `echo -e` -- bash literally violates the POSIX standard by having it do anything other than print `-e` on output (most extensions are in undefined space, but this one is actually a violation, and moreover goes away when both posix and xpg_echo flags are enabled, making it unreliable even on bash). See http://pubs.opengroup.org/onlinepubs/009695399/utilities/echo.html, particularly the APPLICATION USAGE section. – Charles Duffy Dec 05 '16 at 20:51
  • @CharlesDuffy Thanks for detailed answer. wouldn't it be more appropriate to reduce while loop condition to `IFS=$' ' read -r -d'' mtime filename`. I executed the script with this condition and it did not change global value of IFS. – shaffooo Dec 09 '16 at 15:48
  • Correct, it won't change the global value of IFS, but it does trim trailing whitespace from filenames. – Charles Duffy Dec 09 '16 at 16:10
  • 1
    @shaffooo, ...actually, there's a bigger bug in there too: `-d''` generates exactly the same argv as `-d`, so you're consuming the `mtime` argument and making `m` the record delimiter. The whitespace in `-d ''` is critical. – Charles Duffy Dec 09 '16 at 16:37