0

syntax question. if I have a number of subdirectories within a target dir, and I want to output the names of the subs to a text file I can easily run:

ls > filelist.txt

on the target. But say all of my subs are named with a 7 character prefix like:

JR-5426_mydir
JR-5487_mydir2
JR-5517_mydir3
...

and I just want the prefixes. Is there an option to "ls" that will only output n characters per line?

kjarsenal
  • 934
  • 1
  • 12
  • 35
  • 2
    Have you tried [`cut(1)`](http://linux.die.net/man/1/cut)? – Kevin Dec 08 '14 at 17:15
  • 2
    You shouldn't be using `ls` for progammatic use anyhow. See http://mywiki.wooledge.org/ParsingLs – Charles Duffy Dec 08 '14 at 17:16
  • If you want to be really safe from strange filename characters check out http://stackoverflow.com/questions/8677546/bash-for-in-looping-on-null-delimited-string-variable – Zan Lynx Dec 08 '14 at 18:23

2 Answers2

4

Don't use ls in any programmatic context; it should be used strictly for presentation to humans -- ParsingLs gives details on why.

On bash 4.0 or later, the below will provide a deduplicated list of filename prefixes:

declare -A prefixes_seen=( )     # create an associative array -- aka "hash" or "map"
for file in *; do                # iterate over all non-hidden directory entries
  prefixes_seen[${file:0:2}]=1   # add the first two chars of each as a key in the map
done
printf '%s\n' "${!prefixes_seen[@]}" # print all keys in the map separated by newlines

That said, if instead of wanting a 2-character prefix you want everything before the first -, you can write something cleaner:

declare -A prefixes_seen=( )
for file in *-*; do
  prefixes_seen[${file%%-*}]=1   # "${file%%-*}" cuts off "$file" at the first dash
done
printf '%s\n' "${!prefixes_seen[@]}"

...and if you don't care about deduplication:

for file in *-*; do
  printf '%s\n' "${file%%-*}"
done

...or, sticking with the two-character rule:

for file in *; do
  printf '%s\n' "${file:0:2}"
done

That said -- if you're trying to Do It Right, you shouldn't be using newlines to separate lists of filename characters either, because newlines are valid inside filenames on POSIX filesystems. Think about a file named f$'\n'oobar -- that is, with a literal newline in the second character; code written carelessly would see f as one prefix and oo as a second one, from this single name. Iterating over associative-array prefixes, as done for the deduplicating answers, is safer in this case, because it doesn't rely on any delimiter character.

To demonstrate the difference -- if instead of writing

printf '%s\n' "${!prefixes_seen[@]}"

you wrote

printf '%q\n' "${!prefixes_seen[@]}"

it would emit the prefix of the hypothetical file f$'\n'oobar as

$'f\n'

instead of

f

...with an extra newline below it.


If you want to pass lists of filenames (or, as here, filename prefixes) between programs, the safe way to do it is to NUL-delimit the elements -- as NULs are the single character which can't possibly exist in a valid UNIX path. (A filename also can't contain /, but a path obviously can).

A NUL-delimited list can be written like so:

printf '%s\0' "${!prefixes_seen[@]}"

...and read back into an identical data structure on the receiving end (should the receiving code be written in bash) like so:

declare -A prefixes_seen=( )
while IFS= read -r -d '' prefix; do
  prefixes_seen[$prefix]=1
done
Charles Duffy
  • 280,126
  • 43
  • 390
  • 441
1

No, you use the cut command:

ls | cut -c1-7
Andy Lester
  • 91,102
  • 13
  • 100
  • 152