I am trying to create an array of file names so I can sort through a large file and take only the ones I want. My script looks like this
set STATIONS = Kanto-station-names
Where the folder "Kanto-station-names" has the names of files I want. This is followed by myarr=($(awk '{print $1}') $STATIONS)
but terminal immediately goes into a subshell that looks more like a work document where commands do not function and I cannot exit. Any suggestions?
Asked
Active
Viewed 82 times
0

Brock
- 13
- 1
- 3
-
1Please ensure that your question contains enough information to reproduce the problem. – Charles Duffy Jun 16 '16 at 17:31
-
3Please take a look: http://www.shellcheck.net/ – Cyrus Jun 16 '16 at 17:31
-
...also, in general, to read *anything* into a shell array, the appropriate tools are `readarray -t` / `mapfile -t`, or at least `read -a`, or a `while read` loop performing an append operation. `arr=( $(...) )` is very, *very* error-prone and best avoided. – Charles Duffy Jun 16 '16 at 17:32
-
See also http://mywiki.wooledge.org/BashFAQ/001 – Charles Duffy Jun 16 '16 at 17:33
-
...also related: http://stackoverflow.com/questions/30988586/creating-an-array-from-a-text-file-in-bash -- noting you can use `<(awk ...)` in place of a filename to treat the output of `awk` as input when reading from said "file". – Charles Duffy Jun 16 '16 at 17:34
-
2You have two horrible mistakes here: first, the line `set STATIONS = Kanto-station-names` doesn't do what you think it does! it sets the positional parameters `1`, `2` and `3` to `STATIONS`, `=` and `Kanto-station-names` respectively. Second, your command `myarr=($(awk '{print $1}') $STATIONS)` will populate `myarr` with the (splitted-globbed) output of the command `awk '{print $1}'` (and followed by the (splitted-globbed) expansion `$STATIONS`). Now the command `awk '{print $1}'` reads from standard input… so you're in the state where `awk` is waiting for standard input… – gniourf_gniourf Jun 16 '16 at 18:52
3 Answers
0
Why not just create an array using the find command to get your list of filenames?
Ex:
#!/bin/bash
IFS=$'\n'
MyArr=($(find /location/to/Kanto-station-names/ -type f -print0 | xargs -0 ls))
echo ${MyArr[@]}
echo ${MyArr[1}
echo ${MyArr[2}
echo ${MyArr[3}

IT_User
- 729
- 9
- 27
-
3Why not? because it's completely broken! Broken in case filenames contain spaces, newlines or glob characters. – gniourf_gniourf Jun 16 '16 at 18:47
-
-
1Well… it's still broken! you're still subject to pathname expansion (this can be fixed with `set -f`), and it's still broken with filenames containing newlines (and this can't be fixed with this method); it's actually always a bad idea to parse such commands; however, if you _really_ want to use (GNU) `find` in a 100% safe way, do this: `MyArr=(); while IFS= read -r -d '' file; do MyArr+=( "$file" ); done < <(find /location/to/Kanto-station-names/ -type f -print0)`. – gniourf_gniourf Jun 16 '16 at 21:31
-
1Another method (which doesn't rely on GNU `find`) is: `shopt -s dotglob nullglob; MyArr=(); for file in /location/to/Kanto-station-names/*; do [[ ! -L $file && -f $file ]] && MyArr+=( "$file" ); done` – gniourf_gniourf Jun 16 '16 at 21:31
0
Your question is difficult to understand, but if you just want to make an array of all the filenames in Kanto-station-names directory, this should work (note that the leading '$' is a prompt, not something you should type):
$ myArray=()
$ for FILE in `ls Kanto-station-names` ; do myArray+=($FILE) ; done
After that, you can access array elements:
$ echo ${myArray[0]}
$ echo ${myArray[1]}
etc. To see all elements of the array:
$ echo ${myArray[@]}

Juan Tomas
- 4,905
- 3
- 14
- 19
-
@gniourf It's true, if any of your filenames contains whitespace, the above won't work. That example you linked to even has filenames with newlines in them, which is horrible. That said, I parse `ls` all the time with no trouble. I simply have a policy of not naming files with whitespace. IMO, it's evil to have filenames that are more than one token (commonly seen in Win and Mac, not so much in *nix). – Juan Tomas Jun 17 '16 at 15:49
-
1What's evil is to assume anything about filenames. A filename is a C-string that doesn't contain `/`. Period. If you find yourself parsing `ls`, then you're obviously doing something wrong: you're not understanding the _word splitting and globbing_ part of how shells parse commands, and hence not using the proper semantic! Moreover, you're spawning a subshell and using an external command, and that's really not efficient. Use globs instead: it's 100% safe, shorter to type and more efficient (and it's semantically correct). – gniourf_gniourf Jun 17 '16 at 16:02
-
1And don't only focus on spaces: your command is _also_ broken for glob characters. – gniourf_gniourf Jun 17 '16 at 16:04
0
#!/usr/bin/env bash
# cause globs to expand to an empty list if no matches exist
shopt -s nullglob
# Use a glob expression to populate the array
stations=( Kanto-station-names/* )
(( ${#stations[@]} )) || {
echo "ERROR: No stations exist (make sure Kanto-station-names contains files)" >&2
exit 1
}
## read into an array (bash 4.0 or newer)
#readarray -t myarr < <(awk '{print $1}' "${stations[@]}")
# read into an array (bash 3.x-compatible)
IFS=$'\n' read -r -d '' -a myarr < <(awk '{print $1}' "${stations[@]}" && printf '\0') || {
echo "Error extracting first column from station list" >&2
exit 1
}
# print definition of our populated array as output
declare -p myarr
Items of note:
- We don't use
ls
anywhere in this code. All generation of lists of filenames is performed with globbing. See Why you shouldn't parse the output of ls(1). "${stations[@]}"
is expanded inside the process substitution that's runningawk
, so names are passed toawk
-- this was one of the bugs in the original code.- The
readarray
code in the commented-out alternative for bash 4.0 does not detect a failed exit status fromawk
. With bash 4.4 or newer, you can detect when a process substitution fails by runningwait "$!" || { echo "Process failed" >&2; }
or similar -- prior to that point exit status of a process substitution can't be detected, so there are compelling reasons to use the bash 3.x-compatible approach even with bash 4.0 through 4.3.

Charles Duffy
- 280,126
- 43
- 390
- 441