282

I am trying to read a file containing lines into a Bash array.

I have tried the following so far:

Attempt1

a=( $( cat /path/to/filename ) )

Attempt2

index=0
while read line ; do
    MYARRAY[$index]="$line"
    index=$(($index+1))
done < /path/to/filename

Both attempts only return a one element array containing the first line of the file. What am I doing wrong?

I am running bash 4.1.5

codeforester
  • 39,467
  • 16
  • 112
  • 140
Homunculus Reticulli
  • 65,167
  • 81
  • 216
  • 341
  • 1
    You don't need to maintain an index with your `while` loop. You can append to an array like this: `myarray+=($line)`. If you need to increment an integer, you can do `(( index++ ))` or `(( index += 1 ))`. – Dennis Williamson Jul 09 '12 at 11:15
  • 3
    @DennisWilliamson or `let index++` – nhed Jul 09 '12 at 11:23
  • 2
    @DennisWilliamson `((index++))` has a return value, which will likely terminate the script if run in `set -e` mode. The same applies to `let index++`. Using `A=$((A+1))` is safe. – ceving May 08 '14 at 11:03
  • @ceving: You should never use [`set -e`](http://mywiki.wooledge.org/BashFAQ/105) it's a useless relic. Use proper error handling. – Dennis Williamson May 08 '14 at 11:24
  • 3
    @DennisWilliamson I like it, because it is efficient and because of that very useful. `set -eu` is my standard prelude. – ceving May 08 '14 at 13:03
  • Actually "Attempt 2" works (with bash 4.3), provided you set `MYARRAY=()` before. Just add `printf '. %q .\n' "${MYARRAY[@]}"` at the end and you will see this. The problem probably was looking at `echo "$MYARRAY"` instead of `echo "${MYARRAY[@]}"`. Also note: `read -r line` probably is better as it does no ` \ ` interpolation. Perhaps consider `IFS='' read -r line` to read in leading and trailing spaces, too. – Tino Feb 10 '17 at 10:09
  • Strangely using your first Attempt I also got an array that only contains the first element when I tried to `echo` it. But I found that using the indexing such as `echo ${mylist[0]}` and `echo ${mylist[1]}` gives me different outputs. – Phoenix Mu Dec 04 '20 at 16:03
  • Make sure set the Internal File Separator (IFS) variable to $'\n' so that it does not put each word into a new array entry. OLDIFS=${IFS} IFS=$'\n' declare -a ARR ARR=( $(cat "file.txt") ) – Chris Reid Oct 13 '22 at 20:04

6 Answers6

412

The readarray command (also spelled mapfile) was introduced in bash 4.0.

readarray -t a < /path/to/filename
x-yuri
  • 16,722
  • 15
  • 114
  • 161
chepner
  • 497,756
  • 71
  • 530
  • 681
  • 11
    I think you are right about being introduced in bash 4, this doesn't work in bash 3.2 – trip0d199 Jul 31 '13 at 14:43
  • 5
    When I made that comment, I may not have been sure if it was in 4.0, or 4.1, or 4.2. Anyway, the [bash release notes](http://tiswww.case.edu/php/chet/bash/NEWS) confirm it was added in 4.0. – chepner Jul 31 '13 at 14:49
  • Easy to see on my MacOS as `/bin/bash` is 3 and `env bash` is 4 ... so I test expressions on my shell and they work fine, add expressions to script; but the scripts have a shebang `#!/bin/bash` which causes readarray to fail :( – nhed Mar 10 '14 at 23:14
  • For what it's worth, the shebang is only used when you execute the script as `myscript`. You can use `env bash myscript` to run your script with the newer version. – chepner Mar 10 '14 at 23:43
  • using any output: `bkpIFS="$IFS";IFS=$'\n';readarray astr < <(echo -e "a b c\nd e\nf");IFS="$bkpIFS";` checking `for str in "${astr[@]}";do echo $str;done;` – Aquarius Power Aug 10 '14 at 18:26
  • 4
    `readarray` doesn't use `IFS`; it only populates the named array with one line per element, with no field splitting. – chepner Aug 10 '14 at 19:03
  • yes I just had a problem with it concerning NAUTILUS_SCRIPT_SELECTED_FILE_PATHS that has '\012' (\n) char on it, still testing... – Aquarius Power Aug 10 '14 at 22:40
  • [this one](http://stackoverflow.com/a/11394045/1422630) worked with NAUTILUS_SCRIPT_SELECTED_FILE_PATHS! wonder if readarray could work too? – Aquarius Power Aug 10 '14 at 22:46
  • 95
    I would suggest adding `-t` to the answer to strip off the newline characters. Makes it easier to use the array (e.g. for string comparisons) and it is not often that you'll want to keep the newline anyway. – morloch Feb 24 '15 at 02:55
  • 3
    @AquariusPower `bash` 4.4 will add a `-d` flag to `readarray` to specify an alternate character to terminate each line of the input. – chepner Jul 12 '16 at 16:15
  • @morloch thank you, not realizing there is a newline, I wasted many minutes trying to figure out what was wrong with my input, as sed couldn't process the lines. What a horrible default. – qubodup Aug 05 '16 at 03:18
  • 2
    A gotcha: Don't forget to use process substitution with you're trying to do with the output of a command. Otherwise it doesn't work. Example: `readarray my_array < <(command)` – starbeamrainbowlabs May 27 '19 at 11:44
  • To clarify @starbeamrainbowlabs 's comment: If you write the line like `command | readarray -t varname` , the `varname` will be empty because `readarray` was run in a subshell. – Edheldil Aug 10 '21 at 08:27
152

Latest revision based on comment from BinaryZebra's comment and tested here. The addition of command eval allows for the expression to be kept in the present execution environment while the expressions before are only held for the duration of the eval.

Use $IFS that has no spaces\tabs, just newlines/CR

$ IFS=$'\r\n' GLOBIGNORE='*' command eval  'XYZ=($(cat /etc/passwd))'
$ echo "${XYZ[5]}"
sync:x:5:0:sync:/sbin:/bin/sync

Also note that you may be setting the array just fine but reading it wrong - be sure to use both double-quotes "" and braces {} as in the example above


Edit:

Please note the many warnings about my answer in comments about possible glob expansion, specifically gniourf-gniourf's comments about my prior attempts to work around

With all those warnings in mind I'm still leaving this answer here (yes, bash 4 has been out for many years but I recall that some macs only 2/3 years old have pre-4 as default shell)

Other notes:

Can also follow drizzt's suggestion below and replace a forked subshell+cat with

$(</etc/passwd)

The other option I sometimes use is just set IFS into XIFS, then restore after. See also Sorpigal's answer which does not need to bother with this

Community
  • 1
  • 1
nhed
  • 5,774
  • 3
  • 30
  • 44
  • 2
    Why set `IFS` to carriage return and line feed? `\r` will not appear in files with proper line endings, which will certainly include `passwd`. – sorpigal Jul 09 '12 at 11:17
  • 1
    The IFS tells bash how to parse text, it defines the set of characters that break up tokens in the parsing process. By default it includes whitespaces (space & tab) as well as newline/CR - so my code above removes them just for the current parse - so that it is one line per array index (thats what I thought you were looking for) – nhed Jul 09 '12 at 11:18
  • 1
    @Sorpigal /etc/passwd is just an example, I dont know what file he is parsing, he could be on cygwin for all I know, pulling in files from anywhere on the system ... I would not want strays in there – nhed Jul 09 '12 at 11:19
  • @nhed: Thanks, I looked up IFS in the mean time. Yes, your explanation is how I am trying to implement it. However, when I subst /etc/password for my file. The array variable **XYZ** still contains only the first line in the file. – Homunculus Reticulli Jul 09 '12 at 11:20
  • @HomunculusReticulli Can you show how you test your output? – nhed Jul 09 '12 at 11:22
  • @nhed: I simply type `echo $array` at the command line (where array is the array read into). I am expecting to see `a b c d e` if the file read in contains the lines 'a','b','c','d','e' with each line separated by a new line. – Homunculus Reticulli Jul 09 '12 at 11:25
  • @HomunculusReticulli Please note the end of my post (2nd or 3rd edit :) ) - you have to double-quote and use braces, if you want all lines - you would `echo "${XYZ[@]}"` – nhed Jul 09 '12 at 11:27
  • 4
    `echo "${XYZ[@]}"` will print all elements as a single line; to get each element on a separate line use `printf "%s\n" "${XYZ[@]}"`. – Gordon Davisson Jul 09 '12 at 15:39
  • 2
    why use useless fork? Just use $( – drizzt Apr 18 '14 at 10:16
  • @drizzt Agree, but the OP asked about parsing the output so I'd rather answer with minimal changes to his attempts if it works, but I'll add an suggestion in the answer – nhed Apr 21 '14 at 17:31
  • Please note that this is not a good general method! For example, if a file contains a `*`, this will go through filename expansions! Try it yourself: `echo '*' > file; cat file; IFS=$'\r\n'; a=( $(cat file) ); echo "${a[@]}"` Surprise! – gniourf_gniourf Apr 21 '14 at 17:55
  • @gniourf_gniourf one set of double quotes solves that, doesn't it? `echo -e '*\nhi\n*' > file; IFS=$'\r\n' a=( "$(cat file)" ); echo "${a[@]}"`, BTW you had an extra `;` after IFS – nhed Apr 23 '14 at 13:56
  • @nhed no, because now you have an array with only one field! not an array the fields of which are the lines of `file`. Check with `declare -p a`. And btw, the `;` after `IFS` is optional (same effect with or without it). – gniourf_gniourf Apr 23 '14 at 14:39
  • @gniourf_gniourf hmm I thought I tested that, but I'll double check – nhed Apr 24 '14 at 11:28
  • @gniourf_gniourf how about `echo -e '*\nhi\n*\n?' > file; GLOBIGNORE='*' IFS=$'\r\n' a=($(cat file)); echo "${a[@]}"` – nhed Apr 24 '14 at 14:04
  • Your method is now much longer that Sorpigal's or chepner's, much less natural, looks like a really dirty hack, globally sets `IFS` and `GLOBIGNORE` (I definitely don't want `GLOBIGNORE` or `set -f` in my script since my scripts are mainly here for file stuff and I rely a lot on globbing), and might still break in some obscure cases (actually, your method silently discards empty lines). Moreover this is slower than `mapfile`. Please just don't do that! – gniourf_gniourf Apr 24 '14 at 14:21
  • @gniourf_gniourf globally? no it does not! These are only set for the duration of the array assignment (no `;`!) Just as IFS in Sorpigal's! So the globbing is only disabled for the read. As I mentioned to @drizzt above - I'd rather answer with minimal changes to the OP attempts. As for chepner's solution, its not portable, requires a current version of bash which even the default /bin/bash on a relatively modern MacOS is still not supported – nhed Apr 24 '14 at 20:31
  • No you're wrong. `a=b c=d` is _equivalent_ to `a=b; c=d`. Please try it. That's just the way it is in Bash. Bash 4 was released in 2009 IIRC. We're now in 2014. If you don't have access to Bash 4, use Sorgipal's solution. BTW, in Sorgipal's solution `IFS` is _not_ globally set. And chepner's solution is just the way to do it in bash ≥4. – gniourf_gniourf Apr 24 '14 at 20:35
  • @gniourf_gniourf compare `a=b c=d date; echo $c` (result is just the date), then compare to `a=b; c=d; date; echo $c` (result is date plus "d"). IFS is not set globally neither mine or sorpigals answer, only in your comment with the `;`! – nhed Apr 28 '14 at 19:44
  • Please try what you say: in a new clean environment: `IFS=lol GLOBIGNORE='*' XYZ=($(cat /etc/passwd))` and then `echo "$IFS"` (with the quotes!). Do you still say that you're not globally setting `IFS`? I can't explain to you in a comment why this sets the `IFS` but not Sorpigal's answer; it's just the way it is. – gniourf_gniourf Apr 28 '14 at 19:47
  • @gniourf_gniourf yes you are correct. I was testing `echo ${IFS}` vs `echo "${IFS}"`. That difference I can't explain. I can explain why IFS is set globally in my example, because the line just has assignments, and not assignments that precede a command, i'll update my answer – nhed Apr 29 '14 at 13:54
  • 2
    That's how variable assignment has worked in bash for the 25 years I've been using it. "X=a Y=b" sets the variables for the shell. "X=a Y=b executable" sets the variables only for the fork and exec environment of that one execution. The ":;" in the given answer is counter-productive and pollutes the shell. – Blaine Oct 23 '14 at 20:05
  • Why not update the "answer" given with the suggested fixes and optimizations: IFS=$'\r\n' GLOBIGNORE='*' XYZ=($(< /etc/passwd)) – Blaine Oct 23 '14 at 20:17
  • Correction: Can't combine the exec-only assignments with the $(<...) usage because that command has no executable. Efficient work-around is to run the "true" executable: IFS=$'\r\n' XYZ=($(< /etc/passwd)) true – Blaine Oct 23 '14 at 20:39
  • APOLOGIES (with this thing let me edit my comments for 10 minutes instead of 5). Correction: The exec-only assignment won't work for this use case because the goal of assigning XYZ would not persist. Have to set IFS then restore it afterward. – Blaine Oct 23 '14 at 20:49
  • 1
    @Blaine Sure you can't edit, but I think you can delete your own comments - and then resubmit one cohesive one. In practice I **usually** just set IFS before, then reset it after just to make my scripts less obscure (for example http://stackoverflow.com/a/5279465/652904) – nhed Oct 24 '14 at 15:23
  • The current answer with `:;` is broken -- the variables `IFS` and `GLOBIGNORE` are only set while executing the `:` (`true`) command. Is there any way to set `IFS` and `GLOBIGNORE` temporarily but still allow `XYZ` to persist? – Hugues Dec 18 '15 at 20:18
  • For instance, if you try `IFS=$'\r\n' GLOBIGNORE='*' :; XYZ=($(echo '*'))` you should find that the `GLOBIGNORE` variable was set only for the evaluation of `:` (`true`) and not for the next command which assigns `XYZ`. (I am using `4.1.17(9)-release` but that should not matter.) – Hugues Dec 21 '15 at 18:45
  • @Hugues you are right, I had not used my original test cases when testing for my response, i.e. did not have wildcards – nhed Dec 22 '15 at 01:01
  • Assignments alone are set for *this* shell. Assignments for commands vary. If the command is external, the assignments are for that command environment only: `unset a b c d; a=b c=d bash -c 'echo "|$c|"'; echo "<$c>"`. But [for **special built-ins**, POSIX requires](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_14) (Variable assignments specified with special built-in utilities remain in effect after the built-in completes) that the value is set for **this** shell: `sh -c 'unset a b c d; a=b c=d export f=g ; echo "<$a::$c::$f>"'` outputs ``. –  Dec 22 '15 at 11:37
  • It is possible to set variables for only one command with `a=b c=d command eval 'f=g'; echo "<$a::$c::$f>"`. In such case vars `a` and `b` are set for the environment of the command `command`, but discarded after that. I'll edit the answer to include this solution. –  Dec 22 '15 at 11:41
  • 2
    In case the edit is rejected, what I added was `Placing variables in the environment of the split is done with command eval: IFS=$'\r\n' GLOBIGNORE='*' command eval 'XYZ=($(cat /etc/passwd))'` as the first two lines. Feel free to edit as this is your answer anyway. nJoy! –  Dec 22 '15 at 11:53
  • Thanks @BinaryZebra ... – nhed Dec 22 '15 at 17:40
  • 1
    What a solution .. finally. Reading it hurts me somehow, however I was not able to spot any flaw in it like bad sideffects etc. It even returns the error code in case the file is unreadable, so works with `set -e`. And `command eval` wow, what a find! I'd rather suggest `readarray` but for pre-4.0 `bash` this is really a way to do it. (Note that it needs `bash` due to `GLOBIGNORE`) – Tino Feb 10 '17 at 09:58
149

The simplest way to read each line of a file into a bash array is this:

IFS=$'\n' read -d '' -r -a lines < /etc/passwd

Now just index in to the array lines to retrieve each line, e.g.

printf "line 1: %s\n" "${lines[0]}"
printf "line 5: %s\n" "${lines[4]}"

# all lines
echo "${lines[@]}"
sorpigal
  • 25,504
  • 8
  • 57
  • 75
  • 15
    All lines, one per line: `printf '%s\n' "${lines[@]}"`. – gniourf_gniourf Apr 21 '14 at 17:57
  • this worked with NAUTILUS_SCRIPT_SELECTED_FILE_PATHS that has '\012' (\n) char on it, thx! using any output: `IFS=$'\n' read -d '' -r -a astr < <(echo -e "a b c\nd e\nf");` checking: `for str in "${astr[@]}";do echo $str;done;` – Aquarius Power Aug 10 '14 at 22:42
  • 5
    This will discard blank lines in the file: http://mywiki.wooledge.org/BashFAQ/005#Loading_lines_from_a_file_or_stream – glenn jackman Jun 22 '15 at 20:06
  • 4
    In this context, `read` returns `false`, so you cannot distinguish from correct funktion or error like read errors. `readarray` is a better way to go. – Tino Feb 10 '17 at 09:28
  • It's not clear to me what `IFS=$'\n' ` is doing in this expression; it seems to work just as well on a Mac without it – Magnus Mar 19 '19 at 19:12
  • 1
    @Magnus: It's making read split the input in to fields on newline. This will also happen if you omit it, but you will additionally split on the other default input field separator: space. If your file's lines may have spaces this will lead to different results. If your `bash` is new enough you should in any case use `mapfile -t lines < /etc/passwd` instead which is more efficient and just as safe. – sorpigal Mar 19 '19 at 21:00
  • 4
    If you have an older version of Bash that doesn't have `mapfile` or `readarray` (e.g. Mac's ancient default version Bash), then you have to use this method. Since `read` returns `false` here, you could add `|| true` to the end of the command to avoid having your program exit here if you have error checking (`set -e`) enabled. – ishmael Apr 24 '20 at 19:05
29

One alternate way if file contains strings without spaces with 1string each line:

fileItemString=$(cat  filename |tr "\n" " ")

fileItemArray=($fileItemString)

Check:

Print whole Array:

${fileItemArray[*]}

Length=${#fileItemArray[@]}
imp25
  • 2,327
  • 16
  • 23
Karanjot
  • 315
  • 3
  • 2
  • 3
    **Beware!** This expands shell metacharacters, for example if `fileItemString='*'`. Can only be used safely when globbing is turned off, which in turn renders the shell mosty useless. – Tino Feb 10 '17 at 09:30
  • 1
    This will treat every whitespace in the file as separator (not only `\n`). I.e. if Nth line in the file is "foo bar", the resulting array will contain separate `foo` and `bar` entries, not a single `foo bar` entry. – Sasha Jan 07 '19 at 10:03
26

Your first attempt was close. Here is the simplistic approach using your idea.

file="somefileondisk"
lines=`cat $file`
for line in $lines; do
        echo "$line"
done
Atttacat
  • 383
  • 3
  • 4
  • 7
    This was close but didn't answer the part about populating an array. – ioscode May 29 '15 at 18:00
  • lines **is** the array and can be referenced as such. – Atttacat Jun 11 '15 at 17:49
  • 9
    No, `lines` is *not* an array here; it's just a string. Sure, you're splitting that string on whitespace to iterate over it (and also expanding any globs it contains), but that doesn't make it into an array. – Charles Duffy Jun 22 '15 at 20:02
  • 11
    **Beware!** This expands shell metacharacters, for example if `lines='*'`. – Tino Feb 10 '17 at 09:34
  • 5
    While not being direct answer to question, this snippet actually solves the problem I had when google led me to this page. – urmaul Nov 13 '17 at 14:54
  • 3
    This will treat every whitespace in the file as separator (not only \n). I.e. if Nth line in the file is "foo bar", the resulting output will contain `foo` and `bar` as separate line, not a single `foo bar` line. – Sasha Jan 07 '19 at 22:52
  • There's so many problems with this example, i don't understand how it got 24 votes. Where is the IFS variable? Where is the quotations around variables? – Owl Mar 28 '22 at 12:51
4
#!/bin/bash
IFS=$'\n' read  -d '' -r -a inlines  < testinput
IFS=$'\n' read  -d '' -r -a  outlines < testoutput
counter=0
cat testinput | while read line; 
do
    echo "$((${inlines[$counter]}-${outlines[$counter]}))"
    counter=$(($counter+1))
done
# OR Do like this
counter=0
readarray a < testinput
readarray b < testoutput
cat testinput | while read myline; 
do
    echo value is: $((${a[$counter]}-${b[$counter]}))
    counter=$(($counter+1))
done
  • 3
    Stick to `readarray`, because `read` returns `false`, so first solution fails under `set -e`. Also note, that `counter` is still `0` after the loop, because it is done in a subshell (due to pipe). – Tino Feb 10 '17 at 09:32
  • 5
    OSX does not have `readarray` – cmcginty May 13 '18 at 23:51
  • @cmcginty, ...because Apple is shipping a copy of bash that's almost old enough to get a driver's license, yes; for anyone who cares about bash, they really should be installing a newer one rather than using the OS-vendor-provided copy. – Charles Duffy May 28 '21 at 19:21
  • 2
    One issue here: `read -d'' -r` is exactly the same as `read -d -r`; the `-` in `-r` becomes the delimiter. That's incorrect; what it _should_ be is `read -d '' -r`; the space between the `-d` and the `''` is critical. – Charles Duffy May 28 '21 at 19:23
  • To test that, try reading sample input that contains dashes -- it'll be truncated at them. – Charles Duffy May 28 '21 at 19:24