62

I'm trying to store the files listing into an array and then loop through the array again. Below is what I get when I run ls -ls command from the console.

total 40
36 -rwxrwxr-x 1 amit amit 36720 2012-03-31 12:19 1.txt
4 -rwxrwxr-x 1 amit amit  1318 2012-03-31 14:49 2.txt

The following bash script I've written to store the above data into a bash array.

i=0
ls -ls | while read line
do
    array[ $i ]="$line"        
    (( i++ ))
done

But when I echo $array, I get nothing!

FYI, I run the script this way: ./bashscript.sh

Alex Raj Kaliamoorthy
  • 2,035
  • 3
  • 29
  • 46
codef0rmer
  • 10,284
  • 9
  • 53
  • 76
  • bash run pipeline in sub shell, so your assignment to array is only available in do .. done. – yuanjianpeng Apr 27 '19 at 14:43
  • I would suggest that the question here is really "How to iterate over a directory list"? **Arrays are NOT universally supported in shell scripts**. – BuvinJ Oct 16 '19 at 12:44
  • 1
    And even if you have a shell with arrays, you don't want or need to keep the file names in memory just to loop over them one by one. An array is useful if you want to compare every file to every other file, for example, but to just loop over files, use a regular `for file in *` or whatever, and don't squander memory on keeping a copy of the information the shell is perfectly capable of producing at any time. – tripleee Feb 06 '21 at 12:17

10 Answers10

140

I'd use

files=(*)

And then if you need data about the file, such as size, use the stat command on each file.

glenn jackman
  • 238,783
  • 38
  • 220
  • 352
  • 4
    Fantastic! Even works with directory names prepended, to get files from more than one directory (ie `files_in_dirs=(dir/* other_dir/*)`. Very useful, thanks. – Gus Shortz Sep 06 '13 at 21:15
  • 11
    And then to list all elements in this files array: `echo ${files[@]}` or use that in for loop – HankCa May 17 '15 at 22:54
  • 1
    On macs this seems to not work when files contain spaces (which is unfortunately frequent). There may be a work around. If anyone can find what this syntax is doing in the bash documentation, that may help us figure out a solution to this issue. – David Jun 30 '15 at 23:42
  • 3
    empty dirs will cause unpredictable errors without "shopt -s nullglob". – Asain Kujovic Apr 22 '16 at 19:31
  • 7
    @David, there is not problem with files with spaces: they will be stored in the array properly as a single element. It is extracting the file from the array as a single unit that requires care: `for filename in "${files[@]}"` -- where the quotes are crucial. See [this answer](http://stackoverflow.com/a/12316565/7552) for examples. – glenn jackman May 16 '17 at 17:47
  • @glennjackman hey `for filename in "${files[@]}"`, it treat all elements into one unit. at least in win7... – user2959760 Oct 13 '17 at 16:48
  • 1
    Where do you have bash on win7? – glenn jackman Oct 13 '17 at 17:18
  • Can this be used to get (only files) or (everything except symlinks). – RatDon Jun 11 '18 at 18:29
  • On Macs and elsewhere, this works fine if you then also quote your variable inside the loop. See [When to wrap quotes around a shell variable](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Feb 06 '21 at 12:21
  • @RatDon Look into `find` options; it can do all that and lots more. – tripleee Feb 06 '21 at 12:21
  • what's the order of the files being stored in array, it feels like a plain `ls` output order. But I failed to find this in any documentation to confirm my guess. – ychz Jan 08 '23 at 06:19
41

Try with:

#! /bin/bash

i=0
while read line
do
    array[ $i ]="$line"        
    (( i++ ))
done < <(ls -ls)

echo ${array[1]}

In your version, the while runs in a subshell, the environment variables you modify in the loop are not visible outside it.

(Do keep in mind that parsing the output of ls is generally not a good idea at all.)

Mat
  • 202,337
  • 40
  • 393
  • 406
7

Here's a variant that lets you use a regex pattern for initial filtering, change the regex to be get the filtering you desire.

files=($(find -E . -type f -regex "^.*$"))
for item in ${files[*]}
do
  printf "   %s\n" $item
done
harschware
  • 13,006
  • 17
  • 55
  • 87
  • 2
    Word-splitting applies to the expansion of the command substitution, so each space creates a separate array element. If there is a file called `foo bar.txt`, `item` will be set to `foo`, then `bar.txt`. – chepner Apr 29 '13 at 23:37
  • ah yes, after reading your comment I only focused on the regex.. nice catch. – harschware May 01 '13 at 15:50
  • So how would you catch spaces? – Jonny May 24 '13 at 02:23
  • 1
    This is horribly broken. [When to wrap quotes around a shell variable](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Feb 06 '21 at 12:20
5

This might work for you:

OIFS=$IFS; IFS=$'\n'; array=($(ls -ls)); IFS=$OIFS; echo "${array[1]}"
potong
  • 55,640
  • 6
  • 51
  • 83
  • 2
    Or simpler: `IFS=$'\n' array=($(ls -ls))` – Guss Aug 17 '15 at 19:00
  • 1
    @Guss Simpler, yes. Correct, no. The use of `OIFS` here involves restoring the internal field separator after you've change it temporarily. Failing to do that is likely to cause you a lot of subsequent problems! – BuvinJ Oct 16 '19 at 12:48
  • Ok, @BuvinJ, granted. Then how about `(IFS=$'\n' array=($(ls -ls)))` ? the `$IFS` change is now limited to a subshell and won't propagate further. Still simpler ;-) – Guss Oct 16 '19 at 13:42
  • Great idea. I'm not sure about the syntax though. I believe you mean `$(IFS='\n'; array=($(ls -ls)))` would make it run in a subshell, but then your `array` variable is also lost upon return to the outer script! Each way I think about solving this, ends up being at least as long / complicated as the original approach from @potong. And his way is, in the fact, the "canonical" approach to messing with IFS. – BuvinJ Oct 16 '19 at 17:49
3

Running any shell command inside $(...) will help to store the output in a variable. So using that we can convert the files to array with IFS.

IFS=' ' read -r -a array <<< $(ls /path/to/dir)
rashok
  • 12,790
  • 16
  • 88
  • 100
1

You may be tempted to use (*) but what if a directory contains the * character? It's very difficult to handle special characters in filenames correctly.

You can use ls -ls. However, it fails to handle newline characters.

# Store la -ls as an array
readarray -t files <<< $(ls -ls)
for (( i=1; i<${#files[@]}; i++ ))
{
    # Convert current line to an array
    line=(${files[$i]})
    # Get the filename, joining it together any spaces
    fileName=${line[@]:9}
    echo $fileName
}

If all you want is the file name, then just use ls:

for fileName in $(ls); do
    echo $fileName
done

See this article or this this post for more information about some of the difficulties of dealing with special characters in file names.

Dan Bray
  • 7,242
  • 3
  • 52
  • 70
  • 2
    No, it's not hard at all. Just [don't use `ls`](https://mywiki.wooledge.org/ParsingLs) and [quote your variables](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) you'll be fine. – tripleee Feb 06 '21 at 12:18
  • @tripleee if I quote properly using `*`, then it displays newline characters in the filename. Therefore, it offers no advantages over `ls`, but if used incorrectly `*` can be dangerous. Using `ls` might not be perfect, but what would you suggest as a better way? – Dan Bray Feb 06 '21 at 13:53
  • If a name contains a newline character, storing and displaying that newline character is exactly the right thing to do, isn't it? The first link in my comment contains extensive documentation for why you should basically never use `ls` in scripts, and what to use instead. – tripleee Feb 06 '21 at 16:35
  • @tripleee I already read that article before I posted and I disagree with it. I'm sure there are times when `ls` is definitely the wrong tool for the job, but to assume that it never is absurd, especially, when the article does not present good alternatives. It says not to use `ls` and to use `(*)` instead, but provides no good reason to. Using `(*)' can be dangerous if not quoted and still has the same problems `ls` does. – Dan Bray Feb 06 '21 at 20:57
  • I'm trying to understand your argument, but you are not helping. Can you provide an example where `ls` works and `(*)` does not, or ends up being much more complex? – tripleee Feb 07 '21 at 06:29
1

My two cents

The asker wanted to parse output of ls -ls

Below is what I get when I run ls -ls command from the console.

total 40
36 -rwxrwxr-x 1 amit amit 36720 2012-03-31 12:19 1.txt
4 -rwxrwxr-x 1 amit amit  1318 2012-03-31 14:49 2.txt

But there are few answer addressing this parsing operation.

ls's output

Before trying to parse something, we have to ensure command output is consistant, stable and easy to parse as possible

  • In order to ensure output wont be altered by some alias you may prefer to specify full path of command: /bin/ls.
  • Avoid variations of output due to locales, prefix your command by LANG=C LC_ALL=C
  • Use --time-style command switch to use UNIX EPOCH more easier to parse time infos.
  • Use -b switch for holding special characters

So we will prefer

LANG=C LC_ALL=C /bin/ls -lsb --time-style='+%s.%N'

to just

ls -ls

Full bash sample

#!/bin/bash

declare -a bydate=() bysize=() byname=() details=()
declare -i cnt=0 vtotblk=0 totblk
{
    read -r _ totblk # ignore 1st line
    while read -r blk perm lnk usr grp sze date file;do
        byname[cnt]="${file//\\ / }"
        details[cnt]="$blk $perm $lnk $usr $grp $sze $date"
        bysize[sze]+="$cnt "
        bydate[${date/.}]+="$cnt "
        cnt+=1 vtotblk+=blk
    done
} < <(LANG=C LC_ALL=C /bin/ls -lsb --time-style='+%s.%N')

From there, you could easily sort by dates, sizes of names (sorted by ls command).

echo "Path '$PWD': Total: $vtotblk, sorted by dates"
for dte in ${!bydate[@]};do
    printf -v msec %.3f .${dte: -9}
    for idx in ${bydate[dte]};do
        read -r blk perm lnk usr grp sze date <<<"${details[idx]}"
        printf ' %11d %(%a %d %b %T)T%s %s\n' \
               $sze "${date%.*}" ${msec#0} "${byname[idx]}"
    done
done

echo "Path '$PWD': Total: $vtotblk, sorted by sizes"
for sze in ${!bysize[@]};do
    for idx in ${bysize[sze]};do
        read -r blk perm lnk usr grp sze date <<<"${details[idx]}"
        printf -v msec %.3f .${date#*.}
        printf ' %11d %(%a %d %b %T)T%s %s\n' \
               $sze "${date%.*}" ${msec#0} "${byname[idx]}"
    done
done

echo "Path '$PWD': Total: $vtotblk, sorted by names"
for((idx=0;idx<cnt;idx++));{
    read -r blk perm lnk usr grp sze date <<<"${details[idx]}"    
    printf -v msec %.3f .${date#*.}
    printf ' %11d %(%a %d %b %T)T%s %s\n' \
           $sze "${date%.*}" ${msec#0} "${byname[idx]}"
}

( Accessory, you could check if total block printed by ls match total block by lines:

(( vtotblk == totblk )) ||
    echo "WARN: Total blocks: $totblk != Block count: $vtotblk" >&2

Of course, this could be inserted before first echo "Path...;)

Here is an output sample. (Note: there is a filename with a newline)

Path '/tmp/so': Total: 16, sorted by dates
           0 Sun 04 Sep 10:09:18.221 2.txt
         247 Mon 05 Sep 09:11:50.322 Filename with\nsp\303\251cials characters
          13 Mon 05 Sep 10:12:24.859 1.txt
        1313 Mon 05 Sep 11:01:00.855 parseLs.00
        1913 Thu 08 Sep 08:20:20.836 parseLs
Path '/tmp/so': Total: 16, sorted by sizes
           0 Sun 04 Sep 10:09:18.221 2.txt
          13 Mon 05 Sep 10:12:24.859 1.txt
         247 Mon 05 Sep 09:11:50.322 Filename with\nsp\303\251cials characters
        1313 Mon 05 Sep 11:01:00.855 parseLs.00
        1913 Thu 08 Sep 08:20:20.836 parseLs
Path '/tmp/so': Total: 16, sorted by names
          13 Mon 05 Sep 10:12:24.859 1.txt
           0 Sun 04 Sep 10:09:18.221 2.txt
         247 Mon 05 Sep 09:11:50.322 Filename with\nsp\303\251cials characters
        1913 Thu 08 Sep 08:20:20.836 parseLs
        1313 Mon 05 Sep 11:01:00.855 parseLs.00

And if you want to format characters (with care: there could be some issues, if you don't know who create content of path). But if folder is your, you could:

echo "Path '$PWD': Total: $vtotblk, sorted by dates, with special chars"
printf -v spaces '%*s' 37 ''
for dte in ${!bydate[@]};do
    printf -v msec %.3f .${dte: -9}
    for idx in ${bydate[dte]};do
        read -r blk perm lnk usr grp sze date <<<"${details[idx]}"
        printf ' %11d %(%a %d %b %T)T%s %b\n' $sze \
            "${date%.*}" ${msec#0} "${byname[idx]//\\n/\\n$spaces}"
    done
done

Could output:

Path '/tmp/so': Total: 16, sorted by dates, with special chars
           0 Sun 04 Sep 10:09:18.221 2.txt
         247 Mon 05 Sep 09:11:50.322 Filename with
                                     spécials characters
          13 Mon 05 Sep 10:12:24.859 1.txt
        1313 Mon 05 Sep 11:01:00.855 parseLs.00
        1913 Thu 08 Sep 08:20:20.836 parseLs
F. Hauri - Give Up GitHub
  • 64,122
  • 17
  • 116
  • 137
0

Isn't these 2 code lines, either using scandir or including the dir pull in the declaration line, supposed to work?

src_dir="/3T/data/MySQL";
# src_ray=scandir($src_dir);
declare -a src_ray ${src_dir/*.sql}
printf ( $src_ray );
OldManRiver
  • 128
  • 1
  • 6
0

In the conversation over at https://stackoverflow.com/a/9954738/11944425 the behavior can be wrapped into a convenience function which applies some action to entries of the directory as string values.

#!/bin/bash
iterfiles() {
    i=0
    while read filename
    do 
        files[ $i ]="$filename"
        (( i++ ))
    done < <( ls -l )
    for (( idx=0 ; idx<${#files[@]} ; idx++ ))
    do 
        $@ "${files[$idx]}" &
        wait $!
    done
}

where $@ is the complete glob of arguments passed to the function! This lets the function have the utility to take an arbitrary command as a partial function of sorts to operate on the filename:

iterfiles head -n 1 | tee -a header_check.out

When a script needs to iterate over files, returning an array of them is not possible. The workaround is to define the array outside of the function scope (and possibly unset it later) — modifying it inside the function's scope. Then, after the function is called by a script, the array variable becomes available. For instance, the mutation on files demonstrates how this could be done.

declare -a files # or just `files= ` (nothing)
iterfiles() {
    # ...
    files=...
}

Extending the conversation above, @Jean-BaptistePoittevin pointed out a valuable detail.

#!/bin/bash
# Adding a section to unset certain variable names that
# may already be active in the shell.
unset i
unset files
unset omit

i=0
omit='^([\n]+)$'
while read file
do
    files[ $i ]="$file"
    (( i++ ))
done < <(ls -l | grep -Pov ${omit} )

 

Note: This can be tested using echo ${files[0]} or for entry in ${files[@]}; do ... ; done

Often times, the circumstance could require an absolute path in double quotes, where the file (or ancestor directories) have spaces or unusual characters in the name. find is one answer here. The simplest usage might look like the above one, except done < <(ls -l ... ) is replaced with:

done < <(find /path/to/directory ! -path /path/to/directory -type d)

Its convenient when you need absolute paths in double quotes as an iterable collection to use a recipe like the one below. When export is not used, the shell does not update the environment namespace to include it in the find subshell:

#!/bin/bash

export DIRECTORY="$PWD" # For example
declare -a files

i=0
while read filename; do 
    files[ $i ]="$filename"
done < <(find $DIRECTORY ! -path $DIRECTORY -type d)

for (( idx=0; idx<${#files[@]}; idx++ )); do

    # Make a templated string for macro script generation
    quoted_path="\"${files[$idx]}\""
    
    if [[ "$(echo $quoted_path | grep some_substring | wc -c)" != "0" ]]; then
        echo "mv $quoted_path /some/other/watched/folder/" >> run_nightly.sh
    fi

done

Upon running this, ./run_nightly.sh will be populated with bulk commands to move a quoted path to /some/other/watched/folder/. This kind of scripting pattern will make it possible to supercharge your scripts.

-1

simply you can use this below for loop (do not forget to quote to handle filenames with spaces)

declare -a arr
arr=()
for file in "*.txt"
do
    arr=(${arr[*]} "$file")
done

Run

for file in ${arr[*]}
do
    echo "<$file>"
done

to test.

David Tonhofer
  • 14,559
  • 5
  • 55
  • 51
  • Improved a bit but "shellcheck" still complains about potential errors. bash syntax is historical trashheap. – David Tonhofer Aug 13 '22 at 17:15
  • Wrong on at least two counts. Firstly, `for file in "*.txt"` can be replaced by a non-loop `file=*.txt`, since the name isn't in a globbing context there. It's certainly not what was asked for. And both uses of `${arr[*]}` should be `"${arr[@]}"` so that they don't undergo word splitting. – Toby Speight Aug 16 '22 at 14:38