I have a string in a Bash shell script that I want to split into an array of characters, not based on a delimiter but just one character per array index. How can I do this? Ideally it would not use any external programs. Let me rephrase that. My goal is portability, so things like sed
that are likely to be on any POSIX compatible system are fine.

- 1,361
- 2
- 12
- 24
-
2`bash` is not a given if your platform is POSIX. – tripleee Aug 04 '12 at 07:41
-
3@tripleee neither are arrays. – lhunath Jun 11 '15 at 16:10
-
Of course. I'm trying to make sense of the question. Maybe the OP means to target Bash on an otherwise-POSIX system? – tripleee Jun 11 '15 at 16:41
-
1The original intention was to create a shell script that could be shared online without knowing much about the user's platform. So I wanted as much compatibility as possible, across OS X, Ubuntu, etc. I don't need 100% compatibility with exotic variations of Unix. – n s Jun 12 '15 at 17:22
-
You can extract a single character with `${string:1:1}` _in bash_, but if you need to support POSIX sh you can't assume you have that, _or that you have arrays at all_. – Charles Duffy Dec 20 '22 at 18:14
-
And POSIX sh is not some weird exotic thing you never see in modern real-world systems. I mean, Debian's dash is basically a POSIX shell. – Charles Duffy Dec 20 '22 at 18:15
20 Answers
Try
echo "abcdefg" | fold -w1
Edit: Added a more elegant solution suggested in comments.
echo "abcdefg" | grep -o .
-
2Despite the fact that an external command is used, +1 because of conciseness. – Dimitre Radoulov Sep 28 '11 at 11:09
-
10http://unstableme.blogspot.fi/2009/07/split-string-to-characters-in-bash.html has the rather elegant suggestion `echo "abcdefg" | grep -o .` – tripleee Aug 04 '12 at 07:32
-
7@xdazz it don't work on Unicode. Try this `echo "عمر" | fold -w1` It prints spaces and question marks. However @tripleee's solution `echo "عمر" | grep -o .` does work fine. Funny how small programs don't pass the http://stackoverflow.com/q/796986/161278 :). Thanks anyway for your elegant answer. – Omar Al-Ithawi Mar 12 '14 at 14:19
-
2
-
@OmarIthawi: both of those variants work for me, on Mac OS X and Linux CentOS 6.5, so it seems it's not as simple as "the fold solution doesn't work with unicode". – erik.weathers Oct 21 '15 at 09:19
-
Not unexpectedly the `fold` solution appears to be much (>6 times) faster than grep. – alephreish Sep 10 '17 at 12:16
-
@OmarAl-Ithawi: your locale is maybe different than the one of eric.weathers or har-wradim? you can "force" a change of locale by using for example: echo "something" | LC_ALL='C' fold -w1 (or different locales instead of C. use "locale -a" to see all availables for your system) – Olivier Dulac Dec 28 '17 at 17:51
-
@OlivierDulac I tried to change the locale but it didn't work `$ fold`, mine is "fold (GNU coreutils) 8.25" on Ubuntu. This might be _the_ difference. – Omar Al-Ithawi Dec 29 '17 at 21:14
-
For @OmarAl-Ithawi's example, I get different results with different combinations of locale and command. On Debian GNU/Linux, only the combination of `LANG=en_US.UTF-8 grep -o .` works correctly and all the others just print rows of `?`. On Darwin, `LANG=C fold -w1` just prints out the string without any modification, `LANG=C grep -o .` prints rows of `?`, and the rest (`LANG=en_US.UTF-8`) work correctly. – musiphil May 26 '18 at 04:29
You can access each letter individually already without an array conversion:
$ foo="bar"
$ echo ${foo:0:1}
b
$ echo ${foo:1:1}
a
$ echo ${foo:2:1}
r
If that's not enough, you could use something like this:
$ bar=($(echo $foo|sed 's/\(.\)/\1 /g'))
$ echo ${bar[1]}
a
If you can't even use sed
or something like that, you can use the first technique above combined with a while loop using the original string's length (${#foo}
) to build the array.
Warning: the code below does not work if the string contains whitespace. I think Vaughn Cato's answer has a better chance at surviving with special chars.
thing=($(i=0; while [ $i -lt ${#foo} ] ; do echo ${foo:$i:1} ; i=$((i+1)) ; done))
-
3
-
3the loop you suggested: `for i in $(seq ${#foo}); do echo "${foo:$i-1:1}"; done` – wjandrea Aug 17 '16 at 06:19
-
As an alternative to iterating over 0 .. ${#string}-1
with a for/while loop, there are two other ways I can think of to do this with only bash: using =~
and using printf
. (There's a third possibility using eval
and a {..}
sequence expression, but this lacks clarity.)
With the correct environment and NLS enabled in bash these will work with non-ASCII as hoped, removing potential sources of failure with older system tools such as sed
, if that's a concern. These will work from bash-3.0 (released 2005).
Using =~
and regular expressions, converting a string to an array in a single expression:
string="wonkabars"
[[ "$string" =~ ${string//?/(.)} ]] # splits into array
printf "%s\n" "${BASH_REMATCH[@]:1}" # loop free: reuse fmtstr
declare -a arr=( "${BASH_REMATCH[@]:1}" ) # copy array for later
The way this works is to perform an expansion of string
which substitutes each single character for (.)
, then match this generated regular expression with grouping to capture each individual character into BASH_REMATCH[]
. Index 0 is set to the entire string, since that special array is read-only you cannot remove it, note the :1
when the array is expanded to skip over index 0, if needed.
Some quick testing for non-trivial strings (>64 chars) shows this method is substantially faster than one using bash string and array operations.
The above will work with strings containing newlines, =~
supports POSIX ERE where .
matches anything except NUL by default, i.e. the regex is compiled without REG_NEWLINE
. (The behaviour of POSIX text processing utilities is allowed to be different by default in this respect, and usually is.)
Second option, using printf
:
string="wonkabars"
ii=0
while printf "%s%n" "${string:ii++:1}" xx; do
((xx)) && printf "\n" || break
done
This loop increments index ii
to print one character at a time, and breaks out when there are no characters left. This would be even simpler if the bash printf
returned the number of character printed (as in C) rather than an error status, instead the number of characters printed is captured in xx
using %n
. (This works at least back as far as bash-2.05b.)
With bash-3.1 and printf -v var
you have slightly more flexibility, and can avoid falling off the end of the string should you be doing something other than printing the characters, e.g. to create an array:
declare -a arr
ii=0
while printf -v cc "%s%n" "${string:(ii++):1}" xx; do
((xx)) && arr+=("$cc") || break
done

- 9,767
- 3
- 34
- 24
-
I like to consider myself fairly decently knowledgeable in bash, but man there are constantly new things I'm learning. These are both really cool tricks, thanks! – Shaun Mitchell Feb 21 '23 at 20:32
If your string is stored in variable x, this produces an array y with the individual characters:
i=0
while [ $i -lt ${#x} ]; do y[$i]=${x:$i:1}; i=$((i+1));done

- 63,448
- 5
- 82
- 132
-
17This: `for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done` looks more idiomatic for bash. – Aug 16 '15 at 22:48
-
This is old but I want to say that bash reads the whole string every time it calculates its length, so putting it in a var would be good. alternatively, starting from the length and going down also works. – phicr Aug 06 '22 at 01:31
The most simple, complete and elegant solution:
$ read -a ARRAY <<< $(echo "abcdefg" | sed 's/./& /g')
and test
$ echo ${ARRAY[0]}
a
$ echo ${ARRAY[1]}
b
Explanation: read -a
reads the stdin as an array and assigns it to the variable ARRAY treating spaces as delimiter for each array item.
The evaluation of echoing the string to sed just add needed spaces between each character.
We are using Here String (<<<) to feed the stdin of the read command.

- 1,291
- 17
- 16
I have found that the following works the best:
array=( `echo string | grep -o . ` )
(note the backticks)
then if you do: echo ${array[@]}
,
you get: s t r i n g
or: echo ${array[2]}
,
you get: r

- 1,560
- 12
- 26

- 309
- 3
- 7
-
Good solution. How the backticks works (in this case and in other usage cases)? – Itzik Chaimov Oct 20 '22 at 05:33
-
The backtick means execute the command between the 2 backticks and replace the backtick and what's in between with the output of that command. The parenthesis create an array and assigns it to the variable "array" explicitly as the person asking the question requested. – AZAhmed Oct 21 '22 at 19:59
Pure Bash solution with no loop:
#!/usr/bin/env bash
str='The quick brown fox jumps over a lazy dog.'
# Need extglob for the replacement pattern
shopt -s extglob
# Split string characters into array (skip first record)
# Character 037 is the octal representation of ASCII Record Separator
# so it can capture all other characters in the string, including spaces.
IFS= mapfile -s1 -t -d $'\37' array <<<"${str//?()/$'\37'}"
# Strip out captured trailing newline of here-string in last record
array[-1]="${array[-1]%?}"
# Debug print array
declare -p array

- 17,497
- 4
- 32
- 41
-
Nice, U could: - create a function and - post some samples! – F. Hauri - Give Up GitHub Aug 25 '21 at 14:11
string=hello123
for i in $(seq 0 ${#string})
do array[$i]=${string:$i:1}
done
echo "zero element of array is [${array[0]}]"
echo "entire array is [${array[@]}]"
The zero element of array is [h]
. The entire array is [h e l l o 1 2 3 ]
.

- 8,409
- 22
- 75
- 99

- 187
- 1
- 5
-
1These substring extraction operations are superior to equivalent solutions that involve piping the string through a subprocess. – sdenham Feb 17 '19 at 16:19
Yet another on :), the stated question simply says 'Split string into character array' and don't say much about the state of the receiving array, and don't say much about special chars like and control chars.
My assumption is that if I want to split a string into an array of chars I want the receiving array containing just that string and no left over from previous runs, yet preserve any special chars.
For instance the proposed solution family like
for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done
Have left overs in the target array.
$ y=(1 2 3 4 5 6 7 8)
$ x=abc
$ for (( i=0 ; i < ${#x} ; i++ )); do y[i]=${x:i:1}; done
$ printf '%s ' "${y[@]}"
a b c 4 5 6 7 8
Beside writing the long line each time we want to split a problem, so why not hide all this into a function we can keep is a package source file, with a API like
s2a "Long string" ArrayName
I got this one that seems to do the job.
$ s2a()
> { [ "$2" ] && typeset -n __=$2 && unset $2;
> [ "$1" ] && __+=("${1:0:1}") && s2a "${1:1}"
> }
$ a=(1 2 3 4 5 6 7 8 9 0) ; printf '%s ' "${a[@]}"
1 2 3 4 5 6 7 8 9 0
$ s2a "Split It" a ; printf '%s ' "${a[@]}"
S p l i t I t

- 735
- 7
- 22
If the text can contain spaces:
eval a=( $(echo "this is a test" | sed "s/\(.\)/'\1' /g") )

- 94,607
- 11
- 117
- 176
-
2use the information from http://stackoverflow.com/a/7581114/394952 to display the characters stored in the array "a". Like this: `eval a=( $(echo "this is a test" | sed "s/\(.\)/'\1' /g") );v=0; echo Array: "${a[@]}"; while [[ $v -lt ${#a[@]} ]];do echo -ne "$v:\t" ; echo ${a[$v]}; let v=v+1;done` – Menachem Jul 18 '13 at 06:56
$ echo hello | awk NF=NF FS=
h e l l o
Or
$ echo hello | awk '$0=RT' RS=[[:alnum:]]
h
e
l
l
o

- 1
- 62
- 391
- 407
-
1Warning: The result of using a null FS changes with awk implementation. It is explicitly [avoided by POSIX](http://pubs.opengroup.org/onlinepubs/9699919799/utilities/awk.html#tag_20_06_13_04): «1.- If FS is a null string, the behavior is unspecified.». More specific: This fails: `echo hello | original-awk NF=NF FS=` – Aug 18 '15 at 22:38
This is an old post/thread but with a new feature of bash v5.2+ using the shell option patsub_replacement
and the =~
operator for regex. More or less same with @mr.spuratic post/answer.
str='There can be only one, the Highlander.'
regexp="${str//?/(&)}"
[[ "$str" =~ $regexp ]] &&
printf '%s\n' "${BASH_REMATCH[@]:1}"
Or by just: (which includes the whole string at index 0)
declare -p BASH_REMATCH
If that is not desired, one can remove the value of the first index (index 0), with
unset -v 'BASH_REMATCH[0]'
instead of using printf
or echo
to print the value of the array BASH_REMATCH
One can check/see the value of the variable "$regexp"
with either
declare -p regexp
Output
declare -- regexp="(T)(h)(e)(r)(e)( )(c)(a)(n)( )(b)(e)( )(o)(n)(l)(y)( )(o)(n)(e)(,)( )(t)(h)(e)( )(H)(i)(g)(h)(l)(a)(n)(d)(e)(r)(.)"
or
echo "$regexp"
Using it in a script, one might want to test if the shopt
is enabled or not, although the manual says it is on/enabled by default.
Something like.
if ! shopt -q patsub_replacement; then
shopt -s patsub_replacement
fi
But yeah, check the bash
version too! If you're not sure which version of bash
is in use.
if ! ((BASH_VERSINFO[0] >= 5 && BASH_VERSINFO[1] >= 2)); then
printf 'No dice! bash version 5.2+ is required!\n' >&2
exit 1
fi
Space can be excluded from regexp
variable, change it from
regexp="${str//?/(&)}"
To
regexp="${str//[! ]/(&)}"
and the output is:
declare -- regexp="(T)(h)(e)(r)(e) (c)(a)(n) (b)(e) (o)(n)(l)(y) (o)(n)(e) (t)(h)(e) (H)(i)(g)(h)(l)(a)(n)(d)(e)(r)(.)"
- Maybe not as efficient as the other post/answer but it is still a solution/option.

- 7,493
- 2
- 19
- 18
For those who landed here searching how to do this in fish:
We can use the builtin string
command (since v2.3.0) for string manipulation.
↪ string split '' abc
a
b
c
The output is a list, so array operations will work.
↪ for c in (string split '' abc)
echo char is $c
end
char is a
char is b
char is c
Here's a more complex example iterating over the string with an index.
↪ set --local chars (string split '' abc)
for i in (seq (count $chars))
echo $i: $chars[$i]
end
1: a
2: b
3: c

- 56,821
- 26
- 143
- 139
If you also need support for strings with newlines, you can do:
str2arr(){ local string="$1"; mapfile -d $'\0' Chars < <(for i in $(seq 0 $((${#string}-1))); do printf '%s\u0000' "${string:$i:1}"; done); printf '%s' "(${Chars[*]@Q})" ;}
string=$(printf '%b' "apa\nbepa")
declare -a MyString=$(str2arr "$string")
declare -p MyString
# prints declare -a MyString=([0]="a" [1]="p" [2]="a" [3]=$'\n' [4]="b" [5]="e" [6]="p" [7]="a")
As a response to Alexandro de Oliveira, I think the following is more elegant or at least more intuitive:
while read -r -n1 c ; do arr+=("$c") ; done <<<"hejsan"

- 96
- 1
- 5
declare -r some_string='abcdefghijklmnopqrstuvwxyz'
declare -a some_array
declare -i idx
for ((idx = 0; idx < ${#some_string}; ++idx)); do
some_array+=("${some_string:idx:1}")
done
for idx in "${!some_array[@]}"; do
echo "$((idx)): ${some_array[idx]}"
done

- 2,409
- 9
- 12
I know this is a "bash" question, but please let me show you the perfect solution in zsh, a shell very popular these days:
string='this is a string'
string_array=(${(s::)string}) #Parameter expansion. And that's it!
print ${(t)string_array} -> type array
print $#string_array -> 16 items

- 9
- 1
Pure bash, no loop.
Another solution, similar to/adapted from Léa Gris' solution, but using read -a
instead of readarray/mapfile
:
#!/usr/bin/env bash
str='azerty'
# Need extglob for the replacement pattern
shopt -s extglob
# Split string characters into array
# ${str//?()/$'\x1F'} replace each character "c" with "^_c".
# ^_ (Control-_, 0x1f) is Unit Separator (US), you can choose another
# character.
IFS=$'\x1F' read -ra array <<< "${str//?()/$'\x1F'}"
# now, array[0] contains an empty string and the rest of array (starting
# from index 1) contains the original string characters :
declare -p array
# Or, if you prefer to keep the array "clean", you can delete
# the first element and pack the array :
unset array[0]
array=("${array[@]}")
declare -p array
However, I prefer the shorter (and easier to understand for me), where we remove the initial 0x1f
before assigning the array :
#!/usr/bin/env bash
str='azerty'
shopt -s extglob
tmp="${str//?()/$'\x1F'}" # same as code above
tmp=${tmp#$'\x1F'} # remove initial 0x1f
IFS=$'\x1F' read -ra array <<< "$tmp" # assign array
declare -p array # verification

- 580
- 5
- 14
If you want to store this in an array, you can do this:
string=foo
unset chars
declare -a chars
while read -N 1
do
chars[${#chars[@]}]="$REPLY"
done <<<"$string"x
unset chars[$((${#chars[@]} - 1))]
unset chars[$((${#chars[@]} - 1))]
echo "Array: ${chars[@]}"
Array: f o o
echo "Array length: ${#chars[@]}"
Array length: 3
The final x
is necessary to handle the fact that a newline is appended after $string
if it doesn't contain one.
If you want to use NUL-separated characters, you can try this:
echo -n "$string" | while read -N 1
do
printf %s "$REPLY"
printf '\0'
done

- 55,365
- 30
- 138
- 223