I want underscores between words to be replaced with spaces, and leading and trailing underscores left alone. For example:
__hello_world_a_b___
hello___world
should become
__hello world a b___
hello world
I want underscores between words to be replaced with spaces, and leading and trailing underscores left alone. For example:
__hello_world_a_b___
hello___world
should become
__hello world a b___
hello world
Using Bash with its regular expression support:
string='__hello_world_a_b___'
[[ $string =~ ^(_*)(.*[^_])(_*)$ ]]
echo "${BASH_REMATCH[1]}${BASH_REMATCH[2]//_/ }${BASH_REMATCH[3]}"
To check that it works, let's make a script that will take the string as argument:
#!/bin/bash
string=$1
[[ $string =~ ^(_*)(.*[^_])(_*)$ ]]
echo "${BASH_REMATCH[1]}${BASH_REMATCH[2]//_/ }${BASH_REMATCH[3]}"
Call this script banana
, chmod +x banana
and let's go:
$ ./banana '__hello_world_a_b___'
__hello world a b___
$ ./banana '__hello_world_a_b'
__hello world a b
$ ./banana 'hello_world_a_b___'
hello world a b___
$ ./banana 'hello_world_a_b'
hello world a b
$ ./banana '___'
$ # the previous output is empty
$ ./banana $'___hello_world_with\na_newline___'
___hello world with
a newline___
$ ./banana 'hello___world'
hello world
You could simply use the below Perl command which uses the PCRE verb (*SKIP)(*F)
.
$ echo "hello___world" | perl -pe 's/(?:^_+|_+$)(*SKIP)(*F)|_/ /g'
hello world
$ echo "__hello_world_a_b___" | perl -pe 's/(?:^_+|_+$)(*SKIP)(*F)|_/ /g'
__hello world a b___
The above regex would match all the _
except the leading and trailing ones.
Another Pure Bash possibility that doesn't use regular expression but extended globs, in a very pedestrian way:
#!/bin/bash
shopt -s extglob
string=$1
wo_leading=${string##+(_)}
wo_underscore=${wo_leading%%+(_)}
printf -v leading '%*s' "$((${#string}-${#wo_leading}))"
printf -v trailing '%*s' "$((${#wo_leading}-${#wo_underscore}))"
echo "${leading// /_}${wo_underscore//_/ }${trailing// /_}"
The variables wo_leading
will contain the string without leading underscores, and the variable wo_underscore
will contain the string without leading and trailing underscores. From here, it's easy to get the number of leading and trailing underscore, to replace underscores by spaces in wo_underscore
and put back everything together.
Another Perl answer:
perl -pe 's/(?<=[^\W_])(_+)(?=[^\W_])/ " " x length($1) /ge' <<END
__hello_world_a_b___
hello___world
END
__hello world a b___
hello world
That is: a sequence of underscores preceded by a character that is a word character except underscore, and followed by a character that is a word character except underscore.
If you have GNU awk, you can do it with
awk '{match($0,"^(_*)(.*[^_])(_*)$",arr); print arr[1] gensub("_"," ","g",arr[2]) arr[3]}'