Bash - replacing underscores with spaces, except leading/trailing ones

Question

I want underscores between words to be replaced with spaces, and leading and trailing underscores left alone. For example:

__hello_world_a_b___
hello___world

should become

__hello world a b___
hello   world

-1 for "with arrays and functions". If that is a legitimate requirement, you should explain it; if not, you should remove it (and simply ask for the best approach). — ruakh, Nov 02 '14 at 01:30

score 1 · Accepted Answer · answered Nov 02 '14 at 00:20

Using Bash with its regular expression support:

string='__hello_world_a_b___'
[[ $string =~ ^(_*)(.*[^_])(_*)$ ]]
echo "${BASH_REMATCH[1]}${BASH_REMATCH[2]//_/ }${BASH_REMATCH[3]}"

To check that it works, let's make a script that will take the string as argument:

#!/bin/bash

string=$1
[[ $string =~ ^(_*)(.*[^_])(_*)$ ]]
echo "${BASH_REMATCH[1]}${BASH_REMATCH[2]//_/ }${BASH_REMATCH[3]}"

Call this script banana, chmod +x banana and let's go:

$ ./banana '__hello_world_a_b___'
__hello world a b___
$ ./banana '__hello_world_a_b'
__hello world a b
$ ./banana 'hello_world_a_b___'
hello world a b___
$ ./banana 'hello_world_a_b'
hello world a b
$ ./banana '___'

$ # the previous output is empty
$ ./banana $'___hello_world_with\na_newline___'
___hello world with
a newline___
$ ./banana 'hello___world'
hello   world

score 0 · Answer 2 · edited May 23 '17 at 10:25

0

You could simply use the below Perl command which uses the PCRE verb (*SKIP)(*F).

$ echo "hello___world" | perl -pe 's/(?:^_+|_+$)(*SKIP)(*F)|_/ /g'
hello   world
$ echo "__hello_world_a_b___" | perl -pe 's/(?:^_+|_+$)(*SKIP)(*F)|_/ /g'
__hello world a b___

The above regex would match all the _ except the leading and trailing ones.

edited May 23 '17 at 10:25

Community

1
1

answered Nov 02 '14 at 01:04

Avinash Raj

172,303
28
230
274

The PCRE verbs are all a bit subtle, and not very widely understood. So, this is subjective, but I think it's better to write something like `perl -pe 'if (m/^(_*)([^_].*[^_\n])(_*\n?)\z/) { my ($leading, $words, $trailing) = ($1, $2, $3); $words =~ s/_/ /g; $_ = "$leading$words$trailing" }'`. – ruakh Nov 02 '14 at 01:44
@ruakh it was already suggested by eckes . But he deleted his answer. If you understand the above PCRE verb, you don't need to write a long code like above. – Avinash Raj Nov 02 '14 at 01:51

score 0 · Answer 3 · answered Nov 02 '14 at 01:24

Another Pure Bash possibility that doesn't use regular expression but extended globs, in a very pedestrian way:

#!/bin/bash

shopt -s extglob

string=$1

wo_leading=${string##+(_)}
wo_underscore=${wo_leading%%+(_)}

printf -v leading '%*s' "$((${#string}-${#wo_leading}))"
printf -v trailing '%*s' "$((${#wo_leading}-${#wo_underscore}))"

echo "${leading// /_}${wo_underscore//_/ }${trailing// /_}"

The variables wo_leading will contain the string without leading underscores, and the variable wo_underscore will contain the string without leading and trailing underscores. From here, it's easy to get the number of leading and trailing underscore, to replace underscores by spaces in wo_underscore and put back everything together.

score 0 · Answer 4 · answered Nov 02 '14 at 02:39

Another Perl answer:

perl -pe 's/(?<=[^\W_])(_+)(?=[^\W_])/ " " x length($1) /ge' <<END
__hello_world_a_b___
hello___world
END

__hello world a b___
hello   world

That is: a sequence of underscores preceded by a character that is a word character except underscore, and followed by a character that is a word character except underscore.

score 0 · Answer 5 · answered Nov 02 '14 at 20:03

0

If you have GNU awk, you can do it with

awk '{match($0,"^(_*)(.*[^_])(_*)$",arr); print arr[1] gensub("_"," ","g",arr[2]) arr[3]}'

answered Nov 02 '14 at 20:03

Vytenis Bivainis

2,308
21
28

Bash - replacing underscores with spaces, except leading/trailing ones

5 Answers5