28

I am trying to write a regex expression to replace one or more '+' symbols present in a file with a space. I tried the following:

 echo This++++this+++is+not++done | awk '{ sub(/\++/, " "); print }'
 This this+++is+not++done

Expected:

This this is not done

Any ideas why this did not work?

alex
  • 479,566
  • 201
  • 878
  • 984
Ajay Nair
  • 1,827
  • 3
  • 20
  • 33

8 Answers8

45

Use gsub which does global substitution:

echo This++++this+++is+not++done | awk '{gsub(/\++/," ");}1'

sub function replaces only 1st match, to replace all matches use gsub.

Guru
  • 16,456
  • 2
  • 33
  • 46
  • 1
    what is the `1` at the end for? – Elie G. Oct 17 '19 at 17:11
  • 6
    @DrunkenPoney The `1` at the end tells AWK to print out the line after it's been `gsub`ed on. You can avoid it by explicitly adding a print statement to the same action as the `gsub`, e.g. ```{gsub(/\++/," "); print;}``` – Andrey Kaipov Oct 22 '19 at 18:14
  • Ugh in powershell you have to backslash the double quotes. – js2010 May 01 '21 at 15:17
15

Or the tr command:

echo This++++this+++is+not++done | tr -s '+' ' '
radical7
  • 8,957
  • 3
  • 24
  • 33
14

The idiomatic awk solution would be just to translate the input field separator to the output separator:

$ echo This++++this+++is+not++done | awk -F'++' '{$1=$1}1'
This this is not done
Ed Morton
  • 188,023
  • 17
  • 78
  • 185
7

Try this

echo "This++++this+++is+not++done" | sed -re 's/(\+)+/ /g'

Mirage
  • 30,868
  • 62
  • 166
  • 261
5

You could use sed too.

echo This++++this+++is+not++done | sed -e 's/+\{1,\}/ /g'

This matches one or more + and replaces it with a space.

alex
  • 479,566
  • 201
  • 878
  • 984
3

For this case I recommend sed, this is powerful for substitution and has a short syntax.

Solution sed:

echo This++++this+++is+not++done | sed -En 's/\\++/ /gp'

Result:

This this is not done

For awk: You must use the gsub function for global line substitution (more than one substitution). The syntax: gsub(regexp, replacement [, target]). If the third parameter is ommited then $0 is the target. Target must a variable or array element. gsub works in target, overwritten target with the replacement.

Solution awk:

echo This++++this+++is+not++done | awk 'gsub(/\\++/," ")

Result:

This this is not done
buddemat
  • 4,552
  • 14
  • 29
  • 49
  • 1
    Your sed example fails because the backslash is not removed within single quotes. The example would work if you had used double quotes (or removed one of the backslashes). The awk example fails because there is no trailing quote mark. I did not test whether it would fail due to the double backslashes. – Mike Gleen Feb 16 '22 at 17:14
-1
echo "This++++this+++is+not++done" | sed 's/++*/ /g'
Opal
  • 81,889
  • 28
  • 189
  • 210
dsri
  • 67
  • 2
-7

If you have access to node on your computer you can do it by installing rexreplace

npm install -g regreplace

and then run

rexreplace '\++' ' ' myfile.txt

Of if you have more files in a dir data you can do

rexreplace '\++' ' ' data/*.txt
mathiasrw
  • 610
  • 4
  • 10
  • 10
    If you compare the size of npm to awk and how fast tr can replace compared with rexreplace then you might understand that this is using a bomb to open a nut. – Alexx Roche Sep 15 '17 at 11:00
  • 1
    It makes no sense to compare npm to awk. I assume you mean the size of rexreplace. The speed is for sure better with tr. The answer seeks to offer the convenience of using a tool where the syntax is more easy to grasp. It reminds me of the discussions when C was introduced, and people argued that the generated code was messy and slow compared to making it in assembly. – mathiasrw Oct 18 '17 at 20:34
  • 4
    @mathiasrw Often it's not about size or speed. It's usually about dependencies and maintaining a small footprint that will reliably work everywhere. awk is a rather small system utility already installed on just about every distro you'll encounter. node/npm are not yet there. – nonrectangular Jun 05 '19 at 18:29