-1

How can we match any whole word in the input text, then replace it with the replaceString. For example. I would like sed command to match any word in the input text and replace each of them with "XXX".

echo "bar embarrassment recommended" | sed "Please fill right answer here"

I would like output like this

XXX XXX XXX

I am new to Linux. So there may be some other command that can work best here in this situation.Any recommendation?

The definition of a word in this question is very simple. It can contain alphabets a-z or A-Z ,an underscore. And any two words are delimited by a white space.

Examples

this is a sample text
This Is A Sample Text
this_ is_ a_ sample_ text_
jww
  • 97,681
  • 90
  • 411
  • 885
brownfox
  • 557
  • 8
  • 14
  • `sed "s/please_fill_your_regexp_here/XXX/g"` – Karoly Horvath May 14 '15 at 15:05
  • The answers so far all assume a hyphenated word that spans lines should be treated as two words. Recognizing that special case and treating it differently is very difficult in sed, and not likely to be necessary. But you should be aware of the assumption. – mpez0 May 14 '15 at 15:43
  • 1
    What is a `word` in your context? Is `17` a word? Is `that's` one word or 2? How about `as-is`? Are words ALWAYS separated by spaces? Should non-word characters be left as-is or deleted or replaced with blanks? Do you care about preserving the white space between words? Edit your question to define a `word` and include a few multi-word lines of sample input that you think would be difficult for a script to get right and the associated output. – Ed Morton May 14 '15 at 15:48
  • I see from your posted example that your input consists of a series of letters/underscores/blanks surrounded by double quotes (themselves separated from other double-quotes strings by blank chars again) and contained entirely on one line. OK, got it. Now what does the expected output look like? Does it remain all on one line? Are the double quotes to be retained or removed?. Please edit your question to show PRECISELY the output associated with that input so we can test our possible solutions against your posted example input. – Ed Morton May 14 '15 at 19:57
  • Possible duplicate of [sed whole word search and replace](https://stackoverflow.com/q/1032023/608639) – jww May 27 '19 at 00:01

4 Answers4

2

EDIT:

echo "Bar Embarrassment Recommended." | sed -r 's/\w+/XXX/g'

is what you are after. Your need to simply convert a full word to 3 X's was not clear upon my initial reading of your question.

matchew
  • 19,195
  • 5
  • 44
  • 48
  • Above command returns this.XXX XXXXXXXXXXXXX XXXXXXXXXXX but i need it to be like XXX XXX XXX where no of XXX's are equals to no of words in the input text – brownfox May 14 '15 at 15:10
  • there's nothing wrong with the original question (well, lack of effort maybe?:).. I mean, it's crystal clear... – Karoly Horvath May 14 '15 at 15:11
  • @Phaneendra I mean almost, but it the punctuation is retained. Thats what you asked for no? Replace letters with X? Oh, your comment has been edited. Ok, I see what you mean now. – matchew May 14 '15 at 15:13
  • @KarolyHorvath I didn't mean to indicate that his question wasn't clear. Simply indicating that his input might not always be so regular. And that the concept he is looking for is regular expressions. – matchew May 14 '15 at 15:14
  • @Phaneendra I've updated my answer to reflect your comment. Sorry about that. – matchew May 14 '15 at 15:16
  • @matchew I am running your latest command on mac terminal. -r option is not supported on mac i suppose. So i will try your solution on linux and check the result. However my original question is concerning linux only. – brownfox May 14 '15 at 15:27
  • @Phaneendra sure, I'm using sed (GNU sed) 4.2.2 if it helps any – matchew May 14 '15 at 15:31
  • Don't use `[A-Za-z]` as it's meaning is locale-dependent. If you want all letters then use the character class `[[:alpha:]]`. – Ed Morton May 14 '15 at 15:49
2

matchew's helpful answer contains a GNU sed solution, which is appropriate, given that your question is tagged .

In a comment, you say you're on a Mac (OS X), which comes with BSD sed, which behaves differently in many respects.

The POSIX-compliant (and thus cross-platform) equivalent of matchew's command is:

echo "Bar Embarrassment Recommended." | sed 's/[[:alnum:]_]\{1,\}/XXX/g'

However, note that this will match words on any boundary, not just whitespace; i.e., a run of word characters adjoined by any non-word character will match, so that abc-de will turn into XXX-XXX, for instance.

Community
  • 1
  • 1
mklement0
  • 382,024
  • 64
  • 607
  • 775
0

You can use awk. It replace all words with xxx

echo "bar embarrassment recommended" | awk '{for (i=1;i<NF;i++) printf "xxx ";print "xxx"}' file
xxx xxx xxx
Jotne
  • 40,548
  • 12
  • 51
  • 55
  • This replaces all runs of _non-whitespace_ characters, which is not quite the same as what the OP asked for. Also, it normalizes whitespace, which may or may not be desired. – mklement0 May 14 '15 at 21:10
-1

you can match any whole word by word boundary like this:

 echo "bar embarrassment recommended" | sed 's/\b\S\+\b/XXX/g' 
sotona
  • 1,731
  • 2
  • 24
  • 34
  • Does not change anything. – Jotne May 14 '15 at 15:28
  • 2
    The metacharacter `+` needs to be escaped i.e `\+` unless the `-r` switch has been invoked. Also the `\b` metacharacters are unnecessary as `\S` means any character that is not a white space i.e the complement of `\s`. – potong May 14 '15 at 15:46
  • and only some seds will support `\b` or `\S`. – Ed Morton May 14 '15 at 15:50
  • well, that's because I forgot to take my example in special tags. As for \S - it matches ANY non-whitespace character. But between \b\b it shortens the notation and doesn't affect punctuation – sotona May 14 '15 at 15:54