5

I'm trying to detect a pattern that has three parts:

  1. A space
  2. Either an "m" or a "t"
  3. Either a space or the end of a line

I want to keep #2 and #3. For example, I'd like to change "i m sure he doesn t" to "im sure he doesnt"

I'm having trouble expressing #3, since [ $] only seems to match spaces, not line-ends. Here's what I've tried:

$ echo "i m sure he doesn t" | sed 's/ \([mt]\)\([ $]\)/\1\2/g'
im sure he doesn t

How should I express "either a space or end of line" in the expression above? Thanks!

Moira
  • 255
  • 1
  • 3
  • 6

3 Answers3

5

Space or end of line? Use |:

s/ \([mt]\)\( \|$\)/\1\2/g
choroba
  • 231,213
  • 25
  • 204
  • 289
  • This one doesn't work for me. It still ignores the end of line. `$ echo "i m sure he doesn t" | sed 's/ \([mt]\)\( \|$\)/\1\2/' im sure he doesn t ` – Moira Jan 03 '13 at 00:46
  • Are there different flavors of unix or sed that would be giving me different results? If so, how do I figure out which one I'm using? – Moira Jan 03 '13 at 00:52
  • @Moira: Just add the `/g` at the end. – choroba Jan 03 '13 at 12:32
  • It took me quite a while to discover that this didn't work for me, because I was using the pipe (`|`) as a separator, which doesn't seem to mix well with the logical OR (`\|`). Using `/` (or `:`) as a separator fixes that. – AstroFloyd Jan 06 '23 at 18:26
  • @AstroFloyd: You can always use `|` and `\|` to mean the different things, but it makes the expression even less readable than usual. – choroba Jan 09 '23 at 09:39
  • @choroba Your `echo "i m sure he doesn t" | sed 's/ \([mt]\)\( \|$\)/\1\2/g'` works for me (using GNU sed), but as soon as I replace your `/` with `|` it stops working... – AstroFloyd Jan 10 '23 at 09:23
  • @AstroFloyd You need to switch to extended regexes for it to work: `sed -E 's| ([mt])( \|$)|\1\2|g'` – choroba Jan 10 '23 at 10:33
  • @choroba Indeed (and remove the slashes from `\(...\)`. The fact remains that each of `|` as a delimiter and `\|` as logical OR work without `-E` alone, but not together. I thought it might be useful comment for others using `|` to read this instead of figuring it out. No criticism of your answer, just a comment :-) – AstroFloyd Jan 10 '23 at 10:59
2

Just matching space, then m or t, then space or newline won't catch cases with punctuation, e.g. a missing ' in "please don t!". A more general solution is to use word boundaries instead:

echo "i m sure he doesn t test test don t." | sed 's/ \([mt]\)[[:>:]]/\1/g'

The funky [[:>:]] is required on OS X (which I use), see Larry Gerndt's answer to sed whole word search and replace. On other sed flavors you may be able to use \b (any word boundary) or \> instead.

# example with word boundary
echo "i m sure he doesn t test test don t." | sed 's/ \([mt]\)[[:>:]]/\1/g'
im sure he doesnt test test dont.
Community
  • 1
  • 1
Anders Johansson
  • 3,926
  • 19
  • 19
0

Make last space optional:

sed 's/[ ]\([mt][ ]\?\)$/\1/' input

Posix friendly version:

sed 's/[ ]\([mt][ ]\{,1\}\)$/\1/' input
perreal
  • 94,503
  • 21
  • 155
  • 181