1

I've looked around some other peripheral questions and haven't been able to find a solution to my problem, so I'm sorry if this is a duplicate to something I missed.

Basically, I have the following GNU sed command:

sed -E -imr 's/^(\w)+/(\w)+$/g' file

which is supposed to replace the first word of a line with the last word of the line.

The first regex ^(\w)+ works great and matches the first word of each line. The problem is that the command replaces that first word with the literal string (w)+$

I've tried to escape the backslash, the parentheses, the operators, but I've had no luck in making the regex work in the output part of the command.

Can one use a regex capture group to replace a different regex capture group? What needs to be escaped, or what alternative syntax needs to be used?

NOTE

I'm using GNU sed on macOS from brew installed coreutils, so the answers to this question might not work across other versions of sed like native BSD on macOS.

Community
  • 1
  • 1
Ezra Goss
  • 124
  • 8
  • 2
    Try `sed -E -imr 's/^(\w+)(.*\W)(\w+)$/\3\2\1/g' file` (if `\w` and `\W` work at all, maybe `[[:alnum:]]` and `[^[:alnum:]]` will be better). – Wiktor Stribiżew Jul 01 '17 at 21:05
  • Ah, I see. So is the answer that you can't use regex except for backreferences in the output? Correct me if I'm wrong but basically you are segmenting the whole line into different capture groups and just switched the first and last using the backreferences, right? – Ezra Goss Jul 01 '17 at 21:10
  • 1
    In the replacement field you can't use regex syntax, but you are supposed to use plain text and some special symbols (e.g. `\1` returns the first capturing group). – logi-kal Jul 01 '17 at 21:10
  • Thanks for answering. On a side note, why is my question being downvoted? – Ezra Goss Jul 01 '17 at 21:15
  • What is the m switch? – Casimir et Hippolyte Jul 01 '17 at 21:22
  • 1
    @CasimiretHippolyte I think it's because he's using GNU sed on macos. The native sed command has no such option. There is, of course, a solution that will work with BSD sed but he's not using that version. – Jeff Holt Jul 01 '17 at 21:24
  • @CasimireHippolyte It's for reading a string as multiline -- and yes it's a GNU sed command – Ezra Goss Jul 01 '17 at 21:25
  • @jeff6times7: I didn't find any reference about that. If you have one feel free to share it. I use myself GNU sed on linux, and the man page don't speak about that. – Casimir et Hippolyte Jul 01 '17 at 21:36
  • 1
    @ezra I upvoted your question taking it from -2 to -1. I think one of the reasons it got voted down might be the nouns you used in the title. If I had written the title it would have read "Replace one capture group with another using GNU sed", possibly even including the version of sed. – Jeff Holt Jul 01 '17 at 21:38
  • 1
    @jeff6times7 Thanks for the vote, that makes sense. I feel like part of the issue of not being able to find the answer online is not having the vocabulary to search for the right answer. It's interesting that some people downvote over that but I also get why a misused phrase in a title can be annoying. Thanks for the heads up. edit -- also I changed the title to reflect your suggestion. – Ezra Goss Jul 01 '17 at 21:40
  • 1
    I have the same problem with the whole learning process. Patience is a virtue for ALL involved. Getting kinda meta so maybe I should stop. – Jeff Holt Jul 01 '17 at 21:42
  • To bring it back to the question at hand, I'm also having trouble finding the man page for the -m flag. I think the flag might actually be javascript specific, as that's the dominant search result, and the mode on regextester that I was using. Nonetheless, it doesn't seem important or necessary for what I was trying and I'm not sure why it works as a flag when I'm not using javascript. – Ezra Goss Jul 01 '17 at 21:48
  • @CasimiretHippolyte I'm sorry if I'm not up to speed but when you used the word "that" I got lost. Do you mean the GNU linux man page doesn't refer to the -m option? If that's what you mean, I agree. I don't see it documented there either. But sed is notorious regarding its documentation. But then again, I got a syntax error on my fedora 20 system when I tried sed -E -m 's/abc/def/' /dev/null. So I have no idea which version of sed the OP is using because I'm using 4.2.2 and it complains about -m. odd. very odd. – Jeff Holt Jul 01 '17 at 21:54
  • 1
    @jeff6times7: no problems, when you have doubt about the meaning of "that", translate it to `$_` in perl. – Casimir et Hippolyte Jul 01 '17 at 22:18
  • @CasimiretHippolyte Now THAT's funny. – Jeff Holt Jul 01 '17 at 22:37

1 Answers1

5

The replacement doesn't contain a regex, it contains a string which could reference capture groups defined in the regex. To replace the first word by the last one, you need to capture the last word and the rest of the line:

sed -r 's/^\w+(.*\b)(\w+)$/\2\1/'  
            |    |    |
         Matches |    |
         the 1st |    Matches the last word
         word    |
                Matches everything in the middle
                up to a "word boundary"

Note that -r, \w, and \b might not work in all sed versions, but they should work in recent GNU sed.

choroba
  • 231,213
  • 25
  • 204
  • 289
  • The OP is using GNU sed but the native BSD sed will throw an error if this command is given. The fact that OP mentioned the GNU part only as a trailing comment, other macos users who read the question and your answer might try (as I did) your solution, see the error and then be tempted (as I was) to scratch their heads in your general direction. My recommendation would be to note this in your answer and then ask the OP to make the distinction very clear. – Jeff Holt Jul 01 '17 at 21:31
  • ... and `sed 's/\([^[:space:]]*\)\(.*[[:space:]]\)\([^[:space:]]*\)/\3\2\1/'` with any POSIX sed. – Ed Morton Jul 02 '17 at 11:04