1

I am trying to remove the text between the first and last slash of a string using sed. This is what I was able to come up with using regular expressions with an online regex tester:

(?<=\/)(.*)(?=\/[a-zA-Z])

and this is the sed command:

echo "1stFolder/2ndFolder/3rdFolder/file" | sed 's/(?<=\/)(.*)(?=\/[a-zA-Z])//'

However it is not working with sed. Basically I am trying to get these outputs:

Test Case: 1stFolder/2ndFolder/3rdFolder/file
Output: 1stFolder/file

Test Case: 1stFolder/3rdFolder/file
Output: 1stFolder/file

Test Case: 1stFolder/file
Output: 1stFolder/file

I want to use sed or any shell command to get the text between the first and last slash of these filepaths removed.

ChewieSW77
  • 13
  • 2

2 Answers2

1

sed does not support the various Perl regex extensions you tried to use. But really all you need is

sed 's:/.*/:/:'

Regular expressions perform longest-leftmost matching, so /.*/ by definition matches from the first slash to the last.

"Why doesn't tool X support the regex dialect of tool Y" is a common FAQ; see e.g. Why are there so many different regular expression dialects? but really, these online regex testers shuld be more explicit about which tools they support.

tripleee
  • 175,061
  • 34
  • 275
  • 318
-1

You have to leave in one of the forward slashes to be removed.
I think these work, but I can't test to find out.

This way you can use ssed a Perl enhanced sed:

ssed 's~^([^/\r\n]*)/.*(?=/[a-zA-Z][^/\r\n]*$)~\1~'

Or, the plain sed version:

sed 's~^([^/\r\n]*)/.*(/[a-zA-Z][^/\r\n]*$)~\1\2~'

  1. It enforces first slash to last slash in the string.
  2. It enforces that the last slash must contain a letter after it.

Just a warning, if you don't do it like this, it will fail.
Doesn't matter how else you try it.


Doing it this way keeps it from matching 'aa/bb/', and if enforcing a
letter (file) to exist, keeps it from matching 'aa/bb/cc/'.
So, BOS and EOS handling is required as well

  • I don't know of any `sed` version which supports the Perl lookahead `(?=...)` and `~` is not a valid `sed` command. The `\r\n` are meaningless here even if some `sed` dialects support them, as it only processes one line at a time anyway (unless you create a much more complex script; but why should you). – tripleee Aug 04 '19 at 20:08
  • I tried this solution but I got an error. `sed: -e expression #1, char 2: unknown command: `^'` – ChewieSW77 Aug 04 '19 at 20:31
  • @tripleee - `~` is a delimiter. I see that sed doesn't support assertions. I'll post an alternative. –  Aug 04 '19 at 20:35
  • You probably forgot the `s` command before the delimiter. But as it currently stands, this is pretty unsalvageable. – tripleee Aug 04 '19 at 20:42
  • I did and put it in, but you just could have pointed that out .. –  Aug 04 '19 at 20:43
  • I felt encouraged to fix some typos but perhaps you want to review. – tripleee Aug 04 '19 at 20:50
  • @tripleee - Why modify from `[^/\r\n]` to `[^/]` ? Does _sed_ not work with multiline strings ? –  Aug 04 '19 at 20:55
  • GNU sed with the `-z` are works with multi-line strings, other seds just work 1 line at a time unless you manually write the cryptic runes to load lines into the "hold space" and then you're simply using the wrong tool for the job. `a Perl enhanced sed` - shudder :-)! – Ed Morton Aug 05 '19 at 00:45
  • I encourage you to review the very first comment I left. – tripleee Aug 05 '19 at 04:03
  • @tripleee - Maybe you can rephrase whatever it is you are talking about so as to be understood. –  Aug 05 '19 at 15:21
  • In my original comment on your answer, I explained that `sed` only processes a line at a time and therefore `\r\n` is superfluous. Six comments later you ask why I removed `\r\n` and how this actually works in `sed`. – tripleee Aug 05 '19 at 15:23
  • When you reverted that change, you also removed the trailing `~` so your `sed` script has a syntax error again. Also, `$1$2` is a literal string in the replacement, not a backref (`sed` requires `\1\2` ... which is why I changed that). And plain `sed` without `-E` or `-r` matches round parentheses literally; you want to backslash the parens to make them grouping metacharacters. But why don't you google a `sed` tutorial? – tripleee Aug 05 '19 at 15:27
  • @tripleee - I don't know about your _sed-ness_, I'm just a regex expert. If you say the dot `.` matches anything, that's a different story. But, `[^\S\r\n]` is perfectly harmless. Maybe you should read the last line in my post to find out the error in your own answer. –  Aug 05 '19 at 15:29
  • Ok, fixed the replacement side to `\1\2`, etc, anything else ? I'm a windows programmer, so I never have, never will sed anything. –  Aug 05 '19 at 15:32
  • The parentheses without backslashes are still wrong. You can test on any decent REPL like https://repl.it/ or https://ideone.com/; here's a demo: https://ideone.com/AmRrNQ – tripleee Aug 05 '19 at 15:42
  • And `sed` doesn't support `\S` in most dialects either. It's very basic traditional regex from the early days of Unix. – tripleee Aug 05 '19 at 15:45
  • I have read the last line in your post multiple times but I'm afraid I still don't see what it attempts to explain or how that would render my answer invalid. What do you mean by BOS and EOS? Beginning/ending of string? – tripleee Aug 05 '19 at 15:46
  • If you are trying to say that the behavior for `folder1/folder2/` is incorrect, then I challenge your interpretation of the OP's very explicit request. To me "delete text between the first and last slash" should produce `folder1`, not `folder1/folder2/` (though admittedly I restore one of the slashes based on their specific examples). – tripleee Aug 05 '19 at 16:10
  • And yes, I am getting a little impatient, but really, if you want to criticize my answer based on incomplete understanding of the tools we are discussing, at least spell out what you think is wrong with it. – tripleee Aug 05 '19 at 16:13