-1

I've been trying hard to get this Regex to work, but am simply not good enough at this stuff apparently :(

Regex - Trying to extract sources

I thought this would work... I'm trying to get all of the content where:

  1. It starts with ds://
  2. Ends with either carriage return or line feed

That's it! Essentially I'm going to then do a negative lookahead such that I can remove all content that is NOT conforming to above (in Notepad++) which allows for Regex search/replace.

Andrew
  • 437
  • 7
  • 18

4 Answers4

1
  1. Search for lines that contain the pattern, and mark them
    • Search menu > Mark
    • Find what: ds://.*\R
    • check Regular expression
    • Check Mark the lines
    • Find all
  2. Remove the non marked lines
    • Search menu > Bookmark
    • Remove unmarked lines
Toto
  • 89,455
  • 62
  • 89
  • 125
0

You don't need to add the \w specifier to look for a word after the ds:// in the look ahead. Removing that and altering the final specification from "zero or one carriage return, then zero or one newline" to "either a carriage return or a newline" in capture group should do it for you:

(?=ds:\/\/).*(?:\r|\n)

Update: Carriage return or Line feed group does not need to be captured.

Update 2: The following regex will actually work for your proposed use case in the comments, matching everything but the pattern you described in the question.

^(?:(?!ds:\/\/.*(?:\r|\n)).)*$
Tim Klein
  • 2,538
  • 15
  • 19
  • Awesome. That does find all of them. I would have thought changing ?= to ?! would have made it the inverse (i.e., finding everything else)... I must be missing something – Andrew Jan 05 '19 at 00:08
  • If you goal is to find everything else (not *entirely* clear from your question) then you would have to use a "negative lookaround" to *kind of* do what you want. Take a look at [this detailed post](https://stackoverflow.com/a/406408/4362829) for more info. Let me know if you want me to take a crack at it. – Tim Klein Jan 05 '19 at 00:10
  • Updated Regex as it does not need to capture \n or \r group. – Deep Jan 05 '19 at 00:11
0

You regex (?=ds:\w+).*\r?\n? does not match because in the content there is ds:// and \w does not match a forward slash. To make your regex work you could change it to:

(?=ds://\w+).*\r?\n? demo which can be shortened to ds://.*\R? demo

Note that you don't have to escape the forward slash.

If you want to do a find and replace to keep the lines that contain ds:// you could use a negative lookahead:

Find what

^(?!.*ds://).*\R?

Replace with

Leave empty

Explanation

  • ^ Start of the string
  • (?!.*ds://) Negative lookahead to assert the string does not contain ds://
  • .* Match any character 0+ times
  • \R? An optional unicode newline sequence to also match the last line if it is not followed by a newline

See the Regex demo

The fourth bird
  • 154,723
  • 16
  • 55
  • 70
-1

Here you go, Andrew:

Regex: ds:\/\/.*

Link: https://regex101.com/r/ulO9GO/2

Let me know if any question.

Deep
  • 342
  • 3
  • 12