1

I have a long string with lots of HTTP URLs separated by a comma, and I'm trying to create a regular expression to find and replace spaces with %20, but only before "zip" in VSCode. Therefore, spaces after the "zip" should be allowed up to the following URL (before a comma ",").

I've tried using "\s(?=[^zip]*zip)" based on the following question; however, it's not working.

Regex to select the white space before a certain character

Sample Text: http://abcd.com/old file/anotherfile.zip/some contents/sample.txt,http://abcd.com/new file/anotherfile.zip/some more contents/new sample.txt,http://abcd.com/newer file/another file.zip/some more contents/old sample.txt,http://abcd.com/newer file/another file.txt

Expected Output: http://abcd.com/old%20file/anotherfile.zip/some contents/sample.txt,http://abcd.com/new%20file/anotherfile.zip/some more contents/new sample.txt,http://abcd.com/newer%20file/another file.zip/some more contents/old sample.txt,http://abcd.com/newer%20file/another%20file.txt

P.S. The string also contains URLs to files, not inside a zip like the last URL in the above sample. Observe that, with the regular expression, I'm trying to replace the selected space with '%20' whenever there's a space in the path before "zip" and everywhere when the path doesn't have any zip.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Rahul Kunjappa
  • 67
  • 1
  • 1
  • 11

1 Answers1

1

You can use:

\.zip\b[^,]*(*SKIP)(*F)|\s

Details:

  • \.zip\b[^,]*(*SKIP)(*F) - match .zip followed with a word boundary and then any zero or more non-comma chars and then fail the match and start a new search from the failed position
  • | - or
  • \s - a whitespace

See the regex demo.

Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
  • "\s(?=[^,]*\/[^\/,]*zip\b)" from the above link wouldn't select spaces when the zip filename itself contains spaces and also when the path has a different extension, as in the case of the last two URLs. The expected output for the last URLs is: "...,abcd.com/newer%20file/another%20file.zip/some more contents/old sample.txt,abcd.com/newer%20file/another%20file.txt" – Rahul Kunjappa Nov 25 '22 at 07:10
  • @RahulKunjappa Use this in JavaScript: `text.replace(/(\.zip\b[^,]*)|\s/g, (x,y) => y ? x : " ")` – Wiktor Stribiżew Nov 25 '22 at 10:35
  • @RahulKunjappa In PCRE, you can use `\.zip\b[^,]*(*SKIP)(*F)|\s`, see [this regex demo](https://regex101.com/r/9oKf4t/2). In .NET, you can use `\s(?<!\.zip\b[^,]*)` (see [this demo](https://regex101.com/r/9oKf4t/3)) – Wiktor Stribiżew Nov 25 '22 at 11:02
  • Perfect, it works flawlessly now. – Rahul Kunjappa Nov 25 '22 at 18:24
  • @RahulKunjappa What exact solution do you need then? What should be in the answer so that the question can be finalized? – Wiktor Stribiżew Nov 25 '22 at 18:35
  • "\.zip\b[^,]*(*SKIP)(*F)|\s", this one worked for me. – Rahul Kunjappa Nov 26 '22 at 03:22
  • @RahulKunjappa I updated the answer with the PCRE solution. – Wiktor Stribiżew Nov 26 '22 at 08:51