0

I have a regex which looks like this:

((C1)(.*?))?((C2)(.*?))?((C3)(.*?))?((C4)(.*?))?(\.zip|$)

"C" stands for Constant here, they are always the same. What comes in-between them is different every time (and sometimes nothing/whitespace. Please see my working example here!

https://regex101.com/r/R1Btde/2

This matches anything from

C1 Anything C2 Anything C3 Anything C4 Anything.zip
or
C3 Anything C4 Anything.zip

to something without an extension like:

C1 Anything C4 Anything

The problem is at the moment I am having to end my REGEX with (\.zip|$) to make the .zip file extension optional (as some files will be directories). Unfortunatelty this matches 'Nothing' every time.

Is there a way of stopping Group 13 matching with the end of every line, whilst still maintaining this group structure? You can see the groups on my working example. This isn't a duplication of my previous question, it's just a continuation as I try to improve this REGEX to support directories and files without extensions. I really appreciate the help you guys have given me and I wouldn't have got this far without you!

Many thanks!

UnluckyForSome9
  • 301
  • 1
  • 9
  • @Wiktor I don't think that duplicate is correct here. He is asking about making a subexpression optional. The fact that every group is optional and that as a consequence the pattern *does* match empty strings was not his original question... – Paolo Aug 18 '18 at 19:01
  • @UnbearableLightness The regex in the question can match an empty string. OP asks how to make it stop matching empty strings. The post I linked to is a perfect dupe reason. – Wiktor Stribiżew Aug 18 '18 at 21:13

1 Answers1

0

You have the MULTILINE flag (ie “m”) turned on, which makes $ match newlines too.

To make .zip optional at the end, end your regex with:

(\.zip)?

instead of (\.zip|$).

In order to not match the empty string, add a look ahead for something:

^(?!$)...your regex here...(\.zip)?
Bohemian
  • 412,405
  • 93
  • 575
  • 722