0

I have a string like this:

<http://www.w3.org/1999/02/22-rdf-syntax-ns#type>

and I want a Regex that will match whenever there's a situation like this:

<https://example.com/code/INSTRUMENTS/AGD<1>
<https://example.com/code/INSTRUMENTS/AGD>1>

to detect such invalidly serialized IRIs (similar to URIs/URLs).

I've looked at existing answers on SO, but none of them seem to work when the inner character is the same as one of the outer characters.

  • We would need to see the context of these tags in a larger text, but in any case, you would probably need a balanced parser to handle this in general. – Tim Biegeleisen Apr 09 '21 at 06:43

1 Answers1

0

Given the assumption that < always has to be closed with a > and they cannot be "nested":

(<[^>]*<|>[^<]*>)

This will detect them if its just about boolean checking/validation.

If you also want to capture the full string

(<[^>]*<[^<]*>|<[^>]*>[^<]*>)

(I hope I don't have a typo)