So I've been trying to come with a regex that separates these kinds of strings: A100
, A-100
, A1-100
, A1_100
, A1A100
, "A-100"
, and many other examples.
The strings exclusively "end" with only numbers, and I say "end" because they can be in quotations, and technically it's not the end of the string, it's a word boundary though.
What I need is to get both things, whatever is behind only numbers and the string containing only numbers, I need to be able to separate them because I might need to do some additions to the only numbers part.
What I've tried is:
At the very start it was easy,
A100
was easily separated with something like([a-zA-Z]+)(\d+)
, but then I needed to separateA_100
, and I need one string that has theA_
and the other the100
, or if it'sA1-100
, I would needA1-
and then the number part100
.With many iterations of this problem I ended up with this messy regex:
([a-zA-Z\+\.\?\!_\-\\\d]+[a-zA-Z\+\.\?\!_\-\\]+)(\d+)
It separates a lot of the stuff I need EXCEPT for the more simple A100, because if the first part of the string has a number in it (like
A1A100
) then it needs to have something else but a digit, or else I would just getA1
andA100
. But this is very very messy, and I would rather do something simple like([^\n])(\d+)
(this obviously doesn't work) and get any string that can contain any character but newlines and then get the string that ends exclusively with numbers.Tried to implement lookaheads, but I'm not very good with them.
((?=\d+)\d+)
would get me exclusively the number part onA100
but can't for the life of me manage to combine it with any other char string part.
All of this with an implementation that works with C# and .NET. Any guidance?