Well, you've got two ways to do it:
I would use option 1, but it depends on what you need to do.
A: Regex to validate correct lines
In this case, we'll use the m
modifier to set the regex engine to search by line (m = multiline). This means that ^
matches the beginning of a line and $
matches the end of a line.
Then we want to match some characters which are not the semicolon itself. To do this we use the [^ ]
group meaning "anything which is not in the provided list of characters". So to say any char except the semicolon we'll have to use [^;]
.
Now, this char is not alone as they'll be probably many of them. To do that we can either use the *
or +
operators that respectively mean "0 or more times" and "1 or more times". If the data before the semicolon is mandatory then we'll use the +
operator. This leads to [^;]+
to say any char which is not a semicolon, 1 or more times.
Then we'll capture this with the ()
operators. This will let us have direct access to this value without having to take the line and remove the semicolon with a truncation by our own.
After this capturation, we have the semicolon and then maybe some empty spaces or not and then the end of the line. For the spaces after, it's up to you. It would be \s*
to say any kind of space, tab or blank char 0 or n times.
At the end we get this regex: ^([^;]+);\s*$
with the m
and g
flags
m
for multiline and g
for global, which means don't stop at the first match but look for all of them.
Test it here: https://regex101.com/r/sT59eu/1/
B: Regex to find invalid lines
Well, this could be rather easy too: ;.+$
.
means any char. So here we'll find the lines with something behind the semicolon.
Test it here: https://regex101.com/r/ocDofm/1/
But you will NOT find lines with missing semicolons!