RegEx that matches characters after semicolon in the same line

Question

I need some help with the Regular Expressions. I need a RegEx that matches with characters if they are after a semicolon AND in the same line of a previous word.

Let me explain that:

I need something like this. I have to make a function that does not allow to introduce character after a semicolon in the same line, and I think I could do it with this sort of RegEx.

Thank you.

Have you tried anything before asking? There are lots of tutorials of regular expressions out there. — Miguel, Feb 26 '21 at 13:26

score 2 · Accepted Answer · answered Feb 26 '21 at 14:20

2

I am not sure I understood your question, but would something like this help? This regular expression

answered Feb 26 '21 at 14:20

UnleashMe69

48
4

score 2 · Answer 2 · answered Feb 26 '21 at 14:31

Well, you've got two ways to do it:

A: Create a regular expression to validate correct input.
B: Create a regular expression to find incorrect input.

I would use option 1, but it depends on what you need to do.

A: Regex to validate correct lines

In this case, we'll use the m modifier to set the regex engine to search by line (m = multiline). This means that ^ matches the beginning of a line and $ matches the end of a line.

Then we want to match some characters which are not the semicolon itself. To do this we use the [^ ] group meaning "anything which is not in the provided list of characters". So to say any char except the semicolon we'll have to use [^;].

Now, this char is not alone as they'll be probably many of them. To do that we can either use the * or + operators that respectively mean "0 or more times" and "1 or more times". If the data before the semicolon is mandatory then we'll use the + operator. This leads to [^;]+ to say any char which is not a semicolon, 1 or more times.

Then we'll capture this with the () operators. This will let us have direct access to this value without having to take the line and remove the semicolon with a truncation by our own.

After this capturation, we have the semicolon and then maybe some empty spaces or not and then the end of the line. For the spaces after, it's up to you. It would be \s* to say any kind of space, tab or blank char 0 or n times.

At the end we get this regex: ^([^;]+);\s*$ with the m and g flags

m for multiline and g for global, which means don't stop at the first match but look for all of them.

Test it here: https://regex101.com/r/sT59eu/1/

B: Regex to find invalid lines

Well, this could be rather easy too: ;.+$

. means any char. So here we'll find the lines with something behind the semicolon.

Test it here: https://regex101.com/r/ocDofm/1/

But you will NOT find lines with missing semicolons!

score 1 · Answer 3 · answered Feb 26 '21 at 14:26

1

if I understand it correctly, (?<=;)[A-Za-z]+ might does your work. The python documentation is helpful: https://docs.python.org/3/library/re.html

answered Feb 26 '21 at 14:26

Johannes Benoit

11
1

Thank you! You were really helpful! – Carlos Pérez Feb 26 '21 at 14:45

RegEx that matches characters after semicolon in the same line

3 Answers3

A: Regex to validate correct lines

B: Regex to find invalid lines