how can I specify that each row must have at least numbers AND letter?
That can be done with the help of positive lookaheads.
pattern = "^(?=[^A-Z]*[A-Z])(?=\D*\d)[A-Z0-9-]{4,10}.*\d+,\d{2}"
The (?=[^A-Z]*[A-Z])
will be triggered at the start of the string and will require at least one A-Z
letter in the string. The (?=\D*\d)
will also be triggered (after the preceding lookahead returns true) and will require at least one digit. If there is no digit in the string, the match will be failed (no match will be found).
Also, if the number must be at the end of the "row" add a $
anchor (end of string).
Besides, note that .*
will "eat up the digits (supposed to be matched with \d+,\d{2}
) up to the one before a comma since the .*
pattern is greedy. It makes no difference here unless you want to capture the float number. Then, use lazy matching .*?
.
In case the pattern should be case insensitive, use a case insensitive flag re.I
when compiling the pattern, or add (?i)
inline modifier to the pattern start.
UPDATE
If you want to limit the condition to the first non-whitespace chunk, you can use
^(?=[0-9-]*[A-Z])(?=[A-Z-]*\d)[A-Z0-9-]{4,10}.*\d+,\d{2}
^^^^^^^ ^^^^^^^
where we check if there is a letter after optional 0+ digits/hyphen and a digit after 0+ letters or hyphen (see demo) or
^(?=\S*[A-Z])(?=\S*\d)[A-Z0-9-]{4,10}.*\d+,\d{2}
where we check for letters and digits after 0+ non-whitespace characters (\S*
). See another demo