I'm trying to use prxmatch to verify if postcode format (UK) is correct. The ('/^[A-Z]{1,2}\d{2,3}[A-Z]{2}|[A-Z]{1,2}\d[A-Z]\d[A-Z]{2}$/') bit covers (I think) all the possible post code formats used in UK, however I only want exact and not partial matches and no additional chars before or after match.
data pc_flag ; set abc ;
format pc_correct_flag $1. compressed_postcode $100.;
compressed_postcode = compress(postcode);
pc_regex = prxparse('/^[A-Z]{1,2}\d{2,3}[A-Z]{2}|[A-Z]{1,2}\d[A-Z]\d[A-Z]{2}$/');
if prxmatch(pc_regex,compressed_postcode)>0
then pc_correct_flag='Y';
else pc_correct_flag='N';run;
I was expecting 'Y' only on exact matches on full string, i.e. with no additional characters before and after regex. However, I'm also getting false positives, where a part of 'compressed_postcode' matches regex, but there are additional characters after the match, which I thought using $ would prevent. I.e. I'd expect only something like AA11AA to match, but not AA11AAAA. I suspect this has to do with $ positioning but can't figure out exactly what's wrong. Any idea what I've missed?