I'm trying to build a complicated regex. I want to match a regex of the following structure:
- .+ (any character, at least once)
- either "del" or "ins" or "dup" or [ATGC]
- .* (string ends or is followed by whatever)
I have tried different things and at the moment I am here, which doesn't work:
hgvs = "c.*1017delT"
a = re.match('(.*)(del|ins|dup|[ATGC]).*', hgvs)
a.groups()
('c.*1017del', 'T')
I expect to catch everything before the "del" with "(.*)". But he seems to apply the [ATGC] match over the del match.