0

How can I improve the following regex to effectively search for C, CLK, and CE ports and specific printing associated on that ports?

Right now code below detects C, CLK and CE but are all labeled as C port, which is not intended.

Sample code below:

import re
test_array = ['jclk','jce','ja','jb','jc','jd','je']

for thisEntry in test_array:
    if re.match(r'jc', thisEntry):
        print("this is C port")
    elif re.match(r'jce', thisEntry):
        print("this is CE port")
    elif re.match(r'jclk', thisEntry):
        print("this is CLK port")
    else:
        print("dont care")

Result:

this is C port      #for 'jclk', incorrect
this is C port      #for 'jce', incorrect
dont care           #for 'ja', correct
dont care           #for 'jb', correct
this is C port      #for 'jc', correct
dont care           #for 'jd', correct
dont care           #for 'je', correct

Thanks in advance.

Meeyaw
  • 75
  • 1
  • 8
  • You need to use anchors. btw I din't understand why are you using regex here. Can you not use `thisEntry == 'jc'` condition? – anubhava Apr 18 '17 at 06:25
  • Hi thanks for the answer. The reason why I am not using `thisEntry == 'jc'` is that on real application I am accessing a dictionary from a class, in which the value is not just a simple 'j*' arguments. – Meeyaw Apr 18 '17 at 06:28
  • 1
    ok then use word boundary or anchors as: `re.match(r'\bjc\b', thisEntry):` – anubhava Apr 18 '17 at 06:30
  • I tried that and it works now, for I was able to differentiate the C, CE and CLK ports. Thanks! – Meeyaw Apr 18 '17 at 06:34

1 Answers1

1

Translating my comment to an answer.

You are running into this problem because jc is part of many input strings you have.

You need to use anchors or word boundary like this to avoid matching extra text:

import re
test_array = ['jclk','jce','ja','jb','jc','jd','je']

for thisEntry in test_array:
    if re.match(r'\bjc\b', thisEntry):
        print("this is C port")
    elif re.match(r'\bjce\b', thisEntry):
        print("this is CE port")
    elif re.match(r'\bjclk\b', thisEntry):
        print("this is CLK port")
    else:
        print("dont care")
anubhava
  • 761,203
  • 64
  • 569
  • 643