1

Programming in Python3.

I am having difficulty in controlling whether a string meets a specific format.

So, I know that Python does not have a .contain() method like Java but that we can use regex. My code hence will probably look something like this, where lowpan_headers is a dictionary with a field that is a string that should meet a specific format. So the code will probably be like this:

import re

lowpan_headers = self.converter.lowpan_string_to_headers(lowpan_string)
pattern = re.compile("^([A-Z][0-9]+)+$")
pattern.match(lowpan_headers[dest_addrS])

However, my issue is in the format and I have not been able to get it right. The format should be like bbbb00000000000000170d0000306fb6, where the first 4 characters should be bbbb and all the rest, with that exact length, should be hexadecimal values (so from 0-9 and a-f).

So two questions: (1) any easier way of doing this except through importing re (2) If not, can you help me out with the regex?

  • Python **does** have a "contains" equivalent, though: https://stackoverflow.com/questions/3437059/does-python-have-a-string-contains-substring-method – Asunez Aug 09 '17 at 07:45
  • Python *does* have something like `Java.lang.String.contains`, but it uses the `in` operator: `string1 in string2`, although given your specification it sounds like you *actually want regex anyway* – juanpa.arrivillaga Aug 09 '17 at 07:45
  • Does this regex work for you: `r'^b{4}[0-9a-f]{28}$'`? – ikkuh Aug 09 '17 at 07:47
  • 1
    @ikkuh hexadecimal values contain the letters `a-f`, not `a-z` – Ma0 Aug 09 '17 at 07:47
  • Your regular expression matches a single uppercase character followed by some digits, and repetitions of this pattern. Your string has more than one lowercase character. So it doesn't match. – Barmar Aug 09 '17 at 07:47
  • Ah yes, the in operator. That is the equivalent, never considered it like that (terminology differences). Thanks! –  Aug 09 '17 at 07:49

4 Answers4

5

As for the regex you're looking for I believe that

^bbbb[0-9a-f]{28}$

should validate correctly for your requirements.

As for if there is an easier way than using the re module, I would say that there isn't really to achieve the result you're looking for. While using the in keyword in python works in the way you would expect a contains method to work for a string, you are actually wanting to know if a string is in a correct format. As such the best solution, as it is relatively simple, is to use a regular expression, and thus use the re module.

2

In fact, Python does have an equivalent to the .contains() method. You can use the in operator:

if 'substring' in long_string:
    return True

A similar question has already been answered here.

For your case, however, I'd still stick with regex as you're indeed trying to evaluate a certain String format. To ensure that your string only has hexadecimal values, i.e. 0-9 and a-f, the following regex should do it: ^[a-fA-F0-9]+$. The additional "complication" are the four 'b' at the start of your string. I think an easy fix would be to include them as follows: ^(bbbb)?[a-fA-F0-9]+$.

>>> import re
>>> pattern = re.compile('^(bbbb)?[a-fA-F0-9]+$')
>>> test_1 = 'bbbb00000000000000170d0000306fb6'
>>> test_2 = 'bbbb00000000000000170d0000306fx6'
>>> pattern.match(test_1)
<_sre.SRE_Match object; span=(0, 32), match='bbbb00000000000000170d0000306fb6'>
>>> pattern.match(test_2)
>>>

The part that is currently missing is checking for the exact length of the string for which you could either use the string length method or extend the regex -- but I'm sure you can take it from here :-)

m_____z
  • 1,521
  • 13
  • 22
1

Here is a solution that does not use regex:

lowpan_headers = 'bbbb00000000000000170d0000306fb6'
if lowpan_headers[:4] == 'bbbb' and len(lowpan_headers) == 32:
    try:
        int(lowpan_headers[4:], 16)  # tries interpreting the last 28 characters as hexadecimal
        print('Input is valid!')
    except ValueError:
        print('Invalid Input')  # hex test failed!
else:
    print('Invalid Input')  # either length test or 'bbbb' prefix test failed!
Ma0
  • 15,057
  • 4
  • 35
  • 65
0

As I mentioned in the comment Python does have contains() equivalent.

if "blah" not in somestring: 
    continue

(source) (PythonDocs)

If you would prefer to use a regex instead to validate your input, you can use this:

^b{4}[0-9a-f]{28}$ - Regex101 Demo with explanation

Asunez
  • 2,327
  • 1
  • 23
  • 46