0

I have a python list containing \x ASCII hex string literals as some elements and with regular strings. Is there an easy way to split this list into the two different types of strings? Example data below.

I have tried searching for the \x substring within the string and that did not work correctly.

['\xFF', '\x42', 'A', '\xDE', '@', '\x1F']

Edit: Currently using Python 2.7.9

This is what I have tried so far

>>> list=['\xFF', '\x42', 'A', '\xDE', '@', '\x1F']
>>> print [s for s in list if '\x' in s]
ValueError: invalid \x escape
>>> print [s for s in list if '\\x' in s]
[]
>>> print [s for s in list if '\x' in s]
ValueError: invalid \x escape
>>> print [s for s in list if 'x' in s]
[]
>>> 
Matt
  • 545
  • 3
  • 16

3 Answers3

3

You could use a list comprehension with re.search. For example, to get a new list of all word characters:

import re
x = ['\xFF', '\x42', 'A', '\xDE', '@', '\1F']
print([i for i in x if re.search('\w',i)])

Or to split by only specific characters in ASCII ranges, something like:

print([i for i in x if re.search('[\x05-\x40]',i)])

where I picked an arbitrary range above.

Brian
  • 2,172
  • 14
  • 24
1

You could look at the repr of each string to identify if it contains an \x

xs = ['\xFF', '\x42', 'A', '\xDE', '@', '\1F', 'hello\xffworld']  
hexes = []                                                        
others = []                                                       

for x in xs:                                                      
    if r'\x' in repr(x):                                      
        hexes.append(x)                                           
    else:                                                         
        others.append(x) 

print "hexes", hexes                                              
print "others", others                                            

Output:

hexes ['\xff', '\xde', '\x01F', 'hello\xffworld']
others ['B', 'A', '@']
pyrospade
  • 7,870
  • 4
  • 36
  • 52
0

I am going to assume that you would put non Hex numbers together with the hex values. If you want a decimal number string (such as '25" to be rejected you can check for the hex indicator after you have identified it as numeric as shown below.

It would seem that the code shown in How do I check if a string is a number (float) in Python? would probably be a good way of making this test. just loop through and put your string in the proper list based on the results of the test.

The same function is also shown at [Checking if a Python String is a Number](http://pythoncentral.io/how-to-check-if-a-string-is-a-number-in-python-including-unicode/}

The difference is that the second set of codes checks for unicode as well as the regular string casting.

def is_number(s):
  try:
    float(s)
    return True
  except ValueError:
    pass

  try:
    import unicodedata
    unicodedata.numeric(s)
    return True
  except (TypeError, ValueError):
      pass

return False
Community
  • 1
  • 1
sabbahillel
  • 4,357
  • 1
  • 19
  • 36