3

I want to remove all words containing numbers, examples:

LW23 London W98 String

From the string above the only thing i want to remain is "London String". Can this be done with regex.

I'm currently using Python but PHP code is fine too.

Thanks!

EDIT:

Here is what i can do for now:

>>> a = "LW23 London W98 String"
>>> b = a.split(' ')
>>> a
['LW23', 'London', 'W98', 'String']
SilentGhost
  • 307,395
  • 66
  • 306
  • 293
prototype
  • 3,303
  • 2
  • 27
  • 42
  • [Regex to delete all words containing numbers from a sentence](http://stackoverflow.com/questions/11024174/regex-to-delete-all-words-containing-numbers-from-a-sentence) – loler Nov 19 '12 at 12:41

6 Answers6

6

Yes, you can:

result = re.sub(
    r"""(?x) # verbose regex
    \b    # Start of word
    (?=   # Look ahead to ensure that this word contains...
     \w*  # (after any number of alphanumeric characters)
     \d   # ...at least one digit.
    )     # End of lookahead
    \w+   # Match the alphanumeric word
    \s*   # Match any following whitespace""", 
    "", subject)
Tim Pietzcker
  • 328,213
  • 58
  • 503
  • 561
3

You can try a preg_replace with this pattern:

/(\w*\d+\w*)/

Something like $esc_string = preg_replace('/(\w*\d+\w*)/', '', $old_string);

Maxime
  • 127
  • 3
3

Depends on what a 'word' is I guess, but if we're talking whitespace as separators and if it doesn't have to be a regex:

>>> ' '.join(filter(str.isalpha, a.split()))
'London String'
Jon Clements
  • 138,671
  • 33
  • 247
  • 280
  • @SilentGhost so it does - good catch - I was focusing on the example string - mea culpa – Jon Clements Nov 19 '12 at 13:00
  • The question doesn't say anything about punctation - what should happen to `LW23, London` for example? As long as only whitespace is concerned, this is the best answer to me. – georg Nov 19 '12 at 13:42
1

I'm not 100% sure and this is just a suggestion for a possible solution, I'm not a python master but I'd probably have a better idea of what todo if I saw the full code.

My suggestion would be to add the sections of the string to a list, pop each word out and use and if function to check for numbers and remove them if they contain number and add them to a new list if they do not, you could then re-order the list to have the words in the appropriate order.

Sorry if this doesn't help, I just know that if I encountered the problem, this sort of solution is where I would start.

Scott Browne
  • 175
  • 1
  • 8
  • As this is your first answer, I'll give you +1, but for the future, post some working code instead of describing how you would do that. – georg Nov 19 '12 at 14:05
1

You could do this with a regex plus comprehension:

clean = [w for w in test.split(' ') if not re.search("\d", w)]

or

words = test.split(' ')
regex = re.compile("\d")
clean = [w for w in words if not regex.search(w) ]

Input:

"LW23 London W98 String X5Y 99AP Okay"

Output:

['London', 'String', 'Okay']
LSerni
  • 55,617
  • 10
  • 65
  • 107
0

You can match a word containing numbers with

/\w*\d+\w*/

or you could match all the words withOUT numbers (and keep them)

/\w+/
Ramfjord
  • 872
  • 8
  • 14