-2

I have a field that looks like this :

------->Total cash dispensed: 40000 MGA

I want to get only the "MGA" using regex but without using split

regex = r"Total cash dispensed:\s*([^ 0-9]*)"

The code I used to get anything that's not number or white space does not work, How do I fix this?

MARK
  • 75
  • 1
  • 11

2 Answers2

2

You might use a capture group:

\bTotal cash dispensed:\s*\d+\s+([A-Z]+)\b
  • \bTotal cash dispensed:\s* Match the text starting with a word boundary and followed by : and optional whitespace chars
  • \d+\s+ Match 1+ digits and 1+ whitespace chars
  • ([A-Z]+) Capture group 1, match 1+ chars A-Z
  • \b A word boundary to prevent a partial match

Regex demo

import re
 
pattern = r"\bTotal cash dispensed:\s*\d+\s+([A-Z]+)\b"
s = "------>Total cash dispensed: 40000 MGA"
 
matches = re.search(pattern, s)
 
if matches:
    print(matches.group(1))

Output

MGA
The fourth bird
  • 154,723
  • 16
  • 55
  • 70
  • Just to clarify: `\d` is not the same as `[0-9]` :) Example: `re.match('\d', '١')` – Riccardo Bucco May 31 '21 at 14:44
  • 2
    @RiccardoBucco Right, [`(?a)\d` = `[0-9]`](https://ideone.com/Cvavwr) in Python 3. `\d` matches any Unicode digits by default. `re.ASCII` / `re.A` or `(?a)` can be used to make `\d` equal to `[0-9]`. – Wiktor Stribiżew May 31 '21 at 15:01
0

You can match anything except whitespace and digits with the following expression:

[^\s\d]+

"\s\" is whitespace, including spaces, tabs, and enters

"\d" is digits, same as [0-9] and some other numeric characters from other scripts

Example:

re.search('[^\s\d]+', '123 \t')
# -> No match
mousetail
  • 7,009
  • 4
  • 25
  • 45