-4

How to find name, id, type in string using re?

string = "Wacom Bamboo Connect Pen stylus   id: 15  type: STYLUS"

Expected Result:("Wacom Bamboo Connect Pen stylus", 15, "STYLUS")

2 Answers2

0

Use re.findall. The regex: r'^(.*\S)\s+id:\s*(\d+)\s+type:\s*(.+)' means: ^ : start of the string.
.* :any character, repeated 0 or more times.
\S : non-whitespace character.
\s+ : whitespace character, repeated 1 or more times.
\d+ : any digit, repeated 1 or more times.
(PATTERN) : capture the patterns and return it. Here we capture 3 patterns.

import re

string = "Wacom Bamboo Connect Pen stylus   id: 15  type: STYLUS"

lst = re.findall(r'^(.*\S)\s+id:\s*(\d+)\s+type:\s*(.+)', string)

# The first match (list element) is a tuple. Extract it:
lst = list(lst[0])
lst[1] = int(lst[1])
print(lst)
# ['Wacom Bamboo Connect Pen stylus', 15, 'STYLUS']
Timur Shtatland
  • 12,024
  • 2
  • 30
  • 47
0

To match the first string before id you need:

.*(?=(id:))

To match the id you need:

(?<=id:.*)(\d*)(?=.*type)

To match the type you need:

(?<=type:.*)(\w+)

I would suggest you have a look at lookaheads and lookbehinds.

kny_92
  • 31
  • 4
  • 2
    Uh... `^[^id]*` matches any run of characters starting from the beginning of the string, consisting solely of characters that are neither `i` nor `d`. This technically works for the OP's specific example (before `id`, there are no `i`s or `d`s), but I suspect the inputs in general aren't that restrictive (`"Apple iPod id: 21 type: Player"` is going to fail). – ShadowRanger Jun 17 '22 at 15:32
  • Right, that is a major flaw, I updated my answer accordingly. Thanks! – kny_92 Jun 22 '22 at 11:42