2

There are multiple space separated characters in the input eg: string = "a b c d a s e "

What should the pattern be such that when I do re.search on the input using the pattern, I'd get the j'th character along with the space following it in the input by using .group(j)?

I tried something of the sort "^(([a-zA-Z])\s)+" but this is not working. What should I do?

EDIT My actual question is in the heading and the body described only a special case of it: Here's the general version of the question: if I have to take in all patterns of a specific type (initial question had the pattern "[a-zA-Z]\s") from a string, what should I do?

Soham
  • 203
  • 1
  • 2
  • 6

3 Answers3

6

Use findall() instead and get the j-th match by index:

>>> j = 2
>>> re.findall(r"[a-zA-Z]\s", string)[j]
'c '

where [a-zA-Z]\s would match a lower or upper case letter followed by a single space character.

alecxe
  • 462,703
  • 120
  • 1,088
  • 1,195
5

Why use regex when you can simply use str.split() method and access to the characters with a simple indexing?

>>> new = s.split()
>>> new
['a', 'b', 'c', 'd', 'a', 's', 'e']
Mazdak
  • 105,000
  • 18
  • 159
  • 188
1

You could do:

>>> string = "a b c d a s e "
>>> j=2
>>> re.search(r'([a-zA-Z]\s){%i}' % j, string).group(1)
'b '

Explanation:

  1. With the pattern ([a-zA-Z]\s) you capture a letter then the space;
  2. With the repetition {2} added, you capture the last of the repetition -- in this case the second one (base 1 vs base 0 indexing...).

Demo

dawg
  • 98,345
  • 23
  • 131
  • 206