0

I have a string with the format exp = '(( 200 + (4 * 3.14)) / ( 2 ** 3 ))'

I would like to separate the string into tokens by using re.split() and include the separators as well. However, I am not able to split ** together and eventually being split by * instead.

This is my code: tokens = re.split(r'([+|-|**?|/|(|)])',exp)

My Output (wrong):

['(', '(', '200', '+', '(', '4', '*', '3.14', ')', ')', '/', '(', '2', '*', '*', '3', ')', ')']

I would like to ask is there a way for me to split the separators between * and **? Thank you so much!

Desired Output:

['(', '(', '200', '+', '(', '4', '*', '3.14', ')', ')', '/', '(', '2', '**', '3', ')', ')']

wink
  • 57
  • 6
  • 1
    Ugly but simple: First replace ‚**‘ with any single char that can not appear like underscore. Then split and later replace it back – Pablo Henkowski Jan 23 '21 at 12:10
  • @PabloHenkowski oh yah i never though of that! thank you so much! – wink Jan 23 '21 at 12:17
  • 1
    Try this regexp: `r'(\*\*|\*|\+|-|/|\(|\))'` (see the `ast` module for a better alternative) https://stackoverflow.com/questions/5049489/evaluating-mathematical-expressions-in-python – VPfB Jan 23 '21 at 12:27
  • @VPfB yup this works! Thank you so much! – wink Jan 24 '21 at 03:52

1 Answers1

2

Using the [...] notation only allows you to specify individual characters. To get variable sized alternate patterns you need to use the | operator outside of these brackets. This also means that you need to escape the regular expression operators and that you need to place the longer patterns before the shorter ones (i.e. ** before *)

tokens  = re.split(r'(\*\*|\*|\+|\-|/|\(|\))',exp)

or even shorter:

tokens = re.split(r'(\*\*|[*+-/()])',exp)
Alain T.
  • 40,517
  • 4
  • 31
  • 51