I want to remove semi-colons, parentheses and dashes at the same time. How to do it with split()
?
For example, converting (shipment_id[s]-ef234;sender[s]-Taiwan Electronics Co) into ["shipment_id[s]", "ef234", "sender[s]", "Taiwan Electronics Co"]
What if the item contain dashes already like"JB HI-FI". How to do with it?
Asked
Active
Viewed 117 times
2

Jody Hou
- 43
- 6
1 Answers
1
It would be super simple to achieve this with regex
.
import re
pattern = r'[-|;]'
string = 'shipment_id[s]-ef234;sender[s]-Taiwan Electronics Co'
re.split(pattern, string)
#['shipment_id[s]', 'ef234', 'sender[s]', 'Taiwan Electronics Co']
If the ()
are included, we can simply add them to our pattern and pass it via a list comprehension to only include values that are True
.
string = '(shipment_id[s]-ef234;sender[s]-Taiwan Electronics Co)'
pattern = r'[-|;|\(|\)]'
print([match for match in re.split(pattern, string) if match])
#['shipment_id[s]', 'ef234', 'sender[s]', 'Taiwan Electronics Co']

PacketLoss
- 5,561
- 1
- 9
- 27
-
You'll have extra work if you keep the starting and ending `(` and `)` as the question asks because of the empty strings it will create. – Mark Apr 28 '21 at 04:02
-
Simple list comprehension can remove empty strings if that's really a huge concern. Or just use `strip` to remove the parenthesis before doing the `split`. – Aelarion Apr 28 '21 at 04:06
-
1Added to account for `()` @Mark – PacketLoss Apr 28 '21 at 04:08
-
I wonder if `re.findall(r'[^;()-]+',s)` or similar would be simpler. – Mark Apr 28 '21 at 04:08
-
What if the item which contain dashes like "JB HI-FI". How to do with that? – Jody Hou Apr 28 '21 at 07:21