1

I have the following words in a list

listx=['info/base','tri-gen']

I am trying to remove both the '/' and '-' at the same time.

Currently I have two separate blocks of code (mentioned below) which achieve the above

    listx=['info/base','tri-gen']
    if '/' in listx:
        listmain= '/'.join(listx).split('/')
        listmain = list(filter(None, listmain))


    if '-' in listx:   
        listmain= '-'.join(listx).split('-')
        listmain = list(filter(None, listmain))

How do I achieve it in a single if condition or is there a way to include many conditions for e.g like below

'-','/'.join(listx).split('-','/')

Expected output

listx=['info base','tri gen']
Ridhima Kumar
  • 151
  • 3
  • 14

1 Answers1

1

The quick way to do this is using the re module, which provides you with regex magic. Feel free to read the documentation: https://docs.python.org/3/library/re.html

import re
listx=['info/base','tri-gen']

[re.sub("\/|\-"," ",i) for i in listx]

Output:

['info base', 'tri gen']

EDIT

For your comment, I think you can get away without an if statement.

This regex will find all the words you need while ignoring the ones in parenthesis:

\b\w+\b(?![\(\w+\)])

See it at work: https://regex101.com/r/YqhJDb/1

You can implement something like this:

[" ".join(re.findall(r"\b\w+\b(?![\(\w+\)])", i)) for i in listx]

Output:

['info base', 'tri gen', 'century tech limited']

Fourier
  • 2,795
  • 3
  • 25
  • 39
  • 1
    @RidhimaKumar please check my updated answer – Fourier Dec 03 '19 at 12:30
  • On your above solution. I had a doubt. When i have the list of type ['Dynamic', 'Case', 'Management', '(', 'DCM', ')']. Your solution [" ".join(re.findall(r"\b\w+\b(?![\(\w+\)])", i)) for i in listx] does not work. It returns ['Dynamic', 'Case', 'Management', '', 'DCM', '']. While ideal output is ['Dynamic', 'Case', 'Management'] – Ridhima Kumar Dec 04 '19 at 10:52
  • This was not contained in your original example, try: `(?![\w\s]*[\)])\w+` – Fourier Dec 04 '19 at 15:27
  • Yes, you are right this was not contained in original example. This issue just came to me when i tried on a list of the type i mentioned above. Your initial solution was perfect for my problem then. Coming to `(?![\w\s]*[\)])\w+` i tried it like below ( not sure i am doing it right) `[" ".join(re.findall(r"\b\w+\b((?![\w\s]*[\)])\w+)", i)) for i in listx]` and I am getting all blank elements `['', '', '', '', '', '']`. Also I hope that `(?![\w\s]*[\)])\w+` still removes \ and - as stated in the original question. – Ridhima Kumar Dec 04 '19 at 16:04
  • The above is giving me invalid syntax error with an ^ pointing at i). Also does the above expression remove '-' and '/' as well. When i add an extra bracket like this after i `[" ".join(re.findall(r"(?![\w\s]*[\)])\w+", i)) for i in listx`]` I get the output `['Dynamic', 'Case', 'Management', '', 'DCM', '']` – Ridhima Kumar Dec 04 '19 at 16:35
  • I am sorry, there was `)` missing. Your example was a list that did not contain splited items. See here: https://regex101.com/r/YqhJDb/2 . It works perfectly when using `listx = ["century tech limited (CTL)", "tri-gen ( CTL )"]` with `[" ".join(re.findall(r"(?![\w\s]*[\)])\w+", i)) for i in listx]` – Fourier Dec 04 '19 at 16:42