0

I have a string in a list. I want to split values based on my separator. I don't wanna use any regex expression. regex performs it in a single operation. but i want to use for loops and split() functions to achieve it. How to make it possible. Here's my code:

aa = ['prinec-how,are_you&&smile#isfine1']

separator = ["-",",","_","&","#"]
l1 = []

for sep in separator:
    for i in aa:
        #print("i:",i)
        split_list = i.split(sep)
        aa = split_list
        print("aa:",aa)
        #print("split_list:",split_list)
    l1 =l1 + split_list
print(l1)

Required output:

['prinec','how','are','you','smile','isfine1']
Smack Alpha
  • 1,828
  • 1
  • 17
  • 37
  • 2
    just out of interest, why don't you want to use a regex? – Robin Zigmond Apr 25 '19 at 11:18
  • "I have a string in a list. I want to split values based on my separator. I don't wanna use any regex expression." is sort of like "I have a nail. I want to drive it into a block of wood. I don't wanna use a hammer." – John Coleman Apr 25 '19 at 11:19
  • I want to check one by one based on my separator. regex performs it in a single operation. i want to use for loops and split() functions to achieve it – Smack Alpha Apr 25 '19 at 11:20
  • you do realise there won't be any observable difference in the result, right? (Although there may be a performance difference, I would guess the regex would be faster but I can't be sure.) – Robin Zigmond Apr 25 '19 at 11:21

3 Answers3

4

Using str.replace and str.split()

Ex:

aa = ['prinec-how,are_you&&smile#isfine1']
separator = ["-",",","_","&","#"]

for i in aa:
    for sep in separator:
        i = i.replace(sep, " ")
    print(i.split())

Output:

['prinec', 'how', 'are', 'you', 'smile', 'isfine1']
Rakesh
  • 81,458
  • 17
  • 76
  • 113
  • Nice idea, but this assumes that the sentence should actually _also_ be split at whitespace, which is not explicitly stated. (i.e., should `'a&b c'` become `['a', 'b', 'c']` or `['a', 'b c']`?) Instead, you could just replace all the special characters by the first (or any other) from that list, e.g. `'-'`, and then do `split('-')`. – tobias_k Apr 25 '19 at 11:29
2

Intead of using a regular expression (which would be the sensible thing to do here), you could e.g. use itertools.groupby to group characters by whether they are separators or not, and then keep those groups that are not.

aa = ['prinec-how,are_you&&smile#isfine1']
separator = ["-",",","_","&","#"]

from itertools import groupby
res = [''.join(g) for k, g in groupby(aa[0], key=separator.__contains__) if not k]
# res: ['prinec', 'how', 'are', 'you', 'smile', 'isfine1']

As I understand your approach, you want to iteratively split the strings in the list by the different separators and add their parts back to the list. This way, it also makes sense for aa to be a list initially holding a single string. You could do this much easier with a list comprehension, replacing aa with a new list holding the words from the previous aa split by the next separator:

aa = ['prinec-how,are_you&&smile#isfine1']
separator = ["-",",","_","&","#"]

for s in separator:
     aa = [x for a in aa for x in a.split(s) if x]
# aa: ['prinec', 'how', 'are', 'you', 'smile', 'isfine1']
tobias_k
  • 81,265
  • 12
  • 120
  • 179
1

using regex

import re
a=re.compile(r'[^-,_&#]+')

ST = 'prinec-how,are_you&&smile#isfine1'
b=a.findall(ST)
print(b)
"""
output

['prinec', 'how', 'are', 'you', 'smile', 'isfine1']

"""

USING for loop

aa = ['prinec-how,are_you&&smile#isfine1','prinec-how,are_you&&smile#isfi-ne1']

separator = ["-",",","_","&","#"]

for i in range(len(aa)):
    j =aa[i]
    for sep in separator:
        j = j.replace(sep, ' ')
    aa[i]=j.split()

print(aa)    

OUTPUT

   [['prinec', 'how', 'are', 'you', 'smile', 'isfine1'], ['prinec', 'how', 'are', 'you', 'smile', 'isfi', 'ne1']] 
sahasrara62
  • 10,069
  • 3
  • 29
  • 44