-3

For example the string is hello %$ world %^& let me ^@ love && you the expected result would be hello in one variable and rest in other variables example a="hello" b="world" etc.

Vipul Rao
  • 1,495
  • 2
  • 10
  • 15

3 Answers3

0

Use regular expression

Like this:-

import re
a = "hello %$ world %^& let me ^@ love && you"
print(re.findall(r'\w+',a))
Narendra
  • 1,511
  • 1
  • 10
  • 20
0

You can user (regular expressions to retrieve worlds from the string):

import re
my_string = "hello %$ world %^& let me ^@ love && you"
re.findall(r'\w+\b', my_string)
# ['hello', 'world', 'let', 'me', 'love', 'you']

Please see more about regular expressions in Regular Expression HOWTO

Update

As asked in comments, attaching regexp to retrieve group of words separated by special characters:

my_string = "hello world #$$ i love you #$@^ welcome to world"
re.findall(r'(\w+[\s\w]*)\b', my_string)  
# ['hello world', 'i love you', 'welcome to world']
Andriy Ivaneyko
  • 20,639
  • 6
  • 60
  • 82
  • i know this but how to store string after a special character for example `hello world #$$ i love you #$@^ welcome to world` output must be a="hello world" b="i love you" c="welcome to world" – Vipul Rao Feb 14 '18 at 10:47
  • or what if to import a csv file which has something like this in a column and save each output to a particular column ! – Vipul Rao Feb 14 '18 at 10:56
0

The basic answer would be a regexp. I would recommend looking in to tokenizer from NLTK, they encompas research on the topic and give you the flexibility to switch to something more sophisticated later on. Guess what? It offers a Regexp based tokenizer too!

from nltk.tokenize import RegexpTokenizer 

tokenizer = RegexpTokenizer(r'([A-Za-z0-9 ]+)')
corpus = tokenizer.tokenize("hello %$ world %^& let me ^@ love && you")
S van Balen
  • 288
  • 2
  • 11