-2

I was wondering how I would extract a certain part of a string in Python.

So, let's say I have a list. (EXAMPLE)

1 || awdawd@awdawd.com || awlkdjawldkjalwdkda
2 || aawdawd@awd.com || awdadwawdawdawdawd

I know I could use indexing and take the last 10 or so characters from each line, but that wouldn't work if they're different lengths. And it wouldn't work at all for the email.

I'm thinking regular expressions, but once I find the portion of the string, how would I copy just that part of the string and append it to, say, a list.

The regular expression is simple for the email, but not so simple for the string after the '||'. So how would I do that? I'm having trouble getting my head around it. Maybe search for || and get everything after it? But then there's two '||'.

Any help is appreciated.

Awn
  • 817
  • 1
  • 16
  • 33

4 Answers4

3

Get the reverse index after splitting by ||:

>>> L = ["|| awdawd@awdawd.com || awlkdjawldkjalwdkda", "|| aawdawd@awd.com || awdadwawdawdawdawd"]
>>> for x in L:
...     print x.split('||')[-1].strip()
... 
awlkdjawldkjalwdkda
awdadwawdawdawdawd
TerryA
  • 58,805
  • 11
  • 114
  • 143
2

Firstly, if you know the exact format of strings you may use split() function. For example

>>> string1 = "1 || awdawd@awdawd.com || awlkdjawldkjalwdkda"
>>> list1 = string1.split("||")
>>> list1
['1 ', ' awdawd@awdawd.com ', ' awlkdjawldkjalwdkda']
>>> list1[1].strip()
'awdawd@awdawd.com'

If you split a given string using substring "||" you'll receive a list of three elements. E-mail will be the second one, and the strip() function will give you the email without whitespace characters.

If you don't know exact structure of strings, but you know what substrings you want to extract you can use regular expressions, there are quite a few recipes for it, here is one for emails.

1

I think you want the first part. This splits the input according to ||, and then print the stripped content of index 1.

>>> s = '1 || awdawd@awdawd.com || awlkdjawldkjalwdkda'
>>> s.split('||')[1].strip()
'awdawd@awdawd.com'
>>> L = ["|| awdawd@awdawd.com || awlkdjawldkjalwdkda", "|| aawdawd@awd.com || awdadwawdawdawdawd"]
>>> for x in L:
        print(x.split('||')[1].strip())


awdawd@awdawd.com
aawdawd@awd.com
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
1

I think str.split('||') is there for exactly this use case.

To remove remaining whitespace, use str.strip() on the returned array elements.

deyhle
  • 453
  • 3
  • 11