79

I'd like to remove all characters before a designated character or set of characters (for example):

intro = "<>I'm Tom."

Now I'd like to remove the <> before I'm (or more specifically, I). Any suggestions?

Saroekin
  • 1,175
  • 1
  • 7
  • 20

12 Answers12

69

Use re.sub. Just match all the chars upto I then replace the matched chars with I.

re.sub(r'^.*?I', 'I', stri)
Wiktor Stribiżew
  • 607,720
  • 39
  • 448
  • 563
Avinash Raj
  • 172,303
  • 28
  • 230
  • 274
  • I'm fairly new to `re`, I'll look into it some more; I appreciate the answer, thanks! – Saroekin Jun 19 '15 at 19:25
  • 2
    note that, you may switch between first or last `I` `re.sub(r'.*?I', 'I', stri)`. But other answer won't satisfy this. – Avinash Raj Jun 19 '15 at 19:26
  • 1
    So you're saying `re` is the best option? Do you have any good tutorials/articles explaining the fundamentals of `re`? And thanks for your help. – Saroekin Jun 19 '15 at 19:36
  • 2
    Choosing the answer is all upto you. Yeh, learning regex is a must one for every developer since there are only few languages which won't use regex. – Avinash Raj Jun 19 '15 at 19:39
  • @AvinashRajdo you know how I could do this without knowing the length or characters? –  May 06 '18 at 12:39
  • @CoderPE you can use my answer which won't expect you to know the chars length. – Avinash Raj May 06 '18 at 13:03
  • Won't this remove all characters before the last "I"? Not all characters before the first "I"? – Ken May 16 '21 at 16:56
  • 2
    missing `import re` – quent May 17 '22 at 17:45
  • input and output strings are not properly indicated: `output_str = re.sub(r'^.*?I', 'I', input_str)` – quent May 17 '22 at 17:48
47

str.find could find character index of certain string's first appearance:

intro[intro.find('I'):]
duan
  • 8,515
  • 3
  • 48
  • 70
  • 6
    If the character is missing from the string, this will return just the last character of the input string because `.find` will return `-1` and `some_str[-1:]` is "return all characters starting from the last one". – Boris Verkhovskiy Apr 29 '20 at 15:18
  • 3
    Thank you that help me partially. Also if want to remove all characters before/with founded character index: `intro[intro.find('I')+1:]` – Fisal Assubaieye Jan 26 '22 at 07:02
29

Since index(char) gets you the first index of the character, you can simply do string[index(char):].

For example, in this case index("I") = 2, and intro[2:] = "I'm Tom."

Ashkay
  • 786
  • 1
  • 8
  • 19
  • 1
    No problem. This will work for any string as well. Note that 1) You will probably have to make sure the index is valid, i.e. not -1 and 2) `index` only returns the first occurrence of the given string. – Ashkay Jun 19 '15 at 19:26
  • 25
    Actual example would be: `intro[intro.index('I'):]` – mattalxndr Apr 05 '18 at 13:31
  • 1
    This will raise a `ValueError` if the character doesn't appear in the string. – Boris Verkhovskiy Apr 29 '20 at 15:20
8

If you know the character position of where to start deleting, you can use slice notation:

intro = intro[2:]

Instead of knowing where to start, if you know the characters to remove then you could use the lstrip() function:

intro = intro.lstrip("<>")
Brent Washburne
  • 12,904
  • 4
  • 60
  • 82
3
str = "<>I'm Tom."
temp = str.split("I",1)
temp[0]=temp[0].replace("<>","")
str = "I".join(temp)
ahmad valipour
  • 293
  • 2
  • 11
2

I looped through the string and passed the index.

intro_list = []

intro = "<>I'm Tom."
for i in range(len(intro)):
    if intro[i] == '<' or intro[i] == '>':
        pass
    else:
        intro_list.append(intro[i])

intro = ''.join(intro_list)
print(intro)
Mafematic
  • 41
  • 4
2
import re

date_div = "Blah blah\nblah, Updated: Aug. 23, 2012 Blah blah Updated: Feb. 13, 2019"

up_to_word = ":"
rx_to_first = r'^.*?{}'.format(re.escape(up_to_word))
rx_to_last = r'^.*{}'.format(re.escape(up_to_word))

# (Dot.) In the default mode, this matches any character except a newline. 
# If the DOTALL flag has been specified, this matches any character including a newline.

print("Remove all up to the first occurrence of the word including it:")
print(re.sub(rx_to_first, '', date_div, flags=re.DOTALL).strip())

print("Remove all up to the last occurrence of the word including it:")
print(re.sub(rx_to_last, '', date_div, flags=re.DOTALL).strip())
2
>>> intro = "<>I'm Tom."
#Just split the string at the special symbol

>>> intro.split("<>")

Output = ['', "I'm Tom."]

>>> new = intro.split("<>")

>>> new[1]
"I'm Tom."
Heiko Becker
  • 556
  • 3
  • 16
0

This solution works if the character is not in the string too, but uses if statements which can be slow.

if 'I' in intro:
  print('I' + intro.split('I')[1])
else:
  print(intro)
Cade Harger
  • 1
  • 1
  • 4
0

You can use itertools.dropwhile to all the characters before seeing a character to stop at. Then, you can use ''.join() to turn the resulting iterable back into a string:

from itertools import dropwhile
''.join(dropwhile(lambda x: x not in stop, intro))

This outputs:

I'm Tom.
BrokenBenchmark
  • 18,126
  • 7
  • 21
  • 33
0

Based on the @AvinashRaj answer, you can use re.sub to substituate a substring by a string or a character thanks to regex:

missing import re

output_str = re.sub(r'^.*?I', 'I', input_str)
quent
  • 1,936
  • 1
  • 23
  • 28
-3
import re
intro = "<>I'm Tom."
re.sub(r'<>I', 'I', intro)
Tunaki
  • 132,869
  • 46
  • 340
  • 423
  • 1
    doesn't remove everything before a designated character (OP), e.g. with `intro = "junk<>I'm Tom."`, yields `"junkI'm Tom."` – PatrickT May 08 '20 at 22:21