How to remove all characters before a specific character in Python?

Question

I'd like to remove all characters before a designated character or set of characters (for example):

intro = "<>I'm Tom."

Now I'd like to remove the <> before I'm (or more specifically, I). Any suggestions?

I get that but in other cases? How do we know where the text starts? — Simeon Visser, Jun 19 '15 at 19:23
Well, I'm filtering through what I'm looking for in the text; so in response, you'd know where it starts by using loops, splitting text/words, etc. — Saroekin, Jun 19 '15 at 19:26

score 69 · Accepted Answer · edited Apr 29 '20 at 12:31

69

Use re.sub. Just match all the chars upto I then replace the matched chars with I.

re.sub(r'^.*?I', 'I', stri)

edited Apr 29 '20 at 12:31

Wiktor Stribiżew

607,720
39
448
563

answered Jun 19 '15 at 19:22

Avinash Raj

172,303
28
230
274

I'm fairly new to `re`, I'll look into it some more; I appreciate the answer, thanks! – Saroekin Jun 19 '15 at 19:25
2

note that, you may switch between first or last `I` `re.sub(r'.*?I', 'I', stri)`. But other answer won't satisfy this. – Avinash Raj Jun 19 '15 at 19:26
1

So you're saying `re` is the best option? Do you have any good tutorials/articles explaining the fundamentals of `re`? And thanks for your help. – Saroekin Jun 19 '15 at 19:36
2

Choosing the answer is all upto you. Yeh, learning regex is a must one for every developer since there are only few languages which won't use regex. – Avinash Raj Jun 19 '15 at 19:39
@AvinashRajdo you know how I could do this without knowing the length or characters? – May 06 '18 at 12:39
@CoderPE you can use my answer which won't expect you to know the chars length. – Avinash Raj May 06 '18 at 13:03
Won't this remove all characters before the last "I"? Not all characters before the first "I"? – Ken May 16 '21 at 16:56
2

missing `import re` – quent May 17 '22 at 17:45
input and output strings are not properly indicated: `output_str = re.sub(r'^.*?I', 'I', input_str)` – quent May 17 '22 at 17:48

duan · Answer 2 · 2018-06-07T13:01:28.863

47

str.find could find character index of certain string's first appearance:

intro[intro.find('I'):]

edited Jun 07 '18 at 13:01

answered Jun 07 '18 at 12:56

duan

8,515
3
48
70

6

If the character is missing from the string, this will return just the last character of the input string because `.find` will return `-1` and `some_str[-1:]` is "return all characters starting from the last one". – Boris Verkhovskiy Apr 29 '20 at 15:18
3

Thank you that help me partially. Also if want to remove all characters before/with founded character index: `intro[intro.find('I')+1:]` – Fisal Assubaieye Jan 26 '22 at 07:02

score 29 · Answer 3 · answered Jun 19 '15 at 19:22

29

Since index(char) gets you the first index of the character, you can simply do string[index(char):].

For example, in this case index("I") = 2, and intro[2:] = "I'm Tom."

answered Jun 19 '15 at 19:22

Ashkay

786
1
8
19

1

No problem. This will work for any string as well. Note that 1) You will probably have to make sure the index is valid, i.e. not -1 and 2) `index` only returns the first occurrence of the given string. – Ashkay Jun 19 '15 at 19:26
25

Actual example would be: `intro[intro.index('I'):]` – mattalxndr Apr 05 '18 at 13:31
1

This will raise a `ValueError` if the character doesn't appear in the string. – Boris Verkhovskiy Apr 29 '20 at 15:20

score 8 · Answer 4 · answered Jun 19 '15 at 20:15

8

If you know the character position of where to start deleting, you can use slice notation:

intro = intro[2:]

Instead of knowing where to start, if you know the characters to remove then you could use the lstrip() function:

intro = intro.lstrip("<>")

answered Jun 19 '15 at 20:15

Brent Washburne

12,904
4
60
82

score 3 · Answer 5 · answered Jun 19 '15 at 19:26

3

str = "<>I'm Tom."
temp = str.split("I",1)
temp[0]=temp[0].replace("<>","")
str = "I".join(temp)

answered Jun 19 '15 at 19:26

ahmad valipour

293
2
11

Not the downvoter but you may use this `'I' + intro.split('I', 1)[1]` – Avinash Raj Jun 19 '15 at 19:28
@AvinashRaj Neither am I, though how (genuinely curious) would this shape the function differently? In my understanding, you're splitting everything before the `I`? As well as, what does the `[1]` represent? – Saroekin Jun 19 '15 at 19:34
1

Index 1 of the splitted list – Avinash Raj Jun 19 '15 at 19:36

score 2 · Answer 6 · answered Apr 28 '20 at 07:22

I looped through the string and passed the index.

intro_list = []

intro = "<>I'm Tom."
for i in range(len(intro)):
    if intro[i] == '<' or intro[i] == '>':
        pass
    else:
        intro_list.append(intro[i])

intro = ''.join(intro_list)
print(intro)

score 2 · Answer 7 · answered Aug 24 '20 at 11:13

import re

date_div = "Blah blah\nblah, Updated: Aug. 23, 2012 Blah blah Updated: Feb. 13, 2019"

up_to_word = ":"
rx_to_first = r'^.*?{}'.format(re.escape(up_to_word))
rx_to_last = r'^.*{}'.format(re.escape(up_to_word))

# (Dot.) In the default mode, this matches any character except a newline. 
# If the DOTALL flag has been specified, this matches any character including a newline.

print("Remove all up to the first occurrence of the word including it:")
print(re.sub(rx_to_first, '', date_div, flags=re.DOTALL).strip())

print("Remove all up to the last occurrence of the word including it:")
print(re.sub(rx_to_last, '', date_div, flags=re.DOTALL).strip())

How would I keep the first and last occurence in this approach? — joey11235, Dec 21 '22 at 13:39

score 2 · Answer 8 · edited Feb 01 '21 at 15:24

2

>>> intro = "<>I'm Tom."
#Just split the string at the special symbol

>>> intro.split("<>")

Output = ['', "I'm Tom."]

>>> new = intro.split("<>")

>>> new[1]
"I'm Tom."

edited Feb 01 '21 at 15:24

Heiko Becker

556
3
16

answered Feb 01 '21 at 10:57

Chethan Raj

21
1

score 0 · Answer 9 · answered Jul 11 '21 at 05:06

0

This solution works if the character is not in the string too, but uses if statements which can be slow.

if 'I' in intro:
  print('I' + intro.split('I')[1])
else:
  print(intro)

answered Jul 11 '21 at 05:06

Cade Harger

1
1
4

score 0 · Answer 10 · answered Apr 10 '22 at 23:26

You can use itertools.dropwhile to all the characters before seeing a character to stop at. Then, you can use ''.join() to turn the resulting iterable back into a string:

from itertools import dropwhile
''.join(dropwhile(lambda x: x not in stop, intro))

This outputs:

I'm Tom.

score 0 · Answer 11 · answered May 17 '22 at 17:56

0

Based on the @AvinashRaj answer, you can use re.sub to substituate a substring by a string or a character thanks to regex:

missing import re

output_str = re.sub(r'^.*?I', 'I', input_str)

answered May 17 '22 at 17:56

quent

1,936
1
23
28

score -3 · Answer 12 · edited Jun 05 '16 at 12:00

-3

import re
intro = "<>I'm Tom."
re.sub(r'<>I', 'I', intro)

edited Jun 05 '16 at 12:00

Tunaki

132,869
46
340
423

answered Jun 05 '16 at 05:58

Satheesh Alathiyur

111
2
6

1

doesn't remove everything before a designated character (OP), e.g. with `intro = "junk<>I'm Tom."`, yields `"junkI'm Tom."` – PatrickT May 08 '20 at 22:21

How to remove all characters before a specific character in Python?

12 Answers12

Linked

Related