Python: removing everything including and after a certain character on a line

Question

I have some text like so:

1.6 # blah blah blah
# fjsadfklj slkjf yes 3.4
1.8*
1.9 1.10 #blah
#blah
1.11

I want to clean it up by removing all # characters plus anything following them on the same line. In other words, I desire:

1.6
1.8*
1.9 1.10
1.11

What is the best way to approach this? Via simple methods like partition, or maybe regexes?

Possible duplicate of http://stackoverflow.com/questions/1706198/python-how-to-ignore-comment-lines-when-reading-in-a-file Note that the best answer is not the top-rated, probably look at http://stackoverflow.com/a/27178714/2284490 for the most robust answer — Cireo, May 02 '17 at 02:31

score 3 · Accepted Answer · answered Jul 31 '15 at 16:52

3

You may try this,

re.sub(r'\s*#.*', '', s)

\s* will helps to match also the preceding vertical or horizontal space character. What I mean by vertical space is newline character , carriage return.

DEMO

answered Jul 31 '15 at 16:52

Avinash Raj

172,303
28
230
274

score 2 · Answer 2 · answered Jul 31 '15 at 17:11

2

Maybe this does what you want it to do in fulfilling your request?

example = '''1.6 # blah blah blah
# fjsadfklj slkjf yes 3.4
1.8*
1.9 1.10 #blah
#blah
1.11'''

for line in example.splitlines():
    print(line.split('#', 1)[0])

If you really want the comment text, the code is easily modifiable to allows its capture as well.

answered Jul 31 '15 at 17:11

Noctis Skytower

21,433
16
79
117

This is the superior method because it is simple and explicit. – Josh J Jul 31 '15 at 17:51
A naive `timeit` shows split is also ~4x as fast. `python -m timeit 'strs = ("x"*(100 - i%101) + "#" + "y"*100 for i in xrange(10000)); import re' 'for s in strs: re.sub(r"\s*#.*", "", s)'` vs `s.split("#", 1)[0]`. 31.5 msec vs 7.02 msec on my machine – Cireo May 02 '17 at 02:28

Python: removing everything including and after a certain character on a line

2 Answers2