Do comments slow down an interpreted language?

Question

I am asking this because I use Python, but it could apply to other interpreted languages as well (Ruby, PHP, JavaScript).

Am I slowing down the interpreter whenever I leave a comment in my code? According to my limited understanding of an interpreter, it reads program expressions in as strings and then converts those strings into code. It seems that every time it parses a comment, that is wasted time.

Is this the case? Is there some convention for comments in interpreted languages, or is the effect negligible?

This was certainly an issue in BASIC on my old Commodore 64. Languages and hardware both have improved dramatically since then. — Fred Larson, Apr 28 '10 at 16:09
You should be aware that the term 'interpreted' doesn't mean much. Python is bytecode-compiled, and not interpreted directly from source. — Thomas Wouters, Apr 28 '10 at 16:12
It might be interesting to consider JavaScript in regard to this question. I believe JQuery, for instance, has a version that's stripped of comments and extra whitespace to minimize the transfer time. — Fred Larson, Apr 28 '10 at 16:17
Stripping comments and whitespace (and crunching stuff together as much as possible) is pretty common in JavaScript, but not really to speed up parsing or execution; it's all about network transfer time (and bandwidth, for busy sites.) — Thomas Wouters, Apr 28 '10 at 16:38
e.g. The source for http://www.google.com/index.html is practically obfuscated, as Google has crushed every JS variable to 3 letters max and stripped out every bit of whitespace possible. — Nick T, Apr 28 '10 at 17:17

score 107 · Accepted Answer · answered Apr 28 '10 at 15:54

107

For the case of Python, source files are compiled before being executed (the .pyc files), and the comments are stripped in the process. So comments could slow down the compilation time if you have gazillions of them, but they won't impact the execution time.

answered Apr 28 '10 at 15:54

Luper Rouch

9,304
7
42
56

46

+1, because I really liked the `gazillion` use in this context – M. Williams Apr 28 '10 at 16:55
3

It's hard to imagine how high the comment:code ratio would have to be before this was detectable. – Mike Graham Apr 28 '10 at 16:56
4

@Mike: possibly 1 gazillion:1 ? – Seth Johnson Apr 28 '10 at 16:58
1

Not quite sure about multiple gazillions, but I think you're thinking in the right way. – M. Williams Apr 28 '10 at 17:21
I'm just noting that even compilation time only happens once and is then cached. – Ian Bicking Apr 30 '10 at 05:28

score 33 · Answer 2 · answered Apr 28 '10 at 16:03

33

Well, I wrote a short python program like this:

for i in range (1,1000000):
    a = i*10

The idea is, do a simple calculation loads of times.

By timing that, it took 0.35±0.01 seconds to run.

I then rewrote it with the whole of the King James Bible inserted like this:

for i in range (1,1000000):
    """
The Old Testament of the King James Version of the Bible

The First Book of Moses:  Called Genesis


1:1 In the beginning God created the heaven and the earth.

1:2 And the earth was without form, and void; and darkness was upon
the face of the deep. And the Spirit of God moved upon the face of the
waters.

1:3 And God said, Let there be light: and there was light.

...
...
...
...

Even so, come, Lord Jesus.

22:21 The grace of our Lord Jesus Christ be with you all. Amen.
    """
    a = i*10

This time it took 0.4±0.05 seconds to run.

So the answer is yes. 4MB of comments in a loop make a measurable difference.

answered Apr 28 '10 at 16:03

Rich Bradshaw

71,795
44
182
241

26

+1 for a scientific experiment and The Holy Bible in the same post. 8vD – Fred Larson Apr 28 '10 at 16:07
58

That's not a comment. It's a string literal. Furthermore, if you look at the actual bytecode for your two blocks of code, you will see *no difference*. The string is parsed once, and not involved in the calculations at all. You should see the same slowdown if you place the string outside of the loop. – Thomas Wouters Apr 28 '10 at 16:09
1

Thomas Wouters has a good point. Now go back and prefix each line of The Bible with '#'. ;v) – Fred Larson Apr 28 '10 at 16:14
1

Come to think of it, it wouldn't be hard to write a Python script to do just that. – Fred Larson Apr 28 '10 at 16:19
I've always used that notation for comments - it's a multiline comment. – Rich Bradshaw Apr 28 '10 at 17:29
Rich: it's not a comment, it's a string literal. The difference is that your multiline string is incorporated into the compiled bytecode, a comment is not. – Ned Batchelder Apr 28 '10 at 17:46
Whoa, you typed in the entire Bible? You must be *really* bored and *really, really* fast. – Mike Graham Apr 28 '10 at 17:52
1

@Ned, Yes and no. Obviously, it isn't a commen. However, this particular string literal will (in my CPython and I would bet money yours) be optimized out of the bytecode. The difference is the actual compile time. – Mike Graham Apr 28 '10 at 17:53
@Mike Graham - Ever heard of Project Gutenberg? http://www.gutenberg.org/wiki/Main_Page – Chris Lutz Apr 28 '10 at 18:06
16

+1 to counter a stupid downvote, and props for actually experimenting, despite the flawed approach. TIAS (Try it and see) often provides better answers than abstract discussion. – 3Dave Apr 28 '10 at 18:21
6

@David, the case this tests is not the one described by OP nor is it representative of anything like any code that people actually write. – Mike Graham Apr 28 '10 at 19:41
Try changing the loop counter. It might be taking ~4 seconds just to parse the string literal. – wisty Apr 29 '10 at 02:11
4

@Rich, can you convert the string to a comment and post the new timing? – smci Aug 12 '11 at 23:58
My guy that runtime also includes compile time, at least before the .pyc file is generated. In addition that's a STRING LITERAL not a COMMENT. – mike Oct 05 '22 at 22:32

score 24 · Answer 3 · answered Apr 28 '10 at 15:49

24

Comments are usually stripped out in or before the parsing stage, and parsing is very fast, so effectively comments will not slow down the initialization time.

answered Apr 28 '10 at 15:49

kennytm

510,854
105
1,084
1,005

11

Comments have to be stripped, so with big enough comments, they will slow down the program. But you got to have enormous comments (MBs? GBs?) before you can even measure it. – Henrik Hansen Apr 28 '10 at 15:52
3

Having megabytes of comments means there are more than megabytes of code. Time for actual parsing and compiling would overwhelm the "little" comment stripping time. – kennytm Apr 28 '10 at 16:05
12

I went ahead and tried this out. On my particular testing system parsing and executing about 10 megs of Python comments (and one assignment statement) takes 349 ms. The ratio of source bytes to time in this case seems to be fairly constant, at about 28,000 bytes per msec. The same script on Codepad is (as I imagined) slower: http://codepad.org/Ckevfqmq – AKX Apr 28 '10 at 16:08
Well, I'm sure one can construct a pathological example to the contrary. Oh look, see the answer by Rich Bradshaw. For all practical purposes, you're entirely right, of course. – janneb Apr 28 '10 at 16:20

score 6 · Answer 4 · answered Apr 28 '10 at 15:50

The effect is negligable for everyday usage. It's easy to test, but if you consider a simple loop such as:

For N = 1 To 100000: Next

Your computer can process that (count to 100,000) quicker than you can blink. Ignoring a line of text that starts with a certain character will be more than 10,000 times faster.

Don't worry about it.

Jerry Coffin · Answer 5 · 2018-05-10T13:20:00.297

It depends on how the interpreter is implemented. Most reasonably modern interpreters do at least a bit of pre-processing on the source code before any actual execution, and that will include stripping out the comments so they make no difference from that point onward.

At one time, when memory was severely constrained (e.g., 64K total addressable memory, and cassette tapes for storage) you couldn't take things like that for granted. Back in the day of the Apple II, Commodore PET, TRS-80, etc., it was fairly routine for programmers to explicitly remove comments (and even white-space) to improve execution speed. This was also only one of many source code-level hacks routinely employed at the time¹.

Of course, it also helped that those machines had CPUs that could only execute one instruction at a time, had clock speeds around 1 MHz, and had only 8-bit processor registers. Even a machine you'd now find only in a dumpster is so much faster than those were that it's not even funny...

^{1. For another example, in Applesoft you could gain or lose a little speed depending on how you numbered lines. If memory serves, the speed gain was when the target of a goto statement was a multiple of 16.}

Nick T · Answer 6 · 2010-04-28T18:43:36.723

Did up a script like Rich's with some comments (only about 500kb text):

# -*- coding: iso-8859-15 -*-
import timeit

no_comments = """
a = 30
b = 40
for i in range(10):
    c = a**i * b**i
"""
yes_comment = """
a = 30
b = 40

# full HTML from http://en.wikipedia.org/
# wiki/Line_of_succession_to_the_British_throne

for i in range(10):
    c = a**i * b**i
"""
loopcomment = """
a = 30
b = 40

for i in range(10):
    # full HTML from http://en.wikipedia.org/
    # wiki/Line_of_succession_to_the_British_throne

    c = a**i * b**i
"""

t_n = timeit.Timer(stmt=no_comments)
t_y = timeit.Timer(stmt=yes_comment)
t_l = timeit.Timer(stmt=loopcomment)

print "Uncommented block takes %.2f usec/pass" % (
    1e6 * t_n.timeit(number=100000)/1e5)
print "Commented block takes %.2f usec/pass" % (
    1e6 * t_y.timeit(number=100000)/1e5)
print "Commented block (in loop) takes %.2f usec/pass" % (
    1e6 * t_l.timeit(number=100000)/1e5)

C:\Scripts>timecomment.py
Uncommented block takes 15.44 usec/pass
Commented block takes 15.38 usec/pass
Commented block (in loop) takes 15.57 usec/pass

C:\Scripts>timecomment.py
Uncommented block takes 15.10 usec/pass
Commented block takes 14.99 usec/pass
Commented block (in loop) takes 14.95 usec/pass

C:\Scripts>timecomment.py
Uncommented block takes 15.52 usec/pass
Commented block takes 15.42 usec/pass
Commented block (in loop) takes 15.45 usec/pass

Edit as per David's comment:

 -*- coding: iso-8859-15 -*-
import timeit

init = "a = 30\nb = 40\n"
for_ = "for i in range(10):"
loop = "%sc = a**%s * b**%s"
historylesson = """
# <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
# blah blah...
# --></body></html> 
"""
tabhistorylesson = """
    # <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" 
    # blah blah...
    # --></body></html> 
"""

s_looped = init + "\n" + for_ + "\n" + tabhistorylesson + loop % ('   ','i','i')
s_unroll = init + "\n"
for i in range(10):
    s_unroll += historylesson + "\n" + loop % ('',i,i) + "\n"
t_looped = timeit.Timer(stmt=s_looped)
t_unroll = timeit.Timer(stmt=s_unroll)

print "Looped length: %i, unrolled: %i." % (len(s_looped), len(s_unroll))

print "For block takes %.2f usec/pass" % (
    1e6 * t_looped.timeit(number=100000)/1e5)
print "Unrolled it takes %.2f usec/pass" % (
    1e6 * t_unroll.timeit(number=100000)/1e5)

C:\Scripts>timecomment_unroll.py
Looped length: 623604, unrolled: 5881926.
For block takes 15.12 usec/pass
Unrolled it takes 14.21 usec/pass

C:\Scripts>timecomment_unroll.py
Looped length: 623604, unrolled: 5881926.
For block takes 15.43 usec/pass
Unrolled it takes 14.63 usec/pass

C:\Scripts>timecomment_unroll.py
Looped length: 623604, unrolled: 5881926.
For block takes 15.10 usec/pass
Unrolled it takes 14.22 usec/pass

@Nick, I'd expect any non-naive interpreter to only parse the comments for the first pass through the loop. Have you tried this either with an unrolled loop, or by, say, pasting a couple of hundred lines of comments in the code? — 3Dave, Apr 28 '10 at 18:19

score 3 · Answer 7 · edited Jul 25 '21 at 20:59

My limited understanding of an interpreter is that it reads program expressions in as strings and converts those strings into code.

Most interpreters read the text (code) in the file and produce an Abstract Syntax Tree data structure, since it can be easily read by the next stage of compilation. That structure contains no code, in text form, and of course no comments either. Just that tree is enough for executing programs. But interpreters, for efficiency reasons, go one step further and produce byte code. And Python does exactly that.

We could say that the code and the comments, in the form you wrote them, are simply not present,
when the program is running. So no, comments do not slow down the programs at run-time.

Note: Interpreters that do not use some other inner structure to represent the code other than text,
ie a syntax tree, must do exactly what you mentioned. Interpret again and again the code at run-time.

score 2 · Answer 8 · answered Apr 28 '10 at 15:54

2

Having comments will slow down the startup time, as the scripts will get parsed into an executable form. However, in most cases comments don't slow down runtime.

Additionally in python, you can compile the .py files into .pyc, which won't contain the comments (I should hope) - this means that you won't get a startup hit either if the script is already compiled.

answered Apr 28 '10 at 15:54

MarkR

62,604
14
116
151

`s/will slow down the startup time/will slow down the startup time immeasurably`. `s/in most cases comments don't slow down runtime/in all cases comments don't slow down runtime` – Mike Graham Apr 28 '10 at 17:57

score 0 · Answer 9 · answered Sep 23 '14 at 18:11

I wonder if it matters on how comments are used. For example, triple quotes is a docstring. If you use them, the content is validated. I ran into a problem awhile back where I was importing a library into my Python 3 code... I got this error regarding syntax on \N. I looked at the line number and it was content within a triple quote comment. I was somewhat surprised. New to Python, I never thought a block comment would be interpreted for syntax errors.

Simply if you type:

'''
(i.e. \Device\NPF_..)
'''

Python 2 doesn't throw an error, but Python 3 reports: SyntaxError: (unicode error) 'unicodeescape' codec can't decode bytes in position 14-15: malformed \N character escape

So Python 3 is evidently interpreting the triple quote, making sure it's valid syntax.

However, if turned into a single line comment: # (i.e. \Device\NPF_..)
No error results.

I wonder if the triple quote comments wer replaced with single lines, if a performance change would be seen.

score 0 · Answer 10 · answered Apr 28 '10 at 18:13

As the other answers have already stated, a modern interpreted language like Python first parses and compiles the source into bytecode, and the parser simply ignores the comments. This clearly means that any loss of speed would only occur at startup when the source is actually parsed.

Because the parser ignores comments, the compiling phase is basically unaffected by any comments you put in. But the bytes in the comments themselves are actually being read in, and then skipped over during parsing. This means, if you have a crazy amount of comments (e.g. many hundreds of megabytes), this would slow down the interpreter. But then again this would slow any compiler as well.

I'm not sure I'd call this an "interpreted language" in the strictest sense of the word. Something like dynamically-compiled or JIT seems more appropriate. — 3Dave, Apr 28 '10 at 18:18

Code Pope · Answer 11 · 2019-08-30T13:16:41.963

0

This question is really old, but after reading the accepted answer which claims that it won't impact the execution time, which is wrong, I am giving you a simple example where you can see and check the amount it influences the execution time indeed.
I have a file called constants.py. It contains all different actions of chess in a list:

LABELS = [ "a1b1"
    "a1c1", 
    "a1d1", 
    "a1e1", 
    "a1f1",....]

The list LABELS contains 2272 elements. In another file I call:

import constants
np.array(constants.LABELS)

I measured it ten times and the execution of the code takes about 0.597 ms. Now I changed the file and inserted next to each element (2272 times) a comment:

LABELS = [ "a1b1",  # 0 
            "a1c1", # 1
            "a1d1", # 2
            "a1e1", # 3
            "a1f1", # 4
             ...,
            "Q@h8", # 2271]

Now after measuring the execution time of np.array(constants.LABELS) ten times, I have an average execution time of 4.28 ms, thus, about 7 times slower.
Therefore, yes, it impacts the execution time if you have lots of comments.

edited Aug 30 '19 at 13:16

answered Jul 19 '19 at 15:56

Code Pope

5,075
8
26
68

What does "testing np.array(constants.LABELS)" actually mean? Do you see a difference in compiled .pyc files? – Luper Rouch Aug 29 '19 at 08:09
@LuperRouch with "testing np.array(constants.LABELS)" I mean to run the statement `np.array(constant.LABELS)` ten times and measuring the average execution time of the statement. I will clarify that in the text. – Code Pope Aug 30 '19 at 13:15
How do you run this statement? Maybe you could push your test setup to github so we can see how exactly you run your test, as the difference you see is probably due to the fact that you don't reuse compiled .pyc files (as I said, comments do impact compilation time, but they should not impact execution time). – Luper Rouch Sep 06 '19 at 17:07

Do comments slow down an interpreted language?

11 Answers11

Linked

Related