What exactly is the token count in functions/methods used for?

Question

I've been using some tools to measure code quality and CCN (Cyclomatic Complexity Number) and some of those tools provides a count for tokens in functions what does that count says about my function or method? What is it used for?

I'm using `Lizard` and `OCLint` but also tried `Clang analyzer` — Black Sheep, Dec 25 '15 at 02:16

score 3 · Answer 1 · edited May 23 '17 at 11:51

3

Cyclomatic Complexity Number is a metric to indicate complexity of function, procedure or program. The best (large enough and intuitive) explanation I have found is provided here.

I think that tokens refer to conditional statements tokens that actually are taken into account to compute the cyclomatic complexity.

[later edit]

A high CCN means complex code that:

it is (much) more hard to read and understand
it is hard to maintain
unit tests are harder to maintain since a decent code coverage is reached with more difficulty
might lead to more bugs

CCN can be reduced using various techniques. Some examples can be seen here or here.

edited May 23 '17 at 11:51

Community

1
1

answered Dec 22 '15 at 21:31

Alexei - check Codidact

22,016
16
145
164

2

Thanks, I already know the meaning of CCN, I actually want to know about the token count how is this used to said something about the code or how is this used for code metrics. – Black Sheep Dec 25 '15 at 02:18

score 1 · Answer 2 · answered Feb 09 '21 at 15:37

The OP has not declared which tool they're using but for lizard this has been asked from as an issue so it might help someone

Token is the word and operators, etc.

For example: if (abc % 3 != 0) Has [‘if’, ‘(‘, ‘abc’, ‘%’, ‘3’, ‘!=‘, ‘0’, ‘)’] 8 tokens.

Also another source that has similar description:

One program can have a maximum of 8192 tokens. Each token is a word (e.g. variable name) or operator. Pairs of brackets, and strings count as 1 token. commas, periods, LOCALs, semi-colons, ENDs, and comments are not counted.

Now the next question is, would the number of tokens matter like CNN? Giving the disclaimer that I am not an expert in code quality, it depends on the language. For example, in compiled languages, you might want to break a complex line into multiple lines which increases the number of tokens but significantly enhances the readability of the code. You should go for it, the modern compilers are smart enough to optimize them.

However, this might not be so much true in interpreted languages. Again, you should look into the specific language you are using to make sure if there is any optimization behind the scene or not. That being said, some languages such as Python provide syntaxes to reduce the number of tokens. This is great as long as it was designed in the language.

TL;DR: This factor doesn't matter as much as code readability. Double-check your code if it is high but don't mess up the code to lower it.

score 0 · Answer 3 · answered Dec 29 '18 at 10:35

In the context of CCN tools, a token is any distinct operator or operand. How this is implemented depends on the tool. Since the page on Lizard doesn't go into details, you will have to examine the source code (its not many lines)

https://github.com/terryyin/lizard/tree/master/lizard_languages

If you search the source for 'token', you will see how the tool is parsing the code. In most cases it is looking for code blocks, expressions, annotations and accessing of methods/objects.

For example, according to java.py, Java is only parsed for '{', '@', and '.' Not sure why it isn't looking for expression...?

What exactly is the token count in functions/methods used for?

3 Answers3