20

Is there a quick and dirty way in either python or bash script, that can recursively descend a directory and count the total number of lines of code? We would like to be able to exclude certain directories though.

For example:

start at: /apps/projects/reallycoolapp
exclude: lib/, frameworks/

The excluded directories should be recursive as well. For example:

/app/projects/reallycool/lib SHOULD BE EXCLUDED
/app/projects/reallycool/modules/apple/frameworks SHOULD ALSO BE EXCLUDED

This would be a really useful utility.

Justin
  • 42,716
  • 77
  • 201
  • 296

3 Answers3

41

Found an awesome utility CLOC. https://github.com/AlDanial/cloc

Here is the command we ran:

perl cloc.pl /apps/projects/reallycoolapp --exclude-dir=lib,frameworks

And here is the output

--------------------------------------------------------------------------------
Language                      files          blank        comment           code   
--------------------------------------------------------------------------------
PHP                              32            962           1352           2609
Javascript                        5            176            225            920
Bourne Again Shell                4             45             70            182
Bourne Shell                     12             52            113            178
HTML                              1              0              0             25
--------------------------------------------------------------------------------
SUM:                             54           1235           1760           3914
--------------------------------------------------------------------------------
Ivan Perevezentsev
  • 66
  • 1
  • 2
  • 11
Justin
  • 42,716
  • 77
  • 201
  • 296
15

The find and wc arguments alone can solve your problem.

With find you can specify very complex logic like this:

find /apps/projects/reallycoolapp -type f -iname '*.py' ! -path '*/lib/*' ! -path '*/frameworks/*' | xargs wc -l

Here the ! invert the condition so this command will count the lines for each python files not in 'lib/' or in 'frameworks/' directories.

Just dont forget the '*' or it will not match anything.

Lynch
  • 9,174
  • 2
  • 23
  • 34
  • Excelent! I just changed a little bit the last one: find /apps/projects/reallycoolapp -type f -iname '*.py' | xargs wc -l And now I have a great counter! – Kamilla Holanda Oct 31 '13 at 19:21
  • This includes blank lines and comments, which is not a standard way of measuring lines of codes (sloc) – Jivan Mar 09 '21 at 19:07
4
find ./apps/projects/reallycool -type f | \
     grep -v -e /app/projects/reallycool/lib \
             -e /app/projects/reallycool/modules/apple/frameworks | \
     xargs wc -l | \
     cut -d '.' -f 1 | \
     awk 'BEGIN{total=0} {total += $1} END{print total}'

A few notes...

  1. the . after the find is important since that's how the cut command can separate the count from the file name
  2. this is a multiline command, so make sure there aren't spaces after the escaping slashes
  3. you might need to exclude other files like svn or something. Also this will give funny values for binary files so you might want to use grep to whitelist the specific file types you are interested in, ie: grep -e .html$ -e .css$
mlathe
  • 2,375
  • 1
  • 23
  • 42