1

I am a bit stuck and hope you can help.

I am trying to count the total lines within files in a directory (and all sub directories).

So we get data in hourly, which is partitioned into folders like this

DATE>HOUR>COMPANY

So, I want to do a count for all files within a date and hence need to count the lines in all files within all directories.

I can do this for a single file with the below, but I have been unable to make a multi file one work

Can anyone advise :)

count = len(open('Desktop/travel.csv').readlines(  ))

Thisis what I tried for all files:

In [11]: os.chdir(Desktop)
    ...: names={}
    ...: count= 0
    ...: for fn in glob.glob(‘*.csv’):
    ...:     countfile = len(open(f).readlines(  ))
    ...:      count = count + countfile
  File "<ipython-input-11-2e1a69754276>", line 4
    for fn in glob.glob(‘*.csv’):

But I get

    for fn in glob.glob(‘*.csv’):
                        ^
SyntaxError: invalid syntax
kikee1222
  • 1,866
  • 2
  • 23
  • 46
  • 4
    It seems to me you are using ```` instead of actual quotes (double `"` or simple `'`) – palvarez Aug 13 '19 at 07:42
  • Thanks for replying! :) I have tried ' and " but both have the same error! – kikee1222 Aug 13 '19 at 07:45
  • I think it is advisable to also close the file, or better user the `with` notation, see [here](https://stackoverflow.com/questions/845058/how-to-get-line-count-cheaply-in-python) – FObersteiner Aug 13 '19 at 08:01

1 Answers1

1

The first post was right, there was something strange with the formatting.

This works:

Thanks!!

In [21]: import os
    ...: import glob
    ...: 
    ...: count= 0
    ...: for file in glob.glob('*.csv'):
    ...:     countfile = len(open(file).readlines(  ))
    ...:     count = count + countfile
    ...: 

In [22]: count
Out[22]: 709343
kikee1222
  • 1,866
  • 2
  • 23
  • 46