3

I have this command on linux which I have problem converting into type on Windows:

row = run('cat '+'C:/Users/Kyle/Documents/final/VocabCorpus.txt'+" | wc -l").split()[0]

For the statement " wc - l" is for the line count to see how many lines exist. If I were to change it to the following using "type" command, what should it be?

I tried this and it doesnt work.

 row = run('type '+'C:/Users/Kyle/Documents/final/VocabCorpus.txt'+" | wc -l").split()[0]

The run command is below:

def run(command):
    output = subprocess.check_output(command, shell=True)
    return output

Please help me. Thank you.

OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
windboy
  • 141
  • 1
  • 9
  • What is the run method? And `wc` isn't a native windows binary – OneCricketeer Feb 13 '16 at 05:30
  • This statement is to find the number of rows from the vocabulary file and to print it out. Yes I was wondering if there is an equivalent for this statement as this is run in linux. Thanks – windboy Feb 13 '16 at 05:33
  • If it is ran in linux, why are you giving it a Windows filesystem path? – OneCricketeer Feb 13 '16 at 05:35
  • I'm not sure if I am correct but I was told the code can be run on Python for Mac and I tried in for Python for Win, I changed a little but it was not able to run. – windboy Feb 13 '16 at 05:38
  • Obviously, because `wc` isn't a command in windows. You should really try not to use OS specific functions if you don't have to and the functions exist in the standard python library to do the same thing (as I've shown in my answer) – OneCricketeer Feb 13 '16 at 05:41
  • Sorry you are right, there is a run command, it is: – windboy Feb 13 '16 at 05:42
  • def run(command): output = subprocess.check_output(command, shell=True) return output – windboy Feb 13 '16 at 05:43
  • Once again, the default shell on a Windows computer is `cmd` if you type `wc` in the `cmd` you'll get command not found. You haven't shown your error, but I can guarantee you that that is the problem – OneCricketeer Feb 13 '16 at 05:44
  • I tried the solution and deleted the line. It provides error as well and I am not sure what to do now. – windboy Feb 13 '16 at 05:48
  • What error? Can you please update your question with that information? – OneCricketeer Feb 13 '16 at 05:48
  • Its more of a logic error, I have found the problem. Thank you very much for your help:) – windboy Feb 13 '16 at 05:54

2 Answers2

5

You're trying to count the number of lines in a file? Why can't you do that in pure python?

Something like this?

with open('C:/Users/Kyle/Documents/final/VocabCorpus.txt') as f:
    row = len(f.readlines())
OneCricketeer
  • 179,855
  • 19
  • 132
  • 245
  • 4
    A slightly better version is ``sum(1 for line in f)``, which consumes less memory: ``f.readlines()`` reads the whole file while iterating over ``f`` yields only one line at a time. – y0prst Feb 13 '16 at 06:23
0

Actually wc counts \n symbols in your file (proof). If you have big files and want to save some memory, you'd better read it by chunks to have O(1) memory consumption:

CHUNK_SIZE = 4096

def wc_l(filepath):
    nlines = 0
    with open(filepath, 'rb') as f:
        while True:
            chunk = f.read(CHUNK_SIZE)
            if not chunk:
                break
            nlines += sum(1 for char in chunks if char == '\n')
    return nlines
Community
  • 1
  • 1
y0prst
  • 611
  • 1
  • 6
  • 13