When I do wc -l
on a file in Linux (a CSV file of a couple million rows), it reports a number of lines that is lower than what this Python code shows (simply iterating over the lines in the file) by over a thousand. What would be the reason for that?
with open(csv) as csv_lines:
num_lines = 0
for line in csv_lines:
num_lines += 1
print(num_lines)
I've had cases where wc
reports one less than the above, which makes sense in cases where the file has no terminating newline character, as it seems like wc
counts complete lines (including terminating newline) while this code only counts any lines. But what would be the case for a difference of over a thousand lines?
I don't know much about line endings and things like that, so maybe I've misunderstood how wc
and this Python code count lines, so maybe someone could clarify. In linux lines counting not working with python code it says that wc
works by counting the number of \n
characters in the file. But then what is tis Python code doing exactly?
Is there a way to reconcile the difference in numbers to figure out exactly what is causing it? Like a way to calculate number of lines from Python that counts in the same way that wc
does.
The file was generated possibly on a different platform that Linux, not sure if that might be related.