-4

Thank you for all your helps on Friday. Sorry for maybe my simple questions. as I am a beginner in Python, at times I wld face some questions which are easy for more expert people but I try to improve myself.

I am going to clarify my previous question more detail. I have some text files, with your guidance ,I am able to count the lines of these text files. i would like to create a new text file as an output that in each line of this new file I have the name of the input file with the number of the lines with space and the last line of this file contain the sum of the line numbers. For instance I have some files as : points1.txt, points2.txt and points3.txt. The output file wld be :

point1 144798 point2 100000 point3 258627 sum 503425

The code I have is:

import os folder = 'E:/MLS_HFT/TEST/Stuttgart_2009_pointclouds/'

def total_lines():

    count_line = 0

    for filename in os.listdir(folder):
        infilename = os.path.join(folder,filename)
        if not os.path.isfile(infilename): continue
        infile= open(infilename, 'r')

        for lines in infile:
            i+=1

        outfile = ["%s " %i]
        return i
        outfile = ["%s " %i]
        outfile.write("\n".join(output))
        outfile.close()
        return outfile

        total_lines (infile,i)
        count_line = count_line + i

        output = ["%s  %s" %(item.strip() ,count_line) for item in outfile]
    outfile.write("\n".join(output))

I wld be thankful to have your guidance.

mari mm
  • 19
  • 1
  • 5

4 Answers4

2

If you only want the total count of the number of lines in each file:

>>> import fileinput
>>> i = 0 # default value 
>>> for line in fileinput.input(files=('test.txt', 'test2.txt')):
        i += 1


>>> i
20

This can be simplified to:

sum(1 for line in fileinput.input(files=('test.txt', 'test2.txt'))

If you want single files, and you have to add them up as well, just use this in a function:

with open('test.txt') as f:
    sum(1 for line in f)
jamylak
  • 128,818
  • 30
  • 231
  • 230
1

To get the number of lines in a file, open it and read the lines like this:

fid = open('your input filename', 'r')
lines = fid.readlines()
nLines = len(lines)

You could then put the above in a loop that opens each of your files and sums all the nLines values to calculate a total.

EDIT:

Loop over the files like this:

infiles = ['file1.txt', 'file2.txt', 'file3.txt', ... , 'fileN.txt' ]
totalLines = 0

# Loop over array of files
for filename in infiles:
    # Open file:
    fid = open(filename, 'r')

    # Read lines and get length of returned array (array of lines):
    lines = fid.readlines()
    nLines = len(lines)
    totalLines += nLines   # Sum with total lines

    # Close the file
    fid.close()

# Show total:
print "Total lines from all files: " + totalLines
ccbunney
  • 2,282
  • 4
  • 26
  • 42
0
def Total_Lines():    
    Open_File = open(File_Name,'U').read()
    Last_Line = Open_File.count('\n')
    print Last_Line

To count no of lines in a file

for multiple files

import glob

def Total_Lines(path):
    Count_Line = 0
    a = glob.glob(path+'\*')
    for i in a:
        print i
        Open_File = open(i,'U').read()
        Last_Line = Open_File.count('\n')
        print Last_Line
        Count_Line = Count_Line+Last_Line
    print Count_Line

Total_Lines(r"F:\stack")
user19911303
  • 449
  • 2
  • 9
  • 34
0

Assuming your file structure is like as follows and the header in each file is 'points'

file1.txt     file2.txt    file3.txt

points        points       points
23            43           12
34            32           45
45            21           99
56                         100
                           123

The code could be as follows:

import pandas as pd # pandas is powerful
import glob # finds all the pathnames matching a specified pattern 
import os
file_count_and_sum = []
os.chdir("E:/Pointfiles") # the folder where your text files are
for file in glob.glob("*.txt"):
    data = pd.read_csv(file)
    file_count_and_sum.append([file,len(data['points'].values),sum(data['points'].value        s)])
print file_count_and_sum 

Your output would print the filename, count & sum like this;

[['file1.txt', 4, 158], ['file2.txt', 3, 96], ['file3.txt', 5, 379]]

EDIT: In that case will this work?

count = 0  
for file in glob.glob("*.txt"):
   data = pd.read_csv(file)
   count = count + len(data['points'].values)
print count
richie
  • 17,568
  • 19
  • 51
  • 70
  • thank you very much for your help.Well,I think we need to change the code the above code a bit. in ur example each line represent different value while in my files the number of lines represent the number of points, so if I have the number of line in each file, I need to sum them up after counting for each file – mari mm Apr 19 '13 at 11:02
  • Check the Edit, does that work for you? – richie Apr 19 '13 at 11:12
  • sorry just to be in a safe side, I should put the above code in a function? or how to make it general? – mari mm Apr 19 '13 at 11:43
  • why would you need a function? What is your primary objective? – richie Apr 19 '13 at 11:45
  • maybe I ask in this way, I wld like to save this code and apply it in different cases, so should not it be as a function? – mari mm Apr 19 '13 at 12:03
  • 2
    check @user20044033 's answer. That should help. – richie Apr 19 '13 at 12:10
  • i have tried ur code, it works for most of files. just there is one file which too big (915.766.kb) and it gives memory error, what can I do about it? – mari mm Apr 19 '13 at 14:06