I have 1,000 odd files that I am trying to concatenate into one large file. They are all .txt files and all contain the same data format.
I seem to be able to use PyCharm to read each line in each file, and write it to an output file. I count the number of lines and it comes out to be around 393,225 which is roughly accurate (I haven't manually counted). However, when I count the number of lines in the output file, it is only 401 and I don't know why it hasn't done them all.....
Here is the current code I am using in Python:
import datetime
from pathlib import Path
start_date = datetime.date(year=2019, month=3, day=22)
end_date = datetime.date(year=2015, month=12, day=1)
list = []
if start_date == end_date:
for n in range((end_date - start_date).days + 1):
list.append(start_date + datetime.timedelta(n))
else:
for n in range((start_date - end_date).days + 1):
list.append(start_date - datetime.timedelta(n))
countFile = 0
countLines = 0
for d in reversed(list):
date = str(d)
path = '/Users/stephankokkas/notuploading/TESTFILES/PRICEDATA/' + date + '/Race.txt'
raceFile = Path(path)
if raceFile.is_file():
with open('/Users/stephankokkas/notuploading/TESTFILES/finalRaceFile/FinalRace.txt', 'w') as outfile:
with open(path) as infile:
for line in infile:
outfile.write(line)
countLines = countLines + 1
print(line)
else:
print("file NOT FOUND")
print(countLines)
countLines = 0
with open('/Users/stephankokkas/notuploading/TESTFILES/finalRaceFile/FinalRace.txt', 'r') as infile:
for line in infile:
countLines = countLines + 1
print(countLines)
And here is the output
393225
401
I am not sure why they are not the same number.... I expect them to be.
When I open the output file, the data only ranges from 2019-03-22 to 2019-03-22
it seems to only be doing the last file it finds.
Most likely something obvious, but some help would be good. Thanks