-1

I have a folder named myclientcard and it has 69 subfolders in that subfolders we have number of subfolders where it has to go to error folder and inside error folder it has number of txt files, So I want the contents of those text file of all 69 folders inside error inside the specified using the date format 17/01/2019 to 24/01/2019 and convert it into excel file

import os
import numpy as np
from os import listdir
from os.path import join
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
mypath = "D:\myclientcard"
files = [join(mypath,f) for f in listdir(mypath) if '.txt' not in f]
for file in files:
    path = file
    filename =[join(path,f) for f in listdir(path) if 'ERROR' in f]
    #print(filename)
    for text_file_path in filename:
        file_path = text_file_path
        textfiles = [join(file_path,f) for f in listdir(file_path) if '.txt' in f]
        for files in textfiles:
                reading_files = open(files,'r')
                read = reading_files.read()
                writting_files = open('result.txt','a')
                wr = writting_files.write(read)
                read_files = pd.read_csv('result.txt',delim_whitespace='')
                writer = ExcelWriter('output.xlsx')
                read_files.to_excel(writer,'Sheet1',index=false)
                writer.save()
                reading_files.close()
                writting_files.close()
Rahul Chawla
  • 1,048
  • 10
  • 15
Shreya H
  • 11
  • 5
  • ya sure ill send the code please wait – Shreya H Jan 31 '19 at 11:40
  • 1
    Check [this](https://stackoverflow.com/questions/19932130/iterate-through-folders-then-subfolders-and-print-filenames-with-path-to-text-f). You requirement is mostly answered in the link provided. – vmaroli Jan 31 '19 at 11:45
  • @vmaroli sorry that is not meeting my constraint – Shreya H Jan 31 '19 at 11:50
  • @Venkatesh Garnepudi can u help me out to add a line of code where based on the date format the files to be extracted – Shreya H Jan 31 '19 at 11:59
  • if `filename`,`textfiles` is in order, everything can be done. How to order it? If there's any time stamp in file name , it can be done. Once check this https://stackoverflow.com/a/36318986/6113743 – Venkatesh Garnepudi Jan 31 '19 at 12:15
  • @VenkateshGarnepudi i want to extract the data of the files based on the date it was created not the file name itself as 05-01-2019 not this file name , i want the file contents for which it was created on 05-01-2019. – Shreya H Jan 31 '19 at 12:22

1 Answers1

0

Using the answers from here and here. Assuming you are on a windows platform.

import os
import numpy as np
from os import listdir
from os.path import join
# Importing datetime module
from datetime import datetime as dt
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
mypath = "D:\myclientcard"

# Add start date here
start_date = dt.strptime('17/01/2019', '%d/%m/%Y')
# Add end date here
end_date = dt.strptime('24/01/2019', '%d/%m/%Y')
files = [join(mypath,f) for f in listdir(mypath) if '.txt' not in f]
for file in files:
    path = file
    filename =[join(path,f) for f in listdir(path) if 'ERROR' in f]
    #print(filename)
    for text_file_path in filename:
        file_path = text_file_path
        textfiles = [join(file_path,f) for f in listdir(file_path) if '.txt' in f]
        # Filtering on the basis of date
        textfiles = [f for f in textfiles if ((os.path.getctime(f) >= start_date) and (os.path.getctime(f) <= end_date))]
        for files in textfiles:
                reading_files = open(files,'r')
                read = reading_files.read()
                writting_files = open('result.txt','a')
                wr = writting_files.write(read)
                read_files = pd.read_csv('result.txt',delim_whitespace='')
                writer = ExcelWriter('output.xlsx')
                read_files.to_excel(writer,'Sheet1',index=false)
                writer.save()
                reading_files.close()
                writting_files.close()

On a side note, consider optimizing your code. Also try os.walk, it can be useful at times!

Rahul Chawla
  • 1,048
  • 10
  • 15
  • its pooing me an bug like this can u help me out ,start_date = dt.strptime('17/01/2019', '%d/%m/%Y').total_seconds() AttributeError: 'datetime.datetime' object has no attribute 'total_seconds' – Shreya H Jan 31 '19 at 12:50
  • Please accept the answer it this was what you were looking for. – Rahul Chawla Jan 31 '19 at 13:04
  • its throwing me an error again can u please help me out.textfiles = [f for f in textfiles if ((os.path.getctime(f) - start_date) >= 0 and (os.path.getctime(f) - end_date) <= 0)] TypeError: unsupported operand type(s) for -: 'float' and 'datetime.datetime' – Shreya H Jan 31 '19 at 13:07