-1

What I want to do:

  1. Extract sample1.tgz file.
  2. Store into 'sample1' directory
  3. Search a string from sample1/nvram2/log/TextFiles

Complete path => C:\Users\username\scripts\sample1\nvram2\logs\version.txt

Note: TextFiles are with different extensions

Example:

textFile.txt 
textFile.txt.0 
textFile.txt.1 
textFile.log 
textFile

What I have tried:

import os,tarfile, glob

string_to_search=input("Enter the string you want to search : ")

#all_files holds all the files in current directory
all_files = [f for f in os.listdir('.') if os.path.isfile(f)] 
    for current_file in all_files: 
        print("Reading " + current_file)

        if (current_file.endswith(".tgz")) or (current_file.endswith("tar.gz")):
            tar = tarfile.open(current_file, "r:gz")
            #file_name contains only name by removing the extension
            file_name=os.path.splitext(current_file)[0]
            os.makedirs(file_name) #make directory with the file name
            output_file_path=file_name  #Path to store the files after extraction
            tar.extractall(output_file_path) #extract the current file
            tar.close()
            #---Following code is to find  the string from all the files in a directory---
            path=output_file_path + '\nvram2\logs\*'
            files=glob.glob(path)

            for file1 in files: 
                with open(file1) as f2:
                    for line in f2:
                        if string_to_search in line:
                            #print file name which contains the string
                            print(file1)
                            #print the line which contains the string
                            print(str(line))

Issue:

I think, the problem is with path. It works, when I try to execute the code with the following code.

path='\nvram2\logs\*.txt'

But it checks only for '.txt' file extensions. But I want to search for all file extensions.

It does not work when I try the following code. Here output_file_path contains sample1 i.e. the directory name

path=output_file_path + '\nvram2\logs\*'
Dipankar Nalui
  • 1,121
  • 5
  • 18
  • 33

3 Answers3

1

After extracting the files into folder, you can use os.walk to visit all files in the given path and do your comparison.

Example Code:

import os

# Extract tar file
# ...
# ...

path = output_file_path + r'\nvram\logs'

for dirpath, dirs, files in os.walk(path):
    # dirpath : current dir path
    # dirs : directories found in currect dir path
    # files : files found in currect dir path

    # iterate each files
    for file in files:

        # build actual path of the file by joining to dirpath
        file_path = os.path.join(dirpath, file)

        # open file
        with open(file_path) as file_desc:

            # iterate over each line, enumerate is used to get line count
            for ln_no, line in enumerate(file_desc):
                if string_to_search in line:
                    print('Filename: {}'.format(file))
                    print('Text: {}'.format(line.strip()))
                    print('Line No: {}\n'.format(ln_no + 1))
Thejesh PR
  • 935
  • 9
  • 14
1

Here is the full code that solved the issue:

import os,tarfile, glob

string_to_search=input("Enter the string you want to search : ")

#all_files holds all the files in current directory
all_files = [f for f in os.listdir('.') if os.path.isfile(f)] 
for current_file in all_files: 
    if (current_file.endswith(".tgz")) or (current_file.endswith("tar.gz")):
        tar = tarfile.open(current_file, "r:gz")
        #file_name contains only name by removing the extension
        file_name=os.path.splitext(current_file)[0] 
        os.makedirs(file_name) #make directory with the file name
        output_file_path=file_name  #Path to store the files after extraction
        tar.extractall(output_file_path) #extract the current file
        tar.close()

        #----Following code is to find  the string from all the files in a directory
        path1=output_file_path + r'\nvram2\logs'
        all_files=glob.glob(os.path.join(path1,"*"))
        for my_file1 in glob.glob(os.path.join(path1,"*")):
            if os.path.isfile(my_file1): # to discard folders
                with open(my_file1, errors='ignore') as my_file2:
                    for line_no, line in enumerate(my_file2):
                        if string_to_search in line:
                            print(string_to_search + " is found in " + my_file1 + "; Line Number = " + str(line_no))

Got help from this answer. The path and file not found issue was resolved by "Joining the directory with the filename solves it."

Dipankar Nalui
  • 1,121
  • 5
  • 18
  • 33
0

You could add a condition to check whether '.txt' is present in the file1

files= os.listdir(output_file_path + '/nvram2/logs/')

for file1 in files:   
   if '.txt' in file1:
       with open(file1) as f2:
           for line in f2:
               if string_to_search in line:
                    #print file name which contains the string
                    print(file1)
                    #print the line which contains the string
                    print(str(line))
Venkatachalam
  • 16,288
  • 9
  • 49
  • 77