0

I have below data in excel file (data.xlsx) more than 6000 rows and i have many text files in my ubuntu system.

Input excel file

Text files directory structure:-

/home/user/excel/report/ML_PROJECT_APPLICATION_WRITE.txt
/home/user/excel/report/folder-1/ML_PROJECT_APPLICATION_OPEN.txt
/home/user/excel/report/folder-1/filepath/ML_PROJECT_UBUNTU_OPEN.txt
/home/user/excel/report/folder-2/ML_PROJECT_CENTOS_WRITE.txt
/home/user/excel/report/folder-3/ML_PROJECT_RHEL_WRITE.txt

One of the text file format as follows,

ML_PROJECT_APPLICATION_WRITE.txt

# //DEPOT/ABCD/PROJECT/Jerd
# Permission: WRITE

dreac.leoson
ritu.bhangale
makyen
markerikson.s
bernardo.pereira 
elitezen

The filename of the text files are matching excel sheet's D column. For each row, i want to a search text file as per D column and need to look the User-ID in H column, if user-id exists in that particular row matched text file then user-id need to be removed from that text file. Need help to achieve this in automated way. Thank you!

user4948798
  • 1,924
  • 4
  • 43
  • 89

1 Answers1

0

First, import data.xlsx into a pandas.DataFrame and create a dict to map the file names and user ids:

import pandas as pd
data = pd.read_excel("data.xlsx")
d = dict(zip(data["File Name"], data["User-ID"]))

Use a custom function to return the path of your file (with sub-directory):

import os
def find(name):
    for root, _, files in os.walk("/home/user/excel/report"):
        if name in files:
            return os.path.join(root, name)

Then, loop through the dict and transform your files as needed. First, read the file. Then, remove the user-id if it exists and write back to the file.

for file in d:
    fullname = find(f"{file}.txt")
    if fullname is not None:
        with open(fullname, "r") as f:
            contents = f.read().strip()
    
        with open(fullname, "w") as f:
            f.write(contents.replace(d[file], ""))

Note, if your files are in a different directory, make sure you specify the entire file path in pd.read_excel and open.

not_speshal
  • 22,093
  • 2
  • 15
  • 30