0

I have to delete string or list of strings based on user input from a file. I referred the below link and things are working fine.

Deleting a specific line in a file (python)

However, the above approach reads the existing file contents in memory and if the line to delete is not found writes it back in the same file. This approach is not suitable if we dealing with files with huge amount of confidential data.

All i wish to know is, Is there a better way to do the same thing.

  valid_List=["10.1.2.3","10.2.3.4","10.2.4.5","10.2.3.7"]
  filename="abc.txt"
  for i in valid_List:
    f = open(filename,"r")
    lines = f.readlines()
    f.close()
    f = open(filename,"w")
    for line in lines:
      if line!=i+" "+ "ok"+"\n":
        #print("Writing ip not to be deleted")
        f.write(line)
      else:
        print(i," Deleted")
        user_response.append(i+" Deleted")
        logger.info('Response returned to user%s',user_response)
    f.close()
Community
  • 1
  • 1
Iram Khan
  • 37
  • 2
  • 12

3 Answers3

4

You can read and write to two different files and do the operation elementwise.

Afterwards you replace the inputfile with the outputfile

import shutil

valid_List = ["10.1.2.3", "10.2.3.4", "10.2.4.5", "10.2.3.7"]
filename = "abc.txt"
outfile = "outfile.txt"

with open(filename, "r") as f:
    with open(outfile, "w") as o:
        for line in f:
            if all([line != "%s ok\n" % i for i in valid_List]):
                o.write(line)
            else:
                print("%s Deleted" % line.strip())

shutil.move(outfile, filename)

Caveat This uses the a fixed filename for output, which might cause collisions when you run the program multiple times in parallel. If you use this atomic save recipe you can simplify the code to

valid_List = ["10.1.2.3", "10.2.3.4", "10.2.4.5", "10.2.3.7"]
filename = "abc.txt"

with atomic_open(filename, "w") as o:
    with open(filename, "r") as f:
        for line in f:
            if all([line != "%s ok\n" % i for i in valid_List]):
                o.write(line)
            else:
                print("%s Deleted" % line.strip())

This will automatically choose a temporary file (collision-free) for you and replace the input file with the output file on completion.

Also you will notice that I have replaced your outer loop (opening files once for each entry in valid_list) with an all() statement. This saves you a lot of overhead, too.

Community
  • 1
  • 1
Nils Werner
  • 34,832
  • 7
  • 76
  • 98
1

You're opening and closing the huge file multiple times, once for each element in valid_List. You should instead open the file just once and check if any line of file matches with your valid_List.

Try like this (the code is not tested but it should work):

valid_List=["10.1.2.3","10.2.3.4","10.2.4.5","10.2.3.7"]
filename="abc.txt"

f = open(filename,"r")
lines = f.readlines()
f.close()

f = open(filename,"w")
for line in lines:
    flag = True
    deleted = ''
    for i in valid_List:
        if line == i+" "+ "ok"+"\n":
            flag = False
            deleted = i
            break
    if flag:
        #print("Writing ip not to be deleted")
        f.write(line)
    else:
        print(deleted," Deleted")
f.close()  

EDIT
Added check for not-found IPs.

valid_List=["10.1.2.3","10.2.3.4","10.2.4.5","10.2.3.7"]
filename="abc.txt"

if_found = [False for v in valid_List]

f = open(filename,"r")
lines = f.readlines()
f.close()

f = open(filename,"w")
for line in lines:
    flag = True
    deleted = ''
    for _,i in enumerate(valid_List):
        if line == i+" "+ "ok"+"\n":
            flag = False
            if_found[_] = True
            deleted = i
            break
    if flag:
        #print("Writing ip not to be deleted")
        f.write(line)
    else:
        print(deleted," Deleted")
f.close()

for _,i in enumerate(if_found):
    if not i:
        print(valid_List[_]," Not Found")
Abdul Fatir
  • 6,159
  • 5
  • 31
  • 58
  • sorry buddyy...mistakes from my side.. thanks a lot ih helped me.. However i have a doubt..can u tell me how i can show if the ip entered to delete is "not there in file" or "is invalid" ? tried ways bt its not happening. – Iram Khan May 19 '16 at 09:56
  • i have upvoted you, but i do not have enough reputation score. Once i get it, you see the change :) – Iram Khan May 19 '16 at 09:57
  • If this worked, you can mark this as the answer by clicking on the tick mark. For checking if the IP is in file just keep a variable associated with each IP entered and change its value once the IP is found. Finally check if the variable did not change state, which will mean the IP is not in the file. – Abdul Fatir May 19 '16 at 10:05
  • For checking valid IP, you can validate using regex. Please let me know if you don't know what regex is or you need code. – Abdul Fatir May 19 '16 at 10:06
  • i have a list of invalid ip where i have checked invalid ip using Ipy library... so now i am just looping my invalid loop after the loop for line in lines is ended just like this ---> for i in invalid_List: user_response.append(i+" Invalid") logger.info('Response returned to user%s',user_response)... – Iram Khan May 19 '16 at 10:12
  • However I am unable to fix if ip entered is not found in file :( – Iram Khan May 19 '16 at 10:13
  • Goto pastebin.com and paste your code and provide link. – Abdul Fatir May 19 '16 at 10:19
  • access to it is denied from my work location. I tried this, I took a variable and initialized it just below flag variable -->ip_found=0. In the same code you suggested below flag=false, I made it ip_found=1 and below if-else of flag i checked as follows --> if ip_found==0: user_response.append(deleted+" IP not found") logger.info('Response returned to user%s',user_response)... However with this i am getting my invalid ip printed thrice as its in loop. If i paste it outside for i am not getting value of IP as deleted variable is inside for. :( – Iram Khan May 19 '16 at 10:25
  • I've added modified code to the answer. This should do the task. – Abdul Fatir May 19 '16 at 10:36
1

i created this script basically you put bunch of lines strings in a list if any of them was found it get deleted and it works on batch so it open multiple files you input the number of files obviously it only for personal use not users because it doesnt have input check and the files need to be in the same dir as the script:

n=int(input('enter the number of files:'))
for i in range (1,n):
    f = open(f"{i}.txt","r")
    lines = f.readlines()
    f.close()
    f = open(f"{i}.txt","w")
    strings_to_remove=['Edited at','test']
    for line in lines:
        if line.strip() not in strings_to_remove:
            f.write(line)
    f.close()