I have this large SQL file with about 1 milllion inserts in it, some of the inserts are corrupted (about 6000) with weird characters that i need to remove so i can insert them into my DB.
Ex:
INSERT INTO BX-Books
VALUES ('2268032019','Petite histoire de la d�©sinformation','Vladimir Volkoff',1999,'Editions du Rocher','http://images.amazon.com/images/P/2268032019.01.THUMBZZZ.jpg','http://images.amazon.com/images/P/2268032019.01.MZZZZZZZ.jpg','http://images.amazon.com/images/P/2268032019.01.LZZZZZZZ.jpg');
i want to remove only the weird characters and leave all of the normal ones
I tried using the following code to do so:
import fileinput
import string
fileOld = open('text1.txt', 'r+')
file = open("newfile.txt", "w")
for line in fileOld: #in fileinput.input(['C:\Users\Vashista\Desktop\BX-SQL-Dump\test1.txt']):
print(line)
s = line
printable = set(string.printable)
filter(lambda x: x in printable, s)
print(s)
file.write(s)
but it doesnt seem to be working, when i print s it is the same as what is printed during line and whats stranger is that nothing gets written to the file.
Any advice or tips on how to solve this would be useful