I'm currently self-learning Python and I'm in process of writing my first shell-script. It is a linux file search shell-script, with duplicate files recognization with 'md5hash'. It is solely made for learning purposes, not for real project.
Here's my code:
from subprocess import Popen, PIPE
import os
def index(directory):
stack = [directory]
files = []
while stack:
directory = stack.pop()
for file in os.listdir(directory):
fullname = os.path.join(directory, file)
if search_term in fullname:
files.append(fullname)
if os.path.isdir(fullname) and not os.path.islink(fullname):
stack.append(fullname)
return files
from collections import defaultdict
def check(directory):
files = index(directory)
if len(files) < 1:
print("No file(s) meets your search criteria")
else:
print ("List of files that match your criteria:")
for x in files:
print (x)
print ("-----------------------------------------------------------------")
values = []
for x in files:
cmd = ['md5sum', x]
proc = Popen(cmd, stdout=PIPE)
(out, err) = proc.communicate()
a = out.split(' ', 1)
values.append(a[0])
proc.stdout.close()
stat = os.waitpid(proc.pid, 0)
D = defaultdict(list)
for i,item in enumerate(values):
D[item].append(i)
D = {k:v for k,v in D.items() if len(v)>1}
for x in D:
if len(D[x]) > 1:
print ("File", files[D[x][0]], "is same file(s) as:")
for y in range(1, len(D[x])):
print (files[D[x][y]])
search_term = input('Enter a (part of) file name for search:')
a = input('Where to look for a file? (enter full path)')
check(a)
My questions regarding the code:
1. I've been advised to replace deprecated os.popen() with subprocess.Popen()
Yet I don't have a clue how to do it. I tried several solutions that I found already present here on stackoverflow but none seems to work with my case, and every produces some kind of error. For example, dealing with it like this:
from subprocess import Popen, PIPE
...
cmd = ['md5sum', f]
proc = subprocess.Popen(cmd, stdout=subprocess.PIPE)
proc.stdout.close()
stat = os.waitpid(proc.pid, 0)
I'm getting the NameError: global name 'subprocess' is not defined
error.
I'm really lost in this one, so any help provided is appreciated.
2. How to make this program able to search from the top (root)?
If I enter the "/" for the search path, I get the PermissionError: [Errno 1] Operation not permitted: '/proc/1871/map_files'
Does my script need sudo privilegies?
I'm trying to learn Python all for myself by using the Internet. Thanks for your help!