long story short: I am making a database which includes all the quotations ever done in our company. Looking after particular file extension: *.prc One of the information I would like to retrieve is the owner of the file. I am using the following code (showing only part of it):
import os, time, win32security, subprocess
from threading import Thread
from time import time
def GET_THE_OWNER(FILENAME):
open (FILENAME, "r").close ()
sd = win32security.GetFileSecurity (FILENAME, win32security.OWNER_SECURITY_INFORMATION)
owner_sid = sd.GetSecurityDescriptorOwner ()
name, domain, type = win32security.LookupAccountSid (None, owner_sid)
return name
starttime = time()
path = "C:/Users/cbabycv/Documents/Python/0. Quotations/Example"
for root, dirs, files in os.walk(path):
for file in files:
if (file.endswith(".prc")):
#getting data from the file information
Filename = os.path.join(root,file)
try:
Owner = GET_THE_OWNER(Filename)
except:
Owner = "Could not get the owner."
print(Owner)
endtime = time()
print (Owner)
print(endtime-starttime, " sec")
The process is slow (especially when you have to read around 100.000 files). I wonder if there is another way to make it faster? Please note, I am asking for Windows OS not everything else ( I can not use os.stat() in this case - simply not works on windows) I have tried another way described here: how to find the owner of a file or directory in python By Paal Pedersen, but it is even slower than using windows Api
I am using os.walk() to find the files on the server. I do not have the exact location of the files, they could be in any folder (so I am just looking on each file in all folders/subfolders and see if it is a *.prc file). One suggested multiprocessing - many thanks :) I will try to optimize the whole code, but my question is still valid - is there faster/better way finding the owner of the file in Windows OS?
@theCreator Sugested to use powershell. Have tried that. It is approx. 14 times slower...
import os, subprocess
from pathlib import Path
from time import time
starttime = time()
def GET_THE_OWNER(cmd):
startupinfo = subprocess.STARTUPINFO()
startupinfo.dwFlags |= subprocess.STARTF_USESHOWWINDOW
completed = subprocess.run(["powershell.exe", "-Command", "Get-Acl ", cmd, " | Select-Object Owner"], capture_output=True, startupinfo=startupinfo)
return completed
path = Path('C:/Users/cbabycv/Documents/Python/0. Quotations/Example')
for root, dirs, files in os.walk(path):
for file in files:
if (file.endswith(".prc")):
#getting data from the file information
Filename = os.path.join(root,file)
Filename = "\"" + Filename +"\""
Owner = GET_THE_OWNER(Filename)
if Owner.returncode != 0:
print("An error occured: %s", Owner.stderr)
else:
print(Owner.stdout)
endtime = time()
print(endtime-starttime, " sec")