2

Building a malware classifier for a class and I am just trying to implement a loop that goes through every file in a folder using os.listdir() and I specified that the folder is the given argv. I have been stuck on this problem for hours but I can't seem to figure out what the issue is.

The idea of my code is to "train" with a set of known goodware and known malware, then test it on a set of "unknown" files to see if it can correctly classify the files based on a specified feature; the final output should print the labels of the unknown files.

I have tried both the absolute and relative paths as my command line arguments thinking the issue could be related to accessing the subfolders, but that didn't fix the problem. [screenshot of code and directories][1]

commandline: python3 PEexample.py gw/ mw/ unknown/

Full error:

Traceback (most recent call last):
  File "/Users/kyleefriederichs/Desktop/School/CSCE_698_CyberDefense/CSCE689_CyberDefense/Cyber_practice/PEexample.py", line 39, in <module>
    pe1 = pefile.PE(file)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pefile.py", line 2895, in __init__
    self.__parse__(name, data, fast_load)
  File "/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/pefile.py", line 2970, in __parse__
    stat = os.stat(fname)
FileNotFoundError: [Errno 2] No such file or directory: 'gw1.exe'
idx_gw = 0
for file in os.listdir(sys.argv[1]):
    # pe1=pefile.PE(file)
    if not file.startswith("."):
        try:
            pe1 = pefile.PE(file)
        except pefile.PEFormatError:
            sys.exit("[-] Not PE file:" + file)

        pe_gw_imps = len(pe1.DIRECTORY_ENTRY_IMPORT)
        idx_gw += 1
AKX
  • 152,115
  • 15
  • 115
  • 172
Vyxxen
  • 21
  • 3
  • 2
    Can you please provide code that you've tried so far? – WArnold Mar 02 '23 at 10:46
  • Now that you have edited in your code, can you also re-create it with a minimal example? Most of the stuff in that file is not needed to reproduce the error, I presume. When asking questions think: What is the minimum amount of code I need to reproduce the problem? Including, for example. file structure. – Egeau Mar 02 '23 at 11:02
  • I reduced it since the original code is just using the same for loop and I know that it's getting stopped on the first iteration of the loop. It is better? – Vyxxen Mar 02 '23 at 11:10
  • The problem is that os.listdir('gw') gives the names *within* directory *gw*. But when you open the files listed there, you don't give the directory name: 'gw/gw1.exe", just 'gw1.exe'. Since your current directory is not inside gw, but one step above it, pefile.PE() can't find the exe. See answer from AKX. Also, you are trying to learn two things at once, how to deal with directories, and how to deal with the PE python library. I suggest you search for how to recursively scan directories. For instance https://stackoverflow.com/questions/18394147 – Prof. Falken Mar 02 '23 at 11:31

1 Answers1

2

os.listdir() returns filenames, not pathnames.

You'll want to prepend the path to the filename to get the path (and the right tool for manipulating paths is os.path.join):

directory = sys.argv[1]
for filename in os.listdir(directory):
    path = os.path.join(directory, filename)
    # ...
    pe1 = pefile.PE(path)

You could also

AKX
  • 152,115
  • 15
  • 115
  • 172