0

I'm working on a windows kernel driver that reports the command-line arguments of started processses. While getting the command-line string is easy, I'm having trouble interpreting it as separate arguments.

I'm using ProcessNotifyExCallback which gives me a PS_CREATE_NOTIFY_INFO for every started process. It contains a PCUNICODE_STRING CommandLine.

However, I'm unsure how this string is split into individual arguments by the windows kernel. Is there a kernel function that can do that for me? Is the splitting done by userland processes themself? Is there a way to query the (already split) arguments?

I'd like to get the arguments exactly the same was as the user-land process would in it's argc/argv parameters. So writing the "split" function myself is a no-go (doing the splitting/escaping is non-trivial).

Another interesting detail that I don't quite understand:

Assume I want to start the executable calc.exe with 2 arguments: a and b c (note the space). When running the command in cmd.exe, I write calc.exe a "b c". However, inside the ProcessNotifyExCallback callback I receive the string calc.exe a "b c" - there are two spaces between the process name and the argument list. Why is that? When starting the processes normally (no cmd.exe), there is only one space. So I assume the cmd is doing some magic there?

maja
  • 17,250
  • 17
  • 82
  • 125
  • Each process has its own rules on how the command line string is split up into argument strings. So if you want to split up the command line string in your Windows kernel mode driver to a list of argument strings, you would need the knowledge how the started process does that. `cmd.exe` as a different interpretation of the command line string than `powershell.exe` or `bash` (in WSL). `reg.exe` and `robocopy.exe` `findstr.exe` interpret strings on command line different to other Windows commands (executables in Windows system directory). – Mofi May 09 '22 at 10:17
  • Very common is that a series of one or more spaces separate arguments, except the space(s) are within a double quoted argument string. So in general it does not matter how many spaces are between `calc.exe` (argument 0) and `a` (argument 1) and `"b c"` (argument 2). But run in a command prompt window `cmd /?` and read the output usage help explaining how `cmd.exe` interprets the string after option `/C` or option `/K`. I really wish you good luck if you want to interpret with your Windows kernel driver this string like `cmd.exe` to list the arguments separately. – Mofi May 09 '22 at 10:22
  • The Windows Command Processor interprets also a comma, a semicolon, an equal sign and a OEM encoded no-break space (code point value 255 decimal) as argument string separators on processing the strings passed to a batch file on not being enclosed in `"`. Example: `Test.cmd "Hello world!",; ,Arg2=Arg3 ", ... comma | ; ... semicolon"` is interpreted as `Test.cmd` (argument 0), `"Hello world!"` (argument 1), `Arg2` (argument 2), `Arg3` (argument 3), `", ... comma | ; ... semicolon"` (argument 4). – Mofi May 09 '22 at 10:31
  • But putting`echo ` before and `cmd.exe` interprets `echo` as argument 0 and everything after the first space after `echo` as one argument string. `reg.exe`, `robocopy.exe` and `findstr.exe` interpret ``\`` left to one more ``\`` or `"` as an escape character for the backslash or the double quote character. A backslash left to another character is interpreted as literal character by `reg.exe` and `robocopy.exe`. `explorer.exe` has also a special command line string parsing which differs from the common command line string parsing used by most executables. – Mofi May 09 '22 at 10:36
  • I am quite sure that the Windows Sysinternals gurus Mark Russinovich and Bryce Cogswell had good reasons not trying to split up the command line string into a list of argument strings on writing [Process Monitor](https://learn.microsoft.com/en-us/sysinternals/downloads/procmon) and [Process Explorer](https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer) and instead display in these tools the command line string as used on creating a process. – Mofi May 09 '22 at 10:41
  • @Mofi i know that cmd/PowerShell/... Interpret the input differently, but that should be irrelevant. At some point, a syscall must be made (with an array of arguments), and Windows needs to populate the new process's PE header, which will eventually be used by the userland process in it's argc/argv parameters. But I don't know how to access this argument Array inside ProcessNotifyExCallback - i can read the PE header memory, but that feels like an ugly hack – maja May 09 '22 at 14:40
  • Every Windows executable capable starting another executable uses the Windows kernel library function [CreateProcess](https://learn.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createprocessw) which has two parameters being of interest for you: `lpApplicationName` and `lpCommandLine` whereby just one of these two parameters or both can be used by a running process. There is not an array of argument strings passed to `CreateProcess`. BTW: The additional space between application name and its arguments comes often from `CreateProcess`. – Mofi May 09 '22 at 16:37
  • The function parameters `argc` and `argv` of function `main` are created by code in the started program being executed before calling the function `main`. The code executed before execution of function `main` is usually from a library installed with the compiler and is linked to the executable on compilation of source code to an executable. But every C/C++ programmer can code its own startup code. I have done that in the past and many others too. Look on [this answer](https://stackoverflow.com/a/24008269/3074564) for a comparison of command line parsing with three different C compilers. – Mofi May 09 '22 at 16:37
  • @Mofi Thanks for the info. It seems that the kernel provides a helper-function https://learn.microsoft.com/en-us/windows/win32/api/shellapi/nf-shellapi-commandlinetoargvw, but that is for C-like user-space programs and doesn't need to be used. (It also explains how splitting is performed). I still don't know why the PE-header of a running process (that I can read in the kernel) contains the split arguments (`PUNICODE_STRING`, with args separated by `0x0000`). Anyway, if you post your comments as an anser, I'll accept/upvote it. – maja May 10 '22 at 07:11
  • After reading a bit more documentation (including https://learn.microsoft.com/en-us/previous-versions/ms880421(v=msdn.10)?redirectedfrom=MSDN), it seems that this is a windows-thing. On Linux, the syscall gets a list of arguments, rather than a single string. And languages like go concatenate the list-of-arguments into a single string according to the above rules for windows, but forward them directly for linux. – maja May 10 '22 at 07:32
  • It is fine that you have now a better understanding on how executables are started on Windows. I don't want to write an answer on your question, but feel free to answer your question by yourself if you think it could be useful for other programmers, too. I would delete all my comments here to cleanup the question after you wrote an answer. For that reason don't give credits in your answer to me. This is really not necessary. Thanks and good luck with further development of your Windows kernel driver. – Mofi May 10 '22 at 18:30

0 Answers0