28

I'm looking for alternative ways to obtain the command line parameters argc and argv provided to a process without having direct access to the variables passed into main().

I want to make a class that is independent of main() so that argc and argv don't have to be passed explicitly to the code that uses them.

EDIT: Some clarification seems to be in order. I have this class.

class Application
{
  int const argc_;
  char const** const argv_;

public:
  explicit Application(int, char const*[]);
};

Application::Application(int const argc, char const* argv[]) :
  argc_(argc),
  argv_(argv)
{
}

But I'd like a default constructor Application::Application(), with some (most probably) C code, that pulls argc and argv from somewhere.

technosaurus
  • 7,676
  • 1
  • 30
  • 52
user1095108
  • 14,119
  • 9
  • 58
  • 116
  • 1
    What do you mean with "obtain"? Where else from? – user3078414 May 03 '16 at 07:29
  • 11
    There's no portable or standard way of getting arguments to the program except the arguments to `main`. – Some programmer dude May 03 '16 at 07:29
  • @PaulR yes, yes, the APIs are in C, as you know, but class is in C++. I didn't want an answer featuring python, even though it would be cool. – user1095108 May 03 '16 at 07:32
  • 4
    @JoachimPileborg True, but the question is OS-specific, non-portable. – user1095108 May 03 '16 at 07:37
  • 1
    As for the user not needing to pass `argc` and `argv` to the "class", take a look at just about all portable and platform-independent GUI frameworks, they all needs the user of the framework to pass `argc` and `argv` to the framework explicitly. So it's common and not something programmers are unused to. – Some programmer dude May 03 '16 at 07:38
  • 14
    You tag this question `linux`, `windows`, `posix` and `bsd` and say this question is OS-specific? If it was OS-specific you would mention only one OS, the one you target. – Some programmer dude May 03 '16 at 07:40
  • @JoachimPileborg The frameworks contain OS-specific code, not just portable code, there is not reason whatsoever why not make the command line gathering part OS-specific too. I'm trying to gather as many ways to gather the command line as possible, but not infinite and not opinion-based. – user1095108 May 03 '16 at 07:43
  • chicken-egg problem, if you want a code that provides argument, then that code have to rely on main, directly or indirectly you will have a main. – CoffeDeveloper May 03 '16 at 17:00
  • 9
    We should stop upvoting ill-formed questions before getting the question right. The question is still not clear to me, you basically want to get those parameters, without the need to pass them? You can create a wrapper that if injected provides parameters indirectly. I just think that once requirements of OP are clear, then we can just proof if what he's want is even possible – CoffeDeveloper May 03 '16 at 17:06

12 Answers12

37

On Linux, you can get this information from the process's proc file system, namely /proc/$$/cmdline:

int pid = getpid();
char fname[PATH_MAX];
char cmdline[ARG_MAX];
snprintf(fname, sizeof fname, "/proc/%d/cmdline", pid);
FILE *fp = fopen(fname);
fgets(cmdline, sizeof cmdline, fp);
// the arguments are in cmdline
fluter
  • 13,238
  • 8
  • 62
  • 100
  • 7
    1. you can read /proc/self/cmdline 2. args could have been destroyed 3. it is super odd to deal with /proc files with FILE abstraction 4. ARG_MAX is only a limit for one argument, not all args in total 5. missing error checks, arguably could be omitted in the example. However, the biggest issue is that OP is likely trying to do something wrong and such answer like this should not be posted in the first place without further clarification. –  May 03 '16 at 14:30
  • 2
    I believe that /proc/*/cmdline is truncated to some max length constant. This is not a limitation of the size of the process command-line that the OS can start. Instead, this is a limitation of the process list record keeping in Linux. So you can start a process with a longer list of arguments, but the kernel is not going to remember them all for the purpose of trying to read arguments from the process list. – Noah Spurrier May 03 '16 at 19:42
  • The kernel does not memorize arguments. Instead it stores addresses for both start and end of said args and then reads them from target process address space. I don't see any code truncating the result either (unless the requested amount is too small of course). http://lxr.free-electrons.com/source/fs/proc/base.c#L199 –  May 03 '16 at 20:18
  • Trying to use this, but only getting the command name (with path, if was explicitly specified). So `cat /proc/self/cmdline x y z` returns `cat/proc/self/cmdlinexyz`, but `fgets` on that file returns just `cat`. Why is that? – Evgen Mar 20 '19 at 04:08
  • (answering myself) - Turns out the args are [NULL-separated](https://stackoverflow.com/questions/24127416/parsing-command-line-arguments-from-proc-pid-cmdline). Can't just use a string :( – Evgen Mar 20 '19 at 04:46
  • This is such a hack... Such an amazingly beautiful, delightful hack! – Sergey Kalinichenko Sep 08 '21 at 13:34
30

The arguments to main are defined by the C runtime, and the only standard/portable way to obtain the command line arguments. Don't fight the system. :)

If all you want to do is to provide access to command line parameters in other parts of the program with your own API, there are many ways to do so. Just initialise your custom class using argv/argc in main and from that point onward you can ignore them and use your own API. The singleton pattern is great for this sort of thing.

To illustrate, one of the most popular C++ frameworks, Qt uses this mechanism:

int main(int argc, char* argv[])
{
    QCoreApplication app(argc, argv);

    std::cout << app.arguments().at(0) << std::endl;

    return app.exec();
}

The arguments are captured by the app and copied into a QStringList. See QCoreApplication::arguments() for more details.

Similarly, Cocoa on the Mac has a special function which captures the command line arguments and makes them available to the framework:

#import <Cocoa/Cocoa.h>

int main(int argc, char *argv[])
{
    return NSApplicationMain(argc, (const char **)argv);
}

The arguments are then available anywhere in the app using the NSProcessInfo.arguments property.

I notice in your updated question that your class directly stores a copy of argc/argv verbatim in its instance:

int const argc_;
char const** const argv_;

While this should be safe (the lifetime of the argv pointers should be valid for the full lifetime of the process), it is not very C++-like. Consider creating a vector of strings (std::vector<std::string>) as a container and copy the strings in. Then they can even be safely mutable (if you want!).

I want to make a class that is independent of main() so that argc and argv don't have to be passed explicitly to the code that uses them.

It is not clear why passing this info from main is somehow a bad thing that is to be avoided. This is just how the major frameworks do it.

I suggest you look at using a singleton to ensure there is only one instance of your Application class. The arguments can be passed in via main but no other code need know or care that this is where they came from.

And if you really want to hide the fact that main's arguments are being passed to your Application constructor, you can hide them with a macro.

gavinb
  • 19,278
  • 3
  • 45
  • 60
  • Classes in C? Say it ain't so! – sfdcfox May 03 '16 at 18:09
  • Upvoting this. The typical method I see used for API's that want CL access to parse their own arguments (eg: QT) is to just ask for argv and argc. The nice thing about using the standard idiom is that experienced programmers will know what you are doing at a glance. – T.E.D. May 03 '16 at 19:24
  • I understand the idea of using a global (it's ugly, but it does the job done), however WHY enforcing uniqueness? What's wrong with allowing anyone to create a fake set of arguments and pass those instead? – Matthieu M. May 04 '16 at 07:57
  • It's exactly what I don't want. It may be wrong to find `argc` and `argv` using alternative means, but why should people be discouraged from doing so anyway? There are situations apart from the one I described, where it seems to be useful to be able to do so. – user1095108 May 04 '16 at 08:51
  • @MatthieuM. The singleton is merely to preserve the same semantics as the data it encapsulates; there is one set of read-only arguments. Not a strong requirement. – gavinb May 04 '16 at 12:14
  • From a strategic point of view, many SE users consider any singleton use to be code smell, and will downvote an otherwise good answer just for mentioning them. As someone who upvoted this answer, I think that would be a shame. If its not central to your answer, no point in suggesting something controversial. – T.E.D. May 04 '16 at 12:41
  • @user1095108 You say that you don't want to get access the arguments the standard way, but not *why*. Accessing the arguments from another part of the application is possible, regardless of how they are obtained in the first place. If you don't want to use the defined mechanism, you would need to circumvent the C runtime startup code, which opens a whole can of worms. – gavinb May 07 '16 at 06:23
  • @gavinb Interestingly, on macOS, `NSApplicationMain` actually ignores the passed in `argc` and `argv` and gets it directly from [`_NSGetArgc` and `_NSGetArgv`](https://developer.apple.com/documentation/appkit/1428499-nsapplicationmain). – saagarjha Mar 10 '18 at 05:49
  • @SaagarJha Interesting, yes I just looked up the implantation in https://opensource.apple.com/source/Libc/Libc-763.13/sys/crt_externs.c.auto.html . It relies on very specific and detailed knowledge of `dyld` and `libSystem` internals. – gavinb Mar 13 '18 at 02:01
18

I totally agree with @gavinb and others. You really should use the arguments from main and store them or pass them where you need them. That's the only portable way.

However, for educational purposes only, the following works for me with clang on OS X and gcc on Linux:

#include <stdio.h>

__attribute__((constructor)) void stuff(int argc, char **argv)
{
    for (int i=0; i<argc; i++) {
        printf("%s: argv[%d] = '%s'\n", __FUNCTION__, i, argv[i]);
    }
}

int main(int argc, char **argv)
{
    for (int i=0; i<argc; i++) {
        printf("%s: argv[%d] = '%s'\n", __FUNCTION__, i, argv[i]);
    }
    return 0;
}

which will output:

$ gcc -std=c99 -o test test.c && ./test this will also get you the arguments
stuff: argv[0] = './test'
stuff: argv[1] = 'this'
stuff: argv[2] = 'will'
stuff: argv[3] = 'also'
stuff: argv[4] = 'get'
stuff: argv[5] = 'you'
stuff: argv[6] = 'the'
stuff: argv[7] = 'arguments'
main: argv[0] = './test'
main: argv[1] = 'this'
main: argv[2] = 'will'
main: argv[3] = 'also'
main: argv[4] = 'get'
main: argv[5] = 'you'
main: argv[6] = 'the'
main: argv[7] = 'arguments'

The reason is because the stuff function is marked as __attribute__((constructor)) which will run it when the current library is loaded by the dynamic linker. That means in the main program it will run even before main and have a similar environment. Therefore, you're able to get the arguments.

But let me repeat: This is for educational purposes only and shouldn't be used in any production code. It won't be portable and might break at any point in time without warning.

Johannes Weiss
  • 52,533
  • 16
  • 102
  • 136
16

To answer the question in part, concerning Windows, the command line can be obtained as the return of the GetCommandLine function, which is documented here, without explicit access to the arguments of the main function.

Codor
  • 17,447
  • 9
  • 29
  • 56
  • 6
    This is a popular answer. You may consider adding an example in-line of how to use `GetCommandLine`. – Cory Klein May 03 '16 at 15:50
  • 2
    Or just with [__argc, __argv and __wargv](https://msdn.microsoft.com/en-us/library/dn727674.aspx). – isanae May 03 '16 at 18:15
5

In Windows, if you need to get the arguments as wchar_t *, you can use CommandLineToArgvW():

int main()
{
    LPWSTR *sz_arglist;
    int n_args;
    int result;
    sz_arglist = CommandLineToArgvW(GetCommandLineW(), &n_args);
    if (sz_arglist == NULL)
    {
        fprintf(stderr, _("CommandLineToArgvW() failed.\n"));
        return 1;
    }
    else
    {
        result = wmain(n_args, sz_arglist);
    }
    LocalFree(sz_arglist);
    return result;
}

This is very convenient when using MinGW because gcc does not recognize int _wmain(int, wchar_t *) as a valid main prototype.

jdarthenay
  • 3,062
  • 1
  • 15
  • 20
5

Passing values doesn't constitute creating a dependency. Your class doesn't care about where those argc or argv values come out of - it just wants them passed. You may want to copy the values somewhere, though - there's no guarantee that they are not changed (the same applies to alternate methods like GetCommandLine).

Quite the opposite, in fact - you're creating a hidden dependency when you use something like GetCommandLine. Suddenly, instead of a simple "pass a value" semantics, you have "magically take their inputs from elsewhere" - combined with the aforementioned "the values can change at any time", this makes your code a lot more brittle, not to mention impossible to test. And parsing command line arguments is definitely one of the cases where automated testing is quite beneficial. It's a global variable vs. a method argument approach, if you will.

Luaan
  • 62,244
  • 7
  • 97
  • 116
  • But the OP states that this is in C, so there are no classes. You could pass them to an init function, and have the values stored in static variables in the module code. – jamesqf May 03 '16 at 17:29
  • @jamesqf Well, I'm just using the same name the OP used :) Passing the values is the important part, not how exactly it is implemented. – Luaan May 03 '16 at 17:54
3

In C/C++, if main() doesn't export them, then there isn't a direct way to access them; however, that doesn't mean there isn't an indirect way. Many Posix-like systems use the elf format which passes argc, argv and envp on the stack in order to be initialized by _start() and passed into main() via normal calling convention. This is typically done in assembly (because there is still no portable way to get the stack pointer) and put in a "start file", typically with some variation of the name crt.o.

If you don't have access to main() so that you can just export the symbols, you probably aren't going to have access to _start(). So why then, do I even mention it? Because of that 3rd parameter envp. Since environ is a standard exported variable that does get set during _start() using envp. On many ELF systems, if you take the base address of environ and walk it backwards using negative array indices you can deduce the argc and argv parameters. The first one should be NULL followed by the last argv parameter until you get to the first. When the pointed to value cast to long is equal to the negative of your negative index, you have argc and the next (one more than your negative index) is argv/argv[0].

technosaurus
  • 7,676
  • 1
  • 30
  • 52
  • I wrote an Android app in FreePascal, and got reports that it was crashing on Android 13. In Pascal there is a function that you can call to get the args at any time. Normally, FreePascal copies the args in its main. But on Android, it is not an executable, but a .so which is loaded by the JVM, so it cannot access the args. So FreePascal has to search the args later, and it does this exactly in this way by walking backwards from `environ`. And apparently this is causing the crash. It is a rather bad idea – BeniBela Jan 04 '23 at 17:59
2

There are few common scenarios with functions requiring arguments of type int argc, char *argv[] known to me. One such obvious example is GLUT, where its initializing function is taking over these arguments from main(), which is kind of "nested main" scenario. This may or may not be your desired behavior. If not, as there is no convention for naming these arguments, as long as your function has its argument parser, and you know what you're doing, you can do whatever you need, hardcoded:

int foo = 1;
char * bar[1] = {" "};

or read from user input or generated otherwise, AFAIK.

int myFunc( int foo, char *bar[]){
//argument parser
 {…   …}
return 0;
}

Please, see this SO post.

Community
  • 1
  • 1
user3078414
  • 1,942
  • 2
  • 16
  • 24
2

The most portable way would be to use a global variable for storing the parameters. You can make this less ugly by using a Singleton (like your class in the question, but a singleton initialized by main) or similar a Service Locator which is basically just the same: Create an object in main, pass and store params statically, and have another or the same class access them.

Non-Portable ways are using GetCommandLine in Windows, accessing /proc/<pid>/cmdline or (/proc/self/cmdline), or using compiler-specific extensions like __attribute__((constructor))

Note that getting the command line in function via an equivalent of GetCommandLine is not possible (TLDR: Commandline is not passed to the Linux kernel, but already parsed and split by the invoking process (e.g. shell))

Flamefire
  • 5,313
  • 3
  • 35
  • 70
  • Actually, on Windows, as far as I've been able to tell, the command line for a process *is* passed to the kernel as a string. The Win32 API function family CreateProcess takes a string and seems to pass it through a few other functions to the raw system call untouched. The _exec and similar functions actually combine the vector they take into a single string (!) without even bothering to quote it (!!!) which is passed to CreateProcess. – Roflcopter4 Jul 22 '22 at 07:37
  • The link I posted refers to Linux. I edited the answer to clarify this, thanks for the info! – Flamefire Aug 04 '22 at 18:59
2

Here is a proper c++ way of doing so:

#include <iostream>
#include <fstream>
#include <unistd.h>
#include <sstream>
#include <vector>

using namespace std;


template <typename T>
string to_str(T value ){
    ostringstream ss;
    ss << value;
    return ss.str();
    }

int main(int argc, char** argv){
    ifstream reader("/proc/" + to_str(getpid()) + "/cmdline", ios::binary);
    vector<unsigned char> buffer(istreambuf_iterator<char>(reader), {});
    
    int length = buffer.size();
    for(int i = 0; i < length; i++){
        if(!buffer[i]){
            cout << endl;
            }else{cout << buffer[i];}
        }
    
    return 0;
    }
  • It's important to not just post code, but to also include a description of what the code does and why you are suggesting it. This helps others understand the context and purpose of the code, and makes it more useful for others who may be reading the question or answer – DSDmark Dec 22 '22 at 11:03
  • The code is just an interpretation of the C answer by @fluter written in c++ as the question was clearly asking for a c++ way. The rest is explained in the original answer. – Василий максимов Dec 29 '22 at 22:33
1

It sounds like what you want is a global variable; what you should do is just pass argc and argv as parameters.

1

An instructor had a challenge to use the gcc option nostartfiles and then try to access argc and argv. The nostartfiles option causes argc and argv to not be populated. This was my best solution for 64 bit Linux as it accesses argc and argv directly using the base pointer:

// Compile with gcc -nostartfiles -e main args.c -o args
#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) // argc and argv are not available when compiled with -nostartfiles
{
    register void *rbp asm ("rbp");

    printf("argc is %ld\n", *(unsigned long *)(rbp + 8));

    for(int count = 0 ; count < *(unsigned long *)(rbp + 8) ; count++)
    {
        printf("argv[%d] is %s\n", count, *(char **)(rbp + 16 + count * 8));
    }

    exit(0);
}