2

I'm currently learning C by working through K&R's The C Programming Language, and have reached the point in the book where command-line arguments are discussed. In the book, the main routine would be written something like this:

int main(int argc, char *argv[])
{
    do something
}

From my understanding, at some point the number of arguments passed to the program must be counted and stored in argc. Also, the arguments themselves must be stored, and pointers to the first characters of each are stored in the array argv, where argv[0] is a pointer to the name of the command and argv[argc] is a null pointer. These steps can't just magically occur, this behaviour must be defined somewhere!

As an example, imagine that I want to store the first character of each argument passed to the program, firstc, and discard the remainder of that argument (let's pretend that I had a really, really good reason for doing this). I could write main() like so:

int main(char firstc[])
{
    do something
}

Clearly, this can already be done quite easily with the default argc and argv, and I wouldn't actually do it in practice. I can't even imagine a scenario in which this would actually be necessary, but I'm curious to know if it's possible.

So my (entirely theoretical, completely impractical) question is this: is it possible to define my own behaviour for the command line arguments? If it is, how would one go about doing so? If it's relevant, I'm using Ubuntu 16.04 and the GNOME Terminal.

P.S.

I just realized while writing this question that it is entirely possible (perhaps probable) that the C script is completely blind to what's going on outside, and that the terminal emulator is what prepares the command-line arguments for the C program.

kruschk
  • 123
  • 1
  • 5
  • It's not actually the terminal emulator, it's the shell. The shell is a program that reads command lines from the terminal emulator, parses those command lines, and then launches other programs as needed. – user3386109 Aug 18 '16 at 05:10
  • See [What should `main()` return in C and C++?](http://stackoverflow.com/questions/204476/) which also discusses arguments etc, and there are quotes from the standard which should help you. Then look up the [`execve()`](http://pubs.opengroup.org/onlinepubs/9699919799/functions/execve.html) and related POSIX functions. These are what pass arguments on to executed programs on Unix-like (POSIX-compliant) systems — and the behaviour has to be emulated on other systems. The `exec*()` functions prevent you from using an alternative interface on Unix-like systems. – Jonathan Leffler Aug 18 '16 at 05:26
  • Thanks for the correction, user3386109, I think that this is an important distinction. Jonathan, I'll definitely be looking into those links that you posted, thank you! – kruschk Aug 19 '16 at 00:16
  • Thank-you to everyone who took the time to post answers and comments, I appreciate it! I up-voted the posts that I found useful and marked the one that I thought was the most complete as the answer. – kruschk Aug 19 '16 at 00:17

4 Answers4

5

The setup of arguments is not actually within the purview of the C standard, it simply dictates the allowable forms of main that you can use. There are two canonical forms of this (assuming a hosted implementation), one being the argc/argv option, the other being the void option (although note that an implementation is free to provide others).

Typically, there is code that runs before main is called, such as from startup code in an object file like crt0.o.

However, as stated, the standard doesn't dictate anything that happens at that stage, it's the responsibility of the "environment" to set up things correctly so that main can be called.

In terms of doing what you request, I suspect the easiest solution would be to provide a main taking the canonical form and simply call a myMain with the first character of each argument, though you would probably need to intelligently handle any number of arguments that may be given to main.

An example follows which can handle between one and three arguments:

#include <stdio.h>

int myMain0(void) {
    printf ("myMain0\n");
    return 0;
}

int myMain1(char p1) {
    printf ("myMain1 [%c]\n", p1);
    return 0;
}

int myMain2(char p1, char p2) {
    printf ("myMain2 [%c] [%c]\n", p1, p2);
    return 0;
}

int main(int argc, char *argv[]) {
    switch (argc) {
        case 1: return myMain0();
        case 2: return myMain1(argv[1][0]);
        case 3: return myMain2(argv[1][0], argv[2][0]);
    }
    printf ("Invalid argument count of %d\n", argc - 1);
    return 1;
}
paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • The specification states that *"The parameters argc and argv and the strings pointed to by the argv array shall be modifiable by the program"*. Which means that this is legal: `for ( int i=1; i – user3386109 Aug 18 '16 at 06:20
  • Yes, assuming you don't need the original arguments, that approach would be fine. It won't let you declare `main` without an `argc` or with `char`s rather than `char**` but it will give a similar effect. – paxdiablo Aug 18 '16 at 06:26
0

The operating system is the one that passes the number of arguments and the arguments themselves from the command line to the C program.

The function main is not picky about its arguments. You can get no argument at all, you can get only argc, and you can get both argc and argv.

You can even get 3 or 4 arguments with whatever types you want, but they will contain garbage. The operating system will always pass the number of arguments and their names as an int and an array of pointers to strings.

Israel Unterman
  • 13,158
  • 4
  • 28
  • 35
  • 2
    Note that on most Unix-like systems, a third argument, `char **envp` is passed to `main()` anyway as a third argument (even if your code doesn't use it, just as the arguments are passed even if you used `int main(void)` — but you can't access the arguments if you said they don't exist). The `envp` is like `argv`: it points to a series of pointers to the environment variable strings. Apple on Mac OS X provides a 4th argument; I forget what it provides. I believe it is another `char **` variable. – Jonathan Leffler Aug 18 '16 at 05:30
  • _[…continuing…]_ The fourth argument on Mac OS X is another `char **` and I get the executable path as the first entry: `executable_path=./a4` (for a program `a4`), plus three empty strings (whose value escapes me completely) and then a null pointer. – Jonathan Leffler Aug 18 '16 at 05:36
  • [`int main(int argc, char **argv, char **envp, char **apple);`](https://en.wikipedia.org/wiki/Entry_point#C_and_C.2B.2B) – phuclv Aug 18 '16 at 05:37
0

Your main() declaration does not actually define what parameters you get. That is the responsibility of the program's environment: the operating system and the calling program, usually the command line processor.

The caller (typically a shell program) prepares parameters and passes them to appropriate operating system's routine for the program to be called. The OS routine prepares those data for a callee, typically on the stack, and makes a jump into your program's entry point, which then follows to the main function.

Your main() declaration just declares what your main expects on the stack, which indirectly defines how you can use those data, but not what they are. That's also why you can declare main without parameters as main(void) – that simply means 'whatever is passed to me, I'll ignore it anyway'.

CiaPan
  • 9,381
  • 2
  • 21
  • 35
0

There are standards (like ANSI C, C89, etc.), that provide main rules and set set of restrictions, and there are agreements, that do not violate the standards and provide you some possibility.

Fist, I have one more example for you:

#include <stdio.h>

int main(int argc, char * argv[], char * envs[])
{
    int i = 1;
    while (envs[i] != NULL)
    { 
        printf("%d : %s\n", i, envs[i]);
        i++;
    }
    return 0;
}

Just try and see how this third argument of main can be useful.

Also, I want to explain my approach to command line arguments processing. I make ParseCommandLine (or EvaluateParameters) and call it at the beginning of main. That function analyze strings from command line and store all settings for further convenient usage. E.g. if I expect my program to be run as

  prog.exe -i input_file_name -o output_file_name -e

I will do something like:

#include <string.h>
#include <stdio.h>

#define FNAME_LEN 20

struct settings
{
    char inpFName[FNAME_LEN];
    char outFName[FNAME_LEN];
    bool isEncoded;
} globalSettings;

bool ParseCommandLine(int argc, char * argv[])
{
    int c;
    for (int c = 1; c < argc; c += 2)
    {
        if (!strcmp(argv[c], "-i") && c < argc - 1)
        {
            strncpy(globalSettings.inpFName, argv[c + 1], FNAME_LEN - 1);
            continue;
        }
        if (!strcmp(argv[c], "-o") && c < argc - 1)
        {
            strncpy(globalSettings.outFName, argv[c + 1], FNAME_LEN - 1);
            continue;
        }
        if (!strcmp(argv[c], "-e"))
        {
            globalSettings.isEncoded = true;
            c--;
            continue;
        }
    }
    // rules to check mandatory values
    if (strlen(globalSettings.inpFName) == 0 || strlen(globalSettings.outFName) == 0)
    {
        return false;
    }
    return true;
}

int main(int argc, char * argv[])
{
    if (ParseCommandLine(argc, argv))
    {
        // do something
    }
    else
    {
        // explain how to run program
    }
    return 0;
}
VolAnd
  • 6,367
  • 3
  • 25
  • 43