13

I'm new to C and I completed a small exercise that iterates through the letters in an argument passed to it and identifies the vowels. The initial code only worked for one argument (argv[1]). I want to expand it to be able to iterate through all arguments in argv[] and repeat the same process of identifying vowels.

The code:

#include <stdio.h>

int main(int argc, char *argv[])
{
    if (argc < 2) {
        printf("ERROR: You need at least one argument.\n");
        return 1;
    }

    if (argc == 2) {
        int i = 0;
        for (i = 0; argv[1][i] != '\0'; i++) {
            char letter = argv[1][i];

            if (letter == 'A' || letter == 'a') {
                printf("%d: 'A'\n", i);
                //so on
            }
        }
    } else {
        int i = 0;
        int t = 2;
        for (t = 2; argv[t] != '\0'; t++) {
            for (i = 0; argv[t][i] != '\0'; i++) {
                char letter = argv[t][i];

                if //check for vowel
            }

        }

        return 0;
    }
}

I read this answer and it seems the best solution is to use pointers, a concept I'm still a bit shaky with. I was hoping someone could use the context of this question to help me understand pointers better (by explaining how using pointers in this instance solves the problem at hand). Many thanks in advance.

jwdonahue
  • 6,199
  • 2
  • 21
  • 43
Jon Behnken
  • 560
  • 1
  • 3
  • 14

4 Answers4

28

I was hoping someone could use the context of this question to help me understand pointers better....

In context of your program:

int main(int argc, char *argv[])

First, understand what is argc and argv here.

argc(argument count): is the number of arguments passed into the program from the command line, including the name of the program.

argv(argument vector): An array of character pointers pointing to the string arguments passed.

A couple of points about argv:

  • The string pointed to by argv[0] represents the program name.

  • argv[argc] is a null pointer.

For better understanding, let's consider an example:

Say you are passing some command line arguments to a program -

# test have a nice day

test is the name of the executable file and have, a, nice and day are arguments passed to it and in this case, the argument count (argc) will be 5.

The in-memory view of the argument vector (argv) will be something like this:

      argv                       --
        +----+    +-+-+-+-+--+     |
 argv[0]|    |--->|t|e|s|t|\0|     |
        |    |    +-+-+-+-+--+     |
        +----+    +-+-+-+-+--+     |
 argv[1]|    |--->|h|a|v|e|\0|     |
        |    |    +-+-+-+-+--+     |
        +----+    +-+--+           |
 argv[2]|    |--->|a|\0|            > Null terminated char array (string)
        |    |    +-+--+           |
        +----+    +-+-+-+-+--+     |
 argv[3]|    |--->|n|i|c|e|\0|     |
        |    |    +-+-+-+-+--+     |
        +----+    +-+-+-+--+       |
 argv[4]|    |--->|d|a|y|\0|       |
        |    |    +-+-+-+--+       |
        +----+                   --
 argv[5]|NULL|
        |    |
        +----+

A point to note about string (null-terminated character array) that it decays into pointer which is assigned to the type char*.

Since argv (argument vector) is an array of pointers pointing to string arguments passed. So,

argv+0 --> will give address of first element of array.
argv+1 --> will give address of second element of array.
...
...
and so on.

We can also get the address of the first element of the array like this - &argv[0].

That means:

argv+0 and &argv[0] are same.

Similarly,

argv+1 and &argv[1] are same.
argv+2 and &argv[2] are same.
...
...
and so on.

When you dereference them, you will get the string they are pointing to:

*(argv+0) --> "test"
*(argv+1) --> "have"
....
....
and so on.

Similarly,

*(&argv[0]) --> "test"

*(&argv[0]) can also written as argv[0].

which means:

*(argv+0) can also written as argv[0]. 

So,

*(argv+0) and argv[0] are same
*(argv+1) and argv[1] are same
...
...
and so on.

When printing them:

printf ("%s", argv[0]);   //---> print "test"
printf ("%s", *(argv+0)); //---> print "test"
printf ("%s", argv[3]);   //---> print "nice"
printf ("%s", *(argv+3)); //---> print "nice"

And since the last element of argument vector is NULL, when we access - argv[argc] we get NULL.

To access characters of a string:

argv[1] is a string --> "have"
argv[1][0] represents first character of string --> 'h'

As we have already seen:
argv[1] is same as *(argv+1)

So,
argv[1][0] is same as *(*(argv+1)+0)

To access the second character of string "have", you can use:

argv[1][1] --> 'a'
or,
*(*(argv+1)+1) --> 'a'

I hope this will help you out in understanding pointers better in context of your question.

To identify the vowels in arguments passed to program, you can do:

#include <stdio.h>

int main(int argc, char *argv[])
{

    if (argc < 2) {
            printf("ERROR: You need at least one argument.\n");
            return -1;
    }

    for (char **pargv = argv+1; *pargv != argv[argc]; pargv++) {
            /* Explaination:
             * Initialization -
             * char **pargv = argv+1; --> pargv pointer pointing second element of argv
             *                            The first element of argument vector is program name
             * Condition -
             * *pargv != argv[argc]; --> *pargv iterate to argv array
             *                            argv[argc] represents NULL
             *                            So, the condition is *pargv != NULL
             *                            This condition (*pargv != argv[argc]) is for your understanding
             *                            If using only *pragv is also okay 
             * Loop iterator increment -
             * pargv++
             */

            printf ("Vowels in string \"%s\" : ", *pargv);
            for (char *ptr = *pargv; *ptr != '\0'; ptr++) {
                    if (*ptr == 'a' || *ptr == 'e' || *ptr == 'i' || *ptr == 'o' || *ptr == 'u'
                    ||  *ptr == 'A' || *ptr == 'E' || *ptr == 'I' || *ptr == 'O' || *ptr == 'U') {
                            printf ("%c ", *ptr);
                    }
            }
            printf ("\n");
    }

    return 0;
}

Output:

#./a.out have a nice day
Vowels in string "have" : a e 
Vowels in string "a" : a 
Vowels in string "nice" : i e 
Vowels in string "day" : a
H.S.
  • 11,654
  • 2
  • 15
  • 32
  • This is incredibly insightful and helpful -- exactly the type of thing I was looking for. I'm going to have to read this answer a few times over just to digest everything properly but can I start by asking this -- if `argv[1]` represents the **address** in memory of the argument `test`, why does `argv[1] --> test`? Wouldn't `argv[1]` return the integer value of the address in memory, not the string value itself? – Jon Behnken Jan 19 '18 at 18:49
  • @J.Behnken: _"why does argv[1] --> test? "_ In the context of my answer, `argv[1] --> have` and not `test`. Read my answer carefully, I have already answered it - `argv[1]` can also be written as `*(argv+1)`, which means dereferencing the pointer `argv+1`. To get the address of the second element of `argv`, you can do `&argv[1]`. – H.S. Jan 19 '18 at 19:01
  • Right, of course, apologies for that mix-up. So `*(argv+1)` which can be written as `argv[1]` is a pointer to the _value_ located at memory address of `argv+1`? To get the integer value of the address in memory of `argv[1]` we write `&argv[1]`? – Jon Behnken Jan 19 '18 at 19:11
  • `argv[1]` is a pointer to string "have". That means, `argv[0]` hold the base address of _null terminated char array - "have"_. Check the diagram I have drawn in my answer. `argv+1` is the address of the second element of array `argv` which is an array of pointers. – H.S. Jan 19 '18 at 19:28
  • Your code invokes UB. It should be `for (char **pargv = argv + 1; pargv < argv + argc; pargv++) {` – 0___________ Nov 22 '22 at 00:08
  • @0___________ _Your code invokes UB._ -> Please explain, how is it? – H.S. Nov 22 '22 at 01:50
  • @0___________ There is no UB in my code. From C11##5.1.2.2.1p2 - `argv[argc] shall be a null pointer.`. What my program is doing - initialising a pointer `pargv` to second element of array `argv` and if the pointer at this location is not `NULL` than execute the loop body and move pointer `pargv` to next element of array `argv`. Repeat this until the pointer `*pargv` not equal to `argv[argc]` i.e. repeat until `*pargv != NULL` . @0___________ Either justify - why do you think my code invoke UB? or delete your comment. – H.S. Nov 25 '22 at 06:20
  • maybe it doesn't invoke UB but it's less performant as it has to check the memory pointed, the proposed altenative checks pointer value – Jean-François Fabre Dec 24 '22 at 17:47
  • maybe, the question is not about performance but about `argv` and, in my program, I wanted to demonstrate about `argv`. I agree, loop condition comparing pointers can also work here. – H.S. Jun 02 '23 at 02:44
4

You can use nested for loops to loop through all arguments. argc will tell you number of arguments, while argv contains the array of arrays. I also used strlen() function from strings library. It will tell you how long a string is. This way you can check for any number of arguments. Your if statement can also be changed to just 2, either argc is less than 2 or more than.

#include <stdio.h>
#include <string.h>

int main(int argc, char *argv[]) {
    if (argc < 2) {
        printf("ERROR: You need at least one argument.\n");
        return 1;
    } else {
        int i, x;
        int ch = 0;
        for (i=1; i<argc; i++) {
            for (x = 0; x < strlen(argv[i]); x++) {
                ch = argv[i][x];
                if (ch == 'A' || ch == 'a' || ch == 'e')
                    printf('Vowel\n');
            }
        }
    }
}

Python Equivalent for the nested loop

for i in range (0, argc):
    for x in range(0, len(argv[i])):
        ch = argv[i][x];
        if ch in ('A', 'a', 'e'):
            print('Vowel')
thuyein
  • 1,684
  • 13
  • 29
  • 1
    You should probably start the outer `for` loop at 1 to skip the command name (otherwise, there's not much virtue in demanding at least 2 arguments). You should ideally not call `strlen()` in the condition of the inner loop. – Jonathan Leffler Jan 19 '18 at 06:00
  • 1
    @JonathanLeffler Thanks. I forgot to take into account of the first argv. May I ask what is the reasoning behind not calling `strlen()` in loop condition? – thuyein Jan 19 '18 at 07:36
  • 1
    It depends on whether the compiler can recognize that the string doesn't change, but it takes time to compute the length of a 20 KiB string, and if you do that on every iteration, your supposedly linear algorithm becomes quadratic. (Once upon a long time ago (c. 1990), I came across a `strstr()` implementation which had a bug like this in it — and my program was scanning 20 KiB buffers. I ended up writing my own surrogate function for that platform until the bug was fixed, later the same year. This was in the days before such functions were builtins — if, indeed, `strstr()` is a built-in now.) – Jonathan Leffler Jan 19 '18 at 07:47
  • Thank you for this answer. Can you just walk me through the nested loop with a bit more precision? I'm coming from Python so I have a fairly strong grasp on nested loops work, I'm just hung up on the syntax. Starting with the inner loop, while `x` is less than the length of the string in `argv[i]`, identify the vowel, then increment `x`. Once the length of the string is less than `x`, return to the outer loop, which will then move on to the next argument in `argv[]` (by incrementing `i`) -- is that correct? – Jon Behnken Jan 19 '18 at 18:09
  • I've added a python code for that for loop. I hope it's easier to understand. First loops through the first dimension of the 2-dimensional array. The second goes through the second dimension, which is the length of each argument value. `strlen` will return you the number of characters in the current loop argument. @J.Behnken – thuyein Jan 20 '18 at 13:19
1

While you can use multiple conditional expressions to test if the current character is a vowel, it is often beneficial to create a constant string containing all possible members of a set to test against (vowels here) and loop over your string of vowels to determine if the current character is a match.

Instead of looping, you can simply use your constant string as the string to test against in a call to strchr to determine if the current character is a member of the set.

The following simply uses a loop and a pointer to iterate over each character in each argument, and in like manner to iterate over each character in our constant string char *vowels = "aeiouAEIOU"; to determine if the current character is a vowel (handling both lower and uppercase forms).

#include <stdio.h>

int main (int argc, char *argv[]) {

    int i, nvowels = 0;
    char *vowels = "aeiouAEIOU";

    if (argc < 2) {
        fprintf (stderr, "ERROR: You need at least one argument.\n");
        return 1;
    }

    for (i = 1; i < argc; i++) {    /* loop over each argument        */
        char *p = argv[i];          /* pointer to 1st char in argv[i] */
        int vowelsinarg = 0;        /* vowels per argument (optional) */
        while (*p) {                /* loop over each char in arg     */
            char *v = vowels;       /* pointer to 1st char in vowels  */
            while (*v) {            /* loop over each char in vowels  */
                if (*v == *p) {     /* if char is vowel */
                    vowelsinarg++;  /* increment number */
                    break;          /* bail */
                }
                v++;                /* increment pointer to vowels */
            }
            p++;                    /* increment pointer to arg */
        }
        printf ("argv[%2d] : %-16s (%d vowels)\n", i, argv[i], vowelsinarg);
        nvowels += vowelsinarg;     /* update total number of vowels */
    }
    printf ("\n  Total: %d vowles\n", nvowels);

    return 0;
}

Example Use/Output

$ ./bin/argvowelcnt My dog has FLEAS.
argv[ 1] : My               (0 vowels)
argv[ 2] : dog              (1 vowels)
argv[ 3] : has              (1 vowels)
argv[ 4] : FLEAS.           (2 vowels)

  Total: 4 vowles

If you decided to use the strchar function from string.h to check if the current char was in your set of vowels, your inner loop over each character would be reduced to:

        while (*p) {                /* loop over each char in arg     */
            if (strchr (vowels, *p))    /* check if char is in vowels */
                vowelsinarg++;          /* increment number */
            p++;                        /* increment pointer to arg */
        }

Look things over and let me know if you have further questions.

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
0

If anyone else is coming to this from Learn C The Hard Way the answers above using pointers are getting a bit far ahead of us as we won't be covering pointers until chapter 15. However you can do this piece of extra credit with what we've learned so far. I won't spoil the fun but you can use a for loop for the arguments with a nested for loop for the letters in those words; getting a letter from each word then is as simple as argv[j][i] where j is the jth arg passed and i the ith letter in j. No having to import header files or use pointers. I have to say I did come back to H.S's answer when I did come to pointers in chapter 15.

CClarke
  • 503
  • 7
  • 18