5

I'm writing an implementation of the Moore Voting algorithm for finding the majority element (i.e. the element which occurs more than size/2 times) in an array. The code should return the majority element if it exists or else it should return -1. Now my version of the majorityElement(int size, int arr[]) seems to work perfectly fine if I directly hardcode the integer array in the main() function and invoke it from there.

int majorityElement(int size, int arr[])
{
    int majorityindex = 0;
    int votes = 1;
    int index;
    for (index = 1; index < size; index++)
    {
        if (arr[index] == arr[majorityindex])
            votes++;
        else 
            votes--;
        if (votes == 0)
        {
            majorityindex = index;
            votes = 1;
        }
    }
    int count = 0;
    int i;
    for (i = 0; i < size; i++)
    {
        if(arr[majorityindex] == arr[i])
        count++;
    }
    if (count > (size/2))
        return arr[majorityindex];
    return -1;    
}

However, I'm facing some issues if I try to read an input stream like these:

2
5
3 1 3 3 2
3
1 2 3

The first line of the input contains the number of test cases. The first line of the test case will be the size of the array and the second line will be the elements of the array.

I tried to read the input stream from within the main() function like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX 100
int majorityElement(int size, int arr[]);

int main() 
{
   char buf[3];
   fgets(buf, MAX, stdin);
   int n = atoi(buf);
   char a[3];
   char b[MAX];
   int i;
   int count;
   int* num;
   for (i = 0; i < n; i++)
   {
    count = 0; 
    fgets(a, MAX, stdin);
    fgets(b, MAX, stdin);
    int x = atoi(a);
    char* num[x];
    int arr[x];
    int k = 0;
    char* token = strtok(b, " ");
      while (token != NULL)
        { 
          num[k] = token; 
          arr[k] = atoi(num[k]);
          token = strtok(NULL, " "); 
          k++;
        }
    printf("%d\n", majorityElement(x, arr));
    }

   return 1;
}

I took the size of buf[] and a[] as 3 during declaration as they must have sufficient space for the \n character read by fgets() as well as the terminating \0 character. As far as I know, the atoi() function ignores the \n character while converting the character array (string) into an integer. I tried to store the first entry of the input (i.e. the number of entries) in a character array buf, converted it into a string and stored it in a variable n. Similarly, I tried to obtain the size of a test array in a variable x and the test arrays (second line of test case) in an integer array arr. While buf and n seem to obtain the correct values in all cases, I'm not quite sure about arr. I'm aware that fgets() leaves a terminal \n character and that might be causing some havoc during tokenization using strtok, although I can't finger at why. I tried submitting this code on GeeksForGeeks. It gives absolutely correct outputs for the sample test case:

2
5
3 1 3 3 2
3
1 2 3

that is

3
-1

However, when I try to "submit" my solution it says:

Possibly your code doesn't work correctly for multiple test-cases (TCs).

The first test case where your code failed:

    Input:
    4
    1 2 2 1

    Its Correct output is:
    -1

    And Your Code's output is:
    1

I can't seem to make sense of this. If I manually write this in stdin:

1
4
1 2 2 1

the code outputs

-1

which is indeed the correct solution. This doesn't match with the output claimed during the submission i.e. 1. So I'm not really sure where I'm going wrong. Have I used fgets() or strtok() incorrectly in the main() function? Or is it something else?


Updated the main() function according to suggestions in the comments.

int main() 
{
   char buf[MAX];
   fgets(buf, MAX, stdin);
   int n = atoi(buf);
   char a[MAX];
   char b[MAX];
   int i;
   int count;
   int* num;
   for (i = 0; i < n; i++)
   {
    count = 0; 
    fgets(a, MAX, stdin);
    fgets(b, sizeof(a), stdin);
    a[sizeof(a)-1] = '\0';
    b[sizeof(b)-1] = '\0';
    int x = atoi(a);
    int arr[x];
    int k = 0;
    char* token = strtok(b, " ");
      while (token != NULL)
        { 
          if (k > x)
          break;
          arr[k] = atoi(token);
          token = strtok(NULL, " "); 
          k++;
        }
    printf("%d\n", majorityElement(x, arr));
    }

   return 1;
}

As pointed out by @Vlad, the MAX was set too low in my original array. The question says that the number of entries in an array is upper bounded by 10^7 and each array entry is upper bounded by 10^6 (7 digits). So MAX needs to be of the order 10^8. According to the suggestions in the comments, I'm now using dynamic allocation instead of variable length arrays.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX 10000000

int majorityElement(int size, int arr[])
{
    int majorityindex = 0;
    int votes = 1;
    int index;
    for (index = 1; index < size; index++)
    {
        if (arr[index] == arr[majorityindex])
            votes++;
        else 
            votes--;
        if (votes == 0)
        {
            majorityindex = index;
            votes = 1;
        }
    }
    int count = 0;
    int i;
    for (i = 0; i < size; i++)
    {
        if(arr[majorityindex] == arr[i])
        count++;
    }
    if (count > (size/2))
        return arr[majorityindex];
    return -1;    
}

int main() 
{
   char* buf = calloc (MAX, sizeof(char));
   fgets(buf, MAX, stdin);
   int n = atoi(buf);
   char* a = calloc (MAX, sizeof(char));
   char* b = calloc(MAX, sizeof(char));
   int i;
   for (i = 0; i < n; i++)
   {
    fgets(a, MAX, stdin);
    fgets(b, MAX, stdin);
    a[strlen(a)-1] = '\0';
    b[strlen(b)-1] = '\0';
    int x = atoi(a);
    int *arr = calloc(x, sizeof(int));
    int k = 0;
    char* token = strtok(b, " ");
      while (token != NULL)
        { 
          if (k > x)
          break;
          arr[k] = atoi(token);
          token = strtok(NULL, " "); 
          k++;
        }
    printf("%d\n", majorityElement(x, arr));
    free(arr)
    }
   free(buf);
   free(a);
   free(b);
   return 1;
}

If I set MAX to 10^7 then the code passes all the test cases and is accepted for submission. However, if I set MAX to 10^8 (as required), I get a segmentation fault. How to overcome this?

  • 2
    Your use of `fgets` is wrong, because you always use `MAX` as the size, no matter the actual size of the buffer. – Some programmer dude Aug 23 '19 at 12:11
  • @Someprogrammerdude Does that really matter though? `fgets()` automatically stops reading after it encounters the first newline `\n` character in stdin. I took `100` because as a general maximum for the numbers of characters in a single line. Cf. https://stackoverflow.com/a/2879610 –  Aug 23 '19 at 12:14
  • On an unrelated note, the array `num` isn't needed. You could do `arr[k] = atoi(token);` directly. – Some programmer dude Aug 23 '19 at 12:17
  • The array `a[]` can store 3 characters because the declaration was `char a[3]`. Now if the first line of a test contains say `7`, `fgets()` will read that, encounter a newline `\n` and then store `'7'` in `a[0]` and `'\n'` in `a[1]`. Although, I guess I should have manually inserted a `\0` in `a[2]` to terminate the character array (string) (?). –  Aug 23 '19 at 12:19
  • 2
    I'm sorry, I thought about the wrong array. But still, you should never use a non-matching size. Use e.g. `sizeof a` instead (buffer overflows is a thing you really need to take care of). – Some programmer dude Aug 23 '19 at 12:20
  • @Someprogrammerdude I see. Then what should I write as the size of `a[]` during declaration...`char a[1];`? And `fgets(a, 2, stdin);`? Or, do I have to include an extra space for the terminating null during the declaration itself i.e. `char a[2];`, and thus `fgets(a, 3, stdin);` and `a[2]='\0'`? –  Aug 23 '19 at 12:26
  • No no, your definition of `a` is okay for one-digit input (the size in the definition needs to be at least `3`). But you should use `sizeof a` in the call to `fgets`. As in `fgets(a, sizeof a, stdin)`. I think you're overthinking this, especially in relation to `fgets` and what it writes to your arrays. – Some programmer dude Aug 23 '19 at 12:28
  • 2
    Your loop should throw an error if `k > x` – stark Aug 23 '19 at 12:31
  • @Someprogrammerdude Umm, I don't understand why `a`'s size needs to be *at least* `3` for a `1` digit input. `fgets(a, n, stdin)` will read only `n-1` characters from `stdin`. If I don't want to read the newline character `\n` I could directly take `n = 2` and use `fgets(a, 2, stdin)`. Then `a[0]` contain the single-digit character and `a[1]` would contain `\0` if the declaring was `char a[2]`. By the way, the question did not specify that the number of elements in a test-case array will necessarily be single-digit...so that might be the source of error. –  Aug 23 '19 at 12:34
  • One for the digit, one for the newline, one for the terminator. Three. If you don't include space for the newline, it will be left in the input buffer to be read by the next call to `fgets`, which will read it as an "empty" line. And if there might be numbers larger than `9` in the input, then why skimp with such small buffers? Use `MAX` for *all* strings. – Some programmer dude Aug 23 '19 at 12:37
  • @Someprogrammerdude Ah, I see. By the way, it's not guaranteed that the number of elements in a test case array will be expressible by a single-digit integer....so even declaring the size of `a` as `3` would not be sufficient I guess... –  Aug 23 '19 at 12:42
  • Yup, I now used MAX for all strings and also wrote `if(k>x) break;` within `while` as suggested by @stark. However, it still isn't getting accepted for submission and still shows the same error. –  Aug 23 '19 at 12:43
  • @Someprogrammerdude I've updated the modified `main()` in the OP. Any more suggestions? Can't figure out the error. Are the trailing newline characters doing something? –  Aug 23 '19 at 12:49
  • @Blue I can not reproduce the wrong output. Using your original code and providing that each input line starts from the first position I got the expected result -1. So it is possible that input lines do not start from the first position. – Vlad from Moscow Aug 23 '19 at 13:01
  • @Vlad I can't reproduce the wrong output either. However, it's not getting accepted on the GeeksForGeeks challenge I linked, which is strange. (Link: https://practice.geeksforgeeks.org/problems/majority-element/0) Perhaps the GFGs compiler is faulty? I don't know the complete set of test cases they are using though. –  Aug 23 '19 at 13:03
  • @Blue So why are you saying that you got a wrong result when it is correct? Show the test that you did not pass. – Vlad from Moscow Aug 23 '19 at 13:08
  • @Vlad Please read my question again. The test case passes when given directly via stdin, but **GFGs claims** that it doesn't pass that same test case during submission. So I'm confused. Please read my question again...I've written this clearly –  Aug 23 '19 at 13:14
  • @Blue In fact your function is incorrect because -1 can be a valid value of the array. – Vlad from Moscow Aug 23 '19 at 13:16
  • Did you notice the "Submit" button on the GFGs page I linked? I mean that when I say "during submission". –  Aug 23 '19 at 13:18
  • @Vlad As far as I understand, the question only deals with positive integers. Hence they ask you to return -1 if no majority element exists. –  Aug 23 '19 at 13:19
  • Ick. Why are you using fgets and strtok at all? For this problem, do freads in a loop and parse the data with strtol. It's probably easier to use getchar and compute the integers than it is to use strtok and fgets. I don't think I've ever advocated the use of scanf, but even that is probably better than strtok in this case. – William Pursell Aug 23 '19 at 14:57
  • https://stackoverflow.com/questions/1488569/atoi-string-to-int – William Pursell Aug 23 '19 at 14:59
  • When you pass the incorrect size of the buffer to fgets and you do not control the input (and when your code runs on GFG, you don't control the input), you have no way of knowing what is happening. Fix that error first. – William Pursell Aug 23 '19 at 15:01

1 Answers1

0

Your program has several drawbacks.

For example within the function main there are unused variables declared like

int count;
int* num;

The function does take into account that -1 can be a valid value of the array.

There is a problem with the number of elements that can be specified in a test. It is a very big number (according to the description 1 <= N <= 10000000). So the value of MAX equal to 100 is too low. As a result the data can be read incorrectly and not completely. Also there can occur problems with the variable length arrays.

There is no need to use the function fgets because each integer number can be read using scanf.

I could suggest the following solution. Try it and see whether it will pass the tests.

#include <stdio.h>
#include <stdlib.h>

size_t majorityElement( const int a[], size_t n )
{
    size_t majority_index = 0;

    for ( size_t i = 1, votes = 1; i < n; i++ )
    {
        if ( a[majority_index] == a[i] )
        {
            ++votes;
        }
        else
        {
            --votes;
        }

        if ( votes == 0 )
        {
            majority_index = i;
            ++votes;
        }
    }

    size_t count = 0;

    for ( size_t i = 0; i < n; i++ ) count += a[i] == a[majority_index];

    return n / 2 < count ? majority_index : n;
}

int main(void) 
{
    size_t n = 0;

    scanf( "%zu", &n );

    for ( size_t i = 0; i < n; i++ )
    {
        size_t m = 0;

        scanf( "%zu", &m );

        if ( m != 0 )
        {
            int *a = calloc( m, sizeof( int ) );

            for ( size_t j = 0; j < m; j++ ) scanf( "%d", a + j );

            size_t majority_index = majorityElement( a, m );

            printf( "%d\n", majority_index == m ? -1 : a[majority_index] );

            free( a );
        }           
    }

    return 0;
}

If it will not pass the tests then it seems there is a bug in tests.:)

Or if the function return type may not be changed then the function definition can look like

int majorityElement( const int a[], size_t n )
{
    size_t majority_index = 0;

    for ( size_t i = 1, votes = 1; i < n; i++ )
    {
        if ( a[majority_index] == a[i] )
        {
            ++votes;
        }
        else
        {
            --votes;
        }

        if ( votes == 0 )
        {
            majority_index = i;
            ++votes;
        }
    }

    size_t count = 0;

    for ( size_t i = 0; i < n; i++ ) count += a[i] == a[majority_index];

    return n / 2 < count ? a[majority_index] : -1;
}
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • Yep...your solution with `scanf()` does pass all the test cases and is accepted for submission. But my solution with `fgets()` doesn't. It would be helpful if someone could point out the exact source of error in my modified program (leaving aside the issue of extra variables). –  Aug 23 '19 at 14:34
  • @Blue Maybe the problem is either with variable length arrays or you set MAX too low. According to the description the number of elements in the array can be very big: 1 <= N <= 107 – Vlad from Moscow Aug 23 '19 at 14:36
  • Aha, increasing MAX to `10^6` solves the problem. However, if I set it to `10^7` I get a segmentation fault. Which is bad since it says that N may be `10^7` (and each `A_i` can be `10^6` i.e. 7 digits at maximum)! Is there any way to overcome this restriction imposed on the size of character array? Would using dynamic memory allocation (malloc/calloc) help? –  Aug 23 '19 at 14:50
  • @Blue It is due to using variable length arrays that are created in the stack. Use the dynamic memory allocation as it is shown in my demonstrative program. – Vlad from Moscow Aug 23 '19 at 14:52
  • I tried using dynamic allocation now, as updated in the question body. Still, if I set MAX to `10^7` it passes all test cases and is accepted. But if I set it to `10^8` (as required in this case), it produces a segmentation fault. So perhaps GFGs isn't providing sufficient heap size? Or is there some other error? –  Aug 23 '19 at 15:08
  • @Blue As I already said it is a bad idea to use fgets. You are allocating too much memory in total for three arrays. The problem is solved. I showed how the program can be written. So close the question selecting the best (single) answer. Otherwise it is a bad idea to change your question periodically. – Vlad from Moscow Aug 23 '19 at 15:29