1

I know this question has been asked many times before, however I am a complete beginner to C arrays and would like to avoid using pointers (if possible) and keep it as simple as possible. I have used the input from the user for a char array and would like to remove any duplicate string values in my program

#include<string.h>
#include<stdio.h>

int main(){
    int N, i, j, k;
        int flag=1; 
    scanf("%d", &N);
    if(1<=N<=10^6){
    char a[N][5];
        for(i=0; i<N; i++){
            scanf("%s", &a[i]);
        }
        for(i=0; i<N; i++){
            for(j=i+1;j<N;){
                if(strcmp(a[i],a[j])==0){
                    for(k=j+1; k<N; k++){
                        memcpy(a[k], a[k+1], 5);
                    }
                    N--;
                }else{
                    j++;
                }
            }   
        }
        for(i=0; i<N; i++){
            printf("%s\n", a[i]);
        }   
    }
    return 0;
}

enter image description here

An input of 3 and {"abc", "abc", "as"} returns the value only {"abc"}. I would like to get the array as {"abc", "as"}. I cannot understand where in the code or logic, I have gone wrong.

Update

I changed the code as mentioned below however for larger examples it is concatenating the strings

#include<string.h>
#include<stdio.h>

int main(){
    int N, i, j, k;

    scanf("%d", &N);
    if(1 <= N && N <= 1000000){
    char a[N][5];
        for(i=0; i<N; i++){
            scanf("%5s", a[i]);
        }
        for(i=0; i<N; i++){
            for(j=i+1;j<N;){
                if(strcmp(a[i],a[j])==0){
                    for(k=j+1; k<N; k++){
                        strcpy(a[k-1], a[k]);
                    }
                    N--;
                }else{
                    j++;
                }
            }
        }
        printf("%d\n", N);
        for(i=0; i<N; i++){
            printf("%s\n", a[i]);
        }
    }
    return 0;
}

enter image description here

Community
  • 1
  • 1
rut_0_1
  • 761
  • 1
  • 11
  • 34
  • is this question for C or C++? They are not the same language. Please select one and delete the other tag. – xaxxon Jul 07 '17 at 08:41
  • 1
    Your innermost loop is accessing the array *out of bounds* at `a[k+1]`. There might be more bugs, but this is already undefined behavior. –  Jul 07 '17 at 08:47
  • 1
    BTW, this is in fact C code (C++ has `std::string` and `std::vector` and so on which you should use in C++). Would be better to get rid of `iostream` (you don't use it) and ask this as a C question. -- Oh, and use a C compiler for such code, not a C++ compiler. –  Jul 07 '17 at 08:48
  • Next bug: why are you copying 6 bytes if your array members only have 5? (you could just use `strcpy()` btw) -- and then, `scanf("%s", ...)` is **always** a buffer overflow, don't use it. It will not magically stop parsing if you enter more characters. –  Jul 07 '17 at 08:51
  • @FelixPalmen I have made the changes regarding the buffer overflow. However the output yet remains the same. How should I approach the problem of array _out of bounds_ at a[k+1] ? – rut_0_1 Jul 07 '17 at 09:03

2 Answers2

1

Here's a description of all your errors in the comments:

#include<string.h>
#include<stdio.h>

int main(){
    int N, i, j, k;

    // check the return value!
    if (scanf("%d", &N) != 1) return 1;

    // if(1<=N<=10^6){
    // this is completely wrong:
    // 1. ^ doesn't mean "power" but bitwise exclusive or
    // 2. 1<=N evaluates to 0 (false) or 1 (true), this is ALWAYS <= 10
    // 3. so you have finally 1^6 = 7 -- ALWAYS true as a boolean
    //
    // what you want is:
    if(1 <= N && N <= 1000000){

        char a[N][5];
        for(i=0; i<N; i++){
            // scanf("%s", &a[i]);
            // at least limit the number of characters read (one less
            // than your buffer size because there's a 0 byte added)
            // then, taking a *pointer* of an array is wrong, the array
            // already decays as a pointer, so leave out the `&`
            scanf("%4s", a[i]);
        }
        for(i=0; i<N; i++){
            for(j=i+1;j<N;){
                if(strcmp(a[i],a[j])==0){
                    for (k=j+1; k<N; k++){
                        // memcpy(a[k], a[k+1], 6);
                        // two errors here:
                        // 1. You only have 5 bytes per element, so copy only 5
                        // 2. this is *off by one* for the array index
                        // correct version:
                        memcpy(a[k-1], a[k], 5);
                        // (or use strcpy())
                    }
                    N--;
                }else{
                    j++;
                }
            }   
        }
        for(i=0; i<N; i++){
            printf("%s\n", a[i]);
        }   
    }
    return 0;
}

In general, always enable compiler warnings. Your compiler would have spotted most of these bugs for you, see what happens when you compile your original code with gcc and warnings enabled:

$ gcc -std=c11 -Wall -Wextra -pedantic -o2darr 2darr.c
2darr.c: In function 'main':
2darr.c:8:12: warning: comparison of constant '10' with boolean expression is always true [-Wbool-compare]
     if(1<=N<=10^6){
            ^~
2darr.c:8:9: warning: comparisons like 'X<=Y<=Z' do not have their mathematical meaning [-Wparentheses]
     if(1<=N<=10^6){
        ~^~~
2darr.c:8:12: warning: suggest parentheses around comparison in operand of '^' [-Wparentheses]
     if(1<=N<=10^6){
        ~~~~^~~~
2darr.c:11:21: warning: format '%s' expects argument of type 'char *', but argument 2 has type 'char (*)[5]' [-Wformat=]
             scanf("%s", &a[i]);
                     ^
2darr.c:11:21: warning: format '%s' expects argument of type 'char *', but argument 2 has type 'char (*)[5]' [-Wformat=]
2darr.c:6:13: warning: unused variable 'flag' [-Wunused-variable]
         int flag=1;
             ^~~~
  • 1.) returning non-zero from `main()` indicates error, `1` is just common, the OS gets this back as exit status (e.g. a script could check for that) 2.) `a[i]` is an array itself (of dimension 5) because `a` is a 2-dimenstional array. When you pass an array to a function, it *decays* as a pointer (arrays can't be passed). You'll find a lot on that topic, [e.g. here](https://stackoverflow.com/a/1462352/2371524) –  Jul 07 '17 at 11:05
  • There are few more issues coming up for larger data samples as given in the update above – rut_0_1 Jul 07 '17 at 11:08
  • @rut_0_1 no, you didn't read properly: "*at least limit the number of characters read (one less than your buffer size because there's a 0 byte added)*". Your edited code allows `scanf()` to write 5 bytes **plus** the 0-byte, which is too large for your arrays. –  Jul 07 '17 at 11:11
  • when I set it as %4s, it causes program to exit after 5 characters are entered. I want N<=5. Hence when I change it to %5s and the char a[N][6], it works fine however after the last index, garbage values start coming – rut_0_1 Jul 07 '17 at 11:25
0

My first reaction would be to write it in c++ using standard algorithms.

#include<iostream>
#include<string>
#include<algorithm>
#include<cmath>
#include<vector>
#include<unordered_set>
#include<algorithm>

template<class T, class A>
auto deduplictate_keep_order(std::vector<T, A> &vec) -> std::vector<T, A> &
{
    std::unordered_set<T> seen;
    seen.reserve(vec.size());

    auto check_seen = [&seen](T const &val) {
        return !seen.insert(val).second;
    };

    vec.erase(std::remove_if(vec.begin(), vec.end(), check_seen), vec.end());
    return vec;
};

template<class T, class A>
auto deduplictate_any_order(std::vector<T, A> &vec) -> std::vector<T, A> &
{
    std::sort(vec.begin(), vec.end());
    vec.erase(std::unique(vec.begin(), vec.end()), vec.end());
    return vec;
};


int main() {
    int N, i, j, k;
    int flag = 1;
    std::cin >> N;

    int limit = std::pow(10, 6);
    if (1 <= N && N <= limit) {  //  1 <= N <= will not do what you want. 10^6 is 10 XOR 6. You don't want that.
        std::vector<std::string> a;
        for (i = 0; i < N; i++) {
            a.emplace_back();
            std::cin >> a.back();
        }

        // remove duplicates
        deduplictate_keep_order(a);

        // or this
//        deduplictate_any_order(a);

        for (std::string const &s : a)
            std::cout << s << '\n';
    }
    return 0;
}
Richard Hodges
  • 68,278
  • 7
  • 90
  • 142
  • Best demonstration why questions about "C/C++" typically make no sense :) –  Jul 07 '17 at 09:12