-1

I'm doing some exercises for my class, and I'm stuck with a problem. The exercise asks to fill a dynamic array with data from a file, order the data and then delete the duplicate data. My problem is in the last one, I cannot delete the duplicated numbers.

Here is my code:

int main(){
    FILE *f;
    int *v;
    int n=0,i,j;

    f=fopen("InteirosComRepetidos.txt","r");
    v=(int*)malloc(sizeof(int));

    while(!feof(f)){
        v=(int*)realloc(v,(n+1)*sizeof(int));
        fscanf(f,"%d\n",&v[n]);
        n++;
    }

    OrdenarQuicksort(v,0,n);

    for(i=0;i<n;i++){
        printf("%d\n",v[i]);
    }
    //Trying to remove the duplicate
    for(j=n;j>=0;j--){
        if(v[j]==v[j-1])
            v=(int*)realloc(v,(n-1)*sizeof(int));
    }

    printf("Retirados:\n");
    for(i=0;i<n;i++){
        printf("%d\n",v[i]);
    }
    system("pause");
    return 0;}  

Can anyone help me?

Jonathan Leffler
  • 730,956
  • 141
  • 904
  • 1,278
  • 1
    https://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong – M.M May 30 '18 at 01:29
  • You read out of bounds of your allocation (you resize to `n-1` but you keep reading `n` items). Maybe you meant to decrement `n` in the "trying to remove the duplicate" case, but that loop won't correctly remove duplicates that are not at the end – M.M May 30 '18 at 01:32
  • 1
    You will want to look at [**Why is while ( !feof (file) ) always wrong?**](http://stackoverflow.com/questions/5431941/why-is-while-feof-file-always-wrong). – David C. Rankin May 30 '18 at 05:23
  • You will also need to look at [Why trailing white space in a `scanf()` format string is a bad idea](https://stackoverflow.com/questions/15740024/). – Jonathan Leffler May 30 '18 at 07:15
  • Don't cast the result of a call to `malloc()`, `realloc()`, etc. It's unnecessary (`void *` is automatically converted to any pointer type) and potentially dangerous (if the correct prototype is not in scope, the cast may silence the compiler's otherwise-helpful warning and generate bad code for you). – mlp May 30 '18 at 20:48

3 Answers3

1

This:

for(j=n;j>=0;j--){
    ...
}

should be:

for(j=n-1;j>=0;j--){
    ...
}

Otherwise you are going out of bound.

Than this:

if(v[j]==v[j-1])
     v=(int*)realloc(v,(n-1)*sizeof(int));

is reallocating but not removing doubles. Meaning that you if you have

1 2 2 5

all this code does is remove 5 so you end up with

1 2 2

And in case where you reallocate and downsize your array in memory is not going to move it is going to stay where it is, thus in many cases you can still access 5.

0

You have a couple of problems. First, your look and if conditions invoke Undefined Behavior by reading outside the bounds of v in

for(j=n;j>=0;j--){
    if(v[j]==v[j-1])

When j = n you read 1-past the end of the array, and where j = 0, you read 1-before the beginning of the array. A better approach would be:

    i = n - 1;      /* set i to last element */
    while (i--)     /* loop removing the duplicates */
        /* if current equals next */
        if (v[i] == v[i + 1]) { 
        ...

Next, rather than using realloc to trim block from the end, you should use memmove to remove duplicates. (realloc can do nothing for 1, 2, 2, 3, 4) Instead simply backup over each element, checking for duplicates, and moving the end elements forward to overwrite the duplicate. Don't worry about realloc until you have removed all duplicates -- then realloc once, e.g.

    i = n - 1;      /* set i to last element */
    while (i--)     /* loop removing the duplicates */
        /* if current equals next */
        if (v[i] == v[i + 1]) { /* use memmove - blocks overlap */
            memmove (&v[i], &v[i + 1], (n - i - 1) * sizeof *v);
            n--;    /* decrement count */
        }

    /* final reallocation */
    tmp = realloc (v, n * sizeof *v);   /* never realloc to original ptr */
    if (tmp == NULL) {                  /* validate reallocation */
        perror ("realloc-final");
        return 1;
    }
    v = tmp;        /* then assign new block to original pointer */

note: do not realloc to the original pointer itself. Instead use a temporary pointer, and then validate that the realloc succeeded before assigning the address for the new block of memory to the original pointer.

Putting together a short example that uses an initialized array of jumbled numbers rather than reading from a file (since none was provided), you could do something similar to:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

/* simple qsort compare - sort int ascending */
int compare (const void *a, const void *b)
{
    return (*(int*)a > *(int*)b) - (*(int*)a < *(int*)b);
}

int main (void) {

    int a[] = {5,1,5,5,2,8,3,2,6,9,4,8,1,4,3,6,9,2,7,7},  /* numbers */
        n = sizeof a / sizeof *a,   /* number of elements */
        *v = NULL,                  /* pointer to allocate */
        i;
    void *tmp;  /* temp pointer to realloc, never realloc w/original */

    v = malloc (n * sizeof *v); /* don't cast return of malloc, always
                                * use 'n * sizeof dereferenced pointer'
                                */
    memcpy (v, a, n * sizeof *v);       /* copy a to v */
    qsort (v, n, sizeof *v, compare);   /* sort v */

    for (i = 0; i < n; i++){            /* output sorted v */
        printf ("a[%2d]: %d\n", i, v[i]);
    }

    i = n - 1;      /* set i to last element */
    while (i--)     /* loop removing the duplicates */
        /* if current equals next */
        if (v[i] == v[i + 1]) { /* use memmove - blocks overlap */
            memmove (&v[i], &v[i + 1], (n - i - 1) * sizeof *v);
            n--;    /* decrement count */
        }

    /* final reallocation */
    tmp = realloc (v, n * sizeof *v);   /* never realloc to original ptr */
    if (tmp == NULL) {                  /* validate reallocation */
        perror ("realloc-final");
        return 1;
    }
    v = tmp;        /* then assign new block to original pointer */

    printf ("Retirados:\n");
    for (i = 0; i < n; i++){
        printf ("a[%2d]: %d\n", i, v[i]);
    }

    free (v);   /* don't forget to free memory you allocate */
#if defined (_WIN32) || defined (_WIN64)
    getchar();  /* just use getchar() to hold terminal open on windows */
#endif

    return 0;
}

(note: qsort was used in place of your OrdenarQuicksort since you did not provide the OrdenarQuicksort code)

Example Use/Output

$ ./bin/array_rm_dups
a[ 0]: 1
a[ 1]: 1
a[ 2]: 2
a[ 3]: 2
a[ 4]: 2
a[ 5]: 3
a[ 6]: 3
a[ 7]: 4
a[ 8]: 4
a[ 9]: 5
a[10]: 5
a[11]: 5
a[12]: 6
a[13]: 6
a[14]: 7
a[15]: 7
a[16]: 8
a[17]: 8
a[18]: 9
a[19]: 9
Retirados:
a[ 0]: 1
a[ 1]: 2
a[ 2]: 3
a[ 3]: 4
a[ 4]: 5
a[ 5]: 6
a[ 6]: 7
a[ 7]: 8
a[ 8]: 9

Memory Use/Error Check

In any code you write that dynamically allocates memory, you have 2 responsibilities regarding any block of memory allocated: (1) always preserve a pointer to the starting address for the block of memory so, (2) it can be freed when it is no longer needed.

It is imperative that you use a memory error checking program to insure you do not attempt to access memory or write beyond/outside the bounds of your allocated block, attempt to read or base a conditional jump on an uninitialized value, and finally, to confirm that you free all the memory you have allocated.

For Linux valgrind is the normal choice. There are similar memory checkers for every platform. They are all simple to use, just run your program through it.

$ valgrind ./bin/array_rm_dups
==26625== Memcheck, a memory error detector
==26625== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==26625== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==26625== Command: ./bin/array_rm_dups
==26625==
a[ 0]: 1
a[ 1]: 1
a[17]: 8
<snip>
a[18]: 9
a[19]: 9
Retirados:
a[ 0]: 1
a[ 1]: 2
a[ 2]: 3
a[ 3]: 4
a[ 4]: 5
a[ 5]: 6
a[ 6]: 7
a[ 7]: 8
a[ 8]: 9
==26625==
==26625== HEAP SUMMARY:
==26625==     in use at exit: 0 bytes in 0 blocks
==26625==   total heap usage: 2 allocs, 2 frees, 116 bytes allocated
==26625==
==26625== All heap blocks were freed -- no leaks are possible
==26625==
==26625== For counts of detected and suppressed errors, rerun with: -v
==26625== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Always confirm that you have freed all memory you have allocated and that there are no memory errors.

Validate All File Opens & Reads

Reading from a file, you must validate the fopen and you must validate each fscanf. There is no reason to include the '\n' in your format string, the "%d" (and in fact all numeric) format specifiers consume leading whitespace (which includes ' ', '\t' and '\n').

When using malloc or realloc, do not cast the result -- there is no reason to. Both return void* which can be assigned to any pointer type without a cast, see: Do I cast the result of malloc?

Putting those pieces together, your open and read could be done something like:

    f = fopen ("InteirosComRepetidos.txt","r");
    if (f == NULL) {    /* validate file open for reading */
        perror ("fopen-InteirosComRepetidos.txt");
        return 1;   /* if it isn't, exit */
    }
    v = malloc (sizeof *v); /* don't cast return of malloc, and if you always
                             * use 'sizeof dereferenced pointer', times some
                             * amount, you will always get your allocations
                             * correct */

    while (fscanf (f, "%d", &v[n]) == 1) {  /* loop on validated scanf return */
        tmp = realloc (v, (n+1) * sizeof *v); /* never realloc ptr iself */
        if (tmp == NULL) {  /* validate realloc succeeded/handle failure */
            perror ("realloc-v");
            return 1;
        }
        v = tmp;    /* assign reallocated pointer to v */
        n++;        /* advance counter */
    }

note: you never realloc and assign the result to the original pointer (e.g. v = realloc (v, ...)). realloc can fail and when it does, it returns NULL. In that case, you have just lost the reference to your original block of memory -- resulting in not being able to use or free that memory (you have created a memory leak). Instead, always realloc with a temporary pointer (e.g. void *tmp = realloc (v, ...)) and validate that the reallocation succeeds before assigning v = tmp;. That way, in the event realloc fails, you still have your reference to the original block of memory in v (which still holds (or points to) all your original values before your attempted realloc)

David C. Rankin
  • 81,885
  • 6
  • 58
  • 85
-1

What you will have to do is move the repeated elements to the end of the list and then remove them.

Armali
  • 18,255
  • 14
  • 57
  • 171
Lorhan Sohaky
  • 19
  • 1
  • 4
  • 5
    This answer might well be appropriate on [Stack Overflow em Português](https://pt.stackoverflow.com/) but is not appropriate on the English-language site. – Jonathan Leffler May 30 '18 at 04:56
  • The names of the files are in Portuguese, so I answered in Portuguese – Lorhan Sohaky May 30 '18 at 17:16
  • 1
    I understand the urge — but the question was in English, as it is supposed to be, and the answer should be in English, too (though I'd have no problem with English and Portugese saying the same). I don't entirely agree with the policy — I certainly have limited sympathy with those demanding that the code be in English — but it is the SO policy that questions are asked in English. I've not down-voted you either. I got abused in comments recently for translating a question with Google Translate — apparently, there are those who think you can't do for others what they could/should do themselves. – Jonathan Leffler May 30 '18 at 17:54