0

I need to filter some specific elements from an array. I write the code which work perfectly:

#include <stdio.h>
#define NELEMS(x)  (sizeof(x) / sizeof((x)[0]))
int main (int argc, char *argv[]) {
    // our initial array
    int x[] = {1,2,-3,4,5};
    // initialize filtered_array with pointer
    int *filtered;
    int upper_bound = 4, lower_bound = 1, i, j=0, total = -1,result;
    size_t n = NELEMS(x);
    printf("size of the main array: %d\n",n);

    // check which element satisfy the condition and count them
    for (i=0;i<n;++i)
    {
        total = ((x[i] >= lower_bound) && (x[i] <= upper_bound)) ? total+1 : total;
    };

    // allocate array size for filtered array
    filtered = (int*) calloc(total,sizeof(int));
    for (i=0;i<n;++i)
    {
        // filter element from main array and store them in filtered array
        result = ((x[i] >= lower_bound) && (x[i] <= upper_bound)) ? 1 : 0;
        if(result) {
            filtered[j] = x[i];
            ++j;
        };
    };
    for (i = 0; i<total+1; ++i){
        printf("%d ",filtered[i]);
    };
    return 0;
}

But can I avoid to create a new array like I used filtered and dynamically do this for the main array by some overwrite trick?

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
falamiw
  • 426
  • 4
  • 16
  • If none of your arry elements are within bounds, `total` is negative. `total + 1` in the printing loop also raises some red flags. – M Oehm Apr 13 '22 at 10:17
  • Aha, I just start with `-1` for some silly reason. I am aware of that. Thanks for the catch @MOehm – falamiw Apr 13 '22 at 10:19

2 Answers2

1

What you are looking for is called modification (in this case filtering) in place.

That's even pretty simple:

int* filter(int* begin, int* end)
{
    int* pos = begin; // start at the beginning of the array
    for(; begin != end; ++begin)
    {
         if(*begin != 0) // your appropriate condition!
                         // this one filters out all zero values...
         {
             *pos++ = *begin; // copy current element to retain into
                              // first position not yet used
                              // (pos always points to one element past
                              //  the last one retained!)
         }
    }
    return pos; // as above, points to one past the last element retained
}

The return value is important to know how many elements remained.

If you prefer, you can instead write an index-based variant avoiding pointer arithmetics...

The basic idea about the filtering algorithm is that you just copy the elements towards the front, overwriting those values that either are filtered away or have already been copied.

Edit: Additional explanations:

begin and end pointers are to be passed as pointer to first element of the array and pointer to one past the array (typical C++ semantics when iterating over C++ STL containers, though these come with their own iterator types instead of pointers...).

So you'd call the function like:

int array[N];
int* newEnd = filter(array, array + sizeof(array)/sizeof(*array));

This is typical C++ semantics when iterating over STL containers (though these come with their specific iterator types instead of poitners).

If you don't like:

size_t filter(size_t length, int array[length]) // note: length as array subscript
                                                // is ignored anyway...
{
    int* pos = array;
    int* begin = array;
    int* end = array + length;

    // now following the identical loop as above

    return pos - array; // pointer difference; if pos points one past the last
                        // element you get the *number* of elements retained
}

About *pos++ = *begin: That's nothing special and any good C book should explain that nicely to you...

It copies the value begin points to to the address pos points to and increments pos afterwards.

An indexing loop doing the same might look as follows:

size_t pos = 0;
for(size_t i = 0; i < length; ++i)
{
    array[pos++] = array[i]
}
return pos;
Aconcagua
  • 24,880
  • 4
  • 34
  • 59
  • I am not super familiar with pointer. It will be a great help if you describe each pointer expression little bit. like what `*pos++ = *begin` stand for or do here. Sorry for asking that but I really interested to learn pointer based solution. @Aconcagua And what is begin and end here? How this filter function access the elements of my main array? – falamiw Apr 13 '22 at 10:25
1

For starters your program is incorrect.

Firstly you need to include the header <stdlib.h>

#include <stdlib.h>

Secondly, initially total is set to -1

total = -1

So if the original array contains only one element that satisfies the condition then total will be equal to 0 due to this statement

 total = ((x[i] >= lower_bound) && (x[i] <= upper_bound)) ? total+1 : total;

As a result this statement

filtered = (int*) calloc(total,sizeof(int));

is trying to allocate a memory with the size equal to 0. If the memory will be allocated (it is implementation defined) then you may not write anything in this memory. Otherwise the program will have undefined behavior.

In any case you are counting elements that satisfy the condition incorrectly and hence allocating a block of memory of an incorrect size.

Also there is no sense to insert a null statement after compound statements as you are doing

for (i=0;i<n;++i)
{
    //...
};

^^^

Remove such redundant semicolons.

And you are using an incorrect conversion specifier in this call of printf.

printf("size of the main array: %d\n",n);
                                ^^^

You have to write

printf("size of the main array: %zu\n",n);
                                ^^^

As for your question

But can I avoid to create a new array like I used filtered and dynamically do this for the main array by some overwrite trick?

then what you need is to change the array in place and to track the number of actual elements that satisfy the condition.

For example

int x[] = {1,2,-3,4,5};
// initialize filtered_array with pointer
size_t n = NELEMS(x);

int upper_bound = 4, lower_bound = 1;

printf( "size of the main array: %zu\n", n );

size_t m = 0;

for ( size_t i = 0; i < n; ++i )
{
    if ( x[i] >= lower_bound) && x[i] <= upper_bound )
    {
        if ( m != i )
        {
            x[m] = x[i];
        }
        ++m;
    }
}

for ( size_t i = 0; i < m; ++i )
{
    printf( "%d ", x[i] );
}

putchar( '\n' );

If you need to reallocate the original array according to the number of elements that satisfy the condition then the code can look for example the following way

#include <stdlib.h>
#include <string.h>
#include <stdio.h>

//...

size_t n = 5;
int *x = malloc( n * sizeof( int ) );

memcpy( x, ( int[] ){ 1, 2, -3, 4, 5 }, n * sizeof( int ) );

int upper_bound = 4, lower_bound = 1;

printf( "size of the main array: %zu\n", n );

size_t m = 0;

for ( size_t i = 0; i < n; ++i )
{
    if ( x[i] >= lower_bound) && x[i] <= upper_bound )
    {
        if ( m != i )
        {
            x[m] = x[i];
        }
        ++m;
    }
}

if ( m != n )
{
    int *tmp = realloc( x, m * sizeof( int ) );

    if ( tmp != NULL ) 
    {
        x = tmp;

        for ( size_t i = 0; i < m; ++i )
        {
            printf( "%d ", x[i] );
        }

        putchar( '\n' );
    }
}

free( x );
Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • Thanks for your suggestion, I am learning C programming. I have a question, the main array still have some elements (after m indices) which we don't need at all. Could we remove them? I mean I was thinking to convert my main array into an dynamic array which turn into filtered array without those unnecessary elements. Another thing is, What is the best way to test this program? Like I need to implement this with a huge array size. Could I seed the test cases quickly. I was a python developer that's why facing trouble with such mid level programming language. Thanks again for your response – falamiw Apr 13 '22 at 11:38
  • @falamiw You could reallocate the array if it would be allocated dynamically. – Vlad from Moscow Apr 13 '22 at 11:45
  • Aha, I was thinking to do that without using tmp variable that's why confused how to do that. Thanks again(+1) @Vlad from Moscow. – falamiw Apr 13 '22 at 11:59