0

I am having trouble with the push_back() function in C++. For a reason which I don't understand, the push_back function will "not accept" the value I am telling it to append, but the code works perfectly until I try to display the values I want (end of code). I have checked the value type, which is double, but it still won't append it.

The value I am trying to insert comes from a function which calculates the mean of a vector, taking out the NaN values. The code works perfectly but when I am trying to display the values I want, I always get: Segmentation fault (core dumped). This mean function first iterates over a range and creates a vector in which the NaN will be taken away. Then the mean will be calculated. I have spent quite some time on trying to figure out where the error could come from but wasn't able to figure out anything so any help would be highly appreciated. The following mean function:

double mean_func(double **arr, int iterations, int header, int start){

    std::vector<double> vec;
    for (int i=start-iterations; i < start; i++){
         vec.push_back(arr[i][header]);
         }

    vec.erase(std::remove_if(std::begin(vec),
                           std::end(vec),
                           [](const auto& value) { return std::isnan(value); }),
            std::end(vec));

    double sum = std::accumulate(vec.begin(), vec.end(), 0.0);
    double mean = sum / vec.size();

    return mean;
}

The whole code where transf_array_2_vec transforms a vector to an array.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <omp.h>
#include <cmath>
#include <vector>
#include <algorithm>
#include <iostream>
#include <numeric>
#include <typeinfo>


double** transf_vec_2_array(std::vector<std::vector<double> > vals, int N, int M)
{
   double** temp;
   temp = new double*[N];
   #pragma omp parallel for
   for(int i=0; (i < N); i++)
   {
      temp[i] = new double[M];
      for(int j=0; j < M; j++)
      {
          temp[i][j] = vals[i][j];
      }
   }
   return temp;
 }


double mean_func(double **arr, int iterations, int header, int start){

    std::vector<double> vec;
    for (int i=start-iterations; i < start; i++){
         vec.push_back(arr[i][header]);
         }

    vec.erase(std::remove_if(std::begin(vec),
                           std::end(vec),
                           [](const auto& value) { return std::isnan(value); }),
            std::end(vec));

    double sum = std::accumulate(vec.begin(), vec.end(), 0.0);
    double mean = sum / vec.size();

    return mean;
}


double** cfun(double **indata, unsigned int rows, unsigned int cols, int max_inputs, int daily_inputs, int weekly_inputs, int inputs_short_term, const char **header_1, const char **header_2, unsigned int size_header_1, unsigned int size_header_2, double **outdata) {

    std::vector< std::vector <double>> seq_collec;
    std::vector<double>seq;
    unsigned int i, j, k, l;
    unsigned int temp= 5080;

    int num = omp_get_thread_num();
    omp_set_num_threads(num);

    //#pragma omp parallel for //private(seq)
    for(i = max_inputs + weekly_inputs; i < temp ; i++) {
        for(j = 0; j < size_header_1; j++ ){
            for(k = 0; k < size_header_2 ; k++){
                for(l = i-max_inputs; l < i; l++){
                   if((strcmp(header_2[k],"price")==0)|| (strcmp(header_2[k], "change")==0)){
                    seq.push_back(indata[l][j+k]);
                      seq.push_back(mean_func(indata, inputs_short_term, j*size_header_2, l));
                      //std::cout << i << " " << j << " " << k <<std::endl;
                      //std::cout << header_1[j] << " " << header_2[k] << std::endl;
                      //std::cout << typeid(mean).name() << std::endl;
                      //std::cout << mean_func(indata, inputs_short_term, j*size_header_2, l) << std::endl;
                   }else{
                      seq.push_back(indata[l][j+k]);
                    //std::cout << header_1[k] << " " << header_2[k] << std::endl;
                  }
                }
            }
        seq.push_back(mean_func(indata, daily_inputs, j+k, l));
        seq.push_back(mean_func(indata, weekly_inputs, j+k, l));
       }
       //std::cout << seq.size() << std::endl;
       seq_collec.push_back(seq);
       seq.clear();
    }

    outdata = transf_vec_2_array(seq_collec, seq_collec.size(), seq_collec[0].size());
    //std::cout << seq_collec.size() << std::endl;
    //std::cout << seq_collec[0].size() << std::endl;
    return outdata;
}

int main(){

    int rows = 10846, cols = 12, max_inputs = 20, daily_inputs = 1000, weekly_inputs=5000, inputs_short_term=4;
    unsigned int size_header_1 = 3, size_header_2 = 4;
    const char *header_1[size_header_1] = {"CH:SMI","DJIA","RUI"};
    const char *header_2[size_header_2] = {"change","delta_vol","price","volume"};

    double* *indata = new double*[rows];
    double* *outdata = new double*[rows];
    for (int i=0; i < rows; i++){
        indata[i] = new double[cols];
        outdata[i] = new double[cols];
    }

    for (int i=0; i < rows; i++){
        for (int j=0; j < cols; j++){
          indata[i][j]=i + j ;
        }
    }

    outdata = cfun(indata, rows, cols, max_inputs, daily_inputs, weekly_inputs, inputs_short_term, header_1, header_2, size_header_1, size_header_2, outdata);

    for(int j = 0; j < 1; j++){
      for(int i = 0; i < 366; i++){
        std::cout << outdata[i][j] << std::endl; // PROBLEM HERE !!!
      }
    }

    return 0;
}
Joachim
  • 490
  • 5
  • 24
  • Just glancing at it, dollars to donuts your problem is the raw pointers and not the vector. I mean, it could be division-by-zero with the `.size()`, but your pointer arithmetic looks dangerous. – Yakk - Adam Nevraumont Oct 10 '19 at 18:55
  • Also you leak like a sieve. Work out where `outdata` comes from -- it is allocated, then discarded and replaced by the return value of `transf_vec_2_array`. Who knows what `N` and `M` are, I haven't decoded it. But it is a mess. – Yakk - Adam Nevraumont Oct 10 '19 at 18:59
  • If you're going to allocate a 2D array in that nested `for` loop, you certainly picked the worst way to do it. All of the row pointers are strewn all over the heap. – PaulMcKenzie Oct 10 '19 at 18:59
  • @RemyLebeau iterations is like 20, start is like 5020, so I want to iterate from 5000 to 5020 – Joachim Oct 10 '19 at 18:59
  • 2
    The returned `outdata` only contains about 60 rows while you are iterating on 366 rows when displaying. Consider refactoring your code to only use `std::vector`. It will be a lot easier to debug and maintain. – Gilles-Philippe Paillé Oct 10 '19 at 18:59
  • @JoachimBsh that is an odd way to write a loop, but OK – Remy Lebeau Oct 10 '19 at 19:02
  • Using `std::vector` will also solve the memory leak that you are having now (but maybe not realized). – Gilles-Philippe Paillé Oct 10 '19 at 19:03
  • Your program is ill-formed with `unsigned int size_header_1`. Array sizes need to be known at compile time. A `const` var should work for this. –  Oct 10 '19 at 19:04
  • @Gilles-PhilippePaillé Indeed, that's the most plausible explanation. But I don't understand why, because the `transf_vec_2_array` iterates over the whole size of `seq_collec` which is something like 5000. – Joachim Oct 10 '19 at 19:08
  • @Gilles-PhilippePaillé Where is the memory leak ? – Joachim Oct 10 '19 at 19:09
  • 1
    @JoachimBsh `outdata = cfun(...)` and `outdata = transf_vec_2_array(...)` work together to leak data because you reassign it to a new pointer, but don't delete the data already stored there first. –  Oct 10 '19 at 19:12
  • `#pragma omp parallel for` -- Hopefully you are using this just to test for slowness and seeing if there is any effect . Not only do you need only two allocations (not a loop with an allocation in it), setting the items equal can be done with very fast with `N` `memcpy` or `std::copy` calls. – PaulMcKenzie Oct 10 '19 at 19:15
  • Ok, now I see where the memory leak is. Indeed I will correct this. But still, I don't think it explains the error I get, because the vector I transform has enough iterations – Joachim Oct 10 '19 at 19:16
  • @JoachimBsh *I have spent quite some time on trying to figure out where the error could come* -- So you're not using a debugger? What line is the one that causes the crash? – PaulMcKenzie Oct 10 '19 at 19:20
  • @PaulMcKenzie There is actually no crash, except if I try to display the values of `outdata`, which is specified at the end of the code line 116 – Joachim Oct 10 '19 at 19:22
  • Also, I see a lot of comments about the double pointer structure. I am aware I could only use `std::vector`. However, The `cfun` function will later be called in my project from `ctypes` in `python. And this is why I have to use double pointers. – Joachim Oct 10 '19 at 19:31
  • 1
    `std::cout << outdata[i][j] << std::endl;` -- How many row and columns does `outdata` have? Do you know just by looking at that line, or do you need to run your program step-by-step to figure this out? There is no `size` member or property for arrays or ones allocated from the heap. There is no way you are sure that `i` and `j` are in bounds by faith alone. You need to debug your code. Also my issue with the double pointers isn't a vector issue -- it is an efficiency issue. See [this](https://stackoverflow.com/questions/21943621/how-to-create-a-contiguous-2d-array-in-c/21944048#21944048) – PaulMcKenzie Oct 10 '19 at 19:32
  • OK, well... you were all right, it was a leak problem. Thanks to all for this hint, I would have never had the idea to see the problem this way. I was blocked on the push_back() function. Thanks again – Joachim Oct 11 '19 at 16:55

0 Answers0