0

I'm trying to build an R package implementing Dijkstra algorithm using Rcpp and RcppParallel. You can see my work here. Now I want to add a new function and a strange behavior appears. When I compile this function via sourceCpp function and try it, it works well but when I rebuilt the package with this function, R crashes almost immediately when I call the function. I have no error when I built the package via Rstudio (Clean and rebuild).
The algorithm is bidirectional Dijkstra, so it uses two graphs (forward and reverse) and two priority queues from STL. A graph is simply a vector of vector of pairs.

Here is the code (sorry but I don't know how to simplify it):

// [[Rcpp::depends(RcppParallel)]]
// [[Rcpp::plugins(cpp11)]]

#include <queue>
#include <vector>
#include <limits>
#include <functional>
#include <RcppParallel.h>
#include <Rcpp.h>
using namespace RcppParallel;

struct Pardijkstra : public Worker
{
  //input
  const std::vector<std::vector<std::pair<int, double> > > m_graph;
  const std::vector<std::vector<std::pair<int, double> > > m_graphr;
  RVector<int> m_dep;
  RVector<int> m_arr;
  const int m_nbnodes;

  //output
  RcppParallel::RVector<double> m_result;

  //constructor
  Pardijkstra(const std::vector<std::vector<std::pair<int, double> > >  &graph,
              const std::vector<std::vector<std::pair<int, double> > >  &graphr,
              Rcpp::IntegerVector & dep,
              Rcpp::IntegerVector & arr,
              const int nbnodes,
              Rcpp::NumericVector result) : m_graph(graph),m_graphr(graphr),m_dep(dep), m_arr(arr),m_nbnodes(nbnodes),m_result(result)
  {

  }

  //overload () operator
  void operator()(std::size_t begin, std::size_t end){
    struct comp{

      //Custom comparator included in priority queues

      bool operator()(const std::pair<int, double> &a, const std::pair<int, double> &b){
        return a.second > b.second;
      }
    };

    //

    for (std::size_t k=begin; k!=end;k++){

      //Here is the algorithm (bidirectional search)

      int StartNode=m_dep[k];
      int EndNode=m_arr[k];

      std::vector<double> Distances(m_nbnodes, std::numeric_limits<double>::max()); 
      std::vector<double> Distances2(m_nbnodes, std::numeric_limits<double>::max()); 

      //Distances are stored here
      Distances[StartNode] = 0.0;  
      Distances2[EndNode] = 0.0;

      std::vector <int> Visited(m_nbnodes,0);

      std::priority_queue<std::pair<int, double>, std::vector<std::pair<int, double> >, comp > Q;
      std::priority_queue<std::pair<int, double>, std::vector<std::pair<int, double> >, comp > Qr;
      Q.push(std::make_pair(StartNode, 0.0)); 
      Qr.push(std::make_pair(EndNode, 0.0));
      Visited[StartNode]+=1;
      Visited[EndNode]+=1;

      double mu=std::numeric_limits<double>::max();

      while (!Q.empty() && !Qr.empty()) {  

        //Stopping criterion

        if (Q.top().second+Qr.top().second >= mu){
          break;
        }  

        //Forward search
        if (!Q.empty()){
          int v=Q.top().first;
          int w=Q.top().second;
          Q.pop();

          if (w <= Distances[v]) {
            for (int i=0; i< m_graph[v].size(); i++){
              int v2 = m_graph[v][i].first;                                                     
              double w2 = m_graph[v][i].second;

              if (Distances[v] + w2 < Distances[v2]) {                               
                Distances[v2] = Distances[v] + w2;                                   

                Q.push(std::make_pair(v2, Distances[v2]));
                Visited[v2]+=1;

              }
            }
          }
          if ((Visited[v]>1)  && (Distances[v]+Distances2[v]) < mu){

            mu=Distances[v]+Distances2[v];

          }
        }

        //Backward search
        if (!Qr.empty()){
          int vv=Qr.top().first;
          int ww=Qr.top().second;
          Qr.pop();

          Visited[vv]+=1;

          if (ww <= Distances2[vv]) {
            for (int i=0; i< m_graphr[vv].size(); i++){
              int vv2 = m_graphr[vv][i].first;                                                     
              double ww2 = m_graphr[vv][i].second;

              if (Distances2[vv] + ww2 < Distances2[vv2]) {                               
                Distances2[vv2] = Distances2[vv] + ww2;                                  

                Qr.push(std::make_pair(vv2, Distances2[vv2]));
                Visited[vv2]+=1;
              }
            }
          }
          if ((Visited[vv]> 1) && (Distances[vv]+Distances2[vv]) < mu){
            mu=Distances[vv]+Distances2[vv];
          }
        }
      }

      if (mu >= std::numeric_limits<double>::max()){
        m_result[k] = Rcpp::NumericVector::get_na();
      }
      else {
        m_result[k]=mu;
      }

    }

  }

};


//Fonction exported in R

// [[Rcpp::export]]
Rcpp::NumericVector Bidir_par(Rcpp::IntegerVector dep, Rcpp::IntegerVector arr,Rcpp::IntegerVector gfrom,Rcpp::IntegerVector gto,Rcpp::NumericVector gw,int NbNodes){


  //dep : origin nodes
  //arr: destination nodes
  //gfrom: first nodes of graphs edges
  //gto: last nodes of graphs edges
  //gw: weights of graphs edges
  //NbNodes: number of nodes in the graph

  Rcpp::NumericVector result(dep.size());


  //Construction of the two graphs
  int NbEdges=gfrom.size();

  std::vector<std::vector<std::pair<int, double> > > G(NbNodes);   
  std::vector<std::vector<std::pair<int, double> > > Gr(NbNodes);
  for (int i = 0; i != NbEdges; ++i) {

    G[gfrom[i]].push_back(std::make_pair(gto[i], gw[i]));
    Gr[gto[i]].push_back(std::make_pair(gfrom[i], gw[i]));

  }


  Pardijkstra Dijfunc(G,Gr,dep,arr,NbNodes,result);
  parallelFor(0,dep.length(),Dijfunc);

  return Rcpp::wrap(result);



}



I know that inputs should be RVector or RMatrix but unfortunately, directed graphs cannot be simplified in that form without losing efficiency. std::vectors are contiguous in memory right ?

I have noticed that if I remove an if statement (for example all the backward search), it works.

If you want to see the other functions implemented in the package, you can check my github repository. All other functions works well.
Here is the Makevars file :

CXX_STD = CXX11
PKG_LIBS += $(shell ${R_HOME}/bin/Rscript -e "RcppParallel::RcppParallelLibs()")

the Makevars.win file :

CXX_STD = CXX11
PKG_CXXFLAGS += -DRCPP_PARALLEL_USE_TBB=1
PKG_LIBS += $(shell "${R_HOME}/bin${R_ARCH_BIN}/Rscript.exe" -e "RcppParallel::RcppParallelLibs()")

and the DESCRIPTION file :

Imports: Rcpp (>= 1.0.1), RcppParallel (>= 4.4.2)
LinkingTo: Rcpp, RcppParallel
SystemRequirements: GNU make

So what am I doing wrong ?
Why the function works with sourceCpp function and not in the package ?

EDIT : According to @thc, my implementation is not thread-safe so is there a way to rewrite this algorithm in a safety way ? Knowing that priority queues are mandatory here.

Any help is appreciated!

vlarmet
  • 71
  • 1
  • 3
  • 1
    Have you tried running it in a debugger? – Ralf Stubner Jun 13 '19 at 19:00
  • @RalfStubner No how can I do it ? – vlarmet Jun 14 '19 at 08:10
  • @thc I'm fairly new in C++ programing... So using priority queues are prohibited in parallel processing ? I thought that each threads would construct its own queues and only input and output are shared – vlarmet Jun 14 '19 at 08:13
  • I assume you are using Windows, since your GitHub repo contains a `dll` file (which should not be there, BTW). If that is correct, have a look at https://stackoverflow.com/questions/53622354/how-to-debug-line-by-line-rcpp-generated-code-in-windows/53627146#53627146, adjusting the procedure as necessary for package use. – Ralf Stubner Jun 14 '19 at 08:16
  • 1
    Finally, I chose to parallelize all my algorithms via `parallel` package. Despite a bit more memory consumption due to copies of input in each thread, it seems to be safer. – vlarmet Jun 16 '19 at 12:21

0 Answers0