1

I need to sort a table by column. My tables are represents by a single vector.

example :

col_name A B C

vector : 1 2 3 6 5 4 7 8 9

that give me the table :

A B C

1 6 7
2 5 8
3 4 9

After a sort on column B , I need to obtain :

A B C

3 4 9
2 5 8
1 6 7

my code :

#include <iostream>
#include <string>
#include <vector>
#include <algorithm>

int main()
{

  std::vector<std::string> vec = {"1","8","1","2","3","2","3",
                                  "5","5","2","5","6","5","6",
                                  "9","3","3","4","8","3","9"};

  std::vector<std::string> rec = {"1","1","8","2","2","3","3",
                                  "2","5","5","5","5","6","6",
                                  "3","9","3","4","3","8","9"};

  int size = 7;
  int col_idx = 1;


  for(int i = 0; i<3;++i)
    {
      if(i==col_idx)
        continue;

      std::sort(vec.begin() + i*size, vec.begin() + (i+1)*size,
                [col_idx, size, i](std::string& s1,std::string& s2)
                       {
                         std::cout << s1 << " "
                                   << s2 << " "
                                   << *(&s1 +(col_idx - i)*size) << " "
                                   << *(&s2 +(col_idx - i)*size) << " "
                                   << (*(&s1 +(col_idx - i)*size) < *(&s2 +(col_idx - i)*size)) << std::endl;
                         return *(&s1 +(col_idx - i)*size) < *(&s2 +(col_idx - i)*size);
                       });
    }
  std::sort(vec.begin() + col_idx*size, vec.begin() + (col_idx+1)*size);
}

assert(vec==res);

I have a segmentation fault error : only the first line appear from the cout.

Kafka
  • 720
  • 6
  • 21
  • 1
    Any reason why you're not simply using `std::vector >`? – theWiseBro Feb 28 '20 at 14:37
  • Use a debugger! – Matthieu Brucher Feb 28 '20 at 14:43
  • 3
    If your `std::sort` predicate looks convoluted for something that is conceptually simple (sort columns), then take a step back and rethink your approach (how you structured your data, the data types you chose, etc.) – PaulMcKenzie Feb 28 '20 at 14:45
  • It's an example code. In the real code, that allow some theorical speed up, and I can't change all the approch in the whole code. – Kafka Feb 28 '20 at 14:48
  • can you please include the actual and expected output for the numbers in the code? – 463035818_is_not_an_ai Feb 28 '20 at 14:48
  • Don't see the point for the debugger ??? that just say there is a problemee with *(&s1 +(col_idx - i)*size) < *(&s2 +(col_idx - i)*size) – Kafka Feb 28 '20 at 14:50
  • "that allow some theorical speed up" frankly that sounds like a very vague basis to use the "wrong" data type. Can you explain why you expect speed up here? In any case, when there are such constraints you should mention them in the question – 463035818_is_not_an_ai Feb 28 '20 at 14:50
  • @Kafka *that allow some theorical speed up, and I can't change all the approch in the whole code.* -- Honestly, your speedup looks more like it will "slow down". All of those shenanigans you're doing with pointers will eventually, if not already, render the compiler's optimization opportunities null and void. Thus your code may run *slower* than what the compiler would have produced. – PaulMcKenzie Feb 28 '20 at 14:54

1 Answers1

3

Honestly, your approach looks rather complicated to me. Most of its complexity is due to the fact that you have rows in your code but they are present only implicitly. Making stuff explicit in code not only helps to increase readability but also makes code easier to write.

Lets say you use std::array<std::string,3> for rows, then your code could be as leightweight as this:

#include <vector>
#include <array>
#include <algorithm>
#include <iostream>

int main() {
    using row_t = std::array<std::string,3>;
    std::vector<row_t> vec = { {"1","8","1"},{"2","3","2"},{"3","5","5"},{"2","5","6"}};

    std::sort(vec.begin(),vec.end(),[](const row_t& a, const row_t& b) { return a[2] < b[2]; });

    for (const auto& row : vec) {
        for (const auto& e : row) std::cout << e << " ";
        std::cout << '\n';
    }
}

Output:

1 8 1 
2 3 2 
3 5 5 
2 5 6 

Ok, that's probably a good approch for this problem, and maybe it's what should I do, but I can't pass 2 months to change all the code ...

You could have made the requirements more clear in the question. I think if you have 10k lines of code that depend on this particular issue using a flat vector, when a different data structure would be more appropriate, then you have a bigger problem than how to sort rows. Anyhow...

Using a flat std::vector is usually not a bad idea. What I miss from your code is something along the line of

template <int stride>
std::string& access_at(std::vector<std::string>& vec,size_t row,size_t col) { 
    return vec[ row * stride + col ]; 
}
template <int stride>
const std::string& access_at(const std::vector<std::string>& vec,size_t row,size_t col) { 
    return vec[ row * stride + col ]; 
}

That lets you iterate the table like this:

for (size_t i=0;i < vec.size()/3;++i) {
    for (size_t j=0;j<3;++j) {
        std::cout << access_at<3>(vec,i,j) << " ";
    }
    std::cout << '\n';        
}

Next I am going to shamelessly steal and modify code from this answer. The basic idea is to sort a vector of indices instead of sorting the vector directly:

using index_t = std::vector<size_t>;
template <int stride>
index_t make_sorted_index(const std::vector<std::string>& values,size_t col) {
    index_t index(values.size() / stride);
    std::iota(index.begin(), index.end(), 0);
    std::sort(index.begin(), 
              index.end(), 
              [&values,&col](size_t a, size_t b) { 
                  return access_at<stride>(values,a,col) < access_at<stride>(values,b,col); 
              } 
    );
    return index;
}

Once you have that, the loop to print the sorted table need only a minor modification:

for (size_t i=0;i < vec.size()/3;++i) {
    for (size_t j=0;j<3;++j) {
        std::cout << access_at<3>(vec,index[i],j) << " ";
    }
    std::cout << '\n';        
}

Putting everything together:

#include <vector>
#include <numeric>
#include <algorithm>
#include <iostream>

template <int stride>
std::string& access_at(std::vector<std::string>& vec,size_t row,size_t col) { return vec[ row * stride + col ]; }
template <int stride>
const std::string& access_at(const std::vector<std::string>& vec,size_t row,size_t col) { return vec[ row * stride + col ]; }

using index_t = std::vector<size_t>;
template <int stride>
index_t make_sorted_index(const std::vector<std::string>& values,size_t col) {
    index_t index(values.size() / stride);
    std::iota(index.begin(), index.end(), 0);
    std::sort(index.begin(), 
              index.end(), 
              [&values,&col](size_t a, size_t b) { return access_at<stride>(values,a,col) < access_at<stride>(values,b,col); } 
    );
    return index;
}


int main() {
    std::vector<std::string> vec = { "1","8","1","2","3","2","3","5","5","2","5","6"};
    for (size_t i=0;i < vec.size()/3;++i) {
        for (size_t j=0;j<3;++j) {
            std::cout << access_at<3>(vec,i,j) << " ";
        }
        std::cout << '\n';        
    }
    std::cout << '\n';
    auto index = make_sorted_index<3>(vec,1);
    for (size_t i=0;i < vec.size()/3;++i) {
        for (size_t j=0;j<3;++j) {
            std::cout << access_at<3>(vec,index[i],j) << " ";
        }
        std::cout << '\n';        
    }
}

With output:

1 8 1 
2 3 2 
3 5 5 
2 5 6 

2 3 2 
3 5 5 
2 5 6 
1 8 1 

I'll leave it to you to actually copy the vector to get the sorted one, if you really need that.

PS: In the first version above I sorted with respect to column C, the last part sorts with respect to B as requested.

PPS: I still dont understand your code. I don't understand why you have std::cout inside the predicate and to be honest I have no clue how your call to sort is supposed to achieve what you want.

463035818_is_not_an_ai
  • 109,796
  • 11
  • 89
  • 185
  • Ok, that's probably a good approch for this problem, and maybe it's what should I do, but I can't pass 2 months to change all the code ... – Kafka Feb 28 '20 at 14:58