10

This phenomenon is found when I programmed for the LeetCode problem N-Queens.

I have two versions of accepted code, the only difference between which is the way I stored the hash table, one is using vector<int> and the other is using vector<bool>. To be specific, the two versions of code are as follows:

Version 1, vector<int>, Running Time: 4 ms
class Solution {
public:
void dfs(vector<string>& crtrst, vector<vector<string>>& finalrsts, int row, vector<int>& mup, vector<int>& m45dgr, vector<int>& m135dgr)
{
    int n = crtrst.size();
    
    if (row == n)
    {
        finalrsts.push_back(crtrst);
        return;
    }
    
    for (int j=0; j<n; j++)
    {
        if (mup[j] && m45dgr[j-row+n-1] && m135dgr[row+j])
        {
            crtrst[row][j] = 'Q';
            mup[j] = m45dgr[j-row+n-1] = m135dgr[row+j] = 0;
        
            dfs(crtrst,finalrsts,row+1,mup,m45dgr,m135dgr);
        
            crtrst[row][j] = '.';
            mup[j] = m45dgr[j-row+n-1] = m135dgr[row+j] = 1;
        }
    }
}

vector<vector<string>> solveNQueens(int n) 
{
    vector<vector<string>> finalrsts;
    vector<string> crtrst(n,string(n,'.'));
    vector<int> mup(n,1);
    vector<int> m45dgr(2*n-1,1); // degree 45: '\'
    vector<int> m135dgr(2*n-1,1); // degree 135: '/';
    int row = 0;
    dfs(crtrst,finalrsts,row,mup,m45dgr,m135dgr);
    return finalrsts;
}
};
Version 2, vector<bool>, Running time: 12 ms
class Solution {
public:
void dfs(vector<string>& crtrst, vector<vector<string>>& finalrsts, int row, 
    vector<bool>& mup, vector<bool>& m45dgr, vector<bool>& m135dgr)
{
    int n = crtrst.size();
    
    if (row == n)
    {
        finalrsts.push_back(crtrst);
        return;
    }
    
    for (int j=0; j<n; j++)
    {
        if (mup[j] && m45dgr[j-row+n-1] && m135dgr[row+j])
        {
            crtrst[row][j] = 'Q';
            mup[j] = m45dgr[j-row+n-1] = m135dgr[row+j] = false;
        
            dfs(crtrst,finalrsts,row+1,mup,m45dgr,m135dgr);
        
            crtrst[row][j] = '.';
            mup[j] = m45dgr[j-row+n-1] = m135dgr[row+j] = true;
        }
    }
}

vector<vector<string>> solveNQueens(int n) 
{
    vector<vector<string>> finalrsts;
    vector<string> crtrst(n,string(n,'.'));
    vector<bool> mup(n,true);
    vector<bool> m45dgr(2*n-1,true); // degree 45: '\'
    vector<bool> m135dgr(2*n-1,true); // degree 135: '/';
    int row = 0;
    dfs(crtrst,finalrsts,row,mup,m45dgr,m135dgr);
    return finalrsts;
}
};

As I know that, vector<bool> stores each element using 1 bit rather than a bool variable (may be 2 Bytes), and vector<int> stores each element using 4 Bytes. So vector<bool> seems tinier than vector<int>. However, why it is slower than vector<int>?

Community
  • 1
  • 1
C. Wang
  • 2,516
  • 5
  • 29
  • 46
  • 4
    Intuitively, it takes *time* to pack/unpack a bit from compressed storage, so I would *expect* `std::vector` to outperform `std::vector`, especially since what the CPU actually works with are registers (`int`-sized or larger). – Angew is no longer proud of SO Sep 28 '15 at 10:50
  • May I know what kind of compiler are you using? – DawidPi Sep 28 '15 at 11:05
  • This is a good example as to how programmer intuition is frequently wrong when it comes to predicting performance behaviour. – Andre Kostur Sep 28 '15 at 13:50
  • @DawidPi, I reported the time according to LeetCode Online Judge. So I don't know what compiler it uses. – C. Wang Sep 28 '15 at 15:44

2 Answers2

12

Access to single bits is usually slower than to complete addressable units (bytes in the lingo of C++). For example, to write a byte, you just issue a write instruction (mov on x86). To write a bit, you need to load the byte containing it, use bitwise operators to set the right bit within the byte, and then store the resulting byte.

The compact size of a bit vector is nice for storage requirements, but it will result in a slowdown except when your data becomes large enough that caching issues play a role.

If you want to have speed and still be more efficient than 4 bytes per value, try a vector<unsigned char>.

Sebastian Redl
  • 69,373
  • 8
  • 123
  • 157
  • Why `vector` will be faster than `vector` ? I tested the differences with my desktop (both `unsigned char` and `bool` evaluated to `1` byte ), result of `vector` is 3 times slower than `vector`. I can't figure out the details since both of them are 1 byte. – Bin Jul 03 '16 at 16:37
  • 3
    No, they're not. Specifically, `vector` doesn't store each bool as an individual byte; instead it packs them together as bits. That makes it more storage-efficient, but slower. That's the entire point of the question. – Sebastian Redl Jul 03 '16 at 18:16
3

My interpretation:

There's a vector<type> specialization for bool, which is a bit map; that's highly efficient with regard to storage (1Byte of vector storage = 8 bool), but worse at actually accessing the data; for any other vector, you can just access the element at the nth position by *([address of first element + n * sizeof(element)]), but for the getting bool-out-of-byte storage, you'll inevitably will need to do some bit operations, which, in times of large caches, might be significantly slower than just having an array of ints, representing one truth value each.

Marcus Müller
  • 34,677
  • 4
  • 53
  • 94