I have a representation of a large bit matrix where I'd like to efficiently retrieve the majority bit for each matrix column (^= bit value that occurs most often). The background is that the matrix rows represent ORB feature descriptors and the value I'm looking for resembles the mean in the Hamming domain.
The implementation I'm currently working with looks like this
// holds column-sum for each bit
std::vector<int> sum(32 * 8, 0);
// cv::Mat mat is a matrix of values € [0, 255] filled elsewhere
for (size_t i = 0; i < mat.cols; ++i)
{
const cv::Mat &d = mat.row(i);
const unsigned char *p = d.ptr<unsigned char>();
// count bits set column-wise
for (int j = 0; j < d.cols; ++j, ++p)
{
if (*p & (1 << 7)) ++sum[j * 8];
if (*p & (1 << 6)) ++sum[j * 8 + 1];
if (*p & (1 << 5)) ++sum[j * 8 + 2];
if (*p & (1 << 4)) ++sum[j * 8 + 3];
if (*p & (1 << 3)) ++sum[j * 8 + 4];
if (*p & (1 << 2)) ++sum[j * 8 + 5];
if (*p & (1 << 1)) ++sum[j * 8 + 6];
if (*p & (1)) ++sum[j * 8 + 7];
}
}
cv::Mat mean = cv::Mat::zeros(1, 32, CV_8U);
unsigned char *p = mean.ptr<unsigned char>();
const int N2 = (int)mat.rows / 2 + mat.rows % 2;
for (size_t i = 0; i < sum.size(); ++i)
{
if (sum[i] >= N2)
{
// set bit in mean only if the corresponding matrix column
// contains more 1s than 0s
*p |= 1 << (7 - (i % 8));
}
if (i % 8 == 7) ++p;
}
The bottleneck is the big loop with all the bit shifting. Is there any way or known bit magic to make this any faster?