In case you don't want to use C++ std::bitset for performance reasons, then your code can be rewrote like this:
#include <cstdio>
#include <cstdint>
// buffer definition
constexpr size_t clustersTotal = 83;
constexpr size_t clustersTotalBytes = (clustersTotal+7)>>3; //ceiling(n/8)
uint8_t clustersSet[clustersTotalBytes] = {0x07, 0};
// clusters 0,1 and 2 are already set (for show of)
// helper constanst bit masks for faster bit setting
// could be extended to uint64_t and array of qwords on 64b architectures
// but I couldn't be bothered to write all masks by hand.
// also I wonder when the these lookup tables would be large enough
// to disturb cache locality, so shifting in code would be faster.
const uint8_t bitmaskStarting[8] = {0xFF, 0xFE, 0xFC, 0xF8, 0xF0, 0xE0, 0xC0, 0x80};
const uint8_t bitmaskEnding[8] = {0x01, 0x03, 0x07, 0x0F, 0x1F, 0x3F, 0x7F, 0xFF};
constexpr uint8_t bitmaskFull = 0xFF;
// Input values
size_t firstCluster = 6;
size_t sizeCluster = 16;
// set bits (like "void setBits(size_t firstIndex, size_t count);" )
auto lastCluster = firstCluster + sizeCluster - 1;
printf("From cluster %d, size %d => last cluster is %d\n",
firstCluster, sizeCluster, lastCluster);
if (0 == sizeCluster || clustersTotal <= lastCluster)
return 1; // Invalid input values
auto firstClusterByte = firstCluster>>3; // div 8
auto firstClusterBit = firstCluster&7; // remainder
auto lastClusterByte = lastCluster>>3;
auto lastClusterBit = lastCluster&7;
if (firstClusterByte < lastClusterByte) {
// Set the first byte of sequence (by mask from lookup table (LUT))
clustersSet[firstClusterByte] |= bitmaskStarting[firstClusterBit];
// Set bytes between first and last (simple 0xFF - all bits set)
while (++firstClusterByte < lastClusterByte)
clustersSet[firstClusterByte] = bitmaskFull;
// Set the last byte of sequence (by mask from ending LUT)
clustersSet[lastClusterByte] |= bitmaskEnding[lastClusterBit];
} else { //firstClusterByte == lastClusterByte special case
// Intersection of starting/ending LUT masks is set
clustersSet[firstClusterByte] |=
bitmaskStarting[firstClusterBit] & bitmaskEnding[lastClusterBit];
}
for(auto i = 0 ; i < clustersTotalBytes; ++i)
printf("0x%X ", clustersSet[i]); // Your debug display of buffer
Unfortunately I didn't profile any of the versions (yours vs my), so I have no idea what is the quality of optimized C compiler output in both cases. In the ages of lame C compilers and 386-586 processors my version would be much faster. With modern C compiler the LUT usage can be a bit counterproductive, but unless somebody proves me wrong by some profiling results, I still think my version is much more efficient.
That said, as writing to file system is probably involved ahead of this, setting bits will probably take about %0.1 of CPU time even with your variant, I/O waiting will be major factor.
So I'm posting this more like an example how things can be done in different way.
Edit:
Also if you believe in the clib optimization, the:
// Set bytes between first and last (simple 0xFF - all bits set)
while (++firstClusterByte < lastClusterByte)
clustersSet[firstClusterByte] = bitmaskFull;
Can reuse clib memset
magic:
//#include <cstring>
// Set bytes between first and last (simple 0xFF - all bits set)
if (++firstClusterByte < lastClusterByte)
memset(clustersSet, bitmaskFull, (lastClusterByte - firstClusterByte));