1

I'm reading through the internals SeqAn library (which handles biology-specific file formats and data structures) and I'm coming across what must be a c++ idiom which I don't quite understand.

There's a unique id variable record.rID that is an __int32. A pointer to it gets passed to another function that reads a bunch of data from a file and mutates the id.

Here's the call:

res = streamReadBlock(reinterpret_cast<char *>(&record.rID), stream, 4);

Here's the function implementation:

inline size_t
streamReadBlock(char * target, Stream<Bgzf> & stream, size_t maxLen)
{
    if (!(stream._openMode & OPEN_RDONLY))
        return 0;  // File not open for reading.

    // Memoize number of read bytes and pointer into the buffer target.
    size_t bytesRead = 0;
    char * destPtr = target;

    // Read at most maxLen characters, each loop iteration corresponds to reading the end of the first, the beginning of
    // the last or the whole "middle" buffers.  Of course, the first and only iteration can also only read parts of the
    // first buffer.
    while (bytesRead < maxLen)
    {
        // If there are no more bytes left in the current block then read and decompress the next block.
        int available = stream._blockLength - stream._blockOffset;
        if (available <= 0)
        {
            if (_bgzfReadBlock(stream) != 0)
                return -1;  // Could not read next block.
            available = stream._blockLength - stream._blockOffset;
            if (available <= 0)
                break;
        }

        // Copy out the number of bytes to be read or the number of available bytes in the next buffer, whichever number
        // is smaller.
        int copyLength = std::min(static_cast<int>(maxLen - bytesRead), available);
        char * buffer = &stream._uncompressedBlock[0];
        memcpy(destPtr, buffer + stream._blockOffset, copyLength);

        // Advance to next block.
        stream._blockOffset += copyLength;
        destPtr += copyLength;
        bytesRead += copyLength;
    }

    // If we read to the end of the block above then switch the block address to the next block and mark it as unread.
    if (stream._blockOffset == stream._blockLength)
    {
        stream._blockPosition = tell(stream._file);
        stream._blockOffset = 0;
        stream._blockLength = 0;
    }

    return bytesRead;
}

Doing a bit of tracing, I can see that record.rID is getting assigned in there, I guess where that memcpy(destPtr, buffer + stream._blockOffset, copyLength); occurs, but I don't quite understand what's going on and how a meaningful record id is getting assigned (but then I don't have too much experience dealing with this kind of deserialization code).

merv
  • 67,214
  • 13
  • 180
  • 245
daj
  • 6,962
  • 9
  • 45
  • 79
  • I'm guessing the first 4 bytes of stream._uncompressedblock holds the id – James Mar 09 '14 at 23:17
  • dumb question - if I'm stepping through `streamReadBlock`, what should I be watching to monitor the state of `record.rID`? I'm a little lost between the casting, referencing and dereferencing. I was guessing `static_cast<__int32>(*target)` but it doesn't seem to give me the value that I get upon exiting the function. – daj Mar 09 '14 at 23:25
  • record.rID is not passed by reference. the ampersand in this context means "the address of", and it's address is cast to a char pointer, which in turn is passed by value. – JonPall Mar 09 '14 at 23:45
  • oh right sorry about that. Question about watching record.rID state remains though... – daj Mar 10 '14 at 00:37

1 Answers1

0

It's a clever way of writing to an int. By casting the address of record.rID as a pointer to a char, you may directly write bytes to it with memcpy.

JonPall
  • 814
  • 4
  • 9