Abstract: I need to copy all elements of a struct containing a float array into a byte buffer, in order to send it out via UART. The next call of malloc after the copy operation leads (allways) to a hard fault, which is a good indicator that somewhere the memory gets corrupted, but I have no clue where this could happen (after 2 days debugging ...)
Description: I have a nested Typtedef, that contains a float array:
#define DRV_SCALE_MAXSZ 32
#define DRV_CHANNELS 2
typedef struct {
float x1;
float step;
uint8_t aSz;
float Y[DRV_SCALE_MAXSZ];
} DRV_linscale_TDS;
typedef struct {
DRV_linscale_TDS scale;
uint32_t active;
} DRV_ChScale_TDS;
DRV_ChScale_TDS DRV_scale[DRV_CHANNELS] = {0,}; // Channel Scales
And I need to copy the whole content of either DRV_scale[0] or [1] into a byte buffer, in order to send it out via UART.
As a little extra complication I copy it element by element, with a copy function, that reverts the bytes of the value if necessary:
#define TXBUFSZ 255
volatile uint8_t TxBuf[TXBUFSZ] = {0,};
void FillTxBuf(uint8_t idx, uint8_t *pBo) {
if(idx < DRV_CHANNELS) {
volatile uint8_t *pDst = TxBuf;
*pDst++ = DRV_SCALE_MAXSZ;
*pDst++ = DRV_scale[idx].active;
pDst += COM_ElementCopyU32((uint8_t*)&DRV_scale[idx].scale.x1, pDst, pBo);
pDst += COM_ElementCopyU32((uint8_t*)&DRV_scale[idx].scale.step, pDst, pBo);
*pDst++ = DRV_scale[idx].scale.aSz;
uint8_t i = *pDst;
float *pSrc = DRV_scale[idx].scale.Y;
while(i--) {
pDst += COM_ElementCopyU32((uint8_t*)pSrc, pDst, pBo);
pSrc++;
}
}
}
Note: the code above is a shrinked version just for explanation. In reality TxBuf[TXBUFSZ] is a static preallocated byte buffer (declared extern in the header file, and defined in the c file)
The function COM_ElementCopyU32 looks like this:
uint8_t COM_ElementCopyU32(volatile uint8_t* pSrc, volatile uint8_t* pDst, uint8_t* ByteOrder) {
// @brief copy data from Source to Destination and revert bytes if necessary
// @param U8* pSrc: Pointer to data Source Buffer
// @param U8* pDst: Pointer to Destination Buffer
// @param U8 ByteOder: 0 = little endian, 1=big endian
// @return u16 number of copied bytes
if(pSrc && pDst) {
if(*ByteOrder != isBigEndian) {
pDst[0] = pSrc[3];
pDst[1] = pSrc[2];
pDst[2] = pSrc[1];
pDst[3] = pSrc[0];
} else {
pDst[0] = pSrc[0];
pDst[1] = pSrc[1];
pDst[2] = pSrc[2];
pDst[3] = pSrc[3];
}
}
return(sizeof(uint32_t));
}
The issue: as soon as the line
pDst += COM_ElementCopyU32((uint8_t*)pSrc, pDst, pBo);
is involved, the call of FillTxBuf() leads to an hard fault with the next call of malloc(). The next malloc() comes immediately after FillTxBuf() when the CRC32 is appended to the byte stream. The general workflow is: check the incoming request, fill the Tx Buffer, append the CRC32 and send it out.
What have i tried to solve this so far? Well, i tried a lot:
- I removed the line mentioned above. As long i do not copy any bytes from DRV_scale[idx].scale.Y to TxBuf[] in the while loop is disabled, anything works fine.
- I replaced float pSrc = DRV_scale[idx].scale.Y; with * float pSrc = DebugArray; where DebugArray is a "stand alone" static pre-allocated float array of the same size as DRV_scale[idx].scale.Y (32 Elements) and anything works fine
- I tried to copy the Elements from DRV_scale[idx].scale.Y to another float array (lets call it "DupArray"), which worked fine but when I tried to copy "DupArray" bytewise into TxBuf[] it crashes.
- and I tried to copy the Elements from DRV_scale[idx].scale.Y to TxBuf[] in another function, right after Hardware initialisation, using the same code (copy & paste), it worked fine
- I tried several versions of the DRV_linscale_TDS Typdef, with the Byte variable at the end and at the beginning, with no effect
- I checked if there would be a buffer overflow in the while loop, but as expected there is none, as the total number of copied bytes is ~100, so there are 155 bytes "free" (note: the overrun prevention code is still in the original code but left out here for better readability)
I have no clue what's going on here. Each part of the code - when I debug it separatey - works fine. Copying the original Array to another float preallocated float array works fine, copying the float array to a byte array and writing it back works fine. Just if I do exactly that, whats working fine verywhere else, in that particular function, it generates a hard fault.
Through all the testing and debugging it points out clearly: the hard fault only happens, when I try to copy DRV_scale[idx].scale.Y into TxBuf[], anything else works without problems.
One might say: well, then somewhere before FillTxBuf() TxBuf[] gets corrupted, but why works anything flawless in FillTxBuf() when I use a different float array than DRV_scale[idx].scale.Y?
Remarks: One possible workaround would most probably be to split up the struct and use separate preallocated "stand alone" float arrays. The reason why I glued it together in one variable is, that this variable is written to flash and I'd really like the approach FlashWrite(VariablePointer, SizeInBytes) ... If there is no other option, i will have to separate it, but I'd really like to understand in which pitfall I stumbled in ...
The Question: Where could I search?