C++ variable changed during interrupt resets after interrupt

Question

I'm trying to use a flag to see if an event has finished.

Via the debugger, I have seen that the interrupt triggers correctly and does set the value of transmitCompleteI2c to 1. When I return to the if statement which checks whether or not the flag has been set to 1, it has been reset to 0.

The only 2 locations where i alter the value of transmitCompleteI2c are after the if statement and in the interrupt routine.

I'm running the following bits of code. The declaration of transmitCompleteI2c is done in the header file of the class. fireEvent() is a member function of the class I2c.

volatile uint8_t transmitCompleteI2c;

void I2c::foo() {
    if (transmitCompleteI2c) {
        transmitCompleteI2c = 0;

        // trigger event which sets transmitComplete back to 0 when done
        fireEvent();
    }
}

void I2c::sendHandler(XIic *InstancePtr) {
    transmitCompleteI2c = 1;
}

After some extensive digging, it turned out that the address the interrupt routine writes a transmitCompleteI2c to is not the same as the variable in the class. Even after renaming the variable, it is still happening. Below are the header file and source code of my class.

Header file:

#define SLAVE_ADDRESS   0xA0/2
#define SEND_COUNT      16
#define RECEIVE_COUNT   16
#define IIC_INTR_ID     XPAR_FABRIC_AXI_IIC_0_IIC2INTC_IRPT_INTR
#define IIC_DEVICE_ID   XPAR_AXI_IIC_0_DEVICE_ID
#define INTC_DEVICE_ID  XPAR_SCUGIC_SINGLE_DEVICE_ID


class I2c{
private:
    void sendHandler(void* InstancePtr, int);
    void receiveHandler(XIic* InstancePtr, int);
    void statusHandler(XIic* InstancePtr, int);

    XIic_Config* configPtr;
    const XScuGic& intcInterrupt;
    XIic iicDevice;
    int status;
    uint8_t writeBuffer[SEND_COUNT];
    uint8_t readBuffer[RECEIVE_COUNT];
    volatile uint8_t transmitCompleteI2c, receiveComplete;
    volatile bool test;    

public:
    I2c() : intcInterrupt{IntcInstance} {};
    uint8_t initialize();
    int writeData(u16 ByteCount);
    int readData(u8 *BufferPtr, u16 ByteCount);    

};

Implementation: Initialization function:

uint8_t I2c::initialize() {
    // init driver
    configPtr = XIic_LookupConfig(IIC_DEVICE_ID);
    XIic_CfgInitialize(&iicDevice, configPtr, configPtr->BaseAddress);


    //init interrupt system
    XScuGic_SetPriorityTriggerType((XScuGic*) &intcInterrupt, IIC_INTR_ID, 0xA0, 0x3);

    status = XScuGic_Connect((XScuGic*) &intcInterrupt, IIC_INTR_ID, (Xil_ExceptionHandler) XIic_InterruptHandler, &iicDevice);

    XScuGic_Enable((XScuGic*) &intcInterrupt, IIC_INTR_ID);

    Xil_ExceptionInit();
    Xil_ExceptionRegisterHandler(XIL_EXCEPTION_ID_IRQ_INT, (Xil_ExceptionHandler) XScuGic_InterruptHandler, &IntcInstance);

    Xil_ExceptionEnable();

    // set transmit flag
    transmitCompleteI2c = true;
    xil_printf("%08x\n", &transmitCompleteI2c);

    // attach interrupts to i2c device
    XIic_SetSendHandler(&iicDevice, &iicDevice, (XIic_Handler) (&I2c::sendHandler));
    XIic_SetRecvHandler(&iicDevice, &iicDevice, (XIic_Handler) &I2c::receiveHandler);
    XIic_SetStatusHandler(&iicDevice, &iicDevice, (XIic_StatusHandler) &I2c::statusHandler);

    // set slave address
    status = XIic_SetAddress(&iicDevice, XII_ADDR_TO_SEND_TYPE, SLAVE_ADDRESS);

    // start device
    status = XIic_Start(&iicDevice);
    if (status != XST_SUCCESS) {
        return XST_FAILURE;
    }

    return 0;
}

Write data function:

int I2c::writeData(u16 ByteCount) {

    xil_printf("%08x\n", &transmitCompleteI2c);

    /*
     * Set flag
     */
    transmitCompleteI2c = false;

    /*
     * Send the data
     */
    status = XIic_MasterSend(&iicDevice, &writeBuffer[0], 6);
    if (status != XST_SUCCESS) {
        xil_printf("%d\n", status);
        return XST_FAILURE;
    }

    return XST_SUCCESS;
}

Interrupt routine:

void I2c::sendHandler(void *InstancePtr, int byteCount) {
    transmitCompleteI2c = true;
    xil_printf("%08x\n", &transmitCompleteI2c);
}

Why are you using inverted logic for transmitComplete? Wouldn't it make more sense to declare it as bool, and set it to true after transmission is complete? Of course this won't actually help with your problem, but still... — H. Guijt, Jul 27 '16 at 10:41
@H.Guijt Yeah that's true. Changing it in the question and code for better readability — Etruscian, Jul 27 '16 at 10:42
This part of code doesn't show with evedence, is this thread safe? Such use of variable provoke hazards. — Jacek Cz, Jul 27 '16 at 10:51
@JacekCz I'm running a single thread application on an embedded arm on a single core. Right now the only code running is the code you see, with an initialize function to attach the interrupt routine to the interrupt generated by the event — Etruscian, Jul 27 '16 at 10:53
This seems relevant: http://stackoverflow.com/questions/9606725/linux-why-is-sig-atomic-t-typedefed-to-int — Andrew Henle, Jul 27 '16 at 11:00
@AndrewHenle although I see how the `=0` inside foo() may get interrupted halfway trough, I don't think the `=1` in interrupt may have a problem, ie. if he would set breakpoint inside the if in foo, and run the interrupt, at least first time it should work well and get into the if. Etruscian: can you set also breakpoints for memory change? Maybe you can make sure there's 1 stored in variable a set memory breakpoint to see how it gets zeroed. — Ped7g, Jul 27 '16 at 11:43
@AndrewHenle That does seem useful to take into account. I dug into the disassembler view of the debugger and there it showed that it only took 1 line of assembly code to alter or read the value of the variable. Can this still be the problem? — Etruscian, Jul 27 '16 at 11:44
@Ped7g The `=0` won't execute if the interrupt has not triggered, so it won't be interrupted halfway through. The moment I leave the `sendHandler` call, and return to the loop calling foo, the value has been reset. — Etruscian, Jul 27 '16 at 11:46
@Etruscian ... run interrupt .. set it to 1 ... finish interrupt... run foo() ... start setting it to 0 ... interrupt, setting it to 1, finish interrupt ... back to foo: finishing setting it to 0. (can't affect the first time 0->1->if(true) .. but the zeroing after first 1 can be affected .. in very rare case). I think on ARM the `volatile int` is safe bet (atomic), but that special type from Andrew is even better (didn't know it before). — Ped7g, Jul 27 '16 at 12:21
@Ped7g NOT AGREE. volatile has only meaning "pls not optimize in registers". It has nothing to do with synchronised access. This is very popular mistake uP programmers. — Jacek Cz, Jul 27 '16 at 13:23
@JacekCz Dare you explain? I mean on ARM. (I'm aware the PowerPC read ahead cache needs memory barriers even in case of single core, but on ARM the atomic memory value should work well for custom mutex/lock, even without using the mem barriers, although that's probably safer, if you use some optimization enabled in compiler) — Ped7g, Jul 27 '16 at 13:29
@Ped7g declaring it as atomic does not work. I did have it working for a brief amount of time, after which it stopped working literally without any change in the code. — Etruscian, Jul 27 '16 at 13:32
Etruscian: because volatile variable IS NOT atomic variable. Atomic can be single operation on (short) variable (get/probably set) but not sequence. Java/C# has keyword synchronized, working like You think, but C++ volatile NOT. @Ped7g My english isn't fluent, but this is general known. — Jacek Cz, Jul 27 '16 at 13:39
Have you tried using gdb to set a watch point on transmitComplete and see when/how it gets reset to 0? — Always Confused, Jul 27 '16 at 14:46
@JacekCz but `int` on most of the platforms *is* atomic (can't actually recall some where it is not). The OP is using `uint8_t`, that may be problem on ARM (on x86 it would be atomic). Did you ever code your own lock/mutex in assembler? — Ped7g, Jul 27 '16 at 14:50
@Etruscian: I have no idea what is the problem with your code, looks more complicated than broken atomicity (the discussion about atomicity is pretty much irrelevant, if it doesn't work at least "sometimes" randomly). Debug it (maybe on assembler level). — Ped7g, Jul 27 '16 at 14:53
@AlwaysConfused How do I add a watch point? I'm using an altered eclipse from Xilinx. I did find a gdb terminal and used `watch transmitComplete`, but it does not pop up anywhere — Etruscian, Jul 27 '16 at 15:05
@Ped7g atomic can be / probably is SINGLE operation. Sequence of code in situation of interrupt isn't automagically protected on 'atomic' basis. Interrupt can occur between. — Jacek Cz, Jul 27 '16 at 15:25
@JacekCz you are confusing atomic and mutex/lock/synchronization things together.. overall we speak different language, I'm not used to high level language concurrency worries, most of the time I was doing MT code in assembler solving synchronization on my own, so single volatile flag in memory is OK for me to sync by even in C++. But I know it may become tricky for others, so I'm not recommending it publicly. OP did choose to go that way on his own. — Ped7g, Jul 27 '16 at 15:53
@Etruscian wait, so can you watch at least some log, or something? Maybe you can emit logs in each function at the beginning, to see if the interrupt did run, and did set the value? Just keep in mind the emitting log may take some time, so if your interrupt is not re-entry protected, the debug logging may cause additional havoc. — Ped7g, Jul 27 '16 at 15:57
@Ped7g I don't know how to emit logs, but I used `printf` to check the value of `transmitComplete`. Inside the interrupt it is set to `true`, while in the main loop it remains `false`. I do have to stress that at some point, it did work with exactly the same code. And it remained stable for about half an hour before breaking again without changing the code. A simple restart broke the system. — Etruscian, Jul 28 '16 at 09:52
Yeah, that may be pretty much anything elsewhere (from simple putting printf in wrong place up to crazy things like lack of stack space or buffer overflow bug). Hard to tell without full binary dump (not even mentioning the source) and detailed platform description, or without debugging it step by step. Maybe there's some emulator capable to run your binary too, allowing you to step trough it in debugger? — Ped7g, Jul 28 '16 at 10:10
@Ped7g I have found the problem. The memory address of `transmitComplete` in the main loop is not the same of the memory address of `transmitComplete` in the interrupt. So it's setting a different variable to 1. This does not make sense, since it should all be in the same class. Any idea why it would declare another variable? — Etruscian, Jul 28 '16 at 12:24
post the real source first. BTW, you very likely eclipse the member variable with local function parameter, or local definition. — Ped7g, Jul 28 '16 at 12:58
@Ped7g source code added. I do have another class called SPI which does exactly the same (it is a carbon copy except for Iic being Spi and some minor adjustments for the differences in i2c and spi protocol) but gets instantiated multiple times. This does work flawlessly. — Etruscian, Jul 28 '16 at 14:24
@Etruscian doesn't make sense, can you add to the address print also address of `this`, so you make sure the same instance of `I2c` is used in both cases? Like: `xil_printf("%08x->%08x\n", this, &transmitCompleteI2c);` — Ped7g, Jul 28 '16 at 14:35
"The memory address of `transmitComplete` in the main loop is not the same of the memory address of `transmitComplete` in the interrupt." What is "the main loop"? Anyway, of course not, because the interrupt and most other segments of code here use **`transmitCompleteI2c`**. You realise, of course, that these two names are not the same? — underscore_d, Jul 28 '16 at 14:41
@underscore_d I changed the name of `transmitComplete` during this problem to see if it was interfering with other variables with the same name for some reason. It did not, but I did not change the names back afterwards. So yes, they are different in name, but the code below the horizontal line in the post is the actual implemented code. — Etruscian, Jul 28 '16 at 14:49
@Ped7g The results of that print are as follows: `0019344C->001934D8` for inside the init function, `00193454->001934E0` for inside the interrupt routine. It seems like it is indeed making a new instance for just the interrupt routine. — Etruscian, Jul 28 '16 at 14:56
It is not doing anything on it's own, YOU are doing new instance. (outside of the source you posted). So either use static variable, if you want to have two+ instances, or fix your instance owning mechanism, so the class responsible for owning I2c instance owns it only once. — Ped7g, Jul 28 '16 at 15:00
The file that owns the instance is the main.cc file. This file only has 1 instance of the class called `i2c`. In the `main()` routine, the class gets initialized by calling `i2c.initialize()`, after which a `while(true)` loop starts which calls `i2c.writeData(16)`. Here the 16 gets ignored by the method `writeData`. — Etruscian, Jul 28 '16 at 15:07
What are `XIic_blahBlahBlah`? You never `#include` any header for them. How does that work? Without seeing all the definitions for them all, how do we know that one of those functions doesn't take the received instance pointer, dereference it, copy the referred object, and pass a _copy_ to the handler? or any other such inadvertent mishap. — underscore_d, Jul 28 '16 at 15:08
@Etruscian so you think you are more correct than your CPU? :D Print then `&i2c` address too, so you can see which one of them is the "true" one. Then search for the other instance elsewhere, you have somewhere some bug, like missing `&` or `*` or overall confusing API design forcing you to create/copy other instance involuntarily. — Ped7g, Jul 28 '16 at 15:10
@Ped7g I'm definitely more frustrated right now haha. The address the print in the main returns is the same as the address this points to. Both of them are `00193444` in this case. This would mean that the interrupt has a "wrong" instance. Could it be that the interrupt function which calls the interrupt handler which I wrote does not know the i2c class and thus the compiler creates a new instance? — Etruscian, Jul 28 '16 at 15:19
@Etruscian ...What? If a function "does not know" a class, i.e. its declaration, how can it possibly create a new instance? No, you'll get a compile-time error. There's no in-between. That question doesn't make any sense. Also, please, just **use a debugger**: set breakpoints in all constructors and anywhere else suspect, find the part where the errant extra instance is created, and then backtrace from there. That doesn't seem difficult, and if it's not, it really doesn't justify all this wild speculation. — underscore_d, Jul 28 '16 at 15:33
@underscore_d I am using a debugger with breakpoints to see what is going on. I just noticed that the address the `interruptHandler` sees as `this` is actually the address of the struct `iicDevice`. — Etruscian, Jul 28 '16 at 15:43
@Etruscian Of course it is! Look: `// attach interrupts to i2c device XIic_SetSendHandler(&iicDevice, &iicDevice, (XIic_Handler) (&I2c::sendHandler));`. If it's meant to receive the `this` pointer of an `I2c` instance, then you need to pass `this` - not `&iicDevice`. I'm guessing the same applies to all such callback registrations. If so, you're invoking serious UB here, by passing a pointer to the wrong (type of) object, and it's a wonder any of it ever worked. — underscore_d, Jul 28 '16 at 15:51
the struct `iicDevice` does not however have a field called `transmitCompleteI2c`. How is it possible that the code does not break on such a thing? or is the address of `transmitCompleteI2c` set as an offset from the address `this` points to? — Etruscian, Jul 28 '16 at 15:52
@Etruscian It's possible because it's undefined behaviour, which means anything is possible - including seemingly working perfectly, seemingly creating a copy, crashing horribly, or anything in between. — underscore_d, Jul 28 '16 at 15:53
@Etruscian I think you are misusing terms `class` and `instance`. Also from underscore comment you are casting different types unsafely. Well, I probably sound harsh, don't take this personally, it's just your code is full of bugs and ugly constructs, but that's normal, first version of my source is usually much worse, and it takes few iterations and refactoring to have something cleaner, leaner and better... and with less bugs. — Ped7g, Jul 28 '16 at 15:53
if you cast instance of std::vector to std::string, and then call on such pointer a function from std::string, the C++ compiler will happily let you (except the cast should yield a warning of incompatible types). What happens then? The function from std::string will access the memory of vector instance, as if it was string instance. Ie. anything may happen, but surely not something valid or expected. — Ped7g, Jul 28 '16 at 15:56
Actually, I presume `XIic_SetSendHandler` **should** take `&iicDevice` i.e. an `XIic`, so your real problem is assuming wrongly that you can call an `I2c` instance function via it. So the `I2c` function gets an `XIic` pointer as its (non-)`this`, and all UB breaks loose. As @Ped7g said, you're doing things that make no sense, so you shouldn't expect sense in return. The compiler just doesn't stop you because it presumes you're passing the right thing as your (presumably `void *`) pointer and doesn't want to waste cycles asking you whether you're really sure. — underscore_d, Jul 28 '16 at 15:59
Thanks for the help everybody! So the fact that it was suddenly working and that it is still working in one of my other classes was just mere coincidence? I'll change the other class based on this as well. I get that my code is ugly. I copied most of it from the example which came with the XIic driver from the board support package, which makes you wonder.. haha @Ped7g Isn't a class a blueprint of which an instance lives in the memory and can be worked with? Or am I completely wrong on that? — Etruscian, Jul 28 '16 at 16:01
Yeah, class is definition, instance is real memory allocated to be used by class as single item of it. So your _"So it's setting a different variable to 1. This does not make sense, since it should all be in the same class."_ ... is talking about class (static variable), or instance (bad wording)? You are mixing them more times in these comments. It's subtle, but it creates lot of confusion, as we were not sure whether you want multiple instances working over static variable, or you have second "rogue" instance by mistake. (as the irony wanted, neither was true, you were using other memory) — Ped7g, Jul 28 '16 at 16:04
@underscore_d `XIic_SetSendHandler` requires 3 inputs, a pointer to the `XIic struct`, a `void* as CallBackRef` and the interrupt routine function. I mistakenly copied the `XIic struct` as `CallBackRef` from the example, which does not use classes. — Etruscian, Jul 28 '16 at 16:05
@Ped7g Thanks for the clarification, I'll watch my words more closely next time. I do get one more warning I cannot seem to get rid of in the `XIic_SetSendHandler` lines which says: "converting from void (I2c::*) to void (*)". Is there a clean way to do this which does not give this warning? — Etruscian, Jul 28 '16 at 16:09
Do you cast that pointer to `(void *)`, or not? It's the `CallBackRef` one? Try `..., (void*), ...` (but make sure you are using valid callback) In the example it may be, that the struct is written in a way where it can function both as struct and callback, but that doesn't make much sense, so I'm probably just not paying attention (or example is bad). BTW, in C++ `struct` and `class` are the same, only default visibility in struct is public, and private in class. It's just a programmer convenience to use struct for simple memory structures, and class for types with functions. — Ped7g, Jul 28 '16 at 16:24
I did do a very quick look on your git and project web... so in worst case your code may break few bones of human? I love how they let students work on this just so. I hope there's some review process going on... oh, whom am I fooling now... :D Let's hope you will pick 5 years of experience in single next month, that will do. :) — Ped7g, Jul 28 '16 at 16:26
It's the final argument actually, since I'm casting a pointer to the member function of the class to a void *. It isn't the best solution but it seems to work reliable enough. I started programming seriously in February this yeah and basically built everything that's on there with 1 other guy in about 3 weeks. At that point I had about 4 weeks of experience, so at that rate, it should be doable :D — Etruscian, Jul 28 '16 at 17:21

score 2 · Accepted Answer · answered Jul 28 '16 at 14:35

2

"...it should all be in the same class"

but transmitCompleteI2c isn't a static member, so it has to be the same object, not just the same class. Every instance of I2c gets its own copy of each non-static data member.

You can just make it static if you know only one copy is required (this effectively makes it a global, although still only accessible by I2c) - but don't forget it needs a single non-inline definition.

Alternatively, you need to figure out what I2c objects exist, and where they're created. You can implement all constructor and assignment operator overloads with logging (or to set breakpoints) if you're having trouble figuring out where they get created.

answered Jul 28 '16 at 14:35

Useless

64,155
6
88
132

True, but that does not explain why exactly this is working in a different class with the same code, except for some protocol specific changes. – Etruscian Jul 28 '16 at 14:44
1

I thought you said it had worked with the _same_ code? Either way, the answer is in whatever creates, stores, copies etc. instances of this class, not inside the class itself. – Useless Jul 28 '16 at 14:45
This class did work before with the same code. In my comment on your solution, I was talking about a different class which also has the same code, but implements a different protocol. – Etruscian Jul 28 '16 at 14:51
2

The ultimate answer is _no-one can tell you what's happening_, because your code as posted is incomplete. This means no-one can reproduce the problem even in principle. (As an aside, it's neither complete nor minimal, meaning that important stuff is missing, and irrelevant stuff is included. This doesn't help anyone, although I don't think it slowed things down too much in this instance.) However, the only explanation for printing two different addresses for a member is having two different objects. All non-static methods are passed a `this` pointer, and we can't see where yours comes from. – Useless Jul 28 '16 at 14:55
As I am debugging this code, the only other code that is running aside for an entire library of code written by the retailer of my zynq module to communicate with the hardware is a main file with the inclusion of the header files, the declaration of the I2c class the initialization of that class through the initialization function and then a `while(true)` loop which calls `i2c.writeData(16)`. The 16 in this last call is not being used by the `writeData` function. – Etruscian Jul 28 '16 at 15:01
If you want to take a look at the entire source code, feel free to read through our github repository [link](https://github.com/project-march/march-framework/tree/dev) – Etruscian Jul 28 '16 at 15:04
I added a breakpoint in the constructor and it gets called only once in the main file when I create the first instance. After this break, it continues and has 2 different addresses in the `initialize` and `writeData` (which both are the same as the one in the code which owns the instance) and the `sendInterruptHandler`. – Etruscian Jul 28 '16 at 15:28
1

Re github: you're missing my point. Saying _please read my entire codebase so you can help fix my problem for free_ is not a tempting proposition. If you want to get eyes on the problem, it's your job to come up with an [MCVE](https://stackoverflow.com/help/mcve). – Useless Jul 28 '16 at 15:29
1

BTW, if you have two different addresses after the constructor ... probably another (copy) constructor was called later. – Useless Jul 28 '16 at 15:30
Yeah that is true, I'm sorry. I got a bit frustrated as I'm breaking my head over trying to get I2c to work for about 2 weeks now. – Etruscian Jul 28 '16 at 15:32
We've all been there :) Take a step back, figure out where that second instance is coming from and who owns it, and go from there. Good luck! – Useless Jul 28 '16 at 15:37
1

Not to devalue a decent post, but @Etruscian, why did you accept this? It doesn't answer the question by identifying the problem, i.e. you passing the wrong kind of instance pointer to your handlers. You don't have to, and shouldn't, just mark the first available thing as the accepted answer. In this case, it would be better to write your own post (as it looks like neither myself nor Ped7g feel like doing so) summarising the 3-way discussion in the comments above, and mark that as the answer. Or at least leave it without an accepted answer, if there isn't any that answer it. – underscore_d Jul 28 '16 at 19:31
@underscore_d I marked this as the answer since it brought me on the right path looking for which instances of the class were in the memory. Because of that, I found out that the interrupt did not know the instance of the class it should have known, but instead linked to a struct. – Etruscian Jul 29 '16 at 17:11

C++ variable changed during interrupt resets after interrupt

1 Answers1