This has a loop-carried dependency and not enough stuff to do in parallel, so if anything is even executed at all, it would not be FLOPs that you're measuring, with this you will probably measure the latency of floating point addition. The loop carried dependency chain serializes all those additions. That chain has some little side-chains with multiplications in them, but they don't depend on anything so only their throughput matters. But that throughput is going to be better than the latency of an addition on any reasonable processor.
To actually measure FLOPs, there is no single recipe. The optimal conditions depend strongly on the microarchitecture. The number of independent dependency chains you need, the optimal add/mul ratio, whether you should use FMA, it all depends. Typically you have to do something more complicated than what you wrote, and if you're set on using a high level language, you have to somehow trick it into actually doing anything at all.
For inspiration see how do I achieve the theoretical maximum of 4 FLOPs per cycle?