Class function changes other data in class cpp

Question

SOLVED! I Found my error... I posted my solution at the end. Sorry for wasting time.

It has been a while since I have played with c++. It has never been my primary language, so I am not familiar with the finer details. I am having a weird bug pop up. When I call factorize(), it is resetting the sign of the number. This is despite the fact that sign is never touched.

I have found a work around. In my working code, I've added an integer to preserve and reset the value, but I don't think I should have to. I removed those two lines from the code sample below.

Places I reset sign: sign is set to 0 by the constructor of this class. It can be set to 0 in the * and *= operators (if they each have the same sign). It is set to zero by the = operator only if assigning a (unsigned long long) value to the object (it preserves the sign if setting it equal to another FactorNumber).

And that's it. Those are the only places that I am setting sign to zero. I don't see why any of those would be called by this function. But, I am not really attuned to the finer points of how c++ does things when working with classes. I can't see why sign keeps getting reset to 0, but maybe I am doing something wrong. Does anyone see why it is happening?

class FactorNumber {
    private:
        unsigned long long number;
        unsigned long long factor_list[63];     // this is the max possible amount
        int number_of_factors;
        int sign;                               // 0=positive 1=negative
        void factorize();
[snipped irrelevant public function calls]
};    

void FactorNumber::factorize() {
    int x=0;
    for(x=0;x<64;x++) {
        factor_list[x]=0;
    }
    number_of_factors=0;
    unsigned long long current_factor=2;    // 64 bits in c++
    unsigned long long current_number=number;
    unsigned long max_factor=0;  // can never be more than 32 bits
    if (number>3) {
        max_factor=sqrt(current_number);
        while (current_factor<=max_factor) {
            if(current_number%current_factor) {
                if(current_factor>2) {
                    current_factor+=2;
                } else {
                    current_factor=3;   
                }
            } else {
                factor_list[number_of_factors++]=current_factor;
                current_number=current_number/current_factor;
                if(current_number%current_factor) {
                    max_factor=sqrt(current_number);
                }
            }
        }

        // If there is a number larger than one, add it to the array.
        if(current_number>1) {
            factor_list[number_of_factors]=current_number;
        } else {
            number_of_factors--;        // If not, we ignore this last number
        }
    } else {
        number_of_factors=0;
        factor_list[0]=number;
    }
}

My error was an obiwan error. I was writing past the end of my array (factor_list[63] does not actually exist) and that was over-writing my data. It is just coincidence that this reset sign and didn't screw up other stuff (or maybe it does but I didn't catch it yet). This is why I asked the question. It wasn't that I couldn't work around it, I knew there was a bug somewhere in my code.

Changing the for loop condition to x<63 cleared the bug up.

`factor_list` goes until index 62, whereas you're going until index 64 in the first `for` in `factor_list` — Iosif Murariu, May 12 '14 at 14:29
@brasofilo there are already 2 answers there with the solution :) — Iosif Murariu, May 12 '14 at 14:55
@IosifMurariu, no problem ;)... but the Question is not the place to post an Answer. The normal procedure if the OP doesn't do that is to rollback the question and post an answer as Community Wiki. — brasofilo, May 12 '14 at 14:58
I can't post an answer for 8 hours, because I am too new here. Editing the original problem was the quickest way for me to alert people that it had been solved. When the 8 hours have passed, I will put an answer in the proper place. — Brunk, May 12 '14 at 15:07

Iosif Murariu · Answer 1 · 2014-05-12T14:37:14.093

2

You're overflowing in the 1st for in factorize(), because you're going until index 63, whereas the maximum index (as declared in the class is 62 (size 63)). Actually whenever you call factor_list[X]=Y you have the chance to overflow over the next members in the class. You always need to validate the array index!

unsigned long long factor_list[63]; // <----- indexes from 0 to 62 // <code omitted> factor_list[666] = 0; // <----- Oops! Overflowing (but it's still valid code)

Also, why use C-style arrays instead of C++-style arrays in C++? std::array is a better approach.

edited May 12 '14 at 14:37

answered May 12 '14 at 14:31

Iosif Murariu

2,019
1
21
27

Thanks, I found it right before you posted and was editing it. LOL C is the original language I learned. So, I tend to think of things in that way. I don't know how c++ does arrays differently. I will look into it. Thanks for the hint. – Brunk May 12 '14 at 14:37
@Brunk There is no difference from C at all, so you probably have quite a few bugs lurking in your C code. – molbdnilo May 12 '14 at 14:40
@Brunk, if you use MVS you can put a memory breakpoint to trigger when a memory zone (assigned for a variable, i.e. `sign` in your case) gets modified. – Iosif Murariu May 12 '14 at 14:43
Thanks. It has been a while since I have coded much at all. I've been away from it. Usually I am aware of these errors and am careful to avoid them. This problem actually arose because I originally had the array as a size 64. I changed it and added that comment when I realized that there could never be 64 prime factors in 2^64-1, the max would be 63. I never went and changed the loop. What I should have done, is used a #define for that hard-coded value, so they both changed with each other. – Brunk May 12 '14 at 14:52
or instead of a `#define` you could've used a constant (`const int max = 3; int myArray[max]; // <--- this works :)` – Iosif Murariu May 12 '14 at 14:54
I assume the the compiler would not waste memory on that const int and would hard-code it during the optimization phase, just like it hard-codes the #defines? – Brunk May 12 '14 at 15:04
@Brunk i think it depends on the compiler, but i'm sure they get optimized somehow (and if they don't it's for a good reason). here's a link about this http://stackoverflow.com/questions/1637332/static-const-vs-define – Iosif Murariu May 13 '14 at 06:27

score 1 · Answer 2 · answered May 12 '14 at 14:39

1

There are 63 items in factor_list.

This loop

for(x=0;x<64;x++) {
     factor_list[x]=0;
}

Writes to 64 long longs.

The last assignment, factor_list[63], overwrites the variables stored after the array, setting sign to 0.

Change the loop index.

You probably also want to add checks that you don't increment number_of_factors too far.

answered May 12 '14 at 14:39

molbdnilo

64,751
3
43
82

number_of_factors should never exceed 62 primarily because the maximum amount of factors between 0 and (2^64-1) would be 2^63. All the other numbers will have factor counts lower than that. 2^64-1 has 7 factors (3,5,17,257,641,65537,6700417) for example. – Brunk May 12 '14 at 14:45
Well, in this case, it's a pretty solid assumption. It's actually a mathematical certainty that there exists no number between 0 and 2^64-1 with more than 63 prime factors. The only issue that could arise is if C++ decided to change the size of an (unsigned long long), in which case numbers larger than 2^64-1 would be possible, and there could be more than 63 prime factors. – Brunk May 12 '14 at 14:58
@Brunk It's to protect against bugs actually, since it's difficult to tell at a glance whether code is correct. Working under the assumption that your code will not contain bugs is dangerous. – molbdnilo May 12 '14 at 15:11
I understand that. In this case, I have analyzed the loop closely enough to be sure that number_of_factors can only reach 63 in one case (2^63) and the current_number at that point would be 1, which would immediately decrement number_of_factors back to 62. I've tested this loop with edge cases, like that one, to be sure. Adding a test for something that can't happen would only make this function less efficient than it already is. – Brunk May 12 '14 at 15:19
I don't mean to be contrary. You're certainly right that a test would make this code more bullet-proof. But, it's not production code and it's not meant to be. There's known bugs (for example, my * operator makes no check to be sure the product fits in 64 bits) already. I will clean that up eventually. This loop could run as many are 2^30 times for a single number, which makes me want to keep it as lean as possible. It will be used fairly frequently as well. – Brunk May 12 '14 at 15:53

score 0 · Answer 3 · answered May 12 '14 at 23:01

0

As has been noted, I was writing past the end of my array.

answered May 12 '14 at 23:01

Brunk

1
1

Class function changes other data in class cpp

3 Answers3