0

I am tasked to implement from scratch addition and subtraction of signed integers of arbitrary size. Such an integer is stored in an array of 64-bit unsigned integers. The least significant bit of the array's first element is the least significant bit of the whole integer. An array of size d represents a signed integer of at most 64 * d - 1 bits. Format of the integer should not be changed.

I came up with the following:

// Add huge integers: z[] = x[] + y[]
template<typename ing>
inline void addHint(uint64_t *z, uint64_t *x, uint64_t *y, ing d)
{
  uint64_t carry = 0;
  for (ing i = 0; i < d; ++i)
  {
    uint64_t zi = x[i] + y[i];
    // uint64_t carryNew = zi < x[i] or zi < y[i];
    uint64_t carryNew = zi < x[i]; // zi < y[i] unnecessary.
    z[i] = zi + carry;
    carry = carryNew or z[i] < zi;
  }
}


// Subtract huge integers: x[] = z[] - y[]
template<typename ing>
inline void subHint(uint64_t *x, uint64_t *z, uint64_t *y, ing d)
{
  uint64_t carry = 0;
  for (ing i = 0; i < d; ++i)
  {
    uint64_t xi = z[i] - y[i];
    // uint64_t carryNew = z[i] < y[i];
    uint64_t carryNew = z[i] < xi; // Somehow x86-64 g++ 8.3 -O2 dumps fewer assembly lines than the above according to godbolt.org
    x[i] = xi - carry;
    carry = carryNew or xi < x[i];
  }
}

My team is unsatisfied with the speed. Can it be improved?

Thanks!

user2961927
  • 1,290
  • 1
  • 14
  • 22
  • *My team is unsatisfied with the speed.* -- Then it's a discussion between you and your team. What if we say "the code looks good"? – PaulMcKenzie Mar 24 '22 at 20:50
  • `or` ? does this even compile ? – 463035818_is_not_an_ai Mar 24 '22 at 20:50
  • 6
    @463035818_is_not_a_number -- Believe it or not ,`or` is a valid C++ keyword. You hardly see it used, so I don't blame you if it looks strange. – PaulMcKenzie Mar 24 '22 at 20:51
  • 1
    Why is `ing` a template type? Why not just use `std::size_t`? – Ranoiaetep Mar 24 '22 at 20:54
  • @PaulMcKenzie They just asked if it can be faster :) – user2961927 Mar 24 '22 at 21:00
  • @Ranoiaetep In another module sizes of the integers are stored in a separate array of unsigned char or short, so just for matching it up, but might not be necessary – user2961927 Mar 24 '22 at 21:05
  • @mch yeah that was one point of the question, so how to efficiently check if `x[i] + y[i] + carry` overflows? I've been bugged by this case: suppose integers are 10-based, then 9 + 0 + 1 will incur a carry.. Please let me know if you have a way to reduce those comparison operations – user2961927 Mar 24 '22 at 21:31
  • @mch But if that, 9 + 9 + 1 = (+1)9 while carry = (9 < 9 || 9 < 9) = 0 ? – user2961927 Mar 24 '22 at 22:21
  • Your implementation also forgot about the carry in the last addition. Is it expected to be ignored? – mch Mar 24 '22 at 22:53
  • @mch yeah it's intentional since the allocation guaranteed the last bit of the last 64-bit integer is the sign bit – user2961927 Mar 24 '22 at 23:00
  • 1
    This may help [Producing good add with carry code from clang](https://stackoverflow.com/q/33690791/555045) – harold Mar 25 '22 at 00:41
  • How are you handing the sign? – Joseph Larson Mar 25 '22 at 03:46
  • 2
    If your team is unsatisfied with speed, then you should all consider using [GNU MP](https://gmplib.org/), because it contains highly optimized assembly code for many relevant targets, which deal with carry and multidigit arithmetic the best way possible considering specific machine code, including vectorized instructions where available. – Arc Mar 25 '22 at 17:26
  • 1
    To see how all of this multidigit arithmetic is highly-machine dependent, check GMP manual on [assembly code](https://gmplib.org/manual/Assembly-Coding). – Arc Mar 25 '22 at 17:36
  • @JosephLarson The last 64-bit limb "acts" like a signed 64-bit integer – user2961927 Mar 25 '22 at 20:17
  • @user2961927 Have you tested adding +1 and -1? It reaches zero? – Joseph Larson Mar 26 '22 at 14:04
  • @JosephLarson Yes it does. The test kit is comprehensive. What was your concern? – user2961927 Mar 26 '22 at 19:08
  • @user2961927 *My team is unsatisfied with the speed. Can it be improved?* -- Get a library that already does this, as mentioned previously. What is the goal of your team anyway? Is it to give you busy work to do? If so, they must not care about the time spent for all the research required, when all they have to do is get a library that does this (if it were me, I would say "there are people that already spent their time on this, and created a library called GNU GMP -- I'm sure I could spend my time developing other aspects of the program"). – PaulMcKenzie Mar 27 '22 at 03:04
  • @PaulMcKenzie thanks for the advice. App is related to some cryptography research. Reasons for not using GMP in the first place are: (i) likely an overkill since only +- are needed, and (ii) operands are of varying sizes (code above is just a demo) and are stored in a large memory block. Using GMP needs conversion into the library's integer format, which incurs allocation and copying overhead that's likely overwhelming --- or in-place conversion is actually possible? I'll run some tests later. My team are friendly:) They were just surprised to see 4 lines of code in the loop body – user2961927 Mar 27 '22 at 04:37

1 Answers1

0
mp_limb_t mpn_add_n (mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n)

and

mp_limb_t mpn_sub_n (mp_limb_t *rp, const mp_limb_t *s1p, const mp_limb_t *s2p, mp_size_t n)

in GMP are what you were looking for.

user2961927
  • 1,290
  • 1
  • 14
  • 22