Questions tagged [montgomery-multiplication]

A modular multiplication algorithm invented by Montgomery that allows modular arithmetic to be performed efficiently when the modulus is large (typically several hundred bits).

The Montgomery reduction algorithm Redc(T) calculates TR^{-1} mod {N} as follows:

m := (k (T mod R)) mod {R}
t := (T + mN)/R
if  t >= N return t - N else return t.

Using this method to calculate c is generally less efficient than a naive multiplication and reduction, as the cost of conversions to and from residue representation (multiplications by R and R^{-1} modulo N) outweigh the savings from the reduction step.

The advantage of this method becomes apparent when dealing with a sequence of multiplications, as required for modular exponentiation (e.g. exponentiation by squaring).

Many important cryptosystems such as RSA and DSA are based on arithmetic operations, such as multiplications, modulo a large number. The classical method of calculating a modular product involves first multiplying the numbers as if they were integers and then taking the modulus of the result. However, modular reduction is very expensive computationally—equivalent to dividing two numbers.

The situation is even worse when the algorithm requires modular exponentiation.

15 questions
3
votes
1 answer

Final subtraction in montgomery modular multiplication for an RSA cryptosystem

I'm confused about how one might supposedly bypass the final subtraction of the modulus in radix-2 montgomery modular multiplication, when used in a modular exponentiation algorithm. The following two papers put forward the conditions for bypassing…
3
votes
1 answer

Bizzare identical incorrect results across different MWR2MM algorithms for RSA montgomery multiplication

Background I'm trying to implement RSA 2048 in hardware (xilinx ZYNQ FPGA) using a variety of different Montgomery methods. I am implementing the algorithm using Xilinx HLS (essentially C++ code that is synthesized into hardware). Note: For the…
3
votes
2 answers

RSA hardware implementation: radix-2 montgomery multiplication issues

I'm implementing RSA 1024 in hardware (xilinx ZYNQ FPGA), and am unable to figure out a few curious issues. Most notably, I am finding that my implementation only works for certain base/exponent/modulus combinations, but have not found any reason…
asmvolatile
  • 522
  • 5
  • 22
3
votes
1 answer

Montgomery Reduction form using OpenSSL library

I have N of 1024 bits. I need to convert a message M ( 512 bits ) into Montgomery reduction form as below. M' = M * R^{-1} mod N where , R = 2 ^ 512 (mod N) How can I achieve the result ?
dudedev
  • 451
  • 1
  • 5
  • 19
2
votes
0 answers

Reducing LUT utilization in a Vivado HLS design (RSA cryptosystem using montgomery multiplication)

A question/problem for anyone experienced with Xilinx Vivado HLS and FPGA design: I need help reducing the utilization numbers of a design within the confines of HLS (i.e. can't just redo the design in an HDL). I am targeting the Zedboard (Zynq…
asmvolatile
  • 522
  • 5
  • 22
2
votes
2 answers

How do I convert an array of UInt64 to an array of UInt16 to perform multi-precision multiplication?

I need to perform fast Galois field arithmetic in my application. I have a multiplication function written in assembly that has been optimized for my platform, an MSP430 microcontroller. That function computes the product of two large numbers of…
Keron
  • 167
  • 8
1
vote
1 answer

Can Montgomery multiplication be used to speed up the computation of (large number)! % (some prime)

This question originates in a comment I almost wrote below this question, where Zack is computing the factorial of a large number modulo a large number (that we will assume to be prime for the sake of this question). Zack is using the traditional…
Pascal Cuoq
  • 79,187
  • 7
  • 161
  • 281
0
votes
0 answers

Confusion around point addition on secp256k1

I am using https://github.com/HareInWeed/gec for point addition on the secp256k1. The code below adds two points on secp256k1 curve and displays the results. #include #include #include #include…
0
votes
0 answers

What n represents in K= 2^2n mod m

I am doing a bit of studying of RSA algorithm implementation using montgomery modular multiplier and i am not sure what 2^2n stands for , is n number of bits of the message or is 2^2n number of bits of the message or is it something else. page 6 in…
0
votes
1 answer

Montgomery Multiplication - 32-bit register vs. 64-bit register

I need to calculate the speed difference between performing a Montgomery Multiplication page 602-603 with a word-size/register of size 32 vs. 64. So far, this is what I understand: x and y are represented by multiple-word arrays of length n where…
0
votes
1 answer

Montgomery Multiplication Algorithm on Python

I try Montgomery Multiplication Algorithm on Python 3.x. This algorithm pseudo-code is given below: Input: Modulus N(n bit), gcd(n, 2) = 1, Multipler: A (n bit), Multiplicant: B (n bit) Output: R = (A x B x 2 ^ (-n)) mod N R = 0 for (i = 0; i < n;…
0
votes
1 answer

multiplication in GF(p)

I am developing software in JavaCard to addition points in ECC. the issue is I need some basis operations, so for the moment, I need multiplication and inversion, I already have addition and subtraction. I was trying to develop montgomery…
Raul Benitez
  • 35
  • 1
  • 7
0
votes
1 answer

Montgomery Multiplication in RSA: c=m^e%n

How does Montgomery Multiplication work in speeding up the encryption process for computing c=m^e%n as used in RSA encryption? I understand that Montgomery multiplication can efficiently multiply a*b%n but when trying to find m^e%n, is there a more…
-1
votes
1 answer

Frequency of Montgomery Multiplier

I have designed a 16*16 Montgomery multiplier. The code uses a 16*16 multiplier to perform three multiplications. The multiplications are performed one after the other using the same multiplier and the result of each multiplication is stored in the…
-2
votes
1 answer

Verilog code - compiles fine, but simulation does not run

I have had some fairly good experiences with structural modelling in Verilog, but I barely have any with other modelling methods. So, kindly help me out. The code compiles fine, but when simulating it just hangs. Nothing happens. If it is important,…