Interesting task you have! Thanks!
It was a pleasure for me to spend two almost whole days to write very advanced solution. I implemented from scratch three factorization algorithms: Trial Division, Pollard's Rho, Lenstra Elliptic Curve Method (ECM).
It is well known that ECM method (with elliptic curves) is one of the fastest methods for mid-range factors. While trial division is good for very small factors (up to 2^20 factor per second), Pollard Rho is good for bigger yet small factors (up to 2^40 per second), while ECM is good for mid-range factors (up to 2^60 per 10 seconds).
There are also very advanced methods like General Number Field Sieve (GNFS) (factors up to 2^700 per month), but they are very difficult to implement. Also Quadratic Sieve method is advanced (probably up to 2^400 per month), I also implemented this from sratch but it has very big code, yet manageable to understand, but due to its size I don't attach it here. ECM method was the only method quite easy to implement among advanced methods.
Besides mentioned above 3 methods of factorization that I implemented, I also used following algorithms inside code: Modular Exponentiation, Fermat Primality Test, Barrett Reduction, Euclidean Algorithm, Extended Euclidean Algorithm, Modular Multiplicative Inverse, Elliptic Curve Point Addition and Multiplication.
Actually your task as it is is very easy to solve fast: it is known that for bit size 2048 there appears a prime once approximately every ln(2048) = 1420
number, so you just generate fast around 1500 numbers while checking if they are prime, for example using Fermat Primality Test which is very fast. And if a number is prime then by definition it is already factored as it is.
I extended in my mind your task further, to make it more interesting. I don't search for prime number, but instead trying to find such 2048-bit random-generated numbers such that they have at least several big prime factors. This kind of numbers I will call "interesting". Of course if a number has several tiny factors and one large prime then it is not that interesting. But if it has 60-bit prime factor then it is interesting to catch such number, that's what I do in my code.
You can see in my code that I adopted it for two kinds of libraries, Boost Multiprecision and GMP. Both are included in my code (see #include <boost/multiprecision/cpp_int.hpp>
and #include <gmpxx.h>
), so you should install and link both. Under Linux it is very easy to install both through sudo apt install libboost-all-dev libgmp-dev
. But Windows is a bit tricky, install Chocolatey Packet Manager first and then do command choco install boost-msvc-14.3
. And for GMP install VCPKG as described here and then vcpkg install gmp
. If you want you may install Boost through VCPKG too: vcpkg install boost
.
ECM (elliptic curve) method is very interesting and simple:
You generate many random curves, with random X, Y, A, B, N params, where N is your input number that needs to be factored, and other params are random that fit curve equation Y^2 = X^3 + A * x + B (mod N)
.
You multiply each curve by all growing prime numbers (with some small power).
At some point you'll get a multiple of curve order for some first curve, and by a property of curve order in such a case you'll get so-called Infinite point on curve.
If you look at Point Addition formula then you may see that there is a denominator in formula lambda = (y_q - y_p) / (x_q - x_p)
. This denominator is computed as Modular Multiplicative Inverse modulus N. For Infinite point it becomes non-invertible, and by property of inverse number non-invertibility is only possible when GCD(N, x_q - x_p) != 1, in which case GCD gives some non-trivial factor (sometimes also N), hence our N is factored successfully by giving first factor.
If we don't get Infinite point, then we continue to generate more curves and to divide by more (bigger and bigger) prime numbers. More curves we generate and more primes we multiply by, the higher is success of factorization.
Try it online!
SOURCE CODE HERE. As StackOverflow has limit of 30K symbols per post and my code alone is about 44K bytes, I couldn't inline it here, but instead sharing it through Github Gist (link below). Also same code is above available through Try it online!
link on GodBolt server.
GitHub Gist code
Example console output:
TrialDiv time 8.230 sec
Num: 4343663370925180057127849780941698665126534031938076094687921681578209757551374613160773985436765755919464255163981465381983273353052491 (2^453.90)
Factored: 13 (2^3.70, prime), 167 (2^7.38, prime), 3853 (2^11.91, prime), 53831 (2^15.72, prime), 916471 (2^19.81, prime), 9255383 (2^23.14, prime),
UnFactored: 11372390351822722497588418782148940973499109818654526670537593527638523385195910987808859992169924704037636069779 (2^372.24, composite),
PollardRho time 8.51 sec
Num: 11372390351822722497588418782148940973499109818654526670537593527638523385195910987808859992169924704037636069779 (2^372.24)
Factored: 189379811 (2^27.50, prime), 2315962907 (2^31.11, prime), 50213994043 (2^35.55, prime),
UnFactored: 5163708449171395447719565208770850251589387410889704005960043195676697732937073689 (2^278.09, composite),
Curves 1, Ops 12.965 Ki, ExtraPrimes 512, Primes 0.500 Ki, IterTime 0.410 sec
Curves 2, Ops 50.912 Ki, ExtraPrimes 512, Primes 1.000 Ki, IterTime 8.062 sec
Curves 3, Ops 112.586 Ki, ExtraPrimes 464, Primes 1.453 Ki, IterTime 15.093 sec
Curves 4, Ops 162.931 Ki, ExtraPrimes 120, Primes 1.570 Ki, IterTime 4.853 sec
Curves 5, Ops 193.699 Ki, ExtraPrimes 80, Primes 1.648 Ki, IterTime 4.201 sec
ECM time 32.624 sec
Num: 5163708449171395447719565208770850251589387410889704005960043195676697732937073689 (2^278.09)
Factored: 955928964443 (2^39.80, prime),
UnFactored: 540177004907359979630305557131905764121354872876442621652476639261690523 (2^238.29, composite),
Final time 49.385 sec
Num: 4343663370925180057127849780941698665126534031938076094687921681578209757551374613160773985436765755919464255163981465381983273353052491 (2^453.90)
Factored: 13 (2^3.70, prime), 167 (2^7.38, prime), 3853 (2^11.91, prime), 53831 (2^15.72, prime), 916471 (2^19.81, prime), 9255383 (2^23.14, prime), 189379811 (2^27.50, prime), 2315962907 (2^31.11, prime), 50213994043 (2^35.55, prime), 955928964443 (2^39.80, prime),
UnFactored: 540177004907359979630305557131905764121354872876442621652476639261690523 (2^238.29, composite),