Computing the sum of the totient function up to 10^16

Question

Reposting it in a clear, understandable way without any complicated MathJax that doesn't appear properly:

I explored some computer science/number theory challenges sites for fun, and they presented the following problem, exactly as follows:

Let P(n) = sum{1<=k<=n} φ(k)

Find P(10^16)

I searched for quite a while about this and tried different approaches:

Using the formula for φ(n)= n * product{1<=i<=k} (Pi-1)/Pi, I tried to calculate the φ(n) in range, but this becomes very inefficient for large n. I could get as far as 10^7 with this approach. Beyond this it just gets too slow.
I tried a different one, more direct. Wikipedia and Wolfram Alpha suggest similiar formulas for directly calculating P(n):

P(n) = sum {1<=k<=n} φ(k)= 0.5⋅(1+∑{1<=k<=n} μ(k)⋅⌊n/k⌋^2)

This formula seemed a lot more promising. I tried it and managed to get alot further than 10^7 but still far from the target. With pre-calculating a sieve of the Moebius function, I could get to a bit less than 10^9. My memory was insufficient, and couldn't compute anymore values in a sieve. And even if I could, it still takes a long time and is very far from 10^16.

Here is part of the code that I used for my 2nd approach written in Java:

public static BigInteger PhiSummatoryFunction (long limit)
{
    BigInteger sum = BigInteger.ZERO;
    int [] m = MoebiusSieve(limit);
    for (int i=1;i<m.length;i++)
        sum=sum.add(BigInteger.valueOf((long) (m[i]*Math.floor(limit/i)*Math.floor(limit/i))));
    return sum.add(BigInteger.ONE).divide(BigInteger.ONE.add(BigInteger.ONE));
}

Where MoebiusSieve is a function that computes the Moebius function values up to a certain limit in a sieve, using an eratosthenes-like method.

After understanding and implementing a new recursive method for this computation that I found on the internet:

P(n)=n(n+1)/2−∑{2<=i<=√n} P(⌊n/i⌋)−∑{1<=j<=√n} P(j)⋅(⌊n/j⌋−⌊n/(j+1)⌋)

I can compute values up to P(10^11), and with maximum memory allocation, pre-computing as many φ(n) as possible and consequently all P(n) that I can for memoization, I can compute P(10^12) in just over 20 minutes. A major improvement but still a little far from P(10^16). It's ok if the computation takes a bit longer, but I fear P(10^16) would take exponentially longer time, judging by the "jump" in computation time between P(10^11) and P(10^12). My memory allows me to "save" up to 350,000,000 φ(n) values OR up to 700,000,000 μ(k) values. Perhaps there is a way to perform the summation using μ(k) values rather than φ(n)?.

All my computations suggest and show that my recursion is the prominent time consumer. This is obvious, but I am sure it takes longer than it should. I am posting below the recursion code, with some documentation. It seems to me this is the right way to do this computation, but my implementation is not optimal.

public static BigInteger phiR (long limit, long [] s) // limit is 10^t, s is the sieve of precomputed values of `P(n)`. Can store maximum 350,000,000 values
    {                                                                                                                                                       
        if (limit<s.length)                                 
            return BigInteger.valueOf(s[(int) limit]);
        BigInteger sum = BigInteger.valueOf(limit).multiply(BigInteger.valueOf(limit).add(BigInteger.ONE)).divide(BigInteger.valueOf(2)); // this corresponds to the n'th triangular number
        BigInteger midsum1=BigInteger.ZERO; // the first sum
        BigInteger midsum2=BigInteger.ZERO; // the second sum
        long m = 2;
        while (limit/m != limit/(m+1) && m*m<=limit) // computing the first sum, first for changing floor(limit/m) values
        {
            midsum1=midsum1.add(phiR((long) Math.floor(limit/m),s));
            m++;
        }
        for (long k = m;k*k<=limit;k++) // once the floors become constant for some values,-->
        {                               //  can check how many times the value appears, and multiply accordingly,--> 
            BigInteger midPhi = phiR((long) Math.floor(limit/k),s);  // rather than compute the Phi every time
            long q = 1;
            while (limit/k==limit/(k+1)&&k*k<=limit)
            {
                q++;
                k++;
            }
            k--;
            midPhi=midPhi.multiply(BigInteger.valueOf(q));
            midsum1=midsum1.add(midPhi);
        }
        for (long d=1;d*d<=limit;d++) // computing the second sum
            if ((double)d!=Math.floor(limit/d))
                midsum2=midsum2.add(BigInteger.valueOf((long) (Math.floor(limit/d)-Math.floor(limit/(d+1)))).multiply(phiR(d,s)));
        sum=sum.subtract(midsum1).subtract(midsum2);
        return sum;
    }

I was suggested to use dictinaries, in addition to the array, for big values of n, but I don't know anything about it. Can another improvement be made to make the time frame a day or so?

Can you please share a link to method 3 and/or an explanation of the theory behind it? — גלעד ברקן, Apr 30 '19 at 13:36
@גלעדברקן Sure, see answer by `qwr` here: https://math.stackexchange.com/questions/316376/how-to-calculate-these-totient-summation-sums-efficiently/1740370#1740370 — MC From Scratch, Apr 30 '19 at 13:47
Thanks. Interestingly, the algorithm there was offered by, Daniel Fischer, the same person who wrote the [SO answer](https://stackoverflow.com/a/16380998/2034787) I shared in my (now deleted) answer :) — גלעד ברקן, Apr 30 '19 at 13:52
Yes, I noticed as well. I think my recursion is for some reason slower than it should be. I don't know how to make it faster. Someone suggested the use of dictionaries to store `P(n)` values. — MC From Scratch, Apr 30 '19 at 13:59
I'm confused: doesn't using `P(floor(n/i))` (in method 3) mean we would need a value for `P(5*10^15)` right from the start? — גלעד ברקן, May 01 '19 at 00:15
@גלעדברקן No. Through recursion method, it would need to compute `5*10^15` in order to compute `10^16`. It relies on that value, but doesn't have it. In order to get `5*10^15` it would require many more recursions. Essentialy, one of the good things about this method, is that you get lots of extra values along the way. — MC From Scratch, May 01 '19 at 11:00
Oh, right. I wasn't thinking about it as a recursion, in part because of all the discussion about precomputation. — גלעד ברקן, May 01 '19 at 11:19
From what I could find on the internet, the time complexity for method 3 is O(n^(2/3)). For 10^16, that's about one trillion iterations. With current technology, that's still going to take a while. [This blog](https://mathproblems123.wordpress.com/2018/05/10/sum-of-the-euler-totient-function/amp) mentions reaching 10^11 in under a minute with this method. — גלעד ברקן, May 03 '19 at 10:44
I optimized my storage handling quite a bit (still using method 3). With this optimization I can compute `P(10^11)` almost instantly, `P(10^12)` in about 30 seconds, `P(10^13)` in 3.5 minutes and `P(10^14)` in a few hours. This is a major improvement. Still falls a bit short of the 10^16 mark, but I am still working on it. Every day I manage to improve it a little. @גלעדברקן — MC From Scratch, May 03 '19 at 13:07
Memory use becomes an issue with large n, and restricting memory makes the time turn linear. I used this algorithm in C with 128-bit types on an M1 Macbook to get the time to `P(10^13)` in 3.3s, `P(10^14)` in 19s, `P(10^15)` in 3 minutes, and `P(10^16)` in 33 minutes. The last could be faster with a machine with more memory (this used 13GB). Results match OEIS. — DanaJ, May 06 '23 at 03:08

score 1 · Answer 1 · answered Apr 30 '19 at 00:18

If you want to know the totient of a single number n, the best way to find it is to factor n and take the product of 1 less than each factor; for instance, 30 = 2 * 3 * 5, and subtracting 1 from each factor, then multiplying, gives the totient 1 * 2 * 4 = 8. But if you want to find the totients of all the numbers less than a given n, a better approach than factoring each of them is sieving. The idea is simple: Set up an array X from 0 to n, store i in each X_i, then run through the array starting from 0 and whenever X_i = i loop over the multiples of i, multiplying each by (i − 1)/i. You can compute the sum at the end, or accumulate it as you go. Since your sieve will be large, you will need to segment it.

Here are some useful pages from my blog: Sieving For Totients and Segmented Sieve of Eratosthenes. If you poke around there you might also find some other interesting things.

As I mentioned, I tried this approach but it's rather slow compared to others. But even if I do that, how can I apply the segmentation? Each following segment relies on the previous one, so it seems to me that once I 'throw' the previous segment, I can't really produce a following one. — MC From Scratch, Apr 30 '19 at 05:06

score 0 · Answer 2 · answered Apr 08 '20 at 13:04

For lower orders:

public class Solution {

    static Map<Long,Long> X2 = new HashMap<>();

    static long F2(long N){
        return N*(N-1)/2;
    }

    static long R2(long N){
        if(N==1) return 0;
        if(X2.containsKey(N)) return X2.get(N);
        long sum = F2(N);
        long m=2;
        while(true){
            long x = N/m;
            long nxt = N/x;
            if(nxt >= N) {
                X2.put(N, sum - (N-m+1)*R2(N/m));
                return X2.get(N);
            }
            sum -= (nxt-m+1) * R2(N/m);
            m = nxt+1;
        }
    }

    public long odradi(int N){
        return R2(N)+1;
    }

    public static void main(String[] args) {
        System.out.println(R2(100000)+1);
    }
}

Computing the sum of the totient function up to 10^16

2 Answers2