Generate a list a(n) is not of the form prime + a(k), k < n

Question

How to generate this list in Python? a(n) is not of the form prime + a(k), k < n.

Here is the list on oeis http://oeis.org/A025043

It goes as 0, 1, 9, 10, 25, 34, 35, 49, 55, 85, 91, 100, 115, 121.

I've tried the bold way, didn't turn out well. Now I'm looking for a sophisticated solution, like Sieve of Eratosthenes for primes. The bold way requires iterating over every prime, and for every iteration of prime to iterate over every number already in the sequence which takes a really long time.

This table has been generated by someone smart: http://oeis.org/A025043/b025043.txt They either used a lot of computational power, or used a sophisticated algorithm, for which I'm looking for.

To explain what this sequence is, every number that isn't present in it can be represented as a sum of a prime and a number in that sequence. For example 8 = 7(prime) + 1(in sequence), 54 = 53(prime)+1(in sequence), etc.

I'm not looking for primes, they are simple. I'm looking for every number that isn't a sum of a prime and some number in the sequence(which starts at 0,1) — Stepan Filonov, Sep 03 '18 at 10:05
Yes, I already have it. But what next? Getting a list of primes is easy, as is making a short sequence. But when you go over a certain threshold, there are too many iterations. — Stepan Filonov, Sep 03 '18 at 10:08
@StepanFilonov: where did you find that problem?? description is not very clear — blue_note, Sep 03 '18 at 10:09
I need this list for research purposes, it happened to match a certain other sequence. The problem is generating this list with a good algorithm. — Stepan Filonov, Sep 03 '18 at 10:11
@MBo, around 10^7 range at least, need to go way beyond what's present here: http://oeis.org/A025043/b025043.txt — Stepan Filonov, Sep 03 '18 at 10:14
Have you tried asking David W. Wilson how he got that table? (He is listed in OEIS as the author of the OEIS entry and as the contributor of that table.) — Rory Daulton, Sep 03 '18 at 11:45
My C code generates the table up to 10k in 0.043s using brute force. It doesn't do anything clever. (Note, there's a confusion between limits -- whether its all the elements of the sequence up to N, or whether it's the first N terms of the sequence). — Paul Hankin, Sep 03 '18 at 11:58

score 7 · Accepted Answer · answered Sep 03 '18 at 11:52

The obvious way to generate this sequence is to generate all primes, and then to use a sieve. For each new element, x of the sequence you find, strike out x+p for all primes p in the desired range.

A mathoverflow comment guesses that the density of the sequence tends to N/sqrt(ln N). If that's right, then this code runs in time O(N^2/(ln N)^3/2). (Using that the number of primes less than N is approximately N/ln N).

That makes it pretty slow once N gets around 10^6, but running the code on my desktop suggests that even for N=10^7, you'll get the full list in a few hours. Note that the algorithm gets increasingly fast, so don't be too put off by the initial slowness.

Python 3 code:

import array

N = 10000

def gen_primes(N):
    a = array.array('b', [0] * (N+1))
    for i in range(2, N+1):
        if a[i]: continue
        yield i
        for j in range(i, N+1, i):
            a[j] = 1

def seq(N):
    primes = list(gen_primes(N))
    a = array.array('b', [0] * (N+1))
    for i in range(N+1):
        if a[i]: continue
        yield i
        for p in primes:
            if i + p > N:
                break
            a[i+p] = 1

for i in seq(N):
    print(i, end=' ', flush=True)
print()

Re-writing it in C, and compiling with gcc -O2 gives a much faster solution. This code generates the list up to 10^7 in 7m30s on my machine:

#include <stdio.h>
#include <string.h>

#define N 10000000

unsigned char A[N+1];
int primes[N];
int p_count=0;

int main(int argc, char **argv) {
    memset(A, 0, sizeof(A));
    for (int i=2; i<=N; i++) {
        if(A[i])continue;
        primes[p_count++] = i;
        for (int j=i; j<=N; j+=i)A[j]=1;
    }
    memset(A, 0, sizeof(A));
    for(int i=0; i<=N; i++) {
        if(A[i])continue;
        printf("%d ", i);
        fflush(stdout);
        for(int j=0; j<p_count && i+primes[j]<=N; j++)A[i+primes[j]]=1;
    }
    return 0;
}

Thank you for that C code, I think I'm going to implement something of that speed, it'll solve the problem. Much gratitude. — Stepan Filonov, Sep 03 '18 at 12:23
Sieve part of the code can be optimised by starting the `j` loop from `i*i` and the step can be updated as `2*i` — dodobhoot, Sep 04 '18 at 18:47
@dodobhoot the prime sieve is already a lot faster than the second part, so the optimizations don't really help much. — Paul Hankin, Sep 04 '18 at 19:19
@Paul Yes you are correct, that won't add much to the optimisation — dodobhoot, Sep 04 '18 at 19:24

Bernhard Barker · Answer 2 · 2018-09-03T13:33:21.410

The Sieve of Eratosthenes is going to look very similar to the one for primes. But you'll need to start off with a list of primes.

With primes you have a bunch of i * prime(k) terms, where prime is our sequence, and i is what we increase for each element in the sequence.

Here you have a bunch of prime(i) + a(k) terms, where a is our sequence and i is what we increase for each element in the sequence.

Of course the + instead of * makes a big difference in the overall efficiency.

The below code gets up to 100k in a few seconds, so it's going to be slow if you want to generate particularly large numbers.

You can wait and see if someone comes up with a faster method.

# taken from https://stackoverflow.com/questions/3939660
def primes_sieve(limit):
    a = [True] * limit
    a[0] = a[1] = False

    for (i, isprime) in enumerate(a):
        if isprime:
            yield i
            for n in range(i*i, limit, i):
                a[n] = False

def a_sieve(limit, primes):
    a = [True] * limit
    for (i, in_seq) in enumerate(a):
        if in_seq:
            yield i
            for n in primes:
                if i+n >= limit:
                    break
                a[i+n] = False

primes = list(primes_sieve(100000))
a = list(a_sieve(100000, primes))
# a = [0, 1, 9, 10, 25, 34, 35, 49, 55, 85, 91, 100, 115, 121, 125, 133, 145, ...]

For comparison, I wrote the below brute force methods, one which iterates over primes and checks if subtracting that gives an element in our sequence, and another which iterates over our sequence and checks if we get a prime, both of which takes about twice as long for 100k.

It does look somewhat similar to the Sieve, but it looks backwards instead of setting values forwards.

def a_brute(limit, primes):
    a = [True] * limit
    for i in range(limit):
        for p in primes:
            if i-p < 0:
                yield i
                break
            if a[i-p]:
                a[i] = False
                break
        else:
            yield i

def a_brute2(limit, primes_sieve):
    a = [True] * limit
    l = []
    for i in range(limit):
        for j in l:
            if i-j < 0:
                l.append(i)
                break
            if primes_sieve[i-j]:
                a[i] = False
                break
        else:
            l.append(i)
    return l

Result:

Primes sieve: 0.02 seconds
Sieve: 6.26 seconds
Brute force 1: 14.14 seconds
Brute force 2: 12.34 seconds

Generate a list a(n) is not of the form prime + a(k), k < n

2 Answers2