2

I need to find all divisors of all numbers between 1 and n (including 1 and n). where n equals 10^6 and I want to store them in the vector.

vector< vector<int> > divisors(1000000);
void abc()
{
    long int n=1,num;
    while(n<1000000)
    {
        num=n;
        int limit=sqrt(num);
        for(long int i=1;i<limit;i++)
        {
            if(num%i==0)
            {
                divisors[n].push_back(i);
                divisors[n].push_back(num/i);
            }
        }
        n++;
    }
}

This is too much time taking as well. Can i optimize it in any way?

hasan
  • 23,815
  • 10
  • 63
  • 101
vivek sehgal
  • 59
  • 2
  • 4
  • @shapiroyaacov why? I need all the divisors of numbers from 1 to 1000000 – vivek sehgal Oct 11 '15 at 05:55
  • If I understand correctly, you want all the divisors for each number in the range `[1,..,1000000]`? If this is so, why are you checking all the possible divisors for each number in the range? whatever is the list of divisors for `10` is also in the list for `20, 30, .., 90, 100, .. , 1000` etc. – shapiro yaacov Oct 11 '15 at 05:57
  • @shapiroyaacov then how can I improve my code ? – vivek sehgal Oct 11 '15 at 06:01
  • Try implementing an algorithm based on the idea shapiro mentioned. That is a bit more than tweaking your code though - it actually requires a different logic. – Peter Oct 11 '15 at 06:38
  • @viveksehgal what is too much time and on what platform? – Spektre Oct 11 '15 at 11:20

4 Answers4

2
const int N = 1000000;
vector<vector<int>> divisors(N+1);

for (int i = 2; i <= N; i++) {
  for (j = i; j <= N; j += i) {
    divisors[j].push_back(i);
  }
}

this runs in O(N*log(N))

Intuition is that upper N/2 numbers are run only once. Then from remaining numbers upper half are run once more ...

Other way around. If you increase N from lets say 10^6 to 10^7, than you have as many opertions as at 10^6 times 10. (that is linear), but what is extra are numbers from 10^6 to 10^7 that doesnt run more than 10 times each at worst.

number of operaions is

sum (N/n for n from 1 to N)

this becomes then N * sum(1/n for n from 1 to N) and this is N*log(N) that can be shown using integration of 1/x over dx from 1 to N

We can see that algorhitm is optimal, because there is as many operation as is number of divisors. Size of result or total number of divisors is same as complexity of algorhitm.

Luka Rahne
  • 10,336
  • 3
  • 34
  • 56
1

I think this might not be the best solution, but it is much better than the one presented, so here we go:

Go over all the numbers (i) from 1 to n, and for each number:

  1. Add the number to the list of itself.
  2. Set multiplier to 2.
  3. Add i to the list of i * multiplier.
  4. increase multiplier.
  5. Repeat steps 3 & 4 until i * multiplier is greater than n.
shapiro yaacov
  • 2,308
  • 2
  • 26
  • 39
  • Dont you think that this solution is O(n^2)? – vivek sehgal Oct 11 '15 at 06:08
  • Definitely is `O(n^2)`. But you want faster code, and this seems to me as something that will work faster. Why? there is no waste of actions. Every time #3 is done, you're adding a number somewhere (`if (num % i == 0)` fails most times in your code - for every `n` larger than 2). Also, I mentioned that AFAIK, is is not the optimal solution, just an improvement... – shapiro yaacov Oct 11 '15 at 06:20
  • I think this is `O(n log n)`, not `O(n^2)`. For each successive outer-loop iteration, there are fewer inner-loop iterations. It seems like the complexity here involves the sum of [harmonic series](https://en.wikipedia.org/wiki/Harmonic_series_(mathematics)), which is `log n`. – anatolyg Oct 11 '15 at 09:17
0

[Edit3] complete reedit

Your current approach is O(n^1.5) not O(n^2)

Originally I suggested to see Why are my nested for loops taking so long to compute?

But as Oliver Charlesworth suggested to me to read About Vectors growth That should not be much of an issue here (also the measurements confirmed it).

So no need to preallocating of memroy for the list (it would just waste memory and due to CACHE barriers even lower the overall performance at least on mine setup).

So how to optimize?

  • either lower the constant time so the runtime is better of your iteration (even with worse complexity)
  • or lower the complexity so much that overhead is not bigger to actually have some speedup

I would start with SoF (Sieve of Eratosthenes)

But instead setting number as divisible I would add currently iterated sieve to the number divisor list. This should be O(n^2) but with much lower overhead (no divisions and fully parallelisable) if coded right.

  1. start computing SoF for all numbers i=2,3,4,5,...,n-1

  2. for each number x you hit do not update SoF table (you do not need it). Instead add the iterated sieve i to the divisor list of x. Something like:

C++ source:

const int n=1000000;
List<int> divs[n];
void divisors()
    {
    int i,x;
    for (i=1;i<n;i++)
     for (x=i;x<n;x+=i)
      divs[x].add(i);
    }
  • This took 1.739s and found 13969984 divisors total, max 240 divisors per number (including 1 and x). As you can see it does not use any divisions. and the divisors are sorted ascending.

  • List<int> is dynamic list of integers template (something like your vector<>)

You can adapt this to your kind of iteration so you can check up to nn=sqrt(n) and add 2 divisors per iteration that is O(n^1.5*log(n)) with different constant time (overhead) a bit slower due to single division need and duplicity check (log(n) with high base) so you need to measure if it speeds things up or not on my setup is this way slower (~2.383s even if it is with better complexity).

const int n=1000000;
List<int> divs[n];
int i,j,x,y,nn=sqrt(n);

for (i=1;i<=nn;i++)
 for (x=i;x<n;x+=i)
    {
    for (y=divs[x].num-1;y>=0;y--)
     if (i==divs[x][y]) break;
    if (y<0) divs[x].add(i);
    j=x/i;
    for (y=divs[x].num-1;y>=0;y--)
     if (j==divs[x][y]) break;
    if (y<0) divs[x].add(j);
    }

Next thing is to use direct memory access (not sure you can do that with vector<>) my list is capable of such thing do not confuse it with hardware DMA this is just avoidance of array range checking. This speeds up the constant overhead of the duplicity check and the result time is [1.793s] which is a little bit slower then the raw SoF O(n^2) version. So if you got bigger n this would be the way.

[Notes]

If you want to do prime decomposition then iterate i only through primes (in that case you need the SoF table) ...

If you got problems with the SoF or primes look at Prime numbers by Eratosthenes quicker sequential than concurrently? for some additional ideas on this

Community
  • 1
  • 1
Spektre
  • 49,595
  • 11
  • 110
  • 380
  • 2
    Assuming this is C++, then the list growth thing is incorrect; `vector::push_back` is amortized O(1). – Oliver Charlesworth Oct 11 '15 at 09:30
  • @OliverCharlesworth That is true only if `vector<>` is linked list instead of linear array... As I do not use it I do not know ... – Spektre Oct 11 '15 at 09:32
  • `vector` is a standard linear array, not a linked-list. – Oliver Charlesworth Oct 11 '15 at 09:33
  • @OliverCharlesworth in that case without preallocation my point stands ... imagine you adding 1000 numbers if the array was 16 items on start you need to reallocate over and over (how often depends on the growth policy of the template I double the size usually)... – Spektre Oct 11 '15 at 09:36
  • Indeed - assuming an exponential growth policy, then you still end up with O(1) amortized. – Oliver Charlesworth Oct 11 '15 at 09:36
  • @OliverCharlesworth that is not true for heavy list push using and as this program does not do anything else ... then I estimate it is more like `O(log(m))` with very small `c` and `m` is the total size of the list at the end, base of logarithm is 2 if doubling the size but correct me if I am wrong in this . ... like to know. (that is at least my experience with linear array lists) – Spektre Oct 11 '15 at 09:39
  • @OliverCharlesworth WoW that is so cool :) so I am actually lovering the `c` constant instead of complexity did not look at this in this way I would try to rewrite the growth section please check it later and repair if still something wrong there – Spektre Oct 11 '15 at 09:47
  • @OliverCharlesworth done reediting ... I think we should also clear the comment mess here and leave all in one comment ... – Spektre Oct 11 '15 at 10:22
0

Another optimization is not to use -vector- nor -list- , but a large array of divisors, see http://oeis.org/A027750

First step: Sieve of number of divisors

Second step: Sieve of divisors with the total number of divisors

Note: A maximum of 20-fold time increase for 10-fold range. --> O(N*log(N))

Dev-C++ 5.11 , in C

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int SieveNbOfDiv(int NumberOfDivisors[], int IndexCount[], int Limit) {
    for (int i = 1; i*i <= Limit; i++) {
        NumberOfDivisors[i*i] += 1;
        for (int j = i*(i+1); j <= Limit; j += i )
            NumberOfDivisors[j] += 2;
    }
    int Count = 0;
    for (int i = 1; i <= Limit; i++) {
        Count += NumberOfDivisors[i];
        NumberOfDivisors[i] = Count;
        IndexCount[i] = Count;
    }
    return Count;
}

void SieveDivisors(int IndexCount[], int NumberOfDivisors[], int Divisors[], int Limit) {
    for (int i = 1; i <= Limit; i++) {
        Divisors[IndexCount[i-1]++] = 1;
        Divisors[IndexCount[i]-1] = i;
    }
    for (int i = 2; i*i <= Limit; i++) {
        Divisors[IndexCount[i*i-1]++] = i;
        for (int j = i*(i+1); j <= Limit; j += i ) {
            Divisors[IndexCount[j-1]++] = i;
            Divisors[NumberOfDivisors[j-1] + NumberOfDivisors[j] - IndexCount[j-1]] = j/i;
        }
    }
}

int main(int argc, char *argv[]) {
    int N = 1000000;
    if (argc > 1) N = atoi(argv[1]);
    int ToPrint = 0;
    if (argc > 2) ToPrint = atoi(argv[2]); 

    clock_t Start = clock();
    printf("Using sieve of divisors from 1 to %d\n\n", N);

    printf("Evaluating sieve of number of divisors ...\n");
    int *NumberOfDivisors = (int*) calloc(N+1, sizeof(int));
    int *IndexCount = (int*) calloc(N+1, sizeof(int));
    int size = SieveNbOfDiv(NumberOfDivisors, IndexCount, N);

    printf("Total number of divisors = %d\n", size);
    printf("%0.3f second(s)\n\n", (clock() - Start)/1000.0);

    printf("Evaluating sieve of divisors ...\n");
    int *Divisors = (int*) calloc(size+1, sizeof(int));
    SieveDivisors(IndexCount, NumberOfDivisors, Divisors, N);

    printf("%0.3f second(s)\n", (clock() - Start)/1000.0); 

    if (ToPrint == 1)
        for (int i = 1; i <= N; i++) {
            printf("%d(%d) = ", i, NumberOfDivisors[i] - NumberOfDivisors[i-1]);
            for (int j = NumberOfDivisors[i-1]; j < NumberOfDivisors[i]; j++)
                printf("%d ", Divisors[j]);
            printf("\n");
        }

    return 0;
}

With some results:

Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 100000
Using sieve of divisors from 1 to 100000

Evaluating sieve of number of divisors ...
Total number of divisors = 1166750
0.000 second(s)

Evaluating sieve of divisors ...
0.020 second(s)

c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 1000000
Using sieve of divisors from 1 to 1000000

Evaluating sieve of number of divisors ...
Total number of divisors = 13970034
0.060 second(s)

Evaluating sieve of divisors ...
0.610 second(s)

c:\Users\Ab\Documents\gcc\sievedivisors>sievedivisors 10000000
Using sieve of divisors from 1 to 10000000

Evaluating sieve of number of divisors ...
Total number of divisors = 162725364
0.995 second(s)

Evaluating sieve of divisors ...
11.900 second(s)

c:\Users\Ab\Documents\gcc\sievedivisors>
StackPC
  • 36
  • 2