Is this algorithm of O(n^4) avoidable?

Question

DISCLAIMER: This is not a homework question. I don't even go to school.

#include <stdio.h>
void printIntersect(float, float, float, float);

int main(){
   int x, y, z, a;
   float slopes[] = {5, 9, 4/3.0f, 2/3.0f, 3};
   float inter[] = {3, 2, 1, 0, 5/3.0f};  

   for(x = 0; x < (sizeof(slopes) / sizeof(slopes[0])) - 1; x++)
       for(y = 1; y < (sizeof(slopes) / sizeof(slopes[0])); y++)
           for(z = 0; z < sizeof(inter) / sizeof(inter[0]); z++)
                  for(a = 0; a < sizeof(inter) / sizeof(inter[0]); a++)
                      if(slopes[x] != slopes[y])
                          printIntersect(slopes[x], slopes[y], inter[z], inter[a]);

   return 0;
}

void printIntersect(float m_one, float m_two, float b_one, float b_two){
    float m_final, b_final, x_intersect;

    m_final = m_one - m_two;
    b_final = b_two - b_one;

    if(m_final < 0 && b_final < 0)
        m_final *= -1.0f, b_final *= -1.0f;

    if (b_final != 0)
        x_intersect = b_final / m_final;
    else
        x_intersect = 0;  

        printf("The intersection of y = %.2fx %c %.2f and y = %.2fx %c %.2f is x = %.2f\n",
            m_one, (b_one < 0) ? '-' : '+', b_one, m_two, (b_two < 0) ? '-' : '+', b_two, x_intersect);


    return;
}

Scenario: There was an exercise in one of my C books that I'm unsure of. What I got out of the question was that I was to have two arrays: one representing the possible slopes of a line, the other representing all possible y intercepts. The goal was to use all possible combinations of slopes and intercepts with two lines to find their intersection. I was to ignore parallel lines and duplicate lines (which is implicitly ignored anyway considering if they can't be parallel, then there's no way that they can be the same line).

Assuming that's premise (and I really don't care at this point, it's just an exercise), the program I wrote uses 4 nested for loops. You can see why that concerns me, but then again, maybe the 4 of them are necessary.

Since I can't have parallel lines, I iterate the slopes by starting with that of the first line and then iterate through all other slopes in the array as the second line's slope. It's this way that I get all possible combinations of slopes.

Intercepts are different because I can have a lines of the same intercepts and still have them be different. So iteration between those doesn't need to account for duplicates. That being said, the way I iterate through the array of intercepts should account for every possible pair in those two lines.

If the lines are not parallel, I call a function which will print the x intercept.

My question is, could this O(n^4) solution been avoided or at least simplified? Do you have any tips on processing arrays like these?

This isn't really an n^4 problem, it's an n^2 problem. It's just that your definition of n is off - it's the number of lines you're comparing, not the number of slopes and intercepts. Obviously the number of lines is slopes*intercepts. — Mark Ransom, Jul 16 '13 at 02:33
I thought n denoted algorithm efficiency, in which case 4 nested for loops would provide an algorithm of O(n^4), not O(n^2). — Theo Chronic, Jul 16 '13 at 02:40
@MarkRansom No, I think OP is right this time; the slopes and intercepts are the input here, so the algorithm runtime should be in terms of the sizes of those arrays, not the number of lines. — Dennis Meng, Jul 16 '13 at 03:49
@TheoChronic, more specifically it's the combination of two different O(n^2) algorithms, one for determining the lines and one for determining their intersections. The combination might be O(n^4) but I think it's useful to think of the problem in its broken down form. — Mark Ransom, Jul 16 '13 at 04:10

Claudiu · Answer 1 · 2013-07-16T04:48:12.577

Given a slopes and b intersects, you can make a*b lines. There are a*b choose 2 pairs of lines. This is about half as much as a*b*a*b (it is (a*b*(a*b-1)/2). Given no additional information, it seems like you have to check all of them - so yep your algo is indeed O(a*a*b*b).

EDIT: Let's look at the question differently, in terms of the answer. Given a slopes and b intersects, and making a*b lines by combining them, how many intersections will we have? Let's assume for simplicity that the set of slopes is distinct.

This is a different question than asking how many intersects we have given n lines, because of the way the lines are constructed. Given one slope, we create b parallel lines. Given the next, we create another b parallel lines, none of which are parallel to the first set of lines. Repeat a times.

Within one set, we have no intersections since they're all parallel.

Between two sets, how many intersections do we have? None of the lines in one set are parallel to the lines in the other set, so each line in one set will intersect once with each line of the second set. As each set has b lines, there will be b*b intersections between any two sets.

Here we note that all these sets of lines also share the same set of b intersects. So, for each additional set of lines you intersect with the current set of intersected lines, you will add (b*b - b)*c = b*c*(b-1) intersections where c is the number of sets of lines already included, because the intersections at the b y-intercepts are already accounted for, so we only add b*(b - 1) intersections for each set of lines already there.

So we have our initial b^2 for two sets, plus b*(b - 1)*2 for adding the third, plus b*(b - 1)*3 for adding the fourth, etc., up to b*(b-1)*(a-1) for adding the ath set. That is, the number of intersections I is:

I = b^2 + 2b(b-1) + 3b(b-1) + ... + (a-1)b(b-1)

We can re-write b^2 as b + b(b-1):

I = b + b(b-1) + 2b(b-1) + ... + (a-1)b(b-1)

Factor out the common b(b-1) in the last a-1 terms:

I = b + b(b-1)[1 + 2 + ... + (a-1)]

We note that the sum of numbers from 1 to x is x*(x+1)/2:

                (a-1)*a 
I = b + b(b-1)  ------- 
                   2

        a*(a-1)*b*(b-1)
  = b + ---------------
               2

        (a^2 - a)(b^2 - b)
  = b + ------------------
                2

    a^2*b^2 - a^2*b - a*b^2 + a*b + 2b
  = ----------------------------------
                    2

Thus, whatever algorithm you use, you're going to have to generate that many separate pieces of output, which is indeed less than a^2*b^2, but not by a significant amount.

Zong · Answer 2 · 2013-07-16T02:31:18.510

2

Well we can already halve the loop by changing the starting index of the second loop.

for (y = 1, ...)

to

for (y = x + 1; ...)

giving

for (x = 0; x < (sizeof(slopes) / sizeof(slopes[0])) - 1; x++)
    for (y = x + 1; y < (sizeof(slopes) / sizeof(slopes[0])); y++)
        for (z = 0; z < sizeof(inter) / sizeof(inter[0]); z++)
            for (a = 0; a < sizeof(inter) / sizeof(inter[0]); a++)
                printIntersect(slopes[x], slopes[y], inter[z], inter[a]);

I've also removed the equality check, even though as jxh pointed out, slopes may contain duplicate entries. Well, I did this because you can remove duplicates in O(n).

Of course this doesn't change the overall time complexity of the algorithm, but it should help with the runtime.

edited Jul 16 '13 at 02:31

answered Jul 16 '13 at 02:22

Zong

6,160
5
32
46

Problem asked the program to ignore parallel and duplicate lines, but the slopes array may still have more than one entry with the same slope. – jxh Jul 16 '13 at 02:27
Oh man, thank you. I totally missed that. In my version, on my second iteration, y = x = 1, which is something I tried to avoid. – Theo Chronic Jul 16 '13 at 02:32

score 0 · Answer 3 · edited May 23 '17 at 11:49

I don't believe you can get away from O(M² x Y²) (M distinct slopes and Y distinct intercepts), but you can filter your arrays so that they are distinct first. This allows your program to run faster in the case that the input contains many duplicates. Also, without removing the duplicate y-intercepts, you may also be generating and printing duplicated line intersections.

So, using qsort you can sort the input, and then remove the duplicates. Since sorting is O(N x log₂N), it does't affect the overall complexity of your algorithm (and speeds it up in the case where there are many duplicates). It is possible to eliminate duplicates faster using a hashtable, but the standard C library does not provide one (you would have to find one or implement one yourself).

#define ARRAY_SZ(x) (sizeof(x)/((void *)x == &x ? sizeof(x[0]) : 0))

static int is_equal_float (float a, float b) { /*...*/ }

static int cmp_float (const void *a, const void *b) {
    float af = *(const float *)a;
    float bf = *(const float *)b;
    if (is_equal_float(af, bf)) return 0;
    return (af > bf) - (af < bf);
}

/* ... */
qsort(slopes, ARRAY_SZ(slopes), sizeof(slopes[0]), cmp_float);
qsort(inter, ARRAY_SZ(inter), sizeof(slopes[0]), cmp_float);

for (x = 0, y = 1; y < ARRAY_SZ(slopes) - 1; y++) 
    if (slopes[y] != slopes[x]) slopes[++x] slopes[y];
for (z = 0, a = 1; a < ARRAY_SZ(inter) - 1; a++) 
    if (inter[z] != inter[a]) inter[++z] slopes[a];
M = x+1;
Y = z+1;

for (x = 0; x < M - 1; x++) 
    for (y = x + 1; y < M; y++)
        for (z = 0; z < Y; z++)
            for (a = 0; a < Y; a++)
                printIntersect(slopes[x], slopes[y], inter[z], inter[a]);

I gloss over how to compare floats, but you can consult Most effective way for float and double comparison for answers on how to do it in a way that meets your application's needs.

Just to be a little pedantic, you can remove duplicates using a hashtable in O(n). Also, log2(n) = ln(n) / ln(2) so you don't need to specify the base as it's a constant factor. — Zong, Jul 16 '13 at 03:30
@ZongZhengLi: You're right, but the size of the hashtable to achieve 0 collisions is not known. The log base is just a hint to the nature of the algorithm (binary divide and conquer). — jxh, Jul 16 '13 at 03:35

Is this algorithm of O(n^4) avoidable?

3 Answers3