3

I'm trying to create a randomly generated "planet" (circle), and I want the areas of water, land and foliage to be decided by perlin noise, or something similar. Currently I have this (psudo)code:

for (int radius = 0; radius < circleRadius; radius++) {
    for (float theta = 0; theta < TWO_PI; theta += 0.1) {
        float x = radius * cosine(theta);
        float y = radius * sine(theta);
        int colour = whateverFunctionIMake(x, y);
        setPixel(x, y, colour);
    }
}

Not only does this not work (there are "gaps" in the circle because of precision issues), it's incredibly slow. Even if I increase the resolution by changing the increment to 0.01, it still has missing pixels and is even slower (I get 10fps on my mediocre computer using Java (I know not the best) and an increment of 0.01. This is certainly not acceptable for a game).

How might I achieve a similar result whilst being much less computationally expensive?

Thanks in advance.

Jachdich
  • 782
  • 8
  • 23

1 Answers1

1

Why not use:

(x-x0)^2 + (y-y0)^2 <= r^2

so simply:

int x0=?,y0=?,r=?; // your planet position and size
int x,y,xx,rr,col;
for (rr=r*r,x=-r;x<=r;x++)
 for (xx=x*x,y=-r;y<=r;y++)
  if (xx+(y*y)<=rr)
   {
   col = whateverFunctionIMake(x, y);
   setPixel(x0+x, y0+y, col);   
   }

all on integers, no floating or slow operations, no gaps ... Do not forget to use randseed for the coloring function ...

[Edit1] some more stuff

Now if you want speed than you need direct pixel access (in most platforms Pixels, SetPixel, PutPixels etc are slooow. because they perform a lot of stuff like range checking, color conversions etc ... ) In case you got direct pixel access or render into your own array/image whatever you need to add clipping with screen (so you do not need to check if pixel is inside screen on each pixel) to avoid access violations if your circle is overlapping screen.

As mentioned in the comments you can get rid of the x*x and y*y inside loop using previous value (as both x,y are only incrementing). For more info about it see:

the math is like this:

(x+1)^2 = (x+1)*(x+1) = x^2 + 2x + 1

so instead of xx = x*x we just do xx+=x+x+1 for not incremented yet x or xx+=x+x-1 if x is already incremented.

When put all together I got this:

void circle(int x,int y,int r,DWORD c)
    {
    // my Pixel access
    int **Pixels=Main->pyx;         // Pixels[y][x]
    int   xs=Main->xs;              // resolution
    int   ys=Main->ys;
    // circle
    int sx,sy,sx0,sx1,sy0,sy1;      // [screen]
    int cx,cy,cx0,    cy0    ;      // [circle]
    int rr=r*r,cxx,cyy,cxx0,cyy0;   // [circle^2]
    // BBOX + screen clip
    sx0=x-r; if (sx0>=xs) return; if (sx0<  0) sx0=0;
    sy0=y-r; if (sy0>=ys) return; if (sy0<  0) sy0=0;
    sx1=x+r; if (sx1<  0) return; if (sx1>=xs) sx1=xs-1;
    sy1=y+r; if (sy1<  0) return; if (sy1>=ys) sy1=ys-1;
    cx0=sx0-x; cxx0=cx0*cx0;
    cy0=sy0-y; cyy0=cy0*cy0;
    // render
    for (cxx=cxx0,cx=cx0,sx=sx0;sx<=sx1;sx++,cxx+=cx,cx++,cxx+=cx)
     for (cyy=cyy0,cy=cy0,sy=sy0;sy<=sy1;sy++,cyy+=cy,cy++,cyy+=cy)
      if (cxx+cyy<=rr)
       Pixels[sy][sx]=c;
    }

This renders a circle with radius 512 px in ~35ms so 23.5 Mpx/s filling on mine setup (AMD A8-5500 3.2GHz Win7 64bit single thread VCL/GDI 32bit app coded by BDS2006 C++). Just change the direct pixel access to style/api you use ...

[Edit2]

to measure speed on x86/x64 you can use RDTSC asm instruction here some ancient C++ code I used ages ago (on 32bit environment without native 64bit stuff):

double _rdtsc()
    {
    LARGE_INTEGER x; // unsigned 64bit integer variable from windows.h I think
    DWORD l,h;       // standard unsigned 32 bit variables
    asm {
        rdtsc
        mov l,eax
        mov h,edx
        }
    x.LowPart=l;
    x.HighPart=h;
    return double(x.QuadPart);
    }

It returns clocks your CPU has elapsed since power up. Beware you should account for overflows as on fast machines the 32bit counter is overflowing in seconds. Also each core has separate counter so set affinity to single CPU. On variable speed clock before measurement heat upi CPU by some computation and to convert to time just divide by CPU clock frequency. To obtain it just do this:

t0=_rdtsc()
sleep(250);
t1=_rdtsc();
fcpu = (t1-t0)*4;

and measurement:

t0=_rdtsc()
mesured stuff
t1=_rdtsc();
time = (t1-t0)/fcpu

if t1<t0 you overflowed and you need to add the a constant to result or measure again. Also the measured process must take less than overflow period. To enhance precision ignore OS granularity. for more info see:

Spektre
  • 49,595
  • 11
  • 110
  • 380
  • My god why didn't I think of that?! I totally forgot that you don't even need the square root. – Jachdich Apr 08 '20 at 09:41
  • @Jachdich :) do not worry you will think like that with time... btw you can get rid of the multiplication too if you think that x and y are incrementing you can use previous value ... you know `(a+1)^2 = a^2 + da` ... but that is overkill on most modern CPUs as multiplication is not that slow anymore – Spektre Apr 08 '20 at 09:44
  • @Jachdich see my [simple random map generator with height and biomes](https://stackoverflow.com/a/36647622/2521214) for the coloring of planet ideas and [32bit SQRT in 16T without multiplication](https://stackoverflow.com/a/34657972/2521214) for the trick I described in previous comment – Spektre Apr 08 '20 at 10:57
  • @Jachdich was currious so I tried to code this ... see edit1 with result – Spektre Apr 08 '20 at 12:34
  • Thanks for the extra code, I'll certainly use the optimisation about the incrementing x value. Unfortunately, the game engine that I'm using seems to only have a Draw(x, y, colour) function but it does seem to be decently fast (I'm getting well over 2000fps drawing a small 15 radius circle on a potato (2nd gen i3, Linux might make it a bit faster though)) – Jachdich Apr 08 '20 at 21:36
  • @Jachdich the slooow versions of pixel access is like 10000 times slower. On some environments is faster to render into own array and then pass it to gfx api at once when done. if I recompute pixel filling speeds Yours is around `1.41 MPx/s` and mine `23.5 Mpx/s`. So yours would need `580 ms` to fill `512 px` disc and mine take `35 ms`. But with small radius you got big overhead, also JAVA is slow vs C++ and different gfx memory access by CPU throughoutput kiks in a lot too... so try to measure bigger radius.... I would expect JAVA is ~10 times slower than C++ – Spektre Apr 09 '20 at 08:24
  • I am actually using C++ now, forgot to mention. I'll try some more precise timing, as doing "nothing" I only get ~3000fps, so the overhead of handling the events etc needs to be taken into account. – Jachdich Apr 09 '20 at 08:39
  • @Jachdich yep you need to measure just the circle time ... in Windows there are `PerformanceCounters` that I using... On Linux I have no idea ... but you can still use RDTSC asssembly instruction see [Measuring Cache Latencies](https://stackoverflow.com/a/21548494/2521214) – Spektre Apr 09 '20 at 08:42