4

Recently I've been playing around with computer vision and neural networks.
And came across experimental object detection within a 3D application.
But, surprisingly to me - I've faced an issue of converting one coordinates system to another (AFAIK cartesian to polar/sphere).

Let me explain.
For example, we have a screenshot of a 3D application window (some 3D game): enter image description here

Now, using Open-CV or neural network I'm able to detect the round spheres (in-game targets).
As well as their X, Y coordinates within the game window (x, y offsets).
enter image description here

And if I will programmatically move a mouse cursor within the given X, Y coordinates in order to aim one of the targets.
It will work only when I'm in desktop environment (moving the cursor in desktop).
But when I switch to the 3D game and thus, my mouse cursor is now within 3D game world environment - it does not work and does not aim the target.

So, I did a decent research on the topic.
And what I came across, is that the mouse cursor is locked inside 3D game.
Because of this, we cannot move the cursor using MOUSEEVENTF_MOVE (0x0001) + MOUSEEVENTF_ABSOLUTE (0x8000) flags within the mouse_event win32 call.

We are only able to move the mouse programmatically using relative movement.
And, theoretically, in order to get this relative mouse movement offsets, we can calculate the offset of detections from the middle of the 3D game window.
In such case, relative movement vector would be something like (x=-100, y=0) if the target point is 100px left from the middle of the screen.

The thing is, that the crosshair inside a 3D game will not move 100px to the left as expected.
And will not aim the given target.
But it will move a bit in a given direction.

After that, I've made more research on the topic.
And as I understand, the crosshair inside a 3D game is moving using angles in 3D space.
Specifically, there are only two of them: horizontal movement angles and vertical movement angles.

So the game engine takes our mouse movement and converts it to the movement angles within a given 3D world space.
And that's how the crosshair movement is done inside a 3D game.
But we don't have access to that, all we can is move the mouse with win32 calls externally.

Then I've decided to somehow calculate pixels per degree (amount of pixels we need to use with win32 relative mouse movement in order to move the crosshair by 1 degrees inside the game).
In order to do this, I've wrote down a simple calculation algorithm.
Here it is: enter image description here

As you can see, we need to move our mouse relatively with win32 by 16400 pixels horizontally, in order to move the crosshair inside our game by 360 degrees.
And indeed, it works.
16400/2 will move the crosshair by 180 degrees respectively.

What I did next, is I tried to convert our screen X, Y target offset coordinates to percentages (from the middle of the screen).
And then convert them to degrees.

The overall formula looked like (example for horizontal movement only):

w = 100  # screen width
x_offset = 10  # target x offset 
hor_fov = 106.26

degs = (hor_fov/2) * (x_offset /w)  # 5.313 degrees

And indeed, it worked!
But not quite as expected.
The overall aiming precision was different, depending on how far the target is from the middle of the screen.

I'm not that great with trigonometry, but as I can say - there's something to do with polar/sphere coordinates.
Because we can see only some part of the game world both horizontally & vertically.
It's also called the FOV (Field of view).

Because of this, in the given 3D game we are only able to view 106.26 degrees horizontally.
And 73.74 degrees vertically.

My guess, is that I'm trying to convert coordinates from linear system to something non-linear.
As a result, the overall accuracy is not good enough.

I've also tried to use math.atan in Python.
And it works, but still - not accurate.

Here is the code:

def point_get_difference(source_point, dest_point):
    # 1000, 1000
    # source_point = (960, 540)
    # dest_point = (833, 645)
    # result = (100, 100)

    x = dest_point[0]-source_point[0]
    y = dest_point[1]-source_point[1]

    return x, y

def get_move_angle__new(aim_target, gwr, pixels_per_degree, fov):
    game_window_rect__center = (gwr[2]/2, gwr[3]/2)
    rel_diff = list(point_get_difference(game_window_rect__center, aim_target))

    x_degs = degrees(atan(rel_diff[0]/game_window_rect__center[0])) * ((fov[0]/2)/45)
    y_degs = degrees(atan(rel_diff[1] / game_window_rect__center[0])) * ((fov[1]/2)/45)
    rel_diff[0] = pixels_per_degree * x_degs
    rel_diff[1] = pixels_per_degree * y_degs

    return rel_diff, (x_degs+y_degs)

get_move_angle__new((900, 540), (0, 0, 1920, 1080), 16364/360, (106.26, 73.74))
# Output will be: ([-191.93420990140876, 0.0], -4.222458785413539)
# But it's not accurate, overall x_degs must be more or less than -4.22...

Is there a way to precisely convert 2D screen X, Y coordinates into 3D game crosshair movement degrees?
There must be a way, I just can't figure it out ...

Abraham Tugalov
  • 1,902
  • 18
  • 25
  • It could be the FOV distorting things the further away things are from the center of the screen. – Leo Dec 17 '22 at 23:32
  • it depends on the game rendering method ... If its [3D perspective projection you have to take into account the `znear` value (or focal length) along with `FOVx,FOVy` angles of camera to adjust your angle computation](https://stackoverflow.com/a/61631066/2521214)... If its [2.5D ray casting like wolfenstein](https://stackoverflow.com/a/47251071/2521214) then you need to take into account it usually use projection in x axis usually using `cos(x_angle)` or its inversion instead of `x` coordinate directly – Spektre Dec 18 '22 at 12:42
  • @Spektre It's pretty much default 3D perspective projection, the camera renders everything on a 2D plane (screen) using near & far clipping planes. The thing is, we do not have actual `znear` & `zfar` values, as well as distance, size or position of the target object in 3D space. – Abraham Tugalov Dec 18 '22 at 14:46
  • @AbrahamTugalov weird site did not notify me... for perspective I would try: `FOVx = 2.0*atan(0.5*screen_x_resolution/znear); angle_x = atan(screen_x_from_center/znear);` so you might want to measure/fit the stuff you do not know to obtain znear,FOVx ... also the `screen_x` might need to be scaled (by screen resolution and sometimes even with zoom). In case you hook into the game process you might even [obtain video and depth buffers](https://stackoverflow.com/a/38549548/2521214) from there you can obtain even [original 3D position of the target](https://stackoverflow.com/a/51764105/2521214) – Spektre Dec 19 '22 at 06:04
  • Since perspective projection maps 3d points in the same direction on the same point on the 2d screen, depth wouldn't matter. Naturally thinking, `x_degs = degrees(atan(rel_diff[0] / game_window_rect__center[0] * tan(radians(fov[0] / 2))))` and `y_degs = degrees(atan(rel_diff[1] / game_window_rect__center[1] * tan(radians(fov[1] / 2))))`. – ardget Dec 21 '22 at 05:03
  • @ardget Year it works, but still not accurate. Your solution suffers from the same inaccuracy, as the answer given by `Simon Lundberg`. I guess the formula is missing something (maybe with `PHI/Theta`). – Abraham Tugalov Dec 21 '22 at 18:51
  • It seems to me that the main difficulty here is working out how the game engine is projecting your 3D space to the 2D screen. If you can't find that out in a more direct way, you could scatter a few points around in 3D space and record their 2D screen coordinates. Then, hopefully, it shouldn't be that hard to work out which method it is using. – Simon Goater Dec 24 '22 at 17:09

1 Answers1

0

The half-way point between the center and the edge of the screen is not equal to the field of view divided by four. As you noticed, the relationship is nonlinear.

The angle between a fractional position on the screen (0-1) and the middle of the screen can be calculated as follows. This is for the horizontal rotation (i.e, around the vertical axis), so we're only considering the X position on the screen.

# angle is the angle in radians that the camera needs to
# rotate to aim at the point

# px is the point x position on the screen, normalised by
# the resolution (so 0.0 for the left-most pixel, 0.5 for
# the centre and 1.0 for the right-most

# FOV is the field of view in the x dimension in radians
angle = math.atan((x-0.5)*2*math.tan(FOV/2))

For a field of view of 100 degrees and an x of zero, that gives us -50 degrees of rotation (exactly half the field of view). For an x of 0.25 (half-way between the edge and middle), we get a rotation of around -31 degrees.

Note that the 2*math.tan(FOV/2) part is constant for any given field of view, so you can calculate it in advance and store it. Then it just becomes (assuming we named it z):

angle = math.atan((x-0.5)*z)

Just do that for both x and y and it should work.

Edit / update:

Here is a complete function. I've tested it, and it seems to work.

import math

def get_angles(aim_target, window_size, fov):
"""
    Get (x, y) angles from center of image to aim_target.

    Args:
        aim_target: pair of numbers (x, y) where to aim
        window_size: size of area (x, y)
        fov: field of view in degrees, (horizontal, vertical)

    Returns:
       Pair of floating point angles (x, y) in degrees
    """
    fov = (math.radians(fov[0]), math.radians(fov[1]))
    
    x_pos = aim_target[0]/(window_size[0]-1)
    y_pos = aim_target[1]/(window_size[1]-1)


    x_angle = math.atan((x_pos-0.5)*2*math.tan(fov[0]/2))
    y_angle = math.atan((y_pos-0.5)*2*math.tan(fov[1]/2))

    return (math.degrees(x_angle), math.degrees(y_angle))


print(get_angles(
    (0, 0), (1920, 1080), (100, 67.67)
), "should be around -50, -33.835")

print(get_angles(
    (1919, 1079), (1920, 1080), (100, 67.67)
), "should be around 50, 33.835")

print(get_angles(
    (959.5, 539.5), (1920, 1080), (100, 67.67)
), "should be around 0, 0")

print(get_angles(
    (479.75, 269.75), (1920, 1080), (100, 67.67)
), "should be around 30.79, 18.53")
Simon Lundberg
  • 1,413
  • 2
  • 11
  • 23
  • Looks like provided formula is missing something. For the given input `get_move_angle__new((441, 446), (0, 0, 1920, 1080), 16364/360, [106.26, 73.74])` the output angles must closer to `[-41.88894918065101, -8.580429158509922]`, but it's `[-41.588949180651014, -12.680429158509924]`. As you see the correction of `[-0.30000000000000004, 4.100000000000001]` was made. In practice, the most of the times it aims perfectly within the horizontal axis *(but not if the target is too far, then accuracy falls)*. And the vertical axis is very inaccurate *(maybe it's something to do with PHI/Theta?)*. – Abraham Tugalov Dec 21 '22 at 18:49
  • Also it should be noted, that vertical rotation (aka `yaw`) is not the same, as horizontal (aka `pitch`). Vertical axis simply cannot be rotated 360 degrees. I guess it's somehow limited to 180 degrees, so crosshair can only look at the top and the bottom. – Abraham Tugalov Dec 21 '22 at 18:58
  • Did you account for the fact that indexing is from zero to one less than size? I.e, the normalisation shouldn't be `xpos/xsize`, but rather `xpos/(xsize-1)`? – Simon Lundberg Dec 21 '22 at 19:15
  • Made a little mistake in the comment above. The example target position is not `(441, 446)`, it's `(321, 378)` for the provided angles. Returning to your updated answer: yes, I did everything as you describe. Just to be sure, I've copied & pasted your function and tested it again. Still, the output of `get_angles((321, 378), (1920, 1080), (106.26, 73.74)` is `(-41.58150209957484, -12.653891692768738)` but it's wrong. It must be closer to `[-41.88894918065101, -8.580429158509922]` or it misses the target. – Abraham Tugalov Dec 21 '22 at 20:40
  • Can you provide some reason as to why you think my numbers are wrong? How did you calculate -8.58 degrees in that example? – Simon Lundberg Dec 21 '22 at 23:05
  • I've corrected output angles manually in order to aim the given target in 3D space accurately. – Abraham Tugalov Dec 22 '22 at 06:34
  • Well, I have no explanation for that. I'm fairly confident in my trigonometry here. It's possible that the project you're checking using some kind of distortion. That would make it near-impossible to figure out an exact formula. You might need to use a heuristic technique like a lookup table or fit a polynomial to correct it. – Simon Lundberg Dec 22 '22 at 10:32
  • Maybe, dunno. Let me test it in my own 3D application (I can make it using Unity or WebGL), and then I tell you if it's the case. – Abraham Tugalov Dec 22 '22 at 17:29