0

I have an array (lons) of longitude values in the range [-180, 180]. I need to find the mean of the time series. This is easily done with

np.mean(lons)

This straight forward mean, of course, doesn't work if the series contains values either side of the dateline. What is the correct way of calculating the mean for all possible cases? Note, I would rather not have a condition that treats dateline crossing cases differently.

I've played around with np.unwrap after converting from degrees to rad, but I know my calculations are wrong because a small percentage of cases are giving me mean longitudes somewhere near 0 degrees (the meridian) over Africa. These aren't possible as this is an ocean data set.

Thanks.

EDIT: I now realise a more precise way of calculating the mean [lat, lon] position of a time series might be to convert to a cartesian grid. I may go down this route.

2 Answers2

0

This is an application for directional statistics, where the angular mean is computed in the complex plane (see this section). The result is a complex number, whose imaginary part represents the mean angle:

import numpy as np

def angular_mean(angles_deg):
    N = len(angles_deg)
    mean_c = 1.0 / N * np.sum(np.exp(1j * angles_deg * np.pi/180.0))
    return np.angle(mean_c, deg=True)

lons = [
    np.array([-175, -170, 170, 175]),  # broad distribution
    np.random.rand(1000)               # narrow distribution
]

for lon in lons:
    print angular_mean(lon), np.mean(lon)

As you can see, arithmetic mean and angular mean are quite similar for a narrow distribution, whereas they differ significantly for a broad distribution.

Using cartesian coordinates is not appropriate, as the center of mass will be located within the earth, but since you are using surface data I assume you want it to be located on the surface.

sfinkens
  • 1,210
  • 12
  • 15
  • Thanks for your solution. I actually did manage to successfully work this out using a transformation to Cartesian coordinates to find the mean latitude and longitude. I don't need to z value, so this is not really a 3d centre of mass, just an average of x and y. I tested against your code and the results are identical or within around 1% – InitialConditions Nov 21 '17 at 12:01
  • I was wrong regarding the transformation. Longitudes in [-180, 180] are just fine – sfinkens Nov 21 '17 at 12:58
  • Did the original code work though? It did for me. Are you just saying that the transformation to [0, 360] and then back again was unneeded? – InitialConditions Nov 21 '17 at 13:23
  • I think so, yes – sfinkens Nov 21 '17 at 13:26
  • Regarding the cartesian method: You can't reconstruct the latitude without the z component. lat ~ arcsin(z/R) – sfinkens Nov 21 '17 at 13:33
  • No you can't. But I can still get what I want i.e. the mean latitude and longitude of a set of [lat, lon] coordinates. See the second answer down, here: https://stackoverflow.com/questions/1185408/converting-from-longitude-latitude-to-cartesian-coordinates – InitialConditions Nov 21 '17 at 13:38
0

Here is my solution. Note that I calculate the mean latitude and longitude, but also the mean distance (mean_dist) of the [lat, lon] coordinates from the calculated mean latitude (lat_mean) and mean longitude (lon_mean). The reason is that I'm also interested in how much variation there is from the central [lat, lon]. I believe this is correct but I'm open to discussion!

lat_size = np.size(lats)
lon_rad = np.deg2rad(lons)  # lons in degrees [-180, 180]
lat_rad = np.deg2rad(lats)  # lats in degrees [-90, 90]
R = 6371  #  Approx radius of Earth (km)    
x = R * np.cos(lat_rad) * np.cos(lon_rad)
y = R * np.cos(lat_rad) * np.sin(lon_rad)
z = R * np.sin(lat_rad)

x_mean = np.mean(x)
y_mean = np.mean(y)
z_mean = np.mean(z)    

lat_mean = np.rad2deg(np.arcsin(z_mean / R))
lon_mean = np.rad2deg(np.arctan2(y_mean, x_mean))

# Calculate distance from centre point for each [lat, lon] pair    
dist_list = np.empty(lat_size)
dist_list.fill(np.nan)
p = 0
for lat, lon in zip(lats, lons):
    coords_1 = (lat, lon)
    coords_2 = (lat_mean, lon_mean )
    dist_list[p] = geopy.distance.vincenty(coords_1, coords_2).km
    p = p + 1
mean_dist = np.mean(dist_list)
return lat_mean, lon_mean, mean_dist