2

I have a table like the following:

X | Y | Z | node
----------------
1 | 2 | 3 | 100
2 | 2 | 3 | 
2 | 2 | 4 | 
2 | 2 | 5 | 200
3 | 2 | 5 | 
4 | 2 | 5 | 
5 | 2 | 5 | 300

X, Y, Z are 3D space coordinates of some points, a curve passes through all the corresponding points from the first row to the last row. I need to calculate the curve length between two adjacent points whose "node" column aren't null.

If would be great if I can directly insert the result into another table that has three columns: "first_node", "second_node", "curve_length".

I don't need to interpolate extra points into the curve, just need to accumulate lengths all the straight lines, for example, in order to calculate the curve length between node 100 and 200, I need to sum the lengths of 3 straight lines: (1,2,3)<->(2,2,3), (2,2,3)<->(2,2,4), (2,2,4)<->(2,2,5)

EDIT The table has an ID column, which is in increasing order from the first row to the last row.

Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
user416983
  • 974
  • 3
  • 18
  • 28

1 Answers1

1

To get a previous value in SQL, use the lag window function, e.g.

SELECT 
  x, 
  lag(x) OVER (ORDER BY id) as prev_x, ...
FROM ...
ORDER BY id;

That lets you get the previous and next points in 3-D space for a given segment. From there you can trivially calculate the line segment length using regular geometric maths.

You'll now have the lengths of each segment (sqlfiddle query). You can use this as input into other queries, using SELECT ... FROM (SELECT ...) subqueries or a CTE (WITH ....) term.

It turns out to be pretty awkward to go from the node segment lengths to node-to-node lengths. You need to create a table that spans the null entries, using a recursive CTE or with a window function.

I landed up with this monstrosity:

SELECT
  array_agg(from_id) AS seg_ids,
  -- 'max' is used here like 'coalese' for an aggregate,
  -- since non-null is greater than null
  max(from_node) AS from_node,
  max(to_node) AS to_node,
  sum(seg_length) AS seg_length
FROM (
  -- lengths of all sub-segments with the null last segment
  -- removed and a partition counter added
  SELECT
   *,
   -- A running counter that increments when the
   -- node ID changes. Allows us to group by series
   -- of nodes in the outer query.
   sum(CASE WHEN from_node IS NULL THEN 0 ELSE 1 END) OVER (ORDER BY from_id) AS partition_id
  FROM
  (
    -- lengths of all sub-segments
    SELECT
      id AS from_id,
      lead(id, 1) OVER (ORDER BY id) AS to_id,
      -- length of sub-segment
      sqrt(
        (x - lead(x, 1) OVER (ORDER BY id)) ^ 2 +
        (y - lead(y, 1) OVER (ORDER BY id)) ^ 2 +
        (z - lead(z, 1) OVER (ORDER BY id)) ^ 2
      ) AS seg_length,
      node AS from_node,
      lead(node, 1) OVER (ORDER BY id) AS to_node
    FROM
      Table1
  ) sub
  -- filter out the last row
  WHERE to_id IS NOT NULL
) seglengths
-- Group into series of sub-segments between two nodes
GROUP BY partition_id;

Credit to How do I efficiently select the previous non-null value? for the partition trick.

Result:

 seg_ids | to_node | from_node | seg_length 
---------+---------+---------+------------
 {1,2,3} |     100 |     200 |          3
 {4,5,6} |     200 |     300 |          3
(2 rows)

To insert directly into another table, use INSERT INTO ... SELECT ....

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Craig Ringer
  • 307,061
  • 76
  • 688
  • 778
  • Thanks for the hint. It's not homework, the tables used in the project are too complicated, so I simplified the problem. I managed to get all the lengths of the line segments (with 3 nested "WITH" clause !), but I still haven't figured out how to sum the lengths of the line segments – user416983 Aug 26 '15 at 03:51
  • The actual problem is to calculate path lengths between bus stops, given the trajectory of the bus. The "node" column is actually ID of bus stops. – user416983 Aug 26 '15 at 04:04
  • @user416983 OK. Take a look at the edit. It's a hairy problem. This solution expects the `id` to be ordered the same as the line segments. If that's not the case you'll need to adjust the inner query appropriately. – Craig Ringer Aug 26 '15 at 04:21