4

Consider a big grid 10^5 * 10^5. Given about n = 10^8 data (x, y, v) meaning that the cell (x, y) contains the value v. All other cells contain 0. About q = 10^5 queries are asked giving an axis-parallel rectangle. The output should be sum of all values contained in the cells inside that rectangle. How to handle such queries? Any suitable data structure?

I can sort the data by x-coordinates, which would be good on average. But that would be O(nq) in the worst case. Any help?

Artur
  • 591
  • 3
  • 14

1 Answers1

2

You don't have modification queries, which means you don't necessarily have to use complicated data structures such as 2D binary trees (in fact 2D Fenwick Trees exist and are simple, but the memory consumption would be too big for this problem).

If your queries are offline, there is a sweep-line algorithm:

  • Sort both points and queries (both left and right sides of the rectangles, treated as "events") using x coordinates.
  • Maintain a data structure used for dynamic 1D range queries (e.g. a Fenwick Tree). Let's say you have two functions add(y, value) and sum(y1, y2) for that.
  • Store also a kind of "cumulative counter": A[i] for each rectangle.
  • For all events from left to right:
    • If it's a point (x, y, value): call add(y, value).
    • If it's a left corner of a rectangle with (y_down, y_up, ...) coordinates: A[i] := sum(ydown, yup).
    • If it's a right corner: the answer for the given query is sum(ydown, yup) - A[i].

Time complexity: O((n + q)(log(q) + log(n))).
Memory complexity: O(n + q).

md5
  • 23,373
  • 3
  • 44
  • 93