1

In Caché ObjectScript (Intersystems' dialect of MUMPS), is there a way to efficiently skip to the approximate midpoint or a linear point in the key for a global subscript range? Equal, based on the number of records.

I want to divide up the the subscript key range into approximately equal chunks and then process each chunk in parallel.

Knowing that the keys in a global are arranged in a binary tree of some kind, this should be a simple operation for the underlying data storage engine but I'm not sure if there is an interface to do this.

I can do it by scanning the global's whole keyspace but that would defeat the purpose of trying to run the operation in parallel. A sequential scan takes hours on this global. I need the keyspace divided up BEFORE I begin scanning.

I want each thread will to an approximately equal sized contiguous chunk of the keyspace to scan individually; the problem is calculating what key range to give each thread.

Chris Smith
  • 5,326
  • 29
  • 29
  • Did you ever try, instead of N threads getting R records each, one thread getting N*R records each (and possibly threading at that point if it helps)? You aren't I/O bound, clearly, but my guess would be you are network latency bound or something-in-the-ODBC-client bound. 16 records is a pretty small chunk. – psr Jul 26 '12 at 21:23
  • I did 16 records at a time so I could use a single prepared statement with paramater placeholders (?s) in an IN clause. 16 seemed like enough to make up for the round trip overhead without being a pain to program. The problem with doing N at a time is that I'd either have to build a dynamic statement or find another way to upload the keys I want. – Chris Smith Jul 27 '12 at 13:27

2 Answers2

0

you can use second parameter "direction" (1 or -1) in function $order or $query

DAiMor
  • 3,185
  • 16
  • 24
  • I am aware of this option. Unfortunately, those functions still move only one key position forward or backward. You can't skip to the midpoint of a key range that way, which is what the question is asking for. – Chris Smith Jul 24 '12 at 13:35
0

For my particular need, I found that the application I'm using has what I would call an index global. Another global maintained by the app with different keys, linking back to the main table. I can scan that in a fraction of the time and break up the keyset from there.

If someone comes up with a way to do what I want given only the main global, I'll change the accepted answer to that.

Chris Smith
  • 5,326
  • 29
  • 29