5

I have small vectors. Each of them is made of 10 integers which are between 0 and 15. This means that every element in a vector can be written using 4 bits. Hence I can concatenate my vector elements and store the whole vector in a single long type (in C, C++, java...)

Vector v1 dominates vector v2 if for each i in 0,...,9, v1[i] >= v2[i]

I want to write a method compare(long v1, long v2) that would return 0 if non of the vectors dominates the other, 1 if the first one dominates and -1 if the second one dominates.

Is there any efficient way to implement compare other than getting every i component and doing 10 times the normal integer comparison?

EDIT

if v1 is exactly the same as v2 returning 1 or -1 are both fine

Issam T.
  • 1,677
  • 1
  • 14
  • 32
  • If you can assume x86-only then SSE is probably the way to go - store your vectors as 16 x 8 bit ints and then it's pretty simple to implement the comparison. – Paul R Jun 15 '15 at 07:52
  • 4
    Are you sure you've defined your problem correctly? What should `compare(v, v)` return? I assume you want 0 for that, i.e. v1 only dominates v2 if at least one element is bigger? – duncan Jun 15 '15 at 09:04

2 Answers2

5

It's possible to do this using bit-manipulation. Space your values out so that each takes up 5 bits, with 4 bits for the value and an empty 0 in the most significant position as a kind of spacing bit.

Placing a spacing bit between each value stops borrows/carries from propagating between adjacent values and means you can do certain SIMD-like arithmetic operations on the vector just by using regular integer addition or subtraction. We can use subtraction to do a vector comparison.

To do the test you can set all the spacing bits to 1 in one of the vectors and then subtract the second one. If the value in the 4 bits below the spacing bit is greater in the second one then it will carry the bit from the spacing bit and set it to zero in the result, if not then it will remain a one (the first value is greater than or equal to the second). If the first vector dominates the second then all the spacing bits will be one after the subtraction.

Simple demonstration using ints:

#define SPACING_BITS ((1<<4)|(1<<9)|(1<<14)|(1<<19))
int createVector(int v0, int v1, int v2, int v3)
{
    return v0 | (v1 << 5) | (v2 << 10) | (v3 << 15);
}

int vectorDominates(int vectorA, int vectorB)
{
     // returns 1 if vectorA dominates vectorB:
     return (((vectorA | SPACING_BITS) - vectorB) & SPACING_BITS) == SPACING_BITS;
}

int compare(int vectorA, int vectorB)
{
    if(vectorDominates(vectorA, vectorB))
        return 1;
    else if(vectorDominates(vectorB, vectorA))
        return -1;
    return 0;
}

You can extend it to use 64 bit values using 50 bits to store the 10 values. You can also inline the calls to vectorDominates in the compare function.

Demo

samgak
  • 23,944
  • 4
  • 60
  • 82
  • 3
    In you dominates function, do you not need to AND with the spacing bits before comparison? Something like `return ((vectorA | SPACING_BITS) - vectorB) & SPACING_BITS == SPACING_BITS;` ? If I understand your (quite clever) logic, you're only looking to see if the spacing bits are unaffected by the subtraction. – TripeHound Jun 15 '15 at 09:10
  • @TripeHound yes that's absolutely correct. I was previously checking for zero but then flipped the comparison around to handle the "or equals to" and forgot to mask the results. Thanks! – samgak Jun 15 '15 at 09:12
  • Thinking further, if you stored `(((vectorA | SPACING_BITS) - vectorB) & SPACING_BITS)`, then wouldn't this be `SPACING_BITS` if `vectorA` dominates (as you have) or all zeroes if `vectorB` dominates (i.e. all "elements" of B have caused a "borrow"? You would thus only need one call to the comparison function? – TripeHound Jun 15 '15 at 09:16
  • 1
    vectorB can still dominate vectorA without causing a borrow (and therefore still having some ones), because the elements only need to be equal to or greater than, not necessarily greater than. So testing for all zeros isn't going to work unfortunately. – samgak Jun 15 '15 at 09:18
  • 1
    Actually, the problem seems ill-defined, since "_dominates_" is defined as all elements of one vector being greater-than-_or_equal-to_ the other: if all elements were equal, both would, by this definition, be dominant! – TripeHound Jun 15 '15 at 09:20
  • @TripeHound true, perhaps it would be better to return a bitfield with bit 0 set if a dominates b and bit 1 set if b dominates a. Then all 4 possibilities can be returned. – samgak Jun 15 '15 at 09:27
  • @samgak, TripeHound, thank you both for this instructive solution and discussion. I would like to just answer TripHound about the definition of the problem. In fact we want to keep a minimal set of "interesting" vectors. if two vectors are equal, declaring any of them as dominating the other is fine. – Issam T. Jun 15 '15 at 10:15
0

Well, in C you can likely leverage vectorization to do this. I don't think it's directly possible to compare on 4-bit operands, so you're going to have to re-pack (either on the fly or just keep your data in a more suitable format) up to 8-bit before doing the comparison. Since 10 * 8 = 80 which is more than 64, you're going to need 128-bit vector instructions.

Not sure if Java VMs support that yet, but this question suggests that JNI is the answer, i.e. call C code from Java.

Community
  • 1
  • 1
unwind
  • 391,730
  • 64
  • 469
  • 606