14

I'm using a circular buffer to push data onto either end of a list. After I'm done I want to align the buffer so the first element in the list is at position zero and can be used like a regular array without any fancy indexing overhead.

So I have my circular list with capacity N, it has n elements starting at arbitrary index f.

enter image description here

What's the fastest way to shift/rotate all the elements such that f = 0?

The catch is I want to do this in-place (though of course some registers/temporaries will be needed). The buffer may be full (n = N), [EDIT] but I'm also interested in efficiently handling the cases where it's nearly empty.

jozxyqk
  • 16,424
  • 12
  • 91
  • 180
  • Why are these your requirements? Is this buffer taking up so much of the available memory that you couldn't reasonably allocate two of them? Have you profiled your program with and without the rotation and found significant performance gains? – user2357112 Jan 31 '14 at 12:46
  • 1
    @user2357112 42, yes and yes. I have limited memory and I've found heavily reading a circular buffer is more expensive, but I'll know if its worth it after I've played around with these helpful answers. – jozxyqk Jan 31 '14 at 12:50
  • 3
    related: http://stackoverflow.com/q/4457277/951890 – Vaughn Cato Jan 31 '14 at 15:09
  • That's way simpler than the algorithm I was thinking of, and a lot more cache-friendly than the answers posted so far for this question. – user2357112 Jan 31 '14 at 15:18
  • 2
    @jozxyqk: A lot of the posts here rely on those grey elements in the middle being "alive"/"construtected". Are they? – Mooing Duck Jan 31 '14 at 22:05
  • @MooingDuck In my case, not always. I may have one element or it might be full, so yes unnecessary swaps may be done. – jozxyqk Feb 03 '14 at 05:13
  • @jozxyqk: I'm not sure you understand what I'm asking. Are you using "placement new" in this container, or do you even know what that means? – Mooing Duck Feb 03 '14 at 05:31
  • 1
    @MooingDuck aah I see, no my array is static. I'm actually running this on a GPU but didn't want to overcomplicate the question, which I think is interesting in and of itself. – jozxyqk Feb 03 '14 at 05:38

5 Answers5

9

This algorithm taken from the std::rotate implementation on cplusplus.com is quite nice:

template <class ForwardIterator>
  void rotate (ForwardIterator first, ForwardIterator middle,
               ForwardIterator last)
{
  ForwardIterator next = middle;
  while (first!=next)
  {
    swap (*first++,*next++);
    if (next==last) next=middle;
    else if (first==middle) middle=next;
  }
}

http://www.cplusplus.com/reference/algorithm/rotate/

Vaughn Cato
  • 63,448
  • 5
  • 82
  • 132
  • 2
    +1 Very nice :) It occurs to me that there is potentially a lot of empty slots in the buffer between the back and the front that don't need to be rotated. I've adapted this and added an early out as my answer. – Will Jan 31 '14 at 20:09
  • This is a pretty cool trick and very simple to implement. @Will, how would you go about an early-out, not copying the unused data? – jozxyqk Feb 10 '14 at 08:45
  • Off the top of my head, if the data doesn't wrap past the end of the array then the solution is trivial. If it does, shift the end bit next to the end of the tail and then use the above method. I have to assume there's a way to combine these steps into a single operation though. – jozxyqk Feb 10 '14 at 09:23
  • @jozxyqk if *f=0* is all you want, you can just break the loop after *n* iterations. This works even if it wraps past the end. But rotating to an arbitrary *f* is a bit more complicated. (*f* and *n* are the parameters used in your question) – osvein May 01 '18 at 19:31
3

I will first assume that n=N. You can put the element f to 0, then 0 will go to n-f, n-f to n-2f, etc. You will, after a finite number of steps, go back to 0, thanks to group theory. You will have gone to all items at most 1 time (except of course f, which is the beginning and the end).

Will you have gone to all items at least 1? Well, it depends if l=pgcd(N, f)==0. If l==1, you are done. If not, you have to do the same procedure l-1 times, starting from (f+1)...(f+l-1). This procedure only use one variable and do it in-place. Each loop is runned on exacty n/f items.

The following figure illustrate the algorithm, with n=N=12 and f=3. We have l=3 so we have three loops : marron, blue, green. The initial numbers are in black, the final are colored.

enter image description here

If n<N the same algorihm still works, because the empty region before f in the circular array will be reported to the empty region at the end of the linear array. See the following schema for an illustration of this trick. We have N=12, n=10 and f=3, so 1 and 2 are missing (ie they are in the grey region or the ring). l=gcd(3, 12)=3 here.

enter image description here

For cache-friendlyness you may want to invert the two loops. You will need l variables in that case.

In pseudo-Python

from fractions import gcd
l = gcd(N, f)
niter = N//l                   # integer because l = gcd(N, f)
temp = t[f:(f+l-1)]
pos = f
for i=1 to niter
    for j=0 to l-1
        pos -= 1
        pos = (pos + N) % N    # in [0...N-1]
        swap(temp[j],  t[pos])
MatthieuBizien
  • 1,647
  • 1
  • 10
  • 19
  • 2
    great explanation, thanks! gcd isn't the most trivial of things to compute but great to see how the operation unfolds. I quite like rsp's answer, checking for completion via the number of moves. – jozxyqk Feb 10 '14 at 08:43
2

In the destination buffer after rotation buffer position n gets the contens of position (n + f) % N. The tricky part is the fact that all sorts of sequences of replacement can occur. This can be handled by traversing these sequences until the original position occurs. Keeping track of how many replacements have been done allows the algorithm to stop in time.

Following test method acts on a char array as that is easiest to setup:

private char[] rotate(char[] buf, int start) {

    int len = buf.length;
    int count = 0;
    int offset = 0;

    while (count < len) {

        int index = offset;
        char tmp = buf[index];
        int index2 = (start + index) % len;

        while (index2 != offset) {

            buf[index] = buf[index2];
            count++;

            index = index2;
            index2 = (start + index) % len;
        }

        buf[index] = tmp;
        count++;

        offset++;
    }

    return buf;
}

The following tests succeed:

public void testRotate() {

    assertEquals("A",       rotate("A",         0));
    assertEquals("AB",      rotate("AB",        0));
    assertEquals("AB",      rotate("BA",        1));

    assertEquals("ABCD",        rotate("DABC",      1));
    assertEquals("ABCDE",       rotate("DEABC",     2));
    assertEquals("ABCDEF",      rotate("DEFABC",    3));
    assertEquals("ABCDEF1",     rotate("DEF1ABC",   4));
    assertEquals("ABCDEF12",    rotate("DEF12ABC",  5));
    assertEquals("ABCDEF123",   rotate("DEF123ABC",     6));
}

private String rotate(String buf, int start) {

    return new String(rotate(buf.toCharArray(), start));
}

Update:

The above algorithm rotates a full buffer, to optimise rotating buffers thats are not full you can pick out the quick wins and use a full rotation for what is left:

private char[] realign(char[] buf, int start, int items) {

    int len = buf.length;
    int offset = 0;

    if (0 == start) {

        // done

    } else if (items <= len - start) {

        // simply move to front
        while (offset < items) {
            buf[offset++] = buf[start++];
        }

    } else if (items * 2 <= len) {

        // move lead out of the way first
        int last = start;
        int end = items - len + start;

        while (0 < end) {
            buf[--last] = buf[--end];
        }

        while (offset < items && start < len) {

            buf[offset++] = buf[start++];
        }

        while (offset < items) {

            buf[offset++] = buf[last++];
        }

    } else {

        // use full rotate on the rest
        buf = rotate(buf, start);
    }

    return buf;
}

This will take care of most of the situations, those where the buffer if more than half full and where it wraps over the end of the buffer are being rotated in full. The following tests succeed:

public void testRealign() {

    assertEquals("A",       realign("A",        0, 1));
    assertEquals("AB",      realign("BA",       1, 2));

    assertEquals("ABCD",        realign("DABC",     1, 4));
    assertEquals("ABCDE",       realign("DEABC",    2, 5));
    assertEquals("ABCDEF",      realign("DEF123ABC",    6, 6));

    assertEquals("0123456789",  realign("4567890123", 6, 10));

    assertEquals("ABC",         realign("ABC",  0, 3));
    assertEquals("ABC",         realign("012ABC3", 3, 3));
    assertEquals("ABC",         realign("01234ABC", 5, 3));
    assertEquals("ABCD",        realign("D1234ABC", 5, 4));
    assertEquals("ABCD",        realign("CD1234AB", 6, 4));
    assertEquals("ABCD",        realign("BCD1234A", 7, 4));
}

private String realign(String buf, int start, int items) {

    return (new String(realign(buf.toCharArray(), start, items))).substring(0,  items);
}
rsp
  • 23,135
  • 6
  • 55
  • 69
  • Great implementation and easy to follow. Can you think of a nice way to ignore unused parts of the buffer? `if (index < n) buf[index] = ...` saves a main memory copy at the cost of a register compare but being able to skip this entirely would be great for the cases where the circular buffer is nearly empty. – jozxyqk Feb 10 '14 at 08:51
0
fn swapItems arr index_a index_b = (

    local item_a = arr[index_a]
    local item_b = arr[index_b]
    arr[index_a] = item_b
    arr[index_b] = item_a
),
fn rotateItems arr cnt way:#right = (

    for i=1 to cnt do ( --how many times we shift

        case way of (

            #right:(
                local next = 2
                for j=1 to arr.count-1 do ( --shift each except last

                    swapItems arr 1 next --swap first with next
                    next+=1
                )
            )
            #left:(
                local prev = arr.count - 1
                for j=arr.count to 2 by -1 do ( --shift each except first

                    swapItems arr arr.count prev --swap last with prev
                    prev-=1
                )
            )
        )
    )
)
ar = #(1, 2, 3, 4)
mcArray.rotateItems ar 1 > #(4, 1, 2, 3)
mcArray.rotateItems ar 2 > #(3, 4, 1, 2)
mcArray.rotateItems ar 1 way:#left > #(2, 3, 4, 1)
Orien
  • 47
  • 2
  • 7
-4

First of all you should not align the buffer. Looks like an unnecessary overhead.

The tastes way to implement this should be

1) Create a new array
2) Copy elements (std::copy) from n=0 to end of array
3) Copy begining of the array from f=0 to n=6
4) std::swap arrays

vlg789
  • 751
  • 6
  • 20
  • 2
    sorry, but I'm after an *in-place* algorithm. also I believe avoiding heavy modulo arithmetic is worthwhile – jozxyqk Jan 31 '14 at 12:25
  • @jozxyqk: Aligning is way more heavy than arithmetic. And most circular buffers use iterators to avoid doing modulo arithmetic. Does yours not? – Mooing Duck Jan 31 '14 at 22:07
  • 1
    @MooingDuck it's a choice of a mod in each access or a conditional on each increment. In the cases I don't need random access and iterator would probably be faster. What you say sounds reasonable, but I'd like to test it to be sure. – jozxyqk Feb 03 '14 at 05:30
  • 1
    @Mooing Duck, the overall cost entirely depends on the use case. – Sz. Nov 25 '15 at 00:13