9

Many years ago while working on a tight graphics I/O problem, Tom Duff unrolled a loop and created his Duff's Device as follows:

dsend(to, from, count)
char *to, *from;
int count;
{
    int n = (count + 7) / 8;
    switch (count % 8) {
    case 0: do { *to = *from++;
    case 7:      *to = *from++;
    case 6:      *to = *from++;
    case 5:      *to = *from++;
    case 4:      *to = *from++;
    case 3:      *to = *from++;
    case 2:      *to = *from++;
    case 1:      *to = *from++;
           } while (--n > 0);
    }
}

(Note this uses old style function parameters - that's not an error.)

This coding comes directly out of thinking in assembler and coding in C and is dependent on C's case statement fall-through. Can this kind of creativity in interlacing control structures work in any other languages?

dwitvliet
  • 7,242
  • 7
  • 36
  • 62
Shannon Nelson
  • 2,090
  • 14
  • 14

5 Answers5

6

You can do it in any language that supports computed GOTO statements (Fortran, some BASICs, etc.)

Dave
  • 10,369
  • 1
  • 38
  • 35
  • but if you can't dynamically compute the address that the GOTO should jump too, it get's so cumbersome that it's not really worth it anymore. So I don't think it'd be a good idea in dynamic languages like PHP... – Janus Troelsen Dec 27 '11 at 23:08
5

It works in C++.

Note though the code generated depends on your compiler. In particular, when I compiled Duff's device using GCC targeting ARM architectures, the resulting ARM assembler was sub-optimal (I think GCC turned it into a jump table) at any optimization level.

I ended up just handing coding the assembler.

tpdi
  • 34,554
  • 11
  • 80
  • 120
  • 4
    Yes, it was good when compilers didn't optimise much (which is when Duff came up with it). The problem is that at every step there's both the drop-through flow and the 'case' label, and it takes a very good compiler to work out that it doesn't need to flush registers etc. And a compiler that good will probably be able to unroll the loop of the naive implementation anyway. Still, makes a good interview question :) – The Archetypal Paul May 03 '09 at 18:19
3

Duff's device is essentially a computed goto which can be done in many other languages - assembly (of course) and FORTRAN being a couple that support them directly.

Michael Burr
  • 333,147
  • 50
  • 533
  • 760
3

I used it very successfully in JavaScript to speed up large array processing. I wish I could use it in C#.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Gary
  • 1,847
  • 2
  • 14
  • 22
  • 2
    You can. There's no need for the cleanup case to be the same as the loop. Do the unrolled loop n/8 times, then 7 conditional steps at the end. Given it's for large arrays, any minor inefficiency in the last 7 is lost in the noise. – The Archetypal Paul May 03 '09 at 18:16
  • 3
    Paul, if the cleanup case isn't the same as the loop, then I don't think it can rightly be called Duff's device anymore. Duff's device isn't just about loop unrolling. It's about using a `switch` statement to jump into the middle of an unrolled loop. – Rob Kennedy Jun 03 '11 at 20:54
0

Even if it cannot be use this way you may still have two loops:

dsend(to, from, count)
char *to, *from;
int count;
{
    int n;
    for(n=0; n!=count/8; n+=8){
        *to = *from++;
        *to = *from++;
        *to = *from++;
        *to = *from++;
        *to = *from++;
        *to = *from++;
        *to = *from++;
        *to = *from++;
    }
    for(; n!=count; n++)
    {
        *to = *from++;
    }
}

Sure this will be somewhat slower with smaller count but it is somewhat more readable, somewhat more portable across languages and produces very similar benefir with large count.

Euri Pinhollow
  • 332
  • 2
  • 17