compsci: Duff's device for loop unrolling
Loop unrolling is a performance optimisation technique that aims to reduce comparisons between executions of a loop body.
Duff's device is a beautiful/surprising technique for implementing that in C, relying on the ablity to intertwine case
statements inside a do while
loop:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch (count % 8) {
case 0: do { *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
} while (--n > 0);
}
}
The idea is to handle both an initial <N
cases for the loop unrolling with the same structure that handles the per N
unrolled loop iterations.
The partially separated first case and unrolled loop version makes it clearer how this magic is working:
send(to, from, count)
register short *to, *from;
register count;
{
register n = (count + 7) / 8;
switch (count % 8) {
case 0: *to = *from++;
case 7: *to = *from++;
case 6: *to = *from++;
case 5: *to = *from++;
case 4: *to = *from++;
case 3: *to = *from++;
case 2: *to = *from++;
case 1: *to = *from++;
}
while (--n > 0) {
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
*to = *from++;
}
}
Published on: 31 Jul 2023