When the trip count is known up front
I still remember the first max-scan that fooled me. The count was no secret, n sat right there, and MWCC noticed. Watch what it did. n - 1 went straight into the count register via mtctr, and from then on bdnz ("decrement CTR, branch if non-zero") carried the loop. Search the body all you like, you won't find a counter compare.
addi r0, r4, -1 # trip count = n - 1
addi r5, r3, 4 # pointer past first element
lwz r3, 0(r3) # load seed value
mtctr r0 # CTR = n - 1
cmpwi r4, 1
blelr- # early return when nothing to scan
body:
lwz r0, 0(r5) # load candidate
cmpw r0, r3 # compare against running result
ble- skip
mr r3, r0 # conditional update
skip:
addi r5, r5, 4 # advance pointer
bdnz+ body # CTR--, loop while non-zero
blr
What fooled me was expecting an unrolled body. Nope. My running max only budged when a candidate beat it, and MWCC won't unroll across a dependency like that, so the loop stayed put even at -O4,p.
So, two souvenirs. mtctr cuddled up next to bdnz? Trip count was known before the loop ever ran. blelr-? Compare-and-return, fused, the early bail when there's nothing to scan. Keep that mtctr/bdnz couple in view, it walks back on stage in the break lesson driving a loop that bails early too.
Your task
Write amax, returning the largest of the n elements of a (assume n >= 1).