Paired-Single FPR Saves in the Prologue

Advanced Idioms

paired-singlespsq_stcallee-save

Why a float function stores 64 + 32 bits per register

FPRs f14–f31 are callee-saved on the Gekko. If a function holds float values live across a call, it has to stash them somewhere safe and put them back before returning. The odd part is what MWCC does to save one. It writes the register out twice.

stwu   r1, -96(r1)
mflr   r0
stw    r0, 100(r1)
stfd   f31, 80(r1)        # save the 64-bit double view of f31...
psq_st f31, 88(r1), 0, 0  # ...AND the paired-single (two 32-bit) view
stfd   f30, 64(r1)
psq_st f30, 72(r1), 0, 0
...

Why twice? A Gekko FPR can hold two packed 32-bit floats, not just a double. An stfd (store float double) only preserves the double-precision lane, so it would quietly drop the second paired-single half. MWCC covers itself by following every stfd with a psq_st (store paired-single quantized), which writes both 32-bit halves. The epilogue runs the same pairing backwards, psq_l then lfd, restoring the highest-numbered register first.

So a prologue that marches stfd/psq_st up through f31, f30, f29... isn't doing anything exotic. Those are callee-saved float registers, spilled because the function juggles enough floats at once. Write C that keeps enough f32 values alive across calls and the saves appear on their own.

Your task

Write mix(f32 *p): call the provided transform on p[0]..p[5] into six locals — skip the name f so it isn't confused with the f32 type — then combine those locals into a return value. Holding six float results live across six calls forces several callee-saved FPRs — watch the psq_st/stfd pairs appear in the prologue. Use the fmadds/fmuls/fadds instructions in the epilogue to reconstruct which products are added together.

Hints

match mixmwcceppc.exe -O4,p

Loading editor…

Hit “Compile & Check” to diff your code against the target.