A sum of two products
A sum of two products — the heart of a 2D dot product, a determinant, a complex multiply — reads as two multiplies and an add. But the compiler emits only one fmuls, because the second product folds into the add via fmadds.
Consider cross(p, q, r, s) computing p*r + q*s:
fmuls f0, f2, f4 # f0 = q * s (one product, standalone)
fmadds f1, f1, f3, f0 # f1 = p*r + f0 = p*r + q*s
blr
The compiler chose to compute q*s first with fmuls, then fold p*r and the sum into one fmadds (recall fmadds fD, fA, fC, fB = fA*fC + fB). Which product becomes the standalone fmuls and which rides in the fmadds is the compiler's choice — what matters is reading the operand registers to see which two arguments pair up in each multiply. Four float arguments arrive in f1–f4.
The target assembly has the same fmuls → fmadds skeleton. Decode the operand registers to recover the two products and confirm they're added together.
Your task
Write dot2, taking four f32s, to reproduce the assembly above.