What -O4,p Actually Does

Optimization & Scheduling

optimization-levelschedulingmental-model

The compiler that fights you back

Everything on this site is built at -O4,p, which is about as aggressive as MWCC GC/2.0 gets. The 4 is the optimization level. That alone buys you full inlining, common-subexpression elimination, strength reduction, and a pile of loop work. The ,p is the part that bites here. It tells the compiler to optimize for the pipeline, scheduling instructions around the Gekko's execution units instead of just packing them small. Coming from GCC? The ,p is an MWCC quirk, and there is no comma-suffix like it on GCC's -O flags.

So the rules change completely. Under -O0 the assembly tracks your C line for line. Under -O4,p the compiler reads your code as intent rather than layout, free to reorder, fuse, delete, and rematerialize instructions however it likes, as long as the result you can observe stays identical. Two consequences fall out of that.

The instruction order in the binary often won't match your source order at all. Independent work gets hoisted upward to hide latency.
Byte-matching isn't about wrestling the optimizer down. You feed it C whose optimized shape already equals the target, and that knack is what this whole chapter drills.

Below, two independent load-and-add pairs feed a single dependent multiply.

lwz   r6, 0(r3)    # all four loads hoisted to the top...
lwz   r5, 4(r3)
lwz   r4, 8(r3)
lwz   r0, 12(r3)
add   r3, r6, r5   # ...then the two adds...
add   r0, r4, r0
mullw r3, r3, r0   # ...then the dependent multiply
blr

Look at how all four lwz got pulled to the front, even though the source finished computing a before it ever touched b. That reshuffle is ,p scheduling in action, and the next lesson takes it apart.

Your task

Write combine(int *p) that takes four consecutive integers via a pointer. Read which array slots feed each add and how the two sums are combined; write the natural C and let -O4,p schedule it.

Hints

match combinemwcceppc.exe -O4,p

Loading editor…

Hit “Compile & Check” to diff your code against the target.