Filling the top bits with the sign
Read a signed narrow value into a 32-bit register and zeroing the top bits won't do, since a negative s8 such as -1 (0xFF) has to widen into 0xFFFFFFFF rather than 0x000000FF. Halfwords get their own instruction for that, lha (load halfword algebraic), which fills the high bits with the sign as it reads:
lha r3, 0(r3) # halfword, sign-extended into r3
blr
The byte case is the awkward exception, because there is no "load byte algebraic" to match it. A signed byte that needs widening comes in on lbz and then gets a separate extsb (extend sign byte) to clean it up, a pairing you'll spot any time an s8 is widened to an int:
lbz r3, 0(r3)
extsb r3, r3 # sign-extend the byte to 32 bits
blr
Your task
Write load_s16, taking an s16* and returning the pointed-to value widened to int. Because the result is used as a 32-bit int, the load must sign-extend.