This is the other intentionally parallelized solution I mentioned in SIGNAL AMPLIFIER. My first take at a solution had the unit that reads in the command from IN.S
do all the logic and math, and it runs in 294 cycles.
This second solution passes off the decision-making to both of the units that read the inputs IN.A
and IN.B
; they pass down the input value or 0 when appropriate, and the middle unit in the middle row adds them together and passes the result out. This runs in only 204 cycles, which is a lot better.