http://graphics.stanford.edu/projec...fe.12-10-03.ppt
投影片:
Pentium 4 SSE
theoretical*
3GHz * 4 wide * .5 inst / cycle = 6 GFLOPS
(*from Intel P4 Optimization Manual)
GeForce FX 5900 (NV35) fragment shader
obtained:
MULR R0, R0, R0:
20 GFLOPS
equivalent to a 10 GHz P4
and getting faster: 3x improvement over NV30 (6 months)
----
原來是這樣算的啊.
不過話說想到白算盤的Single Instruction computer....
如果有個只用MULR指令寫出來的OS可能就會蠻爽的!?
嗯, 我想太多了.XD