View Single Post
  #24  
Old 02-19-2006, 09:01
Maximus Maximus is offline
Friend
 
Join Date: Nov 2005
Posts: 39
Rept. Given: 0
Rept. Rcvd 0 Times in 0 Posts
Thanks Given: 0
Thanks Rcvd at 1 Time in 1 Post
Maximus Reputation: 0
Pentium family breaks up opetations in several microops (i.e. load/store/alu etc.), whereas AMD uses a 'reduced' microcode set. The microcode is (for all I know) hardwired in the dice.
Real differences come when one processor needs to execute 4 mops compared to the 2 of another...
The Pentium mobile is more based on p3 design than p4 on this point, as it uses bigger hardwired microops.

p1 wasn't 1op/1cycle really. An op usually took 6 cycles, and was sais it was executed in one cycle only because it was sent in the pipeline:
Code:
xxxxx1
 xxxxx1
  xxxxx1
12345|||
so, after the operation entered the pipe, no AGI, read-over-write and not other situations (i.e. U/V misleading), the op was executed in '1 cycle'.
AMD is faster than Px family for this reason, but it is more 'sensible' to first level cache misses due to the reduced pipeline.
I noted PM too uses an increased L1 cache for this, so I can only (wildly) guess the time loss for accessing L2 with a shorter pipe is much bigger (in %) than with a longer pipe.
Anyone knows more on the subject?

Last edited by Maximus; 02-19-2006 at 09:06.
Reply With Quote