Mike Schmit's Top Ten Rules

for

Pairing Pentium Instructions

1. Both instructions must be simple.

2. Shifts or rotates can only pair in the U pipe.
   (SHL, SHR, SAL, SAR, ROL, ROR, RCL or RCR)

3. ADC and SBB can only pair in the U pipe.

4. JMP, CALL and Jcc can only pair in the V pipe. (Jcc = jump on condition code).

5. Neither instruction can contain BOTH a displacement and an immediate operand. For example:

mov     [bx+2], 3  ; 2 is a displacement, 3 is immediate
mov     mem1, 4    ; mem1 is a displacement, 4 is immediate

6. Prefixed instructions can only pair in the U pipe. This includes extended instructions that start with 0Fh except for the special case of the 16-bit conditional jumps of the 386 and above. Examples of prefixed instructions:

mov     ES:[bx], 
mov     eax, [si]  ; 32-bit operand in 16-bit code segment
mov     ax, [esi]  ; 16-bit operand in 32-bit code segment

7. The U pipe instruction must be only 1 byte in length or it will not pair until the second time it executes from the cache.

8. There can be no read-after-write or write-after-write register dependencies between the instructions except for special cases for the flags register and the stack pointer (rules 9 and 10).

mov     ebx, 2   ; writes to EBX
add     ecx, ebx ; reads EBX and ECX, writes to ECX
                ; EBX is read after being written, no pairing
mov     ebx, 1   ; writes to EBX
mov     ebx, 2   ; writes to EBX
                 ; write after write, no pairing

9. The flags register exception allows an ALU instruction to be paired with a Jcc even though the ALU instruction writes the flags and Jcc reads the flags. For example:

cmp     al, 0    ; CMP modifies the flags
je      addr     ; JE reads the flags, but pairs
dec     cx       ; DEC modifies the flags
jnz     loop1    ; JNZ reads the flags, but pairs

10. The stack pointer exception allows two PUSHes or two POPs to be paired even though they both read and write to the SP (or ESP) register.

push    eax      ; ESP is read and modified
push    ebx      ; ESP is read and modified, but still pairs 

Simple Instructions (for Pentium pairing)

The following is a list of simple instructions, as required by rule #1 above.

Instruction format 16-bit example     32-bit example
------------------------------------------------------------
MOV reg, reg       mov ax, bx         mov eax, edx
MOV reg, mem       mov ax, [bx]       mov eax, [edx]
MOV reg, imm       mov ax, 1          mov eax, 1
MOV mem, reg       mov [bx], ax       mov [edx], eax
MOV mem, imm       mov [bx], 1        mov [edx], 1
alu reg, reg         add ax, bx         cmp eax, edx
alu reg, mem       add ax, [bx]       cmp eax, [edx]
alu reg, imm       add ax, 1          cmp eax, 1
alu mem, reg       add [bx], ax       cmp [edx], eax
alu mem, imm       add [bx], 1        cmp [edx], 1

where alu = add, adc, and, or, xor, sub, sbb, cmp, test

INC  reg           inc  ax            inc  eax
INC  mem           inc  var1          inc  [eax]
DEC  reg           dec  bx            dec  ebx
DEC  mem           dec  [bx]          dec  var2
PUSH reg           push ax            push eax
POP  reg           pop  ax            pop  eax
LEA  reg, mem      lea  ax, [si+2]    lea  eax, [eax+4*esi+8]
JMP  near          jmp  label         jmp  lable2
CALL near          call proc          call proc2
Jcc  near          jz   lbl           jnz  lbl2

where Jcc = ja, jae, jb, jbe, jg, jge, jl, jle, je, jne, jc, js,
            jnp, jo, jp, jnbe, jnb, jnae, jna, jnle, jnl, jnge,
            jng, jz, jnz, jnc, jns, jpo, jno, jpe

NOP                nop                nop
shift reg, 1       shl  ax, 1         rcl  eax, 1
shift mem, 1       shr  [bx], 1       rcr  [ebx], 1
shift reg, imm     sal  ax, 2         rol  esi, 2
shift mem, imm     sar  ax, 15        ror  [esi], 31

where shift = shl, shr, sal, sar, rcl, rcr, rol, ror

Notes:

Home Page    e-mail to Quantasm     Order form    Site Map