X86lite Documentation Instructions Operands Condition Codes

X86lite is a 32-bit signed integer only subset of the Intel IA-32 machine architecture. The X86lite instruction set is tiny by comparison to full X86, yet it still provides a sufficient compilation target for the CIS341 course compiler projects.

This document explains the X86lite machine model and its instruction set, and is intended for use as a reference manual. Information about the full X86 architecture can be found on the Intel web pages. The CIS341 course infrastructure provides OCaml interfaces for manipulating X86lite programs and tools for assembling X86lite programs into executables.

The X86lite machine state consists of eight 32-bit registers, three condition flags, and a memory consisting of 2^32 bytes. There is also a program counter register EIP that holds a pointer to the current instruction. EIP can only be manipulated indirectly through control-flow instructions like Jmp.

The eight 32-bit registers in X86lite and their common uses in the full X86 architecture are given below. In X86lite, most of the registers can be used for general purpose calculation, but some X86lite instructions make special use of some of the registers; see the instruction descriptions below.

Register Description / common use on X86
EAX General purpose accumulator
EBX Base register, pointer to data
ECX Counter register for strings and loops
EDX Data register for I/O
ESI Pointer register, string source register
EDI Pointer register, string destination register
EBP Base pointer, points to the stack frame
ESP Stack pointer, points to the value at the top of the stack

The X86 architecture provides conditional branch and conditional move instructions. The processor maintains a set of bit-sized flags to keep track of conditions arising from arithmetic and comparison operations. These condition flags are tested by the conditional jump and move instructions; the flags are set by the arithmetic instructions. X86lite provides only three condition flags (the full X86 architecture has several more).

Condition Flag Description
OF Overflow: set when the result is too big or too small to fit in a 32-bit value and cleared otherwise. This is overflow/underflow for signed (two's complement) 32-bit arithmetic.
SF Sign: equal to the most significant bit of the result (0=positive, 1=negative)
ZF Zero: set if the result is 0 cleared otherwise

The X86lite memory consists of 2^32 bytes numbered 0x00000000 through 0xffffffff. X86lite treats the memory as consisting of 32-bit (4-byte) words. As a consequence, X86lite memory locations are addressed by 32-bit, word aligned pointers. All valid memory addresses are evenly divisible by 4.

By convention on X86 machines, the program stack starts at the high addresses of virtual memory and grows toward the low addresses. The register ESP points to the value at the top of the stack. Instructions like push and pop increment and decrement ESP as needed to maintain this invariant.

This section describes the X86lite instruction set.

X86lite instructions manipulate data stored in memory or in registers. The values operated on by a given instruction are described by operands, which are constant values like integers and statically known memory addresses, or dynamic values such as the contents of a register or a computed memory address.

Operands can take one of four forms, described below:

Operand kind Description
CImm : int32 An immediate, constant value of size 32-bits.
CLbl : Label A label value generated by the compiler. The assembler and linker resolve this to a constant memory address at load time. Label values typically denote targets of Jmp or Call instructions.
Reg : reg One of the eight machine registers. The value of a register is its contents.
Ind : ind An indirect address. The type ind consists of three optional components:

[base : reg] [index : reg, scale : int32] [disp : (int32 | Label)]

The effective address denoted by an indirect address is calculated by:

addr(Ind) = base + (index * scale) + disp.

In the formula above, a missing optional component's value is 0.

The index component cannot be ESP.

When an Ind operand is used as a value (not a location) the operand denotes Mem[addr(Ind)], the contents of the machine memory at the effective address denoted by Ind.

The X86lite Cmp SRC1 SRC2 instruction is used to compare two 32-bit operands (SRC1 and SRC2). It works by subtracting SRC2 from SRC1, setting the condition flags according to the result (the actual result of the subtraction is ignored).

The X86lite conditional branch (J) and conditional set (Setb) instructions specify condition codes that look at the condition flags to determine whether or not the condition is satisfied. The eight condition codes and their interpretation in terms of condition flags are given in the following table:

Condition code Description
Eq Equals: This condition holds when ZF is set. (Intuitively SRC1 = SRC2 when SRC1 - SRC2 = 0.)
Zero Zero: This condition holds when ZF is set. (This condition is the same as Eq.)
NotEq Not equals: This condition holds when ZF is not set.
NotZero Not zero: This condition holds when ZF is not set. (This condition is the same as NotEq.)
Slt (Signed) less than: This condition holds when SF does not equal OF. Equivalently, this condition holds when ((SF = 1 and OF = 0) or (SF = 0 and OF = 1)). The first case holds when the result of SRC1 - SRC2 is negative and there has been no overflow, the second case holds when the result of SRC1 - SRC2 is positive and there has been an overflow.
Sge (Signed) greater than or equal: This condition holds when SF equals OF. This is the negation of the conditions for Slt (see above).
Sle (Signed) less than or equal: This condition holds when (SF is not equal to OF) or ZF is set. This is just (Slt or Equal).
Sgt (Signed) greater than: This condition is just the negation of Sle.

There are 22 instructions in the X86lite architecture. Together, they provide basic signed arithmetic over 32-bit integers, logical operations, data movement between registers and memory, and control-flow operations for branches and jumps. In general, instructions that involve two operands must not use two memory (Ind) operands.

When an operand appears on the right-hand side of the ← symbol in the instruction descriptions below, it is interpreted as a value, computed as described above. When an operand appears on the left-hand side of an ← symbol, it is interpreted as a location. The location of a register operand is the register itself; the location of an Ind operand is Mem[Ind]. Immediate values and labels do not denote valid locations.

In the following table, the Flags column indicates which condition flags are affected by the operation. The symbol --- means that no condition flags are set. The presence of the symbols SF, ZF, and OF indicate that these flags are set as described in the condition codes section. A * next to a flag indicates special handling. Note that overflow conditions for all arithmetic operations are defined per instruction.

Instruction Operation Comments
Flags
Neg DEST DEST-DEST SF ZF OF* Two's complement negation. Flag OF is set if DEST is MIN_INT
Add DEST SRC DESTDEST +32 SRC SF ZF OF* Signed integer addition. Let D64 be the 64-bit sign extension of DEST and S64 be the 64-bit sign extension of SRC. The result R32 = DEST +32 SRC is the 32-bit truncation of R64, which is obtained by 64-bit addition R64 = D64 +64 S64. Flag OF is set when S64 and D64 have the same sign but R32 and S64 do not.
Sub DEST SRC DESTDEST -32 SRC SF ZF OF* Signed integer subtraction. This operation can be computed using arithmetic negation and addition. Let D64 be the 64-bit sign extension of DEST and S64 be the 64-bit sign extension of SRC. The result R32 = DEST -32 SRC is the 32-bit truncation of R64, which is obtained by the 64-bit computation R64 = D64 +64 -S64. Flag OF is set when D64 and -S64 have the same sign but R32 and S64 do not or when SRC = MIN_INT.
Imul Reg SRC RegReg *32 SRC SF* ZF* OF* Signed integer multiply. Let D64 be the 64-bit sign extension of DEST and S64 be the 64-bit sign extension of SRC. The result R32 = DEST *32 SRC is the 32-bit truncation of R64, which is obtained by 64-bit multiplication R64 = D64 *64 S64. Flag OF is set when R64 cannot be represented as a 32-bit sign-extended integer. Flags ZF and SF are undefined.
Instruction Operation Comments
Flags
Not DEST DESTnot DEST --- One's complement (logical) negation.
And DEST SRC DESTDEST AND SRC SF ZF OF* Logical AND. Flag OF is always set to 0.
Or DEST SRC DESTDEST OR SRC SF ZF OF* Logical OR. Flag OF is always set to 0.
Xor DEST SRC DESTDEST XOR SRC SF ZF OF* Logical XOR. Flag OF is always set to 0.
Instruction Operation Comments
Flags
Sar DEST AMT DESTDEST >> AMT SF* ZF* OF* Arithmetic shift DEST right by AMT, replicating the sign bit for the vacated spaces. AMT must be a CImm or ECX operand. If AMT = 0 flags are unaffected. Otherwise the flags SF and ZF are set as usual. The OF flag is set to 0 if the shift amount is 1 (and is otherwise unaffected).
Shl DEST AMT DESTDEST << AMT SF* ZF* OF* Bitwise shift DEST left by AMT. AMT must be a CImm or ECX operand. If AMT = 0 flags are unaffected. Otherwise, flags SF and ZF are set as usual. OF is set if the top two bits of DEST are different and the shift amount is 1 (and is otherwise unaffected).
Shr DEST AMT DESTDEST >>> AMT SF* ZF* OF* Bitwise shift DEST right by AMT inserting 0's for the vacated spaces. AMT must be a CImm or ECX operand. Flags are set as in the Shl instruction, where OF is set to the most-significant bit of the original operand if the shift amount is 1 (and is otherwise unaffected).
Setb DEST CC DEST's lower byte ← if CC then 1 else 0 --- If condition code CC holds in the current state, move 1 into the lower byte of DEST; otherwise move 0 into the lower byte.
Instruction Operation Comments
Flags
Lea DEST Ind DESTaddr(Ind) --- Load effective address of Ind, which must be an operand of type ind (see operands). This instruction calculates a pointer into memory and stores it in DEST.
Mov DEST SRC DESTSRC --- Copy the value of SRC to the location denoted by DEST
Push SRC ESPESP - 4;
Mem[ESP] ← SRC
--- Push a 32-bit value onto the stack: decrement ESP by 4 to allocate the new stack slot and then store SRC in the resulting memory address.
Pop DEST DEST ← Mem[ESP];
ESPESP + 4
--- Pop the top of the stack into DEST: Load the value pointed to by ESP from memory and then increment ESP by 4.
Instruction Operation Comments
Flags
Cmp SRC1 SRC2 SRC1 -32 SRC2 SF* ZF* OF* Compare SRC1 to SRC2 by setting all condition flags as though the instruction Sub SRC1 SRC2 was executed. Does not change register or memory contents.
Jmp SRC EIPSRC --- Jump to the machine instruction at the address given by the value of SRC. Sets the program counter.
Call SRC Push EIP;
EIPSRC
--- Call a procedure: Push the program counter to the stack (decrementing ESP) and then jump to the machine instruction at the address given by the value of SRC.
Ret Pop EIP --- Return from a procedure: Pop the current top of the stack into EIP (incrementing ESP); this instruction effectively jumps to the address at the top of the stack.
J CC Clbl EIP ← if CC then CLbl else * --- Conditional jump: If the condition code CC holds in the current state, set EIP to CLbl otherwise set EIP to the next instruction (i.e. fallthrough).