LOOPNEZ — Loop if Not-Equal Zero

Instruction Word

Slot
Inst
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format x24 - 24 bit(s) 0
LOOPNEZ 1001 01110110
s 3210
imm8 76543210

Assembler Syntax

LOOPNEZ as, label

Description

(please consult the Xtensa ® Instruction Set Architecture Reference Manual for any cross references and additional information)

LOOPNEZ sets up a zero-overhead loop by setting the LCOUNT, LBEG, and LEND special registers, which control instruction fetch. The loop will iterate the number of times specified by address register as with the zero value causing the loop to be skipped altogether by branching directly to the loop end address. LCOUNT, the current loop iteration counter, is loaded from the contents of address register as minus 1. LEND is the loop end address and is loaded with the address of the LOOPNEZ instruction plus four plus the zero-extended 8-bit offset encoded in the instruction (therefore, the loop code may be up to 256 bytes in length). LBEG is loaded with the address of the following instruction. LCOUNT, LEND, and LBEG are still loaded even when the loop is skipped.

After the processor fetches an instruction that increments the PC to the value contained in LEND, and LCOUNT is not zero, it loads the PC with the contents of LBEG and decrements LCOUNT. LOOPNEZ is intended to be implemented with help from the instruction fetch engine of the processor, and therefore should not incur a mispredict or taken branch penalty. Branches and jumps to the address contained in LEND do not cause a loop back, and therefore may be used to exit the loop prematurely. Similarly a return from a call instruction as the last instruction of the loop would not trigger loop back; this case should be avoided.

There is no mechanism to proceed to the next iteration of the loop from the middle of the loop. The compiler may insert a branch to a NOP placed as the last instruction of the loop to implement this function if required.

Because LCOUNT, LBEG, and LEND are single registers, zero-overhead loops may not be nested. Using conditional branch instructions to implement outer level loops is typically not a performance issue. Because loops cannot be nested, it is usually inappropriate to include a procedure call inside a loop (the callee might itself use a zero-overhead loop).

To simplify the implementation of zero-overhead loops, the LBEG address must be such that the first instruction must entirely fit within a naturally aligned four byte region or, if the fetch width is larger than four bytes, a naturally aligned region which is the next power of two equal to or larger than the fetch width. Some implementations require, in addition, that the fetch width is any greater than the naturally aligned power of two region (of four bytes or larger) which is no smaller than that first instruction. When the LOOP instruction would not naturally be placed at such an address, the insertion of NOP instructions or adjustment of which instructions are 16-bit density instructions is sufficient to give it the required alignment.

The automatic loop-back when the PC increments to match LEND is disabled when PS.EXCM is set. This prevents non-privileged code from affecting the operation of the privileged exception vector code. Dynamic loaders need to avoid mixing new code and old register values as the combination may execute in unexpected ways.

Operation

LCOUNT ← AR[s] − 1
LBEG ← nextPC
LEND ← PC + (024||imm8) + 4)
if AR[s] = 032 then
	nextPC ← PC + (024||imm8) + 4
endif

Exceptions

EveryInstR Group (see EveryInstR Group:)

Implementation Pipeline

In Out
ars Estage LBEG Estage, LEND Estage