AE_MULZSAD32X16.H3.L2_S2 — 32x16-bit signed integer dual MAC with Subtraction/Addition and 64-bit result, without saturation, zeroed accumulator.: slot 2 version.

Instruction Word

Slot
ae_slot2_0
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format - 64 bit(s) 1111
AE_MULZSAD32X16.H3.L2_S211010111
ae_fld_mul_x2_S2_q0 3210
ae_fld_mul_x2_S2_d0 3210
ae_fld_mul_x2_S2_d1 3210

Assembler Syntax

AE_MULZSAD32X16.H3.L2_S2 aed0..15(ae_mul_S2_q0), aed0..15(ae_mul_S2_d0), aed0..15(ae_mul_S2_d1)

C Syntax

#include <xtensa/tie/xt_hifi2.h>

extern ae_int64 AE_MULZSAD32X16_H3_L2_S2(ae_int32x2 d1, ae_int16x4 d0);

Description

AE_MULZSAD32X16.H3.L2 is a 32x16-bit signed integer dual MAC with 64-bit result, without saturation. The extension H3.L2 indicates that the two multiplication results are (1) the product of the H element of the 32-bit AE_DR operand with the 3 element of the 16-bit AE_DR operand, and (2) the product of the L element of the 32-bit AE_DR operand with the 2 element of the 16-bit AE_DR operand. and the initial accumultator contents are zeroed before writing into the accumulator. SA indicates that the first multiply result is subtracted from the accumulator, and the second multiply result is added to the accumulator.

This is equivalent to AE_MULZSAD32X16.H3.L2 and will be automatically used by the compiler as needed.

Implementation Pipeline

In Out
ae_mul_S2_d0 Mstage, ae_mul_S2_d1 Mstage ae_mul_S2_q0 Wstage

Protos that use AE_MULZSAD32X16.H3.L2_S2

proto AE_MULZSAD32X16.H3.L2_S2 { out ae_int64 d, in ae_int32x2 d1, in ae_int16x4 d0 }{}{
AE_MULZSAD32X16.H3.L2_S2 d, d1, d0;
}