AE_MULZASD32X16.H3.L2 — 32x16-bit signed integer dual MAC with Addition/Subtraction and 64-bit result, without saturation, zeroed accumulator..

Instruction Word

Slot
ae2_slot1
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format2 - 64 bit(s)0000 1110
AE_MULZASD32X16.H3.L2 1101 1011 0100
ae_fld_mul_q0 3210
ae_fld_mul_d0 3210
ae_fld_mul_d1 3210

Slot
ae_slot1
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format - 64 bit(s) 1111
AE_MULZASD32X16.H3.L2 11010000
ae_fld_mul_q0 3210
ae_fld_mul_d0 3210
ae_fld_mul_d1 3210

Slot
ae_slot1
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format1 - 64 bit(s)1 1110
AE_MULZASD32X16.H3.L2 11010000
ae_fld_mul_q0 3210
ae_fld_mul_d0 3210
ae_fld_mul_d1 3210

Assembler Syntax

AE_MULZASD32X16.H3.L2 aed0..15(ae_mul_q0), aed0..15(ae_mul_d0), aed0..15(ae_mul_d1)

C Syntax

#include <xtensa/tie/xt_hifi2.h>

extern ae_int64 AE_MULZASD32X16_H3_L2(ae_int32x2 d1, ae_int16x4 d0);

Description

AE_MULZASD32X16.H3.L2 is a 32x16-bit signed integer dual MAC with 64-bit result, without saturation. The extension H3.L2 indicates that the two multiplication results are (1) the product of the H element of the 32-bit AE_DR operand with the 3 element of the 16-bit AE_DR operand, and (2) the product of the L element of the 32-bit AE_DR operand with the 2 element of the 16-bit AE_DR operand. and the initial accumultator contents are zeroed before writing into the accumulator. AS indicates that the first multiply result is added to the accumulator, and the second multiply result is subtracted from the accumulator.

Implementation Pipeline

In Out
ae_mul_d0 Mstage, ae_mul_d1 Mstage ae_mul_q0 Wstage

Protos that use AE_MULZASD32X16.H3.L2

proto AE_MULZASD32X16.H3.L2 { out ae_int64 d, in ae_int32x2 d1, in ae_int16x4 d0 }{}{
AE_MULZASD32X16.H3.L2 d, d1, d0;
}