AE_MULZSAFD32X16.H3.L2 — 32x16-bit signed fraction dual MAC with Subtraction/Addition and 64-bit (17.47) result, without saturation, zeroed accumulator..

Instruction Word

Slot
ae2_slot1
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format2 - 64 bit(s)0000 1110
AE_MULZSAFD32X16.H3.L2 1101 1011 1001
ae_fld_mul_q0 3210
ae_fld_mul_d0 3210
ae_fld_mul_d1 3210

Slot
ae_slot1
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format - 64 bit(s) 1111
AE_MULZSAFD32X16.H3.L2 11011000
ae_fld_mul_q0 3210
ae_fld_mul_d0 3210
ae_fld_mul_d1 3210

Slot
ae_slot1
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format1 - 64 bit(s)1 1110
AE_MULZSAFD32X16.H3.L2 11011000
ae_fld_mul_q0 3210
ae_fld_mul_d0 3210
ae_fld_mul_d1 3210

Assembler Syntax

AE_MULZSAFD32X16.H3.L2 aed0..15(ae_mul_q0), aed0..15(ae_mul_d0), aed0..15(ae_mul_d1)

C Syntax

#include <xtensa/tie/xt_hifi2.h>

extern ae_f64 AE_MULZSAFD32X16_H3_L2(ae_f32x2 d1, ae_f16x4 d0);

Description

AE_MULZSAFD32X16.H3.L2 is a 32x16-bit signed fraction dual MAC with 64-bit (17.47) result, without saturation. The extension H3.L2 indicates that the two multiplication results are (1) the product of the H element of the 32-bit AE_DR operand with the 3 element of the 16-bit AE_DR operand, and (2) the product of the L element of the 32-bit AE_DR operand with the 2 element of the 16-bit AE_DR operand. and the initial accumultator contents are zeroed before writing into the accumulator. SA indicates that the first multiply result is subtracted from the accumulator, and the second multiply result is added to the accumulator.

Implementation Pipeline

In Out
ae_mul_d0 Mstage, ae_mul_d1 Mstage ae_mul_q0 Wstage

Protos that use AE_MULZSAFD32X16.H3.L2

proto AE_MULZSAFD32X16.H3.L2 { out ae_f64 d, in ae_f32x2 d1, in ae_f16x4 d0 }{}{
AE_MULZSAFD32X16.H3.L2 d, d1, d0;
}