AE_MULZSAFD32X16.H3.L2_S2 — 32x16-bit signed fraction dual MAC with Subtraction/Addition and 64-bit (17.47) result, without saturation, zeroed accumulator.: slot 2 version.

Instruction Word

Slot
ae_slot2_0
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format - 64 bit(s) 1111
AE_MULZSAFD32X16.H3.L2_S211011010
ae_fld_mul_x2_S2_q0 3210
ae_fld_mul_x2_S2_d0 3210
ae_fld_mul_x2_S2_d1 3210

Assembler Syntax

AE_MULZSAFD32X16.H3.L2_S2 aed0..15(ae_mul_S2_q0), aed0..15(ae_mul_S2_d0), aed0..15(ae_mul_S2_d1)

C Syntax

#include <xtensa/tie/xt_hifi2.h>

extern ae_f64 AE_MULZSAFD32X16_H3_L2_S2(ae_f32x2 d1, ae_f16x4 d0);

Description

AE_MULZSAFD32X16.H3.L2 is a 32x16-bit signed fraction dual MAC with 64-bit (17.47) result, without saturation. The extension H3.L2 indicates that the two multiplication results are (1) the product of the H element of the 32-bit AE_DR operand with the 3 element of the 16-bit AE_DR operand, and (2) the product of the L element of the 32-bit AE_DR operand with the 2 element of the 16-bit AE_DR operand. and the initial accumultator contents are zeroed before writing into the accumulator. SA indicates that the first multiply result is subtracted from the accumulator, and the second multiply result is added to the accumulator.

This is equivalent to AE_MULZSAFD32X16.H3.L2 and will be automatically used by the compiler as needed.

Implementation Pipeline

In Out
ae_mul_S2_d0 Mstage, ae_mul_S2_d1 Mstage ae_mul_S2_q0 Wstage

Protos that use AE_MULZSAFD32X16.H3.L2_S2

proto AE_MULZSAFD32X16.H3.L2_S2 { out ae_f64 d, in ae_f32x2 d1, in ae_f16x4 d0 }{}{
AE_MULZSAFD32X16.H3.L2_S2 d, d1, d0;
}