AE_MULAF16X4SS — Four-way SIMD 16x16-bit signed fractional (1.15) MAC with 32-bit (1.31) result with intermediate product and accumulator saturation.

Instruction Word

Slot
ae2_slot1
6
3
6
2
6
1
6
0
5
9
5
8
5
7
5
6
5
5
5
4
5
3
5
2
5
1
5
0
4
9
4
8
4
7
4
6
4
5
4
4
4
3
4
2
4
1
4
0
3
9
3
8
3
7
3
6
3
5
3
4
3
3
3
2
3
1
3
0
2
9
2
8
2
7
2
6
2
5
2
4
2
3
2
2
2
1
2
0
1
9
1
8
1
7
1
6
1
5
1
4
1
3
1
2
1
1
1
0
9876543210
Format ae_format2 - 64 bit(s)0000 1110
AE_MULAF16X4SS 1100 0001
ae_fld_mul_x4_q1 3210
ae_fld_mul_q0 3210
ae_fld_mul_d1 3210
ae_fld_mul_d0 3210

Assembler Syntax

AE_MULAF16X4SS aed0..15(ae_mul_q1), aed0..15(ae_mul_q0), aed0..15(ae_mul_d1), aed0..15(ae_mul_d0)

C Syntax

#include <xtensa/tie/xt_hifi2.h>

extern void AE_MULAF16X4SS(ae_f32x2 d0 /*inout*/, ae_f32x2 d1 /*inout*/, ae_f16x4 d2, ae_f16x4 d3);

Description

AE_MULAF16X4SS is a four-way SIMD, 16x16-bit fractional (1.15) MAC with 32-bit (1.31) result and intermediate product and accumulator saturation. This operation is bit exact with the ITU-T L_mac basic operation.

Implementation Pipeline

In Out
AE_OVERFLOW Wstage, ae_mul_q1 Wstage, ae_mul_q0 Wstage, ae_mul_d1 Mstage, ae_mul_d0 Mstage AE_OVERFLOW Wstage, ae_mul_q1 Wstage, ae_mul_q0 Wstage

Protos that use AE_MULAF16X4SS

proto AE_MULAF16X4SS { inout ae_f32x2 d0, inout ae_f32x2 d1, in ae_f16x4 d2, in ae_f16x4 d3 }{}{
AE_MULAF16X4SS d0, d1, d2, d3;
}
proto AE_MULAF16X4SS_vector { out ae_int32x4 pout, in ae_int32x4 pin, in ae_int16x4 d0, in ae_int16x4 d1 }{ae_int32x2 t0, ae_int32x2 t1}{
AE_MOV t0, pin->d1;
AE_MOV t1, pin->d0;
AE_MULAF16X4SS t0, t1, d0, d1;
AE_MOV pout->d1, t0;
AE_MOV pout->d0, t1;
}