AI Engine-ML Intrinsics User Guide  (v2023.2)
Shift-Round-Saturate

Intrinsics for moving values from accumulator data-types to vector data-types. More...

Overview

Intrinsics for moving values from accumulator data-types to vector data-types.

Moving data from accumulator data-types back to standard vector data-types requires a reduction in precision. For fixed-point arithmetic, an appropriate transformation involving shifting out lower order bits, rounding and/or saturation can be applied using the SRS family of intrinsics. The shift amount is specified as a parameter (in the range -4 to 59), while the rounding and saturation is applied based on global mode registers of the processor (see Mode Settings).

Note
The shift values -4..-2 are unsafe, as they will only produce correct result if truncation is selected or saturation against 0 is required.

There are three main variants of the SRS intrinsics based on width of input and output data-types:

  1. ssrs is used to convert integer
    • 32-bit accumulator data into a corresponding 8-bit vector
    • 64-bit accumulator data into a corresponding 16-bit vector
  2. lsrs is used to convert integer
    • 32-bit accumulator data into a corresponding 16-bit vector
    • 64-bit accumulator data into a corresponding 32-bit vector
  3. srs is used to convert floating-point accumulators into a corresponding bfloat16 vector

Both ssrs and lsrs modes can be prefixed with 'u' in which case the resulting datatype will be unsigned.

Example

Using the ssrs intrinsic the 32 accumulator lanes of a v32acc32 are shifted directly to the 32 output lanes of a v32int8. Each lane does a separate shifting, rounding and saturation (depending on the parameters):

v32int8 o0 = ssrs(acc0,0)
v32uint8 o1 = ussrs(acc0,0)

As indicated in the name each SRS intrinsic performs three operations: Shifting (down, right), saturation and rounding. The first step is to compute saturation:

input_datatype saturation ( input_datatype ival , int shift , bool & has_sat )
{
input_datatype oval
input_datatype max
input_datatype min
if ( get_sat() ) // Please see set_sat() and get_sat()
{
min = - 2^( output_precision - 1 )
max = 2^( output_precision - 1 ) - 1
if ( is_unsigned( output_datatype ) )
{
min = 0
max = 2 ^ output_precision - 1
}
else if ( get_symsat() ) // Please see set_sym_sat() and get_sym_sat()
min = - 2 ^( output_precision - 1 ) + 1
max = max << shift
min = min << shift
if ( ival > max )
{
oval = max
has_sat = True // See set_srs_sat()
}
else if ( ival < min )
{
oval = min
has_sat = True // See set_srs_sat()
}
else
{
oval = ival
}
}
else
oval = ival
return oval
}

The rounding factor is then checked according to the selected rounding mode in Rounding modes. Finally, the shift is performed and the rounding factor is applied, as such:

output_datatype lane_srs ( input_datatype ival , int shift, bool & sat)
{
input_datatype oval_aux
output_datatype oval
bool round = False
sat = False
oval_aux = saturation( ival, shift, sat )
round = rounding ( ival, shift ) // Please see the rounding modes available
oval = oval_aux >> shift
if ( round )
oval += 1
return oval
}

The full srs call then applies the above algorithm to all lanes of a vector and sets the status saturation bit (if saturation is triggered):

vec_output_datatype srs ( vec_input_datatype ival , int shift, bool & sat)
{
vec_output_datatype out
bool sat = False
bool sat_aux = False
for i in lanes(ival)
{
r = lane_srs(i, shift, sat_aux)
sat |= sat_aux
out = upd_elem(out,i,r)
}
if sat
set_srs_sat()
return out
}
Note
Saturation status is not cleared automatically. If set, it will remain set until the user clears the status bit.
See also
'ups' intrinsics (Upshift)

Modules

 AIE interface
 
 Floating-point interface
 
 Size interface
 
ussrs
v32uint8 ussrs(v32acc32 acc, uint6_t shft, int sign)
Definition: me_srs.h:390
get_sat
unsigned int get_sat()
Control for rounding mode (for Shift-Round-Saturate). See Rounding modes for possible values.
Definition: me_set_mode.h:138
max
v64uint8 max(v64uint8 a, v64uint8 b)
Calculates the maximum between two input vectors.
Definition: me_vadd.h:462
v32uint8
Definition: me_chess.h:490
get_symsat
unsigned int get_symsat()
Control for rounding mode (for Shift-Round-Saturate). See Rounding modes for possible values.
Definition: me_set_mode.h:139
ssrs
v32int8 ssrs(v32acc32 acc, uint6_t shft, int sign)
Definition: me_srs.h:389
min
v64uint8 min(v64uint8 a, v64uint8 b)
Calculates the minimum between two input vectors.
Definition: me_vadd.h:397
srs
v16bfloat16 srs(v16accfloat acc)
Definition: me_srs.h:536
upd_elem
v64int8 upd_elem(v64int8 v, int idx, char b)
shift
v64int8 shift(v64int8 a, v64int8 b, int shift)
Concatenates a and b, interprets them as a vector of 128 bytes and returns a::b[shift*elem_size:shift...
Definition: me_scl2vec.h:161
v32int8
Definition: me_chess.h:489