## SSE - Saturation Arithmetic

In a previous article I presented the SSE instruction set and how to use it in C/C++ with simple examples.
In this article I will introduce other instructions that have a really nice property: they allow you to use saturation arithmetic.

### Saturation Arithmetic

With saturation arithmetic the result of an operation is bounded in a range between a minimum and a maximum value. For example, with saturation arithmetic in the range [0, 255], we have: 200 + 70 = 255 and 20 - 70 = 0.

This property would be great in C/C++ because when performing arithmetic the regular way, we use modular arithmetic, for example with an unsigned char we have : 200 + 70 = 14 and 20 - 70 = 206, this phenomenon is called overflow and at CPU level, it can be detected using the carry/overflow flags.

### Saturation with SSE

As mentioned in the previous article, SSE was initially designed for signal processing and graphics processing. For example imagine we want to add/subtract two grayscale images together, we would be losing a lot of time if we had to clip the result by hand after the operation.
Fortunately, SSE provides special instructions for saturation arithmetic, with a single assembly instruction you can add several values at the same time and clip all the results.

### Example

In the last article we used the _mm_add_epi8 function in order to add 16 char at the same time. In order to perform the same operation but with saturation, we simply use _mm_adds_epi8 (notice the additional 's').

In the following example we add two unsigned char values and print the result (but remember you can add 16 values at the same time).

``` unsigned char a __attribute__ ((aligned (16))) = { 200 }; unsigned char b __attribute__ ((aligned (16))) = { 70 }; __m128i l = _mm_load_si128((__m128i*)a); __m128i r = _mm_load_si128((__m128i*)b);   _mm_store_si128((__m128i*)a, _mm_adds_epu8(l, r)); std::cout << (int)a << std::endl;```

This small program should print 255, instead of 14 with modular arithmetic.

We can also subtract unsigned bytes using _mm_subs_epi8:

``` unsigned char a __attribute__ ((aligned (16))) = { 20 }; unsigned char b __attribute__ ((aligned (16))) = { 70 }; __m128i l = _mm_load_si128((__m128i*)a); __m128i r = _mm_load_si128((__m128i*)b);   _mm_store_si128((__m128i*)a, _mm_subs_epu8(l, r)); std::cout << (int)a << std::endl;```

This program should output 0 (206 with modular arithmetic).

### Limitations

With SSE, saturation arithmetic is limited to signed/unsigned 8 and 16 bytes operands: epi8, epu8, epi16, epu16. The available arithmetic operations are adds and subs.