|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fast convolutional decoding is provided through x86 intrinsic based
SSE operations. SSE3, found on virtually all modern x86 processors,
is the minimal requirement. SSE4.1 and AVX2 are used if available.
Also, the original code was extended with runtime SIMD detection,
so only supported extensions will be used by target CPU. It makes
the library more partable, what is very important for binary
packages distribution. Runtime SIMD detection is currently
implemented through the __builtin_cpu_supports call.
Change-Id: I1da6d71ed0564f1d684f3a836e998d09de5f0351
|
|
Add a separate, faster convolution decoding implementation for rates
up to N=4 and constraint lengths of K=5 and K=7, which covers the
most GSM code uses. The decoding algorithm exploits the symmetric
structure of the Viterbi add-compare-select (ACS) operation - commonly
known as the ACS butterfly. This shift-register optimization can be
found in the well-known text by Dave Forney.
Forney, G.D., "The Viterbi Algorithm," Proc. of the IEEE, March 1973.
Implementation is non-architecture specific and improves performance on
x86 as well as ARM processors. Existing API is unchanged with optimized
code being called internally for supported codes.
The original code was relicensed under GPLv2-or-later with permission
of copyright holder - Tom Tsou.
Change-Id: I74d355274b4176a7d924f91ef3c96912ce338fb2
|