logo

Home

Table of contents

Overview | Supported environment | Credit | Partners | License | History | Related | Publication

Overview

SLEEF stands for SIMD Library for Evaluating Elementary Functions. It implements manually vectorized versions of all C99 real floating point math functions. It can utilize SIMD instructions that are available on modern processors. SLEEF is designed to effciently perform computation with SIMD instructions by reducing the use of conditional branches and scatter/gather memory access. Our benchmarks show that the performance of SLEEF is comparable to that of the best commercial library.

Unlike closed-source commercial libraries, SLEEF is designed to work with various architectures, operating systems and compilers. It is distributed under the Boost Software License , which is a permissive open source license. SLEEF can be easily ported to other architectures by writing a helper file, which is a thin abstraction layer of SIMD intrinsics. SLEEF also provides dispatchers that automatically choose the best subroutines for the computer on which the library is executed. In order to further optimize the application code that calls SLEEF functions, link time optimization(LTO) can be used to reduce the overhead of functions calls, and the build system of SLEEF supports usage of LTO. The library also has a functionality to generate header files in which the library functions are all defined as inline functions. SLEEF can be used for GPGPU and WebAssembly with these header files. In addition to the vectorized functions, SLEEF provides scalar functions. Calls to these scalar SLEEF functions can be auto-vectorized by GCC.

The library contains implementations of all C99 real FP math functions in double precision and single precision. Different accuracy of the results can be chosen for a subset of the elementary functions; for this subset there are versions with up to 1 ULP error (which is the maximum error, not the average) and even faster versions with a few ULPs of error. For non-finite inputs and outputs, the functions return correct results as specified in the C99 standard. All the functions in the library are thoroughly tested and confirmed that the evaluation error is within the designed limit by comparing the returned values against high-precision evaluation using the GNU MPFR Library.

As of version 3.6, SLEEF also includes a quad-precision math library . This library includes fully vectorized IEEE 754 quadruple-precision (QP) functions that correspond to the standard C math functions. It also includes I/O functions for converting between QP numbers and strings.

SLEEF also includes a library of discrete Fourier transform(DFT) . These subroutines are fully vectorized, heavily unrolled, and parallelized in such a way that modern SIMD instructions and multiple cores can be utilized for efficient computation. It has an API similar to that of FFTW for easy migration. The subroutines can utilize long vectors up to 2048 bits. The helper files for abstracting SIMD intrinsics are shared with SLEEF libm, and thus it is easy to port the DFT subroutines to other architectures. Preliminary results of benchmark are now available.

Supported environments

This library supports the following architectures :

The supported combinations of the architecture, operating system and compiler are shown in Table 1.1.

Table 1.1: Environment support matrix
GCC Clang Intel Compiler MSVC
x86_64, Linux Supported Supported Supported N/A
AArch64, Linux Supported Supported N/A N/A
x86_64, macOS Supported(*2) Supported(*2) N/A
x86_64, Windows Supported(Cygwin)(*3) Supported Supported
AArch32, Linux Supported(*1) Supported(*1) N/A N/A
PowerPC (64 bit), Linux Supported Supported N/A N/A
System/390 (64 bit), Linux Supported Supported N/A N/A
x86_64, FreeBSD Supported N/A N/A
x86 (32 bit), Linux Supported Supported N/A
AArch64, macOS Supported Supported N/A N/A
AArch64, Android Preliminary N/A N/A
AArch64, iOS Preliminary N/A N/A
RISC-V (64-bit), Linux Supported Supported N/A N/A

The supported compiler versions are as follows.

  • GCC : version 5 and later
  • Clang : version 6 and later
  • Intel Compiler : ICC version 17
  • MSVC : Visual Studio 2019

(*1) NEON has only single precision support. The computation results are not in full accuracy because NEON is not IEEE 754-compliant.

(*2) LTO is not supported.

(*3) AVX functions are not supported for Cygwin, because AVX is not supported by Cygwin ABI. SLEEF also builds with MinGW for Windows on x86, but only DFT can be tested for now.

(*4) Some compiler versions simply do not support certain vector extensions, for instance SVE is only supported for gcc version 9 onwards. Similarly, the RISC-V interface in SLEEF is based on version 1.0 of the intrinsics, which is only supported from llvm version 17 and gcc version 14 onwards. Toolchain files provide some information on supported compiler versions.

All functions in the library are thread safe unless otherwise noted.

Credit

Partner institutes and corporations

   
IBM logo As the leading company in a wide range of information technologies, IBM participates through David Edelsohn.
ARM logo As the leading IP company in semiconductors design, ARM participates through Pierre Blanchard, Joe Ramsay and Joana Cruz.
Unity Technologies logo As the leading company in developing a video game engine, Unity Technologies participates through Alexandre Mutel.

License

SLEEF is distributed under Boost Software License Version 1.0.

   
open source logo Boost Software License is OSI-certified. See this page for more information about Boost Software License.

History

See Changelog for a full history of changes to SLEEF.

Publication

  • Naoki Shibata and Francesco Petrogalli : SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions, in IEEE Transactions on Parallel and Distributed Systems, DOI:10.1109/TPDS.2019.2960333 (Dec. 2019). [PDF]
  • Francesco Petrogalli and Paul Walker : LLVM and the automatic vectorization of loops invoking math routines: -fsimdmath, 2018 IEEE/ACM 5th Workshop on the LLVM Compiler Infrastructure in HPC (LLVM-HPC), pp. 30-38., DOI:10.1109/LLVM-HPC.2018.8639354 (Nov. 2018). [PDF]