llvm-project

Commit Graph

Author	SHA1	Message	Date
Tue Ly	a752460d73	[libc][math] Implement exp10f function correctly rounded to all rounding modes. Implement exp10f function correctly rounded to all rounding modes. Algorithm: perform range reduction to reduce ``` 10^x = 2^(hi + mid) * 10^lo ``` where: ``` hi is an integer, 0 <= mid * 2^5 < 2^5 -log10(2) / 2^6 <= lo <= log10(2) / 2^6 ``` Then `2^mid` is stored in a table of 32 entries and the product `2^hi * 2^mid` is performed by adding `hi` into the exponent field of `2^mid`. `10^lo` is then approximated by a degree-5 minimax polynomials generated by Sollya with: ``` > P = fpminimax((10^x - 1)/x, 4, [\|D...\|], [-log10(2)/64. log10(2)/64]); ``` Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 10.215 System LIBC reciprocal throughput : 7.944 LIBC reciprocal throughput : 38.538 LIBC reciprocal throughput : 12.175 (with `-msse4.2` flag) LIBC reciprocal throughput : 9.862 (with `-mfma` flag) $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh exp10f --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 40.744 System LIBC latency : 37.546 BEFORE LIBC latency : 48.989 LIBC latency : 44.486 (with `-msse4.2` flag) LIBC latency : 40.221 (with `-mfma` flag) ``` This patch relies on https://reviews.llvm.org/D134002 Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D134104	2022-09-19 10:01:40 -04:00
Tue Ly	463dcc8749	[libc][math] Implement acosf function correctly rounded for all rounding modes. Implement acosf function correctly rounded for all rounding modes. We perform range reduction as follows: - When `\|x\| < 2^(-10)`, we use cubic Taylor polynomial: ``` acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 / 6. ``` - When `2^(-10) <= \|x\| <= 0.5`, we use the same approximation that is used for `asinf(x)` when `\|x\| <= 0.5`: ``` acos(x) = pi/2 - asin(x) ~ pi/2 - x - x^3 * P(x^2). ``` - When `0.5 < x <= 1`, we use the double angle formula: `cos(2y) = 1 - 2 * sin^2 (y)` to reduce to: ``` acos(x) = 2 * asin( sqrt( (1 - x)/2 ) ) ``` - When `-1 <= x < -0.5`, we reduce to the positive case above using the formula: ``` acos(x) = pi - acos(-x) ``` Performance benchmark using perf tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh acosf GNU libc version: 2.35 GNU libc release: stable CORE-MATH reciprocal throughput : 28.613 System LIBC reciprocal throughput : 29.204 LIBC reciprocal throughput : 24.271 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 55.554 System LIBC latency : 76.879 LIBC latency : 62.118 ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D133550	2022-09-09 09:55:30 -04:00
Tue Ly	e2f065c2a3	[libc][math] Implement asinf function correctly rounded for all rounding modes. Implement asinf function correctly rounded for all rounding modes. For `\|x\| <= 0.5`, we approximate `asin(x)` by ``` asin(x) = x * P(x^2) ``` where `P(X^2) = Q(X)` is a degree-20 minimax even polynomial approximating `asin(x)/x` on `[0, 0.5]` generated by Sollya with: ``` > Q = fpminimax(asin(x)/x, [\|0, 2, 4, 6, 8, 10, 12, 14, 16, 18, 20\|], [\|1, D...\|], [0, 0.5]); ``` When `\|x\| > 0.5`, we perform range reduction as follow: Assume further that `0.5 < x <= 1`, and let: ``` y = asin(x) ``` We will use the double angle formula: ``` cos(2X) = 1 - 2 sin^2(X) ``` and the complement angle identity: ``` x = sin(y) = cos(pi/2 - y) = 1 - 2 sin^2 (pi/4 - y/2) ``` So: ``` sin(pi/4 - y/2) = sqrt( (1 - x)/2 ) ``` And hence: ``` pi/4 - y/2 = asin( sqrt( (1 - x)/2 ) ) ``` Equivalently: ``` asin(x) = y = pi/2 - 2 * asin( sqrt( (1 - x)/2 ) ) ``` Let `u = (1 - x)/2`, then ``` asin(x) = pi/2 - 2 * asin(u) ``` Moreover, since `0.5 < x <= 1`, ``` 0 <= u < 1/4, and 0 <= sqrt(u) < 0.5. ``` And hence we can reuse the same polynomial approximation of `asin(x)` when `\|x\| <= 0.5`: ``` asin(x) = pi/2 - 2 * u * P(u^2). ``` Performance benchmark using `perf` tool from the CORE-MATH project on Ryzen 1700: ``` $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf CORE-MATH reciprocal throughput : 23.418 System LIBC reciprocal throughput : 27.310 LIBC reciprocal throughput : 22.741 $ CORE_MATH_PERF_MODE="rdtsc" ./perf.sh asinf --latency GNU libc version: 2.35 GNU libc release: stable CORE-MATH latency : 58.884 System LIBC latency : 62.055 LIBC latency : 62.037 ``` Reviewed By: orex, zimmermann6 Differential Revision: https://reviews.llvm.org/D133400	2022-09-07 19:27:47 -04:00
Kirill Okhotnikov	77e1d9beed	[libc][math] Added atanf function. Performance by core-math (core-math/glibc 2.31/current llvm-14): 28.879/20.843/20.15 Differential Revision: https://reviews.llvm.org/D132842	2022-08-30 22:39:54 +02:00
Kirill Okhotnikov	6c1fc7e430	[libc][math] Added atanhf function. Performance by core-math (core-math/glibc 2.31/current llvm-14): 10.845/43.174/13.467 The review is done on top of D132809. Differential Revision: https://reviews.llvm.org/D132811	2022-08-30 22:39:54 +02:00
Guillaume Chatelet	1e5b3ce707	[reland][NFC][libc] standardize string_view	2022-08-23 12:40:49 +00:00
Guillaume Chatelet	eebe9b2964	Revert "[reland][NFC][libc] standardize string_view" This reverts commit `df99774ef7`.	2022-08-23 12:40:48 +00:00
Guillaume Chatelet	df99774ef7	[reland][NFC][libc] standardize string_view	2022-08-23 12:10:28 +00:00
Guillaume Chatelet	54cfe5f778	Revert "[reland][NFC][libc] standardize string_view" This reverts commit `522d29a6a7`.	2022-08-23 11:50:56 +00:00
Guillaume Chatelet	522d29a6a7	[reland][NFC][libc] standardize string_view	2022-08-23 11:48:53 +00:00
Guillaume Chatelet	0df7e1b0e5	Revert "[reland][NFC][libc] standardize string_view" This reverts commit `187099da1c`.	2022-08-23 11:00:22 +00:00
Guillaume Chatelet	187099da1c	[reland][NFC][libc] standardize string_view	2022-08-23 10:55:57 +00:00
Guillaume Chatelet	aa59c9810a	[libc][NFC] Use STL case for string_view	2022-08-22 15:25:14 +00:00
Kirill Okhotnikov	5ef987c985	[libc][math] Added tanhf function. Correct rounding function. Performance ~2x faster than glibc analog. Performance (llvm 12 intel): ``` CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='' ./perf.sh tanhf GNU libc version: 2.31 GNU libc release: stable 13.279 37.492 18.145 CORE_MATH_PERF_MODE=rdtsc PERF_ARGS='--latency' ./perf.sh tanhf GNU libc version: 2.31 GNU libc release: stable 40.658 109.582 66.568 ``` Differential Revision: https://reviews.llvm.org/D130780	2022-08-01 22:43:00 +02:00
Kirill Okhotnikov	a7f55f0805	[libc][math] Added sinhf function. Differential Revision: https://reviews.llvm.org/D129278	2022-07-29 17:20:53 +02:00
Kirill Okhotnikov	fcb9d7e2cf	[libc][math] Added coshf function. Differential Revision: https://reviews.llvm.org/D129275	2022-07-29 16:57:28 +02:00
Guillaume Chatelet	f72261508a	[libc][NFC] Use STL case for type_traits Migrating all private STL code to the standard STL case but keeping it under the CPP namespace to avoid confusion. Starting with the type_traits header. Differential Revision: https://reviews.llvm.org/D130727	2022-07-29 09:57:03 +00:00
Alex Brachet	04c681d195	[libc] Specify rounding mode for strto[f\|d] tests The specified rounding mode will be used and restored to what it was before the test ran. Additionally, it moves ForceRoundingMode and RoundingMode out of MPFRUtils to be used in more places. Differential Revision: https://reviews.llvm.org/D129685	2022-07-13 20:20:30 +00:00
Kirill Okhotnikov	b8e8012aa2	[libc][math] fmod/fmodf implementation. This is a implementation of find remainder fmod function from standard libm. The underline algorithm is developed by myself, but probably it was first invented before. Some features of the implementation: 1. The code is written on more-or-less modern C++. 2. One general implementation for both float and double precision numbers. 3. Spitted platform/architecture dependent and independent code and tests. 4. Tests covers 100% of the code for both float and double numbers. Tests cases with NaN/Inf etc is copied from glibc. 5. The new implementation in general 2-4 times faster for “regular” x,y values. It can be 20 times faster for x/y huge value, but can also be 2 times slower for double denormalized range (according to perf tests provided). 6. Two different implementation of division loop are provided. In some platforms division can be very time consuming operation. Depend on platform it can be 3-10 times slower than multiplication. Performance tests: The test is based on core-math project (https://gitlab.inria.fr/core-math/core-math). By Tue Ly suggestion I took hypot function and use it as template for fmod. Preserving all test cases. `./check.sh <--special\|--worst> fmodf` passed. `CORE_MATH_PERF_MODE=rdtsc ./perf.sh fmodf` results are ``` GNU libc version: 2.35 GNU libc release: stable 21.166 <-- FPU 51.031 <-- current glibc 37.659 <-- this fmod version. ```	2022-06-24 23:09:14 +02:00
Guillaume Chatelet	c28a522fc7	[libc][NFC] moving template specialization outside class declaration This is necessary to get llvm-libc compile with GCC. This patch is extracted from D119002. Differential Revision: https://reviews.llvm.org/D119142	2022-02-08 10:35:44 +00:00
Tue Ly	9e7688c71e	[libc] Implement log1pf correctly rounded to all rounding modes. Implement log1pf correctly rounded to all rounding modes relying on logf implementation for exponent > 2^(-8). Reviewed By: sivachandra, zimmermann6 Differential Revision: https://reviews.llvm.org/D118962	2022-02-07 16:17:18 -05:00
Tue Ly	e581841e8c	[libc] Implement log10f correctly rounded for all rounding modes. Based on RLIBM implementation similar to logf and log2f. Most of the exceptional inputs are the exact powers of 10. Reviewed By: sivachandra, zimmermann6, santoshn, jpl169 Differential Revision: https://reviews.llvm.org/D118093	2022-01-25 10:33:39 -05:00
Tue Ly	63d2df003e	[libc] Implement correctly rounded log2f based on RLIBM library. Implement log2f based on RLIBM library correctly rounded for all rounding modes. Reviewed By: sivachandra, michaelrj, santoshn, jpl169, zimmermann6 Differential Revision: https://reviews.llvm.org/D115828	2022-01-14 12:40:49 -05:00
Tue Ly	c386d6eb2d	[libc] Fix precision constants for long double in MPFRUtils.cpp.	2022-01-13 21:33:05 -05:00
Tue Ly	8cd81274ff	[libc] Add multithreading support for exhaustive testing and MPFRUtils. Add threading support for exhaustive testing and MPFRUtils. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D117028	2022-01-13 13:46:14 -05:00
Tue Ly	cce6507767	[libc] Add rounding mode support for MPFR testing macros. Add an extra argument for rounding mode to EXPECT_MPFR_MATCH and ASSERT_MPFR_MATCH macros. Reviewed By: sivachandra, michaelrj Differential Revision: https://reviews.llvm.org/D116777	2022-01-13 13:28:50 -05:00
Siva Chandra Reddy	60509623c4	[libc][obvious] Fix style of MPFRWrapper.	2021-12-23 23:19:42 +00:00
Tue Ly	d08a801b5f	[libc] Implement correctly rounded logf based on RLIBM library. Implement correctly rounded logf based on RLIBM library: https://people.cs.rutgers.edu/~sn349/rlibm/. Reviewed By: sivachandra, santoshn, jpl169, zimmermann6 Differential Revision: https://reviews.llvm.org/D115408	2021-12-16 13:43:15 -05:00
Michael Jones	1c92911e9e	[libc] apply new lint rules This patch applies the lint rules described in the previous patch. There was also a significant amount of effort put into manually fixing things, since all of the templated functions, or structs defined in /spec, were not updated and had to be handled manually. Reviewed By: sivachandra, lntue Differential Revision: https://reviews.llvm.org/D114302	2021-12-07 10:49:47 -08:00
Tue Ly	32568fc95e	[libc] Fix a bug in MPFRUtils making ULP values off by 2^(-mantissaWidth). Fix a bug in MPFRUtils making ULP values off by 2^(-mantissaWidth) and incorrect eps for denormal numbers. Differential Revision: https://reviews.llvm.org/D114878	2021-12-02 09:07:46 -05:00
Guillaume Chatelet	0aea170b97	[libc] Add more robust compile time architecture detection We may want to restrict the detected platforms to only `x86_64` and `aarch64`. There are still custom detection in api.td but I don't think we can handle these: - config/linux/api.td:205 - config/linux/api.td:199 Differential Revision: https://reviews.llvm.org/D112818	2021-11-02 11:00:33 +00:00
Guillaume Chatelet	fe953b15cf	Revert "[libc] Add more robust compile time architecture detection" This reverts commit `a72e249986`.	2021-10-29 20:25:55 +00:00
Guillaume Chatelet	a72e249986	[libc] Add more robust compile time architecture detection We may want to restrict the detected platforms to only `x86_64` and `aarch64`. There are still custom detection in api.td but I don't think we can handle these: - config/linux/api.td:205 - config/linux/api.td:199 Differential Revision: https://reviews.llvm.org/D112818	2021-10-29 20:15:12 +00:00
Siva Chandra Reddy	6c3f53c7ba	[libc][NFC] Move test related pieces from FPUtil to util/UnitTest. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D112673	2021-10-29 15:37:30 +00:00
Siva Chandra Reddy	f362aea42d	[libc][NFC] Move utils/CPP to src/__support/CPP. The idea is to move all pieces related to the actual libc sources to the "src" directory. This allows downstream users to ship and build just the "src" directory. Reviewed By: michaelrj Differential Revision: https://reviews.llvm.org/D112653	2021-10-28 15:50:00 +00:00
Siva Chandra Reddy	ca6b354229	[libc] Add range reduction functions based on Paine and Hanek algorithm. These functions will be used in a future patch to implement trigonometric functions. Unit tests have been added but to the libc-long-running-tests suite. The unit tests long running because we compare against MPFR computations performed at 1280 bits of precision. Some cleanups or elimination of repeated patterns can be done as follow up changes. Differential Revision: https://reviews.llvm.org/D104817	2021-08-23 05:18:41 +00:00
Michael Jones	c120edc7b3	[libc][nfc] move ctype_utils and FPUtils to __support Some ctype functions are called from other libc functions (e.g. isspace is used in atoi). By moving ctype_utils.h to __support it becomes easier to include just the implementations of these functions. For these reasons the implementation for isspace was moved into ctype_utils as well. FPUtils was moved to simplify the build order, and to clarify which files are a part of the actual libc. Many files were modified to accomodate these changes, mostly changing the #include paths. Reviewed By: sivachandra Differential Revision: https://reviews.llvm.org/D107600	2021-08-06 17:29:41 +00:00
Siva Chandra Reddy	dba74c6817	[libc] Make ULP error reflect the bit distance more closely. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D105334	2021-07-02 16:56:01 +00:00
Siva Chandra Reddy	d5700bb694	[libc] Calculate ulp error after rounding MPFR result to the result type. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D104615	2021-06-23 20:29:46 +00:00
Tue Ly	4e5f8b4d8d	[libc] Add implementation of expm1f. Use expm1f(x) = exp(x) - 1 for \|x\| > ln(2). For \|x\| <= ln(2), divide it into 3 subintervals: [-ln2, -1/8], [-1/8, 1/8], [1/8, ln2] and use a degree-6 polynomial approximation generated by Sollya's fpminmax for each interval. Errors < 1.5 ULPs when we use fma to evaluate the polynomials. Differential Revision: https://reviews.llvm.org/D101134	2021-06-10 14:58:34 -04:00
Siva Chandra Reddy	861dc75906	[libc] Add x86_64 implementations of double precision cos, sin and tan. The implementations use the x86_64 FPU instructions. These instructions are extremely slow compared to a polynomial based software implementation. Also, their accuracy falls drastically once the input goes beyond 2PI. To improve both the speed and accuracy, we will be taking the following approach going forward: 1. As a follow up to this CL, we will implement a range reduction algorithm which will expand the accuracy to the entire double precision range. 2. After that, we will replace the HW instructions with a polynomial implementation to improve the run time. After step 2, the implementations will be accurate, performant and target architecture independent. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D102384	2021-05-13 19:02:00 +00:00
Siva Chandra Reddy	6666e0d7a2	[libc] Make FPBits a union. This helps us avoid the uncomfortable reinterpret-casts. Avoiding the reinterpret casts prevents us from tripping the sanitizers as well. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D100360	2021-04-13 09:21:35 -07:00
Siva Chandra Reddy	dbb131d53a	[libc] Add a standalone flavor of an equivalent of std::string_view. This class is to serve as a replacement for llvm::StringRef as part of the plans to limit dependency on other parts of LLVM. One use of llvm::StringRef in MPFRWrapper has been replaced with the new class. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D97330	2021-02-23 15:40:26 -08:00
Siva Chandra Reddy	881402ce62	[libc][NFC] Eliminate couple of dependencies on llvm/ADT/StringExtras.h.	2021-02-22 21:41:24 -08:00
Tue Ly	4726bec8f2	[libc] Add implementation of fmaf. Differential Revision: https://reviews.llvm.org/D94018	2021-01-06 17:14:20 -05:00
Siva Chandra Reddy	ff6fd38552	[libc] Add implementations of rounding functions which depend rounding mode. Namely, implementations for rint, rintf, rintl, lrint, lrintf, lrintl, llrint, llrintf and llrintl have been added. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D93889	2020-12-29 22:22:02 -08:00
Siva Chandra Reddy	7aeb3804c4	[libc] Add implementations of lround[f\|l] and llround[f\|l]. A new function to MPFRWrapper has been added, which is used to set up the unit tests. Reviewed By: lntue Differential Revision: https://reviews.llvm.org/D93007	2020-12-11 11:12:40 -08:00
Tue Ly	abf1c82dcc	[libc] Extend MPFRMatcher to handle 2-input-1-output and support hypot function. Differential Revision: https://reviews.llvm.org/D87514	2020-09-14 14:53:46 -04:00
Siva Chandra Reddy	fb542b0b8c	[libc][MPFRWrapper] Provide a way to include MPFR header in downstream repos. Reviewed By: asteinhauser Differential Revision: https://reviews.llvm.org/D87412	2020-09-09 12:58:58 -07:00
Siva Chandra Reddy	fe44992b79	[libc][NFC] For remquo quotient, compare only 3 bits of MPFR and libc results.	2020-08-25 23:42:06 -07:00

1 2

63 Commits