llvm-project

Commit Graph

Author	SHA1	Message	Date
David Tenty	c455961479	[compiler-rt][AIX] Add CMake support for 32-bit Power builds This patch enables support for building compiler-rt builtins for 32-bit Power arch on AIX. For now, we leave out the specialized ppc builtin implementations for 128-bit long double and friends since those will need some special handling for AIX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D87383	2020-09-22 16:08:58 -04:00
David Tenty	89074bdc81	[AIX][compiler-rt] Use the AR/ranlib mode flag for 32-bit and 64-bit mode since we will be building both 32-bit and 64-bit compiler-rt builtins from a single configuration. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D87113	2020-09-22 11:10:47 -04:00
Alex Richardson	aa85c6f2a5	[compiler-rt] Fix atomic support functions on 32-bit architectures The code currently uses __c11_atomic_is_lock_free() to detect whether an atomic operation is natively supported. However, this can result in a runtime function call to determine whether the given operation is lock-free and clang generating a call to e.g. __atomic_load_8 since the branch is not a constant zero. Since we are implementing those runtime functions, we must avoid those calls. This patch replaces __c11_atomic_is_lock_free() with __atomic_always_lock_free() which always results in a compile-time constant value. This problem was found while compiling atomic.c for MIPS32 since the -Watomic-alignment warning was being triggered and objdump showed an undefined reference to _atomic_is_lock_free. In addition to fixing 32-bit platforms this also enables the 16-byte case that was disabled in r153779 (`185f2edd70`). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86510	2020-09-21 10:21:11 +01:00
Craig Topper	c9af34027b	Add __divmodti4 to match libgcc. gcc has used this on x86-64 since at least version 7. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D80506	2020-09-16 21:56:01 -07:00
Stephen Hines	516a01b5f3	Implement __isOSVersionAtLeast for Android Add the implementation of __isOSVersionAtLeast for Android. Currently, only the major version is checked against the API level of the platform which is an integer. The API level is retrieved by reading the system property ro.build.version.sdk (and optionally ro.build.version.codename to see if the platform is released or not). Patch by jiyong@google.com Bug: 150860940 Bug: 134795810 Test: m Reviewed By: srhines Differential Revision: https://reviews.llvm.org/D86596	2020-09-15 12:54:06 -07:00
Craig Topper	f5ad9c2e0e	[builtins] Write __divmoddi4/__divmodsi4 in terms __udivmod instead of __div and multiply. Previously we calculating the remainder by multiplying the quotient and divisor and subtracting from the dividend. __udivmod can calculate the remainder while calculating the quotient. We just need to correct the sign afterward. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D87433	2020-09-10 08:08:55 -07:00
Craig Topper	35f708a3c9	[builtins] Inline __paritysi2 into __paritydi2 and inline __paritydi2 into __parityti2. No point in making __parityti2 go through 2 calls to get to __paritysi2. Reviewed By: MaskRay, efriedma Differential Revision: https://reviews.llvm.org/D87218	2020-09-07 17:57:39 -07:00
Brad Smith	8542dab909	[compiler-rt] Implement __clear_cache() on OpenBSD/arm	2020-09-06 15:54:24 -04:00
Anatoly Trosinenko	93eed63d2f	[builtins] Make __div[sdt]f3 handle denormal results This patch introduces denormal result support to soft-float division implementation unified by D85031. Reviewed By: sepavloff Differential Revision: https://reviews.llvm.org/D85032	2020-09-01 21:52:34 +03:00
Anatoly Trosinenko	0e90d8d4fe	[builtins] Unify the softfloat division implementation This patch replaces three different pre-existing implementations of __div[sdt]f3 LibCalls with a generic one - like it is already done for many other LibCalls. Reviewed By: sepavloff Differential Revision: https://reviews.llvm.org/D85031	2020-09-01 19:05:50 +03:00
Anatoly Trosinenko	11cf6346fd	[NFC][compiler-rt] Factor out __div[sdt]i3 and __mod[dt]i3 implementations Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D86400	2020-08-30 16:14:08 +03:00
Anatoly Trosinenko	fce035eae9	[NFC][compiler-rt] Factor out __mulo[sdt]i4 implementations to .inc file The existing implementations are almost identical except for width of the integer type. Factor them out to int_mulo_impl.inc for better maintainability. This patch is almost identical to D86277. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D86289	2020-08-27 14:33:48 +03:00
Anatoly Trosinenko	182d14db07	[NFC][compiler-rt] Factor out __mulv[sdt]i3 implementations to .inc file The existing implementations are almost identical except for width of the integer type. Factor them out to int_mulv_impl.inc for better maintainability. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D86277	2020-08-27 14:33:48 +03:00
David Tenty	f8454d60b8	[AIX][compiler-rt][builtins] Don't add ppc builtin implementations that require __int128 on AIX since __int128 currently isn't supported on AIX. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D85972	2020-08-25 11:35:38 -04:00
Freddy Ye	e02d081f2b	[X86] Support -march=sapphirerapids Support -march=sapphirerapids for x86. Compare with Icelake Server, it includes 14 more new features. They are amxtile, amxint8, amxbf16, avx512bf16, avx512vp2intersect, cldemote, enqcmd, movdir64b, movdiri, ptwrite, serialize, shstk, tsxldtrk, waitpkg. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D86503	2020-08-25 14:21:21 +08:00
Shoaib Meenai	2c80e2fe51	[runtimes] Use llvm-libtool-darwin for runtimes build It's full featured now and we can use it for the runtimes build instead of relying on an external libtool, which means the CMAKE_HOST_APPLE restriction serves no purpose either now. Restrict llvm-lipo to Darwin targets while I'm here, since it's only needed there. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D86367	2020-08-24 13:48:30 -07:00
Luís Marques	57903cf093	[compiler-rt][RISCV] Use muldi3 builtin assembly implementation D80465 added an assembly implementation of muldi3 for RISC-V but it didn't add it to the cmake `*_SOURCES` list, so the C implementation was being used instead. This patch fixes that. Differential Revision: https://reviews.llvm.org/D86036	2020-08-21 13:06:35 +01:00
Craig Topper	df9a9bb7be	[X86] Correct the implementation of the testFeature macro in getIntelProcessorTypeAndSubtype to do a proper bit test. Instead of ANDing with a one hot mask representing the bit to be tested, we were ANDing with just the bit number. This tests multiple bits none of them the correct one. This caused skylake-avx512, cascadelake and cooperlake to all be misdetected. Based on experiments with the Intel SDE, it seems that all of these CPUs are being detected as being cooperlake. This is bad since its the newest CPU of the 3.	2020-08-20 23:50:45 -07:00
Louis Dionne	afa1afd410	[CMake] Bump CMake minimum version to 3.13.4 This upgrade should be friction-less because we've already been ensuring that CMake >= 3.13.4 is used. This is part of the effort discussed on llvm-dev here: http://lists.llvm.org/pipermail/llvm-dev/2020-April/140578.html Differential Revision: https://reviews.llvm.org/D78648	2020-07-22 14:25:07 -04:00
Nico Weber	669b070936	cmake list formatting fix	2020-07-16 18:29:48 -04:00
Ryan Prichard	15b37e1cfa	[builtins] Omit 80-bit builtins on Android and MSVC long double is a 64-bit double-precision type on: - MSVC (32- and 64-bit x86) - Android (32-bit x86) long double is a 128-bit quad-precision type on x86_64 Android. The assembly variants of the 80-bit builtins are correct, but some of the builtins are implemented in C and require that long double be the 80-bit type passed via an x87 register. Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D82153	2020-07-16 15:11:26 -07:00
Ryan Prichard	8cbb6ccc7f	[builtins] Cleanup generic-file filtering Split filter_builtin_sources into two functions: - filter_builtin_sources that removes generic files when an arch-specific file is selected. - darwin_filter_builtin_sources that implements the EXCLUDE/INCLUDE lists (using the files in lib/builtins/Darwin-excludes). darwin_filter_builtin_sources delegates to filter_builtin_sources. Previously, lib/builtins/CMakeLists.txt had a number of calls to filter_builtin_sources (with a confusing/broken use of the `excluded_list` parameter), as well as a redundant arch-vs-generic filtering for the non-Apple code path at the end of the file. Replace all of this with a single call to filter_builtin_sources. Remove i686_SOURCES. Previously, this list contained only the arch-specific files common to 32-bit and 64-bit x86, which is a strange set. Normally the ${ARCH}_SOURCES list contains everything needed for the arch. "i686" isn't in ALL_BUILTIN_SUPPORTED_ARCH. NFCI, but i686_SOURCES won't be defined, and the order of files in ${arch}_SOURCES lists will change. Differential Revision: https://reviews.llvm.org/D82151	2020-07-13 16:53:07 -07:00
Ryan Prichard	f398e0f3d1	[builtins][Android] Define HAS_80_BIT_LONG_DOUBLE to 0 Android 32-bit x86 uses a 64-bit long double. Android 64-bit x86 uses a 128-bit quad-precision long double. Differential Revision: https://reviews.llvm.org/D82152	2020-07-13 16:53:07 -07:00
Craig Topper	b92c2bb6a2	[X86] Add CPU name strings to getIntelProcessorTypeAndSubtype and getAMDProcessorTypeAndSubtype in compiler-rt. These aren't used in compiler-rt, but I plan to make a similar change to the equivalent code in Host.cpp where the mapping from type/subtype is an unnecessary complication. Having the CPU strings here will help keep the code somewhat synchronized.	2020-07-12 12:59:25 -07:00
Danila Kutenin	68c011aa08	[builtins] Optimize udivmodti4 for many platforms. Summary: While benchmarking uint128 division we found out that it has huge latency for small divisors https://reviews.llvm.org/D83027 ``` Benchmark Time(ns) CPU(ns) Iterations -------------------------------------------------------------------------------------------------- BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 13.0 13.0 55000000 BM_DivideIntrinsic128UniformDivisor<__int128> 14.3 14.3 50000000 BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 13.5 13.5 52000000 BM_RemainderIntrinsic128UniformDivisor<__int128> 14.1 14.1 50000000 BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 153 153 5000000 BM_DivideIntrinsic128SmallDivisor<__int128> 170 170 3000000 BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 153 153 5000000 BM_RemainderIntrinsic128SmallDivisor<__int128> 155 155 5000000 ``` This patch suggests a more optimized version of the division: If the divisor is 64 bit, we can proceed with the divq instruction on x86 or constant multiplication mechanisms for other platforms. Once both divisor and dividend are not less than 2**64, we use branch free subtract algorithm, it has at most 64 cycles. After that our benchmarks improved significantly ``` Benchmark Time(ns) CPU(ns) Iterations -------------------------------------------------------------------------------------------------- BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 11.0 11.0 64000000 BM_DivideIntrinsic128UniformDivisor<__int128> 13.8 13.8 51000000 BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 11.6 11.6 61000000 BM_RemainderIntrinsic128UniformDivisor<__int128> 13.7 13.7 52000000 BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 27.1 27.1 26000000 BM_DivideIntrinsic128SmallDivisor<__int128> 29.4 29.4 24000000 BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 27.9 27.8 26000000 BM_RemainderIntrinsic128SmallDivisor<__int128> 29.1 29.1 25000000 ``` If not using divq instrinsics, it is still much better ``` Benchmark Time(ns) CPU(ns) Iterations -------------------------------------------------------------------------------------------------- BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 12.2 12.2 58000000 BM_DivideIntrinsic128UniformDivisor<__int128> 13.5 13.5 52000000 BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 12.7 12.7 56000000 BM_RemainderIntrinsic128UniformDivisor<__int128> 13.7 13.7 51000000 BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 30.2 30.2 24000000 BM_DivideIntrinsic128SmallDivisor<__int128> 33.2 33.2 22000000 BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 31.4 31.4 23000000 BM_RemainderIntrinsic128SmallDivisor<__int128> 33.8 33.8 21000000 ``` PowerPC benchmarks: Was ``` BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 22.3 22.3 32000000 BM_DivideIntrinsic128UniformDivisor<__int128> 23.8 23.8 30000000 BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 22.5 22.5 32000000 BM_RemainderIntrinsic128UniformDivisor<__int128> 24.9 24.9 29000000 BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 394 394 2000000 BM_DivideIntrinsic128SmallDivisor<__int128> 397 397 2000000 BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 399 399 2000000 BM_RemainderIntrinsic128SmallDivisor<__int128> 397 397 2000000 ``` With this patch ``` BM_DivideIntrinsic128UniformDivisor<unsigned __int128> 21.7 21.7 33000000 BM_DivideIntrinsic128UniformDivisor<__int128> 23.0 23.0 31000000 BM_RemainderIntrinsic128UniformDivisor<unsigned __int128> 21.9 21.9 33000000 BM_RemainderIntrinsic128UniformDivisor<__int128> 23.9 23.9 30000000 BM_DivideIntrinsic128SmallDivisor<unsigned __int128> 32.7 32.6 23000000 BM_DivideIntrinsic128SmallDivisor<__int128> 33.4 33.4 21000000 BM_RemainderIntrinsic128SmallDivisor<unsigned __int128> 31.1 31.1 22000000 BM_RemainderIntrinsic128SmallDivisor<__int128> 33.2 33.2 22000000 ``` My email: danilak@google.com, I don't have commit rights Reviewers: howard.hinnant, courbet, MaskRay Reviewed By: courbet Subscribers: steven.zhang, #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D81809	2020-07-10 09:59:16 +02:00
Sid Manning	baca8f977e	[compiler-rt][Hexagon] Remove fma/fmin/max code This code should reside in the c-library. Differential Revision: https://reviews.llvm.org/D82263	2020-07-07 19:50:04 -05:00
Anatoly Trosinenko	0ee439b705	[builtins] Change si_int to int in some helper declarations This patch changes types of some integer function arguments or return values from `si_int` to the default `int` type to make it more compatible with `libgcc`. The compiler-rt/lib/builtins/README.txt has a link to the [libgcc specification](http://gcc.gnu.org/onlinedocs/gccint/Libgcc.html#Libgcc). This specification has an explicit note on `int`, `float` and other such types being just illustrations in some cases while the actual types are expressed with machine modes. Such usage of always-32-bit-wide integer type may lead to issues on 16-bit platforms such as MSP430. Provided [libgcc2.h](https://gcc.gnu.org/git/?p=gcc.git;a=blob_plain;f=libgcc/libgcc2.h;hb=HEAD) can be used as a reference for all targets supported by the libgcc, this patch fixes some existing differences in helper declarations. This patch is expected to not change behavior at all for targets with 32-bit `int` type. Differential Revision: https://reviews.llvm.org/D81285	2020-06-30 11:07:02 +03:00
Anatoly Trosinenko	a4e8f7fe3f	[builtins] Improve compatibility with 16 bit targets Some parts of existing codebase assume the default `int` type to be (at least) 32 bit wide. On 16 bit targets such as MSP430 this may cause Undefined Behavior or results being defined but incorrect. Differential Revision: https://reviews.llvm.org/D81408	2020-06-26 15:31:11 +03:00
Anatoly Trosinenko	a931ec7ca0	[builtins] Move more float128-related helpers to GENERIC_TF_SOURCES list There are two different _generic_ lists of source files in the compiler-rt/lib/builtins/CMakeLists.txt. Now there is no simple way to not use the tf-variants of helpers at all. Since there exists a separate `GENERIC_TF_SOURCES` list, it seems quite natural to move all float128-related helpers there. If it is not possible for some reason, it would be useful to have an explanation of that reason somewhere near the `GENERIC_TF_SOURCES` definition. Differential Revision: https://reviews.llvm.org/D81282	2020-06-25 22:32:49 +03:00
Craig Topper	23654d9e7a	Recommit "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum." Hopefully this version will fix the previously buildbot failure	2020-06-22 13:32:03 -07:00
Craig Topper	bebea4221d	Revert "[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum." Seems to breaking build. This reverts commit `5ac144fe64`.	2020-06-22 12:20:40 -07:00
Craig Topper	5ac144fe64	[X86] Calculate the needed size of the feature arrays in _cpu_indicator_init and getHostCPUName using the size of the feature enum. Move 0 initialization up to the caller so we don't need to know the size.	2020-06-22 11:46:20 -07:00
Craig Topper	90406d62e5	[X86] Add cooperlake and tigerlake to the enum in cpu_model.c I forgot to do this when I added then to _cpu_indicator_init.	2020-06-21 16:20:26 -07:00
Craig Topper	0e6c9316d4	[X86] Add cooperlake detection to _cpu_indicator_init. libgcc has this enum encoding defined for a while, but their detection code is missing. I've raised a bug with them so that should get fixed soon.	2020-06-21 13:02:33 -07:00
Craig Topper	35f7d58328	[X86] Set the cpu_vendor in __cpu_indicator_init to VENDOR_OTHER if cpuid isn't supported on the CPU. We need to set the cpu_vendor to a non-zero value to indicate that we already called __cpu_indicator_init once. This should only happen on a 386 or 486 CPU.	2020-06-20 15:36:04 -07:00
Ryan Prichard	8627190f31	[builtins] Fix typos in comments Differential Revision: https://reviews.llvm.org/D82146	2020-06-19 16:08:04 -07:00
David Tenty	8aef01eed4	[AIX][compiler-rt] Pick the right form of COMPILER_RT_ALIAS for AIX Summary: we use the alias attribute, similar to what is done for ELF. Reviewers: ZarkoCA, jasonliu, hubert.reinterpretcast, sfertile Reviewed By: jasonliu Subscribers: dberris, aheejin, mstorsjo, #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D81120	2020-06-16 14:10:40 -04:00
Craig Topper	033bf61cc5	[X86] Remove brand_id check from cpu_indicator_init. Brand index was a feature some Pentium III and Pentium 4 CPUs. It provided an index into a software lookup table to provide a brand name for the CPU. This is separate from the family/model. It's unclear to me why this index being non-zero was used to block checking family/model. None of the CPUs that had a non-zero brand index are supported by __builtin_cpu_is or target multi-versioning so this should have no real effect.	2020-06-12 20:35:48 -07:00
Craig Topper	94ccb2acbf	[X86] Combine to two feature variables in __cpu_indicator_init into an array and pass them around as pointer we can treat as an array. This simplifies the indexing code to set and test bits.	2020-06-12 18:30:41 -07:00
Craig Topper	e424a3526a	[X86] Explicitly initialize __cpu_features2 global in compiler-rt to 0. Seems like this may be needed in order for the linker to find the symbol. At least on my Mac.	2020-06-12 18:30:34 -07:00
kamlesh kumar	e31ccee1b0	[RISCV-V] Provide muldi3 builtin assembly implementation Provides an assembly implementation of muldi3 for RISC-V, to solve bug 43388. Since the implementation is the same as for mulsi3, that code was moved to `riscv/int_mul_impl.inc` and is now reused by both `mulsi3.S` and `muldi3.S`. Differential Revision: https://reviews.llvm.org/D80465	2020-06-02 21:04:55 +01:00
Kazushi (Jam) Marukawa	dedaf3a2ac	[VE] Dynamic stack allocation Summary: This patch implements dynamic stack allocation for the VE target. Changes: * compiler-rt: `__ve_grow_stack` to request stack allocation on the VE. * VE: base pointer support, dynamic stack allocation. Differential Revision: https://reviews.llvm.org/D79084	2020-05-27 10:11:06 +02:00
Craig Topper	2bb822bc90	[X86] Add family/model for Intel Comet Lake CPUs for -march=native and function multiversioning This adds the family/model returned by CPUID for some Intel Comet Lake CPUs. Instruction set and tuning wise these are the same as "skylake". These are not in the Intel SDM yet, but these should be correct.	2020-05-24 00:29:25 -07:00
Craig Topper	95bc21f32f	[X86] Add avx512vp2intersect feature to compiler-rt's feature detection to match libgcc.	2020-05-21 21:54:54 -07:00
Kamil Rytarowski	f61f6ffe11	[compiler-rt] [builtin] Switch the return type of __atomic_compare_exchange_##n to bool Summary: Synchronize the function definition with the LLVM documentation. https://llvm.org/docs/Atomics.html#libcalls-atomic GCC also returns bool for the same atomic builtin. Reviewers: theraven Reviewed By: theraven Subscribers: theraven, dberris, jfb, #sanitizers Tags: #sanitizers Differential Revision: https://reviews.llvm.org/D79845	2020-05-13 14:09:02 +02:00
Ayke van Laethem	4d41df6482	[builtins] Support architectures with 16-bit int This is the first patch in a series to add support for the AVR target. This patch includes changes to make compiler-rt more target independent by not relying on the width of an int or long. Differential Revision: https://reviews.llvm.org/D78662	2020-04-26 01:22:10 +02:00
Fangrui Song	17772995d4	[builtins] Add missing header in D77912 and make __builtin_clzll more robust	2020-04-17 08:29:58 -07:00
Ayke van Laethem	d9e5691843	[builtins] Fix unprototypes function declaration The following declarations were missing a prototype: FE_ROUND_MODE __fe_getround(); int __fe_raise_inexact(); Discovered while fixing a bug in Clang related to unprototyped function calls (see the previous commit). Differential Revision: https://reviews.llvm.org/D78205	2020-04-15 23:44:51 +02:00
Fangrui Song	b541196eb4	[builtins] Make __umodsi3/__udivdi3/__umoddi3 standalone (shift and subtract) @kamleshbhalui reported that when the Standard Extension M (Multiplication and Division) is disabled for RISC-V, `__udivdi3` will call __udivmodti4 which will in turn calls `__udivdi3`. This patch moves __udivsi3 (shift and subtract) to int_div_impl.inc `__udivXi3`, optimize a bit, add a `__umodXi3`, and use `__udivXi3` and `__umodXi3` to define `__udivsi3` `__umodsi3` `__udivdi3` `__umoddi3`. Reviewed By: kamleshbhalui Differential Revision: https://reviews.llvm.org/D77912	2020-04-14 10:38:37 -07:00
Shoaib Meenai	f481256bfe	[builtins] Build for arm64e for Darwin https://github.com/apple/swift/pull/30112/ makes the Swift standard library for iOS build for arm64e. If you're building Swift against your own LLVM, this in turn requires having the builtins built for arm64e, otherwise you won't be able to use the builtins (which will in turn lead to an undefined symbol for `__isOSVersionAtLeast`). Make the builtins build for arm64e to fix this. Differential Revision: https://reviews.llvm.org/D76041	2020-03-11 22:01:44 -07:00

1 2 3 4 5 ...

475 Commits