llvm-project

Commit Graph

Author	SHA1	Message	Date
Yaxun (Sam) Liu	622eaa4a4c	[HIP] Support __managed__ attribute This patch implements codegen for __managed__ variable attribute for HIP. Diagnostics will be added later. Differential Revision: https://reviews.llvm.org/D94814	2021-01-22 11:43:58 -05:00
Akira Hatanaka	3d349ed7e1	[CodeGen][ObjC] Fix broken IR generated when there is a nil receiver check This patch fixes a bug in emitARCOperationAfterCall where it inserts the fall-back call after a bitcast instruction and then replaces the bitcast's operand with the result of the fall-back call. The generated IR without this patch looks like this: msgSend.call: ; preds = %entry %call = call i8* bitcast (i8* (i8, i8, ...)* @objc_msgSend br label %msgSend.cont msgSend.null-receiver: ; preds = %entry call void @llvm.objc.release(i8* %4) br label %msgSend.cont msgSend.cont: %8 = phi i8* [ %call, %msgSend.call ], [ null, %msgSend.null-receiver ] %9 = bitcast i8* %10 to %0* %10 = call i8* @llvm.objc.retain(i8* %8) Notice that `%9 = bitcast i8* %10` to %0* is taking operand %10 which is defined after it. To fix the bug, this patch modifies the insert point to point to the bitcast instruction so that the fall-back call is inserted before the bitcast. In addition, it teaches the function to look at phi instructions that are generated when there is a check for a null receiver and insert the retainRV/claimRV instruction right after the call instead of inserting a fall-back call right after the phi instruction. rdar://73360225 Differential Revision: https://reviews.llvm.org/D95181	2021-01-21 17:38:46 -08:00
Jon Roelofs	1deee5cacb	Fix crash when emitting NullReturn guards for functions returning BOOL CodeGenModule::EmitNullConstant() creates constants with their "in memory" type, not their "in vregs" type. The one place where this difference matters is when the type is _Bool, as that is an i1 when in vregs and an i8 in memory. Fixes: rdar://73361264	2021-01-21 14:29:36 -08:00
Joseph Huber	e4eaf9d820	[OpenMP] Add support for mapping names in mapper API Summary: The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D94806	2021-01-21 09:26:44 -05:00
Amy Huang	a3d7cee7f9	[CodeView] Emit function types in -gline-tables-only. This change adds function types to further differentiate between FUNC_IDs in -gline-tables-only. Size increase of object files in clang are Before: 917990 kb After: 999312 kb Bug: https://bugs.llvm.org/show_bug.cgi?id=48432 Differential Revision: https://reviews.llvm.org/D95001	2021-01-20 12:47:35 -08:00
Thomas Lively	11802eced5	[WebAssembly] Prototype new f64x2 conversions As proposed in https://github.com/WebAssembly/simd/pull/383. Differential Revision: https://reviews.llvm.org/D95012	2021-01-20 11:28:06 -08:00
Hans Wennborg	8ba442bc21	Revert "Following up on PR48517, fix handling of template arguments that refer" Combined with 'da98651 - Revert "DR2064: decltype(E) is only a dependent', this change (`5a391d3`) caused verifier errors when building Chromium. See https://crbug.com/1168494#c1 for a reproducer. Additionally it reverts changes that were dependent on this one, see below. > Following up on PR48517, fix handling of template arguments that refer > to dependent declarations. > > Treat an id-expression that names a local variable in a templated > function as being instantiation-dependent. > > This addresses a language defect whereby a reference to a dependent > declaration can be formed without any construct being value-dependent. > Fixing that through value-dependence turns out to be problematic, so > instead this patch takes the approach (proposed on the core reflector) > of allowing the use of pointers or references to (but not values of) > dependent declarations inside value-dependent expressions, and instead > treating template arguments as dependent if they evaluate to a constant > involving such dependent declarations. > > This ends up affecting a bunch of OpenMP tests, due to OpenMP > imprecisely handling instantiation-dependent constructs, bailing out > early instead of processing dependent constructs to the extent possible > when handling the template. > > Previously committed as `8c1f2d15b8`, and > reverted because a dependency commit was reverted. This reverts commit `5a391d38ac`. It also restores clang/test/SemaCXX/coroutines.cpp to its state before `da986511fb`. Revert "[c++20] P1907R1: Support for generalized non-type template arguments of scalar type." > Previously committed as `9e08e51a20`, and > reverted because a dependency commit was reverted. This incorporates the > following follow-on commits that were also reverted: > > `7e84aa1b81` by Simon Pilgrim > `ed13d8c667` by me > `95c7b6cadb` by Sam McCall > `430d5d8429` by Dave Zarzycki This reverts commit `4b574008ae`. Revert "[msabi] Mangle a template argument referring to array-to-pointer decay" > [msabi] Mangle a template argument referring to array-to-pointer decay > applied to an array the same as the array itself. > > This follows MS ABI, and corrects a regression from the implementation > of generalized non-type template parameters, where we "forgot" how to > mangle this case. This reverts commit `18e093faf7`.	2021-01-20 15:55:35 +01:00
Alexey Bataev	b272698de7	[OPENMP]Do not use OMP_MAP_TARGET_PARAM for data movement directives. OMP_MAP_TARGET_PARAM flag is used to mark the data that shoud be passed as arguments to the target kernels, nothing else. But the compiler still marks the data with OMP_MAP_TARGET_PARAM flags even if the data is passed to the data movement directives, like target data, target update etc. This flag is just ignored for this directives and the compiler does not need to emit it. Reviewed By: cchen Differential Revision: https://reviews.llvm.org/D91261	2021-01-19 12:41:15 -08:00
Shilei Tian	82e537a9d2	[Clang][OpenMP] Fixed an issue that clang crashed when compiling OpenMP program in device only mode without host IR D94745 rewrites the `deviceRTLs` using OpenMP and compiles it by directly calling the device compilation. `clang` crashes because entry in `OffloadEntriesDeviceGlobalVar` is unintialized. Current design supposes the device compilation can only be invoked after host compilation with the host IR such that `clang` can initialize `OffloadEntriesDeviceGlobalVar` from host IR. This avoids us using device compilation directly, especially when we only have code wrapped into `declare target` which are all device code. The same issue also exists for `OffloadEntriesInfoManager`. In this patch, we simply initialized an entry if it is not in the maps. Not sure we need an option to tell the device compiler that it is invoked standalone. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94871	2021-01-19 14:18:42 -05:00
Richard Smith	4b574008ae	[c++20] P1907R1: Support for generalized non-type template arguments of scalar type. Previously committed as `9e08e51a20`, and reverted because a dependency commit was reverted. This incorporates the following follow-on commits that were also reverted: `7e84aa1b81` by Simon Pilgrim `ed13d8c667` by me `95c7b6cadb` by Sam McCall `430d5d8429` by Dave Zarzycki	2021-01-18 21:05:01 -08:00
Amy Huang	6227069bdc	[DebugInfo][CodeView] Change in line tables only mode to emit type information for function scopes, rather than using the qualified name. In line-tables-only mode, we used to emit qualified names as the display name for functions when using CodeView. This patch changes to emitting the parent scopes instead, with forward declarations for class types. The total object file size ends up being slightly smaller than if we use the full qualified names. Differential Revision: https://reviews.llvm.org/D94639	2021-01-15 09:28:27 -08:00
Qiu Chaofan	168be42083	[Clang] Mutate long-double math builtins into f128 under IEEE-quad Under -mabi=ieeelongdouble on PowerPC, IEEE-quad floating point semantic is used for long double. This patch mutates call to related builtins into f128 version on PowerPC. And in theory, this should be applied to other targets when their backend supports IEEE 128-bit style libcalls. GCC already has these mutations except nansl, which is not available on PowerPC along with other variants (nans, nansf). Reviewed By: RKSimon, nemanjai Differential Revision: https://reviews.llvm.org/D92080	2021-01-15 16:56:20 +08:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Zequan Wu	e53bbd9951	[IR] move nomerge attribute from function declaration/definition to callsites Move nomerge attribute from function declaration/definition to callsites to allow virtual function calls attach the attribute. Differential Revision: https://reviews.llvm.org/D94537	2021-01-12 12:10:46 -08:00
David Truby	e5f51fdd65	[clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate MSVC on WoA64 includes isCXX14Aggregate in its definition. This is de-facto specification on that platform, so match msvc's behaviour. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47611 Co-authored-by: Peter Waller <peter.waller@arm.com> Differential Revision: https://reviews.llvm.org/D92751	2021-01-12 19:44:01 +00:00
Bevin Hansson	c4944a6f53	[Fixed Point] Add codegen for conversion between fixed-point and floating point. The patch adds the required methods to FixedPointBuilder for converting between fixed-point and floating point, and uses them from Clang. This depends on D54749. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D86632	2021-01-12 13:53:01 +01:00
Fangrui Song	b88c8f1aab	CGDebugInfo: Delete unused parameters	2021-01-11 13:39:03 -08:00
Fangrui Song	f4cec703ec	Add an assert to CGDebugInfo::getTypeOrNull	2021-01-11 13:25:20 -08:00
Joe Ellis	8ea72b3887	[clang][AArch64][SVE] Avoid going through memory for coerced VLST return values VLST return values are coerced to VLATs in the function epilog for consistency with the VLAT ABI. Previously, this coercion was done through memory. It is preferable to use the llvm.experimental.vector.insert intrinsic to avoid going through memory here. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D94290	2021-01-11 12:10:59 +00:00
Fangrui Song	b8d2842088	CGDebugInfo: Delete unneeded UnwrapTypeForDebugInfo Tested with stage 2 -DCMAKE_BUILD_TYPE=Debug clang, byte identical.	2021-01-10 22:22:07 -08:00
Fangrui Song	6215c1b778	CGDebugInfo: Delete redundant test	2021-01-10 22:22:06 -08:00
Fangrui Song	02bc320545	CGDebugInfo: Delete unused DIFile* parameter	2021-01-10 15:03:40 -08:00
Fangrui Song	abfe348e6b	[test] Improve CodeGenCXX/difile_entry.cpp The test added in D87147 did not actually test PR47391. Use an absolute path to test the canonicalization.	2021-01-10 12:24:49 -08:00
Fangrui Song	e2e82c9983	[CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data ELF -fno-pic sets dso_local on a function declaration to allow direct accesses when taking its address (similar to a data symbol). The emitted code follows the traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced. If the function is not defined in the executable, a canonical PLT entry will be needed at link time. This is similar to a copy relocation and is incompatible with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in a shared object). This patch gives -fno-pic code a way to avoid such a canonical PLT entry. The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie -fdirect-access-external-data). While we could set dso_local to avoid GOT when taking the address of a function declaration (there is an ignorable difference about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit and can just cause trouble, so we don't make the generalization.	2021-01-09 16:31:56 -08:00
Fangrui Song	38a716c30f	Make -fno-pic respect -fno-direct-access-external-data D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations. (The option works for -fpie but is a no-op for -fno-pic and -fpic.) This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from global variable declarations. This usually causes the backend to emit a GOT indirection for external data access. With a GOT relocation, the subsequent -no-pie link will not have copy relocation even if the data symbol turns out to be defined by a shared object. Differential Revision: https://reviews.llvm.org/D92714	2021-01-09 00:32:02 -08:00
Fangrui Song	1d3ebbf537	Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made -fpie code emit R_X86_64_PC32 to reference external data symbols by default. Clang adopted -mpie-copy-relocations D19996 as a flexible alternative. The name -mpie-copy-relocations can be improved [1] and does not capture the idea that this option can apply to -fno-pic and -fpic [2], so this patch introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations their aliases for compatibility. [1] For ``` extern int var; int get() { return var; } ``` if var is defined in another translation unit in the link unit, there is no copy relocation. [2] -fno-pic -fno-direct-access-external-data is useful to avoid copy relocations. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888 If a shared object is linked with -Bsymbolic or --dynamic-list and exports a data symbol, normally the data symbol cannot be accessed by -fno-pic code (because by default an absolute relocation is produced which will lead to a copy relocation). -fno-direct-access-external-data can prevent copy relocations. -fpic -fdirect-access-external-data can avoid GOT indirection. This is like the undefined counterpart of -fno-semantic-interposition. However, the user should define var in another translation unit and link with -Bsymbolic or --dynamic-list, otherwise the linker will error in a -shared link. Generally the user has better tools for their goal but I want to mention that this combination is valid. On COFF, the behavior is like always -fdirect-access-external-data. `__declspec(dllimport)` is needed to enable indirect access. There is currently no plan to affect non-ELF behaviors or -fpic behaviors. -fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch. GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D92633	2021-01-09 00:32:01 -08:00
Umesh Kalappa	33c8e16f66	PR47391: Canonicalize DIFiles Like @aprantl suggested, modify to use the canonicalized DIFile, if we don't know the loc info and filename for the compiler generated functions for example static initialization functions. Reviewed By: dblaikie, aprantl Differential Revision: https://reviews.llvm.org/D87147	2021-01-08 22:11:16 -08:00
Arthur Eubanks	756dd70766	[NewPM] Run ObjC ARC passes Match the legacy PM in running various ObjC ARC passes. This requires making some module passes into function passes. These were initially ported as module passes since they add function declarations (e.g. https://reviews.llvm.org/D86178), but that's still up for debate and other passes do so. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D93743	2021-01-08 15:47:11 -08:00
Hongtao Yu	0e23fd676c	[Driver] Add DWARF64 flag: -gdwarf64 @ikudrin enabled support for dwarf64 in D87011. Adding a clang flag so it can be used through that compilation pass. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D90507	2021-01-08 12:58:38 -08:00
Heejin Ahn	7be271537e	[WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin `wasm_rethrow_in_catch` intrinsic and builtin are used in order to rethrow an exception when the exception is caught but there is no matching clause within the current `catch`. For example, ``` try { foo(); } catch (int n) { ... } ``` If the caught exception does not correspond to C++ `int` type, it should be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch` because at the time I thought there would be another intrinsic for C++'s `throw` keyword, which rethrows an exception. It turned out that `throw` keyword doesn't require wasm's `rethrow` instruction, so we rename `rethrow_in_catch` to just `rethrow` here. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D94038	2021-01-08 06:55:04 -08:00
David Sherwood	38d18d9353	[SVE] Add support to vectorize_width loop pragma for scalable vectors This patch adds support for two new variants of the vectorize_width pragma: 1. vectorize_width(X[, fixed\|scalable]) where an optional second parameter is passed to the vectorize_width pragma, which indicates if the user wishes to use fixed width or scalable vectorization. For example the user can now write something like: #pragma clang loop vectorize_width(4, fixed) or #pragma clang loop vectorize_width(4, scalable) In the absence of a second parameter it is assumed the user wants fixed width vectorization, in order to maintain compatibility with existing code. 2. vectorize_width(fixed\|scalable) where the width is left unspecified, but the user hints what type of vectorization they prefer, either fixed width or scalable. I have implemented this by making use of the LLVM loop hint attribute: llvm.loop.vectorize.scalable.enable Tests were added to clang/test/CodeGenCXX/pragma-loop.cpp for both the 'fixed' and 'scalable' optional parameter. See this thread for context: http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html Differential Revision: https://reviews.llvm.org/D89031	2021-01-08 11:37:27 +00:00
Wang, Pengfei	c102b9697b	[X86] Correct the comments about comparison intrinsics. NFCI.	2021-01-08 15:36:15 +08:00
Reid Kleckner	ad55d5c3f3	Simplify vectorcall argument classification of HVAs, NFC This reduces the number of `WinX86_64ABIInfo::classify` call sites from 3 to 1. The call sites were similar, but passed different values for FreeSSERegs. Use variables instead of `if`s to manage that argument.	2021-01-07 11:14:18 -08:00
Pushpinder Singh	4909cb1a0f	[OpenMP][AMDGPU] Use AMDGPU_KERNEL calling convention for entry function AMDGPU backend requires entry functions/kernels to have AMDGPU_KERNEL calling convention for proper linking. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94060	2021-01-06 02:03:30 -05:00
Thomas Lively	497026c902	[WebAssembly] Prototype prefetch instructions As proposed in https://github.com/WebAssembly/simd/pull/352 and using the opcodes used in the V8 prototype: https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions are only usable via intrinsics and clang builtins to make them opt-in while they are being benchmarked. Differential Revision: https://reviews.llvm.org/D93883	2021-01-05 11:32:03 -08:00
Simon Pilgrim	55488bd3cd	CGExpr - EmitMatrixSubscriptExpr - fix getAs<> null-dereference static analyzer warning. NFCI. getAs<> can return null if the cast is invalid, which can lead to null pointer deferences. Use castAs<> instead which will assert that the cast is valid.	2021-01-05 17:08:11 +00:00
Alan Phipps	9f2967bcfe	[Coverage] Add support for Branch Coverage in LLVM Source-Based Code Coverage This is an enhancement to LLVM Source-Based Code Coverage in clang to track how many times individual branch-generating conditions are taken (evaluate to TRUE) and not taken (evaluate to FALSE). Individual conditions may comprise larger boolean expressions using boolean logical operators. This functionality is very similar to what is supported by GCOV except that it is very closely anchored to the ASTs. Differential Revision: https://reviews.llvm.org/D84467	2021-01-05 09:51:51 -06:00
Joe Ellis	3d5b18a3fd	[clang][AArch64][SVE] Avoid going through memory for coerced VLST arguments VLST arguments are coerced to VLATs at the function boundary for consistency with the VLAT ABI. They are then bitcast back to VLSTs in the function prolog. Previously, this conversion is done through memory. With the introduction of the llvm.vector.{insert,extract} intrinsic, we can avoid going through memory here. Depends on D92761 Differential Revision: https://reviews.llvm.org/D92762	2021-01-05 15:18:21 +00:00
Thorsten Schütt	2fd11e0b1e	Revert "[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)" This reverts commit `efc82c4ad2`.	2021-01-04 23:17:45 +01:00
Thorsten Schütt	efc82c4ad2	[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II) Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D93765	2021-01-04 22:58:26 +01:00
Hongtao Yu	4034f9273e	Switching Clang UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm. As a follow-up to D93656, I'm switching the Clang UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm. Test Plan: Reviewed By: aeubanks, tmsriram Differential Revision: https://reviews.llvm.org/D94019	2021-01-04 12:04:46 -08:00
Jon Chesterfield	76bfbb74d3	[libomptarget][amdgpu] Call into deviceRTL instead of ockl [libomptarget][amdgpu] Call into deviceRTL instead of ockl Amdgpu codegen presently emits a call into ockl. The same functionality is already present in the deviceRTL. Adds an amdgpu specific entry point to avoid the dependency. This lets simple openmp code (specifically, that which doesn't use libm) run without rocm device libraries installed. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D93356	2021-01-04 16:48:47 +00:00
Brandon Bergren	6cee9d0cf8	[PowerPC] Support powerpcle target in Clang [3/5] Add powerpcle support to clang. For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment. For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC. Adjust driver to match current binutils behavior regarding machine naming. Adjust and expand tests. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93919	2021-01-02 12:17:58 -06:00
Fangrui Song	d1fd72343c	Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions The idea is that the CC1 default for ELF should set dso_local on default visibility external linkage definitions in the default -mrelocation-model pic mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar. The refactoring is made available by `2820a2ca3a`. Currently only x86 supports local aliases. We move the decision to the driver. There are three CC1 states: * -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable. * (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local * -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out. Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.	2020-12-31 13:59:45 -08:00
Fangrui Song	809a1e0ffd	[CodeGenModule] Set dso_local for Mach-O GlobalValue * static relocation model: always * other relocation models: if isStrongDefinitionForLinker This will make LLVM IR emitted for COFF/Mach-O and executable ELF similar.	2020-12-30 20:52:01 -08:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Fangrui Song	2820a2ca3a	Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the decision to use dso_local. For LLVM synthesized functions/globals, they may lose inferred dso_local but such optimizations are probably not very useful. Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now. (llvm/CodeGen/X86/semantic-interposition-comdat.ll) (Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)	2020-12-29 23:37:55 -08:00
James Y Knight	4ddf140c00	Fix PR35902: incorrect alignment used for ubsan check. UBSan was using the complete-object align rather than nv alignment when checking the "this" pointer of a method. Furthermore, CGF.CXXABIThisAlignment was also being set incorrectly, due to an incorrectly negated test. The latter doesn't appear to have had any impact, due to it not really being used anywhere. Differential Revision: https://reviews.llvm.org/D93072	2020-12-28 18:11:17 -05:00
Thomas Lively	5e09e9979b	[WebAssembly] Prototype extending pairwise add instructions As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes the new instructions available only via clang builtins and LLVM intrinsics to make their use opt-in while they are still being evaluated for inclusion in the SIMD proposal. Depends on D93771. Differential Revision: https://reviews.llvm.org/D93775	2020-12-28 14:11:14 -08:00
Akira Hatanaka	34405b41d6	[CodeGen][ObjC] Destroy callee-destroyed arguments in the caller function when the receiver is nil Callee-destroyed arguments to a method have to be destroyed in the caller function when the receiver is nil as the method doesn't get executed. This fixes PR48207. rdar://71808391 Differential Revision: https://reviews.llvm.org/D93273	2020-12-28 11:52:27 -08:00
Alexandre Ganea	69132d12de	[Clang] Reverse test to save on indentation. NFC.	2020-12-23 19:24:53 -05:00
Alan Phipps	bbd758a791	Revert "This is a test commit" This reverts commit `b920adf3b4`.	2020-12-23 13:04:37 -06:00
Alan Phipps	b920adf3b4	This is a test commit	2020-12-23 12:57:27 -06:00
Arthur O'Dwyer	22cf54a7fb	Replace `T(x)` with `reinterpret_cast<T>(x)` everywhere it means reinterpret_cast. NFC. Differential Revision: https://reviews.llvm.org/D76572	2020-12-22 19:54:29 -05:00
Arthur Eubanks	2080232333	Revert "[c++20] P1907R1: Support for generalized non-type template arguments of scalar type." This reverts commit `9e08e51a20`. This is part of 5 commits being reverted due to https://crbug.com/1161059. See bug for repro.	2020-12-22 10:18:08 -08:00
Richard Smith	9e08e51a20	[c++20] P1907R1: Support for generalized non-type template arguments of scalar type.	2020-12-18 01:08:41 -08:00
Rong Xu	3733463dbb	[IR][PGO] Add hot func attribute and use hot/cold attribute in func section Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493	2020-12-17 18:41:12 -08:00
Tom Stellard	3203143f13	CodeGen: Improve generated IR for __builtin_mul_overflow(uint, uint, int) Add a special case for handling __builtin_mul_overflow with unsigned inputs and a signed output to avoid emitting the __muloti4 library call on x86_64. __muloti4 is not implemented in libgcc, so avoiding this call fixes compilation of some programs that call __builtin_mul_overflow with these arguments. For example, this fixes the build of cpio with clang, which includes code from gnulib that calls __builtin_mul_overflow with these argument types. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D84405	2020-12-17 14:30:31 -08:00
Baptiste Saleil	c2892978e9	[PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ On PPC, the vector pair instructions are independent from MMA. This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names. We also move the vector pair type/intrinsic/builtin tests to their own files. Differential Revision: https://reviews.llvm.org/D91974	2020-12-17 13:19:27 -05:00
Zequan Wu	fb0f728805	[Clang] Make nomerge attribute a function attribute as well as a statement attribute. Differential Revision: https://reviews.llvm.org/D92800	2020-12-17 07:45:38 -08:00
dfukalov	9ed8e0caab	[NFC] Reduce include files dependency and AA header cleanup (part 2). Continuing work started in https://reviews.llvm.org/D92489: Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h". Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92852	2020-12-17 14:04:48 +03:00
Fangrui Song	c70f36865e	Use basic_string::find(char) instead of basic_string::find(const char *s, size_type pos=0) Many (StringRef) cannot be detected by clang-tidy performance-faster-string-find.	2020-12-16 23:28:32 -08:00
Joe Ellis	dad07baf12	[clang][AArch64][SVE] Avoid going through memory for VLAT <-> VLST casts This change makes use of the llvm.vector.extract intrinsic to avoid going through memory when performing bitcasts between vector-length agnostic types and vector-length specific types. Depends on D91362 Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D92761	2020-12-16 12:24:32 +00:00
Johannes Doerfert	b9c77542e2	[Clang][Attr] Introduce the `assume` function attribute The `assume` attribute is a way to provide additional, arbitrary information to the optimizer. For now, assumptions are restricted to strings which will be accumulated for a function and emitted as comma separated string function attribute. The key of the LLVM-IR function attribute is `llvm.assume`. Similar to `llvm.assume` and `__builtin_assume`, the `assume` attribute provides a user defined assumption to the compiler. A follow up patch will introduce an LLVM-core API to query the assumptions attached to a function. We also expect to add more options, e.g., expression arguments, to the `assume` attribute later on. The `omp [begin] asssumes` pragma will leverage this attribute and expose the functionality in the absence of OpenMP. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D91979	2020-12-15 16:51:34 -06:00
Fangrui Song	59decf8e9c	[clang] Migrate deprecated DebugInfo::get to DILocation::get	2020-12-15 13:59:31 -08:00
Baptiste Saleil	57d83c3a90	[PowerPC] Enable paired vector type and intrinsics when MMA is disabled This patch enables the Clang type __vector_pair and its associated LLVM intrinsics even when MMA is disabled. With this patch, the type is now controlled by the PPC paired-vector-memops option. The builtins and intrinsics will be renamed to drop the mma prefix in another patch. Differential Revision: https://reviews.llvm.org/D91819	2020-12-15 15:14:11 -06:00
Jan Svoboda	f24e58df7d	[clang][cli] Create accessors for exception models in LangOptions This abstracts away the members that are being replaced in a follow-up patch. Depends on D83979. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D93214	2020-12-15 10:15:58 +01:00
Rong Xu	c36f31c4db	[PGO] remove unintentional code in early commit Remove unintentional code in commit 54e03d [PGO] Verify BFI counts after loading profile data.	2020-12-14 18:41:49 -08:00
Rong Xu	54e03d03a7	[PGO] Verify BFI counts after loading profile data This patch adds the functionality to compare BFI counts with real profile counts right after reading the profile. It will print remarks under -Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo. Differential Revision: https://reviews.llvm.org/D91813	2020-12-14 15:56:10 -08:00
Gulfem Savrun Yeniceri	7c0e3a77bc	[clang][IR] Add support for leaf attribute This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275	2020-12-14 14:48:17 -08:00
Matt Arsenault	ef4da3c2ba	clang: Add byval on x86_intrcc parameter 0 This will allow removing the special case treatment of the parameter and avoid depending on the pointer's element type.	2020-12-14 16:34:37 -05:00
Simon Pilgrim	4855a1004d	[X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well. Differential Revision: https://reviews.llvm.org/D92940	2020-12-13 15:37:35 +00:00
Alexey Bader	a500a43587	[CodeGen][AMDGPU] Fix ICE for static initializer IR generation Differential Revision: https://reviews.llvm.org/D92782	2020-12-12 23:26:54 +03:00
Melanie Blower	320af6b138	Create SPIRABIInfo to enable SPIR_FUNC calling convention. Background: Call to library arithmetic functions for div is emitted by the compiler and it set wrong “C” calling convention for calls to these functions, whereas library functions are declared with `spir_function` calling convention. InstCombine optimization replaces such calls with “unreachable” instruction. It looks like clang lacks SPIRABIInfo class which should specify default calling conventions for “system” function calls. SPIR supports only SPIR_FUNC and SPIR_KERNEL calling convention. Reviewers: Erich Keane, Anastasia Differential Revision: https://reviews.llvm.org/D92721	2020-12-12 05:48:20 -08:00
Arthur Eubanks	ff7e1da68f	[NPM] Support -fmerge-functions I tried to put it in the same place in the pipeline as the legacy PM. Fixes PR48399. Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D93002	2020-12-10 11:45:08 -08:00
Florian Hahn	9c4cddb53a	[Clang] Add vcmla and rotated variants for Arm ACLE. This patch adds vcmla and the rotated variants as defined in "Arm Neon Intrinsics Reference for ACLE Q3 2020" [1] The _lane_ are still missing, but they can be added separately. This patch only adds the builtin mapping for AArch64. [1] https://developer.arm.com/documentation/ihi0073/latest Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92930	2020-12-10 16:54:08 +00:00
Fangrui Song	f9c0d1b056	[Driver] Add -f[no-]legacy-pass-manager to supersede -f[no-]experimental-new-pass-manager The new PM is considered stable and many downstream groups have adopted it (some have adopted it for more than two years). Add -f[no-]legacy-pass-manager to reflect the fact that it is no longer experimental and the legacy pass manager is something we strive to retire. In the future, when the legacy PM eventually goes away, -fno-experimental-new-pass-manager and -flegacy-pass-manager will be removed. This patch also changes -f[no-]legacy-pass-manager to pass `-plugin-opt={new,legacy}-pass-manager` to the linker (supported by both ld.lld and LLVMgold.so) when -flto/-flto=thin is specified Reviewed By: aeubanks, rsmith Differential Revision: https://reviews.llvm.org/D92915	2020-12-09 16:57:36 -08:00
Reid Kleckner	df282215d4	Don't setup inalloca for swiftcc on i686-windows-msvc Swiftcall does it's own target-independent argument type classification, since it is not designed to be ABI compatible with anything local on the target that isn't LLVM-based. This means it never uses inalloca. However, we have duplicate logic for checking for inalloca parameters that runs before call argument setup. This logic needs to know ahead of time if inalloca will be used later, and we can't move the CGFunctionInfo calculation earlier. This change gets the calling convention from either the FunctionProtoType or ObjCMethodDecl, checks if it is swift, and if so skips the stackbase setup. Depends on D92883. Differential Revision: https://reviews.llvm.org/D92944	2020-12-09 11:08:48 -08:00
Reid Kleckner	d7098ff29c	De-templatify EmitCallArgs argument type checking, NFCI This template exists to abstract over FunctionPrototype and ObjCMethodDecl, which have similar APIs for storing parameter types. In place of a template, use a PointerUnion with two cases to handle this. Hopefully this improves readability, since the type of the prototype is easier to discover. This allows me to sink this code, which is mostly assertions, out of the header file and into the cpp file. I can also simplify the overloaded methods for computing isGenericMethod, and get rid of the second EmitCallArgs overload. Differential Revision: https://reviews.llvm.org/D92883	2020-12-09 11:08:00 -08:00
Yuanfang Chen	1821265db6	[Time-report] Add a flag -ftime-report={per-pass,per-pass-run} to control the pass timing aggregation Currently, -ftime-report + new pass manager emits one line of report for each pass run. This potentially causes huge output text especially with regular LTO or large single file (Obeserved in private tests and was reported in D51276). The behaviour of -ftime-report + legacy pass manager is emitting one line of report for each pass object which has relatively reasonable text output size. This patch adds a flag `-ftime-report=` to control time report aggregation for new pass manager. The flag is for new pass manager only. Using it with legacy pass manager gives an error. It is a driver and cc1 flag. `per-pass` is the new default so `-ftime-report` is aliased to `-ftime-report=per-pass`. Before this patch, functionality-wise `-ftime-report` is aliased to `-ftime-report=per-pass-run`. * Adds an boolean variable TimePassesHandler::PerRun to control per-pass vs per-pass-run. * Adds a new clang CodeGen flag CodeGenOptions::TimePassesPerRun to work with the existing CodeGenOptions::TimePasses. * Remove FrontendOptions::ShowTimers, its uses are replaced by the existing CodeGenOptions::TimePasses. * Remove FrontendTimesIsEnabled (It was introduced in D45619 which was largely reverted.) Differential Revision: https://reviews.llvm.org/D92436	2020-12-08 10:13:19 -08:00
Kevin P. Neal	acd4950d4f	[FPEnv] Correct constrained metadata in fp16-ops-strict.c This test shows we're in some cases not getting strictfp information from the AST. Correct that. Differential Revision: https://reviews.llvm.org/D92596	2020-12-08 10:18:32 -05:00
Tim Northover	c5978f42ec	UBSAN: emit distinctive traps Sometimes people get minimal crash reports after a UBSAN incident. This change tags each trap with an integer representing the kind of failure encountered, which can aid in tracking down the root cause of the problem.	2020-12-08 10:28:26 +00:00
Luís Marques	3af354e863	[Clang][CodeGen][RISCV] Fix hard float ABI for struct with empty struct and complex Fixes bug 44904. Differential Revision: https://reviews.llvm.org/D91278	2020-12-08 09:19:05 +00:00
Luís Marques	fa8f5bfa4e	[Clang][CodeGen][RISCV] Fix hard float ABI test cases with empty struct The code seemed not to account for the field 1 offset. Differential Revision: https://reviews.llvm.org/D91270	2020-12-08 09:19:05 +00:00
Vitaly Buka	6e614b0c7e	[NFC][MSan] Round up OffsetPtr in PoisonMembers getFieldOffset(layoutStartOffset) is expected to point to the first trivial field or the one which follows non-trivial. So it must be byte aligned already. However this is not obvious without assumptions about callers. This patch will avoid the need in such assumptions. Depends on D92727. Differential Revision: https://reviews.llvm.org/D92728	2020-12-07 19:57:49 -08:00
Vitaly Buka	3e1cb0db8a	[CodeGen][MSan] Don't use offsets of zero-sized fields Such fields will likely have offset zero making __sanitizer_dtor_callback poisoning wrong regions. E.g. it can poison base class member from derived class constructor. Differential Revision: https://reviews.llvm.org/D92727	2020-12-07 13:37:40 -08:00
Jinsong Ji	b49b8f096c	[PowerPC][Clang] Remove QPX support Clean up QPX code in clang missed in https://reviews.llvm.org/D83915 Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D92329	2020-12-07 10:15:39 -05:00
Vitaly Buka	1f21f6d6a4	[NFC][CodeGen] Simplify SanitizeDtorMembers::Emit	2020-12-05 21:11:27 -08:00
Fangrui Song	1ab9327d1c	[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead. Actually Clang ppc32 does produce a pair of absolute relocations which match GCC. This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).	2020-12-05 00:42:07 -08:00
Nico Weber	0cbf61be8b	[mac/arm] Fix rtti codegen tests when running on an arm mac shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes RTTI objects to be emitted hidden. Update two tests that didn't expect this to happen for the default triple. Also rename iOS64CXXABI to AppleARM64CXXABI, since it's used for arm64-apple-macos triples too. Part of PR46644. Differential Revision: https://reviews.llvm.org/D91904	2020-12-03 09:11:03 -05:00
Yaxun (Sam) Liu	3a781b912f	Fix assertion in tryEmitAsConstant due to `cd95338ee3` Need to check if result is LValue before getLValueBase.	2020-12-02 19:10:01 -05:00
Hongtao Yu	24d4291ca7	[CSSPGO] Pseudo probes for function calls. An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work. One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution. With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst. To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use. Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`. Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91756	2020-12-02 13:45:20 -08:00
jasonliu	a65d8c5d72	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455	2020-12-02 18:42:44 +00:00
Yaxun (Sam) Liu	cd95338ee3	[CUDA][HIP] Fix capturing reference to host variable In C++ when a reference variable is captured by copy, the lambda is supposed to make a copy of the referenced variable in the captures and refer to the copy in the lambda. Therefore, it is valid to capture a reference to a host global variable in a device lambda since the device lambda will refer to the copy of the host global variable instead of access the host global variable directly. However, clang tries to avoid capturing of reference to a host global variable if it determines the use of the reference variable in the lambda function is not odr-use. Clang also tries to emit load of the reference to a global variable as load of the global variable if it determines that the reference variable is a compile-time constant. For a device lambda to capture a reference variable to host global variable and use the captured value, clang needs to be taught that in such cases the use of the reference variable is odr-use and the reference variable is not compile-time constant. This patch fixes that. Differential Revision: https://reviews.llvm.org/D91088	2020-12-02 10:14:46 -05:00
Alex Zinenko	240dd92432	[OpenMPIRBuilder] forward arguments as pointers to outlined function OpenMPIRBuilder::createParallel outlines the body region of the parallel construct into a new function that accepts any value previously defined outside the region as a function argument. This function is called back by OpenMP runtime function __kmpc_fork_call, which expects trailing arguments to be pointers. If the region uses a value that is not of a pointer type, e.g. a struct, the produced code would be invalid. In such cases, make createParallel emit IR that stores the value on stack and pass the pointer to the outlined function instead. The outlined function then loads the value back and uses as normal. Reviewed By: jdoerfert, llitchev Differential Revision: https://reviews.llvm.org/D92189	2020-12-02 14:59:41 +01:00
Qiu Chaofan	3fca6a7844	[Clang] Don't adjust align for IBM extended double Commit `6b1341eb` fixed alignment for 128-bit FP types on PowerPC. However, the quadword alignment adjustment shouldn't be applied to IBM extended double (ppc_fp128 in IR) values. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92278	2020-12-02 17:02:26 +08:00
David Chisnall	d1ed67037d	[GNU ObjC] Fix a regression listing methods twice. Methods synthesized from declared properties were being added to the method lists twice. This came from the change to list them in the class's method list, which missed removing the place in CGObjCGNU that added them again. Reviewed By: lanza Differential Revision: https://reviews.llvm.org/D91874	2020-12-01 09:50:18 +00:00
Leonard Chan	cf8ff75bad	[clang][RelativeVTablesABI] Use dso_local_equivalent rather than emitting stubs Thanks to D77248, we can bypass the use of stubs altogether and use PLT relocations if they are available for the target. LLVM and LLD support the R_AARCH64_PLT32 relocation, so we can also guarantee a static PLT relocation on AArch64. Not emitting these stubs saves a lot of extra binary size. Differential Revision: https://reviews.llvm.org/D83812	2020-11-30 16:02:35 -08:00
Fangrui Song	164410324d	[CodeGen] -fno-delete-null-pointer-checks: change dereferenceable to dereferenceable_or_null After D17993, with -fno-delete-null-pointer-checks we add the dereferenceable attribute to the `this` pointer. We have observed that one internal target which worked before fails even with -fno-delete-null-pointer-checks. Switching to dereferenceable_or_null fixes the problem. dereferenceable currently does not always respect NullPointerIsValid and may imply nonnull and lead to aggressive optimization. The optimization may be related to `CallBase::isReturnNonNull`, `Argument::hasNonNullAttr`, or `Value::getPointerDereferenceableBytes`. See D66664 and D66618 for some discussions. Reviewed By: bkramer, rsmith Differential Revision: https://reviews.llvm.org/D92297	2020-11-30 12:44:35 -08:00
Hongtao Yu	c083fededf	[CSSPGO] A Clang switch -fpseudo-probe-for-profiling for pseudo-probe instrumentation. This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient. The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86502	2020-11-30 10:16:54 -08:00
Kevin P. Neal	abfbc5579b	[FPEnv] clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the non-target specific builtins. Differential Revision: https://reviews.llvm.org/D92122	2020-11-30 11:59:37 -05:00
Zarko Todorovski	ff8e8c1b14	[AIX] Enabling vector type arguments and return for AIX This patch enables vector type arguments on AIX. All non-aggregate Altivec vector types are 16bytes in size and are 16byte aligned. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D92117	2020-11-27 09:55:52 -05:00
Reid Kleckner	1e843a987d	[MS] Add more 128bit cmpxchg intrinsics for AArch64 The MSVC STL for requires this on ARM64. Requested in https://llvm.org/pr47099 Depends on D92061 Differential Revision: https://reviews.llvm.org/D92062	2020-11-25 12:07:28 -08:00
Reid Kleckner	3bd0672726	[MS] Fix double evaluation of MSVC builtin arguments This code got quite twisted because we consider some MSVC builtins to be target agnostic, and some to be target specific. Target specific intrinsics have a pattern of doing up-front argument evaluation, while general intrinsics do not evaluate their arguments up front. As we tried to share codepaths between the target-specific and target-agnostic handling, we ended up doing double evaluation. Instead, have each target handle MSVC intrinsics consistently before up front argument evaluation. This requires passing less data around and is more consistent with target independent intrinsic handling. See D50979 for past examples of this bug. I noticed this while looking into adding some more intrinsics. Differential Revision: https://reviews.llvm.org/D92061	2020-11-25 11:55:01 -08:00
Simon Pilgrim	9d996c01aa	TargetInfo.cpp - use castAs<> instead of getAs<> as we dereference the pointer directly. NFCI. castAs<> will assert the correct cast type instead of just returning null, which we then try to dereference immediately.	2020-11-25 11:38:29 +00:00
Simon Pilgrim	eb7ea5aa1a	CGCall.cpp - use castAs<> instead of getAs<> as we dereference the pointer directly. NFCI. castAs<> will assert the correct cast type instead of just returning null, which we then try to dereference immediately in the setUsedBits call.	2020-11-25 11:38:29 +00:00
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Teresa Johnson	0768b0576a	Avoid redundant work when computing vtable vcall visibility Add a Visited set to avoid repeatedly processing the same base classes in complex class hierarchies. This cut down the compile time of one source file from >12min to ~1min. Differential Revision: https://reviews.llvm.org/D91676	2020-11-24 12:06:24 -08:00
Yaxun (Sam) Liu	cb08558caa	[HIP] Fix regressions due to fp contract change Recently HIP toolchain made a change to use clang instead of opt/llc to do compilation (https://reviews.llvm.org/D81861). The intention is to make HIP toolchain canonical like other toolchains. However, this change introduced an unintentional change regarding backend fp fuse option, which caused regressions in some HIP applications. Basically before the change, HIP toolchain used clang to generate bitcode, then use opt/llc to optimize bitcode and generate ISA. As such, the amdgpu backend takes the default fp fuse mode which is 'Standard'. This mode respect contract flag of fmul/fadd instructions and do not fuse fmul/fadd instructions without contract flag. However, after the change, HIP toolchain now use clang to generate IR, do optimization, and generate ISA as one process. Now amdgpu backend fp fuse option is determined by -ffp-contract option, which is 'fast' by default. And this -ffp-contract=fast language option is translated to 'Fast' fp fuse option in backend. Suddenly backend starts to fuse fmul/fadd instructions without contract flag. This causes wrong result for some device library functions, e.g. tan(-1e20), which should return 0.8446, now returns -0.933. What is worse is that since backend with 'Fast' fp fuse option does not respect contract flag, there is no way to use #pragma clang fp contract directive to enforce fp contract requirements. This patch fixes the regression by introducing a new value 'fast-honor-pragmas' for -ffp-contract and use it for HIP by default. 'fast-honor-pragmas' is equivalent to 'fast' in frontend but let the backend to use 'Standard' fp fuse option. 'fast-honor-pragmas' is useful since 'Fast' fp fuse option in backend does not honor contract flag, it is of little use to HIP applications since all code with #pragma STDC FP_CONTRACT or any IR from a source compiled with -ffp-contract=on is broken. Differential Revision: https://reviews.llvm.org/D90174	2020-11-24 08:10:06 -05:00
Ben Dunbobbin	e42021d5cc	[Clang][-fvisibility-from-dllstorageclass] Set DSO Locality from final visibility Ensure that the DSO Locality of the globals in the IR is derived from their final visibility when using -fvisibility-from-dllstorageclass. To accomplish this we reset the DSO locality of globals (before setting their visibility from their dllstorageclass) at the end of IRGen in Clang. This removes any effects that visibility options or annotations may have had on the DSO locality. The resulting DSO locality of the globals will be pessimistic w.r.t. to the normal compiler IRGen. Differential Revision: https://reviews.llvm.org/D91779	2020-11-24 00:32:14 +00:00
Zequan Wu	15a3ae1ab1	[Clang] Add __STDCPP_THREADS__ to standard predefine macros According to https://eel.is/c++draft/cpp.predefined#2.6, `__STDCPP_THREADS__` is a predefined macro. Differential Revision: https://reviews.llvm.org/D91747	2020-11-22 16:05:53 -08:00
Alexey Bataev	c964f30814	[OPENMP]Use the real pointer value as base, not indexed value. After fix for PR48174 the base pointer for pointer-based array-sections/array-subscripts will be emitted as `&ptr[idx]`, but actually it should be just `ptr`, i.e. the address stored in the ponter to point correctly to the beginning of the array. Currently it may lead to a crash in the runtime. Differential Revision: https://reviews.llvm.org/D91805	2020-11-20 11:34:14 -08:00
Alex Richardson	51e09e1d5a	[AMDGPU] Set the default globals address space to 1 This will ensure that passes that add new global variables will create them in address space 1 once the passes have been updated to no longer default to the implicit address space zero. This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't already to present to ensure bitcode backwards compatibility. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D84345	2020-11-20 15:46:53 +00:00
Arthur Eubanks	72badbcdcc	[NPM] Move more O0 pass building into PassBuilder This moves handling of alwaysinline, coroutines, matrix lowering, PGO, and LTO-required passes into PassBuilder. Much of this is replicated between Clang and opt. Other out-of-tree users also replicate some of this, such as Rust [1] replicating the alwaysinline, LTO, and PGO passes. The LTO passes are also now run in build(Thin)LTOPreLinkDefaultPipeline() since they are semantically required for (Thin)LTO. [1]: `f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D91585	2020-11-19 11:22:23 -08:00
Joseph Huber	da8bec47ab	[OpenMP] Add Location Fields to Libomptarget Runtime for Debugging Summary: Add support for passing source locations to libomptarget runtime functions using the ident_t struct present in the rest of the libomp API. This will allow the runtime system to give much more insightful error messages and debugging values. Reviewers: jdoerfert grokos Differential Revision: https://reviews.llvm.org/D87946	2020-11-19 12:01:53 -05:00
Xiangling Liao	17497ec514	[AIX][FE] Support constructor/destructor attribute Support attribute((constructor)) and attribute((destructor)) on AIX Differential Revision: https://reviews.llvm.org/D90892	2020-11-19 09:24:01 -05:00
Qiu Chaofan	6b1341eb5b	[PowerPC] [Clang] Fix alignment of 128-bit float types According to ELF v2 ABI, both IEEE 128-bit and IBM extended floating point variables should be quad-word (16 bytes) aligned. Previously, only vector types are considered aligned as quad-word on PowerPC. This patch will fix incorrectness of IEEE 128-bit float argument in va_arg cases. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D91596	2020-11-19 14:22:14 +08:00
Joseph Huber	97e55cfef5	[OpenMP] Add Passing in Original Declaration Names To Mapper API Summary: This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;" Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D89802	2020-11-18 15:28:39 -05:00
Alexey Bataev	5ba324ccad	[OPENMP]Fix PR48174: compile-time crash with target enter data on a global struct. The compiler should treat array subscript with base pointer as a first pointer in complex data, it is used only for member expression with base pointer. Differential Revision: https://reviews.llvm.org/D91660	2020-11-18 07:48:58 -08:00
Florian Hahn	680931af27	[Matrix] Adjust matrix pointer type for inline asm arguments. Matrix types in memory are represented as arrays, but accessed through vector pointers, with the alignment specified on the access operation. For inline assembly, update pointer arguments to use vector pointers. Otherwise there will be a mis-match if the matrix is also an input-argument which is represented as vector. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D91631	2020-11-18 11:44:11 +00:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Alexey Bataev	5292187a2d	[OPENMP]Fix PR48076: mapping of data member pointer. If the data member pointer is mapped, the compiler tries to optimize the mapping of such data by discarding explicit mapping flags and trying to emit combined data instead. In some cases, this optimization is not quite correctly implemented and it leads to a program crash at the runtime. Instead, if the data member is mapped, just emit it as is and do not emit combined mapping flags for it. Differential Revision: https://reviews.llvm.org/D91552	2020-11-17 07:18:32 -08:00
Yaxun (Sam) Liu	3f4b5893ef	[AMDGPU] Add option -munsafe-fp-atomics Add an option -munsafe-fp-atomics for AMDGPU target. When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics" to any functions for amdgpu target. This allows amdgpu backend to use unsafe fp atomic instructions in these functions. Differential Revision: https://reviews.llvm.org/D91546	2020-11-16 21:52:12 -05:00
CJ Johnson	69cd776e1e	[CodeGen] Apply 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments. * Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments * Gates 'nonnull' on -f(no-)delete-null-pointer-checks * Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to explicitly test the behavior of this change * Refactors hundreds of over-constrained clang tests to permit these attributes, where needed * Updates Clang12 patch notes mentioning this change Reviewed-by: rsmith, jdoerfert Differential Revision: https://reviews.llvm.org/D17993	2020-11-16 17:39:17 -08:00
Florian Hahn	ca2e7e5999	[IRGen] Add !annotation metadata for auto-init stores. This patch updates Clang's IRGen to add !annotation nodes with an "auto-init" annotation to all stores for auto-initialization. As discussed in 'RFC: Combining Annotation Metadata and Remarks' (http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html) this allows using optimization remarks to track down where auto-init code was inserted (and not removed by optimizations). There are a few cases in the tests where !annotation gets dropped by optimizations. Those optimizations will be updated in subsequent patches. This patch is based on a patch by Francis Visoiu Mistrih. Reviewed By: thegameg, paquette Differential Revision: https://reviews.llvm.org/D91417	2020-11-16 10:37:02 +00:00
Richard Smith	dc58cd1480	PR48169: Fix crash generating debug info for class non-type template parameters. It appears that LLVM isn't able to generate a DW_AT_const_value for a constant of class type, but if it could, we'd match GCC's debug info in this case, and in the interim we no longer crash.	2020-11-15 17:43:26 -08:00
Roman Lebedev	6861d938e5	Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485 the implementation is known-broken for certain inputs, the bugreport was up for a significant amount of timer, and there has been no activity to address it. Therefore, just completely rip out all of misexpect handling. I suspect, fixing it requires redesigning the internals of MD_misexpect. Should anyone commit to fixing the implementation problem, starting from clean slate may be better anyways. This reverts commit `7bdad08429`, and some of it's follow-ups, that don't stand on their own.	2020-11-14 13:12:38 +03:00
Mehdi Amini	42e88bd6b1	Replace sequences of v.push_back(v[i]); v.erase(&v[i]); with std::rotate (NFC) The code has a few sequence that looked like: Ops.push_back(Ops[0]); Ops.erase(Ops.begin()); And are equivalent to: std::rotate(Ops.begin(), Ops.begin() + 1, Ops.end()); The latter has the advantage of never reallocating the vector, which would be a bug in the original code as push_back would read from the memory it deallocated.	2020-11-14 00:55:33 +00:00
Arthur Eubanks	6e098189db	[DFSan][NewPM] Handle dfsan under NPM Make it required. Since it's a module pass, optnone won't test it, so extend the clang test to also use opt-bisect now that it's supported. 14/16 check-dfsan tests failed with NPM enabled, now all pass. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D91385	2020-11-13 13:41:38 -08:00
Heejin Ahn	902ea588ea	[WebAssembly] Rename atomic.notify and *.atomic.wait - atomic.notify -> memory.atomic.notify - i32.atomic.wait -> memory.atomic.wait32 - i64.atomic.wait -> memory.atomic.wait64 See https://github.com/WebAssembly/threads/pull/149. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D91447	2020-11-13 12:04:48 -08:00
Baptiste Saleil	3f78605a8c	[PowerPC] Add paired vector load and store builtins and intrinsics This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs. Differential Revision: https://reviews.llvm.org/D90799	2020-11-13 12:35:10 -06:00
Michael Liao	8920ef06a1	[hip] Remove the coercion on aggregate kernel arguments. - If an aggregate argument is indirectly accessed within kernels, direct passing results in unpromotable `alloca`, which degrade performance significantly. InferAddrSpace pass is enhanced in [D91121](https://reviews.llvm.org/D91121) to take the assumption that generic pointers loaded from the constant memory could be regarded global ones. The need for the coercion on aggregate arguments is mitigated. Differential Revision: https://reviews.llvm.org/D89980	2020-11-12 21:19:30 -05:00
Arthur Eubanks	3a7b57b7ca	[NFC][NewPM] Reuse PassBuilder callbacks with -O0 This removes lots of duplicated code which was necessary before https://reviews.llvm.org/D89158. Now we can use PassBuilder::runRegisteredEPCallbacks(). This is mostly sanitizers. There is likely more that can be done to simplify, but let's start with this. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90870	2020-11-12 12:42:59 -08:00
Alexey Bataev	3c6b457bee	[OPENMP]Fix PR48076: Check map types array before accessing its front. Need to check if there are map types for the components before trying to access them when trying to modify type mappings for combined partial mappings. Differential Revision: https://reviews.llvm.org/D91370	2020-11-12 12:00:29 -08:00
Hans Wennborg	a088766508	[dllexport] Instantiate default ctor default args for explicit specializations (PR45811) For dllexported default constructors with default arguments, we export default constructor closures which pass in the default args. (See D8331 for a good explanation.) For templates, that means those default args must be instantiated even if the function isn't called. That is done by the InstantiateDefaultCtorDefaultArgs() function, but it wasn't done for explicit specializations, causing asserts (see bug). Differential revision: https://reviews.llvm.org/D91089	2020-11-12 13:29:34 +01:00
Arthur Eubanks	b6ccff3d5f	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Since callbacks may end up not adding passes, we need to check if the pass managers are empty before adding them, so PassManager now has an isEmpty() function. For example, polly adds callbacks but doesn't always add passes in those callbacks, so this is necessary to keep -debug-pass-manager tests' output from changing depending on if polly is enabled or not. Tests are a continuation of those added in https://reviews.llvm.org/D89083. Reviewed By: asbirlea, Meinersbur Differential Revision: https://reviews.llvm.org/D89158	2020-11-11 15:10:27 -08:00
Richard Smith	5f12f4ff90	Suppress printing of inline namespace names in diagnostics by default, except where they are necessary to disambiguate the target. This substantially improves diagnostics from the standard library, which are otherwise full of `::__1::` noise.	2020-11-11 15:05:51 -08:00
Akira Hatanaka	874b0a0b9d	[CodeGen] Mark calls to objc_autorelease as tail This enables a method sending an autorelease message to an object and returning the object in MRR to avoid adding the object to an autorelease pool if a call to objc_retainAutoreleasedReturnValue in the caller function accepts the hand off of the retain count. rdar://problem/50678052 Differential Revision: https://reviews.llvm.org/D91111	2020-11-10 13:48:25 -08:00
Richard Smith	b637148ecb	[c++20] For P0732R2 / P1907R1: Basic code generation and name mangling support for non-type template parameters of class type and template parameter objects. The Itanium side of this follows the approach I proposed in https://github.com/itanium-cxx-abi/cxx-abi/issues/47 on 2020-09-06. The MSVC side of this was determined empirically by observing MSVC's output. Differential Revision: https://reviews.llvm.org/D89998	2020-11-09 22:10:27 -08:00
Michael Kruse	e5dba2d7e5	[OMPIRBuilder] Start 'Create' methods with lower case. NFC. For consistency with the IRBuilder, OpenMPIRBuilder has method names starting with 'Create'. However, the LLVM coding style has methods names starting with lower case letters, as all other OpenMPIRBuilder already methods do. The clang-tidy configuration used by Phabricator also warns about the naming violation, adding noise to the reviews. This patch renames all `OpenMPIRBuilder::CreateXYZ` methods to `OpenMPIRBuilder::createXYZ`, and updates all in-tree callers. I tested check-llvm, check-clang, check-mlir and check-flang to ensure that I did not miss a caller. Reviewed By: mehdi_amini, fghanim Differential Revision: https://reviews.llvm.org/D91109	2020-11-09 19:35:11 -06:00
Fangrui Song	e625f9c5d1	-fbasic-block-sections=list=: Suppress output if failed to open the file Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D90815	2020-11-09 09:26:37 -08:00
Atmn Patel	fd3cad7a60	[clang] Fix ForStmt mustprogress handling D86841 had an error where for statements with no conditional were required to make progress. This is not true, this patch removes that line, and adds regression tests. Differential Revision: https://reviews.llvm.org/D91075	2020-11-09 11:38:06 -05:00
Tyker	d093401a26	[NFC] Remove string parameter of annotation attribute from AST childs. this simplifies using annotation attributes when using clang as library	2020-11-09 16:39:59 +01:00
Simon Pilgrim	8930032f53	Don't dereference a dyn_cast<> result - use cast<> instead. NFCI. We were relying on the dyn_cast<> succeeding - better use cast<> and have it assert that its the correct type than dereference a null result.	2020-11-08 13:06:07 +00:00
Arthur Eubanks	226e179f74	Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0" This reverts commit `ae38540042`. As well as some follow-up test fixes. The original change causes new-pass-manager.ll to fail when polly is enabled.	2020-11-08 00:32:35 -08:00
Fangrui Song	f2e479db92	[OpenMP] Fix -Wmisleading-indentation after D84192	2020-11-06 20:09:43 -08:00
cchen	0cab91140f	[OpenMP5.0] map item can be non-contiguous for target update In order not to modify the `tgt_target_data_update` information but still be able to pass the extra information for non-contiguous map item (offset, count, and stride for each dimension), this patch overload `arg` when the maptype is set as `OMP_MAP_DESCRIPTOR`. The origin `arg` is for passing the pointer information, however, the overloaded `arg` is an array of descriptor_dim: struct descriptor_dim { int64_t offset; int64_t count; int64_t stride }; and the array size is the same as dimension size. In addition, since we have count and stride information in descriptor_dim, we can replace/overload the `arg_size` parameter by using dimension size. For supporting `stride` in array section, we use a dummy dimension in descriptor to store the unit size. The formula for counting the stride in dimension D_n: `unit size * (D_0 * D_1 ... * D_n-1) * D_n.stride`. Demonstrate how it works: ``` double arr[3][4][5]; D0: { offset = 0, count = 1, stride = 8 } // offset, count, dimension size always be 0, 1, 1 for this extra dimension, stride is the unit size D1: { offset = 0, count = 2, stride = 8 * 1 * 2 = 16 } // stride = unit size * (product of dimension size of D0) * D1.stride = 4 * 1 * 2 = 8 D2: { offset = 2, count = 2, stride = 8 * (1 * 5) * 1 = 40 } // stride = unit size * (product of dimension size of D0, D1) * D2.stride = 4 * 5 * 1 = 20 D3: { offset = 0, count = 2, stride = 8 * (1 * 5 * 4) * 2 = 320 } // stride = unit size * (product of dimension size of D0, D1, D2) * D3.stride = 4 * 25 * 2 = 200 // X here means we need to offload this data, therefore, runtime will transfer // data from offset 80, 96, 120, 136, 400, 416, 440, 456 // Runtime patch: https://reviews.llvm.org/D82245 // OOOOO OOOOO OOOOO // OOOOO OOOOO OOOOO // XOXOO OOOOO XOXOO // XOXOO OOOOO XOXOO ``` Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D84192	2020-11-06 21:04:37 -06:00
Kevin P. Neal	2069403cdf	[FPEnv] Use strictfp metadata in casting nodes The strictfp metadata was added to the casting AST nodes in D85960, but we aren't using that metadata yet. This patch adds that support. In order to avoid lots of ad-hoc passing around of the strictfp bits I updated the IRBuilder when moving from a function that has the Expr* to a function that lacks it. I believe we should switch to this pattern to keep the strictfp support from being overly invasive. For the purpose of testing that we're picking up the right metadata, I also made my tests use a pragma to make the AST's strictfp metadata not match the global strictfp metadata. This exposes issues that we need to deal with in subsequent patches, and I believe this is the right method for most all of our clang strictfp tests. Differential Revision: https://reviews.llvm.org/D88913	2020-11-06 11:56:12 -05:00
Jan Ole Hüser	d2e7dca5ca	[CodeGen] Fix Bug 47499: __unaligned extension inconsistent behaviour with C and C++ For the language C++ the keyword __unaligned (a Microsoft extension) had no effect on pointers. The reason, why there was a difference between C and C++ for the keyword __unaligned: For C, the Method getAsCXXREcordDecl() returns nullptr. That guarantees that hasUnaligned() is called. If the language is C++, it is not guaranteed, that hasUnaligend() is called and evaluated. Here are some links: The Bug: https://bugs.llvm.org/show_bug.cgi?id=47499 Thread on the cfe-dev mailing list: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066783.html Diff, that introduced the check hasUnaligned() in getNaturalTypeAlignment(): https://reviews.llvm.org/D30166 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D90630	2020-11-05 12:57:17 -08:00
Arthur Eubanks	ae38540042	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Tests are a continuation of those added in https://reviews.llvm.org/D89083. In order to prevent TargetMachines from adding unnecessary optimization passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be changed to take an OptimizationLevel, but that will be done separately. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89158	2020-11-04 22:27:16 -08:00
Atmn Patel	ac73b73c16	[clang] Add mustprogress and llvm.loop.mustprogress attribute deduction Since C++11, the C++ standard has a forward progress guarantee [intro.progress], so all such functions must have the `mustprogress` requirement. In addition, from C11 and onwards, loops without a non-zero constant conditional or no conditional are also required to make progress (C11 6.8.5p6). This patch implements these attribute deductions so they can be used by the optimization passes. Differential Revision: https://reviews.llvm.org/D86841	2020-11-04 22:03:14 -05:00
Arthur Eubanks	ab0ddbc38a	Reland [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 13:11:40 -08:00
cchen	d0d43b58b1	[OpenMP] target nested `use_device_ptr() if()` and is_device_ptr trigger asserts Clang now asserts for the below case: ``` void clang::CodeGen::CGOpenMPRuntime::createOffloadEntriesAndInfoMetadata(): Assertion `std::get<0>(E) && "All ordered entries must exist!"' failed. ``` The reason why Clang hit the assert is because in `emitTargetDataCalls`, both `BeginThenGen` and `BeginElseGen` call `registerTargetRegionEntryInfo` and try to register the Entry in OffloadEntriesTargetRegion with same key. If changing the expression in if clause to any constant expression, then the assert disappear. (https://godbolt.org/z/TW7haj) The assert itself is to avoid user from accessing elements out of bound inside `OrderedEntries` in `createOffloadEntriesAndInfoMetadata`. In this patch, I add a check in `registerTargetRegionEntryInfo` to avoid register the target region more than once. A test case that triggers assert: https://godbolt.org/z/4cnGW8 Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D90704	2020-11-04 12:36:57 -06:00
Qiu Chaofan	7faf62a80b	[Clang] Add more fp128 math library function builtins Since glibc has supported math library functions conforming IEEE 128-bit floating point types on some platform (like ppc64le), we can fix clang's math builtins missing this type. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D90593	2020-11-04 17:58:42 +08:00
Baptiste Saleil	daa127d77e	[PowerPC] Add MMA builtin decoding and definitions Add MMA builtin decoding. These builtins use the new PowerPC-specific types __vector_pair and __vector_quad. So to avoid pervasive changes, we use custom type descriptors and custom decoding for these builtins. We also use custom code generation to expand builtin calls with pointers to simpler intrinsic calls with non-pointer types. Differential Revision: https://reviews.llvm.org/D81748	2020-11-03 15:08:46 -06:00
Ben Dunbobbin	7ad6010f58	Fix - [Clang] Add the ability to map DLL storage class to visibility `415f7ee883` had a silly typo introduced when I inlined some code into a loop from its own function. Original commit message: For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-03 19:13:54 +00:00
Tim Renouf	89d41f3a2b	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	ee3e642627	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Yaxun (Sam) Liu	abd8cd9199	[CUDA][HIP] Fix linkage for -fgpu-rdc Currently for explicit template function instantiation in CUDA/HIP device compilation clang emits instantiated kernel with external linkage and instantiated device function with internal linkage. This is fine for -fno-gpu-rdc since there is only one TU. However this causes duplicate symbols for kernels for -fgpu-rdc if the same instantiation happen in multiple TU. Or missing symbols if a device function calls an explicitly instantiated template function in a different TU. To make explicit template function instantiation work for -fgpu-rdc we need to follow the C++ linkage paradigm, i.e. use weak_odr linkage. Differential Revision: https://reviews.llvm.org/D90311	2020-11-03 08:07:19 -05:00
Alex Lorenz	701456b523	[darwin] add support for __isPlatformVersionAtLeast check for if (@available) The __isPlatformVersionAtLeast routine is an implementation of `if (@available)` check that uses the _availability_version_check API on Darwin that's supported on macOS 10.15, iOS 13, tvOS 13 and watchOS 6. Differential Revision: https://reviews.llvm.org/D90367	2020-11-02 16:28:09 -08:00
Ben Dunbobbin	ae9231ca2a	Reland - [Clang] Add the ability to map DLL storage class to visibility `415f7ee883` had LIT test failures on any build where the clang executable was not called "clang". I have adjusted the LIT CHECKs to remove the binary name to fix this. Original commit message: For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-02 23:24:49 +00:00
Ben Dunbobbin	5024d3aa18	Revert "[Clang] Add the ability to map DLL storage class to visibility" This reverts commit `415f7ee883`. The added tests were failing on the build bots!	2020-11-02 17:33:54 +00:00
Ben Dunbobbin	415f7ee883	[Clang] Add the ability to map DLL storage class to visibility For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-02 17:08:23 +00:00
Teresa Johnson	0949f96dc6	[MemProf] Pass down memory profile name with optional path from clang Similar to -fprofile-generate=, add -fmemory-profile= which takes a directory path. This is passed down to LLVM via a new module flag metadata. LLVM in turn provides this name to the runtime via the new __memprof_profile_filename variable. Additionally, always pass a default filename (in $cwd if a directory name is not specified vi the = form of the option). This is also consistent with the behavior of the PGO instrumentation. Since the memory profiles will generally be fairly large, it doesn't make sense to dump them to stderr. Also, importantly, the memory profiles will eventually be dumped in a compact binary format, which is another reason why it does not make sense to send these to stderr by default. Change the existing memprof tests to specify log_path=stderr when that was being relied on. Depends on D89086. Differential Revision: https://reviews.llvm.org/D89087	2020-11-01 17:38:23 -08:00
Mark de Wever	b46fddf75f	[CodeGen] Implement [[likely]] and [[unlikely]] for while and for loop. The attribute has no effect on a do statement since the path of execution will always include its substatement. It adds a diagnostic when the attribute is used on an infinite while loop since the codegen omits the branch here. Since the likelihood attributes have no effect on a do statement no diagnostic will be issued for do [[unlikely]] {...} while(0); Differential Revision: https://reviews.llvm.org/D89899	2020-10-31 17:51:29 +01:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Thomas Lively	a787e09779	[WebAssembly] Prototype i64x2.bitmask As proposed in https://github.com/WebAssembly/simd/pull/368. Differential Revision: https://reviews.llvm.org/D90514	2020-10-30 17:23:30 -07:00
Thomas Lively	0a512a555a	[WebAssembly] Prototype i64x2.eq As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508	2020-10-30 16:38:15 -07:00
Thomas Lively	1cb0b56607	[WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u} As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these instructions are available only via builtin functions and intrinsics while they are in the prototyping stage. Differential Revision: https://reviews.llvm.org/D90504	2020-10-30 15:44:04 -07:00
Arthur Eubanks	2e31727a88	[NFC] Clean up PassBuilder Make DebugLogging a member variable so that users of PassBuilder don't need to pass it around so much. Move call to TargetMachine::registerPassBuilderCallbacks() within PassBuilder so users don't need to remember to call it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90437	2020-10-30 10:03:59 -07:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
David Sherwood	cea69fa4dc	[SVE] Add fatal error for unnamed SVE variadic arguments We don't currently support passing unnamed variadic SVE arguments so I've added a fatal error if we hit such cases to prevent any silent ABI issues in future. Differential Revision: https://reviews.llvm.org/D90230	2020-10-30 13:35:47 +00:00
Liu, Chen3	00090a2b82	Support complex target features combinations This patch is mainly doing two things: 1. Adding support for parentheses, making the combination of target features more diverse; 2. Making the priority of ’,‘ is higher than that of '\|' by default. So I need to make some change with PTX Builtin function. Differential Revision: https://reviews.llvm.org/D89184	2020-10-30 10:32:53 +08:00
Thomas Lively	be6f50798e	[WebAssembly] Implement SIMD signselect instructions As proposed in https://github.com/WebAssembly/simd/pull/124, using the opcodes adopted by V8 in https://chromium-review.googlesource.com/c/v8/v8/+/2486235/2/src/wasm/wasm-opcodes.h. Uses new builtin functions and a new target intrinsic exclusively to ensure that the new instructions are only emitted when a user explicitly opts in to using them since they are still in the prototyping and evaluation phase. Differential Revision: https://reviews.llvm.org/D90357	2020-10-29 11:06:20 -07:00
Jon Chesterfield	dee7704829	[AMDGPU] Add __builtin_amdgcn_grid_size [AMDGPU] Add __builtin_amdgcn_grid_size Similar to D76772, loads the data from the dispatch pointer. Marked invariant. Patch also updates the openmp devicertl to use this builtin. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D90251	2020-10-29 16:25:13 +00:00
Amy Huang	7669f3c0f6	Recommit "[CodeView] Emit static data members as S_CONSTANTs." We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. This changes CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072 This reverts commit `504615353f`.	2020-10-28 16:35:59 -07:00
Shilei Tian	0661328d7e	[Clang][OpenMP] Added the support for target data nowait Previously we added support for target nowait, but target data nowait has not been supported yet. In this patch, target data nowait will also be wrapped into a task. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90099	2020-10-28 15:53:30 -04:00
Mircea Trofin	6fa35541a0	[NFC][ThinLTO] Change command line passing to EmbedBitcodeInModule Changing to pass by ref - less null checks to worry about. Differential Revision: https://reviews.llvm.org/D90330	2020-10-28 12:33:39 -07:00
Baptiste Saleil	40dd4d5233	[Clang][PowerPC] Add __vector_pair and __vector_quad types Define the __vector_pair and __vector_quad types that are used to manipulate the new accumulator registers introduced by MMA on PowerPC. Because these two types are specific to PowerPC, they are defined in a separate new file so it will be easier to add other PowerPC specific types if we need to in the future. Differential Revision: https://reviews.llvm.org/D81508	2020-10-28 13:19:20 -05:00
Thomas Lively	5b464f2aa5	[WebAssembly] Fix incorrectly named target builtin Rename __builtin_wasm_q15mulr_saturate_s_i8x16 to __builtin_wasm_q15mulr_saturate_s_i16x8, fixing the implied lane interpretation of the result.	2020-10-28 10:22:43 -07:00
Heejin Ahn	98941279b9	[WebAssembly] Clang-format builtins generation (NFC) Differential Revision: https://reviews.llvm.org/D90294	2020-10-28 10:01:21 -07:00
Thomas Lively	31e944556f	[WebAssembly] Prototype extending multiplication SIMD instructions As proposed in https://github.com/WebAssembly/simd/pull/376. This commit implements new builtin functions and intrinsics for these instructions, but does not yet add them to wasm_simd128.h because they have not yet been merged to the proposal. These are the first instructions with opcodes greater than 0xff, so this commit updates the MC layer and disassembler to handle that correctly. Differential Revision: https://reviews.llvm.org/D90253	2020-10-28 09:38:59 -07:00
JonChesterfield	5d02ca49a2	[libomptarget][nvptx] Undef, weak shared variables [libomptarget][nvptx] Undef, weak shared variables Shared variables on nvptx, and LDS on amdgcn, are uninitialized at the start of kernel execution. Therefore create the variables with undef instead of zeros, motivated in part by the amdgcn back end rejecting LDS+initializer. Common is zero initialized, which seems incompatible with shared. Thus change them to weak, following the direction of https://reviews.llvm.org/rG7b3eabdcd215 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90248	2020-10-28 14:25:36 +00:00
Benjamin Kramer	90a9f97cbd	[openmp] Use front() instead of *begin() to not hide bugs when CurTypes is empty.	2020-10-28 13:58:23 +01:00
Benjamin Kramer	207cf71fa9	Revert "[OpenMP] Add Passing in Original Declaration Names To Mapper API" This reverts commit `d981c7b758` and `a87d7b3d44`. Test fails under msan.	2020-10-28 13:58:14 +01:00
Joseph Huber	a87d7b3d44	[OpenMP] Add Passing in Original Declaration Names To Mapper API Summary: This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;". See clang/test/OpenMP/target_map_names.cpp for an example of the generated output for a given map clause. Reviewers: jdoervert Differential Revision: https://reviews.llvm.org/D89802	2020-10-27 16:09:19 -04:00
Amy Huang	504615353f	Revert "[CodeView] Emit static data members as S_CONSTANTs." Seems like there's an assert in here that we shouldn't be running into. This reverts commit `515973222e`.	2020-10-27 11:29:58 -07:00
Nico Weber	2a4e704c92	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `e5766f25c6`. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Shilei Tian	d38788b357	[Clang][OpenMP] Avoid unnecessary privatization of mapper array when there is no user defined mapper In current implementation, if it requires an outer task, the mapper array will be privatized no matter whether it has mapper. In fact, when there is no mapper, the mapper array only contains number of nullptr. In the libomptarget, the use of mapper array is `if (mappers_array && mappers_array[i])`, which means we can directly set mapper array to nullptr if there is no mapper. This can avoid unnecessary data copy. In this patch, the data privatization will not be emitted if the mapper array is nullptr. When it comes to the emit of task body, the nullptr will be used directly. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90101	2020-10-27 00:02:32 -04:00
Arthur Eubanks	e5766f25c6	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Shilei Tian	e20d64c3d9	[Clang][OpenMP] Fixed an issue of segment fault when using target nowait The implementation of target nowait just wraps the target region into a task. The essential four parameters (base ptr, ptr, size, mapper) are taken as firstprivate such that they will be copied to the private location. When there is no user-defined mapper, the mapper variable will be nullptr. However, it will be still copied to the corresponding place. Therefore, a memcpy will be generated and the source pointer will be nullptr, causing a segmentation fault. The root cause is when calling `emitOffloadingArraysArgument`, the last argument `Options` has a field about whether it requires a task. It only takes depend clause into account. In this patch, the nowait clause is also included. There're two things that will be done in another patches: 1. target data nowait has not been supported yet. D90099 added the support. 2. When there is no mapper, the mapper array can be nullptr no matter whether it requires outer task or not. It can avoid an unnecessary data copy. This is an optimization that is covered in D90101. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89844	2020-10-26 22:33:22 -04:00
Amy Huang	515973222e	[CodeView] Emit static data members as S_CONSTANTs. We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. I changed CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072	2020-10-26 15:30:35 -07:00
Duncan P. N. Exon Smith	d4c667c9af	Avoid unnecessary uses of `MDNode::getTemporary`, NFC This is a long-delayed follow-up to `5e5b85098d`. `TempMDNode` includes a bunch of machinery for RAUW, and should only be used when necessary. RAUW wasn't being used in any of these cases... it was just a placeholder for a self-reference. Where the real node was using `MDNode::getDistinct`, just replace the temporary argument with `nullptr`. Where the real node was using `MDNode::get`, the `replaceOperandWith` call was "promoting" the node to a distinct one implicitly due to self-reference detection in `MDNode::handleChangedOperand`. The `TempMDNode` was serving a purpose by delaying uniquing, but it's way simpler to just call `MDNode::getDistinct` in the first place. Note that using a self-reference at all in these places is a hold-over from before `distinct` metadata existed. It was an old trick to create distinct nodes. It would be intrusive to change, including bitcode upgrades, etc., and it's harmless so I'm not sure there's much value in removing it from existing schemas. After this commit it still has a tiny memory cost (in the extra metadata operand) but no more overhead in construction. Differential Revision: https://reviews.llvm.org/D90079	2020-10-26 17:03:25 -04:00
Zequan Wu	e56e7bd469	Revert "Revert "Ensure that checkInitIsICE is called exactly once for every variable"" This reverts commit `a2ac64dd90`.	2020-10-26 12:08:57 -07:00
Zequan Wu	a2ac64dd90	Revert "Ensure that checkInitIsICE is called exactly once for every variable" This causing `Assertion Result && "Could not evaluate expression"' failed` at https://bugs.chromium.org/p/chromium/issues/detail?id=1142009 This reverts commit `76c0092665`.	2020-10-26 11:59:55 -07:00
Nick Desaulniers	c8f84bd094	[Clang][CodeGen] fix failed assertion Ensure we can emit symbol aliases via function attribute even when function signatures contain incomplete types. Via bugreport: https://reviews.llvm.org/D66492#2350947 Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D90073	2020-10-26 11:37:55 -07:00
Tyker	d3205bbca3	[Annotation] Allows annotation to carry some additional constant arguments. This allows using annotation in a much more contexts than it currently has. especially when annotation with template or constexpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D88645	2020-10-26 10:50:05 +01:00
Melanie Blower	2e204e2391	[clang] Enable support for #pragma STDC FENV_ACCESS Reviewers: rjmccall, rsmith, sepavloff Differential Revision: https://reviews.llvm.org/D87528	2020-10-25 06:46:25 -07:00
Akira Hatanaka	71e1a56de1	[CodeGen] Emit destructor calls to destruct non-trivial C struct temporaries created by conditional and assignment operators rdar://problem/64989559 Differential Revision: https://reviews.llvm.org/D83448	2020-10-23 14:46:17 -07:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Venkataramanan Kumar	57cdc52c4d	Initial support for vectorization using Libmvec (GLIBC vector math library) Differential Revision: https://reviews.llvm.org/D88154	2020-10-22 16:01:39 -04:00
Xiang1 Zhang	7c3fea7721	[X86] Support customizing stack protector guard Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D88631	2020-10-22 10:08:14 +08:00
Richard Smith	ba4768c966	[c++20] For P0732R2 / P1907R1: Basic frontend support for class types as non-type template parameters. Create a unique TemplateParamObjectDecl instance for each such value, representing the globally unique template parameter object to which the template parameter refers. No IR generation support yet; that will follow in a separate patch.	2020-10-21 13:21:41 -07:00
Jonas Paulsson	42a82862b6	Reapply "[clang] Improve handling of physical registers in inline assembly operands." Earlyclobbers are now excepted from this change (original commit: `c78da03`). Review: Ulrich Weigand, Nick Desaulniers Differential Revision: https://reviews.llvm.org/D87279	2020-10-21 10:53:40 +02:00
Mikhail Maltsev	7819411837	[clang] Use SourceLocation as key in hash maps, NFCI The patch adjusts the existing `llvm::DenseMap<unsigned, T>` and `llvm::DenseSet<unsigned>` objects that store source locations, so that they use `SourceLocation` directly instead of `unsigned`. This patch relies on the `DenseMapInfo` trait added in D89719. It also replaces the construction of `SourceLocation` objects from the constants -1 and -2 with calls to the trait's methods `getEmptyKey` and `getTombstoneKey` where appropriate. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D69840	2020-10-20 16:24:09 +01:00
Richard Smith	08c8d5bc51	Properly track whether a variable is constant-initialized. This fixes miscomputation of __builtin_constant_evaluated in the initializer of a variable that's not usable in constant expressions, but is readable when constant-folding. If evaluation of a constant initializer fails, we throw away the evaluated result instead of keeping it as a non-constant-initializer value for the variable, because it might not be a correct value. To avoid regressions for initializers that are foldable but not formally constant initializers, we now try constant-evaluating some globals in C++ twice: once to check for a constant initializer (in an mode where is_constannt_evaluated returns true) and again to determine the runtime value if the initializer is not a constant initializer.	2020-10-19 23:59:11 -07:00
Fangrui Song	0ab222e7d7	[gcov] Delete CC1 option -test-coverage The name is unfortunate because it is similar to the driver option -ftest-coverage. It turns out aside from one occurrence in a test, this option is not used.	2020-10-19 21:48:51 -07:00
Richard Smith	3692d20d2b	Refactor tracking of constant initializers for variables. Instead of framing the interface around whether the variable is an ICE (which is only interesting in C++98), primarily track whether the initializer is a constant initializer (which is interesting in all C++ language modes). No functionality change intended.	2020-10-19 21:31:19 -07:00
Richard Smith	76c0092665	Ensure that checkInitIsICE is called exactly once for every variable for which it matters. This is a step towards separating checking for a constant initializer (in which std::is_constant_evaluated returns true) and any other evaluation of a variable initializer (in which it returns false).	2020-10-19 19:04:04 -07:00
Douglas Yung	774ab60125	Add option to use older clang ABI behavior when passing certain union types as function arguments Recently commit D78699 (commit `26cfb6e562`), fixed clang's behavior with respect to passing a union type through a register to correctly follow the ABI. However, this is an ABI breaking change with earlier versions of the clang compiler, so we should add an -fclang-abi-compat option to address this. Additionally, the PS4 ABI requires the older behavior, so that is added as well. This change adds a Ver11 value to the ClangABI enum that when it is set (or the target is the PS4 triple), we skip the ABI fix introduced in D78699. Differential Revision: https://reviews.llvm.org/D89747	2020-10-19 18:17:34 -07:00
Simon Pilgrim	7fe7d9b130	Fix MSVC "not all control paths return a value" warning. NFCI.	2020-10-19 11:48:31 +01:00
Hans Wennborg	0628bea513	Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting" This broke Chromium's PGO build, it seems because hot-cold-splitting got turned on unintentionally. See comment on the code review for repro etc. > This patch adds -f[no-]split-cold-code CC1 options to clang. This allows > the splitting pass to be toggled on/off. The current method of passing > `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose > correctly (say, with `-O0` or `-Oz`). > > To implement the -fsplit-cold-code option, an attribute is applied to > functions to indicate that they may be considered for splitting. This > removes some complexity from the old/new PM pipeline builders, and > behaves as expected when LTO is enabled. > > Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> > Differential Revision: https://reviews.llvm.org/D57265 > Reviewed By: Aditya Kumar, Vedant Kumar > Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar This reverts commit `273c299d5d`.	2020-10-19 12:31:14 +02:00
Mark de Wever	389c8d5b20	[NFC] Make non-modifying members const. Implementing the likelihood attributes for the iteration statements adds a new helper function. This function can't be const qualified since these non-modifying members aren't const qualified.	2020-10-18 18:50:21 +02:00
Mark de Wever	2bcda6bb28	[Sema, CodeGen] Implement [[likely]] and [[unlikely]] in SwitchStmt This implements the likelihood attribute for the switch statement. Based on the discussion in D85091 and D86559 it only handles the attribute when placed on the case labels or the default labels. It also marks the likelihood attribute as feature complete. There are more QoI patches in the pipeline. Differential Revision: https://reviews.llvm.org/D89210	2020-10-18 13:48:42 +02:00
Richard Smith	d4aac67859	Make the check for whether we should memset(0) an aggregate initialization a little smarter. Look through casts that preserve zero-ness when determining if an initializer is zero, so that we can handle cases like an {0} initializer whose corresponding field is a type other than 'int'.	2020-10-16 16:48:22 -07:00
Richard Smith	48c70c1664	Extend memset-to-zero optimization to C++11 aggregate functional casts Aggr{...}. We previously missed these cases due to not stepping over the additional AST nodes representing their syntactic form.	2020-10-16 13:21:08 -07:00
Matt Arsenault	0a7cd99a70	Reapply "OpaquePtr: Add type to sret attribute" This reverts commit `eb9f7c28e5`. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.	2020-10-16 11:05:02 -04:00
Caroline Concatto	e8d9ee9c7c	[SVE][CodeGen]Use getFixedSize() function for TypeSize comparison in clang This patch makes sure that the instance of TypeSize comparison operator is done with a fixed type size. Differential Revision: https://reviews.llvm.org/D89312	2020-10-16 10:56:39 +01:00
Vedant Kumar	273c299d5d	[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting This patch adds -f[no-]split-cold-code CC1 options to clang. This allows the splitting pass to be toggled on/off. The current method of passing `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose correctly (say, with `-O0` or `-Oz`). To implement the -fsplit-cold-code option, an attribute is applied to functions to indicate that they may be considered for splitting. This removes some complexity from the old/new PM pipeline builders, and behaves as expected when LTO is enabled. Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> Differential Revision: https://reviews.llvm.org/D57265 Reviewed By: Aditya Kumar, Vedant Kumar Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar	2020-10-15 23:13:33 +00:00
Fangrui Song	5a338599fb	[CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with specialized emitting (e.g. memcpy, various math functions) still do not work. This patch makes these functions work for `asm()` and `#pragma redefine_extname`. glibc uses `asm()` to redirect internal libc function calls to hidden aliases. Limitation: such a function is a builtin in clang, but will not be recognized as a libcall in optimization passes because Clang does not annotate the renamed function as a libcall. In GCC -O1 or above, `abs` can be optimized out but we can't. Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example: double sin(double x) asm("real_sin"); double f(double d) { return __builtin_sin(d); } --- According to @rsmith, the following three statements cannot be simultaneously true: (1) The frontend function foo has known, builtin semantics X. (2) The symbol foo has known, builtin semantics X. (3) It's not correct to lower a call to the frontend function foo to the symbol foo. People do want (1) (if it is profitable to expand a memcpy, do it). This also means that people do not want to add -fno-builtin-memcpy. People do want (3): that is why they use asm("__GI_memcpy") in the first place. So unfortunately we make a compromise by not refuting (2) (see the limitation above). For most libcalls, there is a small loss because compilers don't synthesize them. For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make the assembly level redirection. (Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable). Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D88712	2020-10-15 15:14:38 -07:00
Reid Kleckner	5fbab4025e	[MS] Apply `inreg` to AArch64 sret parms on instance methods The documentation rules indicate that instance methods should return large, trivially copyable aggregates via X1/X0 and not X8 as is normally done when returning such structs from free functions: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values Fixes PR47836, a bug in the initial implementation of these rules. I tried to simplify the logic a bit as well while I'm here. Differential Revision: https://reviews.llvm.org/D89362	2020-10-15 14:54:42 -07:00
Yaxun (Sam) Liu	e384e94fbe	Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024" This reverts commit `187658b8a6` due to AMDGPU backend issues.	2020-10-15 17:25:55 -04:00
Leonard Chan	79829a4704	Revert "[clang] Add -fc++-abi= flag for specifying which C++ ABI to use" This reverts commits `683b308c07` and `8487bfd4e9`. We will go for a more restricted approach that does not give freedom to everyone to change ABIs on whichever platform. See the discussion on https://reviews.llvm.org/D85802.	2020-10-15 14:24:38 -07:00
Thomas Lively	1992e30c2d	[WebAssembly] Prototype i8x16.popcnt As proposed at https://github.com/WebAssembly/simd/pull/379. Use a target builtin and intrinsic rather than normal codegen patterns to make the instruction opt-in until it is merged to the proposal and stabilized in engines. Differential Revision: https://reviews.llvm.org/D89446	2020-10-15 21:18:22 +00:00
Stanislav Mekhanoshin	d1beb95d12	[AMDGPU] gfx1032 target Differential Revision: https://reviews.llvm.org/D89487	2020-10-15 12:41:18 -07:00
Thomas Lively	3f738d1f5e	Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c8385a352` with a typing fix to an instruction selection pattern.	2020-10-15 19:32:34 +00:00
Thomas Lively	7c8385a352	Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c6bfd90ab`.	2020-10-15 15:49:36 +00:00
Thomas Lively	7c6bfd90ab	[WebAssembly] v128.load{8,16,32,64}_lane instructions Prototype the newly proposed load_lane instructions, as specified in https://github.com/WebAssembly/simd/pull/350. Since these instructions are not available to origin trial users on Chrome stable, make them opt-in by only selecting them from intrinsics rather than normal ISel patterns. Since we only need rough prototypes to measure performance right now, this commit does not implement all the load and store patterns that would be necessary to make full use of the offset immediate. However, the full suite of offset tests is included to make it easy to track improvements in the future. Since these are the first instructions to have a memarg immediate as well as an additional immediate, the disassembler needed some additional hacks to be able to parse them correctly. Making that code more principled is left as future work. Differential Revision: https://reviews.llvm.org/D89366	2020-10-15 15:33:10 +00:00
Caroline Concatto	145e44bb18	[SVE]Fix implicit TypeSize casts in EmitCheckValue Using TypeSize::getFixedSize() instead of relying upon the implicit TypeSize->uint64_cast as the type is always fixed width. Differential Revision: https://reviews.llvm.org/D89313	2020-10-15 13:25:46 +01:00
Simon Pilgrim	d7fa9030d4	[CodeGen][X86] Emit fshl/fshr ir intrinsics for shiftleft128/shiftright128 ms intrinsics Now that funnel shift handling is pretty good, we can use the intrinsics directly and avoid a lot of zext/trunc issues. https://godbolt.org/z/YqhnnM Differential Revision: https://reviews.llvm.org/D89405	2020-10-15 10:22:41 +01:00
Duncan P. N. Exon Smith	dde4e0318c	clang/CodeGen: Stop using SourceManager::getBuffer, NFC Update `clang/lib/CodeGen` to use a `MemoryBufferRef` from `getBufferOrNone` instead of `MemoryBuffer*` from `getBuffer`. No functionality change here. Differential Revision: https://reviews.llvm.org/D89411	2020-10-14 23:32:43 -04:00
Leonard Chan	683b308c07	[clang] Add -fc++-abi= flag for specifying which C++ ABI to use This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html. The goal is to add a way to override the default target C++ ABI through a compiler flag. This makes it easier to test and transition between different C++ ABIs through compile flags rather than build flags. In this patch: - Store `-fc++-abi=` in a LangOpt. This isn't stored in a CodeGenOpt because there are instances outside of codegen where Clang needs to know what the ABI is (particularly through ASTContext::createCXXABI), and we should be able to override the target default if the flag is provided at that point. - Expose the existing ABIs in TargetCXXABI as values that can be passed through this flag. - Create a .def file for these ABIs to make it easier to check flag values. - Add an error for diagnosing bad ABI flag values. Differential Revision: https://reviews.llvm.org/D85802	2020-10-14 12:31:21 -07:00
Jonas Paulsson	625fa47617	Revert "[clang] Improve handling of physical registers in inline assembly operands." This reverts commit `c78da03778`. Temporarily reverted due to https://bugs.llvm.org/show_bug.cgi?id=47837.	2020-10-14 08:42:51 +02:00
Jonas Paulsson	c78da03778	[clang] Improve handling of physical registers in inline assembly operands. Change EmitAsmStmt() to - Not tie physregs with the "+r" constraint, but instead add the hard register as an input constraint. This makes "+r" and "=r":"r" look the same in the output. Background: Macro intensive user code may contain inline assembly statements with multiple operands constrained to the same physreg. Such a case (with the operand constraints "+r" : "r") currently triggers the TwoAddressInstructionPass assertion against any extra use of a tied register. Furthermore, TwoAddress will insert a COPY to that physreg even though isel has already done so (for the non-tied use), which may lead to a second redundant instruction currently. A simple fix for this is to not emit tied physreg uses in the first place for the "+r" constraint, which is what this patch does. - Give an error on multiple outputs to the same physical register. This should be reported and this is also what GCC does. Review: Ulrich Weigand, Aaron Ballman, Jennifer Yu, Craig Topper Differential Revision: https://reviews.llvm.org/D87279	2020-10-13 15:09:52 +02:00
Bevin Hansson	101309fe04	[AST] Change return type of getTypeInfoInChars to a proper struct instead of std::pair. Followup to D85191. This changes getTypeInfoInChars to return a TypeInfoChars struct instead of a std::pair of CharUnits. This lets the interface match getTypeInfo more closely. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86447	2020-10-13 13:26:56 +02:00
Bevin Hansson	9fa7f48459	[Fixed Point] Add fixed-point to floating point cast types and consteval. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D86631	2020-10-13 13:26:56 +02:00
Ties Stuij	208987844f	[ARM] Follow AACPS standard for volatile bit-fields access width This patch resumes the work of D16586. According to the AAPCS, volatile bit-fields should be accessed using containers of the widht of their declarative type. In such case: ``` struct S1 { short a : 1; } ``` should be accessed using load and stores of the width (sizeof(short)), where now the compiler does only load the minimum required width (char in this case). However, as discussed in D16586, that could overwrite non-volatile bit-fields, which conflicted with C and C++ object models by creating data race conditions that are not part of the bit-field, e.g. ``` struct S2 { short a; int b : 16; } ``` Accessing `S2.b` would also access `S2.a`. The AAPCS Release 2020Q2 (https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=) section 8.1 Data Types, page 36, "Volatile bit-fields - preserving number and width of container accesses" has been updated to avoid conflict with the C++ Memory Model. Now it reads in the note: ``` This ABI does not place any restrictions on the access widths of bit-fields where the container overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field placed between two other bit-fields. This is because the C/C++ memory model defines these as being separate memory locations, which can be accessed by two threads simultaneously. For this reason, compilers must be permitted to use a narrower memory access width (including splitting the access into multiple instructions) to avoid writing to a different memory location. For example, in struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };, writes to a or b must not overwrite each other. ``` I've updated the patch D16586 to follow such behavior by verifying that we only change volatile bit-field access when: - it won't overlap with any other non-bit-field member - we only access memory inside the bounds of the record - avoid overlapping zero-length bit-fields. Regarding the number of memory accesses, that should be preserved, that will be implemented by D67399. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D72932	2020-10-13 10:31:48 +01:00
Simon Pilgrim	6c23cbc560	[X86] Convert integer _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Emit the equivalent integer reduction intrinsics in IR instead of expanding to shuffle+arithmetic sequences. The fadd/fmul reductions might be trickier as they assume a similar bisection reduction while the generic intrinsics assume a sequential reduction (intel docs are ambiguous on the correct approach) - I'm not sure if we want to always tag them with reassoc? Anyway, that issue can wait until a separate fp patch along with the fmin/fmax reductions. Differential Revision: https://reviews.llvm.org/D87604	2020-10-13 09:28:39 +01:00
Richard Smith	913f600566	Canonicalize declaration pointers when forming APValues. References to different declarations of the same entity aren't different values, so shouldn't have different representations. Recommit of `e6393ee813`, most recently reverted in `9a33f027ac` due to a bug caused by ObjCInterfaceDecls not propagating availability attributes along their redeclaration chains; that bug was fixed in `e2d4174e9c`.	2020-10-12 19:32:57 -07:00
Arthur Eubanks	9a33f027ac	Revert "Canonicalize declaration pointers when forming APValues." This reverts commit `9dcd96f728`. See https://crbug.com/1134762.	2020-10-12 12:37:24 -07:00
Tim Renouf	666ef0db20	[AMDGPU] Add gfx602, gfx705, gfx805 targets At AMD, in an internal audit of our code, we found some corner cases where we were not quite differentiating targets enough for some old hardware. This commit is part of fixing that by adding three new targets: * The "Oland" and "Hainan" variants of gfx601 are now split out into gfx602. LLPC (in the GPUOpen driver) and other front-ends could use that to avoid using the shaderZExport workaround on gfx602. * One variant of gfx703 is now split out into gfx705. LLPC and other front-ends could use that to avoid using the shaderSpiCsRegAllocFragmentation workaround on gfx705. * The "TongaPro" variant of gfx802 is now split out into gfx805. TongaPro has a faster 64-bit shift than its former friends in gfx802, and a subtarget feature could be set up for that to take advantage of it. This commit does not make that change; it just adds the target. V2: Add clang changes. Put TargetParser list in order. V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order, so fix the GPUKind order. Differential Revision: https://reviews.llvm.org/D88916 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-10-10 17:22:22 +01:00
Thomas Lively	d8f58bf53a	[WebAssembly] Prototype i16x8.q15mulr_sat_s This saturating, rounding, Q-format multiplication instruction is proposed in https://github.com/WebAssembly/simd/pull/365. Differential Revision: https://reviews.llvm.org/D88968	2020-10-09 21:17:53 +00:00
Liu, Chen3	26cfb6e562	[X86] Passing union type through register For example: union M256 { double d; __m256 m; }; extern void foo1(union M256 A); union M256 m1; void test() { foo1(m1); } clang will pass m1 through stack which does not follow the ABI. Differential Revision: https://reviews.llvm.org/D78699	2020-10-09 11:24:29 +08:00
Alexandre Ganea	66face6aa0	Re-land [DebugInfo] Add debug location to stubs generated by CGDeclCXX and mark them as artificial Previously, when clang was compiled with -DLLVM_ENABLE_ASSERTIONS=ON, the added tests were displaying: inlinable function call in a function with debug info must have a !dbg location call void @"??1?$c@UB@@@@QEAA@XZ"(%struct.c* @"?f@?1??d@@YAPEAU?$c@UB@@@@XZ@4U2@A") fatal error: error in backend: Broken module found, compilation aborted! Stack dump: 0. Program arguments: <f:\svn\buildninja\bin\clang -cc1 -emit-llvm debug-info-no-location.cpp> -gcodeview -debug-info-kind=limited 1. <eof> parser at end of file 2. Per-function optimization Fixes PR43012 Differential Revision: https://reviews.llvm.org/D66328	2020-10-08 20:49:17 -04:00
Arthur Eubanks	afff74e5c2	[HWAsan][NewPM] Handle hwasan like other sanitizers Move it as an EP callback (-O[123]) or in addSanitizersAtO0. This makes it not run in ThinLTO pre-link (like the other sanitizers), so don't check LTO runs in hwasan-new-pm.c. Changing its position also seems to change the generated IR. I think we just need to make sure the pass runs. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D88936	2020-10-08 14:43:21 -07:00
Joseph Huber	3cc1f1fc1d	[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def Summary: Replace the OpenMP Runtime Library functions used in CGOpenMPRuntimeGPU for OpenMP device code generation with ones in OMPKinds.def and use OMPIRBuilder for generating runtime calls. This allows us to consolidate more OpenMP code generation into the OMPIRBuilder. Future additions to the GPU runtime functions should now go in OMPKinds.def Reviewers: jdoerfert Subscribers: aaron.ballman cfe-commits guansong llvm-commits sstefan1 yaxunl Tags: #OpenMP #LLVM #clang Differential Revision: https://reviews.llvm.org/D88430	2020-10-08 14:00:22 -04:00
diggerlin	92bca12843	[AIX] add new option -mignore-xcoff-visibility SUMMARY: In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html. We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file. The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error. In AIX OS: 1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command . 1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes. 1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute. The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR. Reviewer: daltenty,Jason Liu Differential Revision: https://reviews.llvm.org/D87451	2020-10-08 09:34:58 -04:00
Pushpinder Singh	3a12ff0dac	[OpenMP][RTL] Remove dead code RequiresDataSharing was always 0, resulting dead code in device runtime library. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D88829	2020-10-06 05:43:47 -04:00
Fangrui Song	a2cc883368	[CUDA] Don't call __cudaRegisterVariable on C++17 inline variables D17779: host-side shadow variables of external declarations of device-side global variables have internal linkage and are referenced by `__cuda_register_globals`. nvcc from CUDA 11 does not allow `__device__ inline` or `__device__ constexpr` (C++17 inline variables) but clang has incorrectly supported them for a while: ``` error: A __device__ variable cannot be marked constexpr error: An inline __device__/__constant__/__managed__ variable must have internal linkage when the program is compiled in whole program mode (-rdc=false) ``` If such a variable (which has a comdat group) is discarded (a copy from another translation unit is prevailing and selected), accessing the variable from outside the section group (`__cuda_register_globals`) is a violation of the ELF specification and will be rejected by linkers: > A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed. As a workaround, don't register such inline variables for now. (If we register the variables in all TUs, we will keep multiple instances of the shadow and break the C++ semantics for inline variables). We should reject such variables in Sema but our internal users need some time to migrate. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D88786	2020-10-05 12:53:59 -07:00
Craig Topper	a02b449bb1	[X86] Sync AESENC/DEC Key Locker builtins with gcc. For the wide builtins, pass a single input and output pointer to the builtins. Emit the GEPs and input loads from CGBuiltin.	2020-10-04 12:09:41 -07:00
Craig Topper	230c57b0bd	[X86] Synchronize the encodekey builtins with gcc. Don't assume void* is 16 byte aligned. We were taking multiple pointer arguments in the builtin. gcc accepts a single void. The cast from void to _m128i* caused the IR generation to assume the pointer was aligned. Instead make the builtin take a single void, emit i8 GEPs to adjust then cast to <2 x i64>* and perform a store with align of 1.	2020-10-04 12:09:35 -07:00
Mark de Wever	1113fbf44c	[CodeGen] Improve likelihood branch weights Bruno De Fraine discovered some issues with D85091. The branch weights generated for `logical not` and `ternary conditional` were wrong. The `logical and` and `logical or` differed from the code generated of `__builtin_predict`. Adjusted the generated code for the likelihood to match `__builtin_predict`. The patch is based on Bruno's suggestions. Differential Revision: https://reviews.llvm.org/D88363	2020-10-04 14:24:27 +02:00
Yaxun (Sam) Liu	cbd420c5ed	[CUDA][HIP] Fix bound arch for offload action for fat binary Currently CUDA/HIP toolchain uses "unknown" as bound arch for offload action for fat binary. This causes -mcpu or -march with "unknown" added in HIPToolChain::TranslateArgs or CUDAToolChain::TranslateArgs. This causes issue for https://reviews.llvm.org/D88377 since HIP toolchain needs to check -mcpu in HIPToolChain::TranslateArgs. The bound arch of offload action for fat binary is not really used, therefore set it to CudaArch::UNUSED. Differential Revision: https://reviews.llvm.org/D88524	2020-10-02 19:05:51 -04:00
Richard Smith	8fb2a235b0	Don't reject calls to MinGW's unusual _setjmp declaration. We now recognize this function as a builtin despite it having an unexpected number of parameters; make sure we don't enforce that it has only 1 argument for its 2 parameters.	2020-10-02 15:12:15 -07:00
Yaxun (Sam) Liu	dc6a0b0ec7	[HIP] Align device binary To facilitate faster loading of device binaries and share them among processes, HIP runtime favors their alignment being 4096 bytes. HIP runtime can load unaligned device binaries, however, aligning them at 4096 bytes results in faster loading and less shared memory usage. This patch adds an option -bundle-align to clang-offload-bundler which allows bundles to be aligned at specified alignment. By default it is 1, which is NFC compared to existing format. This patch then aligns embedded fat binary and device binary inside fat binary at 4096 bytes. It has been verified this change does not cause significant overall file size increase for typical HIP applications (less than 1%). Differential Revision: https://reviews.llvm.org/D88734	2020-10-02 18:10:44 -04:00
Nathan Lanza	14f6bfcb52	[clang] Implement objc_non_runtime_protocol to remove protocol metadata Summary: Motivated by the new objc_direct attribute, this change adds a new attribute that remotes metadata from Protocols that the programmer knows isn't going to be used at runtime. We simply have the frontend skip generating any protocol metadata entries (e.g. OBJC_CLASS_NAME, _OBJC_$_PROTOCOL_INSTANCE_METHDOS, _OBJC_PROTOCOL, etc) for a protocol marked with `__attribute__((objc_non_runtime_protocol))`. There are a few APIs used to retrieve a protocol at runtime. `@protocol(SomeProtocol)` will now error out of the requested protocol is marked with attribute. `objc_getProtocol` will return `NULL` which is consistent with the behavior of a non-existing protocol. Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D75574	2020-10-02 17:35:50 -04:00
Michael Liao	8c36eaf037	[clang][opencl][codegen] Remove the insertion of `correctly-rounded-divide-sqrt-fp-math` fn-attr. - `-cl-fp32-correctly-rounded-divide-sqrt` is already handled in a per-instruction manner by annotating the accuracy required. There's no need to add that fn-attr. So far, there's no in-tree backend handling that attr and that OpenCL specific option. - In case that out-of-tree backends are broken, this change could be reverted if those backends could not be fixed. Differential Revision: https://reviews.llvm.org/D88424	2020-10-01 11:07:39 -04:00
Arthur Eubanks	ce5379f0f0	[NPM] Add target specific hook to add passes for New Pass Manager The patch adds a new TargetMachine member "registerPassBuilderCallbacks" for targets to add passes to the pass pipeline using the New Pass Manager (similar to adjustPassManager for the Legacy Pass Manager). Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D88138	2020-09-30 13:29:43 -07:00
Joseph Huber	1b60f63e4f	Revert "[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def" Failing tests on Arm due to the tests automatically populating incomatible pointer width architectures. Reverting until the tests are updated. Failing tests: OpenMP/distribute_parallel_for_num_threads_codegen.cpp OpenMP/distribute_parallel_for_if_codegen.cpp OpenMP/distribute_parallel_for_simd_if_codegen.cpp OpenMP/distribute_parallel_for_simd_num_threads_codegen.cpp OpenMP/target_teams_distribute_parallel_for_if_codegen.cpp OpenMP/target_teams_distribute_parallel_for_simd_if_codegen.cpp OpenMP/teams_distribute_parallel_for_if_codegen.cpp OpenMP/teams_distribute_parallel_for_simd_if_codegen.cpp This reverts commit `90eaedda9b`.	2020-09-30 15:12:21 -04:00
Joseph Huber	90eaedda9b	[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def Summary: Replace the OpenMP Runtime Library functions used in CGOpenMPRuntimeGPU for OpenMP device code generation with ones in OMPKinds.def and use OMPIRBuilder for generating runtime calls. This allows us to consolidate more OpenMP code generation into the OMPIRBuilder. This patch also invalidates specifying target architectures with conflicting pointer sizes. Reviewers: jdoerfert Subscribers: aaron.ballman cfe-commits guansong llvm-commits sstefan1 yaxunl Tags: #OpenMP #Clang #LLVM Differential Revision: https://reviews.llvm.org/D88430	2020-09-30 14:00:01 -04:00
Xiangling Liao	3a7487f903	[FE] Use preferred alignment instead of ABI alignment for complete object when applicable On some targets, preferred alignment is larger than ABI alignment in some cases. For example, on AIX we have special power alignment rules which would cause that. Previously, to support those cases, we added a “PreferredAlignment” field in the `RecordLayout` to store the AIX special alignment values in “PreferredAlignment” as the community suggested. However, that patch alone is not enough. There are places in the Clang where `PreferredAlignment` should have been used instead of ABI-specified alignment. This patch is aimed at fixing those spots. Differential Revision: https://reviews.llvm.org/D86790	2020-09-30 10:48:28 -04:00
Xiang1 Zhang	413577a879	[X86] Support Intel Key Locker Key Locker provides a mechanism to encrypt and decrypt data with an AES key without having access to the raw key value by converting AES keys into “handles”. These handles can be used to perform the same encryption and decryption operations as the original AES keys, but they only work on the current system and only until they are revoked. If software revokes Key Locker handles (e.g., on a reboot), then any previous handles can no longer be used. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D88398	2020-09-30 18:08:45 +08:00
Amy Huang	5c4fc581d5	[DebugInfo] Add types from constructor homing to the retained types list. Add class types to the retained types list to make sure they don't get dropped if the constructor is optimized out later. Differential Revision: https://reviews.llvm.org/D88522	2020-09-29 17:00:45 -07:00
John McCall	984744a131	Fix a variety of minor issues with ObjC method mangling: - Fix a memory leak accidentally introduced yesterday by using CodeGen's existing mangling context instead of creating a new context afresh. - Move GNU-runtime ObjC method mangling into the AST mangler; this will eventually be necessary to support direct methods there, but is also just the right architecture. - Make the Apple-runtime method mangling work properly when given an interface declaration, fixing a bug (which had solidified into a test) where mangling a category method from the interface could cause it to be mangled as if the category name was a class name. (Category names are namespaced within their class and have no global meaning.) - Fix a code cross-reference in dsymutil. Based on a patch by Ellis Hoag.	2020-09-29 19:51:53 -04:00
Fangrui Song	3681be876f	Add -fprofile-update={atomic,prefer-atomic,single} GCC 7 introduced -fprofile-update={atomic,prefer-atomic} (prefer-atomic is for best efforts (some targets do not support atomics)) to increment counters atomically, which is exactly what we have done with -fprofile-instr-generate (D50867) and -fprofile-arcs (`b5ef137c11`). This patch adds the option to clang to surface the internal options at driver level. GCC 7 also turned on -fprofile-update=prefer-atomic when -pthread is specified, but it has performance regression (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89307). So we don't follow suit. Differential Revision: https://reviews.llvm.org/D87737	2020-09-29 10:43:23 -07:00
Tres Popp	eb9f7c28e5	Revert "OpaquePtr: Add type to sret attribute" This reverts commit `55c4ff91bd`. Issues were introduced as discussed in https://reviews.llvm.org/D88241 where this change made previous bugs in the linker and BitCodeWriter visible.	2020-09-29 10:31:04 +02:00
Ellis Hoag	98ef7e29b0	This reduces code duplication between CGObjCMac.cpp and Mangle.cpp for generating the mangled name of an Objective-C method. This has no intended functionality change. https://reviews.llvm.org/D88329	2020-09-29 02:26:51 -04:00
Yaxun (Sam) Liu	187658b8a6	Recommit "[HIP] Change default --gpu-max-threads-per-block value to 1024" Recommit `04abbb3a78`	2020-09-28 22:43:17 -04:00
Craig Topper	288c5776c9	[X86] Use inlineasm flag output for the _bittest* intrinsics. Instead of expliciting emitting a setc in the inline asm instructions, we can use flag output. This allows the backend to use the flag directly if it is needed by a branch. Previously we needed a test instruction to convert the register back to a flag. If the flag can't be used directly, the backend will emit a setcc. Differential Revision: https://reviews.llvm.org/D87888	2020-09-28 13:33:22 -07:00
Vedant Kumar	06bc685fa2	[ubsan] nullability-arg: Fix crash on C++ member pointers Extend -fsanitize=nullability-arg to handle call sites which accept C++ member pointers. rdar://62476022 Differential Revision: https://reviews.llvm.org/D88336	2020-09-28 09:41:18 -07:00
Michael Liao	5dbf80cad9	[clang][codegen] Annotate `correctly-rounded-divide-sqrt-fp-math` fn-attr for OpenCL only. - `-cl-fp32-correctly-rounded-divide-sqrt` is an OpenCL-specific option and `correctly-rounded-divide-sqrt-fp-math` should be added for OpenCL at most. Differential revision: https://reviews.llvm.org/D88303	2020-09-28 11:40:32 -04:00
David Sherwood	bafdd11326	[SVE] Replace / operator in TypeSize/ElementCount with divideCoefficientBy After some recent upstream discussion we decided that it was best to avoid having the / operator for both ElementCount and TypeSize, since this could give the impression that these classes can be used in the same way as basic integer integer types. However, division for scalable types is a bit odd because we are only dividing the minimum quantity by a value, as opposed to something like: (MinSize * Vscale) / SomeValue This is why when performing division it's important the caller first establishes whether the operation makes sense, perhaps by calling isKnownMultipleOf() prior to division. The caller must now explictly call divideCoefficientBy() on the class to perform the operation. Differential Revision: https://reviews.llvm.org/D87700	2020-09-28 08:03:00 +01:00
Richard Smith	9dcd96f728	Canonicalize declaration pointers when forming APValues. References to different declarations of the same entity aren't different values, so shouldn't have different representations. Recommit of `e6393ee813` with fixed handling for weak declarations. We now look for attributes on the most recent declaration when determining whether a declaration is weak. (Second recommit with further fixes for mishandling of weak declarations. Our behavior here is fundamentally unsound -- see PR47663 -- but this approach attempts to not make things worse.)	2020-09-27 19:05:26 -07:00
Shilei Tian	ebb1092a28	[Clang][OpenMP] Added support for nowait target in CodeGen via regular task Previously for nowait target, CG emitted a function call to `__tgt_target_nowait`, etc. However, in OpenMP RTL, these functions just directly call the no-nowait version, which means nowait is not working as expected. OpenMP specification says a target is acutally a target task, which is an untied and detachable task. It is natural to go to the direction that generates a task for a nowait target. However, OpenMP task has a problem that it must be within to a parallel region; otherwise the task will be executed immediately. As a result, if we directly wrap to a regular task, the `target nowait` outside of a parallel region is still a synchronous version. In D77609, I added the support for unshackled task in OpenMP RTL. Basically, unshackled task is a task that is not bound to any parallel region. So all nowait target will be tranformed into an unshackled task. In order to distinguish from regular task, a new flag bit is set for unshackled task. This flag will be used by RTL for later process. Since all target tasks are allocated via `__kmpc_omp_target_task_alloc`, and in current `libomptarget`, `__kmpc_omp_target_task_alloc` just calls `__kmpc_omp_task_alloc`. Therefore, we can modify the flag in `__kmpc_omp_target_task_alloc` so that we don't need to modify the FE too much. If users choose to opt out the feature, they just need to use a RTL w/o support of unshackled threads. As a result, in this patch, the `target nowait` region is simply wrapped into a regular task. Later once we have RTL support for unshackled tasks, the wrapped tasks can be executed by unshackled threads w/o changes in the FE. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D78075	2020-09-25 22:10:36 -04:00
Matt Arsenault	55c4ff91bd	OpaquePtr: Add type to sret attribute Make the corresponding change that was made for byval in `b7141207a4`. Like byval, this requires a bulk update of the test IR tests to include the type before this can be mandatory.	2020-09-25 14:07:30 -04:00
Chris Bowler	f330d9f163	[PPC] [AIX] Implement calling convention IR for C99 complex types on AIX Add AIX calling convention logic to Clang for C99 complex types on AIX Differential Revision: https://reviews.llvm.org/D88130	2020-09-25 07:43:31 -04:00
Momchil Velikov	a88c722e68	[AArch64] PAC/BTI code generation for LLVM generated functions PAC/BTI-related codegen in the AArch64 backend is controlled by a set of LLVM IR function attributes, added to the function by Clang, based on command-line options and GCC-style function attributes. However, functions, generated in the LLVM middle end (for example, asan.module.ctor or __llvm_gcov_write_out) do not get any attributes and the backend incorrectly does not do any PAC/BTI code generation. This patch record the default state of PAC/BTI codegen in a set of LLVM IR module-level attributes, based on command-line options: * "sign-return-address", with non-zero value means generate code to sign return addresses (PAC-RET), zero value means disable PAC-RET. * "sign-return-address-all", with non-zero value means enable PAC-RET for all functions, zero value means enable PAC-RET only for functions, which spill LR. * "sign-return-address-with-bkey", with non-zero value means use B-key for signing, zero value mean use A-key. This set of attributes are always added for AArch64 targets (as opposed, for example, to interpreting a missing attribute as having a value 0) in order to be able to check for conflicts when combining module attributed during LTO. Module-level attributes are overridden by function level attributes. All the decision making about whether to not to generate PAC and/or BTI code is factored out into AArch64FunctionInfo, there shouldn't be any places left, other than AArch64FunctionInfo, which directly examine PAC/BTI attributes, except AArch64AsmPrinter.cpp, which is/will-be handled by a separate patch. Differential Revision: https://reviews.llvm.org/D85649	2020-09-25 11:47:14 +01:00
Ian Levesque	6f7fbdd285	[xray] Function coverage groups Add the ability to selectively instrument a subset of functions by dividing the functions into N logical groups and then selecting a group to cover. By selecting different groups over time you could cover the entire application incrementally with lower overhead than instrumenting the entire application at once. Differential Revision: https://reviews.llvm.org/D87953	2020-09-24 22:09:53 -04:00
Reid Kleckner	ecfc9b9712	[MS] For unknown ISAs, pass non-trivially copyable arguments indirectly Passing them directly is likely to be non-conforming, since it usually involves copying the bytes of the record. For unknown architectures, we don't know what MSVC does or will do, but we should at least try to conform as well as we can.	2020-09-24 16:29:48 -07:00
Reid Kleckner	b8a50e9207	[MS] Simplify rules for passing C++ records Regardless of the target architecture, we should always use the C rules (RAA_Default) for records that "canBePassedInRegisters". Those are trivially copyable things, and things marked with [[trivial_abi]]. This should be NFC, although it changes where the final decision about x86_32 overaligned records is made. The current x86_32 C rules say that overaligned things are passed indirectly, so there is no functional difference.	2020-09-24 16:29:47 -07:00
Amy Huang	c8df781e54	[DebugInfo] Fix bug in constructor homing with classes with trivial constructors. This changes the code to avoid using constructor homing for aggregate classes and classes with trivial default constructors, instead of trying to loop through the constructors. Differential Revision: https://reviews.llvm.org/D87808	2020-09-24 14:43:48 -07:00
Erich Keane	f8a92adfa2	Remove dead branch identified by @rsmith on post-commit for D88236	2020-09-24 13:05:15 -07:00
Erich Keane	606a734755	[PR47636] Fix tryEmitPrivate to handle non-constantarraytypes As mentioned in the bug report, tryEmitPrivate chokes on the MaterializeTemporaryExpr in the reproducers, since it assumes that if there are elements, than it must be a ConstantArrayType. However, the MaterializeTemporaryExpr (which matches exactly the AST when it is NOT a global/static) has an incomplete array type. This changes the section where the number-of-elements is non-zero to properly handle non-CAT types by just extracting it as an array type (since all we needed was the element type out of it).	2020-09-24 12:09:22 -07:00
Amy Kwan	6b136b19cb	[Power10] Implement custom codegen for the vec_replace_elt and vec_replace_unaligned builtins. This patch implements custom codegen for the vec_replace_elt and vec_replace_unaligned builtins. These builtins map to the @llvm.ppc.altivec.vinsw and @llvm.ppc.altivec.vinsd intrinsics depending on the arguments. The main motivation for doing custom codegen for these intrinsics is because there are float and double versions of the builtin. Normally, the converting the float to an integer would be done via fptoui in the IR. This is incorrect as fptoui truncates the value and we must ensure the value is not truncated. Therefore, we provide custom codegen to utilize bitcast instead as bitcasts do not truncate. Differential Revision: https://reviews.llvm.org/D83500	2020-09-23 22:55:25 -05:00
Craig Topper	d9717d8ee7	[X86] Add a memory clobber to the bittest intrinsic inline asm. Get default clobbers from the target I believe the inline asm emitted here should have a memory clobber since it writes to memory. It was also missing the dirflag clobber that we use by default along with flags and fpsr. To avoid missing defaults in the future, get the default list from the target Differential Revision: https://reviews.llvm.org/D88121	2020-09-23 14:54:39 -07:00
Amy Kwan	2e7117f847	[PowerPC] Implement the 128-bit vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins in Clang/LLVM This patch implements the vec_[all\|any]_[eq \| ne \| lt \| gt \| le \| ge] builtins for vector signed/unsigned __int128. Differential Revision: https://reviews.llvm.org/D87910	2020-09-23 16:49:40 -04:00
Stanislav Mekhanoshin	59691dc874	[AMDGPU] Make ds fp atomics overloadable Differential Revision: https://reviews.llvm.org/D87947	2020-09-23 11:39:50 -07:00
Sriraman Tallam	7d0bbe4090	Re-apply https://reviews.llvm.org/D87921 , was reverted to triage a PPC bot failure. D87921 was reverted in commit `b89059a313` as it was causing an unknown llvm PPC bot failure. Reapplying the patch after confirming that this is not responsible. Build bot failure: https://reviews.llvm.org/D87921#2286644 which caused the revert. The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled. Fixed the placement of the MPM.addpass for UniqueInternalLinkageNames to make it work correctly with -O2 and new pass manager. Updated the tests to explicitly check O0 and O1. Differential Revision: https://reviews.llvm.org/D87921	2020-09-23 10:28:40 -07:00
Yaxun (Sam) Liu	301e23305d	[CUDA][HIP] Fix static device var used by host code only A static device variable may be accessed in host code through cudaMemCpyFromSymbol etc. Currently clang does not emit the static device variable if it is only referenced by host code, which causes host code to fail at run time. This patch fixes that. Differential Revision: https://reviews.llvm.org/D88115	2020-09-23 08:18:19 -04:00
Mircea Trofin	cf112382dd	[ThinLTO] Option to bypass function importing. This completes the circle, complementing -lto-embed-bitcode (specifically, post-merge-pre-opt). Using -thinlto-assume-merged skips function importing. The index file is still needed for the other data it contains. Differential Revision: https://reviews.llvm.org/D87949	2020-09-22 13:12:11 -07:00
Sriraman Tallam	b89059a313	Revert "The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled." This reverts commit `6950db36d3`.	2020-09-22 12:32:43 -07:00
Zequan Wu	9caa3fbe03	[Coverage] Add empty line regions to SkippedRegions Differential Revision: https://reviews.llvm.org/D84988	2020-09-21 12:42:53 -07:00
Reid Kleckner	3b3a165485	[MS] On x86_32, pass overaligned, non-copyable arguments indirectly This updates the C++ ABI argument classification code to use the logic from D72114, fixing an ABI incompatibility with MSVC. Part of PR44395. Differential Revision: https://reviews.llvm.org/D87923	2020-09-21 11:49:17 -07:00
Sriraman Tallam	6950db36d3	The wrong placement of add pass with optimizations led to -funique-internal-linkage-names being disabled. Fixed the placement of the MPM.addpass for UniqueInternalLinkageNames to make it work correctly with -O2 and new pass manager. Updated the tests to explicitly check O0 and O2. Previously, the addPass was placed before BackendUtil.cpp#L1373 which is wrong as MPM gets assigned at this point and any additions to the pass vector before this is wrong. This change just moves it after MPM is assigned and places it at a point where O0 and O0+ can share it. Differential Revision: https://reviews.llvm.org/D87921	2020-09-21 10:00:12 -07:00
Alexey Bataev	d5ce8233bf	[OpenMP 5.0] Fix user-defined mapper privatization in tasks This patch fixes the problem that user-defined mapper array is not correctly privatized inside a task. This problem causes openmp/libomptarget/test/offloading/target_depend_nowait.cpp fails. Differential Revision: https://reviews.llvm.org/D84470	2020-09-17 11:21:10 -04:00
Michael Liao	4d4f092283	[clang][codegen] Skip adding default function attributes on intrinsics. - After loading builtin bitcode for linking, skip adding default function attributes on LLVM intrinsics as their attributes are well-defined and retrieved directly from internal definitions. Adding extra attributes on intrinsics results in inconsistent result when `-save-temps` is present. Also, that makes few optimizations conservative. Differential Revision: https://reviews.llvm.org/D87761	2020-09-16 14:10:05 -04:00
Sam McCall	f5c7102dbc	Update dead links to Itanium and ARM ABIs. NFC	2020-09-16 13:42:01 +02:00
Simon Pilgrim	4abb5cd839	CGBlocks.cpp - assert non-null CGF pointer. NFCI. Fixes static analyzer warning.	2020-09-16 12:30:24 +01:00
Mircea Trofin	61fc10d6a5	[ThinLTO] add post-thinlto-merge option to -lto-embed-bitcode This will embed bitcode after (Thin)LTO merge, but before optimizations. In the case the thinlto backend is called from clang, the .llvmcmd section is also produced. Doing so in the case where the caller is the linker doesn't yet have a motivation, and would require plumbing through command line args. Differential Revision: https://reviews.llvm.org/D87636	2020-09-15 15:56:11 -07:00
Alexey Bataev	9e3842d603	[OPENMP]Fix codegen for is_device_ptr component, captured by reference. Need to map the component as TO instead of the literal, because need to pass a reference to a component if the pointer is overaligned. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D84887	2020-09-15 17:21:38 -04:00
Snehasish Kumar	f1a3ab9044	[clang] Add a command line flag for the Machine Function Splitter. This patch adds a command line flag for the machine function splitter (added in rG94faadaca4e1). -fsplit-machine-functions Split machine functions using profile information (x86 ELF). On other targets an error is emitted. If profile information is not provided a warning is emitted notifying the user that profile information is required. Differential Revision: https://reviews.llvm.org/D87047	2020-09-15 12:41:58 -07:00
Zequan Wu	f975ae4867	[CodeGen][typeid] Emit typeinfo directly if type is known at compile-time Differential Revision: https://reviews.llvm.org/D87425	2020-09-15 12:15:47 -07:00
Alexey Bataev	738bab743b	[OPENMP]Add support for allocate vars in untied tasks. Local vars, marked with pragma allocate, mustbe allocate by the call of the runtime function and cannot be allocated as other local variables. Instead, we allocate a space for the pointer in private record and store the address, returned by kmpc_alloc call in this pointer. So, for untied tasks ``` #pragma omp task untied { S s; #pragma omp allocate(s) allocator(allocator) s = x; } ``` compiler generates something like this: ``` struct task_with_privates { S ptr; }; void entry(task_with_privates p) { S s = p->s; switch(partid) { case 1: p->s = (S)kmpc_alloc(); kmpc_omp_task(); br exit; case 2: s = x; kmpc_omp_task(); br exit; case 2: ~S(s); kmpc_free((void)s); br exit; } exit: } ``` Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D86558	2020-09-15 13:39:14 -04:00
Simon Pilgrim	9eab73fa17	[X86] Update SSE/AVX integer MINMAX intrinsics to emit llvm.smax.* etc. (PR46851) We're now getting close to having the necessary analysis/combines etc. for the new generic llvm smax/smin/umax/umin intrinsics. This patch updates the SSE/AVX integer MINMAX intrinsics to emit the generic equivalents instead of the icmp+select code pattern. Differential Revision: https://reviews.llvm.org/D87603	2020-09-15 11:19:08 +01:00
Teresa Johnson	226d80ebe2	[MemProf] Rename HeapProfiler to MemProfiler for consistency This is consistent with the clang option added in `7ed8124d46`, and the comments on the runtime patch in D87120. Differential Revision: https://reviews.llvm.org/D87622	2020-09-14 13:14:57 -07:00
Simon Pilgrim	3b7708e2de	Assert we've found the size of each (non-overlapping) structure. NFCI. Fixes clang static analyzer warning.	2020-09-14 16:10:52 +01:00
Serge Pavlov	f1cd6593da	[AST][FPEnv] Keep FP options in trailing storage of CastExpr This is recommit of `6c8041aa0f`, reverted in `de044f7562` because of some fails. Original commit message is below. This change allow a CastExpr to have optional FPOptionsOverride object, stored in trailing storage. Of all cast nodes only ImplicitCastExpr, CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed to have FPOptions. Differential Revision: https://reviews.llvm.org/D85960	2020-09-14 12:15:21 +07:00
Florian Hahn	a874d63344	[Clang] Add option to allow marking pass-by-value args as noalias. After the recent discussion on cfe-dev 'Can indirect class parameters be noalias?' [1], it seems like using using noalias is problematic for current C++, but should be allowed for C-only code. This patch introduces a new option to let the user indicate that it is safe to mark indirect class parameters as noalias. Note that this also applies to external callers, e.g. it might not be safe to use this flag for C functions that are called by C++ functions. In targets that allocate indirect arguments in the called function, this enables more agressive optimizations with respect to memory operations and brings a ~1% - 2% codesize reduction for some programs. [1] : http://lists.llvm.org/pipermail/cfe-dev/2020-July/066353.html Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D85473	2020-09-12 14:56:13 +01:00
Tyker	78de7297ab	Reland [AssumeBundles] Use operand bundles to encode alignment assumptions NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining.	2020-09-12 15:36:06 +02:00
Serge Pavlov	de044f7562	Revert "[AST][FPEnv] Keep FP options in trailing storage of CastExpr" This reverts commit `6c8041aa0f`. It caused some fails on buildbots.	2020-09-12 17:06:42 +07:00
Serge Pavlov	6c8041aa0f	[AST][FPEnv] Keep FP options in trailing storage of CastExpr This change allow a CastExpr to have optional FPOptionsOverride object, stored in trailing storage. Of all cast nodes only ImplicitCastExpr, CStyleCastExpr, CXXFunctionalCastExpr and CXXStaticCastExpr are allowed to have FPOptions. Differential Revision: https://reviews.llvm.org/D85960	2020-09-12 14:30:44 +07:00
Cullen Rhodes	002f5ab3b1	[clang][aarch64] Fix ILP32 ABI for arm_sve_vector_bits The element types of scalable vectors are defined in terms of stdint types in the ACLE. This patch fixes the mapping to builtin types for the ILP32 ABI when creating VLS types with the arm_sve_vector_bits, where the mapping is as follows: int32_t -> LongTy int64_t -> LongLongTy uint32_t -> UnsignedLongTy uint64_t -> UnsignedLongLongTy This is implemented by leveraging getBuiltinVectorTypeInfo which is target agnostic since it calls ASTContext::getIntTypeForBitwidth for integer types. The element type for svfloat16_t is changed from Float16Ty to HalfTy when creating VLS types since this is what is used elsewhere. For more information, see: https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#types-varying-by-data-model https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-support-for-scalable-vectors Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D87358	2020-09-11 09:46:35 +00:00
Michael Liao	b22d450496	Remove dependency on clangASTMatchers. - It seems no long required for shared library builds.	2020-09-10 22:17:48 -04:00
Mark de Wever	08196e0b2e	Implements [[likely]] and [[unlikely]] in IfStmt. This is the initial part of the implementation of the C++20 likelihood attributes. It handles the attributes in an if statement. Differential Revision: https://reviews.llvm.org/D85091	2020-09-09 20:48:37 +02:00
Qiu Chaofan	88ff4d2ca1	[PowerPC] Fix STRICT_FRINT/STRICT_FNEARBYINT lowering In standard C library, both rint and nearbyint returns rounding result in current rounding mode. But nearbyint never raises inexact exception. On PowerPC, x(v\|s)r(d\|s)pic may modify FPSCR XX, raising inexact exception. So we can't select constrained fnearbyint into xvrdpic. One exception here is xsrqpi, which will not raise inexact exception, so fnearbyint f128 is okay here. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D87220	2020-09-09 22:40:58 +08:00
Ties Stuij	d6f3f61231	Revert "[ARM] Follow AACPS standard for volatile bit-fields access width" This reverts commit `514df1b2bb`. Some of the buildbots got llvm-lit errors on CodeGen/volatile.c	2020-09-08 18:46:27 +01:00
Ties Stuij	514df1b2bb	[ARM] Follow AACPS standard for volatile bit-fields access width This patch resumes the work of D16586. According to the AAPCS, volatile bit-fields should be accessed using containers of the widht of their declarative type. In such case: ``` struct S1 { short a : 1; } ``` should be accessed using load and stores of the width (sizeof(short)), where now the compiler does only load the minimum required width (char in this case). However, as discussed in D16586, that could overwrite non-volatile bit-fields, which conflicted with C and C++ object models by creating data race conditions that are not part of the bit-field, e.g. ``` struct S2 { short a; int b : 16; } ``` Accessing `S2.b` would also access `S2.a`. The AAPCS Release 2020Q2 (https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=) section 8.1 Data Types, page 36, "Volatile bit-fields - preserving number and width of container accesses" has been updated to avoid conflict with the C++ Memory Model. Now it reads in the note: ``` This ABI does not place any restrictions on the access widths of bit-fields where the container overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field placed between two other bit-fields. This is because the C/C++ memory model defines these as being separate memory locations, which can be accessed by two threads simultaneously. For this reason, compilers must be permitted to use a narrower memory access width (including splitting the access into multiple instructions) to avoid writing to a different memory location. For example, in struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };, writes to a or b must not overwrite each other. ``` Patch D16586 was updated to follow such behavior by verifying that we only change volatile bit-field access when: - it won't overlap with any other non-bit-field member - we only access memory inside the bounds of the record - avoid overlapping zero-length bit-fields. Regarding the number of memory accesses, that should be preserved, that will be implemented by D67399. Differential Revision: https://reviews.llvm.org/D72932 The following people contributed to this patch: - Diogo Sampaio - Ties Stuij	2020-09-08 17:49:49 +01:00
Simon Pilgrim	58970eb7d1	[OpenMP] Fix typo in CodeGenFunction::EmitOMPWorksharingLoop (PR46412) Fixes issue noticed by static analysis where we have a copy+paste typo, testing ScheduleKind.M1 twice instead of ScheduleKind.M2. Differential Revision: https://reviews.llvm.org/D87250	2020-09-08 11:59:38 +01:00
Simon Pilgrim	2853ae3c1b	[X86] Update SSE/AVX ABS intrinsics to emit llvm.abs.* (PR46851) We're now getting close to having the necessary analysis/combines etc. for the new generic llvm.abs.* intrinsics. This patch updates the SSE/AVX ABS vector intrinsics to emit the generic equivalents instead of the icmp+sub+select code pattern. Differential Revision: https://reviews.llvm.org/D87101	2020-09-07 13:54:12 +01:00
Simon Pilgrim	a8a91533dd	[X86] Replace EmitX86AddSubSatExpr with EmitX86BinaryIntrinsic generic helper. NFCI. Feed the Intrinsic::ID value directly instead of via the IsSigned/IsAddition bool flags.	2020-09-07 13:33:48 +01:00
Eduardo Caldas	1a7a2cd747	[Ignore Expressions][NFC] Refactor to better use `IgnoreExpr.h` and nits This change groups * Rename: `ignoreParenBaseCasts` -> `IgnoreParenBaseCasts` for uniformity * Rename: `IgnoreConversionOperator` -> `IgnoreConversionOperatorSingleStep` for uniformity * Inline `IgnoreNoopCastsSingleStep` into a lambda inside `IgnoreNoopCasts` * Refactor `IgnoreUnlessSpelledInSource` to make adequate use of `IgnoreExprNodes` Differential Revision: https://reviews.llvm.org/D86880	2020-09-07 09:32:30 +00:00
Amy Huang	aaf1a96408	[DebugInfo] Add size to class declarations in debug info. This adds the size to forward declared class DITypes, if the size is known. Fixes an issue where we determine whether to emit fragments based on the type size, so fragments would sometimes be incorrectly emitted if there was no size. Bug: https://bugs.llvm.org/show_bug.cgi?id=47338 Differential Revision: https://reviews.llvm.org/D87062	2020-09-03 15:42:27 -07:00
Yaxun (Sam) Liu	62dbb7e54c	Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024" Temporarily revert commit `04abbb3a78` due to regressions in some HIP apps due backend issues revealed by this change. Will re-commit it when backend issues are fixed.	2020-09-02 16:12:28 -04:00
Erik Pilkington	2d11ae0a40	Fix a -Wparenthesis warning in `8ff44e644b`, NFC	2020-09-02 15:01:54 -04:00
Erik Pilkington	8ff44e644b	[IRGen] Fix an assert when __attribute__((used)) is used on an ObjC method This assert doesn't really make sense for functions in general, since they start life as declarations, and there isn't really any reason to require them to be defined before attributes are applied to them. rdar://67895846	2020-09-02 12:19:11 -04:00
Erik Pilkington	a9a6e62ddf	[CodeGen] Make sure the EH cleanup for block captures is conditional when the block literal is in a conditional context Previously, clang was crashing on the attached test because the EH cleanup for the block capture was incorrectly emitted under the assumption that the expression wasn't conditionally evaluated. This was because before 9a52de00260, pushLifetimeExtendedDestroy was mainly used with C++ automatic lifetime extension, where a conditionally evaluated expression wasn't possible. Now that we're using this path for block captures, we need to handle this case. rdar://66250047 Differential revision: https://reviews.llvm.org/D86854	2020-08-31 10:12:17 -04:00
Arnold Schwaighofer	41634497d4	Teach the swift calling convention about _Atomic types rdar://67351073 Differential Revision: https://reviews.llvm.org/D86218	2020-08-31 07:07:25 -07:00
Fangrui Song	b5ef137c11	[gcov] Increment counters with atomicrmw if -fsanitize=thread Without this patch, `clang --coverage -fsanitize=thread` may fail spuriously because non-atomic counter increments can be detected as data races.	2020-08-28 16:32:35 -07:00
Cullen Rhodes	2ddf795e8c	Reland "[CodeGen][AArch64] Support arm_sve_vector_bits attribute" This relands D85743 with a fix for test CodeGen/attr-arm-sve-vector-bits-call.c that disables the new pass manager with '-fno-experimental-new-pass-manager'. Test was failing due to IR differences with the new pass manager which broke the Fuchsia builder [1]. Reverted in `2e7041f`. [1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375 Original summary: This patch implements codegen for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1]. The purpose of this attribute is to define vector-length-specific (VLS) versions of existing vector-length-agnostic (VLA) types. VLSTs are represented as VectorType in the AST and fixed-length vectors in the IR everywhere except in function args/return. Implemented in this patch is codegen support for the following: * Implicit casting between VLA <-> VLS types. * Coercion of VLS types in function args/return. * Mangling of VLS types. Casting is handled by the CK_BitCast operation, which has been extended to support the two new vector kinds for fixed-length SVE predicate and data vectors, where the cast is implemented through memory rather than a bitcast which is unsupported. Implementing this as a normal bitcast would require relaxing checks in LLVM to allow bitcasting between scalable and fixed types. Another option was adding target-specific intrinsics, although codegen support would need to be added for these intrinsics. Given this, casting through memory seemed like the best approach as it's supported today and existing optimisations may remove unnecessary loads/stores, although there is room for improvement here. Coercion of VLSTs in function args/return from fixed to scalable is implemented through the AArch64 ABI in TargetInfo. The VLA and VLS types are defined by the ACLE to map to the same machine-level SVE vectors. VLS types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif The latest ACLE specification (00bet5) does not contain details of this mangling scheme, it will be specified in the next revision. The mangling scheme is otherwise defined in the appendices to the Procedure Call Standard for the Arm Architecture, see [2] for more information. [1] https://developer.arm.com/documentation/100987/latest [2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85743	2020-08-28 15:57:09 +00:00
David Sherwood	f4257c5832	[SVE] Make ElementCount members private This patch changes ElementCount so that the Min and Scalable members are now private and can only be accessed via the get functions getKnownMinValue() and isScalable(). In addition I've added some other member functions for more commonly used operations. Hopefully this makes the class more useful and will reduce the need for calling getKnownMinValue(). Differential Revision: https://reviews.llvm.org/D86065	2020-08-28 14:43:53 +01:00
JF Bastien	82d29b397b	Add an unsigned shift base sanitizer It's not undefined behavior for an unsigned left shift to overflow (i.e. to shift bits out), but it has been the source of bugs and exploits in certain codebases in the past. As we do in other parts of UBSan, this patch adds a dynamic checker which acts beyond UBSan and checks other sources of errors. The option is enabled as part of -fsanitize=integer. The flag is named: -fsanitize=unsigned-shift-base This matches shift-base and shift-exponent flags. <rdar://problem/46129047> Differential Revision: https://reviews.llvm.org/D86000	2020-08-27 19:50:10 -07:00
Cullen Rhodes	2e7041fdc2	Revert "[CodeGen][AArch64] Support arm_sve_vector_bits attribute" Test CodeGen/attr-arm-sve-vector-bits-call.c is failing on some builders [1][2]. Reverting whilst I investigate. [1] http://lab.llvm.org:8011/builders/fuchsia-x86_64-linux/builds/10375 [2] https://luci-milo.appspot.com/p/fuchsia/builders/ci/clang-linux-x64/b8870800848452818112 This reverts commit `42587345a3`.	2020-08-27 21:31:05 +00:00
Craig Topper	17ceda99d3	[CodeGen] Use an AttrBuilder to bulk remove 'target-cpu', 'target-features', and 'tune-cpu' before re-adding in CodeGenModule::setNonAliasAttributes. I think the removeAttributes interface should be faster than calling removeAttribute 3 times.	2020-08-27 12:54:20 -07:00
Mikhail Maltsev	ae1396c7d4	[ARM][BFloat16] Change types of some Arm and AArch64 bf16 intrinsics This patch adjusts the following ARM/AArch64 LLVM IR intrinsics: - neon_bfmmla - neon_bfmlalb - neon_bfmlalt so that they take and return bf16 and float types. Previously these intrinsics used <8 x i8> and <4 x i8> vectors (a rudiment from implementation lacking bf16 IR type). The neon_vbfdot[q] intrinsics are adjusted similarly. This change required some additional selection patterns for vbfdot itself and also for vector shuffles (in a previous patch) because of SelectionDAG transformations kicking in and mangling the original code. This patch makes the generated IR cleaner (less useless bitcasts are produced), but it does not affect the final assembly. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D86146	2020-08-27 18:43:16 +01:00
Teresa Johnson	7ed8124d46	[HeapProf] Clang and LLVM support for heap profiling instrumentation See RFC for background: http://lists.llvm.org/pipermail/llvm-dev/2020-June/142744.html Note that the runtime changes will be sent separately (hopefully this week, need to add some tests). This patch includes the LLVM pass to instrument memory accesses with either inline sequences to increment the access count in the shadow location, or alternatively to call into the runtime. It also changes calls to memset/memcpy/memmove to the equivalent runtime version. The pass is modeled on the address sanitizer pass. The clang changes add the driver option to invoke the new pass, and to link with the upcoming heap profiling runtime libraries. Currently there is no attempt to optimize the instrumentation, e.g. to aggregate updates to the same memory allocation. That will be implemented as follow on work. Differential Revision: https://reviews.llvm.org/D85948	2020-08-27 08:50:35 -07:00
Cullen Rhodes	42587345a3	[CodeGen][AArch64] Support arm_sve_vector_bits attribute This patch implements codegen for the 'arm_sve_vector_bits' type attribute, defined by the Arm C Language Extensions (ACLE) for SVE [1]. The purpose of this attribute is to define vector-length-specific (VLS) versions of existing vector-length-agnostic (VLA) types. VLSTs are represented as VectorType in the AST and fixed-length vectors in the IR everywhere except in function args/return. Implemented in this patch is codegen support for the following: * Implicit casting between VLA <-> VLS types. * Coercion of VLS types in function args/return. * Mangling of VLS types. Casting is handled by the CK_BitCast operation, which has been extended to support the two new vector kinds for fixed-length SVE predicate and data vectors, where the cast is implemented through memory rather than a bitcast which is unsupported. Implementing this as a normal bitcast would require relaxing checks in LLVM to allow bitcasting between scalable and fixed types. Another option was adding target-specific intrinsics, although codegen support would need to be added for these intrinsics. Given this, casting through memory seemed like the best approach as it's supported today and existing optimisations may remove unnecessary loads/stores, although there is room for improvement here. Coercion of VLSTs in function args/return from fixed to scalable is implemented through the AArch64 ABI in TargetInfo. The VLA and VLS types are defined by the ACLE to map to the same machine-level SVE vectors. VLS types are mangled in the same way as: __SVE_VLS<typename, unsigned> where the first argument is the underlying variable-length type and the second argument is the SVE vector length in bits. For example: #if __ARM_FEATURE_SVE_BITS==512 // Mangled as 9__SVE_VLSIu11__SVInt32_tLj512EE typedef svint32_t vec __attribute__((arm_sve_vector_bits(512))); // Mangled as 9__SVE_VLSIu10__SVBool_tLj512EE typedef svbool_t pred __attribute__((arm_sve_vector_bits(512))); #endif The latest ACLE specification (00bet5) does not contain details of this mangling scheme, it will be specified in the next revision. The mangling scheme is otherwise defined in the appendices to the Procedure Call Standard for the Arm Architecture, see [2] for more information. [1] https://developer.arm.com/documentation/100987/latest [2] https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#appendix-c-mangling Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D85743	2020-08-27 15:11:58 +00:00
Sander de Smalen	4e9b66de3f	[AArch64][SVE] Add missing debug info for ACLE types. This patch adds type information for SVE ACLE vector types, by describing them as vectors, with a lower bound of 0, and an upper bound described by a DWARF expression using the AArch64 Vector Granule register (VG), which contains the runtime multiple of 64bit granules in an SVE vector. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86101	2020-08-27 10:56:42 +01:00
Christopher Tetreault	19e883fc59	[SVE] Remove calls to VectorType::getNumElements from clang Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D82582	2020-08-26 11:12:26 -07:00
Zequan Wu	9500a72091	Revert "[Coverage] Enable emitting gap area between macros" This reverts commit `a31c89c1b7`.	2020-08-25 15:28:42 -07:00
Amy Huang	b1009ee84f	Reland "[DebugInfo] Move constructor homing case in shouldOmitDefinition." For some reason the ctor homing case was before the template specialization case, and could have returned false too early. I moved the code out into a separate function to avoid this. This reverts commit `05777ab941`.	2020-08-25 12:36:11 -07:00
Jeremy Morse	121a49d839	[LiveDebugValues] Add switches for using instr-ref variable locations This patch adds the -Xclang option "-fexperimental-debug-variable-locations" and same LLVM CodeGen option, to pick which variable location tracking solution to use. Right now all the switch does is pick which LiveDebugValues implementation to use, the normal VarLoc one or the instruction referencing one in rGae6f78824031. Over time, the aim is to add fragments of support in aid of the value-tracking RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-February/139440.html also controlled by this command line switch. That will slowly move variable locations to be defined by an instruction calculating a value, and a DBG_INSTR_REF instruction referring to that value. Thus, this is going to grow into a "use the new kind of variable locations" switch, rather than just "use the new LiveDebugValues implementation". Differential Revision: https://reviews.llvm.org/D83048	2020-08-25 14:58:48 +01:00
Eric Christopher	05777ab941	Temporarily Revert "[DebugInfo] Move constructor homing case in shouldOmitDefinition." as it's causing test failures. This reverts commit `589ce5f705`.	2020-08-24 21:51:31 -07:00
Amy Huang	589ce5f705	[DebugInfo] Move constructor homing case in shouldOmitDefinition. For some reason the ctor homing case was before the template specialization case, and could have returned false too early. I moved the code out into a separate function to avoid this. Also added a run line to the template specialization test. I guess all the -debug-info-kind=limited tests should still pass with =constructor, but it's probably unnecessary to test for all of those. Differential Revision: https://reviews.llvm.org/D86491	2020-08-24 20:17:59 -07:00
Raphael Isemann	105151ca56	Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)" The orignal patch with the missing 'REQUIRES: asserts' as there is a debug-only flag used in the test. Original summary: D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5 uint32_t. However, it didn't update the code in ObjectFilePCHContainerOperations that creates the dwoID in the module from the ASTFileSignature (`Buffer->Signature` being the array subclass that is now `std::array<uint8_t, 20>` instead of `std::array<uint32_t, 5>`). ``` uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 \| Buffer->Signature[0] ``` This code works with the old ASTFileSignature (where two uint32_t are enough to fill the uint64_t), but after the patch this only took two bytes from the ASTFileSignature and only partly filled the Signature uint64_t. This caused that the dwoID in the module ref and the dwoID in the actual module no longer match (which in turns causes that LLDB keeps warning about the dwoID's not matching when debugging -gmodules-compiled binaries). This patch just unifies the logic for turning the ASTFileSignature into an uint64_t which makes the dwoID match again (and should prevent issues like that in the future). Reviewed By: aprantl, dang Differential Revision: https://reviews.llvm.org/D84013	2020-08-24 14:52:53 +02:00
Bevin Hansson	577f8b157a	[Fixed Point] Add codegen for fixed-point shifts. This patch adds codegen to Clang for fixed-point shift operations. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D83294	2020-08-24 14:37:16 +02:00
Bevin Hansson	808ac54645	[Fixed Point] Use FixedPointBuilder to codegen fixed-point IR. This changes the methods in CGExprScalar to use FixedPointBuilder to generate IR for fixed-point conversions and operations. Since FixedPointBuilder emits padded operations slightly differently than the original code, some tests change. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D86282	2020-08-24 14:37:07 +02:00
Raphael Isemann	2b3074c0d1	Revert "Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)"" This reverts commit `ada2e8ea67`. Still breaking on Fuchsia (and also Fedora) with exit code 1, so back to investigating.	2020-08-24 12:54:25 +02:00
Raphael Isemann	ada2e8ea67	Reland "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)" This relands D84013 but with a test that relies on less shell features to hopefully make the test pass on Fuchsia (where the test from the previous patch version strangely failed with a plain "Exit code 1"). Original summary: D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5 uint32_t. However, it didn't update the code in ObjectFilePCHContainerOperations that creates the dwoID in the module from the ASTFileSignature (`Buffer->Signature` being the array subclass that is now `std::array<uint8_t, 20>` instead of `std::array<uint32_t, 5>`). ``` uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 \| Buffer->Signature[0] ``` This code works with the old ASTFileSignature (where two uint32_t are enough to fill the uint64_t), but after the patch this only took two bytes from the ASTFileSignature and only partly filled the Signature uint64_t. This caused that the dwoID in the module ref and the dwoID in the actual module no longer match (which in turns causes that LLDB keeps warning about the dwoID's not matching when debugging -gmodules-compiled binaries). This patch just unifies the logic for turning the ASTFileSignature into an uint64_t which makes the dwoID match again (and should prevent issues like that in the future). Reviewed By: aprantl, dang Differential Revision: https://reviews.llvm.org/D84013	2020-08-24 11:51:32 +02:00
Raphael Isemann	c1dd5df425	Revert "Correctly emit dwoIDs after ASTFileSignature refactoring (D81347)" This reverts commit `a4c3ed42ba`. The test is curiously failing with a plain exit code 1 on Fuchsia.	2020-08-21 16:08:37 +02:00
Raphael Isemann	a4c3ed42ba	Correctly emit dwoIDs after ASTFileSignature refactoring (D81347) D81347 changes the ASTFileSignature to be an array of 20 uint8_t instead of 5 uint32_t. However, it didn't update the code in ObjectFilePCHContainerOperations that creates the dwoID in the module from the ASTFileSignature (`Buffer->Signature` being the array subclass that is now `std::array<uint8_t, 20>` instead of `std::array<uint32_t, 5>`). ``` uint64_t Signature = [..] (uint64_t)Buffer->Signature[1] << 32 \| Buffer->Signature[0] ``` This code works with the old ASTFileSignature (where two uint32_t are enough to fill the uint64_t), but after the patch this only took two bytes from the ASTFileSignature and only partly filled the Signature uint64_t. This caused that the dwoID in the module ref and the dwoID in the actual module no longer match (which in turns causes that LLDB keeps warning about the dwoID's not matching when debugging -gmodules-compiled binaries). This patch just unifies the logic for turning the ASTFileSignature into an uint64_t which makes the dwoID match again (and should prevent issues like that in the future). Reviewed By: aprantl, dang Differential Revision: https://reviews.llvm.org/D84013	2020-08-21 15:05:02 +02:00
Bevin Hansson	1a995a0af3	[ADT] Move FixedPoint.h from Clang to LLVM. This patch moves FixedPointSemantics and APFixedPoint from Clang to LLVM ADT. This will make it easier to use the fixed-point classes in LLVM for constructing an IR builder for fixed-point and for reusing the APFixedPoint class for constant evaluation purposes. RFC: http://lists.llvm.org/pipermail/llvm-dev/2020-August/144025.html Reviewed By: leonardchan, rjmccall Differential Revision: https://reviews.llvm.org/D85312	2020-08-20 10:29:45 +02:00
Craig Topper	724f570ad2	[X86] Add support 'tune' in target attribute This adds parsing and codegen support for tune in target attribute. I've implemented this so that arch in the target attribute implicitly disables tune from the command line. I'm not sure what gcc does here. But since -march implies -mtune. I assume 'arch' in the target attribute implies tune in the target attribute. Differential Revision: https://reviews.llvm.org/D86187	2020-08-19 15:58:19 -07:00
Aaron Puchert	916b750a8d	[CodeGen] Use existing EmitLambdaVLACapture (NFC)	2020-08-19 15:20:05 +02:00
Sander de Smalen	0353848cc9	[Clang][SVE] NFC: Move info about ACLE types into separate function. This function returns a struct `BuiltinVectorTypeInfo` that contains the builtin vector's element type, element count and number of vectors (used for vector tuples). Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D86100	2020-08-19 11:04:20 +01:00
Craig Topper	4cbceb74bb	[X86] Add basic support for -mtune command line option in clang Building on the backend support from D85165. This parses the command line option in the driver, passes it on to CC1 and adds a function attribute. -Still need to support tune on the target attribute. -Need to use "generic" as the tuning by default. But need to change generic in the backend first. -Need to set tune if march is specified and mtune isn't. -May need to disable getHostCPUName's ability to guess CPU name from features when it doesn't have a family/model match for mtune=native. That's what gcc appears to do. Differential Revision: https://reviews.llvm.org/D85384	2020-08-18 15:13:19 -07:00
Zequan Wu	84fffa6728	[Coverage] Adjust skipped regions only if {Prev,Next}TokLoc is in the same file as regions' {start, end}Loc Fix a bug if {Prev, Next}TokLoc is in different file from skipped regions' {start, end}Loc Differential Revision: https://reviews.llvm.org/D86116	2020-08-18 13:26:19 -07:00
Eli Friedman	673dbe1b5e	[clang codegen] Use IR "align" attribute for static array arguments. Without the "align" attribute, marking the argument dereferenceable is basically useless. See also D80166. Fixes https://bugs.llvm.org/show_bug.cgi?id=46876 . Differential Revision: https://reviews.llvm.org/D84992	2020-08-18 12:51:16 -07:00
Johannes Doerfert	95a25e4c32	[OpenMP][FIX] Do not use TBAA in type punning reduction GPU code PR46156 When we implement OpenMP GPU reductions we use type punning a lot during the shuffle and reduce operations. This is not always compatible with language rules on aliasing. So far we generated TBAA which later allowed to remove some of the reduce code as accesses and initialization were "known to not alias". With this patch we avoid TBAA in this step, hopefully for all accesses that we need to. Verified on the reproducer of PR46156 and QMCPack. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D86037	2020-08-16 14:38:31 -05:00
Gui Andrade	909a851dbf	[CGAtomic] Mark atomic libcall functions `nounwind` These functions won't ever unwind. This is useful for MemorySanitizer as it simplifies handling __atomic_load in particular. Differential Revision: https://reviews.llvm.org/D85573	2020-08-14 07:46:43 +00:00
Zequan Wu	a31c89c1b7	[Coverage] Enable emitting gap area between macros Differential Revision: https://reviews.llvm.org/D85176	2020-08-12 16:25:27 -07:00
Craig Topper	5c1fe4e20f	[Target] Cache the command line derived feature map in TargetOptions. We can use this to remove some calls to initFeatureMap from Sema and CodeGen when a function doesn't have a target attribute. This reduces compile time of the linux kernel where this map is needed to diagnose some inline assembly constraints based on whether sse, avx, or avx512 is enabled. Differential Revision: https://reviews.llvm.org/D85807	2020-08-12 12:37:23 -07:00
Alexey Bataev	fbd6d2c54e	[OPENMP] Fix PR47063: crash when trying to get captured statetment. Need to call getRawStmt() function instead, when trying to get inner associated statement for the executable directive. Not all directives use captured statements.	2020-08-12 12:05:58 -04:00
Alexey Bataev	f4f3f678f1	[OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks. In untied tasks, need to allocate the space for local variales, declared in task region, when the memory for task data is allocated. THe function can be interrupted and we can exit from the function in untied task switch. Need to keep the state of the local variables in this case. Also, the compiler should not call cleanup when exiting in untied task switch until the real exit out of the declaration scope is met during execution. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D84457	2020-08-12 11:28:19 -04:00
Alexey Bataev	ddbd21d288	[OPENMP]Do not add TGT_OMP_TARGET_PARAM flag to non-captured mapped arguments. If the arguments are mapped, but are actually not used in the target region, the compiler still adds attribute TGT_OMP_TARGET_PARAM for such arguments. It makes the libomptarget to add such parameters to the list of arguments, passed to the kernel at the runtime, and may lead to incorrect results/crashes during execution. Differential Revision: https://reviews.llvm.org/D85755	2020-08-12 10:06:52 -04:00
Alexey Bataev	3651658bdd	Revert "[OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks." This reverts commit `ec9563c54e` to investigate compiler crash revelaed by the buildbots.	2020-08-12 09:50:32 -04:00
Alexey Bataev	ec9563c54e	[OPENMP]Fix PR37671: Privatize local(private) variables in untied tasks. Summary: In untied tasks, need to allocate the space for local variales, declared in task region, when the memory for task data is allocated. THe function can be interrupted and we can exit from the function in untied task switch. Need to keep the state of the local variables in this case. Also, the compiler should not call cleanup when exiting in untied task switch until the real exit out of the declaration scope is met during execution. Reviewers: jdoerfert Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D84457	2020-08-12 09:37:24 -04:00
Kai Nacke	b3aece0531	[SystemZ/ZOS] Add binary format goff and operating system zos to the triple Adds the binary format goff and the operating system zos to the triple class. goff is selected as default binary format if zos is choosen as operating system. No further functionality is added. Reviewers: efriedma, tahonermann, hubert.reinterpertcast, MaskRay Reviewed By: efriedma, tahonermann, hubert.reinterpertcast Differential Revision: https://reviews.llvm.org/D82081	2020-08-11 05:26:26 -04:00
Wang, Pengfei	9512525947	[X86][FPEnv] Teach X86 mask compare intrinsics to respect strict FP semantics. When we use mask compare intrinsics under strict FP option, the masked elements shouldn't raise any exception. So, we cann't replace the intrinsic with a full compare + "and" operation. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D85385	2020-08-11 10:28:41 +08:00
Johannes Doerfert	fa5d22a045	[OpenMP][NFC] Reuse OMPIRBuilder `struct ident_t` handling in Clang Replace the `ident_t` handling in Clang with the methods offered by the OMPIRBuilder. This cuts down on the clang code as well as the differences between the two, making further transitions easier. Tests have changed but there should not be a real functional change. The most interesting difference is probably that we stop generating local ident_t allocations for now and just use globals. Given that this happens only with debug info, the location part of the `ident_t` is probably bigger than the test anyway. As the location part is already a global, we can avoid the allocation, memcpy, and store in favor of a constant global that is slightly bigger. This can be revisited if there are complications. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D80735	2020-08-10 17:13:26 -05:00
Nick Desaulniers	4f2ad15db5	[Clang] implement -fno-eliminate-unused-debug-types Fixes pr/11710. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Resubmit after breaking Windows and OSX builds. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D80242	2020-08-10 15:08:48 -07:00
Michael Liao	c7b683c126	[PGO][CUDA][HIP] Skip generating profile on the device stub and wrong-side functions. - Skip generating profile data on `__global__` function in the host compilation. It's a host-side stub function only and don't have profile instrumentation generated on the real function body. The extra profile data results in the malformed instrumentation profile data. - Skip generating region mapping on functions in the wrong-side, i.e., + For the device compilation, skip host-only functions; and, + For the host compilation, skip device-only functions (including `__global__` functions.) - As the device-side profiling is not ready yet, only host-side profile code generation is checked. Differential Revision: https://reviews.llvm.org/D85276	2020-08-10 11:01:46 -04:00
Xiangling Liao	6ef801aa6b	[AIX] Static init frontend recovery and backend support On the frontend side, this patch recovers AIX static init implementation to use the linkage type and function names Clang chooses for sinit related function. On the backend side, this patch sets correct linkage and function names on aliases created for sinit/sterm functions. Differential Revision: https://reviews.llvm.org/D84534	2020-08-10 10:10:49 -04:00
Nick Desaulniers	abb9bf4bcf	Revert "[Clang] implement -fno-eliminate-unused-debug-types" This reverts commit `e486921fd6`. Breaks windows builds and osx builds.	2020-08-07 16:11:41 -07:00
Nick Desaulniers	e486921fd6	[Clang] implement -fno-eliminate-unused-debug-types Fixes pr/11710. Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D80242	2020-08-07 14:13:48 -07:00
Alexey Bataev	4a7aedb843	[OPENMP]Simplify representation for atomic, critical, master and section constrcut. Several constructs may be represented wityout relying on CapturedStmt. It saves memory and improves compilation speed.	2020-08-07 09:58:23 -04:00
Matt Arsenault	30eeb742f1	clang: Use byref for aggregate kernel arguments Add address space to indirect abi info and use it for kernels. Previously, indirect arguments assumed assumed a stack passed object in the alloca address space using byval. A stack pointer is unsuitable for kernel arguments, which are passed in a separate, constant buffer with a different address space. Start using the new byref for aggregate kernel arguments. Previously these were emitted as raw struct arguments, and turned into loads in the backend. These will lower identically, although with byref you now have the option of applying an explicit alignment. In the future, a reasonable implementation would use byref for all kernel arguments (this would be a practical problem at the moment due to losing things like noalias on pointer arguments). This is mostly to avoid fighting the optimizer's treatment of aggregate load/store. SROA and instcombine both turn aggregate loads and stores into a long sequence of element loads and stores, rather than the optimizable memcpy I would expect in this situation. Now an explicit memcpy will be introduced up-front which is better understood and helps eliminate the alloca in more situations. This skips using byref in the case where HIP kernel pointer arguments in structs are promoted to global pointers. At minimum an additional patch is needed to allow coercion with indirect arguments. This also skips using it for OpenCL due to the current workaround used to support kernels calling kernels. Distinct function bodies would need to be generated up front instead of emitting an illegal call.	2020-08-06 15:52:26 -04:00
Alexey Bataev	0af7835eae	[OPENMP]Redesign of OMPExecutableDirective/OMPDeclarativeDirective representation. Summary: Introduced OMPChildren class to handle all associated clauses, statement and child expressions/statements. It allows to represent some directives more correctly (like flush, depobj etc. with pseudo clauses, ordered depend directives, which are standalone, and target data directives). Also, it will make easier to avoid using of CapturedStmt in directives, if required (atomic, tile etc. directives). Also, it simplifies serialization/deserialization of the executable/declarative directives. Reduces number of allocation operations for mapper declarations. Reviewers: jdoerfert Subscribers: yaxunl, guansong, jfb, cfe-commits, sstefan1, aaron.ballman, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83261	2020-08-06 12:25:19 -04:00
Anatoly Trosinenko	5a07490d76	[ABI][NFC] Fix the confusion of ByVal and ByRef argument names The second argument of getNaturalAlignIndirect() was `bool ByRef`, but the implementation was just delegating to getIndirect() with `ByRef` passed unchanged to `bool ByVal` parameter of getIndirect(). Fix a couple of /ByRef=/ comments as well. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D85113	2020-08-06 15:20:18 +03:00
Stanislav Mekhanoshin	105608a4c2	[AMDGPU] Added missing gfx1031 cases to CGOpenMPRuntimeGPU.cpp	2020-08-05 12:39:03 -07:00
Erich Keane	2143a90b34	Fix _ExtInt(1) to be a i1 in memory. The _ExtInt(1) in getTypeForMem was hitting the bool logic for expanding to an 8 bit value. The result was an assert, or store i1 %0, i8* %2, align 1 since the parameter IS an i1. This patch changes the 'forMem' test to exclude ext-int from the bool test.	2020-08-05 10:54:51 -07:00
Joel E. Denny	002d61db2b	[OpenMP] Fix `present` for exit from `omp target data` Without this patch, the following example fails but shouldn't according to OpenMP TR8: ``` #pragma omp target enter data map(alloc:i) #pragma omp target data map(present, alloc: i) { #pragma omp target exit data map(delete:i) } // fails presence check here ``` OpenMP TR8 sec. 2.22.7.1 "map Clause", p. 321, L23-26 states: > If the map clause appears on a target, target data, target enter > data or target exit data construct with a present map-type-modifier > then on entry to the region if the corresponding list item does not > appear in the device data environment an error occurs and the > program terminates. There is no corresponding statement about the exit from a region. Thus, the `present` modifier should: 1. Check for presence upon entry into any region, including a `target exit data` region. This behavior is already implemented correctly. 2. Should not check for presence upon exit from any region, including a `target` or `target data` region. Without this patch, this behavior is not implemented correctly, breaking the above example. In the case of `target data`, this patch fixes the latter behavior by removing the `present` modifier from the map types Clang generates for the runtime call at the end of the region. In the case of `target`, we have not found a valid OpenMP program for which such a fix would matter. It appears that, if a program can guarantee that data is present at the beginning of a `target` region so that there's no error there, that data is also guaranteed to be present at the end. This patch adds a comment to the runtime to document this case. Reviewed By: grokos, RaviNarayanaswamy, ABataev Differential Revision: https://reviews.llvm.org/D84422	2020-08-05 10:03:31 -04:00
Yonghong Song	00602ee7ef	BPF: simplify IR generation for __builtin_btf_type_id() This patch simplified IR generation for __builtin_btf_type_id(). For __builtin_btf_type_id(obj, flag), previously IR builtin looks like if (obj is a lvalue) llvm.bpf.btf.type.id(obj.ptr, 1, flag) !type else llvm.bpf.btf.type.id(obj, 0, flag) !type The purpose of the 2nd argument is to differentiate __builtin_btf_type_id(obj, flag) where obj is a lvalue vs. __builtin_btf_type_id(obj.ptr, flag) Note that obj or obj.ptr is never used by the backend and the `obj` argument is only used to derive the type. This code sequence is subject to potential llvm CSE when - obj is the same .e.g., nullptr - flag is the same - metadata type is different, e.g., typedef of struct "s" and strust "s". In the above, we don't want CSE since their metadata is different. This patch change IR builtin to llvm.bpf.btf.type.id(seq_num, flag) !type and seq_num is always increasing. This will prevent potential llvm CSE. Also report an error if the type name is empty for remote relocation since remote relocation needs non-empty type name to do relocation against vmlinux. Differential Revision: https://reviews.llvm.org/D85174	2020-08-04 16:29:42 -07:00
Thorsten Schuett	e18c6ef6b4	[clang] improve diagnostics for misaligned and large atomics "Listing the alignment and access size (== expected alignment) in the warning seems like a good idea." solves PR 46947 struct Foo { struct Bar { void * a; void * b; }; Bar bar; }; struct ThirtyTwo { struct Large { void * a; void * b; void * c; void * d; }; Large bar; }; void braz(Foo foo, ThirtyTwo braz) { Foo::Bar bar; __atomic_load(&foo->bar, &bar, __ATOMIC_RELAXED); ThirtyTwo::Large foobar; __atomic_load(&braz->bar, &foobar, __ATOMIC_RELAXED); } repro.cpp:21:3: warning: misaligned atomic operation may incur significant performance penalty; the expected (16 bytes) exceeds the actual alignment (8 bytes) [-Watomic-alignment] __atomic_load(&foo->bar, &bar, __ATOMIC_RELAXED); ^ repro.cpp:24:3: warning: misaligned atomic operation may incur significant performance penalty; the expected (32 bytes) exceeds the actual alignment (8 bytes) [-Watomic-alignment] __atomic_load(&braz->bar, &foobar, __ATOMIC_RELAXED); ^ repro.cpp:24:3: warning: large atomic operation may incur significant performance penalty; the access size (32 bytes) exceeds the max lock-free size (16 bytes) [-Watomic-alignment] 3 warnings generated. Differential Revision: https://reviews.llvm.org/D85102	2020-08-04 11:10:29 -07:00
Yonghong Song	6d67506964	[clang][BPF] support type exist/size and enum exist/value relocations This patch added the following additional compile-once run-everywhere (CO-RE) relocations: - existence/size of typedef, struct/union or enum type - enum value and enum value existence These additional relocations will make CO-RE bpf programs more adaptive for potential kernel internal data structure changes. For existence/size relocations, the following two code patterns are supported: 1. uint32_t __builtin_preserve_type_info((<type> )0, flag); 2. <type> var; uint32_t __builtin_preserve_field_info(var, flag); flag = 0 for existence relocation and flag = 1 for size relocation. For enum value existence and enum value relocations, the following code pattern is supported: uint64_t __builtin_preserve_enum_value((<enum_type> )<enum_value>, flag); flag = 0 means existence relocation and flag = 1 for enum value. relocation. In the above <enum_type> can be an enum type or a typedef to enum type. The <enum_value> needs to be an enumerator value from the same enum type. The return type is uint64_t to permit potential 64bit enumerator values. Differential Revision: https://reviews.llvm.org/D83242	2020-08-04 08:39:53 -07:00
Kazushi (Jam) Marukawa	045e79e77c	[VE] Extend integer arguments and return values smaller than 64 bits In order to follow NEC Aurora SX VE ABI correctly, change to sign/zero extend integer arguments and return values smaller than 64 bits in clang. Also update regression test. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D85071	2020-08-04 08:07:05 +09:00
Thomas Lively	cb32792210	[WebAssembly] Implement prototype v128.load{32,64}_zero instructions Specified in https://github.com/WebAssembly/simd/pull/237, these instructions load the first vector lane from memory and zero the other lanes. Since these instructions are not officially part of the SIMD proposal, they are only available on an opt-in basis via LLVM intrinsics and clang builtin functions. If these instructions are merged to the proposal, this implementation will change so that the instructions will be generated from normal IR. At that point the intrinsics and builtin functions would be removed. This PR also changes the opcodes for the experimental f32x4.qfm{a,s} instructions because their opcodes conflicted with those of the v128.load{32,64}_zero instructions. The new opcodes were chosen to match those used in V8. Differential Revision: https://reviews.llvm.org/D84820	2020-08-03 13:54:00 -07:00
Akira Hatanaka	41b1e97b12	[CodeGen][ObjC] Mark calls to objc_unsafeClaimAutoreleasedReturnValue as notail on x86-64 This is needed because the epilogue code inserted before tail calls on x86-64 breaks the handshake between the caller and callee. Calls to objc_retainAutoreleasedReturnValue used to have the same problem, which was fixed in https://reviews.llvm.org/D59656. rdar://problem/66029552 Differential Revision: https://reviews.llvm.org/D84540	2020-08-03 13:25:25 -07:00
Saiyedul Islam	160ff83765	[OpenMP][AMDGCN] Support OpenMP offloading for AMDGCN architecture - Part 3 Provides AMDGCN and NVPTX specific specialization of getGPUWarpSize, getGPUThreadID, and getGPUNumThreads methods. Adds tests for AMDGCN codegen for these methods in generic and simd modes. Also changes the precondition in InitTempAlloca to be slightly more permissive. Useful for AMDGCN OpenMP codegen where allocas are created with a cast to an address space. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D84260	2020-08-03 05:38:39 +00:00
Eli Friedman	8dfb5d767e	[clang codegen][AArch64] Use llvm.aarch64.neon.fcvtzs/u where it's necessary fptosi/fptoui have similar, but not identical, semantics. In particular, the behavior on overflow is different. Fixes https://bugs.llvm.org/show_bug.cgi?id=46844 for 64-bit. (The corresponding patch for 32-bit is more involved because the equivalent intrinsics don't exist, as far as I can tell.) Differential Revision: https://reviews.llvm.org/D84703	2020-07-30 15:41:54 -07:00
Richard Smith	1e7f026c3b	PR46908: Emit undef destroying_delete_t as an aggregate RValue. We previously used a non-aggregate RValue to represent the passed value, which violated the assumptions of call arg lowering in some cases, in particular on 32-bit Windows, where we'd end up producing an FCA store with TBAA metadata, that the IR verifier would reject.	2020-07-30 14:50:01 -07:00
Johannes Doerfert	ebad64dfe1	[OpenMP][FIX] Consistently use OpenMPIRBuilder if requested When we use the OpenMPIRBuilder for the parallel region we need to also use it to get the thread ID (among other things) in the body. This is because CGOpenMPRuntime::getThreadID() and CGOpenMPRuntime::emitUpdateLocation implicitly assumes that if they are called from within a parallel region there is a certain structure to the code and certain members of the OMPRegionInfo are initialized. It might make sense to initialize them even if we use the OpenMPIRBuilder but we would preferably get rid of such state instead. Bug reported by Anchu Rajendran Sudhakumari. Depends on D82470. Reviewed By: anchu-rajendran Differential Revision: https://reviews.llvm.org/D82822	2020-07-30 10:19:40 -05:00
Johannes Doerfert	19756ef53a	[OpenMP][IRBuilder] Support allocas in nested parallel regions We need to keep track of the alloca insertion point (which we already communicate via the callback to the user) as we place allocas as well. Reviewed By: fghanim, SouraVX Differential Revision: https://reviews.llvm.org/D82470	2020-07-30 10:19:39 -05:00
Alexey Bataev	622e46156d	[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region. Need to map the base pointer for all directives, not only target data-based ones. The base pointer is mapped for array sections, array subscript, array shaping and other array-like constructs with the base pointer. Also, codegen for use_device_ptr clause was modified to correctly handle mapping combination of array like constructs + use_device_ptr clause. The data for use_device_ptr clause is emitted as the last records in the data mapping array. Reviewed By: ye-luo Differential Revision: https://reviews.llvm.org/D84767	2020-07-30 11:18:33 -04:00
Alexey Bataev	b69357c2f4	Revert "[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region." This reverts commit `142d0d3ed8` to investigate undefined behavior revealed by buildbots.	2020-07-30 10:57:56 -04:00
Alexey Bataev	142d0d3ed8	[OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region. Need to map the base pointer for all directives, not only target data-based ones. The base pointer is mapped for array sections, array subscript, array shaping and other array-like constructs with the base pointer. Also, codegen for use_device_ptr clause was modified to correctly handle mapping combination of array like constructs + use_device_ptr clause. The data for use_device_ptr clause is emitted as the last records in the data mapping array. It applies only for global pointers. Differential Revision: https://reviews.llvm.org/D84767	2020-07-30 09:40:05 -04:00
Amy Huang	f71deb43ab	[DebugInfo] Fix to ctor homing to ignore classes with trivial ctors. Previously ctor homing was omitting debug info for classes if they have both trival and nontrivial constructors, but we should only omit debug info if the class doesn't have any trivial constructors. retained types list. bug: https://bugs.llvm.org/show_bug.cgi?id=46537 Differential Revision: https://reviews.llvm.org/D84870	2020-07-29 19:55:20 -07:00
Arthur Eubanks	71d0a2b8a3	[DFSan][NewPM] Port DataFlowSanitizer to NewPM Reviewed By: ychen, morehouse Differential Revision: https://reviews.llvm.org/D84707	2020-07-29 10:19:15 -07:00
Joel E. Denny	9f2f3b9de6	[OpenMP] Implement TR8 `present` motion modifier in Clang (1/2) This patch implements Clang front end support for the OpenMP TR8 `present` motion modifier for `omp target update` directives. The next patch in this series implements OpenMP runtime support. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D84711	2020-07-29 12:18:45 -04:00
Alexey Bader	8d27be8dba	[OpenCL] Add global_device and global_host address spaces This patch introduces 2 new address spaces in OpenCL: global_device and global_host which are a subset of a global address space, so the address space scheme will be looking like: ``` generic->global->host ->device ->private ->local constant ``` Justification: USM allocations may be associated with both host and device memory. We want to give users a way to tell the compiler the allocation type of a USM pointer for optimization purposes. (Link to the Unified Shared Memory extension: https://github.com/intel/llvm/blob/sycl/sycl/doc/extensions/USM/cl_intel_unified_shared_memory.asciidoc) Before this patch USM pointer could be only in opencl_global address space, hence a device backend can't tell if a particular pointer points to host or device memory. On FPGAs at least we can generate more efficient hardware code if the user tells us where the pointer can point - being able to distinguish between these types of pointers at compile time allows us to instantiate simpler load-store units to perform memory transactions. Patch by Dmitry Sidorov. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D82174	2020-07-29 17:24:53 +03:00
Thomas Lively	11bb7eef41	[WebAssembly] Remove intrinsics for SIMD widening ops Instead, pattern match extends of extract_subvectors to generate widening operations. Since extract_subvector is not a legal node, this is implemented via a custom combine that recognizes extract_subvector nodes before they are legalized. The combine produces custom ISD nodes that are later pattern matched directly, just like the intrinsic was. Also removes the clang builtins for these operations since the instructions can now be generated from portable code sequences. Differential Revision: https://reviews.llvm.org/D84556	2020-07-28 18:25:55 -07:00
Joel E. Denny	69fc33f0cd	Revert "[OpenMP] Implement TR8 `present` motion modifier in Clang (1/2)" This reverts commit `3c3faae497`. It breaks a number of bots.	2020-07-28 20:30:05 -04:00
Joel E. Denny	3c3faae497	[OpenMP] Implement TR8 `present` motion modifier in Clang (1/2) This patch implements Clang front end support for the OpenMP TR8 `present` motion modifier for `omp target update` directives. The next patch in this series implements OpenMP runtime support. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D84711	2020-07-28 19:15:18 -04:00
Zahira Ammarguellat	80bd6ae13e	On Windows build, making the /bigobj flag global , instead of passing it per file. To avoid having this flag be passed in per/file manner, we are instead passing it globally. This fixes this bug: https://bugs.llvm.org/show_bug.cgi?id=46733 Reviewed-by: aaron.ballman, beanz, meinersbur Differential Revision: https://reviews.llvm.org/D84038	2020-07-28 18:04:36 -05:00
Richard Smith	740a164dec	PR46377: Fix dependence calculation for function types and typedef types. We previously did not treat a function type as dependent if it had a parameter pack with a non-dependent type -- such a function type depends on the arity of the pack so is dependent even though none of the parameter types is dependent. In order to properly handle this, we now treat pack expansion types as always being dependent types (depending on at least the pack arity), and always canonically being pack expansion types, even in the unusual case when the pattern is not a dependent type. This does mean that we can have canonical types that are pack expansions that contain no unexpanded packs, which is unfortunate but not inaccurate. We also previously did not treat a typedef type as instantiation-dependent if its canonical type was not instantiation-dependent. That's wrong because instantiation-dependence is a property of the type sugar, not of the type; an instantiation-dependent type can have a non-instantiation-dependent canonical type.	2020-07-28 13:23:13 -07:00
Zequan Wu	b46176bbb0	Reland [Coverage] Add comment to skipped regions Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757. Add comment to skipped regions so we don't track execution count for lines containing only comments. Differential Revision: https://reviews.llvm.org/D83592	2020-07-28 13:20:57 -07:00
Richard Smith	6c18f7db73	For PR46800, implement the GCC __builtin_complex builtin. glibc's implementation of the CMPLX macro uses it (with -fgnuc-version set to 4.7 or later).	2020-07-22 13:43:10 -07:00
Hans Wennborg	238bbd48c5	Revert `abd45154b` "[Coverage] Add comment to skipped regions" This casued assertions during Chromium builds. See comment on the code review > Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757. > Add comment to skipped regions so we don't track execution count for lines containing only comments. > > Differential Revision: https://reviews.llvm.org/D84208 This reverts commit `abd45154bd` and the follow-up `87d7254733`.	2020-07-22 17:09:20 +02:00
Joel E. Denny	aa82c40f0a	[OpenMP] Implement TR8 `present` map type modifier in Clang (1/2) This patch implements Clang front end support for the OpenMP TR8 `present` map type modifier. The next patch in this series implements OpenMP runtime support. This patch does not attempt to implement TR8 sec. 2.22.7.1 "map Clause", p. 319, L14-16: > If a map clause with a present map-type-modifier is present in a map > clause, then the effect of the clause is ordered before all other > map clauses that do not have the present modifier. Compare to L10-11, which Clang does not appear to implement yet: > For a given construct, the effect of a map clause with the to, from, > or tofrom map-type is ordered before the effect of a map clause with > the alloc, release, or delete map-type. This patch also does not implement the `present` implicit-behavior for `defaultmap` or the `present` motion-modifier for `target update`. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D83061	2020-07-22 10:15:32 -04:00
Sjoerd Meijer	5567c62afa	[Matrix] Add LowerMatrixIntrinsics to the NPM Pass LowerMatrixIntrinsics wasn't running yet running under the new pass manager, and this adds LowerMatrixIntrinsics to the pipeline (to the same place as where it is running in the old PM). Differential Revision: https://reviews.llvm.org/D84180	2020-07-22 09:47:53 +01:00
David Blaikie	36036aa70e	Reapply "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression" Reapply `49e5f603d4` which had been reverted in `c94332919b`. Originally reverted because I hadn't updated it in quite a while when I got around to committing it, so there were a bunch of missing changes to new code since I'd written the patch. Reviewers: aaron.ballman Differential Revision: https://reviews.llvm.org/D76646	2020-07-21 20:57:12 -07:00
Zequan Wu	abd45154bd	[Coverage] Add comment to skipped regions Bug filled here: https://bugs.llvm.org/show_bug.cgi?id=45757. Add comment to skipped regions so we don't track execution count for lines containing only comments. Differential Revision: https://reviews.llvm.org/D84208	2020-07-21 17:34:18 -07:00
Wang, Pengfei	18581fd2c4	[CFE] Add nomerge function attribute to inline assembly. Sometimes we also want to avoid merging inline assembly. This patch add the nomerge function attribute to inline assembly. Reviewed By: zequanwu Differential Revision: https://reviews.llvm.org/D84225	2020-07-22 08:22:58 +08:00
Alexey Bataev	13bfe4b226	[OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region. Summary: Need to avoid an optimization for base pointer mapping for target data directives. Reviewers: jdoerfert, ye-luo Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D84182	2020-07-21 15:48:32 -04:00
Arthur Eubanks	b13b858182	[NewPM] Support optnone under new pass manager OptNoneInstrumentation is part of StandardInstrumentations. It skips functions (or loops) that are marked optnone. The feature of skipping optional passes for optnone functions under NPM is gated on a -enable-npm-optnone flag. Currently it is by default false. That is because we still need to mark all required passes to be required. Otherwise optnone functions will start having incorrect semantics. After that is done in following changes, we can remove the flag and always enable this. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D83519	2020-07-21 09:53:43 -07:00
Saiyedul Islam	fc7d2908ab	[OpenMP] Use common interface to access GPU Grid Values Use common interface for accessing target specific GPU grid values in NVPTX OpenMP codegen as proposed in https://reviews.llvm.org/D80917 Originally authored by Greg Rodgers (@gregrodgers). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D83492	2020-07-21 05:25:46 +00:00
Logan Smith	8b6179f48c	[NFC] Add missing 'override's	2020-07-20 14:39:36 -07:00
Joel E. Denny	cbf64b5834	[OpenMP] Fix map clause for unused var: don't ignore it For example, without this patch: ``` $ cat test.c int main() { int x[3]; #pragma omp target map(tofrom:x[0:3]) #ifdef USE x[0] = 1 #endif ; return 0; } $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -S -emit-llvm test.c $ grep '^@.offload_maptypes' test.ll $ echo $? 1 $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -S -emit-llvm test.c \ -DUSE $ grep '^@.offload_maptypes' test.ll @.offload_maptypes = private unnamed_addr constant [1 x i64] [i64 35] ``` With this patch, both greps produce the same result. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D83922	2020-07-17 21:37:27 -04:00
Michele Scandale	53880b8cb9	[CMake] Make `intrinsics_gen` dependency unconditional. The `intrinsics_gen` target exists in the CMake exports since r309389 (see LLVMConfig.cmake.in), hence projects can depend on `intrinsics_gen` even it they are built separately from LLVM. Reviewed By: MaskRay, JDevlieghere Differential Revision: https://reviews.llvm.org/D83454	2020-07-17 16:43:17 -07:00
Xiangling Liao	ec6ada6264	[AIX] report_fatal_error on `-fregister_global_dtors_with_atexit` for static init On AIX, the semantic of global_dtors contains __sterm functions associated with C++ cleanup actions and user-declared __attribute__((destructor)) functions. We should never merely register __sterm with atexit(), so currently -fregister_global_dtors_with_atexit does not work well on AIX: It would cause finalization actions to not occur when unloading shared libraries. We need to figure out a way to handle that when we start supporting user-declared __attribute__((destructor)) functions. Currently we report_fatal_error on this option temporarily. Differential Revision: https://reviews.llvm.org/D83974	2020-07-17 16:14:49 -04:00
Saiyedul Islam	c7562e77b3	[OpenMP][NFC] Generalize CGOpenMPRuntimeNVPTX as CGOpenMPRuntimeGPU Refactors CGOpenMPRuntimeNVPTX as CGOpenMPRuntimeGPU to make it a generalization for OpenMP GPU Codegen. Target specific specialized methods for NVPTX are defined in class CGOpenMPRuntimeNVPTX. This paves the way for a clean and maintainable extension to more GPU targets for OpenMP Codegen. For original author (git blame) list of CGOpenMPRuntimeGPU code, look in history of CGOpenMPRuntimeNVPTX.cpp and .h, after this commit. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D83723	2020-07-17 14:38:04 +00:00
Eric Christopher	7bfaa40086	Temporarily Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753. An SROA change soon may obviate some of these problems. This reverts commit `8d09f20798`.	2020-07-16 11:54:04 -07:00
George Rokos	fc47c0e0a6	[clang] Fix compilation warnings in OpenMP declare mapper codegen. This patch fixes the compilation warnings that L is not a reference. Thanks to Lingda Li for providing the patch. Differential Revision: https://reviews.llvm.org/D83959	2020-07-16 11:04:12 -07:00
Xiangling Liao	69f3378ad6	[AIX]Generate debug info for static init related functions Set the debug location for static init related functions(__dtor and __finalize) so we can generate valid debug info on AIX by invoking -g with clang or -debug-info-kind=limited with clang_cc1. This also works for any other future targets who may use sinit and sterm functions for static initialization, where a direct call to dtor will be generated within finalize function body. This patch also aims at validating that the debug info generated is correct for AIX sinit related functions. Differential Revision: https://reviews.llvm.org/D83702	2020-07-16 10:43:10 -04:00
George Rokos	537b16e9b8	[OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime This patch implements the code generation to use OpenMP 5.0 declare mapper (a.k.a. user-defined mapper) constructs. Patch written by Lingda Li. Differential Revision: https://reviews.llvm.org/D67833	2020-07-15 18:11:43 -07:00
Akira Hatanaka	ed6b578040	[CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind This fixes cases where an invoke is emitted, despite the called llvm function being marked nounwind, because ConstructAttributeList failed to add the attribute to the attribute list. llvm optimization passes turn invokes into calls and optimize away the exception handling code, but it's better to avoid emitting the code in the front-end if the called function is known not to raise an exception. Differential Revision: https://reviews.llvm.org/D83906	2020-07-15 14:47:45 -07:00
Alexey Bataev	41d0af0074	[OPENMP]Fix PR46593: Reduction initializer missing construnctor call. Summary: If user-defined reductions with the initializer are used with classes, the compiler misses the constructor call when trying to create a private copy of the reduction variable. Reviewers: jdoerfert Subscribers: cfe-commits, yaxunl, guansong, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83334	2020-07-15 15:14:22 -04:00
Alexey Bataev	9dc327d1b7	[OPENMP]Fix PR46688: cast the type of the allocated variable to the initial one. Summary: If the original variable is marked for allocation in the different address space using #pragma omp allocate, need to cast the allocated variable to its original type with the original address space. Otherwise, the compiler may crash trying to bitcast the type of the new allocated variable to the original type in some cases, like passing this variable as an argument in function calls. Reviewers: jdoerfert Subscribers: jholewinski, cfe-commits, yaxunl, guansong, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83696	2020-07-15 14:54:19 -04:00
Tim Northover	9697a9e2d3	Fix typo in identifier in assert.	2020-07-15 09:57:53 +01:00
Tim Northover	5165b2b5fd	AArch64+ARM: make LLVM consider system registers volatile. Some of the system registers readable on AArch64 and ARM platforms return different values with each read (for example a timer counter), these shouldn't be hoisted outside loops or otherwise interfered with, but the normal @llvm.read_register intrinsic is only considered to read memory. This introduces a separate @llvm.read_volatile_register intrinsic and maps all system-registers on ARM platforms to use it for the __builtin_arm_rsr calls. Registers declared with asm("r9") or similar are unaffected.	2020-07-15 09:47:36 +01:00
Tyker	8d09f20798	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: thopre, yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-07-14 01:05:58 +02:00
Vedant Kumar	8c4a65b9b2	[ubsan] Check implicit casts in ObjC for-in statements Check that the implicit cast from `id` used to construct the element variable in an ObjC for-in statement is valid. This check is included as part of a new `objc-cast` sanitizer, outside of the main 'undefined' group, as (IIUC) the behavior it's checking for is not technically UB. The check can be extended to cover other kinds of invalid casts in ObjC. Partially addresses: rdar://12903059, rdar://9542496 Differential Revision: https://reviews.llvm.org/D71491	2020-07-13 15:11:18 -07:00
Alexey Bataev	7075c056e9	[OPENMP]Fix compiler crash for target data directive without actual target codegen. Summary: Need to privatize addresses of the captured variables when trying to emit the body of the target data directive in no target codegen mode. Reviewers: jdoerfert Subscribers: yaxunl, guansong, cfe-commits, sstefan1, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D83478	2020-07-13 10:52:24 -04:00
David Blaikie	c94332919b	Revert "Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression" Broke buildbots since I hadn't updated this patch in a while. Sorry for the noise. This reverts commit `49e5f603d4`.	2020-07-12 20:29:19 -07:00
David Blaikie	49e5f603d4	Rename/refactor isIntegerConstantExpression to getIntegerConstantExpression There is a version that just tests (also called isIntegerConstantExpression) & whereas this version is specifically used when the value is of interest (a few call sites were actually refactored to calling the test-only version) so let's make the API look more like it. Reviewers: aaron.ballman Differential Revision: https://reviews.llvm.org/D76646	2020-07-12 19:43:24 -07:00
Craig Topper	b4dbb37f32	[X86] Rename X86_CPU_TYPE_COMPAT_ALIAS/X86_CPU_TYPE_COMPAT/X86_CPU_SUBTYPE_COMPAT macros. NFC Remove _COMPAT. Drop the ARCHNAME. Remove the non-COMPAT versions that are no longer needed. We now only use these macros in places where we need compatibility with libgcc/compiler-rt. So we don't need to call out _COMPAT specifically.	2020-07-12 17:00:24 -07:00
Ten Tzen	66f1dcd872	[Windows SEH] Fix the frame-ptr of a nested-filter within a _finally This change fixed a SEH bug (exposed by test58 & test61 in MSVC test xcpt4u.c); when an Except-filter is located inside a finally, the frame-pointer generated today via intrinsic @llvm.eh.recoverfp is the frame-pointer of the immediate parent _finally, not the frame-ptr of outermost host function. The fix is to retrieve the Establisher's frame-pointer that was previously saved in parent's frame. The prolog of a filter inside a _finally should be like code below: %0 = call i8* @llvm.eh.recoverfp(i8* bitcast (@"?fin$0@0@main@@"), i8%frame_pointer) %1 = call i8 @llvm.localrecover(i8* bitcast (@"?fin$0@0@main@@"), i8%0, i32 0) %2 = bitcast i8 %1 to i8** %3 = load i8, i8* %2, align 8 Differential Revision: https://reviews.llvm.org/D77982	2020-07-12 01:37:56 -07:00
Johannes Doerfert	c98699582a	[OpenMP][NFC] Remove unused (always fixed) arguments There are various runtime calls in the device runtime with unused, or always fixed, arguments. This is bad for all sorts of reasons. Clean up two before as we match them in OpenMPOpt now. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D83268	2020-07-11 00:51:51 -05:00
Yaxun (Sam) Liu	849d4405f5	[HIP] Fix rocm detection Do not detect device library by default in rocm detector. Only detect device library in Rocm and HIP toolchain. Separate detection of HIP runtime and Rocm device library. Detect rocm path by version file in host toolchains. Also added detecting rocm version and printing rocm installation path and version with -v. Fixed include path and device library detection for ROCm 3.5. Added --hip-version option. Renamed --hip-device-lib-path to --rocm-device-lib-path. Fixed default value for -fhip-new-launch-api. Added default -std option for HIP. Differential Revision: https://reviews.llvm.org/D82930	2020-07-10 23:20:15 -04:00
Akira Hatanaka	3a5617c02e	Fix build error	2020-07-10 17:40:37 -07:00
Akira Hatanaka	e9bf0a710c	[CodeGen] Store the return value of the target function call to the thunk's return value slot directly when the return type is an aggregate instead of doing so via a temporary This fixes PR45997 (https://bugs.llvm.org/show_bug.cgi?id=45997), which is caused by a bug that has existed since we started passing and returning C++ structs with ObjC strong pointer members (see https://reviews.llvm.org/D44908) or structs annotated with trivial_abi directly. rdar://problem/63740936 Differential Revision: https://reviews.llvm.org/D82513	2020-07-10 17:24:13 -07:00
Aaron Ballman	006c49d890	Change behavior with zero-sized static array extents Currently, Clang previously diagnosed this code by default: void f(int a[static 0]); saying that "static has no effect on zero-length arrays", which was accurate. However, static array extents require that the caller of the function pass a nonnull pointer to an array of at least that number of elements, but it can pass more (see C17 6.7.6.3p6). Given that we allow zero-sized arrays as a GNU extension and that it's valid to pass more elements than specified by the static array extent, we now support zero-sized static array extents with the usual semantics because it can be useful in cases like: void my_bzero(char p[static 0], int n); my_bzero(&c+1, 0); //ok my_bzero(t+k,n-k); //ok, pattern from actual code	2020-07-10 15:58:11 -04:00
Zequan Wu	1fbb719470	[LPM] Port CGProfilePass from NPM to LPM Reviewers: hans, chandlerc!, asbirlea, nikic Reviewed By: hans, nikic Subscribers: steven_wu, dexonsmith, nikic, echristo, void, zhizhouy, cfe-commits, aeubanks, MaskRay, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D83013	2020-07-10 09:04:51 -07:00
Ulrich Weigand	4c5a93bd58	[ABI] Handle C++20 [[no_unique_address]] attribute Many platform ABIs have special support for passing aggregates that either just contain a single member of floatint-point type, or else a homogeneous set of members of the same floating-point type. When making this determination, any extra "empty" members of the aggregate type will typically be ignored. However, in C++ (at least in all prior versions), no data member would actually count as empty, even if it's type is an empty record -- it would still be considered to take up at least one byte of space, and therefore make those ABI special cases not apply. This is now changing in C++20, which introduced the [[no_unique_address]] attribute. Members of empty record type, if they also carry this attribute, now do not take up any space in the type, and therefore the ABI special cases for single-element or homogeneous aggregates should apply. The C++ Itanium ABI has been updated accordingly, and GCC 10 has added support for this new case. This patch now adds support to LLVM. This is cross-platform; it affects all platforms that use the single-element or homogeneous aggregate ABI special case and implement this using any of the following common subroutines in lib/CodeGen/TargetInfo.cpp: isEmptyField isEmptyRecord isSingleElementStruct isHomogeneousAggregate	2020-07-10 14:01:05 +02:00
Fangrui Song	c025bdf25a	Revert D83013 "[LPM] Port CGProfilePass from NPM to LPM" This reverts commit `c92a8c0a0f`. It breaks builds and has unaddressed review comments.	2020-07-09 13:34:04 -07:00
Zequan Wu	c92a8c0a0f	[LPM] Port CGProfilePass from NPM to LPM Reviewers: hans, chandlerc!, asbirlea, nikic Reviewed By: hans, nikic Subscribers: steven_wu, dexonsmith, nikic, echristo, void, zhizhouy, cfe-commits, aeubanks, MaskRay, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D83013	2020-07-09 13:03:42 -07:00
cchen	2da9572a9b	[OPENMP50] extend array section for stride (Parsing/Sema/AST) Reviewers: ABataev, jdoerfert Reviewed By: ABataev Subscribers: yaxunl, guansong, arphaman, sstefan1, cfe-commits, sandoval, dreachem Tags: #clang Differential Revision: https://reviews.llvm.org/D82800	2020-07-09 13:28:51 -05:00
Anatoly Trosinenko	67422e4294	[MSP430] Align the _Complex ABI with current msp430-gcc Assembler output is checked against msp430-gcc 9.2.0.50 from TI. Reviewed By: asl Differential Revision: https://reviews.llvm.org/D82646	2020-07-09 18:28:48 +03:00
sstefan1	6aab27ba85	[OpenMPIRBuilder][Fix] Move llvm::omp::types to OpenMPIRBuilder. Summary: D82193 exposed a problem with global type definitions in `OMPConstants.h`. This causes a race when running in thinLTO mode. Types now live inside of OpenMPIRBuilder to prevent this from happening. Reviewers: jdoerfert Subscribers: yaxunl, hiraditya, guansong, dexonsmith, aaron.ballman, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D83176	2020-07-08 17:23:55 +02:00
Ulrich Weigand	80a1b95b8e	[SystemZ ABI] Allow class types in GetSingleElementType The SystemZ ABI specifies that aggregate types with just a single member of floating-point type shall be passed as if they were just a scalar of that type. This applies to both struct and class types (but not unions). However, the current ABI support code in clang only checks this case for struct types, which means that for class types, generated code does not adhere to the platform ABI. Fixed by accepting both struct and class types in the SystemZABIInfo::GetSingleElementType routine.	2020-07-07 19:56:19 +02:00
Jennifer Yu	6cf0dac1ca	orrectly generate invert xor value for Binary Atomics of int size > 64 When using __sync_nand_and_fetch with __int128, a problem is found that the wrong value for the 'invert' value gets emitted to the xor in case where the int size is greater than 64 bits. This is because uses of llvm::ConstantInt::get which zero extends the greater than 64 bits, so instead -1 that we require, it end up getting 18446744073709551615 This patch replaces the call to llvm::ConstantInt::get with the call to llvm::Constant::getAllOnesValue which works for all integer types. Reviewers: jfp, erichkeane, rjmccall, hfinkel Differential Revision: https://reviews.llvm.org/D82832	2020-07-07 10:20:14 -07:00
Wouter van Oortmerssen	16d83c395a	[WebAssembly] Added 64-bit memory.grow/size/copy/fill This covers both the existing memory functions as well as the new bulk memory proposal. Added new test files since changes where also required in the inputs. Also removes unused init/drop intrinsics rather than trying to make them work for 64-bit. Differential Revision: https://reviews.llvm.org/D82821	2020-07-06 12:49:50 -07:00
Chuanqi Xu	8849831d55	[Coroutines] Warning if return type of coroutine_handle::address is not void* User can own a version of coroutine_handle::address() whose return type is not void* by using template specialization for coroutine_handle<> for some promise_type. In this case, the codes may violate the capability with existing async C APIs that accepted a void* data parameter which was then passed back to the user-provided callback. Patch by ChuanqiXu Differential Revision: https://reviews.llvm.org/D82442	2020-07-06 13:46:01 +08:00
Roman Lebedev	7ea46aee36	Revert "[AssumeBundles] Use operand bundles to encode alignment assumptions" Assume bundle can have more than one entry with the same name, but at least AlignmentFromAssumptionsPass::extractAlignmentInfo() uses getOperandBundle("align"), which internally assumes that it isn't the case, and happily crashes otherwise. Minimal reduced reproducer: run `opt -alignment-from-assumptions` on target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128" target triple = "x86_64-unknown-linux-gnu" %0 = type { i64, %1, i8, i64, %2, i32, %3, i8 } %1 = type opaque %2 = type { i8, i8, i16 } %3 = type { i32, i32, i32, i32 } ; Function Attrs: nounwind define i32 @f(%0* noalias nocapture readonly %arg, %0* noalias %arg1) local_unnamed_addr #0 { bb: call void @llvm.assume(i1 true) [ "align"(%0* %arg, i64 8), "align"(%0* %arg1, i64 8) ] ret i32 0 } ; Function Attrs: nounwind willreturn declare void @llvm.assume(i1) #1 attributes #0 = { nounwind "reciprocal-estimates"="none" } attributes #1 = { nounwind willreturn } This is what we'd have with -mllvm -enable-knowledge-retention This reverts commit `c95ffadb24`.	2020-07-04 23:49:23 +03:00
Bruno Ricci	473fbc90d1	[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper In general there is no way to get to the ASTContext from most AST nodes (Decls are one of the exception). This will be a problem when implementing the rest of APValue::dump since we need the ASTContext to dump some kinds of APValues. The ASTContext* in ASTDumper and TextNodeDumper is not always non-null. This is because we still want to be able to use the various dump() functions in a debugger. No functional changes intended. Reverted in `fcf4d5e449` since a few dump() functions in lldb where missed.	2020-07-03 13:59:22 +01:00
Bruno Ricci	fcf4d5e449	Revert "[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper" This reverts commit `aa7fd905e4`. I missed some dump() functions.	2020-07-02 19:40:09 +01:00
Bruno Ricci	aa7fd905e4	[clang][NFC] Store a pointer to the ASTContext in ASTDumper and TextNodeDumper In general there is no way to get to the ASTContext from most AST nodes (Decls are one of the exception). This will be a problem when implementing the rest of APValue::dump since we need the ASTContext to dump some kinds of APValues. The ASTContext* in ASTDumper and TextNodeDumper is not always non-null. This is because we still want to be able to use the various dump() functions in a debugger. No functional changes intended.	2020-07-02 19:29:02 +01:00
Alexander Belyaev	2a36f29fce	[clang] Re-add deleted forward declaration.	2020-07-02 08:57:48 +02:00
Valentin Clement	2ddba3082c	[flang][openmp] Use common Directive and Clause enum from llvm/Frontend Summary: This patch is removing the custom enumeration for OpenMP Directives and Clauses and replace them with the newly tablegen generated one from llvm/Frontend. This is a first patch and some will follow to share the same infrastructure where possible. The next patch should use the clauses allowance defined in the tablegen file. Reviewers: jdoerfert, DavidTruby, sscalpone, kiranchandramohan, ichoyjx Reviewed By: DavidTruby, ichoyjx Subscribers: jholewinski, cfe-commits, dblaikie, MaskRay, ymandel, ichoyjx, mgorny, yaxunl, guansong, jfb, sstefan1, aaron.ballman, llvm-commits Tags: #llvm, #flang, #clang Differential Revision: https://reviews.llvm.org/D82906	2020-07-01 20:58:11 -04:00
zoecarver	e7c5da57a5	[CodeGen] Add public function to emit C++ destructor call. Adds `CodeGen::getCXXDestructorImplicitParam`, to retrieve a C++ destructor's implicit parameter (after the "this" pointer) based on the ABI in the given CodeGenModule. This will allow other frontends (Swift, for example) to easily emit calls to object destructors with correct ABI semantics and calling convetions. This is needed for Swift C++ interop. Here's the corresponding Swift change: https://github.com/apple/swift/pull/32291 Differential Revision: https://reviews.llvm.org/D82392	2020-07-01 11:01:23 -07:00
Xun Li	565e37c770	[Coroutines] Fix code coverage for coroutine Summary: Previously, source-based coverage analysis does not work properly for coroutine. This patch adds processing of coroutine body and co_return in the coverage analysis, so that we can handle them properly. For coroutine body, we should only look at the actual function body and ignore the compiler-generated things; for co_return, we need to terminate the region similar to return statement. Added a test, and confirms that it now works properly. (without this patch, the statement after the if statement will be treated wrongly) Reviewers: lewissbaker, modocache, junparser Reviewed By: modocache Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D82928	2020-07-01 10:11:40 -07:00
Erich Keane	2831a317b6	Implement AVX ABI Warning/error The x86-64 "avx" feature changes how >128 bit vector types are passed, instead of being passed in separate 128 bit registers, they can be passed in 256 bit registers. "avx512f" does the same thing, except it switches from 256 bit registers to 512 bit registers. The result of both of these is an ABI incompatibility between functions compiled with and without these features. This patch implements a warning/error pair upon an attempt to call a function that would run afoul of this. First, if a function is called that would have its ABI changed, we issue a warning. Second, if said call is made in a situation where the caller and callee are known to have different calling conventions (such as the case of 'target'), we instead issue an error. Differential Revision: https://reviews.llvm.org/D82562	2020-07-01 07:14:31 -07:00
Simon Pilgrim	36aaffbf56	Fix Wdocumentation warnings due to outdated parameter list. NFC.	2020-07-01 12:01:18 +01:00
Richard Smith	4eff2beefb	[c++20] consteval functions don't get vtable slots. For the Itanium C++ ABI, this implements the rule added in https://github.com/itanium-cxx-abi/cxx-abi/pull/83 For the MS C++ ABI, this implements the direction that seemed most plausible based on personal correspondence with MSVC developers, but is subject to change as they decide their ABI rule.	2020-06-30 18:22:09 -07:00
Craig Topper	3537939cda	[X86] Move frontend CPU feature initialization to a look up table based implementation. NFCI This replaces the switch statement implementation in the clang's X86.cpp with a lookup table in X86TargetParser.cpp. I've used constexpr and copy of the FeatureBitset from SubtargetFeature.h to store the features in a lookup table. After the lookup the bitset is translated into strings for use by the rest of the frontend code. I had to modify the implementation of the FeatureBitset to avoid bugs in gcc 5.5 constexpr handling. It seems to not like the same array entry to be used on the left side and right hand side of an assignment or &= or \|=. I've also used uint32_t instead of uint64_t and sized based on the X86::CPU_FEATURE_MAX. I've initialized the features for different CPUs outside of the table so that we can express inheritance in an adhoc way. This was one of the big limitations of the switch and we had resorted to labels and gotos. Differential Revision: https://reviews.llvm.org/D82731	2020-06-30 12:04:58 -07:00
Francesco Petrogalli	67e4330fac	[sve][acle] Implement some of the C intrinsics for brain float. Summary: The following intrinsics have been extended to support brain float types: svbfloat16_t svclasta[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclasta[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlasta[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svclastb[_bf16](svbool_t pg, svbfloat16_t fallback, svbfloat16_t data) bfloat16_t svclastb[_n_bf16](svbool_t pg, bfloat16_t fallback, svbfloat16_t data) bfloat16_t svlastb[_bf16](svbool_t pg, svbfloat16_t op) svbfloat16_t svdup[_n]_bf16(bfloat16_t op) svbfloat16_t svdup[_n]_bf16_m(svbfloat16_t inactive, svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_x(svbool_t pg, bfloat16_t op) svbfloat16_t svdup[_n]_bf16_z(svbool_t pg, bfloat16_t op) svbfloat16_t svdupq[_n]_bf16(bfloat16_t x0, bfloat16_t x1, bfloat16_t x2, bfloat16_t x3, bfloat16_t x4, bfloat16_t x5, bfloat16_t x6, bfloat16_t x7) svbfloat16_t svdupq_lane[_bf16](svbfloat16_t data, uint64_t index) svbfloat16_t svinsr[_n_bf16](svbfloat16_t op1, bfloat16_t op2) Reviewers: sdesmalen, kmclaughlin, c-rhodes, ctetreau, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82345	2020-06-29 16:09:08 +00:00
Bevin Hansson	fefa34faf5	[CodeGen] Use the common semantic for fixed-point codegen, not the result semantic. Summary: Using the result semantic is wrong in some cases, such as unsigned fixed-point + signed integer. In this case, the result semantic is unsigned and the common semantic is signed. Reviewers: leonardchan Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D82662	2020-06-29 16:22:29 +02:00
Fady Ghanim	80e15b4574	[Clang][OpenMP][OMPBuilder] Moving OMP allocation and cache creation code to OMPBuilderCBHelpers Summary: Modified the OMPBuilderCBHelpers in the following ways: - Moved location of class definition and deleted all constructors - Moved OpenMP-specific address allocation of local variables - Moved threadprivate variable creation for the current thread Reviewers: jdoerfert Subscribers: yaxunl, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79676	2020-06-28 19:04:20 -04:00
Melanie Blower	f4aaed3bf1	Reland D81869 "Modify FPFeatures to use delta not absolute settings" This reverts commit `defd43a5b3`. with correction to solve msan report To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the floating point settings in PCH files aren't compatible, rewrite FPFeatures to use a delta in the settings rather than absolute settings. With this patch, these floating point options can be benign. Reviewers: rjmccall Differential Revision: https://reviews.llvm.org/D81869	2020-06-27 01:34:57 -07:00
Matt Arsenault	9e03bdebc1	AMDGPU: Add llvm.amdgcn.sqrt intrinsic I spread the GlobalISel test into the regular one, which I've been avoiding so far.	2020-06-26 15:07:07 -04:00
Melanie Blower	defd43a5b3	Revert "Revert "Revert "Modify FPFeatures to use delta not absolute settings""" This reverts commit `9518763d71`. Memory sanitizer fails in CGFPOptionsRAII::CGFPOptionsRAII dtor	2020-06-26 08:47:04 -07:00
Melanie Blower	9518763d71	Revert "Revert "Modify FPFeatures to use delta not absolute settings"" This reverts commit `b55d723ed6`. Reapply Modify FPFeatures to use delta not absolute settings To solve https://bugs.llvm.org/show_bug.cgi?id=46166 where the floating point settings in PCH files aren't compatible, rewrite FPFeatures to use a delta in the settings rather than absolute settings. With this patch, these floating point options can be benign. Reviewers: rjmccall Differential Revision: https://reviews.llvm.org/D81869	2020-06-26 08:00:08 -07:00
Melanie Blower	b55d723ed6	Revert "Modify FPFeatures to use delta not absolute settings" This reverts commit `3a748cbf86`. I'm reverting this commit because I forgot to format the commit message propertly. Sorry for the thrash.	2020-06-26 07:52:57 -07:00
Melanie Blower	3a748cbf86	Modify FPFeatures to use delta not absolute settings	2020-06-26 07:41:09 -07:00
Francesco Petrogalli	7200fa38a9	[sve][acle] Add some C intrinsics for brain float types. Summary: The following intrinsics has been added: svuint16_t svcnt[_bf16]_m(svuint16_t inactive, svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_x(svbool_t pg, svbfloat16_t op) svuint16_t svcnt[_bf16]_z(svbool_t pg, svbfloat16_t op) svbfloat16_t svtbl[_bf16](svbfloat16_t data, svuint16_t indices) svbfloat16_t svtbl2[_bf16](svbfloat16x2_t data, svuint16_t indices) svbfloat16_t svtbx[_bf16](svbfloat16_t fallback, svbfloat16_t data, svuint16_t indices) Reviewers: c-rhodes, kmclaughlin, efriedma, sdesmalen, ctetreau Subscribers: tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D82429	2020-06-25 16:31:01 +00:00
Andrew Wock	15edd7aaa7	[FPEnv] PowerPC-specific builtin constrained FP enablement This change enables PowerPC compiler builtins to generate constrained floating point operations when clang is indicated to do so. A couple of possibly unexpected backend divergences between constrained floating point and regular behavior are highlighted under the test tag FIXME-CHECK. This may be something for those on the PPC backend to look at. Patch by: Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D82020	2020-06-25 11:42:58 -04:00
Alexey Bataev	32ea3397be	[OPENMP]Dynamic globalization for parallel target regions. Summary: Added support for dynamic memory allocation for globalized variables in case if execution of target regions in parallel is required. Reviewers: jdoerfert Subscribers: jholewinski, yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D82324	2020-06-25 08:25:24 -04:00
Tyker	c95ffadb24	[AssumeBundles] Use operand bundles to encode alignment assumptions Summary: NOTE: There is a mailing list discussion on this: http://lists.llvm.org/pipermail/llvm-dev/2019-December/137632.html Complemantary to the assumption outliner prototype in D71692, this patch shows how we could simplify the code emitted for an alignemnt assumption. The generated code is smaller, less fragile, and it makes it easier to recognize the additional use as a "assumption use". As mentioned in D71692 and on the mailing list, we could adopt this scheme, and similar schemes for other patterns, without adopting the assumption outlining. Reviewers: hfinkel, xbolva00, lebedev.ri, nikic, rjmccall, spatel, jdoerfert, sstefan1 Reviewed By: jdoerfert Subscribers: yamauchi, kuter, fhahn, merge_guards_bot, hiraditya, bollu, rkruppe, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71739	2020-06-25 12:59:44 +02:00
Nigel Perks	dc3f8913d2	Fix crash on XCore on unused inline in EmitTargetMetadata EmitTargetMetadata passed to emitTargetMD a null pointer as returned from GetGlobalValue, for an unused inline function which has been removed from the module at that point. A FIXME in CodeGenModule.cpp commented that the calling code in EmitTargetMetadata should be moved into the one target that needs it (XCore). A review comment agreed. So the calling loop has been moved into the XCore subclass. The check for null is done in that loop. Differential Revision: https://reviews.llvm.org/D77068	2020-06-24 12:48:17 -07:00
Michael Liao	ebc9e0f1f0	Fix coding style. NFC. - Remove `else` after `return`.	2020-06-24 13:13:42 -04:00
Cullen Rhodes	05e10ee0ae	[AArch64][SVE2] Add bfloat16 support to whilerw/whilewr intrinsics Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82399	2020-06-24 10:06:31 +00:00
Cullen Rhodes	fd2c4b8999	[AArch64][SVE] Add bfloat16 support to svlen intrinsic Reviewed By: fpetrogalli Differential Revision: https://reviews.llvm.org/D82186	2020-06-24 10:05:51 +00:00
Kazushi (Jam) Marukawa	96d4ccf00c	[VE] Clang toolchain for VE Summary: This patch enables compilation of C code for the VE target with Clang. Differential Revision: https://reviews.llvm.org/D79411	2020-06-24 10:12:09 +02:00
Eli Friedman	bf8b63ed29	[clang codegen] Fix alignment of "Address" for incomplete array pointer. The code was assuming all incomplete types don't have meaningful alignment, but incomplete arrays do have meaningful alignment. Fixes https://bugs.llvm.org/show_bug.cgi?id=45710 Differential Revision: https://reviews.llvm.org/D79052	2020-06-23 17:16:17 -07:00
David Blaikie	4935419d77	Remove clang::Codegen::EHPadEndScope as unused Unused since r255423 / D15140 / `4e52d6f811` Found indirectly by assessing -debug-info-kind=constructors and observing the EHPadEndScope type was never emitted because the constructor is never called. (all credit to Amy Huang for identifying this issue)	2020-06-23 15:18:49 -07:00
Mikhail Maltsev	3f353a2e5a	[BFloat] Add convert/copy instrinsic support This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a Specifically it adds intrinsic support in clang and llvm for Arm and AArch64. The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Alexandros Lamprineas - Luke Cheeseman - Mikhail Maltsev - Momchil Velikov - Luke Geeson Differential Revision: https://reviews.llvm.org/D80928	2020-06-23 14:27:05 +00:00
Alexey Bataev	cb90e6a7c0	[OPENMP50]Codegen for scan directives in parallel for simd regions. Summary: Added codegen for scan directives in parallel for simd regions. Emits the code for the directive with inscan reductions. Original code: ``` #pragma omp parallel for simd reduction(inscan, op : ...) for() { <input phase>; #pragma omp scan (in)exclusive(...) <scan phase> } ``` is transformed to something: ``` #pragma omp parallel { size num_iters = <num_iters>; <type> buffer[num_iters]; #pragma omp for simd for (i: 0..<num_iters>) { <input phase>; buffer[i] = red; } #pragma omp barrier for (int k = 0; k != ceil(log2(num_iters)); ++k) for (size cnt = last_iter; cnt >= pow(2, k); --k) buffer[i] op= buffer[i-pow(2,k)]; #pragma omp for simd for (0..<num_iters>) { red = InclusiveScan ? buffer[i] : buffer[i-1]; <scan phase>; } } ``` Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D82115	2020-06-23 08:41:11 -04:00
Mikhail Maltsev	9c579540ff	[ARM] BFloat MatMul Intrinsics&CodeGen Summary: This patch adds support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch32. This includes IR intrinsics. Tests are provided as needed. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman - Simon Tatham Reviewers: stuij, t.p.northover, SjoerdMeijer, sdesmalen, fpetrogalli, LukeGeeson, simon_tatham, dmgreen, MarkMurrayARM Reviewed By: MarkMurrayARM Subscribers: MarkMurrayARM, danielkiss, kristof.beyls, hiraditya, cfe-commits, llvm-commits, chill, miyuki Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81740	2020-06-23 12:06:37 +00:00
Sander de Smalen	121e585ec8	[AArch64][SVE] ACLE: Add bfloat16 to struct load/stores. This patch contains: - Support in LLVM CodeGen for bfloat16 types for ld2/3/4 and st2/3/4. - New bfloat16 ACLE builtins for svld(2\|3\|4)[_vnum] and svst(2\|3\|4)[_vnum] Reviewers: stuij, efriedma, c-rhodes, fpetrogalli Reviewed By: fpetrogalli Tags: #clang, #lldb, #llvm Differential Revision: https://reviews.llvm.org/D82187	2020-06-23 12:12:35 +01:00
Craig Topper	0dfc8e1837	[X86] Remove encoding value from the X86_FEATURE and X86_FEATURE_COMPAT macro. NFCI This was orignally done so we could separate the compatibility values and the llvm internal only features into a separate entries in the feature array. This was needed when we explicitly had to convert the feature into the proper 32-bit chunk at every reference and we didn't want things moving around. Now everything is in an array and we have helper funtions or macros to convert encoding to index. So we renumbering is no longer an issue.	2020-06-22 11:46:21 -07:00
Mikhail Maltsev	3a4feb1d53	[ARM][BFloat] Implement bf16 get/set_lane without casts to i16 vectors Currently, in order to extract an element from a bf16 vector, we cast the vector to an i16 vector, perform the extraction, and cast the result to bfloat. This behavior was copied from the old fp16 implementation. The goal of this patch is to achieve optimal code generation for lane copying intrinsics in a subsequent patch (LLVM fails to fold certain combinations of bitcast, insertelement, extractelement and shufflevector instructions leading to the generation of suboptimal code). Differential Revision: https://reviews.llvm.org/D82206	2020-06-22 17:35:43 +00:00
Zhi Zhuang	37fb860301	Add support of __builtin_expect_with_probability Add a new builtin-function __builtin_expect_with_probability and intrinsic llvm.expect.with.probability. The interface is __builtin_expect_with_probability(long expr, long expected, double probability). It is mainly the same as __builtin_expect besides one more argument indicating the probability of expression equal to expected value. The probability should be a constant floating-point expression and be in range [0.0, 1.0] inclusive. It is similar to builtin-expect-with-probability function in GCC built-in functions. Differential Revision: https://reviews.llvm.org/D79830	2020-06-22 10:21:28 -07:00
Eric Christopher	0861889be1	[clang/llvm] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 16:03:58 -07:00
Eric Christopher	10563e16aa	[Analysis/Transforms/Sanitizers] As part of using inclusive language within the llvm project, migrate away from the use of blacklist and whitelist.	2020-06-20 00:42:26 -07:00
Fangrui Song	2a4317bfb3	[SanitizeCoverage] Rename -fsanitize-coverage-{white,black}list to -fsanitize-coverage-{allow,block}list Keep deprecated -fsanitize-coverage-{white,black}list as aliases for compatibility for now. Reviewed By: echristo Differential Revision: https://reviews.llvm.org/D82244	2020-06-19 22:22:47 -07:00
Xiangling Liao	3f2e61c1fe	[AIX] Default AIX to using -fno-use-cxa-atexit On AIX, we use __atexit to register dtor functions rather than __cxa_atexit. So a driver change is needed to default AIX to using -fno-use-cxa-atexit. Windows platform does not uses __cxa_atexit either. Following its precedent, we remove the assertion for when -fuse-cxa-atexit is specified by the user, do not produce a message and silently default to -fno-use-cxa-atexit behavior. Differential Revision: https://reviews.llvm.org/D82136	2020-06-19 08:27:07 -04:00
Xiangling Liao	22337bfe7d	[AIX][Frontend] Static init implementation for AIX considering no priority 1. Provides no piroirity supoort && disables three priority related attributes: init_priority, ctor attr, dtor attr; 2. '-qunique' in XL compiler equivalent behavior of emitting sinit and sterm functions name using getUniqueModuleId() util function in LLVM (currently no support for InternalLinkage and WeakODRLinkage symbols); 3. Add testcases to emit IR sample with __sinit80000000, __dtor, and __sterm80000000; 4. Temporarily side-steps the need to implement the functionality of llvm.global_ctors and llvm.global_dtors arrays. The uses of that functionality in this patch (with respect to the name of the functions involved) are not representative of how the functionality will be used once implemented. Differential Revision: https://reviews.llvm.org/D74166	2020-06-19 08:27:07 -04:00
Sander de Smalen	ad828e3f4d	[SveEmitter] Add builtins for struct loads/stores (ld2/ld3/etc) The struct store intrinsics in LLVM IR take the individual parts as arguments, so this patch uses the intrinsics used for `svget` to break the tuples into individual parts. Reviewers: c-rhodes, efriedma, ctetreau, david-arm Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D81466	2020-06-19 10:35:42 +01:00
Xiangling Liao	ed1b556954	[NFC] Cleanup of EmitCXXGlobalInitFunc() and EmitCXXGlobalDtorFunc() Tidy up some code of EmitCXXGlobalInitFunc() and EmitCXXGlobalDtorFunc() as the pre-work of D74166 patch. Differential Revision: https://reviews.llvm.org/D81972	2020-06-18 18:49:23 -04:00
Ties Stuij	035795659b	[ARM][bfloat] Do not coerce bfloat arguments and returns to integers Summary: As part of moving the argument lowering handling for bfloat arguments and returns to the backend, this patch removes the code that was responsible for handling the coercion of those arguments in Clang's Codegen. Subscribers: kristof.beyls, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81837	2020-06-18 18:26:01 +01:00
Francesco Petrogalli	3e59dfc301	[llvm][SveEmitter] Emit the bfloat version of `svld1ro`. Summary: The new SVE builtin type __SVBFloat16_t` is used to represent scalable vectors of bfloat elements. Reviewers: sdesmalen, efriedma, stuij, ctetreau, shafik, rengolin Subscribers: tschuett, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81304	2020-06-18 16:36:31 +00:00
Alexey Bataev	4971d0b8ec	[OPENMP50]Allow nonmonotonic modifier for all schedule kinds. Summary: According to OpenMP 5.0, nonmonotonic modifier can be used with all schedule kinds, not only dynamic and guided as in OpenMP 4.5. Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D82026	2020-06-18 12:30:50 -04:00
Alexey Bataev	1ec469cf4c	[OPENMP50]Codegen for scan directives in parallel for regions. Summary: Added codegen for scan directives in parallel for regions. Emits the code for the directive with inscan reductions. Original code: ``` #pragma omp parallel for reduction(inscan, op : ...) for() { <input phase>; #pragma omp scan (in)exclusive(...) <scan phase> } ``` is transformed to something: ``` #pragma omp parallel { size num_iters = <num_iters>; <type> buffer[num_iters]; #pragma omp for for (i: 0..<num_iters>) { <input phase>; buffer[i] = red; } #pragma omp barrier for (int k = 0; k != ceil(log2(num_iters)); ++k) for (size cnt = last_iter; cnt >= pow(2, k); --k) buffer[i] op= buffer[i-pow(2,k)]; #pragma omp for for (0..<num_iters>) { red = InclusiveScan ? buffer[i] : buffer[i-1]; <scan phase>; } } ``` Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D81478	2020-06-18 11:56:55 -04:00
Alexandre Ganea	89ea0b0520	[MC] Pass down argv0 & cc1 cmd-line to the back-end and store in MCTargetOptions When targetting CodeView, the goal is to store argv0 & cc1 cmd-line in the emitted .OBJ, in order to allow a reproducer from the .OBJ alone. This patch is to simplify https://reviews.llvm.org/D80833	2020-06-18 09:17:14 -04:00
Lucas Prates	ada4c9dc4a	[ARM][Clang] Removing lowering of half-precision FP arguments and returns from Clang's CodeGen Summary: On the process of moving the argument lowering handling for half-precision floating point arguments and returns to the backend, this patch removes the code that was responsible for handling the coercion of those arguments in Clang's Codegen. Reviewers: rjmccall, chill, ostannard, dnsampaio Reviewed By: ostannard Subscribers: stuij, kristof.beyls, dmgreen, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81451	2020-06-18 13:17:07 +01:00
Florian Hahn	b5e082e728	[Matrix] Add __builtin_matrix_column_store to Clang. This patch add __builtin_matrix_column_major_store to Clang, as described in clang/docs/MatrixTypes.rst. In the initial version, the stride is not optional yet. Reviewers: rjmccall, jfb, rsmith, Bigcheese Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D72782	2020-06-18 11:39:02 +01:00
Sander de Smalen	4ea8e27a64	[SveEmitter] Add builtins to insert/extract subvectors from tuples (svget/svset) For example: svint32_t svget4(svint32x4_t tuple, uint64_t imm_index) returns the subvector at `index`, which must be in range `0..3`. svint32x3_t svset3(svint32x3_t tuple, uint64_t index, svint32_t vec) returns a tuple vector with `vec` inserted into `tuple` at `index`, which must be in range `0..2`. Reviewers: c-rhodes, efriedma Reviewed By: c-rhodes Tags: #clang Differential Revision: https://reviews.llvm.org/D81464	2020-06-18 11:06:16 +01:00
Florian Hahn	934bcaf10b	[Matrix] Add __builtin_matrix_column_load to Clang. This patch add __builtin_matrix_column_major_load to Clang, as described in clang/docs/MatrixTypes.rst. In the initial version, the stride is not optional yet. Reviewers: rjmccall, rsmith, jfb, Bigcheese Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D72781	2020-06-18 10:47:55 +01:00
Sander de Smalen	1d7b4a7e5e	[SveEmitter] Add builtins for tuple creation (svcreate2/svcreate3/etc) The svcreate builtins allow constructing a tuple from individual vectors, e.g. svint32x2_t svcreate2(svint32_t v2, svint32_t v2)` Reviewers: c-rhodes, david-arm, efriedma Reviewed By: c-rhodes, efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D81463	2020-06-18 10:07:09 +01:00
Huihui Zhang	9d8d0646d7	[NFC] Silence compiler warning [-Wmissing-braces]. clang/lib/CodeGen/CGNonTrivialStruct.cpp:330:7: warning: suggest braces around initialization of subobject [-Wmissing-braces] Address(CGF->Builder.CreateLoad(CGF->GetAddrOfLocalVar(Args[Ints])), ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ {	2020-06-17 13:01:53 -07:00
Ian Levesque	7c7c8e0da4	[xray] Option to omit the function index Summary: Add a flag to omit the xray_fn_idx to cut size overhead and relocations roughly in half at the cost of reduced performance for single function patching. Minor additions to compiler-rt support per-function patching without the index. Reviewers: dberris, MaskRay, johnislarry Subscribers: hiraditya, arphaman, cfe-commits, #sanitizers, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D81995	2020-06-17 13:49:01 -04:00
Alexey Bataev	34ee2549a7	[OPENMP50]Codegen for scan directive in for simd regions. Summary: Added codegen for scan directives in parallel for regions. Emits the code for the directive with inscan reductions. Original code: ``` #pragma omp for simd reduction(inscan, op : ...) for(...) { <input phase>; #pragma omp scan (in)exclusive(...) <scan phase> } ``` is transformed to something: ``` size num_iters = <num_iters>; <type> buffer[num_iters]; #pragma omp for simd for (i: 0..<num_iters>) { <input phase>; buffer[i] = red; } #pragma omp barrier for (int k = 0; k != ceil(log2(num_iters)); ++k) for (size cnt = last_iter; cnt >= pow(2, k); --k) buffer[i] op= buffer[i-pow(2,k)]; #pragma omp for simd for (0..<num_iters>) { red = InclusiveScan ? buffer[i] : buffer[i-1]; <scan phase>; } ``` Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D81658	2020-06-17 08:43:17 -04:00
Sander de Smalen	e51c1d06a9	[SveEmitter] Add builtins for svtbl2 Reviewers: david-arm, efriedma, c-rhodes Reviewed By: c-rhodes Tags: #clang Differential Revision: https://reviews.llvm.org/D81462	2020-06-17 09:41:38 +01:00
Jun Ma	4a1776979f	[CodeGen][TLS] Set TLS Model for __tls_guard as well. Differential Revision: https://reviews.llvm.org/D81543	2020-06-17 08:31:13 +08:00
Christopher Tetreault	eb81c85afd	[SVE] Deprecate default false variant of VectorType::get Reviewers: efriedma, fpetrogalli, kmclaughlin, huntergr Reviewed By: fpetrogalli Subscribers: cfe-commits, tschuett, rkruppe, psnobl, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D80342	2020-06-16 15:16:11 -07:00
Alexey Bataev	0f631bd3bb	Revert "[OPENMP50]Codegen for scan directive in for simd regions." This reverts commit `6e78a3086a` to solve the problem with mem leak.	2020-06-16 17:01:59 -04:00
Alexey Bataev	6e78a3086a	[OPENMP50]Codegen for scan directive in for simd regions. Summary: Added codegen for scan directives in parallel for regions. Emits the code for the directive with inscan reductions. Original code: ``` #pragma omp for simd reduction(inscan, op : ...) for(...) { <input phase>; #pragma omp scan (in)exclusive(...) <scan phase> } ``` is transformed to something: ``` size num_iters = <num_iters>; <type> buffer[num_iters]; #pragma omp for simd for (i: 0..<num_iters>) { <input phase>; buffer[i] = red; } #pragma omp barrier for (int k = 0; k != ceil(log2(num_iters)); ++k) for (size cnt = last_iter; cnt >= pow(2, k); --k) buffer[i] op= buffer[i-pow(2,k)]; #pragma omp for simd for (0..<num_iters>) { red = InclusiveScan ? buffer[i] : buffer[i-1]; <scan phase>; } ``` Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D81658	2020-06-16 16:13:27 -04:00
Luke Geeson	10b6567f49	[AArch64]: BFloat MatMul Intrinsics&CodeGen This patch upstreams support for BFloat Matrix Multiplication Intrinsics and Code Generation from __bf16 to AArch64. This includes IR intrinsics. Unittests are provided as needed. AArch32 Intrinsics + CodeGen will come after this patch. This patch is part of a series implementing the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: Luke Geeson - Momchil Velikov - Mikhail Maltsev - Luke Cheeseman Reviewers: SjoerdMeijer, t.p.northover, sdesmalen, labrinea, miyuki, stuij Reviewed By: miyuki, stuij Subscribers: kristof.beyls, hiraditya, danielkiss, cfe-commits, llvm-commits, miyuki, chill, pbarrio, stuij Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80752 Change-Id: I174f0fd0f600d04e3799b06a7da88973c6c0703f	2020-06-16 15:23:30 +01:00
Stanislav Mekhanoshin	9ee272f13d	[AMDGPU] Add gfx1030 target Differential Revision: https://reviews.llvm.org/D81886	2020-06-15 16:18:05 -07:00
Akira Hatanaka	2cfb027369	[CodeGen][NFC] Add a helper function that returns the addresses of parameters of non-trivial C struct special functions This removes the need to pass std::array of Addresses to getFunction, which were overwritten in the function.	2020-06-15 15:59:16 -07:00
Arnold Schwaighofer	4a8120ca9f	Fix ConstantAggregateBuilderBase::getRelativeOffset Summary: If a record has a mix of relative pointers and other fields they wouldn't necessarily be the same. Fallout from D77592. rdar://64309883 Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81857	2020-06-15 12:23:20 -07:00
Jeff Mott	8799ebbc1f	[clang] Fix or emit diagnostic for checked arithmetic builtins with _ExtInt types - Fix computed size for _ExtInt types passed to checked arithmetic builtins. - Emit diagnostic when signed _ExtInt larger than 128-bits is passed to __builtin_mul_overflow. - Change Sema checks for builtins to accept placeholder types. Differential Revision: https://reviews.llvm.org/D81420	2020-06-15 06:51:54 -07:00
Tyker	51e4aa87e0	attempt to fix failing buildbots after `3bab88b7ba` Prevent IR-gen from emitting consteval declarations Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result.	2020-06-15 12:58:37 +02:00
Kirill Bobyrev	550c4562d1	Revert "Prevent IR-gen from emitting consteval declarations" This reverts commit `3bab88b7ba`. This patch causes test failures: http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/17260	2020-06-15 12:14:15 +02:00
Tyker	3bab88b7ba	Prevent IR-gen from emitting consteval declarations Summary: with this patch instead of emitting calls to consteval function. the IR-gen will emit a store of the already computed result. Reviewers: rsmith Reviewed By: rsmith Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76420	2020-06-15 10:47:14 +02:00
Nikita Popov	7cac7e0cfc	[IR] Prefer hasFnAttribute() where possible (NFC) When checking for an enum function attribute, use hasFnAttribute() rather than hasAttribute() at FunctionIndex, because it is significantly faster (and more concise to boot).	2020-06-15 09:30:35 +02:00
Sander de Smalen	91a4a592ed	[SveEmitter] Add SVE tuple types and builtins for svundef. This patch adds new SVE types to Clang that describe tuples of SVE vectors. For example `svint32x2_t` which maps to the twice-as-wide vector `<vscale x 8 x i32>`. Similarly, `svint32x3_t` will map to `<vscale x 12 x i32>`. It also adds builtins to return an `undef` vector for a given SVE type. Reviewers: c-rhodes, david-arm, ctetreau, efriedma, rengolin Reviewed By: c-rhodes Tags: #clang Differential Revision: https://reviews.llvm.org/D81459	2020-06-15 07:36:01 +01:00
Alex Bradbury	3dcfd482cb	[CodeGen] Increase applicability of ffine-grained-bitfield-accesses for targets with limited native integer widths As pointed out in PR45708, -ffine-grained-bitfield-accesses doesn't trigger in all cases you think it might for RISC-V. The logic in CGRecordLowering::accumulateBitFields checks OffsetInRecord is a legal integer according to the datalayout. RISC targets will typically only have the native width as a legal integer type so this check will fail for OffsetInRecord of 8 or 16 when you would expect the transformation is still worthwhile. This patch changes the logic to check for an OffsetInRecord of a at least 1 byte, that fits in a legal integer, and is a power of 2. We would prefer to query whether native load/store operations are available, but I don't believe that is possible. Differential Revision: https://reviews.llvm.org/D79155	2020-06-12 10:33:47 +01:00
Akira Hatanaka	c9a52de002	[CodeGen] Simplify the way lifetime of block captures is extended Rather than pushing inactive cleanups for the block captures at the entry of a full expression and activating them during the creation of the block literal, just call pushLifetimeExtendedDestroy to ensure the cleanups are popped at the end of the scope enclosing the block expression. rdar://problem/63996471 Differential Revision: https://reviews.llvm.org/D81624	2020-06-11 16:06:22 -07:00
John McCall	7fac1acc61	Set the LLVM FP optimization flags conservatively. Functions can have local pragmas that override the global settings. We set the flags eagerly based on global settings, but if we emit an expression under the influence of a pragma, we clear the appropriate flags from the function. In order to avoid doing a ton of redundant work whenever we emit an FP expression, configure the IRBuilder to default to global settings, and only reconfigure it when we see an FP expression that's not using the global settings. Patch by Michele Scandale! https://reviews.llvm.org/D80462	2020-06-11 18:16:41 -04:00
Alexey Bataev	43101d10db	[OPENMP50]Codegen for scan directive in simd loops. Added codegen for scan directives in simd loop. The codegen transforms original code: ``` int x = 0; #pragma omp simd reduction(inscan, +: x) for (..) { <first part> #pragma omp scan inclusive(x) <second part> } ``` into ``` int x = 0; for (..) { int x_priv = 0; <first part> x = x_priv + x; x_priv = x; <second part> } ``` and ``` int x = 0; #pragma omp simd reduction(inscan, +: x) for (..) { <first part> #pragma omp scan exclusive(x) <second part> } ``` into ``` int x = 0; for (..) { int x_priv = 0; <second part> int temp = x; x = x_priv + x; x_priv = temp; <first part> } ``` Differential revision: https://reviews.llvm.org/D78232	2020-06-11 14:48:43 -04:00
Leonard Chan	71568a9e28	[clang] Frontend components for the relative vtables ABI (round 2) This patch contains all of the clang changes from D72959. - Generalize the relative vtables ABI such that it can be used by other targets. - Add an enum VTableComponentLayout which controls whether components in the vtable should be pointers to other structs or relative offsets to those structs. Other ABIs can change this enum to restructure how components in the vtable are laid out/accessed. - Add methods to ConstantInitBuilder for inserting relative offsets to a specified position in the aggregate being constructed. - Fix failing tests under new PM and ASan and MSan issues. See D72959 for background info. Differential Revision: https://reviews.llvm.org/D77592	2020-06-11 11:17:08 -07:00
Alexey Bataev	fac7259c81	Revert "[OPENMP50]Codegen for scan directive in simd loops." This reverts commit `fb80e67f10` to resolve the issue with asan buildbots.	2020-06-11 11:22:51 -04:00
Alexey Bataev	90b54fa045	[OPENMP50]Codegen for use_device_addr clauses. Summary: Added codegen for use_device_addr clause. The components of the list items are mapped as a kind of RETURN components and then the returned base address is used instead of the real address of the base declaration used in the use_device_addr expressions. Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D80730	2020-06-11 09:54:51 -04:00
Alexey Bataev	fb80e67f10	[OPENMP50]Codegen for scan directive in simd loops. Added codegen for scandirectives in simd loop. The codegen transforms original code: ``` int x = 0; #pragma omp simd reduction(inscan, +: x) for (..) { <first part> #pragma omp scan inclusive(x) <second part> } ``` into ``` int x = 0; for (..) { int x_priv = 0; <first part> x = x_priv + x; x_priv = x; <second part> } ``` and ``` int x = 0; #pragma omp simd reduction(inscan, +: x) for (..) { <first part> #pragma omp scan exclusive(x) <second part> } ``` into ``` int x = 0; for (..) { int x_priv = 0; <second part> int temp = x; x = x_priv + x; x_priv = temp; <first part> } ``` Differential revision: https://reviews.llvm.org/D78232	2020-06-11 09:01:23 -04:00
Daniel Grumberg	e87e55edbc	Make ASTFileSignature an array of 20 uint8_t instead of 5 uint32_t Reviewers: aprantl, dexonsmith, Bigcheese Subscribers: arphaman, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81347	2020-06-11 09:12:29 +01:00
Craig Topper	ed34140e11	[X86] Move X86 stuff out of TargetParser.h and into the recently created X86TargetParser.h. NFC	2020-06-10 22:06:34 -07:00
Leonard Chan	7201272d4c	Revert "[clang] Frontend components for the relative vtables ABI" This reverts commit `2e009dbcb3`. Reverting since there were some test failures on buildbots that used the new pass manager. ASan and MSan are also finding some bugs in this that I'll need to address.	2020-06-10 13:50:05 -07:00
Leonard Chan	2e009dbcb3	[clang] Frontend components for the relative vtables ABI This patch contains all of the clang changes from D72959. - Generalize the relative vtables ABI such that it can be used by other targets. - Add an enum VTableComponentLayout which controls whether components in the vtable should be pointers to other structs or relative offsets to those structs. Other ABIs can change this enum to restructure how components in the vtable are laid out/accessed. - Add methods to ConstantInitBuilder for inserting relative offsets to a specified position in the aggregate being constructed. See D72959 for background info. Differential Revision: https://reviews.llvm.org/D77592	2020-06-10 12:48:10 -07:00
Arthur Eubanks	bc38793852	Change debuginfo check for addHeapAllocSiteMetadata Summary: Move check inside of addHeapAllocSiteMetadata(). Change check to DebugInfo <= DebugLineTablesOnly. Reviewers: akhuang Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81481	2020-06-09 11:01:06 -07:00
Thomas Lively	b7d369280b	[WebAssembly] Implement prototype SIMD rounding instructions Summary: As specified in https://github.com/WebAssembly/simd/pull/232. These instructions are implemented as LLVM intrinsics for now rather than normal ISel patterns to make these instructions opt-in. Once the instructions are merged to the spec proposal, the intrinsics will be replaced with proper ISel patterns. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D81222	2020-06-09 10:14:14 -07:00
Saiyedul Islam	675cefbf60	[AMDGPU] Introduce Clang builtins to be mapped to AMDGCN atomic inc/dec intrinsics Summary: __builtin_amdgcn_atomic_inc32(int Ptr, int Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_inc64(int64_t Ptr, int64_t Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_dec32(int Ptr, int Val, unsigned MemoryOrdering, const char SyncScope) __builtin_amdgcn_atomic_dec64(int64_t Ptr, int64_t Val, unsigned MemoryOrdering, const char SyncScope) First and second arguments gets transparently passed to the amdgcn atomic inc/dec intrinsic. Fifth argument of the intrinsic is set as true if the first argument of the builtin is a volatile pointer. The third argument of this builtin is one of the memory-ordering specifiers ATOMIC_ACQUIRE, ATOMIC_RELEASE, ATOMIC_ACQ_REL, or ATOMIC_SEQ_CST following C++11 memory model semantics. This is mapped to corresponding LLVM atomic memory ordering for the atomic inc/dec instruction using CLANG atomic C ABI. The fourth argument is an AMDGPU-specific synchronization scope defined as string. Reviewers: arsenm, sameerds, JonChesterfield, jdoerfert Reviewed By: arsenm, sameerds Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, jfb, kerbowa, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80804	2020-06-09 17:02:58 +00:00
Arthur Eubanks	ce7d3e1c55	Reland (again) D80966 [codeview] Put !heapallocsite on calls to operator new Check that getDebugInfo() is not null, as in the first revision, before calling getDebugInfo()->addHeapAllocSiteMetadata(). Else would cause a crash with a new expression in a default arg. --- Clang marks calls to operator new as heap allocation sites, but the operator declared at global scope returns a void pointer. There is no explicit cast in the code, so the compiler has to write down the allocated type itself. Also generalize a cast to use CallBase, so that we mark heap alloc sites when exceptions are enabled. Differential Revision: https://reviews.llvm.org/D80966	2020-06-09 09:27:32 -07:00
Alexey Bataev	cb9191c042	[OPENMP]Improve code readability, NFC. Reuse existing function instead of code duplication and use better type.	2020-06-09 08:50:36 -04:00
Florian Hahn	3323a628ec	[Matrix] Add __builtin_matrix_transpose to Clang. This patch add __builtin_matrix_transpose to Clang, as described in clang/docs/MatrixTypes.rst. Reviewers: rjmccall, jfb, rsmith, Bigcheese Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D72778	2020-06-09 10:14:37 +01:00
Arthur Eubanks	a92ce3b706	Revert "Reland D80966 [codeview] Put !heapallocsite on calls to operator new" This reverts commit `b6e143aa54`. Causes https://bugs.chromium.org/p/chromium/issues/detail?id=1092370#c5. Will investigate and reland (again).	2020-06-08 12:49:41 -07:00
Jian Cai	4db2b70248	Add a flag to debug automatic variable initialization Summary: Add -ftrivial-auto-var-init-stop-after= to limit the number of times stack variables are initialized when -ftrivial-auto-var-init= is used to initialize stack variables to zero or a pattern. This flag can be used to bisect uninitialized uses of a stack variable exposed by automatic variable initialization, such as http://crrev.com/c/2020401. Reviewers: jfb, vitalybuka, kcc, glider, rsmith, rjmccall, pcc, eugenis, vlad.tsyrklevich Reviewed By: jfb Subscribers: phosek, hubert.reinterpretcast, srhines, MaskRay, george.burgess.iv, dexonsmith, inglorion, gbiv, llozano, manojgupta, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77168	2020-06-08 12:30:56 -07:00
Arthur Eubanks	c07339c675	Move San module passes later in the NPM pipeline Summary: This fixes pr33372.cpp under the new pass manager. ASan adds padding to globals. For example, it will change a {i32, i32, i32} to a {{i32, i32, i32}, [52 x i8]}. However, when loading from the {i32, i32, i32}, InstCombine may (after various optimizations) end up loading 16 bytes instead of 12, likely because it thinks the [52 x i8] padding is ok to load from. But ASan checks that padding should not be loaded from. Ultimately this is an issue of San passes wanting to be run after all optimizations. This change moves the module passes right next to the corresponding function passes. Also remove comment that's no longer relevant, this is the last ASan/MSan/TSan failure under the NPM (hopefully...). As mentioned in https://reviews.llvm.org/rG1285e8bcac2c54ddd924ffb813b2b187467ac2a6, NPM doesn't support LTO + sanitizers, so modified some tests that test for that. Reviewers: leonardchan, vitalybuka Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D81323	2020-06-08 12:08:49 -07:00
Fangrui Song	fc935fc35b	Reland D80979 [clang] Implement VectorType logic not operator With a fix to use -triple %itanium_abi_triple Differential Revision: https://reviews.llvm.org/D80979	2020-06-08 09:32:30 -07:00
Nico Weber	abca3b7b2c	Revert "[clang] Implement VectorType logic not operator." This reverts commit `a0de3335ed`. Breaks check-clang on Windows, see e.g. https://reviews.llvm.org/D80979#2078750 (but fails on all other Windows bots too).	2020-06-08 06:45:21 -04:00
Jun Ma	a0de3335ed	[clang] Implement VectorType logic not operator. Differential Revision: https://reviews.llvm.org/D80979	2020-06-08 08:41:01 +08:00
Fangrui Song	b6e143aa54	Reland D80966 [codeview] Put !heapallocsite on calls to operator new With a change to use `CGM.getCodeGenOpts().getDebugInfo() != codegenoptions::NoDebugInfo` instead of `getDebugInfo()`, to fix `Profile-<arch> :: instrprof-gcov-multithread_fork.test` See CodeGenModule::CodeGenModule, `EmitGcovArcs \|\| EmitGcovNotes` can set `clang::CodeGen::CodeGenModule::DebugInfo`. --- Clang marks calls to operator new as heap allocation sites, but the operator declared at global scope returns a void pointer. There is no explicit cast in the code, so the compiler has to write down the allocated type itself. Also generalize a cast to use CallBase, so that we mark heap alloc sites when exceptions are enabled. Differential Revision: https://reviews.llvm.org/D80966	2020-06-07 13:35:20 -07:00
Florian Hahn	4affc444b4	[Matrix] Implement * binary operator for MatrixType. This patch implements the * binary operator for values of MatrixType. It adds support for matrix * matrix, scalar * matrix and matrix * scalar. For the matrix, matrix case, the number of columns of the first operand must match the number of rows of the second. For the scalar,matrix variants, the element type of the matrix must match the scalar type. Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D76794	2020-06-07 11:11:27 +01:00
Douglas Yung	059ba74bb6	Revert "[codeview] Put !heapallocsite on calls to operator new" This reverts commit `672ed53860`. This commit is hitting an assertion failure across multiple bots in the test: Profile-<arch> :: instrprof-gcov-multithread_fork.test Failing bots include: http://lab.llvm.org:8011/builders/llvm-avr-linux/builds/2205 http://lab.llvm.org:8011/builders/clang-cmake-aarch64-lld/builds/8967 http://lab.llvm.org:8011/builders/clang-cmake-armv7-full/builds/10789 http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/27750 http://lab.llvm.org:8011/builders/sanitizer-ppc64be-linux/builds/16751	2020-06-06 23:30:46 +00:00
Richard Smith	f39e12a06b	PR34581: Don't remove an 'if (p)' guarding a call to 'operator delete(p)' under -Oz. Summary: This transformation is correct for a builtin call to 'free(p)', but not for 'operator delete(p)'. There is no guarantee that a user replacement 'operator delete' has no effect when called on a null pointer. However, the principle behind the transformation is correct, and can be applied more broadly: a 'delete p' expression is permitted to unconditionally call 'operator delete(p)'. So do that in Clang under -Oz where possible. We do this whether or not 'p' has trivial destruction, since the destruction might turn out to be trivial after inlining, and even for a class-specific (but non-virtual, non-destroying, non-array) 'operator delete'. Reviewers: davide, dnsampaio, rjmccall Reviewed By: dnsampaio Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79378	2020-06-05 17:13:43 -07:00
Reid Kleckner	672ed53860	[codeview] Put !heapallocsite on calls to operator new Clang marks calls to operator new as heap allocation sites, but the operator declared at global scope returns a void pointer. There is no explicit cast in the code, so the compiler has to write down the allocated type itself. Also generalize a cast to use CallBase, so that we mark heap alloc sites when exceptions are enabled. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D80966	2020-06-05 12:52:38 -07:00
Ties Stuij	8b137a4306	[clang][BFloat] Add create/set/get/dup intrinsics Summary: This patch is part of a series that adds support for the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile The following people contributed to this patch: - Luke Cheeseman - Momchil Velikov - Luke Geeson - Ties Stuij - Mikhail Maltsev Reviewers: t.p.northover, sdesmalen, fpetrogalli, LukeGeeson, stuij, labrinea Reviewed By: labrinea Subscribers: miyuki, dmgreen, labrinea, kristof.beyls, ilya-biryukov, MaskRay, jkorous, arphaman, usaxena95, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79710	2020-06-05 14:35:10 +01:00
Ties Stuij	ecd682bbf5	[ARM] Add __bf16 as new Bfloat16 C Type Summary: This patch upstreams support for a new storage only bfloat16 C type. This type is used to implement primitive support for bfloat16 data, in line with the Bfloat16 extension of the Armv8.6-a architecture, as detailed here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a The bfloat type, and its properties are specified in the Arm Architecture Reference Manual: https://developer.arm.com/docs/ddi0487/latest/arm-architecture-reference-manual-armv8-for-armv8-a-architecture-profile In detail this patch: - introduces an opaque, storage-only C-type __bf16, which introduces a new bfloat IR type. This is part of a patch series, starting with command-line and Bfloat16 assembly support. The subsequent patches will upstream intrinsics support for BFloat16, followed by Matrix Multiplication and the remaining Virtualization features of the armv8.6-a architecture. The following people contributed to this patch: - Luke Cheeseman - Momchil Velikov - Alexandros Lamprineas - Luke Geeson - Simon Tatham - Ties Stuij Reviewers: SjoerdMeijer, rjmccall, rsmith, liutianle, RKSimon, craig.topper, jfb, LukeGeeson, fpetrogalli Reviewed By: SjoerdMeijer Subscribers: labrinea, majnemer, asmith, dexonsmith, kristof.beyls, arphaman, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76077	2020-06-05 10:32:43 +01:00
Alexey Bataev	4e3d4622b1	Fix undefined behaviour when trying to deref nullptr.	2020-06-04 17:52:06 -04:00
Alexey Bataev	bd1c03d7b7	[OPENMP50]Codegen for inscan reductions in worksharing directives. Summary: Implemented codegen for reduction clauses with inscan modifiers in worksharing constructs. Emits the code for the directive with inscan reductions. The code is the following: ``` size num_iters = <num_iters>; <type> buffer[num_iters]; for (i: 0..<num_iters>) { <input phase>; buffer[i] = red; } for (int k = 0; k != ceil(log2(num_iters)); ++k) for (size cnt = last_iter; cnt >= pow(2, k); --k) buffer[i] op= buffer[i-pow(2,k)]; for (0..<num_iters>) { red = InclusiveScan ? buffer[i] : buffer[i-1]; <scan phase>; } ``` Reviewers: jdoerfert Subscribers: yaxunl, guansong, arphaman, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D79948	2020-06-04 16:29:33 -04:00
Alexey Bataev	9ca5a6d3b5	[OPENMP]Fix PR46146: Do not consider globalized variables as NRVO candidates. Summary: If the variables must be globalized in OpenMP mode (local automatic variable, GPU compilation mode, the variable may escape its declaration context by the reference or by the pointer), it should not be considered as the NRVO candidate. Otherwise, incorrect the return value of the function might not be updated. Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D80936	2020-06-04 12:33:25 -04:00
Craig Topper	dd863ccae1	[X86] Separate X86_CPU_TYPE_COMPAT_WITH_ALIAS from X86_CPU_TYPE_COMPAT. NFC Add a separate X86_CPU_TYPE_COMPAT_ALIAS that carries alias string and the enum from X86_CPU_TYPE_COMPAT.	2020-06-03 14:13:12 -07:00
Yaxun (Sam) Liu	04abbb3a78	[HIP] Change default --gpu-max-threads-per-block value to 1024 Differential Revision: https://reviews.llvm.org/D76795	2020-06-03 11:09:22 -04:00
Andrew Wock	15a1780a10	[PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins This is a re-revert with a corrected test. This patch adds a test for the PowerPC fma compiler builtins, some variations of which negate inputs and outputs. The code to generate IR for these builtins was untested before this patch. Originally, the code used the outdated method of subtracting floating point values from -0.0 as floating point negation. This patch remedies that. Patch by: Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D76949	2020-06-03 09:45:27 -04:00
Alexey Bataev	59e0987a06	[OPENMP]Fix PR46170: partial mapping for array sections of data members. Summary: If the data member is mapped as an array section, need to emit the pointer to the last element of this array section and use this pointer as the highest element in partial struct data. Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D81037	2020-06-03 09:10:20 -04:00
Lucas Prates	8beaba13b8	[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts Summary: During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly assuming all the pointers from which loads were being generated for vld1 intrinsics were aligned according to the intrinsics result type, causing alignment faults on the code generated by the backend. This patch updates vld1 intrinsics' CodeGen to properly capture the correct load alignment based on the type of the pointer provided as input for the intrinsic. Reviewers: t.p.northover, ostannard, pcc, efriedma Reviewed By: ostannard, efriedma Subscribers: echristo, plotfi, nickdesaulniers, efriedma, kristof.beyls, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79721	2020-06-03 11:39:27 +01:00
Wei Mi	7a6c89427c	[SampleFDO] Add use-sample-profile function attribute. When sampleFDO is enabled, people may expect they can use -fno-profile-sample-use to opt-out using sample profile for a certain file. That could be either for debugging purpose or for performance tuning purpose. However, when thinlto is enabled, if a function in file A compiled with -fno-profile-sample-use is imported to another file B compiled with -fprofile-sample-use, the inlined copy of the function in file B may still get its profile annotated. The inconsistency may even introduce profile unused warning because if the target is not compiled with explicit debug information flag, the function in file A won't have its debug information enabled (debug information will be enabled implicitly only when -fprofile-sample-use is used). After it is imported into file B which is compiled with -fprofile-sample-use, profile annotation for the outline copy of the function will fail because the function has no debug information, and that will trigger profile unused warning. We add a new attribute use-sample-profile to control whether a function will use its sample profile no matter for its outline or inline copies. That will make the behavior of -fno-profile-sample-use consistent. Differential Revision: https://reviews.llvm.org/D79959	2020-06-02 17:23:17 -07:00
Vitaly Buka	232d348c6e	[MTE] Convert StackSafety into analysis This lets us to remove !stack-safe metadata and better controll when to perform StackSafety analysis. Reviewers: eugenis Subscribers: hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80771	2020-06-02 16:08:14 -07:00
Min-Yih Hsu	4431d64c10	Support ExtVectorType conditional operator Extension vectors now can be used in element-wise conditional selector. For example: ``` R[i] = C[i]? A[i] : B[i] ``` This feature was previously only enabled in OpenCL C. Now it's also available in C. Not that it has different behaviors than GNU vectors (i.e. __vector_size__). Extension vectors selects on signdness of the vector. GNU vectors on the other hand do normal bool conversions. Also, this feature is not available in C++. Differential Revision: https://reviews.llvm.org/D80574	2020-06-02 16:35:42 +00:00
Alexey Bataev	89d9dba2c6	[OPENMP50]Initial codegen for 'affinity' clauses. Summary: Added initial codegen for 'affinity' clauses on task directives. Emits next code: ``` kmp_task_affinity_info_t affs[<num_elems>]; void *td = __kmpc_task_alloc(..); affs[<i>].base = &data_i; affs[<i>].size = sizeof(data_i); __kmpc_omp_reg_task_with_affinity(&loc, <gtid>, td, <num_elems>, affs); ``` The result returned by the call of `__kmpc_omp_reg_task_with_affinity` function is ignored currently sincethe runtime currently ignores args and returns 0 uncoditionally. Reviewers: jdoerfert Subscribers: yaxunl, guansong, sstefan1, llvm-commits, cfe-commits, caomhin Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80240	2020-06-02 10:50:08 -04:00
Sriraman Tallam	e0bca46b08	Options for Basic Block Sections, enabled in D68063 and D73674. This patch adds clang options: -fbasic-block-sections={all,<filename>,labels,none} and -funique-basic-block-section-names. LLVM Support for basic block sections is already enabled. + -fbasic-block-sections={all, <file>, labels, none} : Enables/Disables basic block sections for all or a subset of basic blocks. "labels" only enables basic block symbols. + -funique-basic-block-section-names: Enables unique section names for basic block sections, disabled by default. Differential Revision: https://reviews.llvm.org/D68049	2020-06-02 00:23:32 -07:00
John McCall	8a8d703be0	Fix how cc1 command line options are mapped into FP options. Canonicalize on storing FP options in LangOptions instead of redundantly in CodeGenOptions. Incorporate -ffast-math directly into the values of those LangOptions rather than considering it separately when building FPOptions. Build IR attributes from those options rather than a mix of sources. We should really simplify the driver/cc1 interaction here and have the driver pass down options that cc1 directly honors. That can happen in a follow-up, though. Patch by Michele Scandale! https://reviews.llvm.org/D80315	2020-06-01 22:00:30 -04:00
Joseph Huber	1a4fb2edcb	[OpenMP] Replace Clang's OpenMP RTL Definitions with OMPKinds.def Summary: This changes Clang's generation of OpenMP runtime functions to use the types and functions defined in OpenMPKinds and OpenMPConstants. New OpenMP runtime function information should now be added to OMPKinds.def. This patch also changed the definitions of __kmpc_push_num_teams and __kmpc_copyprivate to match those found in the runtime. Reviewers: jdoerfert Reviewed By: jdoerfert Subscribers: jfb, AndreyChurbanov, openmp-commits, fghanim, hiraditya, sstefan1, cfe-commits, llvm-commits Tags: #openmp, #clang, #llvm Differential Revision: https://reviews.llvm.org/D80222	2020-06-01 16:23:10 -04:00
Florian Hahn	8f3f88d2f5	[Matrix] Implement matrix index expressions ([][]). This patch implements matrix index expressions (matrix[RowIdx][ColumnIdx]). It does so by introducing a new MatrixSubscriptExpr(Base, RowIdx, ColumnIdx). MatrixSubscriptExprs are built in 2 steps in ActOnMatrixSubscriptExpr. First, if the base of a subscript is of matrix type, we create a incomplete MatrixSubscriptExpr(base, idx, nullptr). Second, if the base is an incomplete MatrixSubscriptExpr, we create a complete MatrixSubscriptExpr(base->getBase(), base->getRowIdx(), idx) Similar to vector elements, it is not possible to take the address of a MatrixSubscriptExpr. For CodeGen, a new MatrixElt type is added to LValue, which is very similar to VectorElt. The only difference is that we may need to cast the type of the base from an array to a vector type when accessing it. Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D76791	2020-06-01 20:08:49 +01:00
Christopher Tetreault	796898172c	[SVE] Eliminate calls to default-false VectorType::get() from Clang Reviewers: efriedma, david-arm, fpetrogalli, ddunbar, rjmccall Reviewed By: fpetrogalli, rjmccall Subscribers: tschuett, rkruppe, psnobl, dmgreen, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80323	2020-06-01 10:02:14 -07:00
Nick Desaulniers	ef1d4bec89	[Clang][CGM] style cleanups NFC Summary: Forked from: https://reviews.llvm.org/D80242 Use the getter for access to DebugInfo consistently. Use break in switch in CodeGenModule::EmitTopLevelDecl consistently. Reviewers: dblaikie Reviewed By: dblaikie Subscribers: cfe-commits, srhines Tags: #clang Differential Revision: https://reviews.llvm.org/D80840	2020-06-01 09:33:08 -07:00
Djordje Todorovic	40a3fcb05c	[DebugInfo][CallSites] Remove decl subprograms from 'retainedTypes:' After the D70350, the retainedTypes: isn't being used for the purpose of call site debug info for extern calls, so it is safe to delete it from IR representation. We are also adding a test to ensure the subprogram isn't stored within the retainedTypes: from corresponding DICompileUnit. Differential Revision: https://reviews.llvm.org/D80369	2020-06-01 09:10:05 +02:00
Florian Hahn	6f6e91d193	[Matrix] Implement + and - operators for MatrixType. This patch implements the + and - binary operators for values of MatrixType. It adds support for matrix +/- matrix, scalar +/- matrix and matrix +/- scalar. For the matrix, matrix case, the types must initially be structurally equivalent. For the scalar,matrix variants, the element type of the matrix must match the scalar type. Reviewers: rjmccall, anemet, Bigcheese, rsmith, martong Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D76793	2020-05-29 20:42:22 +01:00
Arthur Eubanks	1285e8bcac	Run Coverage pass before other *San passes under new pass manager, round 2 Summary: This was attempted once before in https://reviews.llvm.org/D79698, but was reverted due to the coverage pass running in the wrong part of the pipeline. This commit puts it in the same place as the other sanitizers. This changes PassBuilder.OptimizerLastEPCallbacks to work on a ModulePassManager instead of a FunctionPassManager. That is because SanitizerCoverage cannot (easily) be split into a module pass and a function pass like some of the other sanitizers since in its current implementation it conditionally inserts module constructors based on whether or not it successfully modified functions. This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass manager (last check-msan test). Currently sanitizers + LTO don't work together under the new pass manager, so I removed tests that checked that this combination works for sancov. Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80692	2020-05-28 17:04:47 -07:00
Arthur Eubanks	e3fb8446f2	Revert "Run Coverage pass before other *San passes under new pass manager, round 2" This reverts commit `922fa2fce3`.	2020-05-28 14:38:05 -07:00
Arthur Eubanks	922fa2fce3	Run Coverage pass before other *San passes under new pass manager, round 2 Summary: This was attempted once before in https://reviews.llvm.org/D79698, but was reverted due to the coverage pass running in the wrong part of the pipeline. This commit puts it in the same place as the other sanitizers. This changes PassBuilder.OptimizerLastEPCallbacks to work on a ModulePassManager instead of a FunctionPassManager. That is because SanitizerCoverage cannot (easily) be split into a module pass and a function pass like some of the other sanitizers since in its current implementation it conditionally inserts module constructors based on whether or not it successfully modified functions. This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass manager (last check-msan test). Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80692	2020-05-28 14:25:23 -07:00
Vitaly Buka	2f430f7a51	[StackSafety] Remove SetMetadata parameter	2020-05-28 13:32:57 -07:00
Sam McCall	d283fc4f9d	[DebugInfo] Use SplitTemplateClosers (foo<bar<baz> >) in DWARF too Summary: D76801 caused some regressions in debuginfo compatibility by changing how certain functions were named. For CodeView we try to mirror MSVC exactly: this was fixed in `a549c0d004` For DWARF the situation is murkier. Per David Blaikie: > In general DWARF doesn't specify this at all. > [...] > This isn't the only naming divergence between GCC and Clang Nevertheless, including the space seems to provide better compatibility with GCC and GDB. E.g. cpexprs.cc in the GDB testsuite requires this formatting. And there was no particular desire to change the printing of names in debug info in the first place (just in diagnostics and other more user-facing text). Fixes PR46052 Reviewers: dblaikie, labath Subscribers: aprantl, cfe-commits, dyung Tags: #clang Differential Revision: https://reviews.llvm.org/D80554	2020-05-28 12:30:38 +02:00
Alok Kumar Sharma	d20bf5a725	[DebugInfo] Upgrade DISubrange to support Fortran dynamic arrays This patch upgrades DISubrange to support fortran requirements. Summary: Below are the updates/addition of fields. lowerBound - Now accepts signed integer or DIVariable or DIExpression, earlier it accepted only signed integer. upperBound - This field is now added and accepts signed interger or DIVariable or DIExpression. stride - This field is now added and accepts signed interger or DIVariable or DIExpression. This is required to describe bounds of array which are known at runtime. Testing: unit test cases added (hand-written) check clang check llvm check debug-info Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D80197	2020-05-28 13:46:41 +05:30
James Y Knight	aca3d067ef	Fix Darwin 'constinit thread_local' variables. Unlike other platforms using ItaniumCXXABI, Darwin does not allow the creation of a thread-wrapper function for a variable in the TU of users. Because of this, it can set the linkage of the thread-local symbol to internal, with the assumption that no TUs other than the one defining the variable will need it. However, constinit thread_local variables do not require the use of the thread-wrapper call, so users reference the variable directly. Thus, it must not be converted to internal, or users will get a link failure. This was a regression introduced by the optimization in `00223827a9`. Differential Revision: https://reviews.llvm.org/D80417	2020-05-27 11:59:30 -04:00
Alexey Bataev	a888fc6b34	[OPENMP50]Initial support for use_device_addr clause. Summary: Added parsing/sema analysis/serialization support for use_device_addr clauses. Reviewers: jdoerfert Subscribers: yaxunl, guansong, arphaman, sstefan1, llvm-commits, cfe-commits, caomhin Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80404	2020-05-27 11:35:31 -04:00
serge-sans-paille	de02a75e39	[PGO] Fix computation of function Hash And bump its version number accordingly. This is a patched recommit of `7c298c104b` Previous hash implementation was incorrectly passing an uint64_t, that got converted to an uint8_t, to finalize the hash computation. This led to different functions having the same hash if they only differ by the remaining statements, which is incorrect. Added a new test case that trivially tests that a small function change is reflected in the hash value. Not that as this patch fixes the hash computation, it would invalidate all hashes computed before that patch applies, this is why we bumped the version number. Update profile data hash entries due to hash function update, except for binary version, in which case we keep the buggy behavior for backward compatibility. Differential Revision: https://reviews.llvm.org/D79961	2020-05-27 09:15:21 +02:00
Eric Christopher	97a133f157	Temporarily Revert "[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts" as it's causing crashes on code generation and https://bugs.llvm.org/show_bug.cgi?id=46084 This reverts commit `98cad555e2`.	2020-05-26 18:51:00 -07:00
Adrian Prantl	b59b3640bc	Debug Info: Mark os_log helper functions as artificial The os_log helper functions are linkonce_odr and supposed to be uniqued across TUs, so attachine a DW_AT_decl_line on it is highly misleading. By setting the function decl to implicit, CGDebugInfo properly marks the functions as artificial and uses a default file / line 0 location for the function. rdar://problem/63450824 Differential Revision: https://reviews.llvm.org/D80463	2020-05-26 09:08:27 -07:00
Lucas Prates	98cad555e2	[Clang][AArch64] Capturing proper pointer alignment for Neon vld1 intrinsicts Summary: During CodeGen for AArch64 Neon intrinsics, Clang was incorrectly assuming all the pointers from which loads were being generated for vld1 intrinsics were aligned according to the intrinsics result type, causing alignment faults on the code generated by the backend. This patch updates vld1 intrinsics' CodeGen to properly capture the correct load alignment based on the type of the pointer provided as input for the intrinsic. Reviewers: t.p.northover, ostannard, pcc Reviewed By: ostannard Subscribers: kristof.beyls, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79721	2020-05-26 10:09:35 +01:00
Fangrui Song	9d55e4ee13	Make explicit -fno-semantic-interposition (in -fpic mode) infer dso_local -fno-semantic-interposition is currently the CC1 default. (The opposite disables some interprocedural optimizations.) However, it does not infer dso_local: on most targets accesses to ExternalLinkage functions/variables defined in the current module still need PLT/GOT. This patch makes explicit -fno-semantic-interposition infer dso_local, so that PLT/GOT can be eliminated if targets implement local aliases for AsmPrinter::getSymbolPreferLocal (currently only x86). Currently we check whether the module flag "SemanticInterposition" is 0. If yes, infer dso_local. In the future, we can infer dso_local unless "SemanticInterposition" is 1: frontends other than clang will also benefit from the optimization if they don't bother setting the flag. (There will be risks if they do want ELF interposition: they need to set "SemanticInterposition" to 1.)	2020-05-25 20:48:18 -07:00
Benjamin Kramer	2b8d6fa0ac	Revert "[PGO] Fix computation of function Hash" This reverts commit `7c298c104b`. Fails make check-clang. Failing Tests (8): Clang :: Profile/c-counter-overflows.c Clang :: Profile/c-general.c Clang :: Profile/c-unprofiled-blocks.c Clang :: Profile/cxx-rangefor.cpp Clang :: Profile/cxx-throws.cpp Clang :: Profile/misexpect-switch-default.c Clang :: Profile/misexpect-switch-nonconst.c Clang :: Profile/misexpect-switch.c	2020-05-25 20:14:28 +02:00
serge-sans-paille	7c298c104b	[PGO] Fix computation of function Hash Previous implementation was incorrectly passing an uint64_t, that got converted to an uint8_t, to finalize the hash computation. This led to different functions having the same hash if they only differ by the remaining statements, which is incorrect. Added a new test case that trivially tests that a small function change is reflected in the hash value. Not that as this patch fixes the hash computation, it invalidates all hashes computed before that patch applies, which could be an issue for large build system that pre-compute the profile data and let client download them as part of the build process. Differential Revision: https://reviews.llvm.org/D79961	2020-05-25 17:17:29 +02:00
ISHIGURO, Hiroshi	ac2c5af67f	[OPENMP] Fix mixture of omp and clang pragmas Fixes PR45753 When a program that contains a loop to which both `omp parallel for` pragma and `clang loop` pragma are associated is compiled with the -fopenmp option, `clang loop` pragma did not take effect. The example below should not be vectorized by the `clang loop` pragma but it was actually vectorized. The cause is that `llvm.loop.vectorize.width` was not output to the IR when -fopenmp is specified. The fix attaches attributes if they exist for the loop. [example.c] ``` int a[100], b[100]; void foo() { #pragma omp parallel for #pragma clang loop vectorize(disable) for (int i = 0; i < 100; i++) a[i] += b[i] * i; } ``` [compile] ``` $ clang -O2 -fopenmp example.c -c -Rpass=vect example.c:3:11: remark: vectorized loop (vectorization width: 4, interleaved count: 2) [-Rpass=loop-vectorize] #pragma omp parallel for ^ ``` [IR with -fopenmp] ``` $ clang -O2 exmaple.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fopenmp \| grep 'vectorize\.width' ``` [IR with -fno-openmp] ``` $ clang -O2 example.c -S -emit-llvm -mllvm -disable-llvm-optzns -o - -fno-openmp \| grep 'vectorize\.width' !7 = !{!"llvm.loop.vectorize.width", i32 1} ``` Differential Revision: https://reviews.llvm.org/D79921	2020-05-22 12:53:37 +09:00
Heejin Ahn	48acac3629	[WebAssembly] Warn on exception spec only when Wasm EH is used Summary: In D80061 we added warning for exception specifications with types (such as `throw(int)`), but it was enabled every time the target was wasm, which means it warned (and ignored) exception specifications even if wasm EH was not used. This fixes it and we only have the warning when we enable `-fwasm-exceptions`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80362	2020-05-21 17:08:35 -07:00
Zequan Wu	e36076ee3a	[clang] Add nomerge function attribute to clang Differential Revision: https://reviews.llvm.org/D79121	2020-05-21 17:07:39 -07:00
Zequan Wu	b0a0f01bc1	Revert "Add nomerge function attribute to clang" This reverts commit `307e853954`.	2020-05-21 16:13:18 -07:00
Zequan Wu	307e853954	Add nomerge function attribute to clang	2020-05-21 15:28:27 -07:00
Yaxun (Sam) Liu	361e4f14e3	Fix debug info for NoDebug attr NoDebug attr does not totally eliminate debug info about a function when inlining is enabled. This is inconsistent with when inlining is disabled. This patch fixes that. Differential Revision: https://reviews.llvm.org/D79967	2020-05-21 09:02:56 -04:00
Alexey Bataev	414afdf940	[OPENMP]Fix PR45911: Data sharing and lambda capture. Summary: No need to generate inlined OpenMP region for variables captured in lambdas or block decls, only for implicitly captured variables in the OpenMP region. Reviewers: jdoerfert Subscribers: yaxunl, guansong, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D79966	2020-05-20 15:01:02 -04:00
Melanie Blower	827be690dc	[clang] FastMathFlags.allowContract should be initialized only from FPFeatures.allowFPContractAcrossStatement Summary: Fix bug introduced in D72841 adding support for pragma float_control Reviewers: rjmccall, Anastasia Differential Revision: https://reviews.llvm.org/D79903	2020-05-20 06:19:10 -07:00
Eli Friedman	62f3ef2b53	[CGCall] Annotate references with "align" attribute. If we're going to assume references are dereferenceable, we should also assume they're aligned: otherwise, we can't actually dereference them. See also D80072. Differential Revision: https://reviews.llvm.org/D80166	2020-05-19 20:21:30 -07:00
Erich Keane	74ef6a1147	Fix X86_64 complex-returns for regcall. D35259 introduced a case where complex types of non-long-double would result in FI.getReturnInfo() to not be initialized properly. This resulted in a crash under some very specific circumstances when dereferencing the LLVMContext. This patch makes sure that these types have the intended getReturnInfo initialization.	2020-05-19 13:21:15 -07:00
jasonliu	7f5d91d3ff	[clang][AIX] Implement ABIInfo and TargetCodeGenInfo for AIX Summary: Created AIXABIInfo and AIXTargetCodeGenInfo for AIX ABI. Reviewed By: Xiangling_L, ZarkoCA Differential Revision: https://reviews.llvm.org/D79035	2020-05-19 15:00:48 +00:00
Alexey Bataev	2e499eee58	[OPENMP50]Add initial support for 'affinity' clause. Summary: Added parsing/sema/serialization support for affinity clause in task directives. Reviewers: jdoerfert Subscribers: yaxunl, guansong, arphaman, llvm-commits, cfe-commits, caomhin Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D80148	2020-05-19 08:19:09 -04:00
Heejin Ahn	d94bacbcf8	[WebAssembly] Handle exception specifications Summary: Wasm currently does not fully handle exception specifications. Rather than crashing, - This treats `throw()` in the same way as `noexcept`. - This ignores and prints a warning for `throw(type, ..)`, for a temporary measure. This warning is controlled by `-Wwasm-exception-spec`, which is on by default. You can suppress the warning by using `-Wno-wasm-exception-spec`. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D80061	2020-05-19 01:16:09 -07:00
Martin Böhme	4c09289f63	[clang] Add an API to retrieve implicit constructor arguments. Summary: This is needed in Swift for C++ interop -- see here for the corresponding Swift change: https://github.com/apple/swift/pull/30630 As part of this change, I've had to make some changes to the interface of CGCXXABI to return the additional parameters separately rather than adding them directly to a `CallArgList`. Reviewers: rjmccall Reviewed By: rjmccall Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79942	2020-05-19 09:21:26 +02:00
Francesco Petrogalli	b593bfd4d8	[clang][SveEmitter] SVE builtins for `svusdot` and `svsudot` ACLE. Summary: Intrinsics, guarded by `__ARM_FEATURE_SVE_MATMUL_INT8`: * svusdot[_s32] * svusdot[_n_s32] * svusdot_lane[_s32] * svsudot[_s32] * svsudot[_n_s32] * svsudot_lane[_s32] Reviewers: sdesmalen, efriedma, david-arm, rengolin Subscribers: tschuett, kristof.beyls, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79877	2020-05-18 23:07:23 +00:00
Anastasia Stulova	a6a237f204	[OpenCL] Added addrspace_cast operator in C++ mode. This operator is intended for casting between pointers to objects in different address spaces and follows similar logic as const_cast in C++. Tags: #clang Differential Revision: https://reviews.llvm.org/D60193	2020-05-18 12:07:54 +01:00
Jim Lin	7ee479a760	[RISCV] Fix passing two floating-point values in complex separately by two GPRs on RV64 Summary: This patch fixed the error of counting the remaining FPRs. Complex floating-point values should be passed by two FPRs for the hard-float ABI. If no two FPRs are available, it should be passed via a 64-bit GPR (fp+fp). `ArgFPRsLeft` is only decreased one while the type is complex floating-point. It causes two floating-point values in the complex are passed separately by two GPRs. Reviewers: asb, luismarques, lenary Reviewed By: asb Subscribers: rbar, johnrusso, simoncook, sabuasal, niosHD, kito-cheng, shiva0217, jrtc27, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, rkruppe, PkmX, jocewei, psnobl, benna, s.egerton, pzheng, sameer.abuasal, apazos, evandro, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79770	2020-05-18 13:13:22 +08:00
John McCall	32870a84d9	Expose IRGen API to add the default IR attributes to a function definition. I've also made a stab at imposing some more order on where and how we add attributes; this part should be NFC. I wasn't sure whether the CUDA use case for libdevice should propagate CPU/features attributes, so there's a bit of unnecessary duplication.	2020-05-16 14:44:54 -04:00
Heejin Ahn	945ad141ce	Revert "[WebAssembly] Handle exception specifications" This reverts commit `bca347508c`. This broke clang/test/Misc/warning-flags.c, because the newly added warning option in this commit didn't have a matching flag.	2020-05-15 21:33:44 -07:00
Heejin Ahn	bca347508c	[WebAssembly] Handle exception specifications Summary: Wasm currently does not fully handle exception specifications. Rather than crashing, this treats `throw()` in the same way as `noexcept`, and ignores and prints a warning for `throw(type, ..)`, for a temporary measure. Reviewers: dschuff Subscribers: sbc100, jgravelle-google, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79655	2020-05-15 21:03:38 -07:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
Nikita Popov	f89f7da999	[IR] Convert null-pointer-is-valid into an enum attribute The "null-pointer-is-valid" attribute needs to be checked by many pointer-related combines. To make the check more efficient, convert it from a string into an enum attribute. In the future, this attribute may be replaced with data layout properties. Differential Revision: https://reviews.llvm.org/D78862	2020-05-15 19:41:07 +02:00
Yonghong Song	072cde03aa	[Clang][BPF] implement __builtin_btf_type_id() builtin function Such a builtin function is mostly useful to preserve btf type id for non-global data. For example, extern void foo(..., void *data, int size); int test(...) { struct t { int a; int b; int c; } d; d.a = ...; d.b = ...; d.c = ...; foo(..., &d, sizeof(d)); } The function "foo" in the above only see raw data and does not know what type of the data is. In certain cases, e.g., logging, the additional type information will help pretty print. This patch implemented a BPF specific builtin u32 btf_type_id = __builtin_btf_type_id(param, flag) which will return a btf type id for the "param". flag == 0 will indicate a BTF local relocation, which means btf type_id only adjusted when bpf program BTF changes. flag == 1 will indicate a BTF remote relocation, which means btf type_id is adjusted against linux kernel or future other entities. Differential Revision: https://reviews.llvm.org/D74668	2020-05-15 09:44:54 -07:00
Leonard Chan	e9802aa422	Revert "Run Coverage pass before other *San passes under new pass manager" This reverts commit `7d5bb94d78`. Reverting since this leads to a linker error we're seeing on Fuchsia. The underlying issue seems to be that inlining is run after sanitizers and causes different comdat groups instrumented by Sancov to reference non-key symbols defined in other comdat groups. Will re-land this patch after a fix for that is landed.	2020-05-14 15:19:27 -07:00
Alexey Bataev	0363ae97ab	[OPENMP50]Codegen for uses_allocators clause. Summary: Predefined allocators should not be mapped at all (they are just enumeric constants). FOr user-defined allocators need to map the traits only as firstprivates, the allocator itself is private. At the beginning of the target region the user-defined allocatores must be created and then destroyed at the end of the target region: ``` omp_allocator_handle_t my_allocator = __kmpc_init_allocator(<gtid>, /default memhandle/ 0, <number_of_traits>, &<traits>); ... call void @__kmpc_destroy_allocator(<gtid>, my_allocator); ``` Reviewers: jdoerfert, aaron.ballman Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D79257	2020-05-14 18:02:12 -04:00
Eli Friedman	428d0b6f77	Fix clang test failures from D77454	2020-05-14 14:10:51 -07:00
Adrian McCarthy	a549c0d004	Fix template class debug info for Visual Studio visualizers An earlier change eliminated spaces between the close brackets of nested template lists. Unfortunately that prevents the Windows debuggers from matching some types to their corresponding visualizers (e.g., std::map). This selects the SeparateTemplateClosers flag when generating CodeView. Note that we were already making formatting adjustments under similar circumstances for similar reasons. This wasn't caught by existing tests because they were using only -std=c++98. Differential Revision: https://reviews.llvm.org/D79274	2020-05-13 14:20:18 -07:00
Sylvain Audi	7a8edcb212	[Clang] Restore replace_path_prefix instead of startswith In D49466, sys::path::replace_path_prefix was used instead startswith for -f[macro/debug/file]-prefix-map options. However those were reverted later (commit rG3bb24bf25767ef5bbcef958b484e7a06d8689204) due to broken Windows tests. This patch restores those replace_path_prefix calls. It also modifies the prefix matching to be case-insensitive under Windows. Differential Revision : https://reviews.llvm.org/D76869	2020-05-13 13:49:14 -04:00
Shengchen Kan	ad60ff70eb	[NFC] Code cleanup in TargetInfo.cpp Fix the signed/unsigned mismatch issue	2020-05-13 14:48:46 +08:00
Thomas Lively	3d49d1cfa7	[WebAssembly] Implement pseudo-min/max SIMD instructions Summary: As proposed in https://github.com/WebAssembly/simd/pull/122. Since these instructions are not yet merged to the SIMD spec proposal, this patch makes them entirely opt-in by surfacing them only through LLVM intrinsics and clang builtins. If these instructions are made official, these intrinsics and builtins should be replaced with simple instruction patterns. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D79742	2020-05-12 09:39:01 -07:00
Fangrui Song	b56b1e67e3	[gcov] Default coverage version to '408' and delete CC1 option -coverage-exit-block-before-body gcov 4.8 (r189778) moved the exit block from the last to the second. The .gcda format is compatible with 4.7 but decoding libgcov 4.7 produced .gcda with gcov [4.7,8) can mistake the exit block, emit bogus `%s:'%s' has arcs from exit block\n` warnings, and print wrong `" returned %s` for branch statistics (-b). * decoding libgcov 4.8 produced .gcda with gcov 4.7 has similar issues. Also, rename "return block" to "exit block" because the latter is the appropriate term.	2020-05-12 09:14:03 -07:00
Sander de Smalen	d6936be2ef	[SveEmitter] Add builtins for svdup and svindex Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79357	2020-05-12 11:02:32 +01:00
Arthur Eubanks	7d5bb94d78	Run Coverage pass before other *San passes under new pass manager Summary: This fixes compiler-rt/test/msan/coverage-levels.cpp under the new pass manager (final check-msan test!). Under the old pass manager, the coverage pass would run before the MSan pass. The opposite happened under the new pass manager. The MSan pass adds extra basic blocks, changing the number of coverage callbacks. Reviewers: vitalybuka, leonardchan Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79698	2020-05-11 12:59:09 -07:00
Florian Hahn	1065869195	[Matrix] Add matrix type to Clang. This patch adds a matrix type to Clang as described in the draft specification in clang/docs/MatrixSupport.rst. It introduces a new option -fenable-matrix, which can be used to enable the matrix support. The patch adds new MatrixType and DependentSizedMatrixType types along with the plumbing required. Loads of and stores to pointers to matrix values are lowered to memory operations on 1-D IR arrays. After loading, the loaded values are cast to a vector. This ensures matrix values use the alignment of the element type, instead of LLVM's large vector alignment. The operators and builtins described in the draft spec will will be added in follow-up patches. Reviewers: martong, rsmith, Bigcheese, anemet, dexonsmith, rjmccall, aaron.ballman Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D72281	2020-05-11 18:55:45 +01:00
Thomas Lively	8e3e56f2a3	[WebAssembly] Add wasm-specific vector shuffle builtin and intrinsic Summary: Although using `__builtin_shufflevector` and the `shufflevector` instruction works fine, they are not opaque to the optimizer. As a result, DAGCombine can potentially reduce the number of shuffles and change the shuffle masks. This is unexpected behavior for users of the WebAssembly SIMD intrinsics who have crafted their shuffles to optimize the code generated by engines. This patch solves the problem by adding a new shuffle intrinsic that is opaque to the optimizers in line with the decision of the WebAssembly SIMD contributors at https://github.com/WebAssembly/simd/issues/196#issuecomment-622494748. In the future we may implement custom DAG combines to properly optimize shuffles and replace this solution. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D66983	2020-05-11 10:01:55 -07:00
Sander de Smalen	4cad97595f	[SveEmitter] Add builtins for svmovlb and svmovlt These builtins are expanded in CGBuiltin to use intrinsics for (signed/unsigned) shift left long top/bottom. Reviewers: efriedma, SjoerdMeijer Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79579	2020-05-11 09:41:58 +01:00
Fangrui Song	25544ce2df	[gcov] Default coverage version to '407' and delete CC1 option -coverage-cfg-checksum Defaulting to -Xclang -coverage-version='407' makes .gcno/.gcda compatible with gcov [4.7,8) In addition, delete clang::CodeGenOptionsBase::CoverageExtraChecksum and GCOVOptions::UseCfgChecksum. We can infer the information from the version. With this change, .gcda files produced by `clang --coverage a.o` linked executable can be read by gcov 4.7~7. We don't need other -Xclang -coverage* options. There may be a mismatching version warning, though. (Note, GCC r173147 "split checksum into cfg checksum and line checksum" made gcov 4.7 incompatible with previous versions.)	2020-05-10 16:14:07 -07:00
Fangrui Song	13a633b438	[gcov] Delete CC1 option -coverage-no-function-names-in-data rL144865 incorrectly wrote function names for GCOV_TAG_FUNCTION (this might be part of the reasons the header says "We emit files in a corrupt version of GCOV's "gcda" file format"). rL176173 and rL177475 realized the problem and introduced -coverage-no-function-names-in-data to work around the issue. (However, the description is wrong. libgcov never writes function names, even before GCC 4.2). In reality, the linker command line has to look like: clang --coverage -Xclang -coverage-version='407*' -Xclang -coverage-cfg-checksum -Xclang -coverage-no-function-names-in-data Failing to pass -coverage-no-function-names-in-data can make gcov 4.7~7 either produce wrong results (for one gcov-4.9 program, I see "No executable lines") or segfault (gcov-7). (gcov-8 uses an incompatible format.) This patch deletes -coverage-no-function-names-in-data and the related function names support from libclang_rt.profile	2020-05-10 12:37:44 -07:00
Jinsong Ji	a72b9dfd45	[sanitizer] Enable whitelist/blacklist in new PM https://reviews.llvm.org/D63616 added `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang. However, it was done only for legacy pass manager. This patch enable it for new pass manager as well. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D79653	2020-05-10 02:34:29 +00:00
Matt Arsenault	a881dc1103	Fix typo	2020-05-09 16:00:17 -04:00
Simon Pilgrim	0b9783350b	LTO.h - reduce includes to forward declarations. NFC. Add missing ToolOutputFile.h dependency to BackendUtil.cpp	2020-05-09 15:10:51 +01:00
Matt Arsenault	03cb328d6f	clang: Cleanup usage of CreateMemCpy It handles the the pointee type casts in preparation for opaque pointers.	2020-05-08 20:57:56 -04:00
Sriraman Tallam	e8147ad822	Uniuqe Names for Internal Linkage Symbols. This is a standalone patch and this would help Propeller do a better job of code layout as it can accurately attribute the profiles to the right internal linkage function. This also helps SampledFDO/AutoFDO correctly associate sampled profiles to the right internal function. Currently, if there is more than one internal symbol foo, their profiles are aggregated by SampledFDO. This patch adds a new clang option, -funique-internal-funcnames, to generate unique names for functions with internal linkage. This patch appends the md5 hash of the module name to the function symbol as a best effort to generate a unique name for symbols with internal linkage. Differential Revision: https://reviews.llvm.org/D73307	2020-05-07 18:18:37 -07:00
Arthur Eubanks	48451ee6a7	[MSan] Pass MSan command line options under new pass manager Summary: Properly forward TrackOrigins and Recover user options to the MSan pass under the new pass manager. This makes the number of check-msan failures when ENABLE_EXPERIMENTAL_NEW_PASS_MANAGER is TRUE go from 52 to 2. Based on https://reviews.llvm.org/D77249. Reviewers: nemanjai, vitalybuka, leonardchan Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79445	2020-05-07 08:21:35 -07:00
Alexey Bataev	8026394d3c	[OPENMP]Consider 'omp_null_allocator' as a predefined allocator. Summary: omp.h header file defines omp_null_allocator as a predefined allocator, need to consider it also as a predefined allocator. Reviewers: jdoerfert Subscribers: jholewinski, yaxunl, guansong, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D79186	2020-05-07 10:11:06 -04:00
Sander de Smalen	3cb8b4c193	[SveEmitter] Add builtins for SVE2 Polynomial arithmetic This patch adds builtins for: - sveorbt - sveortb - svpmul - svpmullb, svpmullb_pair - svpmullt, svpmullt_pair The svpmullb and svpmullt builtins are expressed using the svpmullb_pair and svpmullt_pair LLVM IR intrinsics, respectively. Reviewers: SjoerdMeijer, efriedma, rengolin Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79480	2020-05-07 11:53:04 +01:00
Akira Hatanaka	dc4e25d4f2	[CodeGen][ObjC] Don't try to retain a __unsafe_unretained ARC pointer passed to __builtin_os_log_format to extend its lifetime to the end of its enclosing block Extend only lifetimes of pointers returned by function calls or message sends instead. In the long term, we should lifetime-extend pointers in more complex expressions and non-ARC objects (e.g., C++ temporaries) too. rdar://problem/61846261	2020-05-06 12:47:17 -07:00
Melanie Blower	c355bec749	Add support for #pragma clang fp reassociate(on\|off) Reviewers: rjmccall, erichkeane, sepavloff Differential Revision: https://reviews.llvm.org/D78827	2020-05-06 08:05:44 -07:00
Erich Keane	8a1c999c9b	Implement _ExtInt ABI for all ABIs in Clang, enable type for ABIs This is the result of an audit of all of the ABIs in clang to implement and enable the type for those targets. Additionally, this finds an issue with integer-promotion passing for a few platforms when using _ExtInt of < int, so this also corrects that resulting in signext/zeroext being on a params of those types in some platforms. Differential Revisions: https://reviews.llvm.org/D79118	2020-05-06 06:52:18 -07:00
Michael Liao	9142c0b46b	[clang][codegen] Hoist parameter attribute setting in function prolog. Summary: - If the coerced type is still a pointer, it should be set with proper parameter attributes, such as `noalias`, `nonnull`, and etc. Hoist that (pointer) parameter attribute setting so that the coerced pointer parameter could be marked properly. Depends on D79394 Reviewers: rjmccall, kerbowa, yaxunl Subscribers: jvesely, nhaehnle, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79395	2020-05-05 15:31:51 -04:00
Michael Liao	276c8dde0b	[clang][codegen] Refactor argument loading in function prolog. NFC. Summary: - Skip copying function arguments and unnecessary casting by using them directly. Reviewers: rjmccall, kerbowa, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D79394	2020-05-05 15:31:51 -04:00
Hans Wennborg	55b9b11fea	Don't assert about missing profile info in createProfileWeightsForLoop The compiler shouldn't crash if the profile info is slightly off. We hit this in Chromium. Differential revision: https://reviews.llvm.org/D79417	2020-05-05 18:59:00 +02:00
Francesco Petrogalli	4fa13a3dac	[clang][OpenMP] Fix getNDSWDS for aarch64. Summary: This change fixes an aarch64-specific bug in the generation of the NDS and WDS values used to compute the signature of the vector functions out of OpenMP directives like `declare simd`. When the directive is used in conjunction with the `linear` clause, the size of the pointee must be used instead of the size of the pointer to compute NDS and WDS. The code-fix is strictly related to the behavior for `linear`, but given that the only way we have to test the NDS and WDS values is to check the resulting `<vlen>` token in the mangled name of the vector function, the tests have been extended to cover all the possible values of WDS and NDS as defined in the ABI at https://github.com/ARM-software/abi-aa/tree/master/vfabia64. Reviewers: ABataev, jdoerfert, andwar Reviewed By: jdoerfert Subscribers: yaxunl, kristof.beyls, guansong, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78969	2020-05-05 16:27:20 +00:00
Sander de Smalen	5ba329059f	[SveEmitter] Add builtins for svreinterpret The reinterpret builtins are generated separately because they need the cross product of all types, 121 functions in total, which is inconvenient to specify in the arm_sve.td file. Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78756	2020-05-05 13:04:44 +01:00
Sander de Smalen	aed6bd6f42	Reland D78750: [SveEmitter] Add builtins for svdupq and svdupq_lane Edit: Changed a few CHECK lines into CHECK-DAG lines. This reverts commit `90f3f62cb0`.	2020-05-05 10:42:11 +01:00
Sander de Smalen	90f3f62cb0	Revert "[SveEmitter] Add builtins for svdupq and svdupq_lane" It seems this patch broke some buildbots, so reverting until I have had a chance to investigate. This reverts commit `6b90a6887d`.	2020-05-04 21:31:55 +01:00
Sander de Smalen	6b90a6887d	[SveEmitter] Add builtins for svdupq and svdupq_lane * svdupq builtins that duplicate scalars to every quadword of a vector are defined using builtins for svld1rq (load and replicate quadword). * svdupq builtins that duplicate boolean values to fill a predicate vector are defined using `svcmpne`. Reviewers: SjoerdMeijer, efriedma, ctetreau Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78750	2020-05-04 20:38:47 +01:00
Zarko	cb78376433	Test commit. Modified comment to add a period at the end.	2020-05-04 13:06:21 -04:00
Melanie Blower	f5360d4bb3	Reapply "Add support for #pragma float_control" with buildbot fixes Add support for #pragma float_control Reviewers: rjmccall, erichkeane, sepavloff Differential Revision: https://reviews.llvm.org/D72841 This reverts commit `fce82c0ed3`.	2020-05-04 05:51:25 -07:00
Thomas Lively	e0f52842c8	[WebAssembly] Renumber SIMD opcodes Summary: As described in https://github.com/WebAssembly/simd/pull/209. This is the final reorganization of the SIMD opcode space before standardization. It has been landed in concert with corresponding changes in other projects in the WebAssembly SIMD ecosystem. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79224	2020-05-01 17:20:49 -07:00
Francesco Petrogalli	7585ba208e	[clang][OpenMP] Fix mangling of linear parameters. Summary: The linear parameter token in the mangling function must be multiplied by the pointee size in bytes when the parameter is a pointer. Reviewers: ABataev, andwar, jdoerfert Subscribers: yaxunl, guansong, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78965	2020-05-01 21:19:00 +00:00
Melanie Blower	fce82c0ed3	Revert "Reapply "Add support for #pragma float_control" with improvements to" This reverts commit `69aacaf699`.	2020-05-01 10:31:09 -07:00
Melanie Blower	69aacaf699	Reapply "Add support for #pragma float_control" with improvements to test cases Add support for #pragma float_control Reviewers: rjmccall, erichkeane, sepavloff Differential Revision: https://reviews.llvm.org/D72841 This reverts commit `85dc033cac`, and makes corrections to the test cases that failed on buildbots.	2020-05-01 10:03:30 -07:00
Alexey Bataev	8c2f4e0e85	[OPENMP50]Codegen for reduction clauses with 'task' modifier. Summary: Added codegen for reduction clause with task modifier. ``` #pragma omp ... reduction(task, +: a) { #pragma omp ... in_reduction(+: a) } ``` is translated into something like this: ``` #pragma omp ... reduction(+:a) { struct red_input_t { void reduce_shar; void reduce_orig; size_t reduce_size; void reduce_init; void reduce_fini; void reduce_comb; unsigned flags; } r_var; r_var.reduce_shar = &a; r_var.reduce_orig = &original a; r_var.reduce_size = sizeof(a); r_var.reduce_init = [](void l,void){return (int)l=0;}; r_var.reduce_fini = nullptr; r_var.reduce_comb = [](void l,void* r){return (int)l += (int)r;}; void tg = __kmpc_taskred_modifier_init(<loc_addr>,<gtid>, <flag - 0 for parallel, 1 for worksharing>, <1 - number of reduction elements>, &r_var); { #pragma omp ... in_reduction(+: a) firstprivate(tg) ... } __kmpc_task_reduction_modifier_fini(<loc_addr>,<gtid>, <flag - 0 for parallel, 1 for worksharing>); } ``` Reviewers: jdoerfert Subscribers: yaxunl, guansong, jfb, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D79034	2020-05-01 11:40:27 -04:00
Melanie Blower	85dc033cac	Revert "Add support for #pragma float_control" This reverts commit `4f1e9a17e9`. due to fail on buildbot, sorry for the noise	2020-05-01 06:36:58 -07:00
Melanie Blower	4f1e9a17e9	Add support for #pragma float_control Reviewers: rjmccall, erichkeane, sepavloff Differential Revision: https://reviews.llvm.org/D72841	2020-05-01 06:14:24 -07:00
Alexey Bataev	b5be1c5419	[OPENMP50]Basic support for uses_allocators clause. Summary: Added parsing/sema/serialization supoprt for uses_allocators clause. Reviewers: jdoerfert Subscribers: yaxunl, guansong, arphaman, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D78577	2020-04-30 16:24:36 -04:00
Alexey Bataev	b737b814fe	[OPENMP]Allow cancellation constructs in target parallel regions. Summary: omp cancellation point parallel and omp cancel parallel directives are allowed in target paralle regions. Reviewers: jdoerfert Subscribers: yaxunl, guansong, caomhin, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78941	2020-04-30 15:10:52 -04:00
Aaron Smith	4eabd00612	[Windows SEH] Fix abnormal-exits in _try Summary: Per Windows SEH Spec, except _leave, all other early exits of a _try (goto/return/continue/break) are considered abnormal exits. In those cases, the first parameter passes to its _finally funclet should be TRUE to indicate an abnormal-termination. One way to implement abnormal exits in _try is to invoke Windows runtime _local_unwind() (MSVC approach) that will invoke _dtor funclet where abnormal-termination flag is always TRUE when calling _finally. Obviously this approach is less optimal and is complicated to implement in Clang. Clang today has a NormalCleanupDestSlot mechanism to dispatch multiple exits at the end of _try. Since _leave (or try-end fall-through) is always Indexed with 0 in that NormalCleanupDestSlot, this fix takes the advantage of that mechanism and just passes NormalCleanupDest ID as 1st Arg to _finally. Reviewers: rnk, eli.friedman, JosephTremoulet, asmith, efriedma Reviewed By: efriedma Subscribers: efriedma, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77936	2020-04-30 09:38:19 -07:00
jasonliu	e0c356582d	[NFC][clang] Replace raw new/delete with unique_ptr to store ABIInfo in TargetCodeGenInfo Use unique_ptr to manage the lifetime of ABIInfo member inside TargetCodeGenInfo. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D79033	2020-04-30 12:31:50 +00:00
Erich Keane	5a1d9c0f5a	Fix x86/x86_64 calling convention for _ExtInt After speaking with Craig Topper about some recent defects, he pointed out that _ExtInts should be passed indirectly if larger than the largest int register, and like ints when smaller than that. This patch implements that. Note that this changed the way vaargs worked quite a bit, but they still work. Differential Revision: https://reviews.llvm.org/D78785	2020-04-29 11:04:25 -07:00
Sander de Smalen	a4dac6d4e0	[SveEmitter] Add builtins for svmov_b and svnot_b. These are custom expanded in CGBuiltin: svmov_b_z(pg, op) <=> svand_b_z(pg, op, op) svnot_b_z(pg, op) <=> sveor_b_z(pg, op, pg) Reviewers: SjoerdMeijer, efriedma, ctetreau, rengolin Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D79039	2020-04-29 13:33:18 +01:00
Simon Pilgrim	db97a12454	Fix Wparentheses gcc warning. NFC. It should be either a float(32) or an int(32).	2020-04-29 12:21:05 +01:00
Sander de Smalen	42a56bf63f	[SveEmitter] Add builtins for gather prefetches Patch by Andrzej Warzynski Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78677	2020-04-29 11:52:49 +01:00
David Blaikie	9b77242c9a	CodeGenTypes::CGRecordLayouts: Use unique_ptr to simplify memory management	2020-04-28 22:31:16 -07:00
Christopher Tetreault	ef3678cfee	[SVE] Update EmitSVEPredicateCast to take a ScalableVectorType Summary: Removes usage of VectorType::getNumElements identified by test located at CodeGen/aarch64-sve-intrinsics/acle_sve_abs.c. Since the type is an SVE predicate vector, it makes sense to specialize the code for scalable vectors only. Reviewers: rengolin, efriedma Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, rkruppe, psnobl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78958	2020-04-28 11:22:20 -07:00
Momchil Velikov	102b4105e3	[CMSE] Clear padding bits of struct/unions/fp16 passed by value When passing a value of a struct/union type from secure to non-secure state (that is returning from a CMSE entry function or passing an argument to CMSE-non-secure call), there is a potential sensitive information leak via the padding bits in the structure. It is not possible in the general case to ensure those bits are cleared by using Standard C/C++. This patch makes the compiler emit code to clear such padding bits. Since type information is lost in LLVM IR, the code generation is done by Clang. For each interesting record type, we build a bitmask, in which all the bits, corresponding to user declared members, are set. Values of record types are returned by coercing them to an integer. After the coercion, the coerced value is masked (with bitwise AND) and then returned by the function. In a similar manner, values of record types are passed as arguments by coercing them to an array of integers, and the coerced values themselves are masked. For union types, we effectively clear only bits, which aren't part of any member, since we don't know which is the currently active one. The compiler will issue a warning, whenever a union is passed to non-secure state. Values of half-precision floating-point types are passed in the least significant bits of a 32-bit register (GPR or FPR) with the most significant bits unspecified. Since this is also a potential leak of sensitive information, this patch also clears those unspecified bits. Differential Revision: https://reviews.llvm.org/D76369	2020-04-28 17:05:58 +01:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
Christopher Tetreault	da8918f27e	[SVE][NFC] Use ScalableVectorType in CGBuiltin Summary: * Upgrade some usages of VectorType to use ScalableVectorType Reviewers: efriedma, david-arm, fpetrogalli, kmclaughlin Reviewed By: efriedma Subscribers: tschuett, rkruppe, psnobl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D78842	2020-04-27 16:29:45 -07:00
Sander de Smalen	e4872d7f08	[SveEmitter] Add builtins for svlen The svlen builtins return the number of elements in a vector and are implemented using `llvm.vscale`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78755	2020-04-27 21:27:32 +01:00
Matt Arsenault	5c03beefa7	clang: Allow backend unsupported warnings Currently this asserts on anything other than errors. In one workaround scenario, AMDGPU emits DiagnosticInfoUnsupported as a warning for functions that can't be correctly codegened, but should never be executed.	2020-04-27 12:14:51 -04:00
Sander de Smalen	03f419f3eb	[SveEmitter] IsInsertOp1SVALL and builtins for svqdec[bhwd] and svqinc[bhwd] Some ACLE builtins leave out the argument to specify the predicate pattern, which is expected to be expanded to an SV_ALL pattern. This patch adds the flag IsInsertOp1SVALL to insert SV_ALL as the second operand. Reviewers: efriedma, SjoerdMeijer Reviewed By: SjoerdMeijer Tags: #clang Differential Revision: https://reviews.llvm.org/D78401	2020-04-27 11:45:10 +01:00
Saiyedul Islam	06bdffb2bb	[AMDGPU] Expose llvm fence instruction as clang intrinsic Expose llvm fence instruction as clang builtin for AMDGPU target __builtin_amdgcn_fence(unsigned int memoryOrdering, const char *syncScope) The first argument of this builtin is one of the memory-ordering specifiers __ATOMIC_ACQUIRE, __ATOMIC_RELEASE, __ATOMIC_ACQ_REL, or __ATOMIC_SEQ_CST following C++11 memory model semantics. This is mapped to corresponding LLVM atomic memory ordering for the fence instruction using LLVM atomic C ABI. The second argument is an AMDGPU-specific synchronization scope defined as string. Reviewed By: sameerds Differential Revision: https://reviews.llvm.org/D75917	2020-04-27 09:39:03 +05:30
Sander de Smalen	3817ca7dbf	[SveEmitter] Add IsAppendSVALL and builtins for svptrue and svcnt[bhwd] Some ACLE builtins leave out the argument to specify the predicate pattern, which is expected to be expanded to an SV_ALL pattern. This patch adds the flag IsAppendSVALL to append SV_ALL as the final operand. Reviewers: SjoerdMeijer, efriedma, rovka, rengolin Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D77597	2020-04-26 12:44:26 +01:00
Craig Topper	0ed5b0d517	[X86] Don't use types when getting the intrinsic declaration for x86_avx512_mask_vcvtph2ps_512. This intrinsic isn't overloaded so we should query with types. Doing so causes the backend to miss the intrinsic and not codegen it. This eventually leads to a linker error.	2020-04-24 11:01:22 -07:00
Luke Geeson	7da1905125	[AArch32] Armv8.6-a Matrix Mult Assembly + Intrinsics This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch32 - Intrinsics Support for AArch32 Neon Intrinsics for Matrix Multiplication Note: these extensions are optional in the 8.6a architecture and so have to be enabled by default No additional IR types or C Types are needed for this extension. This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Reviewers: t.p.northover, miyuki Reviewed By: miyuki Subscribers: miyuki, ostannard, kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77872	2020-04-24 15:54:06 +01:00
Luke Geeson	832cd74913	[AArch64] Armv8.6-a Matrix Mult Assembly + Intrinsics This patch upstreams support for the Armv8.6-a Matrix Multiplication Extension. A summary of the features can be found here: https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/arm-architecture-developments-armv8-6-a This patch includes: - Assembly support for AArch64 only (no SVE or Neon) - Intrinsics Support for AArch64 Armv8.6a Matrix Multiplication Instructions (No bfloat16 matrix multiplication) No IR types or C Types are needed for this extension. This is part of a patch series, starting with BFloat16 support and the other components in the armv8.6a extension (in previous patches linked in phabricator) Based on work by: - Luke Geeson - Oliver Stannard - Luke Cheeseman Reviewers: ostannard, t.p.northover, rengolin, kmclaughlin Reviewed By: kmclaughlin Subscribers: kmclaughlin, kristof.beyls, hiraditya, danielkiss, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77871	2020-04-24 15:54:06 +01:00
Alexey Bataev	e9bfa1dd38	[OPENMP]Use new interface for task reduction. Summary: Patch forces codegen to use the new runtime functions for task reductions where the issue with passing the address of the original variables to the UDR initializers is fixed. Also, this patch is required for upcoming support of task modifier inreduction clause. Reviewers: jdoerfert Subscribers: yaxunl, guansong, cfe-commits, caomhin Tags: #clang Differential Revision: https://reviews.llvm.org/D78733	2020-04-24 09:41:48 -04:00
Sander de Smalen	0ddb2034c1	[SveEmitter] Add builtins for compares and ReverseCompare flag. The IsReverseCompare flag tells CGBuiltin to swap the operands, so that a LT/LE intrinsics can be expressed in terms of GE/GT intrinsics. This patch also adds builtins for the wide-variants of the compares. Reviewers: SjoerdMeijer, efriedma, ctetreau Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78747	2020-04-24 14:33:47 +01:00
Sander de Smalen	823e2a670a	[SveEmitter] Add builtins for contiguous prefetches This patch also adds the enum `sv_prfop` for the prefetch operation specifier and checks to ensure the passed enum values are valid. Reviewers: SjoerdMeijer, efriedma, ctetreau Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78674	2020-04-24 11:35:59 +01:00
serge-sans-paille	8f766e382b	Update compiler extension integration into the build system The approach here is to create a new (empty) component, `Extensions', where all statically compiled extensions dynamically register their dependencies. That way we're more natively compatible with LLVMBuild and llvm-config. Fixes: https://bugs.llvm.org/show_bug.cgi?id=44870 Differential Revision: https://reviews.llvm.org/D78192	2020-04-24 09:40:14 +02:00
Puyan Lotfi	9721fbf85b	[NFC] Refactoring PropertyAttributeKind for ObjCPropertyDecl and ObjCDeclSpec. This is a code clean up of the PropertyAttributeKind and ObjCPropertyAttributeKind enums in ObjCPropertyDecl and ObjCDeclSpec that are exactly identical. This non-functional change consolidates these enums into one. The changes are to many files across clang (and comments in LLVM) so that everything refers to the new consolidated enum in DeclObjCCommon.h. 2nd Landing Attempt... Differential Revision: https://reviews.llvm.org/D77233	2020-04-23 17:21:25 -04:00
Sander de Smalen	7003a1da37	[SveEmitter] Use llvm.aarch64.sve.ld1/st1 for contiguous load/store builtins This patch changes the codegen of the builtins for contiguous loads to map onto the SVE specific IR intrinsics llvm.aarch64.sve.ld1/st1. Reviewers: SjoerdMeijer, efriedma, kmclaughlin, rengolin Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78673	2020-04-23 15:15:41 +01:00
Sander de Smalen	a5e0389b2a	[AArch64] Define ACLE FP conversion intrinsics with more specific predicate. This patch changes the FP conversion intrinsics to take a predicate that matches the number of lanes for the vector with the widest element type as opposed to using <vscale x 16 x i1>. For example: ```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f16(<vscale x 4 x float>, <vscale x 4 x i1>, <vscale x 8 x half>)``` now uses <vscale x 4 x i1> instead of <vscale x 16 x i1> And similar for: ```<vscale x 4 x float> @llvm.aarch64.sve.fcvt.f32f64(<vscale x 4 x float>, <vscale x 2 x i1>, <vscale x 2 x double>)``` where the predicate now matches the wider type, so <vscale x 2 x i1>. Reviewers: efriedma, SjoerdMeijer, paulwalker-arm, rengolin Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78402	2020-04-23 10:53:23 +01:00
Sander de Smalen	002164461b	[SveEmitter] Add builtins for FP conversions This adds the flag IsOverloadCvt which tells CGBulitin to use the result type and the type of the last operand as the overloaded types for the LLVM IR intrinsic. This also adds the flag IsFPConvert, which is needed to avoid converting the predicate of the operation from svbool_t to a predicate with fewer lanes, as the LLVM IR intrinsics use the <vscale x 16 x i1> as the predicate. Reviewers: SjoerdMeijer, efriedma Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78239	2020-04-23 10:49:06 +01:00
Puyan Lotfi	bbf386f02b	Revert "[NFC] Refactoring PropertyAttributeKind for ObjCPropertyDecl and ObjCDeclSpec." This reverts commit `2aa044ed08`. Reverting due to bot failure in lldb.	2020-04-23 00:05:08 -04:00
Puyan Lotfi	2aa044ed08	[NFC] Refactoring PropertyAttributeKind for ObjCPropertyDecl and ObjCDeclSpec. This is a code clean up of the PropertyAttributeKind and ObjCPropertyAttributeKind enums in ObjCPropertyDecl and ObjCDeclSpec that are exactly identical. This non-functional change consolidates these enums into one. The changes are to many files across clang (and comments in LLVM) so that everything refers to the new consolidated enum in DeclObjCCommon.h. Differential Revision: https://reviews.llvm.org/D77233	2020-04-22 23:27:06 -04:00
Sander de Smalen	2d1baf606a	[SveEmitter] Add builtins for svwhilerw/svwhilewr This also adds the IsOverloadWhileRW flag which tells CGBuiltin to use the result predicate type and the first pointer type as the overloaded types for the LLVM IR intrinsic. Reviewers: SjoerdMeijer, efriedma Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D78238	2020-04-22 21:49:18 +01:00
Sander de Smalen	1559485e60	[SveEmitter] Add builtins for svwhile This also adds the IsOverloadWhile flag which tells CGBuiltin to use both the default type (predicate) and the type of the second operand (scalar) as the overloaded types for the LLMV IR intrinsic. Reviewers: SjoerdMeijer, efriedma, rovka Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D77595	2020-04-22 21:47:47 +01:00
Sander de Smalen	662cbaf647	[SveEmitter] Add IsOverloadNone flag and builtins for svpfalse and svcnt[bhwd]_pat Add the IsOverloadNone flag to tell CGBuiltin that it does not have an overloaded type. This is used for e.g. svpfalse which does not take any arguments and always returns a svbool_t. This patch also adds builtins for svcntb_pat, svcnth_pat, svcntw_pat and svcntd_pat, as those don't require custom codegen. Reviewers: SjoerdMeijer, efriedma, rovka Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D77596	2020-04-22 16:42:08 +01:00
Sander de Smalen	41d52662d5	[SveEmitter] Add support for _n form builtins The ACLE has builtins that take a scalar value that is to be expanded into a vector by the operation. While the ISA may have an instruction that takes an immediate or a scalar to represent this, the LLVM IR intrinsic may not, so Clang will have to splat the scalar value. This patch also adds the _n forms for svabd, svadd, svdiv, svdivr, svmax, svmin, svmul, svmulh, svub and svsubr. Reviewers: SjoerdMeijer, efriedma, rovka Reviewed By: SjoerdMeijer Tags: #clang Differential Revision: https://reviews.llvm.org/D77594	2020-04-22 14:23:54 +01:00
Andrzej Warzynski	72f565899d	[SveEmitter] Implement builtins for gathers/scatters This patch adds builtins for: * regular, first-faulting and non-temporal gather loads * regular and non-temporal scatter stores Differential Revision: https://reviews.llvm.org/D77735	2020-04-22 13:21:39 +01:00
Justin Hibbits	4ca2cad947	[PowerPC] Add clang -msvr4-struct-return for 32-bit ELF Summary: Change the default ABI to be compatible with GCC. For 32-bit ELF targets other than Linux, Clang now returns small structs in registers r3/r4. This affects FreeBSD, NetBSD, OpenBSD. There is no change for 32-bit Linux, where Clang continues to return all structs in memory. Add clang options -maix-struct-return (to return structs in memory) and -msvr4-struct-return (to return structs in registers) to be compatible with gcc. These options are only for PPC32; reject them on PPC64 and other targets. The options are like -fpcc-struct-return and -freg-struct-return for X86_32, and use similar code. To actually return a struct in registers, coerce it to an integer of the same size. LLVM may optimize the code to remove unnecessary accesses to memory, and will return i32 in r3 or i64 in r3:r4. Fixes PR#40736 Patch by George Koehler! Reviewed By: jhibbits, nemanjai Differential Revision: https://reviews.llvm.org/D73290	2020-04-21 20:17:25 -05:00
Aaron Ballman	6a30894391	C++2a -> C++20 in some identifiers; NFC.	2020-04-21 15:37:19 -04:00
Craig Topper	68b2e507e4	[Local] Update getOrEnforceKnownAlignment/getKnownAlignment to use Align/MaybeAlign. Differential Revision: https://reviews.llvm.org/D78443	2020-04-20 21:31:44 -07:00
Sander de Smalen	06c980df46	[SveEmitter] Implement zeroing of false lanes This implements zeroing of false lanes for binary operations, where instead of merging into the first operand vector (_m) a `select` is placed on the first input vector. This approach easily translates to the use of the `zeroing movprfx` instruction. This patch also adds builtins for svabd, svadd, svdiv, svdivr, svmax, svmin, svmul, svmulh, svub and svsubr. Reviewers: SjoerdMeijer, efriedma, rovka Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D77593	2020-04-20 17:02:48 +01:00
Sander de Smalen	9986b3de26	[SveEmitter] Explicitly merge with zero/undef Builtins that have the merge type MergeAnyExp or MergeZeroExp, merge into a 'undef' or 'zero' vector respectively, which enables the _x and _z behaviour for unary operations. This patch also adds builtins for svabs and svneg. Reviewers: SjoerdMeijer, efriedma, rovka Reviewed By: efriedma Tags: #clang Differential Revision: https://reviews.llvm.org/D77591	2020-04-20 16:26:20 +01:00
Sander de Smalen	515020c091	[SveEmitter] Add more immediate operand checks. This patch adds a number of intrinsics that take immediates with varying ranges based on the element size one of the operands. svext: immediate ranging 0 to (2048/sizeinbits(elt) - 1) svasrd: immediate ranging 1..sizeinbits(elt) svqshlu: immediate ranging 1..sizeinbits(elt)/2 ftmad: immediate ranging 0..(sizeinbits(elt) - 1) Reviewers: efriedma, SjoerdMeijer, rovka, rengolin Reviewed By: SjoerdMeijer Tags: #clang Differential Revision: https://reviews.llvm.org/D76679	2020-04-20 14:41:58 +01:00
Christopher Tetreault	c858debebc	Remove asserting getters from base Type Summary: Remove asserting vector getters from Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: efriedma Subscribers: cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D77278	2020-04-17 14:03:31 -07:00
Erich Keane	5f0903e9be	Reland Implement _ExtInt as an extended int type specifier. I fixed the LLDB issue, so re-applying the patch. This reverts commit `a4b88c0449`.	2020-04-17 10:45:48 -07:00
Sterling Augustine	a4b88c0449	Revert "Implement _ExtInt as an extended int type specifier." This reverts commit `61ba1481e2`. I'm reverting this because it breaks the lldb build with incomplete switch coverage warnings. I would fix it forward, but am not familiar enough with lldb to determine the correct fix. lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:3958:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch] switch (qual_type->getTypeClass()) { ^ lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4633:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch] switch (qual_type->getTypeClass()) { ^ lldb/source/Plugins/TypeSystem/Clang/TypeSystemClang.cpp:4889:11: error: enumeration values 'DependentExtInt' and 'ExtInt' not handled in switch [-Werror,-Wswitch] switch (qual_type->getTypeClass()) {	2020-04-17 10:29:40 -07:00
Benjamin Kramer	b639091c02	Change users of CreateShuffleVector to pass the masks as int instead of Constants No functionality change intended.	2020-04-17 16:34:29 +02:00
Erich Keane	61ba1481e2	Implement _ExtInt as an extended int type specifier. Introduction/Motivation: LLVM-IR supports integers of non-power-of-2 bitwidth, in the iN syntax. Integers of non-power-of-two aren't particularly interesting or useful on most hardware, so much so that no language in Clang has been motivated to expose it before. However, in the case of FPGA hardware normal integer types where the full bitwidth isn't used, is extremely wasteful and has severe performance/space concerns. Because of this, Intel has introduced this functionality in the High Level Synthesis compiler[0] under the name "Arbitrary Precision Integer" (ap_int for short). This has been extremely useful and effective for our users, permitting them to optimize their storage and operation space on an architecture where both can be extremely expensive. We are proposing upstreaming a more palatable version of this to the community, in the form of this proposal and accompanying patch. We are proposing the syntax _ExtInt(N). We intend to propose this to the WG14 committee[1], and the underscore-capital seems like the active direction for a WG14 paper's acceptance. An alternative that Richard Smith suggested on the initial review was __int(N), however we believe that is much less acceptable by WG14. We considered _Int, however _Int is used as an identifier in libstdc++ and there is no good way to fall back to an identifier (since _Int(5) is indistinguishable from an unnamed initializer of a template type named _Int). [0]https://www.intel.com/content/www/us/en/software/programmable/quartus-prime/hls-compiler.html) [1]http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2472.pdf Differential Revision: https://reviews.llvm.org/D73967	2020-04-17 07:10:57 -07:00
George Burgess IV	94908088a8	[CodeGen] fix inline builtin-related breakage from D78162 In cases where we have multiple decls of an inline builtin, we may need to go hunting for the one with a definition when setting function attributes. An additional test-case was provided on https://github.com/ClangBuiltLinux/linux/issues/979	2020-04-16 11:54:10 -07:00
Ehud Katz	03a9526fe5	[CGExprAgg] Fix infinite loop in `findPeephole` Simplify the function using IgnoreParenNoopCasts. Fix PR45476 Differential Revision: https://reviews.llvm.org/D78098	2020-04-16 13:26:23 +03:00
Benjamin Kramer	3ee1ec0b9d	LangOptions cannot depend on ASTContext, make it not use ASTContext directly Fixes a layering violation introduced in `2ba4e3a459`.	2020-04-16 11:46:35 +02:00
Ayke van Laethem	215dc2e203	[AVR] Use the correct address space for non-prototyped function calls Some function declarations like this: void foo(); do not have a type declaration, for that you'd use: void foo(void); Clang internally bitcasts the variadic function declaration to a function pointer, but doesn't use the correct address space on AVR. This commit fixes that. This fix is necessary to let Clang compile compiler-rt for AVR. Differential Revision: https://reviews.llvm.org/D78125	2020-04-15 23:44:51 +02:00
Melanie Blower	2ba4e3a459	Move BinaryOperators.FPOptions to trailing storage Reviewers: rjmccall Differential Revision: https://reviews.llvm.org/D76384	2020-04-15 12:57:31 -07:00
Richard Smith	bab6df86ae	Rework how UuidAttr, CXXUuidofExpr, and GUID template arguments and constants are represented. Summary: Previously, we treated CXXUuidofExpr as quite a special case: it was the only kind of expression that could be a canonical template argument, it could be a constant lvalue base object, and so on. In addition, we represented the UUID value as a string, whose source form we did not preserve faithfully, and that we partially parsed in multiple different places. With this patch, we create an MSGuidDecl object to represent the implicit object of type 'struct _GUID' created by a UuidAttr. Each UuidAttr holds a pointer to its 'struct _GUID' and its original (as-written) UUID string. A non-value-dependent CXXUuidofExpr behaves like a DeclRefExpr denoting that MSGuidDecl object. We cache an APValue representation of the GUID on the MSGuidDecl and use it from constant evaluation where needed. This allows removing a lot of the special-case logic to handle these expressions. Unfortunately, many parts of Clang assume there are only a couple of interesting kinds of ValueDecl, so the total amount of special-case logic is not really reduced very much. This fixes a few bugs and issues: * PR38490: we now support reading from GUID objects returned from __uuidof during constant evaluation. * Our Itanium mangling for a non-instantiation-dependent template argument involving __uuidof no longer depends on which CXXUuidofExpr template argument we happened to see first. * We now predeclare ::_GUID, and permit use of __uuidof without any header inclusion, better matching MSVC's behavior. We do not predefine ::__s_GUID, though; that seems like a step too far. * Our IR representation for GUID constants now uses the correct IR type wherever possible. We will still fall back to using the {i32, i16, i16, [8 x i8]} layout if a definition of struct _GUID is not available. This is not ideal: in principle the two layouts could have different padding. Reviewers: rnk, jdoerfert Subscribers: arphaman, cfe-commits, aeubanks Tags: #clang Differential Revision: https://reviews.llvm.org/D78171	2020-04-15 12:20:42 -07:00
George Burgess IV	2dd17ff081	[CodeGen] only add nobuiltin to inline builtins if we'll emit them There are some inline builtin definitions that we can't emit (isTriviallyRecursive & callers go into why). Marking these nobuiltin is only useful if we actually emit the body, so don't mark these as such unless we _do_ plan on emitting that. This suboptimality was encountered in Linux (see some discussion on D71082, and https://github.com/ClangBuiltLinux/linux/issues/979). Differential Revision: https://reviews.llvm.org/D78162	2020-04-15 11:05:22 -07:00
Benjamin Kramer	316b49d373	Pass shufflevector indices as int instead of unsigned. No functionality change intended.	2020-04-15 15:52:49 +02:00
Benjamin Kramer	6f64daca8f	Upgrade calls to CreateShuffleVector to use the preferred form of passing an array of ints No functionality change intended.	2020-04-15 12:51:38 +02:00
George Burgess IV	91c8c74180	[CodeGen] clarify a comment; NFC Prompted by discussion on https://reviews.llvm.org/D78148.	2020-04-14 14:33:01 -07:00
Joerg Sonnenberger	9d2d6e71f0	Emit Objective-C constructors as writable They end up as .init_array sections and those need to be writable, otherwise bad merging will happen.	2020-04-14 22:32:34 +02:00
Christopher Tetreault	670f2f694b	[SVE] Remove calls to getBitWidth from clang Reviewers: efriedma Reviewed By: efriedma Subscribers: tschuett, rkruppe, psnobl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77903	2020-04-14 13:29:09 -07:00
Sander de Smalen	c8a5b30bac	[SveEmitter] Add range checks for immediates and predicate patterns. Summary: This patch adds a mechanism to easily add range checks for a builtin's immediate operands. This patch is tested with the qdech intrinsic, which takes both an enum for the predicate pattern, as well as an immediate for the multiplier. Reviewers: efriedma, SjoerdMeijer, rovka Reviewed By: efriedma, SjoerdMeijer Subscribers: mgorny, tschuett, mgrang, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D76678	2020-04-14 16:49:32 +01:00
Sander de Smalen	17a68c61a9	[SveEmitter] Implement builtins for contiguous loads/stores This adds builtins for all contiguous loads/stores, including non-temporal, first-faulting and non-faulting. Reviewers: efriedma, SjoerdMeijer Reviewed By: SjoerdMeijer Tags: #clang Differential Revision: https://reviews.llvm.org/D76238	2020-04-14 15:24:57 +01:00
Georgii Rymar	1647ff6e27	[ADT/STLExtras.h] - Add llvm::is_sorted wrapper and update callers. It can be used to avoid passing the begin and end of a range. This makes the code shorter and it is consistent with another wrappers we already have. Differential revision: https://reviews.llvm.org/D78016	2020-04-14 14:11:02 +03:00
Ayke van Laethem	cfc002714a	[AVR] Support aliases in non-zero address space This fixes code like the following on AVR: void foo(void) { } void bar(void) __attribute__((alias("foo"))); Code like this is present in compiler-rt, which I'm trying to build. Differential Revision: https://reviews.llvm.org/D76182	2020-04-14 00:42:19 +02:00
Christopher Tetreault	f22fbe3a15	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: sdesmalen, efriedma, krememek Reviewed By: sdesmalen, efriedma Subscribers: dexonsmith, Charusso, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77257	2020-04-13 13:01:40 -07:00
Mehdi Amini	ed03d9485e	Revert "[TLI] Per-function fveclib for math library used for vectorization" This reverts commit `60c642e74b`. This patch is making the TLI "closed" for a predefined set of VecLib while at the moment it is extensible for anyone to customize when using LLVM as a library. Reverting while we figure out a way to re-land it without losing the generality of the current API. Differential Revision: https://reviews.llvm.org/D77925	2020-04-11 01:05:01 +00:00
Matt Morehouse	bef187c750	Implement `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist` for clang Summary: This commit adds two command-line options to clang. These options let the user decide which functions will receive SanitizerCoverage instrumentation. This is most useful in the libFuzzer use case, where it enables targeted coverage-guided fuzzing. Patch by Yannis Juglaret of DGA-MI, Rennes, France libFuzzer tests its target against an evolving corpus, and relies on SanitizerCoverage instrumentation to collect the code coverage information that drives corpus evolution. Currently, libFuzzer collects such information for all functions of the target under test, and adds to the corpus every mutated sample that finds a new code coverage path in any function of the target. We propose instead to let the user specify which functions' code coverage information is relevant for building the upcoming fuzzing campaign's corpus. To this end, we add two new command line options for clang, enabling targeted coverage-guided fuzzing with libFuzzer. We see targeted coverage guided fuzzing as a simple way to leverage libFuzzer for big targets with thousands of functions or multiple dependencies. We publish this patch as work from DGA-MI of Rennes, France, with proper authorization from the hierarchy. Targeted coverage-guided fuzzing can accelerate bug finding for two reasons. First, the compiler will avoid costly instrumentation for non-relevant functions, accelerating fuzzer execution for each call to any of these functions. Second, the built fuzzer will produce and use a more accurate corpus, because it will not keep the samples that find new coverage paths in non-relevant functions. The two new command line options are `-fsanitize-coverage-whitelist` and `-fsanitize-coverage-blacklist`. They accept files in the same format as the existing `-fsanitize-blacklist` option <https://clang.llvm.org/docs/SanitizerSpecialCaseList.html#format>. The new options influence SanitizerCoverage so that it will only instrument a subset of the functions in the target. We explain these options in detail in `clang/docs/SanitizerCoverage.rst`. Consider now the woff2 fuzzing example from the libFuzzer tutorial <https://github.com/google/fuzzer-test-suite/blob/master/tutorial/libFuzzerTutorial.md>. We are aware that we cannot conclude much from this example because mutating compressed data is generally a bad idea, but let us use it anyway as an illustration for its simplicity. Let us use an empty blacklist together with one of the three following whitelists: ``` # (a) src:* fun:* # (b) src:SRC/* fun:* # (c) src:SRC/src/woff2_dec.cc fun:* ``` Running the built fuzzers shows how many instrumentation points the compiler adds, the fuzzer will output //XXX PCs//. Whitelist (a) is the instrument-everything whitelist, it produces 11912 instrumentation points. Whitelist (b) focuses coverage to instrument woff2 source code only, ignoring the dependency code for brotli (de)compression; it produces 3984 instrumented instrumentation points. Whitelist (c) focuses coverage to only instrument functions in the main file that deals with WOFF2 to TTF conversion, resulting in 1056 instrumentation points. For experimentation purposes, we ran each fuzzer approximately 100 times, single process, with the initial corpus provided in the tutorial. We let the fuzzer run until it either found the heap buffer overflow or went out of memory. On this simple example, whitelists (b) and (c) found the heap buffer overflow more reliably and 5x faster than whitelist (a). The average execution times when finding the heap buffer overflow were as follows: (a) 904 s, (b) 156 s, and (c) 176 s. We explain these results by the fact that WOFF2 to TTF conversion calls the brotli decompression algorithm's functions, which are mostly irrelevant for finding bugs in WOFF2 font reconstruction but nevertheless instrumented and used by whitelist (a) to guide fuzzing. This results in longer execution time for these functions and a partially irrelevant corpus. Contrary to whitelist (a), whitelists (b) and (c) will execute brotli-related functions without instrumentation overhead, and ignore new code paths found in them. This results in faster bug finding for WOFF2 font reconstruction. The results for whitelist (b) are similar to the ones for whitelist (c). Indeed, WOFF2 to TTF conversion calls functions that are mostly located in SRC/src/woff2_dec.cc. The 2892 extra instrumentation points allowed by whitelist (b) do not tamper with bug finding, even though they are mostly irrelevant, simply because most of these functions do not get called. We get a slightly faster average time for bug finding with whitelist (b), which might indicate that some of the extra instrumentation points are actually relevant, or might just be random noise. Reviewers: kcc, morehouse, vitalybuka Reviewed By: morehouse, vitalybuka Subscribers: pratyai, vitalybuka, eternalsakura, xwlin222, dende, srhines, kubamracek, #sanitizers, lebedev.ri, hiraditya, cfe-commits, llvm-commits Tags: #clang, #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D63616	2020-04-10 10:44:03 -07:00
Kevin P. Neal	7f38812d5b	[FPEnv][AArch64] Platform-specific builtin constrained FP enablement When constrained floating point is enabled the AArch64-specific builtins don't use constrained intrinsics in some cases. Fix that. Neon is part of this patch, so ARM is affected as well. Differential Revision: https://reviews.llvm.org/D77074	2020-04-10 13:02:00 -04:00
Wenlei He	60c642e74b	[TLI] Per-function fveclib for math library used for vectorization Summary: Encode `-fveclib` setting as per-function attribute so it can threaded through to LTO backends. Accordingly per-function TLI now reads the attributes and select available vector function list based on that. Now we also populate function list for all supported vector libraries for the shared per-module `TargetLibraryInfoImpl`, so each function can select its available vector list independently but without duplicating the vector function lists. Inlining between incompatbile vectlib attributed is also prohibited now. Subscribers: hiraditya, dexonsmith, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77632	2020-04-09 18:26:38 -07:00
Pratyai Mazumder	ced398fdc8	[SanitizerCoverage] Add -fsanitize-coverage=inline-bool-flag Reviewers: kcc, vitalybuka Reviewed By: vitalybuka Subscribers: cfe-commits, llvm-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77637	2020-04-09 02:40:55 -07:00
Serge Pavlov	c7ff5b38f2	[FPEnv] Use single enum to represent rounding mode Now compiler defines 5 sets of constants to represent rounding mode. These are: 1. `llvm::APFloatBase::roundingMode`. It specifies all 5 rounding modes defined by IEEE-754 and is used in `APFloat` implementation. 2. `clang::LangOptions::FPRoundingModeKind`. It specifies 4 of 5 IEEE-754 rounding modes and a special value for dynamic rounding mode. It is used in clang frontend. 3. `llvm::fp::RoundingMode`. Defines the same values as `clang::LangOptions::FPRoundingModeKind` but in different order. It is used to specify rounding mode in in IR and functions that operate IR. 4. Rounding mode representation used by `FLT_ROUNDS` (C11, 5.2.4.2.2p7). Besides constants for rounding mode it also uses a special value to indicate error. It is convenient to use in intrinsic functions, as it represents platform-independent representation for rounding mode. In this role it is used in some pending patches. 5. Values like `FE_DOWNWARD` and other, which specify rounding mode in library calls `fesetround` and `fegetround`. Often they represent bits of some control register, so they are target-dependent. The same names (not values) and a special name `FE_DYNAMIC` are used in `#pragma STDC FENV_ROUND`. The first 4 sets of constants are target independent and could have the same numerical representation. It would simplify conversion between the representations. Also now `clang::LangOptions::FPRoundingModeKind` and `llvm::fp::RoundingMode` do not contain the value for IEEE-754 rounding direction `roundTiesToAway`, although it is supported natively on some targets. This change defines all the rounding mode type via one `llvm::RoundingMode`, which also contains rounding mode for IEEE rounding direction `roundTiesToAway`. Differential Revision: https://reviews.llvm.org/D77379	2020-04-09 13:26:47 +07:00
Erich Keane	30588a7395	Make target features check work with ctor and dtor- The problem was reported in PR45468, applying target features to an always_inline constructor/destructor runs afoul of GlobalDecl construction assert when checking for target-feature compatibility. The core problem is fixed by using the version of the check that takes a FunctionDecl rather than the GlobalDecl. However, while writing the test, I discovered that source locations weren't properly set for this check on ctors/dtors. This patch also fixes constructors and CALLED destructors. Unfortunately, it doesn't seem too possible to get a meaningful source location for a 'cleanup' destructor, so those are still 'frontend' level errors unfortunately. A fixme was added to the test to cover that situation.	2020-04-08 13:19:55 -07:00
Raul Tambre	878d96011a	[clang][CodeGen] Handle throw expression in conditional operator constant folding Summary: We're smart and do constant folding when emitting conditional operators. Thus we emit the live value as a lvalue. This doesn't work if the live value is a throw expression. Handle this by emitting the throw and returning the dead value as the lvalue. Fixes PR28184. Reviewers: rsmith Reviewed By: rsmith Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77502	2020-04-08 12:32:21 -07:00
Artem Belevich	a9627b7ea7	[CUDA] Add partial support for recent CUDA versions. Generate PTX using newer versions of PTX and allow using sm_80 with CUDA-11. None of the new features of CUDA-10.2+ have been implemented yet, so using these versions will still produce a warning. Differential Revision: https://reviews.llvm.org/D77670	2020-04-08 11:19:44 -07:00
Bevin Hansson	313461f6d8	[CodeGen] Emit IR for compound assignment with fixed-point operands. Reviewers: rjmccall, leonardchan Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73184	2020-04-08 14:33:04 +02:00
Bevin Hansson	39baaabf6d	[CodeGen] Emit IR for fixed-point unary operators. Reviewers: rjmccall, leonardchan Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73183	2020-04-08 14:33:04 +02:00
Bevin Hansson	0b9922e67a	[CodeGen] Emit IR for fixed-point multiplication and division. Reviewers: rjmccall, leonardchan Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D73182	2020-04-08 14:33:04 +02:00
Alexey Bataev	be99c61588	[OPENMP50]Codegen for iterator construct. Implemented codegen for the iterator expression in the depend clauses. Iterator construct is emitted the following way: iterator(cnt1, cnt2, ...), in : <dep> <TotalNumDeps> = <cnt1_size> * <cnt2_size> * ...; kmp_depend_t deps[<TotalNumDeps>]; deps_counter = 0; for (cnt1) { for (cnt2) { ... deps[deps_counter].base_addr = &<dep>; deps[deps_counter].size = sizeof(<dep>); deps[deps_counter].flags = in; deps_counter += 1; ... } } For depobj construct the codegen is very similar, but the memory is allocated dynamically and added extra first item reserved for internal use.	2020-04-07 15:26:00 -04:00
Amy Huang	bcf66084ed	[DebugInfo] Fix for adding "returns cxx udt" option to functions in CodeView. Summary: This change adds DIFlagNonTrivial to forward declarations of DICompositeType. It adds the flag to nontrivial types and types with unknown triviality. It fixes adding the "CxxReturnUdt" flag to functions inconsistently, since it is added based on whether the return type is marked NonTrivial, and that changes if the return type was a forward declaration. continues the discussion at https://reviews.llvm.org/D75215 Bug: https://bugs.llvm.org/show_bug.cgi?id=44785 Reviewers: rnk, dblaikie, aprantl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77436	2020-04-07 09:10:27 -07:00
Michael Liao	c97be2c377	[hip] Remove `hip_pinned_shadow`. Summary: - Use `device_builtin_surface` and `device_builtin_texture` for surface/texture reference support. So far, both the host and device use the same reference type, which could be revised later when interface/implementation is stablized. Reviewers: yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77583	2020-04-07 09:51:49 -04:00
Florian Hahn	338be9c595	[Clang] Add llvm.loop.unroll.disable to loops with -fno-unroll-loops. Currently Clang does not respect -fno-unroll-loops during LTO. During D76916 it was suggested to respect -fno-unroll-loops on a TU basis. This patch uses the existing llvm.loop.unroll.disable metadata to disable loop unrolling explicitly for each loop in the TU if unrolling is disabled. This should ensure that loops from TUs compiled with -fno-unroll-loops are skipped by the unroller during LTO. This also means that if a loop from a TU with -fno-unroll-loops gets inlined into a TU without this option, the loop won't be unrolled. Due to the fact that some transforms might drop loop metadata, there potentially are cases in which we still unroll loops from TUs with -fno-unroll-loops. I think we should fix those issues rather than introducing a function attribute to disable loop unrolling during LTO. Improving the metadata handling will benefit other use cases, like various loop pragmas, too. And it is an improvement to clang completely ignoring -fno-unroll-loops during LTO. If that direction looks good, we can use a similar approach to also respect -fno-vectorize during LTO, at least for LoopVectorize. In the future, this might also allow us to remove the UnrollLoops option LLVM's PassManagerBuilder. Reviewers: Meinersbur, hfinkel, dexonsmith, tejohnson Reviewed By: Meinersbur, tejohnson Differential Revision: https://reviews.llvm.org/D77058	2020-04-07 14:01:55 +01:00
Eli Friedman	68b03aee1a	Remove SequentialType from the type heirarchy. Now that we have scalable vectors, there's a distinction that isn't getting captured in the original SequentialType: some vectors don't have a known element count, so counting the number of elements doesn't make sense. In some cases, there's a better way to express the commonality using other methods. If we're dealing with GEPs, there's GEP methods; if we're dealing with a ConstantDataSequential, we can query its element type directly. In the relatively few remaining cases, I just decided to write out the type checks. We're talking about relatively few places, and I think the abstraction doesn't really carry its weight. (See thread "[RFC] Refactor class hierarchy of VectorType in the IR" on llvmdev.) Differential Revision: https://reviews.llvm.org/D75661	2020-04-06 17:03:49 -07:00
Erik Pilkington	d33c7de8e1	[CodeGenObjC] Fix a crash when attempting to copy a zero-sized bit-field in a non-trivial C struct Zero sized bit-fields aren't included in the CGRecordLayout, so we shouldn't be calling EmitLValueForField for them. rdar://60695105 Differential revision: https://reviews.llvm.org/D76782	2020-04-06 16:04:13 -04:00
Reid Kleckner	b36c19bc4f	[AST] Remove DeclCXX.h dep on ASTContext.h Saves only 36 includes of ASTContext.h and related headers. There are two deps on ASTContext.h: - C++ method overrides iterator types (TinyPtrVector) - getting LangOptions For #1, duplicate the iterator type, which is TinyPtrVector<>::const_iterator. For #2, add an out-of-line accessor to get the language options. Getting the ASTContext from a Decl is already an out of line method that loops over the parent DeclContexts, so if it is ever performance critical, the proper fix is to pass the context (or LangOpts) into the predicate in question. Other changes are just header fixups.	2020-04-06 10:09:01 -07:00
Amy Huang	11a04a64aa	[DebugInfo] Change to constructor homing debug info mode: skip literal types Summary: In constructor type homing mode sometimes complete debug info for constexpr types was missing, because there was not a constructor emitted. This change makes constructor type homing ignore constexpr types. Reviewers: rnk, dblaikie Subscribers: aprantl, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77432	2020-04-06 09:52:53 -07:00
Alexey Bataev	1c92448656	[OPENMP]Fix PR45439: `omp for collapse(2) ordered(2)` generates invalid IR. Fixed a crash because of the not quite correct casting of the value of iterations.	2020-04-06 12:07:43 -04:00
Johannes Doerfert	419a559c5a	[OpenMP][NFCI] Move OpenMP clause information to `lib/Frontend/OpenMP` This is a cleanup and normalization patch that also enables reuse with Flang later on. A follow up will clean up and move the directive -> clauses mapping. Reviewed By: fghanim Differential Revision: https://reviews.llvm.org/D77112	2020-04-05 22:30:29 -05:00
David Blaikie	e9644e6f4f	DebugInfo: Fix default template parameter computation for dependent non-type template parameters This addresses the immediate bug, though in theory we could still produce a default parameter for the DWARF in this test case - but other cases will be definitely unachievable (you could have a default parameter that cannot be evaluated - so long as the user overrode it with another value rather than relying on that default)	2020-04-05 16:31:30 -07:00
Eli Friedman	83fa811e5b	[clang][opaque pointers] Fix up a bunch of "getType()->getElementType()" In contexts where we know an LLVM type is a pointer, there's generally some simpler way to get the pointee type.	2020-04-03 18:00:33 -07:00
Eli Friedman	b11decc221	[clang codegen][opaque pointers] Remove use of deprecated constructor (See also D76269.)	2020-04-03 18:00:33 -07:00
Kevin P. Neal	9f1c35d8b1	Revert "[PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins" The new test case causes bot failures. This reverts commit `ba87430cad`.	2020-04-03 15:47:19 -04:00
Andrew Wock	ba87430cad	[PowerPC] Replace subtract-from-zero float in version with fneg in PowerPC special fma compiler builtins This patch adds a test for the PowerPC fma compiler builtins, some variations of which negate inputs and outputs. The code to generate IR for these builtins was untested before this patch. Originally, the code used the outdated method of subtracting floating point values from -0.0 as floating point negation. This patch remedies that. Patch by: Drew Wock <drew.wock@sas.com> Differential Revision: https://reviews.llvm.org/D76949	2020-04-03 14:59:33 -04:00
Michael Liao	b952d799ca	[cuda][hip] Fix `RegisterVar` function prototype. Summary: - `RegisterVar` has `void` return type and `size_t` in its variable size parameter in HIP or CUDA 9.0+. Reviewers: tra, yaxunl Subscribers: cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D77398	2020-04-03 12:57:09 -04:00

... 13 14 15 16 17 ...

14730 Commits