llvm-project

Commit Graph

Author	SHA1	Message	Date
Nikita Popov	5071360eb1	[OpaquePtr] Remove uses of CGF.Builder.CreateConstInBoundsGEP1_64() without type Remove uses of to-be-deprecated API.	2021-07-17 17:07:46 +02:00
Nikita Popov	357756ecf6	[OpaquePtr] Remove uses of CreateConstGEP1_64() without element type Remove uses of to-be-deprecated API.	2021-07-17 16:43:20 +02:00
Nikita Popov	4737eebc0d	[OpaquePtr] Remove uses of CreateConstInBoundsGEP2_64() without type Remove uses of to-be-deprecated API.	2021-07-17 16:42:10 +02:00
Giorgis Georgakoudis	e9c7291cb2	[OpenMP] Codegen aggregate for outlined function captures Parallel regions are outlined as functions with capture variables explicitly generated as distinct parameters in the function's argument list. That complicates the fork_call interface in the OpenMP runtime: (1) the fork_call is variadic since there is a variable number of arguments to forward to the outlined function, (2) wrapping/unwrapping arguments happens in the OpenMP runtime, which is sub-optimal, has been a source of ABI bugs, and has a hardcoded limit (16) in the number of arguments, (3) forwarded arguments must cast to pointer types, which complicates debugging. This patch avoids those issues by aggregating captured arguments in a struct to pass to the fork_call. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102107	2021-07-16 23:27:44 -07:00
Nemanja Ivanovic	35a18a981f	[PowerPC] Implement intrinsics for mtfsf[i] This provides intrinsics for emitting instructions that set the FPSCR (`mtfsf/mtfsfi`). The patch also conservatively marks the rounding mode as an implicit def for both since they both may set the rounding mode depending on the operands. Reviewed By: #powerpc, qiucf Differential Revision: https://reviews.llvm.org/D105957	2021-07-16 16:26:11 -05:00
Victor Huang	4eb107ccba	[PowerPC] Add PowerPC population count, reversed load and store related builtins and instrinsics for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and instrisics for population count, reversed load and store related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D106021	2021-07-15 17:23:56 -05:00
Artem Belevich	d774b4aa5e	[NVPTX, CUDA] Add .and.popc variant of the b1 MMA instruction. That should allow clang to compile mma.h from CUDA-11.3. Differential Revision: https://reviews.llvm.org/D105384	2021-07-15 12:02:09 -07:00
Quinn Pham	de3956605a	[PowerPC] Fix popcntb XL Compat Builtin for 32bit This patch implements the `__popcntb` XL compatibility builtin for 32bit in the frontend and backend. This patch also updates tests for `__popcntb` and other XL Compat sync related builtins. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D105360	2021-07-15 13:19:47 -05:00
Victor Huang	d40e8091bd	[PowerPC] Add PowerPC rotate related builtins and emit target independent code for XL compatibility This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds the builtins and emit target independent code for rotate related operations. Reviewed By: nemanjai, #powerpc Differential revision: https://reviews.llvm.org/D104744	2021-07-15 10:23:54 -05:00
Chuanqi Xu	8a1727ba51	[Coroutines] Run coroutine passes by default This patch make coroutine passes run by default in LLVM pipeline. Now the clang and opt could handle IR inputs containing coroutine intrinsics without special options. It should be fine. On the one hand, the coroutine passes seems to be stable since there are already many projects using coroutine feature. On the other hand, the coroutine passes should do nothing for IR who doesn't contain coroutine intrinsic. Test Plan: check-llvm Reviewed by: lxfind, aeubanks Differential Revision: https://reviews.llvm.org/D105877	2021-07-15 14:33:40 +08:00
Thomas Lively	4a4229f70f	[WebAssembly] Codegen for v128.storeX_lane instructions Replace the experimental clang builtins and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50435. Differential Revision: https://reviews.llvm.org/D106019	2021-07-14 16:15:25 -07:00
Thomas Lively	970e090010	[WebAssembly] Codegen for v128.loadX_lane instructions Replace the experimental clang builtin and LLVM intrinsics for these instructions with normal codegen patterns. Resolves PR50433. Differential Revision: https://reviews.llvm.org/D105950	2021-07-14 11:31:53 -07:00
Albion Fung	f1aca5ac96	[PowerPC] Fix L[D\|W]ARX Implementation LDARX and LWARX sometimes gets optimized out by the compiler when it is critical to the correctness of the code. This inline asm generation ensures that it preserved. Differential Revision: https://reviews.llvm.org/D105754	2021-07-13 11:02:07 -05:00
Dave MacLachlan	45ffe6341d	[clang/objc] Optimize getters for non-atomic, copied properties Properties that were declared `@property(copy, nonatomic) id foo` make an unnecessary call to objc_get_property(). This call can be replaced with a direct access to the backing variable identical to how a `@property(nonatomic) id foo` would do it. This reduces codegen by 4 bytes (x86_64/arm64) and removes a cross linkage unit function call per property declared as copy/nonatomic. Differential Revision: https://reviews.llvm.org/D105311	2021-07-13 09:22:13 -04:00
Thomas Lively	cbabfc63b1	[WebAssembly] Custom combines for f32x4.demote_zero_f64x2 Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern. Differential Revision: https://reviews.llvm.org/D105755	2021-07-12 10:32:18 -07:00
Johannes Doerfert	e2cfbfcc0c	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 17:53:56 -05:00
Nico Weber	d3e7491333	Revert Attributor patch series Broke check-clang, see https://reviews.llvm.org/D102307#2869065 Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`	2021-07-10 16:15:55 -04:00
Johannes Doerfert	1d5711c3ee	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 12:32:50 -05:00
Thomas Lively	e5220104d0	[WebAssembly] Custom combines for f64x2.promote_low_f32x4 Replace the clang builtin function and LLVM intrinsic previously used to select the f64x2.promote_low_f32x4 instruction with custom combines from standard SelectionDAG nodes. Implement the new combines to share code with the similar combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232. Differential Revision: https://reviews.llvm.org/D105675	2021-07-09 18:59:29 -07:00
Alexey Bataev	ab8989ab87	[OPENMP]Fix overlapped mapping for dereferenced pointer members. If the base is used in a map clause and later we have a memberexpr with this base, and the member is a pointer, and this pointer is dereferenced anyhow (subscript, array section, dereference, etc.), such components should be considered as overlapped, otherwise it may lead to incorrect size computations, since we try to map a pointee as a part of the whole struct, which is not true for the pointer members. Differential Revision: https://reviews.llvm.org/D105562	2021-07-09 12:51:26 -07:00
David Blaikie	768e3af634	PR51034: Debug Info: Remove 'prototyped' from K&R function declarations Regression caused by `6c9559b67b`.	2021-07-09 12:07:36 -07:00
Varun Gandhi	92dcb1d2db	[Clang] Introduce Swift async calling convention. This change is intended as initial setup. The plan is to add more semantic checks later. I plan to update the documentation as more semantic checks are added (instead of documenting the details up front). Most of the code closely mirrors that for the Swift calling convention. Three places are marked as [FIXME: swiftasynccc]; those will be addressed once the corresponding convention is introduced in LLVM. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D95561	2021-07-09 11:50:10 -07:00
David Blaikie	1def2579e1	PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23 C++23 will make these conversions ambiguous - so fix them to make the codebase forward-compatible with C++23 (& a follow-up change I've made will make this ambiguous/invalid even in <C++23 so we don't regress this & it generally improves the code anyway)	2021-07-08 13:37:57 -07:00
Nikita Popov	a0ea367562	[CodeGen] Avoid nullptr arg to CreateStructGEP (NFC) For now just make the getPointerElementType() explicit.	2021-07-08 21:21:43 +02:00
Alexey Bataev	f57d396dca	[OPENMP]Do no privatize const firstprivates in target regions. No need to emit private copyfor firstprivate constants in target regions, we can use the original copy instead. Differential Revision: https://reviews.llvm.org/D105647	2021-07-08 11:55:37 -07:00
Nikita Popov	693251fb2f	[CodeGen] Avoid CreateGEP with nullptr type (NFC) In preparation for dropping support for it. I've replaced it with a proper type where the correct type was obvious and left an explicit getPointerElementType() where it wasn't.	2021-07-08 20:38:54 +02:00
Alexey Bataev	b3c80dd894	[OPENMP]Remove const firstprivate allocation as a variable in a constant space. Current implementation is not compatible with asynchronous target regions, need to remove it. Differential Revision: https://reviews.llvm.org/D105375	2021-07-07 05:56:48 -07:00
Hsiangkai Wang	593bf9b4de	[Clang][RISCV] Implement vlseg and vlsegff. Differential Revision: https://reviews.llvm.org/D103527	2021-07-07 13:44:40 +08:00
David Blaikie	6c9559b67b	DebugInfo: Mangle K&R declarations for debug info linkage names This fixes a gap in the `overloadable` attribute support (K&R declared functions would get mangled symbol names, but that name wouldn't be represented in the debug info linkage name field for the function) and in -funique-internal-linkage-names (this came up in review discussion on D98799) where K&R static declarations would not get the uniqued linkage names.	2021-07-06 16:28:02 -07:00
Xiang1 Zhang	a39bb960fc	[X86] Refine code of generating BB labels in Keylocker Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D105336	2021-07-05 09:29:51 +08:00
Nikita Popov	fabc17192e	[IRBuilder] Add type argument to CreateMaskedLoad/Gather Same as other CreateLoad-style APIs, these need an explicit type argument to support opaque pointers. Differential Revision: https://reviews.llvm.org/D105395	2021-07-04 12:17:59 +02:00
Jun Ma	3afbf89804	[clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast This change fixes the crash that PRValue cannot be handled by EmitLValue. Differential Revision: https://reviews.llvm.org/D105097	2021-07-01 10:09:47 +08:00
Yaxun (Sam) Liu	434bd5bf54	[AMDGPU] Add builtin functions image_bvh_intersect_ray Reviewed by: Stanislav Mekhanoshin, Matt Arsenault Differential Revision: https://reviews.llvm.org/D104946	2021-06-30 13:10:47 -04:00
Melanie Blower	e773216f46	[clang][patch] Add builtin __arithmetic_fence and option fprotect-parens This patch adds a new clang builtin, __arithmetic_fence. The purpose of the builtin is to provide the user fine control, at the expression level, over floating point optimization when -ffast-math (-ffp-model=fast) is enabled. The builtin prevents the optimizer from rearranging floating point expression evaluation. The new option fprotect-parens has the same effect on parenthesized expressions, forcing the optimizer to respect the parentheses. Reviewed By: aaron.ballman, kpn Differential Revision: https://reviews.llvm.org/D100118	2021-06-30 09:58:06 -04:00
Akira Hatanaka	6cda73e3c4	[CodeGen] Add ParmVarDecls to FunctionDecls that are created to generate ObjC property getter/setter functions This is needed to prevent clang from crashing when we make the changes proposed in https://reviews.llvm.org/D98799. Differential Revision: https://reviews.llvm.org/D104883	2021-06-29 16:27:24 -07:00
Steffen Larsen	3644726a78	[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX 6.5 and 7.0 WMMA and MMA instructions Adds NVPTX builtins and intrinsics for the CUDA PTX `wmma.load`, `wmma.store`, `wmma.mma`, and `mma` instructions added in PTX 6.5 and 7.0. PTX ISA description of - `wmma.load`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-ld - `wmma.store`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-st - `wmma.mma`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma - `mma`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-mma Overview of `wmma.mma` and `mma` matrix shape/type combinations added with specific PTX versions: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-shape Authored-by: Steffen Larsen <steffen.larsen@codeplay.com> Co-Authored-by: Stuart Adams <stuart.adams@codeplay.com> Reviewed By: tra Differential Revision: https://reviews.llvm.org/D104847	2021-06-29 15:44:07 -07:00
Akira Hatanaka	8d21d54725	[CodeGen] Stop creating fake FunctionDecls when generating IR for functions implicitly generated by the compiler These fake functions would cause clang to crash if the changes proposed in https://reviews.llvm.org/D98799 were made.	2021-06-29 14:22:33 -07:00
Akira Hatanaka	952944c12c	[ObjC][ARC] Don't add operand bundle clang.arc.attachedcall to a call if the call already has the operand bundle This bug was causing the call to `replaceAllUsesWith` to crash because the old call instruction and the new call instruction were the same. rdar://74957948 Differential Revision: https://reviews.llvm.org/D97824	2021-06-29 10:23:01 -07:00
Bruno De Fraine	4d8871a898	PR50767: clear non-distinct debuginfo for function with nodebug definition after undecorated declaration Fix suggested by Yuanfang Chen: Non-distinct debuginfo is attached to the function due to the undecorated declaration. Later, when seeing the function definition and `nodebug` attribute, the non-distinct debuginfo should be cleared. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D104777	2021-06-29 10:26:45 +02:00
Xiang1 Zhang	6d234a6908	[X86] Zero some outputs of Kelocker intrinsics in error case Reviewed By: WangPengfei Differential Revision: https://reviews.llvm.org/D104766	2021-06-29 13:35:40 +08:00
Ben Shi	c94c8d8b5d	[AVR][clang] Fix wrong calling convention in functions return struct type According to AVR ABI (https://gcc.gnu.org/wiki/avr-gcc), returned struct value within size 1-8 bytes should be returned directly (via register r18-r25), while larger ones should be returned via an implicit struct pointer argument. Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D99237	2021-06-29 11:32:39 +08:00
Michael Liao	948308ef34	Fix `-Wunused-variable` warning. NFC.	2021-06-28 22:50:36 -04:00
Hongtao Yu	633ca3ff2f	[UniqueLinkageName] Use exsiting GlobalDecl object instead of reconstructing one. C++ constructors/destructors need to go through a different constructor to construct a GlobalDecl object in order to retrieve their linkage type. This causes an assert failure in the default constructor of GlobalDecl. I'm chaning it to using the exsiting GlobalDecl object. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102356	2021-06-28 14:50:41 -07:00
Melanie Blower	c27e5a2a8e	Revert "[clang][patch][fpenv] Add builtin __arithmetic_fence and option fprotect-parens" This reverts commit `4f1238e44d`. Buildbot fails on predecessor patch	2021-06-28 12:42:59 -04:00
Melanie Blower	4f1238e44d	[clang][patch][fpenv] Add builtin __arithmetic_fence and option fprotect-parens This patch adds a new clang builtin, __arithmetic_fence. The purpose of the builtin is to provide the user fine control, at the expression level, over floating point optimization when -ffast-math (-ffp-model=fast) is enabled. The builtin prevents the optimizer from rearranging floating point expression evaluation. The new option fprotect-parens has the same effect on parenthesized expressions, forcing the optimizer to respect the parentheses. Reviewed By: aaron.ballman, kpn Differential Revision: https://reviews.llvm.org/D100118	2021-06-28 12:26:53 -04:00
Jinsong Ji	eb237ffca8	[PowerPC] Add XL Compat fetch builtins Prototype ``` unsigned int __fetch_and_add (volatile unsigned int* addr, unsigned int val); unsigned long __fetch_and_addlp (volatile unsigned long* addr, unsigned long val); ``` Ref: https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-fetch Reviewed By: #powerpc, w2yehia, lkail Differential Revision: https://reviews.llvm.org/D104991	2021-06-28 02:52:32 +00:00
Craig Topper	7a112356e4	[X86] Correct the conversion of VALIGND/Q intrinsics to shufflevector. We need to mask the immediate to the width of a single vector rather than 2 vectors. If we use the width of 2 vectors then any shift larger than the length of 1 vector is going to overflow the shuffle indices. Fixes PR50895.	2021-06-26 19:06:00 -07:00
Joseph Huber	9ce02ea8c9	[OpenMP] Add Module metadata for OpenMP compilation This patch adds a module level metadata flag indicating that the module was compiled with the `-fopenmp` flag. This will make it easier for passes like OpenMPOpt to determine if it should be run. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102361	2021-06-25 16:34:19 -04:00
Stuart Brady	e47027d091	[OpenCL] Use DW_LANG_OpenCL language tag for OpenCL C Note regarding C++ for OpenCL: When compiling C++ for OpenCL, DW_LANG_C_plus_plus* is emitted. There is no DWARF language code defined for C++ for OpenCL as of yet, but DWARF issue 210514.1 has been raised to request one. In the mean time, continuing to emit DW_LANG_C_plus_plus* for C++ for OpenCL allows the potential to distinguish between C++ for OpenCL and OpenCL C in !DICompileUnit nodes, whereas using DW_LANG_OpenCL for C++ for OpenCL would prevent this. This change therefore leaves C++ for OpenCL as-is. Reviewed By: shchenz, Anastasia Differential Revision: https://reviews.llvm.org/D104118	2021-06-25 11:48:42 +01:00
Jinsong Ji	f3ef4f5bff	[PowerPC] Add XL compat __compare_and_swap builtins Prototype int __compare_and_swap (volatile int* addr, int* old_val_addr, int new_val); int __compare_and_swaplp (volatile long* addr, long* old_val_addr, long new_val); Refer to https://www.ibm.com/docs/en/xl-c-and-cpp-aix/16.1?topic=functions-compare-swap-compare-swaplp Reviewed By: w2yehia Differential Revision: https://reviews.llvm.org/D104837	2021-06-25 01:08:48 +00:00
Martin Storsjö	e5c7c171e5	[clang] Rename StringRef _lower() method calls to _insensitive() This is mostly a mechanical change, but a testcase that contains parts of the StringRef class (clang/test/Analysis/llvm-conventions.cpp) isn't touched.	2021-06-25 00:22:01 +03:00
Akira Hatanaka	8db0dbbe2c	[CodeGen] Don't create fake FunctionDecls when generating block/byref copy/dispose helper functions We found out that these fake functions would cause clang to crash if the changes proposed in https://reviews.llvm.org/D98799 were made. The original patch was reverted in `f681fd927e` because debug locations were missing in the body of the block byref helper functions. This patch fixes the bug by calling CreateArtificial after the calls to StartFunction. Differential Revision: https://reviews.llvm.org/D104082	2021-06-24 11:45:52 -07:00
Aakanksha Patil	3453f3dd46	[AMDGPU] Add gfx1035 target Differential Revision: https://reviews.llvm.org/D104804	2021-06-24 14:32:41 -04:00
Zequan Wu	f681fd927e	Revert "[CodeGen] Don't create fake FunctionDecls when generating block/byref" That commit causes crash with error "!dbg attachment points at wrong subprogram for function" on iOS platforms. This reverts commit `f4c06bcb67`.	2021-06-22 21:48:00 -07:00
Akira Hatanaka	f4c06bcb67	[CodeGen] Don't create fake FunctionDecls when generating block/byref copy/dispose helper functions We found out that these fake functions would cause clang to crash if the changes proposed in https://reviews.llvm.org/D98799 were made. Differential Revision: https://reviews.llvm.org/D104082	2021-06-22 11:42:53 -07:00
Fangrui Song	948016228f	Improve clang -Wframe-larger-than= diagnostic Match the style in D104667. This commit is for non-LTO diagnostics, while D104667 is for LTO and llc diagnostics.	2021-06-22 11:20:49 -07:00
Joseph Huber	68d133a3e8	[OpenMP] Simplify GPU memory globalization Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share data between its threads so must allocate global or shared memory to store the data in. Currently this is implemented fully in the frontend using the `__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard CPU stack sharing. The front-end scans the target region for variables that escape the region and must be shared between the threads. Each variable then has a field created for it in a global record type. This patch replaces this functinality with a single allocation command, effectively mimicing an alloca instruction for the variables that must be shared between the threads. This will be much slower than the current solution, but makes it much easier to optimize as we can analyze each variable independently and determine if it is not captured. In the future, we can replace these calls with an `alloca` and small allocations can be pushed to shared memory. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D97680	2021-06-22 10:52:46 -04:00
Graham Hunter	a83ce95b09	[clang] Remove unused capture in closure `c6a91ee6aa` removed uses of IsMonotonic from OpenMP SIMD codegen, but that left a capture of the variable unused which upset buildbots using -Werror.	2021-06-22 15:09:39 +01:00
Graham Hunter	c6a91ee6aa	[Clang][OpenMP] Monotonic does not apply to SIMD The codegen for simd constructs was affected by the presence (or absence) of the 'monotonic' schedule modifier for worksharing loops. The modifier is only intended to apply to the scheduling of chunks for a thread, not iterations of a loop inside a chunk. In addition, the monotonic modifier was applied to worksharing loops by default if no schedule clause was present; the referenced part of the OpenMP 4.5 spec in the code (section 2.7.1) only applies if the user specified a schedule clause with a static kind but no modifier. Without a user-specified schedule clause we should default to nonmonotonic scheduling. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D103793	2021-06-22 10:24:11 +01:00
Nick Desaulniers	8ace121305	[IR] convert warn-stack-size from module flag to fn attr Otherwise, this causes issues when building with LTO for object files that use different values. Link: https://github.com/ClangBuiltLinux/linux/issues/1395 Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D104342	2021-06-21 15:09:25 -07:00
Nick Desaulniers	193e41c987	[Clang][Codegen] Add GNU function attribute 'no_profile' and lower it to noprofile noprofile IR attribute already exists to prevent profiling with PGO; emit that when a function uses the newly added no_profile function attribute. The Linux kernel would like to avoid compiler generated code in functions annotated with such attribute. We already respect this for libcalls to fentry() and mcount(). Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223 Link: https://lore.kernel.org/lkml/CAKwvOdmPTi93n2L0_yQkrzLdmpxzrOR7zggSzonyaw2PGshApw@mail.gmail.com/ Reviewed By: MaskRay, void, phosek, aaron.ballman Differential Revision: https://reviews.llvm.org/D104475	2021-06-18 13:42:32 -07:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Chen Zheng	4590b406c0	[Debug-Info] guard DW_LANG_C_plus_plus_14 under strict dwarf Reviewed By: stuart Differential Revision: https://reviews.llvm.org/D104291	2021-06-16 03:17:56 +00:00
Kevin Athey	e0b469ffa1	[clang-cl][sanitizer] Add -fsanitize-address-use-after-return to clang. Also: - add driver test (fsanitize-use-after-return.c) - add basic IR test (asan-use-after-return.cpp) - (NFC) cleaned up logic for generating table of __asan_stack_malloc depending on flag. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D104076	2021-06-11 12:07:35 -07:00
Simon Pilgrim	61cdaf66fe	[ADT] Remove APInt/APSInt toString() std::string variants <string> is currently the highest impact header in a clang+llvm build: https://commondatastorage.googleapis.com/chromium-browser-clang/llvm-include-analysis.html One of the most common places this is being included is the APInt.h header, which needs it for an old toString() implementation that returns std::string - an inefficient method compared to the SmallString versions that it actually wraps. This patch replaces these APInt/APSInt methods with a pair of llvm::toString() helpers inside StringExtras.h, adjusts users accordingly and removes the <string> from APInt.h - I was hoping that more of these users could be converted to use the SmallString methods, but it appears that most end up creating a std::string anyhow. I avoided trying to use the raw_ostream << operators as well as I didn't want to lose having the integer radix explicit in the code. Differential Revision: https://reviews.llvm.org/D103888	2021-06-11 13:19:15 +01:00
Nick Desaulniers	fc018ebb60	[IR] make -warn-frame-size into a module attr -Wframe-larger-than= is an interesting warning; we can't know the frame size until PrologueEpilogueInsertion (PEI); very late in the compilation pipeline. -Wframe-larger-than= was propagated through CC1 as an -mllvm flag, then was a cl::opt in LLVM's PEI pass; this meant it was dropped during LTO and needed to be re-specified via -plugin-opt. Instead, make it part of the IR proper as a module level attribute, similar to D103048. Introduce -fwarn-stack-size CC1 option. Reviewed By: rsmith, qcolombet Differential Revision: https://reviews.llvm.org/D103928	2021-06-10 16:15:27 -07:00
Michael Kruse	a22236120f	[OpenMP] Implement '#pragma omp unroll'. Implementation of the unroll directive introduced in OpenMP 5.1. Follows the approach from D76342 for the tile directive (i.e. AST-based, not using the OpenMPIRBuilder). Tries to use `llvm.loop.unroll.*` metadata where possible, but has to fall back to an AST representation of the outer loop if the partially unrolled generated loop is associated with another directive (because it needs to compute the number of iterations). Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99459	2021-06-10 14:30:17 -05:00
Joseph Huber	0c32ffceed	[OpenMP] Add type to firstprivate symbol for const firstprivate values Clang will create a global value put in constant memory if an aggregate value is declared firstprivate in the target device. The symbol name only uses the name of the firstprivate variable, so symbol name conflicts will occur if the variable is allowed to have different types through templates. An example of this behvaiour is shown in https://godbolt.org/z/EsMjYh47n. This patch adds the mangled type name to the symbol to avoid such naming conflicts. This fixes https://bugs.llvm.org/show_bug.cgi?id=50642. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D103995	2021-06-10 09:02:20 -04:00
Hongtao Yu	64b2fb7967	[CSSPGO] Emit mangled dwarf names for line tables debug option under -fpseudo-probe-for-profiling Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D103909	2021-06-09 10:46:03 -07:00
AndreyChurbanov	9ce2e5e700	Revert "[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type" This reverts commit `a1f550e052`. Revert in order to fix backwards compatibility breakage caused by type size change for task dependence flag.	2021-06-09 17:38:38 +03:00
Matheus Izvekov	aef5d8fdc7	[clang] NFC: Rename rvalue to prvalue This renames the expression value categories from rvalue to prvalue, keeping nomenclature consistent with C++11 onwards. C++ has the most complicated taxonomy here, and every other language only uses a subset of it, so it's less confusing to use the C++ names consistently, and mentally remap to the C names when working on that context (prvalue -> rvalue, no xvalues, etc). Renames: * VK_RValue -> VK_PRValue * Expr::isRValue -> Expr::isPRValue * SK_QualificationConversionRValue -> SK_QualificationConversionPRValue * JSON AST Dumper Expression nodes value category: "rvalue" -> "prvalue" Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D103720	2021-06-09 12:27:10 +02:00
Brendon Cahoon	294efbbd3e	Reland "[AMDGPU] Add gfx1013 target" This reverts commit `211e584fa2`. Fixed a use-after-free error that caused the sanitizers to fail.	2021-06-08 21:15:35 -04:00
Brendon Cahoon	211e584fa2	Revert "[AMDGPU] Add gfx1013 target" This reverts commit `ea10a86984`. A sanitizer buildbot reports an error.	2021-06-08 16:29:41 -04:00
Nathan Sidwell	b2d0c16e91	[clang] p1099 using enum part 2 This implements the 'using enum maybe-qualified-enum-tag ;' part of 1099. It introduces a new 'UsingEnumDecl', subclassed from 'BaseUsingDecl'. Much of the diff is the boilerplate needed to get the new class set up. There is one case where we accept ill-formed, but I believe this is merely an extended case of an existing bug, so consider it orthogonal. AFAICT in class-scope the c++20 rule is that no 2 using decls can bring in the same target decl ([namespace.udecl]/8). But we already accept: struct A { enum { a }; }; struct B : A { using A::a; }; struct C : B { using A::a; using B::a; }; // same enumerator this patch permits mixtures of 'using enum Bob;' and 'using Bob::member;' in the same way. Differential Revision: https://reviews.llvm.org/D102241	2021-06-08 11:11:46 -07:00
Nick Desaulniers	3787ee4571	reland [IR] make -stack-alignment= into a module attr Relands commit `433c8d950c` with fixes for MIPS. Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO. Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079 Link: https://github.com/ClangBuiltLinux/linux/issues/1377 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103048	2021-06-08 10:59:46 -07:00
Brendon Cahoon	ea10a86984	[AMDGPU] Add gfx1013 target Differential Revision: https://reviews.llvm.org/D103663	2021-06-08 12:49:49 -04:00
Nick Desaulniers	a596b54d47	Revert "[IR] make -stack-alignment= into a module attr" This reverts commit `433c8d950c`. Breaks the MIPS build.	2021-06-08 08:55:50 -07:00
Nick Desaulniers	433c8d950c	[IR] make -stack-alignment= into a module attr Similar to D102742, specifying the stack alignment via CodegenOpts means that this flag gets dropped during LTO, unless the command line is re-specified as a plugin opt. Instead, encode this information as a module level attribute so that we don't have to expose this llvm internal flag when linking the Linux kernel with LTO. Looks like external dependencies might need a fix: * https://github.com/llvm-hs/llvm-hs/issues/345 * https://github.com/halide/Halide/issues/6079 Link: https://github.com/ClangBuiltLinux/linux/issues/1377 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D103048	2021-06-08 08:31:04 -07:00
Yaxun (Sam) Liu	054cc3b1b4	[CUDA][HIP] Fix store of vtbl in ctor vtbl itself is in default global address space. When clang emits ctor, it gets a pointer to the vtbl field based on the this pointer, then stores vtbl to the pointer. Since this pointer can point to any address space (e.g. an object created in stack), this pointer points to default address space, therefore the pointer to vtbl field in this object should also be in default address space. Currently, clang incorrectly casts the pointer to vtbl field in this object to global address space. This caused assertions in backend. This patch fixes that by removing the incorrect addr space cast. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D103835	2021-06-08 10:24:44 -04:00
Martin Storsjö	b34da6ff9c	[clang] Apply MS ABI details on __builtin_ms_va_list on non-windows platforms on x86_64 This fixes inconsistencies in the ms_abi.c testcase. Also add a couple cases of missing double pointers in the windows part of the testcase; the outcome of building that testcase on windows hasn't changed, but the previous form of the test was imprecise (checking for "%[[STRUCT_FOO]]" when clang actually generates "%[[STRUCT_FOO]]*"), which still used to match. Ideally this would share code with the native Windows case, but X86_64ABIInfo and WinX86_64ABIInfo aren't superclasses/subclasses of each other so it's impractical, and the code to share currently only consists of a couple lines. Differential Revision: https://reviews.llvm.org/D103837	2021-06-08 12:14:12 +03:00
Martin Storsjö	6de45b9e6a	[clang] Fix reading long doubles with va_arg on x86_64 mingw On x86_64 mingw, long doubles are always passed indirectly as arguments (see an existing case in WinX86_64ABIInfo::classify); generalize the existing code for reading varargs - any non-aggregate type that is larger than 64 bits (which would be both long double in mingw, and __int128) are passed indirectly too. This makes reading varargs consistent with how they're passed, fixing interop with both gcc and clang callers, for long double and __int128. Differential Revision: https://reviews.llvm.org/D103452	2021-06-07 22:34:10 +03:00
AndreyChurbanov	a1f550e052	[OpenMP] libomp: implement OpenMP 5.1 inoutset task dependence type Refactored code of dependence processing and added new inoutset dependence type. Compiler can set dependence flag to 0x8 when call __kmpc_omp_task_with_deps. Size of type of the dependence flag changed from 1 to 4 bytes in clang. All dependence flags library gets so far and corresponding dependence types: 1 - IN, 2 - OUT, 3 - INOUT, 4 - MUTEXINOUTSET, 8 - INOUTSET. Differential Revision: https://reviews.llvm.org/D97085	2021-06-07 21:42:51 +03:00
Hsiangkai Wang	2b13ff6979	[Clang][CodeGen] Set the size of llvm.lifetime to unknown for scalable types. If the memory object is scalable type, we do not know the exact size of it at compile time. Set the size of lifetime marker to unknown if the object is scalable one. Differential Revision: https://reviews.llvm.org/D102822	2021-06-07 23:30:13 +08:00
Nathan Sidwell	ddda05add5	[clang][NFC] Break out BaseUsingDecl from UsingDecl This is a pre-patch for adding using-enum support. It breaks out the shadow decl handling of UsingDecl to a new intermediate base class, BaseUsingDecl, altering the decl hierarchy to def BaseUsing : DeclNode<Named, "", 1>; def Using : DeclNode<BaseUsing>; def UsingPack : DeclNode<Named>; def UsingShadow : DeclNode<Named>; def ConstructorUsingShadow : DeclNode<UsingShadow>; Differential Revision: https://reviews.llvm.org/D101777	2021-06-07 06:29:28 -07:00
Bradley Smith	60c9b5f35c	[AArch64][SVE] Improve codegen for dupq SVE ACLE intrinsics Use llvm.experimental.vector.insert instead of storing into an alloca when generating code for these intrinsics. This defers the codegen of the generated vector to instruction selection, allowing existing shufflevector style optimizations to apply. Additionally, introduce a new target transform that can recognise fixed predicate patterns in the svbool variants of these intrinsics. Differential Revision: https://reviews.llvm.org/D103082	2021-06-07 12:21:38 +01:00
Andrew Savonichev	b31f41e78b	[Clang] Support a user-defined __dso_handle This fixes PR49198: Wrong usage of __dso_handle in user code leads to a compiler crash. When Init is an address of the global itself, we need to track it across RAUW. Otherwise the initializer can be destroyed if the global is replaced. Differential Revision: https://reviews.llvm.org/D101156	2021-06-07 12:54:08 +03:00
Ten Tzen	33ba8bd2c9	[Windows SEH]: Fix -O2 crash for Windows -EHa This patch fixes a Windows -EHa crash induced by previous commit `797ad70152`. The crash was caused by "LifetimeMarker" scope (with option -O2) that should not be considered as SEH Scope. This change also turns off -fasync-exceptions by default under -EHa option for now. Differential Revision: https://reviews.llvm.org/D103664#2799944	2021-06-04 14:07:44 -07:00
Alexey Bataev	c84a5448b5	[OPENMP]Fix PR50129: omp cancel parallel not working as expected. Need to emit a call for __kmpc_cancel_barrier in the exit block for __kmpc_cancel function call if cancellation of the parallel block is requested. Differential Revision: https://reviews.llvm.org/D103646	2021-06-04 08:24:55 -07:00
Michael Kruse	07a6beb402	[Clang][OpenMP] Emit dependent PreInits before directive. The PreInits of a loop transformation (atm moment only tile) include the computation of the trip count. The trip count is needed by any loop-associated directives that consumes the transformation-generated loop. Hence, we must ensure that the PreInits of consumed loop transformations are emitted with the consuming directive. This is done by addinging the inner loop transformation's PreInits to the outer loop-directive's PreInits. The outer loop-directive will consume the de-sugared AST such that the inner PreInits are not emitted twice. The PreInits of a loop transformation are still emitted directly if its generated loop(s) are not associated with another loop-associated directive. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D102180	2021-06-02 16:59:35 -05:00
Erik Pilkington	369c648399	[clang] Implement the using_if_exists attribute This attribute applies to a using declaration, and permits importing a declaration without knowing if that declaration exists. This is useful for libc++ C wrapper headers that re-export declarations in std::, in cases where the base C library doesn't provide all declarations. This attribute was proposed in http://lists.llvm.org/pipermail/cfe-dev/2020-June/066038.html. rdar://69313357 Differential Revision: https://reviews.llvm.org/D90188	2021-06-02 10:30:24 -04:00
Eli Friedman	577fea4e1a	[CGAtomic] Delete outdated code comparing success/failure ordering for cmpxchg. This doesn't actually have any effect: we only call this code with SequentiallyConsistent orderings. But delete it anyway for consistency with other recent changes.	2021-05-28 15:36:01 -07:00
Tim Northover	e94fada045	SwiftAsync: add Clang attribute to apply the LLVM `swiftasync` one. Expected to be used by Swift runtime developers.	2021-05-28 12:31:12 +01:00
Martin Storsjö	0e4cf807ae	[clang] [MinGW] Don't mark emutls variables as DSO local These actually can be automatically imported from another DLL. (This works properly as long as the actual implementation of emutls is linked dynamically from e.g. libgcc; if the implementation comes from compiler-rt or a statically linked libgcc, it doesn't work as intended.) This fixes PR50146 and https://github.com/msys2/MINGW-packages/issues/8706 (fixing calling std::call_once in a dynamically linked libstdc++); since `f731839584` the dso_local attribute on the TLS variable affected the actual generated code for accessing the emutls variable. The dso_local attribute on the emutls variable made those accesses to use 32 bit relative addressing in code, which requires runtime pseudo relocations in the text section, and breaks entirely if the actual other variable ends up loaded too far away in the virtual address space. Differential Revision: https://reviews.llvm.org/D102970	2021-05-27 23:51:22 +03:00
Erich Keane	eba69b59d1	Reimplement __builtin_unique_stable_name- The original version of this was reverted, and @rjmcall provided some advice to architect a new solution. This is that solution. This implements a builtin to provide a unique name that is stable across compilations of this TU for the purposes of implementing the library component of the unnamed kernel feature of SYCL. It does this by running the Itanium mangler with a few modifications. Because it is somewhat common to wrap non-kernel-related lambdas in macros that aren't present on the device (such as for logging), this uniquely generates an ID for all lambdas involved in the naming of a kernel. It uses the lambda-mangling number to do this, except replaces this with its own number (starting at 10000 for readabililty reasons) for lambdas used to name a kernel. Additionally, this implements itself as constexpr with a slight catch: if a name would be invalidated by the use of this lambda in a later kernel invocation, it is diagnosed as an error (see the Sema tests). Differential Revision: https://reviews.llvm.org/D103112	2021-05-27 07:12:20 -07:00
Marco Elver	280333021e	[SanitizeCoverage] Add support for NoSanitizeCoverage function attribute We really ought to support no_sanitize("coverage") in line with other sanitizers. This came up again in discussions on the Linux-kernel mailing lists, because we currently do workarounds using objtool to remove coverage instrumentation. Since that support is only on x86, to continue support coverage instrumentation on other architectures, we must support selectively disabling coverage instrumentation via function attributes. Unfortunately, for SanitizeCoverage, it has not been implemented as a sanitizer via fsanitize= and associated options in Sanitizers.def, but rolls its own option fsanitize-coverage. This meant that we never got "automatic" no_sanitize attribute support. Implement no_sanitize attribute support by special-casing the string "coverage" in the NoSanitizeAttr implementation. To keep the feature as unintrusive to existing IR generation as possible, define a new negative function attribute NoSanitizeCoverage to propagate the information through to the instrumentation pass. Fixes: https://bugs.llvm.org/show_bug.cgi?id=49035 Reviewed By: vitalybuka, morehouse Differential Revision: https://reviews.llvm.org/D102772	2021-05-25 12:57:14 +02:00
Marco Elver	ca6df73406	[NFC][CodeGenOptions] Refactor checking SanitizeCoverage options Refactor checking SanitizeCoverage options into CodeGenOptions::hasSanitizeCoverage(). Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D102927	2021-05-25 12:57:14 +02:00
Chen Zheng	99d45ed22f	[Debug-Info] handle DW_TAG_rvalue_reference_type at strict DWARF. When -gstrict-dwarf is specified, generate DW_TAG_rvalue_reference_type at DWARF 4 or above Reviewed By: dblaikie, aprantl Differential Revision: https://reviews.llvm.org/D100630	2021-05-23 21:24:13 -04:00
Nick Desaulniers	033138ea45	[IR] make stack-protector-guard-* flags into module attrs D88631 added initial support for: - -mstack-protector-guard= - -mstack-protector-guard-reg= - -mstack-protector-guard-offset= flags, and D100919 extended these to AArch64. Unfortunately, these flags aren't retained for LTO. Make them module attributes rather than TargetOptions. Link: https://github.com/ClangBuiltLinux/linux/issues/1378 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D102742	2021-05-21 15:53:30 -07:00
Yaxun (Sam) Liu	4cb42564ec	[CUDA][HIP] Fix device variables used by host variables emitted on both host and device side with different addresses when ODR-used by host function should not cause device side counter-part to be force emitted. This fixes the regression caused by https://reviews.llvm.org/D102237 Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D102801	2021-05-20 17:04:29 -04:00
Heejin Ahn	3eb12b0ae1	[WebAssembly] Warn on exception spec for Emscripten EH It turns out we have not correctly supported exception spec all along in Emscripten EH. Emscripten EH supports `throw()` but not `throw` with types. See https://bugs.llvm.org/show_bug.cgi?id=50396. Wasm EH also only supports `throw()` but not `throw` with types, and we have been printing a warning message for the latter. This prints the same warning message for `throw` with types when Emscripten EH is used, or more precisely, when Wasm EH is not used. (So this will print the warning messsage even when `-fno-exceptions` is used but I think that should be fine. It's cumbersome to do a complilcated option checking in CGException.cpp and options checkings are mostly done in elsewhere.) Reviewed By: dschuff, kripken Differential Revision: https://reviews.llvm.org/D102791	2021-05-20 13:00:20 -07:00
Reid Kleckner	8f20ac9595	[PGO] Don't reference functions unless value profiling is enabled This reduces the size of chrome.dll.pdb built with optimizations, coverage, and line table info from 4,690,210,816 to 2,181,128,192, which makes it possible to fit under the 4GB limit. This change can greatly reduce binary size in coverage builds, which do not need value profiling. IR PGO builds are unaffected. There is a minor behavior change for frontend PGO. PGO and coverage both use InstrProfiling to create profile data with counters. PGO records the address of each function in the __profd_ global. It is used later to map runtime function pointer values back to source-level function names. Coverage does not appear to use this information. Recording the address of every function with code coverage drastically increases code size. Consider this program: void foo(); void bar(); inline void inlineMe(int x) { if (x > 0) foo(); else bar(); } int getVal(); int main() { inlineMe(getVal()); } With code coverage, the InstrProfiling pass runs before inlining, and it captures the address of inlineMe in the __profd_ global. This greatly increases code size, because now the compiler can no longer delete trivial code. One downside to this approach is that users of frontend PGO must apply the -mllvm -enable-value-profiling flag globally in TUs that enable PGO. Otherwise, some inline virtual method addresses may not be recorded and will not be able to be promoted. My assumption is that this mllvm flag is not popular, and most frontend PGO users don't enable it. Differential Revision: https://reviews.llvm.org/D102818	2021-05-20 11:09:24 -07:00
Fangrui Song	37561ba89b	-fno-semantic-interposition: Don't set dso_local on GlobalVariable `clang -fpic -fno-semantic-interposition` may set dso_local on variables for -fpic. GCC folks consider there are 'address interposition' and 'semantic interposition', and 'disabling semantic interposition' can optimize function calls but cannot change variable references to use local aliases (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100483). This patch removes dso_local for variables in `clang -fpic -fno-semantic-interposition` mode so that the built shared objects can work with copy relocations. Building llvm-project tiself with -fno-semantic-interposition (D102453) should now be safe with trunk Clang. Example: ``` // a.c int var; int *addr() { return var; } // old: cannot be interposed movslq .Lvar$local(%rip), %rax // new: can be interposed movq var@GOTPCREL(%rip), %rax movslq (%rax), %rax ``` The local alias lowering for `GlobalVariable`s is kept in case there is a future option allowing local aliases. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D102583	2021-05-19 16:08:28 -07:00
Arthur Eubanks	0c509dbc7e	[NewPM] Add options to PrintPassInstrumentation To bring D99599's implementation in line with the existing PrintPassInstrumentation, and to fix a FIXME, add more customizability to PrintPassInstrumentation. Introduce three new options. The first takes over the existing "-debug-pass-manager-verbose" cl::opt. The second and third option are specific to -fdebug-pass-structure. They allow indentation, and also don't print analysis queries. To avoid more golden file tests than necessary, prune down the -fdebug-pass-structure tests. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D102196	2021-05-18 20:59:35 -07:00
Benjamin Kramer	3f3642a763	[CodeGen] Avoid unused variable warning in Release builds. NFCI.	2021-05-18 11:09:12 +02:00
Ten Tzen	797ad70152	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 1 This patch is the Part-1 (FE Clang) implementation of HW Exception handling. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). This is the first step of this project; only X86_64 target is enabled in this patch. Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: * First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. * Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). * Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation described below. Two intrinsic are created to track CPP object scopes; eha_scope_begin() and eha_scope_end(). _scope_begin() is immediately added after ctor() is called and EHStack is pushed. So it must be an invoke, not a call. With that it's also guaranteed an EH-cleanup-pad is created regardless whether there exists a call in this scope. _scope_end is added before dtor(). These two intrinsics make the computation of Block-State possible in downstream code gen pass, even in the presence of ctor/dtor inlining. Two intrinsic, seh_try_begin() and seh_try_end(), are added for C-code to mark _try boundary and to prevent from exceptions being moved across _try boundary. All memory instructions inside a _try are considered as 'volatile' to assure 2nd and 3rd rules for C-code above. This is a little sub-optimized. But it's acceptable as the amount of code directly under _try is very small. Part-2 (will be in Part-2 patch): LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D80344/new/	2021-05-17 22:42:17 -07:00
Arthur Eubanks	9f7d552cff	[NFC] Pass GV value type instead of pointer type to GetOrCreateLLVMGlobal For opaque pointers, to avoid PointerType::getElementType(). Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102638	2021-05-17 18:41:17 -07:00
Eli Friedman	698568b74c	[clang CodeGen] Don't crash on large atomic function parameter. I wouldn't recommend writing code like the testcase; a function parameter isn't atomic, so using an atomic type doesn't really make sense. But it's valid, so clang shouldn't crash on it. The code was assuming hasAggregateEvaluationKind(Ty) implies Ty is a RecordType, which isn't true. Just use isRecordType() instead. Differential Revision: https://reviews.llvm.org/D102015	2021-05-17 13:18:23 -07:00
Nick Desaulniers	0f41778919	[AArch64] Support customizing stack protector guard Follow up to D88631 but for aarch64; the Linux kernel uses the command line flags: 1. -mstack-protector-guard=sysreg 2. -mstack-protector-guard-reg=sp_el0 3. -mstack-protector-guard-offset=0 to use the system register sp_el0 for the stack canary, enabling the kernel to have a unique stack canary per task (like a thread, but not limited to userspace as the kernel can preempt itself). Address pr/47341 for aarch64. Fixes: https://github.com/ClangBuiltLinux/linux/issues/289 Signed-off-by: Nick Desaulniers <ndesaulniers@google.com> Reviewed By: xiangzhangllvm, DavidSpickett, dmgreen Differential Revision: https://reviews.llvm.org/D100919	2021-05-17 11:49:22 -07:00
Irina Dobrescu	50511df32e	[AArch64] Lower bitreverse in ISel Adding lowering support for bitreverse. Previously, lowering bitreverse would expand it into a series of other instructions. This patch makes it so this produces a single rbit instruction instead. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D102397	2021-05-17 13:35:27 +01:00
Raphael Isemann	888ce70af2	[DebugInfo] Fix DWARF expressions for __block vars that are not on the heap `__block` variables used to be always stored on the head instead of stack. D51564 allowed `__block` variables to the stored on the stack like normal variablesif they not captured by any escaping block, but the debug-info generation code wasn't made aware of it so we still unconditionally emit DWARF expressions pointing to the heap. This patch makes CGDebugInfo use the `EscapingByref` introduced in D51564 that tracks whether the `__block` variable is actually on the heap. If it's stored on the stack instead we just use the debug info we would generate for normal variables instead. Reviewed By: ahatanak, aprantl Differential Revision: https://reviews.llvm.org/D99946	2021-05-17 14:32:07 +02:00
Florian Hahn	803c52d0db	Recommit "[Clang,Driver] Add -fveclib=Darwin_libsystem_m support." Recommit D102489, with the test case requiring the AArch64 backend. This reverts the revert `59b419adc6`.	2021-05-16 18:49:53 +01:00
Pengxuan Zheng	c9b36a041f	Support GCC's -fstack-usage flag This patch adds support for GCC's -fstack-usage flag. With this flag, a stack usage file (i.e., .su file) is generated for each input source file. The format of the stack usage file is also similar to what is used by GCC. For each function defined in the source file, a line with the following information is produced in the .su file. <source_file>:<line_number>:<function_name> <size_in_byte> <static/dynamic> "Static" means that the function's frame size is static and the size info is an accurate reflection of the frame size. While "dynamic" means the function's frame size can only be determined at run-time because the function manipulates the stack dynamically (e.g., due to variable size objects). The size info only reflects the size of the fixed size frame objects in this case and therefore is not a reliable measure of the total frame size. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D100509	2021-05-15 10:22:49 -07:00
Douglas Yung	59b419adc6	Revert "[Clang,Driver] Add -fveclib=Darwin_libsystem_m support." This reverts commit `187a14e1f3`. The test added in this commit is failing on several build bots: https://lab.llvm.org/buildbot/#/builders/139/builds/4059 https://lab.llvm.org/buildbot/#/builders/132/builds/5605	2021-05-14 22:39:12 -07:00
Arthur Eubanks	e8448a5985	[NFC] Directly get GV type	2021-05-14 14:27:07 -07:00
Florian Hahn	187a14e1f3	[Clang,Driver] Add -fveclib=Darwin_libsystem_m support. Support for Darwin's libsystem_m's vector functions has been added to LLVM in `93a9a8a8d9`. This patch adds support for -fveclib=Darwin_libsystem_m to Clang. Reviewed By: arphaman Differential Revision: https://reviews.llvm.org/D102489	2021-05-14 21:00:13 +01:00
Michael Kruse	83ff0ff463	[Clang][OpenMP] Allow unified_shared_memory for Pascal-generation GPUs. The Pascal architecture supports the page migration engine required for unified_shared_memory, as indicated by NVIDIA: * https://developer.nvidia.com/blog/unified-memory-cuda-beginners/ * https://developer.nvidia.com/blog/beyond-gpu-memory-limits-unified-memory-pascal/ * https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#um-requirements The limitation was introduced in D54493 which justified the cut-off by the requirement for unified addressing. However, Unified Virtual Addressing (UVA) is already available with sm20 (Fermi, Kepler, Maxwell): * https://docs.nvidia.com/cuda/gpudirect-rdma/index.html#basics-of-uva-cuda-memory-management Unified shared memory might even be possible with these, but with migration of entire allocations on kernel startup. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D101595	2021-05-13 17:15:34 -05:00
Aakanksha Patil	464e4dc50f	[AMDGPU] Add gfx1034 target Differential Revision: https://reviews.llvm.org/D102306	2021-05-13 14:25:18 -04:00
cynecx	8ec9fd4839	Support unwinding from inline assembly I've taken the following steps to add unwinding support from inline assembly: 1) Add a new `unwind` "attribute" (like `sideeffect`) to the asm syntax: ``` invoke void asm sideeffect unwind "call thrower", "~{dirflag},~{fpsr},~{flags}"() to label %exit unwind label %uexit ``` 2.) Add Bitcode writing/reading support + LLVM-IR parsing. 3.) Emit EHLabels around inline assembly lowering (SelectionDAGBuilder + GlobalISel) when `InlineAsm::canThrow` is enabled. 4.) Tweak InstCombineCalls/InlineFunction pass to not mark inline assembly "calls" as nounwind. 5.) Add clang support by introducing a new clobber: "unwind", which lower to the `canThrow` being enabled. 6.) Don't allow unwinding callbr. Reviewed By: Amanieu Differential Revision: https://reviews.llvm.org/D95745	2021-05-13 19:13:03 +01:00
Roman Lebedev	16d0381841	Return "[CGCall] Annotate `this` argument with alignment" The original change was reverted because it was discovered that clang mishandles thunks, and they receive wrong attributes for their this/return types - the ones for the function they will call, not the ones they have. While i have tried to fix this in https://reviews.llvm.org/D100388 that patch has been up and stuck for a month now, with little signs of progress. So while it will be good to solve this for real, for now we can simply avoid introducing the bug, by not annotating this/return for thunks. This reverts commit `6270b3a1ea`, relanding `0aa0458f14`.	2021-05-13 20:33:14 +03:00
Roman Lebedev	a624cec56d	[Clang][Codegen] Do not annotate thunk's this/return types with align/deref/nonnull attrs As it was discovered in post-commit feedback for `0aa0458f14`, we handle thunks incorrectly, and end up annotating their this/return with attributes that are valid for their callees, not for thunks themselves. While it would be good to fix this properly, and keep annotating them on thunks, i've tried doing that in https://reviews.llvm.org/D100388 with little success, and the patch is stuck for a month now. So for now, as a stopgap measure, subj.	2021-05-13 20:33:08 +03:00
Vassil Vassilev	92f9852fc9	[clang-repl] Recommit "Land initial infrastructure for incremental parsing" Original commit message: In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have mentioned our plans to make some of the incremental compilation facilities available in llvm mainline. This patch proposes a minimal version of a repl, clang-repl, which enables interpreter-like interaction for C++. For instance: ./bin/clang-repl clang-repl> int i = 42; clang-repl> extern "C" int printf(const char*,...); clang-repl> auto r1 = printf("i=%d\n", i); i=42 clang-repl> quit The patch allows very limited functionality, for example, it crashes on invalid C++. The design of the proposed patch follows closely the design of cling. The idea is to gather feedback and gradually evolve both clang-repl and cling to what the community agrees upon. The IncrementalParser class is responsible for driving the clang parser and codegen and allows the compiler infrastructure to process more than one input. Every input adds to the “ever-growing” translation unit. That model is enabled by an IncrementalAction which prevents teardown when HandleTranslationUnit. The IncrementalExecutor class hides some of the underlying implementation details of the concrete JIT infrastructure. It exposes the minimal set of functionality required by our incremental compiler/interpreter. The Transaction class keeps track of the AST and the LLVM IR for each incremental input. That tracking information will be later used to implement error recovery. The Interpreter class orchestrates the IncrementalParser and the IncrementalExecutor to model interpreter-like behavior. It provides the public API which can be used (in future) when using the interpreter library. Differential revision: https://reviews.llvm.org/D96033	2021-05-13 06:30:29 +00:00
Vassil Vassilev	f6907152db	Revert "[clang-repl] Land initial infrastructure for incremental parsing" This reverts commit `44a4000181`. We are seeing build failures due to missing dependency to libSupport and CMake Error at tools/clang/tools/clang-repl/cmake_install.cmake file INSTALL cannot find	2021-05-13 04:44:19 +00:00
Vassil Vassilev	44a4000181	[clang-repl] Land initial infrastructure for incremental parsing In http://lists.llvm.org/pipermail/llvm-dev/2020-July/143257.html we have mentioned our plans to make some of the incremental compilation facilities available in llvm mainline. This patch proposes a minimal version of a repl, clang-repl, which enables interpreter-like interaction for C++. For instance: ./bin/clang-repl clang-repl> int i = 42; clang-repl> extern "C" int printf(const char*,...); clang-repl> auto r1 = printf("i=%d\n", i); i=42 clang-repl> quit The patch allows very limited functionality, for example, it crashes on invalid C++. The design of the proposed patch follows closely the design of cling. The idea is to gather feedback and gradually evolve both clang-repl and cling to what the community agrees upon. The IncrementalParser class is responsible for driving the clang parser and codegen and allows the compiler infrastructure to process more than one input. Every input adds to the “ever-growing” translation unit. That model is enabled by an IncrementalAction which prevents teardown when HandleTranslationUnit. The IncrementalExecutor class hides some of the underlying implementation details of the concrete JIT infrastructure. It exposes the minimal set of functionality required by our incremental compiler/interpreter. The Transaction class keeps track of the AST and the LLVM IR for each incremental input. That tracking information will be later used to implement error recovery. The Interpreter class orchestrates the IncrementalParser and the IncrementalExecutor to model interpreter-like behavior. It provides the public API which can be used (in future) when using the interpreter library. Differential revision: https://reviews.llvm.org/D96033	2021-05-13 04:23:24 +00:00
Chen Zheng	a0ca4c46ca	[Debug-Info] add -gstrict-dwarf support in backend Reviewed By: dblaikie, probinson Differential Revision: https://reviews.llvm.org/D100826	2021-05-12 23:00:52 -04:00
Roman Lebedev	2d84195d60	[NFCI][clang][Codegen] CodeGenVTables::addVTableComponent(): use getGlobalDecl It does the same thing. Split off from https://reviews.llvm.org/D100388	2021-05-12 20:39:54 +03:00
Yaxun (Sam) Liu	98575708da	[CUDA][HIP] Fix device template variables Currently clang does not emit device template variables instantiated only in host functions, however, nvcc is able to do that: https://godbolt.org/z/fneEfferY This patch fixes this issue by refactoring and extending the existing mechanism for emitting static device var ODR-used by host only. Basically clang records device variables ODR-used by host code and force them to be emitted in device compilation. The existing mechanism makes sure these device variables ODR-used by host code are added to llvm.compiler-used, therefore they are guaranteed not to be deleted. It also fixes non-ODR-use of static device variable by host code causing static device variable to be emitted and registered, which should not. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D102237	2021-05-12 11:13:29 -04:00
Momchil Velikov	5c7b43aa82	[clang][AArch32] Correctly align HA arguments when passed on the stack Analogously to https://reviews.llvm.org/D98794 this patch uses the `alignstack` attribute to fix incorrect passing of homogeneous aggregate (HA) arguments on AArch32. The EABI/AAPCS was recently updated to clarify how VFP co-processor candidates are aligned: `4488e34998` Differential Revision: https://reviews.llvm.org/D100853	2021-05-10 16:28:46 +01:00
Alexey Bataev	230953d577	[OPENMP]Fix PR48851: the locals are not globalized in SPMD mode. Follow the more general patch for now, do not try to SPMDize the kernel if the variable is used and local. Differential Revision: https://reviews.llvm.org/D101911	2021-05-10 06:34:11 -07:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
Arthur Eubanks	6f7131002b	[NewPM] Move analysis invalidation/clearing logging to instrumentation We're trying to move DebugLogging into instrumentation, rather than being part of PassManagers/AnalysisManagers. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D102093	2021-05-07 15:25:31 -07:00
Olivier Goffart	c4adc49a1c	[SEH] Fix regression with SEH in noexpect functions Commit `5baea05601` set the CurCodeDecl because it was needed to pass the assert in CodeGenFunction::EmitLValueForLambdaField, But this was not right to do as CodeGenFunction::FinishFunction passes it to EmitEndEHSpec and cause corruption of the EHStack. Revert the part of the commit that changes the CurCodeDecl, and instead adjust the assert to check for a null CurCodeDecl. Differential Revision: https://reviews.llvm.org/D102027	2021-05-07 13:27:59 -07:00
Ahsan Saghir	25bbff632d	[PowerPC] Provide MMA builtins for compatibility Vector pair intrinsics and builtins were renamed in https://reviews.llvm.org/D91974 to replace the _mma_ prefix by _vsx_. However, some projects used the _mma_ version, so this patch adds these intrinsics to provide compatibility. Fixes Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=50159 Reviewed By: nemanjai, amyk Differential Revision: https://reviews.llvm.org/D100482	2021-05-07 09:10:16 -05:00
Bruno Cardoso Lopes	819e0d105e	[CGAtomic] Lift strong requirement for remaining compare_exchange combinations Follow up on `431e3138a` and complete the other possible combinations. Besides enforcing the new behavior, it also mitigates TSAN false positives when combining orders that used to be stronger.	2021-05-06 21:05:20 -07:00
Johannes Doerfert	df729e2b82	[OpenMP] Overhaul `declare target` handling This patch fixes various issues with our prior `declare target` handling and extends it to support `omp begin declare target` as well. This started with PR49649 in mind, trying to provide a way for users to avoid the "ref" global use introduced for globals with internal linkage. From there it went down the rabbit hole, e.g., all variables, even `nohost` ones, were emitted into the device code so it was impossible to determine if "ref" was needed late in the game (based on the name only). To make it really useful, `begin declare target` was needed as it can carry the `device_type`. Not emitting variables eagerly had a ripple effect. Finally, the precedence of the (explicit) declare target list items needed to be taken into account, that meant we cannot just look for any declare target attribute to make a decision. This caused the handling of functions to require fixup as well. I tried to clean up things while I was at it, e.g., we should not "parse declarations and defintions" as part of OpenMP parsing, this will always break at some point. Instead, we keep track what region we are in and act on definitions and declarations instead, this is what we do for declare variant and other begin/end directives already. Highlights: - new diagnosis for restrictions specificed in the standard, - delayed emission of globals not mentioned in an explicit list of a declare target, - omission of `nohost` globals on the host and `host` globals on the device, - no explicit parsing of declarations in-between `omp [begin] declare variant` and the corresponding end anymore, regular parsing instead, - precedence for explicit mentions in `declare target` lists over implicit mentions in the declaration-definition-seq, and - `omp allocate` declarations will now replace an earlier emitted global, if necessary. --- Notes: The patch is larger than I hoped but it turns out that most changes do on their own lead to "inconsistent states", which seem less desirable overall. After working through this I feel the standard should remove the explicit declare target forms as the delayed emission is horrible. That said, while we delay things anyway, it seems to me we check too often for the current status even though that is often not sufficient to act upon. There seems to be a lot of duplication that can probably be trimmed down. Eagerly emitting some things seems pretty weak as an argument to keep so much logic around. --- Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D101030	2021-05-06 02:10:41 -05:00
Giorgis Georgakoudis	f97b843d88	[OpenMP] Fix non-determinism in clang copyin codegen Codegen for OpeMP copyin has non-deterministic IR output due to the unspecified evaluation order in a codegen conditional branch, which makes automatic test generation unreliable. This patch refactors codegen code to avoid this non-determinism. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101952	2021-05-05 19:24:03 -07:00
Giorgis Georgakoudis	313ee609e1	[OpenMP] Fix non-determinism in clang task codegen (lastprivates) Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101800	2021-05-04 11:56:31 -07:00
Leonard Chan	84c4754372	[clang] Add -fc++-abi= flag for specifying which C++ ABI to use This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html. The goal is to add a way to override the default target C++ ABI through a compiler flag. This makes it easier to test and transition between different C++ ABIs through compile flags rather than build flags. In this patch: - Store -fc++-abi= in a LangOpt. This isn't stored in a CodeGenOpt because there are instances outside of codegen where Clang needs to know what the ABI is (particularly through ASTContext::createCXXABI), and we should be able to override the target default if the flag is provided at that point. - Expose the existing ABIs in TargetCXXABI as values that can be passed through this flag. - Create a .def file for these ABIs to make it easier to check flag values. - Add an error for diagnosing bad ABI flag values. Differential Revision: https://reviews.llvm.org/D85802	2021-05-04 10:52:13 -07:00
Andrew Savonichev	b451ecd86e	[Clang][AArch64] Disable rounding of return values for AArch64 If a return value is explicitly rounded to 64 bits, an additional zext instruction is emitted, and in some cases it prevents tail call optimization. As discussed in D100225, this rounding is not necessary and can be disabled. Differential Revision: https://reviews.llvm.org/D100591	2021-05-04 20:29:01 +03:00
Nico Weber	d7ec48d71b	[clang] accept -fsanitize-ignorelist= in addition to -fsanitize-blacklist= Use that for internal names (including the default ignorelists of the sanitizers). Differential Revision: https://reviews.llvm.org/D101832	2021-05-04 10:24:00 -04:00
serge-sans-paille	b83b23275b	Introduce -Wreserved-identifier Warn when a declaration uses an identifier that doesn't obey the reserved identifier rule from C and/or C++. Differential Revision: https://reviews.llvm.org/D93095	2021-05-04 11:19:01 +02:00
Alex Lorenz	2669abaecf	[clang][CodeGen] Use llvm::stable_sort for multi version resolver options The use of llvm::sort causes periodic failures on the bot with EXPENSIVE_CHECKS enabled, as the regular sort pre-shuffles the array in the expensive checks mode, leading to a non-deterministic test result which causes the CodeGenCXX/attr-cpuspecific-outoflinedefs.cpp testcase to fail on the bot (http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-expensive/).	2021-05-03 20:07:00 -07:00
Valentin Clement	63f8226f25	[OpenMPIRBuilder] Add createOffloadMaptypes and createOffloadMapnames functions Add function to create the offload_maptypes and the offload_mapnames globals. These two functions are used in clang. They will be used in the Flang/MLIR lowering as well. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D101503	2021-05-03 15:42:32 -04:00
Giorgis Georgakoudis	a27ca15dd0	[OpenMP] Fix non-determinism in clang task codegen Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D101739	2021-05-03 10:34:38 -07:00
Saurabh Jha	696becbd13	[Matrix] Remove bitcast when casting between matrices of the same size In matrix type casts, we were doing bitcast when the matrices had the same size. This was incorrect and this patch fixes that. Also added some new CodeGen tests for signed <-> usigned conversions Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D101754	2021-05-03 15:31:43 +01:00
Yaxun (Sam) Liu	c58a6a6fb4	[HIP] Fix device lib selection Choose optimized device lib bitcode by fp options for performance. Reviewed by: Artem Belevich, Fangrui Song Differential Revision: https://reviews.llvm.org/D101654	2021-05-01 20:31:11 -04:00
Yaxun (Sam) Liu	0175999805	[AMDGPU] Add options -mamdgpu-ieee -mno-amdgpu-ieee AMDGPU backend need to know whether floating point opcodes that support exception flag gathering quiet and propagate signaling NaN inputs per IEEE754-2008, which is conveyed by a function attribute "amdgpu-ieee". "amdgpu-ieee"="false" turns this off. Without this function attribute backend assumes it is on for compute functions. -mamdgpu-ieee and -mno-amdgpu-ieee are added to Clang to control this function attribute. By default it is on. -mno-amdgpu-ieee requires -fno-honor-nans or equivalent. Reviewed by: Matt Arsenault Differential Revision: https://reviews.llvm.org/D77013	2021-05-01 09:02:55 -04:00
Nemanja Ivanovic	c3da07d216	[PowerPC] Provide fastmath sqrt and div functions in altivec.h This adds the long overdue implementations of these functions that have been part of the ABI document and are now part of the "Power Vector Intrinsic Programming Reference" (PVIPR). The approach is to add new builtins and to emit code with the fast flag regardless of whether fastmath was specified on the command line. Differential revision: https://reviews.llvm.org/D101209	2021-04-30 19:17:48 -05:00
Joel E. Denny	82e99f5035	[OpenMP] Fix second debug name from map clause This patch fixes a bug from D89802. For example, without it, Clang generates x as the debug map name for both x and y in the following example: ``` #pragma omp target map(to: x, y) x = y = 1; ``` Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D101564	2021-04-30 16:26:59 -04:00
Florian Hahn	6c31295493	[clang] Refactor mustprogress handling, add it to all loops in c++11+. Currently Clang does not add mustprogress to inifinite loops with a known constant condition, matching C11 behavior. The forward progress guarantee in C++11 and later should allow us to add mustprogress to any loop (http://eel.is/c++draft/intro.progress#1). This allows us to simplify the code dealing with adding mustprogress a bit. Reviewed By: aaron.ballman, lebedev.ri Differential Revision: https://reviews.llvm.org/D96418	2021-04-30 14:13:47 +01:00
Akira Hatanaka	2e1d9ebd46	[ObjC][ARC] Don't enter the cleanup scope if the initializer expression isn't an ExprWithCleanups This patch fixes a bug where a temporary ObjC pointer is released before the end of the full expression. This fixes PR50043. rdar://77030453 Differential Revision: https://reviews.llvm.org/D101502	2021-04-29 16:04:30 -07:00
Chirag Khandelwal	c204106188	[Clang][OpenMP] Frontend work for sections - D89671 This patch is child of D89671, contains the clang implementation to use the OpenMP IRBuilder's section construct. Co-author: @anchu-rajendran Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D91054	2021-04-29 19:52:27 +05:30
Evgeny Leviant	6a0283d0d2	[NewPM] Add an option to dump pass structure Patch adds -debug-pass-structure option to dump pass structure when new pass manager is used. Differential revision: https://reviews.llvm.org/D99599	2021-04-29 10:29:42 +03:00
Dan Liew	1bbbcff99d	[NFC] Rename SanitizeAddressDtorKind codegen opt to not have `Kind` suffix. This is post commit follow up based on discussions in https://reviews.llvm.org/D101122. Differential Revision: https://reviews.llvm.org/D101490 (cherry picked from commit f4c7e82d1b21e637c4e0c53125b126c407d8bdbf)	2021-04-28 18:37:16 -07:00
Ryan Santhirarajan	0395f9e70b	[ARM] Neon Polynomial vadd Intrinsic fix The Neon vadd intrinsics were added to the ARMSIMD intrinsic map, however due to being defined under an AArch64 guard in arm_neon.td, were not previously useable on ARM. This change rectifies that. It is important to note that poly128 is not valid on ARM, thus it was extracted out of the original arm_neon.td definition and separated for the sake of AArch64. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D100772	2021-04-28 11:59:40 -07:00
Nico Weber	0f1137ba79	[clang/Basic] Make TargetInfo.h not use DataLayout again Reverts parts of https://reviews.llvm.org/D17183, but keeps the resetDataLayout() API and adds an assert that checks that datalayout string and user label prefix are in sync. Approach 1 in https://reviews.llvm.org/D17183#2653279 Reduces number of TUs build for 'clang-format' from 689 to 575. I also implemented approach 2 in D100764. If someone feels motivated to make us use DataLayout more, it's easy to revert this change here and go with D100764 instead. I don't plan on doing more work in this area though, so I prefer going with the smaller, more self-consistent change. Differential Revision: https://reviews.llvm.org/D100776	2021-04-27 22:26:10 -04:00
Yonghong Song	a2a3ca8d97	BPF: emit debuginfo for Function of DeclRefExpr if requested Commit `e3d8ee35e4` ("reland "[DebugInfo] Support to emit debugInfo for extern variables"") added support to emit debugInfo for extern variables if requested by the target. Currently, only BPF target enables this feature by default. As BPF ecosystem grows, callback function started to get support, e.g., recently bpf_for_each_map_elem() is introduced (https://lwn.net/Articles/846504/) with a callback function as an argument. In the future we may have something like below as a demonstration of use case : extern int do_work(int); long bpf_helper(void callback_fn, void callback_ctx, ...); long prog_main() { struct { ... } ctx = { ... }; return bpf_helper(&do_work, &ctx, ...); } Basically bpf helper may have a callback function and the callback function is defined in another file or in the kernel. In this case, we would like to know the debuginfo types for do_work(), so the verifier can proper verify the safety of bpf_helper() call. For the following example, extern int do_work(int); long bpf_helper(void callback_fn); long prog() { return bpf_helper(&do_work); } Currently, there is no debuginfo generated for extern function do_work(). In the IR, we have, ... define dso_local i64 @prog() local_unnamed_addr #0 !dbg !7 { entry: %call = tail call i64 @bpf_helper(i8 bitcast (i32 (i32)* @do_work to i8*)) #2, !dbg !11 ret i64 %call, !dbg !12 } ... declare dso_local i32 @do_work(i32) #1 ... This patch added support for the above callback function use case, and the generated IR looks like below: ... declare !dbg !17 dso_local i32 @do_work(i32) #1 ... !17 = !DISubprogram(name: "do_work", scope: !1, file: !1, line: 1, type: !18, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2) !18 = !DISubroutineType(types: !19) !19 = !{!20, !20} !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) The TargetInfo.allowDebugInfoForExternalVar is renamed to TargetInfo.allowDebugInfoForExternalRef as now it guards both extern variable and extern function debuginfo generation. Differential Revision: https://reviews.llvm.org/D100567	2021-04-26 16:53:25 -07:00
Alexey Bader	7818906ca1	[SYCL] Implement SYCL address space attributes handling Default address space (applies when no explicit address space was specified) maps to generic (4) address space. Added SYCL named address spaces `sycl_global`, `sycl_local` and `sycl_private` defined as sub-sets of the default address space. Static variables without address space now reside in global address space when compile for SPIR target, unless they have an explicit address space qualifier in source code. Differential Revision: https://reviews.llvm.org/D89909	2021-04-26 13:44:10 +03:00
Levy Hsu	8cf54c7ff5	[RISCV] [1/2] Add IR intrinsic for Zbe extension RV32/64: bcompress bdecompress RV64 ONLY: bcompressw bdecompressw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101143	2021-04-25 19:14:34 -07:00
Thomas Lively	502f54049d	[WebAssembly] Finalize wasm_simd128.h intrinsics Adds new intrinsics for instructions that are in the final SIMD spec but did not previously have intrinsics. Also updates the names of existing intrinsics to reflect the final names of the underlying instructions in the spec. Keeps the old names as deprecated functions to ease the transition to the new names. Differential Revision: https://reviews.llvm.org/D101112	2021-04-23 13:37:27 -07:00
Fangrui Song	a92dbadffe	[OpenMP] Fix -Wdeprecated-copy	2021-04-23 10:49:19 -07:00
Dávid Bolvanský	2cae7025c1	Reland "[Clang] Propagate guaranteed alignment for malloc and others" This relands commit `6914a0ed2b`. Crash in InstCombine was fixed.	2021-04-23 14:05:57 +02:00
Dávid Bolvanský	6914a0ed2b	Revert "[Clang] Propagate guaranteed alignment for malloc and others" This reverts commit `c2297544c0`. Some buildbots are broken.	2021-04-23 11:33:33 +02:00
Dávid Bolvanský	c2297544c0	[Clang] Propagate guaranteed alignment for malloc and others LLVM should be smarter about known malloc's alignment and this knowledge may enable other optimizations. Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D100879	2021-04-23 11:07:14 +02:00
Fangrui Song	2786e673c7	[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all} The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it. Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan) Also document the function attribute "frame-pointer" which is long overdue. Differential Revision: https://reviews.llvm.org/D101016	2021-04-22 18:07:30 -07:00
Levy Hsu	b49337bbb9	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Fangrui Song	ef5e7f90ea	Temporarily revert the code part of D100981 "Delete le32/le64 targets" This partially reverts commit `77ac823fd2`. Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934). Temporarily brings back the code part to give them some time for migration.	2021-04-22 10:18:44 -07:00
Hamza Mahfooz	be2277fbf2	[Matrix] Support #pragma clang fp From https://bugs.llvm.org/show_bug.cgi?id=49739: Currently, `#pragma clang fp` are ignored for matrix types. For the code below, the `contract` fast-math flag should be added to the generated call to `llvm.matrix.multiply` and `fadd` ``` typedef float fx2x2_t __attribute__((matrix_type(2, 2))); void foo(fx2x2_t &A, fx2x2_t &C, fx2x2_t &B) { #pragma clang fp contract(fast) C = A*B + C; } ``` Reviewed By: fhahn, mibintc Differential Revision: https://reviews.llvm.org/D100834	2021-04-22 11:45:34 +01:00
Giorgis Georgakoudis	a2dbfb6b72	[OpenMP] Simplify offloading parallel call codegen This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D95976	2021-04-21 18:46:07 -07:00
Fangrui Song	77ac823fd2	Delete le32/le64 targets They are unused now. Note: NaCl is still used and is currently expected to be needed until 2022-06 (https://blog.chromium.org/2020/08/changes-to-chrome-app-support-timeline.html). Differential Revision: https://reviews.llvm.org/D100981	2021-04-21 18:44:12 -07:00
Fangrui Song	775a9483e5	[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module. Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified. (i.e. If -g[123], every function gets `.eh_frame`. This behavior is strange but that is the status quo on GCC and Clang.) Let's take asan as an example. Other sanitizers are similar. `asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true, so every function gets `.eh_frame` if `-g[123]` is specified. This is the root cause that `-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame while `-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame. This patch * sets the nounwind attribute on sanitizer module ctor/dtor. * let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute. The "uwtable" mechanism is generic: synthesized functions not cloned/specialized from existing ones should consider `Function::createWithDefaultAttr` instead of `Function::create` if they want to get some default attributes which have more of module semantics. Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955 https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc. Differential Revision: https://reviews.llvm.org/D100251	2021-04-21 15:58:20 -07:00
Alexey Bataev	079884225a	[OPENMP]Fix PR49698: OpenMP declare mapper causes segmentation fault. The implicitly generated mappings for allocation/deallocation in mappers runtime should be mapped as implicit, also no need to clear member_of flag to avoid ref counter increment. Also, the ref counter should not be incremented for the very first element that comes from the mapper function. Differential Revision: https://reviews.llvm.org/D100673	2021-04-21 10:38:31 -07:00
Erich Keane	0ed613612c	Ensure target-multiversioning emits deferred declarations As reported in PR50025, sometimes we would end up not emitting functions needed by inline multiversioned variants. This is because we typically use the 'deferred decl' mechanism to emit these. However, the variants are emitted after that typically happens. This fixes that by ensuring we re-run deferred decls after this happens. Also, the multiversion emission is done recursively to ensure that MV functions that require other MV functions to be emitted get emitted.	2021-04-20 08:10:26 -07:00
Jan Svoboda	6a72ed239c	[clang] NFC: Fix range-based for loop warnings related to decl lookup	2021-04-19 18:31:31 +02:00
Xun Li	5faba87938	Revert "[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass" This reverts commit `fa6b54c44a`. The commited patch broke mlir tests. It seems that mlir tests depend on coroutine function properties set in CoroEarly pass.	2021-04-18 17:22:28 -07:00
Xun Li	fa6b54c44a	[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining. The presplit coroutine attributes are set in CoroEarly pass. However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine. This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920 To fix this, we set the attributes in the Clang front-end instead of in CoroEarly pass. Reviewed By: rjmccall, ChuanqiXu Differential Revision: https://reviews.llvm.org/D100282	2021-04-18 15:41:09 -07:00
Xun Li	c0211e8d7d	Revert "[Coroutines] Move CoroEarly pass to before AlwaysInliner" This reverts commit `2b50f5a434`. Forgot to update the description of the commit to sync with phabricator. Going to redo the commit.	2021-04-18 15:38:19 -07:00
Xun Li	2b50f5a434	[Coroutines] Move CoroEarly pass to before AlwaysInliner Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining. The presplit coroutine attributes are set in CoroEarly pass. However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine. This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920 Differential Revision: https://reviews.llvm.org/D100282	2021-04-18 14:54:04 -07:00
Yaxun (Sam) Liu	d5c0f00e21	[CUDA][HIP] Mark device var used by host only Add device variables to llvm.compiler.used if they are ODR-used by either host or device functions. This is necessary to prevent them from being eliminated by whole-program optimization where the compiler has no way to know a device variable is used by some host code. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D98814	2021-04-17 11:25:25 -04:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
Thomas Lively	5c729750a6	[WebAssembly] Remove saturating fp-to-int target intrinsics Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead. This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u}, which are now represented in IR as a saturating truncation to a v2i32 followed by a concatenation with a zero vector. Differential Revision: https://reviews.llvm.org/D100596	2021-04-16 12:11:20 -07:00
Alexey Bataev	10c7b9f64f	[OPENMP]Fix PR49115: Incorrect results for scan directive. For combined worksharing directives need to emit the temp arrays outside of the parallel region and update them in the master thread only. Differential Revision: https://reviews.llvm.org/D100187	2021-04-16 06:25:35 -07:00
Joshua Haberman	8344675908	Implemented [[clang::musttail]] attribute for guaranteed tail calls. This is a Clang-only change and depends on the existing "musttail" support already implemented in LLVM. The [[clang::musttail]] attribute goes on a return statement, not a function definition. There are several constraints that the user must follow when using [[clang::musttail]], and these constraints are verified by Sema. Tail calls are supported on regular function calls, calls through a function pointer, member function calls, and even pointer to member. Future work would be to throw a warning if a users tries to pass a pointer or reference to a local variable through a musttail call. Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D99517	2021-04-15 17:12:21 -07:00
Momchil Velikov	f9d932e673	[clang][AArch64] Correctly align HFA arguments when passed on the stack When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794	2021-04-15 22:58:14 +01:00
Martin Storsjö	8e0f2e89ff	[clang] [AArch64] Fix handling of HFAs passed to Windows variadic functions The documentation says that for variadic functions, all composites are treated similarly, no special handling of HFAs/HVAs, not even for the fixed arguments of a variadic function. Differential Revision: https://reviews.llvm.org/D100467	2021-04-15 22:21:27 +03:00
cchen	e0c2125d1d	[OpenMP] Added codegen for masked directive Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100514	2021-04-15 12:55:07 -05:00
Thomas Lively	6a18cc23ef	[WebAssembly] Codegen for i64x2.extend_{low,high}_i32x4_{s,u} Removes the builtins and intrinsics used to opt in to using these instructions and replaces them with normal ISel patterns now that they are no longer prototypes. Differential Revision: https://reviews.llvm.org/D100402	2021-04-14 13:43:09 -07:00
Thomas Lively	af7925b4dd	[WebAssembly] Codegen for f64x2.convert_low_i32x4_{s,u} Add a custom DAG combine and ISD opcode for detecting patterns like (uint_to_fp (extract_subvector ...)) before the extract_subvector is expanded to ensure that they will ultimately lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions are no longer prototypes and can now be produced via standard IR, this commit also removes the target intrinsics and builtins that had been used to prototype the instructions. Differential Revision: https://reviews.llvm.org/D100425	2021-04-14 10:42:45 -07:00
Thomas Lively	af7ab81ce3	[WebAssembly] Use standard intrinsics for f32x4 and f64x2 ops Now that these instructions are no longer prototypes, we do not need to be careful about keeping them opt-in and can use the standard LLVM infrastructure for them. This commit removes the bespoke intrinsics we were using to represent these operations in favor of the corresponding target-independent intrinsics. The clang builtins are preserved because there is no standard way to easily represent these operations in C/C++. For consistency with the scalar codegen in the Wasm backend, the intrinsic used to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though @llvm.roundeven better captures the semantics of the underlying Wasm instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is left to a potential future patch. Differential Revision: https://reviews.llvm.org/D100411	2021-04-14 09:19:27 -07:00
Martin Storsjö	3637c5c8ec	[clang] [AArch64] Fix Windows va_arg handling for larger structs Aggregate types over 16 bytes are passed by reference. Contrary to the x86_64 ABI, smaller structs with an odd (non power of two) are padded and passed in registers. Differential Revision: https://reviews.llvm.org/D100374	2021-04-14 14:51:53 +03:00
Liu, Chen3	1c4108ab66	[i386] Modify the alignment of __m128/__m256/__m512 vector type according i386 abi. According to i386 System V ABI: 1. when __m256 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 32 byte boundary at the time of the call. 2. when __m512 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 64 byte boundary at the time of the call. The current method of clang passing __m512 parameter are as follow: 1. when target supports avx512, passing it with 64 byte alignment; 2. when target supports avx, passing it with 32 byte alignment; 3. Otherwise, passing it with 16 byte alignment. Passing __m256 parameter are as follow: 1. when target supports avx or avx512, passing it with 32 byte alignment; 2. Otherwise, passing it with 16 byte alignment. This pach will passing __m128/__m256/__m512 following i386 System V ABI and apply it to Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't want to spend any effort dealing with the ramifications of ABI breaks at present. Differential Revision: https://reviews.llvm.org/D78564	2021-04-14 16:44:54 +08:00
Ben Dunbobbin	eae2d4b852	[Windows Itanium][PS4] handle dllimport/export w.r.t vtables/rtti The existing Windows Itanium patches for dllimport/export behaviour w.r.t vtables/rtti can't be adopted for PS4 due to backwards compatibility reasons (see comments on https://reviews.llvm.org/D90299). This commit adds our PS4 scheme for this to Clang. Differential Revision: https://reviews.llvm.org/D93203	2021-04-13 11:41:10 +01:00
Esme-Yi	dff922f39b	Reland [DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions."" This reverts commit `c965e14a12`.	2021-04-12 11:05:55 +00:00
Esme-Yi	c965e14a12	Revert "[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions." This reverts commit `62fa9b9388`.	2021-04-12 10:36:46 +00:00
Esme-Yi	62fa9b9388	[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions. Summary: The tags DW_LANG_C_plus_plus_14 and DW_LANG_C_plus_plus_11, introduced in Dwarf-5, are unexpected in previous versions. Fixing the mismathing doesn't have any drawbacks for any other debuggers, but helps dbx. Reviewed By: aprantl, shchenz Differential Revision: https://reviews.llvm.org/D99250	2021-04-12 07:42:54 +00:00
yifeng.dongyifeng	3a6a80b641	[Clang][Coroutine][DebugInfo] In c++ coroutine, clang will emit different debug info variables for parameters and move-parameters. The first one is the real parameters of the coroutine function, the other one just for copying parameters to the coroutine frame. Considering the following c++ code: ``` struct coro { ... }; coro foo(struct test & t) { ... co_await suspend_always(); ... co_await suspend_always(); ... co_await suspend_always(); } int main(int argc, char *argv[]) { auto c = foo(...); c.handle.resume(); ... } ``` Function foo is the standard coroutine function, and it has only one parameter named t (ignoring this at first), when we use the llvm code to compile this function, we can get the following ir: ``` !2921 = distinct !DISubprogram(name: "foo", linkageName: "_ZN6Object3fooE4test", scope: !2211, file: !45, li\ ne: 48, type: !2329, scopeLine: 48, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefi\ nition \| DISPFlagOptimized, unit: !44, declaration: !2328, retainedNodes: !2922) !2924 = !DILocalVariable(name: "t", arg: 2, scope: !2921, file: !45, line: 48, type: !838) ... !2926 = !DILocalVariable(name: "t", scope: !2921, type: !838, flags: DIFlagArtificial) ``` We can find there are two `the same` DIVariable named t in the same dwarf scope for foo.resume. And when we try to use llvm-dwarfdump to dump the dwarf info of this elf, we get the following output: ``` 0x00006684: DW_TAG_subprogram DW_AT_low_pc (0x00000000004013a0) DW_AT_high_pc (0x00000000004013a8) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_object_pointer (0x0000669c) DW_AT_GNU_all_call_sites (true) DW_AT_specification (0x00005b5c "_ZN6Object3fooE4test") 0x000066a5: DW_TAG_formal_parameter DW_AT_name ("t") DW_AT_decl_file ("/disk1/yifeng.dongyifeng/my_code/llvm/build/bin/coro-debug-1.cpp") DW_AT_decl_line (48) DW_AT_type (0x00004146 "test") 0x000066ba: DW_TAG_variable DW_AT_name ("t") DW_AT_type (0x00004146 "test") DW_AT_artificial (true) ``` The elf also has two 't' in the same scope. But unluckily, it might let the debugger confused. And failed to print parameters for O0 or above. This patch will make coroutine parameters and move parameters use the same DIVar and try to fix the problems that I mentioned before. Test Plan: check-clang Reviewed By: aprantl, jmorse Differential Revision: https://reviews.llvm.org/D97533	2021-04-12 11:10:47 +08:00
Saurabh Jha	71ab6c98a0	[Matrix] Implement C-style explicit type conversions for matrix types. This implements C-style type conversions for matrix types, as specified in clang/docs/MatrixTypes.rst. Fixes PR47141. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D99037	2021-04-10 11:48:41 +01:00
Roman Lebedev	6270b3a1ea	Temporairly revert "[CGCall] Annotate `this` argument with alignment" As per @jyknight, "It seems like there's a bug with vtable thunks getting the wrong information." See https://reviews.llvm.org/D99790#2680857, https://godbolt.org/z/MxhYMe1q7 This reverts commit `0aa0458f14`.	2021-04-10 10:43:16 +03:00
Ben Shi	4f173c0c42	[clang][AVR] Support variable decorator '__flash' Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D96853	2021-04-10 11:23:55 +08:00
cchen	1a43fd2769	[OpenMP51] Initial support for masked directive and filter clause Adds basic parsing/sema/serialization support for the #pragma omp masked directive. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99995	2021-04-09 14:00:36 -05:00
Soumi Manna	5d7cb79416	RISCVABIInfo::classifyArgumentType: Fix static analyzer warnings with uninitialized variables warnings - NFCI Differential Revision: https://reviews.llvm.org/D100172	2021-04-09 15:23:32 +01:00
Yaxun (Sam) Liu	25942d7c49	[AMDGPU] Allow relaxed/consume memory order for atomic inc/dec Reviewed by: Jon Chesterfield Differential Revision: https://reviews.llvm.org/D100144	2021-04-09 09:23:41 -04:00
David Blaikie	eb8a28e2cf	DebugInfo: Include inline namespaces in template specialization parameter names This ensures these types have distinct names if they are distinct types (eg: if one is an instantiation with a type in one inline namespace, and another from a type with the same simple name, but in a different inline namespace).	2021-04-08 17:37:55 -07:00
Xiangling Liao	d508561798	[AIX] Support init priority attribute Differential Revision: https://reviews.llvm.org/D99291	2021-04-08 15:40:09 -04:00
Dávid Bolvanský	2cb8c10342	Revert "Reduce the number of attributes attached to each function" This reverts commit `053dc95839`. It causes perf regressions - see discussion in D97116.	2021-04-08 17:28:57 +02:00
Saurabh Jha	7d8513b7f2	[clang] Move int <-> float scalar conversion to a separate function As prelude to this patch https://reviews.llvm.org/D99037, we want to move the int-float conversion into a separate function that can be reused by matrix cast Differential Revision: https://reviews.llvm.org/D100051	2021-04-07 12:34:16 -07:00
Roman Lebedev	0aa0458f14	[CGCall] Annotate `this` argument with alignment As it is being noted in D99249, lack of alignment information on `this` has been preventing LICM from happening. For some time now, lack of alignment attribute does not imply natural alignment, but an alignment of `1`. Also, we used to treat dereferenceable as implying alignment, but we no longer do, so it's a bugfix. Differential Revision: https://reviews.llvm.org/D99790	2021-04-07 11:02:01 +03:00
Yaxun (Sam) Liu	61d065e21f	Let clang atomic builtins fetch add/sub support floating point types Recently atomicrmw started to support fadd/fsub: https://reviews.llvm.org/D53965 However clang atomic builtins fetch add/sub still does not support emitting atomicrmw fadd/fsub. This patch adds that. Reviewed by: John McCall, Artem Belevich, Matt Arsenault, JF Bastien, James Y Knight, Louis Dionne, Olivier Giroux Differential Revision: https://reviews.llvm.org/D71726	2021-04-06 15:44:00 -04:00
Simon Pilgrim	2901dc7575	Don't directly dereference getAs<> casts to avoid potential null dereferences. NFCI. Replace with castAs<> which asserts the cast is valid. Fixes a number of static analyzer warnings.	2021-04-06 12:24:19 +01:00
Yevgeny Rouban	39e3e3aa51	[NewPM] Redesign of PreserveCFG Checker The reason for the NewPM redesign is described in the commit cba3e783389a: [NewPM] Disable PreservedCFGChecker ... The checker introduces an internal custom CFG analysis that tracks current up-to date CFG snapshot. The analysis is invalidated along any other CFG related analysis (the key is CFGAnalyses). If the CFG analysis is not invalidated at a functional pass exit then the checker asserts that the CFG snapshot taken from this analysis is equals to a snapshot of the current CFG. Along the way: - the function CFG::printDiff() is simplified by removing function name calculation. The name is printed by the caller; - fixed CFG invalidated condition (see CFG::invalidate()); - StandardInstrumentations::registerCallbacks() gets additional optional parameter of type FunctionAnalysisManager*, which is needed by the checker to get the custom CFG analysis; - several PM related tests updated to explicitly set -verify-cfg-preserved=1 as they need. This patch is safe to land as the CFGChecker is left switched off (the options -verify-cfg-preserved is false by default). It will be switched on by a separate patch to minimize possible reverts. Reviewed By: skatkov, kuhar Differential Revision: https://reviews.llvm.org/D91327	2021-04-06 12:35:49 +07:00
Jennifer Yu	7078ef4722	[OPENMP51]Initial support for nocontext clause. Added basic parsing/sema/serialization support for the 'nocontext' clause. Differential Revision: https://reviews.llvm.org/D99848	2021-04-05 11:45:49 -07:00
Craig Topper	b4f2e80600	[RISCV] Refactor conversion of B extensions to IR intrinsics a little to reduce clang binary size. These all pass 1 type to getIntrinsic. So rather than assigning IntrinsicTypes for each builtin which invokes the SmallVector constructor, just select the intrinsic ID with a switch and share a single assignment of IntrinsicTypes.	2021-04-02 23:49:44 -07:00
Jennifer Yu	0fe8af9468	Fix build bot problem with missing OMPC_novariants in switch.	2021-04-02 13:58:39 -07:00
Levy Hsu	f78d932cf2	[RISCV] Add IR intrinsics for Zbc extension Head files are included in a separate patch in case the name needs to be changed. RV32 / 64: clmul clmulh clmulr Differential Revision: https://reviews.llvm.org/D99711	2021-04-02 12:09:13 -07:00
Levy Hsu	944adbf285	Recommit "[RISCV] Add IR intrinsic for Zbb extension" Forgot to amend the Author. Original commit message: Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b Differential Revision: https://reviews.llvm.org/D99320	2021-04-02 11:50:19 -07:00
Craig Topper	1f0b309f24	Revert "[RISCV] Add IR intrinsic for Zbb extension" This reverts commit `1808194590`. I forgot to change the author.	2021-04-02 11:47:02 -07:00
Craig Topper	1808194590	[RISCV] Add IR intrinsic for Zbb extension Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b	2021-04-02 11:23:57 -07:00
Levy Hsu	b001d574d7	[RISCV] Add IR intrinsic for Zbr extension Implementation for RISC-V Zbr extension intrinsic. Header files are included in separate patch in case the name needs to be changed RV32 / 64: crc32b crc32h crc32w crc32cb crc32ch crc32cw RV64 Only: crc32d crc32cd Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99009	2021-04-02 10:58:45 -07:00
Joseph Huber	69ca50bd7d	[OpenMP] Pass mapping names to add components in a user defined mapper Summary: Currently the mapping names are not passed to the mapper components that set up the array region. This means array mappings will not have their names availible in the runtime. This patch fixes this by passing the argument name to the region correctly. This means that the mapped variable's name will be the declared mapper that placed it on the device. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D99681	2021-04-01 15:51:03 -04:00
Raphael Isemann	60854c328d	Avoid calling ParseCommandLineOptions in BackendUtil if possible Calling `ParseCommandLineOptions` should only be called from `main` as the CommandLine setup code isn't thread-safe. As BackendUtil is part of the generic Clang FrontendAction logic, a process which has several threads executing Clang FrontendActions will randomly crash in the unsafe setup code. This patch avoids calling the function unless either the debug-pass option or limit-float-precision option is set. Without these two options set the `ParseCommandLineOptions` call doesn't do anything beside parsing the command line `clang` which doesn't set any options. See also D99652 where LLDB received a workaround for this crash. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D99740	2021-04-01 19:41:16 +02:00
Alexey Bataev	a28e835e94	[OPENMP]Fix PR48885: Crash in passing firstprivate args to tasks on Apple M1. Need to bitcast the function pointer passed as a parameter to the real type to avoid possible problem with calling conventions. Differential Revision: https://reviews.llvm.org/D99521	2021-03-31 13:00:58 -07:00
Alexey Bataev	66da4f6fc9	[OPENMP]Fix PR48658: [OpenMP 5.0] Compiler crash when OpenMP atomic sync hints used. No need to consider hint clause kind as the main atomic clause kind at the codegen. Differential Revision: https://reviews.llvm.org/D99611	2021-03-31 12:58:24 -07:00
Thomas Lively	45783d0e8a	[WebAssembly] Implement i64x2 comparisons Removes the prototype builtin and intrinsic for i64x2.eq and implements that instruction as well as the other i64x2 comparison instructions in the final SIMD spec. Unsigned comparisons were not included in the final spec, so they still need to be scalarized via a custom lowering. Differential Revision: https://reviews.llvm.org/D99623	2021-03-31 10:46:17 -07:00
Wei Mi	d535a05ca1	[ThinLTO] During module importing, close one source module before open another one for distributed mode. Currently during module importing, ThinLTO opens all the source modules, collect functions to be imported and append them to the destination module, then leave all the modules open through out the lto backend pipeline. This patch refactors it in the way that one source module will be closed before another source module is opened. All the source modules will be closed after importing phase is done. It will save some amount of memory when there are many source modules to be imported. Note that this patch only changes the distributed thinlto mode. For in process thinlto mode, one source module is shared acorss different thinlto backend threads so it is not changed in this patch. Differential Revision: https://reviews.llvm.org/D99554	2021-03-30 14:37:29 -07:00
Mike Rice	b7899ba0e8	[OPENMP51]Initial support for the dispatch directive. Added basic parsing/sema/serialization support for dispatch directive. Differential Revision: https://reviews.llvm.org/D99537	2021-03-30 14:12:53 -07:00
Alexey Bataev	e2c7bf08cc	[OPENMP]Fix PR48607: Crash during clang openmp codegen for firstprivate() of `float _Complex`. Need to cast the argument for the debug wrapper function call to the corresponding parameter type to avoid crash. Differential Revision: https://reviews.llvm.org/D99617	2021-03-30 13:39:45 -07:00
Alexey Bataev	1696b8ae96	[OPENMP]Fix PR48740: OpenMP declare reduction in C does not require an initializer If no initializer-clause is specified, the private variables will be initialized following the rules for initialization of objects with static storage duration. Need to adjust the implementation to the current version of the standard. Differential Revision: https://reviews.llvm.org/D99539	2021-03-30 05:38:20 -07:00
Raphael Isemann	1cbba533ec	[ObjC][CodeGen] Fix missing debug info in situations where an instance and class property have the same identifier Since the introduction of class properties in Objective-C it is possible to declare a class and an instance property with the same identifier in an interface/protocol. Right now Clang just generates debug information for whatever property comes first in the source file. The second property is ignored as it's filtered out by the set of already emitted properties (which is just using the identifier of the property to check for equivalence). I don't think generating debug info in this case was never supported as the identifier filter is in place since `7123bca7fb` (which precedes the introduction of class properties). This patch expands the filter to take in account identifier + whether the property is class/instance. This ensures that both properties are emitted in this special situation. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99512	2021-03-30 11:07:16 +02:00
Johannes Doerfert	03cc8a1ba0	[OpenMP][NFC] Move the `noinline` to the parallel entry point The `noinline` for non-SPMD parallel functions is probably not necessary but as long as we use it we should put it on the outermost parallel function, which is the wrapper, not the actual outlined function. Resolves PR49752 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D99506	2021-03-30 01:12:45 -05:00
Alexey Bataev	0411b23319	[OPENMP]Map data field with l-value reference types. Added initial support dfor the mapping of the data members with l-value reference types. Differential Revision: https://reviews.llvm.org/D98812	2021-03-29 07:07:09 -07:00
Alexey Bataev	f6f21dcd6c	[OPENMP]Fix PR49636: Assertion `(!Entry.getAddress() \|\| Entry.getAddress() == Addr) && "Resetting with the new address."' failed. The original issue is caused by the fact that the variable is allocated with incorrect type i1 instead of i8. This causes the bitcasting of the declaration to i8 type and the bitcast expression does not match the original variable. To fix the problem, the UndefValue initializer and the original variable should be emitted with type i8, not i1. Differential Revision: https://reviews.llvm.org/D99297	2021-03-29 06:55:57 -07:00
Alexey Bataev	dcf96178cb	[OPENMP]Fix PR49052: Clang crashed when compiling target code with assert(0). Need to insert a basic block during generation of the target region to avoid crash for the GPU to be able always calling a cleanup action. This cleanup action is required for the correct emission of the target region for the GPU. Differential Revision: https://reviews.llvm.org/D99445	2021-03-29 06:36:06 -07:00
Matt Arsenault	9a0c9402fa	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `07e46367ba`.	2021-03-29 08:55:30 -04:00
Oliver Stannard	07e46367ba	Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute"" Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit `fc9df30991`.	2021-03-29 11:32:22 +01:00
Matt Arsenault	fc9df30991	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `20d5c42e0e`.	2021-03-28 13:35:21 -04:00
Nico Weber	20d5c42e0e	Revert "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `4fefed6563`. Broke check-clang everywhere.	2021-03-28 13:02:52 -04:00
Matt Arsenault	4fefed6563	OpaquePtr: Turn inalloca into a type attribute I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.	2021-03-28 11:12:23 -04:00
Xun Li	f490a5969b	[OpenMP][InstrProfiling] Fix a missing instr profiling counter When emitting a function body there needs to be a instr profiling counter emitted. Otherwise instr profiling won't work for this function. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D98135	2021-03-25 13:52:36 -07:00
Xun Li	c7a39c833a	[Coroutine][Clang] Force emit lifetime intrinsics for Coroutines tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available. Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame). The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame. In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap. Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime. To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point. The following is a common code pattern called "Symmetric Transfer" in coroutine: ``` auto tmp = await_suspend(); __builtin_coro_resume(tmp.address()); return; ``` In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine. During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards. However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape. To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived. I made some attempt in D98638, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines. Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893 Differential Revision: https://reviews.llvm.org/D99227	2021-03-25 13:46:20 -07:00
Yaxun (Sam) Liu	cc9477166a	[CUDA][HIP] add __builtin_get_device_side_mangled_name Add builtin function __builtin_get_device_side_mangled_name to get device side manged name for functions and global variables, which can be used to get symbol address of kernels or variables by mangled name in dynamically loaded bundled code objects at run time. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D99301	2021-03-25 15:25:29 -04:00
Djordje Todorovic	8420a53324	[Debugify] Expose original debug info preservation check as CC1 option In order to test the preservation of the original Debug Info metadata in your projects, a front end option could be very useful, since users usually report that a concrete entity (e.g. variable x, or function fn2()) is missing debug info. The [0] is an example of running the utility on GDB Project. This depends on: D82546 and D82545. Differential Revision: https://reviews.llvm.org/D82547	2021-03-25 05:29:42 -07:00
Alexey Bataev	7654bb6303	[OPENMP]Fix PR48571: critical/master in outlined contexts cause crash. If emit inlined region for master/critical directives, no need to clear lambda/block context data, otherwise the variables cannot be found and it causes a crash at compile time. Differential Revision: https://reviews.llvm.org/D99280	2021-03-24 10:15:24 -07:00
Nemanja Ivanovic	4020932706	[PowerPC] Make altivec.h work with AIX which has no __int128 There are a number of functions in altivec.h that use vector __int128 which isn't supported on AIX. Those functions need to be guarded for targets that don't support the type. Furthermore, the functions that produce quadword instructions without using the type need a builtin. This patch adds the macro guards to altivec.h using the __SIZEOF_INT128__ which is only defined on targets that support the __int128 type.	2021-03-24 00:35:51 -05:00
Bruno Cardoso Lopes	431e3138a1	[CGAtomic] Lift stronger requirements on cmpxch and support acquire failure mode - Fix `emitAtomicCmpXchgFailureSet` to support release/acquire (succ/fail) memory order. - Remove stronger checks for cmpxch. Effectively, this addresses http://wg21.link/p0418 Differential Revision: https://reviews.llvm.org/D98995	2021-03-23 16:45:37 -07:00
Bradley Smith	48f5a392cb	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Roman Lebedev	be87321280	[clang][Codegen] EmitBranchOnBoolExpr(): emit prof branch counts even at -O0 This restores the original behaviour before i unadvertedly broke it in `e3a4701627` and clang/test/Profile/ caught it.	2021-03-21 23:24:27 +03:00
Roman Lebedev	e3a4701627	[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights `08196e0b2e` exposed LowerExpectIntrinsic's internal implementation detail in the form of LikelyBranchWeight/UnlikelyBranchWeight options to the outside. While this isn't incorrect from the results viewpoint, this is suboptimal from the layering viewpoint, and causes confusion - should transforms also use those weights, or should they use something else, D98898? So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight internal again, and fixing all the code that used it directly, which currently is only clang codegen, thankfully, to emit proper @llvm.expect intrinsics instead.	2021-03-21 22:50:21 +03:00
Roman Lebedev	37d6be9052	Revert "[BranchProbability] move options for 'likely' and 'unlikely'" Upon reviewing D98898 i've come to realization that these are implementation detail of LowerExpectIntrinsicPass, and they should not be exposed to outside of it. This reverts commit `ee8b53815d`.	2021-03-21 22:50:21 +03:00
Sanjay Patel	ee8b53815d	[BranchProbability] move options for 'likely' and 'unlikely' This makes the settings available for use in other passes by housing them within the Support lib, but NFC otherwise. See D98898 for the proposed usage in SimplifyCFG (where this change was originally included). Differential Revision: https://reviews.llvm.org/D98945	2021-03-20 14:46:46 -04:00
Benjamin Kramer	19d2c65ddd	[CodeGen] Don't crash on for loops with cond variables and no increment This looks like an oversight from `a875721d8a`, creating IR that refers to `for.inc` even if it doesn't exist. Differential Revision: https://reviews.llvm.org/D98980	2021-03-19 20:43:52 +01:00
Hongtao Yu	fc1812a0ad	[UniqueLinkageName] Use consistent checks when mangling symbo linkage name and debug linkage name. C functions may be declared and defined in different prototypes like below. This patch unifies the checks for mangling names in symbol linkage name emission and debug linkage name emission so that the two names are consistent. static int go(int); static int go(a) int a; { return a; } Test Plan: Differential Revision: https://reviews.llvm.org/D98799	2021-03-18 22:11:16 -07:00
Thomas Lively	f5764a8654	[WebAssembly] Finalize SIMD names and opcodes Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all SIMD instructions to match the finalized SIMD spec. Deliberately does not change the public interface in wasm_simd128.h yet; that will require more care. Depends on D98466. Differential Revision: https://reviews.llvm.org/D98676	2021-03-18 11:21:25 -07:00
Thomas Lively	2f2ae08da9	[WebAssembly] Remove experimental SIMD instructions Removes the instruction definitions, intrinsics, and builtins for qfma/qfms, signselect, and prefetch instructions, which were not included in the final WebAssembly SIMD spec. Depends on D98457. Differential Revision: https://reviews.llvm.org/D98466	2021-03-18 11:21:24 -07:00
Alex Lorenz	d672d5219a	Revert "[CodeGenModule] Set dso_local for Mach-O GlobalValue" This reverts commit `809a1e0ffd`. Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local. Differential Revision: https://reviews.llvm.org/D98458	2021-03-17 17:27:41 -07:00
Richard Smith	a875721d8a	PR49585: Emit the jump destination for a for loop 'continue' from within the scope of the condition variable. The condition variable is in scope in the loop increment, so we need to emit the jump destination from wthin the scope of the condition variable. For GCC compatibility (and compatibility with real-world 'FOR_EACH' macros), 'continue' is permitted in a statement expression within the condition of a for loop, though, so there are two cases here: * If the for loop has no condition variable, we can emit the jump destination before emitting the condition. * If the for loop has a condition variable, we must defer emitting the jump destination until after emitting the variable. We diagnose a 'continue' appearing in the initializer of the condition variable, because it would jump past the initializer into the scope of that variable. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D98816	2021-03-17 16:24:04 -07:00
Mike Rice	410f09af09	[OPENMP51]Initial support for the interop directive. Added basic parsing/sema/serialization support for interop directive. Support for the 'init' clause. Differential Revision: https://reviews.llvm.org/D98558	2021-03-17 09:42:07 -07:00
Bradley Smith	cf0da91ba5	[AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE Previously NEON used a target specific intrinsic for frintn, given that the FROUNDEVEN ISD node now exists, move over to that instead and add codegen support for that node for both NEON and fixed length SVE. Differential Revision: https://reviews.llvm.org/D98487	2021-03-17 11:41:22 +00:00
Vassil Vassilev	0cb7e7ca0c	Make iteration over the DeclContext::lookup_result safe. The idiom: ``` DeclContext::lookup_result R = DeclContext::lookup(Name); for (auto D : R) {...} ``` is not safe when in the loop body we trigger deserialization from an AST file. The deserialization can insert new declarations in the StoredDeclsList whose underlying type is a vector. When the vector decides to reallocate its storage the pointer we hold becomes invalid. This patch replaces a SmallVector with an singly-linked list. The current approach stores a SmallVector<NamedDecl, 4> which is around 8 pointers. The linked list is 3, 5, or 7. We do better in terms of memory usage for small cases (and worse in terms of locality -- the linked list entries won't be near each other, but will be near their corresponding declarations, and we were going to fetch those memory pages anyway). For larger cases: the vector uses a doubling strategy for reallocation, so will generally be between half-full and full. Let's say it's 75% full on average, so there's N * 4/3 + 4 pointers' worth of space allocated currently and will be 2N pointers with the linked list. So we break even when there are N=6 entries and slightly lose in terms of memory usage after that. We suspect that's still a win on average. Thanks to @rsmith! Differential revision: https://reviews.llvm.org/D91524	2021-03-17 08:59:04 +00:00
Luke Drummond	440f6bdf34	[OpenCL][NFCI] Prefer CodeGenFunction::EmitRuntimeCall `CodeGenFunction::EmitRuntimeCall` automatically sets the right calling convention for the callee so we can avoid setting it ourselves. As requested in https://reviews.llvm.org/D98411 Reviewed by: anastasia Differential Revision: https://reviews.llvm.org/D98705	2021-03-16 16:22:19 +00:00
Amy Huang	f5352dd9da	Emit inline implementation of __builtin__wmemchr on MSVCRT platforms. The MSVC runtime library doesn't have a definition for wmemchr, so provide an inline implementation. Differential Revision: https://reviews.llvm.org/D98472	2021-03-15 15:30:55 -07:00
Jonas Paulsson	9cfd301ec8	[SystemZ] Test for isinf and isfinite in testFPKind(). Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes for finite) in testFPKind() and handle with TDC. Review: Ulrich Weigand. Differential Revision: https://reviews.llvm.org/D97901	2021-03-15 15:02:39 -06:00
Stelios Ioannou	ab86edbc88	[AArch64] Implement __rndr, __rndrrs intrinsics This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined. These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator. The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838 Differential Revision: https://reviews.llvm.org/D98264 Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc	2021-03-15 17:51:48 +00:00
Luke Drummond	fcfd3fda71	[OpenCL] Respect calling convention for builtin `__translate_sampler_initializer` has a calling convention of `spir_func`, but clang generated calls to it using the default CC. Instruction Combining was lowering these mismatching calling conventions to `store i1* undef` which itself was subsequently lowered to a trap instruction by simplifyCFG resulting in runtime `SIGILL` There are arguably two bugs here: but whether there's any wisdom in converting an obviously invalid call into a runtime crash over aborting with a sensible error message will require further discussion. So for now it's enough to set the right calling convention on the runtime helper. Reviewed By: svenh, bader Differential Revision: https://reviews.llvm.org/D98411	2021-03-15 17:26:51 +00:00
Thomas Preud'homme	f60b35340f	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-15 15:38:08 +00:00
xling-Liao	0bf2da53c1	[NFC] Adjust SmallVector.h header to workaround XL build compiler issue In order to prevent further building issues related to the usage of SmallVector in other compilation unit, this patch adjusts the llvm.h header as a workaround instead. Besides, this patch reverts previous workarounds: 1. Revert "[NFC] Use llvm::SmallVector to workaround XL compiler problem on AIX" This reverts commit `561fb7f60a`. 2.Revert "[clang][cli] Fix build failure in CompilerInvocation" This reverts commit `8dc70bdcd0`. Differential Revision: https://reviews.llvm.org/D98552	2021-03-12 21:41:36 -06:00
Amy Huang	d7cd208f08	[DebugInfo] Add an attribute to force type info to be emitted for types that are required to be complete. This was motivated by the fact that constructor type homing (debug info optimization that we want to turn on by default) drops some libc++ types, so an attribute would allow us to override constructor homing and emit them anyway. I'm currently looking into the particular libc++ issue, but even if we do fix that, this issue might come up elsewhere and it might be nice to have this. As I've implemented it now, the attribute isn't specific to the constructor homing optimization and overrides all of the debug info optimizations. Open to discussion about naming, specifics on what the attribute should do, etc. Differential Revision: https://reviews.llvm.org/D97411	2021-03-12 12:30:01 -08:00
Nikita Popov	42eb658f65	[OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC) This removes some (but not all) uses of type-less CreateGEP() and CreateInBoundsGEP() APIs, which are incompatible with opaque pointers. There are a still a number of tricky uses left, as well as many more variation APIs for CreateGEP.	2021-03-12 21:01:16 +01:00
Simon Pilgrim	731b3d7664	[clang] Use Constant::getAllOnesValue helper. NFCI. Avoid -1ULL which MSVC tends to complain about	2021-03-12 15:16:36 +00:00
Sriraman Tallam	cdb42a4cc4	Disable unique linkage suffixes ifor global vars until demanglers can be fixed. D96109 added support for unique internal linkage names for both internal linkage functions and global variables. There was a lot of discussion on how to get the demangling right for functions but I completely missed the point that demanglers do not support suffixes for global vars. For example: $ c++filt _ZL3foo foo $ c++filt _ZL3foo.uniq.123 _ZL3foo.uniq.123 The demangling for functions works as expected. I am not sure of the impact of this. I don't understand how debuggers and other tools depend on the correctness of global variable demangling so I am pre-emptively disabling it until we can get the demangling support added. Importantly, uniquefying global variables is not needed right now as we do not do profile attribution to global vars based on sampling. It was added for completeness and so this feature is not exactly missed. Differential Revision: https://reviews.llvm.org/D98392	2021-03-11 20:59:30 -08:00
David Blaikie	bd2bdad19e	void cast to suppress -Wunused-variable in non-asserts build	2021-03-11 17:51:31 -08:00
Florian Hahn	c92ec0dd92	[Matrix] Add support for matrix-by-scalar division. This patch extends the matrix spec to allow matrix-by-scalar division. Originally support for `/` was left out to avoid ambiguity for the matrix-matrix version of `/`, which could either be elementwise or specified as matrix multiplication M1 * (1/M2). For the matrix-scalar version, no ambiguity exists; `*` is also an elementwise operation in that case. Matrix-by-scalar division is commonly supported by systems including Matlab, Mathematica or NumPy. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97857	2021-03-11 22:21:23 +00:00
Joseph Huber	807466ef28	[OpenMP] Restore backwards compatibility for libomptarget Summary: The changes introduced in D87946 changed the API for libomptarget functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x but was not given a backward-compatible interface. This change will require people using Clang 13.x or 12.x to recompile their offloading programs. Reviewed By: jdoerfert cchen Differential Revision: https://reviews.llvm.org/D98358	2021-03-11 09:52:11 -05:00
Nikita Popov	46354bac76	[OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC) Explicitly pass loaded type when creating loads, in preparation for the deprecation of these APIs. There are still a couple of uses left.	2021-03-11 14:40:57 +01:00
Nikita Popov	68e01339cc	[CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC) These are incompatible with opaque pointers. This is in preparation of dropping this API on the IRBuilder side as well. Instead explicitly pass the loaded type.	2021-03-11 10:41:23 +01:00
Olivier Goffart	5baea05601	[SEH] Fix capture of this in lambda functions Commit `1b04bdc2f3` added support for capturing the 'this' pointer in a SEH context (__finally or __except), But the case in which the 'this' pointer is part of a lambda capture was not handled properly Differential Revision: https://reviews.llvm.org/D97687	2021-03-11 09:12:42 +01:00
Zakk Chen	d6a0560bf2	[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics. Demonstrate how to generate vadd/vfadd intrinsic functions 1. add -gen-riscv-vector-builtins for clang builtins. 2. add -gen-riscv-vector-builtin-codegen for clang codegen. 3. add -gen-riscv-vector-header for riscv_vector.h. It also generates ifdef directives with extension checking, base on D94403. 4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h. Generate overloading version Header for generic api. https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface 5. update tblgen doc for riscv related options. riscv_vector.td also defines some unused type transformers for vadd, because I think it could demonstrate how tranfer type work and we need them for the whole intrinsic functions implementation in the future. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D95016	2021-03-10 18:43:43 -08:00
Arthur Eubanks	c8227f06b3	[clang] Don't assert in EmitAggregateCopy on trivial_abi types Fixes PR42961. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D97872	2021-03-10 10:16:06 -08:00
Jingu Kang	25951c5ab8	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
serge-sans-paille	ea8e5b87ac	[NFC] Remove duplicate isNoBuiltinFunc method It's available both in CodeGenOptions and in LangOptions, and LangOptions implementation is slightly better as it uses a StringRef instead of a char pointer, so use it. Differential Revision: https://reviews.llvm.org/D98175	2021-03-10 09:18:55 +01:00
Xiangling Liao	561fb7f60a	[NFC] Use llvm::SmallVector to workaround XL compiler problem on AIX LLVM is recommending to use SmallVector (that is, omitting the N), in the absence of a well-motivated choice for the number of inlined elements N. However, this doesn't work well with XL compiler on AIX since some header(s) aren't properly picked up with it. We need to take a further look into the real issue underneath and fix it in a later patch. But currently we'd like to use this patch to unblock the build compiler issue first. Differential Revision: https://reviews.llvm.org/D98265	2021-03-09 13:03:52 -05:00
diggerlin	46d4d1fea4	[AIX] do not emit visibility attribute into IR when there is -mignore-xcoff-visibility SUMMARY: n the patch https://reviews.llvm.org/D87451 "add new option -mignore-xcoff-visibility" we did as "The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR." in these patch we let -mignore-xcoff-visibility effect on generating IR too. the new feature only work on AIX OS Reviewer: Jason Liu, Differential Revision: https://reviews.llvm.org/D89986	2021-03-09 10:38:00 -05:00
Akira Hatanaka	dca5737945	Move ObjCARCUtil.h back to llvm/Analysis Instead of adding the header to llvm/IR, just duplicate the marker string in the auto upgrader.	2021-03-08 16:35:24 -08:00
Min-Yih Hsu	5eb7a5814a	[cfe][M68k](7/8) Clang basic support This is the first patch supporting M68k in Clang - Register M68k as a target - Target specific CodeGen support - Target specific attribute support Authors: myhsu, m4yers, glaubitz Differential Revision: https://reviews.llvm.org/D88393	2021-03-08 12:30:57 -08:00
Sriraman Tallam	78d0e91865	Refactor -funique-internal-linakge-names implementation. The option -funique-internal-linkage-names was added in D73307 and D78243 as a LLVM early pass to insert a unique suffix to internal linkage functions and vars. The unique suffix was the hash of the module path. However, we found that this can be done more cleanly in clang early and the fixes that need to be done later can be completely avoided. The fixes in particular are trying to modify the DW_AT_linkage_name and finding the right place to insert the pass. This patch ressurects the original implementation proposed in D73307 which was reviewed and then ditched in favor of the pass based approach. Differential Revision: https://reviews.llvm.org/D96109	2021-03-05 13:32:17 -08:00
Jingu Kang	9b302513f6	[AArch64] Add missing intrinsics for vrnd	2021-03-05 11:26:12 +00:00
Michael Kruse	b119120673	[clang][OpenMP] Use OpenMPIRBuilder for workshare loops. Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to: * Only the worksharing-loop directive. * Recognizes only the nowait clause. * No loop nests with more than one loop. * Untested with templates, exceptions. * Semantic checking left to the existing infrastructure. This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics: * The distance function: How many loop iterations there will be before entering the loop nest. * The loop variable function: Conversion from a logical iteration number to the loop variable. These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang. The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does. For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting. Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset. Reviewed By: jdenny Differential Revision: https://reviews.llvm.org/D94973	2021-03-04 22:52:59 -06:00
David Blaikie	cedc53254a	Fix clang for header move in LLVM/IR	2021-03-04 16:20:44 -08:00
Heejin Ahn	561abd83ff	[WebAssembly] Disable uses of __clang_call_terminate Background: Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses Itanium-based libraries and ABIs with some modifications. `__clang_call_terminate` is a wrapper generated in Clang's Itanium C++ ABI implementation. It contains this code, in C-style pseudocode: ``` void __clang_call_terminate(void *exn) { __cxa_begin_catch(exn); std::terminate(); } ``` So this function is a wrapper to call `__cxa_begin_catch` on the exception pointer before termination. In Itanium ABI, this function is called when another exception is thrown while processing an exception. The pointer for this second, violating exception is passed as the argument of this `__clang_call_terminate`, which calls `__cxa_begin_catch` with that pointer and calls `std::terminate` to terminate the program. The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch` says, ``` When the personality routine encounters a termination condition, it will call __cxa_begin_catch() to mark the exception as handled and then call terminate(), which shall not return to its caller. ``` In wasm EH's Clang implementation, this function is called from cleanuppads that terminates the program, which we also call terminate pads. Cleanuppads normally don't access the thrown exception and the wasm backend converts them to `catch_all` blocks. But because we need the exception pointer in this cleanuppad, we generate `wasm.get.exception` intrinsic (which will eventually be lowered to `catch` instruction) as we do in the catchpads. But because terminate pads are cleanup pads and should run even when a foreign exception is thrown, so what we have been doing is: 1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure terminate pads are in this simple shape: ``` %exn = catch call @__clang_call_terminate(%exn) unreachable ``` 2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the pipeline, we attach a `catch_all` to terminate pads, so they will be in this form: ``` %exn = catch call @__clang_call_terminate(%exn) unreachable catch_all call @std::terminate() unreachable ``` In `catch_all` part, we don't have the exception pointer, so we call `std::terminate()` directly. The reason we ran HandleEHTerminatePads at the end of the pipeline, separate from LateEHPrepare, was it was convenient to assume there was only a single `catch` part per `try` during CFGSort and CFGStackify. --- Problem: While it thinks terminate pads could have been possibly split or calls to `__clang_call_terminate` could have been duplicated, `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate pads contain no more than calls to `__clang_call_terminate` and `unreachable` instruction. I assumed that because in LLVM very limited forms of transformations are done to catchpads and cleanuppads to maintain the scoping structure. But it turned out to be incorrect; passes can merge cleanuppads into one, including terminate pads, as long as the new code has a correct scoping structure. One pass that does this I observed was `SimplifyCFG`, but there can be more. After this transformation, a single cleanuppad can contain any number of other instructions with the call to `__clang_call_terminate` and can span many BBs. It wouldn't be practical to duplicate all these BBs within the cleanuppad to generate the equivalent `catch_all` blocks, only with calls to `__clang_call_terminate` replaced by calls to `std::terminate`. Unless we do more complicated transformation to split those calls to `__clang_call_terminate` into a separate cleanuppad, it is tricky to solve. --- Solution (?): This CL just disables the generation and use of `__clang_call_terminate` and calls `std::terminate()` directly in its place. The possible downside of this approach can be, because the Itanium ABI intended to "mark" the violating exception handled, we don't do that anymore. What `__cxa_begin_catch` actually does is increment the exception's handler count and decrement the uncaught exception count, which in my opinion do not matter much given that we are about to terminate the program anyway. Also it does not affect info like stack traces that can be possibly shown to developers. And while we use a variant of Itanium EH ABI, we can make some deviations if we choose to; we are already different in that in the current version of the EH spec we don't support two-phase unwinding. We can possibly consider a more complicated transformation later to reenable this, but I don't think that has high priority. Changes in this CL contains: - In Clang, we don't generate a call to `wasm.get.exception()` intrinsic and `__clang_call_terminate` function in terminate pads anymore; we simply generate calls to `std::terminate()`, which is the default implementation of `CGCXXABI::emitTerminateForUnexpectedException`. - Remove `WebAssembly::ensureSingleBBTermPads() function and `WebAssemblyHandleEHTerminatePads` pass, because terminate pads are already `catch_all` now (because they don't need the exception pointer) and we don't need these transformations anymore. - Change tests to use `std::terminate` directly. Also removes tests that tested `LateEHPrepare::ensureSingleBBTermPads` and `HandleEHTerminatePads` pass. - Drive-by fix: Add some function attributes to EH intrinsic declarations Fixes https://github.com/emscripten-core/emscripten/issues/13582. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D97834	2021-03-04 14:26:35 -08:00
Reid Kleckner	1c2e7d200d	[MS] Fix crash involving gnu stmt exprs and inalloca Use a WeakTrackingVH to cope with the stmt emission logic that cleans up unreachable blocks. This invalidates the reference to the deferred replacement placeholder. Cope with it. Fixes PR25102 (from 2015!)	2021-03-04 13:57:46 -08:00
Gui Andrade	10264a1b21	Introduce noundef attribute at call sites for stricter poison analysis This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits. In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch. Differential Revision: https://reviews.llvm.org/D81678	2021-03-04 12:15:12 -08:00
Zequan Wu	9783e20988	Revert "Revert "[Coverage] Emit gap region between statements if first statements contains terminate statements."" Reland with update on test case ContinuousSyncmode/basic.c. This reverts commit `fe5c2c3ca6`.	2021-03-04 11:52:43 -08:00
Akira Hatanaka	1900503595	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies `ed4718eccb`, which was reverted because it was causing a miscompile. The bug that was causing the miscompile has been fixed in `75805dce5f`. Original commit message: Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-03-04 11:22:30 -08:00
Alexey Bataev	711179b581	[OPENMP]Fix PR48759: "fatal error" when compile with preprocessed file. If the file in line directive does not exist on the system we need, to use the original file to get its file id. Differential Revision: https://reviews.llvm.org/D97945	2021-03-04 07:26:57 -08:00
Nico Weber	fe5c2c3ca6	Revert "[Coverage] Emit gap region between statements if first statements contains terminate statements." This reverts commit `2d7374a0c6`. Breaks ContinuousSyncMode/basic.c in check-profile on macOS.	2021-03-04 08:53:30 -05:00
Thomas Preud'homme	b7aeece47c	Revert "Stop traping on sNaN in __builtin_isinf" This reverts commit `1b6eb56aa0` because the invert logic for isfinite is incorrect.	2021-03-04 12:07:35 +00:00
Wang, Pengfei	e7e67c930a	Add Windows ehcont section support (/guard:ehcont). Add option /guard:ehcont Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D96709	2021-03-04 11:47:29 +08:00
Soumi Manna	eec7f8f7b1	[WebAssembly] Add missing default cases in switch statements unsigned variable 'IntNo' has been declared but not been defined inside function EmitWebAssemblyBuiltinExpr(). static code analysis tool complains about uninitialized variable "IntNo" since this enters to default branch without setting any intrinsics and calls Function *Callee = CGM.getIntrinsic(IntNo). This patch fixes the problem by adding default cases in switch statements.	2021-03-03 13:15:23 -08:00
Zequan Wu	2d7374a0c6	[Coverage] Emit gap region between statements if first statements contains terminate statements. Differential Revision: https://reviews.llvm.org/D97101	2021-03-03 11:25:49 -08:00
Hans Wennborg	0a5dd06718	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR" This caused miscompiles of Chromium tests for iOS due clobbering of live registers. See discussion on the code review for details. > Background: > > This fixes a longstanding problem where llvm breaks ARC's autorelease > optimization (see the link below) by separating calls from the marker > instructions or retainRV/claimRV calls. The backend changes are in > https://reviews.llvm.org/D92569. > > https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue > > What this patch does to fix the problem: > > - The front-end adds operand bundle "clang.arc.attachedcall" to calls, > which indicates the call is implicitly followed by a marker > instruction and an implicit retainRV/claimRV call that consumes the > call result. In addition, it emits a call to > @llvm.objc.clang.arc.noop.use, which consumes the call result, to > prevent the middle-end passes from changing the return type of the > called function. This is currently done only when the target is arm64 > and the optimization level is higher than -O0. > > - ARC optimizer temporarily emits retainRV/claimRV calls after the calls > with the operand bundle in the IR and removes the inserted calls after > processing the function. > > - ARC contract pass emits retainRV/claimRV calls after the call with the > operand bundle. It doesn't remove the operand bundle on the call since > the backend needs it to emit the marker instruction. The retainRV and > claimRV calls are emitted late in the pipeline to prevent optimization > passes from transforming the IR in a way that makes it harder for the > ARC middle-end passes to figure out the def-use relationship between > the call and the retainRV/claimRV calls (which is the cause of > PR31925). > > - The function inliner removes an autoreleaseRV call in the callee if > nothing in the callee prevents it from being paired up with the > retainRV/claimRV call in the caller. It then inserts a release call if > claimRV is attached to the call since autoreleaseRV+claimRV is > equivalent to a release. If it cannot find an autoreleaseRV call, it > tries to transfer the operand bundle to a function call in the callee. > This is important since the ARC optimizer can remove the autoreleaseRV > returning the callee result, which makes it impossible to pair it up > with the retainRV/claimRV call in the caller. If that fails, it simply > emits a retain call in the IR if retainRV is attached to the call and > does nothing if claimRV is attached to it. > > - SCCP refrains from replacing the return value of a call with a > constant value if the call has the operand bundle. This ensures the > call always has at least one user (the call to > @llvm.objc.clang.arc.noop.use). > > - This patch also fixes a bug in replaceUsesOfNonProtoConstant where > multiple operand bundles of the same kind were being added to a call. > > Future work: > > - Use the operand bundle on x86-64. > > - Fix the auto upgrader to convert call+retainRV/claimRV pairs into > calls with the operand bundles. > > rdar://71443534 > > Differential Revision: https://reviews.llvm.org/D92808 This reverts commit `ed4718eccb`.	2021-03-03 15:51:40 +01:00
Thomas Preud'homme	1b6eb56aa0	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-02 15:54:56 +00:00
Alexey Bataev	0caf736d7e	[OPENMP50]Mapping of the subcomponents with the 'default' mappers. If the mapped structure has data members, which have 'default' mappers, need to map these members individually using their 'default' mappers. Differential Revision: https://reviews.llvm.org/D92195	2021-03-02 07:11:06 -08:00
Richard Smith	9e2579dbf4	Fix infinite recursion during IR emission if a constant-initialized lifetime-extended temporary object's initializer refers back to the same object. `GetAddrOfGlobalTemporary` previously tried to emit the initializer of a global temporary before updating the global temporary map. Emitting the initializer could recurse back into `GetAddrOfGlobalTemporary` for the same temporary, resulting in an infinite recursion. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97733	2021-03-01 22:19:21 -08:00
Yuanfang Chen	5de2d189e6	[Diagnose] Unify MCContext and LLVMContext diagnosing The situation with inline asm/MC error reporting is kind of messy at the moment. The errors from MC layout are not reliably propagated and users have to specify an inlineasm handler separately to get inlineasm diagnose. The latter issue is not a correctness issue but could be improved. * Kill LLVMContext inlineasm diagnose handler and migrate it to use DiagnoseInfo/DiagnoseHandler. * Introduce `DiagnoseInfoSrcMgr` to diagnose SourceMgr backed errors. This covers use cases like inlineasm, MC, and any clients using SourceMgr. * Move AsmPrinter::SrcMgrDiagInfo and its instance to MCContext. The next step is to combine MCContext::SrcMgr and MCContext::InlineSrcMgr because in all use cases, only one of them is used. * If LLVMContext is available, let MCContext uses LLVMContext's diagnose handler; if LLVMContext is not available, MCContext uses its own default diagnose handler which just prints SMDiagnostic. * Change a few clients(Clang, llc, lldb) to use the new way of reporting. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D97449	2021-03-01 15:58:37 -08:00
Yaxun (Sam) Liu	53d30381f5	Fix build failure due to dump() Change-Id: I86b534223d63bf8bb8f49af5a64b300efbeba77b	2021-03-01 16:44:08 -05:00
Yaxun (Sam) Liu	5cf2a37f12	[HIP] Emit kernel symbol Currently clang uses stub function to launch kernel. This is inconvenient to interop with C++ programs since the stub function has different name as kernel, which is required by ROCm debugger. This patch emits a variable symbol which has the same name as the kernel and uses it to register and launch the kernel. This allows C++ program to launch a kernel by using the original kernel name. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D86376	2021-03-01 16:31:40 -05:00
Sean Fertile	3f40dbbbc7	[PowerPC][AIX] Enable passing vectors in variadic functions. Differential Revision: https://reviews.llvm.org/D97474	2021-03-01 13:08:28 -05:00
Arthur Eubanks	040c1b49d7	Move EntryExitInstrumentation pass location This seems to be more of a Clang thing rather than a generic LLVM thing, so this moves it out of LLVM pipelines and as Clang extension hooks into LLVM pipelines. Move the post-inline EEInstrumentation out of the backend pipeline and into a late pass, similar to other sanitizer passes. It doesn't fit into the codegen pipeline. Also fix up EntryExitInstrumentation not running at -O0 under the new PM. PR49143 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D97608	2021-03-01 10:08:10 -08:00
Olivier Goffart	1b04bdc2f3	[SEH] capture 'this' Simply make sure that the CodeGenFunction::CXXThisValue and CXXABIThisValue are correctly initialized to the recovered value. For lambda capture, we also need to make sure to fill the LambdaCaptureFields Differential Revision: https://reviews.llvm.org/D97534	2021-03-01 11:57:35 +01:00
Fangrui Song	8afdacba9d	Add GNU attribute 'retain' For ELF targets, GCC 11 will set SHF_GNU_RETAIN on the section of a `__attribute__((retain))` function/variable to prevent linker garbage collection. (See AttrDocs.td for the linker support). This patch adds `retain` functions/variables to the `llvm.used` list, which has the desired linker GC semantics. Note: `retain` does not imply `used`, so an unused function/variable can be dropped by Sema. Before 'retain' was introduced, previous ELF solutions require inline asm or linker tricks, e.g. `asm volatile(".reloc 0, R_X86_64_NONE, target");` (architecture dependent) or define a non-local symbol in the section and use `ld -u`. There was no elegant source-level solution. With D97448, `__attribute__((retain))` will set `SHF_GNU_RETAIN` on ELF targets. Differential Revision: https://reviews.llvm.org/D97447	2021-02-26 16:37:50 -08:00
Fangrui Song	28cb620321	Change some addUsedGlobal to addUsedOrCompilerUsedGlobal An global value in the `llvm.used` list does not have GC root semantics on ELF targets. This will be changed in a subsequent backend patch. Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to prevent undesired GC root semantics. Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets. GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections", which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior). For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that the future LLD change will not affect them. rnk kindly categorized the changes: ``` ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac. MS C++ ABI stuff: wants GC root semantics, no change OpenMP: unsure, but GC root semantics probably don't hurt CodeGenModule: affected in this patch to not use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics Coverage: Probably want GC root semantics CGExpr.cpp: refers to LTO, wants GC root CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run) CGDecl.cpp: Changed in this patch for __attribute__((used)) ``` Differential Revision: https://reviews.llvm.org/D97446	2021-02-26 10:42:07 -08:00
Petr Hosek	8459b8ef39	[Driver] Rename -fprofile-{prefix-map,compilation-dir} to -fcoverage-{prefix-map,compilation-dir} These flags affect coverage mapping (-fcoverage-mapping), not -fprofile-[instr-]generate so it makes more sense to use the -fcoverage-* prefix. Differential Revision: https://reviews.llvm.org/D97434	2021-02-25 21:40:12 -08:00
James Y Knight	24539f1ef2	Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg. And then push those change throughout LLVM. Keep the old signature in Clang's CGBuilder for now -- that will be updated in a follow-on patch (D97224). The MLIR LLVM-IR dialect is not updated to support the new alignment attribute, but preserves its existing behavior. Differential Revision: https://reviews.llvm.org/D97223	2021-02-25 18:29:42 -05:00
Vitaly Buka	9a887f652c	[clang,NFC] Fix typos in file headers	2021-02-25 12:47:02 -08:00
Akira Hatanaka	ec4408ad69	[CodeGen] Call ConvertTypeForMem instead of ConvertType This fixes a crash that occurs when the type passed to the method is `_Bool`. rdar://74493389	2021-02-25 12:11:18 -08:00
Dan Liew	5d64dd8e3c	[Clang][ASan] Introduce `-fsanitize-address-destructor-kind=` driver & frontend option. The new `-fsanitize-address-destructor-kind=` option allows control over how module destructors are emitted by ASan. The new option is consumed by both the driver and the frontend and is propagated into codegen options by the frontend. Both the legacy and new pass manager code have been updated to consume the new option from the codegen options. It would be nice if the new utility functions (`AsanDtorKindToString` and `AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be consumed by other language frontends. Unfortunately that doesn't work because the clang driver doesn't link against the LLVM instrumentation library. rdar://71609176 Differential Revision: https://reviews.llvm.org/D96572	2021-02-25 12:02:21 -08:00
Jan Svoboda	a25e4a6da3	[clang][cli] Store additional optimization remarks info After a revision of D96274 changed `DiagnosticOptions` to not store all remark arguments as-written, it is no longer possible to reconstruct the arguments accurately from the class. This is caused by the fact that for `-Rpass=regexp` and friends, `DiagnosticOptions` store only the group name `pass` and not `regexp`. This is the same representation used for the plain `-Rpass` argument. Note that each argument must be generated exactly once in `CompilerInvocation::generateCC1CommandLine`, otherwise each subsequent call would produce more arguments than the previous one. Currently this works out because of the way `RoundTrip` splits the responsibilities for certain arguments based on what arguments were queried during parsing. However, this invariant breaks when we move to single round-trip for the whole `CompilerInvocation`. This patch ensures that for one `-Rpass=regexp` argument, we don't generate two arguments (`-Rpass` from `DiagnosticOptions` and `-Rpass=regexp` from `CodeGenOptions`) by shifting the responsibility for handling both cases to `CodeGenOptions`. To distinguish between the cases correctly, additional information is stored in `CodeGenOptions`. The `CodeGenOptions` parser of `-Rpass[=regexp]` arguments also looks at `-Rno-pass` and `-R[no-]everything`, which is necessary for generating the correct argument regardless of the ordering of `CodeGenOptions`/`DiagnosticOptions` parsing/generation. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D96847	2021-02-25 11:02:49 +01:00
Yaxun (Sam) Liu	47acdec1dd	[CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc For -fgpu-rdc mode, static device vars in different TU's may have the same name. To support accessing file-scope static device variables in host code, we need to give them a distinct name and external linkage. This can be done by postfixing each static device variable with a distinct CUID (Compilation Unit ID) hash. Since the static device variables have different name across compilation units, now we let them have external linkage so that they can be looked up by the runtime. Reviewed by: Artem Belevich, and Jon Chesterfield Differential Revision: https://reviews.llvm.org/D85223	2021-02-24 18:23:45 -05:00
Vitaly Buka	8560c2d426	[ThinLTO, NewPM] Run OptimizerLastEPCallbacks from buildThinLTOPreLinkDefaultPipeline -O1 and above do dont call real optimizer pipeline in ThinLTO PreLink. Also clang can't add PostLink OptimizerLastEPCallbacks for in-process ThinLTO. This results in missing sanitizer passes with ThinLTO. Simple working solution is just call OptimizerLastEPCallbacks at the end of buildThinLTOPreLinkDefaultPipeline. Differential Revision: https://reviews.llvm.org/D96320	2021-02-23 22:14:41 -08:00
Dávid Bolvanský	053dc95839	Reduce the number of attributes attached to each function Patch takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D97116	2021-02-24 07:08:44 +01:00
Yaxun (Sam) Liu	a3ce7f5cd2	[HIP] Fix managed variable linkage Currently managed variables are emitted as undefined symbols, which causes difficulty for diagnosing undefined symbols for non-managed variables. This patch transforms managed variables in device compilation so that they can be emitted as normal variables. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D96195	2021-02-23 22:34:45 -05:00
Hsiangkai Wang	1a35a1b074	[RISCV] Add vadd with mask and without mask builtin. Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D93446	2021-02-24 07:57:31 +08:00
James Y Knight	fe2dcd89ac	DebugInfo: Emit "LocalToUnit" flag on local member function decls. Previously, the definition was so-marked, but the declaration was not. This resulted in LLVM's dwarf emission treating the function as being external, and incorrectly emitting DW_AT_external. Differential Revision: https://reviews.llvm.org/D96044	2021-02-22 17:55:25 -05:00
Melanie Blower	e64fcdf8d5	[clang][patch] Inclusive language, modify filename SanitizerBlacklist.h to NoSanitizeList.h This patch responds to a comment from @vitalybuka in D96203: suggestion to do the change incrementally, and start by modifying this file name. I modified the file name and made the other changes that follow from that rename. Reviewers: vitalybuka, echristo, MaskRay, jansvoboda11, aaron.ballman Differential Revision: https://reviews.llvm.org/D96974	2021-02-22 15:11:37 -05:00
Ryan Santhiraraja	2c25efcbd3	[AArch64] Adding SHA3 Intrinsics support This patch adds the following SHA3 Intrinsics: vsha512hq_u64, vsha512h2q_u64, vsha512su0q_u64, vsha512su1q_u64 veor3q_u8 veor3q_u16 veor3q_u32 veor3q_u64 veor3q_s8 veor3q_s16 veor3q_s32 veor3q_s64 vrax1q_u64 vxarq_u64 vbcaxq_u8 vbcaxq_u16 vbcaxq_u32 vbcaxq_u64 vbcaxq_s8 vbcaxq_s16 vbcaxq_s32 vbcaxq_s64 Note need to include +sha3 and +crypto when building from the front-end Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D96381	2021-02-22 12:09:20 +00:00
Dávid Bolvanský	ee51c42e00	Reduce the number of attributes attached to each function This takes advantage of the implicit default behavior to reduce the number of attributes.	2021-02-20 06:57:47 +01:00
Petr Hosek	3275b18f89	[Coverage] Normalize compilation dir as well This matches debug info behavior. Differential Revision: https://reviews.llvm.org/D97001	2021-02-19 15:29:03 -08:00
Christopher Tetreault	55448ab540	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett, ctetreau Differential Revision: https://reviews.llvm.org/D96825	2021-02-19 14:48:12 -08:00
Teresa Johnson	0923a60ea7	[clang] Emit type metadata on available_externally vtables for WPD When WPD is enabled, via WholeProgramVTables, emit type metadata for available_externally vtables. Additionally, add the vtables to the llvm.compiler.used global so that they are not prematurely eliminated (before *LTO analysis). This is needed to avoid devirtualizing calls to a function overriding a class defined in a header file but with a strong definition in a shared library. Without type metadata on the available_externally vtables from the header, the WPD analysis never sees what a derived class is overriding. Even if the available_externally base class functions are pure virtual, because shared library definitions are already treated conservatively (committed patches D91583, D96721, and D96722) we will not devirtualize, which would be unsafe since the library might contain overrides that aren't visible to the LTO unit. An example is std::error_category, which is overridden in LLVM and causing failures after a self build with WPD enabled, because libstdc++ contains hidden overrides of the virtual base class methods. Differential Revision: https://reviews.llvm.org/D96919	2021-02-19 12:42:34 -08:00
Petr Hosek	5fbd1a333a	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 14:34:39 -08:00
Sterling Augustine	fc97a63db0	Move a second variable only used in an assert into the assert. This prevents unused variable warnings when building without asserts.	2021-02-18 13:26:07 -08:00
Sterling Augustine	4544a63b77	Move variable only used in an assert into the assert. This prevents unused variable warnings when building without asserts.	2021-02-18 13:04:58 -08:00
Petr Hosek	fbf8b957fd	Revert "[Coverage] Store compilation dir separately in coverage mapping" This reverts commit `97ec8fa5bb` since the test is failing on some bots.	2021-02-18 12:50:24 -08:00
Pengxuan Zheng	0ec32f1326	Revert "[AArch64] Adding Neon Polynomial vadd Intrinsics" Revert the patch due to buildbot failures. This reverts commit `d9645059c5`.	2021-02-18 12:38:16 -08:00
Petr Hosek	97ec8fa5bb	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 12:27:42 -08:00
Zequan Wu	d83511dd26	[Coverage] Emit gap region after conditions when macro is present.	2021-02-18 11:41:04 -08:00
Pengxuan Zheng	d9645059c5	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett Differential Revision: https://reviews.llvm.org/D96825	2021-02-18 11:33:24 -08:00
Jonas Paulsson	e57bd1ff4f	[CFE, SystemZ] New target hook testFPKind() for checks of FP values. The recent commit `00a6254` "Stop traping on sNaN in builtin_isnan" changed the lowering in constrained FP mode of builtin_isnan from an FP comparison to integer operations to avoid trapping. SystemZ has a special instruction "Test Data Class" which is the preferred way to do this check. This patch adds a new target hook "testFPKind()" that lets SystemZ emit the s390_tdc intrinsic instead. testFPKind() takes the BuiltinID as an argument and is expected to soon handle more opcodes than just 'builtin_isnan'. Review: Thomas Preud'homme, Ulrich Weigand Differential Revision: https://reviews.llvm.org/D96568	2021-02-18 12:36:46 -06:00
Jeroen Dobbelaere	46757ccb49	[clang] functions with the 'const' or 'pure' attribute must always return. As described in * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute An `__attribute__((pure))` function must always return, as well as an `__attribute__((const))` function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96960	2021-02-18 17:29:46 +01:00
Hsiangkai Wang	766ee1096f	[Clang][RISCV] Define RISC-V V builtin types Add the types for the RISC-V V extension builtins. These types will be used by the RISC-V V intrinsics which require types of the form <vscale x 1 x i64>(LMUL=1 element size=64) or <vscale x 4 x i32>(LMUL=2 element size=32), etc. The vector_size attribute does not work for us as it doesn't create a scalable vector type. We want these types to be opaque and have no operators defined for them. We want them to be sizeless. This makes them similar to the ARM SVE builtin types. But we will have quite a bit more types. This patch adds around 60. Later patches will add another 230 or so types representing tuples of these types similar to the x2/x3/x4 types in ARM SVE. But with extra complexity that these types are combined with the LMUL concept that is unique to RISCV. For more background see this RFC http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html Authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D92715	2021-02-18 10:17:31 +08:00
Stanislav Mekhanoshin	a8d9d50762	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00
Igor Kudrin	aa84289629	[DebugInfo] Keep the DWARF64 flag in the module metadata This allows the option to affect the LTO output. Module::Max helps to generate debug info for all modules in the same format. Differential Revision: https://reviews.llvm.org/D96597	2021-02-17 17:03:34 +07:00
Alexey Bataev	60d71a286b	[OPENMP50]Allow overlapping mapping in target constructs. OpenMP 5.0 removed a lot of restriction for overlapped mapped items comparing to OpenMP 4.5. Patch restricts the checks for overlapped data mappings only for OpenMP 4.5 and less and reorders mapping of the arguments so, that present and alloc mappings are processed first and then all others. Differential Revision: https://reviews.llvm.org/D86119	2021-02-16 14:42:08 -08:00
Michael Kruse	6c05005238	[OpenMP] Implement '#pragma omp tile', by Michael Kruse (@Meinersbur). The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard. This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult. A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once. I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest). Differential Revision: https://reviews.llvm.org/D76342	2021-02-16 09:45:07 -08:00
serge-sans-paille	3c8bf29f14	Reduce the number of attributes attached to each function This takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time. I've observed -3% in instruction count when compiling sqlite3 amalgamation with -O0 Differential Revision: https://reviews.llvm.org/D96400	2021-02-16 16:19:54 +01:00
Marco Antognini	e54811ff7e	Restore diagnostic handler after CodeGenAction::ExecuteAction Fix dangling pointer to local variable and address some typos. Reviewed By: xur Differential Revision: https://reviews.llvm.org/D96487	2021-02-15 10:33:00 +00:00
Wang, Pengfei	61da20575d	[X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) This is a follow up of D92940. We have successfully converted fadd/fmul _mm_reduce_* intrinsics to llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax too, i.e. llvm.reduction + nnan flag. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93179	2021-02-15 08:52:06 +08:00
Malhar	74ddacd30d	[Clang] Ensure vector predication loop metadata is always emitted when pragma is specified. This patch ensures that vector predication and vectorization width pragmas work together correctly/as expected. Specifically, this patch fixes the issue that when vectorization_width > 1, the vector predication behaviour (this would matter if it has NOT been disabled explicitly by a pragma) was getting ignored, which was incorrect. The fix here removes the dependence of vector predication on the vectorization width. The loop metadata corresponding to clang loop pragma vectorize_predicate is always emitted, if the pragma is specified, even if vectorization is disabled by vectorize_width(1) or vectorize(disable) since the option is also used for interleaving by the LoopVectorize pass. Reviewed By: dmgreen, Meinersbur Differential Revision: https://reviews.llvm.org/D94779	2021-02-13 17:35:54 -06:00
Artur Gainullin	ff50b121e3	[SYCL] Ignore file-scope asm during device-side SYCL compilation. Reviewed By: bader, eandrews Differential Revision: https://reviews.llvm.org/D96538	2021-02-12 17:00:45 -08:00
Florian Hahn	6280bb4cd8	[clang] Remove redundant condition (NFC).	2021-02-12 20:14:24 +00:00
Florian Hahn	51bf4c0e6d	[clang] Add -ffinite-loops & -fno-finite-loops options. This patch adds 2 new options to control when Clang adds `mustprogress`: 1. -ffinite-loops: assume all loops are finite; mustprogress is added to all loops, regardless of the selected language standard. 2. -fno-finite-loops: assume no loop is finite; mustprogress is not added to any loop or function. We could add mustprogress to functions without loops, but we would have to detect that in Clang, which is probably not worth it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96419	2021-02-12 19:25:49 +00:00
Amy Huang	3fe465fb2c	Revert "[DebugInfo] Add an attribute to force type info to be emitted for" Didn't mean to commit this. This reverts commit `1b5c2915a2`.	2021-02-12 10:18:17 -08:00

... 5 6 7 8 9 ...

14730 Commits