llvm-project

Commit Graph

Author	SHA1	Message	Date
Thomas Lively	cbabfc63b1	[WebAssembly] Custom combines for f32x4.demote_zero_f64x2 Replace the clang builtin function and LLVM intrinsic for f32x4.demote_zero_f64x2 with combines from normal SDNodes. Also add missing combines for i32x4.trunc_sat_zero_f64x2_{s,u}, which share the same pattern. Differential Revision: https://reviews.llvm.org/D105755	2021-07-12 10:32:18 -07:00
Johannes Doerfert	e2cfbfcc0c	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 17:53:56 -05:00
Nico Weber	d3e7491333	Revert Attributor patch series Broke check-clang, see https://reviews.llvm.org/D102307#2869065 Ran `git revert -n ebbe149a6f08535ede848a531a601ae6591cfbc5..269416d41908bb670f67af689155d5ab8eea689a`	2021-07-10 16:15:55 -04:00
Johannes Doerfert	1d5711c3ee	[OpenMP] Unified entry point for SPMD & generic kernels in the device RTL In the spirit of TRegions [0], this patch provides a simpler and uniform interface for a kernel to set up the device runtime. The OMPIRBuilder is used for reuse in Flang. A custom state machine will be generated in the follow up patch. The "surplus" threads of the "master warp" will not exit early anymore so we need to use non-aligned barriers. The new runtime will not have an extra warp but also require these non-aligned barriers. [0] https://link.springer.com/chapter/10.1007/978-3-030-28596-8_11 This was in parts extracted from D59319. Reviewed By: ABataev, JonChesterfield Differential Revision: https://reviews.llvm.org/D101976	2021-07-10 12:32:50 -05:00
Thomas Lively	e5220104d0	[WebAssembly] Custom combines for f64x2.promote_low_f32x4 Replace the clang builtin function and LLVM intrinsic previously used to select the f64x2.promote_low_f32x4 instruction with custom combines from standard SelectionDAG nodes. Implement the new combines to share code with the similar combines for f64x2.convert_low_i32x4_{s,u}. Resolves PR50232. Differential Revision: https://reviews.llvm.org/D105675	2021-07-09 18:59:29 -07:00
Alexey Bataev	ab8989ab87	[OPENMP]Fix overlapped mapping for dereferenced pointer members. If the base is used in a map clause and later we have a memberexpr with this base, and the member is a pointer, and this pointer is dereferenced anyhow (subscript, array section, dereference, etc.), such components should be considered as overlapped, otherwise it may lead to incorrect size computations, since we try to map a pointee as a part of the whole struct, which is not true for the pointer members. Differential Revision: https://reviews.llvm.org/D105562	2021-07-09 12:51:26 -07:00
David Blaikie	768e3af634	PR51034: Debug Info: Remove 'prototyped' from K&R function declarations Regression caused by `6c9559b67b`.	2021-07-09 12:07:36 -07:00
Varun Gandhi	92dcb1d2db	[Clang] Introduce Swift async calling convention. This change is intended as initial setup. The plan is to add more semantic checks later. I plan to update the documentation as more semantic checks are added (instead of documenting the details up front). Most of the code closely mirrors that for the Swift calling convention. Three places are marked as [FIXME: swiftasynccc]; those will be addressed once the corresponding convention is introduced in LLVM. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D95561	2021-07-09 11:50:10 -07:00
David Blaikie	1def2579e1	PR51018: Remove explicit conversions from SmallString to StringRef to future-proof against C++23 C++23 will make these conversions ambiguous - so fix them to make the codebase forward-compatible with C++23 (& a follow-up change I've made will make this ambiguous/invalid even in <C++23 so we don't regress this & it generally improves the code anyway)	2021-07-08 13:37:57 -07:00
Nikita Popov	a0ea367562	[CodeGen] Avoid nullptr arg to CreateStructGEP (NFC) For now just make the getPointerElementType() explicit.	2021-07-08 21:21:43 +02:00
Alexey Bataev	f57d396dca	[OPENMP]Do no privatize const firstprivates in target regions. No need to emit private copyfor firstprivate constants in target regions, we can use the original copy instead. Differential Revision: https://reviews.llvm.org/D105647	2021-07-08 11:55:37 -07:00
Nikita Popov	693251fb2f	[CodeGen] Avoid CreateGEP with nullptr type (NFC) In preparation for dropping support for it. I've replaced it with a proper type where the correct type was obvious and left an explicit getPointerElementType() where it wasn't.	2021-07-08 20:38:54 +02:00
Alexey Bataev	b3c80dd894	[OPENMP]Remove const firstprivate allocation as a variable in a constant space. Current implementation is not compatible with asynchronous target regions, need to remove it. Differential Revision: https://reviews.llvm.org/D105375	2021-07-07 05:56:48 -07:00
Hsiangkai Wang	593bf9b4de	[Clang][RISCV] Implement vlseg and vlsegff. Differential Revision: https://reviews.llvm.org/D103527	2021-07-07 13:44:40 +08:00
David Blaikie	6c9559b67b	DebugInfo: Mangle K&R declarations for debug info linkage names This fixes a gap in the `overloadable` attribute support (K&R declared functions would get mangled symbol names, but that name wouldn't be represented in the debug info linkage name field for the function) and in -funique-internal-linkage-names (this came up in review discussion on D98799) where K&R static declarations would not get the uniqued linkage names.	2021-07-06 16:28:02 -07:00
Xiang1 Zhang	a39bb960fc	[X86] Refine code of generating BB labels in Keylocker Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D105336	2021-07-05 09:29:51 +08:00
Nikita Popov	fabc17192e	[IRBuilder] Add type argument to CreateMaskedLoad/Gather Same as other CreateLoad-style APIs, these need an explicit type argument to support opaque pointers. Differential Revision: https://reviews.llvm.org/D105395	2021-07-04 12:17:59 +02:00
Jun Ma	3afbf89804	[clang][AArch64][SVE] Handle PRValue under VLAT <-> VLST cast This change fixes the crash that PRValue cannot be handled by EmitLValue. Differential Revision: https://reviews.llvm.org/D105097	2021-07-01 10:09:47 +08:00
Yaxun (Sam) Liu	434bd5bf54	[AMDGPU] Add builtin functions image_bvh_intersect_ray Reviewed by: Stanislav Mekhanoshin, Matt Arsenault Differential Revision: https://reviews.llvm.org/D104946	2021-06-30 13:10:47 -04:00
Melanie Blower	e773216f46	[clang][patch] Add builtin __arithmetic_fence and option fprotect-parens This patch adds a new clang builtin, __arithmetic_fence. The purpose of the builtin is to provide the user fine control, at the expression level, over floating point optimization when -ffast-math (-ffp-model=fast) is enabled. The builtin prevents the optimizer from rearranging floating point expression evaluation. The new option fprotect-parens has the same effect on parenthesized expressions, forcing the optimizer to respect the parentheses. Reviewed By: aaron.ballman, kpn Differential Revision: https://reviews.llvm.org/D100118	2021-06-30 09:58:06 -04:00
Akira Hatanaka	6cda73e3c4	[CodeGen] Add ParmVarDecls to FunctionDecls that are created to generate ObjC property getter/setter functions This is needed to prevent clang from crashing when we make the changes proposed in https://reviews.llvm.org/D98799. Differential Revision: https://reviews.llvm.org/D104883	2021-06-29 16:27:24 -07:00
Steffen Larsen	3644726a78	[Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX 6.5 and 7.0 WMMA and MMA instructions Adds NVPTX builtins and intrinsics for the CUDA PTX `wmma.load`, `wmma.store`, `wmma.mma`, and `mma` instructions added in PTX 6.5 and 7.0. PTX ISA description of - `wmma.load`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-ld - `wmma.store`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-st - `wmma.mma`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-wmma-mma - `mma`: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-instructions-mma Overview of `wmma.mma` and `mma` matrix shape/type combinations added with specific PTX versions: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#warp-level-matrix-shape Authored-by: Steffen Larsen <steffen.larsen@codeplay.com> Co-Authored-by: Stuart Adams <stuart.adams@codeplay.com> Reviewed By: tra Differential Revision: https://reviews.llvm.org/D104847	2021-06-29 15:44:07 -07:00
Akira Hatanaka	8d21d54725	[CodeGen] Stop creating fake FunctionDecls when generating IR for functions implicitly generated by the compiler These fake functions would cause clang to crash if the changes proposed in https://reviews.llvm.org/D98799 were made.	2021-06-29 14:22:33 -07:00
Akira Hatanaka	952944c12c	[ObjC][ARC] Don't add operand bundle clang.arc.attachedcall to a call if the call already has the operand bundle This bug was causing the call to `replaceAllUsesWith` to crash because the old call instruction and the new call instruction were the same. rdar://74957948 Differential Revision: https://reviews.llvm.org/D97824	2021-06-29 10:23:01 -07:00
Bruno De Fraine	4d8871a898	PR50767: clear non-distinct debuginfo for function with nodebug definition after undecorated declaration Fix suggested by Yuanfang Chen: Non-distinct debuginfo is attached to the function due to the undecorated declaration. Later, when seeing the function definition and `nodebug` attribute, the non-distinct debuginfo should be cleared. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D104777	2021-06-29 10:26:45 +02:00
Xiang1 Zhang	6d234a6908	[X86] Zero some outputs of Kelocker intrinsics in error case Reviewed By: WangPengfei Differential Revision: https://reviews.llvm.org/D104766	2021-06-29 13:35:40 +08:00
Ben Shi	c94c8d8b5d	[AVR][clang] Fix wrong calling convention in functions return struct type According to AVR ABI (https://gcc.gnu.org/wiki/avr-gcc), returned struct value within size 1-8 bytes should be returned directly (via register r18-r25), while larger ones should be returned via an implicit struct pointer argument. Reviewed By: dylanmckay Differential Revision: https://reviews.llvm.org/D99237	2021-06-29 11:32:39 +08:00
Michael Liao	948308ef34	Fix `-Wunused-variable` warning. NFC.	2021-06-28 22:50:36 -04:00
Hongtao Yu	633ca3ff2f	[UniqueLinkageName] Use exsiting GlobalDecl object instead of reconstructing one. C++ constructors/destructors need to go through a different constructor to construct a GlobalDecl object in order to retrieve their linkage type. This causes an assert failure in the default constructor of GlobalDecl. I'm chaning it to using the exsiting GlobalDecl object. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D102356	2021-06-28 14:50:41 -07:00
Melanie Blower	c27e5a2a8e	Revert "[clang][patch][fpenv] Add builtin __arithmetic_fence and option fprotect-parens" This reverts commit `4f1238e44d`. Buildbot fails on predecessor patch	2021-06-28 12:42:59 -04:00
Melanie Blower	4f1238e44d	[clang][patch][fpenv] Add builtin __arithmetic_fence and option fprotect-parens This patch adds a new clang builtin, __arithmetic_fence. The purpose of the builtin is to provide the user fine control, at the expression level, over floating point optimization when -ffast-math (-ffp-model=fast) is enabled. The builtin prevents the optimizer from rearranging floating point expression evaluation. The new option fprotect-parens has the same effect on parenthesized expressions, forcing the optimizer to respect the parentheses. Reviewed By: aaron.ballman, kpn Differential Revision: https://reviews.llvm.org/D100118	2021-06-28 12:26:53 -04:00
Jinsong Ji	eb237ffca8	[PowerPC] Add XL Compat fetch builtins Prototype ``` unsigned int __fetch_and_add (volatile unsigned int* addr, unsigned int val); unsigned long __fetch_and_addlp (volatile unsigned long* addr, unsigned long val); ``` Ref: https://www.ibm.com/docs/en/xl-c-and-cpp-linux/16.1.1?topic=functions-fetch Reviewed By: #powerpc, w2yehia, lkail Differential Revision: https://reviews.llvm.org/D104991	2021-06-28 02:52:32 +00:00
Craig Topper	7a112356e4	[X86] Correct the conversion of VALIGND/Q intrinsics to shufflevector. We need to mask the immediate to the width of a single vector rather than 2 vectors. If we use the width of 2 vectors then any shift larger than the length of 1 vector is going to overflow the shuffle indices. Fixes PR50895.	2021-06-26 19:06:00 -07:00
Joseph Huber	9ce02ea8c9	[OpenMP] Add Module metadata for OpenMP compilation This patch adds a module level metadata flag indicating that the module was compiled with the `-fopenmp` flag. This will make it easier for passes like OpenMPOpt to determine if it should be run. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D102361	2021-06-25 16:34:19 -04:00
Stuart Brady	e47027d091	[OpenCL] Use DW_LANG_OpenCL language tag for OpenCL C Note regarding C++ for OpenCL: When compiling C++ for OpenCL, DW_LANG_C_plus_plus* is emitted. There is no DWARF language code defined for C++ for OpenCL as of yet, but DWARF issue 210514.1 has been raised to request one. In the mean time, continuing to emit DW_LANG_C_plus_plus* for C++ for OpenCL allows the potential to distinguish between C++ for OpenCL and OpenCL C in !DICompileUnit nodes, whereas using DW_LANG_OpenCL for C++ for OpenCL would prevent this. This change therefore leaves C++ for OpenCL as-is. Reviewed By: shchenz, Anastasia Differential Revision: https://reviews.llvm.org/D104118	2021-06-25 11:48:42 +01:00
Jinsong Ji	f3ef4f5bff	[PowerPC] Add XL compat __compare_and_swap builtins Prototype int __compare_and_swap (volatile int* addr, int* old_val_addr, int new_val); int __compare_and_swaplp (volatile long* addr, long* old_val_addr, long new_val); Refer to https://www.ibm.com/docs/en/xl-c-and-cpp-aix/16.1?topic=functions-compare-swap-compare-swaplp Reviewed By: w2yehia Differential Revision: https://reviews.llvm.org/D104837	2021-06-25 01:08:48 +00:00
Martin Storsjö	e5c7c171e5	[clang] Rename StringRef _lower() method calls to _insensitive() This is mostly a mechanical change, but a testcase that contains parts of the StringRef class (clang/test/Analysis/llvm-conventions.cpp) isn't touched.	2021-06-25 00:22:01 +03:00
Akira Hatanaka	8db0dbbe2c	[CodeGen] Don't create fake FunctionDecls when generating block/byref copy/dispose helper functions We found out that these fake functions would cause clang to crash if the changes proposed in https://reviews.llvm.org/D98799 were made. The original patch was reverted in `f681fd927e` because debug locations were missing in the body of the block byref helper functions. This patch fixes the bug by calling CreateArtificial after the calls to StartFunction. Differential Revision: https://reviews.llvm.org/D104082	2021-06-24 11:45:52 -07:00
Aakanksha Patil	3453f3dd46	[AMDGPU] Add gfx1035 target Differential Revision: https://reviews.llvm.org/D104804	2021-06-24 14:32:41 -04:00
Zequan Wu	f681fd927e	Revert "[CodeGen] Don't create fake FunctionDecls when generating block/byref" That commit causes crash with error "!dbg attachment points at wrong subprogram for function" on iOS platforms. This reverts commit `f4c06bcb67`.	2021-06-22 21:48:00 -07:00
Akira Hatanaka	f4c06bcb67	[CodeGen] Don't create fake FunctionDecls when generating block/byref copy/dispose helper functions We found out that these fake functions would cause clang to crash if the changes proposed in https://reviews.llvm.org/D98799 were made. Differential Revision: https://reviews.llvm.org/D104082	2021-06-22 11:42:53 -07:00
Fangrui Song	948016228f	Improve clang -Wframe-larger-than= diagnostic Match the style in D104667. This commit is for non-LTO diagnostics, while D104667 is for LTO and llc diagnostics.	2021-06-22 11:20:49 -07:00
Joseph Huber	68d133a3e8	[OpenMP] Simplify GPU memory globalization Summary: Memory globalization is required to maintain OpenMP standard semantics for data sharing between worker and master threads. The GPU cannot share data between its threads so must allocate global or shared memory to store the data in. Currently this is implemented fully in the frontend using the `__kmpc_data_sharing_push_stack` and __kmpc_data_sharing_pop_stack` functions to emulate standard CPU stack sharing. The front-end scans the target region for variables that escape the region and must be shared between the threads. Each variable then has a field created for it in a global record type. This patch replaces this functinality with a single allocation command, effectively mimicing an alloca instruction for the variables that must be shared between the threads. This will be much slower than the current solution, but makes it much easier to optimize as we can analyze each variable independently and determine if it is not captured. In the future, we can replace these calls with an `alloca` and small allocations can be pushed to shared memory. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D97680	2021-06-22 10:52:46 -04:00
Graham Hunter	a83ce95b09	[clang] Remove unused capture in closure `c6a91ee6aa` removed uses of IsMonotonic from OpenMP SIMD codegen, but that left a capture of the variable unused which upset buildbots using -Werror.	2021-06-22 15:09:39 +01:00
Graham Hunter	c6a91ee6aa	[Clang][OpenMP] Monotonic does not apply to SIMD The codegen for simd constructs was affected by the presence (or absence) of the 'monotonic' schedule modifier for worksharing loops. The modifier is only intended to apply to the scheduling of chunks for a thread, not iterations of a loop inside a chunk. In addition, the monotonic modifier was applied to worksharing loops by default if no schedule clause was present; the referenced part of the OpenMP 4.5 spec in the code (section 2.7.1) only applies if the user specified a schedule clause with a static kind but no modifier. Without a user-specified schedule clause we should default to nonmonotonic scheduling. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D103793	2021-06-22 10:24:11 +01:00
Nick Desaulniers	8ace121305	[IR] convert warn-stack-size from module flag to fn attr Otherwise, this causes issues when building with LTO for object files that use different values. Link: https://github.com/ClangBuiltLinux/linux/issues/1395 Reviewed By: dblaikie, MaskRay Differential Revision: https://reviews.llvm.org/D104342	2021-06-21 15:09:25 -07:00
Nick Desaulniers	193e41c987	[Clang][Codegen] Add GNU function attribute 'no_profile' and lower it to noprofile noprofile IR attribute already exists to prevent profiling with PGO; emit that when a function uses the newly added no_profile function attribute. The Linux kernel would like to avoid compiler generated code in functions annotated with such attribute. We already respect this for libcalls to fentry() and mcount(). Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80223 Link: https://lore.kernel.org/lkml/CAKwvOdmPTi93n2L0_yQkrzLdmpxzrOR7zggSzonyaw2PGshApw@mail.gmail.com/ Reviewed By: MaskRay, void, phosek, aaron.ballman Differential Revision: https://reviews.llvm.org/D104475	2021-06-18 13:42:32 -07:00
Bjorn Pettersson	4c7f820b2b	Update @llvm.powi to handle different int sizes for the exponent This can be seen as a follow up to commit `0ee439b705`, that changed the second argument of __powidf2, __powisf2 and __powitf2 in compiler-rt from si_int to int. That was to align with how those runtimes are defined in libgcc. One thing that seem to have been missing in that patch was to make sure that the rest of LLVM also handle that the argument now depends on the size of int (not using the si_int machine mode for 32-bit). When using __builtin_powi for a target with 16-bit int clang crashed. And when emitting libcalls to those rtlib functions, typically when lowering @llvm.powi), the backend would always prepare the exponent argument as an i32 which caused miscompiles when the rtlib was compiled with 16-bit int. The solution used here is to use an overloaded type for the second argument in @llvm.powi. This way clang can use the "correct" type when lowering __builtin_powi, and then later when emitting the libcall it is assumed that the type used in @llvm.powi matches the rtlib function. One thing that needed some extra attention was that when vectorizing calls several passes did not support that several arguments could be overloaded in the intrinsics. This patch allows overload of a scalar operand by adding hasVectorInstrinsicOverloadedScalarOpd, with an entry for powi. Differential Revision: https://reviews.llvm.org/D99439	2021-06-17 09:38:28 +02:00
Chen Zheng	4590b406c0	[Debug-Info] guard DW_LANG_C_plus_plus_14 under strict dwarf Reviewed By: stuart Differential Revision: https://reviews.llvm.org/D104291	2021-06-16 03:17:56 +00:00
Kevin Athey	e0b469ffa1	[clang-cl][sanitizer] Add -fsanitize-address-use-after-return to clang. Also: - add driver test (fsanitize-use-after-return.c) - add basic IR test (asan-use-after-return.cpp) - (NFC) cleaned up logic for generating table of __asan_stack_malloc depending on flag. for issue: https://github.com/google/sanitizers/issues/1394 Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D104076	2021-06-11 12:07:35 -07:00

1 2 3 4 5 ...

14416 Commits