llvm-project

Commit Graph

Author	SHA1	Message	Date
Chirag Khandelwal	c204106188	[Clang][OpenMP] Frontend work for sections - D89671 This patch is child of D89671, contains the clang implementation to use the OpenMP IRBuilder's section construct. Co-author: @anchu-rajendran Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D91054	2021-04-29 19:52:27 +05:30
Evgeny Leviant	6a0283d0d2	[NewPM] Add an option to dump pass structure Patch adds -debug-pass-structure option to dump pass structure when new pass manager is used. Differential revision: https://reviews.llvm.org/D99599	2021-04-29 10:29:42 +03:00
Dan Liew	1bbbcff99d	[NFC] Rename SanitizeAddressDtorKind codegen opt to not have `Kind` suffix. This is post commit follow up based on discussions in https://reviews.llvm.org/D101122. Differential Revision: https://reviews.llvm.org/D101490 (cherry picked from commit f4c7e82d1b21e637c4e0c53125b126c407d8bdbf)	2021-04-28 18:37:16 -07:00
Ryan Santhirarajan	0395f9e70b	[ARM] Neon Polynomial vadd Intrinsic fix The Neon vadd intrinsics were added to the ARMSIMD intrinsic map, however due to being defined under an AArch64 guard in arm_neon.td, were not previously useable on ARM. This change rectifies that. It is important to note that poly128 is not valid on ARM, thus it was extracted out of the original arm_neon.td definition and separated for the sake of AArch64. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D100772	2021-04-28 11:59:40 -07:00
Nico Weber	0f1137ba79	[clang/Basic] Make TargetInfo.h not use DataLayout again Reverts parts of https://reviews.llvm.org/D17183, but keeps the resetDataLayout() API and adds an assert that checks that datalayout string and user label prefix are in sync. Approach 1 in https://reviews.llvm.org/D17183#2653279 Reduces number of TUs build for 'clang-format' from 689 to 575. I also implemented approach 2 in D100764. If someone feels motivated to make us use DataLayout more, it's easy to revert this change here and go with D100764 instead. I don't plan on doing more work in this area though, so I prefer going with the smaller, more self-consistent change. Differential Revision: https://reviews.llvm.org/D100776	2021-04-27 22:26:10 -04:00
Yonghong Song	a2a3ca8d97	BPF: emit debuginfo for Function of DeclRefExpr if requested Commit `e3d8ee35e4` ("reland "[DebugInfo] Support to emit debugInfo for extern variables"") added support to emit debugInfo for extern variables if requested by the target. Currently, only BPF target enables this feature by default. As BPF ecosystem grows, callback function started to get support, e.g., recently bpf_for_each_map_elem() is introduced (https://lwn.net/Articles/846504/) with a callback function as an argument. In the future we may have something like below as a demonstration of use case : extern int do_work(int); long bpf_helper(void callback_fn, void callback_ctx, ...); long prog_main() { struct { ... } ctx = { ... }; return bpf_helper(&do_work, &ctx, ...); } Basically bpf helper may have a callback function and the callback function is defined in another file or in the kernel. In this case, we would like to know the debuginfo types for do_work(), so the verifier can proper verify the safety of bpf_helper() call. For the following example, extern int do_work(int); long bpf_helper(void callback_fn); long prog() { return bpf_helper(&do_work); } Currently, there is no debuginfo generated for extern function do_work(). In the IR, we have, ... define dso_local i64 @prog() local_unnamed_addr #0 !dbg !7 { entry: %call = tail call i64 @bpf_helper(i8 bitcast (i32 (i32)* @do_work to i8*)) #2, !dbg !11 ret i64 %call, !dbg !12 } ... declare dso_local i32 @do_work(i32) #1 ... This patch added support for the above callback function use case, and the generated IR looks like below: ... declare !dbg !17 dso_local i32 @do_work(i32) #1 ... !17 = !DISubprogram(name: "do_work", scope: !1, file: !1, line: 1, type: !18, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, retainedNodes: !2) !18 = !DISubroutineType(types: !19) !19 = !{!20, !20} !20 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed) The TargetInfo.allowDebugInfoForExternalVar is renamed to TargetInfo.allowDebugInfoForExternalRef as now it guards both extern variable and extern function debuginfo generation. Differential Revision: https://reviews.llvm.org/D100567	2021-04-26 16:53:25 -07:00
Alexey Bader	7818906ca1	[SYCL] Implement SYCL address space attributes handling Default address space (applies when no explicit address space was specified) maps to generic (4) address space. Added SYCL named address spaces `sycl_global`, `sycl_local` and `sycl_private` defined as sub-sets of the default address space. Static variables without address space now reside in global address space when compile for SPIR target, unless they have an explicit address space qualifier in source code. Differential Revision: https://reviews.llvm.org/D89909	2021-04-26 13:44:10 +03:00
Levy Hsu	8cf54c7ff5	[RISCV] [1/2] Add IR intrinsic for Zbe extension RV32/64: bcompress bdecompress RV64 ONLY: bcompressw bdecompressw Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D101143	2021-04-25 19:14:34 -07:00
Thomas Lively	502f54049d	[WebAssembly] Finalize wasm_simd128.h intrinsics Adds new intrinsics for instructions that are in the final SIMD spec but did not previously have intrinsics. Also updates the names of existing intrinsics to reflect the final names of the underlying instructions in the spec. Keeps the old names as deprecated functions to ease the transition to the new names. Differential Revision: https://reviews.llvm.org/D101112	2021-04-23 13:37:27 -07:00
Fangrui Song	a92dbadffe	[OpenMP] Fix -Wdeprecated-copy	2021-04-23 10:49:19 -07:00
Dávid Bolvanský	2cae7025c1	Reland "[Clang] Propagate guaranteed alignment for malloc and others" This relands commit `6914a0ed2b`. Crash in InstCombine was fixed.	2021-04-23 14:05:57 +02:00
Dávid Bolvanský	6914a0ed2b	Revert "[Clang] Propagate guaranteed alignment for malloc and others" This reverts commit `c2297544c0`. Some buildbots are broken.	2021-04-23 11:33:33 +02:00
Dávid Bolvanský	c2297544c0	[Clang] Propagate guaranteed alignment for malloc and others LLVM should be smarter about known malloc's alignment and this knowledge may enable other optimizations. Originally started as LLVM patch - https://reviews.llvm.org/D100862 but this logic should be really in Clang. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D100879	2021-04-23 11:07:14 +02:00
Fangrui Song	2786e673c7	[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all} The Linux kernel objtool diagnostic `call without frame pointer save/setup` arise in multiple instrumentation passes (asan/tsan/gcov). With the mechanism introduced in D100251, it's trivial to respect the command line -m[no-]omit-leaf-frame-pointer/-f[no-]omit-frame-pointer, so let's do it. Fix: https://github.com/ClangBuiltLinux/linux/issues/1236 (tsan) Fix: https://github.com/ClangBuiltLinux/linux/issues/1238 (asan) Also document the function attribute "frame-pointer" which is long overdue. Differential Revision: https://reviews.llvm.org/D101016	2021-04-22 18:07:30 -07:00
Levy Hsu	b49337bbb9	[RISCV] [1/2] Add IR intrinsic for Zbp extension RV32/64: grev grevi gorc gorci shfl shfli unshfl unshfli RV64 ONLY: grevw greviw gorcw gorciw shflw shfli (For non-existing shfliw) unshfli (For non-existing unshfliw) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D100830	2021-04-22 16:34:51 -07:00
Fangrui Song	ef5e7f90ea	Temporarily revert the code part of D100981 "Delete le32/le64 targets" This partially reverts commit `77ac823fd2`. Halide uses le32/le64 (https://github.com/halide/Halide/pull/5934). Temporarily brings back the code part to give them some time for migration.	2021-04-22 10:18:44 -07:00
Hamza Mahfooz	be2277fbf2	[Matrix] Support #pragma clang fp From https://bugs.llvm.org/show_bug.cgi?id=49739: Currently, `#pragma clang fp` are ignored for matrix types. For the code below, the `contract` fast-math flag should be added to the generated call to `llvm.matrix.multiply` and `fadd` ``` typedef float fx2x2_t __attribute__((matrix_type(2, 2))); void foo(fx2x2_t &A, fx2x2_t &C, fx2x2_t &B) { #pragma clang fp contract(fast) C = A*B + C; } ``` Reviewed By: fhahn, mibintc Differential Revision: https://reviews.llvm.org/D100834	2021-04-22 11:45:34 +01:00
Giorgis Georgakoudis	a2dbfb6b72	[OpenMP] Simplify offloading parallel call codegen This revision simplifies Clang codegen for parallel regions in OpenMP GPU target offloading and corresponding changes in libomptarget: SPMD/non-SPMD parallel calls are unified under a single `kmpc_parallel_51` runtime entry point for parallel regions (which will be commonized between target, host-side parallel regions), data sharing is internalized to the runtime. Tests have been auto-generated using `update_cc_test_checks.py`. Also, the revision contains changes to OpenMPOpt for remark creation on target offloading regions. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D95976	2021-04-21 18:46:07 -07:00
Fangrui Song	77ac823fd2	Delete le32/le64 targets They are unused now. Note: NaCl is still used and is currently expected to be needed until 2022-06 (https://blog.chromium.org/2020/08/changes-to-chrome-app-support-timeline.html). Differential Revision: https://reviews.llvm.org/D100981	2021-04-21 18:44:12 -07:00
Fangrui Song	775a9483e5	[IR][sanitizer] Set nounwind on module ctor/dtor, additionally set uwtable if -fasynchronous-unwind-tables On ELF targets, if a function has uwtable or personality, or does not have nounwind (`needsUnwindTableEntry`), it marks that `.eh_frame` is needed in the module. Then, a function gets `.eh_frame` if `needsUnwindTableEntry` or `-g[123]` is specified. (i.e. If -g[123], every function gets `.eh_frame`. This behavior is strange but that is the status quo on GCC and Clang.) Let's take asan as an example. Other sanitizers are similar. `asan.module_[cd]tor` has no attribute. `needsUnwindTableEntry` returns true, so every function gets `.eh_frame` if `-g[123]` is specified. This is the root cause that `-fno-exceptions -fno-asynchronous-unwind-tables -g` produces .debug_frame while `-fno-exceptions -fno-asynchronous-unwind-tables -g -fsanitize=address` produces .eh_frame. This patch * sets the nounwind attribute on sanitizer module ctor/dtor. * let Clang emit a module flag metadata "uwtable" for -fasynchronous-unwind-tables. If "uwtable" is set, sanitizer module ctor/dtor additionally get the uwtable attribute. The "uwtable" mechanism is generic: synthesized functions not cloned/specialized from existing ones should consider `Function::createWithDefaultAttr` instead of `Function::create` if they want to get some default attributes which have more of module semantics. Other candidates: "frame-pointer" (https://github.com/ClangBuiltLinux/linux/issues/955 https://github.com/ClangBuiltLinux/linux/issues/1238), dso_local, etc. Differential Revision: https://reviews.llvm.org/D100251	2021-04-21 15:58:20 -07:00
Alexey Bataev	079884225a	[OPENMP]Fix PR49698: OpenMP declare mapper causes segmentation fault. The implicitly generated mappings for allocation/deallocation in mappers runtime should be mapped as implicit, also no need to clear member_of flag to avoid ref counter increment. Also, the ref counter should not be incremented for the very first element that comes from the mapper function. Differential Revision: https://reviews.llvm.org/D100673	2021-04-21 10:38:31 -07:00
Erich Keane	0ed613612c	Ensure target-multiversioning emits deferred declarations As reported in PR50025, sometimes we would end up not emitting functions needed by inline multiversioned variants. This is because we typically use the 'deferred decl' mechanism to emit these. However, the variants are emitted after that typically happens. This fixes that by ensuring we re-run deferred decls after this happens. Also, the multiversion emission is done recursively to ensure that MV functions that require other MV functions to be emitted get emitted.	2021-04-20 08:10:26 -07:00
Jan Svoboda	6a72ed239c	[clang] NFC: Fix range-based for loop warnings related to decl lookup	2021-04-19 18:31:31 +02:00
Xun Li	5faba87938	Revert "[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass" This reverts commit `fa6b54c44a`. The commited patch broke mlir tests. It seems that mlir tests depend on coroutine function properties set in CoroEarly pass.	2021-04-18 17:22:28 -07:00
Xun Li	fa6b54c44a	[Coroutines] Set presplit attribute in Clang instead of CoroEarly pass Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining. The presplit coroutine attributes are set in CoroEarly pass. However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine. This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920 To fix this, we set the attributes in the Clang front-end instead of in CoroEarly pass. Reviewed By: rjmccall, ChuanqiXu Differential Revision: https://reviews.llvm.org/D100282	2021-04-18 15:41:09 -07:00
Xun Li	c0211e8d7d	Revert "[Coroutines] Move CoroEarly pass to before AlwaysInliner" This reverts commit `2b50f5a434`. Forgot to update the description of the commit to sync with phabricator. Going to redo the commit.	2021-04-18 15:38:19 -07:00
Xun Li	2b50f5a434	[Coroutines] Move CoroEarly pass to before AlwaysInliner Presplit coroutines cannot be inlined. During AlwaysInliner we check if a function is a presplit coroutine, if so we skip inlining. The presplit coroutine attributes are set in CoroEarly pass. However in O0 pipeline, AlwaysInliner runs before CoroEarly, so the attribute isn't set yet and will still inline the coroutine. This causes Clang to crash: https://bugs.llvm.org/show_bug.cgi?id=49920 Differential Revision: https://reviews.llvm.org/D100282	2021-04-18 14:54:04 -07:00
Yaxun (Sam) Liu	d5c0f00e21	[CUDA][HIP] Mark device var used by host only Add device variables to llvm.compiler.used if they are ODR-used by either host or device functions. This is necessary to prevent them from being eliminated by whole-program optimization where the compiler has no way to know a device variable is used by some host code. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D98814	2021-04-17 11:25:25 -04:00
Serge Guelton	d6de1e1a71	Normalize interaction with boolean attributes Such attributes can either be unset, or set to "true" or "false" (as string). throughout the codebase, this led to inelegant checks ranging from if (Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") to if (Fn->hasAttribute("no-jump-tables") && Fn->getFnAttribute("no-jump-tables").getValueAsString() == "true") Introduce a getValueAsBool that normalize the check, with the following behavior: no attributes or attribute set to "false" => return false attribute set to "true" => return true Differential Revision: https://reviews.llvm.org/D99299	2021-04-17 08:17:33 +02:00
Thomas Lively	5c729750a6	[WebAssembly] Remove saturating fp-to-int target intrinsics Use the target-independent @llvm.fptosi and @llvm.fptoui intrinsics instead. This includes removing the instrinsics for i32x4.trunc_sat_zero_f64x2_{s,u}, which are now represented in IR as a saturating truncation to a v2i32 followed by a concatenation with a zero vector. Differential Revision: https://reviews.llvm.org/D100596	2021-04-16 12:11:20 -07:00
Alexey Bataev	10c7b9f64f	[OPENMP]Fix PR49115: Incorrect results for scan directive. For combined worksharing directives need to emit the temp arrays outside of the parallel region and update them in the master thread only. Differential Revision: https://reviews.llvm.org/D100187	2021-04-16 06:25:35 -07:00
Joshua Haberman	8344675908	Implemented [[clang::musttail]] attribute for guaranteed tail calls. This is a Clang-only change and depends on the existing "musttail" support already implemented in LLVM. The [[clang::musttail]] attribute goes on a return statement, not a function definition. There are several constraints that the user must follow when using [[clang::musttail]], and these constraints are verified by Sema. Tail calls are supported on regular function calls, calls through a function pointer, member function calls, and even pointer to member. Future work would be to throw a warning if a users tries to pass a pointer or reference to a local variable through a musttail call. Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D99517	2021-04-15 17:12:21 -07:00
Momchil Velikov	f9d932e673	[clang][AArch64] Correctly align HFA arguments when passed on the stack When we pass a AArch64 Homogeneous Floating-Point Aggregate (HFA) argument with increased alignment requirements, for example struct S { __attribute__ ((__aligned__(16))) double v[4]; }; Clang uses `[4 x double]` for the parameter, which is passed on the stack at alignment 8, whereas it should be at alignment 16, following Rule C.4 in AAPCS (https://github.com/ARM-software/abi-aa/blob/master/aapcs64/aapcs64.rst#642parameter-passing-rules) Currently we don't have a way to express in LLVM IR the alignment requirements of the function arguments. The align attribute is applicable to pointers only, and only for some special ways of passing arguments (e..g byval). When implementing AAPCS32/AAPCS64, clang resorts to dubious hacks of coercing to types, which naturally have the needed alignment. We don't have enough types to cover all the cases, though. This patch introduces a new use of the stackalign attribute to control stack slot alignment, when and if an argument is passed in memory. The attribute align is left as an optimizer hint - it still applies to pointer types only and pertains to the content of the pointer, whereas the alignment of the pointer itself is determined by the stackalign attribute. For byval arguments, the stackalign attribute assumes the role, previously perfomed by align, falling back to align if stackalign` is absent. On the clang side, when passing arguments using the "direct" style (cf. `ABIArgInfo::Kind`), now we can optionally specify an alignment, which is emitted as the new `stackalign` attribute. Patch by Momchil Velikov and Lucas Prates. Differential Revision: https://reviews.llvm.org/D98794	2021-04-15 22:58:14 +01:00
Martin Storsjö	8e0f2e89ff	[clang] [AArch64] Fix handling of HFAs passed to Windows variadic functions The documentation says that for variadic functions, all composites are treated similarly, no special handling of HFAs/HVAs, not even for the fixed arguments of a variadic function. Differential Revision: https://reviews.llvm.org/D100467	2021-04-15 22:21:27 +03:00
cchen	e0c2125d1d	[OpenMP] Added codegen for masked directive Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D100514	2021-04-15 12:55:07 -05:00
Thomas Lively	6a18cc23ef	[WebAssembly] Codegen for i64x2.extend_{low,high}_i32x4_{s,u} Removes the builtins and intrinsics used to opt in to using these instructions and replaces them with normal ISel patterns now that they are no longer prototypes. Differential Revision: https://reviews.llvm.org/D100402	2021-04-14 13:43:09 -07:00
Thomas Lively	af7925b4dd	[WebAssembly] Codegen for f64x2.convert_low_i32x4_{s,u} Add a custom DAG combine and ISD opcode for detecting patterns like (uint_to_fp (extract_subvector ...)) before the extract_subvector is expanded to ensure that they will ultimately lower to f64x2.convert_low_i32x4_{s,u} instructions. Since these instructions are no longer prototypes and can now be produced via standard IR, this commit also removes the target intrinsics and builtins that had been used to prototype the instructions. Differential Revision: https://reviews.llvm.org/D100425	2021-04-14 10:42:45 -07:00
Thomas Lively	af7ab81ce3	[WebAssembly] Use standard intrinsics for f32x4 and f64x2 ops Now that these instructions are no longer prototypes, we do not need to be careful about keeping them opt-in and can use the standard LLVM infrastructure for them. This commit removes the bespoke intrinsics we were using to represent these operations in favor of the corresponding target-independent intrinsics. The clang builtins are preserved because there is no standard way to easily represent these operations in C/C++. For consistency with the scalar codegen in the Wasm backend, the intrinsic used to represent {f32x4,f64x2}.nearest is @llvm.nearbyint even though @llvm.roundeven better captures the semantics of the underlying Wasm instruction. Replacing our use of @llvm.nearbyint with use of @llvm.roundeven is left to a potential future patch. Differential Revision: https://reviews.llvm.org/D100411	2021-04-14 09:19:27 -07:00
Martin Storsjö	3637c5c8ec	[clang] [AArch64] Fix Windows va_arg handling for larger structs Aggregate types over 16 bytes are passed by reference. Contrary to the x86_64 ABI, smaller structs with an odd (non power of two) are padded and passed in registers. Differential Revision: https://reviews.llvm.org/D100374	2021-04-14 14:51:53 +03:00
Liu, Chen3	1c4108ab66	[i386] Modify the alignment of __m128/__m256/__m512 vector type according i386 abi. According to i386 System V ABI: 1. when __m256 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 32 byte boundary at the time of the call. 2. when __m512 are required to be passed on the stack, the stack pointer must be aligned on a 0 mod 64 byte boundary at the time of the call. The current method of clang passing __m512 parameter are as follow: 1. when target supports avx512, passing it with 64 byte alignment; 2. when target supports avx, passing it with 32 byte alignment; 3. Otherwise, passing it with 16 byte alignment. Passing __m256 parameter are as follow: 1. when target supports avx or avx512, passing it with 32 byte alignment; 2. Otherwise, passing it with 16 byte alignment. This pach will passing __m128/__m256/__m512 following i386 System V ABI and apply it to Linux only since other System V OS (e.g Darwin, PS4 and FreeBSD) don't want to spend any effort dealing with the ramifications of ABI breaks at present. Differential Revision: https://reviews.llvm.org/D78564	2021-04-14 16:44:54 +08:00
Ben Dunbobbin	eae2d4b852	[Windows Itanium][PS4] handle dllimport/export w.r.t vtables/rtti The existing Windows Itanium patches for dllimport/export behaviour w.r.t vtables/rtti can't be adopted for PS4 due to backwards compatibility reasons (see comments on https://reviews.llvm.org/D90299). This commit adds our PS4 scheme for this to Clang. Differential Revision: https://reviews.llvm.org/D93203	2021-04-13 11:41:10 +01:00
Esme-Yi	dff922f39b	Reland [DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions."" This reverts commit `c965e14a12`.	2021-04-12 11:05:55 +00:00
Esme-Yi	c965e14a12	Revert "[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions." This reverts commit `62fa9b9388`.	2021-04-12 10:36:46 +00:00
Esme-Yi	62fa9b9388	[DebugInfo] Fix the mismatching between C++ language tags and Dwarf versions. Summary: The tags DW_LANG_C_plus_plus_14 and DW_LANG_C_plus_plus_11, introduced in Dwarf-5, are unexpected in previous versions. Fixing the mismathing doesn't have any drawbacks for any other debuggers, but helps dbx. Reviewed By: aprantl, shchenz Differential Revision: https://reviews.llvm.org/D99250	2021-04-12 07:42:54 +00:00
yifeng.dongyifeng	3a6a80b641	[Clang][Coroutine][DebugInfo] In c++ coroutine, clang will emit different debug info variables for parameters and move-parameters. The first one is the real parameters of the coroutine function, the other one just for copying parameters to the coroutine frame. Considering the following c++ code: ``` struct coro { ... }; coro foo(struct test & t) { ... co_await suspend_always(); ... co_await suspend_always(); ... co_await suspend_always(); } int main(int argc, char *argv[]) { auto c = foo(...); c.handle.resume(); ... } ``` Function foo is the standard coroutine function, and it has only one parameter named t (ignoring this at first), when we use the llvm code to compile this function, we can get the following ir: ``` !2921 = distinct !DISubprogram(name: "foo", linkageName: "_ZN6Object3fooE4test", scope: !2211, file: !45, li\ ne: 48, type: !2329, scopeLine: 48, flags: DIFlagPrototyped \| DIFlagAllCallsDescribed, spFlags: DISPFlagDefi\ nition \| DISPFlagOptimized, unit: !44, declaration: !2328, retainedNodes: !2922) !2924 = !DILocalVariable(name: "t", arg: 2, scope: !2921, file: !45, line: 48, type: !838) ... !2926 = !DILocalVariable(name: "t", scope: !2921, type: !838, flags: DIFlagArtificial) ``` We can find there are two `the same` DIVariable named t in the same dwarf scope for foo.resume. And when we try to use llvm-dwarfdump to dump the dwarf info of this elf, we get the following output: ``` 0x00006684: DW_TAG_subprogram DW_AT_low_pc (0x00000000004013a0) DW_AT_high_pc (0x00000000004013a8) DW_AT_frame_base (DW_OP_reg7 RSP) DW_AT_object_pointer (0x0000669c) DW_AT_GNU_all_call_sites (true) DW_AT_specification (0x00005b5c "_ZN6Object3fooE4test") 0x000066a5: DW_TAG_formal_parameter DW_AT_name ("t") DW_AT_decl_file ("/disk1/yifeng.dongyifeng/my_code/llvm/build/bin/coro-debug-1.cpp") DW_AT_decl_line (48) DW_AT_type (0x00004146 "test") 0x000066ba: DW_TAG_variable DW_AT_name ("t") DW_AT_type (0x00004146 "test") DW_AT_artificial (true) ``` The elf also has two 't' in the same scope. But unluckily, it might let the debugger confused. And failed to print parameters for O0 or above. This patch will make coroutine parameters and move parameters use the same DIVar and try to fix the problems that I mentioned before. Test Plan: check-clang Reviewed By: aprantl, jmorse Differential Revision: https://reviews.llvm.org/D97533	2021-04-12 11:10:47 +08:00
Saurabh Jha	71ab6c98a0	[Matrix] Implement C-style explicit type conversions for matrix types. This implements C-style type conversions for matrix types, as specified in clang/docs/MatrixTypes.rst. Fixes PR47141. Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D99037	2021-04-10 11:48:41 +01:00
Roman Lebedev	6270b3a1ea	Temporairly revert "[CGCall] Annotate `this` argument with alignment" As per @jyknight, "It seems like there's a bug with vtable thunks getting the wrong information." See https://reviews.llvm.org/D99790#2680857, https://godbolt.org/z/MxhYMe1q7 This reverts commit `0aa0458f14`.	2021-04-10 10:43:16 +03:00
Ben Shi	4f173c0c42	[clang][AVR] Support variable decorator '__flash' Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D96853	2021-04-10 11:23:55 +08:00
cchen	1a43fd2769	[OpenMP51] Initial support for masked directive and filter clause Adds basic parsing/sema/serialization support for the #pragma omp masked directive. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D99995	2021-04-09 14:00:36 -05:00
Soumi Manna	5d7cb79416	RISCVABIInfo::classifyArgumentType: Fix static analyzer warnings with uninitialized variables warnings - NFCI Differential Revision: https://reviews.llvm.org/D100172	2021-04-09 15:23:32 +01:00
Yaxun (Sam) Liu	25942d7c49	[AMDGPU] Allow relaxed/consume memory order for atomic inc/dec Reviewed by: Jon Chesterfield Differential Revision: https://reviews.llvm.org/D100144	2021-04-09 09:23:41 -04:00
David Blaikie	eb8a28e2cf	DebugInfo: Include inline namespaces in template specialization parameter names This ensures these types have distinct names if they are distinct types (eg: if one is an instantiation with a type in one inline namespace, and another from a type with the same simple name, but in a different inline namespace).	2021-04-08 17:37:55 -07:00
Xiangling Liao	d508561798	[AIX] Support init priority attribute Differential Revision: https://reviews.llvm.org/D99291	2021-04-08 15:40:09 -04:00
Dávid Bolvanský	2cb8c10342	Revert "Reduce the number of attributes attached to each function" This reverts commit `053dc95839`. It causes perf regressions - see discussion in D97116.	2021-04-08 17:28:57 +02:00
Saurabh Jha	7d8513b7f2	[clang] Move int <-> float scalar conversion to a separate function As prelude to this patch https://reviews.llvm.org/D99037, we want to move the int-float conversion into a separate function that can be reused by matrix cast Differential Revision: https://reviews.llvm.org/D100051	2021-04-07 12:34:16 -07:00
Roman Lebedev	0aa0458f14	[CGCall] Annotate `this` argument with alignment As it is being noted in D99249, lack of alignment information on `this` has been preventing LICM from happening. For some time now, lack of alignment attribute does not imply natural alignment, but an alignment of `1`. Also, we used to treat dereferenceable as implying alignment, but we no longer do, so it's a bugfix. Differential Revision: https://reviews.llvm.org/D99790	2021-04-07 11:02:01 +03:00
Yaxun (Sam) Liu	61d065e21f	Let clang atomic builtins fetch add/sub support floating point types Recently atomicrmw started to support fadd/fsub: https://reviews.llvm.org/D53965 However clang atomic builtins fetch add/sub still does not support emitting atomicrmw fadd/fsub. This patch adds that. Reviewed by: John McCall, Artem Belevich, Matt Arsenault, JF Bastien, James Y Knight, Louis Dionne, Olivier Giroux Differential Revision: https://reviews.llvm.org/D71726	2021-04-06 15:44:00 -04:00
Simon Pilgrim	2901dc7575	Don't directly dereference getAs<> casts to avoid potential null dereferences. NFCI. Replace with castAs<> which asserts the cast is valid. Fixes a number of static analyzer warnings.	2021-04-06 12:24:19 +01:00
Yevgeny Rouban	39e3e3aa51	[NewPM] Redesign of PreserveCFG Checker The reason for the NewPM redesign is described in the commit cba3e783389a: [NewPM] Disable PreservedCFGChecker ... The checker introduces an internal custom CFG analysis that tracks current up-to date CFG snapshot. The analysis is invalidated along any other CFG related analysis (the key is CFGAnalyses). If the CFG analysis is not invalidated at a functional pass exit then the checker asserts that the CFG snapshot taken from this analysis is equals to a snapshot of the current CFG. Along the way: - the function CFG::printDiff() is simplified by removing function name calculation. The name is printed by the caller; - fixed CFG invalidated condition (see CFG::invalidate()); - StandardInstrumentations::registerCallbacks() gets additional optional parameter of type FunctionAnalysisManager*, which is needed by the checker to get the custom CFG analysis; - several PM related tests updated to explicitly set -verify-cfg-preserved=1 as they need. This patch is safe to land as the CFGChecker is left switched off (the options -verify-cfg-preserved is false by default). It will be switched on by a separate patch to minimize possible reverts. Reviewed By: skatkov, kuhar Differential Revision: https://reviews.llvm.org/D91327	2021-04-06 12:35:49 +07:00
Jennifer Yu	7078ef4722	[OPENMP51]Initial support for nocontext clause. Added basic parsing/sema/serialization support for the 'nocontext' clause. Differential Revision: https://reviews.llvm.org/D99848	2021-04-05 11:45:49 -07:00
Craig Topper	b4f2e80600	[RISCV] Refactor conversion of B extensions to IR intrinsics a little to reduce clang binary size. These all pass 1 type to getIntrinsic. So rather than assigning IntrinsicTypes for each builtin which invokes the SmallVector constructor, just select the intrinsic ID with a switch and share a single assignment of IntrinsicTypes.	2021-04-02 23:49:44 -07:00
Jennifer Yu	0fe8af9468	Fix build bot problem with missing OMPC_novariants in switch.	2021-04-02 13:58:39 -07:00
Levy Hsu	f78d932cf2	[RISCV] Add IR intrinsics for Zbc extension Head files are included in a separate patch in case the name needs to be changed. RV32 / 64: clmul clmulh clmulr Differential Revision: https://reviews.llvm.org/D99711	2021-04-02 12:09:13 -07:00
Levy Hsu	944adbf285	Recommit "[RISCV] Add IR intrinsic for Zbb extension" Forgot to amend the Author. Original commit message: Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b Differential Revision: https://reviews.llvm.org/D99320	2021-04-02 11:50:19 -07:00
Craig Topper	1f0b309f24	Revert "[RISCV] Add IR intrinsic for Zbb extension" This reverts commit `1808194590`. I forgot to change the author.	2021-04-02 11:47:02 -07:00
Craig Topper	1808194590	[RISCV] Add IR intrinsic for Zbb extension Header files are included in a separate patch in case the name needs to be changed. RV32 / 64: orc.b	2021-04-02 11:23:57 -07:00
Levy Hsu	b001d574d7	[RISCV] Add IR intrinsic for Zbr extension Implementation for RISC-V Zbr extension intrinsic. Header files are included in separate patch in case the name needs to be changed RV32 / 64: crc32b crc32h crc32w crc32cb crc32ch crc32cw RV64 Only: crc32d crc32cd Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D99009	2021-04-02 10:58:45 -07:00
Joseph Huber	69ca50bd7d	[OpenMP] Pass mapping names to add components in a user defined mapper Summary: Currently the mapping names are not passed to the mapper components that set up the array region. This means array mappings will not have their names availible in the runtime. This patch fixes this by passing the argument name to the region correctly. This means that the mapped variable's name will be the declared mapper that placed it on the device. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D99681	2021-04-01 15:51:03 -04:00
Raphael Isemann	60854c328d	Avoid calling ParseCommandLineOptions in BackendUtil if possible Calling `ParseCommandLineOptions` should only be called from `main` as the CommandLine setup code isn't thread-safe. As BackendUtil is part of the generic Clang FrontendAction logic, a process which has several threads executing Clang FrontendActions will randomly crash in the unsafe setup code. This patch avoids calling the function unless either the debug-pass option or limit-float-precision option is set. Without these two options set the `ParseCommandLineOptions` call doesn't do anything beside parsing the command line `clang` which doesn't set any options. See also D99652 where LLDB received a workaround for this crash. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D99740	2021-04-01 19:41:16 +02:00
Alexey Bataev	a28e835e94	[OPENMP]Fix PR48885: Crash in passing firstprivate args to tasks on Apple M1. Need to bitcast the function pointer passed as a parameter to the real type to avoid possible problem with calling conventions. Differential Revision: https://reviews.llvm.org/D99521	2021-03-31 13:00:58 -07:00
Alexey Bataev	66da4f6fc9	[OPENMP]Fix PR48658: [OpenMP 5.0] Compiler crash when OpenMP atomic sync hints used. No need to consider hint clause kind as the main atomic clause kind at the codegen. Differential Revision: https://reviews.llvm.org/D99611	2021-03-31 12:58:24 -07:00
Thomas Lively	45783d0e8a	[WebAssembly] Implement i64x2 comparisons Removes the prototype builtin and intrinsic for i64x2.eq and implements that instruction as well as the other i64x2 comparison instructions in the final SIMD spec. Unsigned comparisons were not included in the final spec, so they still need to be scalarized via a custom lowering. Differential Revision: https://reviews.llvm.org/D99623	2021-03-31 10:46:17 -07:00
Wei Mi	d535a05ca1	[ThinLTO] During module importing, close one source module before open another one for distributed mode. Currently during module importing, ThinLTO opens all the source modules, collect functions to be imported and append them to the destination module, then leave all the modules open through out the lto backend pipeline. This patch refactors it in the way that one source module will be closed before another source module is opened. All the source modules will be closed after importing phase is done. It will save some amount of memory when there are many source modules to be imported. Note that this patch only changes the distributed thinlto mode. For in process thinlto mode, one source module is shared acorss different thinlto backend threads so it is not changed in this patch. Differential Revision: https://reviews.llvm.org/D99554	2021-03-30 14:37:29 -07:00
Mike Rice	b7899ba0e8	[OPENMP51]Initial support for the dispatch directive. Added basic parsing/sema/serialization support for dispatch directive. Differential Revision: https://reviews.llvm.org/D99537	2021-03-30 14:12:53 -07:00
Alexey Bataev	e2c7bf08cc	[OPENMP]Fix PR48607: Crash during clang openmp codegen for firstprivate() of `float _Complex`. Need to cast the argument for the debug wrapper function call to the corresponding parameter type to avoid crash. Differential Revision: https://reviews.llvm.org/D99617	2021-03-30 13:39:45 -07:00
Alexey Bataev	1696b8ae96	[OPENMP]Fix PR48740: OpenMP declare reduction in C does not require an initializer If no initializer-clause is specified, the private variables will be initialized following the rules for initialization of objects with static storage duration. Need to adjust the implementation to the current version of the standard. Differential Revision: https://reviews.llvm.org/D99539	2021-03-30 05:38:20 -07:00
Raphael Isemann	1cbba533ec	[ObjC][CodeGen] Fix missing debug info in situations where an instance and class property have the same identifier Since the introduction of class properties in Objective-C it is possible to declare a class and an instance property with the same identifier in an interface/protocol. Right now Clang just generates debug information for whatever property comes first in the source file. The second property is ignored as it's filtered out by the set of already emitted properties (which is just using the identifier of the property to check for equivalence). I don't think generating debug info in this case was never supported as the identifier filter is in place since `7123bca7fb` (which precedes the introduction of class properties). This patch expands the filter to take in account identifier + whether the property is class/instance. This ensures that both properties are emitted in this special situation. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D99512	2021-03-30 11:07:16 +02:00
Johannes Doerfert	03cc8a1ba0	[OpenMP][NFC] Move the `noinline` to the parallel entry point The `noinline` for non-SPMD parallel functions is probably not necessary but as long as we use it we should put it on the outermost parallel function, which is the wrapper, not the actual outlined function. Resolves PR49752 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D99506	2021-03-30 01:12:45 -05:00
Alexey Bataev	0411b23319	[OPENMP]Map data field with l-value reference types. Added initial support dfor the mapping of the data members with l-value reference types. Differential Revision: https://reviews.llvm.org/D98812	2021-03-29 07:07:09 -07:00
Alexey Bataev	f6f21dcd6c	[OPENMP]Fix PR49636: Assertion `(!Entry.getAddress() \|\| Entry.getAddress() == Addr) && "Resetting with the new address."' failed. The original issue is caused by the fact that the variable is allocated with incorrect type i1 instead of i8. This causes the bitcasting of the declaration to i8 type and the bitcast expression does not match the original variable. To fix the problem, the UndefValue initializer and the original variable should be emitted with type i8, not i1. Differential Revision: https://reviews.llvm.org/D99297	2021-03-29 06:55:57 -07:00
Alexey Bataev	dcf96178cb	[OPENMP]Fix PR49052: Clang crashed when compiling target code with assert(0). Need to insert a basic block during generation of the target region to avoid crash for the GPU to be able always calling a cleanup action. This cleanup action is required for the correct emission of the target region for the GPU. Differential Revision: https://reviews.llvm.org/D99445	2021-03-29 06:36:06 -07:00
Matt Arsenault	9a0c9402fa	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `07e46367ba`.	2021-03-29 08:55:30 -04:00
Oliver Stannard	07e46367ba	Revert "Reapply "OpaquePtr: Turn inalloca into a type attribute"" Reverting because test 'Bindings/Go/go.test' is failing on most buildbots. This reverts commit `fc9df30991`.	2021-03-29 11:32:22 +01:00
Matt Arsenault	fc9df30991	Reapply "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `20d5c42e0e`.	2021-03-28 13:35:21 -04:00
Nico Weber	20d5c42e0e	Revert "OpaquePtr: Turn inalloca into a type attribute" This reverts commit `4fefed6563`. Broke check-clang everywhere.	2021-03-28 13:02:52 -04:00
Matt Arsenault	4fefed6563	OpaquePtr: Turn inalloca into a type attribute I think byval/sret and the others are close to being able to rip out the code to support the missing type case. A lot of this code is shared with inalloca, so catch this up to the others so that can happen.	2021-03-28 11:12:23 -04:00
Xun Li	f490a5969b	[OpenMP][InstrProfiling] Fix a missing instr profiling counter When emitting a function body there needs to be a instr profiling counter emitted. Otherwise instr profiling won't work for this function. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D98135	2021-03-25 13:52:36 -07:00
Xun Li	c7a39c833a	[Coroutine][Clang] Force emit lifetime intrinsics for Coroutines tl;dr Correct implementation of Corouintes requires having lifetime intrinsics available. Coroutine functions are functions that can be suspended and resumed latter. To do so, data that need to stay alive after suspension must be put on the heap (i.e. the coroutine frame). The optimizer is responsible for analyzing each AllocaInst and figure out whether it should be put on the stack or the frame. In most cases, for data that we are unable to accurately analyze lifetime, we can just conservatively put them on the heap. Unfortunately, there exists a few cases where certain data MUST be put on the stack, not on the heap. Without lifetime intrinsics, we are unable to correctly analyze those data's lifetime. To dig into more details, there exists cases where at certain code points, the current coroutine frame may have already been destroyed. Hence no frame access would be allowed beyond that point. The following is a common code pattern called "Symmetric Transfer" in coroutine: ``` auto tmp = await_suspend(); __builtin_coro_resume(tmp.address()); return; ``` In the above code example, `await_suspend()` returns a new coroutine handle, which we will obtain the address and then resume that coroutine. This essentially "transfered" from the current coroutine to a different coroutine. During the call to `await_suspend()`, the current coroutine may be destroyed, which should be fine because we are not accessing any data afterwards. However when LLVM is emitting IR for the above code, it needs to emit an AllocaInst for `tmp`. It will then call the `address` function on tmp. `address` function is a member function of coroutine, and there is no way for the LLVM optimizer to know that it does not capture the `tmp` pointer. So when the optimizer looks at it, it has to conservatively assume that `tmp` may escape and hence put it on the heap. Furthermore, in some cases `address` call would be inlined, which will generate a bunch of store/load instructions that move the `tmp` pointer around. Those stores will also make the compiler to think that `tmp` might escape. To summarize, it's really difficult for the mid-end to figure out that the `tmp` data is short-lived. I made some attempt in D98638, but it appears to be way too complex and is basically doing the same thing as inserting lifetime intrinsics in coroutines. Also, for reference, we already force emitting lifetime intrinsics in O0 for AlwaysInliner: https://github.com/llvm/llvm-project/blob/main/llvm/lib/Passes/PassBuilder.cpp#L1893 Differential Revision: https://reviews.llvm.org/D99227	2021-03-25 13:46:20 -07:00
Yaxun (Sam) Liu	cc9477166a	[CUDA][HIP] add __builtin_get_device_side_mangled_name Add builtin function __builtin_get_device_side_mangled_name to get device side manged name for functions and global variables, which can be used to get symbol address of kernels or variables by mangled name in dynamically loaded bundled code objects at run time. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D99301	2021-03-25 15:25:29 -04:00
Djordje Todorovic	8420a53324	[Debugify] Expose original debug info preservation check as CC1 option In order to test the preservation of the original Debug Info metadata in your projects, a front end option could be very useful, since users usually report that a concrete entity (e.g. variable x, or function fn2()) is missing debug info. The [0] is an example of running the utility on GDB Project. This depends on: D82546 and D82545. Differential Revision: https://reviews.llvm.org/D82547	2021-03-25 05:29:42 -07:00
Alexey Bataev	7654bb6303	[OPENMP]Fix PR48571: critical/master in outlined contexts cause crash. If emit inlined region for master/critical directives, no need to clear lambda/block context data, otherwise the variables cannot be found and it causes a crash at compile time. Differential Revision: https://reviews.llvm.org/D99280	2021-03-24 10:15:24 -07:00
Nemanja Ivanovic	4020932706	[PowerPC] Make altivec.h work with AIX which has no __int128 There are a number of functions in altivec.h that use vector __int128 which isn't supported on AIX. Those functions need to be guarded for targets that don't support the type. Furthermore, the functions that produce quadword instructions without using the type need a builtin. This patch adds the macro guards to altivec.h using the __SIZEOF_INT128__ which is only defined on targets that support the __int128 type.	2021-03-24 00:35:51 -05:00
Bruno Cardoso Lopes	431e3138a1	[CGAtomic] Lift stronger requirements on cmpxch and support acquire failure mode - Fix `emitAtomicCmpXchgFailureSet` to support release/acquire (succ/fail) memory order. - Remove stronger checks for cmpxch. Effectively, this addresses http://wg21.link/p0418 Differential Revision: https://reviews.llvm.org/D98995	2021-03-23 16:45:37 -07:00
Bradley Smith	48f5a392cb	[IR] Add vscale_range IR function attribute This attribute represents the minimum and maximum values vscale can take. For now this attribute is not hooked up to anything during codegen, this will be added in the future when such codegen is considered stable. Additionally hook up the -msve-vector-bits=<x> clang option to emit this attribute. Differential Revision: https://reviews.llvm.org/D98030	2021-03-22 12:05:06 +00:00
Roman Lebedev	be87321280	[clang][Codegen] EmitBranchOnBoolExpr(): emit prof branch counts even at -O0 This restores the original behaviour before i unadvertedly broke it in `e3a4701627` and clang/test/Profile/ caught it.	2021-03-21 23:24:27 +03:00
Roman Lebedev	e3a4701627	[clang][CodeGen] Lower Likelihood attributes to @llvm.expect intrin instead of branch weights `08196e0b2e` exposed LowerExpectIntrinsic's internal implementation detail in the form of LikelyBranchWeight/UnlikelyBranchWeight options to the outside. While this isn't incorrect from the results viewpoint, this is suboptimal from the layering viewpoint, and causes confusion - should transforms also use those weights, or should they use something else, D98898? So go back to status quo by making LikelyBranchWeight/UnlikelyBranchWeight internal again, and fixing all the code that used it directly, which currently is only clang codegen, thankfully, to emit proper @llvm.expect intrinsics instead.	2021-03-21 22:50:21 +03:00
Roman Lebedev	37d6be9052	Revert "[BranchProbability] move options for 'likely' and 'unlikely'" Upon reviewing D98898 i've come to realization that these are implementation detail of LowerExpectIntrinsicPass, and they should not be exposed to outside of it. This reverts commit `ee8b53815d`.	2021-03-21 22:50:21 +03:00
Sanjay Patel	ee8b53815d	[BranchProbability] move options for 'likely' and 'unlikely' This makes the settings available for use in other passes by housing them within the Support lib, but NFC otherwise. See D98898 for the proposed usage in SimplifyCFG (where this change was originally included). Differential Revision: https://reviews.llvm.org/D98945	2021-03-20 14:46:46 -04:00
Benjamin Kramer	19d2c65ddd	[CodeGen] Don't crash on for loops with cond variables and no increment This looks like an oversight from `a875721d8a`, creating IR that refers to `for.inc` even if it doesn't exist. Differential Revision: https://reviews.llvm.org/D98980	2021-03-19 20:43:52 +01:00
Hongtao Yu	fc1812a0ad	[UniqueLinkageName] Use consistent checks when mangling symbo linkage name and debug linkage name. C functions may be declared and defined in different prototypes like below. This patch unifies the checks for mangling names in symbol linkage name emission and debug linkage name emission so that the two names are consistent. static int go(int); static int go(a) int a; { return a; } Test Plan: Differential Revision: https://reviews.llvm.org/D98799	2021-03-18 22:11:16 -07:00
Thomas Lively	f5764a8654	[WebAssembly] Finalize SIMD names and opcodes Updates the names (e.g. widen => extend, saturate => sat) and opcodes of all SIMD instructions to match the finalized SIMD spec. Deliberately does not change the public interface in wasm_simd128.h yet; that will require more care. Depends on D98466. Differential Revision: https://reviews.llvm.org/D98676	2021-03-18 11:21:25 -07:00
Thomas Lively	2f2ae08da9	[WebAssembly] Remove experimental SIMD instructions Removes the instruction definitions, intrinsics, and builtins for qfma/qfms, signselect, and prefetch instructions, which were not included in the final WebAssembly SIMD spec. Depends on D98457. Differential Revision: https://reviews.llvm.org/D98466	2021-03-18 11:21:24 -07:00
Alex Lorenz	d672d5219a	Revert "[CodeGenModule] Set dso_local for Mach-O GlobalValue" This reverts commit `809a1e0ffd`. Mach-O doesn't support dso_local and this change broke XNU because of the use of dso_local. Differential Revision: https://reviews.llvm.org/D98458	2021-03-17 17:27:41 -07:00
Richard Smith	a875721d8a	PR49585: Emit the jump destination for a for loop 'continue' from within the scope of the condition variable. The condition variable is in scope in the loop increment, so we need to emit the jump destination from wthin the scope of the condition variable. For GCC compatibility (and compatibility with real-world 'FOR_EACH' macros), 'continue' is permitted in a statement expression within the condition of a for loop, though, so there are two cases here: * If the for loop has no condition variable, we can emit the jump destination before emitting the condition. * If the for loop has a condition variable, we must defer emitting the jump destination until after emitting the variable. We diagnose a 'continue' appearing in the initializer of the condition variable, because it would jump past the initializer into the scope of that variable. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D98816	2021-03-17 16:24:04 -07:00
Mike Rice	410f09af09	[OPENMP51]Initial support for the interop directive. Added basic parsing/sema/serialization support for interop directive. Support for the 'init' clause. Differential Revision: https://reviews.llvm.org/D98558	2021-03-17 09:42:07 -07:00
Bradley Smith	cf0da91ba5	[AArch64][SVE/NEON] Add support for FROUNDEVEN for both NEON and fixed length SVE Previously NEON used a target specific intrinsic for frintn, given that the FROUNDEVEN ISD node now exists, move over to that instead and add codegen support for that node for both NEON and fixed length SVE. Differential Revision: https://reviews.llvm.org/D98487	2021-03-17 11:41:22 +00:00
Vassil Vassilev	0cb7e7ca0c	Make iteration over the DeclContext::lookup_result safe. The idiom: ``` DeclContext::lookup_result R = DeclContext::lookup(Name); for (auto D : R) {...} ``` is not safe when in the loop body we trigger deserialization from an AST file. The deserialization can insert new declarations in the StoredDeclsList whose underlying type is a vector. When the vector decides to reallocate its storage the pointer we hold becomes invalid. This patch replaces a SmallVector with an singly-linked list. The current approach stores a SmallVector<NamedDecl, 4> which is around 8 pointers. The linked list is 3, 5, or 7. We do better in terms of memory usage for small cases (and worse in terms of locality -- the linked list entries won't be near each other, but will be near their corresponding declarations, and we were going to fetch those memory pages anyway). For larger cases: the vector uses a doubling strategy for reallocation, so will generally be between half-full and full. Let's say it's 75% full on average, so there's N * 4/3 + 4 pointers' worth of space allocated currently and will be 2N pointers with the linked list. So we break even when there are N=6 entries and slightly lose in terms of memory usage after that. We suspect that's still a win on average. Thanks to @rsmith! Differential revision: https://reviews.llvm.org/D91524	2021-03-17 08:59:04 +00:00
Luke Drummond	440f6bdf34	[OpenCL][NFCI] Prefer CodeGenFunction::EmitRuntimeCall `CodeGenFunction::EmitRuntimeCall` automatically sets the right calling convention for the callee so we can avoid setting it ourselves. As requested in https://reviews.llvm.org/D98411 Reviewed by: anastasia Differential Revision: https://reviews.llvm.org/D98705	2021-03-16 16:22:19 +00:00
Amy Huang	f5352dd9da	Emit inline implementation of __builtin__wmemchr on MSVCRT platforms. The MSVC runtime library doesn't have a definition for wmemchr, so provide an inline implementation. Differential Revision: https://reviews.llvm.org/D98472	2021-03-15 15:30:55 -07:00
Jonas Paulsson	9cfd301ec8	[SystemZ] Test for isinf and isfinite in testFPKind(). Recognize BI__builtin_isinf and BI__builtin_isfinite (and a few other opcodes for finite) in testFPKind() and handle with TDC. Review: Ulrich Weigand. Differential Revision: https://reviews.llvm.org/D97901	2021-03-15 15:02:39 -06:00
Stelios Ioannou	ab86edbc88	[AArch64] Implement __rndr, __rndrrs intrinsics This patch implements the __rndr and __rndrrs intrinsics to provide access to the random number instructions introduced in Armv8.5-A. They are only defined for the AArch64 execution state and are available when __ARM_FEATURE_RNG is defined. These intrinsics store the random number in their pointer argument and return a status code if the generation succeeded. The difference between __rndr __rndrrs, is that the latter intrinsic reseeds the random number generator. The instructions write the NZCV flags indicating the success of the operation that we can then read with a CSET. [1] https://developer.arm.com/docs/101028/latest/data-processing-intrinsics [2] https://bugs.llvm.org/show_bug.cgi?id=47838 Differential Revision: https://reviews.llvm.org/D98264 Change-Id: I8f92e7bf5b450e5da3e59943b53482edf0df6efc	2021-03-15 17:51:48 +00:00
Luke Drummond	fcfd3fda71	[OpenCL] Respect calling convention for builtin `__translate_sampler_initializer` has a calling convention of `spir_func`, but clang generated calls to it using the default CC. Instruction Combining was lowering these mismatching calling conventions to `store i1* undef` which itself was subsequently lowered to a trap instruction by simplifyCFG resulting in runtime `SIGILL` There are arguably two bugs here: but whether there's any wisdom in converting an obviously invalid call into a runtime crash over aborting with a sensible error message will require further discussion. So for now it's enough to set the right calling convention on the runtime helper. Reviewed By: svenh, bader Differential Revision: https://reviews.llvm.org/D98411	2021-03-15 17:26:51 +00:00
Thomas Preud'homme	f60b35340f	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-15 15:38:08 +00:00
xling-Liao	0bf2da53c1	[NFC] Adjust SmallVector.h header to workaround XL build compiler issue In order to prevent further building issues related to the usage of SmallVector in other compilation unit, this patch adjusts the llvm.h header as a workaround instead. Besides, this patch reverts previous workarounds: 1. Revert "[NFC] Use llvm::SmallVector to workaround XL compiler problem on AIX" This reverts commit `561fb7f60a`. 2.Revert "[clang][cli] Fix build failure in CompilerInvocation" This reverts commit `8dc70bdcd0`. Differential Revision: https://reviews.llvm.org/D98552	2021-03-12 21:41:36 -06:00
Amy Huang	d7cd208f08	[DebugInfo] Add an attribute to force type info to be emitted for types that are required to be complete. This was motivated by the fact that constructor type homing (debug info optimization that we want to turn on by default) drops some libc++ types, so an attribute would allow us to override constructor homing and emit them anyway. I'm currently looking into the particular libc++ issue, but even if we do fix that, this issue might come up elsewhere and it might be nice to have this. As I've implemented it now, the attribute isn't specific to the constructor homing optimization and overrides all of the debug info optimizations. Open to discussion about naming, specifics on what the attribute should do, etc. Differential Revision: https://reviews.llvm.org/D97411	2021-03-12 12:30:01 -08:00
Nikita Popov	42eb658f65	[OpaquePtrs] Remove some uses of type-less CreateGEP() (NFC) This removes some (but not all) uses of type-less CreateGEP() and CreateInBoundsGEP() APIs, which are incompatible with opaque pointers. There are a still a number of tricky uses left, as well as many more variation APIs for CreateGEP.	2021-03-12 21:01:16 +01:00
Simon Pilgrim	731b3d7664	[clang] Use Constant::getAllOnesValue helper. NFCI. Avoid -1ULL which MSVC tends to complain about	2021-03-12 15:16:36 +00:00
Sriraman Tallam	cdb42a4cc4	Disable unique linkage suffixes ifor global vars until demanglers can be fixed. D96109 added support for unique internal linkage names for both internal linkage functions and global variables. There was a lot of discussion on how to get the demangling right for functions but I completely missed the point that demanglers do not support suffixes for global vars. For example: $ c++filt _ZL3foo foo $ c++filt _ZL3foo.uniq.123 _ZL3foo.uniq.123 The demangling for functions works as expected. I am not sure of the impact of this. I don't understand how debuggers and other tools depend on the correctness of global variable demangling so I am pre-emptively disabling it until we can get the demangling support added. Importantly, uniquefying global variables is not needed right now as we do not do profile attribution to global vars based on sampling. It was added for completeness and so this feature is not exactly missed. Differential Revision: https://reviews.llvm.org/D98392	2021-03-11 20:59:30 -08:00
David Blaikie	bd2bdad19e	void cast to suppress -Wunused-variable in non-asserts build	2021-03-11 17:51:31 -08:00
Florian Hahn	c92ec0dd92	[Matrix] Add support for matrix-by-scalar division. This patch extends the matrix spec to allow matrix-by-scalar division. Originally support for `/` was left out to avoid ambiguity for the matrix-matrix version of `/`, which could either be elementwise or specified as matrix multiplication M1 * (1/M2). For the matrix-scalar version, no ambiguity exists; `*` is also an elementwise operation in that case. Matrix-by-scalar division is commonly supported by systems including Matlab, Mathematica or NumPy. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97857	2021-03-11 22:21:23 +00:00
Joseph Huber	807466ef28	[OpenMP] Restore backwards compatibility for libomptarget Summary: The changes introduced in D87946 changed the API for libomptarget functions. `__kmpc_push_target_tripcount` was a function in Clang 11.x but was not given a backward-compatible interface. This change will require people using Clang 13.x or 12.x to recompile their offloading programs. Reviewed By: jdoerfert cchen Differential Revision: https://reviews.llvm.org/D98358	2021-03-11 09:52:11 -05:00
Nikita Popov	46354bac76	[OpaquePtrs] Remove some uses of type-less CreateLoad APIs (NFC) Explicitly pass loaded type when creating loads, in preparation for the deprecation of these APIs. There are still a couple of uses left.	2021-03-11 14:40:57 +01:00
Nikita Popov	68e01339cc	[CGBuilder] Remove type-less CreateAlignedLoad() APIs (NFC) These are incompatible with opaque pointers. This is in preparation of dropping this API on the IRBuilder side as well. Instead explicitly pass the loaded type.	2021-03-11 10:41:23 +01:00
Olivier Goffart	5baea05601	[SEH] Fix capture of this in lambda functions Commit `1b04bdc2f3` added support for capturing the 'this' pointer in a SEH context (__finally or __except), But the case in which the 'this' pointer is part of a lambda capture was not handled properly Differential Revision: https://reviews.llvm.org/D97687	2021-03-11 09:12:42 +01:00
Zakk Chen	d6a0560bf2	[Clang][RISCV] Add custom TableGen backend for riscv-vector intrinsics. Demonstrate how to generate vadd/vfadd intrinsic functions 1. add -gen-riscv-vector-builtins for clang builtins. 2. add -gen-riscv-vector-builtin-codegen for clang codegen. 3. add -gen-riscv-vector-header for riscv_vector.h. It also generates ifdef directives with extension checking, base on D94403. 4. add -gen-riscv-vector-generic-header for riscv_vector_generic.h. Generate overloading version Header for generic api. https://github.com/riscv/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md#c11-generic-interface 5. update tblgen doc for riscv related options. riscv_vector.td also defines some unused type transformers for vadd, because I think it could demonstrate how tranfer type work and we need them for the whole intrinsic functions implementation in the future. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Zakk Chen <zakk.chen@sifive.com> Reviewed By: jrtc27, craig.topper, HsiangKai, Jim, Paul-C-Anagnostopoulos Differential Revision: https://reviews.llvm.org/D95016	2021-03-10 18:43:43 -08:00
Arthur Eubanks	c8227f06b3	[clang] Don't assert in EmitAggregateCopy on trivial_abi types Fixes PR42961. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D97872	2021-03-10 10:16:06 -08:00
Jingu Kang	25951c5ab8	[AArch64] Add missing intrinsics for scalar FP rounding Differential Revision: https://reviews.llvm.org/D98269	2021-03-10 13:22:29 +00:00
serge-sans-paille	ea8e5b87ac	[NFC] Remove duplicate isNoBuiltinFunc method It's available both in CodeGenOptions and in LangOptions, and LangOptions implementation is slightly better as it uses a StringRef instead of a char pointer, so use it. Differential Revision: https://reviews.llvm.org/D98175	2021-03-10 09:18:55 +01:00
Xiangling Liao	561fb7f60a	[NFC] Use llvm::SmallVector to workaround XL compiler problem on AIX LLVM is recommending to use SmallVector (that is, omitting the N), in the absence of a well-motivated choice for the number of inlined elements N. However, this doesn't work well with XL compiler on AIX since some header(s) aren't properly picked up with it. We need to take a further look into the real issue underneath and fix it in a later patch. But currently we'd like to use this patch to unblock the build compiler issue first. Differential Revision: https://reviews.llvm.org/D98265	2021-03-09 13:03:52 -05:00
diggerlin	46d4d1fea4	[AIX] do not emit visibility attribute into IR when there is -mignore-xcoff-visibility SUMMARY: n the patch https://reviews.llvm.org/D87451 "add new option -mignore-xcoff-visibility" we did as "The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR." in these patch we let -mignore-xcoff-visibility effect on generating IR too. the new feature only work on AIX OS Reviewer: Jason Liu, Differential Revision: https://reviews.llvm.org/D89986	2021-03-09 10:38:00 -05:00
Akira Hatanaka	dca5737945	Move ObjCARCUtil.h back to llvm/Analysis Instead of adding the header to llvm/IR, just duplicate the marker string in the auto upgrader.	2021-03-08 16:35:24 -08:00
Min-Yih Hsu	5eb7a5814a	[cfe][M68k](7/8) Clang basic support This is the first patch supporting M68k in Clang - Register M68k as a target - Target specific CodeGen support - Target specific attribute support Authors: myhsu, m4yers, glaubitz Differential Revision: https://reviews.llvm.org/D88393	2021-03-08 12:30:57 -08:00
Sriraman Tallam	78d0e91865	Refactor -funique-internal-linakge-names implementation. The option -funique-internal-linkage-names was added in D73307 and D78243 as a LLVM early pass to insert a unique suffix to internal linkage functions and vars. The unique suffix was the hash of the module path. However, we found that this can be done more cleanly in clang early and the fixes that need to be done later can be completely avoided. The fixes in particular are trying to modify the DW_AT_linkage_name and finding the right place to insert the pass. This patch ressurects the original implementation proposed in D73307 which was reviewed and then ditched in favor of the pass based approach. Differential Revision: https://reviews.llvm.org/D96109	2021-03-05 13:32:17 -08:00
Jingu Kang	9b302513f6	[AArch64] Add missing intrinsics for vrnd	2021-03-05 11:26:12 +00:00
Michael Kruse	b119120673	[clang][OpenMP] Use OpenMPIRBuilder for workshare loops. Initial support for using the OpenMPIRBuilder by clang to generate loops using the OpenMPIRBuilder. This initial support is intentionally limited to: * Only the worksharing-loop directive. * Recognizes only the nowait clause. * No loop nests with more than one loop. * Untested with templates, exceptions. * Semantic checking left to the existing infrastructure. This patch introduces a new AST node, OMPCanonicalLoop, which becomes parent of any loop that has to adheres to the restrictions as specified by the OpenMP standard. These restrictions allow OMPCanonicalLoop to provide the following additional information that depends on base language semantics: * The distance function: How many loop iterations there will be before entering the loop nest. * The loop variable function: Conversion from a logical iteration number to the loop variable. These allow the OpenMPIRBuilder to act solely using logical iteration numbers without needing to be concerned with iterator semantics between calling the distance function and determining what the value of the loop variable ought to be. Any OpenMP logical should be done by the OpenMPIRBuilder such that it can be reused MLIR OpenMP dialect and thus by flang. The distance and loop variable function are implemented using lambdas (or more exactly: CapturedStmt because lambda implementation is more interviewed with the parser). It is up to the OpenMPIRBuilder how they are called which depends on what is done with the loop. By default, these are emitted as outlined functions but we might think about emitting them inline as the OpenMPRuntime does. For compatibility with the current OpenMP implementation, even though not necessary for the OpenMPIRBuilder, OMPCanonicalLoop can still be nested within OMPLoopDirectives' CapturedStmt. Although OMPCanonicalLoop's are not currently generated when the OpenMPIRBuilder is not enabled, these can just be skipped when not using the OpenMPIRBuilder in case we don't want to make the AST dependent on the EnableOMPBuilder setting. Loop nests with more than one loop require support by the OpenMPIRBuilder (D93268). A simple implementation of non-rectangular loop nests would add another lambda function that returns whether a loop iteration of the rectangular overapproximation is also within its non-rectangular subset. Reviewed By: jdenny Differential Revision: https://reviews.llvm.org/D94973	2021-03-04 22:52:59 -06:00
David Blaikie	cedc53254a	Fix clang for header move in LLVM/IR	2021-03-04 16:20:44 -08:00
Heejin Ahn	561abd83ff	[WebAssembly] Disable uses of __clang_call_terminate Background: Wasm EH, while using Windows EH (catchpad/cleanuppad based) IR, uses Itanium-based libraries and ABIs with some modifications. `__clang_call_terminate` is a wrapper generated in Clang's Itanium C++ ABI implementation. It contains this code, in C-style pseudocode: ``` void __clang_call_terminate(void *exn) { __cxa_begin_catch(exn); std::terminate(); } ``` So this function is a wrapper to call `__cxa_begin_catch` on the exception pointer before termination. In Itanium ABI, this function is called when another exception is thrown while processing an exception. The pointer for this second, violating exception is passed as the argument of this `__clang_call_terminate`, which calls `__cxa_begin_catch` with that pointer and calls `std::terminate` to terminate the program. The spec (https://libcxxabi.llvm.org/spec.html) for `__cxa_begin_catch` says, ``` When the personality routine encounters a termination condition, it will call __cxa_begin_catch() to mark the exception as handled and then call terminate(), which shall not return to its caller. ``` In wasm EH's Clang implementation, this function is called from cleanuppads that terminates the program, which we also call terminate pads. Cleanuppads normally don't access the thrown exception and the wasm backend converts them to `catch_all` blocks. But because we need the exception pointer in this cleanuppad, we generate `wasm.get.exception` intrinsic (which will eventually be lowered to `catch` instruction) as we do in the catchpads. But because terminate pads are cleanup pads and should run even when a foreign exception is thrown, so what we have been doing is: 1. In `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()`, we make sure terminate pads are in this simple shape: ``` %exn = catch call @__clang_call_terminate(%exn) unreachable ``` 2. In `WebAssemblyHandleEHTerminatePads` pass at the end of the pipeline, we attach a `catch_all` to terminate pads, so they will be in this form: ``` %exn = catch call @__clang_call_terminate(%exn) unreachable catch_all call @std::terminate() unreachable ``` In `catch_all` part, we don't have the exception pointer, so we call `std::terminate()` directly. The reason we ran HandleEHTerminatePads at the end of the pipeline, separate from LateEHPrepare, was it was convenient to assume there was only a single `catch` part per `try` during CFGSort and CFGStackify. --- Problem: While it thinks terminate pads could have been possibly split or calls to `__clang_call_terminate` could have been duplicated, `WebAssemblyLateEHPrepare::ensureSingleBBTermPads()` assumes terminate pads contain no more than calls to `__clang_call_terminate` and `unreachable` instruction. I assumed that because in LLVM very limited forms of transformations are done to catchpads and cleanuppads to maintain the scoping structure. But it turned out to be incorrect; passes can merge cleanuppads into one, including terminate pads, as long as the new code has a correct scoping structure. One pass that does this I observed was `SimplifyCFG`, but there can be more. After this transformation, a single cleanuppad can contain any number of other instructions with the call to `__clang_call_terminate` and can span many BBs. It wouldn't be practical to duplicate all these BBs within the cleanuppad to generate the equivalent `catch_all` blocks, only with calls to `__clang_call_terminate` replaced by calls to `std::terminate`. Unless we do more complicated transformation to split those calls to `__clang_call_terminate` into a separate cleanuppad, it is tricky to solve. --- Solution (?): This CL just disables the generation and use of `__clang_call_terminate` and calls `std::terminate()` directly in its place. The possible downside of this approach can be, because the Itanium ABI intended to "mark" the violating exception handled, we don't do that anymore. What `__cxa_begin_catch` actually does is increment the exception's handler count and decrement the uncaught exception count, which in my opinion do not matter much given that we are about to terminate the program anyway. Also it does not affect info like stack traces that can be possibly shown to developers. And while we use a variant of Itanium EH ABI, we can make some deviations if we choose to; we are already different in that in the current version of the EH spec we don't support two-phase unwinding. We can possibly consider a more complicated transformation later to reenable this, but I don't think that has high priority. Changes in this CL contains: - In Clang, we don't generate a call to `wasm.get.exception()` intrinsic and `__clang_call_terminate` function in terminate pads anymore; we simply generate calls to `std::terminate()`, which is the default implementation of `CGCXXABI::emitTerminateForUnexpectedException`. - Remove `WebAssembly::ensureSingleBBTermPads() function and `WebAssemblyHandleEHTerminatePads` pass, because terminate pads are already `catch_all` now (because they don't need the exception pointer) and we don't need these transformations anymore. - Change tests to use `std::terminate` directly. Also removes tests that tested `LateEHPrepare::ensureSingleBBTermPads` and `HandleEHTerminatePads` pass. - Drive-by fix: Add some function attributes to EH intrinsic declarations Fixes https://github.com/emscripten-core/emscripten/issues/13582. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D97834	2021-03-04 14:26:35 -08:00
Reid Kleckner	1c2e7d200d	[MS] Fix crash involving gnu stmt exprs and inalloca Use a WeakTrackingVH to cope with the stmt emission logic that cleans up unreachable blocks. This invalidates the reference to the deferred replacement placeholder. Cope with it. Fixes PR25102 (from 2015!)	2021-03-04 13:57:46 -08:00
Gui Andrade	10264a1b21	Introduce noundef attribute at call sites for stricter poison analysis This change adds a new IR noundef attribute, which denotes when a function call argument or return val may never contain uninitialized bits. In MemorySanitizer, this attribute enables optimizations which decrease instrumented code size by up to 17% (measured with an instrumented build of clang) . I'll introduce the change allowing msan to take advantage of this information in a separate patch. Differential Revision: https://reviews.llvm.org/D81678	2021-03-04 12:15:12 -08:00
Zequan Wu	9783e20988	Revert "Revert "[Coverage] Emit gap region between statements if first statements contains terminate statements."" Reland with update on test case ContinuousSyncmode/basic.c. This reverts commit `fe5c2c3ca6`.	2021-03-04 11:52:43 -08:00
Akira Hatanaka	1900503595	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies `ed4718eccb`, which was reverted because it was causing a miscompile. The bug that was causing the miscompile has been fixed in `75805dce5f`. Original commit message: Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-03-04 11:22:30 -08:00
Alexey Bataev	711179b581	[OPENMP]Fix PR48759: "fatal error" when compile with preprocessed file. If the file in line directive does not exist on the system we need, to use the original file to get its file id. Differential Revision: https://reviews.llvm.org/D97945	2021-03-04 07:26:57 -08:00
Nico Weber	fe5c2c3ca6	Revert "[Coverage] Emit gap region between statements if first statements contains terminate statements." This reverts commit `2d7374a0c6`. Breaks ContinuousSyncMode/basic.c in check-profile on macOS.	2021-03-04 08:53:30 -05:00
Thomas Preud'homme	b7aeece47c	Revert "Stop traping on sNaN in __builtin_isinf" This reverts commit `1b6eb56aa0` because the invert logic for isfinite is incorrect.	2021-03-04 12:07:35 +00:00
Wang, Pengfei	e7e67c930a	Add Windows ehcont section support (/guard:ehcont). Add option /guard:ehcont Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D96709	2021-03-04 11:47:29 +08:00
Soumi Manna	eec7f8f7b1	[WebAssembly] Add missing default cases in switch statements unsigned variable 'IntNo' has been declared but not been defined inside function EmitWebAssemblyBuiltinExpr(). static code analysis tool complains about uninitialized variable "IntNo" since this enters to default branch without setting any intrinsics and calls Function *Callee = CGM.getIntrinsic(IntNo). This patch fixes the problem by adding default cases in switch statements.	2021-03-03 13:15:23 -08:00
Zequan Wu	2d7374a0c6	[Coverage] Emit gap region between statements if first statements contains terminate statements. Differential Revision: https://reviews.llvm.org/D97101	2021-03-03 11:25:49 -08:00
Hans Wennborg	0a5dd06718	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR" This caused miscompiles of Chromium tests for iOS due clobbering of live registers. See discussion on the code review for details. > Background: > > This fixes a longstanding problem where llvm breaks ARC's autorelease > optimization (see the link below) by separating calls from the marker > instructions or retainRV/claimRV calls. The backend changes are in > https://reviews.llvm.org/D92569. > > https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue > > What this patch does to fix the problem: > > - The front-end adds operand bundle "clang.arc.attachedcall" to calls, > which indicates the call is implicitly followed by a marker > instruction and an implicit retainRV/claimRV call that consumes the > call result. In addition, it emits a call to > @llvm.objc.clang.arc.noop.use, which consumes the call result, to > prevent the middle-end passes from changing the return type of the > called function. This is currently done only when the target is arm64 > and the optimization level is higher than -O0. > > - ARC optimizer temporarily emits retainRV/claimRV calls after the calls > with the operand bundle in the IR and removes the inserted calls after > processing the function. > > - ARC contract pass emits retainRV/claimRV calls after the call with the > operand bundle. It doesn't remove the operand bundle on the call since > the backend needs it to emit the marker instruction. The retainRV and > claimRV calls are emitted late in the pipeline to prevent optimization > passes from transforming the IR in a way that makes it harder for the > ARC middle-end passes to figure out the def-use relationship between > the call and the retainRV/claimRV calls (which is the cause of > PR31925). > > - The function inliner removes an autoreleaseRV call in the callee if > nothing in the callee prevents it from being paired up with the > retainRV/claimRV call in the caller. It then inserts a release call if > claimRV is attached to the call since autoreleaseRV+claimRV is > equivalent to a release. If it cannot find an autoreleaseRV call, it > tries to transfer the operand bundle to a function call in the callee. > This is important since the ARC optimizer can remove the autoreleaseRV > returning the callee result, which makes it impossible to pair it up > with the retainRV/claimRV call in the caller. If that fails, it simply > emits a retain call in the IR if retainRV is attached to the call and > does nothing if claimRV is attached to it. > > - SCCP refrains from replacing the return value of a call with a > constant value if the call has the operand bundle. This ensures the > call always has at least one user (the call to > @llvm.objc.clang.arc.noop.use). > > - This patch also fixes a bug in replaceUsesOfNonProtoConstant where > multiple operand bundles of the same kind were being added to a call. > > Future work: > > - Use the operand bundle on x86-64. > > - Fix the auto upgrader to convert call+retainRV/claimRV pairs into > calls with the operand bundles. > > rdar://71443534 > > Differential Revision: https://reviews.llvm.org/D92808 This reverts commit `ed4718eccb`.	2021-03-03 15:51:40 +01:00
Thomas Preud'homme	1b6eb56aa0	Stop traping on sNaN in __builtin_isinf __builtin_isinf currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: mibintc Differential Revision: https://reviews.llvm.org/D97125	2021-03-02 15:54:56 +00:00
Alexey Bataev	0caf736d7e	[OPENMP50]Mapping of the subcomponents with the 'default' mappers. If the mapped structure has data members, which have 'default' mappers, need to map these members individually using their 'default' mappers. Differential Revision: https://reviews.llvm.org/D92195	2021-03-02 07:11:06 -08:00
Richard Smith	9e2579dbf4	Fix infinite recursion during IR emission if a constant-initialized lifetime-extended temporary object's initializer refers back to the same object. `GetAddrOfGlobalTemporary` previously tried to emit the initializer of a global temporary before updating the global temporary map. Emitting the initializer could recurse back into `GetAddrOfGlobalTemporary` for the same temporary, resulting in an infinite recursion. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D97733	2021-03-01 22:19:21 -08:00
Yuanfang Chen	5de2d189e6	[Diagnose] Unify MCContext and LLVMContext diagnosing The situation with inline asm/MC error reporting is kind of messy at the moment. The errors from MC layout are not reliably propagated and users have to specify an inlineasm handler separately to get inlineasm diagnose. The latter issue is not a correctness issue but could be improved. * Kill LLVMContext inlineasm diagnose handler and migrate it to use DiagnoseInfo/DiagnoseHandler. * Introduce `DiagnoseInfoSrcMgr` to diagnose SourceMgr backed errors. This covers use cases like inlineasm, MC, and any clients using SourceMgr. * Move AsmPrinter::SrcMgrDiagInfo and its instance to MCContext. The next step is to combine MCContext::SrcMgr and MCContext::InlineSrcMgr because in all use cases, only one of them is used. * If LLVMContext is available, let MCContext uses LLVMContext's diagnose handler; if LLVMContext is not available, MCContext uses its own default diagnose handler which just prints SMDiagnostic. * Change a few clients(Clang, llc, lldb) to use the new way of reporting. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D97449	2021-03-01 15:58:37 -08:00
Yaxun (Sam) Liu	53d30381f5	Fix build failure due to dump() Change-Id: I86b534223d63bf8bb8f49af5a64b300efbeba77b	2021-03-01 16:44:08 -05:00
Yaxun (Sam) Liu	5cf2a37f12	[HIP] Emit kernel symbol Currently clang uses stub function to launch kernel. This is inconvenient to interop with C++ programs since the stub function has different name as kernel, which is required by ROCm debugger. This patch emits a variable symbol which has the same name as the kernel and uses it to register and launch the kernel. This allows C++ program to launch a kernel by using the original kernel name. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D86376	2021-03-01 16:31:40 -05:00
Sean Fertile	3f40dbbbc7	[PowerPC][AIX] Enable passing vectors in variadic functions. Differential Revision: https://reviews.llvm.org/D97474	2021-03-01 13:08:28 -05:00
Arthur Eubanks	040c1b49d7	Move EntryExitInstrumentation pass location This seems to be more of a Clang thing rather than a generic LLVM thing, so this moves it out of LLVM pipelines and as Clang extension hooks into LLVM pipelines. Move the post-inline EEInstrumentation out of the backend pipeline and into a late pass, similar to other sanitizer passes. It doesn't fit into the codegen pipeline. Also fix up EntryExitInstrumentation not running at -O0 under the new PM. PR49143 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D97608	2021-03-01 10:08:10 -08:00
Olivier Goffart	1b04bdc2f3	[SEH] capture 'this' Simply make sure that the CodeGenFunction::CXXThisValue and CXXABIThisValue are correctly initialized to the recovered value. For lambda capture, we also need to make sure to fill the LambdaCaptureFields Differential Revision: https://reviews.llvm.org/D97534	2021-03-01 11:57:35 +01:00
Fangrui Song	8afdacba9d	Add GNU attribute 'retain' For ELF targets, GCC 11 will set SHF_GNU_RETAIN on the section of a `__attribute__((retain))` function/variable to prevent linker garbage collection. (See AttrDocs.td for the linker support). This patch adds `retain` functions/variables to the `llvm.used` list, which has the desired linker GC semantics. Note: `retain` does not imply `used`, so an unused function/variable can be dropped by Sema. Before 'retain' was introduced, previous ELF solutions require inline asm or linker tricks, e.g. `asm volatile(".reloc 0, R_X86_64_NONE, target");` (architecture dependent) or define a non-local symbol in the section and use `ld -u`. There was no elegant source-level solution. With D97448, `__attribute__((retain))` will set `SHF_GNU_RETAIN` on ELF targets. Differential Revision: https://reviews.llvm.org/D97447	2021-02-26 16:37:50 -08:00
Fangrui Song	28cb620321	Change some addUsedGlobal to addUsedOrCompilerUsedGlobal An global value in the `llvm.used` list does not have GC root semantics on ELF targets. This will be changed in a subsequent backend patch. Change some `llvm.used` in the ELF code path to use `llvm.compiler.used` to prevent undesired GC root semantics. Change one extern "C" alias (due to `__attribute__((used))` in extern "C") to use `llvm.compiler.used` on all targets. GNU ld has a rule "`__start_/__stop_` references from a live input section retain the associated C identifier name sections", which LLD may drop entirely (currently refined to exclude SHF_LINK_ORDER/SHF_GROUP) in a future release (the rule makes it clumsy to GC metadata sections; D96914 added a way to try the potential future behavior). For `llvm.used` global values defined in a C identifier name section, keep using `llvm.used` so that the future LLD change will not affect them. rnk kindly categorized the changes: ``` ObjC/blocks: this wants GC root semantics, since ObjC mainly runs on Mac. MS C++ ABI stuff: wants GC root semantics, no change OpenMP: unsure, but GC root semantics probably don't hurt CodeGenModule: affected in this patch to not use GC root semantics so that __attribute__((used)) behavior remains the same on ELF, plus two other minor use cases that don't want GC semantics Coverage: Probably want GC root semantics CGExpr.cpp: refers to LTO, wants GC root CGDeclCXX.cpp: one is MS ABI specific, so yes GC root, one is some other C++ init functionality, which should form GC roots (C++ initializers can have side effects and must run) CGDecl.cpp: Changed in this patch for __attribute__((used)) ``` Differential Revision: https://reviews.llvm.org/D97446	2021-02-26 10:42:07 -08:00
Petr Hosek	8459b8ef39	[Driver] Rename -fprofile-{prefix-map,compilation-dir} to -fcoverage-{prefix-map,compilation-dir} These flags affect coverage mapping (-fcoverage-mapping), not -fprofile-[instr-]generate so it makes more sense to use the -fcoverage-* prefix. Differential Revision: https://reviews.llvm.org/D97434	2021-02-25 21:40:12 -08:00
James Y Knight	24539f1ef2	Add Alignment argument to IRBuilder CreateAtomicRMW and CreateAtomicCmpXchg. And then push those change throughout LLVM. Keep the old signature in Clang's CGBuilder for now -- that will be updated in a follow-on patch (D97224). The MLIR LLVM-IR dialect is not updated to support the new alignment attribute, but preserves its existing behavior. Differential Revision: https://reviews.llvm.org/D97223	2021-02-25 18:29:42 -05:00
Vitaly Buka	9a887f652c	[clang,NFC] Fix typos in file headers	2021-02-25 12:47:02 -08:00
Akira Hatanaka	ec4408ad69	[CodeGen] Call ConvertTypeForMem instead of ConvertType This fixes a crash that occurs when the type passed to the method is `_Bool`. rdar://74493389	2021-02-25 12:11:18 -08:00
Dan Liew	5d64dd8e3c	[Clang][ASan] Introduce `-fsanitize-address-destructor-kind=` driver & frontend option. The new `-fsanitize-address-destructor-kind=` option allows control over how module destructors are emitted by ASan. The new option is consumed by both the driver and the frontend and is propagated into codegen options by the frontend. Both the legacy and new pass manager code have been updated to consume the new option from the codegen options. It would be nice if the new utility functions (`AsanDtorKindToString` and `AsanDtorKindFromString`) could live in LLVM instead of Clang so they could be consumed by other language frontends. Unfortunately that doesn't work because the clang driver doesn't link against the LLVM instrumentation library. rdar://71609176 Differential Revision: https://reviews.llvm.org/D96572	2021-02-25 12:02:21 -08:00
Jan Svoboda	a25e4a6da3	[clang][cli] Store additional optimization remarks info After a revision of D96274 changed `DiagnosticOptions` to not store all remark arguments as-written, it is no longer possible to reconstruct the arguments accurately from the class. This is caused by the fact that for `-Rpass=regexp` and friends, `DiagnosticOptions` store only the group name `pass` and not `regexp`. This is the same representation used for the plain `-Rpass` argument. Note that each argument must be generated exactly once in `CompilerInvocation::generateCC1CommandLine`, otherwise each subsequent call would produce more arguments than the previous one. Currently this works out because of the way `RoundTrip` splits the responsibilities for certain arguments based on what arguments were queried during parsing. However, this invariant breaks when we move to single round-trip for the whole `CompilerInvocation`. This patch ensures that for one `-Rpass=regexp` argument, we don't generate two arguments (`-Rpass` from `DiagnosticOptions` and `-Rpass=regexp` from `CodeGenOptions`) by shifting the responsibility for handling both cases to `CodeGenOptions`. To distinguish between the cases correctly, additional information is stored in `CodeGenOptions`. The `CodeGenOptions` parser of `-Rpass[=regexp]` arguments also looks at `-Rno-pass` and `-R[no-]everything`, which is necessary for generating the correct argument regardless of the ordering of `CodeGenOptions`/`DiagnosticOptions` parsing/generation. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D96847	2021-02-25 11:02:49 +01:00
Yaxun (Sam) Liu	47acdec1dd	[CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc For -fgpu-rdc mode, static device vars in different TU's may have the same name. To support accessing file-scope static device variables in host code, we need to give them a distinct name and external linkage. This can be done by postfixing each static device variable with a distinct CUID (Compilation Unit ID) hash. Since the static device variables have different name across compilation units, now we let them have external linkage so that they can be looked up by the runtime. Reviewed by: Artem Belevich, and Jon Chesterfield Differential Revision: https://reviews.llvm.org/D85223	2021-02-24 18:23:45 -05:00
Vitaly Buka	8560c2d426	[ThinLTO, NewPM] Run OptimizerLastEPCallbacks from buildThinLTOPreLinkDefaultPipeline -O1 and above do dont call real optimizer pipeline in ThinLTO PreLink. Also clang can't add PostLink OptimizerLastEPCallbacks for in-process ThinLTO. This results in missing sanitizer passes with ThinLTO. Simple working solution is just call OptimizerLastEPCallbacks at the end of buildThinLTOPreLinkDefaultPipeline. Differential Revision: https://reviews.llvm.org/D96320	2021-02-23 22:14:41 -08:00
Dávid Bolvanský	053dc95839	Reduce the number of attributes attached to each function Patch takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D97116	2021-02-24 07:08:44 +01:00
Yaxun (Sam) Liu	a3ce7f5cd2	[HIP] Fix managed variable linkage Currently managed variables are emitted as undefined symbols, which causes difficulty for diagnosing undefined symbols for non-managed variables. This patch transforms managed variables in device compilation so that they can be emitted as normal variables. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D96195	2021-02-23 22:34:45 -05:00
Hsiangkai Wang	1a35a1b074	[RISCV] Add vadd with mask and without mask builtin. Demonstrate how to add RISC-V V builtins and lower them to IR intrinsics for V extension. Authored-by: Roger Ferrer Ibanez <rofirrim@gmail.com> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D93446	2021-02-24 07:57:31 +08:00
James Y Knight	fe2dcd89ac	DebugInfo: Emit "LocalToUnit" flag on local member function decls. Previously, the definition was so-marked, but the declaration was not. This resulted in LLVM's dwarf emission treating the function as being external, and incorrectly emitting DW_AT_external. Differential Revision: https://reviews.llvm.org/D96044	2021-02-22 17:55:25 -05:00
Melanie Blower	e64fcdf8d5	[clang][patch] Inclusive language, modify filename SanitizerBlacklist.h to NoSanitizeList.h This patch responds to a comment from @vitalybuka in D96203: suggestion to do the change incrementally, and start by modifying this file name. I modified the file name and made the other changes that follow from that rename. Reviewers: vitalybuka, echristo, MaskRay, jansvoboda11, aaron.ballman Differential Revision: https://reviews.llvm.org/D96974	2021-02-22 15:11:37 -05:00
Ryan Santhiraraja	2c25efcbd3	[AArch64] Adding SHA3 Intrinsics support This patch adds the following SHA3 Intrinsics: vsha512hq_u64, vsha512h2q_u64, vsha512su0q_u64, vsha512su1q_u64 veor3q_u8 veor3q_u16 veor3q_u32 veor3q_u64 veor3q_s8 veor3q_s16 veor3q_s32 veor3q_s64 vrax1q_u64 vxarq_u64 vbcaxq_u8 vbcaxq_u16 vbcaxq_u32 vbcaxq_u64 vbcaxq_s8 vbcaxq_s16 vbcaxq_s32 vbcaxq_s64 Note need to include +sha3 and +crypto when building from the front-end Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D96381	2021-02-22 12:09:20 +00:00
Dávid Bolvanský	ee51c42e00	Reduce the number of attributes attached to each function This takes advantage of the implicit default behavior to reduce the number of attributes.	2021-02-20 06:57:47 +01:00
Petr Hosek	3275b18f89	[Coverage] Normalize compilation dir as well This matches debug info behavior. Differential Revision: https://reviews.llvm.org/D97001	2021-02-19 15:29:03 -08:00
Christopher Tetreault	55448ab540	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett, ctetreau Differential Revision: https://reviews.llvm.org/D96825	2021-02-19 14:48:12 -08:00
Teresa Johnson	0923a60ea7	[clang] Emit type metadata on available_externally vtables for WPD When WPD is enabled, via WholeProgramVTables, emit type metadata for available_externally vtables. Additionally, add the vtables to the llvm.compiler.used global so that they are not prematurely eliminated (before *LTO analysis). This is needed to avoid devirtualizing calls to a function overriding a class defined in a header file but with a strong definition in a shared library. Without type metadata on the available_externally vtables from the header, the WPD analysis never sees what a derived class is overriding. Even if the available_externally base class functions are pure virtual, because shared library definitions are already treated conservatively (committed patches D91583, D96721, and D96722) we will not devirtualize, which would be unsafe since the library might contain overrides that aren't visible to the LTO unit. An example is std::error_category, which is overridden in LLVM and causing failures after a self build with WPD enabled, because libstdc++ contains hidden overrides of the virtual base class methods. Differential Revision: https://reviews.llvm.org/D96919	2021-02-19 12:42:34 -08:00
Petr Hosek	5fbd1a333a	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 14:34:39 -08:00
Sterling Augustine	fc97a63db0	Move a second variable only used in an assert into the assert. This prevents unused variable warnings when building without asserts.	2021-02-18 13:26:07 -08:00
Sterling Augustine	4544a63b77	Move variable only used in an assert into the assert. This prevents unused variable warnings when building without asserts.	2021-02-18 13:04:58 -08:00
Petr Hosek	fbf8b957fd	Revert "[Coverage] Store compilation dir separately in coverage mapping" This reverts commit `97ec8fa5bb` since the test is failing on some bots.	2021-02-18 12:50:24 -08:00
Pengxuan Zheng	0ec32f1326	Revert "[AArch64] Adding Neon Polynomial vadd Intrinsics" Revert the patch due to buildbot failures. This reverts commit `d9645059c5`.	2021-02-18 12:38:16 -08:00
Petr Hosek	97ec8fa5bb	[Coverage] Store compilation dir separately in coverage mapping We currently always store absolute filenames in coverage mapping. This is problematic for several reasons. It poses a problem for distributed compilation as source location might vary across machines. We are also duplicating the path prefix potentially wasting space. This change modifies how we store filenames in coverage mapping. Rather than absolute paths, it stores the compilation directory and file paths as given to the compiler, either relative or absolute. Later when reading the coverage mapping information, we recombine relative paths with the working directory. This approach is similar to handling ofDW_AT_comp_dir in DWARF. Finally, we also provide a new option, -fprofile-compilation-dir akin to -fdebug-compilation-dir which can be used to manually override the compilation directory which is useful in distributed compilation cases. Differential Revision: https://reviews.llvm.org/D95753	2021-02-18 12:27:42 -08:00
Zequan Wu	d83511dd26	[Coverage] Emit gap region after conditions when macro is present.	2021-02-18 11:41:04 -08:00
Pengxuan Zheng	d9645059c5	[AArch64] Adding Neon Polynomial vadd Intrinsics This patch adds the following intrinsics: vadd_p8 vadd_p16 vadd_p64 vaddq_p8 vaddq_p16 vaddq_p64 vaddq_p128 Reviewed By: t.p.northover, DavidSpickett Differential Revision: https://reviews.llvm.org/D96825	2021-02-18 11:33:24 -08:00
Jonas Paulsson	e57bd1ff4f	[CFE, SystemZ] New target hook testFPKind() for checks of FP values. The recent commit `00a6254` "Stop traping on sNaN in builtin_isnan" changed the lowering in constrained FP mode of builtin_isnan from an FP comparison to integer operations to avoid trapping. SystemZ has a special instruction "Test Data Class" which is the preferred way to do this check. This patch adds a new target hook "testFPKind()" that lets SystemZ emit the s390_tdc intrinsic instead. testFPKind() takes the BuiltinID as an argument and is expected to soon handle more opcodes than just 'builtin_isnan'. Review: Thomas Preud'homme, Ulrich Weigand Differential Revision: https://reviews.llvm.org/D96568	2021-02-18 12:36:46 -06:00
Jeroen Dobbelaere	46757ccb49	[clang] functions with the 'const' or 'pure' attribute must always return. As described in * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-pure-function-attribute * https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-const-function-attribute An `__attribute__((pure))` function must always return, as well as an `__attribute__((const))` function. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96960	2021-02-18 17:29:46 +01:00
Hsiangkai Wang	766ee1096f	[Clang][RISCV] Define RISC-V V builtin types Add the types for the RISC-V V extension builtins. These types will be used by the RISC-V V intrinsics which require types of the form <vscale x 1 x i64>(LMUL=1 element size=64) or <vscale x 4 x i32>(LMUL=2 element size=32), etc. The vector_size attribute does not work for us as it doesn't create a scalable vector type. We want these types to be opaque and have no operators defined for them. We want them to be sizeless. This makes them similar to the ARM SVE builtin types. But we will have quite a bit more types. This patch adds around 60. Later patches will add another 230 or so types representing tuples of these types similar to the x2/x3/x4 types in ARM SVE. But with extra complexity that these types are combined with the LMUL concept that is unique to RISCV. For more background see this RFC http://lists.llvm.org/pipermail/llvm-dev/2020-October/145850.html Authored-by: Roger Ferrer Ibanez <roger.ferrer@bsc.es> Co-Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Differential Revision: https://reviews.llvm.org/D92715	2021-02-18 10:17:31 +08:00
Stanislav Mekhanoshin	a8d9d50762	[AMDGPU] gfx90a support Differential Revision: https://reviews.llvm.org/D96906	2021-02-17 16:01:32 -08:00
Igor Kudrin	aa84289629	[DebugInfo] Keep the DWARF64 flag in the module metadata This allows the option to affect the LTO output. Module::Max helps to generate debug info for all modules in the same format. Differential Revision: https://reviews.llvm.org/D96597	2021-02-17 17:03:34 +07:00
Alexey Bataev	60d71a286b	[OPENMP50]Allow overlapping mapping in target constructs. OpenMP 5.0 removed a lot of restriction for overlapped mapped items comparing to OpenMP 4.5. Patch restricts the checks for overlapped data mappings only for OpenMP 4.5 and less and reorders mapping of the arguments so, that present and alloc mappings are processed first and then all others. Differential Revision: https://reviews.llvm.org/D86119	2021-02-16 14:42:08 -08:00
Michael Kruse	6c05005238	[OpenMP] Implement '#pragma omp tile', by Michael Kruse (@Meinersbur). The tile directive is in OpenMP's Technical Report 8 and foreseeably will be part of the upcoming OpenMP 5.1 standard. This implementation is based on an AST transformation providing a de-sugared loop nest. This makes it simple to forward the de-sugared transformation to loop associated directives taking the tiled loops. In contrast to other loop associated directives, the OMPTileDirective does not use CapturedStmts. Letting loop associated directives consume loops from different capture context would be difficult. A significant amount of code generation logic is taking place in the Sema class. Eventually, I would prefer if these would move into the CodeGen component such that we could make use of the OpenMPIRBuilder, together with flang. Only expressions converting between the language's iteration variable and the logical iteration space need to take place in the semantic analyzer: Getting the of iterations (e.g. the overload resolution of `std::distance`) and converting the logical iteration number to the iteration variable (e.g. overload resolution of `iteration + .omp.iv`). In clang, only CXXForRangeStmt is also represented by its de-sugared components. However, OpenMP loop are not defined as syntatic sugar. Starting with an AST-based approach allows us to gradually move generated AST statements into CodeGen, instead all at once. I would also like to refactor `checkOpenMPLoop` into its functionalities in a follow-up. In this patch it is used twice. Once for checking proper nesting and emitting diagnostics, and additionally for deriving the logical iteration space per-loop (instead of for the loop nest). Differential Revision: https://reviews.llvm.org/D76342	2021-02-16 09:45:07 -08:00
serge-sans-paille	3c8bf29f14	Reduce the number of attributes attached to each function This takes advantage of the implicit default behavior to reduce the number of attributes, which in turns reduces compilation time. I've observed -3% in instruction count when compiling sqlite3 amalgamation with -O0 Differential Revision: https://reviews.llvm.org/D96400	2021-02-16 16:19:54 +01:00
Marco Antognini	e54811ff7e	Restore diagnostic handler after CodeGenAction::ExecuteAction Fix dangling pointer to local variable and address some typos. Reviewed By: xur Differential Revision: https://reviews.llvm.org/D96487	2021-02-15 10:33:00 +00:00
Wang, Pengfei	61da20575d	[X86] Convert fmin/fmax _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) This is a follow up of D92940. We have successfully converted fadd/fmul _mm_reduce_* intrinsics to llvm.reduction + reassoc flag. We can do the same approach for fmin/fmax too, i.e. llvm.reduction + nnan flag. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93179	2021-02-15 08:52:06 +08:00
Malhar	74ddacd30d	[Clang] Ensure vector predication loop metadata is always emitted when pragma is specified. This patch ensures that vector predication and vectorization width pragmas work together correctly/as expected. Specifically, this patch fixes the issue that when vectorization_width > 1, the vector predication behaviour (this would matter if it has NOT been disabled explicitly by a pragma) was getting ignored, which was incorrect. The fix here removes the dependence of vector predication on the vectorization width. The loop metadata corresponding to clang loop pragma vectorize_predicate is always emitted, if the pragma is specified, even if vectorization is disabled by vectorize_width(1) or vectorize(disable) since the option is also used for interleaving by the LoopVectorize pass. Reviewed By: dmgreen, Meinersbur Differential Revision: https://reviews.llvm.org/D94779	2021-02-13 17:35:54 -06:00
Artur Gainullin	ff50b121e3	[SYCL] Ignore file-scope asm during device-side SYCL compilation. Reviewed By: bader, eandrews Differential Revision: https://reviews.llvm.org/D96538	2021-02-12 17:00:45 -08:00
Florian Hahn	6280bb4cd8	[clang] Remove redundant condition (NFC).	2021-02-12 20:14:24 +00:00
Florian Hahn	51bf4c0e6d	[clang] Add -ffinite-loops & -fno-finite-loops options. This patch adds 2 new options to control when Clang adds `mustprogress`: 1. -ffinite-loops: assume all loops are finite; mustprogress is added to all loops, regardless of the selected language standard. 2. -fno-finite-loops: assume no loop is finite; mustprogress is not added to any loop or function. We could add mustprogress to functions without loops, but we would have to detect that in Clang, which is probably not worth it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D96419	2021-02-12 19:25:49 +00:00
Amy Huang	3fe465fb2c	Revert "[DebugInfo] Add an attribute to force type info to be emitted for" Didn't mean to commit this. This reverts commit `1b5c2915a2`.	2021-02-12 10:18:17 -08:00
Amy Huang	1b5c2915a2	[DebugInfo] Add an attribute to force type info to be emitted for class types. The goal is to provide a way to bypass constructor homing when emitting class definitions and force class definitions in the debug info. Not sure about the wording of the attribute, or whether it should be specific to classes with constructors	2021-02-12 10:16:49 -08:00
Akira Hatanaka	ed4718eccb	[ObjC][ARC] Use operand bundle 'clang.arc.attachedcall' instead of explicitly emitting retainRV or claimRV calls in the IR Background: This fixes a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.attachedcall" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if claimRV is attached to the call since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since the ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if retainRV is attached to the call and does nothing if claimRV is attached to it. - SCCP refrains from replacing the return value of a call with a constant value if the call has the operand bundle. This ensures the call always has at least one user (the call to @llvm.objc.clang.arc.noop.use). - This patch also fixes a bug in replaceUsesOfNonProtoConstant where multiple operand bundles of the same kind were being added to a call. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-12 09:51:57 -08:00
Yaxun (Sam) Liu	053e61d54e	Relands "[HIP] Change default --gpu-max-threads-per-block value to 1024" This reverts commit `e384e94fbe`.	2021-02-12 10:53:59 -05:00
Vitaly Buka	686b65f85f	[Msan, NewPM] Reduce size of msan binaries EarlyCSEPass called after msan redices code size by about 10%. Similar optimization exists for legacy pass manager in addGeneralOptsForMemorySanitizer. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D96406	2021-02-11 16:07:18 -08:00
Vitaly Buka	f2f59d2a06	[NFC] Extract function which registers sanitizer passes Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D96481	2021-02-11 15:29:48 -08:00
Pengxuan Zheng	61cca0f2e5	[AArch64] Adding Neon Sm3 & Sm4 Intrinsics This adds SM3 and SM4 Intrinsics support for AArch64, specifically: vsm3ss1q_u32 vsm3tt1aq_u32 vsm3tt1bq_u32 vsm3tt2aq_u32 vsm3tt2bq_u32 vsm3partw1q_u32 vsm3partw2q_u32 vsm4eq_u32 vsm4ekeyq_u32 Reviewed By: labrinea Differential Revision: https://reviews.llvm.org/D95655	2021-02-11 14:20:20 -08:00
Vitaly Buka	b6051f52ac	[Clang, NewPM] Add KMSan support Depends on D96320. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D96328	2021-02-10 14:07:49 -08:00
Artem Belevich	2aa01ccec3	[CUDA, NVPTX] Allow targeting sm_86 GPUs. The patch only plumbs through the option necessary for targeting sm_86 GPUs w/o adding any new functionality. Differential Revision: https://reviews.llvm.org/D95974	2021-02-09 11:01:10 -08:00
Nico Weber	de1966e542	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly" This reverts commit `4a64d8fe39`. Makes clang crash when buildling trivial iOS programs, see comment after https://reviews.llvm.org/D92808#2551401	2021-02-09 11:06:32 -05:00
Anastasia Stulova	79b222c39f	[OpenCL] Fix types with signed prefix in arginfo metadata. Signed prefix is removed and the single word spelling is printed for the scalar types. Tags: #clang Differential Revision: https://reviews.llvm.org/D96161	2021-02-09 15:13:19 +00:00
Wang, Pengfei	dd2460ed5d	[X86] Always assign reassoc flag for intrinsics reduce_add/mul_ps/pd. Intrinsics reduce_add/mul_ps/pd have assumption that the elements in the vector are reassociable. So we need to always assign the reassoc flag when we call _mm_reduce_* intrinsics. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D96231	2021-02-09 21:14:06 +08:00
Xiangling Liao	6b1e2fc893	[FE] Manipulate the first byte of guard variable type in both load and store operation As Itanium ABI[http://itanium-cxx-abi.github.io/cxx-abi/abi.html#once-ctor] points out: "The size of the guard variable is 64 bits. The first byte (i.e. the byte at the address of the full variable) shall contain the value 0 prior to initialization of the associated variable, and 1 after initialization is complete." Differential Revision: https://reviews.llvm.org/D95822	2021-02-08 11:14:34 -05:00
Anastasia Stulova	ecc8ac3f08	[OpenCL] Fix pipe type printing in arg info metadata Pipe element type spelling for arg info metadata should follow the same behavior as normal type spelling. We should only use the canonical type spelling in the base type field. This patch also removed duplication in type handling. Tags: #clang Differential Revision: https://reviews.llvm.org/D96151	2021-02-08 16:05:13 +00:00
Petr Hosek	9fd9b5a9c9	Don't emit coverage mapping for excluded functions When a function or a file is excluded using -fprofile-list= option, don't emit coverage mapping as doing so confuses users since those functions would always have zero count. This also reduces the binary size considerably in cases where only a few functions or files are being instrumented. Differential Revision: https://reviews.llvm.org/D96000	2021-02-05 13:03:57 -08:00
Yaxun (Sam) Liu	b008ea304d	[CUDA][HIP] Fix device variable linkage For -fgpu-rdc, shadow variables should not be internalized, otherwise they cannot be accessed by other TUs. This is necessary because the shadow variable of external device variables are always emitted as undefined symbols, which need to resolve to a global symbols. Managed variables need to be emitted as undefined symbols in device compilations. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D95901	2021-02-05 15:11:12 -05:00
Thomas Preud'homme	00a62547da	Stop traping on sNaN in __builtin_isnan __builtin_isnan currently generates a floating-point compare operation which triggers a trap when faced with a signaling NaN in StrictFP mode. This commit uses integer operations instead to not generate any trap in such a case. Reviewed By: kpn Differential Revision: https://reviews.llvm.org/D95948	2021-02-05 18:28:48 +00:00
Michael Liao	01bf529db2	Recommit of `a2fdf9d4d7`. - The failures are all cc1-based tests due to the missing `-aux-triple` options, which is always prepared by the driver in CUDA/HIP compilation. - Add extra check on the missing aux-targetinfo to prevent crashing. [hip][cuda] Enable extended lambda support on Windows. - On Windows, extended lambda has extra issues due to the numbering schemes are different between the host compilation (Microsoft C++ ABI) and the device compilation (Itanium C++ ABI. Additional device side lambda number is required per lambda for the host compilation to correctly mangle the device-side lambda name. - A hybrid numbering context `MSHIPNumberingContext` is introduced to number a lambda for both host- and device-compilations. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D69322 This reverts commit `4874ff0241`.	2021-02-05 11:27:30 -05:00
Akira Hatanaka	4a64d8fe39	[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly emitting retainRV or claimRV calls in the IR This reapplies `3fe3946d9a` without the changes made to lib/IR/AutoUpgrade.cpp, which was violating layering. Original commit message: Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.rv" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if the call is annotated with claimRV since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the implicit call is a call to retainRV and does nothing if it's a call to claimRV. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-05 06:09:42 -08:00
Akira Hatanaka	2fbbb18c1d	Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly" This reverts commit `3fe3946d9a`. The commit violates layering by including a header from Analysis in lib/IR/AutoUpgrade.cpp.	2021-02-05 06:00:05 -08:00
Akira Hatanaka	3fe3946d9a	[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly emitting retainRV or claimRV calls in the IR Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end adds operand bundle "clang.arc.rv" to calls, which indicates the call is implicitly followed by a marker instruction and an implicit retainRV/claimRV call that consumes the call result. In addition, it emits a call to @llvm.objc.clang.arc.noop.use, which consumes the call result, to prevent the middle-end passes from changing the return type of the called function. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the calls with the operand bundle in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the call with the operand bundle. It doesn't remove the operand bundle on the call since the backend needs it to emit the marker instruction. The retainRV and claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes an autoreleaseRV call in the callee if nothing in the callee prevents it from being paired up with the retainRV/claimRV call in the caller. It then inserts a release call if the call is annotated with claimRV since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the operand bundle to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV returning the callee result, which makes it impossible to pair it up with the retainRV/claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the implicit call is a call to retainRV and does nothing if it's a call to claimRV. Future work: - Use the operand bundle on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the operand bundles. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-02-05 05:55:18 -08:00
Nico Weber	4874ff0241	Revert "[hip][cuda] Enable extended lambda support on Windows." This reverts commit `a2fdf9d4d7`. Slightly speculative, seeing several cuda tests fail on this Windows bot: http://45.33.8.238/win/32620/step_7.txt	2021-02-04 07:10:46 -05:00
Michael Liao	a2fdf9d4d7	[hip][cuda] Enable extended lambda support on Windows. - On Windows, extended lambda has extra issues due to the numbering schemes are different between the host compilation (Microsoft C++ ABI) and the device compilation (Itanium C++ ABI. Additional device side lambda number is required per lambda for the host compilation to correctly mangle the device-side lambda name. - A hybrid numbering context `MSHIPNumberingContext` is introduced to number a lambda for both host- and device-compilations. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D69322	2021-02-04 01:38:29 -05:00
Nico Weber	b995314143	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit `97ba5cde52`. Still breaks tests: https://reviews.llvm.org/D76802#2540647	2021-02-03 19:14:34 -05:00
Zequan Wu	4dc08cc3aa	[Coverage] Propogate counter to condition of conditional operator Clang usually propagates counter mapping region for conditions of `if`, `while`, `for`, etc from parent counter. We should do the same for condition of conditional operator. Differential Revision: https://reviews.llvm.org/D95918	2021-02-03 13:33:22 -08:00
Yaxun (Sam) Liu	0b2af1a288	[NFC][CUDA] Refactor registering device variable Extract registering device variable to CUDA runtime codegen function since it will be called in multiple places. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D95558	2021-02-03 14:29:51 -05:00
Kevin P. Neal	81b69879c9	[FPEnv][X86] Platform builtins edition: clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the X86 specific builtins. Differential Revision: https://reviews.llvm.org/D94614	2021-02-03 11:49:17 -05:00
Petr Hosek	97ba5cde52	[InstrProfiling] Use !associated metadata for counters, data and values C identifier name input sections such as __llvm_prf_* are GC roots so they cannot be discarded. In LLD, the SHF_LINK_ORDER flag overrides the C identifier name semantics. The !associated metadata may be attached to a global object declaration with a single argument that references another global object, and it gets lowered to SHF_LINK_ORDER flag. When a function symbol is discarded by the linker, setting up !associated metadata allows linker to discard counters, data and values associated with that function symbol. Note that !associated metadata is only supported by ELF, it does not have any effect on non-ELF targets. Differential Revision: https://reviews.llvm.org/D76802	2021-02-02 23:19:51 -08:00
Tom Weaver	4f1320b77d	Revert "[InstrProfiling] Use !associated metadata for counters, data and values" This reverts commit `df3e39f60b`. introduced failing test instrprof-gc-sections.c causing build bot to fail: http://lab.llvm.org:8011/#/builders/53/builds/1184	2021-02-02 14:19:31 +00:00
Hans Wennborg	0479c53b6c	[dllimport] Honor always_inline when deciding whether a dllimport function should be available for inlining (PR48925) Normally, Clang will not make dllimport functions available for inlining if they reference non-imported symbols, as this can lead to confusing link errors. But if the function is marked always_inline, the user presumably knows what they're doing and the attribute should be honored. Differential revision: https://reviews.llvm.org/D95673	2021-02-02 10:28:32 +01:00
Petr Hosek	df3e39f60b	[InstrProfiling] Use !associated metadata for counters, data and values C identifier name input sections such as __llvm_prf_* are GC roots so they cannot be discarded. In LLD, the SHF_LINK_ORDER flag overrides the C identifier name semantics. The !associated metadata may be attached to a global object declaration with a single argument that references another global object, and it gets lowered to SHF_LINK_ORDER flag. When a function symbol is discarded by the linker, setting up !associated metadata allows linker to discard counters, data and values associated with that function symbol. Note that !associated metadata is only supported by ELF, it does not have any effect on non-ELF targets. Differential Revision: https://reviews.llvm.org/D76802	2021-02-01 15:01:43 -08:00
xgupta	94fac81fcc	[Branch-Rename] Fix some links According to the [[ https://foundation.llvm.org/docs/branch-rename/ \| status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D95766	2021-02-01 16:43:21 +05:30
Amy Huang	d5f5deee9e	Reland "[DebugInfo][CodeView] Use <lambda_n> as the display name for lambdas" with fix to test case and stringrefs. Currently (for codeview) lambdas have a string like `<lambda_0>` in their mangled name, and don't have any display name. This change uses the `<lambda_0>` as the display name, which helps distinguish between lambdas in -gline-tables-only, since there are no linkage names there. It also changes how we display lambda names; previously we used `<unnamed-tag>`; now it will show `<lambda_0>`. I added a function to the mangling context code to create this string; for Itanium it just returns an empty string. Bug: https://bugs.llvm.org/show_bug.cgi?id=48432 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D95187 This reverts `9b21d4b943`	2021-01-28 18:44:48 -08:00
Amy Huang	9b21d4b943	Revert "[DebugInfo][CodeView] Use <lambda_n> as the display name for lambdas." for test failures. This reverts commit `d73564c510`.	2021-01-28 16:41:26 -08:00
Amy Huang	d73564c510	[DebugInfo][CodeView] Use <lambda_n> as the display name for lambdas. Currently (for codeview) lambdas have a string like `<lambda_0>` in their mangled name, and don't have any display name. This change uses the `<lambda_0>` as the display name, which helps distinguish between lambdas in -gline-tables-only, since there are no linkage names there. It also changes how we display lambda names; previously we used `<unnamed-tag>`; now it will show `<lambda_0>`. I added a function to the mangling context code to create this string; for Itanium it just returns an empty string. Bug: https://bugs.llvm.org/show_bug.cgi?id=48432 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D95187	2021-01-28 16:30:38 -08:00
Thomas Lively	4b68b64dcc	[WebAssembly] Prototype i8x16 to i32x4 widening instructions As proposed in https://github.com/WebAssembly/simd/pull/395 and matching the opcodes used in V8: https://chromium-review.googlesource.com/c/v8/v8/+/2617385/4/src/wasm/wasm-opcodes.h Differential Revision: https://reviews.llvm.org/D95557	2021-01-28 10:59:32 -08:00
Freddy Ye	1edb76cc91	[X86] merge "={eax}" and "~{eax}" into "=&eax" for MSInlineASM Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D94466	2021-01-27 22:54:17 +08:00
Petr Hosek	bb9eb19829	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 17:13:34 -08:00
Fangrui Song	34b60d8a56	Add -fbinutils-version= to gate ELF features on the specified binutils version There are two use cases. Assembler We have accrued some code gated on MCAsmInfo::useIntegratedAssembler(). Some features are supported by latest GNU as, but we have to use MCAsmInfo::useIntegratedAs() because the newer versions have not been widely adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26). Linker We want to use features supported only by LLD or very new GNU ld, or don't want to work around older GNU ld. We currently can't represent that "we don't care about old GNU ld". You can find such workarounds in a few other places, e.g. Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276), R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969) Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001; GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available). This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table). This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc. It changes one codegen place in SHF_MERGE to demonstrate its usage. `-fbinutils-version=2.35` means the produced object file does not care about GNU ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced assembly can be consumed by GNU as>=2.35, but older versions may not work. `-fbinutils-version=none` means that we can use all ELF features, regardless of GNU as/ld support. Both clang and llc need `parseBinutilsVersion`. Such command line parsing is usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen), however, ClangCodeGen does not depend on LLVMCodeGen. So I add `parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget). Differential Revision: https://reviews.llvm.org/D85474	2021-01-26 12:28:23 -08:00
Petr Hosek	1e634f3952	Revert "Support for instrumenting only selected files or functions" This reverts commit `4edf35f11a` because the test fails on Windows bots.	2021-01-26 12:25:28 -08:00
Fangrui Song	189f311130	CGDebugInfo CreatedLimitedType: Drop file/line for RecordType with invalid location For Clang synthesized `__va_list_tag` (`CreateX86_64ABIBuiltinVaListDecl`), its DW_AT_decl_file/DW_AT_decl_line are arbitrarily set from `CurLoc`. In a stage 2 `-DCMAKE_BUILD_TYPE=Debug` clang build, I observe that in driver.cpp, DW_AT_decl_file/DW_AT_decl_line may be set to an `#include` line (the transitively included file uses va_arg (`__builtin_va_arg`)). This seems arbitrary. Drop that. Reviewed By: #debug-info, dblaikie Differential Revision: https://reviews.llvm.org/D94735	2021-01-26 11:53:25 -08:00
Fangrui Song	31d375f178	CGDebugInfo: Drop Loc.isInvalid() special case from getLineNumber `getLineNumber()` picks CurLoc if the parameter is invalid. This appears to mainly work around missing SourceLocation information for some constructs, but sometimes adds unintended locations. * For `CodeGenObjC/debug-info-blocks.m`, `CurLoc` has been advanced to the closing brace. The debug line of `ImplicitVarParameter` is set to the line of `}` because this implicit parameter has an invalid `SourceLocation`. The debug line is a bit arbitrary - perhaps the location of `^{` is better. * The file/line of Clang synthesized `__va_list_tag` is arbitrarily attached a `#include` line. D94735 Drop the special case to make getLineNumber less magic and add CurLoc fallback in its callers instead. Tested with stage 2 -DCMAKE_BUILD_TYPE=Debug clang, byte identical. Reviewed By: #debug-info, aprantl Differential Revision: https://reviews.llvm.org/D94391	2021-01-26 11:44:41 -08:00
Petr Hosek	4edf35f11a	Support for instrumenting only selected files or functions This change implements support for applying profile instrumentation only to selected files or functions. The implementation uses the sanitizer special case list format to select which files and functions to instrument, and relies on the new noprofile IR attribute to exclude functions from instrumentation. Differential Revision: https://reviews.llvm.org/D94820	2021-01-26 11:11:39 -08:00
Freddy Ye	b3b0acdc6f	[NFC] Refine some uninitialized used variables. These warning are reported by static code analysis tool: Klocwork Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D95421	2021-01-26 16:51:05 +08:00
Johannes Doerfert	bd756286d2	[OpenMP][FIX] Enforce a function boundary for a new data environment Whenever we enter a new OpenMP data environment we want to enter a function to simplify reasoning. Later we probably want to remove the entire specialization wrt. the if clause and pass the result to the runtime, for now this should fix PR48686. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D94315	2021-01-25 22:43:37 -06:00
Richard Smith	925ae8c790	Revert "[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV" This reverts commit `53176c1680`, which introduceed a layering violation. LLVM's IR library can't include headers from Analysis.	2021-01-25 13:53:38 -08:00
Akira Hatanaka	53176c1680	[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV or claimRV calls in the IR Background: This patch makes changes to the front-end and middle-end that are needed to fix a longstanding problem where llvm breaks ARC's autorelease optimization (see the link below) by separating calls from the marker instructions or retainRV/claimRV calls. The backend changes are in https://reviews.llvm.org/D92569. https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue What this patch does to fix the problem: - The front-end annotates calls with attribute "clang.arc.rv"="retain" or "clang.arc.rv"="claim", which indicates the call is implicitly followed by a marker instruction and a retainRV/claimRV call that consumes the call result. This is currently done only when the target is arm64 and the optimization level is higher than -O0. - ARC optimizer temporarily emits retainRV/claimRV calls after the annotated calls in the IR and removes the inserted calls after processing the function. - ARC contract pass emits retainRV/claimRV calls after the annotated calls. It doesn't remove the attribute on the call since the backend needs it to emit the marker instruction. The retainRV/claimRV calls are emitted late in the pipeline to prevent optimization passes from transforming the IR in a way that makes it harder for the ARC middle-end passes to figure out the def-use relationship between the call and the retainRV/claimRV calls (which is the cause of PR31925). - The function inliner removes the autoreleaseRV call in the callee that returns the result if nothing in the callee prevents it from being paired up with the calls annotated with "clang.arc.rv"="retain/claim" in the caller. If the call is annotated with "claim", a release call is inserted since autoreleaseRV+claimRV is equivalent to a release. If it cannot find an autoreleaseRV call, it tries to transfer the attributes to a function call in the callee. This is important since ARC optimizer can remove the autoreleaseRV call returning the callee result, which makes it impossible to pair it up with the retainRV or claimRV call in the caller. If that fails, it simply emits a retain call in the IR if the call is annotated with "retain" and does nothing if it's annotated with "claim". - This patch teaches dead argument elimination pass not to change the return type of a function if any of the calls to the function are annotated with attribute "clang.arc.rv". This is necessary since the pass can incorrectly determine nothing in the IR uses the function return, which can happen since the front-end no longer explicitly emits retainRV/claimRV calls in the IR, and change its return type to 'void'. Future work: - Use the attribute on x86-64. - Fix the auto upgrader to convert call+retainRV/claimRV pairs into calls annotated with the attributes. rdar://71443534 Differential Revision: https://reviews.llvm.org/D92808	2021-01-25 11:57:08 -08:00
Keith Smiley	c3324450b2	[clang] Add -fprofile-prefix-map This flag allows you to re-write absolute paths in coverage data analogous to -fdebug-prefix-map. This flag is also implied by -ffile-prefix-map.	2021-01-25 10:14:04 -08:00
George Koehler	018984ae68	[PowerPC] Fix va_arg in C++, Objective-C on 32-bit ELF targets In the PPC32 SVR4 ABI, a va_list has copies of registers from the function call. va_arg looked in the wrong registers for (the pointer representation of) an object in Objective-C, and for some types in C++. Fix va_arg to look in the general-purpose registers, not the floating-point registers. Also fix va_arg for some C++ types, like a member function pointer, that are aggregates for the ABI. Anthony Richardby found the problem in Objective-C. Eli Friedman suggested part of this fix. Fixes https://bugs.llvm.org/show_bug.cgi?id=47921 Reviewed By: efriedma, nemanjai Differential Revision: https://reviews.llvm.org/D90329	2021-01-23 00:13:36 -05:00
Bjorn Pettersson	ea2cfda386	[CGExpr] Use getCharWidth() more consistently in CCGExprConstant. NFC Most of CGExprConstant.cpp is using the CharUnits abstraction and is using getCharWidth() (directly of indirectly) when converting between size of a char and size in bits. This patch is making that abstraction more consistent by adding CharTy to the CodeGenTypeCache (honoring getCharWidth() when mapping from char to LLVM IR types, instead of using Int8Ty directly). Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D94979	2021-01-22 21:12:17 +01:00
Bjorn Pettersson	72f863fd37	[CodeGen] Use getCharWidth() more consistently in CGRecordLowering. NFC When using getByteArrayType the requested size is calculated in char units, but the type used for the array was hardcoded to the Int8Ty. This patch is using getCharWIdth a bit more consistently by using getIntNTy in combination with getCharWidth, instead of explictly using getInt8Ty. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D94977	2021-01-22 21:12:17 +01:00
Yaxun (Sam) Liu	622eaa4a4c	[HIP] Support __managed__ attribute This patch implements codegen for __managed__ variable attribute for HIP. Diagnostics will be added later. Differential Revision: https://reviews.llvm.org/D94814	2021-01-22 11:43:58 -05:00
Akira Hatanaka	3d349ed7e1	[CodeGen][ObjC] Fix broken IR generated when there is a nil receiver check This patch fixes a bug in emitARCOperationAfterCall where it inserts the fall-back call after a bitcast instruction and then replaces the bitcast's operand with the result of the fall-back call. The generated IR without this patch looks like this: msgSend.call: ; preds = %entry %call = call i8* bitcast (i8* (i8, i8, ...)* @objc_msgSend br label %msgSend.cont msgSend.null-receiver: ; preds = %entry call void @llvm.objc.release(i8* %4) br label %msgSend.cont msgSend.cont: %8 = phi i8* [ %call, %msgSend.call ], [ null, %msgSend.null-receiver ] %9 = bitcast i8* %10 to %0* %10 = call i8* @llvm.objc.retain(i8* %8) Notice that `%9 = bitcast i8* %10` to %0* is taking operand %10 which is defined after it. To fix the bug, this patch modifies the insert point to point to the bitcast instruction so that the fall-back call is inserted before the bitcast. In addition, it teaches the function to look at phi instructions that are generated when there is a check for a null receiver and insert the retainRV/claimRV instruction right after the call instead of inserting a fall-back call right after the phi instruction. rdar://73360225 Differential Revision: https://reviews.llvm.org/D95181	2021-01-21 17:38:46 -08:00
Jon Roelofs	1deee5cacb	Fix crash when emitting NullReturn guards for functions returning BOOL CodeGenModule::EmitNullConstant() creates constants with their "in memory" type, not their "in vregs" type. The one place where this difference matters is when the type is _Bool, as that is an i1 when in vregs and an i8 in memory. Fixes: rdar://73361264	2021-01-21 14:29:36 -08:00
Joseph Huber	e4eaf9d820	[OpenMP] Add support for mapping names in mapper API Summary: The custom mapper API did not previously support the mapping names added previously. This means they were not present if a user requested debugging information while using the mapper functions. This adds basic support for passing the mapped names to the runtime library. Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D94806	2021-01-21 09:26:44 -05:00
Amy Huang	a3d7cee7f9	[CodeView] Emit function types in -gline-tables-only. This change adds function types to further differentiate between FUNC_IDs in -gline-tables-only. Size increase of object files in clang are Before: 917990 kb After: 999312 kb Bug: https://bugs.llvm.org/show_bug.cgi?id=48432 Differential Revision: https://reviews.llvm.org/D95001	2021-01-20 12:47:35 -08:00
Thomas Lively	11802eced5	[WebAssembly] Prototype new f64x2 conversions As proposed in https://github.com/WebAssembly/simd/pull/383. Differential Revision: https://reviews.llvm.org/D95012	2021-01-20 11:28:06 -08:00
Hans Wennborg	8ba442bc21	Revert "Following up on PR48517, fix handling of template arguments that refer" Combined with 'da98651 - Revert "DR2064: decltype(E) is only a dependent', this change (`5a391d3`) caused verifier errors when building Chromium. See https://crbug.com/1168494#c1 for a reproducer. Additionally it reverts changes that were dependent on this one, see below. > Following up on PR48517, fix handling of template arguments that refer > to dependent declarations. > > Treat an id-expression that names a local variable in a templated > function as being instantiation-dependent. > > This addresses a language defect whereby a reference to a dependent > declaration can be formed without any construct being value-dependent. > Fixing that through value-dependence turns out to be problematic, so > instead this patch takes the approach (proposed on the core reflector) > of allowing the use of pointers or references to (but not values of) > dependent declarations inside value-dependent expressions, and instead > treating template arguments as dependent if they evaluate to a constant > involving such dependent declarations. > > This ends up affecting a bunch of OpenMP tests, due to OpenMP > imprecisely handling instantiation-dependent constructs, bailing out > early instead of processing dependent constructs to the extent possible > when handling the template. > > Previously committed as `8c1f2d15b8`, and > reverted because a dependency commit was reverted. This reverts commit `5a391d38ac`. It also restores clang/test/SemaCXX/coroutines.cpp to its state before `da986511fb`. Revert "[c++20] P1907R1: Support for generalized non-type template arguments of scalar type." > Previously committed as `9e08e51a20`, and > reverted because a dependency commit was reverted. This incorporates the > following follow-on commits that were also reverted: > > `7e84aa1b81` by Simon Pilgrim > `ed13d8c667` by me > `95c7b6cadb` by Sam McCall > `430d5d8429` by Dave Zarzycki This reverts commit `4b574008ae`. Revert "[msabi] Mangle a template argument referring to array-to-pointer decay" > [msabi] Mangle a template argument referring to array-to-pointer decay > applied to an array the same as the array itself. > > This follows MS ABI, and corrects a regression from the implementation > of generalized non-type template parameters, where we "forgot" how to > mangle this case. This reverts commit `18e093faf7`.	2021-01-20 15:55:35 +01:00
Alexey Bataev	b272698de7	[OPENMP]Do not use OMP_MAP_TARGET_PARAM for data movement directives. OMP_MAP_TARGET_PARAM flag is used to mark the data that shoud be passed as arguments to the target kernels, nothing else. But the compiler still marks the data with OMP_MAP_TARGET_PARAM flags even if the data is passed to the data movement directives, like target data, target update etc. This flag is just ignored for this directives and the compiler does not need to emit it. Reviewed By: cchen Differential Revision: https://reviews.llvm.org/D91261	2021-01-19 12:41:15 -08:00
Shilei Tian	82e537a9d2	[Clang][OpenMP] Fixed an issue that clang crashed when compiling OpenMP program in device only mode without host IR D94745 rewrites the `deviceRTLs` using OpenMP and compiles it by directly calling the device compilation. `clang` crashes because entry in `OffloadEntriesDeviceGlobalVar` is unintialized. Current design supposes the device compilation can only be invoked after host compilation with the host IR such that `clang` can initialize `OffloadEntriesDeviceGlobalVar` from host IR. This avoids us using device compilation directly, especially when we only have code wrapped into `declare target` which are all device code. The same issue also exists for `OffloadEntriesInfoManager`. In this patch, we simply initialized an entry if it is not in the maps. Not sure we need an option to tell the device compiler that it is invoked standalone. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D94871	2021-01-19 14:18:42 -05:00
Richard Smith	4b574008ae	[c++20] P1907R1: Support for generalized non-type template arguments of scalar type. Previously committed as `9e08e51a20`, and reverted because a dependency commit was reverted. This incorporates the following follow-on commits that were also reverted: `7e84aa1b81` by Simon Pilgrim `ed13d8c667` by me `95c7b6cadb` by Sam McCall `430d5d8429` by Dave Zarzycki	2021-01-18 21:05:01 -08:00
Amy Huang	6227069bdc	[DebugInfo][CodeView] Change in line tables only mode to emit type information for function scopes, rather than using the qualified name. In line-tables-only mode, we used to emit qualified names as the display name for functions when using CodeView. This patch changes to emitting the parent scopes instead, with forward declarations for class types. The total object file size ends up being slightly smaller than if we use the full qualified names. Differential Revision: https://reviews.llvm.org/D94639	2021-01-15 09:28:27 -08:00
Qiu Chaofan	168be42083	[Clang] Mutate long-double math builtins into f128 under IEEE-quad Under -mabi=ieeelongdouble on PowerPC, IEEE-quad floating point semantic is used for long double. This patch mutates call to related builtins into f128 version on PowerPC. And in theory, this should be applied to other targets when their backend supports IEEE 128-bit style libcalls. GCC already has these mutations except nansl, which is not available on PowerPC along with other variants (nans, nansf). Reviewed By: RKSimon, nemanjai Differential Revision: https://reviews.llvm.org/D92080	2021-01-15 16:56:20 +08:00
Lucas Prates	2b1e25befe	[AArch64] Adding ACLE intrinsics for the LS64 extension This introduces the ARMv8.7-A LS64 extension's intrinsics for 64 bytes atomic loads and stores: `__arm_ld64b`, `__arm_st64b`, `__arm_st64bv`, and `__arm_st64bv0`. These are selected into the LS64 instructions LD64B, ST64B, ST64BV and ST64BV0, respectively. Based on patches written by Simon Tatham. Reviewed By: tmatheson Differential Revision: https://reviews.llvm.org/D93232	2021-01-14 09:43:58 +00:00
Zequan Wu	e53bbd9951	[IR] move nomerge attribute from function declaration/definition to callsites Move nomerge attribute from function declaration/definition to callsites to allow virtual function calls attach the attribute. Differential Revision: https://reviews.llvm.org/D94537	2021-01-12 12:10:46 -08:00
David Truby	e5f51fdd65	[clang][aarch64] Precondition isHomogeneousAggregate on isCXX14Aggregate MSVC on WoA64 includes isCXX14Aggregate in its definition. This is de-facto specification on that platform, so match msvc's behaviour. Fixes: https://bugs.llvm.org/show_bug.cgi?id=47611 Co-authored-by: Peter Waller <peter.waller@arm.com> Differential Revision: https://reviews.llvm.org/D92751	2021-01-12 19:44:01 +00:00
Bevin Hansson	c4944a6f53	[Fixed Point] Add codegen for conversion between fixed-point and floating point. The patch adds the required methods to FixedPointBuilder for converting between fixed-point and floating point, and uses them from Clang. This depends on D54749. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D86632	2021-01-12 13:53:01 +01:00
Fangrui Song	b88c8f1aab	CGDebugInfo: Delete unused parameters	2021-01-11 13:39:03 -08:00
Fangrui Song	f4cec703ec	Add an assert to CGDebugInfo::getTypeOrNull	2021-01-11 13:25:20 -08:00
Joe Ellis	8ea72b3887	[clang][AArch64][SVE] Avoid going through memory for coerced VLST return values VLST return values are coerced to VLATs in the function epilog for consistency with the VLAT ABI. Previously, this coercion was done through memory. It is preferable to use the llvm.experimental.vector.insert intrinsic to avoid going through memory here. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D94290	2021-01-11 12:10:59 +00:00
Fangrui Song	b8d2842088	CGDebugInfo: Delete unneeded UnwrapTypeForDebugInfo Tested with stage 2 -DCMAKE_BUILD_TYPE=Debug clang, byte identical.	2021-01-10 22:22:07 -08:00
Fangrui Song	6215c1b778	CGDebugInfo: Delete redundant test	2021-01-10 22:22:06 -08:00
Fangrui Song	02bc320545	CGDebugInfo: Delete unused DIFile* parameter	2021-01-10 15:03:40 -08:00
Fangrui Song	abfe348e6b	[test] Improve CodeGenCXX/difile_entry.cpp The test added in D87147 did not actually test PR47391. Use an absolute path to test the canonicalization.	2021-01-10 12:24:49 -08:00
Fangrui Song	e2e82c9983	[CodeGenModule] Drop dso_local on function declarations for ELF -fno-pic -fno-direct-access-external-data ELF -fno-pic sets dso_local on a function declaration to allow direct accesses when taking its address (similar to a data symbol). The emitted code follows the traditional GCC/Clang -fno-pic behavior: an absolute relocation is produced. If the function is not defined in the executable, a canonical PLT entry will be needed at link time. This is similar to a copy relocation and is incompatible with (-Bsymbolic or --dynamic-list linked shared objects / protected symbols in a shared object). This patch gives -fno-pic code a way to avoid such a canonical PLT entry. The FIXME was about a generalization for -fpie -mpie-copy-relocations (now -fpie -fdirect-access-external-data). While we could set dso_local to avoid GOT when taking the address of a function declaration (there is an ignorable difference about R_386_PC32 vs R_386_PLT32 on i386), it likely does not provide any benefit and can just cause trouble, so we don't make the generalization.	2021-01-09 16:31:56 -08:00
Fangrui Song	38a716c30f	Make -fno-pic respect -fno-direct-access-external-data D92633 added -f[no-]direct-access-external-data to supersede -m[no-]pie-copy-relocations. (The option works for -fpie but is a no-op for -fno-pic and -fpic.) This patch makes -fno-pic -fno-direct-access-external-data drop dso_local from global variable declarations. This usually causes the backend to emit a GOT indirection for external data access. With a GOT relocation, the subsequent -no-pie link will not have copy relocation even if the data symbol turns out to be defined by a shared object. Differential Revision: https://reviews.llvm.org/D92714	2021-01-09 00:32:02 -08:00
Fangrui Song	1d3ebbf537	Add -f[no-]direct-access-external-data to supersede -mpie-copy-relocations GCC r218397 "x86-64: Optimize access to globals in PIE with copy reloc" made -fpie code emit R_X86_64_PC32 to reference external data symbols by default. Clang adopted -mpie-copy-relocations D19996 as a flexible alternative. The name -mpie-copy-relocations can be improved [1] and does not capture the idea that this option can apply to -fno-pic and -fpic [2], so this patch introduces -f[no-]direct-access-external-data and makes -mpie-copy-relocations their aliases for compatibility. [1] For ``` extern int var; int get() { return var; } ``` if var is defined in another translation unit in the link unit, there is no copy relocation. [2] -fno-pic -fno-direct-access-external-data is useful to avoid copy relocations. https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65888 If a shared object is linked with -Bsymbolic or --dynamic-list and exports a data symbol, normally the data symbol cannot be accessed by -fno-pic code (because by default an absolute relocation is produced which will lead to a copy relocation). -fno-direct-access-external-data can prevent copy relocations. -fpic -fdirect-access-external-data can avoid GOT indirection. This is like the undefined counterpart of -fno-semantic-interposition. However, the user should define var in another translation unit and link with -Bsymbolic or --dynamic-list, otherwise the linker will error in a -shared link. Generally the user has better tools for their goal but I want to mention that this combination is valid. On COFF, the behavior is like always -fdirect-access-external-data. `__declspec(dllimport)` is needed to enable indirect access. There is currently no plan to affect non-ELF behaviors or -fpic behaviors. -fno-pic -fno-direct-access-external-data will be implemented in the subsequent patch. GCC feature request https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D92633	2021-01-09 00:32:01 -08:00
Umesh Kalappa	33c8e16f66	PR47391: Canonicalize DIFiles Like @aprantl suggested, modify to use the canonicalized DIFile, if we don't know the loc info and filename for the compiler generated functions for example static initialization functions. Reviewed By: dblaikie, aprantl Differential Revision: https://reviews.llvm.org/D87147	2021-01-08 22:11:16 -08:00
Arthur Eubanks	756dd70766	[NewPM] Run ObjC ARC passes Match the legacy PM in running various ObjC ARC passes. This requires making some module passes into function passes. These were initially ported as module passes since they add function declarations (e.g. https://reviews.llvm.org/D86178), but that's still up for debate and other passes do so. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D93743	2021-01-08 15:47:11 -08:00
Hongtao Yu	0e23fd676c	[Driver] Add DWARF64 flag: -gdwarf64 @ikudrin enabled support for dwarf64 in D87011. Adding a clang flag so it can be used through that compilation pass. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D90507	2021-01-08 12:58:38 -08:00
Heejin Ahn	7be271537e	[WebAssembly] Rename wasm_rethrow_in_catch intrinsic/builtin `wasm_rethrow_in_catch` intrinsic and builtin are used in order to rethrow an exception when the exception is caught but there is no matching clause within the current `catch`. For example, ``` try { foo(); } catch (int n) { ... } ``` If the caught exception does not correspond to C++ `int` type, it should be rethrown. These intrinsic/builtin were renamed `rethrow_in_catch` because at the time I thought there would be another intrinsic for C++'s `throw` keyword, which rethrows an exception. It turned out that `throw` keyword doesn't require wasm's `rethrow` instruction, so we rename `rethrow_in_catch` to just `rethrow` here. Reviewed By: dschuff, tlively Differential Revision: https://reviews.llvm.org/D94038	2021-01-08 06:55:04 -08:00
David Sherwood	38d18d9353	[SVE] Add support to vectorize_width loop pragma for scalable vectors This patch adds support for two new variants of the vectorize_width pragma: 1. vectorize_width(X[, fixed\|scalable]) where an optional second parameter is passed to the vectorize_width pragma, which indicates if the user wishes to use fixed width or scalable vectorization. For example the user can now write something like: #pragma clang loop vectorize_width(4, fixed) or #pragma clang loop vectorize_width(4, scalable) In the absence of a second parameter it is assumed the user wants fixed width vectorization, in order to maintain compatibility with existing code. 2. vectorize_width(fixed\|scalable) where the width is left unspecified, but the user hints what type of vectorization they prefer, either fixed width or scalable. I have implemented this by making use of the LLVM loop hint attribute: llvm.loop.vectorize.scalable.enable Tests were added to clang/test/CodeGenCXX/pragma-loop.cpp for both the 'fixed' and 'scalable' optional parameter. See this thread for context: http://lists.llvm.org/pipermail/cfe-dev/2020-November/067262.html Differential Revision: https://reviews.llvm.org/D89031	2021-01-08 11:37:27 +00:00
Wang, Pengfei	c102b9697b	[X86] Correct the comments about comparison intrinsics. NFCI.	2021-01-08 15:36:15 +08:00
Reid Kleckner	ad55d5c3f3	Simplify vectorcall argument classification of HVAs, NFC This reduces the number of `WinX86_64ABIInfo::classify` call sites from 3 to 1. The call sites were similar, but passed different values for FreeSSERegs. Use variables instead of `if`s to manage that argument.	2021-01-07 11:14:18 -08:00
Pushpinder Singh	4909cb1a0f	[OpenMP][AMDGPU] Use AMDGPU_KERNEL calling convention for entry function AMDGPU backend requires entry functions/kernels to have AMDGPU_KERNEL calling convention for proper linking. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D94060	2021-01-06 02:03:30 -05:00
Thomas Lively	497026c902	[WebAssembly] Prototype prefetch instructions As proposed in https://github.com/WebAssembly/simd/pull/352 and using the opcodes used in the V8 prototype: https://chromium-review.googlesource.com/c/v8/v8/+/2543167. These instructions are only usable via intrinsics and clang builtins to make them opt-in while they are being benchmarked. Differential Revision: https://reviews.llvm.org/D93883	2021-01-05 11:32:03 -08:00
Simon Pilgrim	55488bd3cd	CGExpr - EmitMatrixSubscriptExpr - fix getAs<> null-dereference static analyzer warning. NFCI. getAs<> can return null if the cast is invalid, which can lead to null pointer deferences. Use castAs<> instead which will assert that the cast is valid.	2021-01-05 17:08:11 +00:00
Alan Phipps	9f2967bcfe	[Coverage] Add support for Branch Coverage in LLVM Source-Based Code Coverage This is an enhancement to LLVM Source-Based Code Coverage in clang to track how many times individual branch-generating conditions are taken (evaluate to TRUE) and not taken (evaluate to FALSE). Individual conditions may comprise larger boolean expressions using boolean logical operators. This functionality is very similar to what is supported by GCOV except that it is very closely anchored to the ASTs. Differential Revision: https://reviews.llvm.org/D84467	2021-01-05 09:51:51 -06:00
Joe Ellis	3d5b18a3fd	[clang][AArch64][SVE] Avoid going through memory for coerced VLST arguments VLST arguments are coerced to VLATs at the function boundary for consistency with the VLAT ABI. They are then bitcast back to VLSTs in the function prolog. Previously, this conversion is done through memory. With the introduction of the llvm.vector.{insert,extract} intrinsic, we can avoid going through memory here. Depends on D92761 Differential Revision: https://reviews.llvm.org/D92762	2021-01-05 15:18:21 +00:00
Thorsten Schütt	2fd11e0b1e	Revert "[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II)" This reverts commit `efc82c4ad2`.	2021-01-04 23:17:45 +01:00
Thorsten Schütt	efc82c4ad2	[NFC, Refactor] Modernize StorageClass from Specifiers.h to a scoped enum (II) Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D93765	2021-01-04 22:58:26 +01:00
Hongtao Yu	4034f9273e	Switching Clang UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm. As a follow-up to D93656, I'm switching the Clang UniqueInternalLinkageNamesPass scheduling to using the LLVM one with newpm. Test Plan: Reviewed By: aeubanks, tmsriram Differential Revision: https://reviews.llvm.org/D94019	2021-01-04 12:04:46 -08:00
Jon Chesterfield	76bfbb74d3	[libomptarget][amdgpu] Call into deviceRTL instead of ockl [libomptarget][amdgpu] Call into deviceRTL instead of ockl Amdgpu codegen presently emits a call into ockl. The same functionality is already present in the deviceRTL. Adds an amdgpu specific entry point to avoid the dependency. This lets simple openmp code (specifically, that which doesn't use libm) run without rocm device libraries installed. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D93356	2021-01-04 16:48:47 +00:00
Brandon Bergren	6cee9d0cf8	[PowerPC] Support powerpcle target in Clang [3/5] Add powerpcle support to clang. For FreeBSD, assume a freestanding environment for now, as we only need it in the first place to build loader, which runs in the OpenFirmware environment instead of the FreeBSD environment. For Linux, recognize glibc and musl environments to match current usage in Void Linux PPC. Adjust driver to match current binutils behavior regarding machine naming. Adjust and expand tests. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D93919	2021-01-02 12:17:58 -06:00
Fangrui Song	d1fd72343c	Refactor how -fno-semantic-interposition sets dso_local on default visibility external linkage definitions The idea is that the CC1 default for ELF should set dso_local on default visibility external linkage definitions in the default -mrelocation-model pic mode (-fpic/-fPIC) to match COFF/Mach-O and make output IR similar. The refactoring is made available by `2820a2ca3a`. Currently only x86 supports local aliases. We move the decision to the driver. There are three CC1 states: * -fsemantic-interposition: make some linkages interposable and make default visibility external linkage definitions dso_preemptable. * (default): selected if the target supports .Lfoo$local: make default visibility external linkage definitions dso_local * -fhalf-no-semantic-interposition: if neither option is set or the target does not support .Lfoo$local: like -fno-semantic-interposition but local aliases are not used. So references can be interposed if not optimized out. Add -fhalf-no-semantic-interposition to a few tests using the half-based semantic interposition behavior.	2020-12-31 13:59:45 -08:00
Fangrui Song	809a1e0ffd	[CodeGenModule] Set dso_local for Mach-O GlobalValue * static relocation model: always * other relocation models: if isStrongDefinitionForLinker This will make LLVM IR emitted for COFF/Mach-O and executable ELF similar.	2020-12-30 20:52:01 -08:00
Juneyoung Lee	9b29610228	Use unary CreateShuffleVector if possible As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`. Let's update them. Actually, it would have been more natural if the patches were made in this order: (1) let them use unary CreateShuffleVector first (2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793) The order is swapped, but in terms of correctness it is still fine. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D93923	2020-12-30 22:36:08 +09:00
Fangrui Song	2820a2ca3a	Move -fno-semantic-interposition dso_local logic from TargetMachine to Clang CodeGenModule This simplifies TargetMachine::shouldAssumeDSOLocal and and gives frontend the decision to use dso_local. For LLVM synthesized functions/globals, they may lose inferred dso_local but such optimizations are probably not very useful. Note: the hasComdat() condition in canBenefitFromLocalAlias (D77429) may be dead now. (llvm/CodeGen/X86/semantic-interposition-comdat.ll) (Investigate whether we need test coverage when Fuchsia C++ ABI is clearer)	2020-12-29 23:37:55 -08:00
James Y Knight	4ddf140c00	Fix PR35902: incorrect alignment used for ubsan check. UBSan was using the complete-object align rather than nv alignment when checking the "this" pointer of a method. Furthermore, CGF.CXXABIThisAlignment was also being set incorrectly, due to an incorrectly negated test. The latter doesn't appear to have had any impact, due to it not really being used anywhere. Differential Revision: https://reviews.llvm.org/D93072	2020-12-28 18:11:17 -05:00
Thomas Lively	5e09e9979b	[WebAssembly] Prototype extending pairwise add instructions As proposed in https://github.com/WebAssembly/simd/pull/380. This commit makes the new instructions available only via clang builtins and LLVM intrinsics to make their use opt-in while they are still being evaluated for inclusion in the SIMD proposal. Depends on D93771. Differential Revision: https://reviews.llvm.org/D93775	2020-12-28 14:11:14 -08:00
Akira Hatanaka	34405b41d6	[CodeGen][ObjC] Destroy callee-destroyed arguments in the caller function when the receiver is nil Callee-destroyed arguments to a method have to be destroyed in the caller function when the receiver is nil as the method doesn't get executed. This fixes PR48207. rdar://71808391 Differential Revision: https://reviews.llvm.org/D93273	2020-12-28 11:52:27 -08:00
Alexandre Ganea	69132d12de	[Clang] Reverse test to save on indentation. NFC.	2020-12-23 19:24:53 -05:00
Alan Phipps	bbd758a791	Revert "This is a test commit" This reverts commit `b920adf3b4`.	2020-12-23 13:04:37 -06:00
Alan Phipps	b920adf3b4	This is a test commit	2020-12-23 12:57:27 -06:00
Arthur O'Dwyer	22cf54a7fb	Replace `T(x)` with `reinterpret_cast<T>(x)` everywhere it means reinterpret_cast. NFC. Differential Revision: https://reviews.llvm.org/D76572	2020-12-22 19:54:29 -05:00
Arthur Eubanks	2080232333	Revert "[c++20] P1907R1: Support for generalized non-type template arguments of scalar type." This reverts commit `9e08e51a20`. This is part of 5 commits being reverted due to https://crbug.com/1161059. See bug for repro.	2020-12-22 10:18:08 -08:00
Richard Smith	9e08e51a20	[c++20] P1907R1: Support for generalized non-type template arguments of scalar type.	2020-12-18 01:08:41 -08:00
Rong Xu	3733463dbb	[IR][PGO] Add hot func attribute and use hot/cold attribute in func section Clang FE currently has hot/cold function attribute. But we only have cold function attribute in LLVM IR. This patch adds support of hot function attribute to LLVM IR. This attribute will be used in setting function section prefix/suffix. Currently .hot and .unlikely suffix only are added in PGO (Sample PGO) compilation (through isFunctionHotInCallGraph and isFunctionColdInCallGraph). This patch changes the behavior. The new behavior is: (1) If the user annotates a function as hot or isFunctionHotInCallGraph is true, this function will be marked as hot. Otherwise, (2) If the user annotates a function as cold or isFunctionColdInCallGraph is true, this function will be marked as cold. The changes are: (1) user annotated function attribute will used in setting function section prefix/suffix. (2) hot attribute overwrites profile count based hotness. (3) profile count based hotness overwrite user annotated cold attribute. The intention for these changes is to provide the user a way to mark certain function as hot in cases where training input is hard to cover all the hot functions. Differential Revision: https://reviews.llvm.org/D92493	2020-12-17 18:41:12 -08:00
Tom Stellard	3203143f13	CodeGen: Improve generated IR for __builtin_mul_overflow(uint, uint, int) Add a special case for handling __builtin_mul_overflow with unsigned inputs and a signed output to avoid emitting the __muloti4 library call on x86_64. __muloti4 is not implemented in libgcc, so avoiding this call fixes compilation of some programs that call __builtin_mul_overflow with these arguments. For example, this fixes the build of cpio with clang, which includes code from gnulib that calls __builtin_mul_overflow with these argument types. Reviewed By: vsk Differential Revision: https://reviews.llvm.org/D84405	2020-12-17 14:30:31 -08:00
Baptiste Saleil	c2892978e9	[PowerPC] Rename the vector pair intrinsics and builtins to replace the _mma_ prefix by _vsx_ On PPC, the vector pair instructions are independent from MMA. This patch renames the vector pair LLVM intrinsics and Clang builtins to replace the _mma_ prefix by _vsx_ in their names. We also move the vector pair type/intrinsic/builtin tests to their own files. Differential Revision: https://reviews.llvm.org/D91974	2020-12-17 13:19:27 -05:00
Zequan Wu	fb0f728805	[Clang] Make nomerge attribute a function attribute as well as a statement attribute. Differential Revision: https://reviews.llvm.org/D92800	2020-12-17 07:45:38 -08:00
dfukalov	9ed8e0caab	[NFC] Reduce include files dependency and AA header cleanup (part 2). Continuing work started in https://reviews.llvm.org/D92489: Removed a bunch of includes from "AliasAnalysis.h" and "LoopPassManager.h". Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D92852	2020-12-17 14:04:48 +03:00
Fangrui Song	c70f36865e	Use basic_string::find(char) instead of basic_string::find(const char *s, size_type pos=0) Many (StringRef) cannot be detected by clang-tidy performance-faster-string-find.	2020-12-16 23:28:32 -08:00
Joe Ellis	dad07baf12	[clang][AArch64][SVE] Avoid going through memory for VLAT <-> VLST casts This change makes use of the llvm.vector.extract intrinsic to avoid going through memory when performing bitcasts between vector-length agnostic types and vector-length specific types. Depends on D91362 Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D92761	2020-12-16 12:24:32 +00:00
Johannes Doerfert	b9c77542e2	[Clang][Attr] Introduce the `assume` function attribute The `assume` attribute is a way to provide additional, arbitrary information to the optimizer. For now, assumptions are restricted to strings which will be accumulated for a function and emitted as comma separated string function attribute. The key of the LLVM-IR function attribute is `llvm.assume`. Similar to `llvm.assume` and `__builtin_assume`, the `assume` attribute provides a user defined assumption to the compiler. A follow up patch will introduce an LLVM-core API to query the assumptions attached to a function. We also expect to add more options, e.g., expression arguments, to the `assume` attribute later on. The `omp [begin] asssumes` pragma will leverage this attribute and expose the functionality in the absence of OpenMP. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D91979	2020-12-15 16:51:34 -06:00
Fangrui Song	59decf8e9c	[clang] Migrate deprecated DebugInfo::get to DILocation::get	2020-12-15 13:59:31 -08:00
Baptiste Saleil	57d83c3a90	[PowerPC] Enable paired vector type and intrinsics when MMA is disabled This patch enables the Clang type __vector_pair and its associated LLVM intrinsics even when MMA is disabled. With this patch, the type is now controlled by the PPC paired-vector-memops option. The builtins and intrinsics will be renamed to drop the mma prefix in another patch. Differential Revision: https://reviews.llvm.org/D91819	2020-12-15 15:14:11 -06:00
Jan Svoboda	f24e58df7d	[clang][cli] Create accessors for exception models in LangOptions This abstracts away the members that are being replaced in a follow-up patch. Depends on D83979. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D93214	2020-12-15 10:15:58 +01:00
Rong Xu	c36f31c4db	[PGO] remove unintentional code in early commit Remove unintentional code in commit 54e03d [PGO] Verify BFI counts after loading profile data.	2020-12-14 18:41:49 -08:00
Rong Xu	54e03d03a7	[PGO] Verify BFI counts after loading profile data This patch adds the functionality to compare BFI counts with real profile counts right after reading the profile. It will print remarks under -Rpass-analysis=pgo, or the internal option -pass-remarks-analysis=pgo. Differential Revision: https://reviews.llvm.org/D91813	2020-12-14 15:56:10 -08:00
Gulfem Savrun Yeniceri	7c0e3a77bc	[clang][IR] Add support for leaf attribute This patch adds support for leaf attribute as an optimization hint in Clang/LLVM. Differential Revision: https://reviews.llvm.org/D90275	2020-12-14 14:48:17 -08:00
Matt Arsenault	ef4da3c2ba	clang: Add byval on x86_intrcc parameter 0 This will allow removing the special case treatment of the parameter and avoid depending on the pointer's element type.	2020-12-14 16:34:37 -05:00
Simon Pilgrim	4855a1004d	[X86] Convert fadd/fmul _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Followup to D87604, having confirmed on PR47506 that we can use the llvm codegen expansion for fadd/fmul as well. Differential Revision: https://reviews.llvm.org/D92940	2020-12-13 15:37:35 +00:00
Alexey Bader	a500a43587	[CodeGen][AMDGPU] Fix ICE for static initializer IR generation Differential Revision: https://reviews.llvm.org/D92782	2020-12-12 23:26:54 +03:00
Melanie Blower	320af6b138	Create SPIRABIInfo to enable SPIR_FUNC calling convention. Background: Call to library arithmetic functions for div is emitted by the compiler and it set wrong “C” calling convention for calls to these functions, whereas library functions are declared with `spir_function` calling convention. InstCombine optimization replaces such calls with “unreachable” instruction. It looks like clang lacks SPIRABIInfo class which should specify default calling conventions for “system” function calls. SPIR supports only SPIR_FUNC and SPIR_KERNEL calling convention. Reviewers: Erich Keane, Anastasia Differential Revision: https://reviews.llvm.org/D92721	2020-12-12 05:48:20 -08:00
Arthur Eubanks	ff7e1da68f	[NPM] Support -fmerge-functions I tried to put it in the same place in the pipeline as the legacy PM. Fixes PR48399. Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D93002	2020-12-10 11:45:08 -08:00
Florian Hahn	9c4cddb53a	[Clang] Add vcmla and rotated variants for Arm ACLE. This patch adds vcmla and the rotated variants as defined in "Arm Neon Intrinsics Reference for ACLE Q3 2020" [1] The _lane_ are still missing, but they can be added separately. This patch only adds the builtin mapping for AArch64. [1] https://developer.arm.com/documentation/ihi0073/latest Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D92930	2020-12-10 16:54:08 +00:00
Fangrui Song	f9c0d1b056	[Driver] Add -f[no-]legacy-pass-manager to supersede -f[no-]experimental-new-pass-manager The new PM is considered stable and many downstream groups have adopted it (some have adopted it for more than two years). Add -f[no-]legacy-pass-manager to reflect the fact that it is no longer experimental and the legacy pass manager is something we strive to retire. In the future, when the legacy PM eventually goes away, -fno-experimental-new-pass-manager and -flegacy-pass-manager will be removed. This patch also changes -f[no-]legacy-pass-manager to pass `-plugin-opt={new,legacy}-pass-manager` to the linker (supported by both ld.lld and LLVMgold.so) when -flto/-flto=thin is specified Reviewed By: aeubanks, rsmith Differential Revision: https://reviews.llvm.org/D92915	2020-12-09 16:57:36 -08:00
Reid Kleckner	df282215d4	Don't setup inalloca for swiftcc on i686-windows-msvc Swiftcall does it's own target-independent argument type classification, since it is not designed to be ABI compatible with anything local on the target that isn't LLVM-based. This means it never uses inalloca. However, we have duplicate logic for checking for inalloca parameters that runs before call argument setup. This logic needs to know ahead of time if inalloca will be used later, and we can't move the CGFunctionInfo calculation earlier. This change gets the calling convention from either the FunctionProtoType or ObjCMethodDecl, checks if it is swift, and if so skips the stackbase setup. Depends on D92883. Differential Revision: https://reviews.llvm.org/D92944	2020-12-09 11:08:48 -08:00
Reid Kleckner	d7098ff29c	De-templatify EmitCallArgs argument type checking, NFCI This template exists to abstract over FunctionPrototype and ObjCMethodDecl, which have similar APIs for storing parameter types. In place of a template, use a PointerUnion with two cases to handle this. Hopefully this improves readability, since the type of the prototype is easier to discover. This allows me to sink this code, which is mostly assertions, out of the header file and into the cpp file. I can also simplify the overloaded methods for computing isGenericMethod, and get rid of the second EmitCallArgs overload. Differential Revision: https://reviews.llvm.org/D92883	2020-12-09 11:08:00 -08:00
Yuanfang Chen	1821265db6	[Time-report] Add a flag -ftime-report={per-pass,per-pass-run} to control the pass timing aggregation Currently, -ftime-report + new pass manager emits one line of report for each pass run. This potentially causes huge output text especially with regular LTO or large single file (Obeserved in private tests and was reported in D51276). The behaviour of -ftime-report + legacy pass manager is emitting one line of report for each pass object which has relatively reasonable text output size. This patch adds a flag `-ftime-report=` to control time report aggregation for new pass manager. The flag is for new pass manager only. Using it with legacy pass manager gives an error. It is a driver and cc1 flag. `per-pass` is the new default so `-ftime-report` is aliased to `-ftime-report=per-pass`. Before this patch, functionality-wise `-ftime-report` is aliased to `-ftime-report=per-pass-run`. * Adds an boolean variable TimePassesHandler::PerRun to control per-pass vs per-pass-run. * Adds a new clang CodeGen flag CodeGenOptions::TimePassesPerRun to work with the existing CodeGenOptions::TimePasses. * Remove FrontendOptions::ShowTimers, its uses are replaced by the existing CodeGenOptions::TimePasses. * Remove FrontendTimesIsEnabled (It was introduced in D45619 which was largely reverted.) Differential Revision: https://reviews.llvm.org/D92436	2020-12-08 10:13:19 -08:00
Kevin P. Neal	acd4950d4f	[FPEnv] Correct constrained metadata in fp16-ops-strict.c This test shows we're in some cases not getting strictfp information from the AST. Correct that. Differential Revision: https://reviews.llvm.org/D92596	2020-12-08 10:18:32 -05:00
Tim Northover	c5978f42ec	UBSAN: emit distinctive traps Sometimes people get minimal crash reports after a UBSAN incident. This change tags each trap with an integer representing the kind of failure encountered, which can aid in tracking down the root cause of the problem.	2020-12-08 10:28:26 +00:00
Luís Marques	3af354e863	[Clang][CodeGen][RISCV] Fix hard float ABI for struct with empty struct and complex Fixes bug 44904. Differential Revision: https://reviews.llvm.org/D91278	2020-12-08 09:19:05 +00:00
Luís Marques	fa8f5bfa4e	[Clang][CodeGen][RISCV] Fix hard float ABI test cases with empty struct The code seemed not to account for the field 1 offset. Differential Revision: https://reviews.llvm.org/D91270	2020-12-08 09:19:05 +00:00
Vitaly Buka	6e614b0c7e	[NFC][MSan] Round up OffsetPtr in PoisonMembers getFieldOffset(layoutStartOffset) is expected to point to the first trivial field or the one which follows non-trivial. So it must be byte aligned already. However this is not obvious without assumptions about callers. This patch will avoid the need in such assumptions. Depends on D92727. Differential Revision: https://reviews.llvm.org/D92728	2020-12-07 19:57:49 -08:00
Vitaly Buka	3e1cb0db8a	[CodeGen][MSan] Don't use offsets of zero-sized fields Such fields will likely have offset zero making __sanitizer_dtor_callback poisoning wrong regions. E.g. it can poison base class member from derived class constructor. Differential Revision: https://reviews.llvm.org/D92727	2020-12-07 13:37:40 -08:00
Jinsong Ji	b49b8f096c	[PowerPC][Clang] Remove QPX support Clean up QPX code in clang missed in https://reviews.llvm.org/D83915 Reviewed By: #powerpc, steven.zhang Differential Revision: https://reviews.llvm.org/D92329	2020-12-07 10:15:39 -05:00
Vitaly Buka	1f21f6d6a4	[NFC][CodeGen] Simplify SanitizeDtorMembers::Emit	2020-12-05 21:11:27 -08:00
Fangrui Song	1ab9327d1c	[TargetMachine][CodeGenModule] Delete unneeded ppc32 special case from shouldAssumeDSOLocal PPCMCInstLower does not actually call shouldAssumeDSOLocal for ppc32 so this is dead. Actually Clang ppc32 does produce a pair of absolute relocations which match GCC. This also fixes a comment (R_PPC_COPY and R_PPC64_COPY do exist).	2020-12-05 00:42:07 -08:00
Nico Weber	0cbf61be8b	[mac/arm] Fix rtti codegen tests when running on an arm mac shouldRTTIBeUnique() returns false for iOS64CXXABI, which causes RTTI objects to be emitted hidden. Update two tests that didn't expect this to happen for the default triple. Also rename iOS64CXXABI to AppleARM64CXXABI, since it's used for arm64-apple-macos triples too. Part of PR46644. Differential Revision: https://reviews.llvm.org/D91904	2020-12-03 09:11:03 -05:00
Yaxun (Sam) Liu	3a781b912f	Fix assertion in tryEmitAsConstant due to `cd95338ee3` Need to check if result is LValue before getLValueBase.	2020-12-02 19:10:01 -05:00
Hongtao Yu	24d4291ca7	[CSSPGO] Pseudo probes for function calls. An indirect call site needs to be probed for its potential call targets. With CSSPGO a direct call also needs a probe so that a calling context can be represented by a stack of callsite probes. Unlike pseudo probes for basic blocks that are in form of standalone intrinsic call instructions, pseudo probes for callsites have to be attached to the call instruction, thus a separate instruction would not work. One possible way of attaching a probe to a call instruction is to use a special metadata that carries information about the probe. The special metadata will have to make its way through the optimization pipeline down to object emission. This requires additional efforts to maintain the metadata in various places. Given that the `!dbg` metadata is a first-class metadata and has all essential support in place , leveraging the `!dbg` metadata as a channel to encode pseudo probe information is probably the easiest solution. With the requirement of not inflating `!dbg` metadata that is allocated for almost every instruction, we found that the 32-bit DWARF discriminator field which mainly serves AutoFDO can be reused for pseudo probes. DWARF discriminators distinguish identical source locations between instructions and with pseudo probes such support is not required. In this change we are using the discriminator field to encode the ID and type of a callsite probe and the encoded value will be unpacked and consumed right before object emission. When a callsite is inlined, the callsite discriminator field will go with the inlined instructions. The `!dbg` metadata of an inlined instruction is in form of a scope stack. The top of the stack is the instruction's original `!dbg` metadata and the bottom of the stack is for the original callsite of the top-level inliner. Except for the top of the stack, all other elements of the stack actually refer to the nested inlined callsites whose discriminator field (which actually represents a calliste probe) can be used together to represent the inline context of an inlined PseudoProbeInst or CallInst. To avoid collision with the baseline AutoFDO in various places that handles dwarf discriminators where a check against the `-pseudo-probe-for-profiling` switch is not available, a special encoding scheme is used to tell apart a pseudo probe discriminator from a regular discriminator. For the regular discriminator, if all lowest 3 bits are non-zero, it means the discriminator is basically empty and all higher 29 bits can be reversed for pseudo probe use. Callsite pseudo probes are inserted in `SampleProfileProbePass` and a target-independent MIR pass `PseudoProbeInserter` is added to unpack the probe ID/type from `!dbg`. Note that with this work the switch -debug-info-for-profiling will not work with -pseudo-probe-for-profiling anymore. They cannot be used at the same time. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D91756	2020-12-02 13:45:20 -08:00
jasonliu	a65d8c5d72	[XCOFF][AIX] Generate LSDA data and compact unwind section on AIX Summary: AIX uses the existing EH infrastructure in clang and llvm. The major differences would be 1. AIX do not have CFI instructions. 2. AIX uses a new personality routine, named __xlcxx_personality_v1. It doesn't use the GCC personality rountine, because the interoperability is not there yet on AIX. 3. AIX do not use eh_frame sections. Instead, it would use a eh_info section (compat unwind section) to store the information about personality routine and LSDA data address. Reviewed By: daltenty, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D91455	2020-12-02 18:42:44 +00:00
Yaxun (Sam) Liu	cd95338ee3	[CUDA][HIP] Fix capturing reference to host variable In C++ when a reference variable is captured by copy, the lambda is supposed to make a copy of the referenced variable in the captures and refer to the copy in the lambda. Therefore, it is valid to capture a reference to a host global variable in a device lambda since the device lambda will refer to the copy of the host global variable instead of access the host global variable directly. However, clang tries to avoid capturing of reference to a host global variable if it determines the use of the reference variable in the lambda function is not odr-use. Clang also tries to emit load of the reference to a global variable as load of the global variable if it determines that the reference variable is a compile-time constant. For a device lambda to capture a reference variable to host global variable and use the captured value, clang needs to be taught that in such cases the use of the reference variable is odr-use and the reference variable is not compile-time constant. This patch fixes that. Differential Revision: https://reviews.llvm.org/D91088	2020-12-02 10:14:46 -05:00
Alex Zinenko	240dd92432	[OpenMPIRBuilder] forward arguments as pointers to outlined function OpenMPIRBuilder::createParallel outlines the body region of the parallel construct into a new function that accepts any value previously defined outside the region as a function argument. This function is called back by OpenMP runtime function __kmpc_fork_call, which expects trailing arguments to be pointers. If the region uses a value that is not of a pointer type, e.g. a struct, the produced code would be invalid. In such cases, make createParallel emit IR that stores the value on stack and pass the pointer to the outlined function instead. The outlined function then loads the value back and uses as normal. Reviewed By: jdoerfert, llitchev Differential Revision: https://reviews.llvm.org/D92189	2020-12-02 14:59:41 +01:00
Qiu Chaofan	3fca6a7844	[Clang] Don't adjust align for IBM extended double Commit `6b1341eb` fixed alignment for 128-bit FP types on PowerPC. However, the quadword alignment adjustment shouldn't be applied to IBM extended double (ppc_fp128 in IR) values. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92278	2020-12-02 17:02:26 +08:00
David Chisnall	d1ed67037d	[GNU ObjC] Fix a regression listing methods twice. Methods synthesized from declared properties were being added to the method lists twice. This came from the change to list them in the class's method list, which missed removing the place in CGObjCGNU that added them again. Reviewed By: lanza Differential Revision: https://reviews.llvm.org/D91874	2020-12-01 09:50:18 +00:00
Leonard Chan	cf8ff75bad	[clang][RelativeVTablesABI] Use dso_local_equivalent rather than emitting stubs Thanks to D77248, we can bypass the use of stubs altogether and use PLT relocations if they are available for the target. LLVM and LLD support the R_AARCH64_PLT32 relocation, so we can also guarantee a static PLT relocation on AArch64. Not emitting these stubs saves a lot of extra binary size. Differential Revision: https://reviews.llvm.org/D83812	2020-11-30 16:02:35 -08:00
Fangrui Song	164410324d	[CodeGen] -fno-delete-null-pointer-checks: change dereferenceable to dereferenceable_or_null After D17993, with -fno-delete-null-pointer-checks we add the dereferenceable attribute to the `this` pointer. We have observed that one internal target which worked before fails even with -fno-delete-null-pointer-checks. Switching to dereferenceable_or_null fixes the problem. dereferenceable currently does not always respect NullPointerIsValid and may imply nonnull and lead to aggressive optimization. The optimization may be related to `CallBase::isReturnNonNull`, `Argument::hasNonNullAttr`, or `Value::getPointerDereferenceableBytes`. See D66664 and D66618 for some discussions. Reviewed By: bkramer, rsmith Differential Revision: https://reviews.llvm.org/D92297	2020-11-30 12:44:35 -08:00
Hongtao Yu	c083fededf	[CSSPGO] A Clang switch -fpseudo-probe-for-profiling for pseudo-probe instrumentation. This change introduces a new clang switch `-fpseudo-probe-for-profiling` to enable AutoFDO with pseudo instrumentation. Please refer to https://reviews.llvm.org/D86193 for the whole story. One implication from pseudo-probe instrumentation is that the profile is now sensitive to CFG changes. We perform the pseudo instrumentation very early in the pre-LTO pipeline, before any CFG transformation. This ensures that the CFG instrumented and annotated is stable and optimization-resilient. The early instrumentation also allows the inliner to duplicate probes for inlined instances. When a probe along with the other instructions of a callee function are inlined into its caller function, the GUID of the callee function goes with the probe. This allows samples collected on inlined probes to be reported for the original callee function. Reviewed By: wmi Differential Revision: https://reviews.llvm.org/D86502	2020-11-30 10:16:54 -08:00
Kevin P. Neal	abfbc5579b	[FPEnv] clang should get from the AST the metadata for constrained FP builtins Currently clang is not correctly retrieving from the AST the metadata for constrained FP builtins. This patch fixes that for the non-target specific builtins. Differential Revision: https://reviews.llvm.org/D92122	2020-11-30 11:59:37 -05:00
Zarko Todorovski	ff8e8c1b14	[AIX] Enabling vector type arguments and return for AIX This patch enables vector type arguments on AIX. All non-aggregate Altivec vector types are 16bytes in size and are 16byte aligned. Reviewed By: Xiangling_L Differential Revision: https://reviews.llvm.org/D92117	2020-11-27 09:55:52 -05:00
Reid Kleckner	1e843a987d	[MS] Add more 128bit cmpxchg intrinsics for AArch64 The MSVC STL for requires this on ARM64. Requested in https://llvm.org/pr47099 Depends on D92061 Differential Revision: https://reviews.llvm.org/D92062	2020-11-25 12:07:28 -08:00
Reid Kleckner	3bd0672726	[MS] Fix double evaluation of MSVC builtin arguments This code got quite twisted because we consider some MSVC builtins to be target agnostic, and some to be target specific. Target specific intrinsics have a pattern of doing up-front argument evaluation, while general intrinsics do not evaluate their arguments up front. As we tried to share codepaths between the target-specific and target-agnostic handling, we ended up doing double evaluation. Instead, have each target handle MSVC intrinsics consistently before up front argument evaluation. This requires passing less data around and is more consistent with target independent intrinsic handling. See D50979 for past examples of this bug. I noticed this while looking into adding some more intrinsics. Differential Revision: https://reviews.llvm.org/D92061	2020-11-25 11:55:01 -08:00
Simon Pilgrim	9d996c01aa	TargetInfo.cpp - use castAs<> instead of getAs<> as we dereference the pointer directly. NFCI. castAs<> will assert the correct cast type instead of just returning null, which we then try to dereference immediately.	2020-11-25 11:38:29 +00:00
Simon Pilgrim	eb7ea5aa1a	CGCall.cpp - use castAs<> instead of getAs<> as we dereference the pointer directly. NFCI. castAs<> will assert the correct cast type instead of just returning null, which we then try to dereference immediately in the setUsedBits call.	2020-11-25 11:38:29 +00:00
Zarko Todorovski	c92f29b05e	[AIX] Add mabi=vec-extabi options to enable the AIX extended and default vector ABIs. Added support for the options mabi=vec-extabi and mabi=vec-default which are analogous to qvecnvol and qnovecnvol when using XL on AIX. The extended Altivec ABI on AIX is enabled using mabi=vec-extabi in clang and vec-extabi in llc. Reviewed By: Xiangling_L, DiggerLin Differential Revision: https://reviews.llvm.org/D89684	2020-11-24 18:17:53 -05:00
Teresa Johnson	0768b0576a	Avoid redundant work when computing vtable vcall visibility Add a Visited set to avoid repeatedly processing the same base classes in complex class hierarchies. This cut down the compile time of one source file from >12min to ~1min. Differential Revision: https://reviews.llvm.org/D91676	2020-11-24 12:06:24 -08:00
Yaxun (Sam) Liu	cb08558caa	[HIP] Fix regressions due to fp contract change Recently HIP toolchain made a change to use clang instead of opt/llc to do compilation (https://reviews.llvm.org/D81861). The intention is to make HIP toolchain canonical like other toolchains. However, this change introduced an unintentional change regarding backend fp fuse option, which caused regressions in some HIP applications. Basically before the change, HIP toolchain used clang to generate bitcode, then use opt/llc to optimize bitcode and generate ISA. As such, the amdgpu backend takes the default fp fuse mode which is 'Standard'. This mode respect contract flag of fmul/fadd instructions and do not fuse fmul/fadd instructions without contract flag. However, after the change, HIP toolchain now use clang to generate IR, do optimization, and generate ISA as one process. Now amdgpu backend fp fuse option is determined by -ffp-contract option, which is 'fast' by default. And this -ffp-contract=fast language option is translated to 'Fast' fp fuse option in backend. Suddenly backend starts to fuse fmul/fadd instructions without contract flag. This causes wrong result for some device library functions, e.g. tan(-1e20), which should return 0.8446, now returns -0.933. What is worse is that since backend with 'Fast' fp fuse option does not respect contract flag, there is no way to use #pragma clang fp contract directive to enforce fp contract requirements. This patch fixes the regression by introducing a new value 'fast-honor-pragmas' for -ffp-contract and use it for HIP by default. 'fast-honor-pragmas' is equivalent to 'fast' in frontend but let the backend to use 'Standard' fp fuse option. 'fast-honor-pragmas' is useful since 'Fast' fp fuse option in backend does not honor contract flag, it is of little use to HIP applications since all code with #pragma STDC FP_CONTRACT or any IR from a source compiled with -ffp-contract=on is broken. Differential Revision: https://reviews.llvm.org/D90174	2020-11-24 08:10:06 -05:00
Ben Dunbobbin	e42021d5cc	[Clang][-fvisibility-from-dllstorageclass] Set DSO Locality from final visibility Ensure that the DSO Locality of the globals in the IR is derived from their final visibility when using -fvisibility-from-dllstorageclass. To accomplish this we reset the DSO locality of globals (before setting their visibility from their dllstorageclass) at the end of IRGen in Clang. This removes any effects that visibility options or annotations may have had on the DSO locality. The resulting DSO locality of the globals will be pessimistic w.r.t. to the normal compiler IRGen. Differential Revision: https://reviews.llvm.org/D91779	2020-11-24 00:32:14 +00:00
Zequan Wu	15a3ae1ab1	[Clang] Add __STDCPP_THREADS__ to standard predefine macros According to https://eel.is/c++draft/cpp.predefined#2.6, `__STDCPP_THREADS__` is a predefined macro. Differential Revision: https://reviews.llvm.org/D91747	2020-11-22 16:05:53 -08:00
Alexey Bataev	c964f30814	[OPENMP]Use the real pointer value as base, not indexed value. After fix for PR48174 the base pointer for pointer-based array-sections/array-subscripts will be emitted as `&ptr[idx]`, but actually it should be just `ptr`, i.e. the address stored in the ponter to point correctly to the beginning of the array. Currently it may lead to a crash in the runtime. Differential Revision: https://reviews.llvm.org/D91805	2020-11-20 11:34:14 -08:00
Alex Richardson	51e09e1d5a	[AMDGPU] Set the default globals address space to 1 This will ensure that passes that add new global variables will create them in address space 1 once the passes have been updated to no longer default to the implicit address space zero. This also changes AutoUpgrade.cpp to add -G1 to the DataLayout if it wasn't already to present to ensure bitcode backwards compatibility. Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D84345	2020-11-20 15:46:53 +00:00
Arthur Eubanks	72badbcdcc	[NPM] Move more O0 pass building into PassBuilder This moves handling of alwaysinline, coroutines, matrix lowering, PGO, and LTO-required passes into PassBuilder. Much of this is replicated between Clang and opt. Other out-of-tree users also replicate some of this, such as Rust [1] replicating the alwaysinline, LTO, and PGO passes. The LTO passes are also now run in build(Thin)LTOPreLinkDefaultPipeline() since they are semantically required for (Thin)LTO. [1]: `f5230fbf76/compiler/rustc_llvm/llvm-wrapper/PassWrapper.cpp (L896)` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D91585	2020-11-19 11:22:23 -08:00
Joseph Huber	da8bec47ab	[OpenMP] Add Location Fields to Libomptarget Runtime for Debugging Summary: Add support for passing source locations to libomptarget runtime functions using the ident_t struct present in the rest of the libomp API. This will allow the runtime system to give much more insightful error messages and debugging values. Reviewers: jdoerfert grokos Differential Revision: https://reviews.llvm.org/D87946	2020-11-19 12:01:53 -05:00
Xiangling Liao	17497ec514	[AIX][FE] Support constructor/destructor attribute Support attribute((constructor)) and attribute((destructor)) on AIX Differential Revision: https://reviews.llvm.org/D90892	2020-11-19 09:24:01 -05:00
Qiu Chaofan	6b1341eb5b	[PowerPC] [Clang] Fix alignment of 128-bit float types According to ELF v2 ABI, both IEEE 128-bit and IBM extended floating point variables should be quad-word (16 bytes) aligned. Previously, only vector types are considered aligned as quad-word on PowerPC. This patch will fix incorrectness of IEEE 128-bit float argument in va_arg cases. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D91596	2020-11-19 14:22:14 +08:00
Joseph Huber	97e55cfef5	[OpenMP] Add Passing in Original Declaration Names To Mapper API Summary: This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;" Reviewers: jdoerfert Differential Revision: https://reviews.llvm.org/D89802	2020-11-18 15:28:39 -05:00
Alexey Bataev	5ba324ccad	[OPENMP]Fix PR48174: compile-time crash with target enter data on a global struct. The compiler should treat array subscript with base pointer as a first pointer in complex data, it is used only for member expression with base pointer. Differential Revision: https://reviews.llvm.org/D91660	2020-11-18 07:48:58 -08:00
Florian Hahn	680931af27	[Matrix] Adjust matrix pointer type for inline asm arguments. Matrix types in memory are represented as arrays, but accessed through vector pointers, with the alignment specified on the access operation. For inline assembly, update pointer arguments to use vector pointers. Otherwise there will be a mis-match if the matrix is also an input-argument which is represented as vector. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D91631	2020-11-18 11:44:11 +00:00
Nick Desaulniers	f4c6080ab8	Revert "[IR] add fn attr for no_stack_protector; prevent inlining on mismatch" This reverts commit `b7926ce6d7`. Going with a simpler approach.	2020-11-17 17:27:14 -08:00
Alexey Bataev	5292187a2d	[OPENMP]Fix PR48076: mapping of data member pointer. If the data member pointer is mapped, the compiler tries to optimize the mapping of such data by discarding explicit mapping flags and trying to emit combined data instead. In some cases, this optimization is not quite correctly implemented and it leads to a program crash at the runtime. Instead, if the data member is mapped, just emit it as is and do not emit combined mapping flags for it. Differential Revision: https://reviews.llvm.org/D91552	2020-11-17 07:18:32 -08:00
Yaxun (Sam) Liu	3f4b5893ef	[AMDGPU] Add option -munsafe-fp-atomics Add an option -munsafe-fp-atomics for AMDGPU target. When enabled, clang adds function attribute "amdgpu-unsafe-fp-atomics" to any functions for amdgpu target. This allows amdgpu backend to use unsafe fp atomic instructions in these functions. Differential Revision: https://reviews.llvm.org/D91546	2020-11-16 21:52:12 -05:00
CJ Johnson	69cd776e1e	[CodeGen] Apply 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments. * Adds 'nonnull' and 'dereferenceable(N)' to 'this' pointer arguments * Gates 'nonnull' on -f(no-)delete-null-pointer-checks * Introduces this-nonnull.cpp and microsoft-abi-this-nullable.cpp tests to explicitly test the behavior of this change * Refactors hundreds of over-constrained clang tests to permit these attributes, where needed * Updates Clang12 patch notes mentioning this change Reviewed-by: rsmith, jdoerfert Differential Revision: https://reviews.llvm.org/D17993	2020-11-16 17:39:17 -08:00
Florian Hahn	ca2e7e5999	[IRGen] Add !annotation metadata for auto-init stores. This patch updates Clang's IRGen to add !annotation nodes with an "auto-init" annotation to all stores for auto-initialization. As discussed in 'RFC: Combining Annotation Metadata and Remarks' (http://lists.llvm.org/pipermail/llvm-dev/2020-November/146393.html) this allows using optimization remarks to track down where auto-init code was inserted (and not removed by optimizations). There are a few cases in the tests where !annotation gets dropped by optimizations. Those optimizations will be updated in subsequent patches. This patch is based on a patch by Francis Visoiu Mistrih. Reviewed By: thegameg, paquette Differential Revision: https://reviews.llvm.org/D91417	2020-11-16 10:37:02 +00:00
Richard Smith	dc58cd1480	PR48169: Fix crash generating debug info for class non-type template parameters. It appears that LLVM isn't able to generate a DW_AT_const_value for a constant of class type, but if it could, we'd match GCC's debug info in this case, and in the interim we no longer crash.	2020-11-15 17:43:26 -08:00
Roman Lebedev	6861d938e5	Revert "clang-misexpect: Profile Guided Validation of Performance Annotations in LLVM" See discussion in https://bugs.llvm.org/show_bug.cgi?id=45073 / https://reviews.llvm.org/D66324#2334485 the implementation is known-broken for certain inputs, the bugreport was up for a significant amount of timer, and there has been no activity to address it. Therefore, just completely rip out all of misexpect handling. I suspect, fixing it requires redesigning the internals of MD_misexpect. Should anyone commit to fixing the implementation problem, starting from clean slate may be better anyways. This reverts commit `7bdad08429`, and some of it's follow-ups, that don't stand on their own.	2020-11-14 13:12:38 +03:00
Mehdi Amini	42e88bd6b1	Replace sequences of v.push_back(v[i]); v.erase(&v[i]); with std::rotate (NFC) The code has a few sequence that looked like: Ops.push_back(Ops[0]); Ops.erase(Ops.begin()); And are equivalent to: std::rotate(Ops.begin(), Ops.begin() + 1, Ops.end()); The latter has the advantage of never reallocating the vector, which would be a bug in the original code as push_back would read from the memory it deallocated.	2020-11-14 00:55:33 +00:00
Arthur Eubanks	6e098189db	[DFSan][NewPM] Handle dfsan under NPM Make it required. Since it's a module pass, optnone won't test it, so extend the clang test to also use opt-bisect now that it's supported. 14/16 check-dfsan tests failed with NPM enabled, now all pass. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D91385	2020-11-13 13:41:38 -08:00
Heejin Ahn	902ea588ea	[WebAssembly] Rename atomic.notify and *.atomic.wait - atomic.notify -> memory.atomic.notify - i32.atomic.wait -> memory.atomic.wait32 - i64.atomic.wait -> memory.atomic.wait64 See https://github.com/WebAssembly/threads/pull/149. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D91447	2020-11-13 12:04:48 -08:00
Baptiste Saleil	3f78605a8c	[PowerPC] Add paired vector load and store builtins and intrinsics This patch adds the Clang builtins and LLVM intrinsics to load and store vector pairs. Differential Revision: https://reviews.llvm.org/D90799	2020-11-13 12:35:10 -06:00
Michael Liao	8920ef06a1	[hip] Remove the coercion on aggregate kernel arguments. - If an aggregate argument is indirectly accessed within kernels, direct passing results in unpromotable `alloca`, which degrade performance significantly. InferAddrSpace pass is enhanced in [D91121](https://reviews.llvm.org/D91121) to take the assumption that generic pointers loaded from the constant memory could be regarded global ones. The need for the coercion on aggregate arguments is mitigated. Differential Revision: https://reviews.llvm.org/D89980	2020-11-12 21:19:30 -05:00
Arthur Eubanks	3a7b57b7ca	[NFC][NewPM] Reuse PassBuilder callbacks with -O0 This removes lots of duplicated code which was necessary before https://reviews.llvm.org/D89158. Now we can use PassBuilder::runRegisteredEPCallbacks(). This is mostly sanitizers. There is likely more that can be done to simplify, but let's start with this. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90870	2020-11-12 12:42:59 -08:00
Alexey Bataev	3c6b457bee	[OPENMP]Fix PR48076: Check map types array before accessing its front. Need to check if there are map types for the components before trying to access them when trying to modify type mappings for combined partial mappings. Differential Revision: https://reviews.llvm.org/D91370	2020-11-12 12:00:29 -08:00
Hans Wennborg	a088766508	[dllexport] Instantiate default ctor default args for explicit specializations (PR45811) For dllexported default constructors with default arguments, we export default constructor closures which pass in the default args. (See D8331 for a good explanation.) For templates, that means those default args must be instantiated even if the function isn't called. That is done by the InstantiateDefaultCtorDefaultArgs() function, but it wasn't done for explicit specializations, causing asserts (see bug). Differential revision: https://reviews.llvm.org/D91089	2020-11-12 13:29:34 +01:00
Arthur Eubanks	b6ccff3d5f	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Since callbacks may end up not adding passes, we need to check if the pass managers are empty before adding them, so PassManager now has an isEmpty() function. For example, polly adds callbacks but doesn't always add passes in those callbacks, so this is necessary to keep -debug-pass-manager tests' output from changing depending on if polly is enabled or not. Tests are a continuation of those added in https://reviews.llvm.org/D89083. Reviewed By: asbirlea, Meinersbur Differential Revision: https://reviews.llvm.org/D89158	2020-11-11 15:10:27 -08:00
Richard Smith	5f12f4ff90	Suppress printing of inline namespace names in diagnostics by default, except where they are necessary to disambiguate the target. This substantially improves diagnostics from the standard library, which are otherwise full of `::__1::` noise.	2020-11-11 15:05:51 -08:00
Akira Hatanaka	874b0a0b9d	[CodeGen] Mark calls to objc_autorelease as tail This enables a method sending an autorelease message to an object and returning the object in MRR to avoid adding the object to an autorelease pool if a call to objc_retainAutoreleasedReturnValue in the caller function accepts the hand off of the retain count. rdar://problem/50678052 Differential Revision: https://reviews.llvm.org/D91111	2020-11-10 13:48:25 -08:00
Richard Smith	b637148ecb	[c++20] For P0732R2 / P1907R1: Basic code generation and name mangling support for non-type template parameters of class type and template parameter objects. The Itanium side of this follows the approach I proposed in https://github.com/itanium-cxx-abi/cxx-abi/issues/47 on 2020-09-06. The MSVC side of this was determined empirically by observing MSVC's output. Differential Revision: https://reviews.llvm.org/D89998	2020-11-09 22:10:27 -08:00
Michael Kruse	e5dba2d7e5	[OMPIRBuilder] Start 'Create' methods with lower case. NFC. For consistency with the IRBuilder, OpenMPIRBuilder has method names starting with 'Create'. However, the LLVM coding style has methods names starting with lower case letters, as all other OpenMPIRBuilder already methods do. The clang-tidy configuration used by Phabricator also warns about the naming violation, adding noise to the reviews. This patch renames all `OpenMPIRBuilder::CreateXYZ` methods to `OpenMPIRBuilder::createXYZ`, and updates all in-tree callers. I tested check-llvm, check-clang, check-mlir and check-flang to ensure that I did not miss a caller. Reviewed By: mehdi_amini, fghanim Differential Revision: https://reviews.llvm.org/D91109	2020-11-09 19:35:11 -06:00
Fangrui Song	e625f9c5d1	-fbasic-block-sections=list=: Suppress output if failed to open the file Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D90815	2020-11-09 09:26:37 -08:00
Atmn Patel	fd3cad7a60	[clang] Fix ForStmt mustprogress handling D86841 had an error where for statements with no conditional were required to make progress. This is not true, this patch removes that line, and adds regression tests. Differential Revision: https://reviews.llvm.org/D91075	2020-11-09 11:38:06 -05:00
Tyker	d093401a26	[NFC] Remove string parameter of annotation attribute from AST childs. this simplifies using annotation attributes when using clang as library	2020-11-09 16:39:59 +01:00
Simon Pilgrim	8930032f53	Don't dereference a dyn_cast<> result - use cast<> instead. NFCI. We were relying on the dyn_cast<> succeeding - better use cast<> and have it assert that its the correct type than dereference a null result.	2020-11-08 13:06:07 +00:00
Arthur Eubanks	226e179f74	Revert "[NewPM] Provide method to run all pipeline callbacks, used for -O0" This reverts commit `ae38540042`. As well as some follow-up test fixes. The original change causes new-pass-manager.ll to fail when polly is enabled.	2020-11-08 00:32:35 -08:00
Fangrui Song	f2e479db92	[OpenMP] Fix -Wmisleading-indentation after D84192	2020-11-06 20:09:43 -08:00
cchen	0cab91140f	[OpenMP5.0] map item can be non-contiguous for target update In order not to modify the `tgt_target_data_update` information but still be able to pass the extra information for non-contiguous map item (offset, count, and stride for each dimension), this patch overload `arg` when the maptype is set as `OMP_MAP_DESCRIPTOR`. The origin `arg` is for passing the pointer information, however, the overloaded `arg` is an array of descriptor_dim: struct descriptor_dim { int64_t offset; int64_t count; int64_t stride }; and the array size is the same as dimension size. In addition, since we have count and stride information in descriptor_dim, we can replace/overload the `arg_size` parameter by using dimension size. For supporting `stride` in array section, we use a dummy dimension in descriptor to store the unit size. The formula for counting the stride in dimension D_n: `unit size * (D_0 * D_1 ... * D_n-1) * D_n.stride`. Demonstrate how it works: ``` double arr[3][4][5]; D0: { offset = 0, count = 1, stride = 8 } // offset, count, dimension size always be 0, 1, 1 for this extra dimension, stride is the unit size D1: { offset = 0, count = 2, stride = 8 * 1 * 2 = 16 } // stride = unit size * (product of dimension size of D0) * D1.stride = 4 * 1 * 2 = 8 D2: { offset = 2, count = 2, stride = 8 * (1 * 5) * 1 = 40 } // stride = unit size * (product of dimension size of D0, D1) * D2.stride = 4 * 5 * 1 = 20 D3: { offset = 0, count = 2, stride = 8 * (1 * 5 * 4) * 2 = 320 } // stride = unit size * (product of dimension size of D0, D1, D2) * D3.stride = 4 * 25 * 2 = 200 // X here means we need to offload this data, therefore, runtime will transfer // data from offset 80, 96, 120, 136, 400, 416, 440, 456 // Runtime patch: https://reviews.llvm.org/D82245 // OOOOO OOOOO OOOOO // OOOOO OOOOO OOOOO // XOXOO OOOOO XOXOO // XOXOO OOOOO XOXOO ``` Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D84192	2020-11-06 21:04:37 -06:00
Kevin P. Neal	2069403cdf	[FPEnv] Use strictfp metadata in casting nodes The strictfp metadata was added to the casting AST nodes in D85960, but we aren't using that metadata yet. This patch adds that support. In order to avoid lots of ad-hoc passing around of the strictfp bits I updated the IRBuilder when moving from a function that has the Expr* to a function that lacks it. I believe we should switch to this pattern to keep the strictfp support from being overly invasive. For the purpose of testing that we're picking up the right metadata, I also made my tests use a pragma to make the AST's strictfp metadata not match the global strictfp metadata. This exposes issues that we need to deal with in subsequent patches, and I believe this is the right method for most all of our clang strictfp tests. Differential Revision: https://reviews.llvm.org/D88913	2020-11-06 11:56:12 -05:00
Jan Ole Hüser	d2e7dca5ca	[CodeGen] Fix Bug 47499: __unaligned extension inconsistent behaviour with C and C++ For the language C++ the keyword __unaligned (a Microsoft extension) had no effect on pointers. The reason, why there was a difference between C and C++ for the keyword __unaligned: For C, the Method getAsCXXREcordDecl() returns nullptr. That guarantees that hasUnaligned() is called. If the language is C++, it is not guaranteed, that hasUnaligend() is called and evaluated. Here are some links: The Bug: https://bugs.llvm.org/show_bug.cgi?id=47499 Thread on the cfe-dev mailing list: http://lists.llvm.org/pipermail/cfe-dev/2020-September/066783.html Diff, that introduced the check hasUnaligned() in getNaturalTypeAlignment(): https://reviews.llvm.org/D30166 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D90630	2020-11-05 12:57:17 -08:00
Arthur Eubanks	ae38540042	[NewPM] Provide method to run all pipeline callbacks, used for -O0 Some targets may add required passes via TargetMachine::registerPassBuilderCallbacks(). We need to run those even under -O0. As an example, BPFTargetMachine adds BPFAbstractMemberAccessPass, a required pass. This also allows us to clean up BackendUtil.cpp (and out-of-tree Rust usage of the NPM) by allowing us to share added passes like coroutines and sanitizers between -O0 and other optimization levels. Tests are a continuation of those added in https://reviews.llvm.org/D89083. In order to prevent TargetMachines from adding unnecessary optimization passes at -O0, TargetMachine::registerPassBuilderCallbacks() will be changed to take an OptimizationLevel, but that will be done separately. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D89158	2020-11-04 22:27:16 -08:00
Atmn Patel	ac73b73c16	[clang] Add mustprogress and llvm.loop.mustprogress attribute deduction Since C++11, the C++ standard has a forward progress guarantee [intro.progress], so all such functions must have the `mustprogress` requirement. In addition, from C11 and onwards, loops without a non-zero constant conditional or no conditional are also required to make progress (C11 6.8.5p6). This patch implements these attribute deductions so they can be used by the optimization passes. Differential Revision: https://reviews.llvm.org/D86841	2020-11-04 22:03:14 -05:00
Arthur Eubanks	ab0ddbc38a	Reland [NewPM] Add OptimizationLevel param to registerPipelineStartEPCallback This allows targets to skip optional optimization passes at -O0. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D90777	2020-11-04 13:11:40 -08:00
cchen	d0d43b58b1	[OpenMP] target nested `use_device_ptr() if()` and is_device_ptr trigger asserts Clang now asserts for the below case: ``` void clang::CodeGen::CGOpenMPRuntime::createOffloadEntriesAndInfoMetadata(): Assertion `std::get<0>(E) && "All ordered entries must exist!"' failed. ``` The reason why Clang hit the assert is because in `emitTargetDataCalls`, both `BeginThenGen` and `BeginElseGen` call `registerTargetRegionEntryInfo` and try to register the Entry in OffloadEntriesTargetRegion with same key. If changing the expression in if clause to any constant expression, then the assert disappear. (https://godbolt.org/z/TW7haj) The assert itself is to avoid user from accessing elements out of bound inside `OrderedEntries` in `createOffloadEntriesAndInfoMetadata`. In this patch, I add a check in `registerTargetRegionEntryInfo` to avoid register the target region more than once. A test case that triggers assert: https://godbolt.org/z/4cnGW8 Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D90704	2020-11-04 12:36:57 -06:00
Qiu Chaofan	7faf62a80b	[Clang] Add more fp128 math library function builtins Since glibc has supported math library functions conforming IEEE 128-bit floating point types on some platform (like ppc64le), we can fix clang's math builtins missing this type. Reviewed By: bkramer Differential Revision: https://reviews.llvm.org/D90593	2020-11-04 17:58:42 +08:00
Baptiste Saleil	daa127d77e	[PowerPC] Add MMA builtin decoding and definitions Add MMA builtin decoding. These builtins use the new PowerPC-specific types __vector_pair and __vector_quad. So to avoid pervasive changes, we use custom type descriptors and custom decoding for these builtins. We also use custom code generation to expand builtin calls with pointers to simpler intrinsic calls with non-pointer types. Differential Revision: https://reviews.llvm.org/D81748	2020-11-03 15:08:46 -06:00
Ben Dunbobbin	7ad6010f58	Fix - [Clang] Add the ability to map DLL storage class to visibility `415f7ee883` had a silly typo introduced when I inlined some code into a loop from its own function. Original commit message: For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-03 19:13:54 +00:00
Tim Renouf	89d41f3a2b	[AMDGPU] Add gfx1033 target Differential Revision: https://reviews.llvm.org/D90447 Change-Id: If2650fc7f31bbdd49c76e74a9ca8e3734d769761	2020-11-03 16:27:48 +00:00
Tim Renouf	ee3e642627	[AMDGPU] Add gfx90c target This differentiates the Ryzen 4000/4300/4500/4700 series APUs that were previously included in gfx909. Differential Revision: https://reviews.llvm.org/D90419 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-11-03 16:27:43 +00:00
Yaxun (Sam) Liu	abd8cd9199	[CUDA][HIP] Fix linkage for -fgpu-rdc Currently for explicit template function instantiation in CUDA/HIP device compilation clang emits instantiated kernel with external linkage and instantiated device function with internal linkage. This is fine for -fno-gpu-rdc since there is only one TU. However this causes duplicate symbols for kernels for -fgpu-rdc if the same instantiation happen in multiple TU. Or missing symbols if a device function calls an explicitly instantiated template function in a different TU. To make explicit template function instantiation work for -fgpu-rdc we need to follow the C++ linkage paradigm, i.e. use weak_odr linkage. Differential Revision: https://reviews.llvm.org/D90311	2020-11-03 08:07:19 -05:00
Alex Lorenz	701456b523	[darwin] add support for __isPlatformVersionAtLeast check for if (@available) The __isPlatformVersionAtLeast routine is an implementation of `if (@available)` check that uses the _availability_version_check API on Darwin that's supported on macOS 10.15, iOS 13, tvOS 13 and watchOS 6. Differential Revision: https://reviews.llvm.org/D90367	2020-11-02 16:28:09 -08:00
Ben Dunbobbin	ae9231ca2a	Reland - [Clang] Add the ability to map DLL storage class to visibility `415f7ee883` had LIT test failures on any build where the clang executable was not called "clang". I have adjusted the LIT CHECKs to remove the binary name to fix this. Original commit message: For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-02 23:24:49 +00:00
Ben Dunbobbin	5024d3aa18	Revert "[Clang] Add the ability to map DLL storage class to visibility" This reverts commit `415f7ee883`. The added tests were failing on the build bots!	2020-11-02 17:33:54 +00:00
Ben Dunbobbin	415f7ee883	[Clang] Add the ability to map DLL storage class to visibility For PlayStation we offer source code compatibility with Microsoft's dllimport/export annotations; however, our file format is based on ELF. To support this we translate from DLL storage class to ELF visibility at the end of codegen in Clang. Other toolchains have used similar strategies (e.g. see the documentation for this ARM toolchain: https://developer.arm.com/documentation/dui0530/i/migrating-from-rvct-v3-1-to-rvct-v4-0/changes-to-symbol-visibility-between-rvct-v3-1-and-rvct-v4-0) This patch adds the ability to perform this translation. Options are provided to support customizing the mapping behaviour. Differential Revision: https://reviews.llvm.org/D89970	2020-11-02 17:08:23 +00:00
Teresa Johnson	0949f96dc6	[MemProf] Pass down memory profile name with optional path from clang Similar to -fprofile-generate=, add -fmemory-profile= which takes a directory path. This is passed down to LLVM via a new module flag metadata. LLVM in turn provides this name to the runtime via the new __memprof_profile_filename variable. Additionally, always pass a default filename (in $cwd if a directory name is not specified vi the = form of the option). This is also consistent with the behavior of the PGO instrumentation. Since the memory profiles will generally be fairly large, it doesn't make sense to dump them to stderr. Also, importantly, the memory profiles will eventually be dumped in a compact binary format, which is another reason why it does not make sense to send these to stderr by default. Change the existing memprof tests to specify log_path=stderr when that was being relied on. Depends on D89086. Differential Revision: https://reviews.llvm.org/D89087	2020-11-01 17:38:23 -08:00
Mark de Wever	b46fddf75f	[CodeGen] Implement [[likely]] and [[unlikely]] for while and for loop. The attribute has no effect on a do statement since the path of execution will always include its substatement. It adds a diagnostic when the attribute is used on an infinite while loop since the codegen omits the branch here. Since the likelihood attributes have no effect on a do statement no diagnostic will be issued for do [[unlikely]] {...} while(0); Differential Revision: https://reviews.llvm.org/D89899	2020-10-31 17:51:29 +01:00
Arthur Eubanks	5c31b8b94f	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `10f2a0d662`. More uint64_t overflows.	2020-10-31 00:25:32 -07:00
Thomas Lively	a787e09779	[WebAssembly] Prototype i64x2.bitmask As proposed in https://github.com/WebAssembly/simd/pull/368. Differential Revision: https://reviews.llvm.org/D90514	2020-10-30 17:23:30 -07:00
Thomas Lively	0a512a555a	[WebAssembly] Prototype i64x2.eq As proposed in https://github.com/WebAssembly/simd/pull/381. Since it is still in the prototyping phase, it is only accessible via a target builtin function and a target intrinsic. Depends on D90504. Differential Revision: https://reviews.llvm.org/D90508	2020-10-30 16:38:15 -07:00
Thomas Lively	1cb0b56607	[WebAssembly] Prototype i64x2.widen_{low,high}_i32x4_{s,u} As proposed in https://github.com/WebAssembly/simd/pull/290. As usual, these instructions are available only via builtin functions and intrinsics while they are in the prototyping stage. Differential Revision: https://reviews.llvm.org/D90504	2020-10-30 15:44:04 -07:00
Arthur Eubanks	2e31727a88	[NFC] Clean up PassBuilder Make DebugLogging a member variable so that users of PassBuilder don't need to pass it around so much. Move call to TargetMachine::registerPassBuilderCallbacks() within PassBuilder so users don't need to remember to call it. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D90437	2020-10-30 10:03:59 -07:00
Arthur Eubanks	10f2a0d662	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-30 10:03:46 -07:00
David Sherwood	cea69fa4dc	[SVE] Add fatal error for unnamed SVE variadic arguments We don't currently support passing unnamed variadic SVE arguments so I've added a fatal error if we hit such cases to prevent any silent ABI issues in future. Differential Revision: https://reviews.llvm.org/D90230	2020-10-30 13:35:47 +00:00
Liu, Chen3	00090a2b82	Support complex target features combinations This patch is mainly doing two things: 1. Adding support for parentheses, making the combination of target features more diverse; 2. Making the priority of ’,‘ is higher than that of '\|' by default. So I need to make some change with PTX Builtin function. Differential Revision: https://reviews.llvm.org/D89184	2020-10-30 10:32:53 +08:00
Thomas Lively	be6f50798e	[WebAssembly] Implement SIMD signselect instructions As proposed in https://github.com/WebAssembly/simd/pull/124, using the opcodes adopted by V8 in https://chromium-review.googlesource.com/c/v8/v8/+/2486235/2/src/wasm/wasm-opcodes.h. Uses new builtin functions and a new target intrinsic exclusively to ensure that the new instructions are only emitted when a user explicitly opts in to using them since they are still in the prototyping and evaluation phase. Differential Revision: https://reviews.llvm.org/D90357	2020-10-29 11:06:20 -07:00
Jon Chesterfield	dee7704829	[AMDGPU] Add __builtin_amdgcn_grid_size [AMDGPU] Add __builtin_amdgcn_grid_size Similar to D76772, loads the data from the dispatch pointer. Marked invariant. Patch also updates the openmp devicertl to use this builtin. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D90251	2020-10-29 16:25:13 +00:00
Amy Huang	7669f3c0f6	Recommit "[CodeView] Emit static data members as S_CONSTANTs." We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. This changes CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072 This reverts commit `504615353f`.	2020-10-28 16:35:59 -07:00
Shilei Tian	0661328d7e	[Clang][OpenMP] Added the support for target data nowait Previously we added support for target nowait, but target data nowait has not been supported yet. In this patch, target data nowait will also be wrapped into a task. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90099	2020-10-28 15:53:30 -04:00
Mircea Trofin	6fa35541a0	[NFC][ThinLTO] Change command line passing to EmbedBitcodeInModule Changing to pass by ref - less null checks to worry about. Differential Revision: https://reviews.llvm.org/D90330	2020-10-28 12:33:39 -07:00
Baptiste Saleil	40dd4d5233	[Clang][PowerPC] Add __vector_pair and __vector_quad types Define the __vector_pair and __vector_quad types that are used to manipulate the new accumulator registers introduced by MMA on PowerPC. Because these two types are specific to PowerPC, they are defined in a separate new file so it will be easier to add other PowerPC specific types if we need to in the future. Differential Revision: https://reviews.llvm.org/D81508	2020-10-28 13:19:20 -05:00
Thomas Lively	5b464f2aa5	[WebAssembly] Fix incorrectly named target builtin Rename __builtin_wasm_q15mulr_saturate_s_i8x16 to __builtin_wasm_q15mulr_saturate_s_i16x8, fixing the implied lane interpretation of the result.	2020-10-28 10:22:43 -07:00
Heejin Ahn	98941279b9	[WebAssembly] Clang-format builtins generation (NFC) Differential Revision: https://reviews.llvm.org/D90294	2020-10-28 10:01:21 -07:00
Thomas Lively	31e944556f	[WebAssembly] Prototype extending multiplication SIMD instructions As proposed in https://github.com/WebAssembly/simd/pull/376. This commit implements new builtin functions and intrinsics for these instructions, but does not yet add them to wasm_simd128.h because they have not yet been merged to the proposal. These are the first instructions with opcodes greater than 0xff, so this commit updates the MC layer and disassembler to handle that correctly. Differential Revision: https://reviews.llvm.org/D90253	2020-10-28 09:38:59 -07:00
JonChesterfield	5d02ca49a2	[libomptarget][nvptx] Undef, weak shared variables [libomptarget][nvptx] Undef, weak shared variables Shared variables on nvptx, and LDS on amdgcn, are uninitialized at the start of kernel execution. Therefore create the variables with undef instead of zeros, motivated in part by the amdgcn back end rejecting LDS+initializer. Common is zero initialized, which seems incompatible with shared. Thus change them to weak, following the direction of https://reviews.llvm.org/rG7b3eabdcd215 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90248	2020-10-28 14:25:36 +00:00
Benjamin Kramer	90a9f97cbd	[openmp] Use front() instead of *begin() to not hide bugs when CurTypes is empty.	2020-10-28 13:58:23 +01:00
Benjamin Kramer	207cf71fa9	Revert "[OpenMP] Add Passing in Original Declaration Names To Mapper API" This reverts commit `d981c7b758` and `a87d7b3d44`. Test fails under msan.	2020-10-28 13:58:14 +01:00
Joseph Huber	a87d7b3d44	[OpenMP] Add Passing in Original Declaration Names To Mapper API Summary: This patch adds support for passing in the original delcaration name in the source file to the libomptarget runtime. This will allow the runtime to provide more intelligent debugging messages. This patch takes the original expression parsed from the OpenMP map / update clause and provides a textual representation if it was explicitly mapped, otherwise it takes the name of the variable declaration as a fallback. The information in passed to the runtime in a global array of strings that matches the existing ident_t source location strings using ";name;filename;column;row;;". See clang/test/OpenMP/target_map_names.cpp for an example of the generated output for a given map clause. Reviewers: jdoervert Differential Revision: https://reviews.llvm.org/D89802	2020-10-27 16:09:19 -04:00
Amy Huang	504615353f	Revert "[CodeView] Emit static data members as S_CONSTANTs." Seems like there's an assert in here that we shouldn't be running into. This reverts commit `515973222e`.	2020-10-27 11:29:58 -07:00
Nico Weber	2a4e704c92	Revert "Use uint64_t for branch weights instead of uint32_t" This reverts commit `e5766f25c6`. Makes clang assert when building Chromium, see https://crbug.com/1142813 for a repro.	2020-10-27 09:26:21 -04:00
Shilei Tian	d38788b357	[Clang][OpenMP] Avoid unnecessary privatization of mapper array when there is no user defined mapper In current implementation, if it requires an outer task, the mapper array will be privatized no matter whether it has mapper. In fact, when there is no mapper, the mapper array only contains number of nullptr. In the libomptarget, the use of mapper array is `if (mappers_array && mappers_array[i])`, which means we can directly set mapper array to nullptr if there is no mapper. This can avoid unnecessary data copy. In this patch, the data privatization will not be emitted if the mapper array is nullptr. When it comes to the emit of task body, the nullptr will be used directly. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D90101	2020-10-27 00:02:32 -04:00
Arthur Eubanks	e5766f25c6	Use uint64_t for branch weights instead of uint32_t CallInst::updateProfWeight() creates branch_weights with i64 instead of i32. To be more consistent everywhere and remove lots of casts from uint64_t to uint32_t, use i64 for branch_weights. Reviewed By: davidxl Differential Revision: https://reviews.llvm.org/D88609	2020-10-26 20:24:04 -07:00
Shilei Tian	e20d64c3d9	[Clang][OpenMP] Fixed an issue of segment fault when using target nowait The implementation of target nowait just wraps the target region into a task. The essential four parameters (base ptr, ptr, size, mapper) are taken as firstprivate such that they will be copied to the private location. When there is no user-defined mapper, the mapper variable will be nullptr. However, it will be still copied to the corresponding place. Therefore, a memcpy will be generated and the source pointer will be nullptr, causing a segmentation fault. The root cause is when calling `emitOffloadingArraysArgument`, the last argument `Options` has a field about whether it requires a task. It only takes depend clause into account. In this patch, the nowait clause is also included. There're two things that will be done in another patches: 1. target data nowait has not been supported yet. D90099 added the support. 2. When there is no mapper, the mapper array can be nullptr no matter whether it requires outer task or not. It can avoid an unnecessary data copy. This is an optimization that is covered in D90101. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D89844	2020-10-26 22:33:22 -04:00
Amy Huang	515973222e	[CodeView] Emit static data members as S_CONSTANTs. We used to only emit static const data members in CodeView as S_CONSTANTS when they were used; this patch makes it so they are always emitted. I changed CodeViewDebug.cpp to find the static const members from the class debug info instead of creating DIGlobalVariables in the IR whenever a static const data member is used. Bug: https://bugs.llvm.org/show_bug.cgi?id=47580 Differential Revision: https://reviews.llvm.org/D89072	2020-10-26 15:30:35 -07:00
Duncan P. N. Exon Smith	d4c667c9af	Avoid unnecessary uses of `MDNode::getTemporary`, NFC This is a long-delayed follow-up to `5e5b85098d`. `TempMDNode` includes a bunch of machinery for RAUW, and should only be used when necessary. RAUW wasn't being used in any of these cases... it was just a placeholder for a self-reference. Where the real node was using `MDNode::getDistinct`, just replace the temporary argument with `nullptr`. Where the real node was using `MDNode::get`, the `replaceOperandWith` call was "promoting" the node to a distinct one implicitly due to self-reference detection in `MDNode::handleChangedOperand`. The `TempMDNode` was serving a purpose by delaying uniquing, but it's way simpler to just call `MDNode::getDistinct` in the first place. Note that using a self-reference at all in these places is a hold-over from before `distinct` metadata existed. It was an old trick to create distinct nodes. It would be intrusive to change, including bitcode upgrades, etc., and it's harmless so I'm not sure there's much value in removing it from existing schemas. After this commit it still has a tiny memory cost (in the extra metadata operand) but no more overhead in construction. Differential Revision: https://reviews.llvm.org/D90079	2020-10-26 17:03:25 -04:00
Zequan Wu	e56e7bd469	Revert "Revert "Ensure that checkInitIsICE is called exactly once for every variable"" This reverts commit `a2ac64dd90`.	2020-10-26 12:08:57 -07:00
Zequan Wu	a2ac64dd90	Revert "Ensure that checkInitIsICE is called exactly once for every variable" This causing `Assertion Result && "Could not evaluate expression"' failed` at https://bugs.chromium.org/p/chromium/issues/detail?id=1142009 This reverts commit `76c0092665`.	2020-10-26 11:59:55 -07:00
Nick Desaulniers	c8f84bd094	[Clang][CodeGen] fix failed assertion Ensure we can emit symbol aliases via function attribute even when function signatures contain incomplete types. Via bugreport: https://reviews.llvm.org/D66492#2350947 Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D90073	2020-10-26 11:37:55 -07:00
Tyker	d3205bbca3	[Annotation] Allows annotation to carry some additional constant arguments. This allows using annotation in a much more contexts than it currently has. especially when annotation with template or constexpr. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D88645	2020-10-26 10:50:05 +01:00
Melanie Blower	2e204e2391	[clang] Enable support for #pragma STDC FENV_ACCESS Reviewers: rjmccall, rsmith, sepavloff Differential Revision: https://reviews.llvm.org/D87528	2020-10-25 06:46:25 -07:00
Akira Hatanaka	71e1a56de1	[CodeGen] Emit destructor calls to destruct non-trivial C struct temporaries created by conditional and assignment operators rdar://problem/64989559 Differential Revision: https://reviews.llvm.org/D83448	2020-10-23 14:46:17 -07:00
Nick Desaulniers	b7926ce6d7	[IR] add fn attr for no_stack_protector; prevent inlining on mismatch It's currently ambiguous in IR whether the source language explicitly did not want a stack a stack protector (in C, via function attribute no_stack_protector) or doesn't care for any given function. It's common for code that manipulates the stack via inline assembly or that has to set up its own stack canary (such as the Linux kernel) would like to avoid stack protectors in certain functions. In this case, we've been bitten by numerous bugs where a callee with a stack protector is inlined into an __attribute__((__no_stack_protector__)) caller, which generally breaks the caller's assumptions about not having a stack protector. LTO exacerbates the issue. While developers can avoid this by putting all no_stack_protector functions in one translation unit together and compiling those with -fno-stack-protector, it's generally not very ergonomic or as ergonomic as a function attribute, and still doesn't work for LTO. See also: https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/ https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u Typically, when inlining a callee into a caller, the caller will be upgraded in its level of stack protection (see adjustCallerSSPLevel()). By adding an explicit attribute in the IR when the function attribute is used in the source language, we can now identify such cases and prevent inlining. Block inlining when the callee and caller differ in the case that one contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`. Fixes pr/47479. Reviewed By: void Differential Revision: https://reviews.llvm.org/D87956	2020-10-23 11:55:39 -07:00
Venkataramanan Kumar	57cdc52c4d	Initial support for vectorization using Libmvec (GLIBC vector math library) Differential Revision: https://reviews.llvm.org/D88154	2020-10-22 16:01:39 -04:00
Xiang1 Zhang	7c3fea7721	[X86] Support customizing stack protector guard Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D88631	2020-10-22 10:08:14 +08:00
Richard Smith	ba4768c966	[c++20] For P0732R2 / P1907R1: Basic frontend support for class types as non-type template parameters. Create a unique TemplateParamObjectDecl instance for each such value, representing the globally unique template parameter object to which the template parameter refers. No IR generation support yet; that will follow in a separate patch.	2020-10-21 13:21:41 -07:00
Jonas Paulsson	42a82862b6	Reapply "[clang] Improve handling of physical registers in inline assembly operands." Earlyclobbers are now excepted from this change (original commit: `c78da03`). Review: Ulrich Weigand, Nick Desaulniers Differential Revision: https://reviews.llvm.org/D87279	2020-10-21 10:53:40 +02:00
Mikhail Maltsev	7819411837	[clang] Use SourceLocation as key in hash maps, NFCI The patch adjusts the existing `llvm::DenseMap<unsigned, T>` and `llvm::DenseSet<unsigned>` objects that store source locations, so that they use `SourceLocation` directly instead of `unsigned`. This patch relies on the `DenseMapInfo` trait added in D89719. It also replaces the construction of `SourceLocation` objects from the constants -1 and -2 with calls to the trait's methods `getEmptyKey` and `getTombstoneKey` where appropriate. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D69840	2020-10-20 16:24:09 +01:00
Richard Smith	08c8d5bc51	Properly track whether a variable is constant-initialized. This fixes miscomputation of __builtin_constant_evaluated in the initializer of a variable that's not usable in constant expressions, but is readable when constant-folding. If evaluation of a constant initializer fails, we throw away the evaluated result instead of keeping it as a non-constant-initializer value for the variable, because it might not be a correct value. To avoid regressions for initializers that are foldable but not formally constant initializers, we now try constant-evaluating some globals in C++ twice: once to check for a constant initializer (in an mode where is_constannt_evaluated returns true) and again to determine the runtime value if the initializer is not a constant initializer.	2020-10-19 23:59:11 -07:00
Fangrui Song	0ab222e7d7	[gcov] Delete CC1 option -test-coverage The name is unfortunate because it is similar to the driver option -ftest-coverage. It turns out aside from one occurrence in a test, this option is not used.	2020-10-19 21:48:51 -07:00
Richard Smith	3692d20d2b	Refactor tracking of constant initializers for variables. Instead of framing the interface around whether the variable is an ICE (which is only interesting in C++98), primarily track whether the initializer is a constant initializer (which is interesting in all C++ language modes). No functionality change intended.	2020-10-19 21:31:19 -07:00
Richard Smith	76c0092665	Ensure that checkInitIsICE is called exactly once for every variable for which it matters. This is a step towards separating checking for a constant initializer (in which std::is_constant_evaluated returns true) and any other evaluation of a variable initializer (in which it returns false).	2020-10-19 19:04:04 -07:00
Douglas Yung	774ab60125	Add option to use older clang ABI behavior when passing certain union types as function arguments Recently commit D78699 (commit `26cfb6e562`), fixed clang's behavior with respect to passing a union type through a register to correctly follow the ABI. However, this is an ABI breaking change with earlier versions of the clang compiler, so we should add an -fclang-abi-compat option to address this. Additionally, the PS4 ABI requires the older behavior, so that is added as well. This change adds a Ver11 value to the ClangABI enum that when it is set (or the target is the PS4 triple), we skip the ABI fix introduced in D78699. Differential Revision: https://reviews.llvm.org/D89747	2020-10-19 18:17:34 -07:00
Simon Pilgrim	7fe7d9b130	Fix MSVC "not all control paths return a value" warning. NFCI.	2020-10-19 11:48:31 +01:00
Hans Wennborg	0628bea513	Revert "[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting" This broke Chromium's PGO build, it seems because hot-cold-splitting got turned on unintentionally. See comment on the code review for repro etc. > This patch adds -f[no-]split-cold-code CC1 options to clang. This allows > the splitting pass to be toggled on/off. The current method of passing > `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose > correctly (say, with `-O0` or `-Oz`). > > To implement the -fsplit-cold-code option, an attribute is applied to > functions to indicate that they may be considered for splitting. This > removes some complexity from the old/new PM pipeline builders, and > behaves as expected when LTO is enabled. > > Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> > Differential Revision: https://reviews.llvm.org/D57265 > Reviewed By: Aditya Kumar, Vedant Kumar > Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar This reverts commit `273c299d5d`.	2020-10-19 12:31:14 +02:00
Mark de Wever	389c8d5b20	[NFC] Make non-modifying members const. Implementing the likelihood attributes for the iteration statements adds a new helper function. This function can't be const qualified since these non-modifying members aren't const qualified.	2020-10-18 18:50:21 +02:00
Mark de Wever	2bcda6bb28	[Sema, CodeGen] Implement [[likely]] and [[unlikely]] in SwitchStmt This implements the likelihood attribute for the switch statement. Based on the discussion in D85091 and D86559 it only handles the attribute when placed on the case labels or the default labels. It also marks the likelihood attribute as feature complete. There are more QoI patches in the pipeline. Differential Revision: https://reviews.llvm.org/D89210	2020-10-18 13:48:42 +02:00
Richard Smith	d4aac67859	Make the check for whether we should memset(0) an aggregate initialization a little smarter. Look through casts that preserve zero-ness when determining if an initializer is zero, so that we can handle cases like an {0} initializer whose corresponding field is a type other than 'int'.	2020-10-16 16:48:22 -07:00
Richard Smith	48c70c1664	Extend memset-to-zero optimization to C++11 aggregate functional casts Aggr{...}. We previously missed these cases due to not stepping over the additional AST nodes representing their syntactic form.	2020-10-16 13:21:08 -07:00
Matt Arsenault	0a7cd99a70	Reapply "OpaquePtr: Add type to sret attribute" This reverts commit `eb9f7c28e5`. Previously this was incorrectly handling linking of the contained type, so this merges the fixes from D88973.	2020-10-16 11:05:02 -04:00
Caroline Concatto	e8d9ee9c7c	[SVE][CodeGen]Use getFixedSize() function for TypeSize comparison in clang This patch makes sure that the instance of TypeSize comparison operator is done with a fixed type size. Differential Revision: https://reviews.llvm.org/D89312	2020-10-16 10:56:39 +01:00
Vedant Kumar	273c299d5d	[PM/CC1] Add -f[no-]split-cold-code CC1 option to toggle splitting This patch adds -f[no-]split-cold-code CC1 options to clang. This allows the splitting pass to be toggled on/off. The current method of passing `-mllvm -hot-cold-split=true` to clang isn't ideal as it may not compose correctly (say, with `-O0` or `-Oz`). To implement the -fsplit-cold-code option, an attribute is applied to functions to indicate that they may be considered for splitting. This removes some complexity from the old/new PM pipeline builders, and behaves as expected when LTO is enabled. Co-authored by: Saleem Abdulrasool <compnerd@compnerd.org> Differential Revision: https://reviews.llvm.org/D57265 Reviewed By: Aditya Kumar, Vedant Kumar Reviewers: Teresa Johnson, Aditya Kumar, Fedor Sergeev, Philip Pfaffe, Vedant Kumar	2020-10-15 23:13:33 +00:00
Fangrui Song	5a338599fb	[CGBuiltin] Respect asm labels and redefine_extname for builtins with specialized emitting rL131311 added `asm()` support for builtin functions, but `asm()` for builtins with specialized emitting (e.g. memcpy, various math functions) still do not work. This patch makes these functions work for `asm()` and `#pragma redefine_extname`. glibc uses `asm()` to redirect internal libc function calls to hidden aliases. Limitation: such a function is a builtin in clang, but will not be recognized as a libcall in optimization passes because Clang does not annotate the renamed function as a libcall. In GCC -O1 or above, `abs` can be optimized out but we can't. Additionally, we cannot redirect `__builtin_sin` to `real_sin` in the following example: double sin(double x) asm("real_sin"); double f(double d) { return __builtin_sin(d); } --- According to @rsmith, the following three statements cannot be simultaneously true: (1) The frontend function foo has known, builtin semantics X. (2) The symbol foo has known, builtin semantics X. (3) It's not correct to lower a call to the frontend function foo to the symbol foo. People do want (1) (if it is profitable to expand a memcpy, do it). This also means that people do not want to add -fno-builtin-memcpy. People do want (3): that is why they use asm("__GI_memcpy") in the first place. So unfortunately we make a compromise by not refuting (2) (see the limitation above). For most libcalls, there is a small loss because compilers don't synthesize them. For the few glibc cares about, it uses `asm("memcpy = __GI_memcpy");` to make the assembly level redirection. (Changing function names (e.g. `__memcpy`) is a hit to ergonomics which is not acceptable). Reviewed By: rsmith Differential Revision: https://reviews.llvm.org/D88712	2020-10-15 15:14:38 -07:00
Reid Kleckner	5fbab4025e	[MS] Apply `inreg` to AArch64 sret parms on instance methods The documentation rules indicate that instance methods should return large, trivially copyable aggregates via X1/X0 and not X8 as is normally done when returning such structs from free functions: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values Fixes PR47836, a bug in the initial implementation of these rules. I tried to simplify the logic a bit as well while I'm here. Differential Revision: https://reviews.llvm.org/D89362	2020-10-15 14:54:42 -07:00
Yaxun (Sam) Liu	e384e94fbe	Revert "[HIP] Change default --gpu-max-threads-per-block value to 1024" This reverts commit `187658b8a6` due to AMDGPU backend issues.	2020-10-15 17:25:55 -04:00
Leonard Chan	79829a4704	Revert "[clang] Add -fc++-abi= flag for specifying which C++ ABI to use" This reverts commits `683b308c07` and `8487bfd4e9`. We will go for a more restricted approach that does not give freedom to everyone to change ABIs on whichever platform. See the discussion on https://reviews.llvm.org/D85802.	2020-10-15 14:24:38 -07:00
Thomas Lively	1992e30c2d	[WebAssembly] Prototype i8x16.popcnt As proposed at https://github.com/WebAssembly/simd/pull/379. Use a target builtin and intrinsic rather than normal codegen patterns to make the instruction opt-in until it is merged to the proposal and stabilized in engines. Differential Revision: https://reviews.llvm.org/D89446	2020-10-15 21:18:22 +00:00
Stanislav Mekhanoshin	d1beb95d12	[AMDGPU] gfx1032 target Differential Revision: https://reviews.llvm.org/D89487	2020-10-15 12:41:18 -07:00
Thomas Lively	3f738d1f5e	Reland "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c8385a352` with a typing fix to an instruction selection pattern.	2020-10-15 19:32:34 +00:00
Thomas Lively	7c8385a352	Revert "[WebAssembly] v128.load{8,16,32,64}_lane instructions" This reverts commit `7c6bfd90ab`.	2020-10-15 15:49:36 +00:00
Thomas Lively	7c6bfd90ab	[WebAssembly] v128.load{8,16,32,64}_lane instructions Prototype the newly proposed load_lane instructions, as specified in https://github.com/WebAssembly/simd/pull/350. Since these instructions are not available to origin trial users on Chrome stable, make them opt-in by only selecting them from intrinsics rather than normal ISel patterns. Since we only need rough prototypes to measure performance right now, this commit does not implement all the load and store patterns that would be necessary to make full use of the offset immediate. However, the full suite of offset tests is included to make it easy to track improvements in the future. Since these are the first instructions to have a memarg immediate as well as an additional immediate, the disassembler needed some additional hacks to be able to parse them correctly. Making that code more principled is left as future work. Differential Revision: https://reviews.llvm.org/D89366	2020-10-15 15:33:10 +00:00
Caroline Concatto	145e44bb18	[SVE]Fix implicit TypeSize casts in EmitCheckValue Using TypeSize::getFixedSize() instead of relying upon the implicit TypeSize->uint64_cast as the type is always fixed width. Differential Revision: https://reviews.llvm.org/D89313	2020-10-15 13:25:46 +01:00
Simon Pilgrim	d7fa9030d4	[CodeGen][X86] Emit fshl/fshr ir intrinsics for shiftleft128/shiftright128 ms intrinsics Now that funnel shift handling is pretty good, we can use the intrinsics directly and avoid a lot of zext/trunc issues. https://godbolt.org/z/YqhnnM Differential Revision: https://reviews.llvm.org/D89405	2020-10-15 10:22:41 +01:00
Duncan P. N. Exon Smith	dde4e0318c	clang/CodeGen: Stop using SourceManager::getBuffer, NFC Update `clang/lib/CodeGen` to use a `MemoryBufferRef` from `getBufferOrNone` instead of `MemoryBuffer*` from `getBuffer`. No functionality change here. Differential Revision: https://reviews.llvm.org/D89411	2020-10-14 23:32:43 -04:00
Leonard Chan	683b308c07	[clang] Add -fc++-abi= flag for specifying which C++ ABI to use This implements the flag proposed in RFC http://lists.llvm.org/pipermail/cfe-dev/2020-August/066437.html. The goal is to add a way to override the default target C++ ABI through a compiler flag. This makes it easier to test and transition between different C++ ABIs through compile flags rather than build flags. In this patch: - Store `-fc++-abi=` in a LangOpt. This isn't stored in a CodeGenOpt because there are instances outside of codegen where Clang needs to know what the ABI is (particularly through ASTContext::createCXXABI), and we should be able to override the target default if the flag is provided at that point. - Expose the existing ABIs in TargetCXXABI as values that can be passed through this flag. - Create a .def file for these ABIs to make it easier to check flag values. - Add an error for diagnosing bad ABI flag values. Differential Revision: https://reviews.llvm.org/D85802	2020-10-14 12:31:21 -07:00
Jonas Paulsson	625fa47617	Revert "[clang] Improve handling of physical registers in inline assembly operands." This reverts commit `c78da03778`. Temporarily reverted due to https://bugs.llvm.org/show_bug.cgi?id=47837.	2020-10-14 08:42:51 +02:00
Jonas Paulsson	c78da03778	[clang] Improve handling of physical registers in inline assembly operands. Change EmitAsmStmt() to - Not tie physregs with the "+r" constraint, but instead add the hard register as an input constraint. This makes "+r" and "=r":"r" look the same in the output. Background: Macro intensive user code may contain inline assembly statements with multiple operands constrained to the same physreg. Such a case (with the operand constraints "+r" : "r") currently triggers the TwoAddressInstructionPass assertion against any extra use of a tied register. Furthermore, TwoAddress will insert a COPY to that physreg even though isel has already done so (for the non-tied use), which may lead to a second redundant instruction currently. A simple fix for this is to not emit tied physreg uses in the first place for the "+r" constraint, which is what this patch does. - Give an error on multiple outputs to the same physical register. This should be reported and this is also what GCC does. Review: Ulrich Weigand, Aaron Ballman, Jennifer Yu, Craig Topper Differential Revision: https://reviews.llvm.org/D87279	2020-10-13 15:09:52 +02:00
Bevin Hansson	101309fe04	[AST] Change return type of getTypeInfoInChars to a proper struct instead of std::pair. Followup to D85191. This changes getTypeInfoInChars to return a TypeInfoChars struct instead of a std::pair of CharUnits. This lets the interface match getTypeInfo more closely. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D86447	2020-10-13 13:26:56 +02:00
Bevin Hansson	9fa7f48459	[Fixed Point] Add fixed-point to floating point cast types and consteval. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D86631	2020-10-13 13:26:56 +02:00
Ties Stuij	208987844f	[ARM] Follow AACPS standard for volatile bit-fields access width This patch resumes the work of D16586. According to the AAPCS, volatile bit-fields should be accessed using containers of the widht of their declarative type. In such case: ``` struct S1 { short a : 1; } ``` should be accessed using load and stores of the width (sizeof(short)), where now the compiler does only load the minimum required width (char in this case). However, as discussed in D16586, that could overwrite non-volatile bit-fields, which conflicted with C and C++ object models by creating data race conditions that are not part of the bit-field, e.g. ``` struct S2 { short a; int b : 16; } ``` Accessing `S2.b` would also access `S2.a`. The AAPCS Release 2020Q2 (https://documentation-service.arm.com/static/5efb7fbedbdee951c1ccf186?token=) section 8.1 Data Types, page 36, "Volatile bit-fields - preserving number and width of container accesses" has been updated to avoid conflict with the C++ Memory Model. Now it reads in the note: ``` This ABI does not place any restrictions on the access widths of bit-fields where the container overlaps with a non-bit-field member or where the container overlaps with any zero length bit-field placed between two other bit-fields. This is because the C/C++ memory model defines these as being separate memory locations, which can be accessed by two threads simultaneously. For this reason, compilers must be permitted to use a narrower memory access width (including splitting the access into multiple instructions) to avoid writing to a different memory location. For example, in struct S { int a:24; char b; }; a write to a must not also write to the location occupied by b, this requires at least two memory accesses in all current Arm architectures. In the same way, in struct S { int a:24; int:0; int b:8; };, writes to a or b must not overwrite each other. ``` I've updated the patch D16586 to follow such behavior by verifying that we only change volatile bit-field access when: - it won't overlap with any other non-bit-field member - we only access memory inside the bounds of the record - avoid overlapping zero-length bit-fields. Regarding the number of memory accesses, that should be preserved, that will be implemented by D67399. Reviewed By: ostannard Differential Revision: https://reviews.llvm.org/D72932	2020-10-13 10:31:48 +01:00
Simon Pilgrim	6c23cbc560	[X86] Convert integer _mm_reduce_* intrinsics to emit llvm.reduction intrinsics (PR47506) Emit the equivalent integer reduction intrinsics in IR instead of expanding to shuffle+arithmetic sequences. The fadd/fmul reductions might be trickier as they assume a similar bisection reduction while the generic intrinsics assume a sequential reduction (intel docs are ambiguous on the correct approach) - I'm not sure if we want to always tag them with reassoc? Anyway, that issue can wait until a separate fp patch along with the fmin/fmax reductions. Differential Revision: https://reviews.llvm.org/D87604	2020-10-13 09:28:39 +01:00
Richard Smith	913f600566	Canonicalize declaration pointers when forming APValues. References to different declarations of the same entity aren't different values, so shouldn't have different representations. Recommit of `e6393ee813`, most recently reverted in `9a33f027ac` due to a bug caused by ObjCInterfaceDecls not propagating availability attributes along their redeclaration chains; that bug was fixed in `e2d4174e9c`.	2020-10-12 19:32:57 -07:00
Arthur Eubanks	9a33f027ac	Revert "Canonicalize declaration pointers when forming APValues." This reverts commit `9dcd96f728`. See https://crbug.com/1134762.	2020-10-12 12:37:24 -07:00
Tim Renouf	666ef0db20	[AMDGPU] Add gfx602, gfx705, gfx805 targets At AMD, in an internal audit of our code, we found some corner cases where we were not quite differentiating targets enough for some old hardware. This commit is part of fixing that by adding three new targets: * The "Oland" and "Hainan" variants of gfx601 are now split out into gfx602. LLPC (in the GPUOpen driver) and other front-ends could use that to avoid using the shaderZExport workaround on gfx602. * One variant of gfx703 is now split out into gfx705. LLPC and other front-ends could use that to avoid using the shaderSpiCsRegAllocFragmentation workaround on gfx705. * The "TongaPro" variant of gfx802 is now split out into gfx805. TongaPro has a faster 64-bit shift than its former friends in gfx802, and a subtarget feature could be set up for that to take advantage of it. This commit does not make that change; it just adds the target. V2: Add clang changes. Put TargetParser list in order. V3: AMDGCNGPUs table in TargetParser.cpp needs to be in GPUKind order, so fix the GPUKind order. Differential Revision: https://reviews.llvm.org/D88916 Change-Id: Ia901a7157eb2f73ccd9f25dbacec38427312377d	2020-10-10 17:22:22 +01:00
Thomas Lively	d8f58bf53a	[WebAssembly] Prototype i16x8.q15mulr_sat_s This saturating, rounding, Q-format multiplication instruction is proposed in https://github.com/WebAssembly/simd/pull/365. Differential Revision: https://reviews.llvm.org/D88968	2020-10-09 21:17:53 +00:00
Liu, Chen3	26cfb6e562	[X86] Passing union type through register For example: union M256 { double d; __m256 m; }; extern void foo1(union M256 A); union M256 m1; void test() { foo1(m1); } clang will pass m1 through stack which does not follow the ABI. Differential Revision: https://reviews.llvm.org/D78699	2020-10-09 11:24:29 +08:00
Alexandre Ganea	66face6aa0	Re-land [DebugInfo] Add debug location to stubs generated by CGDeclCXX and mark them as artificial Previously, when clang was compiled with -DLLVM_ENABLE_ASSERTIONS=ON, the added tests were displaying: inlinable function call in a function with debug info must have a !dbg location call void @"??1?$c@UB@@@@QEAA@XZ"(%struct.c* @"?f@?1??d@@YAPEAU?$c@UB@@@@XZ@4U2@A") fatal error: error in backend: Broken module found, compilation aborted! Stack dump: 0. Program arguments: <f:\svn\buildninja\bin\clang -cc1 -emit-llvm debug-info-no-location.cpp> -gcodeview -debug-info-kind=limited 1. <eof> parser at end of file 2. Per-function optimization Fixes PR43012 Differential Revision: https://reviews.llvm.org/D66328	2020-10-08 20:49:17 -04:00
Arthur Eubanks	afff74e5c2	[HWAsan][NewPM] Handle hwasan like other sanitizers Move it as an EP callback (-O[123]) or in addSanitizersAtO0. This makes it not run in ThinLTO pre-link (like the other sanitizers), so don't check LTO runs in hwasan-new-pm.c. Changing its position also seems to change the generated IR. I think we just need to make sure the pass runs. Reviewed By: leonardchan Differential Revision: https://reviews.llvm.org/D88936	2020-10-08 14:43:21 -07:00
Joseph Huber	3cc1f1fc1d	[OpenMP] Replace OpenMP RTL Functions With OMPIRBuilder and OMPKinds.def Summary: Replace the OpenMP Runtime Library functions used in CGOpenMPRuntimeGPU for OpenMP device code generation with ones in OMPKinds.def and use OMPIRBuilder for generating runtime calls. This allows us to consolidate more OpenMP code generation into the OMPIRBuilder. Future additions to the GPU runtime functions should now go in OMPKinds.def Reviewers: jdoerfert Subscribers: aaron.ballman cfe-commits guansong llvm-commits sstefan1 yaxunl Tags: #OpenMP #LLVM #clang Differential Revision: https://reviews.llvm.org/D88430	2020-10-08 14:00:22 -04:00
diggerlin	92bca12843	[AIX] add new option -mignore-xcoff-visibility SUMMARY: In IBM compiler xlclang , there is an option -fnovisibility which suppresses visibility. For more details see: https://www.ibm.com/support/knowledgecenter/SSGH3R_16.1.0/com.ibm.xlcpp161.aix.doc/compiler_ref/opt_visibility.html. We need to add the option -mignore-xcoff-visibility for compatibility with the IBM AIX OS (as the option is enabled by default in AIX). With this option llvm does not emit any visibility attribute to ASM or XCOFF object file. The option only work on the AIX OS, for other non-AIX OS using the option will report an unsupported options error. In AIX OS: 1.1 the option -mignore-xcoff-visibility is enabled by default , if there is not -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command . 1.2 if there is -fvisibility=* explicitly but not -mignore-xcoff-visibility explicitly in the clang command. it will generate visibility attributes. 1.3 if there are both -fvisibility=* and -mignore-xcoff-visibility explicitly in the clang command. The option "-mignore-xcoff-visibility" wins , it do not emit the visibility attribute. The option -mignore-xcoff-visibility has no effect on visibility attribute when compile with -emit-llvm option to generated LLVM IR. Reviewer: daltenty,Jason Liu Differential Revision: https://reviews.llvm.org/D87451	2020-10-08 09:34:58 -04:00
Pushpinder Singh	3a12ff0dac	[OpenMP][RTL] Remove dead code RequiresDataSharing was always 0, resulting dead code in device runtime library. Reviewed By: jdoerfert, JonChesterfield Differential Revision: https://reviews.llvm.org/D88829	2020-10-06 05:43:47 -04:00
Fangrui Song	a2cc883368	[CUDA] Don't call __cudaRegisterVariable on C++17 inline variables D17779: host-side shadow variables of external declarations of device-side global variables have internal linkage and are referenced by `__cuda_register_globals`. nvcc from CUDA 11 does not allow `__device__ inline` or `__device__ constexpr` (C++17 inline variables) but clang has incorrectly supported them for a while: ``` error: A __device__ variable cannot be marked constexpr error: An inline __device__/__constant__/__managed__ variable must have internal linkage when the program is compiled in whole program mode (-rdc=false) ``` If such a variable (which has a comdat group) is discarded (a copy from another translation unit is prevailing and selected), accessing the variable from outside the section group (`__cuda_register_globals`) is a violation of the ELF specification and will be rejected by linkers: > A symbol table entry with STB_LOCAL binding that is defined relative to one of a group's sections, and that is contained in a symbol table section that is not part of the group, must be discarded if the group members are discarded. References to this symbol table entry from outside the group are not allowed. As a workaround, don't register such inline variables for now. (If we register the variables in all TUs, we will keep multiple instances of the shadow and break the C++ semantics for inline variables). We should reject such variables in Sema but our internal users need some time to migrate. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D88786	2020-10-05 12:53:59 -07:00
Craig Topper	a02b449bb1	[X86] Sync AESENC/DEC Key Locker builtins with gcc. For the wide builtins, pass a single input and output pointer to the builtins. Emit the GEPs and input loads from CGBuiltin.	2020-10-04 12:09:41 -07:00

... 8 9 10 11 12 ...

14730 Commits