llvm-project

Commit Graph

Author	SHA1	Message	Date
Vitaly Buka	9905dae5e1	Revert "[Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type" Breakes windows bot. This reverts commit `3ad2fe913a`.	2022-09-03 13:12:49 -07:00
Kazu Hirata	89f1433225	Use llvm::lower_bound (NFC)	2022-09-03 11:17:37 -07:00
yronglin	3ad2fe913a	[Clang][CodeGen] Avoid __builtin_assume_aligned crash when the 1st arg is array type Avoid __builtin_assume_aligned crash when the 1st arg is array type(or string literal). Open issue: https://github.com/llvm/llvm-project/issues/57169 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D133202	2022-09-03 23:26:01 +08:00
Fangrui Song	1a4d851d27	[MinGW] Ignore -fvisibility/-fvisibility-inlines-hidden for dllexport Similar to `123ce97fac` for dllimport: dllexport expresses a non-hidden visibility intention. We can consider it explicit and therefore it should override the global visibility setting (see AST/Decl.cpp "NamedDecl Implementation"). Adding the special case to CodeGenModule::setGlobalVisibility is somewhat weird, but allows we to add the code in one place instead of many in AST/Decl.cpp. Differential Revision: https://reviews.llvm.org/D133180	2022-09-02 09:59:16 -07:00
serge-sans-paille	e0746a8a8d	[clang] cleanup -fstrict-flex-arrays implementation This is a follow up to https://reviews.llvm.org/D126864, addressing some remaining comments. It also considers union with a single zero-length array field as FAM for each value of -fstrict-flex-arrays. Differential Revision: https://reviews.llvm.org/D132944	2022-09-01 15:06:21 +02:00
Chuanqi Xu	7e19d53da4	[NFC] Emit builtin coroutine calls uniforally All the coroutine builtins were emitted in EmitCoroutineIntrinsic except __builtin_coro_size. This patch tries to emit all the corotine builtins uniformally.	2022-09-01 16:31:51 +08:00
Vitaly Buka	960e7a5513	[msan] Use Debug Info to point to affected fields Reviewed By: kstoimenov Differential Revision: https://reviews.llvm.org/D132909	2022-08-31 13:12:17 -07:00
Sanjay Patel	cdf3de45d2	[CodeGen] fix misnamed "not" operation; NFC Seeing the wrong instruction for this name in IR is confusing. Most of the tests are not even checking a subsequent use of the value, so I just deleted the over-specified CHECKs.	2022-08-31 15:11:48 -04:00
Vitaly Buka	c059ede28e	[msan] Add more specific messages for use-after-destroy Reviewed By: kda, kstoimenov Differential Revision: https://reviews.llvm.org/D132907	2022-08-30 19:52:32 -07:00
Luke Nihlen	c9aba60074	[clang] Don't emit debug vtable information for consteval functions Fixes https://github.com/llvm/llvm-project/issues/55065 Reviewed By: shafik Differential Revision: https://reviews.llvm.org/D132874	2022-08-30 19:10:15 +00:00
Rong Xu	db18f26567	[llvm-profdata] Handle internal linkage functions in profile supplementation This patch has the following changes: (1) Handling of internal linkage functions (static functions) Static functions in FDO have a prefix of source file name, while they do not have one in SampleFDO. Current implementation does not handle this and we are not updating the profile for static functions. This patch fixes this. (2) Handling of -funique-internal-linakge-symbols Again this is for the internal linkage functions. Option -funique-internal-linakge-symbols can now be applied to both FDO and SampleFDO compilation. When it is used, it demangles internal linkage function names and adds a hash value as the postfix. When both SampleFDO and FDO profiles use this option, or both not use this option, changes in (1) should handle this. Here we also handle when the SampleFDO profile using this option while FDO profile not using this option, or vice versa. There is one case where this patch won't work: If one of the profiles used mangled name and the other does not. For example, if the SampleFDO profile uses clang c-compiler and without -funique-internal-linakge-symbols, while the FDO profile uses -funique-internal-linakge-symbols. The SampleFDO profile contains unmangled names while the FDO profile contains mangled names. If both profiles use c++ compiler, this won't happen. We think this use case is rare and does not justify the effort to fix. Differential Revision: https://reviews.llvm.org/D132600	2022-08-29 16:15:12 -07:00
Yuanfang Chen	70248bfdea	[Clang] Implement function attribute nouwtable To have finer control of IR uwtable attribute generation. For target code generation, IR nounwind and uwtable may have some interaction. However, for frontend, there are no semantic interactions so the this new `nouwtable` is marked "SimpleHandler = 1". Differential Revision: https://reviews.llvm.org/D132592	2022-08-29 12:12:19 -07:00
Kazu Hirata	86bc4587e1	Use std::clamp (NFC) This patch replaces clamp idioms with std::clamp where the range is obviously valid from the source code (that is, low <= high) to avoid introducing undefined behavior.	2022-08-27 09:53:13 -07:00
Jun Zhang	a4f84f1b2e	[CodeGen] Track DeferredDecls that have been emitted If we run into a first usage or definition of a mangled name, and there's a DeferredDecl that associated with it, we should remember it we need to emit it later on. Without this patch, clang-repl hits a JIT symbol not found error: clang-repl> extern "C" int printf(const char *, ...); clang-repl> auto l1 = []() { printf("ONE\n"); return 42; }; clang-repl> auto l2 = []() { printf("TWO\n"); return 17; }; clang-repl> auto r1 = l1(); ONE clang-repl> auto r2 = l2(); TWO clang-repl> auto r3 = l2(); JIT session error: Symbols not found: [ l2 ] error: Failed to materialize symbols: { (main, { r3, orc_init_func.incr_module_5, $.incr_module_5.inits.0 }) } Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D130831	2022-08-27 22:32:47 +08:00
Leonard Chan	cdb30f7a26	[clang] Do not instrument the rtti_proxies under hwasan We run into a duplicate symbol error when instrumenting the rtti_proxies generated as part of the relative vtables ABI with hwasan: ``` ld.lld: error: duplicate symbol: typeinfo for icu_71::UObject (.rtti_proxy) >>> defined at brkiter.cpp >>> arm64-hwasan-shared/obj/third_party/icu/source/common/libicuuc.brkiter.cpp.o:(typeinfo for icu_71::UObject (.rtti_proxy)) >>> defined at locavailable.cpp >>> arm64-hwasan-shared/obj/third_party/icu/source/common/libicuuc.locavailable.cpp.o:(.data.rel.ro..L_ZTIN6icu_717UObjectE.rtti_proxy.hwasan+0xE00000000000000) ``` The issue here is that the hwasan alias carries over the visibility and linkage of the original proxy, so we have duplicate external symbols that participate in linking. Similar to D132425 we can just disable hwasan for the proxies for now. Differential Revision: https://reviews.llvm.org/D132691	2022-08-26 18:22:17 +00:00
Leonard Chan	93e5cf6b9c	[clang] Do not instrument relative vtables under hwasan Full context in https://bugs.fuchsia.dev/p/fuchsia/issues/detail?id=107017. Instrumenting hwasan with globals results in a linker error under the relative vtables abi: ``` ld.lld: error: libunwind.cpp:(.rodata..L_ZTVN9libunwind12UnwindCursorINS_17LocalAddressSpaceENS_15Registers_arm64EEE.hwasan+0x8): relocation R_AARCH64_PLT32 out of range: 6845471433603167792 is not in [-2147483648, 2147483647]; references libunwind::AbstractUnwindCursor::~AbstractUnwindCursor() >>> defined in libunwind/src/CMakeFiles/unwind_shared.dir/libunwind.cpp.obj ``` This is because the tag is included in the vtable address when calculating the offset between the vtable and virtual function. A temporary solution until we can resolve this is to just disable hwasan instrumentation on relative vtables specifically, which can be done in the frontend. Differential Revision: https://reviews.llvm.org/D132425	2022-08-26 18:21:40 +00:00
Xiang Li	a0ecb4a299	[HLSL] Move DXIL validation version out of ModuleFlags Put DXIL validation version into separate NamedMetadata to avoid update ModuleFlags. Currently DXIL validation version is saved in ModuleFlags in clang codeGen. Then in DirectX backend, the data will be extracted from ModuleFlags and cause rebuild of ModuleFlags. This patch will build NamedMetadata for DXIL validation version and remove the code to rebuild ModuleFlags. Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D130207	2022-08-26 09:20:45 -07:00
Corentin Jabot	463e30f51f	[Clang] Fix crash in coverage of if consteval. Clang crashes when encountering an `if consteval` statement. This is the minimum fix not to crash. The fix is consistent with the current behavior of if constexpr, which does generate coverage data for the discarded branches. This is of course not correct and a better solution is needed for both if constexpr and if consteval. See https://github.com/llvm/llvm-project/issues/54419. Fixes #57377 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D132723	2022-08-26 17:46:53 +02:00
Chris Bieneman	22c477f934	[HLSL] Initial codegen for SV_GroupIndex Semantic parameters aren't passed as actual parameters, instead they are populated from intrinsics which are generally lowered to reads from dedicated hardware registers. This change modifies clang CodeGen to emit the intrinsic calls and populate the parameter's LValue with the result of the intrinsic call for SV_GroupIndex. The result of this is to make the actual passed argument ignored, which will make it easy to clean up later in an IR pass. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D131203	2022-08-25 11:17:54 -05:00
David Majnemer	bd28bd59a3	[clang-cl] /kernel should toggle bit 30 in @feat.00 The linker is supposed to detect when an object with /kernel is linked with another object which is not compiled with /kernel. The linker detects this by checking bit 30 in @feat.00.	2022-08-25 14:17:26 +00:00
Zahira Ammarguellat	5def954a5b	Support of expression granularity for _Float16. Differential Revision: https://reviews.llvm.org/D113107	2022-08-25 08:26:53 -04:00
Sami Tolvanen	cff5bef948	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Relands `67504c9549` with a fix for 32-bit builds. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 22:41:38 +00:00
Sami Tolvanen	a79060e275	Revert "KCFI sanitizer" This reverts commit `67504c9549` as using PointerEmbeddedInt to store 32 bits breaks 32-bit arm builds.	2022-08-24 19:30:13 +00:00
Sami Tolvanen	67504c9549	KCFI sanitizer The KCFI sanitizer, enabled with `-fsanitize=kcfi`, implements a forward-edge control flow integrity scheme for indirect calls. It uses a !kcfi_type metadata node to attach a type identifier for each function and injects verification code before indirect calls. Unlike the current CFI schemes implemented in LLVM, KCFI does not require LTO, does not alter function references to point to a jump table, and never breaks function address equality. KCFI is intended to be used in low-level code, such as operating system kernels, where the existing schemes can cause undue complications because of the aforementioned properties. However, unlike the existing schemes, KCFI is limited to validating only function pointers and is not compatible with executable-only memory. KCFI does not provide runtime support, but always traps when a type mismatch is encountered. Users of the scheme are expected to handle the trap. With `-fsanitize=kcfi`, Clang emits a `kcfi` operand bundle to indirect calls, and LLVM lowers this to a known architecture-specific sequence of instructions for each callsite to make runtime patching easier for users who require this functionality. A KCFI type identifier is a 32-bit constant produced by taking the lower half of xxHash64 from a C++ mangled typename. If a program contains indirect calls to assembly functions, they must be manually annotated with the expected type identifiers to prevent errors. To make this easier, Clang generates a weak SHN_ABS `__kcfi_typeid_<function>` symbol for each address-taken function declaration, which can be used to annotate functions in assembly as long as at least one C translation unit linked into the program takes the function address. For example on AArch64, we might have the following code: ``` .c: int f(void); int (*p)(void) = f; p(); .s: .4byte __kcfi_typeid_f .global f f: ... ``` Note that X86 uses a different preamble format for compatibility with Linux kernel tooling. See the comments in `X86AsmPrinter::emitKCFITypeId` for details. As users of KCFI may need to locate trap locations for binary validation and error handling, LLVM can additionally emit the locations of traps to a `.kcfi_traps` section. Similarly to other sanitizers, KCFI checking can be disabled for a function with a `no_sanitize("kcfi")` function attribute. Reviewed By: nickdesaulniers, kees, joaomoreira, MaskRay Differential Revision: https://reviews.llvm.org/D119296	2022-08-24 18:52:42 +00:00
Vitaly Buka	b5a9adf1f5	[clang] Create alloca to pass into static lambda "this" parameter of lambda if undef, notnull and differentiable. So we need to pass something consistent. Any alloca will work. It will be eliminated as unused later by optimizer. Otherwise we generate code which Msan is expected to catch. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D132275	2022-08-23 13:53:17 -07:00
Joseph Huber	2b8f722e63	[OpenMP] Add option to assert no nested OpenMP parallelism on the GPU The OpenMP device runtime needs to support the OpenMP standard. However constructs like nested parallelism are very uncommon in real application yet lead to complexity in the runtime that is sometimes difficult to optimize out. As a stop-gap for performance we should supply an argument that selectively disables this feature. This patch adds the `-fopenmp-assume-no-nested-parallelism` argument which explicitly disables the usee of nested parallelism in OpenMP. Reviewed By: carlo.bertolli Differential Revision: https://reviews.llvm.org/D132074	2022-08-23 14:09:51 -05:00
utsumi	2e2caea37f	[Clang][OpenMP] Make copyin clause on combined and composite construct work (patch by Yuichiro Utsumi (utsumi.yuichiro@fujitsu.com)) Make copyin clause on the following constructs work. - parallel for - parallel for simd - parallel sections Fixes https://github.com/llvm/llvm-project/issues/55547 Patch by Yuichiro Utsumi (utsumi.yuichiro@fujitsu.com) Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D132209	2022-08-23 07:58:35 -07:00
David Majnemer	2c923b8863	[clang-cl] Expose the /volatile:{iso,ms} choice via _ISO_VOLATILE MSVC allows interpreting volatile loads and stores, when combined with /volatile:iso, as having acquire/release semantics. MSVC also exposes a define, _ISO_VOLATILE, which allows users to enquire if this feature is enabled or disabled.	2022-08-23 14:29:52 +00:00
Yuanfang Chen	f9969a3d28	[CodeGen] Sort llvm.global_ctors by lexing order before emission Fixes https://github.com/llvm/llvm-project/issues/55804 The lexing order is already bookkept in DelayedCXXInitPosition but we were not using it based on the wrong assumption that inline variable is unordered. This patch fixes it by ordering entries in llvm.global_ctors by orders in DelayedCXXInitPosition. for llvm.global_ctors entries without a lexing order, ordering them by the insertion order. (This mostly orders the template instantiation in https://reviews.llvm.org/D126341 intuitively, minus one tweak for which I'll submit a separate patch.) Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D127233	2022-08-22 16:00:14 -07:00
Yaxun (Sam) Liu	9f6cb3e9fd	[AMDGPU] Add builtin s_sendmsg_rtn Reviewed by: Brian Sumner, Artem Belevich Differential Revision: https://reviews.llvm.org/D132140 Fixes: SWDEV-352017	2022-08-22 18:29:23 -04:00
Chris Bieneman	9a478d5232	[NFC] Rename dx.shader to hlsl.shader This metadata annotation is HLSL-specific not DirectX specific. It will need to be attached for shaders regardless of whether they are targeting DXIL.	2022-08-22 16:03:40 -05:00
Kazu Hirata	8b1b0d1d81	Revert "Use std::is_same_v instead of std::is_same (NFC)" This reverts commit `c5da37e42d`. This patch seems to break builds with some versions of MSVC.	2022-08-20 23:00:39 -07:00
Kazu Hirata	c5da37e42d	Use std::is_same_v instead of std::is_same (NFC)	2022-08-20 22:36:26 -07:00
Kazu Hirata	8e494b85a5	Use llvm::drop_begin (NFC)	2022-08-20 21:18:30 -07:00
Alex Bradbury	bc53832080	[clang][RISCV] Fix incorrect ABI lowering for inherited structs under hard-float ABIs The hard float ABIs have a rule that if a flattened struct contains either a single fp value, or an int+fp, or fp+fp then it may be passed in a pair of registers (if sufficient GPRs+FPRs are available). detectFPCCEligibleStruct and the helper it calls, detectFPCCEligibleStructHelper examine the type of the argument/return value to determine if it complies with the requirements for this ABI rule. As reported in bug #57084, this logic produces incorrect results for C++ structs that inherit from other structs. This is because only the fields of the struct were examined, but enumerating RD->fields misses any fields in inherited C++ structs. This patch corrects that issue by adding appropriate logic to enumerate any included base structs. Differential Revision: https://reviews.llvm.org/D131677	2022-08-19 20:31:06 +01:00
Craig Topper	1a60e003df	[RISCV] Use Triple::isRISCV/isRISCV32/isRISCV64 helps in some places. NFC Reviewed By: reames Differential Revision: https://reviews.llvm.org/D132197	2022-08-19 09:11:22 -07:00
Caroline Concatto	9f21d6e953	[Clang][AArch64] Use generic extract/insert vector for svget/svset/svcreate tuples This patch replaces svget, svset and svcreate aarch64 intrinsics for tuple types with the generic llvm-ir intrinsics extract/insert vector Differential Revision: https://reviews.llvm.org/D131547	2022-08-19 12:58:59 +01:00
Caroline Concatto	4ef1f014a1	[Clang][AArch64] Replace aarch64_sve_ldN intrinsic by aarch64_sve_ldN.sret Differential Revision: https://reviews.llvm.org/D131687	2022-08-19 11:42:18 +01:00
Yonghong Song	481d67d310	[Clang][BPF] Support record argument with direct values Currently, record arguments are always passed by reference by allocating space for record values in the caller. This is less efficient for small records which may take one or two registers. For example, for x86_64 and aarch64, for a record size up to 16 bytes, the record values can be passed by values directly on the registers. This patch added BPF support of record argument with direct values for up to 16 byte record size. If record size is 0, that record will not take any register, which is the same behavior for x86_64 and aarch64. If the record size is greater than 16 bytes, the record argument will be passed by reference. Differential Revision: https://reviews.llvm.org/D132144	2022-08-18 19:11:50 -07:00
Prabhdeep Singh Soni	bce94ea551	[OMPIRBuilder] Add support for safelen clause This patch adds OMPIRBuilder support for the safelen clause for the simd directive. Reviewed By: shraiysh, Meinersbur Differential Revision: https://reviews.llvm.org/D131526	2022-08-18 15:43:08 -04:00
Wolfgang Pieb	8564e2fea5	[Inlining] Add a clang option to limit inlining of functions Add the clang option -finline-max-stacksize=<N> to suppress inlining of functions whose stack size exceeds the given value. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D131986	2022-08-18 11:56:24 -07:00
Ties Stuij	27cbfa7cc8	[Clang] Propagate const context info when emitting compound literal This patch fixes a crash when trying to emit a constant compound literal. For C++ Clang evaluates either casts or binary operations at translation time, but doesn't pass on the InConstantContext information that was inferred when parsing the statement. Because of this, strict FP evaluation (-ftrapping-math) which shouldn't be in effect yet, then causes checkFloatingpointResult to return false, which in tryEmitGlobalCompoundLiteral will trigger an assert that the compound literal wasn't constant. The discussion here around 'manifestly constant evaluated contexts' was very helpful to me when trying to understand what LLVM's position is on what evaluation context should be in effect, together with the explanatory text in that patch itself: https://reviews.llvm.org/D87528 Reviewed By: rjmccall, DavidSpickett Differential Revision: https://reviews.llvm.org/D131555	2022-08-18 11:25:20 +01:00
Vitaly Buka	36c9f5a58b	[NFC][OpenMP] Simplify `2f9be69d84`	2022-08-17 18:59:48 -07:00
David Blaikie	06c70e9b99	DebugInfo: Remove auto return type representation support Seems this complicated lldb sufficiently for some cases that it hasn't been worth supporting/fixing there - and it so far hasn't provided any new use cases/value for debug info consumers, so let's remove it until someone has a use case for it. (side note: the original implementation of this still had a bug (I should've caught it in review) that we still didn't produce auto-returning function declarations in types where the function wasn't instantiatied (that requires a fix to remove the `if getContainedAutoType` condition in `CGDebugInfo::CollectCXXMemberFunctions` - without that, auto returning functions were still being handled the same as member function templates and special member functions - never added to the member list, only attached to the type via the declaration chain from the definition) Further discussion about this in D123319 This reverts commit 5ff992bca208a0e37ca6338fc735aec6aa848b72: [DEBUG-INFO] Change how we handle auto return types for lambda operator() to be consistent with gcc This reverts commit c83602fdf51b2692e3bacb06bf861f20f74e987f: [DWARF5][clang]: Added support for DebugInfo generation for auto return type for C++ member functions. Differential Revision: https://reviews.llvm.org/D131933	2022-08-17 00:35:05 +00:00
Yonghong Song	d9198f64d9	[Clang][BPF]: Force sign/zero extension for return values in caller Currently bpf supports calling kernel functions (x86_64, arm64, etc.) in bpf programs. Tejun discovered a problem where the x86_64 func return value (a unsigned char type) is stored in 8-bit subregister %al and the other 56-bits in %rax might be garbage. But based on current bpf ABI, the bpf program assumes the whole %rax holds the correct value as the callee is supposed to do necessary sign/zero extension. This mismatch between bpf and x86_64 caused the incorrect results. To resolve this problem, this patch forced caller to do needed sign/zero extension for 8/16-bit return values as well. Note that 32-bit return values already had sign/zero extension even without this patch. For example, for the test case attached to this patch: $ cat t.c _Bool bar_bool(void); unsigned char bar_char(void); short bar_short(void); int bar_int(void); int foo_bool(void) { if (bar_bool() != 1) return 0; else return 1; } int foo_char(void) { if (bar_char() != 10) return 0; else return 1; } int foo_short(void) { if (bar_short() != 10) return 0; else return 1; } int foo_int(void) { if (bar_int() != 10) return 0; else return 1; } Without this patch, generated call insns in IR looks like: %call = call zeroext i1 @bar_bool() %call = call zeroext i8 @bar_char() %call = call signext i16 @bar_short() %call = call i32 @bar_int() So it is assumed that zero extension has been done for return values of bar_bool()and bar_char(). Sign extension has been done for the return value of bar_short(). The return value of bar_int() does not have any assumption so caller needs to do necessary shifting to get correct 32bit values. With this patch, generated call insns in IR looks like: %call = call i1 @bar_bool() %call = call i8 @bar_char() %call = call i16 @bar_short() %call = call i32 @bar_int() There are no assumptions for return values of the above four function calls, so necessary shifting is necessary for all of them. The following is the objdump file difference for function foo_char(). Without this patch: 0000000000000010 <foo_char>: 2: 85 10 00 00 ff ff ff ff call -1 3: bf 01 00 00 00 00 00 00 r1 = r0 4: b7 00 00 00 01 00 00 00 r0 = 1 5: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2> 6: b7 00 00 00 00 00 00 00 r0 = 0 0000000000000038 <LBB1_2>: 7: 95 00 00 00 00 00 00 00 exit With this patch: 0000000000000018 <foo_char>: 3: 85 10 00 00 ff ff ff ff call -1 4: bf 01 00 00 00 00 00 00 r1 = r0 5: 57 01 00 00 ff 00 00 00 r1 &= 255 6: b7 00 00 00 01 00 00 00 r0 = 1 7: 15 01 01 00 0a 00 00 00 if r1 == 10 goto +1 <LBB1_2> 8: b7 00 00 00 00 00 00 00 r0 = 0 0000000000000048 <LBB1_2>: 9: 95 00 00 00 00 00 00 00 exit The zero extension of the return 'char' value is done here. Differential Revision: https://reviews.llvm.org/D131598	2022-08-16 16:08:01 -07:00
Saleem Abdulrasool	585f62be1a	CodeGen: correct handling of debug info generation for aliases When aliasing a static array, the aliasee is going to be a GEP which points to the value. We should strip pointer casts before forming the reference. This was occluded by the use of opaque pointers. This problem has existed since the introduction of the debug info generation for aliases in `b1ea0191a4`. The test case would assert due to the invalid cast with or without `-no-opaque-pointers` at that revision. Fixes: #57179	2022-08-16 21:27:05 +00:00
Arthur Eubanks	9181ce623f	[Windows] Put init_seg(compiler/lib) in llvm.global_ctors Currently we treat initializers with init_seg(compiler/lib) as similar to any other init_seg, they simply have a global variable in the proper section (".CRT$XCC" for compiler/".CRT$XCL" for lib) and are added to llvm.used. However, this doesn't match with how LLVM sees normal (or init_seg(user)) initializers via llvm.global_ctors. This causes issues like incorrect init_seg(compiler) vs init_seg(user) ordering due to GlobalOpt evaluating constructors, and the ability to remove init_seg(compiler/lib) initializers at all. Currently we use 'A' for priorities less than 200. Use 200 for init_seg(compiler) (".CRT$XCC") and 400 for init_seg(lib) (".CRT$XCL"), which do not append the priority to the section name. Priorities between 200 and 400 use ".CRT$XCC${Priority}". This allows for some wiggle room for people/future extensions that want to add initializers between compiler and lib. Fixes #56922 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D131910	2022-08-16 08:16:18 -07:00
Kazu Hirata	2b43bd0bd9	Remove unused forward declarations (NFC)	2022-08-13 12:55:47 -07:00
Vitaly Buka	2f9be69d84	[OpenMP] Fix another after scope after D129608 https://lab.llvm.org/buildbot/#/builders/5/builds/26770	2022-08-13 12:13:54 -07:00
Vitaly Buka	f385eaf48f	[OpenMP] Fix use after scope after D129608 Broken builder https://lab.llvm.org/buildbot/#/builders/5/builds/26764	2022-08-13 09:40:51 -07:00
Jennifer Yu	2ca27206f9	[OpenMP] Fix segmentation fault when data field is used in is_device_pt Currently, the field just emit map info for this pointer variable. It is failed at run time. For the fields, the PartialStruct is created and it needs call to emitCombinedEntry which create the base that covers all the pieces. The change is to generate map info as regular fields. Differential Revision: https://reviews.llvm.org/D129608	2022-08-12 17:10:26 -07:00
Aaron Ballman	b48fb85fe6	Fix crash-on-valid with consteval temporary construction through list initialization Clang currently crashes when lowering a consteval list initialization of a temporary. This is partially working around an issue in the template instantiation code (TreeTransform::TransformCXXTemporaryObjectExpr()) that does not yet know how to handle list initialization of temporaries in all cases. However, it's also helping reduce fragility by ensuring we always have a valid QualType when trying to emit a constant expression during IR generation. Fixes #55871 Differential Revision: https://reviews.llvm.org/D131194	2022-08-11 13:44:24 -04:00
Florian Hahn	ef110a491f	[Builtins] Do not claim most libfuncs are readnone with trapping math. At the moment, Clang only considers errno when deciding if a builtin is const. This ignores the fact that some library functions may raise floating point exceptions, which may modify global state, e.g. when updating FP status registers. To model the fact that some library functions/builtins may raise floating point exceptions, this patch adds a new 'g' modifier for builtins. If a builtin is marked with 'g', it cannot be considered const, unless FP exceptions are ignored. So far I've not added CHECK lines for all calls in math-libcalls.c. I'll do that once we agree on the overall direction. A consequence seems to be that we fail to select some of the constrained math builtins now, but I am not entirely sure what's going on there. Reviewed By: john.brawn Differential Revision: https://reviews.llvm.org/D129231	2022-08-11 12:29:01 +01:00
Freddy Ye	e4888a37d3	[X86][BF16] Enable __bf16 for x86 targets. X86 psABI has updated to support __bf16 type, the ABI of which is the same as FP16. See https://discourse.llvm.org/t/patch-add-optional-bfloat16-support/63149 Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D130964	2022-08-10 09:00:47 +08:00
Fangrui Song	32197830ef	[clang][clang-tools-extra] LLVM_NODISCARD => [[nodiscard]]. NFC	2022-08-09 07:11:18 +00:00
Fangrui Song	3f18f7c007	[clang] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D131346	2022-08-08 09:12:46 -07:00
Sergei Barannikov	87dc7d4b61	[clang][CodeGen] Factor out Swift ABI hooks (NFCI) Swift calling conventions stands out in the way that they are lowered in mostly target-independent manner, with very few customization points. As such, swift-related methods of ABIInfo do not reference the rest of ABIInfo and vice versa. This change follows interface segregation principle; it removes dependency of SwiftABIInfo on ABIInfo. Targets must now implement SwiftABIInfo separately if they support Swift calling conventions. Almost all targets implemented `shouldPassIndirectly` the same way. This de-facto default implementation has been moved into the base class. `isSwiftErrorInRegister` used to be virtual, now it is not. It didn't accept any arguments which could have an effect on the returned value. This is now a static property of the target ABI. Reviewed By: rusyaev-roman, inclyc Differential Revision: https://reviews.llvm.org/D130394	2022-08-08 00:23:23 +08:00
Shilei Tian	e21202dac1	[Clang][OpenMP] Fix the issue that `llvm.lifetime.end` is emitted too early for variables captured in linear clause Currently if an OpenMP program uses `linear` clause, and is compiled with optimization, `llvm.lifetime.end` for variables listed in `linear` clause are emitted too early such that there could still be uses after that. Let's take the following code as example: ``` // loop.c int j; int u; void loop(int n) { int i; for (i = 0; i < n; ++i) { ++j; u = &j; } } ``` We compile using the command: ``` clang -cc1 -fopenmp-simd -O3 -x c -triple x86_64-apple-darwin10 -emit-llvm loop.c -o loop.ll ``` The following IR (simplified) will be generated: ``` @j = local_unnamed_addr global i32 0, align 4 @u = local_unnamed_addr global ptr null, align 8 define void @loop(i32 noundef %n) local_unnamed_addr { entry: %j = alloca i32, align 4 %cmp = icmp sgt i32 %n, 0 br i1 %cmp, label %simd.if.then, label %simd.if.end simd.if.then: ; preds = %entry call void @llvm.lifetime.start.p0(i64 4, ptr nonnull %j) store ptr %j, ptr @u, align 8 call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j) %0 = load i32, ptr %j, align 4 store i32 %0, ptr @j, align 4 br label %simd.if.end simd.if.end: ; preds = %simd.if.then, %entry ret void } ``` The most important part is: ``` call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j) %0 = load i32, ptr %j, align 4 store i32 %0, ptr @j, align 4 ``` `%j` is still loaded after `@llvm.lifetime.end.p0(i64 4, ptr nonnull %j)`. This could cause the backend incorrectly optimizes the code and further generates incorrect code. The root cause is, when we emit a construct that could have `linear` clause, it usually has the following pattern: ``` EmitOMPLinearClauseInit(S) { OMPPrivateScope LoopScope(this); ... EmitOMPLinearClause(S, LoopScope); ... (void)LoopScope.Privatize(); ... } EmitOMPLinearClauseFinal(S, [](CodeGenFunction &) { return nullptr; }); ``` Variables that need to be privatized are added into `LoopScope`, which also serves as a RAII object. When `LoopScope` is destructed and if optimization is enabled, a `@llvm.lifetime.end` is also emitted for each privatized variable. However, the writing back to original variables in `linear` clause happens after the scope in `EmitOMPLinearClauseFinal`, causing the issue we see above. A quick "fix" seems to be, moving `EmitOMPLinearClauseFinal` inside the scope. However, it doesn't work. That's because the local variable map has been updated by `LoopScope` such that a variable declaration is mapped to the privatized variable, instead of the actual one. In that way, the following code will be generated: ``` %0 = load i32, ptr %j, align 4 store i32 %0, ptr %j, align 4 call void @llvm.lifetime.end.p0(i64 4, ptr nonnull %j) ``` Well, now the life time is correct, but apparently the writing back is broken. In this patch, a new function `OMPPrivateScope::restoreMap` is added and called before calling `EmitOMPLinearClauseFinal`. This can make sure that `EmitOMPLinearClauseFinal` can find the orignal varaibls to write back. Fixes #56913. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D131272	2022-08-06 16:50:37 -04:00
Xiang Li	b2c9ff7273	[NFC][HLSL] Fix build error caused missing typo update. setHLSLFnuctionAttributes to setHLSLFunctionAttributes. Differential Revision: https://reviews.llvm.org/D131240	2022-08-04 23:20:25 -07:00
Xiang Li	6134629af0	[NFC][HLSL] Fix typo in CGHLSLRuntime. Change setHLSLFnuctionAttributes to setHLSLFunctionAttributes. Differential Revision: https://reviews.llvm.org/D131238	2022-08-04 23:08:40 -07:00
Xiang Li	906e41f4e3	[HLSL] clang codeGen for HLSLShaderAttr. Translate HLSLShaderAttr to IR level. 1. Skip mangle for hlsl entry functions. 2. Add function attribute for hlsl entry functions. Reviewed By: Anastasia Differential Revision: https://reviews.llvm.org/D124752	2022-08-04 21:23:57 -07:00
Ellis Hoag	6f4c3c0f64	[InstrProf][attempt 2] Add new format for -fprofile-list= In D130807 we added the `skipprofile` attribute. This commit changes the format so we can either `forbid` or `skip` profiling functions by adding the `noprofile` or `skipprofile` attributes, respectively. The behavior of the original format remains unchanged. Also, add the `skipprofile` attribute when using `-fprofile-function-groups`. This was originally landed as https://reviews.llvm.org/D130808 but was reverted due to a Windows test failure. Differential Revision: https://reviews.llvm.org/D131195	2022-08-04 17:12:56 -07:00
Matt Arsenault	c5b36ab1d6	AMDGPU/clang: Remove dead code The order has to be a constant and should be enforced by the builtin definition. The fallthrough behavior would have been broken anyway. There's still an existing issue/assert if you try to use garbage for the ordering. The IRGen should be broken, but we also hit another assert before that. Fixes issue 56832	2022-08-04 19:02:56 -04:00
Nico Weber	0eb7d86f58	Revert "[InstrProf] Add new format for -fprofile-list=" This reverts commit `b692312ca4`. Breaks tests on Windows, see https://reviews.llvm.org/D130808#3699952	2022-08-04 13:04:59 -04:00
Ellis Hoag	b692312ca4	[InstrProf] Add new format for -fprofile-list= In D130807 we added the `skipprofile` attribute. This commit changes the format so we can either `forbid` or `skip` profiling functions by adding the `noprofile` or `skipprofile` attributes, respectively. The behavior of the original format remains unchanged. Also, add the `skipprofile` attribute when using `-fprofile-function-groups`. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D130808	2022-08-04 08:49:43 -07:00
Ellis Hoag	12e78ff881	[InstrProf] Add the skipprofile attribute As discussed in [0], this diff adds the `skipprofile` attribute to prevent the function from being profiled while allowing profiled functions to be inlined into it. The `noprofile` attribute remains unchanged. The `noprofile` attribute is used for functions where it is dangerous to add instrumentation to while the `skipprofile` attribute is used to reduce code size or performance overhead. [0] https://discourse.llvm.org/t/why-does-the-noprofile-attribute-restrict-inlining/64108 Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D130807	2022-08-04 08:45:27 -07:00
Matt Jacobson	c8b2f3f51b	[ObjC] type method metadata `_imp`, messenger routine at callsite with program address space On targets with non-default program address space (e.g., Harvard architectures), clang crashes when emitting Objective-C method metadata, because the address of the method IMP cannot be bitcast to i8. It similarly crashes at messenger callsite with a failed bitcast. Define the _imp field instead as i8 addrspace(1) (or whatever the target's program address space is). And in getMessageSendInfo(), create signatureType by specifying the program address space. Add a regression test using the AVR target. Test failed previously and passes now. Checked codegen of the test for x86_64-apple-darwin19.6.0 and saw no difference, as expected. Reviewed By: rjmccall, dylanmckay Differential Revision: https://reviews.llvm.org/D112113	2022-08-04 05:40:32 -04:00
Corentin Jabot	127bf44385	[Clang][C++20] Support capturing structured bindings in lambdas This completes the implementation of P1091R3 and P1381R1. This patch allow the capture of structured bindings both for C++20+ and C++17, with extension/compat warning. In addition, capturing an anonymous union member, a bitfield, or a structured binding thereof now has a better diagnostic. We only support structured bindings - as opposed to other kinds of structured statements/blocks. We still emit an error for those. In addition, support for structured bindings capture is entirely disabled in OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there. Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented. at the request of @shafik, i can confirm the correct behavior of lldb wit this change. Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/52720 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D122768	2022-08-04 10:12:53 +02:00
Phoebe Wang	6f867f9102	[X86] Support ``-mindirect-branch-cs-prefix`` for call and jmp to indirect thunk This is to address feature request from https://github.com/ClangBuiltLinux/linux/issues/1665 Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D130754	2022-08-04 15:12:15 +08:00
Corentin Jabot	a274219600	Revert "[Clang][C++20] Support capturing structured bindings in lambdas" This reverts commit `44f2baa380`. Breaks self builds and seems to have conformance issues.	2022-08-03 21:00:29 +02:00
Corentin Jabot	44f2baa380	[Clang][C++20] Support capturing structured bindings in lambdas This completes the implementation of P1091R3 and P1381R1. This patch allow the capture of structured bindings both for C++20+ and C++17, with extension/compat warning. In addition, capturing an anonymous union member, a bitfield, or a structured binding thereof now has a better diagnostic. We only support structured bindings - as opposed to other kinds of structured statements/blocks. We still emit an error for those. In addition, support for structured bindings capture is entirely disabled in OpenMP mode as this needs more investigation - a specific diagnostic indicate the feature is not yet supported there. Note that the rest of P1091R3 (static/thread_local structured bindings) was already implemented. at the request of @shafik, i can confirm the correct behavior of lldb wit this change. Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/54300 Fixes https://github.com/llvm/llvm-project/issues/52720 Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D122768	2022-08-03 20:00:01 +02:00
Yuanfang Chen	92c1bc6158	[CodeGen][inlineasm] assume the flag output of inline asm is boolean value GCC inline asm document says that "... the general rule is that the output variable must be a scalar integer, and the value is boolean." Commit `e5c37958f9` lowers flag output of inline asm on X86 with setcc, hence it is guaranteed that the flag is of boolean value. Clang does not support ARM inline asm flag output yet so nothing need to be worried about ARM. See "Flag Output" section at https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html#OutputOperands Fixes https://github.com/llvm/llvm-project/issues/56568 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D129954	2022-08-02 11:49:01 -07:00
Alok Kumar Sharma	5ec6ea3dfd	[clang][OpenMP][DebugInfo] Mark OpenMP generated functions as artificial The Clang compiler generates internal functions for OpenMP. Current patch marks these functions as artificial. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D111521	2022-08-02 21:24:46 +05:30
Chuanqi Xu	6d10733d44	[C++20] [Modules] Handle initializer for Header Units Previously when we add module initializer, we forget to handle header units. This results that we couldn't compile a Hello World Example with Header Units. This patch tries to fix this. Reviewed By: iains Differential Revision: https://reviews.llvm.org/D130871	2022-08-02 11:24:46 +08:00
Chuanqi Xu	39cfde2366	Revert "[C++20] [Modules] Handle initializer for Header Units" This reverts commit `db6152ad66`. This commit fails in ppc64. Since we want to backport it to 15.x. So revert it now to keep the patch complete.	2022-08-02 11:09:38 +08:00
Chuanqi Xu	db6152ad66	[C++20] [Modules] Handle initializer for Header Units Previously when we add module initializer, we forget to handle header units. This results that we couldn't compile a Hello World Example with Header Units. This patch tries to fix this. Reviewed By: iains Differential Revision: https://reviews.llvm.org/D130871	2022-08-02 10:27:02 +08:00
Zakk Chen	71fd66161d	[RISCV][Clang] Support RVV policy functions. 1. Add policy functions support and tests for vadd, vmv, vfmv and all load instructions except segment load. I didn't add all combination of policy functions in test because it seem not to make sense. 2. Rename HasUnMaskedOverloaded to SupportOverloading. 3. vmv.s.x for ta policy could not have overloaded API. 4. This patch does not support all operations, I will have other follow-up patches support all. [RFC] https://github.com/riscv-non-isa/rvv-intrinsic-doc/pull/137 Reviewed By: kito-cheng, fakepaper56, fakepaper56 Differential Revision: https://reviews.llvm.org/D126742	2022-08-01 17:32:08 +00:00
Gabriel Ravier	5674a3c880	Fixed a number of typos I went over the output of the following mess of a command: (ulimit -m 2000000; ulimit -v 2000000; git ls-files -z \| parallel --xargs -0 cat \| aspell list --mode=none --ignore-case \| grep -E '^[A-Za-z][a-z]*$' \| sort \| uniq -c \| sort -n \| grep -vE '.{25}' \| aspell pipe -W3 \| grep : \| cut -d' ' -f2 \| less) and proceeded to spend a few days looking at it to find probable typos and fixed a few hundred of them in all of the llvm project (note, the ones I found are not anywhere near all of them, but it seems like a good start). Differential Revision: https://reviews.llvm.org/D130827	2022-08-01 13:13:18 -04:00
Chris Bieneman	5dbb92d8cd	[HLSL] CodeGen HLSL Resource annotations HLSL Resource types need special annotations that the backend will use to build out metadata and resource annotations that are required by DirectX and Vulkan drivers in order to provide correct data bindings for shader exeuction. This patch adds some of the required data for unordered-access-views (UAV) resource binding into the module flags. This data will evolve over time to cover all the required use cases, but this should get things started. Depends on D130018. Differential Revision: https://reviews.llvm.org/D130019	2022-08-01 11:19:43 -05:00
Dominik Adamski	d90b7bf2c5	Add support for lowering simd if clause to LLVM IR Scope of changes: 1) Added new function to generate loop versioning 2) Added support for if clause to applySimd function 2) Added tests which confirm that lowering is successful If ifCond is specified, then collapsed loop is duplicated and if branch is added. Duplicated loop is executed if simd ifCond is evaluated to false. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D129368 Signed-off-by: Dominik Adamski <dominik.adamski@amd.com>	2022-08-01 04:43:32 -05:00
Chuanqi Xu	bacdf80f42	Use @llvm.threadlocal.address intrinsic to access TLS variable This is successor for D125291. This revision would try to use @llvm.threadlocal.address in clang to access TLS variable. The reason why the OpenMP tests contains a lot of change is that they uses utils/update_cc_test_checks.py to update their tests. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D129833	2022-08-01 11:05:00 +08:00
Jun Zhang	3da1395383	[CodeGen][NFC] Use isa_and_nonnull instead of explicit check Signed-off-by: Jun Zhang <jun@junz.org>	2022-07-31 13:03:24 +08:00
skc7	09c4121123	Revert "Revert "[Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values"" This reverts commit `4e1fe96`. Reverting this commit and fix the tests that caused failures due to `a35c64c`.	2022-07-29 19:07:07 +00:00
Amy Kwan	4e1fe968c9	Revert "[Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values" This reverts commit `a35c64ce23`. Reverting this commit as it causes various failures on LE and BE PPC bots.	2022-07-29 13:28:48 -05:00
skc7	a35c64ce23	[Clang][Attribute] Introduce maybe_undef attribute for function arguments which accepts undef values Add the ability to put __attribute__((maybe_undef)) on function arguments. Clang codegen introduces a freeze instruction on the argument. Differential Revision: https://reviews.llvm.org/D130224	2022-07-29 02:27:26 +00:00
Shafik Yaghmour	b364535304	[Clang] Diagnose ill-formed constant expression when setting a non fixed enum to a value outside the range of the enumeration values DR2338 clarified that it was undefined behavior to set the value outside the range of the enumerations values for an enum without a fixed underlying type. We should diagnose this with a constant expression context. Differential Revision: https://reviews.llvm.org/D130058	2022-07-28 15:27:50 -07:00
David Blaikie	4e719e0f16	DebugInfo: Prefer vtable homing over ctor homing. Vtables will be emitted in fewer places than ctors (every ctor references the vtable, so at worst it's the same places - but at best the type has a non-inline key function and the vtable is emitted in one place) Pulling this fix out of `517bbc64db` which was reverted in `4821508d4d`	2022-07-28 00:07:35 +00:00
Shafik Yaghmour	28cd7f86ed	Revert "[Clang] Diagnose ill-formed constant expression when setting a non fixed enum to a value outside the range of the enumeration values" This reverts commit `a3710589f2`.	2022-07-27 15:31:41 -07:00
Shafik Yaghmour	a3710589f2	[Clang] Diagnose ill-formed constant expression when setting a non fixed enum to a value outside the range of the enumeration values DR2338 clarified that it was undefined behavior to set the value outside the range of the enumerations values for an enum without a fixed underlying type. We should diagnose this with a constant expression context. Differential Revision: https://reviews.llvm.org/D130058	2022-07-27 14:59:35 -07:00
Matheus Izvekov	15f3cd6bfc	[clang] Implement ElaboratedType sugaring for types written bare Without this patch, clang will not wrap in an ElaboratedType node types written without a keyword and nested name qualifier, which goes against the intent that we should produce an AST which retains enough details to recover how things are written. The lack of this sugar is incompatible with the intent of the type printer default policy, which is to print types as written, but to fall back and print them fully qualified when they are desugared. An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still requires pointer alignment due to pre-existing bug in the TypeLoc buffer handling. --- Troubleshooting list to deal with any breakage seen with this patch: 1) The most likely effect one would see by this patch is a change in how a type is printed. The type printer will, by design and default, print types as written. There are customization options there, but not that many, and they mainly apply to how to print a type that we somehow failed to track how it was written. This patch fixes a problem where we failed to distinguish between a type that was written without any elaborated-type qualifiers, such as a 'struct'/'class' tags and name spacifiers such as 'std::', and one that has been stripped of any 'metadata' that identifies such, the so called canonical types. Example: ``` namespace foo { struct A {}; A a; }; ``` If one were to print the type of `foo::a`, prior to this patch, this would result in `foo::A`. This is how the type printer would have, by default, printed the canonical type of A as well. As soon as you add any name qualifiers to A, the type printer would suddenly start accurately printing the type as written. This patch will make it print it accurately even when written without qualifiers, so we will just print `A` for the initial example, as the user did not really write that `foo::` namespace qualifier. 2) This patch could expose a bug in some AST matcher. Matching types is harder to get right when there is sugar involved. For example, if you want to match a type against being a pointer to some type A, then you have to account for getting a type that is sugar for a pointer to A, or being a pointer to sugar to A, or both! Usually you would get the second part wrong, and this would work for a very simple test where you don't use any name qualifiers, but you would discover is broken when you do. The usual fix is to either use the matcher which strips sugar, which is annoying to use as for example if you match an N level pointer, you have to put N+1 such matchers in there, beginning to end and between all those levels. But in a lot of cases, if the property you want to match is present in the canonical type, it's easier and faster to just match on that... This goes with what is said in 1), if you want to match against the name of a type, and you want the name string to be something stable, perhaps matching on the name of the canonical type is the better choice. 3) This patch could expose a bug in how you get the source range of some TypeLoc. For some reason, a lot of code is using getLocalSourceRange(), which only looks at the given TypeLoc node. This patch introduces a new, and more common TypeLoc node which contains no source locations on itself. This is not an inovation here, and some other, more rare TypeLoc nodes could also have this property, but if you use getLocalSourceRange on them, it's not going to return any valid locations, because it doesn't have any. The right fix here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive into the inner TypeLoc to get the source range if it doesn't find it on the top level one. You can use getLocalSourceRange if you are really into micro-optimizations and you have some outside knowledge that the TypeLocs you are dealing with will always include some source location. 4) Exposed a bug somewhere in the use of the normal clang type class API, where you have some type, you want to see if that type is some particular kind, you try a `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match. Again, like 2), this would usually have been tested poorly with some simple tests with no qualifications, and would have been broken had there been any other kind of type sugar, be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType. The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper into the type. Or use `getAsAdjusted` when dealing with TypeLocs. For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast. 5) It could be a bug in this patch perhaps. Let me know if you need any help! Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Differential Revision: https://reviews.llvm.org/D112374	2022-07-27 11:10:54 +02:00
Argyrios Kyrtzidis	8dfaecc4c2	[CGDebugInfo] Access the current working directory from the `VFS` ...instead of calling `llvm::sys::fs::current_path()` directly. Differential Revision: https://reviews.llvm.org/D130443	2022-07-26 13:48:39 -07:00
Fangrui Song	de1b5c9145	[AArch64] Simplify BTI/PAC-RET module flags These module flags use the Min merge behavior with a default value of zero, so we don't need to emit them if zero. Reviewed By: danielkiss Differential Revision: https://reviews.llvm.org/D130145	2022-07-26 09:48:36 -07:00
Stefan Gränitz	1e30820483	[WinEH] Apply funclet operand bundles to nounwind intrinsics that lower to function calls in the course of IR transforms WinEHPrepare marks any function call from EH funclets as unreachable, if it's not a nounwind intrinsic or has no proper funclet bundle operand. This affects ARC intrinsics on Windows, because they are lowered to regular function calls in the PreISelIntrinsicLowering pass. It caused silent binary truncations and crashes during unwinding with the GNUstep ObjC runtime: https://github.com/gnustep/libobjc2/issues/222 This patch adds a new function `llvm::IntrinsicInst::mayLowerToFunctionCall()` that aims to collect all affected intrinsic IDs. * Clang CodeGen uses it to determine whether or not it must emit a funclet bundle operand. * PreISelIntrinsicLowering asserts that the function returns true for all ObjC runtime calls it lowers. * LLVM uses it to determine whether or not a funclet bundle operand must be propagated to inlined call sites. Reviewed By: theraven Differential Revision: https://reviews.llvm.org/D128190	2022-07-26 17:52:43 +02:00
Arthur Eubanks	2eade1dba4	[WPD] Use new llvm.public.type.test intrinsic for potentially publicly visible classes Turning on opaque pointers has uncovered an issue with WPD where we currently pattern match away `assume(type.test)` in WPD so that a later LTT doesn't resolve the type test to undef and introduce an `assume(false)`. The pattern matching can fail in cases where we transform two `assume(type.test)`s into `assume(phi(type.test.1, type.test.2))`. Currently we create `assume(type.test)` for all virtual calls that might be devirtualized. This is to support `-Wl,--lto-whole-program-visibility`. To prevent this, all virtual calls that may not be in the same LTO module instead use a new `llvm.public.type.test` intrinsic in place of the `llvm.type.test`. Then when we know if `-Wl,--lto-whole-program-visibility` is passed or not, we can either replace all `llvm.public.type.test` with `llvm.type.test`, or replace all `llvm.public.type.test` with `true`. This prevents WPD from trying to pattern match away `assume(type.test)` for public virtual calls when failing the pattern matching will result in miscompiles. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D128955	2022-07-26 08:01:08 -07:00
Kazu Hirata	3f3930a451	Remove redundaunt virtual specifiers (NFC) Identified with tidy-modernize-use-override.	2022-07-25 23:00:59 -07:00
Jun Zhang	58c9480845	[CodeGen] Consider MangleCtx when move lazy emission States Also move MangleCtx when moving some lazy emission states in CodeGenModule. Without this patch clang-repl hits an invalid address access when passing `-Xcc -O2` flag. Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D130420	2022-07-26 12:34:03 +08:00
Kazu Hirata	95a932fb15	Remove redundaunt override specifiers (NFC) Identified with modernize-use-override.	2022-07-24 22:28:11 -07:00
Kazu Hirata	3650615fb2	[clang] Remove unused forward declarations (NFC)	2022-07-24 20:51:06 -07:00
David Chisnall	94c3b16978	Fix crash in ObjC codegen introduced with `5ab6ee7599` `5ab6ee7599` assumed that if `RValue::isScalar()` returns true then `RValue::getScalarVal` will return a valid value. This is not the case when the return value is `void` and so void message returns would crash if they hit this path. This is triggered only for cases where the nil-handling path needs to do something non-trivial (destroy arguments that should be consumed by the callee). Reviewed By: triplef Differential Revision: https://reviews.llvm.org/D123898	2022-07-24 13:59:45 +01:00
Dmitri Gribenko	aba43035bd	Use llvm::sort instead of std::sort where possible llvm::sort is beneficial even when we use the iterator-based overload, since it can optionally shuffle the elements (to detect non-determinism). However llvm::sort is not usable everywhere, for example, in compiler-rt. Reviewed By: nhaehnle Differential Revision: https://reviews.llvm.org/D130406	2022-07-23 15:19:05 +02:00
Jun Zhang	1a3a2eec71	[NFC] Move function definition to cpp file Signed-off-by: Jun Zhang <jun@junz.org>	2022-07-23 13:43:42 +08:00
Shangwu Yao	31d8dbd1e5	[CUDA/SPIR-V] Force passing aggregate type byval This patch forces copying aggregate type in kernel arguments by value when compiling CUDA targeting SPIR-V. The original behavior is not passing by value when there is any of destructor, copy constructor and move constructor defined by user. This patch makes the behavior of SPIR-V generated from CUDA follow the CUDA spec (https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#global-function-argument-processing), and matches the NVPTX implementation ( `41958f76d8/clang/lib/CodeGen/TargetInfo.cpp (L7241)`). Differential Revision: https://reviews.llvm.org/D130387	2022-07-22 20:30:15 +00:00
Sergei Barannikov	37502e042f	[clang][CodeGen] Only include ABIInfo.h where required (NFC) Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D130322	2022-07-22 10:45:02 -07:00
Iain Sandoe	afda39a566	re-land [C++20][Modules] Build module static initializers per P1874R1. The re-land fixes module map module dependencies seen on Greendragon, but not in the clang test suite. --- Currently we only implement this for the Itanium ABI since the correct mangling for the initializers in other ABIs is not yet known. Intended result: For a module interface [which includes partition interface and implementation units] (instead of the generic CXX initializer) we emit a module init that: - wraps the contained initializations in a control variable to ensure that the inits only happen once, even if a module is imported many times by imports of the main unit. - calls module initializers for imported modules first. Note that the order of module import is not significant, and therefore neither is the order of imported module initializers. - We then call initializers for the Global Module Fragment (if present) - We then call initializers for the current module. - We then call initializers for the Private Module Fragment (if present) For a module implementation unit, or a non-module TU that imports at least one module we emit a regular CXX init that: - Calls the initializers for any imported modules first. - Then proceeds as normal with remaining inits. For all module unit kinds we include a global constructor entry, this allows for the (in most cases unusual) possibility that a module object could be included in a final binary without a specific call to its initializer. Implementation: - We provide the module pointer in the AST Context so that CodeGen can act on it and its sub-modules. - We need to account for module build lines like this: ` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or ` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o` - in order to do this, we add to ParseAST to set the module pointer in the ASTContext, once we establish that this is a module build and we know the module pointer. To be able to do this, we make the query for current module public in Sema. - In CodeGen, we determine if the current build requires a CXX20-style module init and, if so, we defer any module initializers during the "Eagerly Emitted" phase. - We then walk the module initializers at the end of the TU but before emitting deferred inits (which adds any hidden and static ones, fixing https://github.com/llvm/llvm-project/issues/51873 ). - We then proceed to emit the deferred inits and continue to emit the CXX init function. Differential Revision: https://reviews.llvm.org/D126189	2022-07-22 08:38:07 +01:00
Shraiysh Vaishay	61fa7a88c7	[clang][OpenMP] Add IRBuilder support for taskgroup This patch makes use of OMPIRBuilder support for codegen of taskgroup construct in clang. Depends on D128203 Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D129992	2022-07-21 11:13:57 +05:30
Fangrui Song	23ba688f02	[X86] Use Min behavior for cf-protection-{return,branch}/ibt-seal module flags These features require that all object files are compiled with the support. When the feature is disabled for an object file, the merge behavior should treat the file having a value of 0 (see D129911). Reviewed By: xiangzhangllvm Differential Revision: https://reviews.llvm.org/D130065	2022-07-19 21:20:02 -07:00
serge-sans-paille	f764dc99b3	[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays Some code [0] consider that trailing arrays are flexible, whatever their size. Support for these legacy code has been introduced in `f8f6324983` but it prevents evaluation of __builtin_object_size and __builtin_dynamic_object_size in some legit cases. Introduce -fstrict-flex-arrays=<n> to have stricter conformance when it is desirable. n = 0: current behavior, any trailing array member is a flexible array. The default. n = 1: any trailing array member of undefined, 0 or 1 size is a flexible array member n = 2: any trailing array member of undefined or 0 size is a flexible array member This takes into account two specificities of clang: array bounds as macro id disqualify FAM, as well as non standard layout. Similar patch for gcc discuss here: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101836 [0] https://docs.freebsd.org/en/books/developers-handbook/sockets/#sockets-essential-functions	2022-07-18 12:45:52 +02:00
Fangrui Song	0d5a62faca	[sanitizer] Add "mainfile" prefix to sanitizer special case list When an issue exists in the main file (caller) instead of an included file (callee), using a `src` pattern applying to the included file may be inappropriate if it's the caller's responsibility. Add `mainfile` prefix to check the main filename. For the example below, the issue may reside in a.c (foo should not be called with a misaligned pointer or foo should switch to an unaligned load), but with `src` we can only apply to the innocent callee a.h. With this patch we can use the more appropriate `mainfile:a.c`. ``` //--- a.h // internal linkage static inline int load(int x) { return x; } //--- a.c, -fsanitize=alignment #include "a.h" int foo(void *x) { return load(x); } ``` See the updated clang/docs/SanitizerSpecialCaseList.rst for a caveat due to C++ vague linkage functions. Reviewed By: #sanitizers, kstoimenov, vitalybuka Differential Revision: https://reviews.llvm.org/D129832	2022-07-15 10:39:26 -07:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Jonas Devlieghere	888673b6e3	Revert "[clang] Implement ElaboratedType sugaring for types written bare" This reverts commit `7c51f02eff` because it stills breaks the LLDB tests. This was re-landed without addressing the issue or even agreement on how to address the issue. More details and discussion in https://reviews.llvm.org/D112374.	2022-07-14 21:17:48 -07:00
Matheus Izvekov	7c51f02eff	[clang] Implement ElaboratedType sugaring for types written bare Without this patch, clang will not wrap in an ElaboratedType node types written without a keyword and nested name qualifier, which goes against the intent that we should produce an AST which retains enough details to recover how things are written. The lack of this sugar is incompatible with the intent of the type printer default policy, which is to print types as written, but to fall back and print them fully qualified when they are desugared. An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still requires pointer alignment due to pre-existing bug in the TypeLoc buffer handling. --- Troubleshooting list to deal with any breakage seen with this patch: 1) The most likely effect one would see by this patch is a change in how a type is printed. The type printer will, by design and default, print types as written. There are customization options there, but not that many, and they mainly apply to how to print a type that we somehow failed to track how it was written. This patch fixes a problem where we failed to distinguish between a type that was written without any elaborated-type qualifiers, such as a 'struct'/'class' tags and name spacifiers such as 'std::', and one that has been stripped of any 'metadata' that identifies such, the so called canonical types. Example: ``` namespace foo { struct A {}; A a; }; ``` If one were to print the type of `foo::a`, prior to this patch, this would result in `foo::A`. This is how the type printer would have, by default, printed the canonical type of A as well. As soon as you add any name qualifiers to A, the type printer would suddenly start accurately printing the type as written. This patch will make it print it accurately even when written without qualifiers, so we will just print `A` for the initial example, as the user did not really write that `foo::` namespace qualifier. 2) This patch could expose a bug in some AST matcher. Matching types is harder to get right when there is sugar involved. For example, if you want to match a type against being a pointer to some type A, then you have to account for getting a type that is sugar for a pointer to A, or being a pointer to sugar to A, or both! Usually you would get the second part wrong, and this would work for a very simple test where you don't use any name qualifiers, but you would discover is broken when you do. The usual fix is to either use the matcher which strips sugar, which is annoying to use as for example if you match an N level pointer, you have to put N+1 such matchers in there, beginning to end and between all those levels. But in a lot of cases, if the property you want to match is present in the canonical type, it's easier and faster to just match on that... This goes with what is said in 1), if you want to match against the name of a type, and you want the name string to be something stable, perhaps matching on the name of the canonical type is the better choice. 3) This patch could exposed a bug in how you get the source range of some TypeLoc. For some reason, a lot of code is using getLocalSourceRange(), which only looks at the given TypeLoc node. This patch introduces a new, and more common TypeLoc node which contains no source locations on itself. This is not an inovation here, and some other, more rare TypeLoc nodes could also have this property, but if you use getLocalSourceRange on them, it's not going to return any valid locations, because it doesn't have any. The right fix here is to always use getSourceRange() or getBeginLoc/getEndLoc which will dive into the inner TypeLoc to get the source range if it doesn't find it on the top level one. You can use getLocalSourceRange if you are really into micro-optimizations and you have some outside knowledge that the TypeLocs you are dealing with will always include some source location. 4) Exposed a bug somewhere in the use of the normal clang type class API, where you have some type, you want to see if that type is some particular kind, you try a `dyn_cast` such as `dyn_cast<TypedefType>` and that fails because now you have an ElaboratedType which has a TypeDefType inside of it, which is what you wanted to match. Again, like 2), this would usually have been tested poorly with some simple tests with no qualifications, and would have been broken had there been any other kind of type sugar, be it an ElaboratedType or a TemplateSpecializationType or a SubstTemplateParmType. The usual fix here is to use `getAs` instead of `dyn_cast`, which will look deeper into the type. Or use `getAsAdjusted` when dealing with TypeLocs. For some reason the API is inconsistent there and on TypeLocs getAs behaves like a dyn_cast. 5) It could be a bug in this patch perhaps. Let me know if you need any help! Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Differential Revision: https://reviews.llvm.org/D112374	2022-07-15 04:16:55 +02:00
Ellis Hoag	af58684f27	[InstrProf] Add options to profile function groups Add two options, `-fprofile-function-groups=N` and `-fprofile-selected-function-group=i` used to partition functions into `N` groups and only instrument the functions in group `i`. Similar options were added to xray in https://reviews.llvm.org/D87953 and the goal is the same; to reduce instrumented size overhead by spreading the overhead across multiple builds. Raw profiles from different groups can be added like normal using the `llvm-profdata merge` command. Reviewed By: ianlevesque Differential Revision: https://reviews.llvm.org/D129594	2022-07-14 11:41:30 -07:00
Nick Desaulniers	140bfdca60	[clang][CodeGen] add fn_ret_thunk_extern to synthetic fns Follow up fix to commit `2240d72f15` ("[X86] initial -mfunction-return=thunk-extern support") https://reviews.llvm.org/D129572 @nathanchance reported that -mfunction-return=thunk-extern was failing to annotate the asan and tsan contructors. https://lore.kernel.org/llvm/Ys7pLq+tQk5xEa%2FB@dev-arch.thelio-3990X/ I then noticed the same occurring for gcov synthetic functions. Similar to commit `2786e67` ("[IR][sanitizer] Add module flag "frame-pointer" and set it for cc1 -mframe-pointer={non-leaf,all}") define a new module level MetaData, "fn_ret_thunk_extern", then when set adds the fn_ret_thunk_extern IR Fn Attr to synthetically created Functions. Fixes https://github.com/llvm/llvm-project/issues/56514 Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129709	2022-07-14 11:25:24 -07:00
Kazu Hirata	cb2c8f694d	[clang] Use value instead of getValue (NFC)	2022-07-13 23:39:33 -07:00
Joseph Huber	b370be37cc	[CUDA] Allow the new driver to compile CUDA in non-RDC mode The new driver primarily allows us to support RDC-mode compilations with proper linking. This is not needed for non-RDC mode compilation, but we still would like the new driver to be able to handle this mode so we can transition away from the old driver in the future. This patch adds the necessary code to support creating a fatbinary for CUDA code generation as well as removing old assumptions and errors about RDC-mode with the new driver. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D129655	2022-07-13 21:49:15 -04:00
Jonas Devlieghere	3968936b92	Revert "[clang] Implement ElaboratedType sugaring for types written bare" This reverts commit `bdc6974f92` because it breaks all the LLDB tests that import the std module. import-std-module/array.TestArrayFromStdModule.py import-std-module/deque-basic.TestDequeFromStdModule.py import-std-module/deque-dbg-info-content.TestDbgInfoContentDequeFromStdModule.py import-std-module/forward_list.TestForwardListFromStdModule.py import-std-module/forward_list-dbg-info-content.TestDbgInfoContentForwardListFromStdModule.py import-std-module/list.TestListFromStdModule.py import-std-module/list-dbg-info-content.TestDbgInfoContentListFromStdModule.py import-std-module/queue.TestQueueFromStdModule.py import-std-module/stack.TestStackFromStdModule.py import-std-module/vector.TestVectorFromStdModule.py import-std-module/vector-bool.TestVectorBoolFromStdModule.py import-std-module/vector-dbg-info-content.TestDbgInfoContentVectorFromStdModule.py import-std-module/vector-of-vectors.TestVectorOfVectorsFromStdModule.py https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45301/	2022-07-13 09:20:30 -07:00
Mitch Phillips	7045519359	Add missing sanitizer metadata plumbing from CFE. clang misses attaching sanitizer metadata for external globals. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D129492	2022-07-13 08:54:41 -07:00
Mitch Phillips	90e5a8ac47	Remove 'no_sanitize_memtag'. Add 'sanitize_memtag'. For MTE globals, we should have clang emit the attribute for all GV's that it creates, and then use that in the upcoming AArch64 global tagging IR pass. We need a positive attribute for this sanitizer (rather than implicit sanitization of all globals) because it needs to interact with other parts of LLVM, including: 1. Suppressing certain global optimisations (like merging), 2. Emitting extra directives by the ASM writer, and 3. Putting extra information in the symbol table entries. While this does technically make the LLVM IR / bitcode format non-backwards-compatible, nobody should have used this attribute yet, because it's a no-op. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D128950	2022-07-13 08:54:41 -07:00
Jun Zhang	8082a00286	[CodeGen] Keep track of decls that were deferred and have been emitted. This patch adds a new field called EmittedDeferredDecls in CodeGenModule that keeps track of decls that were deferred and have been emitted. The intention of this patch is to solve issues in the incremental c++, we'll lose info of decls that are lazily emitted when we undo their usage. See example below: clang-repl> inline int foo() { return 42;} clang-repl> int bar = foo(); clang-repl> %undo clang-repl> int baz = foo(); JIT session error: Symbols not found: [ _Z3foov ] error: Failed to materialize symbols: { (main, { baz, $.incr_module_2.inits.0, orc_init_func.incr_module_2 }) } Signed-off-by: Jun Zhang <jun@junz.org> Differential Revision: https://reviews.llvm.org/D128782	2022-07-13 20:00:59 +08:00
Matheus Izvekov	bdc6974f92	[clang] Implement ElaboratedType sugaring for types written bare Without this patch, clang will not wrap in an ElaboratedType node types written without a keyword and nested name qualifier, which goes against the intent that we should produce an AST which retains enough details to recover how things are written. The lack of this sugar is incompatible with the intent of the type printer default policy, which is to print types as written, but to fall back and print them fully qualified when they are desugared. An ElaboratedTypeLoc without keyword / NNS uses no storage by itself, but still requires pointer alignment due to pre-existing bug in the TypeLoc buffer handling. Signed-off-by: Matheus Izvekov <mizvekov@gmail.com> Differential Revision: https://reviews.llvm.org/D112374	2022-07-13 02:10:09 +02:00
Nick Desaulniers	2240d72f15	[X86] initial -mfunction-return=thunk-extern support Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute Where the supported <value>s are: * keep (disable) * thunk-extern (enable) thunk-extern enables clang to change ret instructions into jmps to an external symbol named __x86_return_thunk, implemented as a new MachineFunctionPass named "x86-return-thunks", keyed off the new IR attribute fn_ret_thunk_extern. The symbol __x86_return_thunk is expected to be provided by the runtime the compiled code is linked against and is not defined by the compiler. Enabling this option alone doesn't provide mitigations without corresponding definitions of __x86_return_thunk! This new MachineFunctionPass is very similar to "x86-lvi-ret". The <value>s "thunk" and "thunk-inline" are currently unsupported. It's not clear yet that they are necessary: whether the thunk pattern they would emit is beneficial or used anywhere. Should the <value>s "thunk" and "thunk-inline" become necessary, x86-return-thunks could probably be merged into x86-retpoline-thunks which has pre-existing machinery for emitting thunks (which could be used to implement the <value> "thunk"). Has been found to build+boot with corresponding Linux kernel patches. This helps the Linux kernel mitigate RETBLEED. * CVE-2022-23816 * CVE-2022-28693 * CVE-2022-29901 See also: * "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." * AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion * TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0 2022-07-12 * Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702 SystemZ may eventually want to support "thunk-extern" and "thunk"; both options are used by the Linux kernel's CONFIG_EXPOLINE. This functionality has been available in GCC since the 8.1 release, and was backported to the 7.3 release. Many thanks for folks that provided discrete review off list due to the embargoed nature of this hardware vulnerability. Many Bothans died to bring us this information. Link: https://www.youtube.com/watch?v=IF6HbCKQHK8 Link: https://github.com/llvm/llvm-project/issues/54404 Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1 Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60 Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html Reviewed By: aaron.ballman, craig.topper Differential Revision: https://reviews.llvm.org/D129572	2022-07-12 09:17:54 -07:00
Xiang1 Zhang	a45dd3d814	[X86] Support -mstack-protector-guard-symbol Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D129346	2022-07-12 10:17:00 +08:00
Xiang1 Zhang	643786213b	Revert "[X86] Support -mstack-protector-guard-symbol" This reverts commit `efbaad1c4a`. due to miss adding review info.	2022-07-12 10:14:32 +08:00
Xiang1 Zhang	efbaad1c4a	[X86] Support -mstack-protector-guard-symbol	2022-07-12 10:13:48 +08:00
Joseph Huber	e88d53d25f	[HIP] Generate offloading entries for HIP with the new driver. This patch adds the small change required to output offloading entried for HIP instead of CUDA. These should be placed in different sections so because they need to be distinct to the offloading toolchain, otherwise we'd have HIP trying to register CUDA kernels or vice-versa. This patch will precede support for HIP in the linker wrapper. Reviewed By: yaxunl, tra Differential Revision: https://reviews.llvm.org/D128850	2022-07-11 15:49:21 -04:00
Mitch Phillips	f18de7619e	Update DynInit generation for ASan globals. Address a follow-up TODO for Sanitizer Metadata. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D128672	2022-07-11 12:23:37 -07:00
Iain Sandoe	b19d3ee712	Revert "[C++20][Modules] Build module static initializers per P1874R1." This reverts commit `ac507102d2`. reverting while we figuere out why one of the green dragon lldb test fails.	2022-07-11 19:50:31 +01:00
Prabhdeep Singh Soni	ac892c70a4	[OMPIRBuilder] Add support for simdlen clause This patch adds OMPIRBuilder support for the simdlen clause for the simd directive. It uses the simdlen support in OpenMPIRBuilder when it is enabled in Clang. Simdlen is lowered by OpenMPIRBuilder by generating the loop.vectorize.width metadata. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D129149	2022-07-11 13:29:06 -04:00
Iain Sandoe	ac507102d2	[C++20][Modules] Build module static initializers per P1874R1. Currently we only implement this for the Itanium ABI since the correct mangling for the initializers in other ABIs is not yet known. Intended result: For a module interface [which includes partition interface and implementation units] (instead of the generic CXX initializer) we emit a module init that: - wraps the contained initializations in a control variable to ensure that the inits only happen once, even if a module is imported many times by imports of the main unit. - calls module initializers for imported modules first. Note that the order of module import is not significant, and therefore neither is the order of imported module initializers. - We then call initializers for the Global Module Fragment (if present) - We then call initializers for the current module. - We then call initializers for the Private Module Fragment (if present) For a module implementation unit, or a non-module TU that imports at least one module we emit a regular CXX init that: - Calls the initializers for any imported modules first. - Then proceeds as normal with remaining inits. For all module unit kinds we include a global constructor entry, this allows for the (in most cases unusual) possibility that a module object could be included in a final binary without a specific call to its initializer. Implementation: - We provide the module pointer in the AST Context so that CodeGen can act on it and its sub-modules. - We need to account for module build lines like this: ` clang -cc1 -std=c++20 Foo.pcm -emit-obj -o Foo.o` or ` clang -cc1 -std=c++20 -xc++-module Foo.cpp -emit-obj -o Foo.o` - in order to do this, we add to ParseAST to set the module pointer in the ASTContext, once we establish that this is a module build and we know the module pointer. To be able to do this, we make the query for current module public in Sema. - In CodeGen, we determine if the current build requires a CXX20-style module init and, if so, we defer any module initializers during the "Eagerly Emitted" phase. - We then walk the module initializers at the end of the TU but before emitting deferred inits (which adds any hidden and static ones, fixing https://github.com/llvm/llvm-project/issues/51873 ). - We then proceed to emit the deferred inits and continue to emit the CXX init function. Differential Revision: https://reviews.llvm.org/D126189	2022-07-09 09:09:09 +01:00
Joseph Huber	5300263c70	[OpenMP] Add loop tripcount argument to kernel launch and remove push function Previously we added the `push_target_tripcount` function to send the loop tripcount to the device runtime so we knew how to configure the teams / threads for execute the loop for a teams distribute construct. This was implemented as a separate function mostly to avoid changing the interface for backwards compatbility. Now that we've changed it anyway and the new interface can take an arbitrary number of arguments via the struct without changing the ABI, we can move this to the new interface. This will simplify the runtime by removing unnecessary state between calls. Depends on D128550 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D128816	2022-07-08 14:44:16 -04:00
Joseph Huber	1fff116645	[OpenMP] Change OpenMP code generation for target region entries This patch changes the code we generate to enter a target region on the device. This is in-line with the new definition in the runtime that was added previously. Additionally we implement this in the OpenMPIRBuilder so that this code can be shared with Flang in the future. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D128550	2022-07-08 14:44:11 -04:00
Shilei Tian	83837a6198	[Clang][OpenMP] Enable floating-point operation for `atomic compare` series D127041 introduced the support for `fmax` and `fmin` such that we can also reprent `atomic compare` and `atomic compare capture` with `atomicrmw` instruction. This patch simply lifts the limitation we set before. Depend on D127041. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D127042	2022-07-06 13:05:11 -04:00
Nikola Tesic	b5b6d3a41b	[Debugify] Port verify-debuginfo-preserve to NewPM Debugify in OriginalDebugInfo mode, introduced with D82545, runs only with legacy PassManager. This patch enables this utility for the NewPM. Differential Revision: https://reviews.llvm.org/D115351	2022-07-06 17:07:20 +02:00
Alexey Bader	923b56e7ca	[NFC] Add a TODO comment to apply nounwind attribute in all GPU modes.	2022-07-06 06:20:09 -07:00
Bruno De Fraine	5b3247bf9f	[tbaa] Handle base classes in struct tbaa This is a fix for the miscompilation reported in https://github.com/llvm/llvm-project/issues/55384 Not adding a new test case since existing test cases already cover base classes (including new-struct-path tbaa). Reviewed By: jeroen.dobbelaere Differential Revision: https://reviews.llvm.org/D126956	2022-07-06 14:37:59 +02:00
Serge Pavlov	f7819ce166	[FPEnv] Allow CompoundStmt to keep FP options This is a recommit of `b822efc740`, reverted in `dc34d8df4c`. The commit caused fails because the test ast-print-fp-pragmas.c did not specify particular target, and it failed on targets which do not support constrained intrinsics. The original commit message is below. AST does not have special nodes for pragmas. Instead a pragma modifies some state variables of Sema, which in turn results in modified attributes of AST nodes. This technique applies to floating point operations as well. Every AST node that can depend on FP options keeps current set of them. This technique works well for options like exception behavior or fast math options. They represent instructions to the compiler how to modify code generation for the affected nodes. However treatment of FP control modes has problems with this technique. Modifying FP control mode (like rounding direction) usually requires operations on hardware, like writing to control registers. It must be done prior to the first operation that depends on the control mode. In particular, such operations are required for implementation of `pragma STDC FENV_ROUND`, compiler should set up necessary rounding direction at the beginning of compound statement where the pragma occurs. As there is no representation for pragmas in AST, the code generation becomes a complicated task in this case. To solve this issue FP options are kept inside CompoundStmt. Unlike to FP options in expressions, these does not affect any operation on FP values, but only inform the codegen about the FP options that act in the body of the statement. As all pragmas that modify FP environment may occurs only at the start of compound statement or at global level, such solution works for all relevant pragmas. The options are kept as a difference from the options in the enclosing compound statement or default options, it helps codegen to set only changed control modes. Differential Revision: https://reviews.llvm.org/D123952	2022-07-03 17:06:26 +07:00
Fazlay Rabbi	38bcd483dd	[OpenMP] Initial parsing and semantic support for 'parallel masked taskloop simd' construct This patch gives basic parsing and semantic support for "parallel masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.10) Differential Revision: https://reviews.llvm.org/D128946	2022-07-01 08:57:15 -07:00
Serge Pavlov	dc34d8df4c	Revert "[FPEnv] Allow CompoundStmt to keep FP options" On some buildbots test `ast-print-fp-pragmas.c` fails, need to investigate it. This reverts commit `0401fd12d4`. This reverts commit `b822efc740`.	2022-07-01 15:42:39 +07:00
Serge Pavlov	b822efc740	[FPEnv] Allow CompoundStmt to keep FP options AST does not have special nodes for pragmas. Instead a pragma modifies some state variables of Sema, which in turn results in modified attributes of AST nodes. This technique applies to floating point operations as well. Every AST node that can depend on FP options keeps current set of them. This technique works well for options like exception behavior or fast math options. They represent instructions to the compiler how to modify code generation for the affected nodes. However treatment of FP control modes has problems with this technique. Modifying FP control mode (like rounding direction) usually requires operations on hardware, like writing to control registers. It must be done prior to the first operation that depends on the control mode. In particular, such operations are required for implementation of `pragma STDC FENV_ROUND`, compiler should set up necessary rounding direction at the beginning of compound statement where the pragma occurs. As there is no representation for pragmas in AST, the code generation becomes a complicated task in this case. To solve this issue FP options are kept inside CompoundStmt. Unlike to FP options in expressions, these does not affect any operation on FP values, but only inform the codegen about the FP options that act in the body of the statement. As all pragmas that modify FP environment may occurs only at the start of compound statement or at global level, such solution works for all relevant pragmas. The options are kept as a difference from the options in the enclosing compound statement or default options, it helps codegen to set only changed control modes. Differential Revision: https://reviews.llvm.org/D123952	2022-07-01 14:32:33 +07:00
Nikita Popov	9ac386495d	[ConstExpr] Don't create insertvalue expressions In preparation for the removal in D128719, this stops creating insertvalue constant expressions (well, unless they are directly used in LLVM IR). Differential Revision: https://reviews.llvm.org/D128792	2022-07-01 09:23:28 +02:00
Piotr Sobczak	4a78225212	[AMDGPU] Add WMMA clang builtins Add WMMA clang builtins and tests. Extra changes in code are needed to handle function overloads. WavefrontSize 32: __builtin_amdgcn_wmma_f32_16x16x16_f16_w32 __builtin_amdgcn_wmma_f32_16x16x16_bf16_w32 __builtin_amdgcn_wmma_f16_16x16x16_f16_w32 __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w32 __builtin_amdgcn_wmma_i32_16x16x16_iu8_w32 __builtin_amdgcn_wmma_i32_16x16x16_iu4_w32 WavefrontSize 64: __builtin_amdgcn_wmma_f32_16x16x16_f16_w64 __builtin_amdgcn_wmma_f32_16x16x16_bf16_w64 __builtin_amdgcn_wmma_f16_16x16x16_f16_w64 __builtin_amdgcn_wmma_bf16_16x16x16_bf16_w64 __builtin_amdgcn_wmma_i32_16x16x16_iu8_w64 __builtin_amdgcn_wmma_i32_16x16x16_iu4_w64 Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D128952	2022-07-01 08:55:25 +02:00
Fazlay Rabbi	d64ba896d3	[OpenMP] Initial parsing and sema support for 'parallel masked taskloop' construct This patch gives basic parsing and semantic support for "parallel masked taskloop" construct introduced in OpenMP 5.1 (section 2.16.9) Differential Revision: https://reviews.llvm.org/D128834	2022-06-30 11:44:17 -07:00
Richard Smith	dcea10c3c6	Fix miscompile with [[no_unique_address]] struct fields. If a zero-sized field has a non-trivial initializer, it should prevent the overall struct initialization from being folded to a constant during IR generation. Don't just ignore zero-sized fields entirely in IR constant emission.	2022-06-29 13:08:40 -07:00
Fazlay Rabbi	73e5d7bdff	[OpenMP] Initial parsing and sema support for 'masked taskloop simd' construct This patch gives basic parsing and semantic support for "masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.8) Differential Revision: https://reviews.llvm.org/D128693	2022-06-28 15:27:49 -07:00
Nikita Popov	5548e807b5	[IR] Remove support for extractvalue constant expression This removes the extractvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. extractvalue is already not supported in bitcode, so we do not need to worry about bitcode auto-upgrade. Uses of ConstantExpr::getExtractValue() should be replaced with IRBuilder::CreateExtractValue() (if the fact that the result is constant is not important) or ConstantFoldExtractValueInstruction() (if it is). Though for this particular case, it is also possible and usually preferable to use getAggregateElement() instead. The C API function LLVMConstExtractValue() is removed, as the underlying constant expression no longer exists. Instead, LLVMBuildExtractValue() should be used (which will constant fold or create an instruction). Depending on the use-case, LLVMGetAggregateElement() may also be used instead. Differential Revision: https://reviews.llvm.org/D125795	2022-06-28 10:40:17 +02:00
Mitch Phillips	dacfa24f75	Delete 'llvm.asan.globals' for global metadata. Now that we have the sanitizer metadata that is actually on the global variable, and now that we use debuginfo in order to do symbolization of globals, we can delete the 'llvm.asan.globals' IR synthesis. This patch deletes the 'location' part of the __asan_global that's embedded in the binary as well, because it's unnecessary. This saves about ~1.7% of the optimised non-debug with-asserts clang binary. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D127911	2022-06-27 14:40:40 -07:00
Vitaly Buka	cdfa15da94	Revert "[clang] Introduce -fstrict-flex-arrays=<n> for stricter handling of flexible arrays" This reverts D126864 and related fixes. This reverts commit `572b08790a`. This reverts commit `886715af96`.	2022-06-27 14:03:09 -07:00
Yuanfang Chen	6678f8e505	[ubsan] Using metadata instead of prologue data for function sanitizer Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The self (global value) references inside `Prologue Data` is still pointing to the original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`. This patch detaches the information from function `Prologue Data` and attaches it to a function metadata node. This and D116130 fix https://github.com/llvm/llvm-project/issues/49689. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D115844	2022-06-27 12:09:13 -07:00
Ritanya B Bharadwaj	8322fe200d	Adding support for target in_reduction Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.6. So, this ``` for (int i=0; i<N; i++) { res = res+i } ``` will become ``` #pragma omp task in_reduction(+:res) if(0) #pragma omp target map(res) for (int i=0; i<N; i++) { res = res+i } ``` Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D125669	2022-06-27 10:36:46 -05:00
Florian Hahn	ca47ab128b	[Clang] Remove unused function declaration after `77475ffd22`.	2022-06-27 14:17:53 +01:00

1 2 3 4 5 ...

15545 Commits