llvm-project

Commit Graph

Author	SHA1	Message	Date
Vitaly Buka	d7df3f0a4b	[NFC] Exctract getNoSanitizeMask lambda	2022-06-07 14:08:43 -07:00
Vitaly Buka	f32ad5703e	[NFC] Move part of SanitizerMetadata into private method	2022-06-07 14:08:43 -07:00
Mitch Phillips	5d9de5f446	[NFC] Clang-format parts of D126929 and D126100	2022-06-07 14:08:43 -07:00
Mitch Phillips	f49a5844b6	[NFC][CodeGen] Rename method Extracted from D84652.	2022-06-07 14:08:42 -07:00
Erich Keane	5c3bde9625	[CodeGen] Fix an issue when the 'extern C' replacement names broke Originally broken by me in D122608, this is a regression where we attempt to replace an extern-C thing with 'itself'. The problem is that we end up deleting it, causing the value to fail when it gets put into llvm.used.	2022-06-07 11:30:59 -07:00
Akira Hatanaka	3ba6ace3cc	[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in DebugTypeVisitor This recommits `d1346e2`. I've added a line to the test case to enable it only on assert builds. Differential Revision: https://reviews.llvm.org/D125839	2022-06-06 19:12:26 -07:00
Akira Hatanaka	834e5d12c7	Revert "[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in" This reverts commit `d1346e2ee2`. The commit broke a few bots.	2022-06-06 18:48:24 -07:00
Sam Clegg	47039a1a4b	[WebAssembly] Remove restriction on main name mangling Summary: Emscripten now handles/supports this new mode. Subscribers: dschuff, jgravelle-google, aheejin, sunfish, cfe-commits Tags: #clang Differential Revision: https://reviews.llvm.org/D75277	2022-06-06 14:04:27 -07:00
Akira Hatanaka	d1346e2ee2	[gmodules] Skip CXXDeductionGuideDecls when visiting FunctionDecls in DebugTypeVisitor Differential Revision: https://reviews.llvm.org/D125839	2022-06-06 12:51:36 -07:00
Fangrui Song	d86a206f06	Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options	2022-06-05 00:31:44 -07:00
Anders Waldenborg	dd2362a8ba	[clang] Allow const variables with weak attribute to be overridden A variable with `weak` attribute signifies that it can be replaced with a "strong" symbol link time. Therefore it must not emitted with "weak_odr" linkage, as that allows the backend to use its value in optimizations. The frontend already considers weak const variables as non-constant (note_constexpr_var_init_weak diagnostic) so this change makes frontend and backend consistent. This commit reverses the `f49573d1` weak globals that are const should get weak_odr linkage. commit from 2009-08-05 which introduced this behavior. Unfortunately that commit doesn't provide any details on why the change was made. This was discussed in https://discourse.llvm.org/t/weak-attribute-semantics-on-const-variables/62311 Differential Revision: https://reviews.llvm.org/D126324	2022-06-03 23:44:15 +02:00
Guillaume Chatelet	c698189696	[NFC] Format CGBuilder.h	2022-06-03 07:54:01 +00:00
Shilei Tian	c4a90db720	[Clang][OpenMP] Add the codegen support for `atomic compare capture` This patch adds the codegen support for `atomic compare capture` in clang. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D120290	2022-06-02 21:38:21 -04:00
Shilei Tian	3a96256b7e	[Clang][OpenMP] Avoid using `IgnoreImpCasts` if possible This patch removes all `IgnoreImpCasts` in Sema, and only uses it if necessary. If the expression is not of the same type as the pointer value, a cast is inserted. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D126602	2022-06-02 17:45:02 -04:00
Paul Robinson	dc5175adef	[PS5] Make passing unions in registers match PS4 ABI	2022-06-02 11:00:54 -07:00
Paul Robinson	cc756f91c3	[PS5] Classify __m64 as integer, matching PS4 ABI	2022-06-02 11:00:53 -07:00
Hans Wennborg	d42fe9aa84	Revert "[clang][AIX] add option mdefault-visibility-export-mapping" This caused assertions, see comment on the code review: llvm/clang/lib/AST/Decl.cpp:1510: clang::LinkageInfo clang::LinkageComputer::getLVForDecl(const clang::NamedDecl , clang::LVComputationKind): Assertion `D->getCachedLinkage() == LV.getLinkage()' failed. > The option mdefault-visibility-export-mapping is created to allow > mapping default visibility to an explicit shared library export > (e.g. dllexport). Exactly how and if this is manifested is target > dependent (since it depends on how they map dllexport in the IR). > > Three values are provided for the option: > > none: the default and behavior without the option, no additional export linkage information is created. > * explicit: add the export for entities with explict default visibility from the source, including RTTI > * all: add the export for all entities with default visibility > > This option is useful for targets which do not export symbols as part of > their usual default linkage behaviour (e.g. AIX), such targets > traditionally specified such information in external files (e.g. export > lists), but this mapping allows them to use the visibility information > typically used for this purpose on other (e.g. ELF) platforms. > > Reviewed By: MaskRay > > Differential Revision: https://reviews.llvm.org/D126340 This reverts commit `8c8a2679a2`.	2022-06-02 15:09:39 +02:00
Martin Storsjö	f730749e85	[clang] [ARM] Add __builtin_sponentry like on aarch64 This is used for calling the SEH aware setjmp on MinGW. Differential Revision: https://reviews.llvm.org/D126764	2022-06-02 12:29:59 +03:00
Shilei Tian	eb673be5ac	[OMPIRBuilder] Add the support for compare capture This patch adds the support for `compare capture` in `OMPIRBuilder`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120007	2022-06-01 19:53:43 -04:00
Joseph Huber	afd2f7e991	[Binary] Promote OffloadBinary to inherit from Binary We use the `OffloadBinary` to create binary images of offloading files and their corresonding metadata. This patch changes this to inherit from the base `Binary` class. This allows us to create and insepect these more generically. This patch includes all the necessary glue to implement this as a new binary format, along with added the magic bytes we use to distinguish the offloading binary to the `file_magic` implementation. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D126812	2022-06-01 18:40:57 -04:00
David Tenty	8c8a2679a2	[clang][AIX] add option mdefault-visibility-export-mapping The option mdefault-visibility-export-mapping is created to allow mapping default visibility to an explicit shared library export (e.g. dllexport). Exactly how and if this is manifested is target dependent (since it depends on how they map dllexport in the IR). Three values are provided for the option: * none: the default and behavior without the option, no additional export linkage information is created. * explicit: add the export for entities with explict default visibility from the source, including RTTI * all: add the export for all entities with default visibility This option is useful for targets which do not export symbols as part of their usual default linkage behaviour (e.g. AIX), such targets traditionally specified such information in external files (e.g. export lists), but this mapping allows them to use the visibility information typically used for this purpose on other (e.g. ELF) platforms. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D126340	2022-06-01 18:07:17 -04:00
Nikita Popov	858e6273d9	[Clang] Always set opaque pointers mode Always set the opaque pointers mode, to make sure that -no-opaque-pointers continues working when the default on the LLVM side is flipped.	2022-05-31 15:43:05 +02:00
Zi Xuan Wu (Zeson)	563cc3fda9	[Clang][CSKY] Add support about CSKYABIInfo According to the CSKY ABIv2 document, https://github.com/c-sky/csky-doc/blob/master/C-SKY_V2_CPU_Applications_Binary_Interface_Standards_Manual.pdf construct the ABIInfo to handle argument passing and return of clang data type. It also includes how to emit and expand VAArg intrinsic. Differential Revision: https://reviews.llvm.org/D126451	2022-05-31 10:53:30 +08:00
Joel E. Denny	d2e3cb7374	[OpenMP][Clang] Fix atomic compare for signed vs. unsigned Without this patch, arguments to the `llvm::OpenMPIRBuilder::AtomicOpValue` initializer are reversed. Reviewed By: ABataev, tianshilei1992 Differential Revision: https://reviews.llvm.org/D126619	2022-05-30 11:02:20 -04:00
Enna1	52992f136b	Add !nosanitize to FixedMetadataKinds This patch adds !nosanitize metadata to FixedMetadataKinds.def, !nosanitize indicates that LLVM should not insert any sanitizer instrumentation. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D126294	2022-05-27 09:46:13 +08:00
Bruno Cardoso Lopes	ce54b22657	[Clang][CoverageMapping] Fix switch counter codegen compile time explosion C++ generated code with huge amount of switch cases chokes badly while emitting coverage mapping, in our specific testcase (~72k cases), it won't stop after hours. After this change, the frontend job now finishes in 4.5s and shrinks down `@__covrec_` by 288k when compared to disabling simplification altogether. There's probably no good way to create a testcase for this, but it's easy to reproduce, just add thousands of cases in the below switch, and build with `-fprofile-instr-generate -fcoverage-mapping`. ``` enum type : int { FEATURE_INVALID = 0, FEATURE_A = 1, ... }; const char *to_string(type e) { switch (e) { case type::FEATURE_INVALID: return "FEATURE_INVALID"; case type::FEATURE_A: return "FEATURE_A";} ... } ``` Differential Revision: https://reviews.llvm.org/D126345	2022-05-26 11:05:15 -07:00
Mike Rice	0a5cfbf7b2	[OpenMP] Use the align clause value from 'omp allocate' for globals Refactor the code that handles the align clause of 'omp allocate' so it can be used with globals as well as local variables. Differential Revision: https://reviews.llvm.org/D126426	2022-05-26 09:51:48 -07:00
Joseph Huber	1bae02b773	[Cuda] Use fallback method to mangle externalized decls if no CUID given CUDA requires that static variables be visible to the host when offloading. However, The standard semantics of a stiatc variable dictate that it should not be visible outside of the current file. In order to access it from the host we need to perform "externalization" on the static variable on the device. This requires generating a semi-unique name that can be affixed to the variable as to not cause linker errors. This is currently done using the CUID functionality, an MD5 hash value set up by the clang driver. This allows us to achieve is mostly unique ID that is unique even between multiple compilations of the same file. However, this is not always availible. Instead, this patch uses the unique ID from the file to generate a unique symbol name. This will create a unique name that is consistent between the host and device side compilations without requiring the CUID to be entered by the driver. The one downside to this is that we are no longer stable under multiple compilations of the same file. However, this is a very niche use-case and is not supported by Nvidia's CUDA compiler so it likely to be good enough. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D125904	2022-05-26 09:18:22 -04:00
Aaron Ballman	9368bf9023	Removing this as part of the revert done in `69da3b6aea` This appears to have been added in a follow-up commit that I missed.	2022-05-25 13:45:17 -04:00
Adrian Kuegel	9698a445c6	Fix warning by handling OMPC_fail in switch statement.	2022-05-25 09:33:41 +02:00
Mike Rice	239094cdee	[OpenMP] Add codegen for 'omp_all_memory' reserved locator. This creates an entry with address=nullptr and flag=0x80. When an 'omp_all_memory' entry is specified any other 'out' or 'inout' entries are not needed and are not passed to the runtime. Differential Revision: https://reviews.llvm.org/D126321	2022-05-24 15:26:23 -07:00
Mike Rice	9ba937112f	[OpenMP] Add parsing/sema support for omp_all_memory reserved locator Adds support for the reserved locator 'omp_all_memory' for use in depend clauses with 'out' or 'inout' dependence-types. Differential Revision: https://reviews.llvm.org/D125828	2022-05-24 10:28:59 -07:00
Stephen Long	4f1e64b54f	[MSVC, ARM64] Add __readx18 intrinsics https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 unsigned char __readx18byte(unsigned long) unsigned short __readx18word(unsigned long) unsigned long __readx18dword(unsigned long) unsigned __int64 __readx18qword(unsigned long) Given the lack of documentation of the intrinsics, we chose to align the offset with just `CharUnits::One()` when calling `IRBuilderBase::CreateAlignedLoad()` Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D126024	2022-05-23 10:59:12 -07:00
Stephen Long	3e0be5610f	[MSVC, ARM64] Add __writex18 intrinsics https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 void __writex18byte(unsigned long, unsigned char) void __writex18word(unsigned long, unsigned short) void __writex18dword(unsigned long, unsigned long) void __writex18qword(unsigned long, unsigned __int64) Given the lack of documentation of the intrinsics, we chose to align the offset with just `CharUnits::One()` when calling `IRBuilderBase::CreateAlignedStore()`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D126023	2022-05-23 07:01:11 -07:00
Stephen Long	ae80024fbe	[clang] Honor __attribute__((no_builtin("foo"))) on functions Support for `__attribute__((no_builtin("foo")))` was added in https://reviews.llvm.org/D68028, but builtins were still being used even when the attribute was placed on a function. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D124701	2022-05-20 06:41:47 -07:00
Jon Chesterfield	83c431fb9e	[amdgpu] Add amdgpu_kernel calling conv attribute to clang Allows emitting define amdgpu_kernel void @func() IR from C or C++. This replaces the current workflow which is to write a stub in opencl that calls an external C function implemented in C++ combined through llvm-link. Calling the resulting function still requires a manual implementation of the ABI from the host side. The primary application is for more rapid debugging of the amdgpu backend by permuting a C or C++ test file instead of manually updating an IR file. Implementation closely follows D54425. Non-amd reviewers from there. Reviewed By: yaxunl Differential Revision: https://reviews.llvm.org/D125970	2022-05-20 08:50:37 +01:00
Yaxun (Sam) Liu	cefe472c51	[clang] Fix __has_builtin Fix __has_builtin to return 1 only if the requested target features of a builtin are enabled by refactoring the code for checking required target features of a builtin and use it in evaluation of __has_builtin. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D125829	2022-05-19 11:34:42 -04:00
Jay Foad	6bec3e9303	[APInt] Remove all uses of zextOrSelf, sextOrSelf and truncOrSelf Most clients only used these methods because they wanted to be able to extend or truncate to the same bit width (which is a no-op). Now that the standard zext, sext and trunc allow this, there is no reason to use the OrSelf versions. The OrSelf versions additionally have the strange behaviour of allowing extending to a smaller width, or truncating to a larger width, which are also treated as no-ops. A small amount of client code relied on this (ConstantRange::castOp and MicrosoftCXXNameMangler::mangleNumber) and needed rewriting. Differential Revision: https://reviews.llvm.org/D125557	2022-05-19 11:23:13 +01:00
Mitch Phillips	7aa1fa0a0a	Reland "[dwarf] Emit a DIGlobalVariable for constant strings." An upcoming patch will extend llvm-symbolizer to provide the source line information for global variables. The goal is to move AddressSanitizer off of internal debug info for symbolization onto the DWARF standard (and doing a clean-up in the process). Currently, ASan reports the line information for constant strings if a memory safety bug happens around them. We want to keep this behaviour, so we need to emit debuginfo for these variables as well. Reviewed By: dblaikie, rnk, aprantl Differential Revision: https://reviews.llvm.org/D123534	2022-05-18 13:56:45 -07:00
Mitch Phillips	ed2c3218f5	Revert "[dwarf] Emit a DIGlobalVariable for constant strings." This reverts commit `4680982b36`. Broke a fuchsia windows bot. More details in the review: https://reviews.llvm.org/D123534	2022-05-16 19:07:38 -07:00
Mitch Phillips	4680982b36	[dwarf] Emit a DIGlobalVariable for constant strings. An upcoming patch will extend llvm-symbolizer to provide the source line information for global variables. The goal is to move AddressSanitizer off of internal debug info for symbolization onto the DWARF standard (and doing a clean-up in the process). Currently, ASan reports the line information for constant strings if a memory safety bug happens around them. We want to keep this behaviour, so we need to emit debuginfo for these variables as well. Reviewed By: dblaikie, rnk, aprantl Differential Revision: https://reviews.llvm.org/D123534	2022-05-16 16:52:16 -07:00
Egor Zhdan	2f04e703bf	[Clang] Add DriverKit support This is the second patch that upstreams the support for Apple's DriverKit. The first patch: https://reviews.llvm.org/D118046. Differential Revision: https://reviews.llvm.org/D121911	2022-05-13 20:34:57 +01:00
Joseph Huber	af757f8980	[OpenMP] Don't set device runtime debugging flags if using '-nogpulib' We use globals to configure debugging at compile-time for the device runtime. Because these are only used by the OpenMP runtime we shouldn't define them if we aren't using the device runtime. When a user passes in '-nogpulib' this indicates that we are not using the device runtime, so we should check for the precense of this flag and not emit these globals if used. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D125314	2022-05-13 14:38:43 -04:00
Mike Rice	772b0c44a4	[OpenMP] Fix mangling for linear parameters with negative stride The 'n' character is used in place of '-' in the mangled name. Differential Revision: https://reviews.llvm.org/D125406	2022-05-11 14:02:09 -07:00
Joseph Huber	26eb04268f	[Clang] Introduce clang-offload-packager tool to bundle device files In order to do offloading compilation we need to embed files into the host and create fatbainaries. Clang uses a special binary format to bundle several files along with their metadata into a single binary image. This is currently performed using the `-fembed-offload-binary` option. However this is not very extensibile since it requires changing the command flag every time we want to add something and makes optional arguments difficult. This patch introduces a new tool called `clang-offload-packager` that behaves similarly to CUDA's `fatbinary`. This tool takes several input files with metadata and embeds it into a single image that can then be embedded in the host. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D125165	2022-05-11 09:39:13 -04:00
Matt Devereau	75bb815231	[AArch64][SVE] Add aarch64_sve_pcs attribute to Clang Enable function attribute aarch64_sve_pcs at the C level, which correspondes to aarch64_sve_vector_pcs at the LLVM IR level. This requirement was created by this addition to the ARM C Language Extension: https://github.com/ARM-software/acle/pull/194 Differential Revision: https://reviews.llvm.org/D124998	2022-05-11 13:33:56 +00:00
Joseph Huber	0035f7154c	[CUDA] Create offloading entries when using the new driver The changes made in D123460 generalized the code generation for OpenMP's offloading entries. We can use the same scheme to register globals for CUDA code. This patch adds the code generation to create these offloading entries when compiling using the new offloading driver mode. The offloading entries are simple structs that contain the information necessary to register the global. The struct used is as follows: ``` Type struct __tgt_offload_entry { void addr; // Pointer to the offload entry info. // (function or global) char name; // Name of the function or global. size_t size; // Size of the entry info (0 if it a function). int32_t flags; int32_t reserved; }; ``` Currently CUDA handles RDC code generation by deferring the registration of globals in the current TU to a callback function containing the modules ID. Later all the module IDs will be used to register all of the globals at once. Rather than mimic this, offloading entries allow us to mimic the way OpenMP registers globals. That is, we create a simple global struct for each device global to be registered. These are placed at a special section `cuda_offloading_entires`. Because this section is a valid C-identifier, the linker will profide a `__start` and `__stop` pointer that we can use to iterate and register all globals at runtime. the registration requires a flag variable to indicate which registration function to use. I have assigned the flags somewhat arbitrarily, but these use the following values. Kernel: 0 Variable: 0 Managed: 1 Surface: 2 Texture: 3 Depends on D120272 Reviewed By: tra Differential Revision: https://reviews.llvm.org/D123471	2022-05-11 07:30:21 -04:00
Mike Rice	0dbaef61b5	[OpenMP] Fix mangling for linear modifiers with variable stride This adds support for variable stride with the val, uval, and ref linear modifiers. Previously only the no modifer type ls<argno> was supported. val -> Ls<argno> uval -> Us<argno> ref -> Rs<argno> Differential Revision: https://reviews.llvm.org/D125330	2022-05-10 14:12:44 -07:00
Mike Rice	1a02519bc5	[OpenMP] Add mangling support for linear modifiers (ref,uval,val) Add mangling for linear parameters specified with ref, uval, and val for 'omp declare simd' vector functions. Add missing stride for linear this parameters. Differential Revision: https://reviews.llvm.org/D125269	2022-05-10 09:56:55 -07:00
Daniel Bertalan	93a8225da1	[CodeGen] Use ABI alignment for C++ new expressions In case of placement new, if we do not know the alignment of the operand, we can't assume it has the preferred alignment. It might be e.g. a pointer to a struct member which follows ABI alignment rules. This makes UBSAN no longer report "constructor call on misaligned address" when constructing a double into a struct field of type double on i686. The psABI specifies an alignment of 4 bytes, but the preferred alignment used by Clang is 8 bytes. We now use ABI alignment for allocating new as well, as the preferred alignment should be used for over-aligning e.g. local variables, which isn't relevant for ABI code dealing with operator new. AFAICT there wouldn't be problems either way though. Fixes #54845. Differential Revision: https://reviews.llvm.org/D124736	2022-05-10 16:02:23 +01:00
Simon Pilgrim	ec6024d081	[X86] Replace avx512f integer mul reduction builtins with generic builtin D117829 added the generic "__builtin_reduce_mul" which we can use to replace the x86 specific integer mul reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required. Differential Revision: https://reviews.llvm.org/D125222	2022-05-09 14:10:28 +01:00
Simon Pilgrim	8a92c45e07	[Clang] Add integer mul reduction builtin Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.mul intrinsic call. For other reductions, we've tried to share builtins for float/integer vectors, but the fmul reduction intrinsic also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fmul support this shouldn't affect the integer case. Differential Revision: https://reviews.llvm.org/D117829	2022-05-09 12:12:53 +01:00
Richard Smith	c4f95ef86a	Reimplement `__builtin_dump_struct` in Sema. Compared to the old implementation: * In C++, we only recurse into aggregate classes. * Unnamed bit-fields are not printed. * Constant evaluation is supported. * Proper conversion is done when passing arguments through `...`. * Additional arguments are supported and are injected prior to the format string; this directly supports use with `fprintf`, for example. * An arbitrary callable can be passed rather than only a function pointer. In particular, in C++, a function template or overload set is acceptable. * All text generated by Clang is printed via `%s` rather than directly; this avoids issues where Clang's pretty-printing output might itself contain a `%` character. * Fields of types that we don't know how to print are printed with a `"%p"` format and passed by address to the print function. No return value is produced. Reviewed By: aaron.ballman, erichkeane, yihanaa Differential Revision: https://reviews.llvm.org/D124221	2022-05-05 14:55:47 -07:00
Yaxun (Sam) Liu	62501bc45a	[NFC][CUDA][HIP] rework mangling number for aux target CUDA/HIP needs to mangle for aux target. When mangling for aux target, the mangler should use mangling number for aux target. Previously in https://reviews.llvm.org/D122734 a state was introduced in ASTContext to let the mangler get mangling number for aux target from ASTContext. This patch removes that state from ASTConext and add an IsAux member to MangleContext to indicate that the mangle context is for aux target. This reflects the reality that the mangle context is created for mangling aux target and makes ASTContext cleaner. Reviewed by: Artem Belevich, Reid Kleckner Differential Revision: https://reviews.llvm.org/D124842	2022-05-04 13:05:33 -04:00
David Pagan	37471cf2c3	[clang][OpenMP] Local variable alignment incorrect with align clause If alignment specified with align clause is less than natural alignment for list item type, the alignment should be set to the natural alignment. See OMP5.1 specification, page 185, lines 7-10 Differential Revision: https://reviews.llvm.org/D124676	2022-05-03 13:10:01 -07:00
Shilei Tian	9c1085c7e2	[Clang][OpenMP] Add the support for floating-point variables for specific atomic clauses Currently when using `atomic update` with floating-point variables, if the operation is add or sub, `cmpxchg`, instead of `atomicrmw` is emitted, as shown in [1]. In fact, about three years ago, llvm-svn: 351850 added the support for FP operations. This patch adds the support in OpenMP as well. [1] https://godbolt.org/z/M7b4ba9na Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D124724	2022-05-03 11:30:54 -04:00
David Truby	8bc29d1427	[clang][AArch64][SVE] Implement conditional operator for SVE vectors This patch adds support for the conditional (ternary) operator on SVE scalable vector types in C++, matching the behaviour for NEON vector types. Like the conditional operator for NEON types, this is disabled in C mode. Differential Revision: https://reviews.llvm.org/D124091	2022-05-03 13:10:32 +00:00
Fangrui Song	4d34c4e0e6	[OpenMP] Fix -Wswitch (due to new OMPC_cancellation_construct_type) after D123828	2022-05-02 12:10:09 -07:00
Simon Pilgrim	9a14c369c4	[X86] Replace avx512f integer add reduction builtins with generic builtin D124741 added the generic "__builtin_reduce_add" which we can use to replace the x86 specific integer add reduction builtins - internally these were mapping to the same intrinsic already so there are no test changes required. Differential Revision: https://reviews.llvm.org/D124757	2022-05-02 14:39:17 +01:00
Simon Pilgrim	a23291b7db	[Clang] Add integer add reduction builtin Similar to the existing bitwise reduction builtins, this lowers to a llvm.vector.reduce.add intrinsic call. For other reductions, we've tried to share builtins for float/integer vectors, but the fadd reduction intrinsics also take a starting value argument and can either do unordered or serialized, but not reduction-trees as specified for the builtins. However we address fadd support this shouldn't affect the integer case. (Split off from D117829) Differential Revision: https://reviews.llvm.org/D124741	2022-05-02 11:03:25 +01:00
joker881	19978e0874	[RISCV]Add CTZ Intrinsic for ZBB in Clang Add Intrinsics and test for B extension (updating coming soon (: Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124348	2022-04-30 08:18:10 +08:00
python3kgae	73417c5176	[HLSL][clang][Driver] Support validator version command line option. The DXIL validator version option(/validator-version) decide the validator version when compile hlsl. The format is major.minor like 1.0. In normal case, the value of validator version should be got from DXIL validator. Before we got DXIL validator ready for llvm/main, DXIL validator version option is added first to set validator version. It will affect code generation for DXIL, so it is treated as a code gen option. A new member std::string DxilValidatorVersion is added to clang::CodeGenOptions. Then CGHLSLRuntime is added to clang::CodeGenModule. It is used to translate clang::CodeGenOptions::DxilValidatorVersion into a ModuleFlag under key "dx.valver" at end of clang code generation. Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D123884	2022-04-29 16:48:08 -07:00
Joe Nash	8bdfc73f63	[AMDGPU][clang] Definition of gfx11 subtarget Contributors: Jay Foad <jay.foad@amd.com> Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> Patch 2/N for upstreaming of AMDGPU gfx11 architecture Depends on D124536 Reviewed By: foad, kzhuravl, #amdgpu, arsenm Differential Revision: https://reviews.llvm.org/D124537	2022-04-29 13:55:56 -04:00
Joseph Huber	643c9b22ef	[OpenMP] Make generating offloading entries more generic This patch moves the logic for generating the offloading entries to the OpenMPIRBuilder. This makes it easier to re-use in other places, such as for OpenMP support in Flang or using the same method for generating offloading entires for other languages like Cuda. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D123460	2022-04-29 09:14:31 -04:00
jonasyhwang	eaca933c59	[Clang][CodeGen]Fix __builtin_dump_struct missing record type field name Thanks for @rsmith to point this. I'm sorry for introducing this bug. See @rsmith 's comment in https://reviews.llvm.org/D122248 Eg:(By @rsmith ) https://godbolt.org/z/o7vcbWaEf I have added a test case struct: ``` struct U19A { int a; }; struct U19B { struct U19A a; }; struct U19B a = { .a.a = 2022 }; ``` Dump result: ``` struct U19B { struct U19A a = { int a = 2022 } } ``` Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122920	2022-04-29 12:58:53 +08:00
Yaxun (Sam) Liu	11d3e31c60	[CUDA][HIP] Fix mangling number for local struct MSVC and Itanium mangling use different mangling numbers for function-scope structs, which causes inconsistent mangled kernel names in device and host compilations. This patch uses Itanium mangling number for structs in for mangling device side names in CUDA/HIP host compilation on Windows to fix this issue. A state is added to ASTContext to indicate whether the current name mangling is for device side names in host compilation. Device and host mangling number are encoded/decoded as upper and lower half of 32 bit unsigned integer to fit into the original mangling number field for AST. Diagnostic will be emitted if a manglining number exceeds limit. Reviewed by: Artem Belevich, Reid Kleckner Differential Revision: https://reviews.llvm.org/D122734 Fixes: SWDEV-328515	2022-04-28 19:54:43 -04:00
Alexey Bataev	1462e63f67	[OPENMP]PR53344: Emit code for final update of the inscan reduction vars in worksharing loops. Need to emit final update of the inscan reduction variables. For worksharing loops, the reduction values are stored in the temp array, need to copy the last element to the original var at the end of the construct. Differential Revision: https://reviews.llvm.org/D121156	2022-04-28 10:41:28 -07:00
Yaxun (Sam) Liu	57a210e5b7	[CUDA][HIP] Fix linkage of __clang_gpu_used_external Different TU's may have this globl var. appending linkage can only be used with lld recognized special variables. Change it to internal linkage. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D124466	2022-04-26 20:43:39 -04:00
Michael Kruse	ff289feeba	[OpenMPIRBuilder] Remove ContinuationBB argument from Body callback. The callback is expected to create a branch to the ContinuationBB (sometimes called FiniBB in some lambdas) argument when finishing. This creates problems: 1. The InsertPoint used for CodeGenIP does not need to be the end of a block. If it is not, a naive callback will insert a branch instruction into the middle of the block. 2. The BasicBlock the CodeGenIP is pointing to may or may not have a terminator. There is an conflict where to branch to if the block already has a terminator. 3. Some API functions work only with block having a terminator. Some workarounds have been used to insert a temporary terminator that is removed again. 4. Some callbacks are sensitive to whether the BasicBlock has a terminator or not. This creates a callback ordering problem where different callback may have different behaviour depending on whether a previous callback created a terminator or not. The problem also exists for FinalizeCallbackTy where some callbacks do create branch to another "continue" block, but unlike BodyGenCallbackTy does not receive the target as argument. This is not addressed in this patch. With this patch, the callback receives an CodeGenIP into a BasicBlock where to insert instructions. If it has to insert control flow, it can split the block at that position as needed but otherwise no separate ContinuationBB is needed. In particular, a callback can be empty without breaking the emitted IR. If the caller needs the control flow to branch to a specific target, it can insert the branch instruction itself and pass an InsertPoint before the terminator to the callback. Certain frontends such as Clang may expect the current IRBuilder position to be at the end of a basic block. In this case its callbacks must split the block at CodeGenIP before setting the IRBuilder position such that the instructions after CodeGenIP are moved to another basic block and before returning create a new branch instruction to the split block. Some utility functions such as `splitBB` are supporting correct splitting of BasicBlocks, independent of whether they have a terminator or not, returning/setting the InsertPoint of an IRBuilder to the end of split predecessor block, and optionally omitting creating a branch to the split successor block to be added later. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D118409	2022-04-26 16:35:01 -05:00
Andrew Savonichev	0a27622a1d	[NVPTX] Disable DWARF .file directory for PTX Default behavior for .file directory was changed in D105856, but ptxas (CUDA 11.5 release) refuses to parse it: $ llc -march=nvptx64 llvm/test/DebugInfo/NVPTX/debug-file-loc.ll $ ptxas debug-file-loc.s ptxas debug-file-loc.s, line 42; fatal : Parsing error near '"foo.h"': syntax error Added a new field to MCAsmInfo to control default value of UseDwarfDirectory. This value is used if -dwarf-directory command line option is not specified. Differential Revision: https://reviews.llvm.org/D121299	2022-04-26 21:40:36 +03:00
Jonas Paulsson	9b38e2efa0	[SystemZ] Fix C++ ABI for passing args of structs containing zero width bitfield. A struct like { float a; int :0; } should per the SystemZ ABI be passed in a GPR, but to match a bug in GCC it has been passed in an FPR (see `759449c`). GCC has now corrected the C++ ABI for this case, and this patch for clang follows suit. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D122388	2022-04-26 17:16:14 +02:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Alok Kumar Sharma	a48300aee5	[clang][OpenMP][DebugInfo] Debug support for TLS variables present in OpenMP consruct In case of OpenMP programs, thread local variables can be present in any clause pertaining to OpenMP constructs, as we know that compiler generates artificial functions and in some cases values are passed to those artificial functions thru parameters. For an example, if thread local variable is present in copyin clause (testcase attached with the patch), parameter with same name is generated as parameter to artificial function. When user inquires the thread Local variable, its debug info is hidden by the parameter. User never gets the actual TLS variable when inquires it, instead gets the artificial parameter. Current patch suppresses the debug info for such artificial parameter to enable correct debugging of TLS variables. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D123787	2022-04-23 12:29:32 +05:30
Yaxun (Sam) Liu	04fb81674e	[CUDA][HIP] Externalize kernels with internal linkage This patch is a continuation of https://reviews.llvm.org/D123353. Not only kernels in anonymous namespace, but also template kernels with template arguments in anonymous namespace need to be externalized. To be more generic, this patch checks the linkage of a kernel assuming the kernel does not have __global__ attribute. If the linkage is internal then clang will externalize it. This patch also fixes the postfix for externalized symbol since nvptx does not allow '.' in symbol name. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D124189 Fixes: https://github.com/llvm/llvm-project/issues/54560	2022-04-22 17:05:36 -04:00
Ying Yi	b09ba42620	Bug 51277: [DWARF] DW_AT_alignment incorrect when attribute((__aligned__)) is present but ignored` In the original code, the 'getDeclAlignIfRequired' function is used. The 'getDeclAlignIfRequired' function will return the max alignment of all aligned attributes if the type has aligned attributes. The function doesn't consider the type at all. The 'getTypeAlignIfRequired' function uses the type's alignment value, which also used by the 'alignof' function. I think we should use the function of 'getTypeAlignIfRequired'. Reviewed By: dblaikie, jmorse, wolfgangp Differential Revision: https://reviews.llvm.org/D124006	2022-04-22 12:15:00 +01:00
Shafik Yaghmour	5ff992bca2	[DEBUG-INFO] Change how we handle auto return types for lambda operator() to be consistent with gcc D70524 added support for auto return types for C++ member functions. I was implementing support on the LLDB side for looking up the deduced type. I ran into trouble with some cases with respect to lambdas. I looked into how gcc was handling these cases and it appears gcc emits the deduced return type for lambdas. So I am changing out behavior to match that. Differential Revision: https://reviews.llvm.org/D123319	2022-04-21 14:58:50 -07:00
Chuanqi Xu	483efc9ad0	[Pipelines] Remove Legacy Passes in Coroutines The legacy passes are deprecated now and would be removed in near future. This patch tries to remove legacy passes in coroutines. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D123918	2022-04-21 10:59:11 +08:00
Richard Smith	72315d02c4	Treat `std::move`, `forward`, etc. as builtins. This is extended to all `std::` functions that take a reference to a value and return a reference (or pointer) to that same value: `move`, `forward`, `move_if_noexcept`, `as_const`, `addressof`, and the libstdc++-specific function `__addressof`. We still require these functions to be declared before they can be used, but don't instantiate their definitions unless their addresses are taken. Instead, code generation, constant evaluation, and static analysis are given direct knowledge of their effect. This change aims to reduce various costs associated with these functions -- per-instantiation memory costs, compile time and memory costs due to creating out-of-line copies and inlining them, code size at -O0, and so on -- so that they are not substantially more expensive than a cast. Most of these improvements are very small, but I measured a 3% decrease in -O0 object file size for a simple C++ source file using the standard library after this change. We now automatically infer the `const` and `nothrow` attributes on these now-builtin functions, in particular meaning that we get a warning for an unused call to one of these functions. In C++20 onwards, we disallow taking the addresses of these functions, per the C++20 "addressable function" rule. In earlier language modes, a compatibility warning is produced but the address can still be taken. The same infrastructure is extended to the existing MSVC builtin `__GetExceptionInfo`, which is now only recognized in namespace `std` like it always should have been. This is a re-commit of `fc30901096`, `a571f82a50`, `64c045e25b`, and `de6ddaeef3`, and reverts `aa643f455a`. This change also includes a workaround for users using libc++ 3.1 and earlier (!!), as apparently happens on AIX, where std::move sometimes returns by value. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D123345 Revert "Fixup D123950 to address revert of D123345" This reverts commit `aa643f455a`.	2022-04-20 17:58:31 -07:00
David Tenty	aa643f455a	Fixup D123950 to address revert of D123345 Since D123345 got reverted Builtin::BIaddressof and Builtin::BI__addressof don't exist and cause build breaks.	2022-04-20 19:59:07 -04:00
David Tenty	98d911e01f	Revert "Treat `std::move`, `forward`, etc. as builtins." This reverts commit `b27430f9f4` as the parent https://reviews.llvm.org/D123345 breaks the AIX CI: https://lab.llvm.org/buildbot/#/builders/214/builds/819	2022-04-20 19:14:37 -04:00
Pengxuan Zheng	38612fbc89	Reland "[COFF, ARM64] Add __break intrinsic" https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 Reland after fixing the test failure. The failure was due to conflict with a change (D122983) which was merged right before this patch. Reviewed By: rnk, mstorsjo Differential Revision: https://reviews.llvm.org/D124032	2022-04-20 13:01:30 -07:00
Pengxuan Zheng	bff8356b19	Revert "[COFF, ARM64] Add __break intrinsic" This reverts commit `8a9b4fb4aa`.	2022-04-20 11:57:49 -07:00
Eli Friedman	ecc8479a01	Look through calls to std::addressof to compute pointer alignment. This is sort of a followup to D37310; that basically fixed the same issue, but then the libstdc++ implementation of <atomic> changed. Re-fix the the issue in essentially the same way: look through the addressof operation to find the alignment of the underlying object. Differential Revision: https://reviews.llvm.org/D123950	2022-04-20 11:30:11 -07:00
Pengxuan Zheng	8a9b4fb4aa	[COFF, ARM64] Add __break intrinsic https://docs.microsoft.com/en-us/cpp/intrinsics/arm64-intrinsics?view=msvc-170 Reviewed By: rnk, mstorsjo Differential Revision: https://reviews.llvm.org/D124032	2022-04-20 11:20:26 -07:00
Fangrui Song	a57d16bf80	[CodeGen] Fix -Wswitch after D116462	2022-04-19 17:33:15 -07:00
Fangrui Song	8b0e7f2293	[CodeGen] Fix -Wswitch after D116462	2022-04-19 17:28:54 -07:00
Paul Kirth	bac6cd5bf8	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-04-19 21:23:48 +00:00
Yaxun (Sam) Liu	cac4e2fe25	[CUDA][HIP] Fix gpu.used.external Rename gpu.used.external as __clang_gpu_used_external as ptxas does not allow . in global variable name. Fixes: https://github.com/llvm/llvm-project/issues/54934 Reviewed by: Joseph Huber, Artem Belevich Differential Revision: https://reviews.llvm.org/D123946	2022-04-18 23:10:31 -04:00
Michael Kruse	2d92ee97f1	Reapply "[OpenMP] Refactor OMPScheduleType enum." This reverts commit `af0285122f`. The test "libomp::loop_dispatch.c" on builder openmp-gcc-x86_64-linux-debian fails from time-to-time. See #54969. This patch is unrelated.	2022-04-18 21:56:47 -05:00
Michael Kruse	af0285122f	Revert "[OpenMP] Refactor OMPScheduleType enum." This reverts commit `9ec501da76`. It may have caused the openmp-gcc-x86_64-linux-debian buildbot to fail. https://lab.llvm.org/buildbot/#/builders/4/builds/20377	2022-04-18 14:38:31 -05:00
Michael Kruse	9ec501da76	[OpenMP] Refactor OMPScheduleType enum. The OMPScheduleType enum stores the constants from libomp's internal sched_type in kmp.h and are used by several kmp API functions. The enum values have an internal structure, namely each scheduling algorithm (e.g.) exists in four variants: unordered, orderend, normerge unordered, and nomerge ordered. This patch (basically a followup to D114940) splits the "ordered" and "nomerge" bits into separate flags, as was already done for the "monotonic" and "nonmonotonic", so we can apply bit flags operations on them. It also now contains all possible combinations according to kmp's sched_type. Deriving of the OMPScheduleType enum from clause parameters has been moved form MLIR's OpenMPToLLVMIRTranslation.cpp to OpenMPIRBuilder to make available for clang as well. Since the primary purpose of the flag is the binary interface to libomp, it has been made more private to LLVMFrontend. The primary interface for generating worksharing-loop using OpenMPIRBuilder code becomes `applyWorkshareLoop` which derives the OMPScheduleType automatically and calls the appropriate emitter function. While this is mostly a NFC refactor, it still applies the following functional changes: * The logic from OpenMPToLLVMIRTranslation to derive the OMPScheduleType also applies to clang. Most notably, it now applies the nonmonotonic flag for non-static schedules by default. * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was previously not applied if the simd modifier was used. I assume this was a bug, since the effect was due to `loop.schedule_modifier()` returning `mlir::omp::ScheduleModifier::none` instead of `llvm::Optional::None`. * In OpenMPToLLVMIRTranslation, the nonmonotonic default flag was set even if ordered was specified, in breach to what the comment before citing the OpenMP specification says. I assume this was an oversight. The ordered flag with parameter was not considered in this patch. Changes will need to be made (e.g. adding/modifying function parameters) when support for it is added. The lengthy names of the enum values can be discussed, for the moment this is avoiding reusing previously existing enum value names such as `StaticChunked` to avoid confusion. Reviewed By: peixin Differential Revision: https://reviews.llvm.org/D123403	2022-04-18 14:03:17 -05:00
Richard Smith	b27430f9f4	Treat `std::move`, `forward`, etc. as builtins. This is extended to all `std::` functions that take a reference to a value and return a reference (or pointer) to that same value: `move`, `forward`, `move_if_noexcept`, `as_const`, `addressof`, and the libstdc++-specific function `__addressof`. We still require these functions to be declared before they can be used, but don't instantiate their definitions unless their addresses are taken. Instead, code generation, constant evaluation, and static analysis are given direct knowledge of their effect. This change aims to reduce various costs associated with these functions -- per-instantiation memory costs, compile time and memory costs due to creating out-of-line copies and inlining them, code size at -O0, and so on -- so that they are not substantially more expensive than a cast. Most of these improvements are very small, but I measured a 3% decrease in -O0 object file size for a simple C++ source file using the standard library after this change. We now automatically infer the `const` and `nothrow` attributes on these now-builtin functions, in particular meaning that we get a warning for an unused call to one of these functions. In C++20 onwards, we disallow taking the addresses of these functions, per the C++20 "addressable function" rule. In earlier language modes, a compatibility warning is produced but the address can still be taken. The same infrastructure is extended to the existing MSVC builtin `__GetExceptionInfo`, which is now only recognized in namespace `std` like it always should have been. This is a re-commit of `fc30901096`, `a571f82a50`, and `64c045e25b` which were reverted in `e75d8b7037` due to a crasher bug where CodeGen would emit a builtin glvalue as an rvalue if it constant-folds. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D123345	2022-04-17 13:26:16 -07:00
Vitaly Buka	e75d8b7037	Revert "Treat `std::move`, `forward`, and `move_if_noexcept` as builtins." Revert "Extend support for std::move etc to also cover std::as_const and" Revert "Update test to handle opaque pointers flag flip." It crashes on libcxx tests https://lab.llvm.org/buildbot/#/builders/85/builds/8174 This reverts commit `fc30901096`. This reverts commit `a571f82a50`. This reverts commit `64c045e25b`.	2022-04-16 00:27:51 -07:00
Joseph Huber	984a0dc386	[OpenMP] Use new offloading binary when embedding offloading images The previous patch introduced the offloading binary format so we can store some metada along with the binary image. This patch introduces using this inside the linker wrapper and Clang instead of the previous method that embedded the metadata in the section name. Differential Revision: https://reviews.llvm.org/D122683	2022-04-15 20:35:26 -04:00
Richard Smith	fc30901096	Extend support for std::move etc to also cover std::as_const and std::addressof, plus the libstdc++-specific std::__addressof. This brings us to parity with the corresponding GCC behavior. Remove STDBUILTIN macro that ended up not being used.	2022-04-15 16:31:39 -07:00
Richard Smith	64c045e25b	Treat `std::move`, `forward`, and `move_if_noexcept` as builtins. We still require these functions to be declared before they can be used, but don't instantiate their definitions unless their addresses are taken. Instead, code generation, constant evaluation, and static analysis are given direct knowledge of their effect. This change aims to reduce various costs associated with these functions -- per-instantiation memory costs, compile time and memory costs due to creating out-of-line copies and inlining them, code size at -O0, and so on -- so that they are not substantially more expensive than a cast. Most of these improvements are very small, but I measured a 3% decrease in -O0 object file size for a simple C++ source file using the standard library after this change. We now automatically infer the `const` and `nothrow` attributes on these now-builtin functions, in particular meaning that we get a warning for an unused call to one of these functions. In C++20 onwards, we disallow taking the addresses of these functions, per the C++20 "addressable function" rule. In earlier language modes, a compatibility warning is produced but the address can still be taken. The same infrastructure is extended to the existing MSVC builtin `__GetExceptionInfo`, which is now only recognized in namespace `std` like it always should have been. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D123345	2022-04-15 14:09:45 -07:00
Eli Friedman	4802edd1ac	Fix size of flexible array initializers, and re-enable assertions. In D123649, I got the formula for getFlexibleArrayInitChars slightly wrong: the flexible array elements can be contained in the tail padding of the struct. Fix the formula to account for that. With the fixed formula, we run into another issue: in some cases, we were emitting extra padding for flexible arrray initializers. Fix CGExprConstant so it uses a packed struct when necessary, to avoid this extra padding. Differential Revision: https://reviews.llvm.org/D123826	2022-04-15 12:09:57 -07:00
Jan Svoboda	9d98f58959	[clang][CodeGen] NFCI: Use FileEntryRef This patch removes use of the deprecated `DirectoryEntry::getName()` from clangCodeGen by using `{File,Directory}EntryRef` instead. Reviewed By: bnbarham Differential Revision: https://reviews.llvm.org/D123768	2022-04-15 15:16:17 +02:00
Eli Friedman	6cf0b1b3da	Comment out assertions about initializer size added in D123649. They're causing failures in LLVM test-suite. Added some regression tests that explain the issue.	2022-04-14 13:58:17 -07:00
Eli Friedman	5955a0f937	Allow flexible array initialization in C++. Flexible array initialization is a C/C++ extension implemented in many compilers to allow initializing the flexible array tail of a struct type that contains a flexible array. In clang, this is currently restricted to C. But this construct is used in the Microsoft SDK headers, so I'd like to extend it to C++. For now, this doesn't handle dynamic initialization; probably not hard to implement, but it's extra code, and I don't think it's necessary for the expected uses. And we explicitly fail out of constant evaluation. I've added some additional code to assert that initializers have the correct size, with or without flexible array init. This might catch issues unrelated to flexible array init. Differential Revision: https://reviews.llvm.org/D123649	2022-04-14 11:56:40 -07:00
David Truby	53fd8db791	[Clang][AArch64][SVE] Allow subscript operator for SVE types Undefined behaviour is just passed on to extract_element when the index is out of bounds. Subscript on svbool_t is not allowed as this doesn't really have meaningful semantics. Differential Revision: https://reviews.llvm.org/D122732	2022-04-14 13:20:50 +01:00
Jan Svoboda	d79ad2f1db	[clang][lex] NFCI: Use FileEntryRef in PPCallbacks::InclusionDirective() This patch changes type of the `File` parameter in `PPCallbacks::InclusionDirective()` from `const FileEntry *` to `Optional<FileEntryRef>`. With the API change in place, this patch then removes some uses of the deprecated `FileEntry::getName()` (e.g. in `DependencyGraph.cpp` and `ModuleDependencyCollector.cpp`). Reviewed By: dexonsmith, bnbarham Differential Revision: https://reviews.llvm.org/D123574	2022-04-14 10:46:12 +02:00
joker881	a4f47a99aa	RISCV] Add clang builtins for CLZ instruction. add intrinsic for CLZ Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121915	2022-04-14 12:29:15 +08:00
Eli Friedman	d791de0e25	Restrict lvalue-to-rvalue conversions in CGExprConstant. We were generating wrong code for cxx20-consteval-crash.cpp: instead of loading a value of a variable, we were using its address as the initializer. Found while adding code to verify the size of constant initializers. Differential Revision: https://reviews.llvm.org/D123648	2022-04-13 12:34:57 -07:00
Erich Keane	38823b7f5f	Fix Werror build issue from `6f20744b7f`	2022-04-13 11:00:09 -07:00
Erich Keane	6f20744b7f	Add support for ignored bitfield conditional codegen. Currently we emit an error in just about every case of conditionals with a 'non simple' branch if treated as an LValue. This patch adds support for the special case where this is an 'ignored' lvalue, which permits the side effects from happening. It also splits up the emit for conditional LValue in a way that should be usable to handle simple assignment expressions in similar situations. Differential Revision: https://reviews.llvm.org/D123680	2022-04-13 10:33:55 -07:00
Yaxun (Sam) Liu	0424b5115c	[CUDA][HIP] Fix host used external kernel in archive For -fgpu-rdc, a host function may call an external kernel which is defined in an archive of bitcode. Since this external kernel is only referenced in host function, the device bitcode does not contain reference to this external kernel, then the linker will not try to resolve this external kernel in the archive. To fix this issue, host-used external kernels and device variables are tracked. A global array containing pointers to these external kernels and variables is emitted which serves as an artificial references to the external kernels and variables used by host. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D123441	2022-04-13 10:47:16 -04:00
Nikita Popov	2978d02681	[Clang] Remove support for legacy pass manager This removes the -flegacy-pass-manager and -fno-experimental-new-pass-manager options, and the corresponding support code in BackendUtil. The -fno-legacy-pass-manager and -fexperimental-new-pass-manager options are retained as no-ops. Differential Revision: https://reviews.llvm.org/D123609	2022-04-13 10:21:42 +02:00
Daniel Kiss	b0343a38a5	Support the min of module flags when linking, use for AArch64 BTI/PAC-RET LTO objects might compiled with different `mbranch-protection` flags which will cause an error in the linker. Such a setup is allowed in the normal build with this change that is possible. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D123493	2022-04-13 09:31:51 +02:00
Quinn Pham	7d7022fb0c	[PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once This patch changes `EmitPPCBuiltinExpr` in `CGBuiltin.cpp` to remove the loop at the beginning of the function that emits the arguments and to delay emitting the arguments until inside the switch statement. These changes will put `EmitPPCBuiltinExpr` in line with the strategy of the target independent function `EmitBuiltinExpr`. Also, this patch ensures that arguments are only emitted once. Tests that included builtins affected by these changes have been modified to match expected behaviour. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D121637	2022-04-12 15:33:20 -05:00
Nikita Popov	b72fd1a84d	[CGCall] Check store type in findDominatingStoreToReturnValue() We need to make sure that the stored type matches the return type.	2022-04-11 12:08:29 +02:00
Yaxun (Sam) Liu	4ea1d43509	[CUDA][HIP] Externalize kernels in anonymous name space kernels in anonymous name space needs to have unique name to avoid duplicate symbols. Fixes: https://github.com/llvm/llvm-project/issues/54560 Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D123353	2022-04-10 21:56:28 -04:00
Jonas Hahnfeld	e4903d8be3	[CUDA/HIP] Remove argument from module ctor/dtor signatures In theory, constructors can take arguments when called via .init_array where at least glibc passes in (argc, argv, envp). This isn't used in the generated code and if it was, the first argument should be an integer, not a pointer. For destructors registered via atexit, the function should never take an argument. Differential Revision: https://reviews.llvm.org/D123370	2022-04-09 12:34:41 +02:00
Jennifer Yu	187ccc66fa	[clang][OpenMP5.1] Initial parsing/sema for has_device_addr Added basic parsing/sema/ support for the 'has_device_addr' clause. Differential Revision: https://reviews.llvm.org/D123402	2022-04-08 21:19:38 -07:00
Mitch Phillips	fa34951fbc	Reland "[MTE] Add -fsanitize=memtag* and friends." Differential Revision: https://reviews.llvm.org/D118948	2022-04-08 14:28:33 -07:00
Aaron Ballman	4aaf25b4f7	Revert "[MTE] Add -fsanitize=memtag* and friends." This reverts commit `8aa1490513`. Broke testing: https://lab.llvm.org/buildbot/#/builders/109/builds/36233	2022-04-08 16:15:58 -04:00
Mitch Phillips	8aa1490513	[MTE] Add -fsanitize=memtag* and friends. Currently, enablement of heap MTE on Android is specified by an ELF note, which signals to the linker to enable heap MTE. This change allows -fsanitize=memtag-heap to synthesize these notes, rather than adding them through the build system. We need to extend this feature to also signal the linker to do special work for MTE globals (in future) and MTE stack (currently implemented in the toolchain, but not implemented in the loader). Current Android uses a non-backwards-compatible ELF note, called ".note.android.memtag". Stack MTE is an ABI break anyway, so we don't mind that we won't be able to run executables with stack MTE on Android 11/12 devices. The current expectation is to support the verbiage used by Android, in that "SYNC" means MTE Synchronous mode, and "ASYNC" effectively means "fast", using the Kernel auto-upgrade feature that allows hardware-specific and core-specific configuration as to whether "ASYNC" would end up being Asynchronous, Asymmetric, or Synchronous on that particular core, whichever has a reasonable performance delta. Of course, this is platform and loader-specific. Differential Revision: https://reviews.llvm.org/D118948	2022-04-08 12:13:15 -07:00
Nikita Popov	692a147bf4	[CGCall] Make findDominatingStoreToReturnValue() more robust This was skipping specific lifetime + bitcast patterns, but with opaque pointers the bitcast will not be present, and we did not perform this fold. Instead skip over lifetime.end and bitcasts generally, without trying to correlate them.	2022-04-08 15:18:12 +02:00
serge-sans-paille	301e0d9135	[Clang][Fortify] drop inline decls when redeclared When an inline builtin declaration is shadowed by an actual declaration, we must reference the actual declaration, even if it's not the last, following GCC behavior. This fixes #54715 Differential Revision: https://reviews.llvm.org/D123308	2022-04-08 09:31:51 +02:00
Chuanqi Xu	74b56e02bd	[NFC] Remove unused variable in CodeGenModules This eliminates an unused-variable warning	2022-04-08 11:52:31 +08:00
David Blaikie	1cee3d9db7	DebugInfo: Consider the type of NTTP when simplifying template names Since the NTTP may need to be cast to the type when rebuilding the name, check that the type can be rebuilt when determining whether a template name can be simplified.	2022-04-08 00:00:46 +00:00
Quinn Pham	fef56f79ac	Revert "[PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once" This reverts commit `2aae5b1fac`. Because it breaks tests on windows.	2022-04-07 16:45:19 -05:00
Quinn Pham	2aae5b1fac	[PowerPC] Fix EmitPPCBuiltinExpr to emit arguments once This patch changes `EmitPPCBuiltinExpr` in `CGBuiltin.cpp` to remove the loop at the beginning of the function that emits the arguments and to delay emitting the arguments until inside the switch statement. These changes will put `EmitPPCBuiltinExpr` in line with the strategy of the target independent function `EmitBuiltinExpr`. Also, this patch ensures that arguments are only emitted once. Tests that included builtins affected by these changes have been modified to match expected behaviour. Reviewed By: #powerpc, nemanjai, amyk Differential Revision: https://reviews.llvm.org/D121637	2022-04-07 16:00:12 -05:00
Pavel Samolysov	b4ac84901e	[clang][NFC] Extract EmitAssemblyHelper::shouldEmitRegularLTOSummary The code to check if the regular LTO summary should be emitted and to add the corresponding module flags was duplicated in the 'EmitAssemblyHelper::EmitAssemblyWithLegacyPassManager' and 'EmitAssemblyHelper::RunOptimizationPipeline' methods. In order to eliminate these code duplications, the 'EmitAssemblyHelper::shouldEmitRegularLTOSummary' method has been extracted. The method returns a bool value, the value is 'true' if the module summary should be emitted. The patch keeps the setting of the module flags inline. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D123026	2022-04-07 10:38:46 -07:00
Kavitha Natarajan	b1ea0191a4	[clang][DebugInfo] Support debug info for alias variable clang to emit DWARF information for global alias variable as DW_TAG_imported_declaration. This change also handles nested (recursive) imported declarations. Reviewed by: dblaikie, aprantl Differential Revision: https://reviews.llvm.org/D120989	2022-04-07 17:15:40 +05:30
David Blaikie	6b306233f7	DebugInfo: Make the simplified template names prefix more unique	2022-04-06 18:25:46 +00:00
Ting Wang	b389354b28	[Clang][PowerPC] Add max/min intrinsics to Clang and PPC backend Add support for builtin_[max\|min] which has below prototype: A builtin_max (A1, A2, A3, ...) All arguments must have the same type; they must all be float, double, or long double. Internally use SelectCC to get the result. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D122478	2022-04-05 22:43:48 -04:00
Tom Honermann	5531abaf71	[clang] Corrections for target_clones multiversion functions. This change merges code for emit of target and target_clones multiversion resolver functions and, in doing so, corrects handling of target_clones functions that are declared but not defined. Previously, a use of such a target_clones function would result in an attempted emit of an ifunc that referenced an undefined resolver function. Ifunc references to undefined resolver functions are not allowed and, when the LLVM verifier is not disabled (via '-disable-llvm-verifier'), resulted in the verifier issuing a "IFunc resolver must be a definition" error and aborting the compilation. With this change, ifuncs and resolver function definitions are always emitted for used target_clones functions regardless of whether the target_clones function is defined (if the function is defined, then the ifunc and resolver are emitted regardless of whether the function is used). This change has the side effect of causing target_clones variants and resolver functions to be emitted in a different order than they were previously. This is harmless and is reflected in the updated tests. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122958	2022-04-05 19:50:22 -04:00
Tom Honermann	40af8df6fe	[clang] NFC: Preparation for merging code to emit target and target_clones resolvers. This change modifies CodeGenModule::emitMultiVersionFunctions() in preparation for a change that will merge support for emitting target_clones resolvers into this function. This change mostly serves to isolate indentation changes from later behavior modifying changes. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122957	2022-04-05 19:50:22 -04:00
Tom Honermann	0ace0100ae	[clang] NFC: Simplify the interface to CodeGenModule::GetOrCreateMultiVersionResolver(). Previously, GetOrCreateMultiVersionResolver() required the caller to provide a GlobalDecl along with an llvm::type and FunctionDecl. The latter two can be cheaply obtained from the first, and the llvm::type parameter is not always used, so requiring the caller to provide them was unnecessary and created the possibility that callers would pass an inconsistent set. This change simplifies the interface to only require the GlobalDecl value. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122956	2022-04-05 19:50:22 -04:00
Tom Honermann	bed5ee3f4b	[clang] NFC: Enhance comments in CodeGen for multiversion function support. Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122955	2022-04-05 19:50:22 -04:00
Shangwu Yao	15a1769631	Emit OpenCL metadata when targeting SPIR-V This is required for converting function calls such as get_global_id() into SPIR-V builtins. Differential Revision: https://reviews.llvm.org/D123049	2022-04-05 20:58:32 +00:00
Tom Honermann	7c53fc4fe1	[clang] Emit target_clones resolver functions as COMDAT. Previously, resolver functions synthesized for target_clones multiversion functions were not emitted as COMDAT. Now fixed.	2022-04-05 15:34:35 -04:00
David Blaikie	bb3980ae9f	DebugInfo: Don't use enumerators in template names for debug info as they are not canonical Since enumerators may not be available in every translation unit they can't be reliably used to name entities. (this also makes simplified template name roundtripping infeasible - since the expected name could only be rebuilt if the enumeration definition could be found (or only if it couldn't be found, depending on the context of the original name))	2022-04-05 17:16:42 +00:00
David Truby	4be1ec9fb5	[clang][AArc64][SVE] Add support for comparison operators on SVE types Comparison operators on SVE types return a signed integer vector of the same width as the incoming SVE type. This matches the existing behaviour for NEON types. Differential Revision: https://reviews.llvm.org/D122404	2022-04-05 13:56:27 +01:00
Nikita Popov	46cfbe561b	[LLVMContext] Replace enableOpaquePointers() with setOpaquePointers() This allows both explicitly enabling and explicitly disabling opaque pointers, in anticipation of the default switching at some point. This also slightly changes the rules by allowing calls if either the opaque pointer mode has not yet been set (explicitly or implicitly) or if the value remains unchanged.	2022-04-05 12:02:48 +02:00
Nikita Popov	ff18b158ed	[CodeGen] Avoid unnecessary ConstantExpr cast With opaque pointers, this is not necessarily a ConstantExpr. And we don't need one here either, just Constant is sufficient.	2022-04-05 11:28:40 +02:00
Nikita Popov	d69e9f9d89	[OpaquePtrs][Clang] Add -opaque-pointers/-no-opaque-pointers cc1 options This adds cc1 options for enabling and disabling opaque pointers on the clang side. This is not super useful now (because -mllvm -opaque-pointers and -Xclang -opaque-pointers have the same visible effect) but will be important once opaque pointers are enabled by default in clang. In that case, it will only be possible to disable them using the cc1 -no-opaque-pointers option. Differential Revision: https://reviews.llvm.org/D123034	2022-04-05 10:15:41 +02:00
Pavel Samolysov	87b28f5092	[clang][NFC] Extract the EmitAssemblyHelper::TargetTriple member Few times in different methods of the EmitAssemblyHelper class the following code snippet is used to get the TargetTriple and then use it's single method to check some conditions: TargetTriple(TheModule->getTargetTriple()) The parsing of a target triple string is not a trivial operation and it takes time to repeat the parsing many times in different methods of the class and even numerous times in one method just to call a getter (llvm::Triple(TheModule->getTargetTriple()).getVendor()), for example. The patch extracts the TargetTriple member of the EmitAssemblyHelper class to parse the triple only once in the class' constructor. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D122587	2022-04-04 12:16:39 +03:00
Luo, Yuanke	979d876bb4	[X86][AMX] enable amx cast intrinsics in FE. We have some discission in D99152 and llvm-dev and finially come up with a solution to add amx specific cast intrinsics. We've support the intrinsics in llvm IR. This patch is to replace bitcast with amx cast intrinsics in code emitting in FE. Differential Revision: https://reviews.llvm.org/D122567	2022-04-02 14:02:35 +08:00
Erich Keane	9ba8c4024b	Fix behavior of ifuncs with 'used' extern "C" static functions We expect that `extern "C"` static functions to be usable in things like inline assembly, as well as ifuncs: See the bug report here: https://github.com/llvm/llvm-project/issues/54549 However, we were diagnosing this as 'not defined', because the ifunc's attempt to look up its resolver would generate a declared IR function. Additionally, as background, the way we allow these static extern "C" functions to work in inline assembly is by making an alias with the C mangling in MOST situations to the version we emit with internal-linkage/mangling. The problem here was multi-fold: First- We generated the alias after the ifunc was checked, so the function by that name didn't exist yet. Second, the ifunc's generation caused a symbol to exist under the name of the alias already (the declared function above), which suppressed the alias generation. This patch fixes all of this by moving the checking of ifuncs/CFE aliases until AFTER we have generated the extern-C alias. Then, it does a 'fixup' around the GlobalIFunc to make sure we correct the reference. Differential Revision: https://reviews.llvm.org/D122608	2022-04-01 13:00:59 -07:00
Jorge Gorbe Moya	fc7573f29c	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `46774df307`.	2022-03-31 14:54:41 -07:00
Paul Kirth	46774df307	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-31 17:38:21 +00:00
Aaron Ballman	2267549296	Fix the build after `cd26190a10` These variables were being used uninitialized and it caused a significant number of test failures on Windows.	2022-03-31 12:03:53 -04:00
wangyihan	907d3acefc	[Clang][CodeGen]Beautify dump format, add indent for nested struct and struct members Beautify dump format, add indent for nested struct and struct members, also fix test cases in dump-struct-builtin.c for example: struct: ``` struct A { int a; struct B { int b; struct C { struct D { int d; union E { int x; int y; } e; } d; int c; } c; } b; }; ``` Before: ``` struct A { int a = 0 struct B { int b = 0 struct C { struct D { int d = 0 union E { int x = 0 int y = 0 } } int c = 0 } } } ``` After: ``` struct A { int a = 0 struct B { int b = 0 struct C { struct D { int d = 0 union E { int x = 0 int y = 0 } } int c = 0 } } } ``` Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D122704	2022-03-31 07:38:37 +08:00
Chris Bieneman	dfde354958	NFC. Fixing warnings from adding DXContainer Adds DXContainer to switch statements in Clang and LLDB to silence warnings.	2022-03-29 14:46:24 -05:00
wangyihan	de7cd3ccf5	[Clang][CodeGen]Remove anonymous tag locations Remove anonymous tag locations, powered by 'PrintingPolicy', @aaron.ballman once suggested removing this extra information in https://reviews.llvm.org/D122248 struct: struct S { int a; struct /* Anonymous*/ { int x; } b; int c; }; Before: struct S { int a = 0 struct S::(unnamed at ./builtin_dump_struct.c:20:3) { int x = 0 } int c = 0 } After: struct S { int a = 0 struct S::(unnamed) { int x = 0 } int c = 0 } Differntial Revision: https://reviews.llvm.org/D122670	2022-03-29 11:38:29 -07:00
Paul Kirth	90cb325abd	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `2add3fbd97`.	2022-03-29 06:20:30 +00:00
Phoebe Wang	cd26190a10	[X86][regcall] Support passing / returning structures Currently, the regcall calling conversion in Clang doesn't match with ICC when passing / returning structures. https://godbolt.org/z/axxKMKrW7 This patch tries to fix the problem to match with ICC. Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D122104	2022-03-29 11:29:57 +08:00
Paul Kirth	2add3fbd97	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D115907	2022-03-28 23:30:04 +00:00
James Y Knight	d614874900	[Clang] Implement __builtin_source_location. This builtin returns the address of a global instance of the `std::source_location::__impl` type, which must be defined (with an appropriate shape) before calling the builtin. It will be used to implement std::source_location in libc++ in a future change. The builtin is compatible with GCC's implementation, and libstdc++'s usage. An intentional divergence is that GCC declares the builtin's return type to be `const void` (for ease-of-implementation reasons), while Clang uses the actual type, `const std::source_location::__impl`. In order to support this new functionality, I've also added a new 'UnnamedGlobalConstantDecl'. This artificial Decl is modeled after MSGuidDecl, and is used to represent a generic concept of an lvalue constant with global scope, deduplicated by its value. It's possible that MSGuidDecl itself, or some of the other similar sorts of things in Clang might be able to be refactored onto this more-generic concept, but there's enough special-case weirdness in MSGuidDecl that I gave up attempting to share code there, at least for now. Finally, for compatibility with libstdc++'s <source_location> header, I've added a second exception to the "cannot cast from void* to T* in constant evaluation" rule. This seems a bit distasteful, but feels like the best available option. Reviewers: aaron.ballman, erichkeane Differential Revision: https://reviews.llvm.org/D120159	2022-03-28 18:29:02 -04:00
Joseph Huber	9d3550c517	[OpenMP] Add AMDGPU calling convention to ctor / dtor functions This patch adds the necessary AMDGPU calling convention to the ctor / dtor kernels. These are fundamentally device kenels called by the host on image load. Without this calling convention information the AMDGPU plugin is unable to identify them. Depends on D122504 Fixes #54091 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D122515	2022-03-25 22:44:20 -04:00
Joseph Huber	3c6d32ec6c	[OpenMP] Make Ctor / Dtor functions have external visibility The default construction of constructor functions by LLVM tends to make them have internal linkage. When we call a ctor / dtor function in the target region we are actually creating a kernel that is called at registration. Because the ctor is a kernel we need to make sure it's externally visible so we can actually call it. This prevented AMDGPU from correctly using constructors while NVPTX could use them simply because it ignored internal visibility. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D122504	2022-03-25 22:44:17 -04:00
William S. Moses	89525cbf28	[Clang] Add helper method to determine if a nonvirtual base has an entry in the LLVM struct This patch adds a helper method to determine if a nonvirtual base has an entry in the LLVM struct. Such a base may not have an entry if the base does not have any fields/bases itself that would change the size of the struct. This utility method is useful for other frontends (Polygeist) that use Clang as an API to generate code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122502	2022-03-25 16:32:12 -04:00
Joseph Huber	b9f67d44ba	[OpenMP] Replace device kernel linkage with weak_odr Currently the device kernels all have weak linkage to prevent linkage errors on multiple defintions. However, this prevents some optimizations from adequately analyzing them because of the nature of weak linkage. This patch replaces the weak linkage with weak_odr linkage so we can statically assert that multiple declarations of the same kernel will have the same definition. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D122443	2022-03-25 11:29:15 -04:00
Jennifer Yu	a6cdac48ff	Eliminate extra set of simd variant function attribute. Current clang generates extra set of simd variant function attribute with extra 'v' encoding. For example: _ZGVbN2v__Z5add_1Pf vs _ZGVbN2vv__Z5add_1Pf The problem is due to declaration of ParamAttrs following: llvm::SmallVector<ParamAttrTy, 8> ParamAttrs(ParamPositions.size()); where ParamPositions.size() is grown after following assignment: Pos = ParamPositions[PVD]; So the PVD is not find in ParamPositions. The problem is ParamPositions need to set for each FD decl. To fix this Move ParamPositions's init inside while loop for each FD. Differential Revision: https://reviews.llvm.org/D122338	2022-03-24 13:27:28 -07:00
wangyihan	7faa95624e	[clang][CodeGen]Fix clang crash and add bitfield support in __builtin_dump_struct Fix clang crash and add bitfield support in __builtin_dump_struct. In clang13.0.x, a struct with three or more members and a bitfield at the same time will cause a crash. In clang15.x, as long as the struct has one bitfield, it will cause a crash in clang. Open issue: https://github.com/llvm/llvm-project/issues/54462 Differential Revision: https://reviews.llvm.org/D122248	2022-03-24 12:23:29 -07:00
David Blaikie	7b498beef0	DebugInfo: Classify noreturn function types as non-reconstructible This information isn't preserved in the DWARF description of function types (though probably should be - it's preserved on the function declarations/definitions themselves through the DW_AT_noreturn attribute - but we should move or also include that in the subroutine type itself too - but for now, with it not being there, the DWARF is lossy and can't be reconstructed)	2022-03-24 18:53:14 +00:00
Mike Rice	f82ec5532b	[OpenMP] Initial parsing/sema for the 'omp target parallel loop' construct Adds basic parsing/sema/serialization support for the #pragma omp target parallel loop directive. Differential Revision: https://reviews.llvm.org/D122359	2022-03-24 09:19:00 -07:00
Aaron Ballman	488c772920	Fix a crash with variably-modified parameter types in a naked function Naked functions have no prolog, so it's not valid to emit prolog code to evaluate the variably-modified type. This fixes Issue 50541.	2022-03-24 10:39:14 -04:00
Dávid Bolvanský	a683ba4ff5	[NFCI] Fix set-but-unused warning in CGOpenMPRuntime.cpp	2022-03-24 07:49:21 +01:00
Ben Shi	51585aa240	[clang][AVR] Implement standard calling convention for AVR and AVRTiny This patch implements avr-gcc's calling convention: https://gcc.gnu.org/wiki/avr-gcc#Calling_Convention Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D120720	2022-03-24 02:08:22 +00:00
Julian Lettner	64902d335c	Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121736	2022-03-23 18:36:55 -07:00
Zequan Wu	581dc3c729	Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" This reverts commit `22570bac69`.	2022-03-23 16:11:54 -07:00
Joseph Huber	0d16c23af1	[OpenMP] Do not create offloading entries for internal or hidden symbols Currently we create offloading entries to register device variables with the host. When we register a variable we will look up the symbol in the device image and map the device address to the host address. This is a problem when the symbol is declared with hidden visibility or internal linkage. This means the symbol is not accessible externally and we cannot get its address. We should still allow static variables to be declared on the device, but ew should not create an offloading entry for them so they exist independently on the host and device. Fixes #54309 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D122352	2022-03-23 18:27:16 -04:00
Erich Keane	3fb101a691	[NFC] Replace a not-null-check && isa with isa_and_nonnull	2022-03-23 13:09:28 -07:00
Nikita Popov	a8690ba9d0	[CGExpr] Perform bitcast unconditionally The way the check is written is not compatible with opaque pointers -- while we don't need to change the IR pointer type, we do need to change the element type stored in the Address.	2022-03-23 15:39:39 +01:00
Nikita Popov	5c6752d4ad	[CGObjCMac] Check global value type instead of poitner type As we're going to reassign the initializer, we actually need the value types to match, not just the pointer types. This is only relevant with opaque pointers.	2022-03-23 15:39:39 +01:00
Nikita Popov	beee09687f	[CGBlocks] Don't assume presence of bitcast With opaque pointers, the bitcast constexpr will not be present.	2022-03-23 15:39:39 +01:00
David Truby	683fc6203c	[clang][AArc64][SVE] Implement vector-scalar operators This patch extends the support for C/C++ operators for SVE types to allow one of the arguments to be a scalar, in which case a vector splat is performed. Differential Revision: https://reviews.llvm.org/D121829	2022-03-23 14:20:48 +00:00
Nikita Popov	c070d5ceff	[CGOpenMPRuntime] Remove uses of deprecated Address constructor And as these are the last remaining uses, also remove the constructor itself.	2022-03-23 12:40:44 +01:00
Nikita Popov	8b62dd3cd6	Reapply [CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer() This requires some adjustment in caller code, because there was a confusion regarding the meaning of the PtrTy argument: This argument is the type of the pointer being loaded, not the addresses being loaded from. Reapply after fixing the specified pointer type for one call in `47eb4f7dcd`, where the used type is important for determining alignment.	2022-03-23 12:06:11 +01:00
Nikita Popov	47eb4f7dcd	[CGOpenMPRuntime] Specify correct type in EmitLoadOfPointerLValue() Perform a bitcast first, so we can specify the correct pointer type inf EmitLoadOfPointerLValue(), rather than using a dummy void pointer.	2022-03-23 11:51:14 +01:00
Nikita Popov	ba2be802b0	[CGOpenMPRuntime] Reuse getDepobjElements() (NFC) There were two more places repeating this code, reuse the helper. This requires moving the static functions into the class.	2022-03-23 11:31:49 +01:00
Nikita Popov	27f6cee12d	Revert "[CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer()" This reverts commit `767ec883e3`. This results in a some incorrect alignments which are not covered by existing tests.	2022-03-23 10:24:39 +01:00
Phoebe Wang	32103608fc	[Inline-asm] Add diagnosts for unsupported inline assembly arguments GCC supports power-of-2 size structures for the arguments. Clang supports fewer than GCC. But Clang always crashes for the unsupported cases. This patch adds sema checks to do the diagnosts to solve these crashes. Reviewed By: jyu2 Differential Revision: https://reviews.llvm.org/D107141	2022-03-23 11:25:19 +08:00
Akira Hatanaka	818e72d1b0	[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in TargetInfo.cpp Differential Revision: https://reviews.llvm.org/D122199	2022-03-22 18:39:16 -07:00
Mike Rice	2cedaee6f7	[OpenMP] Initial parsing/sema for the 'omp parallel loop' construct Adds basic parsing/sema/serialization support for the #pragma omp parallel loop directive. Differential Revision: https://reviews.llvm.org/D122247	2022-03-22 13:55:47 -07:00
Nikita Popov	cd6d9ae263	[CGOpenMPRuntime] Remove some uses of deprecated Adddress ctor	2022-03-22 16:29:35 +01:00
Nikita Popov	4f5640cad3	[CGOpenMPRuntime] Remove some uses of deprecated Address ctor	2022-03-22 15:35:45 +01:00
Nikita Popov	73c0d05e6a	[CGOpenMPRuntimeGPU] Remove uses of deprecated address constructor Worth noting that the code marked with FIXME is dead and would produce invalid IR if hit. Someone familiar with this code should probably look into that.	2022-03-22 15:02:45 +01:00
Djordje Todorovic	73777b4c35	[Debugify] Optimize debugify original mode Before we start addressing the issue with having a lot of false positives when using debugify in the original mode, we have made a few patches that should speed up the execution of the testing utility Passes. For example, when testing a large project (let's say LLVM project itself), we can face a lot of potential DI issues. Usually, we use -verify-each-debuginfo-preserve (that is very similar to -debugify-each) -- it collects DI metadata before each Pass, and after the Pass it checks if the Pass preserved the DI metadata. However, we can speed up this process, since we don't need to collect DI metadata before each Pass -- we could use the DI metadata that are collected after the previous Pass from the pipeline as an input for the next Pass. This patch speeds up the utility for ~2x. Differential Revision: https://reviews.llvm.org/D115622	2022-03-22 12:14:00 +01:00
Nikita Popov	51ba13b1ae	[CGStmtOpenMP] Remove uses of deprecated Address constructor	2022-03-22 11:00:08 +01:00
Nikita Popov	b8f0e12847	[CodeGen] Remove some uses of deprecated Address constructor Remove two stray uses in CodeGenModule and CGCUDANV.	2022-03-22 10:02:35 +01:00
Nikita Popov	767ec883e3	[CodeGen] Avoid deprecated Address ctor in EmitLoadOfPointer() This requires some adjustment in caller code, because there was a confusion regarding the meaning of the PtrTy argument: This argument is the type of the pointer being loaded, not the addresses being loaded from.	2022-03-22 09:42:31 +01:00
Nikita Popov	a9656bd1bc	[CodeGen][OpenMP] Make EmitLoadOfPointer() type consistent If necessary insert a bitcast beforehand, so the LLVM-level pointer type and the Clang-level pointer type line up.	2022-03-22 09:37:48 +01:00
Nikita Popov	7a2e12e0a7	[CodeGen][OpenMP] Use correct type in EmitLoadOfPointer() The EmitLoadOfPointer() call already specified the right pointer type, but it did not match the Address we're loading from, so we need to insert a bitcast first.	2022-03-21 15:22:37 +01:00
Nikita Popov	b6f85d8539	[CodeGen][OpenMP] Use correct type in EmitLoadOfPointer() Rather than using a dummy void pointer type, we should specify the correct private type and perform the bitcast beforehand rather than afterwards. This way, the Address will have correct alignment information.	2022-03-21 12:08:05 +01:00
Mike Rice	6bd8dc91b8	[OpenMP] Initial parsing/sema for the 'omp target teams loop' construct Adds basic parsing/sema/serialization support for the #pragma omp target teams loop directive. Differential Revision: https://reviews.llvm.org/D122028	2022-03-18 13:48:32 -07:00
Alan Zhao	8cd8bd4a5c	Implement __cpuid and __cpuidex as Clang builtins https://reviews.llvm.org/D23944 implemented the #pragma intrinsic from MSVC. This causes the statement #pragma intrinsic(cpuid) to fail [0] on Clang because cpuid is currently implemented in intrin.h instead of a Clang builtin. Reimplementing cpuid (as well as it's releated function, cpuidex) should resolve this. [0]: https://crbug.com/1279344 Differential revision: https://reviews.llvm.org/D121653	2022-03-18 18:13:52 +01:00
Nikita Popov	52cc65d474	[OpenMPRuntime] Specify correct pointer type Rather than specifying a dummy type in EmitLoadOfPointer() and then casting it to the correct one, we should instead specify the correct type and cast beforehand. Otherwise the computed alignment will be incorrect.	2022-03-18 14:25:51 +01:00
Nikita Popov	74992f4a5b	[CodeGen] Store element type in DominatingValue<RValue> For aggregate rvalues, we need to store the element type in the dominating value, so we can recover the element type for the address.	2022-03-18 11:13:25 +01:00
Nikita Popov	33d020d010	[CodeGen] Remove some uses of deprecated Address constructor	2022-03-18 11:01:25 +01:00
Benjamin Kramer	5d2ce7663b	Use llvm::append_range instead of push_back loops where applicable. NFCI.	2022-03-18 01:25:34 +01:00
Paul Kirth	964398ccb1	Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""" This reverts commit `6cf560d69a`.	2022-03-18 00:21:33 +00:00
Paul Kirth	6cf560d69a	Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"" I mistakenly reverted my commit, so I'm relanding it. This reverts commit `10866a1df4`.	2022-03-18 00:04:22 +00:00
Paul Kirth	10866a1df4	Revert "[misexpect] Re-implement MisExpect Diagnostics" This reverts commit `e7749d4713`.	2022-03-17 23:54:26 +00:00
Paul Kirth	e7749d4713	[misexpect] Re-implement MisExpect Diagnostics Reimplements MisExpect diagnostics from D66324 to reconstruct its original checking methodology only using MD_prof branch_weights metadata. New checks rely on 2 invariants: 1) For frontend instrumentation, MD_prof branch_weights will always be populated before llvm.expect intrinsics are lowered. 2) for IR and sample profiling, llvm.expect intrinsics will always be lowered before branch_weights are populated from the IR profiles. These invariants allow the checking to assume how the existing branch weights are populated depending on the profiling method used, and emit the correct diagnostics. If these invariants are ever invalidated, the MisExpect related checks would need to be updated, potentially by re-introducing MD_misexpect metadata, and ensuring it always will be transformed the same way as branch_weights in other optimization passes. Frontend based profiling is now enabled without using LLVM Args, by introducing a new CodeGen option, and checking if the -Wmisexpect flag has been passed on the command line. Differential Revision: https://reviews.llvm.org/D115907	2022-03-17 23:46:23 +00:00
Changpeng Fang	dd5895cc39	AMDGPU: Use the implicit kernargs for code object version 5 Summary: Specifically, for trap handling, for targets that do not support getDoorbellID, we load the queue_ptr from the implicit kernarg, and move queue_ptr to s[0:1]. To get aperture bases when targets do not have aperture registers, we load private_base or shared_base directly from the implicit kernarg. In clang, we use implicitarg_ptr + offsets to implement __builtin_amdgcn_workgroup_size_{xyz}. Reviewers: arsenm, sameerds, yaxunl Differential Revision: https://reviews.llvm.org/D120265	2022-03-17 14:12:36 -07:00
Johannes Doerfert	f02550bdd9	Reapply "[OpenMP][FIX] Allow device constructors for AMD GPU" This reverts commit `a597d6a780` and reapplies `07b1766461`. In AMD GPU device code the globals are in AS(1). Before, we crashed if the global was a structure. Now we simply cast away the AS before we generate the code to initialize the global. Differential Revision: https://reviews.llvm.org/D121837 Fixes: https://github.com/llvm/llvm-project/issues/54421	2022-03-17 12:53:47 -05:00
Julian Lettner	22570bac69	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121736	2022-03-17 10:47:13 -07:00
Nikita Popov	6e1e99dc07	[CodeGen] Avoid pointer element type access for blocks Pass the block struct type down to the TargetInfo hooks.	2022-03-17 16:56:31 +01:00
Nikita Popov	6c0af92612	[CodeGen] Avoid some pointer element type accesses	2022-03-17 16:36:14 +01:00
Nikita Popov	2edac9d962	[CodeGen] Avoid some pointer element type accesses	2022-03-17 16:32:45 +01:00
Nikita Popov	bf1a99861c	[CodeGen] Avoid some pointer element type accesses	2022-03-17 15:25:55 +01:00
Nikita Popov	799643f7f0	[CGObjCGNU] Remove pointer element type uses	2022-03-17 14:53:34 +01:00
Evgenii Stepanov	cb96464f12	Stricter use-after-dtor detection for trivial members. Poison trivial class members one-by-one in the reverse order of their construction, instead of all-at-once at the very end. For example, in the following code access to `x` from `~B` will produce an undefined value. struct A { struct B b; int x; }; Reviewed By: kda Differential Revision: https://reviews.llvm.org/D119600	2022-03-16 18:20:27 -07:00
Evgenii Stepanov	c5ea8e9138	Use-after-dtor detection for trivial base classes. -fsanitize-memory-use-after-dtor detects memory access after a subobject is destroyed but its memory is not yet deallocated. This is done by poisoning each object memory near the end of its destructor. Subobjects (members and base classes) do this in their respective destructors, and the parent class does the same for its members with trivial destructors. Inexplicably, base classes with trivial destructors are not handled at all. This change fixes this oversight by adding the base class poisoning logic to the parent class destructor. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D119300	2022-03-16 18:20:27 -07:00
Eli Friedman	04ba344176	[CodeGen] Inline _byteswap_* builtins. As discussed in D57915. Fixes https://github.com/llvm/llvm-project/issues/39999 . Differential Revision: https://reviews.llvm.org/D121865	2022-03-16 16:18:51 -07:00
Johannes Doerfert	a597d6a780	Revert "[OpenMP][FIX] Allow device constructors for AMD GPU" This reverts commit `07b1766461` as it broke the buildbots: https://lab.llvm.org/buildbot#builders/193/builds/8594	2022-03-16 17:35:54 -05:00
Johannes Doerfert	07b1766461	[OpenMP][FIX] Allow device constructors for AMD GPU In AMD GPU device code the globals are in AS(1). Before, we crashed if the global was a structure. Now we simply cast away the AS before we generate the code to initialize the global. Differential Revision: https://reviews.llvm.org/D121837	2022-03-16 17:04:28 -05:00
Mike Rice	79f661edc1	[OpenMP] Initial parsing/sema for the 'omp teams loop' construct Adds basic parsing/sema/serialization support for the #pragma omp teams loop directive. Differential Revision: https://reviews.llvm.org/D121713	2022-03-16 14:39:18 -07:00
Arthur Eubanks	2371c5a0e0	[OpaquePtr][ARM] Use elementtype on ldrex/ldaex/stlex/strex Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Basically the same as D120527. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D121847	2022-03-16 14:11:53 -07:00
Thomas Lively	7e8913d775	[WebAssembly] Fix names of SIMD instructions containing '_zero' Fix the instruction names to match the WebAssembly spec: - `i32x4.trunc_sat_zero_f64x2_{s,u}` => `i32x4.trunc_sat_f64x2_{s,u}_zero` - `f32x4.demote_zero_f64x2` => `f32x4.demote_f64x2_zero` Also rename related things like intrinsics, builtins, and test functions to match. Reviewed By: aheejin Differential Revision: https://reviews.llvm.org/D121661	2022-03-16 13:34:57 -07:00
Yonghong Song	3251ba2d0f	[Attr] Fix a btf_type_tag AST generation Current ASTContext.getAttributedType() takes attribute kind, ModifiedType and EquivType as the hash to decide whether an AST node has been generated or note. But this is not enough for btf_type_tag as the attribute might have the same ModifiedType and EquivType, but still have different string associated with attribute. For example, for a data structure like below, struct map_value { int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) a; int __attribute__((btf_type_tag("tag2"))) __attribute__((btf_type_tag("tag4"))) b; }; The current ASTContext.getAttributedType() will produce an AST similar to below: struct map_value { int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) a; int __attribute__((btf_type_tag("tag1"))) __attribute__((btf_type_tag("tag3"))) b; }; and this is incorrect. It is very difficult to use the current AttributedType as it is hard to get the tag information. To fix the problem, this patch introduced BTFTagAttributedType which is similar to AttributedType in many ways but with an additional BTFTypeTagAttr. The tag itself can be retrieved with BTFTypeTagAttr. With the new BTFTagAttributed type, the debuginfo code can be greatly simplified compared to previous TypeLoc based approach. Differential Revision: https://reviews.llvm.org/D120296	2022-03-16 08:46:52 -07:00
Simon Moll	0aab344104	[Clang] Allow "ext_vector_type" applied to Booleans This is the `ext_vector_type` alternative to D81083. This patch extends Clang to allow 'bool' as a valid vector element type (attribute ext_vector_type) in C/C++. This is intended as the canonical type for SIMD masks and facilitates clean vector intrinsic declarations. Vectors of i1 are supported on IR level and below down to many SIMD ISAs, such as AVX512, ARM SVE (fixed vector length) and the VE target (NEC SX-Aurora TSUBASA). The RFC on cfe-dev: https://lists.llvm.org/pipermail/cfe-dev/2020-May/065434.html Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D88905	2022-03-16 11:10:32 +01:00
Keith Smiley	a2db7d5e9c	reland: [clang] Don't append the working directory to absolute paths This fixes a bug that happens when using -fdebug-prefix-map to remap an absolute path to a relative path. Since the path was absolute before remapping, it is safe to assume that concatenating the remapped working directory would be wrong. This was originally submitted as https://reviews.llvm.org/D113718, but reverted because when testing with dwarf 5 enabled, the tests were too strict. Differential Revision: https://reviews.llvm.org/D121663	2022-03-15 13:42:35 -07:00
Simon Pilgrim	7262eacd41	Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO" Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'	2022-03-15 13:01:35 +00:00
Keith Smiley	cb22d71806	[clang] Fix DIFile directory root on Windows On unix systems this logic would not separate the file and directory of the DIFile unless they shared more components at the start than just the root path character. The logic to do this was unix specific so it didn't work on Windows. Now we check if the entire root_path is the same as what you were going to set as the Dir and use the full filepath in that case. Differential Revision: https://reviews.llvm.org/D111579	2022-03-14 20:07:01 -07:00
Julian Lettner	9c542a5a4e	Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with `__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`. Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this. Enable fallback to the old behavior via Clang driver flag (`-fregister-global-dtors-with-atexit`) or llc / code generation flag (`-lower-global-dtors-via-cxa-atexit`). This escape hatch will be removed in the future. Differential Revision: https://reviews.llvm.org/D121327	2022-03-14 17:51:18 -07:00
Joseph Huber	806bbc49dc	[OpenMP] Try to embed offloading objects after codegen Currently we use the `-fembed-offload-object` option to embed a binary file into the host as a named section. This is currently only used as a codegen action, meaning we only handle this option correctly when the input is a bitcode file. This patch adds the same handling to embed an offloading object after we complete code generation. This allows us to embed the object correctly if the input file is source or bitcode. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120270	2022-03-14 20:08:24 -04:00
Dávid Bolvanský	003c0b9307	[Clang] always_inline statement attribute Motivation: ``` int test(int x, int y) { int r = 0; [[clang::always_inline]] r += foo(x, y); // force compiler to inline this function here return r; } ``` In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others). This patch solves this problem with call site attribute. "noinline" statement attribute already landed in D119061. Also, some LLVM Inliner fixes landed so call site attribute is stronger than function attribute. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D120717	2022-03-14 21:45:31 +01:00
Arthur Eubanks	250620f76e	[OpaquePtr][AArch64] Use elementtype on ldxr/stxr Includes verifier changes checking the elementtype, clang codegen changes to emit the elementtype, and ISel changes using the elementtype. Reviewed By: #opaque-pointers, nikic Differential Revision: https://reviews.llvm.org/D120527	2022-03-14 10:09:59 -07:00
Erich Keane	dc152659b4	Have cpu-specific variants set 'tune-cpu' as an optimization hint Due to various implementation constraints, despite the programmer choosing a 'processor' cpu_dispatch/cpu_specific needs to use the 'feature' list of a processor to identify it. This results in the identified processor in source-code not being propogated to the optimizer, and thus, not able to be tuned for. This patch changes to use the actual cpu as written for tune-cpu so that opt can make decisions based on the cpu-as-spelled, which should better match the behavior expected by the programmer. Note that the 'valid' list of processors for x86 is in llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list contains only Intel processors, but other vendors may wish to add their own entries as 'alias'es (or with different feature lists!). If this is not done, there is two potential performance issues with the patch, but I believe them to be worth it in light of the improvements to behavior and performance. 1- In the event that the user spelled "ProcessorB", but we only have the features available to test for "ProcessorA" (where A is B minus features), AND there is an optimization opportunity for "B" that negatively affects "A", the optimizer will likely choose to do so. 2- In the event that the user spelled VendorI's processor, and the feature list allows it to run on VendorA's processor of similar features, AND there is an optimization opportunity for VendorIs that negatively affects "A"s, the optimizer will likely choose to do so. This can be fixed by adding an alias to X86TargetParser.def. Differential Revision: https://reviews.llvm.org/D121410	2022-03-14 06:14:30 -07:00
Kazushi (Jam) Marukawa	b1b4b6f366	[Clang][VE] Add vector load intrinsics Add vector load intrinsic instructions for VE. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D121049	2022-03-12 09:09:57 +09:00
Akira Hatanaka	aa4ea0ee54	[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in a couple more files Differential Revision: https://reviews.llvm.org/D121135	2022-03-11 09:30:31 -08:00
Simon Pilgrim	d258196f5f	[clang] ScalarExprEmitter::VisitCastExpr - use castAs<> instead of getAs<> to avoid dereference of nullptr The pointers are always dereferenced, so assert the cast is correct instead of returning nullptr	2022-03-09 11:40:37 +00:00
Ryan Senanayake	b3dae59b9d	[clang] Fix CodeGenAction for LLVM IR MemBuffers Replaces use of getCurrentFile with getCurrentFileOrBufferName in CodeGenAction. This avoids an assertion error or an incorrect name chosen for the output file when assertions are disabled. This error previously occurred when the FrontendInputFile was a MemoryBuffer instead of a file. Reviewed By: jlebar Differential Revision: https://reviews.llvm.org/D121259	2022-03-09 00:39:48 +00:00
Akira Hatanaka	9bb8c80bea	[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in CGBuiltin.cpp Differential Revision: https://reviews.llvm.org/D121153	2022-03-08 09:45:15 -08:00
Stanislav Mekhanoshin	932f628121	[AMDGPU] new gfx940 fp atomics Differential Revision: https://reviews.llvm.org/D121028	2022-03-07 12:32:02 -08:00
David Blaikie	c0a6433f2b	Simplify OpenMP Lambda use * Use default ref capture for non-escaping lambdas (this makes maintenance easier by allowing new uses, removing uses, having conditional uses (such as in assertions) not require updates to an explicit capture list) * Simplify addPrivate API not to take a lambda, since it calls it unconditionally/immediately anyway - most callers are simply passing in a named value or short expression anyway and the lambda syntax just adds noise/overhead Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D121077	2022-03-07 18:23:20 +00:00
Qiu Chaofan	b2497e5435	[PowerPC] Add generic fnmsub intrinsic Currently in Clang, we have two types of builtins for fnmsub operation: one for float/double vector, they'll be transformed into IR operations; one for float/double scalar, they'll generate corresponding intrinsics. But for the vector version of builtin, the 3 op chain may be recognized as expensive by some passes (like early cse). We need some way to keep the fnmsub form until code generation. This patch introduces ppc.fnmsub.* intrinsic to unify four fnmsub intrinsics. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D116015	2022-03-07 13:00:06 +08:00
Shao-Ce SUN	fa9c8bab0c	[RISCV] Support k-ext clang intrinsics Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D112774	2022-03-05 13:57:18 +08:00
Akira Hatanaka	3717b9661f	[NFC][Clang][OpaquePtr] Remove calls to Address::deprecated in CGBlocks.cpp Differential Revision: https://reviews.llvm.org/D120856	2022-03-03 08:54:46 -08:00
Aakanksha	840695814a	[AMDGPU] Add gfx1036 target Differential Revision: https://reviews.llvm.org/D120846	2022-03-02 23:26:38 +00:00
Stanislav Mekhanoshin	2e2e64df4a	[AMDGPU] Add gfx940 target This is target definition only. Differential Revision: https://reviews.llvm.org/D120688	2022-03-02 13:54:48 -08:00
Tong Zhang	f76d3b800f	[clang][CGStmt] fix crash on invalid asm statement Clang is crashing on the following statement char var[9]; __asm__ ("" : "=r" (var) : "0" (var)); This is similar to existing test: crbug_999160_regtest The issue happens when EmitAsmStmt is trying to convert input to match output type length. However, that is not guaranteed to be successful all the time and if the statement itself is invalid like having an array type in the example, we should give a regular error message here instead of using assert(). Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D120596	2022-03-02 11:18:55 -08:00
Akira Hatanaka	d112cc2756	[NFC][Clang][OpaquePtr] Remove the call to Address::deprecated in CreatePointerBitCastOrAddrSpaceCast Differential Revision: https://reviews.llvm.org/D120757	2022-03-02 08:58:00 -08:00
Tong Zhang	17ce89fa80	[SanitizerBounds] Add support for NoSanitizeBounds function Currently adding attribute no_sanitize("bounds") isn't disabling -fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang frontend handles fsanitize=array-bounds which can already be disabled by no_sanitize("bounds"). However, instrumentation added by the BoundsChecking pass in the middle-end cannot be disabled by the attribute. The fix is very similar to D102772 that added the ability to selectively disable sanitizer pass on certain functions. In this patch, if no_sanitize("bounds") is provided, an additional function attribute (NoSanitizeBounds) is attached to IR to let the BoundsChecking pass know we want to disable local-bounds checking. In order to support this feature, the IR is extended (similar to D102772) to make Clang able to preserve the information and let BoundsChecking pass know bounds checking is disabled for certain function. Reviewed By: melver Differential Revision: https://reviews.llvm.org/D119816	2022-03-01 18:47:02 +01:00
Michael Kruse	a66f7769a3	[OpenMPIRBuilder] Implement static-chunked workshare-loop schedules. Add applyStaticChunkedWorkshareLoop method implementing static schedule when chunk-size is specified. Unlike a static schedule without chunk-size (where chunk-size is chosen by the runtime such that each thread receives one chunk), we need two nested loops: one for looping over the iterations of a chunk, and a second for looping over all chunks assigned to the threads. This patch includes the following related changes: * Adapt applyWorkshareLoop to triage between the schedule types, now possible since all schedules have been implemented. The default schedule is assumed to be non-chunked static, as without OpenMPIRBuilder. * Remove the chunk parameter from applyStaticWorkshareLoop, it is ignored by the runtime. Change the value for the value passed to the init function to 0, as without OpenMPIRBuilder. * Refactor CanonicalLoopInfo::setTripCount and CanonicalLoopInfo::mapIndVar as used by both, applyStaticWorkshareLoop and applyStaticChunkedWorkshareLoop. * Enable Clang to use the OpenMPIRBuilder in the presence of the schedule clause. Differential Revision: https://reviews.llvm.org/D114413	2022-02-28 18:18:33 -06:00
Dávid Bolvanský	223b824022	[Clang] noinline call site attribute Motivation: ``` int foo(int x, int y) { // any compiler will happily inline this function return x / y; } int test(int x, int y) { int r = 0; [[clang::noinline]] r += foo(x, y); // for some reason we don't want any inlining here return r; } ``` In 2018, @kuhar proposed "Introduce per-callsite inline intrinsics" in https://reviews.llvm.org/D51200 to solve this motivation case (and many others). This patch solves this problem with call site attribute. The implementation is "smaller" wrt approach which uses new intrinsics and thanks to https://reviews.llvm.org/D79121 (Add nomerge statement attribute to clang), we have got some basic infrastructure to deal with attrs on statements with call expressions. GCC devs are more inclined to call attribute solution as well, as builtins are problematic for them - https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104187. But they have no patch proposal yet so.. We have free hands here. If this approach makes sense, next future steps would be support for call site attributes for always_inline / flatten. Reviewed By: aaron.ballman, kuhar Differential Revision: https://reviews.llvm.org/D119061	2022-02-28 21:21:17 +01:00
Itay Bookstein	f3480390be	[clang][CodeGen] Avoid emitting ifuncs with undefined resolvers The purpose of this change is to fix the following codegen bug: ``` // main.c __attribute__((cpu_specific(generic))) int foo(void) { static int z; return &z;} int main() { return foo() = 5; } // other.c __attribute__((cpu_dispatch(generic))) int foo(void); // run: clang main.c other.c -o main; ./main ``` This will segfault prior to the change, and return the correct exit code 5 after the change. The underlying cause is that when a translation unit contains a cpu_specific function without the corresponding cpu_dispatch the generated code binds the reference to foo() against a GlobalIFunc whose resolver is undefined. This is invalid: the resolver must be defined in the same translation unit as the ifunc, but historically the LLVM bitcode verifier did not check that. The generated code then binds against the resolver rather than the ifunc, so it ends up calling the resolver rather than the resolvee. In the example above it treats its return value as an int , therefore trying to write to program text. The root issue at the representation level is that GlobalIFunc, like GlobalAlias, does not support a "declaration" state. The object which provides the correct semantics in these cases is a Function declaration, but unlike Functions, changing a declaration to a definition in the GlobalIFunc case constitutes a change of the object type, as opposed to simply emitting code into a Function. I think this limitation is unlikely to change, so I implemented the fix by returning a function declaration rather than an ifunc when encountering cpu_specific, and upgrading it to an ifunc when emitting cpu_dispatch. This uses `takeName` + `replaceAllUsesWith` in similar vein to other places where the correct IR object type cannot be known locally/up-front, like in `CodeGenModule::EmitAliasDefinition`. Previous discussion in: https://reviews.llvm.org/D112349 Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D120266	2022-02-26 11:17:49 +02:00
Adrian Prantl	bc7aeea854	Revert "Don't append the working directory to absolute paths" This reverts commit `2cd9a86da5`.	2022-02-25 17:00:10 -08:00
Adrian Prantl	2cd9a86da5	Don't append the working directory to absolute paths This fixes a bug that happens when using -fdebug-prefix-map to remap an absolute path to a relative path. Since the path was absolute before remapping, it is safe to assume that concatenating the remapped working directory would be wrong. Differential Revision: https://reviews.llvm.org/D113718	2022-02-25 13:03:59 -08:00
Alexey Bataev	d04d9220e1	[OPENMP]Fix PR50347: Mapping of global scope deep object fails. Changed the we handle llvm::Constants in sizes arrays. ConstExprs and GlobalValues cannot be used as initializers, need to put them at the runtime, otherwise there wight be the compilation errors. Differential Revision: https://reviews.llvm.org/D105297	2022-02-25 10:54:24 -08:00
Shangwu Yao	c2f501f395	[CUDA][SPIRV] Assign global address space to CUDA kernel arguments (resubmit https://reviews.llvm.org/D119207 after fixing the test for some build settings) This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential revision: https://reviews.llvm.org/D120366	2022-02-24 20:51:43 -08:00
Alexey Bataev	ca6fa71b7e	Revert "[OPENMP]Fix PR50347: Mapping of global scope deep object fails." This reverts commit `638938117a`. Need to fix reported fail https://lab.llvm.org/buildbot/#/builders/193/builds/7496	2022-02-24 12:04:39 -08:00
Alexey Bataev	638938117a	[OPENMP]Fix PR50347: Mapping of global scope deep object fails. Changed the we handle llvm::Constants in sizes arrays. ConstExprs and GlobalValues cannot be used as initializers, need to put them at the runtime, otherwise there wight be the compilation errors. Differential Revision: https://reviews.llvm.org/D105297	2022-02-24 11:49:14 -08:00
Joseph Huber	7aef8b3754	[OpenMP] Make section variable external to prevent collisions Summary: We use a section to embed offloading code into the host for later linking. This is normally unique to the translation unit as it is thrown away during linking. However, if the user performs a relocatable link the sections will be merged and we won't be able to access the files stored inside. This patch changes the section variables to have external linkage and a name defined by the section name, so if two sections are combined during linking we get an error.	2022-02-24 10:57:09 -05:00
Yaxun (Sam) Liu	9d899d8f01	[HIP] Support `-fgpu-default-stream` Introduce -fgpu-default-stream={legacy\|per-thread} option to support per-thread default stream for HIP runtime. When -fgpu-default-stream=per-thread, HIP kernels are launched through hipLaunchKernel_spt instead of hipLaunchKernel. Also HIP_API_PER_THREAD_DEFAULT_STREAM=1 is defined by the preprocessor to enable other per-thread stream API's. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D120298	2022-02-23 22:28:29 -05:00
Fangrui Song	0477cac332	[asan] Allow -fsanitize-address-globals-dead-stripping with -fno-data-sections for ELF -fdata-sections decides whether global variables go into different sections. This is orthogonal to whether we place their metadata (`.data` or `asan_globals`) into different sections. With -fno-data-sections, `-fsanitize-address-globals-dead-stripping` can still: * deduplicate COMDAT `asan.module_ctor` and `asan.module_dtor` * (with ld --gc-sections): for a data section (e.g. `.data`), if all global variables defined relative to it are unreferenced, discard them and associated `asan_globals` sections (rare but no need to exclude this case) Similar to `c7b90947bd` for PE/COFF. Reviewed By: #sanitizers, kstoimenov, vitalybuka Differential Revision: https://reviews.llvm.org/D120394	2022-02-23 16:08:25 -08:00
Joseph Huber	119d71cb73	[OpenMP][NFC] Address warnings and lint messages in CGOpenMPRuntime Summary: This patch addressed the warnings and linting messages for the CGOpenMPRuntime.cpp file. This was causing some -Werror builds to fail.	2022-02-23 18:07:25 -05:00
Reid Kleckner	1d1b089c5d	Fix more unused lambda capture warnings, NFC	2022-02-23 14:07:04 -08:00
Reid Kleckner	cd37594c03	Fix unused lambda capture warning, NFC	2022-02-23 14:01:01 -08:00
Joseph Huber	2b97b16f29	[OpenMP] Add option to make offloading mandatory Currently when we generate OpenMP offloading code we always make fallback code for the CPU. This is necessary for implementing features like conditional offloading and ensuring that unhandled pragmas don't result in missing symbols. However, this is problematic for a few cases. For offloading tests we can silently fail to the host without realizing that offloading failed. Additionally, this makes it impossible to provide interoperabiility to other offloading schemes like HIP or CUDA because those methods do not provide any such host fallback guaruntee. this patch adds the `-fopenmp-offload-mandatory` flag to prevent generating the fallback symbol on the CPU and instead replaces the function with a dummy global and the failed branch with 'unreachable'. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D120353	2022-02-23 16:45:36 -05:00
Arthur Eubanks	4cb24ef90a	[clang] Remove Address::deprecated() from CGClass.cpp	2022-02-23 13:31:56 -08:00
Arthur Eubanks	6eec483584	[clang] Remove getPointerElementType() in EmitVTableTypeCheckedLoad()	2022-02-23 09:38:33 -08:00
Nikita Popov	b1863d8245	[Clang][OpenMP] Remove use of getPointerElementType() This new pointer element type use snuck in via D118632.	2022-02-23 16:14:24 +01:00
Arthur Eubanks	36e335eeb5	[clang] Remove Address::deprecated() calls in CodeGenFunction.cpp	2022-02-22 18:28:49 -08:00
Arthur Eubanks	cde658fa1f	[clang] Remove Address::deprecated() calls in CGVTables.cpp	2022-02-22 16:54:28 -08:00
Arthur Eubanks	3ef7e6c53c	[clang] Remove an Address::deprecated() call in CGClass.cpp	2022-02-22 16:19:06 -08:00
Shilei Tian	104d9a6743	[Clang][OpenMP] Add the codegen support for `atomic compare` This patch adds the codegen support for `atomic compare` in clang. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D118632	2022-02-22 13:01:39 -05:00
Shilei Tian	ccebf8ac8c	[Clang][OpenMP] Add support for compare capture in parser This patch adds the support for `atomic compare capture` in parser and part of sema. We don't create an AST node for this because the spec doesn't say `compare` and `capture` clauses should be used tightly, so we cannot look one more token ahead in the parser. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D116261	2022-02-18 10:23:59 -05:00
Joseph Huber	0870a4f59a	[OpenMP] Add flag for disabling thread state in runtime The runtime uses thread state values to indicate when we use an ICV or are in nested parallelism. This is done for OpenMP correctness, but it not needed in the majority of cases. The new flag added is `-fopenmp-assume-no-thread-state`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120106	2022-02-18 08:35:05 -05:00
Alexander Potapenko	c85a26454d	[asan] Add support for disable_sanitizer_instrumentation attribute For ASan this will effectively serve as a synonym for __attribute__((no_sanitize("address"))). Adding the disable_sanitizer_instrumentation to functions will drop the sanitize_XXX attributes on the IR level. This is the third reland of https://reviews.llvm.org/D114421. Now that TSan test is fixed (https://reviews.llvm.org/D120050) there should be no deadlocks. Differential Revision: https://reviews.llvm.org/D120055	2022-02-18 09:51:54 +01:00
hyeongyukim	b529744c29	[Clang] Rename `disable-noundef-analysis` flag to `-[no-]enable-noundef-analysis` This flag was previously renamed `enable_noundef_analysis` to `disable-noundef-analysis,` which is not a conventional name. (Driver and CC1's boolean options are using [no-] prefix) As discussed at https://reviews.llvm.org/D105169, this patch reverts its name to `[no-]enable_noundef_analysis` and enables noundef-analysis as default. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D119998	2022-02-18 17:02:41 +09:00
Arthur Eubanks	0b5fe2c9f2	[clang] Remove Address::deprecated() in emitVoidPtrDirectVAArg()	2022-02-17 15:05:50 -08:00
Matthew Voss	9ce09099bb	Revert "[CUDA][SPIRV] Assign global address space to CUDA kernel arguments" This reverts commit `9de4fc0f2d`. Reverting due to test failure: https://lab.llvm.org/buildbot/#/builders/139/builds/17199	2022-02-17 14:32:10 -08:00
Arthur Eubanks	ba9944ea1d	[clang] Remove Address::deprecated() in CGCXXABI.h	2022-02-17 14:23:02 -08:00
Arthur Eubanks	0e219af475	[clang] Remove Address::deprecated() call in CGExprCXX.cpp	2022-02-17 13:58:26 -08:00
Shafik Yaghmour	f56cb520d8	[DEBUGINFO] [LLDB] Add support for generating debug-info for structured bindings of structs and arrays Currently we are not emitting debug-info for all cases of structured bindings a C++17 feature which allows us to bind names to subobjects in an initializer. A structured binding is represented by a DecompositionDecl AST node and the binding are represented by a BindingDecl. It looks the original implementation only covered the tuple like case which be represented by a DeclRefExpr which contains a VarDecl. If the binding is to a subobject of the struct the binding will contain a MemberExpr and in the case of arrays it will contain an ArraySubscriptExpr. This PR adds support emitting debug-info for the MemberExpr and ArraySubscriptExpr cases as well as llvm and lldb tests for these cases as well as the tuple case. Differential Revision: https://reviews.llvm.org/D119178	2022-02-17 11:14:14 -08:00
Shangwu Yao	9de4fc0f2d	[CUDA][SPIRV] Assign global address space to CUDA kernel arguments This patch converts CUDA pointer kernel arguments with default address space to CrossWorkGroup address space (__global in OpenCL). This is because Generic or Function (OpenCL's private) is not supported as storage class for kernel pointer types. Differential Revision: https://reviews.llvm.org/D119207	2022-02-17 09:38:06 -08:00
Simon Pilgrim	57fc9798d7	[clang] CGDebugInfo::getOrCreateMethodType - use castAs<> instead of getAs<> to avoid dereference of nullptr The pointer is always dereferenced, so assert the cast is correct instead of returning nullptr	2022-02-17 13:18:23 +00:00
Simon Pilgrim	2614de8202	[clang] CGCXXABI::EmitLoadOfMemberFunctionPointer - use castAs<> instead of getAs<> to avoid dereference of nullptr The pointer is always dereferenced by arrangeCXXMethodType, so assert the cast is correct instead of returning nullptr	2022-02-17 13:18:23 +00:00
Nikita Popov	5065076698	[CodeGen] Rename deprecated Address constructor To make uses of the deprecated constructor easier to spot, and to ensure that no new uses are introduced, rename it to Address::deprecated(). While doing the rename, I've filled in element types in cases where it was relatively obvious, but we're still left with 135 calls to the deprecated constructor.	2022-02-17 11:26:42 +01:00
Nikita Popov	fe3407a91b	[CGBuilder] Assert that CreateAddrSpaceCast does not change element type Address space casts in general may change the element type, but don't allow it in the method working on Address, so we can preserve the element type. CreatePointerBitCastOrAddrSpaceCast() still needs to be addressed.	2022-02-16 15:17:08 +01:00
Chuanqi Xu	d30ca5e2e2	[C++20] [Coroutines] Implement return value optimization for get_return_object This patch tries to implement RVO for coroutine's return object got from get_return_object. From [dcl.fct.def.coroutine]/p7 we could know that the return value of get_return_object is either a reference or a prvalue. So it makes sense to do copy elision for the return value. The return object should be constructed directly into the storage where they would otherwise be copied/moved to. Test Plan: folly, check-all Reviewed By: junparser Differential revision: https://reviews.llvm.org/D117087	2022-02-16 13:38:00 +08:00
David Blaikie	9980a3f831	DebugInfo: Disable simplified template names for -gmlt and below Since -gmlt doesn't carry any type information necessary to rebuild template names.	2022-02-15 11:58:40 -08:00
David Blaikie	1ea326634b	DebugInfo: Don't simplify template names using _BitInt(N) _BitInt(N) only encodes the byte size in DWARF, not the bit size, so can't be reconstituted.	2022-02-15 11:58:40 -08:00
Alexander Potapenko	05ee1f4af8	Revert "[asan] Add support for disable_sanitizer_instrumentation attribute" This reverts commit `dd145f953d`. https://reviews.llvm.org/D119726, like https://reviews.llvm.org/D114421, still causes TSan to fail, see https://lab.llvm.org/buildbot/#/builders/70/builds/18020 Differential Revision: https://reviews.llvm.org/D119838	2022-02-15 15:04:53 +01:00
Alexander Potapenko	dd145f953d	[asan] Add support for disable_sanitizer_instrumentation attribute For ASan this will effectively serve as a synonym for __attribute__((no_sanitize("address"))) This is a reland of https://reviews.llvm.org/D114421 Reviewed By: melver, eugenis Differential Revision: https://reviews.llvm.org/D119726	2022-02-15 14:06:12 +01:00
Momchil Velikov	6398903ac8	Extend the `uwtable` attribute with unwind table kind We have the `clang -cc1` command-line option `-funwind-tables=1\|2` and the codegen option `VALUE_CODEGENOPT(UnwindTables, 2, 0) ///< Unwind tables (1) or asynchronous unwind tables (2)`. However, this is encoded in LLVM IR by the presence or the absence of the `uwtable` attribute, i.e. we lose the information whether to generate want just some unwind tables or asynchronous unwind tables. Asynchronous unwind tables take more space in the runtime image, I'd estimate something like 80-90% more, as the difference is adding roughly the same number of CFI directives as for prologues, only a bit simpler (e.g. `.cfi_offset reg, off` vs. `.cfi_restore reg`). Or even more, if you consider tail duplication of epilogue blocks. Asynchronous unwind tables could also restrict code generation to having only a finite number of frame pointer adjustments (an example of not having a finite number of `SP` adjustments is on AArch64 when untagging the stack (MTE) in some cases the compiler can modify `SP` in a loop). Having the CFI precise up to an instruction generally also means one cannot bundle together CFI instructions once the prologue is done, they need to be interspersed with ordinary instructions, which means extra `DW_CFA_advance_loc` commands, further increasing the unwind tables size. That is to say, async unwind tables impose a non-negligible overhead, yet for the most common use cases (like C++ exceptions), they are not even needed. This patch extends the `uwtable` attribute with an optional value: - `uwtable` (default to `async`) - `uwtable(sync)`, synchronous unwind tables - `uwtable(async)`, asynchronous (instruction precise) unwind tables Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D114543	2022-02-14 14:35:02 +00:00
Nikita Popov	1aeb4c6b50	[ItaniumCXXABI] Avoid pointer element type accesses	2022-02-14 15:17:14 +01:00
Nikita Popov	f208644ed3	[CGBuilder] Remove CreateBitCast() method Use CreateElementBitCast() instead, or don't work on Address where not necessary.	2022-02-14 15:06:04 +01:00
phyBrackets	de4e855204	Refactor nested if else with ternary operator in CGExprScalar.cpp Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D119364	2022-02-13 00:15:35 +05:30
Evgenii Stepanov	a730b6a41a	[NFC] clang-format one function. fix code formatting Differential Revision: https://reviews.llvm.org/D119299	2022-02-11 15:00:29 -08:00
Weverything	d5c314cdf4	[Clang][OpaquePtr] Remove deprecated Address constructor calls Remove most calls to deprcated Address constructor in CGExpr.cpp Differential Revision: https://reviews.llvm.org/D119496	2022-02-11 13:02:09 -08:00
Arthur Eubanks	87dd3d350c	[clang][OpaquePtr] Remove call to getPointerElementType() in CodeGenModule::GetAddrOfGlobalTemporary()	2022-02-11 10:39:49 -08:00
Sameer Sahasrabuddhe	d8f99bb6e0	[AMDGPU] replace hostcall module flag with function attribute The module flag to indicate use of hostcall is insufficient to catch all cases where hostcall might be in use by a kernel. This is now replaced by a function attribute that gets propagated to top-level kernel functions via their respective call-graph. If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the default behaviour is to emit kernel metadata indicating that the kernel uses the hostcall buffer pointer passed as an implicit argument. The attribute may be placed explicitly by the user, or inferred by the AMDGPU attributor by examining the call-graph. The attribute is inferred only if the function is not being sanitized, and the implictarg_ptr does not result in a load of any byte in the hostcall pointer argument. Reviewed By: jdoerfert, arsenm, kpyzhov Differential Revision: https://reviews.llvm.org/D119216	2022-02-11 22:51:56 +05:30
Simon Pilgrim	9ece72c159	[clang] VisitCastExpr - use cast<> instead of dyn_cast<> to avoid dereference of nullptr The pointer is always dereferenced, so assert the cast is correct (which it should be as we just created that ScalableVectorType) instead of returning nullptr	2022-02-11 10:51:34 +00:00
Sander de Smalen	0b41238ae7	[AArch64] Emit TBAA metadata for SVE load/store intrinsics In Clang we can attach TBAA metadata based on the load/store intrinsics based on the operation's element type. This also contains changes to InstCombine where the AArch64-specific intrinsics are transformed into generic LLVM load/store operations, to ensure that all metadata is transferred to the new instruction. There will be some further work after this patch to also emit TBAA metadata for SVE's gather/scatter- and struct load/store intrinsics. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D119319	2022-02-11 09:00:29 +00:00
Arthur Eubanks	e487ddc5c6	[clang][OpaquePtr] Use proper Address constructor in AtomicInfo::getAtomicAddress()	2022-02-10 18:29:51 -08:00
David Blaikie	389f67b35b	DebugInfo: Don't simplify names referencing local enums Due to the way type units work, this would lead to a declaration in a type unit of a local type in a CU - which is ambiguous. Rather than trying to resolve that relative to the CU that references the type unit, let's just not try to simplify these names. Longer term this should be fixed by not putting the template instantiation in a type unit to begin with - since it references an internal linkage type, it can't legitimately be duplicated/in more than one translation unit, so skip the type unit overhead. (but the right fix for that is to move type unit management into a DICompositeType flag (dropping the "identifier" field is not a perfect solution since it breaks LLVM IR linking decl/def merging during IR linking))	2022-02-10 15:51:47 -08:00
David Blaikie	26c5cf8fa0	Fix Windows build that fails if a class has a member with the same naem	2022-02-10 15:27:31 -08:00
Yuanfang Chen	f927021410	Reland "[clang-cl] Support the /JMC flag" This relands commit `b380a31de0`. Restrict the tests to Windows only since the flag symbol hash depends on system-dependent path normalization.	2022-02-10 15:16:17 -08:00
David Blaikie	f3a2cfc103	DebugInfo: Don't simplify any template referencing a lambda Lambda names aren't entirely canonical (as demonstrated by the cross-project-test added here) at the moment (we should fix that for a bunch of reasons) - even if the template referencing them is non-simplified, other names referencing /that/ template can't be simplified either because type units might cause a different template to be picked up that would conflict with the expected name. (other than for roundtripping precision, it'd be OK to simplify types that reference types that reference lambdas - but best be consistent between the roundtrip/verify mode and the actual simplified template names mode)	2022-02-10 14:56:54 -08:00
Yuanfang Chen	b380a31de0	Revert "[clang-cl] Support the /JMC flag" This reverts commit `bd3a1de683`. Break bots: https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-windows-x64/b8822587673277278177/overview	2022-02-10 14:17:37 -08:00
Yuanfang Chen	bd3a1de683	[clang-cl] Support the /JMC flag The introduction and some examples are on this page: https://devblogs.microsoft.com/cppblog/announcing-jmc-stepping-in-visual-studio/ The `/JMC` flag enables these instrumentations: - Insert at the beginning of every function immediately after the prologue with a call to `void __fastcall __CheckForDebuggerJustMyCode(unsigned char *JMC_flag)`. The argument for `__CheckForDebuggerJustMyCode` is the address of a boolean global variable (the global variable is initialized to 1) with the name convention `__<hash>_<filename>`. All such global variables are placed in the `.msvcjmc` section. - The `<hash>` part of `__<hash>_<filename>` has a one-to-one mapping with a directory path. MSVC uses some unknown hashing function. Here I used DJB. - Add a dummy/empty COMDAT function `__JustMyCode_Default`. - Add `/alternatename:__CheckForDebuggerJustMyCode=__JustMyCode_Default` link option via ".drectve" section. This is to prevent failure in case `__CheckForDebuggerJustMyCode` is not provided during linking. Implementation: All the instrumentations are implemented in an IR codegen pass. The pass is placed immediately before CodeGenPrepare pass. This is to not interfere with mid-end optimizations and make the instrumentation target-independent (I'm still working on an ELF port in a separate patch). Reviewed By: hans Differential Revision: https://reviews.llvm.org/D118428	2022-02-10 10:26:30 -08:00
Yaxun (Sam) Liu	1d97cb1f6e	[HIP] Emit amdgpu_code_object_version module flag code object version determines ABI, therefore should not be mixed. This patch emits amdgpu_code_object_version module flag in LLVM IR based on code object version (default 4). The amdgpu_code_object_version value is code object version times 100. LLVM IR with different amdgpu_code_object_version module flag cannot be linked. The -cc1 option -mcode-object-version=none is for ROCm device library use only, which supports multiple ABI. Reviewed by: Artem Belevich Differential Revision: https://reviews.llvm.org/D119026	2022-02-08 21:58:40 -05:00
Bill Wendling	deaf22bc0e	[X86] Implement -fzero-call-used-regs option The "-fzero-call-used-regs" option tells the compiler to zero out certain registers before the function returns. It's also available as a function attribute: zero_call_used_regs. The two upper categories are: - "used": Zero out used registers. - "all": Zero out all registers, whether used or not. The individual options are: - "skip": Don't zero out any registers. This is the default. - "used": Zero out all used registers. - "used-arg": Zero out used registers that are used for arguments. - "used-gpr": Zero out used registers that are GPRs. - "used-gpr-arg": Zero out used GPRs that are used as arguments. - "all": Zero out all registers. - "all-arg": Zero out all registers used for arguments. - "all-gpr": Zero out all GPRs. - "all-gpr-arg": Zero out all GPRs used for arguments. This is used to help mitigate Return-Oriented Programming exploits. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D110869	2022-02-08 17:42:54 -08:00
Arthur Eubanks	f05a63f9a0	[clang] Properly cache member pointer LLVM types When not going through the main Clang->LLVM type cache, we'd accidentally create multiple different opaque types for a member pointer type. This allows us to remove the -verify-type-cache flag now that check-clang passes with it on. We can do the verification in expensive builds. Previously microsoft-abi-member-pointers.cpp was failing with -verify-type-cache. I suspect that there may be more issues when we have multiple member pointer types and we clear the cache, but we can leave that for later. Followup to D118744. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D119215	2022-02-08 13:22:24 -08:00
Dawid Jurczak	5d8d3a11c4	[NFC] Increase initial size of FoldingSets used in ASTContext and CodeGenTypes Among many FoldingSet users most notable seem to be ASTContext and CodeGenTypes. The reasons that we spend not-so-tiny amount of time in FoldingSet calls from there, are following: 1. Default FoldingSet capacity for 2^6 items very often is not enough. For PointerTypes/ElaboratedTypes/ParenTypes it's not unlikely to observe growing it to 256 or 512 items. FunctionProtoTypes can easily exceed 1k items capacity growing up to 4k or even 8k size. 2. FoldingSetBase::GrowBucketCount cost itself is not very bad (pure reallocations are rather cheap thanks to BumpPtrAllocator). What matters is high collision rate when lot of items end up in same bucket slowing down FoldingSetBase::FindNodeOrInsertPos and trashing CPU cache (as items with same hash are organized in intrusive linked list which need to be traversed). This change address both issues by increasing initial size of FoldingSets used in ASTContext and CodeGenTypes. Extracted from: https://reviews.llvm.org/D118385 Differential Revision: https://reviews.llvm.org/D118608	2022-02-08 17:54:04 +01:00
Nikita Popov	18834dca2d	[OpenCL] Mark kernel arguments as ABI aligned Following the discussion on D118229, this marks all pointer-typed kernel arguments as having ABI alignment, per section 6.3.5 of the OpenCL spec: > For arguments to a __kernel function declared to be a pointer to > a data type, the OpenCL compiler can assume that the pointee is > always appropriately aligned as required by the data type. Differential Revision: https://reviews.llvm.org/D118894	2022-02-08 16:12:51 +01:00
Simon Pilgrim	09857a4bd1	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 15:00:10 +00:00
Simon Pilgrim	a59faf272e	Revert rG6c174ab2ad0676b295f11f6c3913eff9289fa6b9 "[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat" Missed some legacy builtin tests that need cleaning up first	2022-02-08 14:45:28 +00:00
Simon Pilgrim	6c174ab2ad	[X86] Remove __builtin_ia32_padd/psub saturated intrinsics and use generic __builtin_elementwise_add/sub_sat D117898 added the generic __builtin_elementwise_add_sat and __builtin_elementwise_sub_sat with the same integer behaviour as the SSE/AVX instructions This patch removes the __builtin_ia32_padd/psub saturated intrinsics and just uses the generics - the existing tests see no changes: __m256i test_mm256_adds_epi8(__m256i a, __m256i b) { // CHECK-LABEL: test_mm256_adds_epi8 // CHECK: call <32 x i8> @llvm.sadd.sat.v32i8(<32 x i8> %{{.}}, <32 x i8> %{{.}}) return _mm256_adds_epi8(a, b); }	2022-02-08 14:21:20 +00:00
David Pagan	0a7cc078ac	Enable inoutset dependency-type in depend clause. Done in manner similar to mutexinoutset (see https://reviews.llvm.org/D57576) Runtime support already exists in LLVM OpenMP runtime (see https://reviews.llvm.org/D97085). The value used to identify an inoutset dependency type in the LLVM OpenMP runtime is 8. Some tests updated due to change in dependency type error messages that now include new dependency type. Also updated test/OpenMP/task_codegen.cpp to verify we emit the right code.	2022-02-08 08:35:36 -05:00
Simon Pilgrim	c00db97159	[Clang] Add elementwise saturated add/sub builtins This patch implements `__builtin_elementwise_add_sat` and `__builtin_elementwise_sub_sat` builtins. These map to the add/sub saturated math intrinsics described here: https://llvm.org/docs/LangRef.html#saturation-arithmetic-intrinsics With this in place we should then be able to replace the x86 SSE adds/subs intrinsics with these generic variants - it looks like other targets should be able to use these as well (arm/aarch64/webassembly all have similar examples in cgbuiltin). Differential Revision: https://reviews.llvm.org/D117898	2022-02-08 11:22:01 +00:00
Arthur Eubanks	45084eab5e	[clang] Fix some clang->llvm type cache invalidation issues Take the following as an example struct z { z (p)(); }; z f(); When we attempt to get the LLVM type of f, we recurse into z. z itself has a function pointer with the same type as f. Given the recursion, Clang simply treats z::p as a pointer to an empty struct `{}`. The LLVM type of f is as expected. So we have two different potential LLVM types for a given Clang type. If we store one of those into the cache, when we access the cache with a different context (e.g. we are/aren't recursing on z) we may get an incorrect result. There is some attempt to clear the cache in these cases, but it doesn't seem to handle all cases. This change makes it so we only use the cache when we are not in any sort of function context, i.e. `noRecordsBeingLaidOut() && FunctionsBeingProcessed.empty()`, which are the cases where we may decide to choose a different LLVM type for a given Clang type. LLVM types for builtin types are never recursive so they're always ok. This allows us to clear the type cache less often (as seen with the removal of one of the calls to `TypeCache.clear()`). We still need to clear it when we use a placeholder type then replace it later with the final type and other dependent types need to be recalculated. I've added a check that the cached type matches what we compute. It triggered in this test case without the fix. It's currently not check-clang clean so it's not on by default for something like expensive checks builds. This change uncovered another issue where the LLVM types for an argument and its local temporary don't match. For example in type-cache-3, when expanding z::dc's argument into a temporary alloca, we ConvertType() the type of z::p which is `void ({})`, which doesn't match the alloca GEP type of `{}*`. No noticeable compile time changes: https://llvm-compile-time-tracker.com/compare.php?from=3918dd6b8acf8c5886b9921138312d1c638b2937&to=50bdec9836ed40e38ece0657f3058e730adffc4c&stat=instructions Fixes #53465. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D118744	2022-02-07 18:59:09 -08:00
Arthur Eubanks	2724c153f9	[clang] Cache OpenCL types If we call CGOpenCLRuntime::convertOpenCLSpecificType() multiple times we should get the same type back. Reviewed By: svenvh Differential Revision: https://reviews.llvm.org/D119011	2022-02-07 09:23:04 -08:00
Nikita Popov	c45a99f36b	[MatrixBuilder] Require explicit element type in CreateColumnMajorLoad() This makes the method compatible with opaque pointers.	2022-02-07 16:57:33 +01:00
Nikita Popov	cdc0573f75	[MatrixBuilder] Remove unnecessary IRBuilder template (NFC) IRBuilderBase exists specifically to avoid the need for this.	2022-02-07 16:42:38 +01:00
Yaxun (Sam) Liu	171da443d5	[HIPSPV] Fix literals are mapped to Generic address space This issue is an oversight in D108621. Literals in HIP are emitted as global constant variables with default address space which maps to Generic address space for HIPSPV. In SPIR-V such variables translate to OpVariable instructions with Generic storage class which are not legal. Fix by mapping literals to CrossWorkGroup address space. The literals are not mapped to UniformConstant because the “flat” pointers in HIP may reference them and “flat” pointers are modeled as Generic pointers in SPIR-V. In SPIR-V/OpenCL UniformConstant pointers may not be casted to Generic. Patch by: Henry Linjamäki Reviewed by: Yaxun Liu Differential Revision: https://reviews.llvm.org/D118876	2022-02-05 17:26:52 -05:00
James Y Knight	caa1ebde70	Don't assume that a new cleanup was added to InnermostEHScope. After `fa87fa97fb`, this was no longer guaranteed to be the cleanup just added by this code, if IsEHCleanup got disabled. Instead, use stable_begin(), which _is_ guaranteed to be the cleanup just added. This caused a crash when a object that is callee destroyed (e.g. with the MS ABI) was passed in a call from a noexcept function. Added a test to verify. Fixes: `fa87fa97fb`	2022-02-04 23:39:42 -05:00
Joseph Huber	034adaf5be	[OpenMP] Completely remove old device runtime This patch completely removes the old OpenMP device runtime. Previously, the old runtime had the prefix `libomptarget-new-` and the old runtime was simply called `libomptarget-`. This patch makes the formerly new runtime the only runtime available. The entire project has been deleted, and all references to the `libomptarget-new` runtime has been replaced with `libomptarget-`. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D118934	2022-02-04 15:31:33 -05:00
Shilei Tian	b35be6fe98	[Clang][Sema][OpenMP] Sema support for `atomic compare` This patch adds the Sema support for `atomic compare`. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D116637	2022-02-04 12:30:56 -05:00
Hans Wennborg	853e0aa424	Don't dllexport reference temporaries Even if the reference itself is dllexport, the temporary should not be. In fact, we're already giving it internal linkage, so dllexporting it is not just wasteful, but will fail to link, as in the example below: $ cat /tmp/a.cc void _DllMainCRTStartup() {} const int __declspec(dllexport) &foo = 42; $ clang-cl -fuse-ld=lld /tmp/a.cc /Zl /link /dll /out:a.dll lld-link: error: <root>: undefined symbol: int const &foo::$RT1 Differential revision: https://reviews.llvm.org/D118980	2022-02-04 16:31:51 +01:00
John Brawn	bca998ed3c	[AArch64] Generate fcmps when appropriate for neon intrinsics Differential Revision: https://reviews.llvm.org/D118257	2022-02-04 12:55:38 +00:00
Jan Svoboda	42afaf7f47	[clang][CodeGen] Use memory type representation in `va_arg` Some types (e.g. `_Bool`) have different scalar and memory representations. CodeGen for `va_arg` didn't take this into account, leading to an assertion failures with different types. This patch makes sure we use memory representation for `va_arg`. Reviewed By: ahatanak Differential Revision: https://reviews.llvm.org/D118904	2022-02-04 12:10:57 +01:00
James Y Knight	fa87fa97fb	Skip exception cleanups when the innermost scope is EHTerminateScope. EHTerminateScope is used to implement C++ noexcept semantics. Per C++ [except.terminate], it is implemented-defined whether no, some, or all cleanups are run prior to terminatation. Therefore, the code to run cleanups on the way towards termination is unnecessary, and may be omitted. After this change, we will still run some cleanups: any cleanups in a function called from the noexcept function will continue to run, while those in the noexcept function itself will not. (Commit attempt 2: check InnermostEHScope != stable_end() before accessing it.) Differential Revision: https://reviews.llvm.org/D113620	2022-02-02 17:50:18 -05:00
Rainer Orth	efdd0a29b7	[clang][Sparc] Fix __builtin_extract_return_addr etc. While investigating the failures of `symbolize_pc.cpp` and `symbolize_pc_inline.cpp` on SPARC (both Solaris and Linux), I noticed that `__builtin_extract_return_addr` is a no-op in `clang` on all targets, while `gcc` has non-default implementations for arm, mips, s390, and sparc. This patch provides the SPARC implementation. For background see `SparcISelLowering.cpp` (`SparcTargetLowering::LowerReturn_32`), the SPARC psABI p.3-12, `%i7` and p.3-16/17, and SCD 2.4.1, p.3P-10, `%i7` and p.3P-15. Tested (after enabling the `sanitizer_common` tests on SPARC) on `sparcv9-sun-solaris2.11`. Differential Revision: https://reviews.llvm.org/D91607	2022-02-02 19:20:02 +01:00
Alex Lorenz	116c1bea65	[clang][macho] add clang frontend support for emitting macho files with two build version load commands This patch extends clang frontend to add metadata that can be used to emit macho files with two build version load commands. It utilizes "darwin.target_variant.triple" and "darwin.target_variant.SDK Version" metadata names for that. MachO uses two build version load commands to represent an object file / binary that is targeting both the macOS target, and the Mac Catalyst target. At runtime, a dynamic library that supports both targets can be loaded from either a native macOS or a Mac Catalyst app on a macOS system. We want to add support to this to upstream to LLVM to be able to build compiler-rt for both targets, to finish the complete support for the Mac Catalyst platform, which is right now targetable by upstream clang, but the compiler-rt bits aren't supported because of the lack of this multiple build version support. Differential Revision: https://reviews.llvm.org/D115415	2022-02-02 08:30:39 -08:00
serge-sans-paille	e188aae406	Cleanup header dependencies in LLVMCore Based on the output of include-what-you-use. This is a big chunk of changes. It is very likely to break downstream code unless they took a lot of care in avoiding hidden ehader dependencies, something the LLVM codebase doesn't do that well :-/ I've tried to summarize the biggest change below: - llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h - llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h - llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h - llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h - llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h - llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h - llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h And the usual count of preprocessed lines: $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l before: 6400831 after: 6189948 200k lines less to process is no that bad ;-) Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D118652	2022-02-02 06:54:20 +01:00
Joseph Huber	53d5757ea2	[OpenMP] Add kernel string attribute to kernel function This patch adds a function attribute to the kernel function generated in OpenMP offloading. We already create a `nvvm.annotations` metadata node indicating the kernels present in the program. However, this created some indirection when trying to identify if a specific function was an entry. We add a single function attribute for each function now to simplify this. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D118708	2022-02-01 13:49:31 -05:00
Fangrui Song	7aaf024dac	[BitcodeWriter] Fix cases of some functions `WriteIndexToFile` is used by external projects so I do not touch it.	2022-01-31 16:46:11 -08:00
Fangrui Song	85dfe19b36	[ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils D116542 adds EmbedBufferInModule which introduces a layer violation (https://llvm.org/docs/CodingStandards.html#library-layering). See `2d5f857a1e` for detail. EmbedBufferInModule does not use BitcodeWriter functionality and should be moved LLVMTransformsUtils. While here, change the function case to the prevailing convention. It seems that EmbedBufferInModule just follows the steps of EmbedBitcodeInModule. EmbedBitcodeInModule calls WriteBitcodeToFile but has IR update operations which ideally should be refactored to another library. Reviewed By: jhuber6 Differential Revision: https://reviews.llvm.org/D118666	2022-01-31 16:33:57 -08:00
Itay Bookstein	2a868802a3	[clang][CodeGen][NFC] Remove unused CodeGenModule fields Signed-off-by: Itay Bookstein <ibookstein@gmail.com> Reviewed By: erichkeane Differential Revision: https://reviews.llvm.org/D118619	2022-01-31 23:45:53 +02:00
Joseph Huber	551b177452	[OpenMP] Add a flag for embedding a file into the module This patch adds support for a flag `-fembed-offload-binary` to embed a file as an ELF section in the output by placing it in a global variable. This can be used to bundle offloading files with the host binary so it can be accessed by the linker. The section is named using the `-fembed-offload-section` option. Depends on D116541 Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D116542	2022-01-31 15:56:00 -05:00
tyb0807	51e188d079	[AArch64] Support for memset tagged intrinsic This introduces a new ACLE intrinsic for memset tagged (https://github.com/ARM-software/acle/blob/next-release/main/acle.md#memcpy-family-of-operations-intrinsics---mops). void __builtin_arm_mops_memset_tag(void , int, size_t) A corresponding LLVM intrinsic is introduced: i8* llvm.aarch64.mops.memset.tag(i8*, i8, i64) The types match llvm.memset but the return type is not void. This is part 1/4 of a series of patches split from https://reviews.llvm.org/D117405 to facilitate reviewing. Patch by Tomas Matheson Differential Revision: https://reviews.llvm.org/D117753	2022-01-31 20:49:34 +00:00
Ben Shi	653836251a	[clang][AVR] Set '-fno-use-cxa-atexit' to default AVR is baremetal environment, so the avr-libc does not support '__cxa_atexit()'. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D118445	2022-01-30 02:26:19 +00:00
Weverything	be2147db05	Remove reference type when checking const structs ConstStructBuilder::Finalize in CGExprConstant.ccp assumes that the passed in QualType is a RecordType. In some instances, the type is a reference to a RecordType and the reference needs to be removed first. Differential Revision: https://reviews.llvm.org/D117376	2022-01-28 13:08:58 -08:00
Amilendra Kodithuwakku	1f08b08674	[clang][ARM] Emit warnings when PACBTI-M is used with unsupported architectures Branch protection in M-class is supported by - Armv8.1-M.Main - Armv8-M.Main - Armv7-M Attempting to enable this for other architectures, either by command-line (e.g -mbranch-protection=bti) or by target attribute in source code (e.g. __attribute__((target("branch-protection=..."))) ) will generate a warning. In both cases function attributes related to branch protection will not be emitted. Regardless of the warning, module level attributes related to branch protection will be emitted when it is enabled via the command-line. The following people also contributed to this patch: - Victor Campos Reviewed By: chill Differential Revision: https://reviews.llvm.org/D115501	2022-01-28 09:59:58 +00:00
Joseph Huber	2945f11c60	[OpenMP] Only generate runtime flags with host input This patch changes the code generation of runtime flags to only occur if a host bitcode file was passed in. This is a cheap way to determine if we are compiling the OpenMP device runtime itself or user code. This is needed because the global flags we generate for the device runtime e.g. __omp_rtl_debug_kind were being generated with default values when we compiled the runtime library. This would then invalidate the ones we want to be able to add in when the user defines it. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D118399	2022-01-27 18:43:41 -05:00
Arthur Eubanks	662ef6d177	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in VisitArrayInitLoopExpr With this we can bootstrap an `-O0 -g0` clang with `-mllvm -opaque-pointers`!	2022-01-27 14:44:53 -08:00
Arthur Eubanks	6e8a66bdad	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in EmitCXXMemberDataPointerAddress()	2022-01-27 14:44:53 -08:00
Arthur Eubanks	f17123831e	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in CreateTempAlloca() Specify the Address element type, which is the bitcast destination type. (the whole bitcast won't be needed after opaque pointers)	2022-01-27 14:18:54 -08:00
Arthur Eubanks	63cf2063a2	[NFC][Clang][OpaquePtr] Move away from deprecated Address constructor in EmitNewArrayInitializer() Specify the Address element type, which is the same for all pointers in the array.	2022-01-27 14:00:16 -08:00
Sri Hari Krishna Narayanan	5aa24558cf	OMPIRBuilder for Interop directive Implements the OMPIRBuilder portion for the Interop directive. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D105876	2022-01-27 14:53:18 -05:00
David Green	82973edfb7	[ARM][AArch64] Introduce qrdmlah and qrdmlsh intrinsics Since it's introduction, the qrdmlah has been represented as a qrdmulh and a sadd_sat. This doesn't produce the same result for all input values though. This patch fixes that by introducing a qrdmlah (and qrdmlsh) intrinsic specifically for the vqrdmlah and sqrdmlah instructions. The old test cases will now produce a qrdmulh and sqadd, as expected. Fixes #53120 and #50905 and #51761. Differential Revision: https://reviews.llvm.org/D117592	2022-01-27 19:19:46 +00:00
Dawid Jurczak	b88ca619d3	[NFC][CodeGen] Use llvm::DenseMap for DeferredDecls CodeGenModule::DeferredDecls std::map::operator[] seem to be hot especially while code generating huge compilation units. In such cases using DenseMap instead gives observable compile time improvement. Patch was tested on Linux build with default config acting as benchmark. Build was performed on isolated CPU cores in silent x86-64 Linux environment following: https://llvm.org/docs/Benchmarking.html#linux rules. Compile time statistics diff produced by perf and time before and after change are following: instructions -0.15%, cycles -0.7%, max-rss +0.65%. Using StringMap instead DenseMap doesn't bring any visible gains. Differential Revision: https://reviews.llvm.org/D118169	2022-01-27 10:57:48 +01:00
Ahmed Bougacha	ecb502342c	[ObjC] Emit selector load right before msgSend call. We currently emit the selector load early, but only because we need it to compute the signature (so that we know which msgSend variant to call). We can prepare the signature with a plain undef, and replace it with the materialized selector value if (and only if) needed, later. Concretely, this usually doesn't have an effect, but tests need updating because we reordered the receiver bitcast and the selector load, which is always fine. There is one notable change: with this, when a msgSend needs a receiver null check, the selector is now loaded in the non-null block, instead of before the null check. That should be a mild improvement.	2022-01-26 20:52:54 -08:00
Arthur Eubanks	eee97f1617	[clang] Use proper type to left shift after D117262 Causing warnings like warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits as reported in D117262.	2022-01-26 17:54:37 -08:00
Arthur Eubanks	6a953d931c	[clang] Fix -Wsubobject-linkage after D117262 /home/buildbot/llvm-avr-linux/llvm-avr-linux/llvm/clang/lib/CodeGen/Address.h:76:7: warning: 'clang::CodeGen::Address' has a field 'clang::CodeGen::Address::A' whose type uses the anonymous namespace [-Wsubobject-linkage] https://lab.llvm.org/buildbot/#/builders/112/builds/12047	2022-01-26 11:43:44 -08:00
Arthur Eubanks	b1613f05ae	[NFC] Store Address's alignment into PointerIntPairs This mitigates the extra memory caused by D115725. On 32-bit arches where we only have 2 bits per PointerIntPair we fall back to simply storing alignment separately. Reviewed By: rnk, nikic Differential Revision: https://reviews.llvm.org/D117262	2022-01-26 10:35:28 -08:00
Benjamin Kramer	f15014ff54	Revert "Rename llvm::array_lengthof into llvm::size to match std::size from C++17" This reverts commit `ef82063207`. - It conflicts with the existing llvm::size in STLExtras, which will now never be called. - Calling it without llvm:: breaks C++17 compat	2022-01-26 16:55:53 +01:00
serge-sans-paille	ef82063207	Rename llvm::array_lengthof into llvm::size to match std::size from C++17 As a conquence move llvm::array_lengthof from STLExtras.h to STLForwardCompat.h (which is included by STLExtras.h so no build breakage expected).	2022-01-26 16:17:45 +01:00
JackAKirk	0ad19a8331	[CUDA,NVPTX] Corrected fragment size for tf32 LD B matrix. Signed-off-by: JackAKirk <jack.kirk@codeplay.com> Reviewed By: tra Differential Revision: https://reviews.llvm.org/D118023	2022-01-25 11:29:19 -08:00
Nikita Popov	30d4a7e295	[IRBuilder] Require explicit element type in CreatePtrDiff() For opaque pointer compatibility, we cannot derive the element type from the pointer type.	2022-01-25 12:43:57 +01:00
Nikita Popov	caff8591ef	[OpenMP] Simplify pointer comparison Rather than checking ptrdiff(a, b) != 0, directly check a != b.	2022-01-25 12:38:37 +01:00
Nikita Popov	99adacbcb7	[clang] Remove some getPointerElementType() uses Same cases where the call can be removed in a straightforward way.	2022-01-25 12:09:06 +01:00

... 5 6 7 8 9 ...

15545 Commits