llvm-project

Commit Graph

Author	SHA1	Message	Date
Zequan Wu	387620aa8c	Reland "[LTO][COFF] Use bitcode file names in lto native object file names." This reverts commit `eef5405f74`.	2022-11-22 11:26:18 -08:00
Zequan Wu	eef5405f74	Revert "[LTO][COFF] Use bitcode file names in lto native object file names." This reverts commit `531ed6d5aa`.	2022-11-22 10:55:05 -08:00
Zequan Wu	531ed6d5aa	[LTO][COFF] Use bitcode file names in lto native object file names. Currently the lto native object files have names like main.exe.lto.1.obj. In PDB, those names are used as names for each compiland. Microsoft’s tool SizeBench uses those names to present to users the size of each object files. So, names like main.exe.lto.1.obj is not user friendly. This patch makes the lto native object file names more readable by using the bitcode file names as part of the file names. For example, if the input bitcode file has path like "path/to/foo.obj", its corresponding lto native object file path would be "path/to/main.exe.lto.foo.obj". Since the lto native object file name only bothers PDB, this patch only changes the lld-linker's behavior. Reviewed By: tejohnson, MaskRay, #lld-macho Differential Revision: https://reviews.llvm.org/D137217	2022-11-22 10:19:58 -08:00
Krzysztof Parzyszek	b805853ccb	[Hexagon] Make local array static in getIntrinsicForHexagonNonClangBuiltin It should not be created on every call, the omission of `static` was a bug in the patch that introduced it.	2022-11-22 09:48:01 -08:00
Jan Sjodin	969d787a47	[OpenMP][OMPIRBuilder] Add a configuration class that captures flags that affect codegen This patch introudces the OpenMPIRBuilderConfig class which contains various flags that are needed to lower OMP constructs to LLVM-IR. The purpose is to keep the flags in one place so they do not have to be passed in every time. The flags can be set optionally since some uses cases don't rely on functions that depend on these flags. Reviewed By: jdoerfert, tschuett Differential Revision: https://reviews.llvm.org/D138220	2022-11-22 09:25:04 -05:00
Stefan Gränitz	9a9d636cae	[CGObjC] Open cleanup scope before SaveAndRestore CurrentFuncletPad and push CatchRetScope early Pushing the `CatchRetScope` early causes cleanups for catch parameters to be emitted in the basic block of the catch handler instead of the `catchret.dest` block. This is important because the latter is not part of the catchpad and this caused code truncations due to ARC PreISel intrinsics in WinEHPrepare. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D137939	2022-11-22 12:02:53 +01:00
Kazu Hirata	6ba4b62af8	Return None instead of Optional<T>() (NFC) This patch replaces: return Optional<T>(); with: return None; to make the migration from llvm::Optional to std::optional easier. Specifically, I can deprecate None (in my source tree, that is) to identify all the instances of None that should be replaced with std::nullopt. Note that "return None" far outnumbers "return Optional<T>();". There are more than 2000 instances of "return None" in our source tree. All of the instances in this patch come from functions that return Optional<T> except Archive::findSym and ASTNodeImporter::import, where we return Expected<Optional<T>>. Note that we can construct Expected<Optional<T>> from any parameter convertible to Optional<T>, which None certainly is. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716 Differential Revision: https://reviews.llvm.org/D138464	2022-11-21 19:06:42 -08:00
Thomas Lively	ae96b5bd2d	[WebAssembly] Update relaxed-simd instruction names Including builtin and intrinsic names. These should be the final names for the proposal. https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md Reviewed By: aheejin, maratyszcza Differential Revision: https://reviews.llvm.org/D138249	2022-11-21 12:40:15 -08:00
gonglingqin	c2ec455f18	[LoongArch] Add intrinsics for ibar, break and syscall Diagnostics for intrinsic input parameters have also been added. Differential Revision: https://reviews.llvm.org/D138094	2022-11-21 09:31:26 +08:00
Kazu Hirata	30f9eb1eb8	[clang] Remove unused forward declarations (NFC)	2022-11-20 14:32:17 -08:00
Fangrui Song	d1163784b5	Remove unused llvm/IRPrinter/IRPrintingPasses.h or reorder #include after D137768	2022-11-19 22:09:05 +00:00
Alex Richardson	0745b0c035	Fix incorrect cast in VisitSYCLUniqueStableNameExpr Clang language-level address spaces and LLVM pointer address spaces are not the same thing (even though they will both have a numeric value of zero in many cases). LangAS is a enum class to avoid implicit conversions, but `eba69b59d1` avoided the compiler error by adding a `static_cast<>`. While touching this code, simplify it by using CreatePointerBitCastOrAddrSpaceCast() which is already a no-op if the types match. This changes the code generation for spir64 to place the globals in the sycl_global addreds space, which maps to `addrspace(1)`. Reviewed By: bader Differential Revision: https://reviews.llvm.org/D138284	2022-11-19 11:43:17 +00:00
yronglin	80f444646c	[CodeGen][ARM] Fix ARMABIInfo::EmitVAAarg crash with empty record type variadic arg Fix ARMABIInfo::EmitVAAarg crash with empty record type variadic arg Open issue: https://github.com/llvm/llvm-project/issues/58794 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D138137	2022-11-19 15:14:10 +08:00
Jennifer Yu	9d90cf2fca	[OPENMP5.1] Initial support for message clause.	2022-11-18 17:59:23 -08:00
Xing Xue	fa7477eb87	[Clang][CodeGen][AIX] Map __builtin_frexpl, __builtin_ldexpl, and __builtin_modfl to 'double' version lib calls in 64-bit 'long double' mode Summary: AIX library functions frexpl(), ldexpl(), and modfl() are for 128-bit IBM long double, i.e. __ibm128. Other *l() functions, e.g., acosl(), are for 64-bit long double. The AIX Clang compiler currently maps builtin functions __builtin_frexpl(), __builtin_ldexpl(), and __builtin_modfl() to frexpl(), ldexpl(), and modfl() in 64-bit long double mode which results in seg-faults or incorrect return values. This patch changes to map __builtin_frexpl(), __builtin_ldexpl(), and __builtin_modfl() to double version lib functions frexp(), ldexp() and modf() in 64-bit long double mode. Reviewed by: hubert.reinterpretcast, daltenty Differential Revision: https://reviews.llvm.org/D137986	2022-11-18 11:36:56 -05:00
Alexander Shaposhnikov	f102fe7304	Revert "Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm"" This reverts commit `7f608a2497` and removes the dependency of Object on IRPrinter.	2022-11-18 08:58:31 +00:00
Mikhail Goncharov	7f608a2497	Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm" This reverts commit `34ab474348`. as it has introduced circular dependency lib - analysis	2022-11-18 09:25:45 +01:00
Alexander Shaposhnikov	34ab474348	[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm Enable using -module-summary with -S (similarly to what currently can be achieved with opt <input> -o - \| llvm-dis). This is a recommit of `ef9e62469`. Test plan: ninja check-all Differential revision: https://reviews.llvm.org/D137768	2022-11-18 05:04:07 +00:00
Alexander Shaposhnikov	7059a6c32c	[IR] Split out IR printing passes into IRPrinter This diff splits out (from LLVMCore) IR printing passes into IRPrinter. This structure is similar to what we already have for IRReader and enables us to avoid circular dependencies between LLVMCore and Analysis (this is a preparation for https://reviews.llvm.org/D137768). The legacy interface is left unchanged, once the legacy pass manager is removed (in the future) we will be able to clean it up further. The bazel build configuration has been updated as well. Test plan: 1/ Tested the following cmake configurations: static/dynamic linking * lld/gold * clang/gcc 2/ bazel build --config=generic_clang @llvm-project//... Differential revision: https://reviews.llvm.org/D138081	2022-11-18 01:47:56 +00:00
Jennifer Yu	1e054e6b52	[OPENMP5.1] Initial support for severity clause Differential Revision:https://reviews.llvm.org/D138227	2022-11-17 16:05:02 -08:00
Doru Bercea	98bfd7f976	Fix declare target implementation to support enter.	2022-11-17 17:35:53 -06:00
Fangrui Song	fc91c70593	Revert D135411 "Add generic KCFI operand bundle lowering" This reverts commit `eb2a57ebc7`. llvm/include/llvm/Transforms/Instrumentation/KCFI.h including llvm/CodeGen is a layering violation. We should use an approach where Instrumementation/ doesn't need to include CodeGen/. Sorry for not spotting this in the review.	2022-11-17 22:45:30 +00:00
Sami Tolvanen	eb2a57ebc7	Add generic KCFI operand bundle lowering The KCFI sanitizer emits "kcfi" operand bundles to indirect call instructions, which the LLVM back-end lowers into an architecture-specific type check with a known machine instruction sequence. Currently, KCFI operand bundle lowering is supported only on 64-bit X86 and AArch64 architectures. As a lightweight forward-edge CFI implementation that doesn't require LTO is also useful for non-Linux low-level targets on other machine architectures, add a generic KCFI operand bundle lowering pass that's only used when back-end lowering support is not available and allows -fsanitize=kcfi to be enabled in Clang on all architectures. Reviewed By: nickdesaulniers, MaskRay Differential Revision: https://reviews.llvm.org/D135411	2022-11-17 21:55:00 +00:00
Ben Shi	84ef723573	[clang] Fix wrong ABI of AVRTiny. A scalar which exceeds 4 bytes should be returned via a stack slot, on an AVRTiny device. Reviewed By: aykevl Differential Revision: https://reviews.llvm.org/D138125	2022-11-17 08:38:44 +08:00
Eli Friedman	0fcb26c5b6	[clang] Fix __try/__finally blocks in C++ constructors. We were crashing trying to convert a GlobalDecl from a CXXConstructorDecl. Instead of trying to do that conversion, just pass down the original GlobalDecl. I think we could actually compute the correct constructor/destructor kind from the context, given the way Microsoft mangling works, but it's simpler to just pass through the correct constructor/destructor kind. Differential Revision: https://reviews.llvm.org/D136776	2022-11-16 15:13:33 -08:00
Tom Honermann	3e25ae605e	[Clang] Correct when Itanium ABI guard variables are set for non-block variables with static or thread storage duration. Previously, Itanium ABI guard variables were set after initialization was complete for non-block declared variables with static and thread storage duration. That resulted in initialization of such variables being restarted in cases where the variable was referenced while it was still under construction. Per C++20 [class.cdtor]p2, such references are permitted (though the value obtained by such an access is unspecified). The late initialization resulted in recursive reinitialization loops for cases like this: template<typename T> struct ct { struct mc { mc() { ct<T>::smf(); } void mf() const {} }; thread_local static mc tlsdm; static void smf() { tlsdm.mf(); } }; template<typename T> thread_local typename ct<T>::mc ct<T>::tlsdm; int main() { ct<int>::smf(); } With this change, guard variables are set before initialization is started so as to avoid such reinitialization loops. Fixes https://github.com/llvm/llvm-project/issues/57828 Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D135919	2022-11-16 16:31:35 -05:00
Jennifer Yu	628fdc3f57	[OPENMP]Initial support for at clause Error directive is allowed in both declared and executable contexts. The function ActOnOpenMPAtClause is called in both places during the parsers. Adding a param "bool InExContext" to identify context which is used to emit error massage. Differential Revision: https://reviews.llvm.org/D137851	2022-11-15 14:06:50 -08:00
Michele Scandale	b7d7c448df	Fix `unsafe-fp-math` attribute emission. The conditions for which Clang emits the `unsafe-fp-math` function attribute has been modified as part of `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7`. In the backend code generators `"unsafe-fp-math"="true"` enable floating point contraction for the whole function. The intent of the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` was to prevent backend code generators performing contractions when that is not expected. However the change is inaccurate and incomplete because it allows `unsafe-fp-math` to be set also when only in-statement contraction is allowed. Consider the following example ``` float foo(float a, float b, float c) { float tmp = a * b; return tmp + c; } ``` and compile it with the command line ``` clang -fno-math-errno -funsafe-math-optimizations -ffp-contract=on \ -O2 -mavx512f -S -o - ``` The resulting assembly has a `vfmadd213ss` instruction which corresponds to a fused multiply-add. From the user perspective there shouldn't be any contraction because the multiplication and the addition are not in the same statement. The optimized IR is: ``` define float @test(float noundef %a, float noundef %b, float noundef %c) #0 { %mul = fmul reassoc nsz arcp afn float %b, %a %add = fadd reassoc nsz arcp afn float %mul, %c ret float %add } attributes #0 = { [...] "no-signed-zeros-fp-math"="true" "no-trapping-math"="true" [...] "unsafe-fp-math"="true" } ``` The `"unsafe-fp-math"="true"` function attribute allows the backend code generator to perform `(fadd (fmul a, b), c) -> (fmadd a, b, c)`. In the current IR representation there is no way to determine the statement boundaries from the original source code. Because of this for in-statement only contraction the generated IR doesn't have instructions with the `contract` fast-math flag and `llvm.fmuladd` is being used to represent contractions opportunities that occur within a single statement. Therefore `"unsafe-fp-math"="true"` can only be emitted when contraction across statements is allowed. Moreover the change in `84a9ec2ff1ee97fd7e8ed988f5e7b197aab84a7` doesn't take into account that the floating point math function attributes can be refined during IR code generation of a function to handle the cases where the floating point math options are modified within a compound statement via pragmas (see `CGFPOptionsRAII`). For consistency `unsafe-fp-math` needs to be disabled if the contraction mode for any scope/operation is not `fast`. Similarly for consistency reason the initialization of `UnsafeFPMath` of in `TargetOptions` for the backend code generation should take into account the contraction mode as well. Reviewed By: zahiraam Differential Revision: https://reviews.llvm.org/D136786	2022-11-14 20:40:57 -08:00
Fangrui Song	77bf0df376	Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm" This reverts commit `bf8381a8bc`. There is a layering violation: LLVMAnalysis depends on LLVMCore, so LLVMCore should not include LLVMAnalysis header llvm/Analysis/ModuleSummaryAnalysis.h	2022-11-14 15:51:03 -08:00
Alexander Shaposhnikov	bf8381a8bc	[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm Enable using -module-summary with -S (similarly to what currently can be achieved with opt <input> -o - \| llvm-dis). This is a recommit of `ef9e62469`. Test plan: ninja check-all Differential revision: https://reviews.llvm.org/D137768	2022-11-14 23:24:08 +00:00
Alexander Shaposhnikov	8c15c17e3b	Revert "[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm" This reverts commit `ef9e624694` for further investigation offline. It appears to break the buildbot llvm-clang-x86_64-sie-ubuntu-fast.	2022-11-14 21:31:30 +00:00
Alexander Shaposhnikov	ef9e624694	[opt][clang] Enable using -module-summary/-flto=thin with -S/-emit-llvm Enable using -module-summary with -S (similarly to what currently can be achieved with opt <input> -o - \| llvm-dis). Test plan: ninja check-all Differential revision: https://reviews.llvm.org/D137768	2022-11-14 21:11:07 +00:00
Arthur Eubanks	cbcf123af2	[LegacyPM] Remove cl::opts controlling optimization pass manager passes Move these to the new PM if they're used there. Part of removing the legacy pass manager for optimization pipeline. Reland with UseNewGVN usage in clang removed. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D137915	2022-11-14 09:38:17 -08:00
Akash Banerjee	87f652d31f	Migrate getOrCreateInternalVariable from Clang to OMPIRBuilder. This patch removes getOrCreateInternalVariable from Clang OMP CodeGen and replaces it's uses with OMPBuilder::getOrCreateInternalVariable. Also refactors OMPBuilder::getOrCreateInternalVariable to change type of name from Twine to StringRef Differential Revision: https://reviews.llvm.org/D137720	2022-11-14 17:18:10 +00:00
Matt Arsenault	76db4e3c43	clang: Fix unnecessary truncation of resource limit values	2022-11-11 16:38:51 -08:00
Joshua Batista	a5d14f757b	Add builtin_elementwise_sin and builtin_elementwise_cos Add codegen for llvm cos and sin elementwise builtins The sin and cos elementwise builtins are necessary for HLSL codegen. Tests were added to make sure that the expected errors are encountered when these functions are given inputs of incompatible types. The new builtins are restricted to floating point types only. Reviewed By: craig.topper, fhahn Differential Revision: https://reviews.llvm.org/D135011	2022-11-10 23:30:27 -08:00
gonglingqin	da34aff90d	[Clang][LoongArch] Implement __builtin_loongarch_crc_w_d_w builtin and add diagnostics This patch adds support to prevent __builtin_loongarch_crc_w_d_w from compiling on loongarch32 in the front end and adds diagnostics accordingly. Reference: https://github.com/gcc-mirror/gcc/blob/master/gcc/config/loongarch/larchintrin.h#L175-L184 Depends on D136906 Differential Revision: https://reviews.llvm.org/D137316	2022-11-11 09:16:57 +08:00
gonglingqin	85f08c4197	[Clang][LoongArch] Implement __builtin_loongarch_dbar builtin Differential Revision: https://reviews.llvm.org/D136906	2022-11-10 17:27:44 +08:00
Matt Jacobson	dd9f7963e4	[ObjC] avoid crashing when emitting synthesized getter/setter and ptrdiff_t is smaller than long On targets where ptrdiff_t is smaller than long, clang crashes when emitting synthesized getters/setters that call objc_[gs]etProperty. Explicitly emit a zext/trunc of the ivar offset value (which is defined to long) to ptrdiff_t, which objc_[gs]etProperty takes. Add a test using the AVR target, where ptrdiff_t is smaller than long. Test failed previously and passes now. Differential Revision: https://reviews.llvm.org/D112049	2022-11-10 02:10:30 -05:00
OCHyams	4b6b2b1a42	Reapply: [Assignment Tracking][7/*] Add assignment tracking functionality to clang Reverted in `98fa95492f`. The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in the previous patch in this set, into the optimisation pipeline from clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the main test for this patch. Note: while clang (with the help of the declare-to-assign pass) can now emit Assignment Tracking metadata, the llvm middle and back ends don't yet understand it. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D132226	2022-11-09 09:28:41 +00:00
OCHyams	98fa95492f	Revert "[Assignment Tracking][7/*] Add assignment tracking functionality to clang" This reverts commit `28f9636edd`. Bot failure: https://lab.llvm.org/buildbot/#/builders/109/builds/50251	2022-11-08 18:43:05 +00:00
OCHyams	28f9636edd	[Assignment Tracking][7/*] Add assignment tracking functionality to clang The Assignment Tracking debug-info feature is outlined in this RFC: https://discourse.llvm.org/t/ rfc-assignment-tracking-a-better-way-of-specifying-variable-locations-in-ir This patch plumbs the AssignmentTrackingPass (AKA declare-to-assign), added in the previous patch in this set, into the optimisation pipeline from clang. clang/test/CodeGen/assignment-tracking/assignment-tracking.cpp is the main test for this patch. Note: while clang (with the help of the declare-to-assign pass) can now emit Assignment Tracking metadata, the llvm middle and back ends don't yet understand it. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D132226	2022-11-08 17:49:08 +00:00
Rageking8	94738a5ac3	Fix duplicate word typos; NFC This revision fixes typos where there are 2 consecutive words which are duplicated. There should be no code changes in this revision (only changes to comments and docs). Do let me know if there are any undesirable changes in this revision. Thanks.	2022-11-08 07:21:23 -05:00
Grace Jennings	86674f66cc	[HLSL] Added HLSL this as a reference This change makes `this` a reference instead of a pointer in HLSL. HLSL does not have the `->` operator, and accesses through `this` are with the `.` syntax. Tests were added and altered to make sure the AST accurately reflects the types. Reviewed By: aaron.ballman Differential Revision: https://reviews.llvm.org/D135721	2022-11-07 13:50:08 -08:00
Matthias Braun	cafe50daf5	Explicitly initialize opaque pointer mode in CodeGenAction Explicitly call `LLVMContext::setOpaquePointers` in `CodeGenAction` before loading any IR files. With this we use the mode specified on the command-line rather than lazily initializing it based on the contents of the IR. This helps when using `-fthinlto-index` which may end up mixing files with typed and opaque pointer types which fails when the first file happened to use typed pointers since we cannot downgrade IR with opaque pointer types to typed pointer types. Differential Revision: https://reviews.llvm.org/D137475	2022-11-07 12:31:28 -08:00
Jennifer Yu	de14befa77	Remove redundant loads. It is caused by regenerate captured var value when processing the has_device_addr, the captured var value has been generated in GenerateOpenMPCapturedVars and passed as Arg in generateInfoForCapture. The fix just use Arg instead regenerated just same as is_device_ptr	2022-11-04 15:22:25 -07:00
Mike Rice	c954cfeb57	Some uses of the preprocessor can result in multiple target regions on the same line. Cases such as those in the associated lit tests, can now be supported. This adds a 'Count' field to TargetRegionEntryInfo to differentiate regions with the same source position. The OffloadEntriesInfoManager routines are updated to maintain a count of regions seen at a location. The registration of regions proceeds that same as before, but now the next available count is always determined and used in the offload entry. Fixes: https://github.com/llvm/llvm-project/issues/52707 Differential Revision: https://reviews.llvm.org/D134816	2022-11-04 12:54:22 -07:00
Nikita Popov	304f1d59ca	[IR] Switch everything to use memory attribute This switches everything to use the memory attribute proposed in https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly attributes are dropped. The readnone, readonly and writeonly attributes are restricted to parameters only. The old attributes are auto-upgraded both in bitcode and IR. The bitcode upgrade is a policy requirement that has to be retained indefinitely. The IR upgrade is mainly there so it's not necessary to update all tests using memory attributes in this patch, which is already large enough. We could drop that part after migrating tests, or retain it longer term, to make it easier to import IR from older LLVM versions. High-level Function/CallBase APIs like doesNotAccessMemory() or setDoesNotAccessMemory() are mapped transparently to the memory attribute. Code that directly manipulates attributes (e.g. via AttributeList) on the other hand needs to switch to working with the memory attribute instead. Differential Revision: https://reviews.llvm.org/D135780	2022-11-04 10:21:38 +01:00
Freddy Ye	a806fc2767	[X86] Support -march=raptorlake, meteorlake Reviewed By: pengfei, skan, MaskRay Differential Revision: https://reviews.llvm.org/D135937	2022-11-04 09:32:17 +08:00
Jan Sjodin	9ea2b150b5	[OpenMP][OMPIRBuilder] Migrate createOffloadEntriesAndInfoMetadata from clang to OpenMPIRBuilder This patch moves the createOffloadEntriesAndInfoMetadata to OpenMPIRBuilder, the createOffloadEntry helper function. The clang specific error handling is invoked using a callback. This code will also be used by flang in the future.	2022-11-03 10:27:44 -04:00
Jennifer Yu	ea64e66f7b	[OPENMP]Initial support for error directive. Differential Revision: https://reviews.llvm.org/D137209	2022-11-02 14:25:28 -07:00
Martin Storsjö	9b3834ef67	[clang] Fix inline builtin functions of an __asm__ renamed function with symbol prefixes If a function is renamed with `__asm__`, the name provided is the exact symbol name, without any extra implicit symbol prefixes. If the target does use symbol prefixes, the IR level symbol gets an `\01` prefix to indicate that it's a literal symbol name to be taken as is. When a builtin function is specialized by providing an inline version of it, that inline function is named `<funcname>.inline`. When the base function has been renamed due to `__asm__`, the inline function ends up named `<asmname>.inline`. Up to this point, things did work as expected before. However, for targets with symbol prefixes, one codepath that produced the combined name `<asmname>.inline` used the mangled `asmname` with `\01` prefix, while others didn't. This patch fixes this. This fixes the combination of asm renamed builtin function, with inline override of the function, on any target with symbol prefixes (such as i386 windows and any Darwin target). Differential Revision: https://reviews.llvm.org/D137073	2022-11-02 22:24:42 +02:00
Akash Banerjee	a3463a9f5c	[OpenMP][OpenMPIRBuilder] Migrate loadOffloadInfoMetadata from clang to OMPIRbuilder This patch moves the implementation of the loadOffloadInfoMetadata to the OMPIRbuilder. Differential Revision: https://reviews.llvm.org/D136872	2022-11-02 18:54:25 +00:00
Krzysztof Parzyszek	13918432cf	[Hexagon] Add builtins and intrinsics for V6_v[add\|sub]carryo	2022-10-31 13:41:31 -07:00
Jan Sjodin	67f8521cd4	[OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info Re-apply of: `3d0e9edd8e` Reverted in: `0cb65b0a58` A function parameter was using the wrong type 'llvm::TargetRegion' instead of 'const llvm:: TargetRegion&', which caused the error in the address sanitizer. The correct type is now used. This patch puts the individual target region information attributes into a struct so that the nested mappings are not needed and passing the information around is simplified. Reviewed By: jdoerfert, mikerice Differential Revision: https://reviews.llvm.org/D136601	2022-10-31 10:49:44 -04:00
Matt Arsenault	0ebd4638af	clang: Improve errors for DiagnosticInfoResourceLimit Print source location info and demangle the name, compared to the default behavior. Several observations: 1. Specially handling this seems to give source locations without enabling debug info, and also gives columns compared to the backend diagnostic. 2. We're duplicating diagnostic effort in DiagnosticInfo and clang. This feels wrong, but clang can demangle and I guess have better debug info available? Should clang really have any of this code? For the purposes of this diagnostic, the important piece is just reading the source location out of the llvm::Function. 3. lld is not duplicating the same effort as clang with LTO, and just directly printing the DiagnosticInfo as-is. e.g. $ clang -fgpu-rdc lld: error: local memory (480000) exceeds limit (65536) in function '_Z12use_huge_ldsIiEvv' lld: error: local memory (960000) exceeds limit (65536) in function '_Z12use_huge_ldsIdEvv' $ clang -fno-gpu-rdc backend-resource-limit-diagnostics.hip:8:17: error: local memory (480000) exceeds limit (65536) in 'void use_huge_lds<int>()' __global__ void use_huge_lds() { ^ backend-resource-limit-diagnostics.hip:8:17: error: local memory (960000) exceeds limit (65536) in 'void use_huge_lds<double>()' 2 errors generated when compiling for gfx90a. 4. Backend errors are not observed with -save-temps and -fno-gpu-rdc or -flto, and the compile incorrectly succeeds. 5. The backend version prints error: <location info>; clang prints <location info>: error: 6. -emit-codegen-only is totally broken for AMDGPU. MC gets a null target streamer. I do not understand why this is a thing. This just creates a horrible edge case. Just work around this by emitting actual code instead of blocking this patch.	2022-10-28 21:42:57 -07:00
Ben Langmuir	e1f9983022	Move getenv for AS_SECURE_LOG_FILE to clang Avoid calling getenv in the MC layer and let the clang driver do it so that it is reflected in the command-line as an -mllvm option. rdar://101558354 Differential Revision: https://reviews.llvm.org/D136888	2022-10-28 16:08:04 -07:00
Eduard Zingerman	524c640090	[clang][DebugInfo] Emit DISubprogram for extern functions with reserved names Callsite `DISubprogram` entries are not generated for: - builtin functions; - external functions with reserved names (e.g. names starting from "__"). This limitation was added by the commit [1] as a workaround for the situation described in [2] that triggered the IR verifier error. The goal of the present commit is to lift this limitation by adjusting the IR verifier logic. The logic behind [1] is to avoid the following situation: - a `DISubprogram` is added for some builtin function; - there is some location where this builtin is also emitted by a transformation (w/o debug location); - the `Verifier::visitCallBase` sees a call to a function with `DISubprogram` but w/o debug location and emits an error. Here is an updated example of such situation taken from [2]: ``` extern "C" int memcmp(void , void , long); struct a { int b; int c; int d; }; struct e { int f[1000]; }; bool foo(e g, e &h) { // DISubprogram for memcmp is created here when [1] is commented out return memcmp(&g, &h, sizeof(e)); } bool bar(a &g, a &h) { // memcmp might be generated here by MergeICmps return g.b == h.b && g.c == h.c && g.d == h.d; } ``` This triggers the verifier error when: - compiled for AArch64: `clang++ -c -g -Oz -target aarch64-unknown-linux-android21 test.cpp`; - [1] check is commented out. Instead of forbidding generation of `DISubprogram` entries as in [1] one can instead adjust the verifier to additionally check if callee has a body. Functions w/o bodies cannot be inlined and thus verifier warning is not necessary. E.g. `llvm::InlineFunction` requires functions for which `GlobalValue::isDeclaration() == false`. [1] `568db780bb` [2] https://bugs.chromium.org/p/chromium/issues/detail?id=1022296 Differential Revision: https://reviews.llvm.org/D136041	2022-10-28 08:07:54 -07:00
Kevin Athey	0cb65b0a58	Revert "[OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info" This reverts commit `3d0e9edd8e`. Breaking HWASAN buildbot: https://lab.llvm.org/buildbot/#/builders/236/builds/786 Shown by targetted builds breaking at this patch: Built at this patch: https://lab.llvm.org/buildbot/#/builders/236/builds/803 Built at prior patch: https://lab.llvm.org/buildbot/#/builders/236/builds/804	2022-10-27 13:57:25 -07:00
Jan Sjodin	3d0e9edd8e	[OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info This patch puts the individual target region information attributes into a struct so that the nested mappings are not needed and passing the information around is simplified. Reviewed By: jdoerfert, mikerice Differential Revision: https://reviews.llvm.org/D136601	2022-10-25 11:15:36 -04:00
David Green	af1bb287b4	[AArch64][ARM] Alter v8.3a complex neon intrinsics to be target-based, not preprocessor based This alters the 8.3 complex intrinsics to be target-gated, as opposed to hidden behind preprocessor macros. This is the last of arm_neon.h, and follows the same formula as before. Differential Revision: https://reviews.llvm.org/D135647	2022-10-25 14:35:11 +01:00
David Green	9c48b7f0e7	[AArch64][ARM] Alter v8.1a neon intrinsics to be target-based, not preprocessor based As a continuation of D132034, this switches the QRDMX v8.1a neon intrinsics over from preprocessor defines to be target-gated. As there is no "rdma" or "qrdmx" target feature, they use the "v8.1a" architecture feature directly. This works well for AArch64, but something needs to be done for Arm at the same time, as they both use the same header and tablegen emitter. This patch opts for adding "v8.1a" and all dependant target features to the Arm TargetParser, similar to what was recently done for AArch64 but through initFeatureMap when the Architecture is parsed. I attempted to make the code similar to the AArch64 backend. Otherwise this is similar to the changes made in D132034. Differential Revision: https://reviews.llvm.org/D135615	2022-10-25 09:02:52 +01:00
Markus Böck	3637dc601c	[clang][CodeGen] Consistently return nullptr Values for void builtins and scalar initalization A common post condition of the various visitor functions in CodeGen is that instructions, that do not return any values, simply return a nullptr Value as a sentinel. This has not been the case however for calls to some builtins returning void, as well as for an initializer expression of the form `void()`. This would then lead to ICEs in CodeGen on code relying on nullptr being returned for void values, which is eg. the case for conditional expressions [0]. This patch fixes that by returning nullptr Values for intrinsics known not to return any values as well as for a scalar initializer returning void. Fixes https://github.com/llvm/llvm-project/issues/53127 [0] `266ec801fb/clang/lib/CodeGen/CGExprScalar.cpp (L4849-L4892)` Differential Revision: https://reviews.llvm.org/D136548	2022-10-24 21:41:13 +02:00
Erich Keane	975740bf8d	"Reapply "GH58368: Correct concept checking in a lambda defined in concept"" This reverts commit `cecc9a92cf`. The problem ended up being how we were handling the lambda-context in code generation: we were assuming any decl context here would be a named-decl, but that isn't the case. Instead, we just replace it with the concept's owning context. Differential Revision: https://reviews.llvm.org/D136451	2022-10-24 12:36:54 -07:00
Erich Keane	cecc9a92cf	Revert "Reapply "GH58368: Correct concept checking in a lambda defined in concept""" This reverts commit `b876f6e2f2`. Still getting build failures on PPC AIX that aren't obvious what is causing them, so reverting while I try to figure this out.	2022-10-24 12:20:23 -07:00
Erich Keane	b876f6e2f2	Reapply "GH58368: Correct concept checking in a lambda defined in concept"" This reverts commit `5293016287`. Now with updating the ASTBitcodes to show that this AST is incompatible from the last.	2022-10-24 11:46:54 -07:00
Erich Keane	5293016287	Revert "GH58368: Correct concept checking in a lambda defined in concept" This reverts commit `b7c922607c`. This seems to cause some problems with some modules related things, which makes me think I should have updated the version-major in ast-bit-codes? Going to revert to confirm this was a problem, then change that and re-try a commit.	2022-10-24 10:21:22 -07:00
Erich Keane	b7c922607c	GH58368: Correct concept checking in a lambda defined in concept As that bug reports, the problem here is that the lambda's 'context-decl' was not set to the concept, and the lambda picked up template arguments from the concept. SO, we failed to get the correct template arguments in SemaTemplateInstantiate. However, a Concept Specialization is NOT a decl, its an expression, so we weren't able to put the concept in the decl tree like we needed. This patch introduces a ConceptSpecializationDecl, which is the smallest type possible to use for this purpose, containing only the template arguments. The net memory impliciation of this is turning a trailing-objects into a pointer to a type with trailing-objects, so it should be minor. As future work, we may consider giving this type more responsibility, or figuring out how to better merge duplicates, but as this is just a template-argument collection at the moment, there isn't much value to it. Differential Revision: https://reviews.llvm.org/D136451	2022-10-24 06:32:18 -07:00
David Green	6f1e430360	[AArch64] Alter v8.5a FRINT neon intrinsics to be target-based, not preprocessor based This switches the v8.5-a FRINT intrinsics over to be target-gated, behind preprocessor defines. This one is pretty simple, being AArch64 only. Differential Revision: https://reviews.llvm.org/D135646	2022-10-24 11:22:06 +01:00
Chris Bieneman	4c7218e770	[HLSL] Remove unused frontend-generated ID As @python3kgae pointed out we're going to want to assign these IDs after optimization so that we can remove unused resrouces. This patch just removes the unused ID value from the frontend metadata, clang code generation, and updates associated test cases. Reviewed By: python3kgae Differential Revision: https://reviews.llvm.org/D136271	2022-10-21 12:41:09 -05:00
Paulo Matos	39d8597927	[clang] Fix typo in error message	2022-10-21 12:06:28 +02:00
Xiang Li	464926ef44	[HLSL] Disable integer promotion to avoid int16_t being promoted to int for HLSL. short will be promoted to int in UsualUnaryConversions. Disable it for HLSL to keep int16_t as 16bit. Reviewed By: aaron.ballman, rjmccall Differential Revision: https://reviews.llvm.org/D133668	2022-10-20 16:06:25 -07:00
Xiang Li	a7183a158d	[NFC] [DirectX backend] move ResourceClass into llvm. Move ResourceClass into llvm/Frontend/HLSL/HLSLResource.h so it could be shared between clang and DirectX backend. Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D136134	2022-10-20 13:26:56 -07:00
Phoebe Wang	62ca79102c	[X86][1/2] Support PREFETCHI instructions For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D136040	2022-10-20 08:46:01 +08:00
Prabhdeep Singh Soni	6149589127	[OMPIRBuilder] Support depend clause for task This patch adds support for the `depend` clause for the `task` construct. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D135695	2022-10-19 13:11:43 -04:00
Phoebe Wang	bc1819389f	[X86][RFC] Using `__bf16` for AVX512_BF16 intrinsics This is an alternative of D120395 and D120411. Previously we use `__bfloat16` as a typedef of `unsigned short`. The name may give user an impression it is a brand new type to represent BF16. So that they may use it in arithmetic operations and we don't have a good way to block it. To solve the problem, we introduced `__bf16` to X86 psABI and landed the support in Clang by D130964. Now we can solve the problem by switching intrinsics to the new type. Reviewed By: LuoYuanke, RKSimon Differential Revision: https://reviews.llvm.org/D132329	2022-10-19 23:47:04 +08:00
Xiang Li	14ae5d2b74	[HLSL] Add SV_DispatchThreadID Support SV_DispatchThreadID attribute. Translate it into dx.thread.id in clang codeGen. Reviewed By: beanz, aaron.ballman Differential Revision: https://reviews.llvm.org/D133983	2022-10-18 16:17:19 -07:00
Joseph Huber	8c1449a84d	[OpenMP] Make kernels have protected visibility This patch changes the kernels generated by OpenMP to have protected visibility. This is unlikely to change anything functionally. However, protected visibility better matches the behaviour of these GPU kernels. We do not expect any pending shared library load to preempt these kernels so we can specify a more restrictive visibility. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D136198	2022-10-18 16:37:28 -05:00
Dominik Adamski	ccd314d320	[OpenMP][OMPIRBuilder] Add generation of SIMD align assumptions to OMPIRBuilder Currently generation of align assumptions for OpenMP simd construct is done outside OMPIRBuilder for C code and it is not supported for Fortran. According to OpenMP 5.0 standard (2.9.3) only pointers and arrays can be aligned for C code. If given aligned variable is pointer, then Clang generates the following set of the LLVM IR isntructions to support simd align clause: ; memory allocation for pointer address: %A.addr = alloca ptr, align 8 ; some LLVM IR code ; Alignment instructions (alignment is equal to 32): %0 = load ptr, ptr %A.addr, align 8 call void @llvm.assume(i1 true) [ "align"(ptr %0, i64 32) ] If given aligned variable is array, then Clang generates the following set of the LLVM IR isntructions to support simd align clause: ; memory allocation for array: %B = alloca [10 x i32], align 16 ; some LLVM IR code ; Alignment instructions (alignment is equal to 32): %arraydecay = getelementptr inbounds [10 x i32], ptr %B, i64 0, i64 0 call void @llvm.assume(i1 true) [ "align"(ptr %arraydecay, i64 32) ] OMPIRBuilder was modified to generate aligned assumptions. It generates only llvm.assume calls. Frontend is responsible for generation of aligned pointer and getting the default alignment value if user does not specify it in aligned clause. Unit and regression tests were added to check if aligned clause was handled correctly. Differential Revision: https://reviews.llvm.org/D133578 Reviewed By: jdoerfert	2022-10-18 02:04:18 -05:00
Xiang Li	3a671c8e91	[NFC] use llvm_unreachable instead of return on switch which all cases are covered.	2022-10-17 17:47:48 -07:00
Xiang Li	0674f2ec96	[NFC] Fix warning on no return after switch.	2022-10-17 15:52:23 -07:00
Artem Belevich	a10eb07d1a	Do not append terminating NUL to the binary string with embedded fatbin. Extra NUL does not impact functionality of the generated code, but it confuses various NVIDIA tools used to examine embedded GPU binaries. Differential Revision: https://reviews.llvm.org/D135832	2022-10-17 15:39:39 -07:00
Xiang Li	13163dd8ab	[HLSL] CodeGen hlsl resource binding. ''register(ID, space)'' like register(t3, space1) will be translated into i32 3, i32 1 as the last 2 operands for resource annotation metadata. NamedMetadata for CBuffers and SRVs are added as "hlsl.srvs" and "hlsl.cbufs". Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D130951	2022-10-17 14:29:19 -07:00
Ellis Hoag	970e1ea01a	[clang] Fix crash with -funique-internal-linkage-names Calling `getFunctionLinkage(CalleeInfo.getCalleeDecl())` will crash when the declaration does not have a body, e.g., `extern void foo();`. Instead, we can use `isExternallyVisible()` to see if the delcaration has internal linkage. I believe using `!isExternallyVisible()` is correct because the clang linkage must be `InternalLinkage` or `UniqueExternalLinkage`, both of which are "internal linkage" in llvm. `9c26f51f5e/clang/include/clang/Basic/Linkage.h (L28-L40)` Fixes https://github.com/llvm/llvm-project/issues/54139 Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D135926	2022-10-17 08:57:23 -07:00
Ting Wang	ee703b5cb1	[clang][PowerPC] PPC64 VAArg fix right-alignment for aggregates fit in register PPC64 ABI pass aggregates smaller than a register into the least significant bits of the register. In the case of variadic functions, they will end up right-aligned in their argument slots in the argument area on big-endian targets. Apply right-alignment for these aggregates. Fixes #55900. Reviewed By: rjmccall Differential Revision: https://reviews.llvm.org/D133338	2022-10-16 22:01:47 -04:00
Kazu Hirata	647e48cf5f	[clang] Use std::clamp (NFC) Note that the constructor of MipsABIInfo guarantees that MinABIStackAlignInBytes <= StackAlignInBytes, so we can use std::clamp safely.	2022-10-16 10:11:29 -07:00
Jan Sjodin	dd3d8ddb5f	[OpenMP][OpenMPIRBuilder] Migrate OffloadEntriesInfoManager from clang to OMPIRbuilder This patch moves the implementation of the OffloadEntriesInfoManager to the OMPIRbuilder. This class will later be used by flang as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D135786	2022-10-16 08:32:40 -04:00
Chris Bieneman	911d2dc230	[NFC] [HLSL] Move common metadata to LLVMFrontend This change pulls some code from the DirectX backend into a new LLVMFrontendHLSL library to share utility data structures between the HLSL code generation in Clang and the backend in LLVM. This is a small refactoring as a first start to get code into the right structure and get the library built and dependencies correct. Fixes #58000 (https://github.com/llvm/llvm-project/issues/58000) Reviewed By: python3kgae Differential Revision: https://reviews.llvm.org/D135110	2022-10-14 13:40:04 -05:00
Akira Hatanaka	28f7087c91	[CodeGen][ObjC] Call synthesized copy constructor/assignment operator functions in getter/setter functions of non-trivial C struct properties This fixes a bug where the getter/setter functions were doing a trivial copy instead of calling the synthesized functions that copy non-trivial C struct types. This fixes https://github.com/llvm/llvm-project/issues/56680. Differential Revision: https://reviews.llvm.org/D131701	2022-10-14 10:40:24 -07:00
Kazu Hirata	41ac5d258d	[clang] Fix a warning This patch fixes: clang/lib/CodeGen/CGCall.cpp:1867:64: error: '&&' within '\|\|' [-Werror,-Wlogical-op-parentheses]	2022-10-14 08:36:59 -07:00
Zahira Ammarguellat	84a9ec2ff1	Remove redundant option -menable-unsafe-fp-math. There are currently two options that are used to tell the compiler to perform unsafe floating-point optimizations: '-ffast-math' and '-funsafe-math-optimizations'. '-ffast-math' is enabled by default. It automatically enables the driver option '-menable-unsafe-fp-math'. Below is a table illustrating the special operations enabled automatically by '-ffast-math', '-funsafe-math-optimizations' and '-menable-unsafe-fp-math' respectively. Special Operations -ffast-math -funsafe-math-optimizations -menable-unsafe-fp-math MathErrno 0 1 1 FiniteMathOnly 1 0 0 AllowFPReassoc 1 1 1 NoSignedZero 1 1 1 AllowRecip 1 1 1 ApproxFunc 1 1 1 RoundingMath 0 0 0 UnsafeFPMath 1 0 1 FPContract fast on on '-ffast-math' enables '-fno-math-errno', '-ffinite-math-only', '-funsafe-math-optimzations' and sets 'FpContract' to 'fast'. The driver option '-menable-unsafe-fp-math' enables the same special options than '-funsafe-math-optimizations'. This is redundant. We propose to remove the driver option '-menable-unsafe-fp-math' and use instead, the setting of the special operations to set the function attribute 'unsafe-fp-math'. This attribute will be enabled only if those special operations are enabled and if 'FPContract' is either 'fast' or set to the default value. Differential Revision: https://reviews.llvm.org/D135097	2022-10-14 10:55:29 -04:00
Aaron Ballman	19e984ef8f	Properly print unnamed TagDecl objects in diagnostics The diagnostics engine is very smart about being passed a NamedDecl to print as part of a diagnostic; it gets the "right" form of the name, quotes it properly, etc. However, the result of using an unnamed tag declaration was to print '' instead of anything useful. This patch causes us to print the same information we'd have gotten if we had printed the type of the declaration rather than the name of it, as that's the most relevant information we can display. Differential Revision: https://reviews.llvm.org/D134813	2022-10-14 08:18:28 -04:00
Benjamin Kramer	c5d950f469	[HLSL] Simplify code and fix unused variable warnings. NFC.	2022-10-13 09:46:32 +02:00
Xiang Li	ebe9c7f3e2	[HLSL] CodeGen hlsl cbuffer/tbuffer. cbuffer A { float a; float b; } will be translated to a global variable. Something like struct CB_Ty { float a; float b; }; CB_Ty A; And all use of a and b will be replaced with A.a and A.b. Only support none-legacy cbuffer layout now. CodeGen for Resource binding will be in separate patch. In the separate patch, resource binding will map the resource information to the global variable. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D130131	2022-10-12 21:17:38 -07:00
Nikita Popov	01bbe87fbb	[CGStmt] Use helper functions to set memory attributes (NFC)	2022-10-12 16:38:39 +02:00
Michael Wyman	1fbb6d8b34	Fix assert in generated `direct` property getter/setters due to removal of `_cmd` parameter. This fixes a bug from https://reviews.llvm.org/D131424 that removed the implicit `_cmd` parameter as an argument to `objc_direct` method implementations. In many cases the generated getter/setter will call `objc_getProperty` or `objc_setProperty`, both of which require the selector of the getter/setter; since `_cmd` didn't automatically have backing storage, attempting to load the address asserted. For direct property generated getters/setters, this now passes an undefined/uninitialized/poison value as the `_cmd` argument to `objc_getProperty`/`objc_setProperty`. Prior to removing the `_cmd` argument from the ABI of direct methods, it was left uninitialized/undefined; although references within hand-implemented methods would load the selector in the method prologue, generated getters/setters never did and just forwarded the undefined value that was passed as the argument. This change keeps the generated code mostly similar to before, passing an uninitialized/undefined/poison value; for setters, the value argument may be moved to another register. Added a test that triggers the assert prior to the implementation code. Differential Revision: https://reviews.llvm.org/D135091	2022-10-11 21:15:53 -07:00
Eli Friedman	1079662d2f	[clang][codegen] Don't emit atomic loads for threadsafe init if they aren't inline Performing a load before calling __cxa_guard_acquire is supposed to be an optimization, but it isn't much of one if we're just going to emit a call to __atomic_load_1 instead. Instead, just skip the load, and let __cxa_guard_acquire do whatever it wants. (In practice, on such targets, the C++ library is just built with threading turned off, so the result isn't actually threadsafe, but there's not really anything clang can do about that.) The alternative here is that we try to define some ABI for threadsafe init that allows the speculative load without full atomics. Almost any target without full atomics has a load that's s "atomic enough" for this purpose. But it's not clear how we emit an "atomic enough" load in LLVM IR, and there isn't any ABI document we can refer to. Or I guess we could turn off -fthreadsafe-statics by default on Cortex-M0, but that seems like it would be surprising. Fixes https://github.com/llvm/llvm-project/issues/58184 Differential Revision: https://reviews.llvm.org/D135628	2022-10-11 14:00:33 -07:00
David Green	b879f99f0e	[AArch64][ARM] Alter most of arm_neon.h to be target-based, not preprocessor based. Similar to D131064, this alters most of the intrinsics in arm_neon.h to be target based, not preprocessor based. The intrinsics that are changed are the ones with obvious target features (fp16, fp16fml, cryptos, i8mm and bf16). The ones that are not yet altered are the ones without target features like rdma (8.1) and complex (8.3). Those will be switched in a followup patch that allows targeting architecture versions. The existing ArchGuard in arm_neon.td is split into ArchGuard that still adds ifdef defines (for example for intrinsics that require __aarch64__), and TargetGuards for intrinsics dependant on target features. From there the TargetGuards are used in two ways: - For intrinsics emitted as functions, __attribute__((target(TargetGuard))) is added to the definition of the function. Along with the existing always_inline intrinsic, this will give a compile time error if the function is used in a context where the target feature is not available. - For intrinsics emitted as macros, the __builtins are emitted into arm_neon.inc using TARGET_BUILTIN as opposed to BUILTIN, which includes the target feature and gives an error if the builtin is found in a function without the required features, similar to arm_sve.h. The second method requires that the intrinsics be separable from the existing _v intrinsics used in other types. For example __builtin_neon_splat_lane_bf16 is used as opposed to __builtin_neon_splat_lane_v. There are some adjustments to the CGBuiltin to account for intrinsics that can be treated similarly, except for their target features. Differential Revision: https://reviews.llvm.org/D132034	2022-10-11 09:09:16 +01:00
Aaron Ballman	9ced729c2c	Repair a confusing standards reference; NFC There is no 6.9 in C++11, the quote actually lives in [intro.multithread] for that revision. However, the words moved in C++17 to [intro.progress] so I added that information as well.	2022-10-10 14:10:39 -04:00
Anton Bikineev	7b85e76500	[PGO] Consider parent context when weighing branches with likelyhood. Generally, with PGO enabled the C++20 likelyhood attributes shall be dropped assuming the profile has a good coverage. However, currently this is not the case for the following code: if (always_false()) [[likely]] { ... } The patch fixes this and drops the attribute, if the parent context was executed in the profile. The patch still preserves the attribute, if the parent context was not executed, e.g. to support the cases when the profile has insufficient coverage. Differential Revision: https://reviews.llvm.org/D134456	2022-10-08 23:49:27 +02:00

1 2 3 4 5 ...

15669 Commits