llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	78c09f0f24	[PatternMatch][InstCombine] match a vector with constant expression element(s) as a constant expression The InstCombine test is reduced from issue #56601. Without the more liberal match for ConstantExpr, we try to rearrange constants in Negator forever. Alternatively, we could adjust the definition of m_ImmConstant to be more conservative, but that's probably a larger patch, and I don't see any downside to changing m_ConstantExpr. We never capture and modify a ConstantExpr; transforms just want to avoid it. Differential Revision: https://reviews.llvm.org/D130286	2022-07-21 15:23:57 -04:00
Daniel Thornburgh	6605187103	[NFC] Fix compiler warning in MarkupFilter	2022-07-21 12:00:29 -07:00
Daniel Thornburgh	17e4c217b6	[Symbolizer] Implement contextual symbolizer markup elements. This change implements the contextual symbolizer markup elements: reset, module, and mmap. These provide information about the runtime context of the binary necessary to resolve addresses to symbolic values. Summary information is printed to the output about this context. Multiple mmap elements for the same module line are coalesced together. The standard requires that such elements occur on their own lines to allow for this; accordingly, anything after a contextual element on a line is silently discarded. Implementing this cleanly requires that the filter drive the parser; this allows skipped sections to avoid being parsed. This also makes the filter quite a bit easier to use, at the cost of some unused flexibility. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D129519	2022-07-21 11:29:19 -07:00
David Sherwood	f15b6b2907	[AArch64] Add target hook for preferPredicateOverEpilogue This patch adds the AArch64 hook for preferPredicateOverEpilogue, which currently returns true if SVE is enabled and one of the following conditions (non-exhaustive) is met: 1. The "sve-tail-folding" option is set to "all", or 2. The "sve-tail-folding" option is set to "all+noreductions" and the loop does not contain reductions, 3. The "sve-tail-folding" option is set to "all+norecurrences" and the loop has no first-order recurrences. Currently the default option is "disabled", but this will be changed in a later patch. I've added new tests to show the options behave as expected here: Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll Differential Revision: https://reviews.llvm.org/D129560	2022-07-21 17:20:06 +01:00
Joseph Huber	bc33c2fa0c	[Binary] Hard-code the alignment of the offloading binary Summary: We previously used `alignof` to get the necessary alignment of the binary header. However this was different on 32-bit platforms and caused a few tests to fail because of it. This patch just changes this to be a hard-coded constant of 8.	2022-07-21 09:28:26 -04:00
Nikita Popov	1f69503107	[MemoryBuiltins] Add getReallocatedOperand() function (NFC) Replace the value-accepting isReallocLikeFn() overload with a getReallocatedOperand() function, which returns which operand is the one being reallocated. Currently, this is always the first one, but once allockind(realloc) is respected, the reallocated operand will be determined by the allocptr parameter attribute.	2022-07-21 14:54:16 +02:00
Nikita Popov	46e6dd84b7	[MemoryBuiltins] Remove isFreeCall() function (NFC) Remove isFreeCall() in favor of getFreedOperand(). Replace the two remaining uses with a getFreedOperand() != nullptr check, as they only care that something is getting freed. (The usage in DSE is correct as such. The allocator-related checks in CFLGraph look rather questionable in general.)	2022-07-21 14:44:23 +02:00
Alexey Lapshin	8bb4451a65	[Reland][DebugInfo][llvm-dwarfutil] Combine overlapped address ranges. DWARF files may contain overlapping address ranges. f.e. it can happen if the two copies of the function have identical instruction sequences and they end up sharing. That looks incorrect from the point of view of DWARF spec. Current implementation of DWARFLinker does not combine overlapped address ranges. It would be good if such ranges would be handled in some useful way. Thus, this patch allows DWARFLinker to combine overlapped ranges in a single one. Depends on D86539 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D123469	2022-07-21 14:15:39 +03:00
Matt Devereau	e0fbd990c9	[AArch64][SVE] Add ISel pattern to lower DUPLANE128 to LD1RQD Following on from https://reviews.llvm.org/D128902, lower DUPLANE128 to LD1RQD for integer load types from instruction selection. Differential Revision: https://reviews.llvm.org/D130010	2022-07-21 10:56:43 +00:00
Alexey Lapshin	3aad49082c	Revert "[DebugInfo][llvm-dwarfutil] Combine overlapped address ranges." This reverts commit `d2a4d6bf9c`.	2022-07-21 13:40:20 +03:00
Nikita Popov	c81dff3c30	[MemoryBuiltins] Add getFreedOperand() function (NFCI) We currently assume in a number of places that free-like functions free their first argument. This is true for all hardcoded free-like functions, but with the new attribute-based design, the freed argument is supposed to be indicated by the allocptr attribute. To make sure we handle this correctly once allockind(free) is respected, add a getFreedOperand() helper which returns the freed argument, rather than just indicating whether the call frees some argument. This migrates most but not all users of isFreeCall() to the new API. The remaining users are a bit more tricky.	2022-07-21 12:39:35 +02:00
Alexey Lapshin	d2a4d6bf9c	[DebugInfo][llvm-dwarfutil] Combine overlapped address ranges. DWARF files may contain overlapping address ranges. f.e. it can happen if the two copies of the function have identical instruction sequences and they end up sharing. That looks incorrect from the point of view of DWARF spec. Current implementation of DWARFLinker does not combine overlapped address ranges. It would be good if such ranges would be handled in some useful way. Thus, this patch allows DWARFLinker to combine overlapped ranges in a single one. Depends on D86539 Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D123469	2022-07-21 13:15:18 +03:00
Nikita Popov	d144ae6e1b	[MemoryBuiltins] Default to trivial mapper in getAllocSize() (NFC) Default getAllocSize() to use the trivial mapper. Also switch from using std::function to function_ref. Furthermore, update the doc comment to point out a subtle difference between getAllocSize() and getObjectSize(): The latter may also return something for calls that return their argument (via "returned" attribute or special intrinsics like invariant groups).	2022-07-21 11:43:48 +02:00
Nikita Popov	f45ab43332	[MemoryBuiltins] Avoid isAllocationFn() call before checking removable alloc Alloc directly checking whether a given call is a removable allocation, instead of first checking whether it is an allocation first.	2022-07-21 09:39:19 +02:00
Congzhe Cao	05ccde8023	[LoopCacheAnalysis] Fix a type mismatch problem in cost calculation There is a problem in loop cache analysis that the types of SCEV variables `Coeff` and `ElemSize` in function `isConsecutive()` may not match. The mismatch would cause SCEV failures when `Coeff` is multiplied with `ElemSize`. The fix in this patch is to extend the type of both `Coeff` and `ElemSize` to whichever is wider in those two variables. As a clean-up, duplicate calculations of `Stride` in `computeRefCost()` is then removed. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D128877	2022-07-21 01:57:05 -04:00
Anubhab Ghosh	4fcf8434dd	[ORC] Add a new MemoryMapper-based JITLinkMemoryManager implementation. MapperJITLinkMemoryManager supports executor memory management using any implementation of MemoryMapper to do the transfer such as InProcessMapper or SharedMemoryMapper. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D129495	2022-07-20 17:52:37 -07:00
Teresa Johnson	0174f5553e	[MemProf] Basic metadata support and verification Add basic support for the MemProf metadata (!memprof and !callsite) which was initially described in "RFC: IR metadata format for MemProf" (https://discourse.llvm.org/t/rfc-ir-metadata-format-for-memprof/59165). The bulk of the patch is verification support, along with some tests. There are a couple of changes to the format described in the original RFC: Initial measurements suggested that a tree format for the stack ids in the contexts would be more efficient, but subsequent evaluation with large applications showed that in fact the cost of the additional metadata nodes required by this deduplication scheme overwhelmed the benefit from sharing stack id nodes. Therefore, the implementation here and in follow on patches utilizes a simpler scheme of lists of stack id integers in the memprof profile contexts and callsite metadata. The follow on matching patch employs context trimming optimizations to reduce the cost. Secondly, instead of verbosely listing all profiled fields in each profiled context (memory info block or MIB), and deferring the interpretation of the profile data, the profile data is evaluated and converted into string tags specifying the behavior (e.g. "cold") during profile matching. This reduces the verbosity of the profile metadata, and allows additional context trimming optimizations. As a result, the named metadata schema description is also no longer needed. Differential Revision: https://reviews.llvm.org/D128141	2022-07-20 15:30:55 -07:00
Schrodinger ZHU Yifan	304027206c	[ThinLTO] Support aliased GlobalIFunc Fixes https://github.com/llvm/llvm-project/issues/56290: when an ifunc is aliased in LTO, clang will attempt to create an alias summary; however, as ifunc is not included in the module summary, doing so will lead to crash. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D129009	2022-07-20 15:30:38 -07:00
Kazu Hirata	94e03abf91	[IPO] Restore a call to has_value (NFC) This patch restores a call to has_value to make it clear that we are checking the presence of an optional value, not the underlying value. This patch partially reverts `d08f34b592`. Differential Revision: https://reviews.llvm.org/D129453	2022-07-20 09:40:18 -07:00
Roman Rusyaev	394a388d14	[TableGen] Add a location for a class definition that was forward-declared This change improves ctags generation for tablegen files. For the following example ``` class A; class A { int a; } ``` Previously, tags were generated only for a forward declaration of class 'A'. This patch allows generating tags for the forward declarations and further definition of class 'A'. Reviewed By: barannikov88 Original patch by: rusyaev-roman (Roman Rusyaev) Some adjustments by: nhaehnle (Nicolai Hähnle) Differential Revision: https://reviews.llvm.org/D129935	2022-07-20 15:56:17 +02:00
esmeyi	b1847ff068	[XCOFF] write the aux header when the visibility is specified in XCOFF32. The n_type field in the symbol table entry has two interpretations in XCOFF32, and a single interpretation in XCOFF64. The new interpretation is used in XCOFF32 if the value of the o_vstamp field in the auxiliary header is 2. In XCOFF64 and the new XCOFF32 interpretation, the n_type field is used for the symbol type and visibility. The patch writes the aux header with an o_vstamp field value of 2 when the visibility is specified in XCOFF32 to make the new XCOFF32 interpretation used. Reviewed By: DiggerLin, jhenderson Differential Revision: https://reviews.llvm.org/D128148	2022-07-20 07:09:34 -04:00
Chuanqi Xu	645d2dd3a9	Revert "Don't treat readnone call in presplit coroutine as not access memory" This reverts commit `57224ff4a6`. This commit may trigger crashes on some workloads. Revert it for clearness.	2022-07-20 17:00:58 +08:00
Fangrui Song	e931c2e870	[LegacyPM] Remove InstrOrderFileLegacyPass Following recent changes removing non-core features of the legacy PM/optimization pipeline.	2022-07-19 23:58:51 -07:00
Chuanqi Xu	57224ff4a6	Don't treat readnone call in presplit coroutine as not access memory To solve the readnone problems in coroutines. See https://discourse.llvm.org/t/address-thread-identification-problems-with-coroutine/62015 for details. According to the discussion, we decide to fix the problem by inserting isPresplitCoroutine() checks in different passes instead of wrapping/unwrapping readnone attributes in CoroEarly/CoroCleanup passes. In this direction, we might not be able to cover every case at first. Let's take a "find and fix" strategy. Reviewed By: nikic, nhaehnle, jyknight Differential Revision: https://reviews.llvm.org/D127383	2022-07-20 10:37:23 +08:00
Jez Ng	2e2737cdf9	[MC][MachO] Change addrsig format + ensure its size is properly set There were two problems with the previous setup: 1. We weren't setting its size, which caused problems when `__llvm_addrsig` wasn't the last section. In particular, `__debug_line` (if created) is generated and placed after `__llvm_addrsig`, and would result in an invalid object file w/ overlapping sections being emitted. 2. The symbol indices could be invalidated if e.g. `llvm-strip` ran on the object file. See discussion [here][1]. To fix both these issues, we use symbol relocations instead of encoding symbol indices directly in the section contents. The section itself doesn't contain any data. That sidesteps the layout problem in addition to solving the second issue. The corresponding LLD change to read in this new format: {D128938}. It will fix the icf-safe.ll test failure on this diff. [1]: https://discourse.llvm.org/t/problems-with-mach-o-address-significance-table-generation/63392/ Reviewed By: #lld-macho, alx32 Differential Revision: https://reviews.llvm.org/D127637	2022-07-19 21:22:23 -04:00
Lang Hames	94e6d2677b	[ORC] Fix serialization / deserialization of default-constructed StringRef. Avoids accessing the data field on zero-length strings. This is the StringRef counterpart to the ArrayRef<char> fix in `67220c2ad7`. rdar://97285294	2022-07-19 17:22:21 -07:00
Anubhab Ghosh	1b1f1c7786	Re-re-apply `5acd471698`, Add a shared-memory based orc::MemoryMapper... ...with more fixes. The original patch was reverted in `3e9cc543f2` due to bot failures caused by a missing dependence on librt. That issue was fixed in `32d8d23cd0`, but that commit also broke sanitizer bots due to a bug in SimplePackedSerialization: empty ArrayRef<char>s triggered a zero-byte memcpy from a null source. The ArrayRef<char> serialization issue was fixed in `67220c2ad7`, and this patch has also been updated with a new custom SharedMemorySegFinalizeRequest message that should avoid serializing empty ArrayRefs in the first place. https://reviews.llvm.org/D128544	2022-07-19 15:35:33 -07:00
Johannes Doerfert	bf789b1957	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in `6555558a80`.	2022-07-19 16:24:42 -05:00
Yusra Syeda	6fb27bc2e3	[SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention Differential Revision: https://reviews.llvm.org/D127328	2022-07-19 13:55:25 -04:00
Cole Kissane	e939bf67e3	[llvm] add zstd to `llvm::compression` namespace - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` - debian users should install libzstd when using `LLVM_ENABLE_ZSTD=FORCE_ON` from source due to this bug https://bugs.launchpad.net/ubuntu/+source/libzstd/+bug/1941956 Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-19 10:54:36 -07:00
Jon Chesterfield	3a20597776	[amdgpu] Implement lds kernel id intrinsic Implement an intrinsic for use lowering LDS variables to different addresses from different kernels. This will allow kernels that cannot reach an LDS variable to avoid wasting space for it. There are a number of implicit arguments accessed by intrinsic already so this implementation closely follows the existing handling. It is slightly novel in that this SGPR is written by the kernel prologue. It is necessary in the general case to put variables at different addresses such that they can be compactly allocated and thus necessary for an indirect function call to have some means of determining where a given variable was allocated. Claiming an arbitrary SGPR into which an integer can be written by the kernel, in this implementation based on metadata associated with that kernel, which is then passed on to indirect call sites is sufficient to determine the variable address. The intent is to emit a __const array of LDS addresses and index into it. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D125060	2022-07-19 17:46:19 +01:00
Alexey Lapshin	4539b44148	[Reland][Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF. This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal): ``` ./llvm-dwarfutil [options] <input file> <output file> --garbage-collection Do garbage collection for debug info(default) -j <value> Alias for --num-threads --no-garbage-collection Don`t do garbage collection for debug info --no-odr-deduplication Don`t do ODR deduplication for debug types --no-odr Alias for --no-odr-deduplication --no-separate-debug-file Create single output file, containing debug tables(default) --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine --odr-deduplication Do ODR deduplication for debug types(default) --odr Alias for --odr-deduplication --separate-debug-file Create two output files: file w/o debug tables and file with debug tables --tombstone [bfd,maxpc,exec,universal] Tombstone value used as a marker of invalid address(default: universal) =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges =exec - Match with address ranges of executable sections =universal - Both: bfd and maxpc ``` Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D86539	2022-07-19 15:11:36 +03:00
Simon Pilgrim	0f6b0461b0	[DAG] SimplifyDemandedBits - relax "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" to match only demanded bits The "xor (X >> ShiftC), XorC --> (not X) >> ShiftC" fold is currently limited to the XOR mask being a shifted all-bits mask, but we can relax this to only need to match under the demanded bits. This helps expose more bit extraction/clearing patterns and fixes the PowerPC testCompares*.ll regressions from D127115 Alive2: https://alive2.llvm.org/ce/z/fl7T7K Differential Revision: https://reviews.llvm.org/D129933	2022-07-19 10:59:07 +01:00
David Spickett	5d14873249	[llvm][AArch64] Add missing FPCR, H and B registers to Codeview mapping Fixes https://github.com/llvm/llvm-project/issues/56484 H registers are 16 bit views of AArch64's Neon registers and B are the 8 bit views. msvc does not support 16 bit float (some mention in DirectX but I couldn't find a way to get to it) so for lack of a better reference I'm using: `85c9b41b33/server/references/dia/include/cvconst.h` (the other microsoft-pdb repo is no longer up to date) Luckily clang does support fp16 so a test is added for that. There is no 8 bit float type so I had to get creative with the test case. We're not testing for correct debug info here just that we can select the B register and not crash in the process. For FPCR it's never going to be passed as an argument so I've not added a test for it. It is included to keep our list looking the same as the reference. Reviewed By: majnemer Differential Revision: https://reviews.llvm.org/D129774	2022-07-19 09:33:13 +00:00
Alexey Lapshin	e717f91c96	Revert "[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF." This reverts commit `e2147c26bd`.	2022-07-19 12:17:47 +03:00
Alexey Lapshin	e2147c26bd	[Debuginfo][llvm-dwarfutil] llvm-dwarfutil dsymutil-like tool for ELF. This patch implements proposal https://lists.llvm.org/pipermail/llvm-dev/2020-August/144579.html llvm-dwarfutil - is a tool that is used for processing debug info(DWARF) located in built binary files to improve debug info quality, reduce debug info size. The patch currently implements smaller set of command-line options(comparing to the proposal): ``` ./llvm-dwarfutil [options] <input file> <output file> --garbage-collection Do garbage collection for debug info(default) -j <value> Alias for --num-threads --no-garbage-collection Don`t do garbage collection for debug info --no-odr-deduplication Don`t do ODR deduplication for debug types --no-odr Alias for --no-odr-deduplication --no-separate-debug-file Create single output file, containing debug tables(default) --num-threads <threads> Number of available threads for multi-threaded execution. Defaults to the number of cores on the current machine --odr-deduplication Do ODR deduplication for debug types(default) --odr Alias for --odr-deduplication --separate-debug-file Create two output files: file w/o debug tables and file with debug tables --tombstone [bfd,maxpc,exec,universal] Tombstone value used as a marker of invalid address(default: universal) =bfd - Zero for all addresses and [1,1] for DWARF v4 (or less) address ranges and exec =maxpc - Minus 1 for all addresses and minus 2 for DWARF v4 (or less) address ranges =exec - Match with address ranges of executable sections =universal - Both: bfd and maxpc ``` Reviewed By: clayborg Differential Revision: https://reviews.llvm.org/D86539	2022-07-19 11:18:36 +03:00
serge-sans-paille	a2ac383b44	[llvm] Fix forward declaration in Support/JSON.h Some methods of json::Array require json::Value to be completely defined, so they can't be defined in-class. Fix that by defining them out of class. Fix #55780	2022-07-19 09:07:29 +02:00
Max Kazantsev	51f837a680	[NFC] Introduce API to detect tokens penetrating LCSSA form Following discussion in PR56243, we need to somehow detect the situation when token values penetrate LCSSA form for transforms that require that it is maintained by all values (for example, to sustain use-def dominance invarians). This patch introduces a parameter to LCSSA checkers to control their ignorance about tokens. Differential Revision: https://reviews.llvm.org/D129983 Reviewed By: efriedma	2022-07-19 13:52:30 +07:00
Shraiysh Vaishay	35fc666877	[OpenMP][IRBuilder] Add support for taskgroup This patch adds support for generating taskgroup construct. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D128203	2022-07-19 10:49:34 +05:30
Lang Hames	67220c2ad7	[ORC] Fix serialization / deserialization of default-constructed ArrayRef<char>. Avoids a zero-length memcpy from a null src, which caused errors on some of the sanitizer bots. Also uses null when deserializing an empty ArrayRef (rather than pointing to a zero length range in the middle of the input buffer).	2022-07-18 20:39:01 -07:00
Matt Arsenault	8d0383eb69	CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.	2022-07-18 17:23:41 -04:00
Stanislav Mekhanoshin	523a99c0eb	[AMDGPU] Support for gfx940 fp8 smfmac Differential Revision: https://reviews.llvm.org/D129908	2022-07-18 12:12:41 -07:00
Stanislav Mekhanoshin	2695f0a688	[AMDGPU] Support for gfx940 fp8 mfma Differential Revision: https://reviews.llvm.org/D129906	2022-07-18 11:49:56 -07:00
Stanislav Mekhanoshin	9fa5a6b7e8	[AMDGPU] Support for gfx940 fp8 conversions Differential Revision: https://reviews.llvm.org/D129902	2022-07-18 11:48:43 -07:00
zhijian	a6316d6da5	[AIX] support read global symbol of big archive Reviewers: James Henderson, Fangrui Song Differential Revision: https://reviews.llvm.org/D124865	2022-07-18 10:43:30 -04:00
Simon Pilgrim	c2ab5c5514	[DAG] Fix typo in isDesirableToCommuteWithShift description. NFC.	2022-07-18 13:11:23 +01:00
Max Kazantsev	d693fd29f1	[Verifier] Make Verifier recognize undef tokens as correct IR Undef tokens may appear in unreached code as result of RAUW of some optimization, and it should not be considered as bad IR. Patch by Dmitry Bakunevich! Differential Revision: https://reviews.llvm.org/D128904 Reviewed By: mkazantsev	2022-07-18 16:26:06 +07:00
Nikita Popov	11079e8820	[IR] Don't treat callbr as indirect terminator Callbr is no longer an indirect terminator in the sense that is relevant here (that it's successors cannot be updated). The primary effect of this change is that callbr no longer prevents formation of loop simplify form. I decided to drop the isIndirectTerminator() method entirely and replace it with isa<IndirectBrInst>() checks. I assume this method was added to abstract over indirectbr and callbr, but it never really caught on, and there is nothing left to abstract anymore at this point. Differential Revision: https://reviews.llvm.org/D129849	2022-07-18 09:32:08 +02:00
Valentin Clement	048aaab194	[flang][openacc] Use TableGen to generate the clause parser This patch introduce an automatic generation of the clause parser from the TableGen information. New information can be stored directly in the TableGen file: - The different aliases that a clause support. - prefix before a value. - whether a prefix is optional or not. Makes it easier to add new clauses and also avoid some error (`write` clause incorrect until now). This patch is updating only the OpenACC part. A patch with a modification of the OpenMP clause parser will follow. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D106968	2022-07-18 09:26:57 +02:00
Craig Topper	a55ff6aadd	[Support][CodeGen] Fix spelling Divison->Division. NFC	2022-07-17 23:16:29 -07:00
Fangrui Song	0e3447bf8a	[LegacyPM] Remove WholeProgramDevirt Unused after LTO removal from legacy optimization passline.	2022-07-17 23:14:53 -07:00
Fangrui Song	1f90cc589e	[LegacyPM] Remove FunctionImportLegacyPass Unused after ThinLTO was removed from legacy optimization pipeline.	2022-07-17 23:06:46 -07:00
Abinav Puthan Purayil	d96361d714	[AMDGPU] Add the uses_dynamic_stack field to the kernel descriptor and the kernel metadata map This change introduces the dynamic stack boolean field to code-object-v3 and above under the code properties of the kernel descriptor and under the kernel metadata map of NT_AMDGPU_METADATA. This field corresponds to the is_dynamic_callstack field of amd_kernel_code_t. Differential Revision: https://reviews.llvm.org/D128344	2022-07-18 10:07:13 +05:30
Kazu Hirata	3112987d5c	Remove unused forward declarations (NFC)	2022-07-17 15:37:48 -07:00
Fangrui Song	bbaa015e82	[LegacyPM] Remove LowerTypeTestsPass Unused after LTO removal from optimization passline.	2022-07-17 15:06:38 -07:00
Fangrui Song	a6942256ca	[LegacyPM] Remove NameAnonGlobalLegacyPass Unused after LTO removal from optimization passline.	2022-07-17 14:38:29 -07:00
Fangrui Song	d74b88c69d	[LegacyPM] Remove CanonicalizeAliasesLegacyPass Unused after LTO removal from optimization passline.	2022-07-17 14:30:22 -07:00
Fangrui Song	70519a1fba	[LegacyPM] Remove LTO passes from optimization pipeline Following recent changes removing non-core features of the legacy PM/optimization pipeline.	2022-07-17 14:24:36 -07:00
Fangrui Song	f502115561	[LegacyPM] Remove PGO options from PassManagerBuilder They have been dead since legacy PGO/SamplePGO passes were removed.	2022-07-17 14:03:23 -07:00
Fangrui Song	dd5e3f0e27	[LegacyPM] Remove SampleProfileLoaderLegacyPass Following recent changes removing non-core features of the legacy PM/optimization pipeline (e.g. PGO), remove SamplePGO.	2022-07-17 12:09:46 -07:00
Kazu Hirata	c13a09a462	[llvm] Fix header guards (NFC) Identified with llvm-header-guard.	2022-07-17 02:18:55 -07:00
Kazu Hirata	92a1b2afc8	[Analysis] Remove isArithmeticRecurrenceKind The last use was removed on Jul 30, 2021 in commit `9d35594993`.	2022-07-16 13:23:32 -07:00
Fangrui Song	f9d6f37201	[LegacyPM] Remove ControlHeightReductionLegacyPass This pass tries to reduce the number of conditional branches in the hot path based on profile. It's mostly a no-op after legacy PGO passes are moved.	2022-07-16 01:35:56 -07:00
Fangrui Song	3a42c499c2	[LegacyPM] Remove createInstrProfilingLegacyPass Follow the steps of removing non-core instrumentation passes like PGO.	2022-07-16 01:26:40 -07:00
Fangrui Song	685775bbab	[LegacyPM] Remove CGProfileLegacyPass It's mostly a no-op after I removed legacy PGO passes in D123834.	2022-07-16 00:39:56 -07:00
Fangrui Song	df8f5be596	[LegacyPM] Remove ModuleSanitizerCoverageLegacyPass Follow the steps of various other legacy instrumentation passes removed for 15.0.0.	2022-07-15 19:01:20 -07:00
Mitch Phillips	4162aefad1	Revert "Re-apply `5acd471698`, Add a shared-memory based orc::MemoryMapper, with fixes." This reverts commit `32d8d23cd0`. Reason: Broke the UBSan buildbots. See more details on Phabricator: https://reviews.llvm.org/D128544	2022-07-15 17:11:55 -07:00
Rong Xu	5e0443292b	[PGO] Report number of counts being dropped when a hash-mismatch happens This patch reports number of counts being dropped when a hash-mismatch happens. This information will be helpful to the users -- if the dropped counts are large, the user should redo the instrumentation build and recollect the profile. Differential Revision: https://reviews.llvm.org/D129001	2022-07-15 14:53:59 -07:00
Fangrui Song	0d5a62faca	[sanitizer] Add "mainfile" prefix to sanitizer special case list When an issue exists in the main file (caller) instead of an included file (callee), using a `src` pattern applying to the included file may be inappropriate if it's the caller's responsibility. Add `mainfile` prefix to check the main filename. For the example below, the issue may reside in a.c (foo should not be called with a misaligned pointer or foo should switch to an unaligned load), but with `src` we can only apply to the innocent callee a.h. With this patch we can use the more appropriate `mainfile:a.c`. ``` //--- a.h // internal linkage static inline int load(int x) { return x; } //--- a.c, -fsanitize=alignment #include "a.h" int foo(void *x) { return load(x); } ``` See the updated clang/docs/SanitizerSpecialCaseList.rst for a caveat due to C++ vague linkage functions. Reviewed By: #sanitizers, kstoimenov, vitalybuka Differential Revision: https://reviews.llvm.org/D129832	2022-07-15 10:39:26 -07:00
Anubhab Ghosh	32d8d23cd0	Re-apply `5acd471698`, Add a shared-memory based orc::MemoryMapper, with fixes. The original commit was reverted in `3e9cc543f2` due to buildbot failures, which should be fixed by the addition of dependencies on librt. Differential Revision: https://reviews.llvm.org/D128544	2022-07-15 09:45:30 -07:00
David Kreitzer	c720b6fddd	Clarify the behavior of the llvm.vector.insert/extract intrinsics when the index is out of range. Both intrinsics return a poison value. Consequently, mark the intrinsics speculatable. Differential Revision: https://reviews.llvm.org/D129656	2022-07-15 07:56:44 -07:00
Edd Barrett	2e62a26fd7	[stackmaps] Legalise patchpoint arguments. This is similar to D125680, but for llvm.experimental.patchpoint (instead of llvm.experimental.stackmap). Differential review: https://reviews.llvm.org/D129268	2022-07-15 12:01:59 +01:00
Fangrui Song	3c849d0aef	Modernize Optional::{getValueOr,hasValue}	2022-07-15 01:20:39 -07:00
Nikita Popov	2a721374ae	[IR] Don't use blockaddresses as callbr arguments Following some recent discussions, this changes the representation of callbrs in IR. The current blockaddress arguments are replaced with `!` label constraints that refer directly to callbr indirect destinations: ; Before: %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo)) to label %asm.fallthrough [label %foo] ; After: %res = callbr i8* asm "", "=r,r,!i"(i8* %x) to label %asm.fallthrough [label %foo] The benefit of this is that we can easily update the successors of a callbr, without having to worry about also updating blockaddress references. This should allow us to remove some limitations: * Allow unrolling/peeling/rotation of callbr, or any other clone-based optimizations (https://github.com/llvm/llvm-project/issues/41834) * Allow duplicate successors (https://github.com/llvm/llvm-project/issues/45248) This is just the IR representation change though, I will follow up with patches to remove limtations in various transformation passes that are no longer needed. Differential Revision: https://reviews.llvm.org/D129288	2022-07-15 10:18:17 +02:00
Nikita Popov	f75ccadcdd	[LSR] Create SCEVExpander earlier, use member isSafeToExpand() (NFC) This is a followup to D129630, which switches LSR to the member isSafeToExpand() variant, and removes the freestanding function. This is done by creating the SCEVExpander early (already during the analysis phase). Because the SCEVExpander is now available for the whole lifetime of LSRInstance, I've also made it into a member variable, rather than passing it around in even more places. Differential Revision: https://reviews.llvm.org/D129769	2022-07-15 09:41:23 +02:00
Fangrui Song	141c9d7759	[llvm-dwp] Add SHF_COMPRESSED support and remove .zdebug support clang 14 removed -gz=zlib-gnu and ld.lld/llvm-objcopy removed .zdebug support recently. llvm-dwp currently doesn't support SHF_COMPRESSED. Add support and remove .zdebug support. Simplify llvm::object::Decompressor which has no .zdebug user now. While here, add tests for ELF32LE, ELF32BE, and ELF64BE. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D129728	2022-07-14 16:19:32 -07:00
Dawid Jurczak	d71128d97d	[NFC][Metadata] Change MDNode::operands()'s return type from op_range to ArrayRef<MDOperand> This patch is https://reviews.llvm.org/D129468 follow-up and address one of comment coming from that review: https://reviews.llvm.org/D129468#3643295 Differential Revision: https://reviews.llvm.org/D129565	2022-07-14 17:22:32 +02:00
Nikita Popov	9e6e631b38	[LoopPredication] Use isSafeToExpandAt() member function (NFC) As a followup to D129630, this switches a usage of the freestanding function in LoopPredication to use the member variant instead. This was the last use of the freestanding function, so drop it entirely.	2022-07-14 14:49:07 +02:00
Nikita Popov	dcf4b733ef	[SCEVExpander] Make CanonicalMode handing in isSafeToExpand() more robust (PR50506) isSafeToExpand() for addrecs depends on whether the SCEVExpander will be used in CanonicalMode. At least one caller currently gets this wrong, resulting in PR50506. Fix this by a) making the CanonicalMode argument on the freestanding functions required and b) adding member functions on SCEVExpander that automatically take the SCEVExpander mode into account. We can use the latter variant nearly everywhere, and thus make sure that there is no chance of CanonicalMode mismatch. Fixes https://github.com/llvm/llvm-project/issues/50506. Differential Revision: https://reviews.llvm.org/D129630	2022-07-14 14:41:51 +02:00
Namhyung Kim	69b312cde4	[llvm-objdump] Create fake sections for a ELF core file The linux perf tools use /proc/kcore for disassembly kernel functions. Actually it copies the relevant parts to a temp file and then pass it to objdump. But it doesn't have section headers so llvm-objdump cannot handle it. Let's create fake section headers for the program headers. It'd have a single section for each segment to cover the entire range. And for this purpose we can consider only executable code segments. With this change, I can see the following command shows proper outputs. perf annotate --stdio --objdump=/path/to/llvm-objdump Differential Revision: https://reviews.llvm.org/D128705	2022-07-14 13:39:59 +01:00
Cullen Rhodes	3e9cc543f2	Revert "[ORC] Add a shared-memory based orc::MemoryMapper." This reverts commit `5acd471698`. Breaks shared library build with: ld.lld-12: error: undefined symbol: shm_open >>> referenced by ExecutorSharedMemoryMapperService.cpp:68 (/home/culrho01/llvm-project/llvm/lib/ExecutionEngine/Orc/TargetProcess/ExecutorSharedMemoryMapperService.cpp:68) >>> lib/ExecutionEngine/Orc/TargetProcess/CMakeFiles/LLVMOrcTargetProcess.dir/ExecutorSharedMemoryMapperService.cpp.o:(llvm::orc::rt_bootstrap::ExecutorSharedMemoryMapperService::reserve[abi:cxx11](unsigned long)) >>> did you mean: sem_open >>> defined in: /usr/bin/../lib/gcc/aarch64-linux-gnu/9/../../../aarch64-linux-gnu/libpthread.so	2022-07-14 09:52:57 +00:00
Amara Emerson	6e6be5f950	Revert "[llvm] add zstd to llvm::compression namespace" This reverts commit `d449c60076`. Breaks macOS builds with this: llvm/lib/Support/Compression.cpp:24:10: fatal error: 'zstd.h' file not found	2022-07-14 01:23:20 -07:00
Jannik Silvanus	e5c4cde451	[AMDGPU] SIMachineScheduler: Add support for several MachineScheduler features The SI machine scheduler inherits from ScheduleDAGMI. This patch adds support for a few features that are implemented in ScheduleDAGMI (or its base classes) that were missing so far because their support is implemented in overridden functions. * Support cl::opt -view-misched-dags This option allows to open a graphical window of the scheduling DAG. * Support cl::opt -misched-print-dags This option allows to print the scheduling DAG in text form. * After constructing the scheduling DAG, call postprocessDAG() to apply any registered DAG mutations. Note that currently there are no mutations defined in AMDGPUTargetMachine.cpp in case SIScheduler is used. Still add this to avoid surprises in the future in case mutations are added. Differential Revision: https://reviews.llvm.org/D128808	2022-07-14 09:45:31 +02:00
Kazu Hirata	611ffcf4e4	[llvm] Use value instead of getValue (NFC)	2022-07-13 23:11:56 -07:00
Cole Kissane	d449c60076	[llvm] add zstd to llvm::compression namespace - add `FindZSTD.cmake` - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-13 19:58:42 -07:00
Cole Kissane	5ecb161c64	Revert "[llvm] add zstd to `llvm::compression` namespace" This reverts commit `cef07169ec`.	2022-07-13 19:48:29 -07:00
Cole Kissane	cef07169ec	[llvm] add zstd to `llvm::compression` namespace - add `FindZSTD.cmake` - add zstd to `llvm::compression` namespace - add a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - add tests for zstd to `llvm/unittests/Support/CompressionTest.cpp` Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-13 19:06:27 -07:00
Fangrui Song	e690137dde	[Support] Change compression::zlib::{compress,uncompress} to use uint8_t * It's more natural to use uint8_t * (std::byte needs C++17 and llvm has too much uint8_t ) and most callers use uint8_t instead of char *. The functions are recently moved into `llvm::compression::zlib::`, so downstream projects need to make adaption anyway.	2022-07-13 16:26:54 -07:00
Anubhab Ghosh	5acd471698	[ORC] Add a shared-memory based orc::MemoryMapper. This is an implementation of orc::MemoryMapper that maps shared memory pages in both executor and controller process and writes directly to them avoiding transferring content over EPC. All allocations are properly deinitialized automatically on the executor side at shutdown by the ExecutorSharedMemoryMapperService. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D128544	2022-07-13 15:24:28 -07:00
Philip Reames	dde2a7fb6d	[RISCV] Exploit fact that vscale is always power of two to replace urem sequence When doing scalable vectorization, the loop vectorizer uses a urem in the computation of the vector trip count. The RHS of that urem is a (possibly shifted) call to @llvm.vscale. vscale is effectively the number of "blocks" in the vector register. (That is, types such as <vscale x 8 x i8> and <vscale x 1 x i8> both fill one 64 bit block, and vscale is essentially how many of those blocks there are in a single vector register at runtime.) We know from the RISCV V extension specification that VLEN must be a power of two between ELEN and 2^16. Since our block size is 64 bits, the must be a power of two numbers of blocks. (For everything other than VLEN<=32, but that's already broken.) It is worth noting that AArch64 SVE specification explicitly allows non-power-of-two sizes for the vector registers and thus can't claim that vscale is a power of two by this logic. Differential Revision: https://reviews.llvm.org/D129609	2022-07-13 10:54:47 -07:00
Fangrui Song	b28412d539	[llvm-objcopy][ELF] Add --set-section-type The request is mentioned on D129053. I feel that having this functionality is mildly useful (not strong). * Rename .ctors to .init_array and change sh_type to SHT_INIT_ARRAY (GNU objcopy detects the special name but we don't). * Craft tests for a new SHT_LLVM_* extension Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D129337	2022-07-13 10:04:21 -07:00
Mitch Phillips	90e5a8ac47	Remove 'no_sanitize_memtag'. Add 'sanitize_memtag'. For MTE globals, we should have clang emit the attribute for all GV's that it creates, and then use that in the upcoming AArch64 global tagging IR pass. We need a positive attribute for this sanitizer (rather than implicit sanitization of all globals) because it needs to interact with other parts of LLVM, including: 1. Suppressing certain global optimisations (like merging), 2. Emitting extra directives by the ASM writer, and 3. Putting extra information in the symbol table entries. While this does technically make the LLVM IR / bitcode format non-backwards-compatible, nobody should have used this attribute yet, because it's a no-op. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D128950	2022-07-13 08:54:41 -07:00
Nikita Popov	6f9d990a6e	[TargetFolder] Use DL-aware folding for icmp The Fold() call was accidentally dropped in `138fcc5f76`, though it doesn't seem to make a difference in practice (no test changes).	2022-07-13 15:35:13 +02:00
Nikita Popov	6d6983ced9	[IRBuilder] Migrate fneg to fold infrastructure Make use of a single FoldUnOpFMF() API, though in practice FNeg is the only unary operation that exists. This is likely NFC in practice, because users of InstSimplifyFolder don't create fneg.	2022-07-13 15:29:52 +02:00
Max Kazantsev	30e33b4b81	[SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional	2022-07-13 18:54:25 +07:00
Corentin Jabot	d4892a168f	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-13 10:19:26 +02:00
Kazu Hirata	3361a364e6	[llvm] Use has_value instead of hasValue (NFC)	2022-07-12 22:25:42 -07:00
Nathan James	a565509308	[ADT] Use Empty Base Optimization for Allocators In D94439, BumpPtrAllocator changed its implementation to use an empty base optimization for the underlying allocator. This patch builds on that by extending its functionality to more classes as well as enabling the underlying allocator to be a reference type, something not currently possible as you can't derive from a reference. The main place this sees use is in StringMaps which often use the default MallocAllocator, yet have to pay the size of a pointer for no reason. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D129206	2022-07-12 23:57:04 +01:00
Jonas Devlieghere	a262f4dbd7	Revert "[Clang] Add a warning on invalid UTF-8 in comments." This reverts commit `cc309721d2` because it breaks the following tests on GreenDragon: TestDataFormatterObjCCF.py TestDataFormatterObjCExpr.py TestDataFormatterObjCKVO.py TestDataFormatterObjCNSBundle.py TestDataFormatterObjCNSData.py TestDataFormatterObjCNSError.py TestDataFormatterObjCNSNumber.py TestDataFormatterObjCNSURL.py TestDataFormatterObjCPlain.py TestDataFormatterObjNSException.py https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/45288/	2022-07-12 15:22:29 -07:00
Kai Nacke	4ae254e488	Revert "[GISel] Unify use of getStackGuard" This reverts commit `e60b4fb2b7`.	2022-07-12 17:00:43 -04:00
Kai Nacke	e60b4fb2b7	[GISel] Unify use of getStackGuard Some rework of getStackGuard() based on comments in https://reviews.llvm.org/D129505. - getStackGuard() now creates and returns the destination register, simplifying calls - the pointer type is passed to getStackGuard() to avoid recomputation - removed PtrMemTy in emitSPDescriptorParent(), because this type is only used here when loading the value but not when storing the value Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D129576	2022-07-12 16:46:37 -04:00
Sunho Kim	2a0aa98c8d	[ORC] Remove unused function declaration. (NFC) Differential Revision: https://reviews.llvm.org/D129582	2022-07-13 05:13:31 +09:00
Sunho Kim	db995d72db	[JITLink][COFF] Initial COFF support. Adds initial COFF support in JITLink. This is able to run a hello world c program in x86 windows successfully. Implemented - COFF object loader - Static local symbols - Absolute symbols - External symbols - Weak external symbols - Common symbols - COFF jitlink-check support - All COMDAT selection type execpt largest - Implicit symobl size calculation - Rel32 relocation with PLT stub. - IMAGE_REL_AMD64_ADDR32NB relocation Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D128968	2022-07-13 03:52:43 +09:00
Yuanfang Chen	fcb7d76d65	[coroutine] add nomerge function attribute to `llvm.coro.save` It is illegal to merge two `llvm.coro.save` calls unless their `llvm.coro.suspend` users are also merged. Marks it "nomerge" for the moment. This reverts D129025. Alternative to D129025, which affects other token type users like WinEH. Reviewed By: ChuanqiXu Differential Revision: https://reviews.llvm.org/D129530	2022-07-12 10:39:38 -07:00
Nick Desaulniers	2240d72f15	[X86] initial -mfunction-return=thunk-extern support Adds support for: * `-mfunction-return=<value>` command line flag, and * `__attribute__((function_return("<value>")))` function attribute Where the supported <value>s are: * keep (disable) * thunk-extern (enable) thunk-extern enables clang to change ret instructions into jmps to an external symbol named __x86_return_thunk, implemented as a new MachineFunctionPass named "x86-return-thunks", keyed off the new IR attribute fn_ret_thunk_extern. The symbol __x86_return_thunk is expected to be provided by the runtime the compiled code is linked against and is not defined by the compiler. Enabling this option alone doesn't provide mitigations without corresponding definitions of __x86_return_thunk! This new MachineFunctionPass is very similar to "x86-lvi-ret". The <value>s "thunk" and "thunk-inline" are currently unsupported. It's not clear yet that they are necessary: whether the thunk pattern they would emit is beneficial or used anywhere. Should the <value>s "thunk" and "thunk-inline" become necessary, x86-return-thunks could probably be merged into x86-retpoline-thunks which has pre-existing machinery for emitting thunks (which could be used to implement the <value> "thunk"). Has been found to build+boot with corresponding Linux kernel patches. This helps the Linux kernel mitigate RETBLEED. * CVE-2022-23816 * CVE-2022-28693 * CVE-2022-29901 See also: * "RETBLEED: Arbitrary Speculative Code Execution with Return Instructions." * AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion * TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0 2022-07-12 * Return Stack Buffer Underflow / Return Stack Buffer Underflow / CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702 SystemZ may eventually want to support "thunk-extern" and "thunk"; both options are used by the Linux kernel's CONFIG_EXPOLINE. This functionality has been available in GCC since the 8.1 release, and was backported to the 7.3 release. Many thanks for folks that provided discrete review off list due to the embargoed nature of this hardware vulnerability. Many Bothans died to bring us this information. Link: https://www.youtube.com/watch?v=IF6HbCKQHK8 Link: https://github.com/llvm/llvm-project/issues/54404 Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1 Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60 Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html Reviewed By: aaron.ballman, craig.topper Differential Revision: https://reviews.llvm.org/D129572	2022-07-12 09:17:54 -07:00
Dawid Jurczak	165240fe38	[NFC] Fix compile time regression seen on some benchmarks after `a630ea3003` commit The goal of this change is fixing most of compile time slowdown seen after `a630ea3003` commit on lencod and sqlite3 benchmarks. There are 3 improvements included in this patch: 1. In getNumOperands when possible get value directly from SmallNumOps. 2. Inline getLargePtr by moving its definition to header. 3. In TBAAStructTypeNode::getField get all operands once instead taking operands in loop one after one. Differential Revision: https://reviews.llvm.org/D129468	2022-07-12 15:00:27 +02:00
Corentin Jabot	cc309721d2	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-12 14:34:30 +02:00
Nikita Popov	00797b88e0	[InlineAsm] Improve error messages for invalid constraint strings InlineAsm constraint string verification can fail for many reasons, but used to always print a generic "invalid type for inline asm constraint string" message -- which is especially confusing if the actual error is unrelated to the type, e.g. a failure to parse the constraint string. Change the verify API to return an Error with a more specific error message, and print that in the IR parser.	2022-07-12 11:41:16 +02:00
Nikita Popov	4bb7b6fae3	[IR] Remove support for float binop constant expressions As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179, this removes support for the floating-point binop constant expressions fadd, fsub, fmul, fdiv and frem. As part of this change, the C APIs LLVMConstFAdd, LLVMConstFSub, LLVMConstFMul, LLVMConstFDiv and LLVMConstFRem are removed. The LLVMBuild APIs should be used instead. Differential Revision: https://reviews.llvm.org/D129478	2022-07-12 09:40:49 +02:00
Kazu Hirata	ec9a0e36d9	[IPO] Remove addLTOOptimizationPasses and addLateLTOOptimizationPasses (NFC) The last uses were removed on Apr 15, 2022 in commit `2e6ac54cf4`. Differential Revision: https://reviews.llvm.org/D129460	2022-07-11 20:15:24 -07:00
Xiang1 Zhang	a45dd3d814	[X86] Support -mstack-protector-guard-symbol Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D129346	2022-07-12 10:17:00 +08:00
Xiang1 Zhang	643786213b	Revert "[X86] Support -mstack-protector-guard-symbol" This reverts commit `efbaad1c4a`. due to miss adding review info.	2022-07-12 10:14:32 +08:00
Xiang1 Zhang	efbaad1c4a	[X86] Support -mstack-protector-guard-symbol	2022-07-12 10:13:48 +08:00
Prabhdeep Singh Soni	ac892c70a4	[OMPIRBuilder] Add support for simdlen clause This patch adds OMPIRBuilder support for the simdlen clause for the simd directive. It uses the simdlen support in OpenMPIRBuilder when it is enabled in Clang. Simdlen is lowered by OpenMPIRBuilder by generating the loop.vectorize.width metadata. Reviewed By: jdoerfert, Meinersbur Differential Revision: https://reviews.llvm.org/D129149	2022-07-11 13:29:06 -04:00
spupyrev	eecd41aa09	Revert "Rebase: [Facebook] [MC] Introduce NeverAlign fragment type" This reverts commit `6d0528636a`.	2022-07-11 09:50:47 -07:00
Rafael Auler	6d0528636a	Rebase: [Facebook] [MC] Introduce NeverAlign fragment type Summary: Introduce NeverAlign fragment type. The intended usage of this fragment is to insert it before a pair of macro-op fusion eligible instructions. NeverAlign fragment ensures that the next fragment (first instruction in the pair) does not end at a given alignment boundary by emitting a minimal size nop if necessary. In effect, it ensures that a pair of macro-fusible instructions is not split by a given alignment boundary, which is a precondition for macro-op fusion in modern Intel Cores (64B = cache line size, see Intel Architecture Optimization Reference Manual, 2.3.2.1 Legacy Decode Pipeline: Macro-Fusion). This patch introduces functionality used by BOLT when emitting code with MacroFusion alignment already in place. The use case is different from BoundaryAlign and instruction bundling: - BoundaryAlign can be extended to perform the desired alignment for the first instruction in the macro-op fusion pair (D101817). However, this approach has higher overhead due to reliance on relaxation as BoundaryAlign requires in the general case - see https://reviews.llvm.org/D97982#2710638. - Instruction bundling: the intent of NeverAlign fragment is to prevent the first instruction in a pair ending at a given alignment boundary, by inserting at most one minimum size nop. It's OK if either instruction crosses the cache line. Padding both instructions using bundles to not cross the alignment boundary would result in excessive padding. There's no straightforward way to request instruction bundling to avoid a given end alignment for the first instruction in the bundle. LLVM: https://reviews.llvm.org/D97982 Manual rebase conflict history: https://phabricator.intern.facebook.com/D30142613 Test Plan: sandcastle Reviewers: #llvm-bolt Subscribers: phabricatorlinter Differential Revision: https://phabricator.intern.facebook.com/D31361547	2022-07-11 09:31:52 -07:00
David Sherwood	03fee6712a	[LoopVectorize] Add option to use active lane mask for loop control flow Currently, for vectorised loops that use the get.active.lane.mask intrinsic we only use the mask for predicated vector operations, such as masked loads and stores, etc. The loop itself is still controlled by comparing the canonical induction variable with the trip count. However, for some targets this is inefficient when it's cheap to use the mask itself to control the loop. This patch adds support for using the active lane mask for control flow by: 1. Generating the active lane mask for the next iteration of the vector loop, rather than the current one. If there are still any remaining iterations then at least the first bit of the mask will be set. 2. Extract the first bit of this mask and use this bit for the conditional branch. I did this by creating a new VPActiveLaneMaskPHIRecipe that sets up the initial PHI values in the vector loop pre-header. I've also made use of the new BranchOnCond VPInstruction for the final instruction in the loop region. Differential Revision: https://reviews.llvm.org/D125301	2022-07-11 13:46:55 +01:00
Abhina Sreeskantharajan	6e2329e33a	[SystemZ][z/OS] Force alignment to fix build failure on z/OS The following commit https://reviews.llvm.org/D125998 added a static_assert which was triggered on z/OS because bitfields are always aligned to 1 regardless of type. ``` error: static_assert failed due to requirement 'alignof(llvm::SmallVector<llvm::MDOperand, 0>) <= alignof(llvm::MDNode::Header)' "LargeStorageVector too strongly aligned" ``` The solution was to force the alignment to be size_t. Reviewed By: wolfgangp Differential Revision: https://reviews.llvm.org/D129369	2022-07-11 08:29:29 -04:00
Kazu Hirata	c13d04e599	[DWARFLinker] Remove unused declaration copyAbbrev (NFC) The corresponding definition was removed on Apr 26, 2021 in commit `233c24330b`.	2022-07-10 22:10:23 -07:00
Kazu Hirata	f2e1d2cec0	[GlobalISel] Remove unused declaration fewerElementsVectorSextInReg (NFC) The corresponding definition was removed on Dec 23, 2021 in commit `29f88b93fd`.	2022-07-10 20:41:02 -07:00
Nicolai Hähnle	ede600377c	ManagedStatic: remove many straightforward uses in llvm (Reapply after revert in `e9ce1a5880` due to Fuchsia test failures. Removed changes in lib/ExecutionEngine/ other than error categories, to be checked in more detail and reapplied separately.) Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 10:29:15 +02:00
Nicolai Hähnle	e9ce1a5880	Revert "ManagedStatic: remove many straightforward uses in llvm" This reverts commit `e6f1f06245`. Reverting due to a failure on the fuchsia-x86_64-linux buildbot.	2022-07-10 09:54:30 +02:00
Nicolai Hähnle	e6f1f06245	ManagedStatic: remove many straightforward uses in llvm Bulk remove many of the more trivial uses of ManagedStatic in the llvm directory, either by defining a new getter function or, in many cases, moving the static variable directly into the only function that uses it. Differential Revision: https://reviews.llvm.org/D129120	2022-07-10 09:15:08 +02:00
Fangrui Song	2c18e817ee	[Support] Delete redundant 'static' from namespace scope 'static constexpr'. NFC	2022-07-09 23:36:01 -07:00
Corentin Jabot	50416e5454	Revert "[Clang] Add a warning on invalid UTF-8 in comments." It is probable thart this change crashes on the powerpc bots. This reverts commit `355532a149`.	2022-07-09 17:18:35 +02:00
Lang Hames	7ac7837080	[JITLink][AArch64] Rename PointerToGOT and fix typo. PointerToGOT lowering was accidentally changed from Delta32 to Delta64 in `db37225803`. This patch moves it back to Delta32 and renames the generic aarch64 edge to Delta32ToGOT to avoid the ambiguity. No test case yet -- I haven't figured out how to write a succinct test case (this typically appears in CIEs in eh-frames).	2022-07-09 08:09:23 -07:00
Corentin Jabot	355532a149	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-09 11:26:45 +02:00
Leonard Chan	474c873148	Revert "[llvm] cmake config groundwork to have ZSTD in LLVM" This reverts commit `f07caf20b9` which seems to break upstream https://lab.llvm.org/buildbot/#/builders/109/builds/42253.	2022-07-08 13:48:05 -07:00
Cole Kissane	f07caf20b9	[llvm] cmake config groundwork to have ZSTD in LLVM - added `FindZSTD.cmake` - added a CMake option `LLVM_ENABLE_ZSTD` with behavior mirroring that of `LLVM_ENABLE_ZLIB` - likewise added have_zstd to compiler-rt/test/lit.common.cfg.py, clang-tools-extra/clangd/test/lit.cfg.py, and several lit.site.cfg.py.in files mirroring have_zlib behavior Reviewed By: leonardchan, MaskRay Differential Revision: https://reviews.llvm.org/D128465	2022-07-08 11:46:52 -07:00
Joseph Huber	5300263c70	[OpenMP] Add loop tripcount argument to kernel launch and remove push function Previously we added the `push_target_tripcount` function to send the loop tripcount to the device runtime so we knew how to configure the teams / threads for execute the loop for a teams distribute construct. This was implemented as a separate function mostly to avoid changing the interface for backwards compatbility. Now that we've changed it anyway and the new interface can take an arbitrary number of arguments via the struct without changing the ABI, we can move this to the new interface. This will simplify the runtime by removing unnecessary state between calls. Depends on D128550 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D128816	2022-07-08 14:44:16 -04:00
Joseph Huber	1fff116645	[OpenMP] Change OpenMP code generation for target region entries This patch changes the code we generate to enter a target region on the device. This is in-line with the new definition in the runtime that was added previously. Additionally we implement this in the OpenMPIRBuilder so that this code can be shared with Flang in the future. Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D128550	2022-07-08 14:44:11 -04:00
Cole Kissane	96063bfa90	[llvm] Remove unused and redundant crc32 funcction from llvm::compression::zlib namespace * Remove crc32 from zlib compression namespace, people should use the `llvm::crc32` instead. Reviewed By: MaskRay, leonardchan Differential Revision: https://reviews.llvm.org/D128754	2022-07-08 11:24:45 -07:00
Cole Kissane	ea61750c35	[NFC] Refactor llvm::zlib namespace * Refactor compression namespaces across the project, making way for a possible introduction of alternatives to zlib compression. Changes are as follows: * Relocate the `llvm::zlib` namespace to `llvm::compression::zlib`. Reviewed By: MaskRay, leonardchan, phosek Differential Revision: https://reviews.llvm.org/D128953	2022-07-08 11:19:07 -07:00
Nicolai Hähnle	5a731d733c	Fix test: LLVMGetBitcodeModule takes ownership of memory buffer Clarify this behavior in the C interface header file and fix a related bug in a test. Differential Revision: https://reviews.llvm.org/D129113	2022-07-08 20:06:44 +02:00
Matt Arsenault	13ac4c3de9	GlobalISel: Add buildBoolExtInReg helper	2022-07-08 11:55:08 -04:00
Matt Arsenault	1ee6ce9bad	GlobalISel: Allow forming atomic/volatile G_ZEXTLOAD SelectionDAG has a target hook, getExtendForAtomicOps, which it uses in the computeKnownBits implementation for ATOMIC_LOAD. This is pretty ugly (as is having a separate load opcode for atomics), so instead allow making use of atomic zextload. Enable this for AArch64 since the DAG path defaults in to the zext behavior. The tablegen changes are pretty ugly, but partially helps migrate SelectionDAG from using ISD::ATOMIC_LOAD to regular ISD::LOAD with atomic memory operands. For now the DAG emitter will emit matchers for patterns which the DAG will not produce. I'm still a bit confused by the intent of the isLoad/isStore/isAtomic bits. The DAG implementation rejects trying to use any of these in combination. For now I've opted to make the isLoad checks also check isAtomic, although I think having isLoad and isAtomic set on these makes most sense.	2022-07-08 11:55:08 -04:00
Valentin Clement	015834e455	[flang][openacc][NFC] Extract device_type parser to its own Move the device_type parser to a separate parser AccDeviceTypeExprList. Preparatory work for D106968. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D106967	2022-07-08 16:02:04 +02:00
Valentin Clement	36e24da8eb	[flang][openacc][NFC] Make self clause value optional in ACC.td and extract the parser Set the isOptional flag for the self clause. Move the optional and parenthesis part of the parser. Update the rest of the code to deal with the optional value. Preparatory work for D106968. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D106965	2022-07-08 15:45:12 +02:00
Johannes Doerfert	f6e0c05e3d	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit `f17639ea0c` as three AMDGPU tests haven't been updated. Will need to verify the changes are not regressions we should avoid.	2022-07-08 00:53:38 -05:00
Johannes Doerfert	f17639ea0c	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good even if some tests look like they regress. Fixes: https://github.com/llvm/llvm-project/issues/54981 Note: A previous version was flawed and consequently reverted in `6555558a80`.	2022-07-08 00:38:27 -05:00
Abinav Puthan Purayil	c42fe5bd7a	[GlobalISel][SelectionDAG] Implement the HasNoUse builtin predicate This change introduces the HasNoUse builtin predicate in PatFrags that checks for the absence of use of the first result operand. GlobalISelEmitter will allow source PatFrags with this predicate to be matched with destination instructions with empty outs. This predicate is required for selecting the no-return variant of atomic instructions in AMDGPU. Differential Revision: https://reviews.llvm.org/D125212	2022-07-08 09:47:33 +05:30
Joseph Huber	41fba3c107	[Metadata] Add 'exclude' metadata to add the exclude flags on globals This patchs adds a new metadata kind `exclude` which implies that the global variable should be given the necessary flags during code generation to not be included in the final executable. This is done using the ``SHF_EXCLUDE`` flag on ELF for example. This should make it easier to specify this flag on a variable without needing to explicitly check the section name in the target backend. Depends on D129053 D129052 Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D129151	2022-07-07 12:20:40 -04:00
Joseph Huber	1d2ce4da84	[Object] Add ELF section type for offloading objects Currently we use the `.llvm.offloading` section to store device-side objects inside the host, creating a fat binary. The contents of these sections is currently determined by the name of the section while it should ideally be determined by its type. This patch adds the new `SHT_LLVM_OFFLOADING` section type to the ELF section types. Which should make it easier to identify this specific data format. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D129052	2022-07-07 12:20:30 -04:00
Joseph Huber	ed801ad5e5	[Clang] Use metadata to make identifying embedded objects easier Currently we use the `embedBufferInModule` function to store binary strings containing device offloading data inside the host object to create a fatbinary. In the case of LTO, we need to extract this object from the LLVM-IR. This patch adds a metadata node for the embedded objects containing the embedded pointers and the sections they were stored at. This should create a cleaner interface for identifying these values. In the future it may be worthwhile to also encode an `ID` in the metadata corresponding to the object's special section type if relevant. This would allow us to extract the data from an object file and LLVM-IR using the same ID. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D129033	2022-07-07 12:20:25 -04:00
Nicolai Hähnle	fdf7e437bf	llvm-c: Add LLVMDeleteInstruction to fix a test issue Not deleting the loose instruction with metadata associated to it causes an assertion when the LLVMContext is destroyed. This was previously hidden by the fact that llvm-c-test does not call LLVMShutdown. The planned removal of ManagedStatic exposed this issue. Differential Revision: https://reviews.llvm.org/D129114	2022-07-07 14:29:20 +02:00
Sven van Haastregt	1d9086bf05	Fix use of uninitialized member in constructor The constructor does `Saver(Alloc)`, so `Alloc` should be initialized first. Move `Alloc` up in the declaration order. Fixes a -Wuninitialized warning when building with GCC 12.1. Reported-by: Mihail Atanassov <mihail.atanassov@arm.com>	2022-07-07 12:05:24 +01:00
Nikita Popov	4a579abd9f	[GlobalsModRef] Don't override getModRefBehavior() for CallBase BasicAA will already call getModRefBehavior() on the Function of the CallBase if there are no operand bundles. This happens through getBestAAResults(), i.e. it is a recursive call that will query other AA providers, not just the BasicAA implementation. As such, there is no need to reimplement the same functionality in GlobalsModRef, a combination of BasicAA and GlobalsModRef already handles it. This does mean that this no longer works under -disable-basic-aa, but that's a testing only option.	2022-07-07 10:35:44 +02:00
Sander de Smalen	6106a767b7	[AArch64][SME] Update load/store intrinsics to take predicate corresponding to element size. Instead of using <vscale x 16 x i1> for all the loads/stores, we now use the appropriate predicate type according to the element size, e.g. ld1b uses <vscale x 16 x i1> ld1w uses <vscale x 4 x i1> ld1q uses <vscale x 1 x i1> Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D129083	2022-07-07 07:39:27 +00:00
Nico Weber	e9fe20dab3	Revert "[Clang] Add a warning on invalid UTF-8 in comments." This reverts commit `4174f0ca61`. Also revert follow-up "[Clang] Fix invalid utf-8 detection" This reverts commit `bf45e27a67`. The second commit broke tests, see comments on https://reviews.llvm.org/D129223, and it sounds like the first commit isn't valid without the second one. So reverting both for now.	2022-07-06 22:51:52 +02:00
Nico Weber	39ed08f8d4	try to fix build after `babef908cc`	2022-07-06 22:15:09 +02:00
Noah Shutty	babef908cc	[llvm] [Debuginfod] DebuginfodCollection and DebuginfodServer for tracking local debuginfo. This library implements the class `DebuginfodCollection`, which scans a set of directories for binaries, classifying them according to whether they contain debuginfo. This also provides the `DebuginfodServer`, an `HTTPServer` which serves debuginfod's `/debuginfo` and `/executable` endpoints. This is intended as the final new supporting library required for `llvm-debuginfod`. As implemented here, `DebuginfodCollection` only finds ELF binaries and DWARF debuginfo. All other files are ignored. However, the class interface is format-agnostic. Generalizing to support other platforms will require refactoring of LLVM's object parsing libraries to eliminate use of `report_fatal_error` ([[ https://github.com/llvm/llvm-project/blob/main/llvm/lib/Object/WasmObjectFile.cpp#L74 \| e.g. when reading WASM files ]]), so that the debuginfod daemon does not crash when it encounters a malformed file on the disk. The `DebuginfodCollection` is tested by end-to-end tests of the debuginfod server (D114846). Reviewed By: mysterymath Differential Revision: https://reviews.llvm.org/D114845	2022-07-06 20:02:14 +00:00
Corentin Jabot	4174f0ca61	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-06 21:18:29 +02:00
Noah Shutty	8366e21ef1	[llvm] [Debuginfod] Add HTTP Server to Debuginfod library. This provides a minimal HTTP server interface and an implementation wrapping [[ https://github.com/yhirose/cpp-httplib \| cpp-httplib ]] in the Debuginfod library. If the Curl HTTP client is available (D112753) the server is tested by pinging it with the client. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D114415	2022-07-06 18:56:54 +00:00
Noah Shutty	484b1aa611	[llvm] [Debuginfod] Add cpp-httplib optional dependency. Adds optional dependency on cpp-httplib, a lightweight header-only HTTP server. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D113218	2022-07-06 18:40:56 +00:00
Chris Bieneman	e0b5208650	[NFC] [DirectX] Prefix for intrinsics should be dx `dxil` is an architecture supported by the DirectX backend. These intrinsics will likely be shared with other DirectX architectures like `dxbc`. Using a common prefix `dx` will make it more intuitive. Also the `dx` prefix is already set in the Triple, which causes intrinsics described here to be unmatchable via the ClangBuiltin mechanism.	2022-07-06 13:27:12 -05:00
Corentin Jabot	fb06dd3e8c	Revert "[Clang] Add a warning on invalid UTF-8 in comments." Reverting while I investigate build failures This reverts commit `e3dc56805f`.	2022-07-06 19:45:12 +02:00
Jin Xin Ng	65001f5777	[LTO][ELF] Add selective --save-temps= option Allows specific “temps” to be saved, instead of the current all-or-nothing nature of --save-temps. Multiple of these “temps” can be saved by specifying the argument multiple times. Differential Revision: https://reviews.llvm.org/D127778	2022-07-06 10:06:18 -07:00
Edd Barrett	ed8ef65f3d	[stackmaps] Start legalizing live variable operands Prior to this change, live variable operands passed to `llvm.experimental.stackmap` would be emitted directly to target nodes, meaning that they don't get legalised. The upshot of this is that LLVM may crash when encountering illegally typed target nodes. e.g. https://github.com/llvm/llvm-project/issues/21657 This change introduces a platform independent stackmap DAG node whose operands are legalised as per usual, thus avoiding aforementioned crashes. Note that some kinds of argument are still not handled properly, namely vectors, structs, and large integers, like i128s. These will need to be addressed in follow-up changes. Note also that this does not change the behaviour of `llvm.experimental.patchpoint`. A follow up change will do the same for this intrinsic. Differential review: https://reviews.llvm.org/D125680	2022-07-06 14:01:54 +01:00
Corentin Jabot	e3dc56805f	[Clang] Add a warning on invalid UTF-8 in comments. Introduce an off-by default `-Winvalid-utf8` warning that detects invalid UTF-8 code units sequences in comments. Invalid UTF-8 in other places is already diagnosed, as that cannot appear in identifiers and other grammar constructs. The warning is off by default as its likely to be somewhat disruptive otherwise. This warning allows clang to conform to the yet-to be approved WG21 "P2295R5 Support for UTF-8 as a portable source file encoding" paper. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D128059	2022-07-06 17:59:44 +02:00
Nikola Tesic	b5b6d3a41b	[Debugify] Port verify-debuginfo-preserve to NewPM Debugify in OriginalDebugInfo mode, introduced with D82545, runs only with legacy PassManager. This patch enables this utility for the NewPM. Differential Revision: https://reviews.llvm.org/D115351	2022-07-06 17:07:20 +02:00
Shilei Tian	1023ddaf77	[LLVM] Add the support for fmax and fmin in atomicrmw instruction This patch adds the support for `fmax` and `fmin` operations in `atomicrmw` instruction. For now (at least in this patch), the instruction will be expanded to CAS loop. There are already a couple of targets supporting the feature. I'll create another patch(es) to enable them accordingly. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D127041	2022-07-06 10:57:53 -04:00
Paul Robinson	08e4fe6c61	[X86] Add RDPRU instruction Add support for the RDPRU instruction on Zen2 processors. User-facing features: - Clang option -m[no-]rdpru to enable/disable the feature - Support is implicit for znver2/znver3 processors - Preprocessor symbol __RDPRU__ to indicate support - Header rdpruintrin.h to define intrinsics - "rdpru" mnemonic supported for assembler code Internal features: - Clang builtin __builtin_ia32_rdpru - IR intrinsic @llvm.x86.rdpru Differential Revision: https://reviews.llvm.org/D128934	2022-07-06 07:17:47 -07:00
Sunho Kim	30b6c51f51	[ORC][ORC_RT][AArch64] Implement TLS descriptor in ELFNixPlatform. Implements TLS descriptor relocations in JITLink ELF/AARCH64 backend and support the relevant runtime functions in ELFNixPlatform. Unlike traditional TLS model, TLS descriptor model requires linker to return the "offset" from thread pointer via relocaiton not the actual pointer to thread local variable. There is no public libc api for adding new allocations to TLS block dynamically which thread pointer points to. So, we support this by taking delta from thread base pointer to the actual thread local variable in our allocated section. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D128601	2022-07-06 20:12:22 +09:00
Nikita Popov	f96cb66d19	[ValueTracking] Accept Instruction in isSafeToSpeculativelyExecute() (NFC) As constant expressions can no longer trap, it only makes sense to call isSafeToSpeculativelyExecute on Instructions, so limit the API to accept only them, rather than general Operators or Values.	2022-07-06 11:12:49 +02:00
Nikita Popov	8ee913d83b	[IR] Remove Constant::canTrap() (NFC) As integer div/rem constant expressions are no longer supported, constants can no longer trap and are always safe to speculate. Remove the Constant::canTrap() method and its usages.	2022-07-06 10:36:47 +02:00
Nikita Popov	11950efe06	[ConstExpr] Remove div/rem constant expressions D128820 stopped creating div/rem constant expressions by default; this patch removes support for them entirely. The getUDiv(), getExactUDiv(), getSDiv(), getExactSDiv(), getURem() and getSRem() on ConstantExpr are removed, and ConstantExpr::get() now only accepts binary operators for which ConstantExpr::isSupportedBinOp() returns true. Uses of these methods may be replaced either by corresponding IRBuilder methods, or ConstantFoldBinaryOpOperands (if a constant result is required). On the C API side, LLVMConstUDiv, LLVMConstExactUDiv, LLVMConstSDiv, LLVMConstExactSDiv, LLVMConstURem and LLVMConstSRem are removed and corresponding LLVMBuild methods should be used. Importantly, this also means that constant expressions can no longer trap! This patch still keeps the canTrap() method to minimize diff -- I plan to drop it in a separate NFC patch. Differential Revision: https://reviews.llvm.org/D129148	2022-07-06 10:11:34 +02:00
Zaara Syeda	dbf6ab5ef9	[LSR] Fix bug for optimizing unused IVs to final values This is a fix for a crash reported for https://reviews.llvm.org/D118808 The fix is to only consider PHINodes which are induction phis. Fixes #55529 Differential Revision: https://reviews.llvm.org/D125990	2022-07-05 12:30:58 -04:00
Jay Foad	4dbc2876cf	[AMDGPU] GFX11 trivial NFC tweaks A few miscellaneous comment, whitespace and indentation tweaks.	2022-07-05 17:20:17 +01:00
Nikita Popov	935570b2ad	[ConstExpr] Don't create div/rem expressions This removes creation of udiv/sdiv/urem/srem constant expressions, in preparation for their removal. I've added a ConstantExpr::isDesirableBinOp() predicate to determine whether an expression should be created for a certain operator. With this patch, div/rem expressions can still be created through explicit IR/bitcode, forbidding them entirely will be the next step. Differential Revision: https://reviews.llvm.org/D128820	2022-07-05 15:54:53 +02:00
Archibald Elliott	1666f09933	[ARM] Add Support for Cortex-M85 This patch adds support for Arm's Cortex-M85 CPU. The Cortex-M85 CPU is an Arm v8.1m Mainline CPU, with optional support for MVE and PACBTI, both of which are enabled by default. Parts have been coauthored by by Mark Murray, Alexandros Lamprineas and David Green. Differential Revision: https://reviews.llvm.org/D128415	2022-07-05 10:43:31 +01:00
David Sherwood	77b13a57a9	[AArch64][SME] Add SME addha/va intrinsics This patch adds new the following SME intrinsics: @llvm.aarch64.sme.addva @llvm.aarch64.sme.addha Differential Revision: https://reviews.llvm.org/D127861	2022-07-05 09:47:17 +01:00
Florian Hahn	644a965c1e	[LV] Vectorize cases with larger number of RT checks, execute only if profitable. This patch replaces the tight hard cut-off for the number of runtime checks with a more accurate cost-driven approach. The new approach allows vectorization with a larger number of runtime checks in general, but only executes the vector loop (and runtime checks) if considered profitable at runtime. Profitable here means that the cost-model indicates that the runtime check cost + vector loop cost < scalar loop cost. To do that, LV computes the minimum trip count for which runtime check cost + vector-loop-cost < scalar loop cost. Note that there is still a hard cut-off to avoid excessive compile-time/code-size increases, but it is much larger than the original limit. The performance impact on standard test-suites like SPEC2006/SPEC2006/MultiSource is mostly neutral, but the new approach can give substantial gains in cases where we failed to vectorize before due to the over-aggressive cut-offs. On AArch64 with -O3, I didn't observe any regressions outside the noise level (<0.4%) and there are the following execution time improvements. Both `IRSmk` and `srad` are relatively short running, but the changes are far above the noise level for them on my benchmark system. ``` CFP2006/447.dealII/447.dealII -1.9% CINT2017rate/525.x264_r/525.x264_r -2.2% ASC_Sequoia/IRSmk/IRSmk -9.2% Rodinia/srad/srad -36.1% ``` `size` regressions on AArch64 with -O3 are ``` MultiSource/Applications/hbd/hbd 90256.00 106768.00 18.3% MultiSourc...ks/ASCI_Purple/SMG2000/smg2000 240676.00 257268.00 6.9% MultiSourc...enchmarks/mafft/pairlocalalign 472603.00 489131.00 3.5% External/S...2017rate/525.x264_r/525.x264_r 613831.00 630343.00 2.7% External/S...NT2006/464.h264ref/464.h264ref 818920.00 835448.00 2.0% External/S...te/538.imagick_r/538.imagick_r 1994730.00 2027754.00 1.7% MultiSourc...nchmarks/tramp3d-v4/tramp3d-v4 1236471.00 1253015.00 1.3% MultiSource/Applications/oggenc/oggenc 2108147.00 2124675.00 0.8% External/S.../CFP2006/447.dealII/447.dealII 4742999.00 4759559.00 0.3% External/S...rate/510.parest_r/510.parest_r 14206377.00 14239433.00 0.2% ``` Reviewed By: lebedev.ri, ebrevnov, dmgreen Differential Revision: https://reviews.llvm.org/D109368	2022-07-04 15:11:39 +01:00
Nikita Popov	7283f48a05	[IR] Remove support for insertvalue constant expression This removes the insertvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. This is very similar to the extractvalue removal from D125795. insertvalue is also not supported in bitcode, so no auto-ugprade is necessary. ConstantExpr::getInsertValue() can be replaced with IRBuilder::CreateInsertValue() or ConstantFoldInsertValueInstruction(), depending on whether a constant result is required (with the latter being fallible). The ConstantExpr::hasIndices() and ConstantExpr::getIndices() methods also go away here, because there are no longer any constant expressions with indices. Differential Revision: https://reviews.llvm.org/D128719	2022-07-04 09:27:22 +02:00
esmeyi	d2a35e4d39	[AIX] Handling the label alignment of a global variable with its multiple aliases. This patch handles the case where a variable has multiple aliases. AIX's assembly directive .set is not usable for the aliasing purpose, and using different labels allows AIX to emulate symbol aliases. If a value is emitted between any two labels, meaning they are not aligned, XCOFF will automatically calculate the offset for them. This patch implements: 1) Emits the label of the alias just before emitting the value of the sub-element that the alias referred to. 2) A set of aliases that refers to the same offset should be aligned. 3) We didn't emit aliasing labels for common and zero-initialized local symbols in PPCAIXAsmPrinter::emitGlobalVariableHelper, but emitted linkage for them in AsmPrinter::emitGlobalAlias, which caused a FAILURE. This patch fixes the bug by blocking emitting linkage for the alias without a label. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D124654	2022-07-03 23:16:16 -04:00
Joseph Huber	228c8f9cc0	[ObjectYAML] Add offloading binary implementations for obj2yaml and yaml2obj This patchs adds the necessary code for inspecting or creating offloading binaries using the standing `obj2yaml` and `yaml2obj` features in LLVM. Depends on D127774 Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D127776	2022-07-01 21:13:18 -04:00
Rong Xu	b764e58865	Remove redundant code. [NFC] isAssumeLikeIntrinsic() is a superset of isLifetimeStartOrEnd().	2022-07-01 10:58:18 -07:00
Xiang Li	43dc319049	[DirectX] add thread/group id DXIL operations. Add DXIL operation for thread/group id operations. ID Name Description 93 ThreadId reads the thread ID 94 GroupId reads the group ID (SV_GroupID) 95 ThreadIdInGroup reads the thread ID within the group (SV_GroupThreadID) 96 FlattenedThreadIdInGroup provides a flattened index for a given thread within a given group (SV_GroupIndex) Also add llvm intrinsic which map to these intrinsics to DXIL operation. Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D127990	2022-07-01 10:56:07 -07:00
Martin Sebor	0d68ff87d2	[InstCombine] Transform strrchr to memrchr for constant strings Add an emitter for the memrchr common extension and simplify the strrchr call handler to use it. This enables transforming calls with the empty string to the test C ? S : 0. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D128954	2022-07-01 11:10:00 -06:00
Alexey Lapshin	554aea52d7	[reland][Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef. This review is extracted from D96035. This patch adds possibility to keep not only DwarfStringPoolEntry, but also pointer to it. The DwarfStringPoolEntryRef keeps reference to the string map entry. String map keeps string data and corresponding DwarfStringPoolEntry info. Not all string map entries may be included into the result, and then not all string entries should have DwarfStringPoolEntry info. Currently StringMap keeps DwarfStringPoolEntry for all entries. It leads to extra memory usage. This patch allows to keep DwarfStringPoolEntry info only for entries which really need it. [reland] : make msan happy. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D126883	2022-07-01 20:08:09 +03:00
Fazlay Rabbi	38bcd483dd	[OpenMP] Initial parsing and semantic support for 'parallel masked taskloop simd' construct This patch gives basic parsing and semantic support for "parallel masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.10) Differential Revision: https://reviews.llvm.org/D128946	2022-07-01 08:57:15 -07:00
Andrew Ng	c0ef1ffc9e	[Build][NFC] Fixes for building on Windows with libc++ Differential Revision: https://reviews.llvm.org/D128514	2022-07-01 15:37:57 +01:00
Nikita Popov	21933b2f7f	[IRBuilder] Move CreateNeg() to fold API Remove the CreateNeg() method from IRBuilderFolder and base it on CreateSub(0, V) instead, which will call FoldNoWrapBinaryOp(). May not be NFC if InstSimplifyFolder is used.	2022-07-01 14:54:10 +02:00
Nikita Popov	5c8021777c	[IRBuilder] Move CreateNot() to fold API Drop the IRBuilderFolder method entirely and base this on CreateXor(V, -1) instead, so this will now go through FoldBinOp. May not be NFC if the InstSimplifyBuilder is used.	2022-07-01 14:48:57 +02:00
Chen Zheng	758de0e931	[InstructionSimplify] handle denormal input for fcmp Handle denormal constant input for fcmp instructions based on the denormal handling mode. Reviewed By: spatel, dcandler Differential Revision: https://reviews.llvm.org/D128647	2022-07-01 03:51:28 -04:00
Nikita Popov	9ac386495d	[ConstExpr] Don't create insertvalue expressions In preparation for the removal in D128719, this stops creating insertvalue constant expressions (well, unless they are directly used in LLVM IR). Differential Revision: https://reviews.llvm.org/D128792	2022-07-01 09:23:28 +02:00
Piotr Sobczak	b6ef36a1c4	[AMDGPU] Update WMMA intrinsics with explicit f16 types Update intrinsics to use n x f16 and n x i16 instead of 32-bit types. This may avoid the need for a bitcast and is probably less confusing. Depends on making v16f16 and v16i16 types legal. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D128951	2022-07-01 08:55:25 +02:00
Xiang1 Zhang	72a23cef7e	[ISel] Match all bits when merge undefs for DAG combine Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D128570	2022-07-01 09:09:43 +08:00
Xiang1 Zhang	64f44a90ef	Revert "[ISel] Match all bits when merge undef(s) for DAG combine" This reverts commit `5fe5aa284e`.	2022-07-01 08:59:04 +08:00
Xiang1 Zhang	5fe5aa284e	[ISel] Match all bits when merge undef(s) for DAG combine	2022-07-01 08:58:00 +08:00
Fazlay Rabbi	d64ba896d3	[OpenMP] Initial parsing and sema support for 'parallel masked taskloop' construct This patch gives basic parsing and semantic support for "parallel masked taskloop" construct introduced in OpenMP 5.1 (section 2.16.9) Differential Revision: https://reviews.llvm.org/D128834	2022-06-30 11:44:17 -07:00
Jonas Devlieghere	21f1dca125	[llvm] Fix the modules build Fixes error: missing '#include "llvm/IR/FMF.h"'; 'FastMathFlags' must be defined before it is used in llvm/include/llvm/IR/NoFolder.h.	2022-06-30 08:58:37 -07:00
Piotr Sobczak	4874838a63	[AMDGPU] gfx11 WMMA instruction support gfx11 introduces new WMMA (Wave Matrix Multiply-accumulate) instructions. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D128756	2022-06-30 11:13:45 -04:00
Nikita Popov	f34dcf2763	[IRBuilder] Migrate all binops to folding API Migrate all binops to use FoldXYZ rather than CreateXYZ APIs, which are compatible with InstSimplifyFolder and fallible constant folding. Rather than continuing to add one method for every single operator, add a generic FoldBinOp (plus variants for nowrap, exact and fmf operators), which we would need anyway for CreateBinaryOp. This change is not NFC because IRBuilder with InstSimplifyFolder may perform more folding. However, this patch changes SCEVExpander to not use the folder in InsertBinOp to minimize practical impact and keep this change as close to NFC as possible.	2022-06-30 16:41:17 +02:00
Daniel Bertalan	a3f67f0920	[lld-macho] Initial support for Linker Optimization Hints Linker optimization hints mark a sequence of instructions used for synthesizing an address, like ADRP+ADD. If the referenced symbol ends up close enough, it can be replaced by a faster sequence of instructions like ADR+NOP. This commit adds support for 2 of the 7 defined ARM64 optimization hints: - LOH_ARM64_ADRP_ADD, which transforms a pair of ADRP+ADD into ADR+NOP if the referenced address is within +/- 1 MiB - LOH_ARM64_ADRP_ADRP, which transforms two ADRP instructions into ADR+NOP if they reference the same page These two kinds already cover more than 50% of all LOHs in chromium_framework. Differential Review: https://reviews.llvm.org/D128093	2022-06-30 06:28:42 +02:00
Chuanqi Xu	0b5ead6590	[WebAssembly] Don't set musttail for coroutines when tail-call is not enabled The C++20 Coroutines couldn't be compiled to WebAssembly due to an optimization named symmetric transfer requires the support for musttail calls but WebAssembly doesn't support it yet. This patch tries to fix the problem by adding a supportsTailCalls method to TargetTransformImpl to skip the symmetric transfer when tail-call feature is not supported. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D128794	2022-06-30 11:15:40 +08:00
Vitaly Buka	72cd6b6c83	Revert "[Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef." Breaks msan bot, see D126883 This reverts commit `77df3be0de`.	2022-06-29 17:53:42 -07:00
Joseph Huber	f892ddb3be	[OpenMP] Add variant extension that applies to declarations This patch adds a new extension to the `omp begin / end declare variant` support that causes it to apply to function declarations as well. This is explicitly not done in the standard, but can be useful in some situations so we should provide it as an extension. This will allow us to uniquely bind and overload existing definitions with a simple declaration using variants. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D124624	2022-06-29 15:04:26 -04:00
Corentin Jabot	64ab2b1dcc	Improve handling of static assert messages. Instead of dumping the string literal (which quotes it and escape every non-ascii symbol), we can use the content of the string when it is a 8 byte string. Wide, UTF-8/UTF-16/32 strings are still completely escaped, until we clarify how these entities should behave (cf https://wg21.link/p2361). `FormatDiagnostic` is modified to escape non printable characters and invalid UTF-8. This ensures that unicode characters, spaces and new lines are properly rendered in static messages. This make clang more consistent with other implementation and fixes this tweet https://twitter.com/jfbastien/status/1298307325443231744 :) Of note, `PaddingChecker` did print out new lines that were later removed by the diagnostic printing code. To be consistent with its tests, the new lines are removed from the diagnostic. Unicode tables updated to both use the Unicode definitions and the Unicode 14.0 data. U+00AD SOFT HYPHEN is still considered a print character to match existing practices in terminals, in addition of being considered a formatting character as per Unicode. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D108469	2022-06-29 14:57:35 +02:00
Nikita Popov	1271b8f57a	[Bitcode] Restore bitcast expression auto-upgrade Restore the autoupgrade from bitcast to ptrtoint+inttoptr, which was lost as part of D127729. This fixes the backwards compatibility issue noted in: https://reviews.llvm.org/D127729#inline-1236519	2022-06-29 14:35:56 +02:00
Nikita Popov	66a16b2848	[IRBuilder] Migrate div/rem to use fold infrastructure Migrate udiv, sdiv, urem, and srem to use the FoldXYZ rather than the CreateXYZ infrastructure.	2022-06-29 13:17:02 +02:00
Florian Hahn	675080a453	[SCEV] Construct SCEV iteratively. This patch updates SCEV construction to work iteratively instead of recursively in most cases. It resolves stack overflow issues when trying to construct SCEVs for certain inputs, e.g. PR45201. The basic approach is to to use a worklist to queue operands of V which need to be created before V. To do so, the current patch adds a getOperandsToCreate function which collects the operands SCEV construction depends on for a given value. This is a slight duplication with createSCEV. At the moment, SCEVs for phis are still created recursively. Fixes #32078, #42594, #44546, #49293, #49599, #55333, #55511 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D114650	2022-06-29 11:29:31 +01:00
Pavel Samolysov	8958057fb1	[ArgPromotion] Move isDenselyPacked static member (NFC) The `isDenselyPacked` static member of the `ArgumentPromotionPass` class is not used in the class itself anymore. The single known user of the function is in the `AttributorAttributes.cpp` file, so the function has been moved into the file. Differential Revision: https://reviews.llvm.org/D128725	2022-06-29 10:45:10 +03:00
luxufan	cbeca742a4	[llvm-readobj][RISCV] Support dumping PT_RISCV_ATTRIBUTES This patch drops the prefix `PT_RISCV_` when dumping `PT_RISCV_ATTRIBUTES`. GNU readelf dumps it as `RISCV_ATTRIBUT`. Because GNU readelf uses something like `%-14.14s` so only the first 14 bytes are printed. Differential Revision: https://reviews.llvm.org/D128493	2022-06-29 15:13:45 +08:00
esmeyi	ff855f5ec0	Pass code-model through Module IR to [llc]. Currently, the code-model specified in IR can't be captured by [llc]. This patch fixes that. Reviewed By: shchenz, MaskRay Differential Revision: https://reviews.llvm.org/D128623	2022-06-29 02:30:13 -04:00
Chen Zheng	370127b7d5	[XCOFF] change default program code csect alignment to 32 This is the same with commercial XLC on AIX. Reviewed By: Esme Differential Revision: https://reviews.llvm.org/D114419	2022-06-29 04:16:01 +00:00
Luo, Yuanke	5cb0979870	[X86][AMX] Split greedy RA for tile register When we fill the shape to tile configure memory, the shape is gotten from AMX pseudo instruction. However the register for the shape may be split or spilled by greedy RA. That cause we fill the shape to config memory after ldtilecfg is executed, so that the shape configuration would be wrong. This patch is to split the tile register allocation from greedy register allocation, so that after tile registers are allocated the shape registers are still virtual register. The shape register only may be redefined or multi-defined by phi elimination pass, two address pass. That doesn't affect tile register configuration. Differential Revision: https://reviews.llvm.org/D128584	2022-06-29 10:35:43 +08:00
Fazlay Rabbi	73e5d7bdff	[OpenMP] Initial parsing and sema support for 'masked taskloop simd' construct This patch gives basic parsing and semantic support for "masked taskloop simd" construct introduced in OpenMP 5.1 (section 2.16.8) Differential Revision: https://reviews.llvm.org/D128693	2022-06-28 15:27:49 -07:00
Corentin Jabot	a774ba7f60	Revert "Improve handling of static assert messages." This reverts commit `870b6d2183`. This seems to break some libc++ tests, reverting while investigating	2022-06-29 00:03:23 +02:00
Guozhi Wei	ddc9e8861c	[MachineCombiner, AArch64] Add a new pattern A-(B+C) => (A-B)-C to reduce latency Add a new pattern A - (B + C) ==> (A - B) - C to give machine combiner a chance to evaluate which instruction sequence has lower latency. Differential Revision: https://reviews.llvm.org/D124564	2022-06-28 21:42:51 +00:00
Alexey Lapshin	77df3be0de	[Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef. This review is extracted from D96035. This patch adds possibility to keep not only DwarfStringPoolEntry, but also pointer to it. The DwarfStringPoolEntryRef keeps reference to the string map entry. String map keeps string data and corresponding DwarfStringPoolEntry info. Not all string map entries may be included into the result, and then not all string entries should have DwarfStringPoolEntry info. Currently StringMap keeps DwarfStringPoolEntry for all entries. It leads to extra memory usage. This patch allows to keep DwarfStringPoolEntry info only for entries which really need it. Reviewed By: JDevlieghere Differential Revision: https://reviews.llvm.org/D126883	2022-06-29 00:12:03 +03:00
Corentin Jabot	870b6d2183	Improve handling of static assert messages. Instead of dumping the string literal (which quotes it and escape every non-ascii symbol), we can use the content of the string when it is a 8 byte string. Wide, UTF-8/UTF-16/32 strings are still completely escaped, until we clarify how these entities should behave (cf https://wg21.link/p2361). `FormatDiagnostic` is modified to escape non printable characters and invalid UTF-8. This ensures that unicode characters, spaces and new lines are properly rendered in static messages. This make clang more consistent with other implementation and fixes this tweet https://twitter.com/jfbastien/status/1298307325443231744 :) Of note, `PaddingChecker` did print out new lines that were later removed by the diagnostic printing code. To be consistent with its tests, the new lines are removed from the diagnostic. Unicode tables updated to both use the Unicode definitions and the Unicode 14.0 data. U+00AD SOFT HYPHEN is still considered a print character to match existing practices in terminals, in addition of being considered a formatting character as per Unicode. Reviewed By: aaron.ballman, #clang-language-wg Differential Revision: https://reviews.llvm.org/D108469	2022-06-28 22:26:00 +02:00
Alexey Lapshin	2b747241a6	[DWARFLinker] mark odr candidates inside the same object file. This patch is extracted from D86539. Current implementation of lookForDIEsToKeep() function skips types duplications basing on the getCanonicalDIEOffset() data: ``` if (AttrSpec.Form != dwarf::DW_FORM_ref_addr && (UseOdr \|\| IsModuleRef) && Info.Ctxt && Info.Ctxt != ReferencedCU->getInfo(Info.ParentIdx).Ctxt && Info.Ctxt->getCanonicalDIEOffset() && isODRAttribute(AttrSpec.Attr)) <<<<< continue; ``` But that field is set after all compile units inside object file are processed: ``` for (auto &CurrentUnit : OptContext.CompileUnits) lookForDIEsToKeep(.., &CurrentUnit, ..); // check CanonicalDIEOffset DIECloner.cloneAllCompileUnits(); // set CanonicalDIEOffset ``` Thus, if the object file contains several compilation units - types would not be deduplicated. The above solution works well for the case when the object file contains only one compilation unit. But if the object file contains several compilation units then types would not be deduplicated between these compilation units. This patch changes the algorithm so that types were deduplicated between compilation units from the same object file. It produces binary incompatible output for the cases when several compilation units are located inside the same object file. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D125469	2022-06-28 19:48:49 +03:00
Rahman Lavaee	0aa6df6575	[Propeller] Encode address offsets of basic blocks relative to the end of the previous basic blocks. This is a resurrection of D106421 with the change that it keeps backward-compatibility. This means decoding the previous version of `LLVM_BB_ADDR_MAP` will work. This is required as the profile mapping tool is not released with LLVM (AutoFDO). As suggested by @jhenderson we rename the original section type value to `SHT_LLVM_BB_ADDR_MAP_V0` and assign a new value to the `SHT_LLVM_BB_ADDR_MAP` section type. The new encoding adds a version byte to each function entry to specify the encoding version for that function. This patch also adds a feature byte to be used with more flexibility in the future. An use-case example for the feature field is encoding multi-section functions more concisely using a different format. Conceptually, the new encoding emits basic block offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, offsets must be aggregated along with basic block sizes to calculate the final offsets of basic blocks relative to the function address. This encoding uses smaller values compared to the existing one (offsets relative to function symbol). Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 17% total reduction in the size of the bb-address-map section (from about 11MB to 9MB for the clang PGO binary). The extra two bytes (version and feature fields) incur a small 3% size overhead to the `LLVM_BB_ADDR_MAP` section size. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D121346	2022-06-28 07:42:54 -07:00
Nikita Popov	278a47cc92	[IRBuilder] Migrate vector operations to fold infrastructure Migrate extractelement, insertelement and shufflevector to use the FoldXYZ rather than CreateXYZ APIs. This is probably NFC in practice, because the places using InstSimplifyFolder probably aren't using vector operations.	2022-06-28 15:11:15 +02:00
Pavel Samolysov	170c4d21bd	[ArgPromotion] Unify byval promotion with non-byval It makes sense to handle byval promotion in the same way as non-byval but also allowing `store` instructions. However, these should use the same checks as the `load` instructions do, i.e. be part of the `ArgsToPromote` collection. For these instructions, the check for interfering modifications can be disabled, though. The promotion algorithm itself has been modified a lot: all the accesses (i.e. loads and stores) are rewritten to the emitted `alloca` instructions. To optimize these new `alloca`s out, the `PromoteMemToReg` function from `Transforms/Utils/PromoteMemoryToRegister.cpp` file is invoked after promotion. In order to let the `PromoteMemToReg` promote as many `alloca`s as it is possible, there should be no `GEP`s from the `alloca`s. To eliminate the `GEP`s, its own `alloca` is generated for every argument part because a single `alloca` for the whole argument (that significantly simplifies the code of the pass though) unfortunately cannot be used. The idea comes from the following discussion: https://reviews.llvm.org/D124514#3479676 Differential Revision: https://reviews.llvm.org/D125485	2022-06-28 15:19:58 +03:00
David Sherwood	054faac9f9	[AArch64][SME] Add SVE2 psel, uclamp, sclamp and revd IR intrinsics When the SME feature is enabled we also gain access to a few extra SVE2 instructions. This patch adds LLVM IR intrinsics to make use of these new instructions: @llvm.aarch64.sve.psel @llvm.aarch64.sve.revd @llvm.aarch64.sve.sclamp @llvm.aarch64.sve.uclamp Differential Revision: https://reviews.llvm.org/D128332	2022-06-28 10:25:06 +01:00
Sander de Smalen	180cc74de9	[AArch64] Update SME load/store intrinsics to work on opaque pointers. These intrinsics should be able to use opaque pointers, because the load/store type is already encoded in their names and return/operand type. Reviewed By: c-rhodes Differential Revision: https://reviews.llvm.org/D128505	2022-06-28 09:50:11 +01:00
David Sherwood	f916ee0fb1	[AArch64][SME] Add SME outer product intrinsics This patch adds the following intrinsics to support the SME ACLE: * @llvm.aarch64.sme.mopa: Non-widening outer product + accumulate * @llvm.aarch64.sme.mops: Non-widening outer product + subtract * @llvm.aarch64.sme.mopa.wide: Widening outer product + accumulate * @llvm.aarch64.sme.mops.wide: Widening outer product + subtract * @llvm.aarch64.sme.smopa.wide: Widening signed sum of outer product + accumulate * @llvm.aarch64.sme.smops.wide: Widening signed sum of outer product + subtract * @llvm.aarch64.sme.umopa.wide: Widening unsigned sum of outer product + accumulate * @llvm.aarch64.sme.umops.wide: Widening unsigned sum of outer product + subtract * @llvm.aarch64.sme.sumopa.wide: Widening signed by unsigned sum of outer product + accumulate * @llvm.aarch64.sme.sumops.wide: Widening signed by unsigned sum of outer product + subtract * @llvm.aarch64.sme.usmopa.wide: Widening unsigned by signed sum of outer product + accumulate * @llvm.aarch64.sme.usmops.wide: Widening unsigned by signed sum of outer product + subtract Differential Revision: https://reviews.llvm.org/D127956	2022-06-28 09:41:44 +01:00
Nikita Popov	5548e807b5	[IR] Remove support for extractvalue constant expression This removes the extractvalue constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. extractvalue is already not supported in bitcode, so we do not need to worry about bitcode auto-upgrade. Uses of ConstantExpr::getExtractValue() should be replaced with IRBuilder::CreateExtractValue() (if the fact that the result is constant is not important) or ConstantFoldExtractValueInstruction() (if it is). Though for this particular case, it is also possible and usually preferable to use getAggregateElement() instead. The C API function LLVMConstExtractValue() is removed, as the underlying constant expression no longer exists. Instead, LLVMBuildExtractValue() should be used (which will constant fold or create an instruction). Depending on the use-case, LLVMGetAggregateElement() may also be used instead. Differential Revision: https://reviews.llvm.org/D125795	2022-06-28 10:40:17 +02:00
Guillaume Chatelet	3c126d5fe4	[Alignment] Replace commonAlignment with std::min `commonAlignment` is a shortcut to pick the smallest of two `Align` objects. As-is it doesn't bring much value compared to `std::min`. Differential Revision: https://reviews.llvm.org/D128345	2022-06-28 07:15:02 +00:00
wlei	7e86b13c63	[CSSPGO][llvm-profgen] Reimplement SampleContextTracker using context trie This is the followup patch to https://reviews.llvm.org/D125246 for the `SampleContextTracker` part. Before the promotion and merging of the context is based on the SampleContext(the array of frame), this causes a lot of cost to the memory. This patch detaches the tracker from using the array ref instead to use the context trie itself. This can save a lot of memory usage and benefit both the compiler's CS inliner and llvm-profgen's pre-inliner. One structure needs to be specially treated is the `FuncToCtxtProfiles`, this is used to get all the functionSamples for one function to do the merging and promoting. Before it search each functions' context and traverse the trie to get the node of the context. Now we don't have the context inside the profile, instead we directly use an auxiliary map `ProfileToNodeMap` for profile , it initialize to create the FunctionSamples to TrieNode relations and keep updating it during promoting and merging the node. Moreover, I was expecting the results before and after remain the same, but I found that the order of FuncToCtxtProfiles matter and affect the results. This can happen on recursive context case, but the difference should be small. Now we don't have the context, so I just used a vector for the order, the result is still deterministic. Measured on one huge size(12GB) profile from one of our internal service. The profile similarity difference is 99.999%, and the running time is improved by 3X(debug mode) and the memory is reduced from 170GB to 90GB. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D127031	2022-06-27 23:22:21 -07:00
wlei	aa58b7b1e3	[CSSPGO][llvm-profgen] Reimplement computeSummaryAndThreshold using context trie Follow-up patch to https://reviews.llvm.org/D125246, support `computeSummaryAndThreshold` based on context trie. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D127026	2022-06-27 23:22:21 -07:00
wlei	eba5749262	[CSSPGO][llvm-profgen] Reimplement CS profile generator using context trie Our investigation showed ProfileMap's key is the bottleneck of the memory consumption for CS profile generation on some large services. This patch tries to optimize it by storing the CS function samples using the context trie tree structure instead of the context frame array ref. Parts of code in `ContextTrieNode` are reused. Our experiment on one internal service showed that the context key's memory can be reduced from 80GB to 300MB. To be compatible with non-CS profiles, the profile writer still needs to use ProfileMap as input, so rebuild the ProfileMap using the context trie in `postProcessProfiles`. The optimization is not complete yet, next step is to reimplement Pre-inliner or profile trimmer, after that, ProfileMap should be small to be written. Reviewed By: hoy, wenlei Differential Revision: https://reviews.llvm.org/D125246	2022-06-27 23:22:21 -07:00
Wolfgang Pieb	a630ea3003	Reland [Metadata] Add a resize capability to MDNodes and add a push_back interface to MDNode Fixed a bug with double destruction of operands and corrected a test issue. Note that this patch leads to a slight increase in compile time (I measured about .3%) and a slight increase in memory usage. The increased memory usage should be offset once resizing is used to a larger extent. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D125998	2022-06-27 16:23:11 -07:00
Mitch Phillips	dacfa24f75	Delete 'llvm.asan.globals' for global metadata. Now that we have the sanitizer metadata that is actually on the global variable, and now that we use debuginfo in order to do symbolization of globals, we can delete the 'llvm.asan.globals' IR synthesis. This patch deletes the 'location' part of the __asan_global that's embedded in the binary as well, because it's unnecessary. This saves about ~1.7% of the optimised non-debug with-asserts clang binary. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D127911	2022-06-27 14:40:40 -07:00
Yuanfang Chen	6678f8e505	[ubsan] Using metadata instead of prologue data for function sanitizer Information in the function `Prologue Data` is intentionally opaque. When a function with `Prologue Data` is duplicated. The self (global value) references inside `Prologue Data` is still pointing to the original function. This may cause errors like `fatal error: error in backend: Cannot represent a difference across sections`. This patch detaches the information from function `Prologue Data` and attaches it to a function metadata node. This and D116130 fix https://github.com/llvm/llvm-project/issues/49689. Reviewed By: pcc Differential Revision: https://reviews.llvm.org/D115844	2022-06-27 12:09:13 -07:00
Daniel Thornburgh	eb5af0acf0	[Symbolize] Add log markup --filter to llvm-symbolizer. This adds a --filter option to llvm-symbolizer. This takes log-bearing symbolizer markup from stdin and writes a human-readable version to stdout. For now, this only implements the "symbol" markup tag; all others are passed through unaltered. This is a proof-of-concept bit of functionalty; implement the various tags is more-or-less just a matter of hooking up various parts of the Symbolize library to the architecture established here. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D126980	2022-06-27 10:44:15 -07:00
Ritanya B Bharadwaj	8322fe200d	Adding support for target in_reduction Implementing target in_reduction by wrapping target task with host task with in_reduction and if clause. This is in compliance with OpenMP 5.0 section: 2.19.5.6. So, this ``` for (int i=0; i<N; i++) { res = res+i } ``` will become ``` #pragma omp task in_reduction(+:res) if(0) #pragma omp target map(res) for (int i=0; i<N; i++) { res = res+i } ``` Reviewed By: ABataev Differential Revision: https://reviews.llvm.org/D125669	2022-06-27 10:36:46 -05:00
Nikita Popov	f65c88c42f	[GlobalOpt] Fix memset handling in global ctor evaluation (PR55859) The global ctor evaluator currently handles by checking whether the memset memory is already zero, and skips it in that case. However, it only actually checks the first byte of the memory being set. This patch extends the code to check all bytes being set. This is done byte-by-byte to avoid converting undef values to zeros in larger reads. However, the handling is still not completely correct, because there might still be padding bytes (though probably this doesn't matter much in practice, as I'd expect global variable padding to be zero-initialized in practice). Mostly fixes https://github.com/llvm/llvm-project/issues/55859. Differential Revision: https://reviews.llvm.org/D128532	2022-06-27 16:50:49 +02:00
Bradley Smith	a83aa33d1b	[IR] Move vector.insert/vector.extract out of experimental namespace These intrinsics are now fundemental for SVE code generation and have been present for a year and a half, hence move them out of the experimental namespace. Differential Revision: https://reviews.llvm.org/D127976	2022-06-27 10:48:45 +00:00
Nikita Popov	217e85761c	[ArgPromotion] Remove legacy PM support Support for the legacy pass manager in ArgPromotion causes complications in D125485. As the legacy pass manager for middle-end optimizations is unsupported, drop ArgPromotion from the legacy pipeline, rather than introducing additional complexity to deal with it. Differential Revision: https://reviews.llvm.org/D128536	2022-06-27 09:42:17 +02:00
Chuanqi Xu	24e53b01d5	Revert "[Coroutines] Only do symmetric transfer if optimization is on" This reverts commit `7782e080e8`. According to the discussion of WG21, symmetric transfer is a desired feature.	2022-06-27 10:54:56 +08:00
Kazu Hirata	d08f34b592	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-26 18:31:51 -07:00
Kazu Hirata	a81b64a1fb	[llvm] Use Optional::has_value instead of Optional::hasValue (NFC) This patch replaces x.hasValue() with x.has_value() where x is not contextually convertible to bool.	2022-06-26 16:10:42 -07:00
Kazu Hirata	a7938c74f1	[llvm] Don't use Optional::hasValue (NFC) This patch replaces Optional::hasValue with the implicit cast to bool in conditionals only.	2022-06-25 21:42:52 -07:00
Philip Reames	b61235739f	Fix build after `ab736a27` This class is templatized by the concrete subclass - not all subclasses have a data layout field called DL.	2022-06-25 12:10:19 -07:00
Kazu Hirata	3b7c3a654c	Revert "Don't use Optional::hasValue (NFC)" This reverts commit `aa8feeefd3`.	2022-06-25 11:56:50 -07:00
Kazu Hirata	aa8feeefd3	Don't use Optional::hasValue (NFC)	2022-06-25 11:55:57 -07:00
Kazu Hirata	d152e50c15	[llvm] Don't use Optional::{hasValue,getValue} (NFC)	2022-06-25 11:24:23 -07:00
Philip Reames	ab736a2750	[BasicTTI] Account for vector of pointers in getMemoryOpCost By using getPrimitiveSizeInBits, we were getting 0 for every pointer type. This code is trying to account for the cost of truncating a store or extending a load to convert from the source vector element type to the legal vector element type. I'd originally seen this as a crash when trying to scalarize a <vscale x 1 x ptr> type coming from the vectorizer. Here's a minimum reproducer to exercise the code in question. void e(int argv[], int p) { for (int i = 0; i < 1024; i++) argv[i] = p; } This was checked in as the splat_ptr test in `2cf320d`. After bbf3fd, this no longer crashes since we correctly return invalid if the extending load/truncating store isn't legal. Differential Revision: https://reviews.llvm.org/D128228	2022-06-25 11:11:58 -07:00
Guillaume Chatelet	1baf1fc276	[NFC] Remove dead code	2022-06-25 17:18:45 +00:00
Corentin Jabot	c92056d038	[Clang][C++23] P2071 Named universal character escapes Implements [[ https://wg21.link/p2071r1 \| P2071 Named Universal Character Escapes ]] - as an extension in all language mode, the patch not warn in c++23 mode will be done later once this paper is plenary approved (in July). We add * A code generator that transforms `UnicodeData.txt` and `NameAliases.txt` to a space efficient data structure that can be queried in `O(NameLength)` * A set of functions in `Unicode.h` to query that data, including * A function to find an exact match of a given Unicode character name * A function to perform a loose (ignoring case, space, underscore, medial hyphen) matching * A function returning the best matching codepoint for a given string per edit distance * Support of `\N{}` escape sequences in String and character Literals, with loose and typos diagnostics/fixits * Support of `\N{}` as UCN with loose matching diagnostics/fixits. Loose matching is considered an error to match closely the semantics of P2071. The generated data contributes to 280kB of data to the binaries. `UnicodeData.txt` and `NameAliases.txt` are not committed to the repository in this patch, and regenerating the data is a manual process. Reviewed By: tahonermann Differential Revision: https://reviews.llvm.org/D123064	2022-06-25 19:03:33 +02:00
Min-Yih Hsu	87805d6a24	[MCA] Hot fix for -Wmismatched-tags errors on mca::SourceMgr Hot fix for -Wmismatched-tags build errors regarding mca::SourceMgr changes introduced in `97579dcc6d`.	2022-06-24 16:14:18 -07:00
Min-Yih Hsu	b847692ed8	[MCA] Allow mca::Instruction-s to be recycled and reused This patch introduces a new feature that allows InstrBuilder to reuse mca::Instruction recycled from IncrementalSourceMgr. This significantly reduces the memory footprint. Note that we're only recycling instructions that have static InstrDesc and no variadic operands. Differential Revision: https://reviews.llvm.org/D127084	2022-06-24 15:39:51 -07:00
Min-Yih Hsu	97579dcc6d	[MCA] Introducing incremental SourceMgr and resumable pipeline The new resumable mca::Pipeline capability introduced in this patch allows users to save the current state of pipeline and resume from the very checkpoint. It is better (but not require) to use with the new IncrementalSourceMgr, where users can add mca::Instruction incrementally rather than having a fixed number of instructions ahead-of-time. Note that we're using unit tests to test these new features. Because integrating them into the `llvm-mca` tool will make too many churns. Differential Revision: https://reviews.llvm.org/D127083	2022-06-24 15:39:51 -07:00
Mingming Liu	e0d069598b	[Inline] Annotate inline pass name with link phase information for analysis. The annotation is flag gated; flag is turned off by default. Differential Revision: https://reviews.llvm.org/D125495	2022-06-24 10:06:43 -07:00
Fazlay Rabbi	42bb88e2aa	[OpenMP] Initial parsing and sema support for 'masked taskloop' construct This patch gives basic parsing and semantic support for "masked taskloop" construct introduced in OpenMP 5.1 (section 2.16.7) Differential Revision: https://reviews.llvm.org/D128478	2022-06-24 10:00:08 -07:00
Arthur Eubanks	e422c0d3b2	[GlobalOpt] Perform store->dominated load forwarding for stored once globals The initial land incorrectly optimized forwarding non-Constants in non-nosync/norecurse functions. Bail on non-Constants since norecurse should cause global -> alloca promotion anyway. The initial land also incorrectly assumed that StoredOnceStore was the only store to the global, but it actually means that only one value other than the global initializer is stored. Add a check that there's only one store. Compile time tracker: https://llvm-compile-time-tracker.com/compare.php?from=c80b88ee29f34078d2149de94e27600093e6c7c0&to=ef2c2b7772424b6861a75e794f3c31b45167304a&stat=instructions Reviewed By: nikic, asbirlea, jdoerfert Differential Revision: https://reviews.llvm.org/D128128	2022-06-24 09:09:26 -07:00
Nikita Popov	871197d0a3	[MemoryBuiltins] Accept any value in getInitialValueOfAllocation() (NFC) Drop the requirement that getInitialValueOfAllocation() must be passed an allocator function, shifting the responsibility for checking that into the function (which it does anyway). The motivation is to avoid some calls to isAllocationFn(), which has somewhat ill-defined semantics (given the number of allocator-related attributes we have floating around...) (For this function, all we eventually need is an allockind of zeroed or uninitialized.) Differential Revision: https://reviews.llvm.org/D127274	2022-06-24 16:08:07 +02:00
Joseph Huber	1dcbe03c32	[Binary] Further improve malformed input handling for the OffloadBinary Summary: This patch adds some new sanity checks to make sure that the sizes of the offsets are within the bounds of the file or what is expected by the binary. This also improves the error handling of the version structure to be built into the binary itself so we can change it easier.	2022-06-24 09:57:44 -04:00
Nabeel Omer	0d41794335	[SLP] Add cost model for `llvm.powi.` intrinsics (REAPPLIED) Patch was reverted in `4c5f10a` due to buildbot failures, now being reapplied with updated AArch64 and RISCV tests. This patch adds handling for the llvm.powi. intrinsics in BasicTTIImplBase::getIntrinsicInstrCost() and improves vectorization. Closes #53887. Differential Revision: https://reviews.llvm.org/D128172	2022-06-24 10:23:19 +00:00
Nikita Popov	54eff7da3c	[AA] Export isEscapeSource() API (NFC) Export API that was previously private to BasicAliasAnalysis and will be used in D127202.	2022-06-24 11:59:15 +02:00
Fangrui Song	44ee3efb93	[CodeGen] Simplify isVirtualRegister. NFC	2022-06-23 23:26:02 -07:00
Douglas Yung	f401dd6f43	Revert "Add support for decoding base64." This reverts commit `8b987ca5e3`. This change breaks several Windows bots - https://lab.llvm.org/buildbot/#/builders/123/builds/11371 - https://lab.llvm.org/buildbot/#/builders/117/builds/7685 - https://lab.llvm.org/buildbot/#/builders/42/builds/6077 - https://lab.llvm.org/buildbot/#/builders/216/builds/6340	2022-06-23 21:47:20 -07:00
Kai Luo	6710b21d46	[PowerPC] Allow llvm.ppc.cfence to accept pointer types In the context of atomic load, integer, pointer and float point types are allowed, thus we should allow llvm.ppc.cfence to accept any type mentioned. Fixes https://github.com/llvm/llvm-project/issues/55983. Reviewed By: shchenz, vchuravy Differential Revision: https://reviews.llvm.org/D127554	2022-06-24 10:55:32 +08:00
Greg Clayton	8b987ca5e3	Add support for decoding base64. An upcoming patch to LLDB will require the ability to decode base64. This patch adds support for decoding base64 and adds tests. Differential Revision: https://reviews.llvm.org/D126254	2022-06-23 16:13:19 -07:00
Derek Schuff	5a082d9c1c	[WebAssembly][Object] Remove requirement that objects must have code sections When parsing name and linking sections, we currently require that the object must have a code section (it seems that this was intended to verify section ordering). However it can be useful for binaries to have their code sections stripped out (e.g. if we just want the debug info). In that case we need the rest of the known sections (so e.g. we know how many functions there are, to verify the name section) but not the actual code. I've removed the restriction completely. I think this is OK because the section-parsing code already checks function and global indices in many places for validity and will return appropriate errors if the relevant sections are missing. Also we can't just replace the requirement of seeing a code section with a requirement that we see a function or global section, because a binary may just not have any functions or globals. But there's only an problem if the name or linking section tries to name a nonexistent function. Part of a fix for https://github.com/emscripten-core/emscripten/issues/13084 Differential Revision: https://reviews.llvm.org/D128094	2022-06-23 13:56:17 -07:00
Jin Xin Ng	22f1273357	[ThinLTO][ELF] Add --thinlto-emit-index-files option Allows ThinLTO indices to be written to disk on-the-fly/as-part-of “normal” linker execution. Previously ThinLTO indices could be written via --thinlto-index-only but that would cause the linker to exit early. For MLGO specifically, this enables saving the ThinLTO index files without having to restart the linker to collect data only available at later stages (i.e. output of --save-temps) of the linker's execution. Note, this option does not currently work with: --thinlto-object-suffix-replace, as this is intended to be used to consume minimized IR bitcode files while --thinlto-emit-index-files is intended to be run together with InProcessThinLTO (which cannot parse minimized IR). --thinlto-prefix-replace support is left unimplemented but can be implemented if needed Differential Revision: https://reviews.llvm.org/D127777	2022-06-23 12:35:42 -07:00
Med Ismail Bennani	148071fbae	[llvm] Update module map to include the `IR/ConstantFold` header This should fix the build failure occuring when enabling modules (LLVM_ENABLE_MODULES=On): https://green.lab.llvm.org/green/job/lldb-cmake/44785/ Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>	2022-06-23 11:52:25 -07:00
Philip Reames	0c1326748f	[BasicTTI] Avoid crash when costing scalable select expansion If the target has chosen to expand a scalable vector type, BasicTTI tries to scalarize and we'd crash. As a minimum, we should return an invalid cost instead. The added test provide coverage for the moment, but given they show a number of gaps in RISCV costing, they're likely not to cover this code path long term.	2022-06-23 09:14:57 -07:00
Baptiste Saleil	79e77a9f39	[AMDGPU] Flush the vmcnt counter in loop preheaders when necessary waitcnt vmcnt instructions are currently generated in loop bodies before using values loaded outside of the loop. In some cases, it is better to flush the vmcnt counter in a loop preheader before entering the loop body. This patch detects these cases and generates waitcnt instructions to flush the counter. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D115747	2022-06-23 10:53:21 -04:00
Nikita Popov	da34966a5a	[llvm-c] Add LLVMGetAggregateElement() function This adds LLVMGetAggregateElement() as a wrapper for Constant::getAggregateElement(), which allows fetching a struct/array/vector element without handling different possible underlying representations. As the changed echo test shows, previously you for example had to treat ConstantArray (use LLVMGetOperand) and ConstantDataArray (use LLVMGetElementAsConstant) separately, not to mention all the other possible representations (like PoisonValue). I've deprecated LLVMGetElementAsConstant() in favor of the new function, which is strictly more powerful (but I could be convinced to drop the deprecation). This is partly motivated by https://reviews.llvm.org/D125795, which drops LLVMConstExtractValue() because the underlying constant expression no longer exists. This function could previously be used as a poor man's getAggregateElement(). Differential Revision: https://reviews.llvm.org/D128417	2022-06-23 14:50:54 +02:00
Nikita Popov	20b5f0c641	[IR] Export ConstantFold.h header (NFC) This is in preparation for https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. As part of that change, we'll want to invoke some of these constant folding APIs explicitly, as it won't happen as part of ConstantExpr::getXYZ() anymore. Ideally, we'd merge these with the DL-aware constant folding APIs and only call those, but this is not easily possible for some current usages (most important IRBuilder, which uses DL-independent constant folding by default, and some major layering changes would be needed to change that). This is basically a reboot of D115035 with different motivation. Differential Revision: https://reviews.llvm.org/D128213	2022-06-23 11:32:14 +02:00
wangpc	634484885c	[TableGen] Add new operator !exists We can cast a string to a record via !cast, but we have no mechanism to check if it is valid and TableGen will raise an error if failed to cast. Besides, we have no semantic null in TableGen (we have `?` but different backends handle uninitialized value differently), so operator like `dyn_cast<>` is hard to implement. In this patch, we add a new operator `!exists<T>(s)` to check whether a record with type `T` and name `s` exists. Self-references are allowed just like `!cast`. By doing these, we can write code like: ``` class dyn_cast_to_record<string name> { R value = !if(!exists<R>(name), !cast<R>(name), default_value); } defvar v = dyn_cast_to_record<"R0">.value; // R0 or default_value. ``` Reviewed By: tra, nhaehnle Differential Revision: https://reviews.llvm.org/D127948	2022-06-23 11:11:47 +08:00
Mingming Liu	bc856eb3fc	[SampleProfile][Inline] Annotate sample profile inline remarks with link phase (prelink/postlink) information. Differential Revision: https://reviews.llvm.org/D126833	2022-06-22 17:00:53 -07:00
Florian Mayer	9320a32bb9	[MTE] [HWASan] Use LoopInfo for reachability queries. The reachability queries default to "reachable" after exploring too many basic blocks. LoopInfo helps it skip over the whole loop. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D127917	2022-06-22 15:28:49 -07:00
Evgenii Stepanov	5011b4ca0e	Revert "[Attributor] Ensure to use the proper liveness AA" Reason: memory leaks This reverts commit `083010312a`.	2022-06-22 13:40:45 -07:00
Guillaume Gomez	d0a4450ecd	Rename GCCBuiltin into ClangBuiltin This patch is needed because developers expect "GCCBuiltin" items to be the GCC intrinsics equivalent and not the Clang internals. Reviewed By: #libc_abi, RKSimon, xbolva00 Differential Revision: https://reviews.llvm.org/D127460	2022-06-22 19:49:20 +01:00
Mingming Liu	67dc8021a1	[Support] Change TrackingStatistic and NoopStatistic to use uint64_t instead of unsigned. Binary size of `clang` is trivial; namely, numerical value doesn't change when measured in MiB, and `.data` section increases from 139Ki to 173 Ki. Differential Revision: https://reviews.llvm.org/D128070	2022-06-22 10:11:40 -07:00
Daniel Thornburgh	8bd078b57c	[Symbolize] Parse multi-line markup elements. This allows registering certain tags as possibly beginning multi-line elements in the symbolizer markup parser. The parser is kept agnostic to how lines are delimited; it reports the entire contents, including line endings, once the end of element marker is reached. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D124798	2022-06-22 10:00:43 -07:00
serge-sans-paille	27fd01d3f8	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `fb67d683db` detected a few regressions, fixing them. The impact on preprocessed output is negligible: -4k lines.	2022-06-22 18:50:39 +02:00
Guillaume Chatelet	57ffff6db0	Revert "[NFC] Remove dead code" This reverts commit `8ba2cbff70`.	2022-06-22 14:55:47 +00:00
Guillaume Chatelet	8ba2cbff70	[NFC] Remove dead code	2022-06-22 13:33:58 +00:00
Guillaume Chatelet	9803db8c18	[NFC] Remove dead code	2022-06-22 13:13:01 +00:00
David Sherwood	aa0a413df8	[AArch64][SME] Add some SME PSTATE setting/query intrinsics This patch adds support for: * Querying the PSTATE.SM state with @llvm.aarch64.sme.get.pstatesm * Reading/writing the TPIDR2 register with new @llvm.aarch64.sme.get.tpidr2 and @llvm.aarch64.sme.set.tpidr2 intrinsics. Tests added here: CodeGen/AArch64/sme-get-pstatesm.ll CodeGen/AArch64/sme-read-write-tpidr2.ll Differential Revision: https://reviews.llvm.org/D127957	2022-06-22 10:26:45 +01:00
Pavel Samolysov	f44bf3805a	[DeadArgElim] Reformat the pass in accordance with the code style The code has been reformatted in accordance with the code style. Some function comments were extended to the Doxygen ones and reworded a bit to eliminate the duplication of the function's/class' name in the comment. Differential Revision: https://reviews.llvm.org/D128168	2022-06-22 09:13:00 +03:00
Johannes Doerfert	083010312a	[Attributor] Ensure to use the proper liveness AA When determining liveness via Attributor::isAssumedDead(...) we might end up without a liveness AA or with one pointing into another function. Neither is helpful and we will avoid both from now on. Reapplied after fixing the ASAN error which caused the revert: `db68a25ca9`	2022-06-21 21:28:26 -05:00
Vasileios Porpodas	7a9ad25769	Recommit "[SLP][X86] Improve reordering to consider alternate instruction bundles" This reverts commit `6d6268dcbf`. Review: https://reviews.llvm.org/D125712	2022-06-21 18:35:29 -07:00
Vasileios Porpodas	6d6268dcbf	Revert "[SLP][X86] Improve reordering to consider alternate instruction bundles" This reverts commit `6f88acf410`.	2022-06-21 17:07:21 -07:00
Vasileios Porpodas	6f88acf410	[SLP][X86] Improve reordering to consider alternate instruction bundles During the reordering transformation we should try to avoid reordering bundles like fadd,fsub because this may block them being matched into a single vector instruction in x86. We do this by checking if a TreeEntry is such a pattern and adding it to the list of TreeEntries with orders that need to be considered. Differential Revision: https://reviews.llvm.org/D125712	2022-06-21 16:44:48 -07:00
Anubhab Ghosh	79fbee3cc5	Re-apply "[JITLink][Orc] Add MemoryMapper interface with InProcess implementation" [JITLink][Orc] Add MemoryMapper interface with InProcess implementation MemoryMapper class takes care of cross-process and in-process address space reservation, mapping, transferring content and applying protections. Implementations of this class can support different ways to do this such as using shared memory, transferring memory contents over EPC or just mapping memory in the same process (InProcessMemoryMapper). The original patch landed with commit `6ede652050` It was reverted temporarily in commit `6a4056ab2a` Reviewed By: sgraenitz, lhames Differential Revision: https://reviews.llvm.org/D127491	2022-06-21 23:53:16 +02:00
Simon Pilgrim	8cecb6be56	[DAG] Remove SelectionDAG::GetDemandedBits DemandedElts variant. NFC. We're slowly removing SelectionDAG::GetDemandedBits and replacing it with SimplifyMultipleUseDemandedBits, we no longer have any uses for the vector demanded elt variant.	2022-06-21 21:23:10 +01:00
Daniel Bertalan	77b6efbd82	[ADT] [lld-macho] Check for end iterator deref in filter_iterator_base If ld64.lld was supplied an object file that had a `__debug_abbrev` or `__debug_str` section, but didn't have any compile unit DIEs in `__debug_info`, it would dereference an iterator pointing to the empty array of DIEs. This underlying issue started causing segmentation faults when parsing for `__debug_info` was addded in D128184. That commit was reverted, and this one fixes the invalid dereference to allow relanding it. This commit adds an assertion to `filter_iterator_base`'s dereference operators to catch bugs like this one. Ran check-llvm, check-clang and check-lld. Differential Revision: https://reviews.llvm.org/D128294	2022-06-21 15:47:45 -04:00
Martin Sebor	b19194c032	[InstCombine] handle subobjects of constant aggregates Remove the known limitation of the library function call folders to only work with top-level arrays of characters (as per the TODO comment in the code) and allows them to also fold calls involving subobjects of constant aggregates such as member arrays.	2022-06-21 11:55:14 -06:00
Nabeel Omer	4c5f10aeeb	Revert rGe6ccb57bb3f6b761f2310e97fd6ca99eff42f73e "[SLP] Add cost model for `llvm.powi.*` intrinsics" This reverts commit `e6ccb57bb3`.	2022-06-21 15:05:55 +00:00
Nabeel Omer	e6ccb57bb3	[SLP] Add cost model for `llvm.powi.` intrinsics This patch adds handling for the llvm.powi. intrinsics in BasicTTIImplBase::getIntrinsicInstrCost() and improves vectorization. Closes #53887. Differential Revision: https://reviews.llvm.org/D128172	2022-06-21 14:40:34 +00:00
Jan Svoboda	a44c6453fe	[llvm][vfs] Implement in-memory symlinks This patch implements symlinks for the in-memory VFS. Original author: @erik.pilkington. Depends on D117648 & D117649. Reviewed By: sammccall Differential Revision: https://reviews.llvm.org/D117650	2022-06-21 16:29:54 +02:00
Jan Svoboda	b439a08dfc	[llvm][vfs] NFC: Promote `InMemoryDirIterator` to nested class	2022-06-21 16:29:54 +02:00
Jan Svoboda	9e0398da8d	[llvm][vfs] NFC: Promote `lookupInMemoryNode()` to member function	2022-06-21 16:29:53 +02:00
Jan Svoboda	1ff5330ea3	[llvm][vfs] NFC: Rename `InMemoryFileSystem::addHardLink()` arguments	2022-06-21 16:29:53 +02:00
Florian Hahn	4ea6891f95	[ConstraintElimination] Remove unneeded StackEntry::Condition (NFC). The field was only used for debug printing. Print constraint from the system instead.	2022-06-21 15:57:29 +02:00
Nico Weber	6a4056ab2a	Revert "[JITLink][Orc] Add MemoryMapper interface with InProcess implementation" This reverts commit `6ede652050`. Doesn't build on Windows, see https://reviews.llvm.org/D127491#3598773	2022-06-21 09:56:49 -04:00
Anubhab Ghosh	6ede652050	[JITLink][Orc] Add MemoryMapper interface with InProcess implementation MemoryMapper class takes care of cross-process and in-process address space reservation, mapping, transferring content and applying protections. Implementations of this class can support different ways to do this such as using shared memory, transferring memory contents over EPC or just mapping memory in the same process (InProcessMemoryMapper). Reviewed By: sgraenitz, lhames Differential Revision: https://reviews.llvm.org/D127491	2022-06-21 13:44:17 +02:00
Markus Lavin	3815ae29b5	[machinesink] fix debug invariance issue Do not include debug instructions when comparing block sizes with thresholds. Differential Revision: https://reviews.llvm.org/D127208	2022-06-21 08:13:09 +02:00
Kazu Hirata	7a47ee51a1	[llvm] Don't use Optional::getValue (NFC)	2022-06-20 22:45:45 -07:00
Kazu Hirata	d66cbc565a	Don't use Optional::hasValue (NFC)	2022-06-20 20:26:05 -07:00
Kazu Hirata	0916d96d12	Don't use Optional::hasValue (NFC)	2022-06-20 20:17:57 -07:00
Kazu Hirata	064a08cd95	Don't use Optional::hasValue (NFC)	2022-06-20 20:05:16 -07:00
Philip Reames	bbf3fd4af1	[BasicTTI] Return Invalid for scalable vectors reaching getScalarizationOverhead If we would scalarize a fixed vector, we know we can't do so for a scalable one. However, there's no need to crash, we can instead simply return a invalid cost which will work its way through the computation (since invalid is sticky), and the client should bail out. Sorry for the lack of test here. The particular codepath I saw this reached on was the result of another bug.	2022-06-20 13:19:11 -07:00
Kazu Hirata	5413bf1bac	Don't use Optional::hasValue (NFC)	2022-06-20 11:33:56 -07:00
David Green	c0ecbfa4fd	[AArch64] Known bits for AArch64ISD::DUP An AArch64ISD::DUP is just a splat, where the known bits for each lane are the same as the input. This teaches that to computeKnownBitsForTargetNode. Problems arise for constants though, as a constant BUILD_VECTOR can be lowered to an AArch64ISD::DUP, which SimplifyDemandedBits would then turn back into a constant BUILD_VECTOR leading to an infinite cycle. This has been prevented by adding a isTargetCanonicalConstantNode node to prevent the conversion back into a BUILD_VECTOR. Differential Revision: https://reviews.llvm.org/D128144	2022-06-20 19:11:57 +01:00
Philip Reames	db85345f2d	[BasicTTI] Allow generic handling of scalable vector fshr/fshl This change removes an explicit scalable vector bailout for fshl and fshr. This bailout was added in `60e4698b9a`, when sinking a unconditional bailout for all intrinsics into selected cases. Its not clear if the bailout was originally unneeded, or if our cost model infrastructure has simply matured in the meantime. Either way, the generic code appears to handle scalable vectors without issue. Note that the RISC-V cost model changes here aren't particularly interesting. They do probably better match the current lowering, but the main point is to have coverage of the BasicTTI path and simply show lack of crashing. AArch64 costing was changed to preserve legacy behavior. There will most likely be an upcoming change to use the generic costs there too, but I didn't want to make that change not being particularly familiar with the target. Differential Revision: https://reviews.llvm.org/D127680	2022-06-20 10:38:51 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
David Candler	d3919a8cc5	[ConstantFolding] Respect denormal handling mode attributes when folding instructions Depending on the environment, a floating point instruction should treat denormal inputs as zero, and/or flush a denormal output to zero. Denormals are not currently accounted for when an instruction gets folded to a constant, which can lead to differences in output between a folded and a unfolded instruction when running on the target. The denormal handling mode can be set by the function level attribute denormal-fp-math, which this patch uses to determine whether any denormal inputs to or outputs from folding should be zero, and that the sign is set appropriately. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D116952	2022-06-20 16:41:46 +01:00
Fraser Cormack	398834f45b	Update usage comments in Printable.h. NFC. The example wouldn't compile, and used an invalid case style for a function. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D128176	2022-06-20 16:18:10 +01:00
Guillaume Chatelet	d3cf49e984	[Alignment] Remove alignTo version taking a MaybeAlign	2022-06-20 15:15:53 +00:00
Jan Svoboda	192a3b33f9	[support][ci] Fix modular build on GreenDragon This is to fix the following error on https://green.lab.llvm.org/green/job/clang-stage2-Rthinlto: BranchProbability.h:236:34: error: declaration of 'distance' must be imported from module 'std.iterator.__iterator.distance' before it is required	2022-06-20 16:56:20 +02:00
Guillaume Chatelet	7dbf8cfeb7	[NFC] Implement alignTo with skew in terms of alignTo	2022-06-20 14:10:14 +00:00
David Sherwood	013358632e	[AArch64][SME] Add the zero intrinsic The SME zero instruction takes a mask as an input declaring which 64-bit element tiles should be zeroed. There is a 1:1 mapping between the zero intrinsic and the instruction, however we also want to make the register allocator aware that some tile registers are being written to. We can actually just use the custom inserter for a pseudo instruction to correctly mark all the appropriate registers in the mask as implicitly defined by the operation. Differential Revision: https://reviews.llvm.org/D127843	2022-06-20 14:27:59 +01:00
Guillaume Chatelet	03036061c7	[Alignment] Use 'previous()' method instead of scalar division This is in preparation of integration with D128052. Differential Revision: https://reviews.llvm.org/D128169	2022-06-20 11:01:43 +00:00
Guillaume Chatelet	01cfc8a05a	[NFC][Alignment] Remove dead code	2022-06-20 09:47:18 +00:00
Guillaume Chatelet	f1255186c7	[NFC][Alignment] Remove max functions between Align and MaybeAlign `llvm::max(Align, MaybeAlign)` and `llvm::max(MaybeAlign, Align)` are not used often enough to be required. They also make the code more opaque. Differential Revision: https://reviews.llvm.org/D128121	2022-06-20 08:37:48 +00:00
Guillaume Chatelet	009fe0755e	[Alignment] Remove multiply by MaybeAlign	2022-06-20 08:37:15 +00:00
Chuanqi Xu	7782e080e8	[Coroutines] Only do symmetric transfer if optimization is on Symmetric transfer is not a part of C++ standards. So the vendors is not forced to implement it any way. Given the symmetric transfer nowadays is an optimization. It makes more sense to enable it only if the optimization is enabled. It is also helpful for the compilation speed in O0.	2022-06-20 16:20:36 +08:00
Kazu Hirata	c7987d4948	[ADT] Use value instead of getValue() (NFC) Since Optional<clang::FileEntryRef> uses a custom storage class, this patch adds value to MapEntryOptionalStorage.	2022-06-19 18:34:33 -07:00
Kazu Hirata	813f487228	[ADT] Use has_value (NFC) This patch switches to has_value within Optional. Since Optional<clang::FileEntryRef> uses custom storage class, this patch adds has_entry to MapEntryOptionalStorage.	2022-06-19 18:10:13 -07:00
Nico Weber	7effcbda49	Rename parallelForEachN to just parallelFor Patch created by running: rg -l parallelForEachN \| xargs sed -i '' -c 's/parallelForEachN/parallelFor/' No behavior change. Differential Revision: https://reviews.llvm.org/D128140	2022-06-19 17:49:00 -04:00
Kazu Hirata	5d7e63fb4f	[ADT] Rename value to alt (NFC) This patch renames value to alt so that the parameter won't collide with member function value().	2022-06-19 12:00:03 -07:00
Simon Pilgrim	ba3f2667b6	[DAG] Add MaskedVectorIsZero helper Equivalent to MaskedValueIsZero, except its checking if all of the demanded vectors elements are known to be zero	2022-06-19 17:56:30 +01:00
Kazu Hirata	129b531c9c	[llvm] Use value_or instead of getValueOr (NFC)	2022-06-18 23:07:11 -07:00
Kazu Hirata	3c49576417	[ADT] Add has_value, value, value_or to llvm::Optional This patch adds has_value, value, value_or to llvm::Optional so that llvm::Optional looks more like std::optional. I will keep the existing functions while migrating their callers and then remove them later. Differential Revision: https://reviews.llvm.org/D128131	2022-06-18 21:21:33 -07:00
Kazu Hirata	556bcc7821	[ADT] Rename value to val (NFC) I'd like to introduce functions, such as value, value_or, has_value, etc to make llvm::Optional look more like std::optional. Renaming value to val avoids name conflicts. Differential Revision: https://reviews.llvm.org/D128125	2022-06-18 20:19:18 -07:00
Kazu Hirata	4271a1ff33	[llvm] Call *set::insert without checking membership first (NFC)	2022-06-18 10:17:22 -07:00
Simon Pilgrim	37185ceac9	[Object] Make IsLittleEndian check constexpr to silence static analyzer dead code warnings. The "ELFT::TargetEndianness == support::little" check is known at compile time	2022-06-18 17:35:54 +01:00
Guillaume Chatelet	17e68156f6	[NFC][Alignment] Remove dead code	2022-06-18 15:00:55 +00:00
Kazu Hirata	621f58e716	[Target, CodeGen] Use isImm(), isReg(), etc (NFC)	2022-06-18 07:41:04 -07:00
Simon Pilgrim	3ea1422362	[CodeGen] Add back setOperationAction/setLoadExtAction/setLibcallName single opcode variants The work to add ArrayRef helpers (D122557, D123467 etc.) to the TargetLowering::set* methods resulted in all the single opcode calls to these methods being cast to single element ArrayRef on the fly - resulting in a scary >5x increase in build time (identified with vcperf) on MSVC release builds of most of the TargetLowering/ISelLowering files. This patch adds the back the single opcode variants to various set*Action calls to avoid this issue for now, and updates the ArrayRef helpers to wrap them - I'm still investigating whether the single element ArrayRef build times can be improved.	2022-06-18 13:02:05 +01:00
Chris Bieneman	3adc908b26	[DirectX][MC] Add MC support for DXContainer DXContainer files resemble traditional object files in that they are comprised of parts which resemble sections. Adding DXContainer as an object file format in the MC layer will allow emitting DXContainer objects through the normal object emission pipeline. Differential Revision: https://reviews.llvm.org/D127165	2022-06-17 21:19:32 -05:00
Florian Hahn	e9cced2739	Recommit "[LAA] Initial support for runtime checks with pointer selects." This reverts commit `7aa8a67882`. This version includes fixes to address issues uncovered after the commit landed and discussed at D11448. Those include: * Limit select-traversal to selects inside the loop. * Freeze pointers resulting from looking through selects to avoid branch-on-poison.	2022-06-17 21:06:26 +02:00
Daniel Thornburgh	2040b6df0a	[Symbolize] Parser for log symbolizer markup. This adds a parser for the log symbolizer markup format discussed in https://discourse.llvm.org/t/rfc-log-symbolizer/61282. The parser operates in a line-by-line fashion with minimal memory requirements. This doesn't yet include support for multi-line tags or specific parsing for ANSI X3.64 SGR control sequences, but it can be extended to do so. The latter can also be relatively easily handled by examining the resulting text elements. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D124686	2022-06-17 10:26:24 -07:00
Joe Nash	75378d432f	[AMDGPU] NFC. Change comment format on gfx11 interp and ldsdir intrinsics	2022-06-17 12:28:26 -04:00
Guillaume Chatelet	90f96ec7a5	[NFC][Alignment] Remove assumeAligned from MachineFrameInfo ctor	2022-06-17 15:21:17 +00:00
Joe Nash	20d20156f4	[AMDGPU] gfx11 VINTERP intrinsics and ISel support Depends on D127664 Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D127756	2022-06-17 09:16:59 -04:00
Joe Nash	6d5d8b1313	[AMDGPU] gfx11 ldsdir intrinsics and ISel Reviewed By: #amdgpu, rampitec Differential Revision: https://reviews.llvm.org/D127664	2022-06-17 09:03:16 -04:00
lorenzo chelini	84519bc5f7	[LLVM][IR] Fix typo in DerivedTypes.h (NFC)	2022-06-17 12:38:23 +02:00
Jennifer Yu	bb83f8e70b	[OpenMP] Initial parsing and sema for 'parallel masked' construct Differential Revision: https://reviews.llvm.org/D127454	2022-06-16 18:01:15 -07:00
David Blaikie	61fac2c370	Incomplete attempt to pull DWARFTypePrinter into its own file for reuse from lldb	2022-06-16 22:28:28 +00:00
Mitch Phillips	ed5a349b89	Make setSanitizerMetadata byval. This fixes a UaF bug in llvm::GlobalObject::copyAttributesFrom, where a sanitizer metadata object is captured by reference, and passed by reference to llvm::GlobalValue::setSanitizerMetadata. The reference comes from the same map that the new value is going to be inserted to, and the map insertion triggers iterator invalidation - leading to a use-after-free on the dangling reference. This patch fixes that bug by making setSanitizerMetadata's argument byval. This should also systematically prevent the problem from happening in future, as it's a very easy pattern to have. This shouldn't be any performance problem, the SanitizerMetadata struct is a bitfield POD.	2022-06-16 14:47:27 -07:00
Congzhe Cao	4c77d0276b	[Delinearization] Refactoring of fixed-size array delinearization This is a follow-up patch to D122857 where we added delinearization of fixed-size arrays to loop cache analysis, which resulted in some duplicate code, i.e., "tryDelinearizeFixedSize()", in LoopCacheCost.cpp and DependenceAnalysis.cpp. Refactoring is done in this patch. This patch refactors out the main logic of "tryDelinearizeFixedSize()" as "tryDelinearizeFixedSizeImpl()" and moves it to Delinearization.cpp, such that clients can reuse "llvm::tryDelinearizeFixedSizeImpl()" wherever they would like to delinearize fixed-size arrays. Currently it has two users, i.e., DependenceAnalysis.cpp and LoopCacheCost.cpp. Reviewed By: Meinersbur, #loopoptwg Differential Revision: https://reviews.llvm.org/D124745	2022-06-16 16:03:41 -04:00
Joe Nash	2d43de13df	[AMDGPU] gfx11 new dot instruction codegen support Reviewed By: rampitec, #amdgpu Differential Revision: https://reviews.llvm.org/D127904	2022-06-16 14:19:34 -04:00
Jay Foad	36ec1fcaac	[AMDGPU] Add GFX11 llvm.amdgcn.ds.add.gs.reg.rtn / llvm.amdgcn.ds.sub.gs.reg.rtn intrinsics Differential Revision: https://reviews.llvm.org/D127955	2022-06-16 18:23:14 +01:00
Jay Foad	c155a944fb	[AMDGPU] GFX11 CodeGen support for MIMG instructions This includes: - New llvm.amdgcn.image.msaa.load.* intrinsics - NSA changes, because MIMG-NSA is now limited to 3 dwords - Split CD forms of IMAGE_SAMPLE instructions out into separate test files since they are no longer supported in GFX11 Differential Revision: https://reviews.llvm.org/D127837	2022-06-16 18:23:14 +01:00
Jay Foad	445a483b41	[AMDGPU] Add new GFX11 intrinsic llvm.amdgcn.exp.row Differential Revision: https://reviews.llvm.org/D127671	2022-06-16 18:23:14 +01:00
Mircea Trofin	7f24e574d4	[MLInliner] Don't inline call sites in unreachable basic blocks This requires DominatorTree be updated, which we do in the ml inliner case, but not in the default case, and the cost of doing so is noticeable to compile time for the latter[1]. So the patch only affects the ML inliner. [1] https://llvm-compile-time-tracker.com/compare.php?from=9fc0aa45e3312944431ba7e1ca0cec99c613992b&to=7af461b1ce0d9138211ef5f883f35d5b9ddf47be&stat=wall-time Differential Revision: https://reviews.llvm.org/D127899	2022-06-16 09:14:22 -07:00
Corentin Jabot	b62e3a73e1	Replace to_hexString by touhexstr [NFC] LLVM had 2 methods to convert a number to an hexa string, this remove one of them. Differential Revision: https://reviews.llvm.org/D127958	2022-06-16 17:29:50 +02:00
David Sherwood	6f6fa5aa10	[AArch64][SME] Add SME cntsb/h/w/d intrinsics These intrinsics return the number of elements in a streaming vector, for example aarch64.sme.cntsw returns the number of 32-bit elements. When in streaming mode these are equivalent to aarch64.sve.cntb/h/w/d with an input value of 1. I have implemented these intrinsics using the rdsvl instruction and added tests here: CodeGen/AArch64/SME/sme-intrinsics-rdsvl.ll Differential Revision: https://reviews.llvm.org/D127853	2022-06-16 10:50:25 +01:00
Sunho Kim	f3e7e4d786	[JITLink][AArch64][NFC] Suppress unused variable error. Suppress unused variable error when assertion got disabled. Reviewed By: chapuni Differential Revision: https://reviews.llvm.org/D127940	2022-06-16 15:30:04 +09:00
Craig Topper	3aa6ec619f	[ValueTypes] Add types for nxv16bf16 and nxv32bf16. This is needed by our downstream and makes bf16 and f16 have the same set of scalable vector types. Reviewed By: rui.zhang Differential Revision: https://reviews.llvm.org/D127877	2022-06-15 23:00:53 -07:00
Jin Xin Ng	aaff3fb6d5	[mlgo] Fix accounting for SCC splits Previously if the inliner split an SCC such that an empty one remained, the MLInlineAdvisor could potentially lose track of the EdgeCount if a subsequent CGSCC pass modified the calls of a function that was initially in the SCC pre-split. Saving the seen nodes in onPassEntry resolves this. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D127693	2022-06-15 10:53:23 -07:00
Joseph Huber	601ec17d54	[Binary] Add iterator to the OffloadBinary string maps The offload binary contains internally a string map of all the key and value pairs identified in the binary itself. Normally users query these values from the `getString` function, but this makes it difficult to identify which strings are availible. This patch adds a simple const iterator range to the offload binary allowing users to iterate through the strings. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D127774	2022-06-15 12:24:26 -04:00
Guillaume Chatelet	412c788ab0	[NFC][Alignment] Use Align in MCAlignFragment	2022-06-15 12:31:00 +00:00
Benjamin Kramer	8bc0bb9564	Add a conversion from double to bf16 This introduces a new compiler-rt function `__truncdfbf2`.	2022-06-15 12:56:31 +02:00
Benjamin Kramer	fb34d531af	Promote bf16 to f32 when the target doesn't support it This is modeled after the half-precision fp support. Two new nodes are introduced for casting from and to bf16. Since casting from bf16 is a simple operation I opted to always directly lower it to integer arithmetic. The other way round is more complicated if you want to preserve IEEE semantics, so it's handled by a new __truncsfbf2 compiler-rt builtin. This is of course very bare bones, but sufficient to get a semi-softened fadd on x86. Possible future improvements: - Targets with bf16 conversion instructions can now make fp_to_bf16 legal - The software conversion to bf16 can be replaced by a trivial implementation under fast math. Differential Revision: https://reviews.llvm.org/D126953	2022-06-15 12:56:31 +02:00
David Sherwood	5fa2416ea0	[AArch64][SME] Add SME read/write intrinsics that map to the mova instruction This patch adds implementations for the read/write SME ACLE intrinsics: @llvm.aarch64.sme.read.horiz @llvm.aarch64.sme.read.vert @llvm.aarch64.sme.write.horiz @llvm.aarch64.sme.write.vert These all map to the SME mova instruction. Differential Revision: https://reviews.llvm.org/D127414	2022-06-15 10:31:07 +01:00
Austin Kerbow	48ebc1af29	[AMDGPU] Add more expressive sched_barrier controls The sched_barrier builtin allow the scheduler's behavior to be shaped by users when very specific codegen is needed in order to create highly optimized code. This patch adds more granular control over the types of instructions that are allowed to be reordered with respect to one or multiple sched_barriers. A mask is used to specify groups of instructions that should be allowed to be scheduled around a sched_barrier. The details about this mask may be used can be found in llvm/include/llvm/IR/IntrinsicsAMDGPU.td. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D127123	2022-06-14 22:03:05 -07:00
Fangrui Song	94d1692aa1	[MC] Remove unused MCStreamer::SwitchSection switchSection should be used instead.	2022-06-14 21:25:56 -07:00
Mircea Trofin	22a1f998f7	FunctionPropertiesAnalysis: handle callsite BBs that lose edges There could be successors that were reached before but now are only reachable from elsewhere in the CFG. Suppose the following diamond CFG (lines are arrows pointing down): A / \ B C \ / D There's a call site in C that is inlined. Upon doing that, it turns out it expands to: call void @llvm.trap() unreachable D isn't reachable from C anymore, but we did discount it when we set up FunctionPropertiesUpdater, so we need to re-include it here. The patch also updates loop accounting to use LoopInfo rather than traverse BBs. Differential Revision: https://reviews.llvm.org/D127353	2022-06-14 15:19:44 -07:00
Venkata Ramanaiah Nalamothu	340b0ca900	[llvm] Add DW_CC_nocall to function debug metadata when either return values or arguments are removed Adding the `DW_CC_nocall` calling convention to the function debug metadata is needed when either the return values or the arguments of a function are removed as this helps in informing debugger that it may not be safe to call this function or try to interpret the return value. This translates to setting `DW_AT_calling_convention` with `DW_CC_nocall` for appropriate DWARF DIEs. The DWARF5 spec (section 3.3.1.1 Calling Convention Information) says: If the `DW_AT_calling_convention` attribute is not present, or its value is the constant `DW_CC_normal`, then the subroutine may be safely called by obeying the `standard` calling conventions of the target architecture. If the value of the calling convention attribute is the constant `DW_CC_nocall`, the subroutine does not obey standard calling conventions, and it may not be safe for the debugger to call this subroutine. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D127134	2022-06-15 03:30:15 +05:30
Luboš Luňák	4d27c154a5	remove a duplicated include	2022-06-14 18:55:26 +02:00
Jin Xin Ng	9f2b873a7d	[inliner] Add per-SCC-pass InlineAdvisor printing option Adds option to print the contents of the Inline Advisor after each SCC Inliner pass Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D127689	2022-06-14 08:06:52 -07:00
David Sherwood	bd61664167	[AArch64][SME] Add ldr/str (fill/spill) intrinsics This patch adds implementations for the fill/spill SME ACLE intrinsics: @llvm.aarch64.sme.ldr @llvm.aarch64.sme.str Differential Revision: https://reviews.llvm.org/D127317	2022-06-14 13:58:22 +01:00
Guillaume Chatelet	b4cf74dc9e	[NFC] Remove dead code	2022-06-14 10:56:37 +00:00
Guillaume Chatelet	6725d80640	[NFC][Alignment] Use Align in shouldAlignPointerArgs	2022-06-14 10:56:36 +00:00
Rosie Sumpter	2c4e44752d	[AArch64][SME] Add load/store intrinsics This patch adds implementations for the load/store SME ACLE intrinsics: - @llvm.aarch64.sme.ld1* - @llvm.aarch64.sme.st1* Differential Revision: https://reviews.llvm.org/D127210	2022-06-14 11:11:22 +01:00
Chuanqi Xu	735e6c40b5	[Coroutines] Convert coroutine.presplit to enum attr This is required by @nikic in https://reviews.llvm.org/D127383 to decrease the cost to check whether a function is a coroutine and this fixes a FIXME too. Reviewed By: rjmccall, ezhulenev Differential Revision: https://reviews.llvm.org/D127471	2022-06-14 14:23:46 +08:00
Kazu Hirata	a2232da2a5	[CodeGen] Remove addSEHCatchHandler and addSEHCleanupHandler (NFC) The last uses of these functions are removed on Oct 9, 2015 in commit `14e773500e`.	2022-06-13 23:08:49 -07:00
Kazu Hirata	34ff78c5cf	[CodeGen] Remove restrictRef (NFC) The last use was removed on Apr 14, 2017 in commit `4fe9d6c640`.	2022-06-13 23:08:48 -07:00
Sunho Kim	398df667d6	[JITLink][AArch64] Implement MoveWide16 generic edge. Implements MoveWide16 generic edge kind that can be used to patch MOVZ/MOVK (imm16) instructions. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D127584	2022-06-14 13:51:47 +09:00
Sunho Kim	6cc3450a52	[JITLink][AArch64] Lift fixup functions from aarch64.cpp to aarch64.h. (NFC) Lift fixup functions from aarch64.cpp to aarch64.h so that they have better chance of getting inlined. Also, adds some comments documenting the purpose of functions. Reviewed By: sgraenitz Differential Revision: https://reviews.llvm.org/D127559	2022-06-14 13:34:00 +09:00
Sunho Kim	db37225803	[JITLink][AArch64] Unify table managers of ELF and MachO. Unifies GOT/PLT table managers of ELF and MachO on aarch64 architecture. Additionally, it migrates table managers from PerGraphGOTAndPLTStubsBuilder to generic crtp TableManager. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D127558	2022-06-14 13:16:03 +09:00
Fangrui Song	bf0bac43ff	[CodeGen] Initialize ISD after `800d222e53` In the Intrinsic::fptosi_sat branch, ISD was uninitialized when Tys.empty().	2022-06-13 19:52:21 -07:00
Philip Reames	800d222e53	[BasicTTI] Remove unused support for multiple opcodes in getTypeBasedIntrinsicInstrCost [nfc] ISDs only ever contains a single ISD opcode. We can simplify the code under this assumption. The code being removed was added back in 2016 in `0f26b0aeb4` to support FMAXNAN/FMINNAN, but at some point since then the motivating case was rewritten not to use the ISDs mechanism. No reason to keep the false generality around now.	2022-06-13 18:23:39 -07:00
Lang Hames	14b7c108a2	[C-API][ORC] Add C API to suspend lookups during definition generation. Slow definition generators may suspend lookups to temporarily release the session lock, allowing unrelated lookups to proceed. Using this functionality is discouraged: it is best to make definition generation fast, rather than suspending the lookup. As a last resort where this is not possible, suspension may be used.	2022-06-13 17:20:07 -07:00
Kazu Hirata	145cc9db2b	[CodeGen] Remove futureWeight (NFC) The last use was removed on Jun 5, 2022 in commit `5c06f7168f`, which itself was a patch to remove unused code.	2022-06-13 17:10:23 -07:00
Lang Hames	803c770ee0	[C-API][ORC] Add LLVMOrcExecutionSessionLookup -- generic async symbol lookup. An API to wrap ExecutionSession::lookup, this allows C API clients to use async lookup. The immediate motivation for adding this is to simplify upcoming definition-generator unit tests. As we're adding more tests that need to convert between C and C++ flag values this commit adds helper functions to support this. This patch also updates the CAPIDefinitionGenerator to use these new utilities.	2022-06-13 16:37:35 -07:00
Kazu Hirata	5c41b0f429	[Analysis] Remove getUniqueInstruction (NFC) The last use was removed on Apr 7, 2022 in commit `5cefe7d9f5`.	2022-06-13 14:26:20 -07:00
Lang Hames	b425f55693	[C-API][ORC] Fix struct name. This struct was using the wrong prefix (LLVMJIT... vs LLVMOrc...).	2022-06-13 13:53:51 -07:00
Jay Foad	bfcfd53b92	[AMDGPU] Add GFX11 llvm.amdgcn.permlane64 intrinsic Compared to permlane16, permlane64 has no BC input because it has no boundary conditions, no fi input because the instruction acts as if FI were always enabled, and no OLD input because it always writes to every active lane. Also use the new intrinsic in the atomic optimizer pass. Differential Revision: https://reviews.llvm.org/D127662	2022-06-13 21:12:11 +01:00
Guillaume Chatelet	2b89a4dc51	[NFC] Remove dead code	2022-06-13 15:38:27 +00:00
Guillaume Chatelet	8865700f90	[NFC] Remove dead code	2022-06-13 15:38:27 +00:00
Guillaume Chatelet	111b32ecb4	[NFC][Alignment] Use getAlign in Attributor classes	2022-06-13 15:13:05 +00:00
Kazu Hirata	246e83e973	[GlobalISel] Remove buildSequence (NFC) The last use was removed on Jun 27, 2019 in commit `8138996128`.	2022-06-13 06:58:36 -07:00
Jez Ng	d4bcb45db7	[MC][re-land] Omit DWARF unwind info if compact unwind is present where eligible This reverts commit `d941d59783`. Differential Revision: https://reviews.llvm.org/D122258	2022-06-12 17:24:19 -04:00
Jez Ng	d941d59783	Revert "[MC] Omit DWARF unwind info if compact unwind is present where eligible" This reverts commit `ef501bf85d`.	2022-06-12 10:47:08 -04:00
Jez Ng	ef501bf85d	[MC] Omit DWARF unwind info if compact unwind is present where eligible Previously, omitting unnecessary DWARF unwinds was only done in two cases: * For Darwin + aarch64, if no DWARF unwind info is needed for all the functions in a TU, then the `__eh_frame` section would be omitted entirely. If any one function needed DWARF unwind, then MC would emit DWARF unwind entries for all the functions in the TU. * For watchOS, MC would omit DWARF unwind on a per-function basis, as long as compact unwind was available for that function. This diff makes it so that we omit DWARF unwind on a per-function basis for Darwin + aarch64 as well. In addition, we introduce the flag `--emit-dwarf-unwind=` which can toggle between `always`, `no-compact-unwind` (only emit DWARF when CU cannot be emitted for a given function), and the target platform `default`. `no-compact-unwind` is particularly useful for newer x86_64 platforms: we don't want to omit DWARF unwind for x86_64 in general due to possible backwards compat issues, but we should make it possible for people to opt into this behavior if they are only targeting newer platforms. Motivation: I'm working on adding support for `__eh_frame` to LLD, but I'm concerned that we would suffer a perf hit. Processing compact unwind is already expensive, and that's a simpler format than EH frames. Given that MC currently produces one EH frame entry for every compact unwind entry, I don't think processing them will be cheap. I tried to do something clever on LLD's end to drop the unnecessary EH frames at parse time, but this made the code significantly more complex. So I'm looking at fixing this at the MC level instead. Addendum: It turns out that there was a latent bug in the X86 backend when `OmitDwarfIfHaveCompactUnwind` is naively enabled, which is not too surprising given that this combination has not been heretofore used. For functions that have unwind info that cannot be encoded with CU, MC would end up dropping both the compact unwind entry (OK; existing behavior) as well as the DWARF entries (not OK). This diff fixes things so that we emit the DWARF entry, as well as a CU entry with encoding `UNWIND_X86_MODE_DWARF` -- this basically tells the unwinder to look for the DWARF entry. I'm not 100% sure the `UNWIND_X86_MODE_DWARF` CU entry is necessary, this was the simplest fix. ld64 seems to be able to handle both the absence and presence of this CU entry. Ultimately ld64 (and LLD) will synthesize `UNWIND_X86_MODE_DWARF` if it is absent, so there is no impact to the final binary size. Reviewed By: davide, lhames Differential Revision: https://reviews.llvm.org/D122258	2022-06-12 10:03:56 -04:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Mircea Trofin	7e7021ca1a	[mlgo] Update FunctionPropertyCache after invalidating analyses The update depends on LoopInfo, so we need that refreshed first, not after. Differential Revision: https://reviews.llvm.org/D127467	2022-06-10 16:18:14 -07:00
Mitch Phillips	8db981d463	Add sanitizer-specific GlobalValue attributes. Plan is the migrate the global variable metadata for sanitizers, that's currently carried around generally in the 'llvm.asan.globals' section, onto the global variable itself. This patch adds the attribute and plumbs it through the LLVM IR and bitcode formats, but is a no-op other than that so far. Reviewed By: vitalybuka, kstoimenov Differential Revision: https://reviews.llvm.org/D126100	2022-06-10 12:28:18 -07:00
Shraiysh Vaishay	f62baddac0	[OpenMP][IRBuilder] Add final clause to task This patch adds final clause to OpenMP IR Builder. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D126626	2022-06-11 00:02:18 +05:30
Joe Nash	ea3c9a87d3	[AMDGPU] gfx11 add bits to COMPUTE_PGM_RSRC3 Contributors: Konstantin Zhuravlyov <kzhuravl_dev@outlook.com> Patch 21/N for upstreaming of AMDGPU gfx11 architecture Depends on D127143 Reviewed By: rampitec, #amdgpu, kzhuravl Differential Revision: https://reviews.llvm.org/D127241	2022-06-10 13:07:14 -04:00
Guillaume Chatelet	95083fa3b8	[NFC] Remove deadcode	2022-06-10 15:13:42 +00:00
Guillaume Chatelet	38637ee477	[clang] Add support for __builtin_memset_inline In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`. The idea is to get support from the compiler to easily write efficient memory function implementations. This patch could be split in two: - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics. - and another one for the Clang part providing the instrinsic as a builtin. Differential Revision: https://reviews.llvm.org/D126903	2022-06-10 13:13:59 +00:00
David Sherwood	8daaea206b	[InstCombine] Use +0.0 instead of -0.0 as the FP identity for some folds In foldSelectIntoOp we sometimes transform a select of a fadd into a fadd of a select, where we select between data and an identity value. For both fadd and fsub the identity is always -0.0, but if the nsz flag is set on the select instruction we can use +0.0 instead. Doing so then triggers other optimisations, such as when folding the select of masked load into a new masked load. Differential Revision: https://reviews.llvm.org/D126774	2022-06-10 12:42:34 +01:00
Nikita Popov	d77f944832	[LoopInfo] Add getOutermostLoop() (NFC) This is a recurring pattern, add an API function for it.	2022-06-10 11:48:21 +02:00
Jay Foad	6c372daa84	[AMDGPU] New GFX11 intrinsic llvm.amdgcn.s.sendmsg.rtn Add new intrinsic and codegen support for the s_sendmsg_rtn_b32 and s_sendmsg_rtn_b64 instructions. Differential Revision: https://reviews.llvm.org/D127315	2022-06-10 08:15:23 +01:00
Peter S. Housel	1aa71f8679	[ORC][ORC_RT] Integrate ORC platforms with LLJIT and lli This change enables integrating orc::LLJIT with the ORCv2 platforms (MachOPlatform and ELFNixPlatform) and the compiler-rt orc runtime. Changes include: - Adding SPS wrapper functions for the orc runtime's dlfcn emulation functions, allowing initialization and deinitialization to be invoked by LLJIT. - Changing the LLJIT code generation default to add UseInitArray so that .init_array constructors are generated for ELF platforms. - Integrating the ORCv2 Platforms into lli, and adding a PlatformSupport implementation to the LLJIT instance used by lli which implements initialization and deinitialization by calling the new wrapper functions in the runtime. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D126492	2022-06-09 22:47:58 -07:00
Philip Reames	b59c2315af	[BasicTTI] Return Invalid cost for more scalable vector scalarization cases Instead of crashing on a cast<FixedVectorType>, we should isntead return Invalid for these cases. This avoids crashes in assert builds, and potential miscompiles in release builds.	2022-06-09 16:10:51 -07:00
Philip Reames	206f10d3f6	Plumb InstructionCost through unroll costing Teach the unroller(s) how to handle an invalid cost. This avoids crashes when the backend can't provide a cost due to either a fundemental limitation or an unimplemented cost model case. Differential Revision: https://reviews.llvm.org/D127305	2022-06-09 15:42:53 -07:00
Philip Reames	f85c5079b8	Pipe potentially invalid InstructionCost through CodeMetrics Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred. On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost. I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change. Differential Revision: https://reviews.llvm.org/D127131	2022-06-09 15:17:24 -07:00
Johannes Doerfert	6555558a80	Revert "[Attributor] Replace AAValueSimplify with AAPotentialValues" This reverts commit `da50dab1ae`. Patch broke AMD GPU OpenMP offload buildbots. https://lab.llvm.org/buildbot/#/builders/193/builds/13246	2022-06-09 17:04:01 +02:00
Simon Moll	746908a038	[NFC] Clang-format PatternMatch.h	2022-06-09 16:51:32 +02:00
Johannes Doerfert	da50dab1ae	[Attributor] Replace AAValueSimplify with AAPotentialValues For the longest time we used `AAValueSimplify` and `genericValueTraversal` to determine "potential values". This was problematic for many reasons: - We recomputed the result a lot as there was no caching for the 9 locations calling `genericValueTraversal`. - We added the idea of "intra" vs. "inter" procedural simplification only as an afterthought. `genericValueTraversal` did offer an option but `AAValueSimplify` did not. Thus, we might end up with "too much" simplification in certain situations and then gave up on it. - Because `genericValueTraversal` was not a real `AA` we ended up with problems like the infinite recursion bug (#54981) as well as code duplication. This patch introduces `AAPotentialValues` and replaces the `AAValueSimplify` uses with it. `genericValueTraversal` is folded into `AAPotentialValues` as are the instruction simplifications performed in `AAValueSimplify` before. We further distinguish "intra" and "inter" procedural simplification now. `AAValueSimplify` was not deleted as we haven't ported the re-materialization of instructions yet. There are other differences over the former handling, e.g., we may not fold trivially foldable instructions right now, e.g., `add i32 1, 1` is not folded to `i32 2` but if an operand would be simplified to `i32 1` we would fold it still. We are also even more aware of function/SCC boundaries in CGSCC passes, which is good. Fixes: https://github.com/llvm/llvm-project/issues/54981	2022-06-09 16:48:53 +02:00
Simon Moll	b8c2781ff6	[NFC] format InstructionSimplify & lowerCaseFunctionNames Clang-format InstructionSimplify and convert all "FunctionName"s to "functionName". This patch does touch a lot of files but gets done with the cleanup of InstructionSimplify in one commit. This is the alternative to the less invasive clang-format only patch: D126783 Reviewed By: spatel, rengolin Differential Revision: https://reviews.llvm.org/D126889	2022-06-09 16:10:08 +02:00
Johannes Doerfert	14899bc43d	[Attributor] Generalize interface from ConstantInt to Constant We can use constant to allow undef and there is no need to force integers in the API anyway. The user can decide if a non integer constant is fine or not.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	7a07b88f37	[Attributor][FIX] Replace call site argument uses, not values We need to be careful replacing values as call site arguments (IRPosition::IRP_CALL_SITE_ARGUMENT) is representing a use and not a value. This patch replaces the interface to take a IR position instead making it harder to misuse accidentally. It does not change our tests right now but a follow up exposed the potential footgun.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	1df6e171c3	[Attributor] Simplify (integer range) state handling We used to be very conservative when integer states were merged. Instead of adding the known range (which is large due to uncertainty) into the assumed range (which is hopefully small), we can also only allow to merge in both at the same time into their respective counterpart. This will ensure we keep the invariant that assumed is part of known.	2022-06-09 12:00:26 +02:00
Johannes Doerfert	481b8f31df	[Attributor][NFC] Introduce helper struct We often use a context associated with a value. For now only one use case has been changed.	2022-06-09 12:00:26 +02:00
Nicolai Hähnle	f971e77fb4	ADT/ArrayRef: Add makeMutableArrayRef overloads Equivalent overloads already exist for makeArrayRef. Differential Revision: https://reviews.llvm.org/D126421	2022-06-09 09:59:50 +02:00
Lang Hames	3fcd3669e3	[ORC] Add an output stream operator for SymbolStringPool. Handy for checking string pool state, e.g. when debugging dangling-pool-entry errors.	2022-06-08 16:49:51 -07:00
Florian Mayer	0593ce5f0b	[MC] Add 'G' to augmentation string for MTE instrumented functions This was agreed on in https://lists.llvm.org/pipermail/llvm-dev/2020-May/141345.html The thread proposed two options * add a character to augmentation string and handle in libuwind * use a separate personality function. It was determined that this is the simpler and better option. This is part of ARM's Aarch64 ABI: https://github.com/ARM-software/abi-aa/blob/main/aadwarf64/aadwarf64.rst#id22 The next step after this is teaching libunwind to untag when this augmentation character is set. Reviewed By: MaskRay, eugenis Differential Revision: https://reviews.llvm.org/D127007	2022-06-08 12:36:32 -07:00
Hongtao Yu	ab34ab2b87	[PseudoProbe] Use callee name as callsite identfier for MCDecodedPseudoProbeInlineTree. The callsite identifier used in pseudo probe encoding and decoding is consisted of a function name and the callsite probe id. For encoding, i.e., `MCPseudoProbeInlineTree`, the function name is callee function name. However for decoding, i.e., `MCDecodedPseudoProbeInlineTree`, the caller function name is used actually. This results in multiple callees that are inlined at the same callsite, likely via indirect call promotion, sharing the same decoded inline frame. While it is not a problem for profile generation, it confuses probe re-encoding in Bolt. In Bolt, we decode pseudo probes first and build `MCDecodedPseudoProbeInlineTree`. The decoded tree is used for final re-encoding. Here comes the problem. Two inlinees from the same callsite share the same decoded inline frame. During re-encoding, the frame name (whatever inlinee comes first) will be used and encoded in the bolted binary. This will cause wrong inline contexts in the profile generated on the bolted binary. The fix is a no-op to pre-bolt profile generation. Some of the bolt tests are not yet upstreamed, thus I'm not adding a bolt test here. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D126434	2022-06-08 10:54:40 -07:00
Thomas Lively	aff679a48c	[WebAssembly] Implement remaining relaxed SIMD instructions Add codegen, intrinsics, and builtins for the i16x8.relaxed_q15mulr_s, i16x8.dot_i8x16_i7x16_s, and i32x4.dot_i8x16_i7x16_add_s instructions. These are the last instructions from the relaxed SIMD proposal[1] that had not been implemented. [1]: https://github.com/WebAssembly/relaxed-simd/blob/main/proposals/relaxed-simd/Overview.md. Differential Revision: https://reviews.llvm.org/D127170	2022-06-08 10:32:10 -07:00
Philip Reames	f0d2a55d3a	Restore isa<Ty>(X) asserts inside cast<Ty>(X) PLEASE DO NOT REVERT without careful consideration, and preferably prior discussion. cast<Ty>(X) is a "checked cast". Its entire purpose is explicitly documented (https://llvm.org/docs/ProgrammersManual.html#the-isa-cast-and-dyn-cast templates) as catching bad casts by asserting that the cast is valid. Unfortunately, in a recent rewrite of our casting infrastructure about three months back, these asserts got dropped. This is discussed in more detail on discourse in https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033. Differential Revision: https://reviews.llvm.org/D127231	2022-06-08 07:32:37 -07:00
Paul Walker	d88354213c	[SelectionDAG] Remove invalid TypeSize conversion from PromoteIntRes_BITCAST. Extend the TypeWidenVector case of PromoteIntRes_BITCAST to work with TypeSize directly rather than silently casting to unsigned. To accomplish this I've extended TypeSize with an interface that essentially allows TypeSize division when both operands have the same number of dimensions. There still exists combinations of scalable vector bitcasts that cause compiler crashes. I call these out by adding "is missing" entries to sve-bitcast. Depends on D126957. Fixes: #55114 Differential Revision: https://reviews.llvm.org/D127126	2022-06-08 10:30:07 +01:00
Nathan James	638b0fb4d6	[ADT][NFC] Early bail out for ComputeEditDistance The minimun bound for number of edits is the size difference between the 2 arrays. If MaxEditDistance is smaller than this, we can bail out early without needing to traverse any of the arrays. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D127070	2022-06-08 08:20:29 +01:00
Wolfgang Pieb	213eb424e8	Revert "[Metadata] Add a resize capability to MDNodes and add a push_back interface to MDNodes" This reverts commit `e3f6eda8c6`. Failure in unittest on https://lab.llvm.org/buildbot*builders/171/builds/15666	2022-06-07 15:48:31 -07:00
Wolfgang Pieb	e3f6eda8c6	[Metadata] Add a resize capability to MDNodes and add a push_back interface to MDNodes A change to the allocation characteristics of MDNodes, introducing the ability to add operands one at a time. This functionality is restricted to MDTuples. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D125998	2022-06-07 14:34:38 -07:00
Sunho Kim	9f29916169	[JITLink][AArch64] Refactor isLoadStoreImm12 check out of getPageOffset12Shift. The separate isLoadStoreImm12 predicate will be used for validating ELF/aarch64 ldst relocation types. Reviewed By: lhames, sgraenitz Differential Revision: https://reviews.llvm.org/D126628	2022-06-07 13:18:12 -07:00
Joseph Huber	f06731e3c3	[Binary] Make the OffloadingImage type own the memory Summary: The OffloadingBinary uses a convenience struct to help manage the memory that will be serialized using the binary format. This currently uses a reference to an existing buffer, but this should own the memory instead so it is easier to work with seeing as its only current use requires saving the buffer anyway.	2022-06-07 15:56:09 -04:00
Philip Reames	781de11f42	Revert "[LLVM][Casting.h] Add trivial self-cast" This reverts commit `0809f63826`. The patch appears not to have included corresponding isa<Ty> support. This was revealed when reintroducing the required isa<Ty> asserts in cast<Ty>. See https://discourse.llvm.org/t/cast-x-is-broken-implications-and-proposal-to-address/63033 for context. Here's the template instantiation error: In file included from /home/preames/llvm-repo/llvm-project/llvm/unittests/Support/Casting.cpp:9: /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h: In instantiation of ‘static bool llvm::isa_impl<To, From, Enabler>::doit(const From&) [with To = llvm::bar; From = llvm::bar; Enabler = void]’: /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:110:36: required from ‘static bool llvm::isa_impl_cl<To, const From>::doit(const From) [with To = llvm::bar; From = llvm::bar]’ /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:137:41: required from ‘static bool llvm::isa_impl_wrap<To, FromTy, FromTy>::doit(const FromTy&) [with To = llvm::bar; FromTy = const llvm::bar]’ /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:129:13: required from ‘static bool llvm::isa_impl_wrap<To, From, SimpleFrom>::doit(const From&) [with To = llvm::bar; From = const llvm::bar const; SimpleFrom = const llvm::bar]’ /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:263:62: required from ‘static bool llvm::CastIsPossible<To, From, Enable>::isPossible(const From&) [with To = llvm::bar; From = const llvm::bar; Enable = void]’ /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:517:38: required from ‘static bool llvm::CastInfo<To, From, typename std::enable_if<(! llvm::is_simple_type<From>::value), void>::type>::isPossible(From&) [with To = llvm::bar; From = llvm::bar* const]’ /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:556:46: required from ‘bool llvm::isa(const From&) [with To = llvm::bar; From = llvm::bar]’ /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:585:3: required from ‘decltype(auto) llvm::cast(From) [with To = llvm::bar; From = llvm::bar]’ /home/preames/llvm-repo/llvm-project/llvm/unittests/Support/Casting.cpp:181:27: required from here /home/preames/llvm-repo/llvm-project/llvm/include/llvm/Support/Casting.h:64:64: error: ‘classof’ is not a member of ‘llvm::bar*’ 64 \| static inline bool doit(const From &Val) { return To::classof(&Val); }	2022-06-07 12:50:40 -07:00
Derek Schuff	2ae385e560	[WebAssembly] Add WASM_SEC_LAST_KNOWN to BinaryFormat section types list [NFC] There are 3 places where we were using WASM_SEC_TAG as the "last" known section type, which requires updating (or leaves a bug) when a new known section type is added. Instead add a "last type" to the enum for this purpose. Differential Revision: https://reviews.llvm.org/D127164	2022-06-07 12:05:23 -07:00
Sunho Kim	b6553f592a	[JITLink][ELF][AArch64] Lift MachO/arm64 edges into aarch64.h, reuse for ELF. This patch moves the aarch64 fixup logic from the MachO/arm64 backend to aarch64.h header so that it can be re-used in the ELF/aarch64 backend. This significantly expands relocation support in the ELF/aarch64 backend. Reviewed By: lhames, sgraenitz Differential Revision: https://reviews.llvm.org/D126286	2022-06-07 12:01:43 -07:00
Reid Kleckner	570e76bb6c	[config] Remove vestigial LLVM_VERSION_INFO This has been superseded by the llvm/Support/VCSRevision.h header. So far as I can tell, nothing in the CMake build sets LLVM_VERSION_INFO. It was always undefined, and the ifdefs using it were dead. However, CMake is very flexible, so it's possible that I missed some ways to set this variable. One could, for example, probably pass -DLLVM_VERSION_INFO=x on the command line and get that through to configure_file, or set the variable in an obscure way (`set(${proj}_VERSION_INFO "x")`). I'm reasonably confident that isn't happening, but I'd like a second opinion. Update the Bazel and gn builds accordingly. Differential Revision: https://reviews.llvm.org/D126977	2022-06-07 11:36:26 -07:00
Reid Kleckner	b1c7889f32	[config] Remove RETSIGTYPE from config.h.cmake, NFC This doesn't need to be configurable. It was hardcoded to void in all LLVM build systems.	2022-06-07 11:35:25 -07:00
Matt Arsenault	cc5a1b3dd9	llvm-reduce: Add cloning of target MachineFunctionInfo MIR support is totally unusable for AMDGPU without this, since the set of reserved registers is set from fields here. Add a clone method to MachineFunctionInfo. This is a subtle variant of the copy constructor that is required if there are any MIR constructs that use pointers. Specifically, at minimum fields that reference MachineBasicBlocks or the MachineFunction need to be adjusted to the values in the new function.	2022-06-07 10:14:48 -04:00
Matt Arsenault	56303223ac	llvm-reduce: Don't assert on functions which don't track liveness Use the query that doesn't assert if TracksLiveness isn't set, which needs to always be available. We also need to start printing liveins regardless of TracksLiveness.	2022-06-07 10:00:25 -04:00
Guillaume Chatelet	0788186182	[Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times. Differential Revision: https://reviews.llvm.org/D126910	2022-06-07 13:52:20 +00:00
Jay Foad	1feed6691a	[APInt] Remove truncOrSelf, zextOrSelf and sextOrSelf Differential Revision: https://reviews.llvm.org/D125559	2022-06-07 10:01:49 +01:00
Fangrui Song	15d82c62dc	[MC] De-capitalize MCStreamer functions Follow-up to `c031378ce0` . The class is mostly consistent now.	2022-06-07 00:31:02 -07:00
luxufan	a7b154aa17	[MC][ARM] Reuse symbol value in constant pool Fix https://github.com/llvm/llvm-project/issues/55816 Before this patch, MCConstantExpr were reused, but MCSymbolExpr were not. To reuse symbol value, this patch added a DenseMap to record the symbol value. Differential Revision: https://reviews.llvm.org/D127113	2022-06-07 13:39:52 +08:00
Chris Bieneman	21c9452305	[DX][ObjYAML] Support for parsing DXIL part This patch adds support for parsing the DXIL part data into the ObjectYAML tooling. The DXIL part has additional headers describing the shader and bitcode data and stores serialized bitcode after the headers. Depends on D124945 Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D126795	2022-06-06 18:46:19 -05:00
Philip Reames	c1fb8bd777	[BasicTTI] Add missing scalable vector handling BasicTTI needs to return an invalid cost for scalable vectors instead of crash. Without this, it is impossible to write tests for missing functionality in a target.	2022-06-06 14:21:41 -07:00
Chris Bieneman	352c395fb6	[ObjectYAML][DX] Add dxcontainer2yaml support This change finishes fleshing out the ObjectYAML tools to support converting DXContainer files into yaml representations. Depends on D124944 Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D124945	2022-06-06 13:23:29 -05:00
Michael Kitzan	b7fcf6632f	[GISel] Add new combines for G_ADD Patch adds new GICombineRules for G_ADD: G_ADD(x, G_SUB(y, x)) -> y G_ADD(G_SUB(y, x), x) -> y Patch additionally adds new combine tests for AArch64 target for these new rules. Reviewed by: paquette Differential Revision: https://reviews.llvm.org/D87936	2022-06-06 11:19:45 -07:00
Nimish Mishra	6a3c4a40f4	[flang][OpenMP] Added parser support for in_reduction clause OpenMP 5.0 adds a new clause `in_reduction` on OpenMP directives. This patch adds parser support for the same. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D124156	2022-06-06 14:55:27 +05:30
Kazu Hirata	7c009d2c31	[PDB] Remove truncate* (NFC) - truncateQuotedNameFront: The last use was removed on Jul 10, 2017 in commit `a9d944fd6f`. - truncateQuotedNameBack: The last use was removed on Mar 26, 2018 in commit `7b84b678a9`. - truncateStringMiddle: The last use was removed on Mar 26, 2018 in commit `7b84b678a9`. - truncateStringBack: The last use is in truncateQuotedNameBack being removed above. - truncateStringFront: The last use is in truncateQuotedNameFront being removed above.	2022-06-05 23:33:51 -07:00
Kazu Hirata	43d4585e64	[GlobalISel] Remove widenWithUnmerge (NFC) The last use was removed on Dec 23, 2021 in commit `29f88b93fd`.	2022-06-05 19:58:18 -07:00
Kazu Hirata	61abcb0b37	[GlobalISel] Remove valueIsSplit (NFC) The last use was removed on Jun 27, 2019 in commit `8138996128`.	2022-06-05 19:51:03 -07:00
Kazu Hirata	3b9707dbc0	[llvm] Convert for_each to range-based for loops (NFC)	2022-06-05 12:07:14 -07:00
Alexey Lapshin	501d5b24db	[Debuginfo][DWARF][NFC] Refactor DwarfStringPoolEntryRef - remove isIndexed(). This patch is extraction from the https://reviews.llvm.org/D126883. It removes DwarfStringPoolEntryRef::isIndexed() and isIndexed bit since they are not used. Differential Revision: https://reviews.llvm.org/D126958	2022-06-05 21:18:31 +03:00
Nathan James	a13b61f7f0	[ADT] Add edit_distance_insensitive to StringRef In some instances its advantageous to calculate edit distances without worrying about casing. Currently to achieve this both strings need to be converted to the same case first, then edit distance can be calculated. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D126159	2022-06-05 12:03:09 +01:00
Kazu Hirata	4969a6924d	Use llvm::less_first (NFC)	2022-06-04 21:23:18 -07:00
Fangrui Song	557efc9a8b	[llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!` error. More were added due to cargo cult. Since the error has been removed, cl::ZeroOrMore is unneeded. Also remove cl::init(false) while touching the lines.	2022-06-03 21:59:05 -07:00
Reid Kleckner	0a832ba5c2	[config] Remove LLVM_DEFAULT_TARGET_TRILE from config.h It is redundant with llvm-config.h, which is always included by config.h. Port D12660 / `d178f4fc89` from config.h to llvm-config.h. Update the gn build accordingly. NFCI	2022-06-03 10:15:46 -07:00
Alvin Wong	bb94611d65	[COFF] Check table ptr more thoroughly and ignore empty sections When loading split debug files for PE/COFF executables (produced with `objcopy --only-keep-debug`), the tables or directories in such files may point to data inside sections that may have been stripped. COFFObjectFile shall detect and gracefully handle this, to allow the object file be loaded without considering these tables or directories. This is required for LLDB to load these files for use as debug symbols. COFFObjectFile shall also check these pointers more carefully to account for cases in which the section contains less raw data than the size given by VirtualSize, to prevent going out of bounds. This commit also changes COFFDump in llvm-objdump to reuse the pointers that are already range-checked in COFFObjectFile. This fixes a crash when trying to dump the TLS directory from a stripped file. Fixes https://github.com/mstorsjo/llvm-mingw/issues/284 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D126898	2022-06-03 18:31:01 +03:00
Serguei Katkov	24e16e4af2	[SSAUpdaterImpl] Do not generate phi node with all the same incoming values If all available vals to basic block are the same - do not build new phi node and just use this value. Reviewed By: sameerds Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D126525	2022-06-03 12:24:33 +07:00
Mingming Liu	8601f269f1	[Inline][Remark][NFC] Optionally provide inline context to inline advisor. This patch has no functional change, and merely a preparation patch for main functional change. The motivating use case is to annotate inline remark pass name with context information (e.g. prelink or postlink, CGSCC or always-inliner), see D125495 for more details. Differential Revision: https://reviews.llvm.org/D126824	2022-06-02 13:14:30 -07:00
Balazs Benics	7d24641f89	[llvm][analyzer][NFC] Introduce SFINAE for specializing FoldingSetTraits Reviewed By: martong Differential Revision: https://reviews.llvm.org/D126803	2022-06-02 19:46:38 +02:00
Liqiang Tao	14e8add939	[llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder This patch introduces the abstract base class InlinePriority to serve as the comparison function for the priority queue. A derived class, such as SizePriority, may choose to cache the priorities for different functions for performance reasons. This design shields the type used for the priority away from classes outside InlinePriority and classes derived from it. In turn, PriorityInlineOrder no longer needs to be a template class. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D126300	2022-06-02 23:40:26 +08:00
Liqiang Tao	5c6ed60c51	Revert "[llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder" This reverts commit `50de7f1e77`.	2022-06-02 23:18:47 +08:00
Liqiang Tao	50de7f1e77	[llvm][ModuleInliner] Refactor InlineSizePriority and PriorityInlineOrder This patch introduces the abstract base class InlinePriority to serve as the comparison function for the priority queue. A derived class, such as SizePriority, may choose to cache the priorities for different functions for performance reasons. This design shields the type used for the priority away from classes outside InlinePriority and classes derived from it. In turn, PriorityInlineOrder no longer needs to be a template class. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D126300	2022-06-02 22:28:33 +08:00
Joseph Huber	6bdf352ed8	[Binary] Remove OffloadBinary from the Objects enumeration Summary: We use the beginning and end of this enumeration to determine what is and isn't an object format. The enumeration for the OffloadBinary was put here by mistake which led to it being mistakenly classified as an Object file.	2022-06-02 09:34:23 -04:00
Paul Walker	1fe4953d89	[SVE] Remove custom lowering of scalable vector MGATHER & MSCATTER operations. Differential Revision: https://reviews.llvm.org/D126255	2022-06-02 11:19:52 +01:00
Snehasish Kumar	962db7de84	[memprof] Update summary output. Update the YAML format print out of the profile to include a summary instead of displaying the headers in the raw file buffer. This allows us to release the raw buffer early saving memory. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D126834	2022-06-02 02:15:42 +00:00
Matthias Braun	850d53a197	LTO: Decide upfront whether to use opaque/non-opaque pointer types LTO code may end up mixing bitcode files from various sources varying in their use of opaque pointer types. The current strategy to decide between opaque / typed pointers upon the first bitcode file loaded does not work here, since we could be loading a non-opaque bitcode file first and would then be unable to load any files with opaque pointer types later. So for LTO this: - Adds an `lto::Config::OpaquePointer` option and enforces an upfront decision between the two modes. - Adds `-opaque-pointers`/`-no-opaque-pointers` options to the gold plugin; disabled by default. - `--opaque-pointers`/`--no-opaque-pointers` options with `-plugin-opt=-opaque-pointers`/`-plugin-opt=-no-opaque-pointers` aliases to lld; disabled by default. - Adds an `-lto-opaque-pointers` option to the `llvm-lto2` tool. - Changes the clang driver to pass `-plugin-opt=-opaque-pointers` to the linker in LTO modes when clang was configured with opaque pointers enabled by default. This fixes https://github.com/llvm/llvm-project/issues/55377 Differential Revision: https://reviews.llvm.org/D125847	2022-06-01 18:05:53 -07:00
Hendrik Greving	a92ed167f2	[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4. Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4. Keeps MVT::i2, MVT::i4 lowering actions as expand, which should be removed once targets set this explicitly. Adjusts 11 lit tests to reflect slightly different behavior during DAG combine. Differential Revision: https://reviews.llvm.org/D125247	2022-06-02 00:49:11 +00:00
Quentin Colombet	1a155ee7de	[RegisterClassInfo] Invalidate cached information if ignoreCSRForAllocationOrder changes Even if CSR list is same between functions, we could have had a different allocation order if ignoreCSRForAllocationOrder is evaluated differently. Hence invalidate cached register class information if ignoreCSRForAllocationOrder changes. Patch by Srividya Karumuri <srividya_karumuri@apple.com> Differential Revision: https://reviews.llvm.org/D126565	2022-06-01 17:15:51 -07:00
Shilei Tian	eb673be5ac	[OMPIRBuilder] Add the support for compare capture This patch adds the support for `compare capture` in `OMPIRBuilder`. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D120007	2022-06-01 19:53:43 -04:00
Joseph Huber	afd2f7e991	[Binary] Promote OffloadBinary to inherit from Binary We use the `OffloadBinary` to create binary images of offloading files and their corresonding metadata. This patch changes this to inherit from the base `Binary` class. This allows us to create and insepect these more generically. This patch includes all the necessary glue to implement this as a new binary format, along with added the magic bytes we use to distinguish the offloading binary to the `file_magic` implementation. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D126812	2022-06-01 18:40:57 -04:00
Chris Bieneman	129c056d62	[ObjectYAML][DX] Support yaml2dxcontainer This patch adds a the first bits of support for a yaml representation of dxcontainer files. Since the YAML representation's primary purpose is testing infrastructure, the yaml representation supports both verbose and a more friendly format by making computable sizes and offsets optional. If provided they are validated to be correct, otherwise they are computed on the fly during emission. As I expand the format I'll be able to make more size fields optional, and I will continue to make the format easier to work with. Depends on D124804 Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D124944	2022-06-01 15:34:00 -05:00
Hendrik Greving	e9d05cc7d8	Revert "[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4." This reverts commit `430ac5c302`. Due to failures in Clang tests. Differential Revision: https://reviews.llvm.org/D125247	2022-06-01 13:27:49 -07:00
Chris Bieneman	9e3919dac4	[Object][DX] Parse DXContainer Parts DXContainer files are structured as parts. This patch adds support for parsing out the file part offsets and file part headers. Reviewed By: kuhar Differential Revision: https://reviews.llvm.org/D124804	2022-06-01 14:55:36 -05:00
Hendrik Greving	430ac5c302	[ValueTypes] Define MVTs for v128i2/v64i4 as well as i2 and i4. Adds MVT::v128i2, MVT::v64i4, and implied MVT::i2, MVT::i4. Keeps MVT::i2, MVT::i4 lowering actions as `expand`, which should be removed once targets set this explicitly. Adjusts 11 lit tests to reflect slightly different behavior during DAG combine. Differential Revision: https://reviews.llvm.org/D125247	2022-06-01 12:48:01 -07:00
Matt Arsenault	0e1c71e4a4	CodeGen: Move getAddressSpaceForPseudoSourceKind into TargetMachine Avoid the dependency on TargetInstrInfo, which depends on the subtarget and therefore the individual function. Currently AMDGPU is constructing PseudoSourceValue instances in MachineFunctionInfo. In order to facilitate copying MachineFunctionInfo, we need to stop allocating these there. Alternatively we could allow targets to subclass PseudoSourceValueManager, and allocate them similarly to MachineFunctionInfo.	2022-06-01 09:45:40 -04:00
Alexander Kornienko	7aa8a67882	Revert "[LAA] Initial support for runtime checks with pointer selects." This reverts commit `5890b30105` as per discussion on the review thread: https://reviews.llvm.org/D114487#3547560.	2022-06-01 15:24:27 +02:00
Florian Hahn	f68c547158	[LAA] Remove unused RuntimeCheckingPtrGroup constructor (NFC). The constructor is not used. Remove it.	2022-06-01 13:30:33 +01:00
Sheng	3fd75ce9c4	[NFC] fix typo	2022-06-01 18:48:03 +08:00
Martin Storsjö	298e9cac92	[MC] [Win64EH] Check that the SEH unwind opcodes match the actual instructions It's a fairly common issue that the generating code incorrectly marks instructions as narrow or wide; check that the instruction lengths add up to the expected value, and error out if it doesn't. This allows catching code generation bugs. Also check that prologs and epilogs are properly terminated, to catch other code generation issues. Differential Revision: https://reviews.llvm.org/D125647	2022-06-01 11:25:49 +03:00
Martin Storsjö	6b75a3523f	[ARM] [MC] Add support for writing ARM WinEH unwind info This includes .seh_* directives for generating it from assembly. It is designed fairly similarly to the ARM64 handling. For .seh_handler directives, such as ".seh_handler __C_specific_handler, @except" (which is supported on x86_64 and aarch64 so far), the "@except" bit doesn't work in ARM assembly, as '@' is used as a comment character (on all current platforms). Allow using '%' instead of '@' for this purpose. This convention is used by GAS in similar contexts already, e.g. [1]: Note on targets where the @ character is the start of a comment (eg ARM) then another character is used instead. For example the ARM port uses the % character. In practice, this unfortunately means that all such .seh_handler directives will need ifdefs for ARM. Contrary to ARM64, on ARM, it's quite common that we can't evaluate e.g. the function length at this point, due to instructions whose length is finalized later. (Also, inline jump tables end with a ".p2align 1".) If unable to to evaluate the function length immediately, emit it as an MCExpr instead. If we'd implement splitting the unwind info for a function (which isn't implemented for ARM64 yet either), we wouldn't know whether we need to split it though. Avoid calling getFrameIndexOffset() on an unset FuncInfo.UnwindHelpFrameIdx, to avoid triggering asserts in the preexisting testcase CodeGen/ARM/Windows/wineh-basic.ll. (Once MSVC exception handling is fully implemented, those changes can be reverted.) [1] https://sourceware.org/binutils/docs/as/Section.html#Section Differential Revision: https://reviews.llvm.org/D125645	2022-06-01 11:25:48 +03:00
Martin Storsjö	e71b07e468	[MC] [Win64EH] Wrap the epilog instructions in a struct. NFC. For ARM SEH, the epilogs will need a little more associated data than just the plain list of opcodes. This is a preparatory refactoring for D125645. Differential Revision: https://reviews.llvm.org/D125879	2022-06-01 11:25:48 +03:00
Mircea Trofin	f46dd19b48	[mlgo] Incrementally update FunctionPropertiesInfo during inlining Re-computing FunctionPropertiesInfo after each inlining may be very time consuming: in certain cases, e.g. large caller with lots of callsites, and when the overall IR doesn't increase (thus not tripping a size bloat threshold). This patch addresses this by incrementally updating FunctionPropertiesInfo. Differential Revision: https://reviews.llvm.org/D125841	2022-05-31 17:27:32 -07:00
Augie Fackler	42861faa8e	attributes: introduce allockind attr for describing allocator fn behavior I chose to encode the allockind information in a string constant because otherwise we would get a bit of an explosion of keywords to deal with the possible permutations of allocation function types. I'm not sure that CodeGen.h is the correct place for this enum, but it seemed to kind of match the UWTableKind enum so I put it in the same place. Constructive suggestions on a better location most certainly encouraged. Differential Revision: https://reviews.llvm.org/D123088	2022-05-31 10:01:17 -04:00
Simon Moll	18c1ee04de	Re-land "[VP] vp intrinsics are not speculatable" with test fix Update the llvmir-intrinsics.mlir test to account for the modified attribute sets. This reverts commit `2e2a8a2d90`.	2022-05-30 14:41:15 +02:00
Mehdi Amini	2e2a8a2d90	Revert "[VP] vp intrinsics are not speculatable" This reverts commit `78a18d2b54`. Break MLIR bot: https://lab.llvm.org/buildbot/#/builders/61/builds/27127	2022-05-30 12:26:16 +00:00
Simon Moll	78a18d2b54	[VP] vp intrinsics are not speculatable VP intrinsics show UB if the %evl parameter is out of bounds - they must not carry the speculatable attribute. The out-of-bounds UB disappears when the %evl parameter is expanded into the mask or expansion replaces the entire VP intrinsic with non-VP code. This patch - Removes the speculatable attribute on all VP intrinsics. - Generalizes the isSafeToSpeculativelyExecute function to let VP expansion know whether the VP intrinsic replacement will be speculatable. VP expansion may only discard %evl where this is the case. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D125296	2022-05-30 12:20:05 +02:00
Yuki Okushi	bc08a16d82	Remove `deplibs` keyword completely D102763 removed the almost support of `deplibs` but it seems `kw_deplibs` was missed. This patch removes it. Differential Revision: https://reviews.llvm.org/D126527	2022-05-29 01:16:44 +09:00
eopXD	6a84579243	[LSR][TTI][PowerPC][SystemZ][X86] Add const-ness to TTI::isLSRCostLess. NFC Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D126350	2022-05-27 15:22:23 -07:00
Derek Schuff	a205f2904d	[WebAssembly] Consolidate sectionTypeToString in BinaryFormat [NFC] Currently there are 2 duplicate implementation, and I want to add a use in a 3rd place. Combine them in lib/BinaryFormat so they can be shared. Also update toString for symbol and reloc types to use StringRef Differential Revision: https://reviews.llvm.org/D126553	2022-05-27 09:26:36 -07:00
Balazs Benics	a73b50ad06	Revert "[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable" This reverts commit `3988bd1398`. Did not build on this bot: https://lab.llvm.org/buildbot#builders/215/builds/6372 /usr/include/c++/9/bits/predefined_ops.h:177:11: error: no match for call to ‘(llvm::less_first) (std::pair<long unsigned int, llvm::bolt::BinaryBasicBlock>&, const std::pair<long unsigned int, std::nullptr_t>&)’ 177 \| { return bool(_M_comp(__it, __val)); }	2022-05-27 11:19:18 +02:00
Balazs Benics	3988bd1398	[llvm][clang][bolt][NFC] Use llvm::less_first() when applicable One could reuse this functor instead of rolling out your own version. There were a couple other cases where the code was similar, but not quite the same, such as it might have an assertion in the lambda or other constructs. Thus, I've not touched any of those, as it might change the behavior in some way. As per https://discourse.llvm.org/t/submitting-simple-nfc-patches/62640/3?u=steakhal Chris Lattner > LLVM intentionally has a “yes, you can apply common sense judgement to > things” policy when it comes to code review. If you are doing mechanical > patches (e.g. adopting less_first) that apply to the entire monorepo, > then you don’t need everyone in the monorepo to sign off on it. Having > some +1 validation from someone is useful, but you don’t need everyone > whose code you touch to weigh in. Differential Revision: https://reviews.llvm.org/D126068	2022-05-27 11:15:23 +02:00
Serge Pavlov	bdd0093f4d	[GlobalISel] Add G_IS_FPCLASS Add a generic opcode to represent `llvm.is_fpclass` intrinsic. Differential Revision: https://reviews.llvm.org/D121454	2022-05-27 13:49:47 +07:00
Piggy NL	842e48bd65	[demangler][RISCV] Fix for long double Summary: The size of long double in RISCV (both RV32 and RV64) is 16 bytes, thus the mangled_size shouble be 32. This patch will fix test case "_ZN5test01hIfEEvRAcvjplstT_Le4001a000000000000000E_c" in test_demangle.pass.cpp, which is expected to be invalid but demangler returned "void test0::h<float>(char (&) [(unsigned int)((sizeof (float)) + (0x0.000000004001ap-16382L))])" in RISCV environment without this patch. Reviewed By: urnathan Differential Revision: https://reviews.llvm.org/D126480	2022-05-27 13:53:20 +08:00
Rahman Lavaee	08cc058518	Reland "[Propeller] Promote functions with propeller profiles to .text.hot." This relands commit `4d8d2580c5`. The major change here is using 'addUsedIfAvailable<BasicBlockSectionsProfileReader>()` to make sure we don't change the pipeline tests. Differential Revision: https://reviews.llvm.org/D126518	2022-05-26 19:53:14 -07:00
Enna1	52992f136b	Add !nosanitize to FixedMetadataKinds This patch adds !nosanitize metadata to FixedMetadataKinds.def, !nosanitize indicates that LLVM should not insert any sanitizer instrumentation. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D126294	2022-05-27 09:46:13 +08:00
Rahman Lavaee	3aa249329f	Revert "[Propeller] Promote functions with propeller profiles to .text.hot." This reverts commit `4d8d2580c5`.	2022-05-26 18:45:40 -07:00
Rahman Lavaee	4d8d2580c5	[Propeller] Promote functions with propeller profiles to .text.hot. Today, text section prefixes (none, .unlikely, .hot, and .unkown) are determined based on PGO profile. However, Propeller may deem a function hot when PGO doesn't. Besides, when `-Wl,-keep-text-section-prefix=true` Propeller cannot enforce a global section ordering as the linker can only reorder sections within each output section (.text, .text.hot, .text.unlikely). This patch promotes all functions with Propeller profiles (functions listed in the basic-block-sections profile) to .text.hot. The feature is hidden behind the flag `--bbsections-guided-section-prefix` which defaults to `true`. The new implementation refactors the parsing of basic block sections profile into a new `BasicBlockSectionsProfileReader` analysis pass. This allows us to use the information earlier in `CodeGenPrepare` in order to set the functions text prefix. `BasicBlockSectionsProfileReader` will be used both by `BasicBlockSections` pass and `CodeGenPrepare`. Differential Revision: https://reviews.llvm.org/D122930	2022-05-26 16:23:21 -07:00
Arthur Eubanks	36096c2b38	[NFC][JumpThreading] Remove InsertFreezeWhenUnfoldingSelect pass parameter All callers pass true. select-unfold-freeze.ll is now a subset of select.ll so delete it. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D126501	2022-05-26 16:13:34 -07:00
Adrian Tong	7c13ae6490	Give option to use isCopyInstr to determine which MI is treated as Copy instruction in MCP. This is then used in AArch64 to remove copy instructions after taildup ran in machine block placement Differential Revision: https://reviews.llvm.org/D125335	2022-05-26 18:43:16 +00:00
Zongwei Lan	ad73ce318e	[Target] use getSubtarget<> instead of static_cast<>(getSubtarget()) Differential Revision: https://reviews.llvm.org/D125391	2022-05-26 11:22:41 -07:00
Bruno Cardoso Lopes	ce54b22657	[Clang][CoverageMapping] Fix switch counter codegen compile time explosion C++ generated code with huge amount of switch cases chokes badly while emitting coverage mapping, in our specific testcase (~72k cases), it won't stop after hours. After this change, the frontend job now finishes in 4.5s and shrinks down `@__covrec_` by 288k when compared to disabling simplification altogether. There's probably no good way to create a testcase for this, but it's easy to reproduce, just add thousands of cases in the below switch, and build with `-fprofile-instr-generate -fcoverage-mapping`. ``` enum type : int { FEATURE_INVALID = 0, FEATURE_A = 1, ... }; const char *to_string(type e) { switch (e) { case type::FEATURE_INVALID: return "FEATURE_INVALID"; case type::FEATURE_A: return "FEATURE_A";} ... } ``` Differential Revision: https://reviews.llvm.org/D126345	2022-05-26 11:05:15 -07:00
Owen Anderson	939a43461b	Revert "Replace the custom linked list in LeaderTableEntry with TinyPtrVector." This reverts commit `1e91149844`. Pending further discussion.	2022-05-26 09:50:36 -07:00
Krzysztof Parzyszek	aee6b8efd0	[ADT] Explicitly delete copy/move constructors and operator= in IntervalMap The default implementations will perform a shallow copy instead of a deep copy, causing some internal data structures to be shared between different objects. Disable these operations so they don't get accidentally used. Differential Revision: https://reviews.llvm.org/D126401	2022-05-26 07:58:18 -07:00
Paul Robinson	634c8ef69a	[PS5] Allow dllimport/dllexport same as PS4	2022-05-26 07:01:30 -07:00
Chen Zheng	d79275238f	[MachineSink] replace MachineLoop with MachineCycle reapply `62a9b36fcf` and fix module build failue: 1: remove MachineCycleInfoWrapperPass in MachinePassRegistry.def MachineCycleInfoWrapperPass is a anylysis pass, should not be there. 2: move the definition for MachineCycleInfoPrinterPass to cpp file. Otherwise, there are module conflicit for MachineCycleInfoWrapperPass in MachinePassRegistry.def and MachineCycleAnalysis.h after `62a9b36fcf`. MachineCycle can handle irreducible loop. Natural loop analysis (MachineLoop) can not return correct loop depth if the loop is irreducible loop. And MachineSink is sensitive to the loop depth, see MachineSinking::isProfitableToSinkTo(). This patch tries to use MachineCycle so that we can handle irreducible loop better. Reviewed By: sameerds, MatzeB Differential Revision: https://reviews.llvm.org/D123995	2022-05-26 06:45:23 -04:00
Ivan Kosarev	ad1d60c3be	[FileCheck] Catch missspelled directives. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D125604	2022-05-26 11:37:19 +01:00
Fangrui Song	9ee15bba47	[MC] Lower case the first letter of EmitCOFF* EmitWin* EmitCV*. NFC	2022-05-26 00:14:08 -07:00
Owen Anderson	1e91149844	Replace the custom linked list in LeaderTableEntry with TinyPtrVector. The purpose of the custom linked list was to optimize for the case of a single-element list. It turns out that TinyPtrVector handles the same basic scenario even better, reducing the size of LeaderTableEntry by 33%, and requiring only log2(N) allocations as the size of the list grows. The only downside is that we have to store the Value's and BasicBlock's in separate vectors, which is slightly awkward in a few cases. Fortunately that ends up being entirely encapsulated inside helper functions. Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D125205	2022-05-25 23:52:44 -07:00
serge-sans-paille	fb67d683db	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `7030654296` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D126417	2022-05-26 08:12:34 +02:00
Snehasish Kumar	ec51971eae	[memprof] Keep and display symbol names in the RawMemProfReader. Extend the Frame struct to hold the symbol name if requested when a RawMemProfReader object is constructed. This change updates the tests and removes the need to pass --debug to obtain the mapping from GUID to symbol names. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D126344	2022-05-25 21:17:44 +00:00
Alexey Bataev	10f41a2147	[SLP]Fix PR55688: Miscompile due to incorrect nuw/nsw handling. Need to use all ReductionOps when propagating flags for the reduction ops, otherwise transformation is not correct. Plus, need to drop nuw/nsw flags. Differential Revision: https://reviews.llvm.org/D126371	2022-05-25 13:59:06 -07:00
Maksim Panchenko	bed9efed71	[MCDisassembler] Disambiguate Size parameter in tryAddingSymbolicOperand() MCSymbolizer::tryAddingSymbolicOperand() overloaded the Size parameter to specify either the instruction size or the operand size depending on the architecture. However, for proper symbolic disassembly on X86, we need to know both sizes, as an instruction can have two operands, and the instruction size cannot be reliably calculated based on the operand offset and its size. Hence, split Size into OpSize and InstSize. For X86, the new interface allows to fix a couple of issues: * Correctly adjust the value of PC-relative operands. * Set operand size to zero when the operand is specified implicitly. Differential Revision: https://reviews.llvm.org/D126101	2022-05-25 13:44:32 -07:00
Christian Sigg	c4bc416418	[LLVM] Add rcp.approx.ftz.f32 intrinsic Split out from https://reviews.llvm.org/D126158. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D126369	2022-05-25 21:05:20 +02:00
Aaron Ballman	69da3b6aea	Revert "[OpenMP] atomic compare fail : Parser & AST support" This reverts commit `232bf8189e`. It broke the sanitize buildbot: https://lab.llvm.org/buildbot/#/builders/5/builds/24074 It also reproduces on Windows debug builds as a crash.	2022-05-25 13:34:34 -04:00
Zequan Wu	a648724921	Reland "[llvm-pdbutil] Add options to only dump symbol record at specified offset and its parents or children with spcified depth." This reverts commit `cfb4e78252`.	2022-05-25 09:57:35 -07:00
Takafumi Arakaki	18e6b8234a	Allow pointer types for atomicrmw xchg This adds support for pointer types for `atomic xchg` and let us write instructions such as `atomicrmw xchg i64** %0, i64* %1 seq_cst`. This is similar to the patch for allowing atomicrmw xchg on floating point types: https://reviews.llvm.org/D52416. Differential Revision: https://reviews.llvm.org/D124728	2022-05-25 16:20:26 +00:00
Anubhab Ghosh	9da89651a8	[llvm-objcopy][ObjectYAML][mips] Add MIPS specific ELF section indexes This fixes https://github.com/llvm/llvm-project/issues/53998 and displays correct information in obj2yaml for SHN_MIPS_* sections according to https://refspecs.linuxfoundation.org/elf/mipsabi.pdf Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D123902	2022-05-25 09:01:12 -07:00
Sunil Kuravinakop	ca27f3e3b2	[Clang][OpenMP] Support for omp nothing Patch to support "#pragma omp nothing" Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D123286	2022-05-24 23:59:31 -05:00
Sunil Kuravinakop	232bf8189e	[OpenMP] atomic compare fail : Parser & AST support This is a support for " #pragma omp atomic compare fail ". It has Parser & AST support for now. Reviewed By: tianshilei1992 Differential Revision: https://reviews.llvm.org/D123235	2022-05-24 23:56:42 -05:00
Chen Zheng	80c4910f3d	Revert "[MachineSink] replace MachineLoop with MachineCycle" This reverts commit `62a9b36fcf`. Cause build failure on lldb incremental buildbot: https://green.lab.llvm.org/green/view/LLDB/job/lldb-cmake/43994/changes	2022-05-24 22:43:37 -04:00
Chen Zheng	62a9b36fcf	[MachineSink] replace MachineLoop with MachineCycle MachineCycle can handle irreducible loop. Natural loop analysis (MachineLoop) can not return correct loop depth if the loop is irreducible loop. And MachineSink is sensitive to the loop depth, see MachineSinking::isProfitableToSinkTo(). This patch tries to use MachineCycle so that we can handle irreducible loop better. Reviewed By: sameerds, MatzeB Differential Revision: https://reviews.llvm.org/D123995	2022-05-24 01:16:19 -04:00
Shraiysh Vaishay	7604c59bd2	[OpenMP][IRBuilder] `omp task` support This patch adds basic support for `omp task` to the OpenMPIRBuilder. The outlined function after code extraction is called from a wrapper function with appropriate arguments. This wrapper function is passed to the runtime calls for task allocation. This approach is different from the Clang approach - clang directly emits the runtime call to the outlined function. The outlining utility (OutlineInfo) simply outlines the code and generates a function call to the outlined function. After the function has been generated by the outlining utility, there is no easy way to alter the function arguments without meddling with the outlining itself. Hence the wrapper function approach is taken. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D71989	2022-05-24 10:22:11 +05:30
Hyoun Kyu Cho	6c12ae8163	Exposes interface to free up caching data structure in DWARFDebugLine and DWARFUnit for memory management This is minimum changes extracted from https://reviews.llvm.org/D78950. The old patch tried to add LRU eviction of caching data structure. Due to multiple layers of interfaces that users could be using, it was not clear where to put the functionality. While we work out on where to put that functionality, it'll be great to add this minimum interface change so that the user could implement their own memory management. More specifically: * Add a clearLineTable method for DWARFDebugLine which erases the given offset from the LineTableMap. * DWARFDebugContext adds the clearLineTableForUnit method that leverages clearLineTable to remove the object corresponding to a given compile unit, for memory management purposes. When it is referred to again, the line table object will be repopulated. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D90006	2022-05-24 03:23:24 +00:00
Wolfgang Pieb	ae9489025f	[NFC][Metadata] Define move constructor and move assignment operator for MDOperand. This is a preparatory patch for the MDNode resize functionality. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D125994	2022-05-23 20:04:45 -07:00
Sam Clegg	74f9841977	[lld][WebAssembly] Allow use of statically allocated TLS region. It turns out we were already allocating static address space for TLS data along with the non-TLS static data, but this space was going unused/ignored. With this change, we include the TLS segment in `__wasm_init_memory` (which does the work of loading the passive segments into memory when a module is first loaded). We also set the `__tls_base` global to point to the start of this segment. This means that the runtime can use this static copy of the TLS data for the first/primary thread if it chooses, rather than doing a runtime allocation prior to calling `__wasm_init_tls`. Practically speaking, this will allow emscripten to avoid dynamic allocation of TLS region on the main thread. Differential Revision: https://reviews.llvm.org/D126107	2022-05-23 17:27:17 -07:00
Jamie Schmeiser	24239e246c	Add new hidden option -print-on-crash that prints out IR that caused opt pipeline to crash A new hidden option -print-on-crash that prints the IR as it was upon entering the last pass when there is a crash. The IR is saved in its print form before each pass is started and a signal handler is registered. If the compilation crashes, the signal handler will print the saved IR to dbgs(). This option can be modified using -print-module-scope to get the IR for the complete module. Note that this option only works with the new pass manager. Reviewed By: yrouban Differential Revision: https://reviews.llvm.org/D86657	2022-05-23 15:38:38 -07:00
Mitch Phillips	cead4eceb0	[symbolizer] Parse DW_TAG_variable DIs to show line info for globals Currently, llvm-symbolizer doesn't like to parse .debug_info in order to show the line info for global variables. addr2line does this. In the future, I'm looking to migrate AddressSanitizer off of internal metadata over to using debuginfo, and this is predicated on being able to get the line info for global variables. This patch adds the requisite support for getting the line info from the .debug_info section for symbolizing global variables. This only happens when you ask for a global variable to be symbolized as data. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D123538	2022-05-23 13:30:22 -07:00
Sanjay Patel	e8c20d995b	[IR] add and use pattern match specialization for sqrt intrinsic; NFC This was included in D126190 originally, but it's independent and a useful change for readability.	2022-05-23 14:16:30 -04:00
Jingu Kang	bb82f74612	Revert "Revert "[AArch64] Set maximum VF with shouldMaximizeVectorBandwidth"" This reverts commit `42ebfa8269`. The commmit from https://reviews.llvm.org/D125918 has fixed the stage 2 build failure. Differential Revision: https://reviews.llvm.org/D118979	2022-05-23 16:15:45 +01:00
Anastasia Stulova	72832efc94	[SPIR-V] Allow setting SPIR-V version via target triple. Currently added versions are from v1.0 to v1.5, other versions can be added as needed. This change also adds documentation about SPIR-V target support in LLVM. Differential Revision: https://reviews.llvm.org/D124776	2022-05-23 14:24:00 +01:00
Peter Waller	ade47bdc31	[LV] Improve register pressure estimate at high VFs Previously, `getRegUsageForType` was implemented using `getTypeLegalizationCost`. `getRegUsageForType` is used by the loop vectorizer to estimate the register pressure caused by using a vector type. However, `getTypeLegalizationCost` currently only appears to understand splitting and not scalarization, so significantly underestimates the register requirements. Instead, use `getNumRegisters`, which understands when scalarization can occur (via computeRegisterProperties). This was discovered while investigating D118979 (Set maximum VF with shouldMaximizeVectorBandwidth), where under fixed-length 512-bit SVE the loop vectorizer previously ends up costing an v128i1 as 2 v64i* registers where it actually occupies 128 i32 registers. I'm sending this patch early for comment, I'm still doing some sanity checking with LNT. I note that getRegisterClassForType appears to return VectorRC even though the type in question (large vNi1 types) end up occupying scalar registers. That might be worth fixing too. Differential Revision: https://reviews.llvm.org/D125918	2022-05-23 07:57:45 +00:00
Sergei Trofimovich	5e9be93566	[Support] Add missing <cstdint> header to Base64.h Without the change llvm build fails on this week's gcc-13 snapshot as: [ 91%] Building CXX object unittests/Support/CMakeFiles/SupportTests.dir/Base64Test.cpp.o In file included from llvm/unittests/Support/Base64Test.cpp:14: llvm/include/llvm/Support/Base64.h: In function 'std::string llvm::encodeBase64(const InputBytes&)': llvm/include/llvm/Support/Base64.h:29:5: error: 'uint32_t' was not declared in this scope 29 \| uint32_t x = ((unsigned char)Bytes[i] << 16) \| \| ^~~~~~~~	2022-05-23 08:48:14 +01:00
Sergei Trofimovich	ff1681ddb3	[Support] Add missing <cstdint> header to Signals.h Without the change llvm build fails on this week's gcc-13 snapshot as: [ 0%] Building CXX object lib/Support/CMakeFiles/LLVMSupport.dir/Signals.cpp.o In file included from llvm/lib/Support/Signals.cpp:14: llvm/include/llvm/Support/Signals.h:119:8: error: variable or field 'CleanupOnSignal' declared void 119 \| void CleanupOnSignal(uintptr_t Context); \| ^~~~~~~~~~~~~~~	2022-05-23 08:48:14 +01:00
NAKAMURA Takumi	cd5f3241c3	ADT::GenericCycleInfo: Hide validateTree() in -Asserts. validateTree() is instantiated with __FILE__. It will be pruned at link time due to -ffunction-sections but be left in object files. Its user is only GenericCycleInfo::compute() with assert(validateTree()); Therefore I think validateTree() may be hidden with NDEBUG. This is a fixup for https://reviews.llvm.org/D112696	2022-05-23 01:15:02 +09:00
Paul Walker	258dac43d6	[SVE] Enable use of 32bit gather/scatter indices for fixed length vectors Differential Revision: https://reviews.llvm.org/D125193	2022-05-22 12:32:30 +01:00
Lang Hames	55e8f721d4	[ORC] Allow FailedToMaterialize errors to outlive ExecutionSessions. Idiomatic llvm::Error usage can result in a FailedToMaterialize error tearing down an ExecutionSession instance. Since the FailedToMaterialize error holds SymbolStringPtrs and JITDylib references this leads to crashes when accessing or logging the error. This patch modifies FailedToMaterialize to retain the SymbolStringPool and JITDylibs involved in the failure so that we can safely report an error message to the client, even if the error tears down the session. The contract for JITDylibs allows the getName method to be used even after the session has been torn down, but no other JITDylib fields should be accessed via the FailedToMaterialize error if the ssesion has been torn down. Logging the error is guaranteed to be safe in all cases.	2022-05-21 13:51:02 -07:00
Lang Hames	f3428dafdc	[ORC] Add a ~ExectionSession destructor to verify that endSession was called. Clients are required to call ExecutionSession::endSession before destroying the ExecutionSession. Failure to do so can lead to memory leaks and other difficult to debug issues. Enforcing this requirement by assertion makes it easy to spot or debug situations where the contract was not followed.	2022-05-21 09:02:01 -07:00
Benjamin Kramer	c312f02594	[STLExtras] Make indexed_accessor_range operator== compatible with C++20 This would be ambigious with itself when C++20 tries to lookup the reversed form. I didn't find a use in LLVM, but MLIR does a lot of comparisons of ranges of different types.	2022-05-21 13:00:30 +02:00
Nikita Popov	6f0ca6fd23	[JumpThreading] Insert freeze when unfolding select JumpThreading may convert selects into branch instructions, in which case the condition needs to be frozen (as branch on poison is immediate undefined behavior, unlike select on poison). The necessary code for this is already in place, this just enables the option. Differential Revision: https://reviews.llvm.org/D125869	2022-05-21 11:24:27 +02:00
Ahmed Bougacha	362b4066f0	[ObjCARC] Drop nullary clang.arc.attachedcall bundles in autoupgrade. In certain use-cases, these can be emitted by old compilers, but the operand is now always required. These are only used for optimizations, so it's safe to drop them if they happen to have the now-invalid format. The semantically-required call is already a separate instruction. Differential Revision: https://reviews.llvm.org/D123811	2022-05-20 15:27:29 -07:00
Shilei Tian	ff60a0a364	[LLVM] Add a check if should cast atomic operations to integer type Currently for atomic load, store, and rmw instructions, as long as the operand is floating-point value, they are casted to integer. Nowadays many targets can actually support part of atomic operations with floating-point operands. For example, NVPTX supports atomic load and store of floating-point values. This patch adds a series interface functions `shouldCastAtomicXXXInIR`, and the default implementations are same as what we currently do. Later for targets can have their specialization. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D125652	2022-05-20 17:23:53 -04:00
Jay Foad	9ece051847	[AMDGPU] Mark s_get_waveid_in_workgroup as not reading memory It is already marked as having side effects, at least in MIR. It does not interact with anything else that is modelled as a memory access either in IR or MachineIR. Differential Revision: https://reviews.llvm.org/D125985	2022-05-19 21:25:46 +01:00
Jay Foad	86b55edab6	[AMDGPU] Mark s_getreg as having side effects instead of reading memory s_getreg does not interact with anything else that is modelled as a memory access either in IR or MachineIR. Differential Revision: https://reviews.llvm.org/D125968	2022-05-19 21:25:46 +01:00
Jennifer Yu	7aa9c39381	[Clang][[OpenMP5.1] Initial parser/sema for default(private) clause This implements the default(private) clause as defined in OMP5.1 Differential Revision: https://reviews.llvm.org/D125912	2022-05-19 12:43:13 -07:00
Keith Smiley	066243057f	[Object] Fix updating darwin archives When creating an archive, llvm-ar looks at the host to determine the archive format to use, on Apple platforms this means it uses the K_DARWIN format. K_DARWIN is _virtually_ equivalent to K_BSD, expect for some very slight differences around padding, timestamps in deterministic mode, and 64 bit formats. When updating an archive using llvm-ar, or llvm-objcopy, Archive would try to determine the kind, but it was not possible to get K_DARWIN in the initialization of the archive, because they're virtually inciting usable from K_BSD, especially since the slight differences only apply in very specific cases. This leads to linker failures when the alignment workaround is not applied to an archive copied with llvm-objcopy. This change teaches Archive to infer the K_DARWIN type in the cases where it's possible and the first object in the archive is a macho object. This avoids using the host triple to determine this to not affect cross compiling. Ideally we would eliminate the separate K_DARWIN type entirely since it's not a truly separate archive type, but then we'd have to force the macho workarounds on the BSD format generally. This might be acceptable but then it would be unclear how to handle this case without forcing the K_DARWIN64 format on all BSD users: ``` if (LastOffset >= Sym64Threshold) { if (Kind == object::Archive::K_DARWIN) Kind = object::Archive::K_DARWIN64; else Kind = object::Archive::K_GNU64; } ``` The logic used to determine if the object is macho is derived from the logic llvm-ar uses. Previous context: - `111cd669e9` - `23a76be5ad` Differential Revision: https://reviews.llvm.org/D124895	2022-05-19 10:56:26 -07:00
Lang Hames	4bb18a89c4	[ORC] Add missing std::moves, pass SymbolLookupSet by value. Avoids some unnecessary SymbolStringPtr copies.	2022-05-19 10:51:20 -07:00
Paul Walker	d640442518	[NFC] Fix a couple of whitespace issues.	2022-05-19 17:27:09 +00:00
Sotiris Apostolakis	ca7c307d18	[SelectOpti][1/5] Setup new select-optimize pass This is the first commit for the cmov-vs-branch optimization pass. The goal is to develop a new profile-guided and target-independent cost/benefit analysis for selecting conditional moves over branches when optimizing for performance. Initially, this new pass is expected to be enabled only for instrumentation-based PGO. RFC: https://discourse.llvm.org/t/rfc-cmov-vs-branch-optimization/6040 Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D120230	2022-05-19 16:31:10 +00:00
Joseph Huber	dbffa4073c	[NVVM] Update intrinsic defintions to include the `nocallback` attribute This patch adds the `nocallback` attribute to the NVVM intrinsics that did not use the `DefaultAttrsIntrinsic` method that includes it already. The `nocallback` attribute states that the intrinsic function cannot enter back into the caller's translation-unit. This allows as to determine that a function calling a `nocallback` function can have the `norecurse` attribute. This should be safe for all the NVVM intrinsics because they do not call other functions within the translation unit. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D125937	2022-05-19 12:30:35 -04:00
Amy Kwan	c35ca3a1c7	[PowerPC] Implement XL compat __fnabs and __fnabss builtins. This patch implements the following floating point negative absolute value builtins that required for compatibility with the XL compiler: ``` double __fnabs(double); float __fnabss(float); ``` These builtins will emit : - fnabs on PWR6 and below, or if VSX is disabled. - xsnabsdp on PWR7 and above, if VSX is enabled. Differential Revision: https://reviews.llvm.org/D125506	2022-05-19 11:28:40 -05:00
Jay Foad	4e432f1b7c	[APInt] Deprecate truncOrSelf, zextOrSelf and sextOrSelf Differential Revision: https://reviews.llvm.org/D125558	2022-05-19 11:23:13 +01:00
Michael Kruse	797fabaab2	[Analysis] Avoid virtual dtor. NFC. Replace virtual destructor by a protected non-virtual one. Additionally also making derived structs as virtual avoids the warning from reappearing. Also see the mailing list discussion: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20220516/1038290.html Reviewed By: dblaikie, YangKeao Differential Revision: https://reviews.llvm.org/D125830	2022-05-18 17:41:17 -05:00
Yusra Syeda	1e14b1a797	[SystemZ][z/OS] Add missing include to llvm/include/llvm/BinaryFormat/GOFF.h Differential Revision: https://reviews.llvm.org/D125921	2022-05-18 15:57:05 -04:00
Michael Kitzan	29bebb0237	[GISel] Add new combines for G_FMINNUM/MAXNUM and G_FMINIMUM/MAXIMUM I noticed https://reviews.llvm.org/D87415 added SDAG combines to fold FMIN/MAX instrs with NaNs. The patch implements the same NaN combines for GISel GMIR FMIN/MAX opcodes: G_FMINNUM(X, NaN) -> X G_FMAXNUM(X, NaN) -> X G_FMINIMUM(X, NaN) -> NaN G_FMAXIMUM(X, NaN) -> NaN The patch adds AArch64 tests for these combines as well. Reviewed by: arsenm Differential revision: https://reviews.llvm.org/D125819	2022-05-18 12:08:53 -07:00
Martin Storsjö	d4257fbbba	[llvm-readobj] Improve printing of Windows ARM packed unwind info Fix a couple minor details in the existing logic for calculating saved registers and stack adjustment. Synthesize the corresponding prologues and epilogues and print them. (This supersedes the previous printout of one single list of stored registers; as there's lots of minor nuance differences in how registers are pushed/popped in various corner cases, it's better to print the full prologue/epilogue instead of trying to condense it into one single list.) Print the raw values of the fields Reg, R, L (LinkRegister) and C (Chaining) instead of only printing the derived values. Differential Revision: https://reviews.llvm.org/D125644	2022-05-18 21:33:08 +03:00
Yusra Syeda	5ac411aea8	[SystemZ][z/OS] Add the PPA1 to SystemZAsmPrinter Differential Revision: https://reviews.llvm.org/D125725	2022-05-18 14:13:17 -04:00
Nikita Popov	e1d47d86d8	[IR] Report whether replaceUsesOfWith() changed something (NFC) With change reporting in transformation passes in mind.	2022-05-18 11:46:28 +02:00
Nikita Popov	e9a1c82d69	[SCEVExpander] Expand umin_seq using freeze %x umin_seq %y is currently expanded to %x == 0 ? 0 : umin(%x, %y). This patch changes the expansion to umin(%x, freeze %y) instead (https://alive2.llvm.org/ce/z/wujUhp). The motivation for this change are the test cases affected by D124910, where the freeze expansion ultimately produces better optimization results. This is largely because `(%x umin_seq %y) == %x` is a common expansion pattern, which reliably optimizes in freeze representation, but only sometimes with the zero comparison (in particular, if %x == 0 can fold to something else, we generally won't be able to cover reasonable code from this.) Differential Revision: https://reviews.llvm.org/D125372	2022-05-18 09:53:07 +02:00
Stanislav Mekhanoshin	dee3190293	[AMDGPU] Add llvm.amdgcn.global.load.lds intrinsic Differential Revision: https://reviews.llvm.org/D125279	2022-05-17 12:35:27 -07:00
Stanislav Mekhanoshin	791ec1c68e	[AMDGPU] Add intrinsics llvm.amdgcn.{raw\|struct}.buffer.load.lds Differential Revision: https://reviews.llvm.org/D124884	2022-05-17 10:32:13 -07:00
Ruobing Han	e1cf702a02	fix typo error in DivergenceAnalysis.h Fix a typo error in the comment in DivergenceAnalysis.h Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D125808	2022-05-17 17:20:17 +00:00
Michael Kruse	bd93df937a	[Polly] Mark classes as final by default. NFC. This make is obivious that a class was not intended to be derived from. NPM analysis pass can unfortunately not marked as final because they are derived from a llvm::Checker<T> template internally by the NPM. Also normalize the use of classes/structs * NPM passes are structs * Legacy passes are classes * structs that have methods and are not a visitor pattern are classes * structs have public inheritance by default, remove "public" keyword * Use typedef'ed type instead of inline forward declaration	2022-05-17 12:05:39 -05:00
Nikita Popov	5df22e507b	[IRBuilder] Move insertvalue/extractvalue to fold infrastructure Move from the old CreateXYZ() to the new FoldXYZ() mechanism. This change is likely NFC in practice, because I don't think that the places using InstSimplifyFolder use insertvalue/extractvalue.	2022-05-17 16:04:55 +02:00
Jay Foad	77480556c4	[RegAllocGreedy] New hook regClassPriorityTrumpsGlobalness Add a new TargetRegisterInfo hook to allow targets to tweak the priority of live ranges, so that AllocationPriority of the register class will be treated as more important than whether the range is local to a basic block or global. This is determined per-MachineFunction. Differential Revision: https://reviews.llvm.org/D125102	2022-05-17 12:35:21 +01:00
Alexey Lapshin	4d9c083437	[DWARFLinker][NFC] Add None value to the DwarfLinkerAccelTableKind enum. this review is extracted from D86539. 1. Rename AccelTableKind to DwarfLinkerAccelTableKind (to differentiate from AccelTableKind from CodeGen/AsmPrinter/DwarfDebug.h) 2. Add None value to the DwarfLinkerAccelTableKind. 3. added 'None' value for 'accelerator' option of dsymutil. Differential Revision: https://reviews.llvm.org/D125474	2022-05-17 12:32:32 +03:00
Nikita Popov	a694546f7c	[KnownBits] Add operator== Checking whether two KnownBits are the same is somewhat common, mainly in test code. I don't think there is a lot of room for confusion with "determine what the KnownBits for an icmp eq would be", as that has a different result type (this is what the eq() method implements, which returns Optional<bool>). Differential Revision: https://reviews.llvm.org/D125692	2022-05-17 09:38:13 +02:00
luxufan	63c81b23be	[RISCV] Support getHostCpuName for sifive-u74 Reviewed By: kito-cheng Differential Revision: https://reviews.llvm.org/D123978	2022-05-17 14:06:59 +08:00
Yang Keao	7dce9eb6e5	[DomPrinter] Migrate -dot-dom to the new pass manager. In D123677, @YangKeao provided an implementation of `DOTGraphTraits{Viewer,Printer}` in the new pass manager. This commit migrates the `DomPrinter` and `DomViewer` to the new pass manager. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D124904	2022-05-16 15:07:16 -05:00
Paul Walker	7dd05ba9ed	[SelectionDAG] Remove duplicate "is scaled" information from gather/scatter SDNodes. During early gather/scatter enablement two different approaches were taken to represent scaled indices: * A Scale operand whereby byte_offsets = Index * Scale * An IndexType whereby byte_offsets = Index * sizeof(MemVT.ElementType) Having multiple representations is bad as shown by this patch which fixes instances where the two are out of sync. The dedicated scale operand is more flexible and pervasive so this patch removes the UNSCALED values from IndexType. This means all indices are scaled but the scale can be one, hence unscaled. SDNodes now use the scale operand to answer the "isScaledIndex" question. I toyed with the idea of keeping the UNSCALED enums and helper functions but because they will have no uses and force SDNodes to validate the set of supported values I figured it's best to remove them. We can re-add them if there's a real need. For similar reasons I've kept the IndexType enum when a bool could be used as I think being explicitly looks better. Depends On D123347 Differential Revision: https://reviews.llvm.org/D123381	2022-05-16 20:47:52 +01:00
zhijian	52c615553c	[AIX] fixed llvm-ar can not read empty big archive correctly. Summary: llvm-ar can not read empty big archive correctly. it output error as error: unable to load 'empty.a': truncated or malformed archive (characters in size field in archive member header are not all decimal numbers: '<bigaf>' Reviewers: James Henderson Differential Revision: https://reviews.llvm.org/D124017	2022-05-16 14:29:37 -04:00
Rahman Lavaee	5f7ef65245	[llvm-objdump] Let --symbolize-operands symbolize basic block addresses based on the SHT_LLVM_BB_ADDR_MAP section. `--symbolize-operands` already symbolizes branch targets based on the disassembly. When the object file is created with `-fbasic-block-sections=labels` (ELF-only) it will include a SHT_LLVM_BB_ADDR_MAP section which maps basic blocks to their addresses. In such case `llvm-objdump` can annotate the disassembly based on labels inferred on this section. In contrast to the current labels, SHT_LLVM_BB_ADDR_MAP-based labels are created for every machine basic block including empty blocks and those which are not branched into (fallthrough blocks). The old logic is still executed even when the SHT_LLVM_BB_ADDR_MAP section is present to handle functions which have not been received an entry in this section. Reviewed By: jhenderson, MaskRay Differential Revision: https://reviews.llvm.org/D124560	2022-05-16 10:11:11 -07:00
Adrian Prantl	9c7c8be4a3	Remove stale file from modulemap	2022-05-16 10:05:38 -07:00
Hongtao Yu	acfd0a3456	[llvm-profgen] Update callsite body samples by summing up all call target samples. Current profile generation caculcates callsite body samples and call target samples separately. The former is done based on LBR range samples while the latter is done based on branch samples. Note that there's a subtle difference. LBR ranges is formed from two consecutive branch samples. Therefore the last entry in a LBR record will not be counted towards body samples while there's still a chance for it to be counted towards call targets if it is a function call. I'm making sense of the call body samples by updating it to the aggregation of call targets. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D122609	2022-05-16 09:13:37 -07:00
Ellis Hoag	6e23cd2bf0	[InstrProf][NFC] Save profile bias to function map Add a map from functions to load instructions that compute the profile bias. Previously we assumed that if the first instruction in the function was a load instruction, then it must be computing the bias. This was likely to work out because functions usually start with the `llvm.instrprof.increment` instruction, but optimizations could change this. For example, inlining into a non-profiled function. Reviewed By: phosek Differential Revision: https://reviews.llvm.org/D114319	2022-05-16 08:32:31 -07:00
Sanjay Patel	be7f09f7b2	[IR] create and use helper functions that test the signbit; NFCI	2022-05-16 11:26:23 -04:00
Philip Reames	55e2df7285	[LiveIntervals] Add range accessors for value numbers [nfc]	2022-05-16 08:23:12 -07:00
Florian Hahn	b7315ffc3c	[LAA,LV] Add initial support for pointer-diff memory checks. This patch adds initial support for a pointer diff based runtime check scheme for vectorization. This scheme requires fewer computations and checks than the existing full overlap checking, if it is applicable. The main idea is to only check if source and sink of a dependency are far enough apart so the accesses won't overlap in the vector loop. To do so, it is sufficient to compute the difference and compare it to the `VF * UF * AccessSize`. It is sufficient to check `(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards dependence in the vector loop with the given VF and UF. If Src >=u Sink, there is not dependence preventing vectorization, hence the overflow should not matter and using the ULT should be sufficient. Note that the initial version is restricted in multiple ways: 1. Pointers must only either be read or written, by a single instruction (this allows re-constructing source/sink for dependences with the available information) 2. Source and sink pointers must be add-recs, with matching steps 3. The step must be a constant. 3. abs(step) == AccessSize. Most of those restrictions can be relaxed in the future. See https://github.com/llvm/llvm-project/issues/53590. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D119078	2022-05-16 15:27:22 +01:00
Nikita Popov	8ab819ad90	[ConstantRange] Add toKnownBits() method Add toKnownBits() method to mirror fromKnownBits(). We know the top bits that are constant between min and max. The return value for an empty range is chosen to be conservative.	2022-05-16 16:12:25 +02:00
Liqin.Weng	ff3f4988ed	[CodeGen] Use ArrayRef in TargetLowering functions Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123656	2022-05-16 13:30:58 +00:00
Sheng	aab5bd180a	[ADT] Adopt the new casting infrastructure for PointerUnion Reviewed By: lattner, bzcheeseman Differential Revision: https://reviews.llvm.org/D125609	2022-05-16 18:40:05 +08:00
Abinav Puthan Purayil	485dd0b752	[GlobalISel] Handle constant splat in funnel shift combine This change adds the constant splat versions of m_ICst() (by using getBuildVectorConstantSplat()) and uses it in matchOrShiftToFunnelShift(). The getBuildVectorConstantSplat() name is shortened to getIConstantSplatVal() so that the *SExtVal() version would have a more compact name. Differential Revision: https://reviews.llvm.org/D125516	2022-05-16 16:03:30 +05:30
Nicolas Abram Lujan	436bbce765	[llvm-c] Add functions for enabling and creating opaque pointers This is based on https://reviews.llvm.org/D125168 which adds a wrapper to allow use of opaque pointers from the C API. I added an opaque pointer mode test to echo.ll, and to fix assertions that forbid the use of mixed typed and opaque pointers that were triggering in it I had to also add wrappers for setOpaquePointers() and isOpaquePointer(). I also changed echo.ll to remove a bitcast i32* %x to i8*, because passing it through llvm-as and llvm-dis was generating a %0 = bitcast ptr %x to ptr, but when building that same bitcast in echo.cpp it was getting elided by IRBuilderBase::CreateCast (`08ac661248/llvm/include/llvm/IR/IRBuilder.h (L1998-L1999)`). Differential Revision: https://reviews.llvm.org/D125183	2022-05-16 10:53:46 +02:00
stk	9902a0945d	Add ThreadPriority::Low, and use QoS class Utility on Mac On Apple Silicon Macs, using a Darwin thread priority of PRIO_DARWIN_BG seems to map directly to the QoS class Background. With this priority, the thread is confined to efficiency cores only, which makes background indexing take forever. Introduce a new ThreadPriority "Low" that sits in the middle between Background and Default, and maps to QoS class "Utility" on Mac. Make this new priority the default for indexing. This makes the thread run on all cores, but still lowers priority enough to keep the machine responsive, and not interfere with user-initiated actions. I didn't change the implementations for Windows and Linux; on these systems, both ThreadPriority::Background and ThreadPriority::Low map to the same thread priority. This could be changed as a followup (e.g. by using SCHED_BATCH for Low on Linux). See also https://github.com/clangd/clangd/issues/1119. Reviewed By: sammccall, dgoldman Differential Revision: https://reviews.llvm.org/D124715	2022-05-16 10:01:49 +02:00
bzcheeseman	0809f63826	[LLVM][Casting.h] Add trivial self-cast Casting from a type to itself should always be possible. Make this simple for all users, and add tests to ensure we keep being able to do this. Ref: https://reviews.llvm.org/D125543 Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D125590	2022-05-15 22:22:16 -07:00
Alexey Lapshin	fdae8641ad	[DWARFLinker][NFC] cleanup AddressManager interface. this review is extracted from D86539 1. delete areRelocationsResolved() method. 2. rename hasLiveMemoryLocation() -> isLiveVariable() hasLiveAddressRange() -> isLiveSubprogram(). Differential Revision: https://reviews.llvm.org/D125492	2022-05-15 22:47:04 +03:00
Sheng	c644488a8b	Rename `MCFixedLenDisassembler.h` as `MCDecoderOps.h` The name `MCFixedLenDisassembler.h` is out of date after D120958. Rename it as `MCDecoderOps.h` to reflect the change. Reviewed By: myhsu Differential Revision: https://reviews.llvm.org/D124987	2022-05-15 08:44:58 +08:00
Alex Brachet	a74d9e74e5	[ifs] Add --strip-size flag st_size may not be of importance to the abi if you are not using copy relocations. This is helpful when you want to check the abi of a shared object both when instrumented and not because asan will increase the size of objects to include the redzone. Differential revision: https://reviews.llvm.org/D124792	2022-05-14 18:50:20 +00:00
Alex Brachet	1f61260847	Revert "[ifs] Add --strip-size flag" This reverts commit `b6b0fd6a94`.	2022-05-14 17:33:27 +00:00
Alex Brachet	b6b0fd6a94	[ifs] Add --strip-size flag st_size may not be of importance to the abi if you are not using copy relocations. This is helpful when you want to check the abi of a shared object both when instrumented and not because asan will increase the size of objects to include the redzone. Differential revision: https://reviews.llvm.org/D124792	2022-05-14 17:25:50 +00:00
Xiaodong Liu	7ff7001ba9	[llvm] Fix comment nits in Module class, NFC. There is no member called "GlobalValRefMap" in Module class. It has been changed to "GlobalList". Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D125187	2022-05-14 17:41:37 +08:00
Jay Foad	169ae6db69	[APInt] Allow extending and truncating to the same width Allow zext, sext, trunc, truncUSat and truncSSat to extend or truncate to the same bit width, which is a no-op. Disallowing this forced clients to use workarounds like using zextOrTrunc (even though they never wanted truncation) or zextOrSelf (even though they did not want its strange behaviour of allowing a smaller bit width, which is also treated as a no-op). Differential Revision: https://reviews.llvm.org/D125556	2022-05-14 09:54:24 +01:00
Wolfgang Pieb	2740c1875d	[NFC][Metadata] Refactor allocation, initalization and deletion of MDNodes. This patch is refactoring the allocation, initialization and deletion of MDNodes. It is intended as a preparatory patch for the upcoming addition of dynamic resizability of MDNodes. It is fundamentally NFC, but removes the necessity for suppressing the memory sanitizer for MDNode's operator delete. Reviewers: dexonsmith Differential Revision: https://reviews.llvm.org/D125489	2022-05-13 16:05:29 -07:00
bzcheeseman	c758708018	[LLVM][Casting.h] Add ForwardToPointerCast trait Addresses use cases in Clang/MLIR that need pointer-to-pointer, reference-to-reference, and value-to-value casts from/to the same types. This should reduce boilerplate by allowing the user to simply specify the pointer cast and forward the reference cast directly to the pointer cast. This cast trait DOES NOT implement `castFailed` and `doCastIfPossible` because in the general case doing so could result in a nullptr dereference. Users can use `NullableValueCastFailed` and `DefaultDoCastIfPossible` as desired for those cases where `nullptr` is acceptable. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D125576	2022-05-13 18:48:50 -04:00
bzcheeseman	bc65fc8bb3	[LLVM][Casting.h] Remove CastInfo pointer partial specialization. Since cast_convert_val now has pointer specializations, we don't need the pointer partial specialization for CastInfo. We want to trim these down when possible to avoid future ambiguous partial specialization errors. Reviewed By: rriddle Differential Revision: https://reviews.llvm.org/D125578	2022-05-13 18:31:10 -04:00
Alexander Shaposhnikov	badd088c57	[GlobalOpt] Enable optimization of constructors with different priorities Adjust `optimizeGlobalCtorsList` to handle the case of different priorities. This addresses the issue https://github.com/llvm/llvm-project/issues/55083. Test plan: ninja check-all Differential revision: https://reviews.llvm.org/D125278	2022-05-13 22:19:29 +00:00
Amara Emerson	41fef10449	[GlobalISel] Combine G_SHL, G_ASHR, G_SHL of undef shifts to undef. Differential Revision: https://reviews.llvm.org/D125041	2022-05-13 12:20:34 -07:00
Hongtao Yu	1662cfa4be	[CSSPGO][CSProfileConverter] Remove call target samples when including callee samples into caller. When a flat CS profile is converted to a nested profile, the call target samples for inlined callee contexts are left over in the callsite target map. This could cause indirect call promotion to function improperly. One issue is that the inlined callsites are treated with double amount of samples. The other is the inlined callsites are reconsidered for subsequent PGO ICP. I'm fixing this by excluding call targets from the callsite for inlined targets. While fixing this I found that callsite target sum and the number of body samples for that callsite could be mismatched. {D122609} has an explanation and a fix for that on llvm-profgen side. For now I'm tolerating it in this change. Reviewed By: wenlei Differential Revision: https://reviews.llvm.org/D125266	2022-05-13 09:19:32 -07:00
zhijian	fe3b621f05	[AIX] support write operation of big archive. SUMMARY 1. Enable supporting the write operation of big archive. 2. the first commit come from https://reviews.llvm.org/D104367 3. refactor the first commit and implement writing symbol table. 4. fixed the bugs and add more test cases in the second commit. Reviewers: James Henderson Differential Revision: https://reviews.llvm.org/D123949	2022-05-13 10:40:15 -04:00
Jay Foad	fcbf617dcc	[APInt] Fix documentation of *OrSelf methods Document that truncOrSelf, zextOrSelf and sextOrSelf only enforce an upper or lower bound on the bitwidth of the result.	2022-05-13 15:26:41 +01:00
Jonas Paulsson	eaa78035c6	[SystemZ] Patchset for expanding memcpy/memset using at most two stores. * Set MaxStoresPerMemcpy and MaxStoresPerMemset to 2. * Optimize stores of replicated values in SystemZ::combineSTORE(). This handles the now expanded memory operations and as well some other pre-existing cases. * Reject a big displacement in isLegalAddressingMode() for a vector type. * Return true from shouldConsiderGEPOffsetSplit(). Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D122105	2022-05-13 15:31:09 +02:00
Nikita Popov	ed1cb01baf	[IRBuilder] Add IsInBounds parameter to CreateGEP() We commonly want to create either an inbounds or non-inbounds GEP based on a boolean value, e.g. when preserving inbounds from existing GEPs. Directly accept such a boolean in the API, rather than requiring a ternary between CreateGEP and CreateInBoundsGEP. This change is not entirely NFC, because we now preserve an inbounds flag in a constant expression edge-case in InstCombine.	2022-05-13 14:30:55 +02:00
Nathan Sidwell	562ce15924	[demangler] Avoid special-subst code duplication We need to expand special substitutions in four different ways. This refactors to only have one conversion from enum to string, and derive the other 3 needs off that. The SpecialSubstitution node is derived from the ExpandedSpecialSubstitution. While this may seem unintuitive, it works out quite well, as SpecialSubstitution can then use the former's getBaseName and remove an unneeded 'basic_' prefix, for those substitutions that are instantiations (to known typedef). Similarly all those instantiations use the same set of template arguments (with 'basic_string', getting an additional 'allocator' arg). Expansion tests were added in D123134, and remain unchanged. Reviewed By: MaskRay, dblaikie Differential Revision: https://reviews.llvm.org/D125257	2022-05-13 04:35:29 -07:00
Nikita Popov	0485211dd0	[IRBuilder] Remove redundant createGEP() overloads (NFC) ArrayRef<Value > also accepts a single Value , there's no need to create separate overloads for this.	2022-05-13 12:43:21 +02:00
Zakk Chen	7dfc56c107	[RISCV] Add the passthru operand for RVV unmasked segment load IR intrinsics. The goal is support tail and mask policy in RVV builtins. We focus on IR part first. If the passthru operand is undef, we use tail agnostic, otherwise use tail undisturbed. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125323	2022-05-13 02:16:40 -07:00
Jay Foad	26e1ebd3ea	[GlobalISel] Change ConstantFoldVectorBinop to return vector of APInt Previously it built MIR for the results and returned a Register. This avoids building constants for earlier elements of the vector if later elements will fail to fold, and allows CSEMIRBuilder::buildInstr to avoid unconditionally building a copy from the result. Use a new helper function MachineIRBuilder::buildBuildVectorConstant to build a G_BUILD_VECTOR of G_CONSTANTs. Differential Revision: https://reviews.llvm.org/D117758	2022-05-13 09:33:07 +01:00
Alex Bradbury	cb778e9328	[WebAssembly] Implement ref.is_null MC layer support and codegen Custom type-checking (in WebAssemblyAsmTypeCheck.cpp) is used to workaround the fact that separate variants of the instruction are defined for externref and funcref. Based on an initial patch by Paulo Matos <pmatos@igalia.com>. Differential Revision: https://reviews.llvm.org/D123484	2022-05-13 07:08:10 +01:00
bzcheeseman	0be41ed5bb	[LLVM][Casting.h] Don't create a temporary while casting. C-style casting can create a temporary when compiled by a C++ compiler, which was emitting a warning casting a reference to another reference. We can't use C++-style casting directly because it doesn't always work with incomplete types. In order to support the current use-cases, for references we switch to pointer space to perform the cast. Reviewed By: qiongsiwu1 Differential Revision: https://reviews.llvm.org/D125482	2022-05-12 23:11:02 -04:00
Chen Zheng	0ca2b93cc2	[NFC] add the missing //@} address code review comments for D123995	2022-05-12 22:43:35 -04:00
Florian Hahn	5890b30105	[LAA] Initial support for runtime checks with pointer selects. Scaffolding support for generating runtime checks for multiple SCEV expressions per pointer. The initial version just adds support for looking through a single pointer select. The more sophisticated logic for analyzing forks is in D108699 Reviewed By: huntergr Differential Revision: https://reviews.llvm.org/D114487	2022-05-12 19:33:48 +01:00
Pedro Olsen Ferreira	28a0b94d22	Rename and fix ValueMap::resize to reserve The underlying map type (DenseMap) has had its resize() function renamed to reserve() as part of `c04fc7a60f` (SVN 264026). This is only visible when the member function is called, as it is template type name dependent. Differential Revision: https://reviews.llvm.org/D125387	2022-05-12 13:44:47 +01:00
Dmitry Vassiliev	7b22cf12ef	[Intrinsics] Fix `nvvm_prmt` intrinsic attributes `nvvm_prmt` doesn't seem to be `commutative`. nvvm also sets `IntrSpeculatable` for it. Here is the doc https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#data-movement-and-conversion-instructions-prmt Reviewed By: tra, jchlanda Differential Revision: https://reviews.llvm.org/D125423	2022-05-12 10:46:03 +02:00
bzcheeseman	f156b51aec	[LLVM][Casting.h] Update dyn_cast machinery to provide more control over how the casting is performed. This patch expands the expressive capability of the casting utilities in LLVM by introducing several levels of configurability. By creating modular CastInfo classes we can enable projects like MLIR that need more fine-grained control over how a cast is actually performed to retain that control, while making it easy to express the easy cases (like a checked pointer to pointer cast). The current implementation of Casting.h doesn't make it clear where the entry points for customizing the cast behavior are, so part of the motivation for this patch is adding that documentation. Another part of the motivation is to support using LLVM RTTI with a wider set of use cases, such as nullable value to value casts, or pointer to value casts (as in MLIR). Reviewed By: lattner, rriddle Differential Revision: https://reviews.llvm.org/D123901	2022-05-12 00:15:09 -04:00
Austin Kerbow	2db700215a	[AMDGPU] Add llvm.amdgcn.sched.barrier intrinsic Adds an intrinsic/builtin that can be used to fine tune scheduler behavior. If there is a need to have highly optimized codegen and kernel developers have knowledge of inter-wave runtime behavior which is unknown to the compiler this builtin can be used to tune scheduling. This intrinsic creates a barrier between scheduling regions. The immediate parameter is a mask to determine the types of instructions that should be prevented from crossing the sched_barrier. In this initial patch, there are only two variations. A mask of 0 means that no instructions may be scheduled across the sched_barrier. A mask of 1 means that non-memory, non-side-effect inducing instructions may cross the sched_barrier. Note that this intrinsic is only meant to work with the scheduling passes. Any other transformations that may move code will not be impacted in the ways described above. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D124700	2022-05-11 13:22:51 -07:00
River Riddle	5a9a438a54	[TableGen] Refactor TableGenParseFile to no longer use a callback Now that TableGen no longer relies on global Record state, we can allow for the client to own the RecordKeeper and SourceMgr. Given that TableGen internally still relies on the global llvm::SrcMgr, this method unfortunately still isn't thread-safe. Differential Revision: https://reviews.llvm.org/D125277	2022-05-11 11:55:33 -07:00
River Riddle	2ac3cd20ca	[TableGen] Remove the use of global Record state This commits removes TableGens reliance on managed static global record state by moving the RecordContext into the RecordKeeper. The RecordKeeper is now treated similarly to a (LLVM\|MLIR\|etc)Context object and is passed to static construction functions. This is an important step forward in removing TableGens reliance on global state, and in a followup will allow for users that parse tablegen to parse multiple tablegen files without worrying about Record lifetime. Differential Revision: https://reviews.llvm.org/D125276	2022-05-11 11:55:33 -07:00
Joseph Huber	e7858a9fab	[Cuda] Add initial support for wrapping CUDA images in the new driver. This patch adds the initial support for wrapping CUDA images. This requires changing some of the logic for how we bundle images. We now need to copy the image for all kinds that are active for the architecture. Then we need to run a separate wrapping job if the Kind is Cuda. For cuda wrapping we need to use the `fatbinary` program from the CUDA SDK to bundle all the binaries together. This is then passed to a new function to perfom the actual module code generation that will be implemented in a later patch. Depends on D120273 D123471 Reviewed By: tra Differential Revision: https://reviews.llvm.org/D123810	2022-05-11 07:30:23 -04:00
Nikita Popov	c1bb4a881e	[SCEVExpander] Deduplicate min/max expansion code (NFC)	2022-05-11 12:11:11 +02:00
Fraser Cormack	c1d48b35d8	[SelectionDAG][VP] Rename VP sext/zext/trunc ISD opcodes Rather than VP_SEXT/VP_ZEXT/VP_TRUNC, having VP_SIGN_EXTEND/VP_ZERO_EXTEND/VP_TRUNCATE better matches their non-VP counterparts. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D125298	2022-05-11 10:25:51 +01:00
Arthur Eubanks	7e0802aeb5	[BasicAA] Fix order in which we pass MemoryLocations to alias() D98718 caused the order of Values/MemoryLocations we pass to alias() to be significant due to storing the offset in the PartialAlias case. But some callers weren't audited and were still passing swapped arguments, causing the returned PartialAlias offset to be negative in some cases. For example, the newly added unittests would return -1 instead of 1. Fixes #55343, a miscompile. Reviewed By: asbirlea, nikic Differential Revision: https://reviews.llvm.org/D125328	2022-05-10 12:05:38 -07:00
Matthias Braun	cd19af74c0	Avoid 8 and 16bit switch conditions on x86 This adds a `TargetLoweringBase::getSwitchConditionType` callback to give targets a chance to control the type used in `CodeGenPrepare::optimizeSwitchInst`. Implement callback for X86 to avoid i8 and i16 types where possible as they often incur extra zero-extensions. This is NFC for non-X86 targets. Differential Revision: https://reviews.llvm.org/D124894	2022-05-10 10:00:10 -07:00
Nicolai Hähnle	38bb46523f	GlobalISel: Trivial documentation and comment fixes Differential Revision: https://reviews.llvm.org/D124808	2022-05-10 07:48:56 -05:00
Craig Topper	39e63bd2d8	[IR][CostModel] A scalable vector shuffle can't be an identity or reverse shuffle. Even if the minimum number of elements is 1 and the length doesn't change, we don't know what vscale is so we can't classify it as identity mask. Instead it is a zero element splat. For reverse, we shouldn't classify it as a reverse unless there are at least 2 elements in the mask. This applies to both fixed and scalable vectors. For fixed vectors, a single element would be an identity shuffle. For scalable vector it's a zero elt splat. Reviewed By: sdesmalen, liaolucy Differential Revision: https://reviews.llvm.org/D124655	2022-05-09 21:37:25 -07:00
Mircea Trofin	c35ad9ee4f	[mlgo] Support exposing more features than those supported by models This allows the compiler to support more features than those supported by a model. The only requirement (development mode only) is that the new features must be appended at the end of the list of features requested from the model. The support is transparent to compiler code: for unsupported features, we provide a valid buffer to copy their values; it's just that this buffer is disconnected from the model, so insofar as the model is concerned (AOT or development mode), these features don't exist. The buffers are allocated at setup - meaning, at steady state, there is no extra allocation (maintaining the current invariant). These buffers has 2 roles: one, keep the compiler code simple. Second, allow logging their values in development mode. The latter allows retraining a model supporting the larger feature set starting from traces produced with the old model. For release mode (AOT-ed models), this decouples compiler evolution from model evolution, which we want in scenarios where the toolchain is frequently rebuilt and redeployed: we can first deploy the new features, and continue working with the older model, until a new model is made available, which can then be picked up the next time the compiler is built. Differential Revision: https://reviews.llvm.org/D124565	2022-05-09 18:01:21 -07:00
Michael Kruse	588ffdaf37	[polly] Fix compiler warning. NFC. Fix the warning warning: 'polly::ScopViewer' has virtual functions but non-virtual destructor [-Wnon-virtual-dtor] and for several other classes by inserting virtual destructors.	2022-05-09 14:04:40 -05:00
Michael Kruse	6b3b87376b	[polly] migrate -polly-show to the new pass manager Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D123678	2022-05-09 14:04:29 -05:00
Michael Kruse	a6b399ad79	[PassManager] Implement DOTGraphTraitsViewer under NPM Rename the legacy `DOTGraphTraits{Module,}{Viewer,Printer}` to the corresponding `DOTGraphTraits...WrapperPass`, and implement a new `DOTGraphTraitsViewer` with new pass manager. Reviewed By: Meinersbur Differential Revision: https://reviews.llvm.org/D123677	2022-05-09 14:04:28 -05:00
Kazu Hirata	3f64f03289	[CodeGen] Clarify the semantics of ADDCARRY/SUBCARRY This patch clarifies the semantics of ADDCARRY/SUBCARRY, specifically stating that both the incoming and outgoing carries are active high. Differential Revision: https://reviews.llvm.org/D125130	2022-05-09 10:17:00 -07:00
Alexey Bataev	9dc4ced204	[SLP]Try partial store vectorization if supported by target. We can try to vectorize number of stores less than MinVecRegSize / scalar_value_size, if it is allowed by target. Gives an extra opportunity for the vectorization. Fixes PR54985. Differential Revision: https://reviews.llvm.org/D124284	2022-05-09 09:48:15 -07:00
Nathan Sidwell	bc150a07f1	[demangler] No need to space adjacent template closings With the demangler parenthesizing 'a >> b' inside template parameters, because C++11 parsing of >> there, we don't really need to add spaces between adjacent template arg closing '>' chars. In 2022, that just looks odd. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D123134	2022-05-09 06:14:44 -07:00
Nathan Sidwell	e48cd7088b	[demangler] Buffer peeking needs buffer The output buffer has a 'back' member, which returns NUL when you try it with an empty buffer. But there are no use cases that need that additional functionality. This makes the 'back' member behave more like STL containers' back members. (It still returns a value, not a reference.) Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D123201	2022-05-09 04:17:22 -07:00
Philipp Tomsich	91b24b0180	[AArch64] Ampere1 does not support MTE The initial support for the Ampere1 mistakenly signalled support for the MTE feature. However, the core does not include the optional MTE functionality. Update the target parser to not include MTE for Ampere1. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D125191	2022-05-09 11:29:42 +02:00
Sam McCall	d44ffd631c	[Bitstream] Only consider flushing to file on block boundaries The goal of flushing to disk is to keep a reasonable bound on peak memory usage. With a a default threshold of 512MB (and most BitstreamWriters having no backing file at all), checking after every byte whether to flush seems excessive. This change makes clangd's unittests run 5% faster (in opt), so it's not actually free even in the case with no backing file. Likely there are more important workloads where it makes some difference. Differential Revision: https://reviews.llvm.org/D125145	2022-05-07 16:57:03 +02:00
Amaury Séchet	c2c259224b	const char* for LLVMTargetMachineEmitToFile's argument The `LLVMTargetMachineEmitToFile` takes a `char* Filename` right now, but it doesn't modify it. This is annoying to use in the case where you want to pass a const string, because you either have to remove the const, or copy it somewhere else and pass that. Either way, it's not very nice. I added a const and clang formatted it. This shouldn't break any ABI in my opinion. I'm sorry but I didn't know whom to put as reviewer for this, so I chose someone with a lot of commits from the .cpp file. Reviewed By: deadalnix Differential Revision: https://reviews.llvm.org/D124453	2022-05-07 14:40:55 +00:00
Serge Pavlov	eb28da89a6	[InstCombine] Remove side effect of replaced constrained intrinsics If a constrained intrinsic call was replaced by some value, it was not removed in some cases. The dangling instruction resulted in useless instructions executed in runtime. It happened because constrained intrinsics usually have side effect, it is used to model the interaction with floating-point environment. In some cases side effect is actually absent or can be ignored. This change adds specific treatment of constrained intrinsics so that their side effect can be removed if it actually absents. Differential Revision: https://reviews.llvm.org/D118426	2022-05-07 19:04:11 +07:00
Paul Walker	702c4ade22	[ISD::IndexType] Helper functions for common queries. Add helper functions to query the signed and scaled properties of ISD::IndexType along with functions to change them. Remove setIndexType from MaskedGatherSDNode because it only has one usage and typically should only be changed alongside its index operand. Minimise the direct use of the enum values to lay the groundwork for more refactoring. Differential Revision: https://reviews.llvm.org/D123347	2022-05-07 11:23:42 +01:00
Sam McCall	0a83ff83af	[FuzzMutate] Move LLVM module (de)serialization from FuzzerCLI -> IRMutator. NFC These are not directly related to the CLI, and are mostly (always?) used when mutating the modules as part of fuzzing. Motivation: split FuzzerCLI into its own library that does not depend on IR. Subprojects that don't use IR should be be fuzzed without the dependency. Differential Revision: https://reviews.llvm.org/D125080	2022-05-07 12:09:49 +02:00
Peter S. Housel	981523b2e4	[ORC-RT][ORC] Handle dynamic unwind registration for libunwind This changes the ELFNix platform Orc runtime to use, when available, the __unw_add_dynamic_eh_frame_section interface provided by libunwind for registering .eh_frame sections loaded by JITLink. When libunwind is not being used for unwinding, the ELFNix platform detects this and defaults to the __register_frame interface provided by libgcc_s. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D114961	2022-05-06 14:00:29 -07:00
Craig Topper	2ca78d2bdf	[SelectionDAG] Improve asserts in SelectionDAG::getSelect. The VT passed in must match the type of LHS and RHS. Previously we only checked that the vectorness matched.	2022-05-06 09:41:11 -07:00
Marco Elver	9ae87b5973	[Instrumentation] Share InstrumentationIRBuilder between TSan and SanCov Factor our InstrumentationIRBuilder and share it between ThreadSanitizer and SanitizerCoverage. Simplify its usage at the same time (use function of passed Instruction or BasicBlock). This class may be used in other instrumentation passes in future. NFCI. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D125038	2022-05-06 09:15:17 +02:00
Sam McCall	ba0d50ad7e	[Support] Fix UB in BumpPtrAllocator when first allocation is zero. BumpPtrAllocator::Allocate() is marked __attribute__((returns_nonnull)) when the compiler supports it, which makes it UB to return null. When there have been no allocations yet, the current slab is [nullptr, nullptr). A zero-sized allocation fits in this range, and so Allocate(0, 1) returns null. There's no explicit docs whether Allocate(0) is valid. I think we have to assume that it is: - the implementation tries to support it (e.g. >= tests instead of >) - malloc(0) is allowed - requiring each callsite to do a check is bug-prone - I found real LLVM code that makes zero-sized allocations Differential Revision: https://reviews.llvm.org/D125040	2022-05-06 08:57:27 +02:00
Ilia Diachkov	0098f2aebb	[SPIRV] Add SPIR-V specific intrinsics, two passes and tests The patch adds SPIR-V specific intrinsics required to keep information critical to SPIR-V consistency (types, constants, etc.) during translation from IR to MIR. Two related passes (SPIRVEmitIntrinsics and SPIRVPreLegalizer) and several LIT tests (passed with this change) have also been added. It also fixes the issue with opaque pointers in SPIRVGlobalRegistry.cpp and the mismatch of the data layout between the SPIR-V backend and clang (Issue #55122). Differential Revision: https://reviews.llvm.org/D124416 Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com> Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com> Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>	2022-05-06 03:02:00 +03:00
Lang Hames	16dcbb53dc	[ORC] Return ExecutorAddrs rather than JITEvaluatedSymbols from LLJIT::lookup. Clients don't care about linkage, and ExecutorAddr is much more ergonomic.	2022-05-05 13:56:00 -07:00
Lang Hames	98616cfc02	[ORC] Add an ExecutorAddr::toPtr overload for function types. In the common case of converting an ExecutorAddr to a function pointer type, this eliminates the need for the '()' boilerplate to explicitly specify a function pointer. E.g.: auto F = A.toPtr<int()()>(); can now be written as auto F = A.toPtr<int()>();	2022-05-05 12:37:23 -07:00
David Blaikie	0d8cb8b399	DWARFVerifier: Verify CU/TU index overlap issues Discovered in a large object that would need a 64 bit index (but the cu/tu index format doesn't include a 64 bit offset/length mode in DWARF64 - a spec bug) but instead binutils dwp overflowed the offsets causing overlapping regions.	2022-05-05 18:18:53 +00:00
Serge Pavlov	e1554ac63a	Revert "[InstCombine] Remove side effect of replaced constrained intrinsics" This reverts commit `83914ee96f`. The change caused discussion: https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20220502/1034841.html	2022-05-06 01:09:16 +07:00
Brian Tracy	87a55137e2	Fix "the the" typo in documentation and user facing strings There are many more instances of this pattern, but I chose to limit this change to .rst files (docs), anything in libcxx/include, and string literals. These have the highest chance of being seen by end users. Reviewed By: #libc, Mordante, martong, ldionne Differential Revision: https://reviews.llvm.org/D124708	2022-05-05 17:52:08 +02:00
Xing Xue	e5926906eb	[XCOFF][AIX] Use unique section names for LSDA and EH info sections with -ffunction-sections Summary: When -ffunction-sections is on, this patch makes the compiler to generate unique LSDA and EH info sections for functions on AIX by appending the function name to the section name as a suffix. This will allow the AIX linker to garbage-collect unused function. Reviewed by: MaskRay, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D124855	2022-05-05 09:01:36 -04:00
Nikita Popov	9678936f18	[DAGCombine] Fold (X & ~Y) \| Y with truncated not This extends the (X & ~Y) \| Y to X \| Y fold to also work if ~Y is a truncated not (when taking into account the mask X). This is done by exporting the infrastructure added in D124856 and reusing it here. I've retained the old value of AllowUndefs=false, though probably this can be switched to true with extra test coverage. Differential Revision: https://reviews.llvm.org/D124930	2022-05-05 11:10:11 +02:00
Chuanqi Xu	405bf90235	[NFC] [Pipelines] Hoist CoroCleanup as Module Pass This is similar to previous patch https://reviews.llvm.org/D123925. It could also reduce the time we call declaresCoroCleanupIntrinsics. And it is helpful for further changes. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D124362	2022-05-05 15:15:09 +08:00
Serge Pavlov	83914ee96f	[InstCombine] Remove side effect of replaced constrained intrinsics If a constrained intrinsic call was replaced by some value, it was not removed in some cases. The dangling instruction resulted in useless instructions executed in runtime. It happened because constrained intrinsics usually have side effect, it is used to model the interaction with floating-point environment. In some cases it is correct behavior but often the side effect is actually absent or can be ignored. This change adds specific treatment of constrained intrinsics so that their side effect can be removed if it actually absents. Differential Revision: https://reviews.llvm.org/D118426	2022-05-05 12:02:42 +07:00
Junfeng Dong	a0fb387941	[DebugInfo] Give warning instead of error for premature terminator in .debug_aranges section. llvm-profgen gives error message when the input binary contains premature terminator in .debug_aranges section. These zero length items point to some rodata with zero size type in embed Rust Library. Considering Zero-Sized Types are a valid feature in Rust. They are not real error. This change makes the "error:" message into a warning to avoid misleading. Why do we still want a warning on such case? because it doesn't follow dwarf standard. https://bugs.llvm.org/show_bug.cgi?id=46805 contains early discussion. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D124121	2022-05-04 15:21:58 -07:00
serge-sans-paille	7030654296	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `fa5a4e1b95` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D124847	2022-05-04 08:32:38 +02:00
Luboš Luňák	8ef5710e63	[ThreadPool] add ability to group tasks into separate groups This is needed for parallelizing of loading modules symbols in LLDB (D122975). Currently LLDB can parallelize indexing symbols when loading a module, but modules are loaded sequentially. If LLDB index cache is enabled, this means that the cache loading is not parallelized, even though it could. However doing that creates a threadpool-within-threadpool situation, so the number of threads would not be properly limited. This change adds ThreadPoolTaskGroup as a simple type that can be used with ThreadPool calls to put tasks into groups that can be independently waited for (even recursively from within a task) but still run in the same thread pool. Differential Revision: https://reviews.llvm.org/D123225	2022-05-04 06:16:55 +02:00
Alex Borcan	afaa56df7a	Implement support for __llvm_addrsig for MachO in llvm-mc The __llvm_addrsig section is a section that the linker needs for safe icf. This was not yet implemented for MachO - this is the implementation. It has been tested with a safe deduplication implementation inside lld. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D123751	2022-05-03 18:19:18 -04:00
Philipp Tomsich	64816e68f4	[AArch64] Support for Ampere1 core Add support for the Ampere Computing Ampere1 core. Ampere1 implements the AArch64 state and is compatible with ARMv8.6-A. Differential Revision: https://reviews.llvm.org/D117112	2022-05-03 15:54:02 +01:00
Nathan Sidwell	ed2d4da732	[demangler] Fold expressions of .* and ->* (Exitingly) a fold expression's operators include .* and ->*, but we failed to demangle them as we categorize those as MemberExprs, not BinaryExprs. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D123305	2022-05-03 06:45:25 -07:00
Simon Tatham	32814df442	[Windows] Fix handling of \" in program name on cmd line. Bugzilla #47579: if you invoke clang on Windows via a pathname in which a quoted section closes just after a backslash, e.g. "C:\Program Files\Whatever\"clang.exe then cmd.exe and CreateProcess will correctly find the binary, because when they parse the program name at the start of the command line, they don't regard the \ before the " as having any kind of escaping effect. This is different from the behaviour of the Windows standard C library when it parses the rest of the command line, which would consider that \" not to close the quoted string. But this confuses windows::GetCommandLineArguments, because the Windows API function GetCommandLineW() will return a command line containing that \" sequence, and cl::TokenizeWindowsCommandLine will tokenize the whole string according to the C library's rules. So it will misidentify where the program name stops and the arguments start. To fix this, I've introduced a new variant function cl::TokenizeWindowsCommandLineFull(), intended to be applied to the string returned from GetCommandLineW(). It parses the first word of the command line according to CreateProcess's rules, considering \ to never be an escaping character; thereafter, it switches over to the C library rules for the rest of the command line. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D122914	2022-05-03 11:57:50 +01:00
Igor Kirillov	4e5e042d9a	[LoopVectorize] Support reductions that store intermediary result Adds ability to vectorize loops containing a store to a loop-invariant address as part of a reduction that isn't converted to SSA form due to lack of aliasing info. Runtime checks are generated to ensure the store does not alias any other accesses in the loop. Ordered fadd reductions are not yet supported. Differential Revision: https://reviews.llvm.org/D110235	2022-05-03 10:12:30 +01:00
Jeremy Morse	1d712c3818	[DebugInfo][InstrRef] Don't generate redundant DBG_PHIs In SelectionDAG, DBG_PHI instructions are created to "read" physreg values and give them an instruction number, when they can't be traced back to a defining instruction. The most common scenario if arguments to a function. Unfortunately, if you have 100 inlined methods, each of which has the same "this" pointer, then the 100 dbg.value instructions become 100 DBG_INSTR_REFs plus 100 DBG_PHIs, where only one DBG_PHI would suffice. This patch adds a vreg cache for MachienFunction::salvageCopySSA, if we've already traced a value back to the start of a block and created a DBG_PHI then it allows us to re-use the DBG_PHI, as well as reducing work. Differential Revision: https://reviews.llvm.org/D124517	2022-05-03 09:56:12 +01:00
Markus Lavin	dd8cf372c5	[NFC] Minimal refactor of TTI to avoid clangsa complaint Differential Revision: https://reviews.llvm.org/D124754	2022-05-03 10:43:48 +02:00
David Green	6f81903e89	[LV][SLP] Mark fptosi_sat as vectorizable This adds fptosi_sat and fptoui_sat to the list of trivially vectorizable functions, mainly so that the loop vectorizer can vectorize the instruction. Marking them as trivially vectorizable also allows them to be SLP vectorized, and Scalarized. The signature of a fptosi_sat requires two type overrides (@llvm.fptosi.sat.v2i32.v2f32), unlike other intrinsics that often only take a single. This patch alters hasVectorInstrinsicOverloadedScalarOpd to isVectorIntrinsicWithOverloadTypeAtArg, so that it can mark the first operand of the intrinsic as a overloaded (but not scalar) operand. Differential Revision: https://reviews.llvm.org/D124358	2022-05-03 09:32:34 +01:00

... 11 12 13 14 15 ...

49278 Commits