llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	d16989607b	[InstCombine] reduce code duplication in visitBranchInst(); NFCI	2022-10-18 11:34:02 -04:00
Florian Hahn	a8e9742bd4	[IndVarSimplify] Clear block and loop dispositions after moving instr. Moving an instruction can invalidate the cached block dispositions of the corresponding SCEV. Invalidate the cached dispositions. Also fixes a copy-paste error in forgetBlockAndLoopDispositions where the start expression S was removed from BlockDispositions in the loop but not the current values. This was also exposed by the new test case. Fixes #58439.	2022-10-18 16:18:14 +01:00
Nikita Popov	d06131fda2	[AST] Pass BatchAA to mergeSetIn() (NFCI)	2022-10-18 16:54:55 +02:00
Krzysztof Parzyszek	9fde8e907b	[Hexagon] Fix MULHS lowering for HVX v60 The carry bit from an intermediate addition was not properly propagated. For example mulhs(7fffffff, 7fffffff) was evaluated as 3ffeffff, while the correct result is 3fffffff.	2022-10-18 07:54:38 -07:00
Alexey Bataev	e79532d28c	[SLP][NFC]Try to fix MSVC buildbots with a workaround, NFC.	2022-10-18 07:50:10 -07:00
uabkaka	da137d041b	[SimplifyLibCalls] Add NoUndef/NonNull/Dereferenceable attributes to iprintf/siprintf When SimplifyLibCalls fail to optimize printf and sprintf it add NoUndef/NonNull/Dereferenceable attributes. This patch add the same attributes if SimplifyLibCalls optimize printf/sprintf into the integer only iprintf/siprintf. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D136140	2022-10-18 16:36:35 +02:00
Alexey Bataev	6a6fc4890d	[SLP][NFC]Formatting of the getEntryCost function, NFC.	2022-10-18 07:18:26 -07:00
Florian Hahn	e302fa89aa	[LoopUnroll] Forget exit values when making changes. When unrolling, the exit values in LCSSA phis will get updated. Invalidate cached SCEV values for those phis in case SCEV looked through a exit phi. Fixes #58340.	2022-10-18 15:12:24 +01:00
Anton Afanasyev	e175f99c49	Revert "[MachineCombiner][RISCV] Enable MachineCombiner for RISCV" This reverts commit `3112cf3b00`. Test breakage: https://lab.llvm.org/buildbot/#/builders/16/builds/36631	2022-10-18 15:57:11 +03:00
Weining Lu	9572406bbc	[LoongArch] Fix codegen of atomicrmw nand Fix invalid RISCV-like MI being emitted for performing the `not` operation: the LoongArch `xori` zero-extends the immediate, hence is not equivalent to RISCV `xori`. The LoongArch `not` is a `nor` with zero. Differential Revision: https://reviews.llvm.org/D136021	2022-10-18 20:39:20 +08:00
Anton Sidorenko	3112cf3b00	[MachineCombiner][RISCV] Enable MachineCombiner for RISCV Initial implementation to match basic FP reassociation patterns. Differential Revision: https://reviews.llvm.org/D135264	2022-10-18 15:31:03 +03:00
Carlos Alberto Enciso	b6625765cf	[llvm-debuginfo-analyzer] Fix linking errors in buildbots. The tool used the 'old' LLVM build information (LLVMBuild.txt), which caused linking errors in: https://lab.llvm.org/buildbot/#/builders/177/builds/10125 https://lab.llvm.org/buildbot/#/builders/196/builds/19699 Update the CMake configuration to support the new LLVM build system that uses only CMakeLists.txt. Reviewed By: jryans Differential Revision: https://reviews.llvm.org/D136159	2022-10-18 13:05:46 +01:00
Nikita Popov	e162a73e41	[CFG] Add const qualifier to isPotentiallyReachableFromMany() (NFC) Accept a const pointer for StopBB. Unfortunately the worklist has to use non-const pointers due to LoopInfo interaction.	2022-10-18 10:06:07 +02:00
Carlos Alberto Enciso	c28a977b87	Recommit [llvm-debuginfo-analyzer] (02/09) - Driver and documentation Originally committed in `fe7a3cedf7` Reverted in `26dd64ba9c` Buildbot failures: https://lab.llvm.org/buildbot#builders/139/builds/29663 - unittest trigger an invalid assertion. https://lab.llvm.org/buildbot#builders/196/builds/19665 - 'has virtual functions but non-virtual destructor' warning as error. Recommitted with fix: - Removed the assertion. - Added virtual destructor.	2022-10-18 08:39:26 +01:00
Dominik Adamski	ccd314d320	[OpenMP][OMPIRBuilder] Add generation of SIMD align assumptions to OMPIRBuilder Currently generation of align assumptions for OpenMP simd construct is done outside OMPIRBuilder for C code and it is not supported for Fortran. According to OpenMP 5.0 standard (2.9.3) only pointers and arrays can be aligned for C code. If given aligned variable is pointer, then Clang generates the following set of the LLVM IR isntructions to support simd align clause: ; memory allocation for pointer address: %A.addr = alloca ptr, align 8 ; some LLVM IR code ; Alignment instructions (alignment is equal to 32): %0 = load ptr, ptr %A.addr, align 8 call void @llvm.assume(i1 true) [ "align"(ptr %0, i64 32) ] If given aligned variable is array, then Clang generates the following set of the LLVM IR isntructions to support simd align clause: ; memory allocation for array: %B = alloca [10 x i32], align 16 ; some LLVM IR code ; Alignment instructions (alignment is equal to 32): %arraydecay = getelementptr inbounds [10 x i32], ptr %B, i64 0, i64 0 call void @llvm.assume(i1 true) [ "align"(ptr %arraydecay, i64 32) ] OMPIRBuilder was modified to generate aligned assumptions. It generates only llvm.assume calls. Frontend is responsible for generation of aligned pointer and getting the default alignment value if user does not specify it in aligned clause. Unit and regression tests were added to check if aligned clause was handled correctly. Differential Revision: https://reviews.llvm.org/D133578 Reviewed By: jdoerfert	2022-10-18 02:04:18 -05:00
Max Kazantsev	f884a4c957	[NFC] Reuse NonTrivialUnswitchCandidate instead of std::pair	2022-10-18 14:00:53 +07:00
LiaoChunyu	7b970290c0	[RISCV] Optimize SELECT_CC when the true value of select is Constant (select (setcc lhs, rhs, CC), constant, falsev) -> (select (setcc lhs, rhs, InverseCC), falsev, constant) This patch removes unnecessary copies Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D129757	2022-10-18 09:24:17 +08:00
Koakuma	d3fcbee10d	[SPARC] Make calls to function with big return values work Implement CanLowerReturn and associated CallingConv changes for SPARC/SPARC64. In particular, for SPARC64 there's new `RetCC_Sparc64_` functions that handles the return case of the calling convention. It uses the same analysis as `CC_Sparc64_` family of funtions, but fails if the return value doesn't fit into the return registers. This makes calls to functions with big return values converted to an sret function as expected, instead of crashing LLVM. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D132465	2022-10-18 00:01:55 +00:00
Arthur Eubanks	308b4bca14	[NFC][SROA] Update comment to use opaque pointers for clarity	2022-10-17 16:37:29 -07:00
Daniel Sanders	021e6e05d3	[instsimplify] Move (extelt (inselt Vec, Value, Index), Index) -> Value from InstCombine As requested in https://reviews.llvm.org/D135625#3858141 Differential Revision: https://reviews.llvm.org/D136099	2022-10-17 15:22:06 -07:00
Xiang Li	13163dd8ab	[HLSL] CodeGen hlsl resource binding. ''register(ID, space)'' like register(t3, space1) will be translated into i32 3, i32 1 as the last 2 operands for resource annotation metadata. NamedMetadata for CBuffers and SRVs are added as "hlsl.srvs" and "hlsl.cbufs". Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D130951	2022-10-17 14:29:19 -07:00
Craig Topper	2b32e4f98b	[RISCV] Add basic support for the sifive-7-series short forward branch optimization. sifive-7-series has macrofusion support to convert a branch over a single instruction into a conditional instruction. This can be an improvement if the branch is hard to predict. This patch adds support for the most basic case, a branch over a move instruction. This is implemented as a pseudo instruction so we can hide the control flow until all code motion passes complete. I've disabled a recent select optimization if this feature is enabled in the subtarget. Related gcc patch for the same optimization https://www.mail-archive.com/gcc-patches@gcc.gnu.org/msg211045.html Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135814	2022-10-17 13:56:22 -07:00
Matthias Braun	6d972ad2d8	ControlHeightReduction: Remove assert check in shouldApply Remove assertion checking for non-empty `ProfileSummaryInfo`. Differential Revision: https://reviews.llvm.org/D133706	2022-10-17 13:10:13 -07:00
Florian Hahn	6db71b8f14	[ConstraintElim] Use helper to allow overflow for coefficients of GEPs If the arithmetic for indices of inbounds GEPs overflows, the result is poison. This means it is also OK for the coefficients to overflow. GEP decomposition is limited to cases where the index size is <= 64 bit, which can be represented by int64_t used for the coefficients in the constraint system.	2022-10-17 20:30:43 +01:00
Han Zhu	d0d48a91f8	[X86] Lower vector interleave into unpck and perm [This Godbolt link](https://godbolt.org/z/s17Kv1s9T) shows different codegen between clang and gcc for a transpose operation. clang result: ``` vmovdqu xmm0, xmmword ptr [rcx + rax] vmovdqu xmm1, xmmword ptr [rcx + rax + 16] vmovdqu xmm2, xmmword ptr [r8 + rax] vmovdqu xmm3, xmmword ptr [r8 + rax + 16] vpunpckhbw xmm4, xmm2, xmm0 vpunpcklbw xmm0, xmm2, xmm0 vpunpcklbw xmm2, xmm3, xmm1 vpunpckhbw xmm1, xmm3, xmm1 vmovdqu xmmword ptr [rdi + 2rax + 48], xmm1 vmovdqu xmmword ptr [rdi + 2rax + 32], xmm2 vmovdqu xmmword ptr [rdi + 2rax], xmm0 vmovdqu xmmword ptr [rdi + 2rax + 16], xmm4 ``` gcc result: ``` vmovdqu ymm3, YMMWORD PTR [rdi+rax] vpunpcklbw ymm1, ymm3, YMMWORD PTR [rsi+rax] vpunpckhbw ymm0, ymm3, YMMWORD PTR [rsi+rax] vperm2i128 ymm2, ymm1, ymm0, 32 vperm2i128 ymm1, ymm1, ymm0, 49 vmovdqu YMMWORD PTR [rcx+rax2], ymm2 vmovdqu YMMWORD PTR [rcx+32+rax2], ymm1 ``` clang's code is roughly 15% slower than gcc's when evaluated on an internal compression benchmark. The loop vectorizer generates the following shufflevector intrinsic: ``` %interleaved.vec = shufflevector <32 x i8> %a, <32 x i8> %b, <64 x i32> <i32 0, i32 32, i32 1, i32 33, i32 2, i32 34, i32 3, i32 35, i32 4, i32 36, i32 5, i32 37, i32 6, i32 38, i32 7, i32 39, i32 8, i32 40, i32 9, i32 41, i32 10, i32 42, i32 11, i32 43, i32 12, i32 44, i32 13, i32 45, i32 14, i32 46, i32 15, i32 47, i32 16, i32 48, i32 17, i32 49, i32 18, i32 50, i32 19, i32 51, i32 20, i32 52, i32 21, i32 53, i32 22, i32 54, i32 23, i32 55, i32 24, i32 56, i32 25, i32 57, i32 26, i32 58, i32 27, i32 59, i32 28, i32 60, i32 29, i32 61, i32 30, i32 62, i32 31, i32 63> ``` which is lowered to SelectionDAG: ``` t2: v32i8,ch = CopyFromReg t0, Register:v32i8 %0 t6: v64i8 = concat_vectors t2, undef:v32i8 t4: v32i8,ch = CopyFromReg t0, Register:v32i8 %1 t7: v64i8 = concat_vectors t4, undef:v32i8 t8: v64i8 = vector_shuffle<0,64,1,65,2,66,3,67,4,68,5,69,6,70,7,71,8,72,9,73,10,74,11,75,12,76,13,77,14,78,15,79,16,80,17,81,18,82,19,83,20,84,21,85,22,86,23,87,24,88,25,89,26,90,27,91,28,92,29,93,30,94,31,95> t6, t7 ``` So far this `vector_shuffle` is good enough for us to pattern-match and transform, but as we go down the SelectionDAG pipeline, it got split into smaller shuffles. During dagcombine1, the shuffle is split by `foldShuffleOfConcatUndefs`. ``` // shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) t2: v32i8,ch = CopyFromReg t0, Register:v32i8 %0 t4: v32i8,ch = CopyFromReg t0, Register:v32i8 %1 t19: v32i8 = vector_shuffle<0,32,1,33,2,34,3,35,4,36,5,37,6,38,7,39,8,40,9,41,10,42,11,43,12,44,13,45,14,46,15,47> t2, t4 t15: ch,glue = CopyToReg t0, Register:v32i8 $ymm0, t19 t20: v32i8 = vector_shuffle<16,48,17,49,18,50,19,51,20,52,21,53,22,54,23,55,24,56,25,57,26,58,27,59,28,60,29,61,30,62,31,63> t2, t4 t17: ch,glue = CopyToReg t15, Register:v32i8 $ymm1, t20, t15:1 ``` With `foldShuffleOfConcatUndefs` commented out, the vector is still split later by the type legalizer, which comes after dagcombine1, because v64i8 is not a legal type in AVX2 (64 * 8 = 512 bits while ymm = 256 bits). There doesn't seem to be a good way to avoid this split. Lowering the `vector_shuffle` into unpck and perm during dagcombine1 is too early. Therefore, although somewhat inconvenient, we decided to go with pattern-matching a pair vector shuffles later in the SelectionDAG pipeline, as part of `lowerV32I8Shuffle`. The code looks at the two operands of the first shuffle it encounters, iterates through the users of the operands, and tries to find two shuffles that are consecutive interleaves. Once the pattern is found, it lowers them into unpcks and perms. It returns the perm for the shuffle that's currently being lowered (have ISel modify the DAG), and replaces the other shuffle in place. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D134477	2022-10-17 11:39:27 -07:00
Sjoerd Meijer	5b9597f59a	Recommit "[LoopFlatten] Enable it by default" The sanitizer bots turned green again after another change went in, i.e. revert `26dd64ba9c`, so I don't think this patch was causing the problems.	2022-10-17 23:27:19 +05:30
Craig Topper	30305d7948	[TargetLowering][RISCV][Sparc] Don't emit zero check in CTTZTableLookup for CTTZ_ZERO_UNDEF. The code incorrectly checked for CTLZ_ZERO_UNDEF instead of CTTZ_ZERO_UNDEF. While I was there I flipped the condition into an early out. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D136010	2022-10-17 10:15:39 -07:00
Kazu Hirata	ef9956f434	[IR] Rename FuncletPadInst::getNumArgOperands to arg_size (NFC) This patch renames FuncletPadInst::getNumArgOperands to arg_size for consistency with CallBase, where getNumArgOperands was removed in favor of arg_size in commit `3e1c787b31` Differential Revision: https://reviews.llvm.org/D136048	2022-10-17 10:15:10 -07:00
Fangrui Song	5d3139aef1	[AArch64] Fix warnings	2022-10-17 16:58:52 +00:00
Sjoerd Meijer	a71c4e4fbb	Revert "[LoopFlatten] Enable it by default" This reverts commit `233659c7ae`. I see some sanitizer build bot failures. Not sure if it is change causing it, but let's see if a revert returns the bots to green...	2022-10-17 22:14:20 +05:30
Mingming Liu	db0286a096	[AArch64]Enhance 'isBitfieldPositioningOp' to find pattern (shl(and(val,mask), N). Before this patch (and D135844) - Given DAG node shl(op, N), isBitfieldPositioningOp uses (optionally shifted [1] ) op as the Src (least significant bits of Src are inserted into DstLSB of Dst node). After this patch - If op is and(val, mask), isBitfieldPositioningOp tries to see through and and find if val is a simpler source than op. It helps in a similar (probably symmetric) way how isSeveralBitsExtractOpFromShr [2] optimizes isBitfieldExtractOpFromShr Existing test cases are improved without regressions. [1] `cbd8464595/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (L2546)` [2] `cbd8464595/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (L2057)` Differential Revision: https://reviews.llvm.org/D135850	2022-10-17 09:01:29 -07:00
Simon Pilgrim	8e77458578	[DAG] visitShiftByConstant - replace constant detection with FoldConstantArithmetic Instead of checking that an operand is constant/opaque before calling getNode() and then checking that the result is a constant, just use FoldConstantArithmetic which will just early-out if the operands are not constant foldable.	2022-10-17 16:19:10 +01:00
Mingming Liu	45cadb4bd3	[AArch64][NFC]Refactor 'isBitfieldPositioningOp' so that DAG nodes with different Opcode are handled with separate helper functions. Using different helper functions for DAG nodes with different Opcode allows specialization. - 'isBitfieldExtractOp' [1] shows how specialization based on Opcode could catch more patterns. - The refactor paves the way (e.g., makes diff clearer) for enhancement in {D135844,D135850,D135852} [1] `cbd8464595/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp (L2163-L2202)` Differential Revision: https://reviews.llvm.org/D135843	2022-10-17 08:08:48 -07:00
Sanjay Patel	8d76fbb5f0	[VectorCombine] fix crashing on match of non-canonical fneg We can't assume that operand 0 is the negated operand because the matcher handles "fsub -0.0, X" (and also +0.0 with FMF). By capturing the extract within the match, we avoid the bug and make the transform more robust (can't assume that this pass will only see canonical IR).	2022-10-17 10:47:48 -04:00
Nicola Lancellotti	43fe14c056	[AArch64] Canonicalize ZERO_EXTEND to VSELECT Differential Revision: https://reviews.llvm.org/D135596	2022-10-17 15:42:46 +01:00
Simon Pilgrim	af5942cc09	Remove trailing whitespace. NFC.	2022-10-17 15:20:26 +01:00
Nikita Popov	779fd39684	Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify Relative to the previous attempt, this is rebased over the InstSimplify fix in `ac74e7a780`, which addresses the miscompile reported in PR58401. ----- foldOpIntoPhi() currently only folds operations into the phi if all but one operands constant-fold. The two exceptions to this are freeze and select, where we allow more general simplification. This patch makes foldOpIntoPhi() generally simplification based and removes all the instruction-specific logic. We just try to simplify the instruction for each operand, and for the (potentially) one non-simplified operand, we move it into the new block with adjusted operands. This fixes https://github.com/llvm/llvm-project/issues/57448, which was my original motivation for the change. Differential Revision: https://reviews.llvm.org/D134954	2022-10-17 16:11:05 +02:00
Nikita Popov	ac74e7a780	[InstSimplify] Only check self-simplify in simplifyInstruction() InstSimplify currently checks whether the instruction simplifies back to itself, and returns undef in that case. Generally, this should only occur in unreachable code. However, this was also done for the simplifyInstructionWithOperands() API. In that case, the instruction only serves as a template that provides the opcode and other non-operand data. In this case, simplifying back to the same "instruction" may be expected. This caused PR58401 in conjunction with D134954. As such, move this check into simplifyInstruction() only. The only other caller of simplifyInstructionWithOperands() also handles the self-simplification case explicitly.	2022-10-17 15:52:38 +02:00
Carlos Alberto Enciso	26dd64ba9c	Revert "[llvm-debuginfo-analyzer] (02/09) - Driver and documentation" This reverts commit `fe7a3cedf7`.	2022-10-17 14:26:48 +01:00
Carlos Alberto Enciso	fe7a3cedf7	[llvm-debuginfo-analyzer] (02/09) - Driver and documentation llvm-debuginfo-analyzer is a command line tool that processes debug info contained in a binary file and produces a debug information format agnostic “Logical View”, which is a high-level semantic representation of the debug info, independent of the low-level format. The code has been divided into the following patches: 1) Interval tree 2) Driver and documentation 3) Logical elements 4) Locations and ranges 5) Select elements 6) Warning and internal options 7) Compare elements 8) ELF Reader 9) CodeView Reader Full details: https://discourse.llvm.org/t/llvm-dev-rfc-llvm-dva-debug-information-visual-analyzer/62570 This patch: Driver and documentation - Command line options. - Full documentation. - String Pool table. Reviewed By: psamolysov, probinson Differential Revision: https://reviews.llvm.org/D125777	2022-10-17 13:46:55 +01:00
Florian Hahn	699396131f	Revert "Reapply [InstCombine] Switch foldOpIntoPhi() to use InstSimplify" This reverts commit `333246b48e`. It looks like this patch causes a mis-compile: https://github.com/llvm/llvm-project/issues/58401 Fixes #58401.	2022-10-17 12:56:28 +01:00
Sjoerd Meijer	233659c7ae	[LoopFlatten] Enable it by default LoopFlatten has been in the code base off by default for years, but this enables it to run by default. Downstream this has been running for years, so it has been exposed to quite some code. Then around the time we switched to the NPM, several fixes went in related to updating the MemorySSA state and we moved it to a loop pass manager, which both helped preventing rerunning certain analysis passes, and thus helped a bit with compile-times. About compile-times, adding a pass isn't free, but this should see only very minor increases. The pass is relatively simple and there shouldn't be anything algorithmically expensive because all it does is looking at inner/outer loops and it checks assumptions on loop increments and indices. If we see increases, I expect this to mainly come from invalidation of analysis info, and perhaps subsequent passes to trigger and do more. Despite its simplicity/restrictions, it triggers in most code-bases, which makes it worth to enable this by default. Differential Revision: https://reviews.llvm.org/D109958	2022-10-17 17:11:39 +05:30
Nathan Sidwell	d3b10150b6	[demangler] Simplify OutputBuffer initialization Every non-testcase use of OutputBuffer contains code to allocate an initial buffer (using either 128 or 1024 as initial guesses). There's now no need to do that, given recent changes to the buffer extension heuristics -- it allocates a 1k(ish) buffer on first need. Just pass in a buffer (if any) to the constructor. Thus the OutputBuffer's ownership of the buffer starts at its own lifetime start. We can reduce the lifetime of this object in several cases. That new constructor takes a 'size_t *' for the size argument, as all uses with a non-null buffer are passing through a malloc'd buffer from their own caller in this manner. The buffer reset member function is never used, and is deleted. Some adjustment to a couple of uses is needed, due to the lazy buffer creation of this patch. a) the Microsoft demangler can demangle empty strings to nothing, which it then memoizes. We need to avoid the UB of passing nullptr to memcpy. b) a unit test checks insertion of no characters into an empty buffer. We need to avoid UB when converting that to std::string. The original buffer initialization code would return a failure code if that first malloc failed. Existing code either ignored that, called std::terminate with a FIXME, or returned an error code. But that's not foolproof anyway, as a subsequent buffer extension failure ends up calling std::terminate. I am working on addressing that unfortunate failure mode in a manner more consistent with the C++ ABI design. Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D122604	2022-10-17 04:23:16 -07:00
Nikita Popov	436fb27186	[BasicAA] Support loop phis in pointsToConstantMemory() When looking for underlying objects, if we encounter one that we have already seen, then we should skip it (as it has already been checked) rather than bail out. In particular, this adds support for the case where we have a loop use of a phi recurrence.	2022-10-17 12:34:55 +02:00
Chuanqi Xu	1cedc51ff5	[Coroutines] Don't merge readnone calls in presplit coroutines Another alternative to fix the thread identification problem in coroutines. We plan to fix this problem by unifying memory effecting attributes. See https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579. But it may be a long-term project. And it is a pity that the coroutines can't resume in different threads for years. So this one is temporary fix. It may cause unnecessary performance regression for coroutines. But correctness are more important. And this one is planned to be reverted after we are able to unify the memory effecting attributes actually. Reviewed By: jdoerfert, rjmccall Differential Revision: https://reviews.llvm.org/D135550	2022-10-17 10:22:43 +08:00
Kazu Hirata	5ea3155565	[llvm] Use llvm::find (NFC)	2022-10-16 16:21:00 -07:00
Florian Hahn	462ab9810d	[ConstraintElim] Fix signed integer overflow for inbounds GEP. For inbounds GEPs, signed overflow yields poison, so it is fine for the coefficients to wrap as well. This fixes an UBSan failure.	2022-10-16 23:25:28 +01:00
Florian Hahn	aec0c1009f	[ConstraintElim] Replace custom GEP index handling by using existing code Instead of duplicating the existing decomposition code for GEP indices just use the existing code by calling the existing decompose function on the index expression and multiply the result's coefficients by the scale of the index. This both reduces code duplication and generalizes the pattern we can handle.	2022-10-16 21:53:11 +01:00
Florian Hahn	a4635ec710	[ConstraintElim] Support `add nsw` for unsigned preds with positive ops. If both operands of an `add nsw` are known positive, it can be treated the same as `add nuw` and added to the unsigned system. https://alive2.llvm.org/ce/z/6gprff	2022-10-16 20:25:14 +01:00
Kazu Hirata	7820a30a1b	[AMDGPU] Use llvm::any_of (NFC)	2022-10-16 09:19:09 -07:00
Sanjay Patel	e5ee0b06d6	[InstCombine] try to determine "exact" for sdiv If the divisor is a power-of-2 or negative-power-of-2 and the dividend is known to have >= trailing zeros than the divisor, the division is exact: https://alive2.llvm.org/ce/z/UGBksM (general proof) https://alive2.llvm.org/ce/z/D4yPS- (examples based on regression tests) This isn't the most direct optimization (we could create ashr in these examples instead of relying on existing folds for exact divides), but it's possible that there's a more general constraint than just a pow2 divisor, so this might be extended in the future. This should solve issue #58348. Differential Revision: https://reviews.llvm.org/D135970	2022-10-16 10:59:56 -04:00
Sanjay Patel	340ae45be0	[InstCombine] use isKnownNonNegative() for readability; NFCI This should be functionally equivalent - both calls are thin wrappers around computeKnownBits(). We'll probably want to use known-bits directly in follow-up patches because that could determine "exact" for example (see issue #58348).	2022-10-16 10:59:56 -04:00
Jan Sjodin	dd3d8ddb5f	[OpenMP][OpenMPIRBuilder] Migrate OffloadEntriesInfoManager from clang to OMPIRbuilder This patch moves the implementation of the OffloadEntriesInfoManager to the OMPIRbuilder. This class will later be used by flang as well. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D135786	2022-10-16 08:32:40 -04:00
Amara Emerson	13792ba417	[AArch64][GlobalISel] When lowering signext i1 parameters, don't zero-extend to s8 first. Fixes https://github.com/llvm/llvm-project/issues/57181	2022-10-15 20:25:43 -07:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Kazu Hirata	1b97645e56	[ADT] Introduce StringRef::{starts,ends}_width{,_insensitive} This patch introduces: StringRef::starts_with StringRef::starts_with_insensitive StringRef::ends_with StringRef::ends_with_insensitive to be more compatible with std::string and std::string_view. I'm planning to deprecate the existing functions in favor of the new ones. Differential Revision: https://reviews.llvm.org/D136030	2022-10-15 15:06:37 -07:00
Kazu Hirata	b2f41e9ac1	[Vectorize] Use std::conditional_t (NFC)	2022-10-15 14:52:25 -07:00
Florian Hahn	7c1b80e35c	[ConstraintElim] Support unsigned decomposition of mul/shl nuw..const Support decomposition for `mul/shl nuw` with constant operand for unsigned queries. Those expressions should not wrap in the unsigned sense and can be added directly to the unsigned system.	2022-10-15 21:28:08 +01:00
Kazu Hirata	f3a76f0581	[Object] Fix a warning This patch fixes: llvm/lib/Object/XCOFFObjectFile.cpp:1001:20: warning: suggest parentheses around ‘&&’ within ‘\|\|’ [-Wparentheses]	2022-10-15 12:43:12 -07:00
Florian Hahn	f12684d36e	[ConstraintElim] Support signed decomposition of `add nsw`. Add support decomposition for `add nsw` for signed queries. `add nsw` won't wrap and can be directly added to the signed system.	2022-10-15 18:34:03 +01:00
wanglei	506e936871	[LoongArch] Fix wrong VariantKind for MO_GOT_PC_{HI/LO} flags Differential Revision: https://reviews.llvm.org/D135946	2022-10-15 17:45:08 +08:00
Alexander Shaposhnikov	25915c6ad2	[objcopy][MachO] Clean up Section ctors, NFC	2022-10-15 00:53:52 +00:00
Kazushi (Jam) Marukawa	0278c9ceb6	[VE] Change the way to lower select Change to use VEISD::CMOV in combineSelect for better optimization. Support VEISD::CMOV in combineTRUNCATE also to optimize trancate. Merge functions to handle condition codes to VE.h. And add basic CMOV patterns to VEInstrInfo.td. Update regression tests also. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D135878	2022-10-15 08:49:36 +09:00
Krzysztof Parzyszek	fb063ea2ea	[Hexagon] Clean up leftover instructions in HvxIdioms Quick and dirty fix, because this is causing one builder to fail.	2022-10-14 16:45:03 -07:00
Krzysztof Parzyszek	6cb2a02a38	[Hexagon] Report if changes were made in HvxIdioms pass This should fix ``` Pass modifies its input and doesn't report it: Hexagon Vector Combine Pass modifies its input and doesn't report it UNREACHABLE executed at [...hecks-debian/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1436! ```	2022-10-14 15:46:33 -07:00
Keith Smiley	c2d209476c	[llvm-objcopy][MachO] Add support for LC_DYLIB_CODE_SIGN_DRS This allows binaries containing the LC_DYLIB_CODE_SIGN_DRS to be objcopy'd and stripped. Differential Revision: https://reviews.llvm.org/D135988	2022-10-14 15:41:19 -07:00
Zequan Wu	82035ec777	Revert "[PGO] Make emitted symbols hidden" This reverts commit `ecac223b0e`. The commit causes instrprof-darwin-dead-strip.c to fail on mac.	2022-10-14 15:23:26 -07:00
Krzysztof Parzyszek	361a27c155	[Hexagon] Recognize idioms for fixed-point vector multiplication Recognize Q.15Q.15 and Q.31Q.31, with and without rounding.	2022-10-14 15:22:25 -07:00
Martin Storsjö	6eb205b257	Reapply [AArch64] Fix aligning the stack after calling __chkstk Whenever a call to __chkstk was made, the frame lowering previously omitted the aligning (as NumBytes was reset to zero before doing alignment). This fixes https://github.com/llvm/llvm-project/issues/56182. The initial version of this produced invalid code for small functions with no local stack allocations, if those functions were marked with the "stackrealign" attribute. If building with -mstack-alignment=16 (which otherwise mostly would be a no-op), this attribute is added on the main function. Differential Revision: https://reviews.llvm.org/D135687	2022-10-15 00:40:13 +03:00
Krzysztof Parzyszek	b465a98316	[Hexagon] Fix isTypeForHVX for vector predicates HexagonSubtarget::isTypeFixHVX would stop breaking the type up when it reached 64 bits in width. HVX vector predicates can be shorter than that, for example <32 x i1> would have a bitwidth of 32, and it's still a valid HVX type.	2022-10-14 14:38:41 -07:00
Krzysztof Parzyszek	705e77abed	[Hexagon] Lower funnel shifts for HVX HVX v62+ has bidirectional shifts, which do not mask the shift amount to the bit width. Instead, the shift amount is sign-extended from the log(BW) bit value, and a negative value causes a shift in the other direction. For the shift amount being -log(BW), this reversed shift will shift all bits out, inserting 0s or sign bits depending on the type and direction.	2022-10-14 14:13:18 -07:00
Florian Hahn	16cf666bb7	[Loop] Move block and loop dispo invalidation to makeLoopInvariant. makeLoopInvariant may recursively move its operands to make them invariant, before moving the passed in instruction. Those recursively moved instructions are currently missed when invalidating block and loop dispositions. To address this, move the invalidation code to Loop::makeLoopInvariant. Fixes #58314. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D135909	2022-10-14 21:58:14 +01:00
serge-sans-paille	232e0a011e	[lto] Do not try to internalize symbols with escaped name Because of LLVM mangling escape sequence (through '\01' prefix), it is possible for a single symbols two have two different IR representations. For instance, consider @symbol and @"\01_symbol". On OSX, because of the system mangling rules, these two IR names point are converted in the same final symbol upon linkage. LTO doesn't model this behavior, which may result in symbols being incorrectly internalized (if all reference use the escaping sequence while the definition doesn't). The proper approach is probably to use the mangled name to compute GUID to avoid the dual representation, but we can also avoid discarding symbols that are bound to two different IR names. This is an approximation, but it's less intrusive on the codebase. Fix #57864 Differential Revision: https://reviews.llvm.org/D135710	2022-10-14 22:34:17 +02:00
Zain Jaffal	0c8dde551c	[ConstraintElimination] Move logic for replacing ssub overflow users (NFC) Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D134044	2022-10-14 21:14:21 +01:00
Filipp Zhinkin	ef774bec63	[AArch64] Support SETCCCARRY lowering Support SETCCCARRY lowering to SBCS instruction. Related issue: https://github.com/llvm/llvm-project/issues/44629 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D135302	2022-10-14 22:29:31 +03:00
Argyrios Kyrtzidis	d877e3fe71	[Transforms/ObjCARC] Fix non-deterministic output of `ObjCARCOptPass` `ProvenanceAnalysis::related()` was assuming that the order of parameters for `relatedCheck()` was not affecting the result but this was not the case when both parameters were `PHINode`s. Due to this assumption `ProvenanceAnalysis::related()` was ordering the parameters based on pointer value which resulted in non-deterministic behavior. To address this change `relatedPHI()` so that it gives the same result independent of the parameter order. rdar://100325456 Differential Revision: https://reviews.llvm.org/D135376	2022-10-14 12:26:58 -07:00
Craig Topper	1fab0ac559	[RISCV] Rename ReadVIALUCV->ReadVICALUV to match WriteVICALUV. NFC	2022-10-14 12:11:55 -07:00
Krzysztof Parzyszek	e8375e3042	[Hexagon] Use IRBuilderBase in function parameters This will allow using builders with different folders.	2022-10-14 12:10:59 -07:00
Krzysztof Parzyszek	7f4ce3f1eb	[Hexagon] Introduce PS_vsplat[ir][bhw] pseudo instructions HVX v60 only has splats that take a 32-bit word as input, while v62+ has splats that take 8- or 16-bit value. This makes writing output patterns that need to use a splat annoying, because the entire output pattern needs to be replicated for various versions of HVX. To avoid this, the patterns will always use the pseudos, and then the pseudos will be handled using a post-ISel hook.	2022-10-14 12:03:13 -07:00
Craig Topper	d3366efd43	[LV] Simplify register usage code and avoid double map lookups. NFC Instead of checking whether a map entry exists to decide if we should initialize it or add to it, we can rely on the map entry being constructed and initialized to 0 before the addition happens. For the std::max case, I've made a reference to the map entry to avoid looking it up twice. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135977	2022-10-14 11:55:48 -07:00
Chris Bieneman	911d2dc230	[NFC] [HLSL] Move common metadata to LLVMFrontend This change pulls some code from the DirectX backend into a new LLVMFrontendHLSL library to share utility data structures between the HLSL code generation in Clang and the backend in LLVM. This is a small refactoring as a first start to get code into the right structure and get the library built and dependencies correct. Fixes #58000 (https://github.com/llvm/llvm-project/issues/58000) Reviewed By: python3kgae Differential Revision: https://reviews.llvm.org/D135110	2022-10-14 13:40:04 -05:00
Craig Topper	44f0b13494	[RISCV] Correct RISCVTTIImpl::getRegUsageForType for vectors of pointers. getPrimitiveSizeInBits returns 0 for pointers, we need to query the size via DataLayout instead. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135976	2022-10-14 11:34:12 -07:00
Chris Bieneman	e530a1188e	[DX] Add pass to pretty-print DXIL metadata in asm When DXC prints IR output it adds a bunch of IR comments in a header that describe the DXIL metadata in a more human-readable format. This pass will serve that purpose for LLVM by printing out ahead of the IR printer. Reviewed By: python3kgae Differential Revision: https://reviews.llvm.org/D135802	2022-10-14 13:32:59 -05:00
Caroline Concatto	60e2aad109	[AArch64]Change printVectorList to print SVE vector range This patch has the prefered disassembly changed for SVE vector list. For instance, instead of printing this assembly: ld4d { z1.d, z2.d, z3.d, z4.d }, p0/z, [x0] it will print this: ld4d { z1.d-z4.d }, p0/z, [x0] Differential Revision: https://reviews.llvm.org/D135952	2022-10-14 18:59:56 +01:00
David Green	de6dfbbb30	[ARM] Fix for MVE i128 vector icmp costs. We were hitting an assert as the legalied type needn't be a vector. Fixes #58364	2022-10-14 18:49:25 +01:00
Hassnaa Hamdi	2c72d90ecc	[AArch64-SVE]: Force generating code compatible to streaming mode. Add a compile-time flag for enabling streaming mode. When streaming mode is enabled, lower basic loads and stores of fixed-width vectors; to generate code that is compatible to streaming mode. Differential Revision: https://reviews.llvm.org/D133433	2022-10-14 17:46:56 +00:00
Angelo Matni	ccde601f14	Fix llvm/lib/ObjCopy, llvm/llvm-ifs: c++20 compatibility Cleanup: avoid referring to `std::vector<T>` members when `T` is incomplete. This is [not legal](https://timsong-cpp.github.io/cppwp/n4868/vector#overview-4) according to the C++ standard, and causes build errors in particular in C++20 mode. Fix it by defining the vector's type before using the vector. Reviewed By: saugustine, MaskRay Differential Revision: https://reviews.llvm.org/D135906	2022-10-14 10:28:46 -07:00
chenglin.bi	c1909d7337	[DAGCombiner] Fix crash for the merge stores with different value type The crash case comes from #58350. It have two stores, one store is type f32 and the other is v1f32. When we try to merge these two stores on v1f32, the memVT is vector type so the old code will use ISD::EXTRACT_SUBVECTOR for type f32 also then compiler crash. So this patch insert a build_vector for f32 store to generate v1f32 also when memVT is v1f32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135954	2022-10-15 01:16:35 +08:00
Nicola Lancellotti	ce1a2ccf94	[NFC] Fix typo in DAGCombiner	2022-10-14 17:47:25 +01:00
Dmitry Preobrazhensky	bf96703fb3	[AMDGPU][MC][GFX8+] Correct v_cndmask modifiers Correct v_cndmask_b32 to support abs/neg modifiers in dpp/sdwa/e64 variants. Correct v_cndmask_b16 for proper disassembly of abs/neg modifiers in e64_dpp variants. Differential Revision: https://reviews.llvm.org/D135900	2022-10-14 19:37:27 +03:00
Sander de Smalen	02df03c5b7	[AArch64][SME] Add support for arm_locally_streaming functions. Functions with `aarch64_sme_pstatesm_body` will emit a SMSTART at the start of the function, and a SMSTOP at the end of the function, such that all operations use the right value for vscale. Because the placement of these nodes is critically important (i.e. no vscale-dependent operations should be done before SMSTART has been issued), we require glueing the CopyFromReg to the Entry node such that we can insert the SMSTART as part of that glued chain. More details about the SME attributes and design can be found in D131562. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D131582	2022-10-14 13:47:53 +00:00
chenglin.bi	85e41fcaac	[AArch64] Select to CCMN when the CCMP's second operator is negative constant CCMP/CCMN's second operator support const from 0 to 31. When the CCMP's second operator is in the range [-31, -1] we can replace it with CCMN to avoid extra mov. Fix: #57034 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D135939	2022-10-14 21:41:25 +08:00
Florian Hahn	5a68e578ca	[ConstraintElim] Add debug message when decomposition fails.	2022-10-14 11:02:05 +01:00
Martin Storsjö	f309f095e7	Revert "[AArch64] Fix aligning the stack after calling __chkstk" This reverts commit `50e0aced45`. This could accidentally start producing invalid code in some cases (in particular, if compiling with -mstack-alignment=16, which one could expect to be a no-op for a target where the stack always is aligned to 16 bytes anyway).	2022-10-14 11:55:59 +03:00
Benjamin Kramer	08dc847f33	Add missing `override`s after `aad013de41`	2022-10-14 10:38:32 +02:00
Nikita Popov	237b962031	[BasicAA] Account for cycles when checking for same select condition If we have translated across a cycle backedge, the same SSA value for the condition might be referring to two different loop iterations. Use the isValueEqualInPotentialCycles() helper to avoid assuming equality in that case.	2022-10-14 10:37:40 +02:00
Nikita Popov	03f9d0ff22	[TBAA] Model call accessing immutable type as readnone Accesses to constant memory are not observable and should be reported as readnone, not readonly. This is consistent with what we do for normal (non-call) instructions: For those, the TBAA metadata will result in pointsToConstantMemory() returning true, which will then result in a NoModRef result, not a Ref result. Differential Revision: https://reviews.llvm.org/D135864	2022-10-14 10:08:37 +02:00
gonglingqin	e632bb6543	[LoongArch] Add codegen support for atomicrmw umin/umax operation on LA64 Furthermore, use `beqz $rd, .BB` instead of `beq $rd, $zero, .BB`. Differential Revision: https://reviews.llvm.org/D135525	2022-10-14 15:24:43 +08:00
Leon Clark	6370bc2435	Add f16 nearbyint support. Enable lowering of FNEARBYINT for f16 and extend existing tests. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135124	2022-10-14 08:05:24 +01:00
Matt Arsenault	d0750ec475	AtomicExpand: Avoid some operations if the atomic is overaligned Let some of the pointer bithacking fold away if we know the LSB are 0.	2022-10-13 23:31:00 -07:00

1 2 3 4 5 ...

162738 Commits