llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	572dfef1db	[SelectionDAG] Use llvm::any_of to simplify a loop. NFC	2022-05-04 19:09:06 -07:00
Nikita Popov	451bc723ae	[SDAG] Handle truncated not in haveNoCommonBitsSet() Demanded bits analysis may replace a full-width not with a any_extend (not (truncate X)) pattern. This patch looks through this kind of pattern in haveNoCommonBitsSet(). Of course, we can only do this if we only need negated bits in the non-extended part, as the other bits may now be arbitrary. For example, if we have haveNoCommonBitsSet(~X & Y, X) then ~X only needs to actually negate bits set in Y. This is only a partial solution to the problem in that it allows add -> or conversion, but the resulting or doesn't get folded yet. (I guess that will involve exposing getBitwiseNotOperand() as a more general helper and using that in the relevant transform.) Differential Revision: https://reviews.llvm.org/D124856	2022-05-04 15:30:44 +02:00
serge-sans-paille	7030654296	[iwyu] Handle regressions in libLLVM header include Running iwyu-diff on LLVM codebase since `fa5a4e1b95` detected a few regressions, fixing them. Differential Revision: https://reviews.llvm.org/D124847	2022-05-04 08:32:38 +02:00
Simon Pilgrim	faa35fc873	[DAG] Fix issue with rot(rot(x,c1),c2) -> rot(x,c1+c2) fold with unnormalized rotation amounts Don't assume the rotation amounts have been correctly normalized - do it as part of the constant folding. Also, the normalization should be performed with UREM not SREM.	2022-05-03 17:16:26 +01:00
Nikita Popov	2171a896ed	[SDAG] Handle A and B&~A in haveNoCommonBitsSet() This is the DAG variant of D124763. The code already handles the general pattern, but not this degenerate case. This allows folding A + (B&~A) to A \| (B&~A) which further holds to A \| B. Handling on the SDAG level is needed because in the motivating case the add is actually a getelementptr, which only gets converted into an add on the SDAG level. However, this patch is not quite sufficient to handle the getelementptr case yet, because of an interfering demanded bits simplification. Differential Revision: https://reviews.llvm.org/D124772	2022-05-03 15:47:02 +02:00
Nikita Popov	e0892614b1	[SDAG] Extract commutative helper from haveNoCommonBitsSet() (NFC) To make it easier to add additional patterns, which will generally want to handle commuted top-level operands.	2022-05-03 12:28:35 +02:00
Hsiangkai Wang	eaaa31ff2c	[RISCV][TargetLowering] Special case overflow expansion for (uaddo X, C). Follow-up to D122933. Differential Revision: https://reviews.llvm.org/D124374	2022-05-03 03:51:36 +00:00
Craig Topper	5f057eaa0d	[DAGCombiner] reassociationCanBreakAddressingModePattern should check uses of the outer add. When looking for memory uses, reassociationCanBreakAddressingModePattern should check uses of the outer ADD rather than the inner ADD. We want to know if the two ops we're reassociating are used by a load/store. In practice, the existing check usually works because CodeGenPrepare will make one of the load/stores have an offset of 0 relative to split GEP. That will make the inner add have a memory use. To test this, I've manually split the GEPs so there is no 0 offset store. This issue was recently discussed in the original review D60294. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D124644	2022-05-02 16:38:53 -07:00
Sanjay Patel	747c6a0c73	[SDAG] fix miscompile when casting int->FP->int This is the codegen equivalent of D124692. As shown in https://github.com/llvm/llvm-project/issues/55150 - the existing fold may be wrong when converting to a signed value. This is a quick fix to avoid the miscompile. https://alive2.llvm.org/ce/z/KtaDmd Differential Revision: https://reviews.llvm.org/D124771	2022-05-02 14:57:27 -04:00
Simon Pilgrim	ae8b10e543	[DAG] (style) Break apart if-else chain as they all return	2022-05-01 17:56:59 +01:00
Paul Walker	f10a8f6752	[LegalizeDAG] Fix TypeSize conversion error when expanding SIGN_EXTEND_INREG SIGN_EXTEND_INREG expansion can trigger a TypeSize error because "VT.getSizeInBits() == 1" is used to detect for a boolean without first verifying VT is a scalar.	2022-04-30 19:21:48 +01:00
Craig Topper	6affe87bda	[DAGCombiner] When matching a disguised rotate by constant don't forget to apply LHSMask/RHSMask. We try to match as a disguised rotate by constant of these forms (shl (X \| Y), C1) \| (srl X, C2) --> (rotl X, C1) \| (shl Y, C1) (shl X, C1) \| (srl (X \| Y), C2) --> (rotl X, C1) \| (srl Y, C2) We may have also looked through an AND to find the shift. If we did, we need to apply a mask to the result. I'll add an AArch64 test and pre-commit it and the RISC-V test tomorrow. Fixes PR55201. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D124711	2022-04-30 11:02:30 -07:00
Paul Walker	23c509754d	[DAGCombiner] Stop invalid sign conversion in refineIndexType. When looking through extends of gather/scatter indices it's safe to convert a known positive signed index to unsigned, but unsigned indices must remain unsigned. Depends On D123318 Differential Revision: https://reviews.llvm.org/D123326	2022-04-29 14:20:13 +01:00
Nikita Popov	027c728f29	[SelectionDAGBuilder] Don't create MGATHER/MSCATTER with Scale != ElemSize This is an alternative to D124530. In getUniformBase() only create scales that match the gather/scatter element size. If targets also support other scales, then they can produce those scales in target DAG combines. This is what X86 already does (as long as the resulting scale would be 1, 2, 4 or 8). This essentially restores the pre-opaque-pointer state of things. Fixes https://github.com/llvm/llvm-project/issues/55021. Differential Revision: https://reviews.llvm.org/D124605	2022-04-29 14:57:53 +02:00
Paul Walker	7a0b897e86	[DAGCombiner][SVE] Ensure MGATHER/MSCATTER addressing mode combines preserve index scaling refineUniformBase and selectGatherScatterAddrMode both attempt the transformation: base(0) + index(A+splat(B)) => base(B) + index(A) However, this is only safe when index is not implicitly scaled. Differential Revision: https://reviews.llvm.org/D123222	2022-04-29 12:35:16 +01:00
Serge Pavlov	9fc58f1820	[PowerPC] Support of ppc_fp128 in lowering of llvm.is_fpclass PowerPC supports `ppc_fp128`, which is not an IEEE floating point type. The generic lowering of llvm.is_fpclass could not handle it properly. This change extends the generic lowering code to support `ppc_fp128`. The change was tested on emulator using runtime tests from https://reviews.llvm.org/D112933 and the patch for clang https://reviews.llvm.org/D112932. Differential Revision: https://reviews.llvm.org/D113908	2022-04-29 11:10:47 +07:00
Alexey Bataev	75e1cf4a6a	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-28 10:04:41 -07:00
Bjorn Pettersson	3a39bb96ca	[SelectionDAG] Use correct boolean representation in FoldConstantArithmetic The description of SETCC says /// SetCC operator - This evaluates to a true value iff the condition is /// true. If the result value type is not i1 then the high bits conform /// to getBooleanContents. Without this patch, we sign extended the i1 to the used larger type regardless of getBooleanContents. This resulted in miscompiles, as shown in the attached testcase that ended up returning -1 instead of 1 when using -mattr=+v. Fixes https://github.com/llvm/llvm-project/issues/55168 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124618	2022-04-28 18:42:16 +02:00
Alexey Bataev	9861ca0c23	Revert "[COST]Improve cost model for shuffles in SLP." This reverts commit `29a470e380` to fix a crash reported in https://reviews.llvm.org/D100486#3479989.	2022-04-28 08:11:56 -07:00
Alexey Bataev	29a470e380	[COST]Improve cost model for shuffles in SLP. Introduced masks where they are not added and improved target dependent cost models to avoid returning of the incorrect cost results after adding masks. Differential Revision: https://reviews.llvm.org/D100486	2022-04-27 10:56:26 -07:00
Denis Antrushin	4059770af5	[StatepointLowering] Only export STATEPOINT results if used in nonlocal blocks. Cuurently we always export STATEPOINT results (GC pointers lowered via VRegs) to virtual registers. When processing gc.relocate instructions we have to generate CopyFromRegs node and then export it to VReg again if gc.relocate is used in other basic blocks. This results in generation of extra COPY MIR instruction if statepoint and its gc.relocate are in the same BB, but gc.relocate result is used in other blocks. This patch changes this behavior to export statepoint results only if used in other basic blocks. For local uses StatepointLoweringState.(get\|set)Location() API is used to communicate appropriate statepoint result from `LowerStatepoint()` to `visitGCRelocate()` This is NFC and is purely compile time optimization. On big methids it can improve codegen compile time up to 10%. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D124444	2022-04-27 15:53:24 +03:00
Serge Pavlov	170a903144	Intrinsic for checking floating point class This change introduces a new intrinsic, `llvm.is.fpclass`, which checks if the provided floating-point number belongs to any of the the specified value classes. The intrinsic implements the checks made by C standard library functions `isnan`, `isinf`, `isfinite`, `isnormal`, `issubnormal`, `issignaling` and corresponding IEEE-754 operations. The primary motivation for this intrinsic is the support of strict FP mode. In this mode using compare instructions or other FP operations is not possible, because if the value is a signaling NaN, floating-point exception `Invalid` is raised, but the aforementioned functions must never raise exceptions. Currently there are two solutions for this problem, both are implemented partially. One of them is using integer operations to implement the check. It was implemented in https://reviews.llvm.org/D95948 for `isnan`. It solves the problem of exceptions, but offers one solution for all targets, although some can do the check in more efficient way. The other, implemented in https://reviews.llvm.org/D96568, introduced a hook 'clang::TargetCodeGenInfo::testFPKind', which injects a target specific code into IR to implement `isnan` and some other functions. It is convenient for targets that have dedicated instruction to determine FP data class. However using target-specific intrinsic complicates analysis and can prevent some optimizations. A special intrinsic for value class checks allows representing data class tests with enough flexibility. During IR transformations it represents the check in target-independent way and saves it from undesired transformations. In the instruction selector it allows efficient lowering depending on the used target and mode. This implementation is an extended variant of `llvm.isnan` introduced in https://reviews.llvm.org/D104854. It is limited to minimal intrinsic support. Target-specific treatment will be implemented in separate patches. Differential Revision: https://reviews.llvm.org/D112025	2022-04-26 13:09:16 +07:00
Lian Wang	9980148305	[RISCV][SelectionDAG] Support VP_ADD/VP_MUL/VP_SUB mask operations Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D124144	2022-04-26 02:30:22 +00:00
David Green	9727c77d58	[NFC] Rename Instrinsic to Intrinsic	2022-04-25 18:13:23 +01:00
Simon Pilgrim	34e7243464	[DAG] Fold freeze(bitcast(x)) -> bitcast(freeze(x)) This is a very specific fold to fix an upstream poor codegen issue. InstCombine has the much more flexible pushFreezeToPreventPoisonFromPropagating but I don't think we're quite there with DAG/TLI handling for canCreateUndefOrPoison/isGuaranteedNotToBeUndefOrPoison value tracking yet. Fixes #54911 Differential Revision: https://reviews.llvm.org/D124185	2022-04-22 16:39:25 +01:00
Alexey Bataev	2cca53c815	[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer. We can process the long shuffles (working across several actual vector registers) in the best way if we take the actual register represantion into account. We can build more correct representation of register shuffles, improve number of recognised buildvector sequences. Also, same function can be used to improve the cost model for the shuffles. in future patches. Part of D100486 Differential Revision: https://reviews.llvm.org/D115653	2022-04-20 09:37:16 -07:00
Alexey Bataev	5f7ac15912	Revert "[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer." This reverts commit `2f49163b33` to fix a buildbot failure. Reported in https://lab.llvm.org/buildbot#builders/105/builds/24284	2022-04-20 06:35:55 -07:00
Alexey Bataev	2f49163b33	[DAG]Introduce llvm::processShuffleMasks and use it for shuffles in DAG Type Legalizer. We can process the long shuffles (working across several actual vector registers) in the best way if we take the actual register represantion into account. We can build more correct representation of register shuffles, improve number of recognised buildvector sequences. Also, same function can be used to improve the cost model for the shuffles. in future patches. Part of D100486 Differential Revision: https://reviews.llvm.org/D115653	2022-04-20 05:32:56 -07:00
Matt Arsenault	8591328e15	Intrinsics: Mark llvm.eh.sjlj.callsite argument as immarg The assert in SelectionDAG implies that it is	2022-04-19 21:04:33 -04:00
chenglin.bi	222adf338a	[Arch64][SelectionDAG] Add target-specific implementation of srem 1. X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first. 2. Add AArch64 faster path for SREM only pow2 case. Fix https://github.com/llvm/llvm-project/issues/54649 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122968	2022-04-19 02:49:42 +08:00
chenglin.bi	acfc025a72	Revert "[Arch64][SelectionDAG] Add target-specific implementation of srem" This reverts commit `9d9eddd3dd`.	2022-04-18 10:35:09 +08:00
chenglin.bi	9d9eddd3dd	[Arch64][SelectionDAG] Add target-specific implementation of srem X%C to the equivalent of X-X/C*C is not always fastest path if there is no SDIV pair exist. So check target have faster for srem only first. Add AArch64 faster path for SREM only pow2 case. Fix https://github.com/llvm/llvm-project/issues/54649 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122968	2022-04-16 12:29:11 +08:00
Craig Topper	c6dc229a6d	[DAGCombiner] Move call to hasOneUse after opcode checks. NFC Checking the opcode is cheap, counting the number of uses is not.	2022-04-15 17:02:16 -07:00
Craig Topper	a7b9d75e7a	[DAGCombiner] Move or/xor/and opcode check in ReduceLoadOpStoreWidth before hasOneUse check. hasOneUse is not cheap on nodes with chain results that might have many uses. By checking the opcode first, we can avoid a costly walk of the use list on nodes we aren't interested in. Found by investigating calls to hasNUsesOfValue from the example provided in D123857.	2022-04-15 16:38:27 -07:00
John Brawn	12c1022679	[AArch64] Lowering and legalization of strict FP16 For strict FP16 to work correctly needs some changes in lowering and legalization: * SelectionDAGLegalize::PromoteNode was missing handling for some strict fp opcodes. * Some of the custom lowering of strict fp operations needed to be adjusted to work with FP16. * Custom lowering needed to be added for round-to-int operations. With this, and the previous patches for the rest of the strict fp isel, we can set IsStrictFPEnabled = true. Differential Revision: https://reviews.llvm.org/D115620	2022-04-14 16:51:22 +01:00
Paul Walker	0c44115e51	[SVE] Add support for non-element-type sized scaling when lowering MGATHER/MSCATTER. The lowering code did not use the scale operand of MGATHER/MSCATTER nodes, but instead assumed scaled indices were always scaled based on the element type of the memory type. This patch adds the missing support by rewritting the nodes as unscaled variants. Differential Revision: https://reviews.llvm.org/D123670	2022-04-14 11:54:46 +01:00
Simon Pilgrim	fef221bf1f	[DAG] Enable SimplifyVBinOp folds on add/sub sat intrinsics	2022-04-13 12:53:23 +01:00
Jonas Paulsson	46f83caebc	[InlineAsm] Add support for address operands ("p"). This patch adds support for inline assembly address operands using the "p" constraint on X86 and SystemZ. This was in fact broken on X86 (see example at https://reviews.llvm.org/D110267, Nov 23). These operands should probably be treated the same as memory operands by CodeGenPrepare, which have been commented with "TODO" there. Review: Xiang Zhang and Ulrich Weigand Differential Revision: https://reviews.llvm.org/D122220	2022-04-13 12:50:21 +02:00
Simon Pilgrim	cfb3ee2185	[DAG] Add non-uniform vector support to (shl (srl x, c1), c2) -> (and (shift x, c3)) Another part of D77804 yak shaving Differential Revision: https://reviews.llvm.org/D123523	2022-04-13 11:37:33 +01:00
Simon Pilgrim	bc32a1dd76	[DAG] Add non-uniform vector support to (shl (sr[la] exact X, C1), C2) folds	2022-04-12 12:57:56 +01:00
Craig Topper	35be4a7af3	[SelectionDAG] Remove unecessary null check after call to getNode. NFC As far as I know getNode will never return a null SDValue. I'm guessing this was modeled after the FoldConstantArithmetic call earlier. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D123550	2022-04-11 18:03:44 -07:00
Craig Topper	2ce2562876	[RISCV][SelectionDAG] Add a hook to sign extend i32 ConstantInt operands of phis on RV64. Materializing constants on RISCV is simpler if the constant is sign extended from i32. By default i32 constant operands of phis are zero extended. This patch adds a hook to allow RISCV to override this for i32. We have an existing isSExtCheaperThanZExt, but it operates on EVT which we don't have at these places in the code. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D122951	2022-04-11 14:38:39 -07:00
Craig Topper	28cb508195	[TargetLowering][RISCV] Allow truncation when checking if the arguments of a setcc are splats. We're just trying to canonicalize here and won't be using the constant value returned. The attached test changes are because we were previously commuting a seteq X, (splat_vector 0) because we also have (sub 0, X). The 0 is larger than the element type so we don't detect it as a splat without the AllowTruncation flag. By preventing the commute we are able to match it to the vmseq.vx instruction during isel. We only look for constants on the RHS in isel. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D123256	2022-04-11 09:49:36 -07:00
Sanjay Patel	2ed15984b4	[SDAG] try to reduce compare of funnel shift equal 0 fshl (or X, Y), X, C ==/!= 0 --> or (shl Y, C), X ==/!= 0 fshl X, (or X, Y), C ==/!= 0 --> or (srl Y, BW-C), X ==/!= 0 This is similar to an existing setcc-of-rotate fold, but the matching requires more checks for the more general funnel op: https://alive2.llvm.org/ce/z/Ab2jDd We are effectively decomposing the funnel shift into logical shifts, reassociating, and removing a shift. This should get us the final improvements for x86-64 that were originally shown in D111530 ( https://github.com/llvm/llvm-project/issues/49541 ); x86-32 still shows some SHLD/SHRD, so the pattern is not matching there yet. Differential Revision: https://reviews.llvm.org/D122919	2022-04-11 07:44:58 -04:00
Tim Northover	6c85668d28	Tail calls: look through AssertZExt to find register copy. arm64_32 guarantees the high 32 bits of pointer parameters are passed as 0, and this is modelled in the IR by inserting an AssertZExt after the CopyFromReg. The function deciding whether registers that need to be preserved actually are wasn't expecting this so it banned perfectly legitimate tail calls.	2022-04-11 12:24:47 +01:00
Fraser Cormack	18106b99f0	[VP] Explicitly map from VP intrinsic to ISD opcode This patch aims to overcome an issue in these mappings where, when an ISD node was registered with BEGIN_REGISTER_VP_SDNODE but outwidth the scope of a pair of BEGIN_REGISTER_VP_INTRINSIC/END_REGISTER_VP_INTRINSIC macros, the switch cases fell apart. This in particular happened with VP_SETCC, where we'd end up with something along the lines of: case Intrinsic::vp_fcmp: break; case Intrinsic::vp_icmp: break; ResOpc = ISD::VP_SETCC; case Intrinsic::vp_store: ... To remedy this, we introduce a special-purpose mapping macro which can map any number of VP intrinsic opcodes to an ISD opcode. As a result, we no longer need to special-case the mapping from vp.icmp and vp.fcmp to VP_SETCC, as the new helper macro does it for us. Thanks to @craig.topper for noticing this and to @rogfer01 for the idea. Reviewed By: rogfer01 Differential Revision: https://reviews.llvm.org/D123324	2022-04-08 12:30:22 +01:00
Fraser Cormack	8216255c9f	[RISCV][VP] Add basic RVV codegen for vp.fcmp This patch adds the necessary infrastructure to lower vp.fcmp via ISD::VP_SETCC to RVV instructions. Most notably this patch adds cond-code legalization for VP_SETCC, reusing the existing TargetLowering::LegalizeSetCCCondCode by passing in additional SDValue parameters for the Mask and EVL. This method then uses VP operations to legalize the condcode. There is still a general lack of canonicalization on VP_SETCC as opposed to SETCC which results in worse code than is theoretically possible. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D123051	2022-04-07 09:16:07 +01:00
Craig Topper	bdb1ab9804	[LegalizeTypes][VP] Use LoVT/HiVT when splitting VP operations in SplitVecRes_UnaryOp. The VP path was using the split source VTs instead of the split destination VTs. This may not be a problem today because the VP nodes going through this have the same source and dest VTs. It will be a problem when we start using this function for legalizing VP cast operations.	2022-04-06 10:51:49 -07:00
Daniil Kovalev	62a983ebc5	Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if" This reverts commit `83a798d4b0`. As discussed in D120714 with @thakis, the patch added unneeded complexity without noticeable benefits.	2022-04-06 20:32:53 +03:00
Craig Topper	8fc19185e3	[LegalizeTypes] Move SplitVecRes_VECTOR_REVERSE/VECTOR_SPLICE near other SplitVecRes methods. NFC This file is divided into sections for different legalization actions. We should keep similar methods together.	2022-04-06 10:29:32 -07:00

1 2 3 4 5 ...

12024 Commits