llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	d6fe8d37c6	[DAG] Fold concat_vectors(concat_vectors(x,y),concat_vectors(a,b)) -> concat_vectors(x,y,a,b) Follow-up to D107068, attempt to fold nested concat_vectors/undefs, as long as both the vector and inner subvector types are legal. This exposed the same issue in ARM's MVE LowerCONCAT_VECTORS_i1 (raised as PR51365) and AArch64's performConcatVectorsCombine which both assumed concat_vectors only took 2 subvector operands. Differential Revision: https://reviews.llvm.org/D107597	2021-08-16 16:06:54 +01:00
Paul Walker	cd0e196413	[DAGCombiner] Stop visitEXTRACT_SUBVECTOR creating illegal BITCASTs post legalisation. visitEXTRACT_SUBVECTOR can sometimes create illegal BITCASTs when removing "redundant" INSERT_SUBVECTOR operations. This patch adds an extra check to ensure such combines only occur after operation legalisation if any resulting BITBAST is itself legal. Differential Revision: https://reviews.llvm.org/D108086	2021-08-15 18:25:49 +01:00
Luo, Yuanke	53642d5b80	[NFC] Fix the formula for reciprocal calculation. Differential Revision: https://reviews.llvm.org/D107713	2021-08-09 16:03:56 +08:00
Amara Emerson	2b067e3335	Change TargetLowering::canMergeStoresTo() to take a MF instead of DAG. DAG is unnecessary and we need this hook to implement store merging on GlobalISel too.	2021-08-06 12:57:53 -07:00
Craig Topper	f7076cfd3a	[DAGCombiner][RISCV][AMDGPU] Call SimplifyDemandedBits at the end of visitMULHU to enable known bits contant folding. We don't have real demanded bits support for MULHU, but we can still use the known bits based constant folding support at the end of SimplifyDemandedBits to simplify a MULHU. This helps with cases where we know the LHS and RHS have enough leading zeros so that the high multiply result is always 0. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D106471	2021-08-05 08:31:26 -07:00
Simon Pilgrim	2cbf9fd402	[DAG] DAGCombiner::visitVECTOR_SHUFFLE - recognise INSERT_SUBVECTOR patterns IR typically creates INSERT_SUBVECTOR patterns as a widening of the subvector with undefs to pad to the destination size, followed by a shuffle for the actual insertion - SelectionDAGBuilder has to do something similar for shuffles when source/destination vectors are different sizes. This combine attempts to recognize these patterns by looking for a shuffle of a subvector (from a CONCAT_VECTORS) that starts at a modulo of its size into an otherwise identity shuffle of the base vector. This uncovered a couple of target-specific issues as we haven't often created INSERT_SUBVECTOR nodes in generic code - aarch64 could only handle insertions into the bottom of undefs (i.e. a vector widening), and x86-avx512 vXi1 insertion wasn't keeping track of undef elements in the base vector. Fixes PR50053 Differential Revision: https://reviews.llvm.org/D107068	2021-08-05 15:40:48 +01:00
Craig Topper	c23405174a	[DAGCombiner][AMDGPU] Canonicalize constants to the RHS of MULHU/MULHS. This allows special constants like to 0 to be recognized. It's also expected by isel patterns if a target had a mulh with immediate instructions. The commuting done by tablegen won't commute patterns with immediates since it expects DAGCombine to have done it. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D107486	2021-08-04 11:39:23 -07:00
Simon Pilgrim	11396641e4	[DAG] Cleanup DAGCombiner::CombineConsecutiveLoads early-outs. NFCI. We had some similar hasOneUse/isNON_EXTLoad early-outs spread out over different parts of the method - we should pull them all together. Noticed while triaging PR45116	2021-08-03 13:47:55 +01:00
Sanjay Patel	fa6b2c9915	[DAGCombiner] don't try to partially reduce add-with-overflow ops This transform was added with D58874, but there were no tests for overflow ops. We need to change this one way or another because it can crash as shown in: https://llvm.org/PR51238 Note that if there are no uses of an overflow op's bool overflow result, we reduce it to a regular math op, so we continue to fold that case either way. If we have uses of both the math and the overflow bool, then we are likely not saving anything by creating an independent sub instruction as seen in the test diffs here. This patch makes the behavior in SDAG consistent with what we do in instcombine AFAICT. Differential Revision: https://reviews.llvm.org/D106983	2021-07-29 08:51:54 -04:00
Juneyoung Lee	4f71f59bf3	[DAGCombiner] Fold SETCC(FREEZE(x),const) to FREEZE(SETCC(x,const)) if SETCC is used by BRCOND This patch adds a peephole optimization `SETCC(FREEZE(x),const)` => `FREEZE(SETCC(x,const))` if the SETCC is only used by BRCOND. Combined with `BRCOND(FREEZE(X)) => BRCOND(X)`, this leads to a nice improvement in the generated assembly when x is a masked loaded value. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D105344	2021-07-28 09:22:15 +09:00
Fraser Cormack	7b33b849bd	[SelectionDAG] Support scalable splats in U(ADD\|SUB)SAT combines This patch builds on top of D106575 in which scalable-vector splats were supported in `ISD::matchBinaryPredicate`. It teaches the DAGCombiner how to perform a variety of the pre-existing saturating add/sub combines on scalable-vector types. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106652	2021-07-27 10:52:34 +01:00
Fraser Cormack	f924a3d474	[SelectionDAG] Support scalable-vector splats in yet more cases This patch extends support for (scalable-vector) splats in the DAGCombiner via the `ISD::matchBinaryPredicate` function, which enable a variety of simple combines of constants. Users of this function may now have to distinguish between `BUILD_VECTOR` and `SPLAT_VECTOR` vector operands. The way of dealing with this in-tree follows the approach added for `ISD::matchUnaryPredicate` implemented in D94501. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D106575	2021-07-26 10:15:08 +01:00
Simon Pilgrim	c261a06b7a	[DAG] Add initial SelectionDAG::isGuaranteedNotToBeUndefOrPoison framework (PR51129) I've setup the basic framework for the isGuaranteedNotToBeUndefOrPoison call and updated DAGCombiner::visitFREEZE to use it, further Opcodes can be handled when we have test coverage. I'm not aware of any vector test freeze coverage so the DemandedElts (and the Depth) args are not being used yet - but they are in place. SelectionDAG::isGuaranteedNotToBePoison wrappers have also been added. Differential Revision: https://reviews.llvm.org/D106668	2021-07-24 11:36:35 +01:00
David Truby	1528a4d400	[llvm][sve] Lowering for VLS truncating stores This adds custom lowering for truncating stores when operating on fixed length vectors in SVE. It also includes a DAG combine to fold extends followed by truncating stores into non-truncating stores in order to prevent this pattern appearing once truncating stores are supported. Currently truncating stores are not used in certain cases where the size of the vector is larger than the target vector width. Differential Revision: https://reviews.llvm.org/D104471	2021-07-23 14:04:55 +01:00
Paulo Matos	46667a1003	[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR Reland of `31859f896`. This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D104797	2021-07-22 22:07:24 +02:00
Eli Friedman	0ca46a1757	[SelectionDAG] Fix the representation of ISD::STEP_VECTOR. The existing rule about the operand type is strange. Instead, just say the operand is a TargetConstant with the right width. (Legalization ignores TargetConstants, so it doesn't matter if that width is legal.) Highlights: 1. I had to substantially rewrite the AArch64 isel patterns to expect a TargetConstant. Nothing too exotic, but maybe a little hairy. Maybe worth considering a target-specific node with some dagcombines instead of this complicated nest of isel patterns. 2. Our behavior on RV32 for vectors of i64 has changed slightly. In particular, we correctly preserve the width of the arithmetic through legalization. This changes the DAG a bit. Maybe room for improvement here. 3. I explicitly defined the behavior around overflow. This is necessary to make the DAGCombine transforms legal, and I don't think it causes any practical issues. Differential Revision: https://reviews.llvm.org/D105673	2021-07-21 10:58:40 -07:00
Amy Huang	fd972bb9fd	Revert "[llvm][sve] Lowering for VLS truncating stores" because it causes a seg fault (see https://reviews.llvm.org/D104471). This reverts commit `c305557acd`.	2021-07-19 11:03:33 -07:00
Simon Pilgrim	fd7a54c709	[DAG] DAGCombiner::foldSelectOfBinops - propagate the common flags to the merged binop As discussed on D106058 - we were failing to keep the common flags. This matches the behaviour in InstCombinerImpl::foldSelectOpOp.	2021-07-18 18:38:59 +01:00
Simon Pilgrim	5643be96bc	[DAG] Enable foldSelectOfBinops on select(setcc(),binop(),binop()) calls	2021-07-18 18:38:59 +01:00
Simon Pilgrim	1a6a8443c2	[DAG] Move select(cc, binop(), binop()) folds into DAGCombiner::foldSelectOfBinops. NFCI. I'm going to extend the functionality started in D106058 so move the folds into their own method to reduce the amount of code in DAGCombiner::visitSELECT	2021-07-18 14:54:41 +01:00
Simon Pilgrim	0aece73aba	[DAG] Fold select(cond,binop(x,y),binop(x,z)) -> binop(x,select(cond,y,z)) Similar to the folds performed in InstCombinerImpl::foldSelectOpOp, this attempts to push a select further up to help merge a pair of binops. I'm primarily interested in select(cond,add(x,y),add(x,z)) folds to help expose pointer math (see https://bugs.llvm.org/show_bug.cgi?id=51069 etc.) but I've tried to use the more generic isBinOp(). Differential Revision: https://reviews.llvm.org/D106058	2021-07-15 16:08:30 +01:00
Qiu Chaofan	954a15d639	[SelectionDAG] Check use before combining into USUBSAT Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D105789	2021-07-13 14:50:26 +08:00
David Truby	c305557acd	[llvm][sve] Lowering for VLS truncating stores This adds custom lowering for truncating stores when operating on fixed length vectors in SVE. It also includes a DAG combine to fold extends followed by truncating stores into non-truncating stores in order to prevent this pattern appearing once truncating stores are supported. Currently truncating stores are not used in certain cases where the size of the vector is larger than the target vector width. Differential Revision: https://reviews.llvm.org/D104471	2021-07-12 11:14:17 +01:00
David Green	4ce26deac2	[DAG] Reassociate Add with Or We already have reassociation code for Adds and Ors separately in DAG combiner, this adds it for the combination of the two where Ors act like Adds. It reassociates (add (or (x, c), y) -> (add (add (x, y), c)) where we know that the Ors operands have no common bits set, and the Or has one use. Differential Revision: https://reviews.llvm.org/D104765	2021-07-07 10:21:07 +01:00
David Stuttard	83cb9632a1	[DAGCombiner] Add support for mulhi const folding in DAGCombiner Differential Revision: https://reviews.llvm.org/D103323 Change-Id: I4ffaaa32301795ba8a339567a68e77fe0862b869	2021-07-05 12:01:26 +01:00
Paul Walker	287d39dd5a	[NFC] Fix a few whitespace issues and typos.	2021-07-04 11:49:58 +01:00
Craig Topper	af331e8284	[SelectionDAG] Rename memory VT argument for getMaskedGather/getMaskedScatter from VT to MemVT. Use getMemoryVT() in MGATHER/MSCATTER DAG combines instead of using the passthru or store value VT for this argument.	2021-07-02 17:37:40 -07:00
Roman Lebedev	c2c0d3ea89	Revert "[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR" This reverts commit `4facbf213c`. ``` ****************** FAIL: LLVM :: CodeGen/WebAssembly/funcref-call.ll (44466 of 44468) **************** TEST 'LLVM :: CodeGen/WebAssembly/funcref-call.ll' FAILED ****************** Script: -- : 'RUN: at line 1'; /builddirs/llvm-project/build-Clang12/bin/llc < /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll --mtriple=wasm32-unknown-unknown -asm-verbose=false -mattr=+reference-types \| /builddirs/llvm-project/build-Clang12/bin/FileCheck /repositories/llvm-project/llvm/test/CodeGen/WebAssembly/funcref-call.ll -- Exit Code: 2 Command Output (stderr): -- llc: /repositories/llvm-project/llvm/include/llvm/Support/LowLevelTypeImpl.h:44: static llvm::LLT llvm::LLT::scalar(unsigned int): Assertion `SizeInBits > 0 && "invalid scalar size"' failed. ```	2021-07-02 11:49:51 +03:00
Paulo Matos	4facbf213c	[WebAssembly] Implementation of global.get/set for reftypes in LLVM IR Reland of `31859f896`. This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Differential Revision: https://reviews.llvm.org/D104797	2021-07-02 09:46:28 +02:00
David Green	2887f14639	[ISel] Port AArch64 SABD and UABD to DAGCombine This ports the AArch64 SABD and USBD over to DAG Combine, where they can be used by more backends (notably MVE in a follow-up patch). The matching code has changed very little, just to handle legal operations and types differently. It selects from (ABS (SUB (EXTEND a), (EXTEND b))), producing a ubds/abdu which is zexted to the original type. Differential Revision: https://reviews.llvm.org/D91937	2021-06-26 19:34:16 +01:00
David Green	b8c8bb0769	[DAG] Fold neg(splat(neg(x)) -> splat(x) This add as a fold of sub(0, splat(sub(0, x))) -> splat(x). This can come up in the lowering of right shifts under AArch64, where we generate a shift left of a negated number. Differential Revision: https://reviews.llvm.org/D103755	2021-06-25 19:53:29 +01:00
Jinsong Ji	c125af82a5	[DAGCombine] Check reassoc flags in aggressive fsub fusion The is from discussion in https://reviews.llvm.org/D104247#inline-993387 The contract and reassoc flags shouldn't imply each other . All the aggressive fsub fusion reassociate operations, we should guard them with reassoc flag check. Reviewed By: mcberg2017 Differential Revision: https://reviews.llvm.org/D104723	2021-06-23 13:59:40 +00:00
Jinsong Ji	3996311ee1	[DAGCombine] reassoc flag shouldn't enable contract According to IR LangRef, the FMF flag: contract Allow floating-point contraction (e.g. fusing a multiply followed by an addition into a fused multiply-and-add). reassoc Allow reassociation transformations for floating-point instructions. This may dramatically change results in floating-point. My understanding is that these two flags shouldn't imply each other, as we might have a SDNode that can be reassociated with others, but not contractble. eg: We may want following fmul/fad/fsub to freely reassoc, but don't want fma being generated here. %F = fmul reassoc double %A, %B ; <double> [#uses=1] %G = fmul reassoc double %C, %D ; <double> [#uses=1] %H = fadd reassoc double %F, %G ; <double> [#uses=1] %I = fsub reassoc double %H, %E ; <double> [#uses=1] Before https://reviews.llvm.org/D45710, `reassoc` flag actually did not imply isContratable either. The current implementation also only check the flag in fadd node, ignoring fmul node, this patch update that as well. Reviewed By: spatel, qiucf Differential Revision: https://reviews.llvm.org/D104247	2021-06-21 21:15:43 +00:00
Saleem Abdulrasool	5b5833b9e0	SelectionDAG: repair the Windows build `6e5628354e` regressed the Windows build as the return type no longer matched in both branches for the return value type deduction. This uses a bit more compiler magic to deal with that.	2021-06-14 08:25:36 -07:00
Roman Lebedev	0f94c3c80d	[NFC][DAGCombine] Extract getFirstIndexOf() lambda back into a function Not all supported compilers like such lambdas, at least one buildbot is unhappy.	2021-06-14 16:25:59 +03:00
Roman Lebedev	6e5628354e	[DAGCombine] reduceBuildVecToShuffle(): sort input vectors by decreasing size The sorting, obviously, must be stable, else we will have random assembly fluctuations. Apparently there was no test coverage that would benefit from that, so i've added one test. The sorting consists of two parts - just sort the input vectors, and recompute the shuffle mask -> input vector mapping. I don't believe we need to do anything else. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D104187	2021-06-14 16:18:37 +03:00
Carl Ritson	cfbb92441f	[SDAG] Fix pow2 assumption when splitting vectors When reducing vector builds to shuffles it possible that the DAG combiner may try to extract invalid subvectors. This happens as the existing code assumes vectors will be power of 2 sizes, which is already untrue, but becomes more noticable with v6 and v7 types. Specifically the existing code assumes that half PowerOf2Ceil of a given vector index will fit twice into a given vector. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D103880	2021-06-11 08:58:16 +09:00
David Spickett	64de8763aa	Revert "Implementation of global.get/set for reftypes in LLVM IR" This reverts commit `31859f896c`. Causing SVE and RISCV-V test failures on bots.	2021-06-10 10:11:17 +00:00
Paulo Matos	31859f896c	Implementation of global.get/set for reftypes in LLVM IR This change implements new DAG notes GLOBAL_GET/GLOBAL_SET, and lowering methods for load and stores of reference types from IR globals. Once the lowering creates the new nodes, tablegen pattern matches those and converts them to Wasm global.get/set. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D95425	2021-06-10 10:07:45 +02:00
Sanjay Patel	dd763ac791	[SDAG] fix miscompile from merging stores of different sizes As shown in: https://llvm.org/PR50623 ...and the similar tests here, we were not accounting for store merging of different sizes that do not cover the entire range of the wide value to be stored. This is the easy fix: just make sure that all of the original stores are the same size, so when we calculate the wide width, it's a simple N * M check. This still allows all of the motivating optimizations from: D86420 / `54a5dd485c` D87112 / `7a06b166b1` We could enhance this code to track individual bytes and allow merging multiple sizes.	2021-06-09 09:51:39 -04:00
Simon Pilgrim	61a2d6bfe4	[DAG] foldShuffleOfConcatUndefs - ensure shuffles of upper (undef) subvector elements is undef (PR50609) shuffle(concat(x,undef),concat(y,undef)) -> concat(shuffle(x,y),shuffle(x,y)) If the original shuffle references any of the upper (undef) subvector elements, ensure the split shuffle masks uses undef instead of an out-of-bounds value. Fixes PR50609	2021-06-08 15:49:41 +01:00
Sanjay Patel	0718ac706d	[SDAG] allow cast folding for vector sext-of-setcc with signed compare This extends `434c8e013a` and `ede3982792` to handle signed predicates by sign-extending the setcc operands. This is not shown directly in https://llvm.org/PR50055 , but the pattern is visible by changing the unsigned convert to signed in the source code.	2021-06-02 15:05:02 -04:00
Sanjay Patel	ede3982792	[SDAG] allow more cast folding for vector sext-of-setcc This is a follow-up to D103280 that eases the use restrictions, so we can handle the motivating case from: https://llvm.org/PR50055 The loop code is adapted from similar use checks in ExtendUsesToFormExtLoad() and SliceUpLoad(). I did not see an easier way to filter out non-chain uses of load values. Differential Revision: https://reviews.llvm.org/D103462	2021-06-02 13:14:49 -04:00
Sanjay Patel	1b14f3951a	[SDAG] add helper function for sext-of-setcc folds; NFC Try to make this easier to read as noted in D103280	2021-06-01 08:07:17 -04:00
Sanjay Patel	63fe4cb082	[SDAG] add check to sext-of-setcc fold to bypass changing a legal op I accidentaly pushed a draft of D103280 that was discussed during the review, but it was not supposed to be the final version. Rather than revert and recommit, I'm updating the existing code. This way we have a record of the codegen diff that would result if we decide to remove this predicate in the future.	2021-05-31 08:58:11 -04:00
Sanjay Patel	434c8e013a	[SDAG] try harder to fold casts into vector compare sext (vsetcc X, Y) --> vsetcc (zext X), (zext Y) -- (when the zexts are free and a bunch of other conditions) We have a couple of similar folds to this already for vector selects, but this pattern slips through because it is only a setcc. The tests are based on the motivating case from: https://llvm.org/PR50055 ...but we need extra logic to get that example, so I've left that as a TODO for now. Differential Revision: https://reviews.llvm.org/D103280	2021-05-31 07:14:01 -04:00
Florian Hahn	126f90b252	[DAGCombine] Poison-prove scalarizeExtractedVectorLoad. extractelement is poison if the index is out-of-bounds, so just scalarizing the load may introduce an out-of-bounds load, which is UB. To avoid introducing new UB, we can mask the index so it only contains valid indices. Fixes PR50382. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D103077	2021-05-30 11:40:55 +01:00
Fraser Cormack	b7101e218c	[DAGCombine][RISCV] Don't try to trunc-store combined vector stores DAGCombine's `mergeStoresOfConstantsOrVecElts` optimization is told whether it's to use vector types and also whether it's to issue a truncating store. However, the truncating store code path assumes a scalar integer `ConstantSDNode`, and when using vector types it creates either a `BUILD_VECTOR` or `CONCAT_VECTORS` to store: neither of which is a constant. The `riscv64` target is able to expose a crash here because it switches on both code paths at the same time. The `f32` is stored as `i32` which must be promoted to `i64`, necessitating a truncating store. It also decides later that it prefers a vector store of `v2f32`. While vector truncating stores are legal, this combine is not able to emit them. We also don't have a test case. This patch adds an assert to catch this case more gracefully, and updates one of the caller functions to the function to turn off the use of truncating stores when preferring vectors. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D103173	2021-05-27 14:16:32 +01:00
Fraser Cormack	85e31eddf2	[DAGCombiner] Relax an assertion to an early return The select-of-constants transform was asserting that its constant vector inputs did not implicitly truncate their input without that as an explicit precondition to the function. This patch relaxes that assertion into an early return to skip the optimization. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D102393	2021-05-17 09:15:55 +01:00
Sanjay Patel	9dfd7f9b67	[SDAG] reduce code duplication for extend_vec_inreg combines; NFC These are identical so far, and I was looking at adding a fold for a pattern with scalar_to_vector which would also nd up duplicated.	2021-05-14 08:29:57 -04:00

1 2 3 4 5 ...

3074 Commits