llvm-project

Commit Graph

Author	SHA1	Message	Date
Sanjay Patel	870591200d	[SDAG] remove duplicate functionality when getting shift type for demanded bits; NFCI This was noted as a potential cleanup in D117508. getShiftAmountTy() has checks for vector, phase, etc. so it should handle anything that the caller was trying to account for.	2022-01-18 12:13:45 -05:00
David Sherwood	f4515ab858	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `197f3c0deb`. Reverting after miscompilation errors discovered with ffmpeg.	2022-01-18 08:40:20 +00:00
Sanjay Patel	ba6485e25f	[SDAG] add demanded bits transform for bswap A possible codegen regression for PowerPC is noted in D117406 because we don't recognize a pattern that demands only 1 byte from a bswap. This fold has existed in IR since close to the beginning of LLVM: https://github.com/llvm/llvm-project/blame/main/llvm/lib/Transforms/InstCombine/InstCombineSimplifyDemanded.cpp#L794 ...so this patch copies that code as much as possible and adapts it for SDAG. The test for PowerPC that would change in D117406 is over-reduced with undefs, so I recreated it for AArch64 and x86 by passing in pointer args and renamed the values to make the logic clearer. Differential Revision: https://reviews.llvm.org/D117508	2022-01-17 18:25:42 -05:00
David Sherwood	197f3c0deb	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-17 11:08:57 +00:00
Nikita Popov	c63a3175c2	[AttrBuilder] Remove ctor accepting AttributeList and Index Use the AttributeSet constructor instead. There's no good reason why AttrBuilder itself should exact the AttributeSet from the AttributeList. Moving this out of the AttrBuilder generally results in cleaner code.	2022-01-15 22:39:31 +01:00
David Sherwood	ba471ba8d2	Revert "[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants" This reverts commit `31009f0b5a`. It seems to be causing SVE VLA buildbot failures and has introduced a genuine regression. Reverting for now.	2022-01-13 15:59:43 +00:00
David Sherwood	31009f0b5a	[CodeGen][AArch64] Ensure isSExtCheaperThanZExt returns true for negative constants When we know the value we're extending is a negative constant then it makes sense to use SIGN_EXTEND because this may improve code quality in some cases, particularly when doing a constant splat of an unpacked vector type. For example, for SVE when splatting the value -1 into all elements of a vector of type <vscale x 2 x i32> the element type will get promoted from i32 -> i64. In this case we want the splat value to sign-extend from (i32 -1) -> (i64 -1), whereas currently it zero-extends from (i32 -1) -> (i64 0xFFFFFFFF). Sign-extending the constant means we can use a single mov immediate instruction. New tests added here: CodeGen/AArch64/sve-vector-splat.ll I believe we see some code quality improvements in these existing tests too: CodeGen/AArch64/dag-numsignbits.ll CodeGen/AArch64/reduce-and.ll CodeGen/AArch64/unfold-masked-merge-vector-variablemask.ll The apparent regressions in CodeGen/AArch64/fast-isel-cmp-vec.ll only occur because the test disables codegen prepare and branch folding. Differential Revision: https://reviews.llvm.org/D114357	2022-01-13 09:43:07 +00:00
Nick Desaulniers	4edb9983cb	[SelectionDAG] treat X constrained labels as i for asm Completely rework how we handle X constrained labels for inline asm. X should really be treated as i. Then existing tests can be moved to use i D115410 and clang can just emit i D115311. (D115410 and D115311 are callbr, but this can be done for label inputs, too). Coincidentally, this simplification solves an ICE uncovered by D87279 based on assumptions made during D69868. This is the third approach considered. See also discussions v1 (D114895) and v2 (D115409). Reported-by: kernel test robot <lkp@intel.com> Fixes: https://github.com/ClangBuiltLinux/linux/issues/1512 Reviewed By: void, jyknight Differential Revision: https://reviews.llvm.org/D115688	2022-01-11 10:29:40 -08:00
Nick Desaulniers	649b11ef8b	git-clang-format HEAD~	2022-01-10 18:34:30 -08:00
Nick Desaulniers	301e911740	[TargetLowering] precommit refactor from D115688 NFC Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>	2022-01-10 18:32:13 -08:00
Serge Guelton	d2cc6c2d0c	Use a sorted array instead of a map to store AttrBuilder string attributes Using and std::map<SmallString, SmallString> for target dependent attributes is inefficient: it makes its constructor slightly heavier, and involves extra allocation for each new string attribute. Storing the attribute key/value as strings implies extra allocation/copy step. Use a sorted vector instead. Given the low number of attributes generally involved, this is cheaper, as showcased by https://llvm-compile-time-tracker.com/compare.php?from=5de322295f4ade692dc4f1823ae4450ad3c48af2&to=05bc480bf641a9e3b466619af43a2d123ee3f71d&stat=instructions Differential Revision: https://reviews.llvm.org/D116599	2022-01-10 14:49:53 +01:00
Nikita Popov	0312fe2901	[CodeGen] Support opaque pointers for inline asm This is the last part of D116531. Fetch the type of the indirect inline asm operand from the elementtype attribute, rather than the pointer element type. Fixes https://github.com/llvm/llvm-project/issues/52928.	2022-01-07 10:57:38 +01:00
Kazu Hirata	2aed08131d	[llvm] Use true/false instead of 1/0 (NFC) Identified with modernize-use-bool-literals.	2022-01-07 00:39:14 -08:00
Simon Pilgrim	882c083889	[DAG] TargetLowering::SimplifySetCC - use APInt::getMinSignedBits() helper. NFC.	2022-01-04 13:48:36 +00:00
Craig Topper	cbcbbd6ac8	[ValueTracking][SelectionDAG] Rename ComputeMinSignedBits->ComputeMaxSignificantBits. NFC This function returns an upper bound on the number of bits needed to represent the signed value. Use "Max" to match similar functions in KnownBits like countMaxActiveBits. Rename APInt::getMinSignedBits->getSignificantBits. Keeping the old name around to keep this patch size down. Will do a bulk rename as follow up. Rename KnownBits::countMaxSignedBits->countMaxSignificantBits. Reviewed By: lebedev.ri, RKSimon, spatel Differential Revision: https://reviews.llvm.org/D116522	2022-01-03 11:33:30 -08:00
Craig Topper	1c6b740d4b	[TargetLowering] Remove workaround for old behavior of getShiftAmountTy. NFC getShiftAmountTy used to directly return the shift amount type from the target which could be too small for large illegal types. For example, X86 always returns i8. The code here detected this and used i32 instead if it won't fit. This behavior was added to getShiftAmountTy in D112469 so we no longer need this workaround.	2021-12-28 14:08:25 -08:00
Simon Pilgrim	71fc4bbdd2	[X86][SSE] Add ISD::ROTR support Fix issue in TargetLowering::expandROT where we only attempt to flip a rotation if the other direction has better support - this matches TargetLowering::expandFunnelShift This allows us to enable ISD::ROTR lowering on SSE targets, which particularly simplifies/improves codegen for splat amount and AVX2 per-element shifts.	2021-12-23 15:07:30 +00:00
Simon Pilgrim	4639461531	[DAG][X86] Add TargetLowering::isSplatValueForTargetNode override Add callback to enable us to test target nodes if they are splat vectors Added some basic X86ISD::VBROADCAST + X86ISD::VBROADCAST_LOAD handling	2021-12-22 16:57:44 +00:00
Simon Pilgrim	52d2f35323	[DAG] Update expandFunnelShift/expandROT to return the expansion directly. NFCI. Don't return a bool to indicate if the expansion was successful, just return the SDValue result directly, like we do for most other basic expansions.	2021-12-07 18:09:43 +00:00
Kazu Hirata	630c847b1b	[llvm] Use range-based for loops (NFC)	2021-12-07 09:17:03 -08:00
David Green	255ad73424	[ARM] Make MVE v2i1 predicates legal MVE can treat v16i1, v8i1, v4i1 and v2i1 as different views onto the same 16bit VPR.P0 register, with v2i1 holding two 8 bit values for the two halves. This was never treated as a legal type in llvm in the past as there are not many 64bit instructions and no 64bit compares. There are a few instructions that could use it though, notably a VSELECT (as it can handle any size using the underlying v16i8 VPSEL), AND/OR/XOR for similar reasons, some gathers/scatter and long multiplies and VCTP64 instructions. This patch goes through and makes v2i1 a legal type, handling all the cases that fall out of that. It also makes VSELECT legal for v2i64 as a side benefit. A lot of the codegen changes as a result - usually in way that is a little better or a little worse, but still expensive. Costs can change a little too in the process, again in a way that expensive things remain expensive. A lot of the tests that changed are mainly to ensure correctness - the code can hopefully be improved in the future where it comes up in practice. The intrinsics currently remain using the v4i1 they previously did to emulate a v2i1. This will be changed in a followup patch but this one was already large enough. Differential Revision: https://reviews.llvm.org/D114449	2021-12-03 14:05:41 +00:00
Simon Pilgrim	6803d08c38	[DAG][PowerPC] Enable initial ISD::BITCAST SimplifyDemandedBits/SimplifyMultipleUseDemandedBits big-endian handling This patch begins extending handling for peeking through bitcast nodes to big-endian targets as well as the existing little-endian case. Differential Revision: https://reviews.llvm.org/D114676	2021-12-02 11:47:53 +00:00
Nikita Popov	bfa91f38a9	[DAG] Restore dropped condition This was dropped in `fcee33bd5a`, presumably accidentally.	2021-11-26 21:18:54 +01:00
Simon Pilgrim	fcee33bd5a	[DAG] Pull out repeated isLittleEndian() calls. NFC.	2021-11-26 18:41:56 +00:00
Simon Pilgrim	2778f9a9f6	[DAG] SimplifyDemandedVectorElts - attempt to handle ADD(x,x) as single use If the ADD node is the only user of the repeated operand, then treat this as single use - allows us to peek through shl(x,1) patterns.	2021-11-26 10:32:10 +00:00
Simon Pilgrim	63b1e58f07	[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl (REAPPLIED) If we only demand bits from one half of a rotation pattern, see if we can simplify to a logical shift. For the ARM/AArch64 rev16/32 patterns, I had to drop a fold to prevent srl(bswap()) -> rotr(bswap) -> srl(bswap) infinite loops. I've replaced this with an isel PatFrag which should do the same task. Reapplied with fix for AArch64 rev patterns to matching the ARM fix. https://alive2.llvm.org/ce/z/iroxki (rol -> shl by amt iff demanded bits has at least as many trailing zeros as the shift amount) https://alive2.llvm.org/ce/z/4ez_U- (ror -> shl by revamt iff demanded bits has at least as many trailing zeros as the reverse shift amount) https://alive2.llvm.org/ce/z/cD7dR- (ror -> lshr by amt iff demanded bits has at least as many leading zeros as the shift amount) https://alive2.llvm.org/ce/z/_XGHtQ (rol -> lshr by revamt iff demanded bits has at least as many leading zeros as the reverse shift amount) Differential Revision: https://reviews.llvm.org/D114354	2021-11-25 11:14:15 +00:00
Zarko Todorovski	95875d246a	[LLVM][NFC]Inclusive language: remove occurances of sanity check/test from llvm Part of work to use more inclusive language in clang/llvm. Rewording some comments and change function and variable names.	2021-11-24 17:29:55 -05:00
Benjamin Kramer	d32787230d	Revert "[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl" This reverts commit `3cf4a2c620`. It makes llc hang on the following test case. ``` target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128" target triple = "aarch64-unknown-linux-gnu" define dso_local void @_PyUnicode_EncodeUTF16() local_unnamed_addr #0 { entry: br label %while.body117.i while.body117.i: ; preds = %cleanup149.i, %entry %out.6269.i = phi i16* [ undef, %cleanup149.i ], [ undef, %entry ] %0 = load i16, i16* undef, align 2 %1 = icmp eq i16 undef, -10240 br i1 %1, label %fail.i, label %cleanup149.i cleanup149.i: ; preds = %while.body117.i %or130.i = call i16 @llvm.bswap.i16(i16 %0) #2 store i16 %or130.i, i16* %out.6269.i, align 2 br label %while.body117.i fail.i: ; preds = %while.body117.i ret void } ; Function Attrs: nofree nosync nounwind readnone speculatable willreturn declare i16 @llvm.bswap.i16(i16) #1 attributes #0 = { "target-features"="+neon,+v8a" } attributes #1 = { nofree nosync nounwind readnone speculatable willreturn } attributes #2 = { mustprogress nofree norecurse nosync nounwind readnone uwtable willreturn "frame-pointer"="non-leaf" "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-cpu"="generic" "target-features"="+neon,+v8a" } ```	2021-11-24 14:42:54 +01:00
Simon Pilgrim	3cf4a2c620	[DAG] SimplifyDemandedBits - simplify rotl/rotr to shl/srl If we only demand bits from one half of a rotation pattern, see if we can simplify to a logical shift. For the ARM rev16 patterns, I had to drop a fold to prevent srl(bswap()) -> rotr(bswap) -> srl(bswap) infinite loops. I've replaced this with an isel PatFrag which should do the same task. https://alive2.llvm.org/ce/z/iroxki (rol -> shl by amt iff demanded bits has at least as many trailing zeros as the shift amount) https://alive2.llvm.org/ce/z/4ez_U- (ror -> shl by revamt iff demanded bits has at least as many trailing zeros as the reverse shift amount) https://alive2.llvm.org/ce/z/cD7dR- (ror -> lshr by amt iff demanded bits has at least as many leading zeros as the shift amount) https://alive2.llvm.org/ce/z/_XGHtQ (rol -> lshr by revamt iff demanded bits has at least as many leading zeros as the reverse shift amount) Differential Revision: https://reviews.llvm.org/D114354	2021-11-24 11:28:35 +00:00
Eric Tang	9fe6b9e802	[TargetLowering][RISCV] Fixed a scalable vector issue when lowering [s\|u]mul.overflow intrinsics Fixed the vector type issue that where we used getVectorNumElements() should be replaced by getVectorElementCount() when lowering these intrinsics. This is similar to D94149 Signed-off-by: Eric Tang <tangxingxin1008@gmail.com> Reviewed By: craig.topper, frasercrmck Differential Revision: https://reviews.llvm.org/D109809	2021-11-18 10:16:08 +00:00
Craig Topper	0274be28d7	[RISCV] Lower vector CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF by converting to FP and extracting the exponent. If we have a large enough floating point type that can exactly represent the integer value, we can convert the value to FP and use the exponent to calculate the leading/trailing zeros. The exponent will contain log2 of the value plus the exponent bias. We can then remove the bias and convert from log2 to leading/trailing zeros. This doesn't work for zero since the exponent of zero is zero so we can only do this for CTLZ_ZERO_UNDEF/CTTZ_ZERO_UNDEF. If we need a value for zero we can use a vmseq and a vmerge to handle it. We need to be careful to make sure the floating point type is legal. If it isn't we'll continue using the integer expansion. We could split the vector and concatenate the results but that needs some additional work and evaluation. Differential Revision: https://reviews.llvm.org/D111904	2021-11-17 10:29:41 -08:00
Simon Pilgrim	5fedbd5b18	[DAG] SimplifyDemandedVectorElts - zero_extend_vector_inreg(and(x,c)) -> and(x,c') If we've only demanded the 0'th element, and it comes from a (one-use) AND, try to convert the zero_extend_vector_inreg into a mask and constant fold it with the AND.	2021-11-17 12:41:48 +00:00
Simon Pilgrim	f059b04f7b	[DAG] Add SelectionDAG::ComputeMinSignedBits helper As suggested on D113371, this adds a wrapper to SelectionDAG::ComputeNumSignBits, similar to the llvm::ComputeMinSignedBits wrapper. I've included some usage, its not exhaustive, just the more obvious cases where the intention is obvious. Differential Revision: https://reviews.llvm.org/D113396	2021-11-08 14:12:45 +00:00
Craig Topper	04c184bba7	[TargetLowering] Simplify the interface of expandABS. NFC Instead of returning a bool to indicate success and a separate SDValue, return the SDValue and have the callers check if it is null. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112331	2021-10-22 10:22:23 -07:00
Craig Topper	996123e5e8	[TargetLowering] Simplify the interface for expandCTPOP/expandCTLZ/expandCTTZ. There is no need to return a bool and have an SDValue output parameter. Just return the SDValue and let the caller check if it is null. I have another patch to add more callers of these so I thought I'd clean up the interface first. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D112267	2021-10-21 15:35:28 -07:00
Craig Topper	fe1f0de003	[RISCV][WebAssembly][TargetLowering] Allow expandCTLZ/expandCTTZ to rely on CTPOP expansion for vectors. Our fallback expansion for CTLZ/CTTZ relies on CTPOP. If CTPOP isn't legal or custom for a vector type we would scalarize the CTLZ/CTTZ. This is different than CTPOP itself which would use a vector expansion. This patch teaches expandCTLZ/CTTZ to rely on the vector CTPOP expansion instead of scalarizing. To do this I had to add additional checks to make sure the operations used by CTPOP expansions are all supported. Some of the operations were already needed for the CTLZ/CTTZ expansion. This is a huge improvement to the RISCV which doesn't have a scalar ctlz or cttz in the base ISA. For WebAssembly, I've added Custom lowering to keep the scalarizing behavior. I've also extended the scalarizing to CTPOP. Differential Revision: https://reviews.llvm.org/D111919	2021-10-20 07:46:41 -07:00
Sander de Smalen	be6c8dc765	[SelectionDAG] Fix getVectorSubVecPointer for scalable subvectors. When inserting a scalable subvector into a scalable vector through the stack, the index to store to needs to be scaled by vscale. Before this patch, that didn't yet happen, so it would generate the wrong offset, thus storing a subvector to the incorrect address and overwriting the wrong lanes. For some insert: nxv8f16 insert_subvector(nxv8f16 %vec, nxv2f16 %subvec, i64 2) The offset was not scaled by vscale: orr x8, x8, #0x4 st1h { z0.h }, p0, [sp] st1h { z1.d }, p1, [x8] ld1h { z0.h }, p0/z, [sp] And is changed to: mov x8, sp st1h { z0.h }, p0, [sp] st1h { z1.d }, p1, [x8, #1, mul vl] ld1h { z0.h }, p0/z, [sp] Differential Revision: https://reviews.llvm.org/D111633	2021-10-20 13:55:24 +01:00
Simon Pilgrim	71e39e3f18	[ADT] Add APInt::isNegatedPowerOf2() helper Inspired by D111968, provide a isNegatedPowerOf2() wrapper instead of obfuscating code with (-Value).isPowerOf2() patterns, which I'm sure are likely avenues for typos..... Differential Revision: https://reviews.llvm.org/D111998	2021-10-19 14:38:21 +01:00
Simon Pilgrim	88487662f7	[Codegen] TargetLowering::getCanonicalIndexType - early out scaled MVT::i8 indices. NFCI. Avoids unused assignment scan-build warning.	2021-10-14 13:08:40 +01:00
Jay Foad	a9bceb2b05	[APInt] Stop using soft-deprecated constructors and methods in llvm. NFC. Stop using APInt constructors and methods that were soft-deprecated in D109483. This fixes all the uses I found in llvm, except for the APInt unit tests which should still test the deprecated methods. Differential Revision: https://reviews.llvm.org/D110807	2021-10-04 08:57:44 +01:00
Christopher Tetreault	3077bc90de	[NFC] Restore magic and magicu to a globally visible location While these functions are only used in one location in upstream, it has been reused in multiple downstreams. Restore this file to a globally visibile location (outside of APInt.h) to eliminate donwstream breakage and enable potential future reuse. Additionally, this patch renames types and cleans up clang-tidy issues.	2021-09-30 17:43:12 -07:00
Simon Pilgrim	9db20822f7	[APInt] Add APIntOps::ScaleBitMask helper APInt is used to describe a bit mask in a variety of value tracking and demanded bits/elts functions. When traversing through dst/src operands, we have a number of places where these masks need to widened/narrowed to translate through bitcasts, reductions etc. to a different type. This patch add a APIntOps::ScaleBitMask common helper, adds unit test coverage, and updates a number of cases to use the the helper instead of their own implementation. This came up on D109065 where we currently have to add yet another implementation of the same code. Differential Revision: https://reviews.llvm.org/D109683	2021-09-13 16:27:12 +01:00
Craig Topper	9af8f1b18e	[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode. Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D109535	2021-09-09 13:28:30 -07:00
Chris Lattner	735f46715d	[APInt] Normalize naming on keep constructors / predicate methods. This renames the primary methods for creating a zero value to `getZero` instead of `getNullValue` and renames predicates like `isAllOnesValue` to simply `isAllOnes`. This achieves two things: 1) This starts standardizing predicates across the LLVM codebase, following (in this case) ConstantInt. The word "Value" doesn't convey anything of merit, and is missing in some of the other things. 2) Calling an integer "null" doesn't make any sense. The original sin here is mine and I've regretted it for years. This moves us to calling it "zero" instead, which is correct! APInt is widely used and I don't think anyone is keen to take massive source breakage on anything so core, at least not all in one go. As such, this doesn't actually delete any entrypoints, it "soft deprecates" them with a comment. Included in this patch are changes to a bunch of the codebase, but there are more. We should normalize SelectionDAG and other APIs as well, which would make the API change more mechanical. Differential Revision: https://reviews.llvm.org/D109483	2021-09-09 09:50:24 -07:00
Chris Lattner	9e46dd965a	[APInt.h] Reduce the APInt header file interface a bit. NFC This moves one mid-size function out of line, inlines the trivial tcAnd/tcOr/tcXor/tcComplement methods into their only caller, and moves the magic/umagic functions into SelectionDAG since they are implementation details of its algorithm. This also removes the unit tests for magic, but these are already tested in the divide lowering logic for various targets. This also upgrades some C style comments to C++. Differential Revision: https://reviews.llvm.org/D109476	2021-09-08 18:17:07 -07:00
David Green	d8d24c64fe	[DAG] Fix GT -> GE condition when creating SetCC `79845ed6df` folded some setcc(ashr) conditions to setcc, but got the condition for NE incorrect, using GT where it should be using GE.	2021-09-08 12:41:51 +01:00
David Green	79845ed6df	[DAG] Fold setcc eq with ashr to compare to zero. Pulled out of D109149, this folds set_cc seteq (ashr X, BW-1), -1 -> set_cc setlt X, 0 to prevent some regressions later on when folding select_cc setgt X, -1, C, ~C -> xor (ashr X, BW-1), C Differential Revision: https://reviews.llvm.org/D109214	2021-09-05 14:06:47 +01:00
Roman Lebedev	3f1f08f0ed	Revert @llvm.isnan intrinsic patchset. Please refer to https://lists.llvm.org/pipermail/llvm-dev/2021-September/152440.html (and that whole thread.) TLDR: the original patch had no prior RFC, yet it had some changes that really need a proper RFC discussion. It won't be productive to discuss such an RFC, once it's actually posted, while said patch is already committed, because that introduces bias towards already-committed stuff, and the tree is potentially in broken state meanwhile. While the end result of discussion may lead back to the current design, it may also not lead to the current design. Therefore i take it upon myself to revert the tree back to last known good state. This reverts commit `4c4093e6e3`. This reverts commit `0a2b1ba33a`. This reverts commit `d9873711cb`. This reverts commit `791006fb8c`. This reverts commit `c22b64ef66`. This reverts commit `72ebcd3198`. This reverts commit `5fa6039a5f`. This reverts commit `9efda541bf`. This reverts commit `94d3ff09cf`.	2021-09-02 13:53:56 +03:00
Roman Lebedev	f5753125f0	[Codegen][TLI][X86] SimplifyMultipleUseDemandedBits(): 0'th vec subreg widening is free, try to perform it earlier I believe, the profitability reasoning here is correct "sub"reg is already located within the 0'th subreg of wider reg, so if we have suvector insertion at index 0 into undef, then it's always free do to. After this, D109065 finally avoids the regression in D108382. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D109074	2021-09-02 00:54:05 +03:00
Bjorn Pettersson	789f01283d	[SelectionDAG] Fix miscompile bugs related to smul.fix.sat with scale zero When expanding a SMULFIXSAT ISD node (usually originating from a smul.fix.sat intrinsic) we've applied some optimizations for the special case when the scale is zero. The idea has been that it would be cheaper to use an SMULO instruction (if legal) to perform the multiplication and at the same time detect any overflow. And in case of overflow we could use some SELECT:s to replace the result with the saturated min/max value. The only tricky part is to know if we overflowed on the min or max value, i.e. if the product is positive or negative. Unfortunately the implementation has been incorrect as it has looked at the product returned by the SMULO to determine the sign of the product. In case of overflow that product is truncated and won't give us the correct sign bit. This patch is adding an extra XOR of the multiplication operands, which is used to determine the sign of the non truncated product. This patch fixes PR51677. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D108938	2021-08-30 22:08:26 +02:00

1 2 3 4 5 ...

1186 Commits