llvm-project

Commit Graph

Author	SHA1	Message	Date
Craig Topper	76123d338d	[DAGCombiner] visitSIGN_EXTEND_INREG should fold sext_vector_inreg(undef) to 0 not undef. We need to ensure that the sign bits of the result all match so we can't fold to undef. Similar to PR46585. Reviewed By: lebedev.ri Differential Revision: https://reviews.llvm.org/D83163	2020-07-04 14:35:49 -07:00
Craig Topper	120c5f1057	[DAGCombiner] Don't fold zext_vector_inreg/sext_vector_inreg(undef) to undef. Fold to 0. zext_vector_inreg needs to produces 0s in the extended bits and sext_vector_inreg needs to produce upper bits that are all the same. So we should fold them to a 0 vector instead of undef. Fixes PR46585.	2020-07-04 11:42:53 -07:00
Simon Pilgrim	56a8a5c9fe	[DAG] matchBinOpReduction - match subvector reduction patterns beyond a matched shufflevector reduction Currently matchBinOpReduction only handles shufflevector reduction patterns, but in many cases these only occur in the final stages of a reduction, once we're down to legal vector widths. Before this its likely that we are performing reductions using subvector extractions to repeatedly split the source vector in half and perform the binop on the halves. Assuming we've found a non-partial reduction, this patch continues looking for subvector reductions as far as it can beyond the last shufflevector. Fixes PR37890	2020-07-04 15:28:15 +01:00
Guillaume Chatelet	87e2751cf0	[Alignment][NFC] Use proper getter to retrieve alignment from ConstantInt and ConstantSDNode This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D83082	2020-07-03 08:06:43 +00:00
Sanjay Patel	bc110de78a	[SelectionDAG] don't split branch on logic-of-vector-compares SelectionDAGBuilder converts logic-of-compares into multiple branches based on a boolean TLI setting in isJumpExpensive(). But that probably never considered the pattern of extracted bools from a vector compare - it seems unlikely that we would want to turn vector logic into control-flow. The motivating x86 reduction case is shown in PR44565: https://bugs.llvm.org/show_bug.cgi?id=44565 ...and that test shows the expected improvement from using pmovmsk codegen. For AArch64, I modified the test to include an extra op because the simpler test gets transformed by a codegen invocation of SimplifyCFG. Differential Revision: https://reviews.llvm.org/D82602	2020-07-02 17:05:24 -04:00
Sander de Smalen	143e324e75	[CodeGen][SVE] Don't drop scalable flag in DAGCombiner::visitEXTRACT_SUBVECTOR There was a rogue 'assert' in AArch64ISelLowering for the tuple.get intrinsics, that shouldn't really have been there (I suspect this was a remnant from when we expected the wider vector always to have come from a vector CONCAT). When I tried to create a more minimal reproducer, I found a bug in DAGCombiner where it drops the scalable flag when trying to fold: extract_subv (bitcast X), Index --> bitcast (extract_subv X, Index') This patch fixes both issues. Reviewers: david-arm, efriedma, spatel Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D82910	2020-07-02 10:16:43 +01:00
David Sherwood	c7df35d2b2	[CodeGen] Fix warnings in getCopyToPartsVector Whilst trying to assemble the following test: clang/test/CodeGen/aarch64-sve-intrinsics/acle_sve_set2.c I discovered we were hitting some warnings about possible invalid calls to getVectorNumElements() in getCopyToPartsVector(). I've tried to fix these by using ElementCount types where possible and I've made the assumption that we don't support using a fixed width vector to copy parts of a scalable vector, and vice versa. Looking at how the copy is implemented I think that's the right thing for now. Differential Revision: https://reviews.llvm.org/D82744	2020-07-02 09:08:20 +01:00
Craig Topper	51e92b223b	[X86] Speculatively apply the same fix from `361853c96f` to PromoteIntOp_MGATHER. The UpdateNodeOperands here is also subject to CSE.	2020-07-01 11:57:59 -07:00
Craig Topper	361853c96f	[LegalizeTypes] Properly handle the case when UpdateNodeOperands in PromoteIntOp_MLOAD triggers CSE instead of updating the node in place. The caller can't handle the node having multiple results like a masked load does. So we need to detect the case and do our own result replacement. Fixes PR46532.	2020-07-01 11:48:50 -07:00
David Sherwood	f11305780f	[CodeGen] Fix warnings in DAGCombiner::visitSCALAR_TO_VECTOR In visitSCALAR_TO_VECTOR we try to optimise cases such as: scalar_to_vector (extract_vector_elt %x) into vector shuffles of %x. However, it led to numerous warnings when %x is a scalable vector type, so for now I've changed the code to only perform the combination on fixed length vectors. Although we probably could change the code to work with scalable vectors in certain cases, without a proper profit analysis it doesn't seem worth it at the moment. This change fixes up one of the warnings in: llvm/test/CodeGen/AArch64/sve-merging-stores.ll I've also added a simplified version of the same test to: llvm/test/CodeGen/AArch64/sve-fp.ll which already has checks for no warnings. Differential Revision: https://reviews.llvm.org/D82872	2020-07-01 18:47:13 +01:00
James Y Knight	4b0aa5724f	Change the INLINEASM_BR MachineInstr to be a non-terminating instruction. Before this instruction supported output values, it fit fairly naturally as a terminator. However, being a terminator while also supporting outputs causes some trouble, as the physreg->vreg COPY operations cannot be in the same block. Modeling it as a non-terminator allows it to be handled the same way as invoke is handled already. Most of the changes here were created by auditing all the existing users of MachineBasicBlock::isEHPad() and MachineBasicBlock::hasEHPadSuccessor(), and adding calls to isInlineAsmBrIndirectTarget or mayHaveInlineAsmBr, as appropriate. Reviewed By: nickdesaulniers, void Differential Revision: https://reviews.llvm.org/D79794	2020-07-01 12:51:50 -04:00
Guillaume Chatelet	ef36f5143d	[Alignment] TargetLowering::hasPairedLoad must use Align for RequiredAlignment As per documentation of `hasPairLoad`: "`RequiredAlignment` gives the minimal alignment constraints that must be met to be able to select this paired load." In this sense, `0` is strictly equivalent to `1`. We make this obvious by using `Align` instead of unsigned. There is only one implementor of this interface. Differential Revision: https://reviews.llvm.org/D82958	2020-07-01 14:32:30 +00:00
Guillaume Chatelet	d3085c2501	[Alignment][NFC] Transition and simplify calls to DL::getABITypeAlignment This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82956	2020-07-01 14:31:56 +00:00
Guillaume Chatelet	27bbc8ede1	[Alignment][NFC] Migrate TargetTransformInfo::CreateVariableSizedObject to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82939	2020-07-01 14:31:21 +00:00
David Sherwood	97a7a9abb2	[CodeGen] Fix up warnings in visitEXTRACT_SUBVECTOR It's perfectly valid to do certain DAG combines where we extract subvectors from a concat vector when we have scalable vector types. However, we can do this in a way that avoids generating compiler warnings by replacing calls to getVectorNumElements() with getVectorMinNumElements(). Due to the way subvector extracts are designed to work with scalable vector types this is ok. This eliminates some warnings from existing tests in this file: llvm/test/CodeGen/AArch64/sve-intrinsics-loads.ll Differential Revision: https://reviews.llvm.org/D82655	2020-07-01 15:10:53 +01:00
Guillaume Chatelet	28de229bc6	[Alignment][NFC] Migrate MachineFrameInfo::CreateStackObject to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82894	2020-07-01 07:28:11 +00:00
Guillaume Chatelet	c1cd61e02a	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemcpy to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82849	2020-06-30 13:12:31 +00:00
Guillaume Chatelet	306d7c6929	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemmove to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82850	2020-06-30 12:46:59 +00:00
Guillaume Chatelet	6a6af30d43	[Alignment][NFC] Migrate SelectionDAGTargetInfo::EmitTargetCodeForMemset to Align This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82851	2020-06-30 12:46:26 +00:00
Guillaume Chatelet	5f8bdb3e6a	[Alignment][NFC] TargetLowering::allowsMemoryAccess Second patch of a series to adapt TargetLowering::allowsXXX functions This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Differential Revision: https://reviews.llvm.org/D82785	2020-06-30 08:17:00 +00:00
David Sherwood	c02332a693	[CodeGen] Fix warning in getNode for EXTRACT_SUBVECTOR Fix a warning in getNode() when extracting a subvector from a concat vector. We can simply replace the call to getVectorNumElements with getVectorMinNumElements as this follows the defined behaviour for EXTRACT_SUBVECTOR. Differential Revision: https://reviews.llvm.org/D82746	2020-06-30 08:11:41 +01:00
David Sherwood	46a7f4d6f4	[SVE][CodeGen] Fix bug in DAGCombiner::reduceBuildVecToShuffle When trying to reduce a BUILD_VECTOR to a SHUFFLE_VECTOR it's important that we carefully check the vector types that led to that BUILD_VECTOR. In the test I have attached to this commit there is a case where the results of two SVE faddv instructions are being stored to consecutive memory locations. With my fix, as part of merging those stores we discover that each BUILD_VECTOR element came from an extract of a SVE vector element and therefore bail out. Differential Revision: https://reviews.llvm.org/D82564	2020-06-30 07:28:15 +01:00
Simon Pilgrim	3521ecf1f8	[X86] Add vector support to targetShrinkDemandedConstant for OR/XOR opcodes If a constant is only allsignbits in the demanded/active bits, then sign extend it to an allsignbits bool pattern for OR/XOR ops. This also requires SimplifyDemandedBits XOR handling to be modified to call ShrinkDemandedConstant on any (non-NOT) XOR pattern to account for non-splat cases. Next step towards fixing PR45808 - with this patch we now get a <-1,-1,0,0> v4i64 constant instead of <1,1,0,0>. Differential Revision: https://reviews.llvm.org/D82257	2020-06-29 12:19:05 +01:00
Simon Pilgrim	973685fc78	[TargetLowering] Add DemandedElts arg to ShrinkDemandedConstant Pre-commit for D82257, this adds a DemandedElts arg to ShrinkDemandedConstant/targetShrinkDemandedConstant which will allow future patches to (optionally) add vector support.	2020-06-29 11:46:58 +01:00
Guillaume Chatelet	3500d9ec95	Fix invalid alignment in DAGCombiner::isLegalNarrowLdSt `ShAmt / 8` can be a non power of two, this can lead to an invalid alignment. context: https://reviews.llvm.org/D41350#inline-749165 Differential Revision: https://reviews.llvm.org/D82565	2020-06-29 09:22:15 +00:00
Simon Pilgrim	6bdb3ce452	[DAG] reduceBuildVecExtToExtBuildVec - don't combine if it would break a splat. reduceBuildVecExtToExtBuildVec was breaking a splat(zext(x)) pattern into buildvector(x, 0, x, 0, ..) resulting in much more complex insert+shuffle codegen. We already go to some lengths to avoid this in SimplifyDemandedVectorElts etc. when we encounter splat buildvectors. It should be OK to fold all splat(aext(x)) patterns - we might need to tighten this if we find a case where we mustn't introduce a buildvector(x, undef, x, undef, ..) but I can't find one. Fixes PR46461.	2020-06-27 11:03:57 +01:00
Sanjay Patel	e7f7715eb9	[DAGCombiner] rename variables for readability; NFC PR46406 shows a pattern where we can do better, so try to clean this up before adding more code.	2020-06-26 14:22:11 -04:00
Sjoerd Meijer	243a5329d4	[SelectionDAG] Lower @llvm.get.active.lane.mask to setcc This lowers intrinsic @llvm.get.active.lane.mask to a setcc node, i.e. an icmp ule, and creates vectors for its 2 arguments on which the comparison is performed. Differential Revision: https://reviews.llvm.org/D82292	2020-06-26 07:46:38 +01:00
Eli Friedman	e9d4e34ab8	[AArch64][SVE] Add legalization support for i32/i64 vector srem/urem Implement them on top of sdiv/udiv, similar to what we do for integer types. Potential future work: implementing i8/i16 srem/urem, optimizations for constant divisors, optimizing the mul+sub to mls. Differential Revision: https://reviews.llvm.org/D81511	2020-06-23 16:27:52 -07:00
Kerry McLaughlin	5080503174	[SVE][CodeGen] Legalisation of vsetcc with scalable types Summary: Changes SplitVecOp_VSETCC to use getVectorElementCount() Reviewers: sdesmalen, efriedma, dancgr Reviewed By: efriedma Subscribers: david-arm, tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79167	2020-06-23 11:56:29 +01:00
Simon Pilgrim	bcc0dc3832	[DAG] visitSIGN_EXTEND_INREG - rename EVT variable. NFCI. We had a EVT type variable called EVT, which isn't a good idea....	2020-06-23 10:45:27 +01:00
Paul Walker	499c63288f	[SVE] Code generation for fixed length vector loads & stores. Summary: This patch adds base support for code generating fixed length vector operations targeting a known SVE vector length. To achieve this we lower fixed length vector operations to equivalent scalable vector operations, whereby SVE predication is used to limit the elements processed to those present within the fixed length vector. Specifically this patch implements load and store operations, which get lowered to their masked counterparts thusly: V = load(Addr) => V = extract_fixed_vector(masked_load(make_pred(V.NumElts), Addr)) store(V, (Addr)) => masked_store(insert_fixed_vector(V), make_pred(V.NumElts), Addr)) Reviewers: rengolin, efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80385	2020-06-23 09:39:03 +00:00
Simon Pilgrim	0acd22b8fb	StatepointLowering.cpp - fix implicit CommandLine.h dependency. NFC. StatepointLowering defines a cl::opt but don't include CommandLine.h.	2020-06-23 09:43:39 +01:00
Michael Liao	b1360caa82	[SDAG] Add new AssertAlign ISD node. Summary: - AssertAlign node records the guaranteed alignment on its source node, where these alignments are retrieved from alignment attributes in LLVM IR. These tracked alignments could help DAG combining and lowering generating efficient code. - In this patch, the basic support of AssertAlign node is added. So far, we only generate AssertAlign nodes on return values from intrinsic calls. - Addressing selection in AMDGPU is revised accordingly to capture the new (base + offset) patterns. Reviewers: arsenm, bogner Subscribers: jvesely, wdng, nhaehnle, tpr, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81711	2020-06-23 00:51:11 -04:00
Simon Pilgrim	48d1a2d6d0	[DAG] Add SimplifyMultipleUseDemandedVectorElts helper for SimplifyMultipleUseDemandedBits. NFCI. We have many cases where we call SimplifyMultipleUseDemandedBits and demand specific vector elements, but all the bits from them - this adds a helper wrapper to handle this.	2020-06-22 14:24:39 +01:00
Simon Pilgrim	ecc5d7ee0d	[DAG] SimplifyMultipleUseDemandedBits - drop unnecessary *_EXTEND_VECTOR_INREG cases For little endian targets, if we only need the lowest element and none of the extended bits then we can just use the (bitcasted) source vector directly. We already do this in SimplifyDemandedBits, this adds the SimplifyMultipleUseDemandedBits equivalent.	2020-06-22 12:35:32 +01:00
David Sherwood	7edc7f6edb	[CodeGen] Fix SimplifyDemandedBits for scalable vectors For now I have changed SimplifyDemandedBits and it's various callers to assume we know nothing for scalable vectors and to ignore the demanded bits completely. I have also done something similar for SimplifyDemandedVectorElts. These changes fix up lots of warnings due to calls to EVT::getVectorNumElements() for types with scalable vectors. These functions are all used for optimisations, rather than functional requirements. In future we can revisit this code if there is a need to improve code quality for SVE. Differential Revision: https://reviews.llvm.org/D80537	2020-06-19 07:59:35 +01:00
David Sherwood	9e811b0d93	[CodeGen] Fix ComputeNumSignBits for scalable vectors When trying to calculate the number of sign bits for scalable vectors we should just bail out for now and pretend we know nothing. Differential Revision: https://reviews.llvm.org/D81093	2020-06-19 07:58:42 +01:00
Simon Pilgrim	2474421398	[TargetLowering] SimplifyMultipleUseDemandedBits - drop already extended ISD::SIGN_EXTEND_INREG nodes. If the source of the SIGN_EXTEND_INREG node is already sign extended, use the source directly.	2020-06-18 16:41:08 +01:00
Lucas Prates	a255931c40	[ARM] Supporting lowering of half-precision FP arguments and returns in AArch32's backend Summary: Half-precision floating point arguments and returns are currently promoted to either float or int32 in clang's CodeGen and there's no existing support for the lowering of `half` arguments and returns from IR in AArch32's backend. Such frontend coercions, implemented as coercion through memory in clang, can cause a series of issues in argument lowering, as causing arguments to be stored on the wrong bits on big-endian architectures and incurring in missing overflow detections in the return of certain functions. This patch introduces the handling of half-precision arguments and returns in the backend using the actual "half" type on the IR. Using the "half" type the backend is able to properly enforce the AAPCS' directions for those arguments, making sure they are stored on the proper bits of the registers and performing the necessary floating point convertions. Reviewers: rjmccall, olista01, asl, efriedma, ostannard, SjoerdMeijer Reviewed By: ostannard Subscribers: stuij, hiraditya, dmgreen, llvm-commits, chill, dnsampaio, danielkiss, kristof.beyls, cfe-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D75169	2020-06-18 13:15:13 +01:00
David Sherwood	65912a9768	[CodeGen] Fix warnings in foldCONCAT_VECTORS Instead of asserting the number of elements is the same, we should be comparing the element counts instead. In addition, when looking at concats of extract_subvectors it's fine to use getVectorMinNumElements() for scalable vectors. I discovered these warnings when compiling the structured loads tests in this file: test/CodeGen/AArch64/sve-intrinsics-loads.ll Differential Revision: https://reviews.llvm.org/D81936	2020-06-18 09:29:37 +01:00
Aaron Smith	7e01675ea5	[SelectionDAG] Add MVT::bf16 to getConstantFP() Summary: This was probably overlooked in recent bfloat patches. Needed to handle bf16 constants in SelectionDAG. ConstantFP:bf16<APFloat(0)> Reviewers: stuij Reviewed By: stuij Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81779	2020-06-16 15:10:05 -07:00
Qiu Chaofan	f8ef7c99a0	[DAGCombiner] Require ninf for division estimation Current implementation of division estimation isn't correct for some cases like 1.0/0.0 (result is nan, not expected inf). And this change exposes a potential infinite loop: we use isConstOrConstSplatFP in combineRepeatedFPDivisors to look up if the divisor is some constant. But it doesn't work after legalized on some platforms. This patch restricts the method to act before LegalDAG. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D80542	2020-06-14 22:58:22 +08:00
Amanieu d'Antras	6973125cb7	Fix FastISel dropping srcloc metadata from InlineAsm Summary: Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=46060 I've also added the Extra_IsConvergent flag which was missing from FastISel. Reviewers: echristo Reviewed By: echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80759	2020-06-13 16:52:37 +01:00
Michael Liao	e7b920e6fe	[DAGCombine] Generalize the case (add (or x, c1), c2) -> (add x, (c1 + c2)) Reviewers: arsenm Subscribers: sdardis, wdng, hiraditya, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, ecnelises, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81708	2020-06-12 13:53:08 -04:00
Simon Pilgrim	5509e2cc2e	[DAG] foldAddSubOfSignBit - add support for non-uniform vector constants	2020-06-12 14:58:15 +01:00
David Sherwood	bd97342a0c	[CodeGen] Let computeKnownBits do something sensible for scalable vectors Until we have a real need for computing known bits for scalable vectors I have simply changed the code to bail out for now and pretend we know nothing. I've also fixed up some simple callers of computeKnownBits too. Differential Revision: https://reviews.llvm.org/D80437	2020-06-11 08:17:11 +01:00
Sanjay Patel	702cf93356	[DAGCombiner] allow more folding of fadd + fmul into fma If fmul and fadd are separated by an fma, we can fold them together to save an instruction: fadd (fma A, B, (fmul C, D)), N1 --> fma(A, B, fma(C, D, N1)) The fold implemented here is actually a specialization - we should be able to peek through >1 fma to find this pattern. That's another patch if we want to try that enhancement though. This transform was guarded by the TLI hook enableAggressiveFMAFusion(), so it was done for some in-tree targets like PowerPC, but not AArch64 or x86. The hook is protecting against forming a potentially more expensive computation when fma takes longer to execute than a single fadd. That hook may be needed for other transforms, but in this case, we are replacing fmul+fadd with fma, and the fma should never take longer than the 2 individual instructions. 'contract' FMF is all we need to allow this transform. That flag corresponds to -ffp-contract=fast in Clang, so we are allowed to form fma ops freely across expressions. Differential Revision: https://reviews.llvm.org/D80801	2020-06-09 10:41:27 -04:00
Guillaume Chatelet	800e100588	Revert "[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess" This reverts commit `f21c52667e`.	2020-06-09 10:43:59 +00:00
Simon Wallis	4dba59689d	[ARM] prologue instructions emitted for naked function with >64 byte argument Summary: The naked function attribute is meant to suppress all function prologue/epilogue instructions. On ARM, some are still emitted if an argument greater than 64 bytes in size (the threshold for using the byval attribute in IR) is passed partially in registers. Perform the check for Attribute::Naked and early exit in SelectionDAGISel::LowerArguments(). Checking in ARMFrameLowering::determineCalleeSaves() is too late. A test case is included. Reviewers: llvm-commits, olista01, danielkiss Reviewed By: danielkiss Subscribers: kristof.beyls, hiraditya, danielkiss Tags: #llvm Differential Revision: https://reviews.llvm.org/D80715 Change-Id: Icedecf2a4ad31bc3c35ab0df7489a9d346e1f7cc	2020-06-09 11:33:03 +01:00
Guillaume Chatelet	f21c52667e	[Alignment][NFC] Migrate TargetLowering::allowsMemoryAccess Summary: Note to downstream target maintainers: this might silently change the semantics of your code if you override `TargetLowering::allowsMemoryAccess` without marking it override. This patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D81379	2020-06-09 10:11:07 +00:00
Guillaume Chatelet	e26ed6bdae	Fix unused variable warning	2020-06-09 08:56:05 +00:00
David Sherwood	cc8872400c	[CodeGen] Ensure callers of CreateStackTemporary use sensible alignments In two instances of CreateStackTemporary we are sometimes promoting alignments beyond the stack alignment. I have introduced a new function called getReducedAlign that will return the alignment for the broken down parts of illegal vector types. For example, on NEON a <32 x i8> type is made up of two <16 x i8> types - in this case the sensible alignment is 16 bytes, not 32. In the legalization code wherever we create stack temporaries I have started using the reduced alignments instead for illegal vector types. I added a test to CodeGen/AArch64/build-one-lane.ll that tries to insert an element into an illegal fixed vector type that involves creating a temporary stack object. Differential Revision: https://reviews.llvm.org/D80370	2020-06-09 08:10:17 +01:00
Christopher Tetreault	caa2fddce7	[SVE] Eliminate calls to default-false VectorType::get() from CodeGen Reviewers: efriedma, c-rhodes, david-arm, spatel, craig.topper, aqjune, paquette, arsenm, gchatelet Reviewed By: spatel, gchatelet Subscribers: wdng, tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80313	2020-06-08 10:26:10 -07:00
Sanjay Patel	302cc8a121	[DAGCombiner] clean-up FMA+FMUL folds; NFC D80801 suggests some readability improvements before mocing this block.	2020-06-06 10:32:54 -04:00
Matt Arsenault	45e1a22a92	GlobalISel: Make known bits/alignment API more consistent Just computing the alignment makes sense without caring about the general known bits, such as for non-integral pointers. Separate the two and start calling into the TargetLowering hooks for frame indexes. Start calling the TargetLowering implementation for FrameIndexes, which improves the AMDGPU matching for stack addressing modes. Also introduce a new hook for returning known alignment of target instructions. For AMDGPU, it would be useful to report the known alignment implied by certain intrinsic calls. Also stop using MaybeAlign.	2020-06-05 14:57:22 -04:00
Sander de Smalen	937cb7a8c7	Reland D80640: [CodeGen][SVE] Calculate correct type legalization for scalable vectors. This reverts commit `9bcef270d7`.	2020-06-05 18:09:31 +01:00
Sander de Smalen	9bcef270d7	Revert "[CodeGen][SVE] Calculate correct type legalization for scalable vectors." Seems to break some buildbots, reverting the patch for now. This reverts commit `164f4b9d26`.	2020-06-05 16:03:52 +01:00
Sander de Smalen	164f4b9d26	[CodeGen][SVE] Calculate correct type legalization for scalable vectors. This patch updates TargetLoweringBase::computeRegisterProperties and TargetLoweringBase::getTypeConversion to support scalable vectors, and make the right calls on how to legalise them. These changes are required to legalise both MVTs and EVTs. Reviewers: efriedma, david-arm, ctetreau Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D80640	2020-06-05 15:20:34 +01:00
Kerry McLaughlin	89fc0166f5	[CodeGen][SVE] Legalisation of extends with scalable types Summary: This patch adds legalisation of extensions where the operand of the extend is a legal scalable type but the result is not. EXTRACT_SUBVECTOR is used to split the result, before being replaced by target-specific [S\|U]UNPK[HI\|LO] operations. For example: ``` zext <vscale x 16 x i8> %a to <vscale x 16 x i16> ``` should emit: ``` uunpklo z2.h, z0.b uunpkhi z1.h, z0.b ``` Reviewers: sdesmalen, efriedma, david-arm Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, huihuiz, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79587	2020-06-05 12:08:42 +01:00
Philip Reames	4c735439fd	[Statepoint] Migrate a few tests to gc-live bundle format and fix assert The assert was missed in `0e7c7705`, migrating the test revealed the problem.	2020-06-04 18:15:58 -07:00
Matt Arsenault	af867b7850	DAG: Change computeKnownBitsForFrameIndex to be usable by GISel This wasn't getting much value from the DAG or depth arguments, since it's only called on the frame index root nodes. FrameIndexes can also only return a scalar value, so it also didn't need DemandedElts.	2020-06-04 10:50:26 -04:00
Sanjay Patel	652b3757c8	[x86] add test/code comment for chain value use (PR46195); NFC	2020-06-04 09:15:17 -04:00
Simon Pilgrim	adf10dcf2e	[DAG] scalarizeBinOpOfSplats - extract from the source of splat vector (PR46189) D79003/rG9fa58d1bf2f8 exposed an issue with scalarizeBinOpOfSplats that we were extracting from the splatted vector result instead of the source, the splat index is only valid for the source vector not the result, which may contain undefs, including at the splat index.	2020-06-04 11:58:59 +01:00
Tim Northover	87e24c3200	Revert "[DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI" This reverts commit `21dadd774f`. In at least PromoteIntBinOps, they wanted to know about users of all values produced by the node not just the integer being promoted. For example not replacing chain users if the operation was a load breaks the ordering of the DAG.	2020-06-04 11:53:14 +01:00
Madhur Amilkanthwar	b3cff3c720	Utility to dump .dot representation of SelectionDAG without firing viewer Summary: This patch adds support for dumping .dot representation of SelectionDAG. It is inspired from the fact that, a developer may want to just dump the graph at a predictable path with a simple name to compare. The exisitng utility (i.e. viewGraph) are overkill for this motive hence this patch adds the requires support while using the core routines from GraphWriter. Example usage: DAG.dumpDotGraph("/tmp/graph.dot", "MyGraph") will create /tmp/graph.dot file when DAG is an object of SelectionDAG class. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D80711	2020-06-04 11:51:48 +05:30
Philip Reames	ab6779bbd8	[Statepoint] Remove last of old ImmutableStatepoint code To do so, I had to sink the old school inline operand handling into GCStatepointInst which is non ideal. This code should be removed shortly and I was able to at least clean it up a bunch.	2020-06-03 20:31:17 -07:00
Philip Reames	91dd2f2536	[Statepoint] Delete more dead code from old wrappers The verify() routine duplicates IR/Verifier.cpp checks, so while not technically dead it doesn't add any value either.	2020-06-03 20:10:30 -07:00
Henry Kao	c57e41c000	[CodeGen][SVE] Replace deprecated calls in getCopyFromPartsVector() Summary: Replaced getVectorNumElements() with getVectorElementCount(). Added operator overloads for class ElementCount. Fixes warning in several AArch64 unit tests. Reviewers: sdesmalen, kmclaughlin, dancgr, efriedma, each, andwar, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80826	2020-06-03 11:20:02 -04:00
Simon Pilgrim	ea80b40669	[DAG] SimplifyDemandedBits - peek through SHL if we only demand sign bits. If we're only demanding the (shifted) sign bits of the shift source value, then we can use the value directly. This handles SimplifyDemandedBits/SimplifyMultipleUseDemandedBits for both ISD::SHL and X86ISD::VSHLI. Differential Revision: https://reviews.llvm.org/D80869	2020-06-03 16:11:54 +01:00
Simon Pilgrim	c438b257f1	[DAG] GetDemandedBits - don't bother asserting for a non-null cast<> result. NFC. cast<> will assert on failure anyhow. This lets us fold the cast<> with the getAPIntValue() that uses it.	2020-06-03 12:43:07 +01:00
Kadir Cetinkaya	af86a10bad	[llvm] Fix unused variable warning	2020-06-02 22:46:24 +02:00
Amy Kwan	a3ada630d8	[DAGCombiner] Combine shifts into multiply-high This patch implements a target independent DAG combine to produce multiply-high instructions from shifts. This DAG combine will combine shifts for any type as long as the MULH on the narrow type is legal. For now, it is enabled on PowerPC as PowerPC is the only target that has an implementation of the isMulhCheaperThanMulShift TLI hook introduced in D78271. Moreover, this DAG combine focuses on catching the pattern: (shift (mul (ext <narrow_type>:$a to <wide_type>), (ext <narrow_type>:$b to <wide_type>)), <narrow_width>) to produce mulhs when we have a sign-extend, and mulhu when we have a zero-extend. The patch performs the following checks: - Operation is a right shift arithmetic (sra) or logical (srl) - Input to the shift is a multiply - Both operands to the shift are sext/zext nodes - The extends into the multiply are both the same - The narrow type is half the width of the wide type - The shift amount is the width of the narrow type - The respective mulh operation is legal Differential Revision: https://reviews.llvm.org/D78272	2020-06-02 15:22:48 -05:00
Denis Antrushin	fa818ded24	[StatepointLowering] Handle UNDEF gc values. Do not spill UNDEF GC values. Instead, replace corresponding gc.relocate intrinsic with an (arbitrary, but recognizable) constant. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D80714	2020-06-02 10:18:33 +03:00
Matt Arsenault	836c7dcf12	DAG: Fix getNode dropping flags if there's a glue output The AMDGPU non-strict fdiv lowering needs to introduce an FP mode switch in some cases, and has custom nodes to provide chain/glue for the intermediate FP operations. We need to propagate nofpexcept here, but getNode was dropping the flags. Adding nofpexcept in the AMDGPU custom lowering is left to a future patch. Also fix a second case where flags were dropped, but in this case it seems it just didn't handle this number of operands. Test will be included in future AMDGPU patch.	2020-06-01 13:48:02 -04:00
Florian Hahn	ec25a71eb7	[ScheduleDAG] Avoid unnecessary recomputation of topological order. In some cases ScheduleDAGRRList has to add new nodes to resolve problems with interfering physical registers. When new nodes are added, it completely re-computes the topological order, which can take a long time, but is unnecessary. We only add nodes one by one, and initially they do not have any predecessors. So we can just insert them at the end of the vector. Later we add predecessors, but the helper function properly updates the topological order much more efficiently. With this change, the compile time for the program below drops from 300s to 30s on my machine. define i11129 @test1() { %L1 = load i11129, i11129* undef %B30 = ashr i11129 %L1, %L1 store i11129 %B30, i11129* undef ret i11129 %L1 } This should be generally beneficial, as we can skip a large amount of work. Theoretically there are some scenarios where we might not safe much, e.g. when we add a dependency between the first and last node. Then we would have to shift all nodes. But we still do not have to spend the time re-computing the initial order. Reviewers: MatzeB, atrick, efriedma, niravd, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D59722	2020-05-31 11:04:35 +01:00
Craig Topper	a4dd45b7d0	[DAGCombiner] Move debug message and statistic update into CommitTargetLoweringOpt. This code was repeated in two callers of CommitTargetLoweringOpt. But CommitTargetLoweringOpt is also called from TargetLowering. We should print a message for those calls to. So sink the repeated code into CommitTargetLoweringOpt to catch those calls.	2020-05-30 19:47:07 -07:00
Simon Pilgrim	63824ad947	[TargetLowering] SimplifyDemandedBits - remove shift amount clamps from getValidShiftAmountConstant calls. NFC. getValidShiftAmountConstant only returns a value if the shift amount is in range, so we don't need to check it again.	2020-05-30 14:04:55 +01:00
Simon Pilgrim	9d0bfcec83	[SelectionDAG] ComputeNumSignBits - use Valid Min/Max shift amount helpers directly. NFCI. We are calling getValidShiftAmountConstant first followed by getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant if that failed. But both are used in the same way in ComputeNumSignBits and the Min/Max variants call getValidShiftAmountConstant internally anyhow.	2020-05-30 14:02:14 +01:00
Simon Pilgrim	81b50a7823	[SelectionDAG] Remove repeated getOperand() call. NFC.	2020-05-30 10:21:36 +01:00
Tobias Bosch	6a4714030e	[DebugInfo][DAG] Don't reuse debug location on COPY if width changes. Summary: This caused incorrect debug information for parameters: Previously, after a COPY of a parameter that changes the width, we would emit a DBG_VALUE that continues to be associated to that parameter, even though it now used a different width. This made the LiveDebugValues pass assume the parameter value got clobbered and it stopped tracking the parameter entry value, leading to incorrect debug information. Fixes https://bugs.llvm.org/show_bug.cgi?id=39715 Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80819	2020-05-29 13:24:33 -07:00
Zequan Wu	80e107ccd0	Add NoMerge MIFlag to avoid MIR branch folding Let the codegen recognized the nomerge attribute and disable branch folding when the attribute is given Differential Revision: https://reviews.llvm.org/D79537	2020-05-29 12:31:06 -07:00
Guozhi Wei	40c08367e4	[DAGCombiner] Add command line options to guard store width reduction optimizations As discussed in the thread http://lists.llvm.org/pipermail/llvm-dev/2020-May/141838.html, some bit field access width can be reduced by ReduceLoadOpStoreWidth, some can't. If two accesses are very close, and the first access width is reduced, the second is not. Then the wide load of second access will be stalled for long time. This patch add command line options to guard ReduceLoadOpStoreWidth and ShrinkLoadReplaceStoreWithStore, so users can use them to disable these store width reduction optimizations. Differential Revision: https://reviews.llvm.org/D80745	2020-05-29 09:41:41 -07:00
David Sherwood	d8a78889f6	[CodeGen] Fix warning in visitShuffleVector Make sure we only ask for the number of elements after we've bailed out for scalable vectors. Differential revision: https://reviews.llvm.org/D80632	2020-05-29 17:09:59 +01:00
Sanjay Patel	21dadd774f	[DAGCombiner] avoid unnecessary indirection from SDNode/SDValue; NFCI	2020-05-29 09:31:52 -04:00
Florian Hahn	d20a3d35e1	[DAGComb] Do not turn insert_elt into shuffle for single elt vectors. Currently combineInsertEltToShuffle turns insert_vector_elt into a vector_shuffle, even if the inserted element is a vector with a single element. In this case, it should be unlikely that the additional shuffle would be more efficient than a insert_vector_elt. Additionally, this fixes a infinite cycle in DAGCombine, where combineInsertEltToShuffle turns a insert_vector_elt into a shuffle, which gets turned back into a insert_vector_elt/extract_vector_elt by a custom AArch64 lowering (in visitVECTOR_SHUFFLE). Such insert_vector_elt and extract_vector_elt combinations can be lowered efficiently using mov on AArch64. There are 2 test changes in arm64-neon-copy.ll: we now use one or two mov instructions instead of a single zip1. The reason that we need a second mov in ins1f2 is that we have to move the result to the result register and is not really related to the DAGCombine fold I think. But in any case, on most uarchs, mov should be cheaper than zip1. On a Cortex-A75 for example, zip1 is twice as expensive as mov (https://developer.arm.com/docs/101398/latest/arm-cortex-a75-software-optimization-guide-v20) Reviewers: spatel, efriedma, dmgreen, RKSimon Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D80710	2020-05-29 13:21:13 +01:00
Paul Walker	92f3d29af0	[SelectionDAG] Update getNode asserts for EXTRACT/INSERT_SUBVECTOR. Summary: The description of EXTACT_SUBVECTOR and INSERT_SUBVECTOR has been changed to accommodate scalable vectors (see ISDOpcodes.h). This patch updates the asserts used to verify these requirements when using SelectionDAG's getNode interface. This patch introduces the MVT function getVectorMinNumElements that can be used against fixed-length and scalable vectors when only the known minimum vector length is required. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D80709	2020-05-29 11:02:18 +00:00
David Sherwood	4265f1d23c	[CodeGen] Fix warnings in getZeroExtendInReg We should be using getVectorElementCount() to assert that two types have the same numbers of elements. I encountered the warnings while compiling this test: CodeGen/AArch64/sve-intrinsics-ld1.ll Differential Revision: https://reviews.llvm.org/D80616	2020-05-29 11:51:07 +01:00
David Sherwood	b147b88c84	[CodeGen] Add support for extracting elements of scalable vectors I have tried to ensure that SelectionDAG and DAGCombiner do sensible things for scalable vectors, and added support for a limited number of simple folds. Codegen support for the vector extract patterns have also been added to the AArch64 backend. New vector extract tests have been added here: CodeGen/AArch64/sve-extract-element.ll and I have also added new folds using inserts and extracts here: CodeGen/AArch64/sve-insert-element.ll Differential Revision: https://reviews.llvm.org/D80208	2020-05-29 07:49:43 +01:00
Eric Christopher	bce702e5f2	unsigned -> Register for readability.	2020-05-28 15:21:55 -07:00
Philip Reames	9d06547794	[Statepoints] Sink routines for grabbing projections to GCStatepointInst [NFC] Mechanical movement, nothing more.	2020-05-28 13:51:59 -07:00
Philip Reames	a0d2fd4a1f	[Statepoint] Sink actual_args and gc_args to GCStatepointInst [NFC] These are the two operand sets which are expected to survive more than another week or so. Instead of bothering to update the deopt and gc-transition operands, we'll just wait until those are removed and delete the code. For those following along, this is likely to be the last (major) change in this sequence for about a week. I want to wait until all of this has been merged downstream to ensure I haven't introduced any bugs (and migrate some downstream code to the new interfaces). Once that's done, we should be able to delete Statepoint/ImmutableStatepoint without too much work.	2020-05-28 13:51:59 -07:00
Philip Reames	4d6cda9bda	[Statepoint] Use iterate_range.empty [NFC]	2020-05-28 13:51:59 -07:00
Philip Reames	58beb76b7b	[Statepoint] Convert a few more isStatepoint calls to idiomatic isa/cast I'd apparently only grepped in the lib directories and missed a few used in the Statepoint header itself. Beyond simple mechanical cleanup, changed the type of one routine to reflect the fact it also returns a statepoint.	2020-05-28 11:35:36 -07:00
Philip Reames	501aa47ab8	[Statepoint] Sink logic about actual callee into GCStatepointInst Sinking logic around actual callee from Statepoint to GCStatepointInst. While doing so, adjust naming to be consistent about refering to "actual" callee and follow precedent on naming from CallBase otherwise. Use the result to simplify one consumer. This is mostly just to ensure the new code is exercised, but is also a helpful cleanup on it's own.	2020-05-28 10:53:39 -07:00
Nikita Popov	e0e5c64460	[SDAG] Don't require LazyBlockFrequencyInfo at optnone While LazyBlockFrequencyInfo itself is lazy, the dominator tree and loop info analyses it requires are not. Drop the dependency on this pass in SelectionDAGIsel at O0. This makes for a ~0.6% O0 compile-time improvement. Differential Revision: https://reviews.llvm.org/D80387	2020-05-28 18:48:33 +02:00
Philip Reames	98a87c65a3	[Statepoint] Reduce scope of usage of ImmutableStatepoint Can't quite fully remove it yet as some more items need sunk the GCStatepointInst class from the wrapper, but we can at least reduce scope.	2020-05-27 18:57:42 -07:00
Philip Reames	87bea912c2	[Statepoint] Replace uses of isX functions with idiomatic isa<X> Now that all of the statepoint related routines have classes with isa support, let's cleanup. I'm leaving the (dead) utitilities in tree for a few days so that I can do the same cleanup downstream without breakage.	2020-05-27 18:32:28 -07:00
Matt Arsenault	dda82986f9	DAG: Fix expansion of DYNAMIC_STACKALLOC for StackGrowsUp targets Can't test this since I can't directly use the default expansion for AMDGPU. It needs to scale the amount by the wave size, rather than use the raw byte size value.	2020-05-27 18:45:40 -04:00
Philip Reames	1af3705c7f	Start migrating away from statepoint's inline length prefixed argument bundles In the current statepoint design, we have four distinct groups of operands to the call: call args, gc transition args, deopt args, and gc args. This format prexisted the support in IR for operand bundles and was in fact one of the inspirations for the extension. However, we never went back and rearchitected statepoints to fully leverage bundles. This change is the first in a small sequence to do so. All this does is extend the SelectionDAG lowering code to allow deopt and gc transition operands to be specified in either inline argument bundles or operand bundles. Differential Revision: https://reviews.llvm.org/D8059	2020-05-27 09:16:10 -07:00
Simon Pilgrim	50db8402fc	ResourcePriorityQueue.h - reduce unnecessary includes to forward declarations. NFC. Move includes to ResourcePriorityQueue.cpp	2020-05-26 19:22:14 +01:00
Matt Arsenault	9786e7552d	Revert "[AMDGPU] NFC target dependent requiresUniformRegister refactored out" This reverts commit `fb38b98338`. This will regress compile time.	2020-05-26 12:58:18 -04:00
alex-t	fb38b98338	[AMDGPU] NFC target dependent requiresUniformRegister refactored out Summary: Target specific method encapsulated into the Target Lowering Info. Reviewers: rampitec, vpykhtin Reviewed By: rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70085	2020-05-26 19:49:20 +03:00
Serge Pavlov	4d20e31f73	[FPEnv] Intrinsic llvm.roundeven This intrinsic implements IEEE-754 operation roundToIntegralTiesToEven, and performs rounding to the nearest integer value, rounding halfway cases to even. The intrinsic represents the missed case of IEEE-754 rounding operations and now llvm provides full support of the rounding operations defined by the standard. Differential Revision: https://reviews.llvm.org/D75670	2020-05-26 19:24:58 +07:00
Sanjay Patel	f368040c14	[DAGCombiner] try to move splat after binop with splat constant binop (splat X), (splat C) --> splat (binop X, C) binop (splat C), (splat X) --> splat (binop C, X) We do this in IR, and there's a similar fold for the case with 2 non-constant operands just above the code diff in this patch. This was discussed in D79718, and the extra shuffle in the test (llvm/test/CodeGen/X86/vector-fshl-128.ll::sink_splatvar) where it was noticed disappears because demanded elements analysis is no longer blocked. The large majority of the test diffs seem to be benign code scheduling changes, but I do see another type of win: moving the splat later allows binop narrowing in some cases. Regressions were avoided on x86 and ARM with the INSERT_VECTOR_ELT restriction. Differential Revision: https://reviews.llvm.org/D79886	2020-05-26 08:12:46 -04:00
Simon Pilgrim	8f48814879	FunctionLoweringInfo.h - move APInt.h dependency to FunctionLoweringInfo.cpp. NFC.	2020-05-25 12:58:35 +01:00
Simon Pilgrim	9fa58d1bf2	[DAG] Add SimplifyDemandedVectorElts binop SimplifyMultipleUseDemandedBits handling For the supported binops (basic arithmetic, logicals + shifts), if we fail to simplify the demanded vector elts, then call SimplifyMultipleUseDemandedBits and try to peek through ops to remove unnecessary dependencies. This helps with PR40502. Differential Revision: https://reviews.llvm.org/D79003	2020-05-25 12:41:22 +01:00
Simon Pilgrim	1603106725	[TargetLowering] Improve expandFunnelShift shift amount masking For the 'inverse shift', we currently always perform a subtraction of the original (masked) shift amount. But for the case where we are handling power-of-2 type widths, we can replace: (sub bw-1, (and amt, bw-1) ) -> (and (xor amt, bw-1), bw-1) -> (and ~amt, bw-1) This allows x86 shifts to fold away the and-mask. Followup to D77301 + D80466. http://volta.cs.utah.edu:8080/z/Nod0Gr Differential Revision: https://reviews.llvm.org/D80489	2020-05-24 11:25:09 +01:00
Amy Kwan	b631f86ac5	[TLI][PowerPC] Introduce TLI query to check if MULH is cheaper than MUL + SHIFT This patch introduces a TargetLowering query, isMulhCheaperThanMulShift. Currently in DAG Combine, it will transform mulhs/mulhu into a wider multiply and a shift if the wide multiply is legal. This TLI function is implemented on 64-bit PowerPC, as it is more desirable to have multiply-high over multiply + shift for words and doublewords. Having multiply-high can also aid in further transformations that can be done. Differential Revision: https://reviews.llvm.org/D78271	2020-05-23 16:47:12 -05:00
Simon Pilgrim	fe0006c882	TargetLowering.h - remove unnecessary TargetMachine.h include. NFC Replace with forward declaration and move dependency down to source files that actually need it. Both TargetLowering.h and TargetMachine.h are 2 of the most expensive headers (top 10) in the ClangBuildAnalyzer report when building llc.	2020-05-23 19:49:38 +01:00
Craig Topper	7392820f98	[Align] Remove operations on MaybeAlign that asserted that it had a defined value. If the caller needs to reponsible for making sure the MaybeAlign has a value, then we should just make the caller convert it to an Align with operator*. I explicitly deleted the relational comparison operators that were being inherited from Optional. It's unclear what the meaning of two MaybeAligns were one is defined and the other isn't should be. So make the caller reponsible for defining the behavior. I left the ==/!= operators from Optional. But now that exposed a weird quirk that ==/!= between Align and MaybeAlign required the MaybeAlign to be defined. But now we use the operator== from Optional that takes an Optional and the Value. Differential Revision: https://reviews.llvm.org/D80455	2020-05-22 21:54:28 -07:00
Simon Pilgrim	b9def827b7	StatepointLowering.h - remove unused includes. NFC.	2020-05-22 10:49:11 +01:00
Marcello Maggioni	dbaed589ab	[SelectionDAG] Add the option of disabling generic combines. Summary: For some targets generic combines don't really do much and they consume a disproportionate amount of time. There's not really a mechanism in SDISel to tactically disable combines, but we can have a switch to disable all of them and let the targets just implement what they specifically need. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79112	2020-05-21 20:11:29 +00:00
Denis Antrushin	dedcefe09d	[Statepoint] Constant fold FP deopt args. We do not have any special handling for constant FP deopt arguments. They are just spilled to stack or generated in register by MOVS instruction. This is inefficient and, when we have too many such constant arguments, may result in register allocation failure. Instead, we can bitcast such constant FP operands to appropriately sized integer and record as constant into statepoint and later, into StackMap. Reviewed By: skatkov Differential Revision: https://reviews.llvm.org/D80318	2020-05-21 11:02:54 +03:00
Craig Topper	ae5ab2f40a	[LegalizeDAG] Modify ExpandLegalINT_TO_FP to swap data for little/big endian instead of the pointers. Will make it easier to pass the pointer info and alignment correctly to the loads/stores. While there also make the i32 stores independent and use a token factor to join before the load.	2020-05-20 22:29:59 -07:00
Craig Topper	17bd86bc9b	[LegalizeVectorTypes] Create correct memoperands in SplitVecRes_INSERT_SUBVECTOR. Previously this code just used a default constructed MachinePointerInfo. But we know the accesses are to a fixed stack object or at least somewhere on the stack. While there fix the alignment passed to the full vector load/stores. I don't think this function is currently exercised in tree so I don't know how to test it. I just noticed it when I removed non-constant index support in this function. Differential Revision: https://reviews.llvm.org/D80058	2020-05-20 15:06:36 -07:00
Arthur Eubanks	8a88755610	Reland [X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Reverted due to unexpectedly passing tests, added REQUIRES: asserts for reland. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 11:25:44 -07:00
Arthur Eubanks	b8cbff51d3	Revert "[X86] Codegen for preallocated" This reverts commit `810567dc69`. Some tests are unexpectedly passing	2020-05-20 10:04:55 -07:00
Arthur Eubanks	810567dc69	[X86] Codegen for preallocated See https://reviews.llvm.org/D74651 for the preallocated IR constructs and LangRef changes. In X86TargetLowering::LowerCall(), if a call is preallocated, record each argument's offset from the stack pointer and the total stack adjustment. Associate the call Value with an integer index. Store the info in X86MachineFunctionInfo with the integer index as the key. This adds two new target independent ISDOpcodes and two new target dependent Opcodes corresponding to @llvm.call.preallocated.{setup,arg}. The setup ISelDAG node takes in a chain and outputs a chain and a SrcValue of the preallocated call Value. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to an %esp adjustment, the exact amount determined by looking in X86MachineFunctionInfo with the integer index key. The arg ISelDAG node takes in a chain, a SrcValue of the preallocated call Value, and the arg index int constant. It produces a chain and the pointer fo the arg. It is lowered to a target dependent node with the SrcValue replaced with the integer index key by looking in X86MachineFunctionInfo. In X86TargetLowering::EmitInstrWithCustomInserter() this is lowered to a lea of the stack pointer plus an offset determined by looking in X86MachineFunctionInfo with the integer index key. Force any function containing a preallocated call to use the frame pointer. Does not yet handle a setup without a call, or a conditional call. Does not yet handle musttail. That requires a LangRef change first. Tried to look at all references to inalloca and see if they apply to preallocated. I've made preallocated versions of tests testing inalloca whenever possible and when they make sense (e.g. not alloca related, inalloca edge cases). Aside from the tests added here, I checked that this codegen produces correct code for something like ``` struct A { A(); A(A&&); ~A(); }; void bar() { foo(foo(foo(foo(foo(A(), 4), 5), 6), 7), 8); } ``` by replacing the inalloca version of the .ll file with the appropriate preallocated code. Running the executable produces the same results as using the current inalloca implementation. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77689	2020-05-20 09:20:38 -07:00
QingShan Zhang	2b59e9f1bd	[DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression We have the getNegatibleCost/getNegatedExpression to evaluate the cost and negate the expression. However, during negating the expression, the cost might change as we are changing the DAG, and then, hit the assertion if we negated the wrong expression as the cost is not trustful anymore. This patch is target to remove the getNegatibleCost to avoid the out of sync with getNegatedExpression, and check the cost during negating the expression. It also reduce the duplicated code between getNegatibleCost and getNegatedExpression. And fix the crash for the test in D76638 Reviewed By: RKSimon, spatel Differential Revision: https://reviews.llvm.org/D77319	2020-05-20 02:12:16 +00:00
Matt Arsenault	3e315697ac	DAG: Use correct pointer size for llvm.ptrmask This was ignoring the address space, and would assert on address spaces with a different size from the default.	2020-05-18 16:46:11 -04:00
Nikita Popov	52e98f620c	[Alignment] Remove unnecessary getValueOrABITypeAlignment calls (NFC) Now that load/store alignment is required, we no longer need most of them. Also switch the getLoadStoreAlignment() helper to return Align instead of MaybeAlign.	2020-05-17 22:19:15 +02:00
Craig Topper	796ae8cf82	[LegalizeDAG] Use MachinePointerInfo::getUnknownStack in place of MachinePointerInfo() in a couple places. NFC We know the pointer somewhere on the stack, we just don't know exactly where since the index may be variable. Differential Revision: https://reviews.llvm.org/D80060	2020-05-16 15:48:16 -07:00
Eli Friedman	4f04db4b54	AllocaInst should store Align instead of MaybeAlign. Along the lines of D77454 and D79968. Unlike loads and stores, the default alignment is getPrefTypeAlign, to match the existing handling in various places, including SelectionDAG and InstCombine. Differential Revision: https://reviews.llvm.org/D80044	2020-05-16 14:53:16 -07:00
Craig Topper	13d44b2a0c	[LegalizeDAG] Use getMemBasePlusOffset to simplify some code. Use other signature of getMemBasePlusOffset in another location. NFCI The code was calculating an offset from a stack pointer SDValue. This is exactly what getMemBasePlusOffset does. I also replaced sizeof(int) with a hardcoded 4. We know the type we're operating on is 4 bytes. But the size of int that the source code is being compiled with isn't guaranteed to be 4 bytes. While here replace another use of getMemBasePlusOffset that was proceeded with a call to getConstant with the other signature that call getConstant internally.	2020-05-16 01:02:08 -07:00
Craig Topper	45c7b3fd91	[LegalizeVectorTypes] Remove non-constnat INSERT_SUBVECTOR handling. NFC Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-15 23:56:13 -07:00
Eli Friedman	11aa3707e3	StoreInst should store Align, not MaybeAlign This is D77454, except for stores. All the infrastructure work was done for loads, so the remaining changes necessary are relatively small. Differential Revision: https://reviews.llvm.org/D79968	2020-05-15 12:26:58 -07:00
David Sherwood	fb1c55b57d	[CodeGen] Fix FoldConstantVectorArithmetic for scalable vectors For now I have changed FoldConstantVectorArithmetic to return early if we encounter a scalable vector, since the subsequent code assumes you can perform lane-wise constant folds. However, in future work we should be able to extend this to look at splats of a constant value and fold those if possible. I have also added the same code to FoldConstantArithmetic, since that deals with vectors too. The warnings I fixed in this patch were being generated by this existing test: CodeGen/AArch64/sve-int-arith.ll Differential Revision: https://reviews.llvm.org/D79421	2020-05-15 14:58:44 +01:00
Simon Pilgrim	9d4b4f344d	DAGCombiner.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC. Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-15 12:41:35 +01:00
David Sherwood	8ce4a8f6df	[CodeGen] Refactor CreateStackTemporary I've created a new variant of CreateStackTemporary that takes TypeSize and Align arguments, and made the older instances of CreateStackTemporary call this new function. This refactoring is in preparation for more patches in this area related to scalable vectors and improving the alignment calculations. Differential Revision: https://reviews.llvm.org/D79933	2020-05-15 07:29:13 +01:00
Eli Friedman	accc6b5545	LoadInst should store Align, not MaybeAlign. The fact that loads and stores can have the alignment missing is a constant source of confusion: code that usually works can break down in rare cases. So fix the LoadInst API so the alignment is never missing. To reduce the number of changes required to make this work, IRBuilder and certain LoadInst constructors will grab the module's datalayout and compute the alignment automatically. This is the same alignment instcombine would eventually apply anyway; we're just doing it earlier. There's a minor risk that the way we're retrieving the datalayout could break out-of-tree code, but I don't think that's likely. This is the last in a series of patches, so most of the necessary changes have already been merged. Differential Revision: https://reviews.llvm.org/D77454	2020-05-14 13:19:21 -07:00
Simon Pilgrim	acb6f1ae09	TargetLowering.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC. Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-14 18:13:58 +01:00
Jay Foad	17941437a2	[TargetLowering] Improve expansion of FSHL/FSHR Use an extra shift-by-1 instead of a compare and select to handle the shift-by-zero case. This sometimes saves one instruction (if the compare couldn't be combined with a previous instruction). It also works better on targets that don't have good select instructions. Note that currently this change doesn't affect most targets because expandFunnelShift is not used because funnel shift intrinsics are lowered early in SelectionDAGBuilder. But there is work afoot to change that; see D77152. Differential Revision: https://reviews.llvm.org/D77301	2020-05-14 16:36:22 +01:00
Simon Pilgrim	80715b7124	SelectionDAG.cpp - remove non-constant EXTRACT_SUBVECTOR/INSERT_SUBVECTOR handling. NFC. Now that D79814 has landed, we can assume that subvector ops use constant, in-range indices.	2020-05-14 13:23:00 +01:00
Eric Christopher	bfa200ebcf	Remove an unused variable.	2020-05-13 15:13:02 -07:00
Eli Friedman	ed428c429e	[SelectionDAG] Require constant index for INSERT/EXTRACT_SUBVECTOR. It sounds like an interesting idea in theory, but nothing is actually taking advantage of it, and specifying/implementing the edge cases is painful. So just forbid it. Differential Revision: https://reviews.llvm.org/D79814	2020-05-13 13:08:59 -07:00
Craig Topper	8c72b0271b	[CodeGen] Use Align in MachineConstantPool.	2020-05-12 10:06:40 -07:00
James Y Knight	e9536795a3	Add comment for SelectionDAGBuilder::SL field.	2020-05-12 10:46:08 -04:00
David Sherwood	42c7a6d52b	[CodeGen] Fix incorrect uses of getVectorNumElements() I have fixed up some places in SelectionDAG::getNode() where we used to assert that the number of vector elements for two types are the same. I have changed such cases to assert that the element counts are the same instead. I've added new tests that exercise the code paths for all the truncations. All the extend operations are covered by this existing test: CodeGen/AArch64/sve-sext-zext.ll For the ISD::SETCC case I fixed this code path is exercised by these existing tests: CodeGen/AArch64/sve-fcmp.ll CodeGen/AArch64/sve-intrinsics-int-compares-with-imm.ll Differential Revision: https://reviews.llvm.org/D79399	2020-05-12 07:50:37 +01:00
Eli Friedman	c9c930ae67	[SelectionDAG] Don't promote the alignment of allocas beyond the stack alignment. allocas in LLVM IR have a specified alignment. When that alignment is specified, the alloca has at least that alignment at runtime. If the specified type of the alloca has a higher preferred alignment, SelectionDAG currently ignores that specified alignment, and increases the alignment. It does this even if it would trigger stack realignment. I don't think this makes sense, so this patch changes that. I was looking into this for SVE in particular: for SVE, overaligning vscale'ed types is extra expensive because it requires realigning the stack multiple times, or using dynamic allocation. (This currently isn't implemented.) I updated the expected assembly for a couple tests; in particular, for arg-copy-elide.ll, the optimization in question does not increase the alignment the way SelectionDAG normally would. For the rest, I just increased the specified alignment on the allocas to match what SelectionDAG was inferring. Differential Revision: https://reviews.llvm.org/D79532	2020-05-11 17:39:00 -07:00
Sam McCall	728cf6d86b	Revert "[DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression" This reverts commit `3c44c441db`. Causes infloops on some inputs, see https://reviews.llvm.org/D77319 for repro	2020-05-11 16:44:01 +02:00
QingShan Zhang	3c44c441db	[DAGCombine] Remove the getNegatibleCost to avoid the out of sync with getNegatedExpression We have the getNegatibleCost/getNegatedExpression to evaluate the cost and negate the expression. However, during negating the expression, the cost might change as we are changing the DAG, and then, hit the assertion if we negated the wrong expression as the cost is not trustful anymore. This patch is target to remove the getNegatibleCost to avoid the out of sync with getNegatedExpression, and check the cost during negating the expression. It also reduce the duplicated code between getNegatibleCost and getNegatedExpression. And fix the crash for the test in D76638 Reviewed By: RKSimon, spatel Differential Revision: https://reviews.llvm.org/D77319	2020-05-11 02:41:10 +00:00
Craig Topper	bebdc62c3f	[SelectionDAG] Remove ConstantPoolSDNode::getAlignment. Use getAlign instead. Differential Revision: https://reviews.llvm.org/D79459	2020-05-08 16:04:11 -07:00
Craig Topper	d1119980e5	[SelectionDAG] Use Align/MaybeAlign for ConstantPoolSDNode. This patch stores the alignment for ConstantPoolSDNode as an Align and updates the getConstantPool interface to take a MaybeAlign. Removing getAlignment() will be done as a follow up. Differential Revision: https://reviews.llvm.org/D79436	2020-05-08 16:04:11 -07:00
Simon Pilgrim	70293ba26f	[DAG] SimplifyMultipleUseDemandedBits - remove superfluous bitcasts If the SimplifyMultipleUseDemandedBits calls BITCASTs that peek through back to the original type then we can remove the BITCASTs entirely. Differential Revision: https://reviews.llvm.org/D79572	2020-05-08 19:04:49 +01:00
aartbik	771d30c647	[llvm] [CodeGen] Fixed vector halving bug for masked store Summary: Note that this fix is very similar to what has already been done for the masked load in https://reviews.llvm.org/D78608 Bugs: https://bugs.llvm.org/show_bug.cgi?id=45563 https://bugs.llvm.org/show_bug.cgi?id=45833 Reviewers: craig.topper, nicolasvasilache, mehdi_amini Reviewed By: craig.topper Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D79611	2020-05-07 19:01:40 -07:00
Kerry McLaughlin	a31f4c52bf	[SVE][CodeGen] Fix legalisation for scalable types Summary: This patch handles illegal scalable types when lowering IR operations, addressing several places where the value of isScalableVector() is ignored. For types such as <vscale x 8 x i32>, this means splitting the operations. In this example, we would split it into two operations of type <vscale x 4 x i32> for the low and high halves. In cases such as <vscale x 2 x i32>, the elements in the vector will be promoted. In this case they will be promoted to i64 (with a vector of type <vscale x 2 x i64>) Reviewers: sdesmalen, efriedma, huntergr Reviewed By: efriedma Subscribers: david-arm, tschuett, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78812	2020-05-07 10:01:31 +01:00
Craig Topper	7b9d6673bf	[SelectionDAG] When splitting gather operands in type legalization, set MMO size to UnknownSize I missed this case when I did the same for gather results and scatter operands in `c69a4d6bef`.	2020-05-06 19:57:14 -07:00
LemonBoy	7fa5abd343	[SelectionDAG] Fix assertion failure with big shift amounts Calling getShiftAmountTy with LegalTypes set may return a type that's too narrow to hold the shift amount for integer type it's applied to. Fixes the regression introduced by D79096 Differential Revision: https://reviews.llvm.org/D79405	2020-05-06 11:58:37 -07:00
Sanjay Patel	2f1fe1864d	[DAGCombiner] sink target-supported FP<->int cast op after concat vectors Try to combine N short vector cast ops into 1 wide vector cast op: concat (cast X), (cast Y)... -> cast (concat X, Y...) This is part of solving PR45794: https://bugs.llvm.org/show_bug.cgi?id=45794 As noted in the code comment, this is uglier than I was hoping because the opcode determines whether we pass the source or destination type to isOperationLegalOrCustom(). Also IIUC, there's no way to validate what the other (dest or src) type is. Without the extra legality check on that, there's an ARM regression test in: test/CodeGen/ARM/isel-v8i32-crash.ll ...that will crash trying to lower an unsupported v8f32 to v8i16. Differential Revision: https://reviews.llvm.org/D79360	2020-05-06 10:25:58 -04:00
David Sherwood	cd3a54c55a	[CodeGen] Fix warnings due to SelectionDAG::getSplatSourceVector Summary: I have fixed several places in getSplatSourceVector and isSplatValue to work correctly with scalable vectors. I added new support for the ISD::SPLAT_VECTOR DAG node as one of the obvious cases we can support with scalable vectors. In other places I have tried to do the sensible thing, such as bail out for vector types we don't yet support or don't intend to support. It's not possible to add IR test cases to cover these changes, since they are currently only ever exercised on certain targets, e.g. only X86 targets use the result of getSplatSourceVector. I've assumed that X86 tests already exist to test these code paths for fixed vectors. However, I have added some AArch64 unit tests that test the specific functions I have changed. Differential revision: https://reviews.llvm.org/D79083	2020-05-05 08:45:41 +01:00
Alex Richardson	d1ff003fbb	[SelectionDAGBuilder] Stop setting alignment to one for hidden sret values We allocated a suitably aligned frame index so we know that all the values have ABI alignment. For MIPS this avoids using pair of lwl + lwr instructions instead of a single lw. I found this when compiling CHERI pure capability code where we can't use the lwl/lwr unaligned loads/stores and and were to falling back to a byte load + shift + or sequence. This should save a few instructions for MIPS and possibly other backends that don't have fast unaligned loads/stores. It also improves code generation for CodeGen/X86/pr34653.ll and CodeGen/WebAssembly/offset.ll since they can now use aligned loads. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78999	2020-05-04 14:44:39 +01:00
LemonBoy	6d103ca855	[SelectionDAG] Unify scalarizeVectorLoad and VectorLegalizer::ExpandLoad The two code paths have the same goal, legalizing a load of a non-byte-sized vector by loading the "flattened" representation in memory, slicing off each single element and then building a vector out of those pieces. The technique employed by `ExpandLoad` is slightly more convoluted and produces slightly better codegen on ARM, AMDGPU and x86 but suffers from some bugs (D78480) and is wrong for BE machines. Differential Revision: https://reviews.llvm.org/D79096	2020-05-02 15:18:10 -07:00
Simon Pilgrim	a09a3c6d3e	Revert rG8e05ac0a510c - "[DAGCombine] visitTRUNCATE - remove GetDemandedBits call" Causing buildbot failures	2020-05-02 20:08:33 +01:00
Simon Pilgrim	8e05ac0a51	[DAGCombine] visitTRUNCATE - remove GetDemandedBits call rL368553 added SimplifyMultipleUseDemandedBits handling for ISD::TRUNCATE to SimplifyDemandedBits so we don't need to duplicate this (and it gets rid of another GetDemandedBits call which is slowly being replaced with SimplifyMultipleUseDemandedBits anyhow).	2020-05-02 19:52:17 +01:00
Simon Pilgrim	7cb5a51f38	[DAG] SimplifyDemandedVectorElts - add INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling	2020-05-01 16:20:51 +01:00
Simon Pilgrim	65d32a9892	[DAG] SimplifyDemandedVectorElts - remove INSERT_SUBVECTOR if we don't demand the subvector	2020-05-01 16:20:51 +01:00
Simon Pilgrim	e3c0be596c	[DAG] SimplifyDemandedVectorElts - add EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling	2020-05-01 13:48:07 +01:00
Craig Topper	6a1ad76dab	[X86] Don't return true from isTruncateFree for vectors Also fix some cost tables for vXi1 types to match the costs entries for the types they will be promoted to. Differential Revision: https://reviews.llvm.org/D79045	2020-04-30 16:43:35 -07:00
Simon Pilgrim	96238486ed	[DAGCombine] Move the remaining X86 funnel shift patterns to DAGCombine X86 matches several 'shift+xor' funnel shift patterns: fold (or (srl (srl x1, 1), (xor y, 31)), (shl x0, y)) -> (fshl x0, x1, y) fold (or (shl (shl x0, 1), (xor y, 31)), (srl x1, y)) -> (fshr x0, x1, y) fold (or (shl (add x0, x0), (xor y, 31)), (srl x1, y)) -> (fshr x0, x1, y) These patterns are also what we end up with the proposed expansion changes in D77301. This patch moves these to DAGCombine's generic MatchFunnelPosNeg. All existing X86 test cases still pass, and we just have a small codegen change in pr32282.ll. Reviewed By: @spatel Differential Revision: https://reviews.llvm.org/D78935	2020-04-30 12:57:17 +01:00
Simon Pilgrim	6547a5ceb2	[DAG] Add TODO comment regarding ADD(X,X) -> SHL(X,1) canonicalization As discussed on D78935	2020-04-30 12:57:16 +01:00
David Sherwood	058cd8c5be	[CodeGen] Add support for inserting elements into scalable vectors Summary: This patch tries to ensure that we do something sensible when generating code for the ISD::INSERT_VECTOR_ELT DAG node when operating on scalable vectors. Previously we always returned 'undef' when inserting an element into an out-of-bounds lane index, whereas now we only do this for fixed length vectors. For scalable vectors it is assumed that the backend will do the right thing in the same way that we have to deal with variable lane indices. In this patch I have permitted a few basic combinations for scalable vector types where it makes sense, but in general avoided most cases for now as they currently require the use of BUILD_VECTOR nodes. This patch includes tests for all scalable vector types when inserting into lane 0, but I've only included one or two vector types for other cases such as variable lane inserts. Differential Revision: https://reviews.llvm.org/D78992	2020-04-30 11:14:04 +01:00
QingShan Zhang	b5f89744cc	[DAGCombine] Checking the cost directly to improve the code readability Call getNegatedExpression(Cost) and check the Cost to make the code more clear. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D78347	2020-04-29 01:49:39 +00:00
Craig Topper	e13c141a91	[SelectionDAGBuilder] Use CallBase::isInlineAsm in a couple places. NFC These lines were just changed from using CallBase::getCalledValue to getCallledOperand. Go aheand change them to isInlineAsm.	2020-04-27 23:00:44 -07:00
Craig Topper	a58b62b4a2	[IR] Replace all uses of CallBase::getCalledValue() with getCalledOperand(). This method has been commented as deprecated for a while. Remove it and replace all uses with the equivalent getCalledOperand(). I also made a few cleanups in here. For example, to removes use of getElementType on a pointer when we could just use getFunctionType from the call. Differential Revision: https://reviews.llvm.org/D78882	2020-04-27 22:17:03 -07:00
David Sherwood	096b25a8d8	[CodeGen] Use SPLAT_VECTOR for zeroinitialiser with scalable types Summary: When generating code for the LLVM IR zeroinitialiser operation, if the vector type is scalable we should be using SPLAT_VECTOR instead of BUILD_VECTOR. Differential Revision: https://reviews.llvm.org/D78636	2020-04-27 15:57:59 +01:00
QingShan Zhang	2957fa0cd1	[NFC][DAGCombine] Adding three helper functions and change the getNegatedExpression to negateExpression This is a NFC patch for D77319. The idea is to hide the getNegatibleCost inside the getNegatedExpression() to have it return null if the cost is expensive, and add some helper function for easy to use. And rename the old getNegatedExpression to negateExpression to avoid the semantic conflict. Reviewed By: RKSimon Differential revision: https://reviews.llvm.org/D78291	2020-04-27 04:11:42 +00:00
aartbik	907871d9ad	[llvm] [CodeGen] Fixed vector halving bug for masked load Summary: Given a VL=14 that is enveloped by a proper VL=16, splitting the masked load using the enveloping halving VL=8/8 should yields should eventually yield V=8/5. This fixes various assert failures in getHalfNumVectorElementsVT() and IncrementMemoryAddress(). Note, I suspect similar fixes will be needed for other masked operations, but for now I send out a fix for masked load only. Bugzilla issue 45563 https://bugs.llvm.org/show_bug.cgi?id=45563 Reviewers: craig.topper, mehdi_amini, nicolasvasilache Reviewed By: craig.topper Subscribers: hiraditya, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D78608	2020-04-23 15:12:44 -07:00
Christopher Tetreault	ccd623eae3	[SVE] Remove calls to isScalable from CodeGen Reviewers: efriedma, sdesmalen, stoklund, sunfish Reviewed By: efriedma Subscribers: tschuett, hiraditya, rkruppe, psnobl, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77755	2020-04-23 12:58:52 -07:00
Alex Richardson	bbcfce4bad	Use FrameIndexTy for stack protector Using getValueType() is not correct for architectures extended with CHERI since we need a pointer type and not the value that is loaded. While stack protector is useless when you have CHERI (since CHERI provides much stronger security guarantees), we still have a test to check that we can generate correct code for checks. Merging `b281138a1b` into our tree broke this test. Fix by using TLI.getFrameIndexTy(). Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D77785	2020-04-23 13:12:27 +01:00
Craig Topper	05a11974ae	[CallSite removal] Remove unneeded includes of CallSite.h. NFC	2020-04-22 00:07:13 -07:00
Kang Zhang	a8e15ee04a	[CodeGen] Support freeze expand for ppc_fp128 Summary: The patch D29014 has added the new ISD::FREEZE and can deal with the integer. The patch D76980 has added SoftenFloatRes_FREEZE for float point. But we still lack of expand for ppc_fp128, this will cause assertion for some cases. This patch is to support freeze expand for ppc_fp128. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D78278	2020-04-20 07:27:41 +00:00
Simon Pilgrim	46de0d5fe9	SelectionDAGBuilder.h - remove unused includes + forward declarations. NFC. Replace SelectionDAG.h include with SelectionDAG forward declaration.	2020-04-19 12:38:41 +01:00
Simon Pilgrim	032738d17e	InstrEmitter.h - reduce SelectionDAG.h include to SelectionDAGNodes.h include. Add SDDbgLabel/TargetLowering forward declarations. Add the full SelectionDAG.h include to InstrEmitter.cpp.	2020-04-19 11:52:31 +01:00
Christopher Tetreault	c858debebc	Remove asserting getters from base Type Summary: Remove asserting vector getters from Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: dexonsmith, sdesmalen, efriedma Reviewed By: efriedma Subscribers: cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D77278	2020-04-17 14:03:31 -07:00
Fraser Cormack	c819ef9653	Provide operand indices to adjustSchedDependency This allows targets to know exactly which operands are contributing to the dependency, which is required for targets with per-operand scheduling models. Differential Revision: https://reviews.llvm.org/D77135	2020-04-17 11:08:44 +01:00
Craig Topper	944cc5e0ab	[SelectionDAGBuilder][CGP][X86] Move some of SDB's gather/scatter uniform base handling to CGP. I've always found the "findValue" a little odd and inconsistent with other things in SDB. This simplfifies the code in SDB to just handle a splat constant address or a 2 operand GEP in the same BB. This removes the need for "findValue" since the operands to the GEP are guaranteed to be available. The splat constant handling is new, but was needed to avoid regressions due to constant folding combining GEPs created in CGP. CGP is now responsible for canonicalizing gather/scatters into this form. The pattern I'm using for scalarizing, a scalar GEP followed by a GEP with an all zeroes index, seems to be subject to constant folding that the insertelement+shufflevector was not. Differential Revision: https://reviews.llvm.org/D76947	2020-04-16 17:49:22 -07:00
Eli Friedman	7c10541e56	[SelectionDAG] Fix usage of Align constructing MachineMemOperands. The "Align" passed into getMachineMemOperand etc. is the alignment of the MachinePointerInfo, not the alignment of the memory operation. (getAlign() on a MachineMemOperand automatically reduces the alignment to account for this.) We were passing on wrong (overconservative) alignment in a bunch of places. Fix a bunch of these, mostly in legalization. And while I'm here, switch to the new Align APIs. The test changes are all scheduling changes: the biggest effect of preserving large alignments is that it improves alias analysis, so the scheduler has more freedom. (I was originally just trying to do a minor cleanup in SelectionDAGBuilder, but I accidentally went deeper down the rabbit hole.) Differential Revision: https://reviews.llvm.org/D77687	2020-04-15 13:01:41 -07:00
Benjamin Kramer	d790bd3999	Unbreak the build	2020-04-15 15:54:47 +02:00
Victor Campos	d85b3877dc	[CodeGen][ARM] Error when writing to specific reserved registers in inline asm Summary: No error or warning is emitted when specific reserved registers are written to in inline assembly. Therefore, writes to the program counter or to the frame pointer, for instance, were permitted, which could have led to undesirable behaviour. Example: int foo() { register int a __asm__("r7"); // r7 = frame-pointer in M-class ARM __asm__ __volatile__("mov %0, r1" : "=r"(a) : : ); return a; } In contrast, GCC issues an error in the same scenario. This patch detects writes to specific reserved registers in inline assembly for ARM and emits an error in such case. The detection works for output and input operands. Clobber operands are not handled here: they are already covered at a later point in AsmPrinter::emitInlineAsm(const MachineInstr *MI). The registers covered are: program counter, frame pointer and base pointer. This is ARM only. Therefore the implementation of other targets' counterparts remain open to do. Reviewers: efriedma Reviewed By: efriedma Subscribers: kristof.beyls, hiraditya, danielkiss, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D76848	2020-04-15 14:40:42 +01:00
QingShan Zhang	c9f9c79c5a	[NFC][DAGCombine] Change the value of NegatibleCost to make it align with the semantics This is a minor NFC change to make the code more clear. We have the NegatibleCost that has cheaper, neutral, and expensive. Typically, the smaller one means the less cost. It is inverse for current implementation, which makes following code not easy to read. If (CostX > CostY) negate(X) Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D77993	2020-04-15 02:20:58 +00:00
Craig Topper	3043093822	[CallSite removal][CodeGen] Replace ImmutableCallSite with CallBase in isInTailCallPosition.	2020-04-13 23:04:57 -07:00
Craig Topper	113f37a1f9	[CallSite removal][TargetLowering] Replace ImmutableCallSite with CallBase Differential Revision: https://reviews.llvm.org/D77995	2020-04-13 13:50:15 -07:00
Matt Arsenault	e6605a209c	DAG: Fix wrong legality check for ISD::FMAD Since `1725f28841`, this should check isFMADLegalForFAddFSub rather than the the plain isOperationLegal. This would assert in a subset of cases due to an oddity in how FMAD is selected. We will allow FMA formation pre-legalize, but not FMAD even in cases where it would be valid. The current hook requires passing in the root fadd/fsub. However, in this distributed case, this would be far more complicated to pass in the relevant operand. AMDGPU doesn't get any value from the node, and only needs the type and is the only implementor, so I'm not sure why we have this complexity. Just rename and expand the assert to avoid the more complicated checks spread through the distribution logic.	2020-04-13 10:25:39 -07:00
Craig Topper	dbb272b0a3	[CallSite removal][FastISel] Use CallBase instead of CallSite in fastLowerCall.	2020-04-12 18:02:24 -07:00
Craig Topper	95192f548d	[CallSite removal][TargetLowering] Use CallBase instead of CallSite in TargetLowering::ParseConstraints interface. Differential Revision: https://reviews.llvm.org/D77929	2020-04-12 11:26:25 -07:00
Jonathan Roelofs	41f13f1f64	reland: [DAG] Fix PR45049: LegalizeTypes crash Sometimes LegalizeTypes knows about common subexpressions before SelectionDAG does, leading to accidental SDValue removal before its reference count was truly zero. Differential Revision: https://reviews.llvm.org/D76994 Reviewed-By: bjope Fixes: https://bugs.llvm.org/show_bug.cgi?id=45049 Reverted in `3ce77142a6` because the previous patch broke the expensive-checks bots. The new patch removes the broken check.	2020-04-12 09:52:17 -06:00
Craig Topper	5b42399029	[CallSite removal][FastISel] Remove uses of CallSite. Differential Revision: https://reviews.llvm.org/D77933	2020-04-11 20:52:45 -07:00
Craig Topper	806763efcf	[CallSite removal][SelectionDAGBuilder] Use CallBase instead of ImmutableCallSite in visitPatchpoint. Differential Revision: https://reviews.llvm.org/D77932	2020-04-11 13:07:31 -07:00
Sanjay Patel	1318ddbc14	[VectorUtils] rename scaleShuffleMask to narrowShuffleMaskElts; NFC As proposed in D77881, we'll have the related widening operation, so this name becomes too vague. While here, change the function signature to take an 'int' rather than 'size_t' for the scaling factor, add an assert for overflow of 32-bits, and improve the documentation comments.	2020-04-11 10:05:49 -04:00
Craig Topper	9c1842d8af	Change FastISel::CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI. This is the same as what was done to the CallLoweringInfo in TargetLowering.h in r309159. This is just a step on the way to replacing this with CallBase.	2020-04-10 23:45:36 -07:00
Craig Topper	f49f6cf91e	[CallSite removal][SelectionDAGBuilder] Remove most CallSite usage from visitInlineAsm. I only left it at the interface to ParseConstraints since that needs updates to other callers in different files. I'll do that as a follow up. Differential Revision: https://reviews.llvm.org/D77892	2020-04-10 19:23:33 -07:00
Christopher Tetreault	889f6606ed	Clean up usages of asserting vector getters in Type Summary: Remove usages of asserting vector getters in Type in preparation for the VectorType refactor. The existence of these functions complicates the refactor while adding little value. Reviewers: stoklund, sdesmalen, efriedma Reviewed By: sdesmalen Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D77272	2020-04-10 14:53:43 -07:00
Simon Pilgrim	a88cc20456	ProfileSummaryInfo.h - remove unnecessary includes. NFC Remove a number of includes that aren't necessary (nor are we relying on the remaining includes to provide the declarations), we just needed a llvm::Instruction forward declaration. This exposed a couple of source files that were implicitly replying on the includes for their use of llvm::SmallSet or std::set, requiring local includes to be added there instead.	2020-04-10 16:25:48 +01:00
Serguei Katkov	4275eb1331	Re-land [Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, danstrushin Reviewed By: dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77797	2020-04-10 10:13:39 +07:00
Serguei Katkov	44f0d7f136	Revert "[Codegen/Statepoint] Allow usage of registers for non gc deopt values." This reverts commit `a0275705bb`. It causes buildbot failures building LLVM with BUILD_SHARED_LIBS due to a linker error.	2020-04-09 18:24:47 +07:00
Serguei Katkov	a0275705bb	[Codegen/Statepoint] Allow usage of registers for non gc deopt values. The change introduces the usage of physical registers for non-gc deopt values. This require runtime support to know how to take a value from register. By default usage is off and can be switched on by option. The change also introduces additional fix-up patch which forces the spilling of caller saved registers (clobbered after the call) and re-writes statepoint to use spill slots instead of caller saved registers. Reviewers: reames, dantrushin Reviewed By: reames, dantrushin Subscribers: mgorny, hiraditya, mgrang, llvm-commits Differential Revision: https://reviews.llvm.org/D77371	2020-04-09 16:57:35 +07:00
Jay Foad	c63aed890e	[KnownBits] Move AND, OR and XOR logic into KnownBits Summary: There are at least three clients for KnownBits calculations: ValueTracking, SelectionDAG and GlobalISel. To reduce duplication the common logic should be moved out of these clients and into KnownBits itself. This patch does this for AND, OR and XOR calculations by implementing and using appropriate operator overloads KnownBits::operator& etc. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74060	2020-04-09 10:10:37 +01:00
Matt Arsenault	586769cce2	DAG: Use Register	2020-04-08 13:44:31 -04:00
Matt Arsenault	dcce3ef1d2	FastISel: Partially use Register Doesn't try to convert the cases that depend on generated code.	2020-04-08 12:10:58 -04:00

... 2 3 4 5 6 ...

10971 Commits