llvm-project

Commit Graph

Author	SHA1	Message	Date
Sjoerd Meijer	dad5f00e3b	[DAGCombine] Combine pattern for REV16 This adds another pattern to the combiner for a case that we were not handling to generate the REV16 instruction for ARM/Thumb2 and a bswap+ror on X86. Differential Revision: https://reviews.llvm.org/D74032	2020-02-17 14:54:17 +00:00
Benjamin Kramer	5fc5c7db38	Strength reduce vectors into arrays. NFCI.	2020-02-17 15:37:35 +01:00
Simon Pilgrim	ce2b5f1569	Fix gcc9.2 -Winit-list-lifetime warning. NFCI. Reported by @lbenes (Luke Benes)	2020-02-15 16:48:51 +00:00
Diogo Sampaio	8bc790f9e6	[AArch64][FPenv] Update chain of int to fp conversion Summary: When using strict fp, it is required to update the chain when performing integer type promotion of a operand to a integer to floating point conversion. Reviewers: craig.topper, john.brawn Reviewed By: craig.topper Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74597	2020-02-15 05:07:34 +00:00
Vedant Kumar	3091049446	Add dbgs() output to help track down missing DW_AT_location bugs, NFC	2020-02-13 14:38:44 -08:00
Fangrui Song	0dce409cee	[AsmPrinter] De-capitalize Emit{Function,BasicBlock]* and Emit{Start,End}OfAsmFile	2020-02-13 13:22:49 -08:00
Simon Pilgrim	32176133fa	Move FIXME to start of comment so visual studio actually tags it. NFC.	2020-02-13 14:28:50 +00:00
Serguei Katkov	a6f38b4697	[Statepoint] Remove redundant clear of call target on register Patchable statepoint is lowered into sequence of nops, so zeroed call target should not be on register. It is better to use getTargetConstant instead of getConstant to select zero constant for call target. Reviewers: reames Reviewed By: reames Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D74465	2020-02-13 10:25:50 +07:00
Jay Foad	32aac25637	[KnownBits] Introduce anyext instead of passing a flag into zext Summary: This was a very odd API, where you had to pass a flag into a zext function to say whether the extended bits really were zero or not. All callers passed in a literal true or false. I think it's much clearer to make the function name reflect the operation being performed on the value we're tracking (rather than on the KnownBits Zero and One fields), so zext means the value is being zero extended and new function anyext means the value is being extended with unknown bits. NFC. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74482	2020-02-12 19:06:53 +00:00
Simon Pilgrim	9eb426c88c	[TargetLowering] Add NegatibleCost enum for isNegatibleForFree return codes The isNegatibleForFree/getNegatedExpression methods currently rely on a raw char value to indicate whether a negation is beneficial or not. This patch replaces the char return value with an NegatibleCost enum to more clearly demonstrate what is implied. It also renames isNegatibleForFree to getNegatibleCost to more accurately reflect whats going on. Differential Revision: https://reviews.llvm.org/D74221	2020-02-12 11:51:42 +00:00
Djordje Todorovic	97ed706a96	Revert "[DebugInfo] Enable the debug entry values feature by default" This reverts commit rG9f6ff07f8a39. Found a test failure on clang-with-thin-lto-ubuntu buildbot.	2020-02-12 11:59:04 +01:00
Djordje Todorovic	9f6ff07f8a	[DebugInfo] Enable the debug entry values feature by default This patch enables the debug entry values feature. - Remove the (CC1) experimental -femit-debug-entry-values option - Enable it for x86, arm and aarch64 targets - Resolve the test failures - Leave the llc experimental option for targets that do not support the CallSiteInfo yet Differential Revision: https://reviews.llvm.org/D73534	2020-02-12 10:25:14 +01:00
Nicolai Hähnle	07a5b849f7	SelectionDAG: Fix bug in ClusterNeighboringLoads Summary: The method attempts to find loads that can be legally clustered by looking for loads consuming the same chain glue token. However, the old code looks at _all_ users of values produced by the chain node -- including uses of the loaded/returned value of volatile loads or atomics. This could lead to circular dependencies which then failed during scheduling. With this change, we filter out users by getResNo, i.e. by which SDValue value they use, to ensure that we only look at users of the chain glue token. This appears to be a rather old bug, which is perhaps surprising. However, the test case is actually quite fragile (i.e., it is hidden by fairly small changes), and the test _must_ use volatile loads for the bug to manifest. Reviewers: arsenm, bogner, craig.topper, foad Subscribers: MatzeB, jvesely, wdng, hiraditya, javed.absar, jfb, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D74253	2020-02-12 09:12:55 +01:00
Craig Topper	0daf9b8e41	[X86][LegalizeTypes] Add SoftPromoteHalf support STRICT_FP_EXTEND and STRICT_FP_ROUND This adds a strict version of FP16_TO_FP and FP_TO_FP16 and uses them to implement soft promotion for the half type. This is enough to provide basic support for __fp16 with strictfp. Add the necessary X86 support to use VCVTPS2PH/VCVTPH2PS when F16C is enabled.	2020-02-11 22:30:04 -08:00
Sebastian Neubauer	7cddd15e56	[SelectionDAG] Optimize build_vector of truncates and shifts Add a simplification to fuse a manual vector extract with shifts and truncate into a bitcast. Unpacking and packing values into vectors is only optimized with extractelement instructions, not when manually unpacked using shifts and truncates. This patch simplifies shifts and truncates into a bitcast if possible. Simplify (build_vec (trunc $1) (trunc (srl $1 width)) (trunc (srl $1 (2 * width))) ...) to (bitcast $1) Differential Revision: https://reviews.llvm.org/D73892	2020-02-10 15:04:07 +01:00
Craig Topper	eeb63944e4	[LegalizeTypes][ARM][AArch64][PowerPC][RISCV][X86] Use BUILD_PAIR to return expanded integer results from ReplaceNodeResults instead of just returning two results. Remove code from LegalizeTypes that allowed this to work. We were already using BUILD_PAIR for this in some places so this standardizes on a single way to do this.	2020-02-08 09:52:31 -08:00
Craig Topper	2af1640f9a	[LegalizeDAG][X86][AMDGPU] Use ANY_EXTEND instead of ZERO_EXTEND when promoting ISD::CTTZ/CTTZ_ZERO_UNDEF. Summary: For CTTZ we place a set bit just past where the non-promoted type stopped so the extended bits won't be used for the count. For CTTZ_ZERO_UNDEF we don't care what happens if no bits are set in the original type and we end up counting into the extended bits. So we can just use ANY_EXTEND for both cases. This matches what is done in type legalization for these operations. We make no effort to force the upper bits to zero. Differential Revision: https://reviews.llvm.org/D74111	2020-02-07 22:25:56 -08:00
Vedant Kumar	0d0ef315cb	[MachineInstr] Add isCandidateForCallSiteEntry predicate Add the isCandidateForCallSiteEntry predicate to MachineInstr to determine whether a DWARF call site entry should be created for an instruction. For now, it's enough to have any call instruction that doesn't belong to a blacklisted set of opcodes. For these opcodes, a call site entry isn't meaningful. Differential Revision: https://reviews.llvm.org/D74159	2020-02-07 10:10:41 -08:00
Guillaume Chatelet	f85d3408e6	[NFC] Introduce an API for MemOp Summary: This patch introduces an API for MemOp in order to simplify and tighten the client code. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73964	2020-02-07 11:32:27 +01:00
Jeremy Morse	6531a78ac4	Revert "[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field" This reverts commit `ed29dbaafa`. I'm backing out D68945, which as the discussion for D73526 shows, doesn't seem to handle the -O0 path through the codegen backend correctly. I'll reland the patch when a fix is worked out, apologies for all the churn. The two parent commits are part of this revert too. Conflicts: llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp llvm/test/DebugInfo/X86/dbg-addr-dse.ll SelectionDAGBuilder conflict is due to a nearby change in `e39e2b4a79` that's technically unrelated. dbg-addr-dse.ll conflicted because `41206b61e3` (legitimately) changes the order of two lines. There are further modifications to dbg-value-func-arg.ll: it landed after the patch being reverted, and I've converted indirection to be represented by the isIndirect field rather than DW_OP_deref.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ece761427f	Revert "[DebugInfo][DAG] Distinguish different kinds of location indirection" This reverts commit `3137fe4d23`. I'm backing out D68945, which this patch is a follow up for. It'll be re-landed when D68945 is fixed. The changes to dbg-value-func-arg.ll occur because our handling of certain kinds of location now mixes up indirection that happens at different points in a DIExpression. While this is a regression, it's a return to the prior behaviour while a better patch is sought.	2020-02-06 14:41:40 +00:00
Jeremy Morse	ed5998d21e	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `2d3174c4df`. The overall solution for this problem is reverting D68945, which wasn't handling the -O0 path through the codegen backend correctly. See: discussion in D73526.	2020-02-06 14:41:39 +00:00
Simon Pilgrim	4592bb7195	visitINSERT_VECTOR_ELT - pull out repeated dyn_cast. NFCI. This always gets called at least once.	2020-02-05 13:30:54 +00:00
Thomas Lively	649aba93a2	Revert "[WebAssembly][InstrEmitter] Foundation for multivalue call lowering" Summary: This reverts commit `3ef169e586`. The purpose of this commit was to allow stack machines to perform instruction selection for instructions with variadic defs. However, MachineInstrs fundamentally cannot support variadic defs right now, so this change does not turn out to be useful. Depends on D73927. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73928	2020-02-04 20:04:59 -08:00
Reid Kleckner	2d89e0a098	[SEH] Remove CATCHPAD SDNode and X86::EH_RESTORE MachineInstr The CATCHPAD node mostly existed to be selected into the EH_RESTORE instruction, which sets the frame back up when 32-bit Windows exceptions return to the parent function. However, creating this MachineInstr early increases the risk that other passes will come along and insert instructions that use the stack before ESP and EBP are restored. That happened in PR44697. Instead of representing these in the instruction stream early, delay it until PEI. Mark the blocks where this needs to happen as EHPads, but not funclet entry blocks. Passes after PEI have to be careful not to hoist instructions that can use stack across frame setup instructions, so this should be relatively reliable. Fixes PR44697 Reviewed By: hans Differential Revision: https://reviews.llvm.org/D73752	2020-02-04 15:13:12 -08:00
Matt Arsenault	a3c814d234	Separately track input and output denormal mode AMDGPU and x86 at least both have separate controls for whether denormal results are flushed on output, and for whether denormals are implicitly treated as 0 as an input. The current DAGCombiner use only really cares about the input treatment of denormals.	2020-02-04 12:59:21 -05:00
Simon Pilgrim	3dd688a9ee	[DAG] OptLevelChanger - fix uninitialized variable analyzer warning (PR44471) Ensure that OptLevelChanger::SavedFastISel is initialized in the constructor. This should be NFC - as the equivalent 'same opt level' early-out is used in the destructor as well, so SavedFastISel is only actually referenced in the general case. Differential Revision: https://reviews.llvm.org/D73875	2020-02-04 10:54:33 +00:00
David Green	362d00e051	[ARM][VecReduce] Force expand vector_reduce_fmin Under MVE, we do not have any lowering for fminimum, which a vector_reduce_fmin without NoNan will be expanded into. As with the other recent patches, force this to expand in the pre-isel pass. Note that Neon lowering would be OK because the scalar fminimum uses the vector VMIN instruction, but is probably better to just rely on the scalar operations, which is what is done here. Also fixes what appears to be the reversal of INF vs -INF in the vector_reduce_fmin widening code.	2020-02-04 09:36:59 +00:00
Guillaume Chatelet	b8144c0536	[NFC] Encapsulate MemOp logic Summary: This patch simply introduces functions instead of directly accessing the fields. This helps introducing additional check logic. A second patch will add simplifying functions. Reviewers: courbet Subscribers: arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73945	2020-02-04 10:36:26 +01:00
Simon Pilgrim	61621f826a	[TargetLowering] SimplifyDemandedBits - add basic KnownBits ZEXTLoad handling We have to be careful in SimplifyDemandedBits with loads in case we attempt to combine back to a constant (which then gets turned into a constant pool load again), but we can at least set the upper KnownBits for a ZEXTLoad to zero.	2020-02-03 16:50:04 +00:00
Guillaume Chatelet	333f2ad8b8	[Alignment][NFC] Use Align for getMemcpy/Memmove/Memset Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, dschuff, jyknight, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73885	2020-02-03 17:13:19 +01:00
Guillaume Chatelet	fc19465965	[Alignment][NFC] Use Align for code creating MemOp Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73874	2020-02-03 14:10:30 +01:00
Guillaume Chatelet	75d9994a51	Fix broken invariant Summary: A Copy with a source that is zeros is the same as a Set of zeros. This fixes the invariant that SrcAlign should always be non-null. Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73791	2020-02-03 11:01:05 +01:00
Craig Topper	943b5561d6	[LegalizeTypes][X86] Add a new strategy for type legalizing f16 type that softens it to i16, but promotes to f32 around arithmetic ops. This is based on this llvm-dev thread http://lists.llvm.org/pipermail/llvm-dev/2019-December/137521.html The current strategy for f16 is to promote type to float every except where the specific width is required like loads, stores, and bitcasts. This results in rounding occurring in odd places instead of immediately after arithmetic operations. This interacts in weird ways with the __fp16 type in clang which is a storage only type where arithmetic is always promoted to float. InstCombine can remove some fpext/fptruncs around such arithmetic and turn it into arithmetic on half. This wouldn't be so bad if SelectionDAG was able to put those fpext/fpround back in when it promotes. It is also not obvious how to handle to make the existing strategy work with STRICT fp. We need to use STRICT versions of the conversions which require chain operands. But if the conversions are created for a bitcast, there is no place to get an appropriate chain from. This patch implements a different strategy where conversions are emitted directly around arithmetic operations. And otherwise its passed around as an i16 including in arguments and return values. This can result in more conversions between arithmetic operations, but is closer to matching the IR the frontend generates for __fp16. And it will allow us to use the chain from constrained arithmetic nodes to link the STRICT_FP_TO_FP16/STRICT_FP16_TO_FP that will need to be added. I've set it up so that each target can opt into the new behavior. Converting all the targets myself was more than I was able to handle. Differential Revision: https://reviews.llvm.org/D73749	2020-02-01 11:21:04 -08:00
Matt Arsenault	792d9b5719	DAG: Check if a value is divergent before requiresUniformRegister This avoids a potentially expensive scan if we already know it doesn't matter.	2020-01-31 15:27:18 -08:00
Simon Pilgrim	8fbc7fd567	[DAG] SimplifyMultipleUseDemandedBits - peek through unused ISD::INSERT_SUBVECTOR subvectors If we don't demand any elements of the inserted subvector then just skip it.	2020-01-31 18:57:22 +00:00
Simon Pilgrim	5702dadf6f	[DAG] Enable ISD::INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::INSERT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-31 18:02:34 +00:00
Guillaume Chatelet	3c89b75f23	[NFC] Introduce a type to model memory operation Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code. Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73785	2020-01-31 17:29:01 +01:00
Leonard Chan	2d3174c4df	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. This is an attempt to reland with a few fixes for buildbot since I haven't merged from master in a bit. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 17:09:42 -08:00
Leonard Chan	3b23453b6c	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `fff6a1b0f1`. This was breaking a bunch of buildbots.	2020-01-30 16:18:41 -08:00
Leonard Chan	fff6a1b0f1	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 15:58:37 -08:00
Simon Pilgrim	57b0d33224	[DAGCombiner] ISD::AND/OR/XOR - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:53 +00:00
Simon Pilgrim	a967aa2706	[DAGCombiner] ISD::SDIV/UDIV/SREM/UREM - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:52 +00:00
Simon Pilgrim	f7245ef897	[DAGCombiner] ISD::SHL/SRA/SRL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 18:49:42 +00:00
Simon Pilgrim	25b8e96388	[DAGCombiner] ISD::MUL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 17:26:22 +00:00
Simon Pilgrim	4b04e11735	[DAGCombiner] Sub/SUBSAT - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 16:57:13 +00:00
Simon Pilgrim	48bd6a0986	[DAGCombiner] visitIMINMAX - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us instead of the ConstantSDNode variant where we have to do it ourselves.	2020-01-29 16:57:13 +00:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Wang, Pengfei	3239b5034e	[FPEnv] Add pragma FP_CONTRACT support under strict FP. Summary: Support pragma FP_CONTRACT under strict FP. Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3 Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72820	2020-01-28 20:43:43 +08:00
Guillaume Chatelet	879c825cb8	[instrinsics] Add @llvm.memcpy.inline instrinsics Summary: This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++. A follow up CL will add the intrinsics in Clang. Reviewers: courbet, theraven, t.p.northover, jdoerfert, tejohnson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71710	2020-01-28 09:42:01 +01:00
Simon Pilgrim	e7e043724e	[DAG] Enable ISD::EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow. Differential Revision: This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-27 21:17:47 +00:00
Wang, Pengfei	17b8f96d65	[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI. Some functions like fmuladd don't really have a node, we should divide the declaration form those have node to avoid introducing fake nodes. Differential Revision: https://reviews.llvm.org/D72871	2020-01-27 10:38:05 +08:00
Simon Pilgrim	4a5f9d9faf	[TargetLowering] Respect recursive depth in SimplifyDemandedBits call to ComputeNumSignBits	2020-01-26 10:01:56 +00:00
Simon Pilgrim	3daa71ee00	[SelectionDAG] ComputeNumSignBits - add DemandedElts support for MIN/MAX ops	2020-01-25 20:21:14 +00:00
Simon Pilgrim	3f8916b2e8	[SelectionDAG] ComputeNumSignBits - add support for rotate non-uniform vector amounts	2020-01-25 19:15:05 +00:00
Simon Pilgrim	e3c26a9d1b	[SelectionDAG] ComputeNumSignBits - add support for rotate uniform vector amounts	2020-01-25 18:55:47 +00:00
Simon Pilgrim	c8de7c8f50	[TargetLowering] SimplifyDemandedBits - Remove ashr if all our demandedbits already match the sign bit Differential Revision: https://reviews.llvm.org/D73412	2020-01-25 17:36:46 +00:00
@justice_adams (Justice Adams)	daee63f974	[SelectionDag] Updated FoldConstantArithmetic method signature in preparation for merge with FoldConstantVectorArithmetic Updated FoldConstantArithmetic method signature to match that of FoldConstantVectorArithmetic in preparation for merging the two functions together https://bugs.llvm.org/show_bug.cgi?id=36544 This is the first step in combining the various FoldConstantVectorArithmetic and FoldConstantVectorArithmetic functions into one FoldConstantArithmetic function. Differential Revision: https://reviews.llvm.org/D72870	2020-01-24 18:00:58 -05:00
Craig Topper	d3bf06bc81	[DAGCombiner] Add combine for (not (strict_fsetcc)) to create a strict_fsetcc with the opposite condition. Unlike the existing code that I modified here, I only handle the case where the strict_fsetcc has a single use. Not sure exactly how to handle multiples uses. Testing this on X86 is hard because we already have a other combines that get rid of lowered version of the integer setcc that this xor will eventually become. So this combine really just saves a bunch of extra nodes being created. Not sure about other targets. Differential Revision: https://reviews.llvm.org/D71816	2020-01-24 14:15:36 -08:00
Stanislav Mekhanoshin	7a94d4f4ee	Allow combining of extract_subvector to extract element Differential Revision: https://reviews.llvm.org/D73132	2020-01-24 10:50:26 -08:00
Guillaume Chatelet	805c157e8a	[Alignment][NFC] Deprecate Align::None() Summary: This is a follow up on https://reviews.llvm.org/D71473#inline-647262. There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case. Reviewers: xbolva00, courbet, bollu Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73099	2020-01-24 12:53:58 +01:00
Simon Pilgrim	0b45c2264a	[SelectionDAG] rot(x, y) --> x iff ComputeNumSignBits(x) == BitWidth(x) Rotating an 0/-1 value by any amount will always result in the same 0/-1 value	2020-01-24 10:35:57 +00:00
Simon Pilgrim	e25eee4db7	[SelectionDAG] ComputeNumSignBits - add ISD::ADD demanded elts support	2020-01-23 17:48:07 +00:00
Simon Pilgrim	0fec8acdd8	[SelectionDAG] ComputeNumSignBits - add ISD::ADD vector support Add missing handling for (ADD (AND X, 1), -1) uniform vectors	2020-01-23 16:42:12 +00:00
Simon Pilgrim	fc5bbbf328	[SelectionDAG] ComputeNumSignBits - add ISD::SUB demanded elts support	2020-01-23 16:20:48 +00:00
Simon Pilgrim	48d4ba8fb2	[SelectionDAG] Compute Known + Sign Bits - merge INSERT_VECTOR_ELT known/unknown index paths Match the approach in SimplifyDemandedBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits/ComputeNumSignBits call.	2020-01-23 13:31:37 +00:00
Simon Pilgrim	03cae086f4	[SelectionDAG] ComputeKnownBits - merge EXTRACT_VECTOR_ELT known/unknown index paths Match the approach in SimplifyDemandedBits/ComputeNumSignBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits call.	2020-01-23 11:29:16 +00:00
Simon Pilgrim	98da49d979	[SelectionDAG] Compute Known + Sign Bits - merge INSERT_SUBVECTOR known/unknown index paths Match the approach in SimplifyDemandedBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits/ComputeNumSignBits call, additionally we only ever need original demanded elts of the base vector even if the index is unknown.	2020-01-23 11:29:15 +00:00
Stanislav Mekhanoshin	2d0fcf786c	Precommit NFC part of DAGCombiner change. NFC. This is NFC part of DAGCombiner::visitEXTRACT_SUBVECTOR() change in the D73132.	2020-01-22 09:01:22 -08:00
Sander de Smalen	4cf16efe49	[AArch64][SVE] Add patterns for unpredicated load/store to frame-indices. This patch also fixes up a number of cases in DAGCombine and SelectionDAGBuilder where the size of a scalable vector is used in a fixed-width context (thus triggering an assertion failure). Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D71215	2020-01-22 14:32:27 +00:00
Simon Pilgrim	80656fd7ae	[SelectionDAG] getShiftAmountConstant - assert the type is an integer.	2020-01-22 13:52:44 +00:00
Sander de Smalen	67d4c9924c	Add support for (expressing) vscale. In LLVM IR, vscale can be represented with an intrinsic. For some targets, this is equivalent to the constexpr: getelementptr <vscale x 1 x i8>, <vscale x 1 x i8>* null, i32 1 This can be used to propagate the value in CodeGenPrepare. In ISel we add a node that can be legalized to one or more instructions to materialize the runtime vector length. This patch also adds SVE CodeGen support for VSCALE, which maps this node to RDVL instructions (for scaled multiples of 16bytes) or CNT[HSD] instructions (scaled multiples of 2, 4, or 8 bytes, respectively). Reviewers: rengolin, cameron.mcinally, hfinkel, sebpop, SjoerdMeijer, efriedma, lattner Reviewed by: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68203	2020-01-22 10:09:27 +00:00
Thomas Lively	3ef169e586	[WebAssembly][InstrEmitter] Foundation for multivalue call lowering Summary: WebAssembly is unique among upstream targets in that it does not at any point use physical registers to store values. Instead, it uses virtual registers to model positions in its value stack. This means that some target-independent lowering activities that would use physical registers need to use virtual registers instead for WebAssembly and similar downstream targets. This CL generalizes the existing `usesPhysRegsForPEI` lowering hook to `usesPhysRegsForValues` in preparation for using it in more places. One such place is in InstrEmitter for instructions that have variadic defs. On register machines, it only makes sense for these defs to be physical registers, but for WebAssembly they must be virtual registers like any other values. This CL changes InstrEmitter to check the new target lowering hook to determine whether variadic defs should be physical or virtual registers. These changes are necessary to support a generalized CALL instruction for WebAssembly that is capable of returning an arbitrary number of arguments. Fully implementing that instruction will require additional changes that are described in comments here but left for a follow up commit. Reviewers: aheejin, dschuff, qcolombet Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71484	2020-01-21 11:13:46 -08:00
Simon Pilgrim	f04284cf1d	[TargetLowering] SimplifyDemandedBits ISD::SRA multi-use handling Call SimplifyMultipleUseDemandedBits to peek through extended source args with multiple uses	2020-01-21 15:12:07 +00:00
Simon Pilgrim	47f99d2ca8	[SelectionDAG] GetDemandedBits - remove ANY_EXTEND handling Rely on SimplifyMultipleUseDemandedBits fallback instead.	2020-01-21 14:39:00 +00:00
Simon Pilgrim	651fa669a2	[TargetLowering] SimplifyDemandedBits ANY_EXTEND/ANY_EXTEND_VECTOR_INREG multi-use handling Call SimplifyMultipleUseDemandedBits to peek through extended source args with multiple uses	2020-01-21 14:07:19 +00:00
Simon Pilgrim	5f5f478564	[DAG] Fold extract_vector_elt (scalar_to_vector), K to undef (K != 0) This was unconditionally folding this to the source operand, even if the access was out of bounds. Use undef instead of the extract is not the first element. This helps with some cases where 3-vectors are legalized and avoids processing the 4th component. Original Patch by: arsenm (Matt Arsenault) Differential Revision: https://reviews.llvm.org/D51589	2020-01-21 10:58:30 +00:00
Simon Pilgrim	8d2e6bdbe1	[TargetLowering] SimplifyDemandedBits - Pull out InDemandedMask variable to ISD::SHL. NFCI. Matches ISD::SRA + ISD::SRL variants.	2020-01-21 10:40:18 +00:00
Simon Pilgrim	9c06c10fba	[SelectionDAG] GetDemandedBits - fallback to SimplifyMultipleUseDemandedBits by default. First step towards removing SelectionDAG::GetDemandedBits entirely since it so similar to SimplifyMultipleUseDemandedBits anyhow.	2020-01-20 16:51:52 +00:00
Michael Liao	6d0d86a64d	[DAG] Add helper for creating constant vector index with correct type. NFC.	2020-01-18 01:23:36 -05:00
Simon Pilgrim	1dc2f25790	[SelectionDAG] ComputeKnownBits - assert we're computing the 0'th (difference) result for the SUB/SUBC cases Matches what we already do for the ADD/ADDC/ADDE case.	2020-01-17 13:53:57 +00:00
Simon Pilgrim	f611158350	[SelectionDAG] Better ISD::ANY_EXTEND/ISD::ANY_EXTEND_VECTOR_INREG ComputeKnownBits support Add DemandedElts handling to ISD::ANY_EXTEND and add missing ISD::ANY_EXTEND_VECTOR_INREG handling. Despite the lack of test changes this code IS being used - its just that the ANY_EXTEND ops are legalized later on (typically to ZERO_EXTEND equivalents) so we typically manage to combine later on.	2020-01-17 11:37:58 +00:00
Davide Italiano	30a8865142	[FastISel] Lower `llvm.dbg.value(undef, ...` correctly. Summary: Instead of just dropping them. <rdar://problem/58657146> Reviewers: aprantl, vsk, ab, paquette, echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72877	2020-01-16 16:22:20 -08:00
Craig Topper	61a89e17df	[LegalizeDAG][Mips] Add an assert to protect a uint_to_fp implementation from double rounding. Add a i32->f32 uint_to_fp implementation that avoids this code. The algorithm here only works if the sint_to_fp doesn't do any rounding. Otherwise it can round before the offset fixup is applied. Add an assert to protect this. To avoid breaking the one test in tree that tested this code with a set of types that fail the assert, I've enabled i32->f32 to use the i64->f32 algorithm. This only occurs when f64 isn't a legal type. If f64 is legal then we do i32->f64->f32 instead. Differential Revision: https://reviews.llvm.org/D72794	2020-01-16 11:08:16 -08:00
Matt Arsenault	d0943537e1	GlobalISel: Apply target MMO flags to atomics Unify MMO flag handling with SelectionDAG like with loads and stores.	2020-01-16 13:49:43 -05:00
Matt Arsenault	0d0fce42b0	GlobalISel: Preserve load/store metadata in IRTranslator This was dropping the invariant metadata on dead argument loads, so they weren't deleted. Atomics still need to be fixed the same way. Also, apparently store was never preserving dereferencable which should also be fixed.	2020-01-16 13:49:43 -05:00
Jeremy Morse	c969335abd	Revert "[PHIEliminate] Move dbg values after phi and label" Testing compiler-rt, a new assertion failure occurs when building the GwpAsanTestObjects object. I'm uploading a reproducer to D70597. This reverts commit `75188b01e9`.	2020-01-16 14:01:27 +00:00
Chris Ye	75188b01e9	[PHIEliminate] Move dbg values after phi and label If there are DBG_VALUEs between phi and label (after phi and before label), DBG_VALUE will block PHI lowering after the LABEL. Moving all DBG_VALUEs after Labels in the function ScheduleDAGSDNodes::EmitSchedule to avoid impacting PHI lowering. before: PHI DBG_VALUE LABEL after: (move DBG_VALUE after label) PHI LABEL DBG_VALUE then: (phi lowering after label) LABEL COPY DBG_VALUE Fixes the issue: https://bugs.llvm.org/show_bug.cgi?id=43859 Differential Revision: https://reviews.llvm.org/D70597	2020-01-16 11:58:09 +00:00
Craig Topper	5cf1b01a01	[LegalizeDAG][TargetLowering] Move vXi64/i64->vXf32/f32 uint_to_fp legalizing code from TargetLowering::expandUINT_TO_FP back to LegalizeDAG. This was moved in October 2018, but we don't appear to be using this for vectors on any in tree target. Moving it back simplifies D72794 so we can share the code for i32->f32.	2020-01-15 22:04:50 -08:00
Michael Liao	8d07f8d98c	[DAGCombine] Replace `getIntPtrConstant()` with `getVectorIdxTy()`. - Prefer `getVectorIdxTy()` as the index operand type for `EXTRACT_SUBVECTOR` as targets expect different types by overloading `getVectorIdxTy()`.	2020-01-14 17:03:05 -05:00
Craig Topper	9ee90ea55c	[LegalizeTypes] Remove untested code from ExpandIntOp_UINT_TO_FP This code is untested in tree because the "APFloat::semanticsPrecision(sem) >= SrcVT.getSizeInBits() - 1" check is false for most combinations for int and fp types except maybe i32 and f64. For that you would need i32 to be an illegal type, but f64 to be legal and have custom handling for legalizing the split sint_to_fp. The precision check itself was added in 2010 to fix a double rounding issue in the algorithm that would occur if the sint_to_fp was not able to do the conversion without rounding. Differential Revision: https://reviews.llvm.org/D72728	2020-01-14 13:15:29 -08:00
Ulrich Weigand	81ee484484	[FPEnv] Fix chain handling regression after `04a8696` Code in getRoot made the assumption that every node in PendingLoads must always itself have a dependency on the current DAG root node. After the changes in `04a8696`, it turns out that this assumption no longer holds true, causing wrong codegen in some cases (e.g. stores after constrained FP intrinsics might get deleted). To fix this, we now need to make sure that the TokenFactor created by getRoot always includes the previous root, if there is no implicit dependency already present. The original getControlRoot code already has exactly this check, so this patch simply reuses that code now for getRoot as well. This fixes the regression. NFC if no constrained FP intrinsic is present.	2020-01-14 14:10:57 +01:00
Simon Pilgrim	c05a11108b	[SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() and generic ISD::SHL handling. As mentioned by @nikic on rGef5debac4302, we can merge the guaranteed bottom zero bits from the shifted value, and then, if a min shift amount is known, zero out the bottom bits as well.	2020-01-14 11:51:41 +00:00
Simon Pilgrim	a43b0065c5	[SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() and generic ISD::SRL handling. As mentioned by @nikic on rGef5debac4302 (although that was just about SHL), we can merge the guaranteed top zero bits from the shifted value, and then, if a min shift amount is known, zero out the top bits as well. SHL tests / handling will be added in a follow up patch.	2020-01-14 11:41:47 +00:00
Craig Topper	26c7a4ed10	[LegalizeIntegerTypes][X86] Add support for expanding input of STRICT_SINT_TO_FP/STRICT_UINT_TO_FP into a libcall. Needed to support i128->fp128 on 32-bit X86. Add full set of strict sint_to_fp/uint_to_fp conversion tests for fp128.	2020-01-13 13:11:12 -08:00
Daniel Sanders	a0f4600f4f	Rework `be15dfa88f` such that it works with GlobalISel which doesn't use EVT Summary: `be15dfa88f` broke GlobalISel's usage of getSetCCInverse() which currently appears to be limited to our out-of-tree backend. GlobalISel doesn't use EVT's and isn't able to derive them from the information it has as it doesn't distinguish between integer and floating point types (that distinction is made by operations rather than values). Bring back the bool version of getSetCCInverse() in a way that doesn't break the intent of `be15dfa88f` but also allows GlobalISel to continue using it. Reviewers: spatel, bogner, arichardson Reviewed By: arichardson Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72309	2020-01-13 12:19:37 -08:00
Simon Pilgrim	c6fcd5d115	[SelectionDAG] ComputeNumSignBits add getValidMaximumShiftAmountConstant() for ISD::SHL support Allows us to handle non-uniform SHL shifts to determine the minimum number of sign bits remaining (based off the maximum shift amount value)	2020-01-13 18:02:37 +00:00
Andrew Wei	05366870ee	[LegalizeTypes] Add SoftenFloatResult support for STRICT_SINT_TO_FP/STRICT_UINT_TO_FP Some target like arm/riscv with soft-float will have compiling crash when using -fno-unsafe-math-optimization option. This patch will add the missing strict FP support to SoftenFloatRes_XINT_TO_FP. Differential Revision: https://reviews.llvm.org/D72277	2020-01-14 01:01:56 +08:00
Simon Pilgrim	38e2c01221	[SelectionDAG] ComputeNumSignBits add getValidMinimumShiftAmountConstant() ISD::SRA support Allows us to handle more non-uniform SRA sign bits cases	2020-01-13 16:55:02 +00:00
Simon Pilgrim	376bc39c82	[SelectionDAG] ComputeNumSignBits - Use getValidShiftAmountConstant for shift opcodes getValidShiftAmountConstant handles out of bounds shift amounts for us, allowing us to remove the local handling.	2020-01-13 14:12:12 +00:00
Simon Pilgrim	6d1a8fd447	[SelectionDAG] ComputeKnownBits - Add DemandedElts support to getValidShiftAmountConstant/getValidMinimumShiftAmountConstant()	2020-01-13 14:12:12 +00:00
Ulrich Weigand	04a86966fb	[FPEnv] Fix chain handling for fpexcept.strict nodes We need to ensure that fpexcept.strict nodes are not optimized away even if the result is unused. To do that, we need to chain them into the block's terminator nodes, like already done for PendingExcepts. This patch adds two new lists of pending chains, PendingConstrainedFP and PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without and with fpexcept.strict markers. This allows not only to solve the above problem, but also to relax chains a bit further by no longer flushing all FP nodes before a store or other memory access. (They are still flushed before nodes with other side effects.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72341	2020-01-13 14:38:49 +01:00
Simon Pilgrim	ef5debac43	[SelectionDAG] ComputeKnownBits add getValidMinimumShiftAmountConstant() ISD::SHL support As mentioned on D72573	2020-01-13 12:02:13 +00:00
Simon Pilgrim	8f49204f26	[SelectionDAG] ComputeKnownBits - minimum leading/trailing zero bits in LSHR/SHL (PR44526) As detailed in https://blog.regehr.org/archives/1709 we don't make use of the known leading/trailing zeros for shifted values in cases where we don't know the shift amount value. This patch adds support to SelectionDAG::ComputeKnownBits to use KnownBits::countMinTrailingZeros and countMinLeadingZeros to set the minimum guaranteed leading/trailing known zero bits. Differential Revision: https://reviews.llvm.org/D72573	2020-01-13 11:08:12 +00:00
Craig Topper	efb674ac2f	[LegalizeVectorOps] Parallelize the lo/hi part of STRICT_UINT_TO_FLOAT legalization. The lo and hi computation are independent. Give them the same input chain and TokenFactor the results together.	2020-01-11 17:50:30 -08:00
Craig Topper	ed679804d5	[TargetLowering][X86] Connect the chain from STRICT_FSETCC in TargetLowering::expandFP_TO_UINT and X86TargetLowering::FP_TO_INTHelper.	2020-01-11 17:50:20 -08:00
Craig Topper	ddfcd82bdc	[LegalizeVectorOps] Expand vector MERGE_VALUES immediately. Custom legalization can produce MERGE_VALUES to return multiple results. We can expand them immediately instead of leaving them around for DAG combine to clean up.	2020-01-11 17:50:20 -08:00
Craig Topper	5a9954c02a	[LegalizeVectorOps] Remove some of the simpler Expand methods. Pass Results vector to a couple. NFCI Some of the simplest handlers just call TLI and if that fails, they fall back to unrolling. For those just inline the TLI call and share the unrolling call with the default case of Expand. For ExpandFSUB and ExpandBITREVERSE so that its obvious they don't return results sometimes and want to defer to LegalizeDAG.	2020-01-11 12:14:19 -08:00
Craig Topper	9fe6f36c1a	[LegalizeVectorOps] Only pass SDNode* instead SDValue to all of the Expand* and Promote* methods. All the Expand* and Promote* function assume they are being called with result 0 anyway. Just hardcode result 0 into them.	2020-01-11 11:41:23 -08:00
Craig Topper	bb2553175a	[TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from RunttimeLibcalls.def and all associated usages Summary: This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare. This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code. Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn Reviewed By: efriedma Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72536	2020-01-10 19:30:08 -08:00
Craig Topper	71cee21861	[TargetLowering] Use SelectionDAG::getSetCC and remove a repeated call to getSetCCResultType in softenSetCCOperands. NFCI	2020-01-10 13:24:00 -08:00
Craig Topper	b590e0fd81	[TargetLowering][ARM][X86] Change softenSetCCOperands handling of ONE to avoid spurious exceptions for QNANs with strict FP quiet compares ONE is currently softened to OGT \| OLT. But the libcalls for OGT and OLT libcalls will trigger an exception for QNAN. At least for X86 with libgcc. UEQ on the other hand uses UO \| OEQ. The UO and OEQ libcalls will not trigger an exception for QNAN. This patch changes ONE to use the inverse of the UEQ lowering. So we now produce O & UNE. Technically the existing behavior was correct for a signalling ONE, but since I don't know how to generate one of those from clang that seemed like something we can deal with later as we would need to fix other predicates as well. Also removing spurious exceptions seemed better than missing an exception. There are also problems with quiet OGT/OLT/OLE/OGE, but those are harder to fix. Differential Revision: https://reviews.llvm.org/D72477	2020-01-10 11:00:17 -08:00
Craig Topper	f678fc7660	[LegalizeVectorOps] Improve handling of multi-result operations. This system wasn't very well designed for multi-result nodes. As a consequence they weren't consistently registered in the LegalizedNodes map leading to nodes being revisited for different results. I've removed the "Result" variable from the main LegalizeOp method and used a SDNode* instead. The result number from the incoming Op SDValue is only used for deciding which result to return to the caller. When LegalizeOp is called it should always register a legalized result for all of its results. Future calls for any other result should be pulled for the LegalizedNodes map. Legal nodes will now register all of their results in the map instead of just the one we were called for. The Expand and Promote handling to use a vector of results similar to LegalizeDAG. Each of the new results is then re-legalized and logged in the LegalizedNodes map for all of the Results for the node being legalized. None of the handles register their own results now. And none call ReplaceAllUsesOfValueWith now. Custom handling now always passes result number 0 to LowerOperation. This matches what LegalizeDAG does. Since the introduction of STRICT nodes, I've encountered several issues with X86's custom handling being called with an SDValue pointing at the chain and our custom handlers using that to get a VT instead of result 0. This should prevent us from having any more of those issues. On return we will update the LegalizedNodes map for all results so we shouldn't call the custom handler again for each result number. I want to push SDNode* further into the Expand and Promote handlers, but I've left that for a follow to keep this patch size down. I've created a dummy SDValue(Node, 0) to keep the handlers working. Differential Revision: https://reviews.llvm.org/D72224	2020-01-10 10:14:58 -08:00
Ulrich Weigand	f0fd11df7d	[FPEnv] Invert sense of MIFlag::FPExcept flag In D71841 we inverted the sense of the SDNode-level flag to ensure all nodes default to potentially raising FP exceptions unless otherwise specified -- i.e. if we forget to propagate the flag somewhere, the effect is now only lost performance, not incorrect code. However, the related flag at the MI level still defaults to nodes not raising FP exceptions unless otherwise specified. To be fully on the (conservatively) safe side, we should invert that flag as well. This patch does so by replacing MIFlag::FPExcept with MIFlag::NoFPExcept. (Note that this does also introduce an incompatible change in the MIR format.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72466	2020-01-10 15:34:50 +01:00
Peng Guo	cfd8498401	[MIR] Fix cyclic dependency of MIR formatter Summary: Move MIR formatter pointer from TargetMachine to TargetInstrInfo to avoid cyclic dependency between target & codegen. Reviewers: dsanders, bkramer, arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72485	2020-01-10 11:18:12 +01:00
Matt Arsenault	f33f3d98e9	DAG: Don't use unchecked dyn_cast	2020-01-09 17:37:52 -05:00
Matt Arsenault	255cc5a760	CodeGen: Use LLT instead of EVT in getRegisterByName Only PPC seems to be using it, and only checks some simple cases and doesn't distinguish between FP. Just switch to using LLT to simplify use from GlobalISel.	2020-01-09 17:37:52 -05:00
Craig Topper	b705fe5686	[TargetLowering][X86] TeachSimplifyDemandedBits to handle cases where only the sign bit is demanded from a SETCC and can be passed through If we're doing a compare that only tests the sign bit and only the sign bit is demanded, we can just bypass the node. This removes one of the blend dependencies in our v2i64->v2f32 uint_to_fp codegen on pre-sse4.2 targets. Differential Revision: https://reviews.llvm.org/D72356	2020-01-09 10:21:25 -08:00
Sanjay Patel	cb5612e2df	[DAGCombiner] reduce extract subvector of concat If we are extracting a chunk of a vector that's a fraction of an operand of the concatenated vector operand, we can extract directly from one of those original operands. This is another suggestion from PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024#c2 But I'm not sure yet if it will make any difference on those patterns. It seems to help a few existing AVX512 tests though. Differential Revision: https://reviews.llvm.org/D72361	2020-01-09 09:38:12 -05:00
QingShan Zhang	d48ac7d54d	[DAGCombine] Fold the (fma -x, y, -z) to -(fma x, y, z) This is a positive combination as long as the NEG is NOT free, as we are reducing the number of NEG from two to one. Differential Revision: https://reviews.llvm.org/D72312	2020-01-09 04:33:46 +00:00
Daniel Sanders	de3d0ee023	Revert "Revert "[MIR] Target specific MIR formating and parsing"" There was an unguarded dereference of MF in a function that permitted nullptr. Fixed This reverts commit `71d64f72f9`.	2020-01-08 20:03:29 -08:00
Nico Weber	71d64f72f9	Revert "[MIR] Target specific MIR formating and parsing" This reverts commit `3ef05d85be`. It broke check-llvm on many bots, see comments on D69836.	2020-01-08 22:50:49 -05:00
Peng Guo	3ef05d85be	[MIR] Target specific MIR formating and parsing Summary: Added MIRFormatter for target specific MIR formating and parsing with immediate and custom pseudo source values. Target machine can subclass MIRFormatter and implement custom logic for printing and parsing immediate and custom pseudo source values for better readability. * Target specific immediate mnemonic need to start with "." follows by identifier string. When MIR parser sees immediate it will call target specific parsing function. * Custom pseudo source value need to start with custom follows by double-quoted string. MIR parser will pass the quoted string to target specific PSV parsing function. * MIRFormatter have 2 helper functions to facilitate LLVM value printing and parsing for custom PSV if they refers LLVM values. Patch by Peng Guo Reviewers: dsanders, arsenm Reviewed By: dsanders Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69836	2020-01-08 18:48:02 -08:00
Daniel Sanders	5ab6fa7b70	Revert "[MIR] Target specific MIR formating and parsing" Forgot to credit Peng in the commit message. This reverts commit `be841f89d0`.	2020-01-08 18:48:02 -08:00
Peng Guo	be841f89d0	[MIR] Target specific MIR formating and parsing Summary: Added MIRFormatter for target specific MIR formating and parsing with immediate and custom pseudo source values. Target machine can subclass MIRFormatter and implement custom logic for printing and parsing immediate and custom pseudo source values for better readability. * Target specific immediate mnemonic need to start with "." follows by identifier string. When MIR parser sees immediate it will call target specific parsing function. * Custom pseudo source value need to start with custom follows by double-quoted string. MIR parser will pass the quoted string to target specific PSV parsing function. * MIRFormatter have 2 helper functions to facilitate LLVM value printing and parsing for custom PSV if they refers LLVM values. Reviewers: dsanders, arsenm Reviewed By: dsanders Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69836	2020-01-08 18:34:21 -08:00
Simon Pilgrim	108279948d	[SelectionDAG] Use llvm::Optional<APInt> for FoldValue. Use llvm::Optional<APInt> instead of std::pair<APInt, bool> with the bool second being used to report success/failure of fold.	2020-01-08 16:09:24 +00:00
Sanjay Patel	780ba1f22b	[DAGCombiner] clean up extract-of-concat fold; NFC This hopes to improve readability and adds an assert. The functional change noted by the TODO comment is proposed in: D72361	2020-01-08 10:15:33 -05:00
Bevin Hansson	8e2b44f7e0	[Intrinsic] Add fixed point division intrinsics. Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division: llvm.sdiv.fix.* llvm.udiv.fix.* These intrinsics perform scaled division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang. Patch by: ebevhan Reviewers: bjope, leonardchan, efriedma, craig.topper Reviewed By: craig.topper Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70007	2020-01-08 15:17:46 +01:00
Wang, Pengfei	9a621de1ec	[X86] Adding fp128 support for strict fcmp Summary: Adding fp128 support for strict fcmp Reviewers: craig.topper, LiuChen3, andrew.w.kaylor, RKSimon, uweigand Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71897	2020-01-08 12:59:31 +08:00
Bill Wendling	e886e762dd	Revert "Allow output constraints on "asm goto"" This reverts commit `52366088a8`. I accidentally pushed this before supporting changes.	2020-01-07 13:44:08 -08:00
Bill Wendling	52366088a8	Allow output constraints on "asm goto" Summary: Remove the restrictions that preventing "asm goto" from returning non-void values. The values returned by "asm goto" are only valid on the "fallthrough" path. Reviewers: jyknight, nickdesaulniers, hfinkel Reviewed By: jyknight, nickdesaulniers Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69876	2020-01-07 13:40:26 -08:00
Sanjay Patel	58e2e92a57	[DAGCombiner] reduce shuffle of concat of same vector This is possibly a small part towards solving PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 The vectorizer is creating shuffles of concat like this: %63 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %64 = shufflevector <8 x i64> %63, <8 x i64> undef, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> That might be fixable in the vectorizers, but we're not allowed to fold that into a single shuffle in instcombine, so we should have a backend backstop to convert that into the likely simpler form: %64 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3> Differential Revision: https://reviews.llvm.org/D72300	2020-01-07 09:48:59 -05:00
Craig Topper	62f3403bfc	[LegalizeTypes] Add widening support for STRICT_FSETCC/FSETCCS This patch adds widening which really just scalarizes because we don't have a strategy for the extra elements we would need to pad with. Differential Revision: https://reviews.llvm.org/D72193	2020-01-06 13:45:55 -08:00
Simon Pilgrim	6fa6000e3e	[DAG] DAGCombiner::XformToShuffleWithZero - use APInt::extractBits helper. NFCI.	2020-01-06 13:17:02 +00:00
James Henderson	d68904f957	[NFC] Fix trivial typos in comments Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.	2020-01-06 10:50:26 +00:00
Craig Topper	19ace449a3	[TargetLowering] Use SETCC input type to call getBooleanContents instead of the setcc result type. This isn't a functonal change since we also check the bit width is the same and the input type is integer. This guarantees the input and output type are the same. But passing the input type makes the code more readable.	2020-01-05 23:15:49 -08:00
QingShan Zhang	b9780f4f80	[DAGCombine] Don't check the legality of type when combine the SIGN_EXTEND_INREG This is the DAG node for SIGN_EXTEND_INREG : t21: v4i32 = sign_extend_inreg t18, ValueType:ch:v4i16 It has two operands. The first one is the value it want to extend, and the second one is the type to specify how to extend the value. For this example, it means that, it is signed extend the t18(v4i32) from v4i16 to v4i32. That is the semantics of c code: vector int foo(vector int m) { return m << 16 >> 16; } And it could be any vector type that hardware support the operation, though the type 'v4i16' is NOT legal for the target. When we are trying to combine the srl + sra, what we did now is calling the TLI.isOperationLegal(), which will also check the legality of the type. That doesn't make sense. Differential Revision: https://reviews.llvm.org/D70230	2020-01-06 03:00:58 +00:00
Craig Topper	4e37d60f2a	[LegalizeVectorOps][X86] Enable expansion of vector fp_to_uint in LegalizeVectorOps to avoid scalarization. The code here isn't great in all caess. Particularly v4f64->v4i32 on 64-bit AVX targets. But there is some improvement in some configurations. There's definitely some issues with computeNumSignBits with X86ISD::STRICT_FCMP. As well as not being able to propagate sign bits through merge_values nodes that get created during custom legalization.	2020-01-04 19:18:54 -08:00
Craig Topper	16a67d252c	[TargetLowering] In expandFP_TO_UINT, add proper extend or truncate for the condition to feed the DstVT select. Previously, for vectors we created a vselect with a condition that didn't match what the target wanted according to getSetCCResultType. To make up for this, X86 had a special DAG combine to detect if the condition was all sign bits and then insert its own truncate or extend. By adding the extend/truncate here explicitly we can avoid that.	2020-01-04 18:15:20 -08:00
Craig Topper	285d5e6b8b	[LegalizeVectorOps] Split most of ExpandStrictFPOp into a separate UnrollStrictFPOp method. Call that method from ExpandUINT_TO_FLOAT. ExpandStrictFPOp calls ExpandUINT_TO_FLOAT. Previously, ExpandUINT_TO_FLOAT returned SDValue() if it wasn't able to handle and needed to unroll. Then ExpandStrictFPOp would detect his SDValue() and do the unroll. After this change, ExpandUINT_TO_FLOAT will directly call UnrollStrictFPOp and return the unrolled result.	2020-01-04 17:03:50 -08:00
Simon Pilgrim	eb0e1978df	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT (REAPPLIED) This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0. Differential Revision: https://reviews.llvm.org/D65887	2020-01-04 13:15:50 +00:00
Sanjay Patel	ca7fdd41bd	[DAGCombiner] fix miscompile in translating (X & undef) to shuffle See PR42982 for more context: https://bugs.llvm.org/show_bug.cgi?id=42982	2020-01-03 14:58:49 -05:00
Craig Topper	7cdc60c3db	[LegalizeVectorOps] Pass the post-UpdateNodeOperands version of Op to ExpandLoad/ExpandStore UpdateNodeOperands might CSE to another existing node. So we should make sure we're legalizing that node otherwise we might fail to hook up the operands properly. I've moved the result registration up to the caller to avoid having to pass both Result and Op into the functions where it might be confusing which is which. This address 2 other issues pointed out in D71861. Differential Revision: https://reviews.llvm.org/D72021	2020-01-03 11:53:08 -08:00
Reid Kleckner	9c2b72821b	Move tail call disabling code to target independent code When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in `d9699bc7bd` (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118	2020-01-03 11:27:41 -08:00
Roman Lebedev	0727e2b90c	[DAGCombiner][X86][AArch64] Generalize `A-(A&B)`->`A&(~B)` fold (PR44448) The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should/can just be 'A - (A & B)' -> 'A & (~B)' Even if we don't manage to fold `~` into B, we have likely formed `ANDN` node. Also, this way there's less similar-but-duplicate folds. Name: X - (X & Y) -> X & (~Y) %o = and i32 %X, %Y %r = sub i32 %X, %o => %n = xor i32 %Y, -1 %r = and i32 %X, %n https://rise4fun.com/Alive/kOUl See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:47 +03:00
Roman Lebedev	86403c0ff8	[DAGCombiner] `~(add X, -1)` -> `neg X` fold The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should just be 'A - (A & B)' -> 'A & (~B)', but we currently fail to sink that '~' into `(B - 1)`. Name: ~(X - 1) -> (0 - X) %o = add i32 %X, -1 %r = xor i32 %o, -1 => %r = sub i32 0, %X https://rise4fun.com/Alive/rjU	2020-01-03 17:55:46 +03:00
Roman Lebedev	3d492d7503	[DAGCombine][X86][Thumb2/LowOverheadLoops] `A - (A & C)` -> `A & (~C)` fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. Name: PR44448 ptr - (ptr & C) -> ptr & (~C) %bias = and i32 %ptr, C %r = sub i32 %ptr, %bias => %r = and i32 %ptr, ~C See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:45 +03:00
Roman Lebedev	1711be78f7	[NFC][DAGCombine] Clarify comment for 'A - (A & (B - 1))' fold	2020-01-03 17:55:42 +03:00
Roman Lebedev	8dab0a4a7d	[DAGCombine][X86][AArch64] 'A - (A & (B - 1))' -> 'A & (0 - B)' fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. https://rise4fun.com/Alive/ZVdp Name: ptr - (ptr & (alignment-1)) -> ptr & (0 - alignment) %mask = add i64 %alignment, -1 %bias = and i64 %ptr, %mask %r = sub i64 %ptr, %bias => %highbitmask = sub i64 0, %alignment %r = and i64 %ptr, %highbitmask See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 13:58:36 +03:00
Matt Arsenault	0d9f919b73	DAG: Use TargetConstant for FENCE operands	2020-01-02 17:16:10 -05:00
Fangrui Song	87fb204e8f	[SelectionDAG] Simplify SelectionDAGBuilder::visitInlineAsm	2020-01-02 09:44:23 -08:00
Ulrich Weigand	63336795f0	[FPEnv] Default NoFPExcept SDNodeFlag to false The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that transforms SDNodes without copying flags can introduce a correctness bug, not just a missed optimization. This patch changes the default to false. This makes it necessary to move setting the (No)FPExcept flag for constrained intrinsics from the visitConstrainedIntrinsic routine to the generic visit routine at the place where the other flags are set, or else the intersectFlagsWith call would erase the NoFPExcept flag again. In order to avoid making non-strict FP code worse, whenever SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes none of which can raise FP exceptions, it will preserve this property on all results nodes generated, by setting the NoFPExcept flag on those result nodes that would otherwise be considered as raising an FP exception. To check whether or not an SD node should be considered as raising an FP exception, the following logic applies: - For machine nodes, check the mayRaiseFPException property of the underlying MI instruction - For regular nodes, check isStrictFPOpcode - For target nodes, check a newly introduced isTargetStrictFPOpcode The latter is implemented by reserving a range of target opcodes, similarly to how memory opcodes are identified. (Note that there a bit of a quirk in identifying target nodes that are both memory nodes and strict FP nodes. To simplify the logic, right now all target memory nodes are automatically also considered strict FP nodes -- this could be fixed by adding one more range.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71841	2020-01-02 16:59:45 +01:00
Mark de Wever	8dc7b982b4	[NFC] Fixes -Wrange-loop-analysis warnings This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857	2020-01-01 20:01:37 +01:00
Matt Arsenault	4d7201e7b9	DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nsz This was increasing the number of instructions when fsub was legalized on AMDGPU with no signed zeros enabled. This fold should be guarded by hasOneUse, and I don't think getNode should be doing that. The same fold is already done as a regular combine through isNegatibleForFree. This does require duplicating, even though isNegatibleForFree does this combine already (and properly checks hasOneUse) to avoid one PPC regression. In the regression, the outer fneg has nsz but the fsub operand does not. isNegatibleForFree only sees the operand, and doesn't see it's used from a nsz context. A nsz parameter needs to be added and threaded through isNegatibleForFree to avoid this.	2019-12-31 22:49:51 -05:00
Craig Topper	4ae3120ed8	[LegalizeVectorOps][AArch64] Stop asking for v4f16 fp_round and fp_extend to be promoted. These operations are needed as building blocks for promoting so they can't be promoted themselves. This appeared to work because the fp_extend query type for operation actions is the result type, not the input type so it never triggered in the legalizer. For fp_round, the vector op legalizer just ended up creating a nop fp_extend that was elided by getNode, followed by a nop fp_round that was also elided by getNode. This was followed by a final fp_round from v4f32 back to vf416 which was CSEd to the original node. Then legalize vector ops just believed that node legalized to itself. LegalizeDAG took another crack at promoting it, but didn't have a handler so just skipped it with a debug message saying it wasn't promoted. This patch just removes the operation actions to avoid this non-sense. Found while trying to refactor LegalizeVectorOps to handle multiple result nodes better.	2019-12-31 15:04:12 -08:00
Craig Topper	787e078f3e	[TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues instead of creating a MERGE_VALUES node. NFCI This allows us to clean up some places that were peeking through the MERGE_VALUES node after the call. By returning the SDValues directly, we can clean that up. Unfortunately, there are several call sites in AMDGPU that wanted the MERGE_VALUES and now need to create their own.	2019-12-30 19:36:04 -08:00
Fangrui Song	6f9b4c6826	[SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsm Indirect C_Immediate or C_Other constraints have been excluded. Also simplify an unneeded change to indirect 'X' by D60942.	2019-12-29 20:53:30 -08:00
Fangrui Song	5edb40c022	[SelectionDAG] Disallow indirect "i" constraint This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.	2019-12-29 16:50:42 -08:00
Simon Pilgrim	34769e0783	SimplifyDemandedBits - Remove duplicate getOperand() call. NFC. Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.	2019-12-28 16:42:50 +00:00
Craig Topper	a3f8964813	[TargetLowering] Update comment to reference the correct compiler-rt function the code is based on. NFC	2019-12-27 22:49:04 -08:00
Fangrui Song	044cc919f4	Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removed	2019-12-27 18:09:22 -08:00
Fangrui Song	7a7334663c	Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821 Intrinsic has incorrect argument type! i32 (i32) @llvm.setjmp wipes tear	2019-12-27 00:00:14 -08:00
Craig Topper	53ee806d93	[X86][FPEnv] Promote some float strictfp operations to double on i686-pc-windows-msvc to match what we do for non-strict. The float libcalls are inlined in MSVC's math header where they just cast to double and use the double libcall. Do the same when we emit libcalls.	2019-12-26 20:22:24 -08:00
Kristina Bessonova	cdd25a4c74	[DebugInfo][SelectionDAG] Change order while transferring SDDbgValue to another node SelectionDAG::transferDbgValues() can 'reattach' SDDbgValue from one to another node, but doesn't change its source order. If the destination node has the order greater than the SDDbgValue, there are two possible issues revealed later: * If debug info is attached to an instruction that is the first definition of a register, this ends up with a def-after-use and the debug info gets 'undef' later. * If MIR has another definition of a register above the debug info, the debug info may represent a source variable incorrectly because it appears (significantly) before an instruction corresponded to this debug info. So, the patch changes the order of an SDDbgValue when it is moved to a node with greater order. Reviewers: dblaikie, jmorse, aprantl Reviewed By: aprantl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71175	2019-12-26 21:01:59 +03:00
Wang, Pengfei	472bded3ed	[X86] Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Summary: Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Reviewers: craig.topper, RKSimon, LiuChen3, uweigand, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71871	2019-12-26 08:15:13 +08:00
Fangrui Song	e0d855b399	[SelectionDAG] Change SelectionDAGISel::{funcInfo,SDB} to use unique_ptr CurDAG is referenced more than 2000 times and used in many gerated .cpp files. Don't touch it for now.	2019-12-23 22:41:05 -08:00
Fangrui Song	01b98e6fd5	[SelectionDAG] Don't repeatedly add a node to the worklist in ComputeLiveOutVRegInfo. NFC For sqlite3 amalgram, this decreases the number of Worklist.push_back calls (603084) by 10%.	2019-12-23 22:04:14 -08:00
Ulrich Weigand	0d3f782e41	[FPEnv][X86] More strict int <-> FP conversion fixes Fix several several additional problems with the int <-> FP conversion logic both in common code and in the X86 target. In particular: - The STRICT_FP_TO_UINT expansion emits a floating-point compare. This compare can raise exceptions and therefore needs to be a strict compare. I've made it signaling (even though quiet would also be correct) as signaling is the more usual default for an LT. This code exists both in common code and in the X86 target. - The STRICT_UINT_TO_FP expansion algorithm was incorrect for strict mode: it emitted two STRICT_SINT_TO_FP nodes and then used a select to choose one of the results. This can cause spurious exceptions by the STRICT_SINT_TO_FP that ends up not chosen. I've fixed the algorithm to use only a single STRICT_SINT_TO_FP instead. - The !isStrictFPEnabled logic in DoInstructionSelection would sometimes do the wrong thing because it calls getOperationAction using the result VT. But for some opcodes, incuding [SU]INT_TO_FP, getOperationAction needs to be called using the operand VT. - Remove some (obsolete) code in X86DAGToDAGISel::Select that would mutate STRICT_FP_TO_[SU]INT to non-strict versions unnecessarily. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71840	2019-12-23 21:11:45 +01:00
Sanjay Patel	8cefc37be5	[DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815	2019-12-23 10:11:45 -05:00
Carl Ritson	2791667d2e	[DAGCombiner] Check term use before applying aggressive FSUB optimisations Summary: Without this check unnecessary FMA instructions are generated when the FSUB terms are reused. This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation. Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel Reviewed By: arsenm Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71656	2019-12-23 09:37:58 +09:00
Valentin Churavy	fb0ccff6e5	[SelectionDAG] Copy FP flags when visiting a binary instruction. Summary: We noticed in Julia that the sequence below no longer turned into a sequence of FMA instructions in LLVM 7+, but it did in LLVM 6. ``` %29 = fmul contract <4 x double> %wide.load, %wide.load16 %30 = fmul contract <4 x double> %wide.load13, %wide.load17 %31 = fmul contract <4 x double> %wide.load14, %wide.load18 %32 = fmul contract <4 x double> %wide.load15, %wide.load19 %33 = fadd fast <4 x double> %vec.phi, %29 %34 = fadd fast <4 x double> %vec.phi10, %30 %35 = fadd fast <4 x double> %vec.phi11, %31 %36 = fadd fast <4 x double> %vec.phi12, %32 ``` Unlike Clang, Julia doesn't set the `unsafe-fp-math=true` function attribute, but rather emits more local instruction flags. This partially undoes https://reviews.llvm.org/D46854 and if required I can try to minimize the test further. Reviewers: spatel, mcberg2017 Reviewed By: spatel Subscribers: chriselrod, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71495	2019-12-22 14:29:36 -05:00
Craig Topper	e6e23a24be	[LegalizeDAG] Add return to the strict node handling in PromoteLegalINT_TO_FP to prevent an invalid strict fp node from being created by falling into non-strict code path.	2019-12-19 11:39:50 -08:00
Liu, Chen3	2f932b5729	Enable STRICT_FP_TO_SINT/UINT on X86 backend This patch is mainly for custom lowering the vector operation. Differential Revision: https://reviews.llvm.org/D71592	2019-12-19 14:49:13 +08:00
Ulrich Weigand	1946461344	[FPEnv] Strict versions of llvm.minimum/llvm.maximum Add new intrinsics llvm.experimental.constrained.minimum llvm.experimental.constrained.maximum as strict versions of llvm.minimum and llvm.maximum. Includes SystemZ back-end support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71624	2019-12-18 21:35:28 +01:00
Craig Topper	cfe316007f	[SelectionDAGBuilder] Use getConstant instead of getTargetConstant to build the offset for struct types in getUniformBase. getTargetConstant prevents any optimizations from operating on the value and basically says its already been iseled. But since we want the index to be in a register, this isn't true. Prior to this we were generating a vbroadcast with an immediate argument which is illegal and was flagged by the expensive checks bot.	2019-12-18 10:44:28 -08:00
stozer	89d19d60ad	Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel This reverts commit `1f3dd83cc1`, reapplying commit `bb1b0bc4e5`. The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.	2019-12-18 16:26:42 +00:00
stozer	1f3dd83cc1	Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel" Reverted due to build failure on windows bots. This reverts commit `bb1b0bc4e5`.	2019-12-18 11:46:10 +00:00
stozer	bb1b0bc4e5	[DebugInfo] Correctly handle salvaged casts and split fragments at ISel Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions. There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.	2019-12-18 11:09:18 +00:00
Wang, Pengfei	8cc0b58673	[X86] Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic. Summary: Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic. Reviewers: craig.topper, c-rhodes, RKSimon Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71442	2019-12-18 12:24:58 +08:00
Craig Topper	c36773c78e	[FPEnv][LegalizeTypes] Make ScalarizeVecOp_STRICT_FP_ROUND do its own replacements and return SDValue() The caller will assert for nodes with more than 2 results unless we return a null SDValue. I tried to test this by copying an AArch64 test for ScalarizeVecOp_FP_ROUND. While it did hit the assert and this commited fixed that. It also hit a later problem that couldn't be fixed without adding strict FP support to AArch64.	2019-12-17 15:17:43 -08:00
Craig Topper	84d8fa30f9	[FPEnv][LegalizeTypes][LegalizeDAG][AArch64] Few fixes/improvements for legalizing fp<->int conversion nodes. This started with adding a test to support get code coverage on ScalarizeVecOp_UnaryOp_StrictFP by copying an existing AArch64 test and using constrained sitofp/uitofp intrinsics. This found 3 separate issues: -ScalarizeVecOp_UnaryOp_StrictFP needs to do its own replacement because the caller can't handle replacing multiple results. -Missing integer promotion support for sitofp/uitofp -Chain result not always assigned in ExpandLegalINT_TO_FP. Committing them together so I can add the test case.	2019-12-17 14:37:00 -08:00
Sanjay Patel	6a77e36975	[SDAG] adjust isNegatibleForFree calculation to avoid crashing This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch adds a hack to work-around the case where we probably no longer detect that either multiply operand of an FMA isNegatibleForFree which is assumed to be true when we started rewriting the expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-17 13:49:15 -05:00
Sanjay Patel	5b0251da1c	Revert "[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression()" This reverts commit `36b1232ec5`. Need to adjust commit message - that was a leftover from the earlier version.	2019-12-17 13:47:59 -05:00
Sanjay Patel	36b1232ec5	[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch adds a hack to work-around the case where we probably no longer detect that either multiply operand of an FMA isNegatibleForFree which is assumed to be true when we started rewriting the expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-17 13:46:06 -05:00
Amaury Séchet	ff6567cc77	[DAGCombiner] Add node back in the worklist in topological order in CommitTargetLoweringOpt Summary: Right now, DAGCombiner process the nodes in an iplementation defined order. This tends to be fragile as optimisation may or may not kick in depending on the traversal order. This is part of a larger effort to get the DAGCombiner to process its node in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70921	2019-12-17 18:26:16 +01:00
Kevin P. Neal	b1d8576b0a	This adds constrained intrinsics for the signed and unsigned conversions of integers to floating point. This includes some of Craig Topper's changes for promotion support from D71130. Differential Revision: https://reviews.llvm.org/D69275	2019-12-17 10:06:51 -05:00
Craig Topper	13ce7c1291	[LegalizeTypes] Pre-size the SmallVectors in ScalarizeVecRes_StrictFPOp and SplitVecRes_StrictFPOp so we don't have to call push_back. NFCI This avoids grow checking/handling in each iteration of the loop.	2019-12-16 23:42:13 -08:00
Craig Topper	c738ebc1f5	[LegalizeTypes] Remove ScalarizeVecRes_STRICT_FP_ROUND in favor of just using ScalarizeVecRes_StrictFPOp. NFCI It looks like ScalarizeVecRes_StrictFPOp can handle a variable number of arguments with scalar and vector types so it should be sufficient.	2019-12-16 23:42:13 -08:00
Craig Topper	c4d2bb1ede	[LegalizeTypes] Remove the call to SplitVecRes_UnaryOp from SplitVecRes_StrictFPOp. NFCI It doesn't seem to do anything that SplitVecRes_StrictFPOp can't do. SplitVecRes_StrictFPOp already handles nodes with a variable number of arguments and a mix of scalar and vector arguments.	2019-12-16 23:42:13 -08:00
Craig Topper	4e48513b47	[SelectionDAG] Add the fpexcept flag to the SelectionDAG dumping output so we can better see when its not propagating. We're currently losing this flag in type legalization and probably other places when we expand strict fp nodes. This will make reading logs easier.	2019-12-16 18:05:11 -08:00
Sanjay Patel	2afe864118	[DAG] Add SimplifyDemandedBits support for BSWAP This exposes a shortcoming for AArch64, and that is tracked by PR40881: https://bugs.llvm.org/show_bug.cgi?id=40881 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D58017	2019-12-15 08:52:34 -05:00
Craig Topper	1dc0c8af5e	[LegalizeTypes] Teach BitcastToInt_ATOMIC_SWAP to only create FP16_TO_FP when called from PromoteFloatResult. There's also a call from SoftenFloatResult that should not be promoted. The change test case would fail with the new RUN line prior to this change.	2019-12-14 15:05:32 -08:00
Craig Topper	95ce8f9498	[LegalizeTypes] In PromoteFloatOp_SETCC, don't both querying for transforming the result type. The result type is already legal, is doesnt' need to be transformed.	2019-12-14 15:05:32 -08:00
Alex Richardson	11448eeb72	[NFC] Use SelectionDAG::getMemBasePlusOffset() instead of getNode(ISD::ADD) Summary: To find potential opportunities to use getMemBasePlusOffset() I looked at all ISD::ADD uses found with the regex getNode\(ISD::ADD,.+,.+Ptr in lib/CodeGen/SelectionDAG. If this patch is accepted I will convert the files in the individual backends too. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71207	2019-12-13 21:40:03 +00:00
Alex Richardson	fc83f53a86	[NFC] Implement SelectionDAG::getObjectPtrOffset() using getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. In order to make this change, getMemBasePlusOffset() has been extended to also take a SDNodeFlags parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71206	2019-12-13 21:40:03 +00:00
Alex Richardson	ea8888d1af	[NFC] Add a SDValue overload for SelectionDAG::getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows integer constants offsets, but there are cases where we can use an existing SDValue parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel, craig.topper Reviewed By: spatel, craig.topper Subscribers: craig.topper, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71205	2019-12-13 21:40:03 +00:00
Alex Richardson	d9bb70acd7	[NFC] Change SelectionDAG::getMemBasePlusOffset() to use int64_t Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows positive offsets, but there are cases where we want to subtract an offset from an existing pointer. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71204	2019-12-13 21:40:03 +00:00
Sanjay Patel	2f0c7fd2db	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc (2nd try) The initial attempt (rG89633320) botched the logic by reversing the source/dest types. Added x86 tests for additional coverage. The vector tests show a potential improvement (fold vector load instead of broadcasting), but that's a known/existing problem. This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-13 14:03:54 -05:00
Nicola Zaghen	97572775d2	Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. This fixes the buildbot failures. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-13 14:30:21 +00:00
Alex Richardson	be15dfa88f	[NFC] Use EVT instead of bool for getSetCCInverse() Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void )0x12033091e < (void )0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917	2019-12-13 12:22:03 +00:00

... 2 3 4 5 6 ...

10649 Commits