llvm-project

Commit Graph

Author	SHA1	Message	Date
Fangrui Song	44cdae68c3	[CodeGenPrepare] Delete dead !DL check Follow-up for D73754 DL is assigned in CodeGenPrepare::runOnFunction and is guaranteed to be non-null.	2020-02-02 09:49:06 -08:00
Fangrui Song	5a56a25b0b	[CodeGenPrepare] Make TargetPassConfig required The code paths in the absence of TargetMachine, TargetLowering or TargetRegisterInfo are poorly tested. As rL285987 said, requiring TargetPassConfig allows us to delete many (untested) checks littered everywhere. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D73754	2020-02-02 09:28:45 -08:00
Fangrui Song	5932f7b8f2	[PatchableFunction] Use an empty DebugLoc The current FirstMI.getDebugLoc() is actually null in almost all cases. If it isn't, the generated .loc will be considered initial. The .loc will have the prologue_end flag and terminate the prologue prematurely. Also use an overload of BuildMI that will not prepend PATCHABLE_FUNCTION_ENTRY to a MachineInstr bundle.	2020-02-01 14:12:06 -08:00
Craig Topper	943b5561d6	[LegalizeTypes][X86] Add a new strategy for type legalizing f16 type that softens it to i16, but promotes to f32 around arithmetic ops. This is based on this llvm-dev thread http://lists.llvm.org/pipermail/llvm-dev/2019-December/137521.html The current strategy for f16 is to promote type to float every except where the specific width is required like loads, stores, and bitcasts. This results in rounding occurring in odd places instead of immediately after arithmetic operations. This interacts in weird ways with the __fp16 type in clang which is a storage only type where arithmetic is always promoted to float. InstCombine can remove some fpext/fptruncs around such arithmetic and turn it into arithmetic on half. This wouldn't be so bad if SelectionDAG was able to put those fpext/fpround back in when it promotes. It is also not obvious how to handle to make the existing strategy work with STRICT fp. We need to use STRICT versions of the conversions which require chain operands. But if the conversions are created for a bitcast, there is no place to get an appropriate chain from. This patch implements a different strategy where conversions are emitted directly around arithmetic operations. And otherwise its passed around as an i16 including in arguments and return values. This can result in more conversions between arithmetic operations, but is closer to matching the IR the frontend generates for __fp16. And it will allow us to use the chain from constrained arithmetic nodes to link the STRICT_FP_TO_FP16/STRICT_FP16_TO_FP that will need to be added. I've set it up so that each target can opt into the new behavior. Converting all the targets myself was more than I was able to handle. Differential Revision: https://reviews.llvm.org/D73749	2020-02-01 11:21:04 -08:00
Matt Arsenault	bc101ffd77	GlobalISel: Support widening unmerge results with pointer source	2020-02-01 10:47:03 -05:00
David Blaikie	338beff4dc	DwarfDebug.cpp: Fix some indentation	2020-01-31 16:01:57 -08:00
David Blaikie	b33e5f3c3e	DebugInfo: Split DWARF: Hash non-member function child DIEs Significant missing hashing - as per the comment this was only meant to skip member functions (unspecified, but I think it's legible as member function declarations, not definitions) but was skipping all named subprograms (so only hashed child DIEs for member function definitions - because they didn't have a direct name, but only a name given indirectly in the DW_AT_specification-referenced DIE)	2020-01-31 15:32:03 -08:00
Matt Arsenault	792d9b5719	DAG: Check if a value is divergent before requiresUniformRegister This avoids a potentially expensive scan if we already know it doesn't matter.	2020-01-31 15:27:18 -08:00
Jay Foad	f465b1aff4	[GlobalISel] Tweak lowering of G_SMULO/G_UMULO Summary: Applying this cleanup: - MIRBuilder.buildInstr(TargetOpcode::G_ASHR) - .addDef(Shifted) - .addUse(Res) - .addUse(ShiftAmt); + MIRBuilder.buildAShr(Shifted, Res, ShiftAmt); caused an assertion failure here: llc: /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:404: llvm::MachineInstr *llvm::MachineRegisterInfo::getVRegDef(unsigned int) const: Assertion `(I.atEnd() \|\| std::next(I) == def_instr_end()) && "getVRegDef assumes a single definition or no definition"' failed. #4 0x00000000050a6d96 in llvm::MachineRegisterInfo::getVRegDef (this=0x74606a0, Reg=2147483650) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/MachineRegisterInfo.cpp:403 #5 0x00000000066148f6 in llvm::getConstantVRegValWithLookThrough (VReg=2147483650, MRI=..., LookThroughInstrs=false, HandleFConstant=true) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:244 #6 0x00000000066147da in llvm::getConstantVRegVal (VReg=2147483650, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:210 #7 0x0000000006615367 in llvm::ConstantFoldBinOp (Opcode=101, Op1=2147483650, Op2=2147483656, MRI=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/Utils.cpp:341 #8 0x000000000657eee0 in llvm::CSEMIRBuilder::buildInstr (this=0x7465010, Opc=101, DstOps=..., SrcOps=..., Flag=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/CSEMIRBuilder.cpp:160 #9 0x0000000003645958 in llvm::MachineIRBuilder::buildAShr (this=0x7465010, Dst=..., Src0=..., Src1=..., Flags=...) at /home/jayfoad2/git/llvm-project/llvm/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h:1298 #10 0x00000000065c35b1 in llvm::LegalizerHelper::lower (this=0x7fffffffb5f8, MI=..., TypeIdx=0, Ty=...) at /home/jayfoad2/git/llvm-project/llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp:2020 because at this point there are two instructions defining Res: the original G_SMULO/G_UMULO and the new G_MUL that we built. The fix is to modify the original mul in place, so that there is only ever one definition of Res. Reviewers: arsenm, aditya_nandakumar Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72842	2020-01-31 19:21:01 +00:00
Simon Pilgrim	8fbc7fd567	[DAG] SimplifyMultipleUseDemandedBits - peek through unused ISD::INSERT_SUBVECTOR subvectors If we don't demand any elements of the inserted subvector then just skip it.	2020-01-31 18:57:22 +00:00
Simon Pilgrim	5702dadf6f	[DAG] Enable ISD::INSERT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::INSERT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-31 18:02:34 +00:00
Hiroshi Yamauchi	ac8da31a0f	[PGO][PGSO] Handle MBFIWrapper Some code gen passes use MBFIWrapper to keep track of the frequency of new blocks. This was not taken into account and could lead to incorrect frequencies as MBFI silently returns zero frequency for unknown/new blocks. Add a variant for MBFIWrapper in the PGSO query interface. Depends on D73494.	2020-01-31 09:36:55 -08:00
Jay Foad	2a1b5af299	[GlobalISel] Tidy up unnecessary calls to createGenericVirtualRegister Summary: As a side effect some redundant copies of constant values are removed by CSEMIRBuilder. Reviewers: aemerson, arsenm, dsanders, aditya_nandakumar Subscribers: sdardis, jvesely, wdng, nhaehnle, rovka, hiraditya, jrtc27, atanasyan, volkan, Petar.Avramovic, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73789	2020-01-31 17:07:16 +00:00
Guillaume Chatelet	3c89b75f23	[NFC] Introduce a type to model memory operation Summary: This is a first step before changing the types to llvm::Align and introduce functions to ease client code. Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73785	2020-01-31 17:29:01 +01:00
Quentin Colombet	cfebd77742	[GISel][KnownBits] Fix a bug where we could run out of stack space One of the exit criteria of computeKnownBits is whether we reach the max recursive call depth. Before this patch we would check that the depth is exactly equal to max depth to exit. Depth may get bigger than max depth if it gets passed to a different GISelKnownBits object. This may happen when say a generic part uses a GISelKnownBits object with some max depth, but then we hit TL.computeKnownBitsForTargetInstr which creates a new GISelKnownBits object with a different and smaller depth. In that situation, when we hit the max depth check for the first time in the target specific GISelKnownBits object, depth may already be bigger than the current max depth. Hence we would continue to compute the known bits, until we ran through the full depth of the chain of computation or ran out of stack space. For instance, let say we have GISelKnownBits Info(/MaxDepth/ = 10); Info.getKnownBits(Foo) // 9 recursive calls to computeKnownBitsImpl. // Then we hit a target specific instruction. // The target specific GISelKnownBits does this: GISelKnownBits TargetSpecificInfo(/MaxDepth/ = 6) TargetSpecificInfo.computeKnownBitsImpl() // <-- next max depth checks would // always return false. This commit does not have any test case, none of the in-tree targets use computeKnownBitsForTargetInstr.	2020-01-30 19:30:39 -08:00
Leonard Chan	2d3174c4df	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. This is an attempt to reland with a few fixes for buildbot since I haven't merged from master in a bit. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 17:09:42 -08:00
Amara Emerson	84bd851108	[GlobalISel][IRTranslator] When translating vector geps, splat the base pointer if required. We can have geps that have a scalar base pointer, and a vector index value, which means that the base pointer must be splatted into a vector of pointers. This fixes crashes on arm64 GlobalISel with optimizations enabled.	2020-01-30 16:27:27 -08:00
Leonard Chan	3b23453b6c	Revert "[SafeStack][DebugInfo] Insert DW_OP_deref in correct location" This reverts commit `fff6a1b0f1`. This was breaking a bunch of buildbots.	2020-01-30 16:18:41 -08:00
Leonard Chan	fff6a1b0f1	[SafeStack][DebugInfo] Insert DW_OP_deref in correct location This patch addresses the issue found in https://bugs.llvm.org/show_bug.cgi?id=44585 where a DW_OP_deref was placed at the end of a dwarf expression, resulting in corrupt symbols when debugging. Differential Revision: https://reviews.llvm.org/D73526	2020-01-30 15:58:37 -08:00
Matt Arsenault	eb7f74e300	CodeGen: Use Register	2020-01-30 15:01:56 -08:00
Sean Fertile	8b737688c2	[AIX] Minor cleanup in AsmPrinter. [NFC] - Extends the comments related to function descriptors, noting how they are only used on AIX. - Changes the condition used to gate the creation of the current function symbol in AsmPrinter::SetupMachineFunction to reflect being AIX specific. The creation of the symbol is different because of AIXs linkage conventions, not because AIX uses function descriptors. Differential Revision: https://reviews.llvm.org/D73115	2020-01-30 14:15:02 -05:00
Fangrui Song	06b8e32d4f	[AArch64] -fpatchable-function-entry=N,0: place patch label after BTI Summary: For -fpatchable-function-entry=N,0 -mbranch-protection=bti, after `9a24488cb6`, we place the NOP sled after the initial BTI. ``` .Lfunc_begin0: bti c nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lfunc_begin0 ``` This patch adds a label after the initial BTI and changes the __patchable_function_entries entry to reference the label: ``` .Lfunc_begin0: bti c .Lpatch0: nop nop .section __patchable_function_entries,"awo",@progbits,f,unique,0 .p2align 3 .xword .Lpatch0 ``` This placement is compatible with the resolution in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 . A local linkage function whose address is not taken does not need a BTI. Placing the patch label after BTI has the advantage that code does not need to differentiate whether the function has an initial BTI. Reviewers: mrutland, nickdesaulniers, nsz, ostannard Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73680	2020-01-30 11:11:52 -08:00
Matt Arsenault	ea956685a1	GlobalISel: Implement s32->s64 G_FPTOSI lowering Port directly from DAG version. The lowering for G_FPTOUI used to fail on AMDGPU because it uses G_FPTOSI.	2020-01-30 08:47:07 -05:00
Dominik Montada	dc141af755	[GlobalISel] (fix) Use pointer type size for offset constant when lowering stores Commit `9965b12fd1` was supposed to change the offset constant when lowering load/stores, but only introduced this change for loads. This patch adds the same fix for stores.	2020-01-30 08:32:35 -05:00
Simon Pilgrim	57b0d33224	[DAGCombiner] ISD::AND/OR/XOR - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:53 +00:00
Simon Pilgrim	a967aa2706	[DAGCombiner] ISD::SDIV/UDIV/SREM/UREM - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-30 12:02:52 +00:00
Matt Arsenault	c5fffa4da3	GlobalISel: Add observer argument to legalizeIntrinsic This is passed to legalizeCustom, but not intrinsic. Also remove the MRI argument, since you can get that from the MachineIRBuilder. I'm not sure why MachineIRBuilder has a private observer member, and this is passed separately.	2020-01-29 18:33:45 -05:00
Amara Emerson	c12f046eb9	[GlobalISel] Add new combine to convert scalar G_MUL to G_SHL. For pow2 constants we should use G_SHL for pattern matching (and perf) purposes later. Vector support not yet implemented. Differential Revision: https://reviews.llvm.org/D73659	2020-01-29 13:39:00 -08:00
Amara Emerson	0da937bb5c	[GlobalISel][IRTranslator] Follow convention and put constant offset of getelementptr arithmetic on RHS. We were needlessly putting known constant values on the LHS of a G_MUL, which is suboptimal. Differential Revision: https://reviews.llvm.org/D73650	2020-01-29 11:37:19 -08:00
Fangrui Song	8903e61b66	[AsmPrinter][ELF] Define local aliases (.Lfoo$local) for GlobalObjects For `MC_GlobalAddress` operands referencing certain GlobalObjects, we can lower them to STB_LOCAL aliases to avoid costs brought by assembler/linker's conservative decisions about symbol interposition: * An assembler conservatively assumes a global default visibility symbol interposable (ELF semantics). So relocations in object files are needed even if the code generator assumed the definition exact and non-interposable. * The relocations can cause the creation of PLT entries on some targets for -shared links. A linker conservatively assumes a global default visibility symbol interposable (if not otherwise constrained by -Bsymbolic/--dynamic-list/VER_NDX_LOCAL/etc). "certain" refers to GlobalObjects in the intersection of `hasExactDefinition() and !isInterposable()`: `external`, `appending`, `internal`, `private`. Local linkages (`internal` and `private`) cannot be interposed. `appending` is for very few objects LLVM interpret specially. So the set just includes `external`. This patch emits STB_LOCAL aliases (.Lfoo$local) for such GlobalObjects, so that targets can lower MC_GlobalAddress operands to STB_LOCAL aliases if applicable. We may extend the scope and include GlobalAlias in the future. LLVM's existing -fno-semantic-interposition behaviors give us license to do such optimizations: * Various optimizations (ipconstprop, inliner, sccp, sroa, etc) treat normal ExternalLinkage GlobalObjects as non-interposable. * Before D72197, MC resolved a PC-relative VK_None fixup to a non-local symbol at assembly time (no outstanding relocation), if the target is defined in the same section. Put it simply, even if IR optimizations failed to optimize and allowed interposition for the function call in `void foo() {} void bar() { foo(); }`, the assembler would disallow it. This patch sets up AsmPrinter infrastructure to make -fno-semantic-interposition more so. With and without the patch, the object file output should be identical: `.Lfoo$local` does not take a symbol table entry. Reviewed By: sfertile Differential Revision: https://reviews.llvm.org/D73228	2020-01-29 10:58:43 -08:00
Simon Pilgrim	f7245ef897	[DAGCombiner] ISD::SHL/SRA/SRL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 18:49:42 +00:00
Adrian Prantl	18dbe1b279	Run clang-format on DwarfExpression (NFC)	2020-01-29 10:23:12 -08:00
Adrian Prantl	816ee8a423	DwarfExpression: Factor out getOrCreateBaseType() (NFC)	2020-01-29 10:23:12 -08:00
Simon Pilgrim	25b8e96388	[DAGCombiner] ISD::MUL - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 17:26:22 +00:00
Simon Pilgrim	4b04e11735	[DAGCombiner] Sub/SUBSAT - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us.	2020-01-29 16:57:13 +00:00
Simon Pilgrim	48bd6a0986	[DAGCombiner] visitIMINMAX - use general SelectionDAG::FoldConstantArithmetic This handles all the constant splat / opaque testing for us instead of the ConstantSDNode variant where we have to do it ourselves.	2020-01-29 16:57:13 +00:00
Matt Arsenault	b63629a58d	GlobalISel: Fix mask computation in lowerInsert This is supposed to be the high bit index, not the width. Use the wrapping form of getBitsSet and avoid the bitflip.	2020-01-29 08:25:36 -08:00
Jay Foad	0d7bd34312	[MachineScheduler] Ignore artificial edges when forming store chains Summary: BaseMemOpClusterMutation::apply forms store chains by looking for control (i.e. non-data) dependencies from one mem op to another. In the test case, clusterNeighboringMemOps successfully clusters the loads, and then adds artificial edges to the loads' successors as described in the comment: // Copy successor edges from SUa to SUb. Interleaving computation // dependent on SUa can prevent load combining due to register reuse. The effect of this is that data dependencies from one load to a store are copied as artificial dependencies from a different load to the same store. Then when BaseMemOpClusterMutation::apply looks at the stores, it finds that some of them have a control dependency on a previous load, which breaks the chains and means that the stores are not all considered part of the same chain and won't all be clustered. The fix is to only consider non-artificial control dependencies when forming chains. Subscribers: MatzeB, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71717	2020-01-29 16:23:01 +00:00
Matt Arsenault	f717483acd	GlobalISel: Assert on invalid bitcast in MIRBuilder The other casts validate, so this should too.	2020-01-29 07:49:39 -08:00
Matt Arsenault	c5c1bb3374	GlobalISel: Lower G_WRITE_REGISTER	2020-01-29 06:48:24 -08:00
David Stenberg	6a2413c435	[ARM64] Debug info for structure argument missing DW_AT_location Summary: Prevent eliminating dbg_val due to COPY. Fixes this https://bugs.llvm.org/show_bug.cgi?id=40709 Patch by: Kamlesh Kumar (kamleshbhalui) Reviewers: aprantl, dblaikie, vsk, dsanders Reviewed By: dsanders Subscribers: dstenb, kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73159	2020-01-29 10:56:23 +01:00
Sam Parker	ac30ea2f87	[RDA][ARM] Move functionality into RDA Add several new helpers to RDA: - hasLocalDefBefore - isRegDefinedAfter - isSafeToDefRegAt And move two bits of logic from ARMLowOverheadLoops into RDA: - isSafeToMove - isSafeToRemove Both of these have some wrappers too to make them more convienent to use. Differential Revision: https://reviews.llvm.org/D73460	2020-01-29 03:27:47 -05:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Michael Spang	a2fb2c0ddc	[GlobalMerge] Preserve symbol visibility when merging globals Symbols created for merged external global variables have default visibility. This can break programs when compiling with -Oz -fvisibility=hidden as symbols that should be hidden will be exported at link time. Differential Revision: https://reviews.llvm.org/D73235	2020-01-28 13:26:18 -08:00
Hiroshi Yamauchi	2c03c899d5	[MBFI] Move BranchFolding::MBFIWrapper to its own files. NFC. Summary: To avoid header file circular dependency issues in passing updated MBFI (in MBFIWrapper) to the interface of profile guided size optimizations. A prep step for (and split off of) D73381. Reviewers: davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73494	2020-01-28 10:58:46 -08:00
Sam Parker	7ad879caa0	[NFC][RDA] typedef SmallPtrSetImpl<MachineInstr*>	2020-01-28 13:15:44 +00:00
Wang, Pengfei	3239b5034e	[FPEnv] Add pragma FP_CONTRACT support under strict FP. Summary: Support pragma FP_CONTRACT under strict FP. Reviewers: craig.topper, andrew.w.kaylor, uweigand, RKSimon, LiuChen3 Subscribers: hiraditya, jdoerfert, cfe-commits, llvm-commits, LuoYuanke Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D72820	2020-01-28 20:43:43 +08:00
Guillaume Chatelet	879c825cb8	[instrinsics] Add @llvm.memcpy.inline instrinsics Summary: This is a follow up on D61634. It adds an LLVM IR intrinsic to allow better implementation of memcpy from C++. A follow up CL will add the intrinsics in Clang. Reviewers: courbet, theraven, t.p.northover, jdoerfert, tejohnson Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71710	2020-01-28 09:42:01 +01:00
Fangrui Song	c7c5da6df3	Reland "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()"" Reland `7a8b0b1595`, with a fix that checks `!E.value().empty()` to avoid inserting a zero to SlotRemap. Debugged by rnk@ in https://bugs.chromium.org/p/chromium/issues/detail?id=1045650#c33 Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D73510	2020-01-27 15:58:49 -08:00
Jay Foad	cbbbd5b5f6	[GlobalISel] Make use of KnownBits::computeForAddSub Summary: This is mostly NFC. computeForAddSub may give more precise results in some cases, but that doesn't seem to affect any existing GlobalISel tests. Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73431	2020-01-27 22:22:56 +00:00
Simon Pilgrim	e7e043724e	[DAG] Enable ISD::EXTRACT_SUBVECTOR SimplifyMultipleUseDemandedBits handling This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow. Differential Revision: This allows SimplifyDemandedBits to call SimplifyMultipleUseDemandedBits to create a simpler ISD::EXTRACT_SUBVECTOR, which is particularly useful for cases where we're splitting into subvectors anyhow.	2020-01-27 21:17:47 +00:00
Adrian Prantl	a095d149c2	Fix an assertion failure in DwarfExpression's subregister composition This patch fixes an assertion failure in DwarfExpression that is triggered when a complex fragment has exactly the size of a subregister of the register the DBG_VALUE points to and there is no DWARF encoding for the super-register. I took the opportunity to replace/document some magic values with static constructor functions to make this code less confusing to read. rdar://problem/58489125 Differential Revision: https://reviews.llvm.org/D72938	2020-01-27 12:44:37 -08:00
Vedant Kumar	e08f205f5c	Reland (again): [DWARF] Allow cross-CU references of subprogram definitions This is a revert-of-revert (i.e. this reverts commit `802bec89`, which itself reverted `fa4701e1` and `79daafc9`) with a fix folded in. The problem was that call site tags weren't emitted properly when LTO was enabled along with split-dwarf. This required a minor fix. I've added a reduced test case in test/DebugInfo/X86/fission-call-site.ll. Original commit message: This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. Update #1: Reland with a fix to create a declaration DIE when the declaration is missing from the CU's retainedTypes list. The declaration is left out of the retainedTypes list in two cases: 1) Re-compiling pre-r266445 bitcode (in which declarations weren't added to the retainedTypes list), and 2) Doing LTO function importing (which doesn't update the retainedTypes list). It's possible to handle (1) and (2) by modifying the retainedTypes list (in AutoUpgrade, or in the LTO importing logic resp.), but I don't see an advantage to doing it this way, as it would cause more DWARF to be emitted compared to creating the declaration DIEs lazily. Update #2: Fold in a fix for call site tag emission in the split-dwarf + LTO case. Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a ReleaseLTO-g build of the test suite. rdar://46577651, rdar://57855316, rdar://57840415, rdar://58888440 Differential Revision: https://reviews.llvm.org/D70350	2020-01-27 10:52:34 -08:00
Nico Weber	68051c1224	Revert "[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager()" This reverts commit `7a8b0b1595`. It seems to break exception handling on 32-bit Windows, see https://crbug.com/1045650	2020-01-27 11:22:33 -05:00
Dominik Montada	9965b12fd1	Use pointer type size for offset constant when lowering load/stores	2020-01-27 06:55:32 -08:00
Matt Arsenault	2a160ba5b0	GlobalISel: Reimplement widenScalar for G_UNMERGE_VALUES results Only use shifts if the requested type exactly matches the source type, and create sub-unmerges otherwise.	2020-01-27 06:18:26 -08:00
Matt Arsenault	06d9230fef	GlobalISel: Translate vector GEPs	2020-01-27 05:35:05 -08:00
Igor Kudrin	8f3d47c54a	[DWARF] Do not pass Version to DWARFExpression. NFCI. The Version was used only to determine the size of an operand of DW_OP_call_ref. The size was 4 for all versions apart from 2, but the DW_OP_call_ref operation was introduced only in DWARF3. Thus, the code may be simplified and using of Version may be eliminated. Differential Revision: https://reviews.llvm.org/D73264	2020-01-27 19:08:46 +07:00
David Stenberg	13d4ef9ac0	Improvements to call site register worklist Summary: This fixes PR44118. For cases where we have a chain like this: R8 = R1 (entry value) R0 = R8 call @foo R0 the code that emits call site entries using entry values would not follow that chain, instead emitting a call site entry with R8 as location rather than R0. Such a case was discovered when originally adding dbgcall-site-orr-moves.mir. This patch fixes that issue. This is done by changing the ForwardedRegWorklist set to a map in which the worklist registers always map to the parameter registers that they describe. Another thing this patch fixes is that worklist registers now can describe more than one parameter register at a time. Such a case occurred in dbgcall-site-interpretation.mir, resulting in a call site entry not being emitted for one of the parameters. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73168	2020-01-27 12:41:42 +01:00
David Stenberg	b46baa82fc	Don't separate imp/expl def handling for call site params Summary: Since D70431 the describeLoadedValue() hook takes a parameter register, meaning that it can now be asked to describe any register. This means that we can drop the difference between explicit and implicit defines that we previously had in collectCallSiteParameters(). I have not found any case for any upstream targets where a parameter register is only implicitly defined, and does not overlap with any explicit defines. I don't know if such a case would even make sense. So as far as I have tested, this patch should be a non-functional change. However, this reduces the complexity of the code a bit, and it will simplify the implementation of an upcoming patch which solves PR44118. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D73167	2020-01-27 11:31:09 +01:00
Petar Avramovic	cbf03aee6d	[MIPS GlobalISel] Select population count (popcount) G_CTPOP is generated from llvm.ctpop.<type> intrinsics, clang generates these intrinsics from __builtin_popcount and __builtin_popcountll. Add lower and narrow scalar for G_CTPOP. Lower G_CTPOP for MIPS32. Differential Revision: https://reviews.llvm.org/D73216	2020-01-27 09:59:50 +01:00
Petar Avramovic	8bc7ba5b9e	[MIPS GlobalISel] Select count trailing zeros llvm.cttz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. G_CTTZ is generated from llvm.cttz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_ctz and __builtin_ctzll. G_CTTZ_ZERO_UNDEF comes from llvm.cttz.<type> (<type> <src>, i1 true). Clang generates such intrinsics as parts of expansion of builtin_ffs and builtin_ffsll. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar (algorithm uses G_CTTZ_ZERO_UNDEF) for G_CTTZ. Lower G_CTTZ and G_CTTZ_ZERO_UNDEF for MIPS32. Differential Revision: https://reviews.llvm.org/D73215	2020-01-27 09:51:06 +01:00
Petar Avramovic	2b66d32f3f	[MIPS GlobalISel] Select count leading zeros llvm.ctlz.<type> intrinsic has additional i1 argument is_zero_undef, it tells whether zero as the first argument produces a defined result. MIPS clz instruction returns 32 for zero input. G_CTLZ is generated from llvm.ctlz.<type> (<type> <src>, i1 false) intrinsics, clang generates these intrinsics from __builtin_clz and __builtin_clzll. G_CTLZ_ZERO_UNDEF can also be generated from llvm.ctlz with true as second argument. It is also traditionally part of and many algorithms that are now predicated on avoiding zero-value inputs. Add narrow scalar for G_CTLZ (algorithm uses G_CTLZ_ZERO_UNDEF). Lower G_CTLZ_ZERO_UNDEF and select G_CTLZ for MIPS32. Differential Revision: https://reviews.llvm.org/D73214	2020-01-27 09:43:38 +01:00
Fangrui Song	941f20c3bd	[MachineVerifier] Simplify and delete LLVM_VERIFY_MACHINEINSTRS from a comment. NFC The environment variable has been unused since r228079.	2020-01-27 00:31:23 -08:00
Wang, Pengfei	17b8f96d65	[FPEnv] Divide macro INSTRUCTION into INSTRUCTION and DAG_INSTRUCTION, and macro FUNCTION likewise. NFCI. Some functions like fmuladd don't really have a node, we should divide the declaration form those have node to avoid introducing fake nodes. Differential Revision: https://reviews.llvm.org/D72871	2020-01-27 10:38:05 +08:00
Simon Pilgrim	4a5f9d9faf	[TargetLowering] Respect recursive depth in SimplifyDemandedBits call to ComputeNumSignBits	2020-01-26 10:01:56 +00:00
Simon Pilgrim	3daa71ee00	[SelectionDAG] ComputeNumSignBits - add DemandedElts support for MIN/MAX ops	2020-01-25 20:21:14 +00:00
Simon Pilgrim	3f8916b2e8	[SelectionDAG] ComputeNumSignBits - add support for rotate non-uniform vector amounts	2020-01-25 19:15:05 +00:00
Simon Pilgrim	e3c26a9d1b	[SelectionDAG] ComputeNumSignBits - add support for rotate uniform vector amounts	2020-01-25 18:55:47 +00:00
Simon Pilgrim	c8de7c8f50	[TargetLowering] SimplifyDemandedBits - Remove ashr if all our demandedbits already match the sign bit Differential Revision: https://reviews.llvm.org/D73412	2020-01-25 17:36:46 +00:00
Vedant Kumar	802bec8961	Revert "Reland: [DWARF] Allow cross-CU references of subprogram definitions" ... as well as: Revert "[DWARF] Defer creating declaration DIEs until we prepare call site info" This reverts commit `fa4701e197`. This reverts commit `79daafc903`. There have been reports of this assert getting hit: CalleeDIE && "Could not find DIE for call site entry origin	2020-01-24 18:07:54 -08:00
Quentin Colombet	5d87b5d202	[GISelKnownBits] Add support for PHIs Teach the GISelKnowBits analysis how to deal with PHI operations. PHIs are essentially COPYs happening on edges, so we can just reuse the code for COPY. This is NFC COPY-wise has we leave Depth untouched when calling computeKnownBitsImpl for COPYs, like it was before this patch. Increasing Depth is however required for PHIs as they may loop back to themselves and we would end up in an infinite loop if we were not increasing Depth. Differential Revision: https://reviews.llvm.org/D73317	2020-01-24 16:43:52 -08:00
@justice_adams (Justice Adams)	daee63f974	[SelectionDag] Updated FoldConstantArithmetic method signature in preparation for merge with FoldConstantVectorArithmetic Updated FoldConstantArithmetic method signature to match that of FoldConstantVectorArithmetic in preparation for merging the two functions together https://bugs.llvm.org/show_bug.cgi?id=36544 This is the first step in combining the various FoldConstantVectorArithmetic and FoldConstantVectorArithmetic functions into one FoldConstantArithmetic function. Differential Revision: https://reviews.llvm.org/D72870	2020-01-24 18:00:58 -05:00
Craig Topper	d3bf06bc81	[DAGCombiner] Add combine for (not (strict_fsetcc)) to create a strict_fsetcc with the opposite condition. Unlike the existing code that I modified here, I only handle the case where the strict_fsetcc has a single use. Not sure exactly how to handle multiples uses. Testing this on X86 is hard because we already have a other combines that get rid of lowered version of the integer setcc that this xor will eventually become. So this combine really just saves a bunch of extra nodes being created. Not sure about other targets. Differential Revision: https://reviews.llvm.org/D71816	2020-01-24 14:15:36 -08:00
Stanislav Mekhanoshin	be8e38cbd9	Correct NumLoads in clustering Scheduler sends NumLoads argument into shouldClusterMemOps() one less the actual cluster length. So for 2 instructions it will pass just 1. Correct this number. This is NFC for in tree targets. Differential Revision: https://reviews.llvm.org/D73292	2020-01-24 12:45:28 -08:00
Stanislav Mekhanoshin	7a94d4f4ee	Allow combining of extract_subvector to extract element Differential Revision: https://reviews.llvm.org/D73132	2020-01-24 10:50:26 -08:00
Fangrui Song	50a3ff30e1	[PatchableFunction] Allow empty entry MachineBasicBlock Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D73301	2020-01-24 09:42:48 -08:00
Tom Weaver	f5147765ba	[DebugInfo][LiveDebugValues] Teach Live Debug Values About Meta Instructions Previously LiveDebugValues pass would consider meta instructions that 'fiddle' with liveness of registers as register definitions when transfering register defs. This would mean that, for example, a KILL instruction would cause LiveDebugValues to terminate the range of an earlier DBG_VALUE instruction resulting in the none propogation of said DBG_VALUE instructions into later blocks. This patch adds the check and a helpful comment, fixes a test that previously tested for the broken behaviour by coincidence and adds a test specifically for this. reviewers: vsk, dstenb, djtodoro Differential Revision: https://reviews.llvm.org/D73210	2020-01-24 16:29:05 +00:00
Guillaume Chatelet	805c157e8a	[Alignment][NFC] Deprecate Align::None() Summary: This is a follow up on https://reviews.llvm.org/D71473#inline-647262. There's a caveat here that `Align(1)` relies on the compiler understanding of `Log2_64` implementation to produce good code. One could use `Align()` as a replacement but I believe it is less clear that the alignment is one in that case. Reviewers: xbolva00, courbet, bollu Subscribers: arsenm, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, Jim, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73099	2020-01-24 12:53:58 +01:00
Simon Pilgrim	0b45c2264a	[SelectionDAG] rot(x, y) --> x iff ComputeNumSignBits(x) == BitWidth(x) Rotating an 0/-1 value by any amount will always result in the same 0/-1 value	2020-01-24 10:35:57 +00:00
Fangrui Song	22467e2595	Add function attribute "patchable-function-prefix" to support -fpatchable-function-entry=N,M where M>0 Similar to the function attribute `prefix` (prefix data), "patchable-function-prefix" inserts data (M NOPs) before the function entry label. -fpatchable-function-entry=2,1 (1 NOP before entry, 1 NOP after entry) will look like: ``` .type foo,@function .Ltmp0: # @foo nop foo: .Lfunc_begin0: # optional `bti c` (AArch64 Branch Target Identification) or # `endbr64` (Intel Indirect Branch Tracking) nop .section __patchable_function_entries,"awo",@progbits,get,unique,0 .p2align 3 .quad .Ltmp0 ``` -fpatchable-function-entry=N,0 + -mbranch-protection=bti/-fcf-protection=branch has two reasonable placements (https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01185.html): ``` (a) (b) func: func: .Ltmp0: bti c bti c .Ltmp0: nop nop ``` (a) needs no additional code. If the consensus is to go for (b), we will need more code in AArch64BranchTargets.cpp / X86IndirectBranchTracking.cpp . Differential Revision: https://reviews.llvm.org/D73070	2020-01-23 17:02:27 -08:00
Simon Pilgrim	e25eee4db7	[SelectionDAG] ComputeNumSignBits - add ISD::ADD demanded elts support	2020-01-23 17:48:07 +00:00
Sam Parker	05532575e8	[RDA] Skip debug values Skip debug instructions when iterating through a block to find uses. Differential Revision: https://reviews.llvm.org/D73273	2020-01-23 17:04:54 +00:00
Simon Pilgrim	0fec8acdd8	[SelectionDAG] ComputeNumSignBits - add ISD::ADD vector support Add missing handling for (ADD (AND X, 1), -1) uniform vectors	2020-01-23 16:42:12 +00:00
Guillaume Chatelet	59f95222d4	[Alignment][NFC] Use Align with CreateAlignedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73274	2020-01-23 17:34:32 +01:00
Simon Pilgrim	fc5bbbf328	[SelectionDAG] ComputeNumSignBits - add ISD::SUB demanded elts support	2020-01-23 16:20:48 +00:00
Jay Foad	b482e1bfe2	[CodeGen] Make use of MachineInstrBuilder::getReg Reviewers: arsenm Subscribers: wdng, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73262	2020-01-23 13:38:13 +00:00
Sam Parker	0d1468db58	[NFC][RDA] Make the interface const Make all the public query methods const.	2020-01-23 13:32:11 +00:00
Simon Pilgrim	48d4ba8fb2	[SelectionDAG] Compute Known + Sign Bits - merge INSERT_VECTOR_ELT known/unknown index paths Match the approach in SimplifyDemandedBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits/ComputeNumSignBits call.	2020-01-23 13:31:37 +00:00
Guillaume Chatelet	279fa8e006	[Alignement][NFC] Deprecate untyped CreateAlignedLoad Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, kerbowa, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73260	2020-01-23 13:34:32 +01:00
Simon Pilgrim	03cae086f4	[SelectionDAG] ComputeKnownBits - merge EXTRACT_VECTOR_ELT known/unknown index paths Match the approach in SimplifyDemandedBits/ComputeNumSignBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits call.	2020-01-23 11:29:16 +00:00
Simon Pilgrim	98da49d979	[SelectionDAG] Compute Known + Sign Bits - merge INSERT_SUBVECTOR known/unknown index paths Match the approach in SimplifyDemandedBits where we calculate the demanded elts and then have a common path for the ComputeKnownBits/ComputeNumSignBits call, additionally we only ever need original demanded elts of the base vector even if the index is unknown.	2020-01-23 11:29:15 +00:00
Djordje Todorovic	91b0956f38	[NFC][DwarfDebug] Use proper analog GNU attribute for the pc address The low_pc is analog to the DW_AT_call_return_pc, since it describes the return address after the call. The DW_AT_call_pc is the address of the call instruction, and we don't use it at the moment. Differential Revision: https://reviews.llvm.org/D73173	2020-01-23 12:15:35 +01:00
David Tenty	45a4aaea7f	[NFC][XCOFF] Refactor Csect creation into TargetLoweringObjectFile Summary: We create a number of standard types of control sections in multiple places for things like the function descriptors, external references and the TOC anchor among others, so it is possible for their properties to be defined inconsistently in different places. This refactor moves their creation and properties into functions in the TargetLoweringObjectFile class hierarchy, where functions for retrieving various special types of sections typically seem to reside. Note: There is one case in PPCISelLowering which is specific to function entry points which we don't address since we don't have access to the TLOF there. Reviewers: DiggerLin, jasonliu, hubert.reinterpretcast Reviewed By: jasonliu, hubert.reinterpretcast Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72347	2020-01-22 12:09:11 -05:00
Stanislav Mekhanoshin	2d0fcf786c	Precommit NFC part of DAGCombiner change. NFC. This is NFC part of DAGCombiner::visitEXTRACT_SUBVECTOR() change in the D73132.	2020-01-22 09:01:22 -08:00
Hiroshi Yamauchi	ddbc728828	[PGO][PGSO] Update BFI in CodeGenPrepare::optimizeSelectInst. Summary: Without the BFI update, some hot blocks are incorrectly treated as cold code. This fixes a FDO perf regression in the TSVC benchmark from D71288. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D73146	2020-01-22 08:36:54 -08:00
Sander de Smalen	4cf16efe49	[AArch64][SVE] Add patterns for unpredicated load/store to frame-indices. This patch also fixes up a number of cases in DAGCombine and SelectionDAGBuilder where the size of a scalable vector is used in a fixed-width context (thus triggering an assertion failure). Reviewers: efriedma, c-rhodes, rovka, cameron.mcinally Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D71215	2020-01-22 14:32:27 +00:00
Jay Foad	e0f0d0e55c	[MachineScheduler] Allow clustering mem ops with complex addresses The generic BaseMemOpClusterMutation calls into TargetInstrInfo to analyze the address of each load/store instruction, and again to decide whether two instructions should be clustered. Previously this had to represent each address as a single base operand plus a constant byte offset. This patch extends it to support any number of base operands. The old target hook getMemOperandWithOffset is now a convenience function for callers that are only prepared to handle a single base operand. It calls the new more general target hook getMemOperandsWithOffset. The only requirements for the base operands returned by getMemOperandsWithOffset are: - they can be sorted by MemOpInfo::Compare, such that clusterable ops get sorted next to each other, and - shouldClusterMemOps knows what they mean. One simple follow-on is to enable clustering of AMDGPU FLAT instructions with both vaddr and saddr (base register + offset register). I've left a FIXME in the code for this case. Differential Revision: https://reviews.llvm.org/D71655	2020-01-22 14:28:24 +00:00
Simon Pilgrim	80656fd7ae	[SelectionDAG] getShiftAmountConstant - assert the type is an integer.	2020-01-22 13:52:44 +00:00
Sander de Smalen	67d4c9924c	Add support for (expressing) vscale. In LLVM IR, vscale can be represented with an intrinsic. For some targets, this is equivalent to the constexpr: getelementptr <vscale x 1 x i8>, <vscale x 1 x i8>* null, i32 1 This can be used to propagate the value in CodeGenPrepare. In ISel we add a node that can be legalized to one or more instructions to materialize the runtime vector length. This patch also adds SVE CodeGen support for VSCALE, which maps this node to RDVL instructions (for scaled multiples of 16bytes) or CNT[HSD] instructions (scaled multiples of 2, 4, or 8 bytes, respectively). Reviewers: rengolin, cameron.mcinally, hfinkel, sebpop, SjoerdMeijer, efriedma, lattner Reviewed by: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68203	2020-01-22 10:09:27 +00:00
Guillaume Chatelet	0957233320	[Alignment][NFC] Use Align with CreateMaskedStore Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D73106	2020-01-22 11:04:39 +01:00
Amara Emerson	67a8775322	[AArch64] Don't generate gpr CSEL instructions in early-ifcvt if regclasses aren't compatible. In GlobalISel we may in some unfortunate circumstances generate PHIs with operands that are on separate banks. If-conversion doesn't currently check for that case and ends up generating a CSEL on AArch64 with incorrect register operands. Differential Revision: https://reviews.llvm.org/D72961	2020-01-21 16:51:31 -08:00
Quentin Colombet	ff1f3cc1a1	[GISelKnownBits] Make the max depth a parameter of the analysis Allow users of that analysis to define the cut off depth of the analysis instead of hardcoding 6. NFC as the default parameter is 6.	2020-01-21 11:35:31 -08:00
Thomas Lively	3ef169e586	[WebAssembly][InstrEmitter] Foundation for multivalue call lowering Summary: WebAssembly is unique among upstream targets in that it does not at any point use physical registers to store values. Instead, it uses virtual registers to model positions in its value stack. This means that some target-independent lowering activities that would use physical registers need to use virtual registers instead for WebAssembly and similar downstream targets. This CL generalizes the existing `usesPhysRegsForPEI` lowering hook to `usesPhysRegsForValues` in preparation for using it in more places. One such place is in InstrEmitter for instructions that have variadic defs. On register machines, it only makes sense for these defs to be physical registers, but for WebAssembly they must be virtual registers like any other values. This CL changes InstrEmitter to check the new target lowering hook to determine whether variadic defs should be physical or virtual registers. These changes are necessary to support a generalized CALL instruction for WebAssembly that is capable of returning an arbitrary number of arguments. Fully implementing that instruction will require additional changes that are described in comments here but left for a follow up commit. Reviewers: aheejin, dschuff, qcolombet Subscribers: sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71484	2020-01-21 11:13:46 -08:00
Fangrui Song	7a8b0b1595	[StackColoring] Remap PseudoSourceValue frame indices via MachineFunction::getPSVManager() Reviewed By: dantrushin Differential Revision: https://reviews.llvm.org/D73063	2020-01-21 09:46:27 -08:00
Krzysztof Parzyszek	020041d99b	Update spelling of {analyze,insert,remove}Branch in strings and comments These names have been changed from CamelCase to camelCase, but there were many places (comments mostly) that still used the old names. This change is NFC.	2020-01-21 10:15:38 -06:00
Simon Pilgrim	f04284cf1d	[TargetLowering] SimplifyDemandedBits ISD::SRA multi-use handling Call SimplifyMultipleUseDemandedBits to peek through extended source args with multiple uses	2020-01-21 15:12:07 +00:00
Simon Pilgrim	47f99d2ca8	[SelectionDAG] GetDemandedBits - remove ANY_EXTEND handling Rely on SimplifyMultipleUseDemandedBits fallback instead.	2020-01-21 14:39:00 +00:00
Simon Pilgrim	651fa669a2	[TargetLowering] SimplifyDemandedBits ANY_EXTEND/ANY_EXTEND_VECTOR_INREG multi-use handling Call SimplifyMultipleUseDemandedBits to peek through extended source args with multiple uses	2020-01-21 14:07:19 +00:00
Simon Pilgrim	5f5f478564	[DAG] Fold extract_vector_elt (scalar_to_vector), K to undef (K != 0) This was unconditionally folding this to the source operand, even if the access was out of bounds. Use undef instead of the extract is not the first element. This helps with some cases where 3-vectors are legalized and avoids processing the 4th component. Original Patch by: arsenm (Matt Arsenault) Differential Revision: https://reviews.llvm.org/D51589	2020-01-21 10:58:30 +00:00
Simon Pilgrim	8d2e6bdbe1	[TargetLowering] SimplifyDemandedBits - Pull out InDemandedMask variable to ISD::SHL. NFCI. Matches ISD::SRA + ISD::SRL variants.	2020-01-21 10:40:18 +00:00
Fangrui Song	d232c21566	[AsmPrinter] Don't emit __patchable_function_entries entry if "patchable-function-entry"="0" Add improve tests	2020-01-20 16:13:48 -08:00
Simon Pilgrim	9c06c10fba	[SelectionDAG] GetDemandedBits - fallback to SimplifyMultipleUseDemandedBits by default. First step towards removing SelectionDAG::GetDemandedBits entirely since it so similar to SimplifyMultipleUseDemandedBits anyhow.	2020-01-20 16:51:52 +00:00
Awanish Pandey	84c4c87e04	Recommit "[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return type for C++ member functions." Summary: This was reverted in `328e0f3dca` due to chromium bot failure. This revision addresses that case. Original commit message: Summary: This patch will provide support for auto return type for the C++ member functions. Before this return type of the member function is deduced and stored in the DIE. This patch includes llvm side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524	2020-01-20 15:13:13 +05:30
Fangrui Song	eaab1bf21e	[StackColoring] Remap FixedStackPseudoSourceValue frame index referenced by MachineMemOperand StackColoring::remapInstructions() remaps MachineOperand frame index (e.g. %stack.1 -> %stack.0) but does not remap FixedStackPseudoSourceValue frame index (e.g. store 4 into %stack.1.ap2.i.i) referenced by MachineMemoryOperand. This can cause an assertion failure when LiveDebugValues references a dead stack object. It is difficult to craft a test case. -g, va_copy and stack-coloring are required. I can only reproduce it on ppc32.	2020-01-19 22:53:45 -08:00
Fangrui Song	886d2c2ca7	[BranchRelaxation] Simplify offset computation and fix a bug in adjustBlockOffsets() If Start!=0, adjustBlockOffsets() may unnecessarily adjust the offset of Start. There is no correctness issue, but it can create more block splits.	2020-01-19 16:02:16 -08:00
Fangrui Song	9a24488cb6	[CodeGen] Move fentry-insert, xray-instrumentation and patchable-function before addPreEmitPass() This intention is to move patchable-function before aarch64-branch-targets (configured in AArch64PassConfig::addPreEmitPass) so that we emit BTI before NOPs (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424). This also allows addPreEmitPass() passes to know the precise instruction sizes if they want. Tried x86-64 Debug/Release builds of ccls with -fxray-instrument -fxray-instruction-threshold=1. No output difference with this commit and the previous commit.	2020-01-19 00:09:46 -08:00
Fangrui Song	9583a3f262	[AsmPrinter] Delete dead takeDeletedSymbsForFunction() The code added in r98579 is dead now.	2020-01-18 17:08:00 -08:00
Michael Liao	6d0d86a64d	[DAG] Add helper for creating constant vector index with correct type. NFC.	2020-01-18 01:23:36 -05:00
David Blaikie	58b10df54f	DebugInfo: Move SectionLabel tracking into CU's addRange This makes the SectionLabel handling more resilient - specifically for future PROPELLER work which will have more CU ranges (rather than just one per function). Ultimately it might be nice to make this more general/resilient to arbitrary labels (rather than relying on the labels being created for CU ranges & then being reused by ranges, loclists, and possibly other addresses). It's possible that other (non-rnglist/loclist) uses of addresses will need the addresses to be in SectionLabels earlier (eg: move the CU.addRange to be done on function begin, rather than function end, so during function emission they are already populated for other use).	2020-01-17 18:12:34 -08:00
Derek Schuff	ff171acf84	[WebAssembly] Track frame registers through VReg and local allocation This change has 2 components: Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It describes how the Dwarf frame base will be encoded. That can be a register (the default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or a DW_OP_WASM_location descriptr. WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs has run. Make WebAssemblyExplicitLocals store the local it allocates for the frame register. Use this local information to implement getDwarfFrameBase The result is that the DW_AT_frame_base attribute is correctly encoded for each subprogram, and each param and local variable has a correct DW_AT_location that uses DW_OP_fbreg to refer to the frame base. This is a reland of rG3a05c3969c18 with fixes for the expensive-checks and Windows builds Differential Revision: https://reviews.llvm.org/D71681	2020-01-17 17:23:56 -08:00
Matt Arsenault	a4451d88ee	Consolidate internal denormal flushing controls Currently there are 4 different mechanisms for controlling denormal flushing behavior, and about as many equivalent frontend controls. - AMDGPU uses the fp32-denormals and fp64-f16-denormals subtarget features - NVPTX uses the nvptx-f32ftz attribute - ARM directly uses the denormal-fp-math attribute - Other targets indirectly use denormal-fp-math in one DAGCombine - cl-denorms-are-zero has a corresponding denorms-are-zero attribute AMDGPU wants a distinct control for f32 flushing from f16/f64, and as far as I can tell the same is true for NVPTX (based on the attribute name). Work on consolidating these into the denormal-fp-math attribute, and a new type specific denormal-fp-math-f32 variant. Only ARM seems to support the two different flush modes, so this is overkill for the other use cases. Ideally we would error on the unsupported positive-zero mode on other targets from somewhere. Move the logic for selecting the flush mode into the compiler driver, instead of handling it in cc1. denormal-fp-math/denormal-fp-math-f32 are now both cc1 flags, but denormal-fp-math-f32 is not yet exposed as a user flag. -cl-denorms-are-zero, -fcuda-flush-denormals-to-zero and -fno-cuda-flush-denormals-to-zero will be mapped to -fp-denormal-math-f32=ieee or preserve-sign rather than the old attributes. Stop emitting the denorms-are-zero attribute for the OpenCL flag. It has no in-tree users. The meaning would also be target dependent, such as the AMDGPU choice to treat this as only meaning allow flushing of f32 and not f16 or f64. The naming is also potentially confusing, since DAZ in other contexts refers to instructions implicitly treating input denormals as zero, not necessarily flushing output denormals to zero. This also does not attempt to change the behavior for the current attribute. The LangRef now states that the default is ieee behavior, but this is inaccurate for the current implementation. The clang handling is slightly hacky to avoid touching the existing denormal-fp-math uses. Fixing this will be left for a future patch. AMDGPU is still using the subtarget feature to control the denormal mode, but the new attribute are now emitted. A future change will switch this and remove the subtarget features.	2020-01-17 20:09:53 -05:00
Reid Kleckner	423e3db6a8	Remove unneeded FoldingSet.h include from Attributes.h Avoids 637 extra FoldingSet.h and Allocator.h includes. FoldingSet.h needs Allocator.h, which is relatively expensive.	2020-01-17 16:36:09 -08:00
Evgenii Stepanov	d081962dea	Merge memtag instructions with adjacent stack slots. Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and adds _untied variants of those pseudos that are allowed to take the base address as a FI operand. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286	2020-01-17 15:19:29 -08:00
Ian Levesque	97ba483026	[xray] Allow instrumenting only function entry and/or only function exit Extend -fxray-instrumentation-bundle to split function-entry and function-exit into two separate options, so that it is possible to instrument only function entry or only function exit. For use cases that only care about one or the other this will save significant overhead and code size. Differential Revision: https://reviews.llvm.org/D72890	2020-01-17 13:32:34 -08:00
Ian Levesque	7628e474a5	[xray] Add xray-ignore-loops option XRay allows tuning by minimum function size, but also always instruments functions with loops in them. If the minimum function size is set to a large value the loop instrumention ends up causing most functions to be instrumented anyway. This adds a new flag, xray-ignore-loops, to disable the loop detection logic. Differential Revision: https://reviews.llvm.org/D72659	2020-01-17 13:32:17 -08:00
Adrian Prantl	7b30370e5b	Move the sysroot attribute from DIModule to DICompileUnit [this re-applies `c0176916a4` with the correct commit message and phabricator link] This addresses point 1 of PR44213. https://bugs.llvm.org/show_bug.cgi?id=44213 The DW_AT_LLVM_sysroot attribute is used for Clang module debug info, to allow LLDB to import a Clang module from source. Currently it is part of each DW_TAG_module, however, it is the same for all modules in a compile unit. It is more efficient and less ambiguous to store it once in the DW_TAG_compile_unit. This should have no effect on DWARF consumers other than LLDB. Differential Revision: https://reviews.llvm.org/D71732	2020-01-17 12:55:40 -08:00
Adrian Prantl	c17aee67f1	Revert "Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot" This reverts commit `12e479475a`. I accidentally landed this patch with the wrong commit message ...	2020-01-17 12:52:36 -08:00
Adrian Prantl	12e479475a	Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot This is a purely cosmetic change that is NFC in terms of the binary output. I bugs me that I called the attribute DW_AT_LLVM_isysroot since the "i" is an artifact of GCC command line option syntax (-isysroot is in the category of -i options) and doesn't carry any useful information otherwise. This attribute only appears in Clang module debug info. Differential Revision: https://reviews.llvm.org/D71722	2020-01-17 09:36:48 -08:00
Simon Pilgrim	1dc2f25790	[SelectionDAG] ComputeKnownBits - assert we're computing the 0'th (difference) result for the SUB/SUBC cases Matches what we already do for the ADD/ADDC/ADDE case.	2020-01-17 13:53:57 +00:00
Sam Parker	42350cd893	[ARM][MVE] Tail Predicate IsSafeToRemove Introduce a method to walk through use-def chains to decide whether it's possible to remove a given instruction and its users. These instructions are then stored in a set until the end of the transform when they're erased. This is now used to perform checks on the iteration count (LoopDec chain), element count (VCTP chain) and the possibly redundant iteration count. As well as being able to remove chains of instructions, we know also check that the sub feeding the vctp is producing the expected value. Differential Revision: https://reviews.llvm.org/D71837	2020-01-17 13:19:14 +00:00
Simon Pilgrim	f611158350	[SelectionDAG] Better ISD::ANY_EXTEND/ISD::ANY_EXTEND_VECTOR_INREG ComputeKnownBits support Add DemandedElts handling to ISD::ANY_EXTEND and add missing ISD::ANY_EXTEND_VECTOR_INREG handling. Despite the lack of test changes this code IS being used - its just that the ANY_EXTEND ops are legalized later on (typically to ZERO_EXTEND equivalents) so we typically manage to combine later on.	2020-01-17 11:37:58 +00:00
Davide Italiano	30a8865142	[FastISel] Lower `llvm.dbg.value(undef, ...` correctly. Summary: Instead of just dropping them. <rdar://problem/58657146> Reviewers: aprantl, vsk, ab, paquette, echristo Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72877	2020-01-16 16:22:20 -08:00
Derek Schuff	80906d9d16	Revert "[WebAssembly] Track frame registers through VReg and local allocation" This reverts commit `3a05c3969c`. It breaks under expensive-checks and on Windows	2020-01-16 14:38:00 -08:00
Derek Schuff	3a05c3969c	[WebAssembly] Track frame registers through VReg and local allocation This change has 2 components: Target-independent: add a method getDwarfFrameBase to TargetFrameLowering. It describes how the Dwarf frame base will be encoded. That can be a register (the default), the CFA (which replaces NVPTX-specific logic in DwarfCompileUnit), or a DW_OP_WASM_location descriptr. WebAssembly: Allow WebAssemblyFunctionInfo::getFrameRegister to return the correct virtual register instead of FP32/SP32 after WebAssemblyReplacePhysRegs has run. Make WebAssemblyExplicitLocals store the local it allocates for the frame register. Use this local information to implement getDwarfFrameBase The result is that the DW_AT_frame_base attribute is correctly encoded for each subprogram, and each param and local variable has a correct DW_AT_location that uses DW_OP_fbreg to refer to the frame base. Differential Revision: https://reviews.llvm.org/D71681	2020-01-16 13:51:17 -08:00
Matt Arsenault	a66d2817ca	GlobalISel: Don't ignore requested ext narrowing type This was assuming the narrow target was the source type. Respect the requested type when these don't match by using intermediate merges. This avoids producing very wide, illegal shift expansions.	2020-01-16 14:29:37 -05:00
Matt Arsenault	be31a7b7ee	GlobalISel: Move extension scalar narrowing to separate function Also rename a few things. Handling a different requested type will require this to become much more complex.	2020-01-16 14:29:37 -05:00
Craig Topper	61a89e17df	[LegalizeDAG][Mips] Add an assert to protect a uint_to_fp implementation from double rounding. Add a i32->f32 uint_to_fp implementation that avoids this code. The algorithm here only works if the sint_to_fp doesn't do any rounding. Otherwise it can round before the offset fixup is applied. Add an assert to protect this. To avoid breaking the one test in tree that tested this code with a set of types that fail the assert, I've enabled i32->f32 to use the i64->f32 algorithm. This only occurs when f64 isn't a legal type. If f64 is legal then we do i32->f64->f32 instead. Differential Revision: https://reviews.llvm.org/D72794	2020-01-16 11:08:16 -08:00
Matt Arsenault	d0943537e1	GlobalISel: Apply target MMO flags to atomics Unify MMO flag handling with SelectionDAG like with loads and stores.	2020-01-16 13:49:43 -05:00
Matt Arsenault	0d0fce42b0	GlobalISel: Preserve load/store metadata in IRTranslator This was dropping the invariant metadata on dead argument loads, so they weren't deleted. Atomics still need to be fixed the same way. Also, apparently store was never preserving dereferencable which should also be fixed.	2020-01-16 13:49:43 -05:00
Jay Foad	885260d5d8	[GlobalISel] Don't arbitrarily limit a mask to 64 bits Reviewers: arsenm Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72853	2020-01-16 16:13:20 +00:00
Jay Foad	63f73545dd	[GlobalISel] Pass MachineOperands into MachineIRBuilder helper methods Reviewers: arsenm, aditya_nandakumar, aemerson Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72849	2020-01-16 16:04:21 +00:00
Sam Parker	760b175109	[ARM][LowOverheadLoops] Update liveness info Recommitting `e93e0d413f` after reverting due to test failures, which will hopefully now be fixed. Original commit message: After expanding the pseudo instructions, update the liveness info. We do this in a post-order traversal of the loop, including its exit blocks and preheader(s). Differential Revision: https://reviews.llvm.org/D72131	2020-01-16 15:44:25 +00:00
Jay Foad	28bb43bdf8	[GlobalISel] Use more MachineIRBuilder helper methods Reviewers: arsenm, nhaehnle Subscribers: wdng, rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72833	2020-01-16 15:34:51 +00:00
Jeremy Morse	c969335abd	Revert "[PHIEliminate] Move dbg values after phi and label" Testing compiler-rt, a new assertion failure occurs when building the GwpAsanTestObjects object. I'm uploading a reproducer to D70597. This reverts commit `75188b01e9`.	2020-01-16 14:01:27 +00:00
Chris Ye	75188b01e9	[PHIEliminate] Move dbg values after phi and label If there are DBG_VALUEs between phi and label (after phi and before label), DBG_VALUE will block PHI lowering after the LABEL. Moving all DBG_VALUEs after Labels in the function ScheduleDAGSDNodes::EmitSchedule to avoid impacting PHI lowering. before: PHI DBG_VALUE LABEL after: (move DBG_VALUE after label) PHI LABEL DBG_VALUE then: (phi lowering after label) LABEL COPY DBG_VALUE Fixes the issue: https://bugs.llvm.org/show_bug.cgi?id=43859 Differential Revision: https://reviews.llvm.org/D70597	2020-01-16 11:58:09 +00:00
Craig Topper	5cf1b01a01	[LegalizeDAG][TargetLowering] Move vXi64/i64->vXf32/f32 uint_to_fp legalizing code from TargetLowering::expandUINT_TO_FP back to LegalizeDAG. This was moved in October 2018, but we don't appear to be using this for vectors on any in tree target. Moving it back simplifies D72794 so we can share the code for i32->f32.	2020-01-15 22:04:50 -08:00
Stanislav Mekhanoshin	8b417dd3d6	Process BUNDLE in tail duplication When tail duplication estimates a size of tail it uses instruction count. Account for a number of instrictions in a bundle too. Differential Revision: https://reviews.llvm.org/D72783	2020-01-15 15:46:57 -08:00
Matt Arsenault	25e9938a45	GlobalISel: Handle more cases of G_SEXT narrowing This now develops the same problem G_ZEXT/G_ANYEXT have where the requested type is assumed to be the source type. This will be fixed separately by creating intermediate merges.	2020-01-15 18:33:15 -05:00
Vedant Kumar	43464509fc	DWARF: Simplify the way the return PC is attached to call site tags, NFC This cleanup was suggested by Djordje in D72489.	2020-01-15 14:16:21 -08:00
Jinsong Ji	c65ac2ba78	[MachineScheduler][NFC] Don't swap when we can't cluster https://reviews.llvm.org/D72706 tried to reduce reordering due to mem op clustering. This patch avoid doing the swap when we can't cluster. Reviewed By: rampitec Differential Revision: https://reviews.llvm.org/D72800	2020-01-15 21:55:31 +00:00
Mircea Trofin	5466597fee	[NFC] Refactor InlineResult for readability Summary: InlineResult is used both in APIs assessing whether a call site is inlinable (e.g. llvm::isInlineViable) as well as in the function inlining utility (llvm::InlineFunction). It means slightly different things (can/should inlining happen, vs did it happen), and the implicit casting may introduce ambiguity (casting from 'false' in InlineFunction will default a message about hight costs, which is incorrect here). The change renames the type to a more generic name, and disables implicit constructors. Reviewers: eraman, davidxl Reviewed By: davidxl Subscribers: kerbowa, arsenm, jvesely, nhaehnle, eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72744	2020-01-15 13:34:20 -08:00
Vedant Kumar	f0120556c7	[DWARF] Emit DW_AT_call_return_pc as an address This reverts D53469, which changed llvm's DWARF emission to emit DW_AT_call_return_pc as a function-local offset. Such an encoding is not compatible with post-link block re-ordering tools and isn't standards- compliant. In addition to reverting back to the original DW_AT_call_return_pc encoding, teach lldb how to fix up DW_AT_call_return_pc when the address comes from an object file pointed-to by a debug map. While doing this I noticed that lldb's support for tail calls that cross a DSO/object file boundary wasn't covered, so I added tests for that. This latter case exercises the newly added return PC fixup. The dsymutil changes in this patch were originally included in D49887: the associated test should be sufficient to test DW_AT_call_return_pc encoding purely on the llvm side. Differential Revision: https://reviews.llvm.org/D72489	2020-01-15 13:02:23 -08:00
Matt Arsenault	936483fb7d	GlobalISel: Implement lower for G_BITCAST Bitcast only really applies between scalars and vectors. Implement as an unmerge and remerge. The test needs to tolerate failure since one of the unmerges currently fails to legalize.	2020-01-15 08:58:58 -05:00
Matt Arsenault	91715617ad	GlobalISel: Fix narrowScalar for G_ANYEXT results This is nearly the same as G_ZEXT.	2020-01-15 08:58:57 -05:00
Simon Pilgrim	0b64400e0b	RegisterClassInfo::computePSetLimit - assert that we actually find a register. Fixes "pointer is null" clang static analyzer warning.	2020-01-15 12:18:12 +00:00
David Green	b891490ceb	[Scheduler] Adjust interface of CreateTargetMIHazardRecognizer to use ScheduleDAGMI. NFC All the callers of this function will be ScheduleDAGMI from the MachineScheduler. This allows us to use the extra info available in ScheduleDAGMI without resorting to awkward casts.	2020-01-15 07:21:44 +00:00
Michael Liao	01a4b83154	[codegen,amdgpu] Enhance MIR DIE and re-arrange it for AMDGPU. Summary: - `dead-mi-elimination` assumes MIR in the SSA form and cannot be arranged after phi elimination or DeSSA. It's enhanced to handle the dead register definition by skipping use check on it. Once a register def is `dead`, all its uses, if any, should be `undef`. - Re-arrange the DIE in RA phase for AMDGPU by placing it directly after `detect-dead-lanes`. - Many relevant tests are refined due to different register assignment. Reviewers: rampitec, qcolombet, sunfish Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72709	2020-01-14 19:26:15 -05:00
Michael Liao	8d07f8d98c	[DAGCombine] Replace `getIntPtrConstant()` with `getVectorIdxTy()`. - Prefer `getVectorIdxTy()` as the index operand type for `EXTRACT_SUBVECTOR` as targets expect different types by overloading `getVectorIdxTy()`.	2020-01-14 17:03:05 -05:00
Craig Topper	9ee90ea55c	[LegalizeTypes] Remove untested code from ExpandIntOp_UINT_TO_FP This code is untested in tree because the "APFloat::semanticsPrecision(sem) >= SrcVT.getSizeInBits() - 1" check is false for most combinations for int and fp types except maybe i32 and f64. For that you would need i32 to be an illegal type, but f64 to be legal and have custom handling for legalizing the split sint_to_fp. The precision check itself was added in 2010 to fix a double rounding issue in the algorithm that would occur if the sint_to_fp was not able to do the conversion without rounding. Differential Revision: https://reviews.llvm.org/D72728	2020-01-14 13:15:29 -08:00
Jay Foad	b777e551f0	[MachineScheduler] Reduce reordering due to mem op clustering Summary: Mem op clustering adds a weak edge in the DAG between two loads or stores that should be clustered, but the direction of this edge is pretty arbitrary (it depends on the sort order of MemOpInfo, which represents the operands of a load or store). This often means that two loads or stores will get reordered even if they would naturally have been scheduled together anyway, which leads to test case churn and goes against the scheduler's "do no harm" philosophy. The fix makes sure that the direction of the edge always matches the original code order of the instructions. Reviewers: atrick, MatzeB, arsenm, rampitec, t.p.northover Subscribers: jvesely, wdng, nhaehnle, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72706	2020-01-14 19:19:02 +00:00
diggerlin	eb23cc136b	[AIX][XCOFF] Supporting the ReadOnlyWithRel SectionKnd SUMMARY: In this patch we put the global variable in a Csect which's SectionKind is "ReadOnlyWithRel" into Data Section. Reviewers: hubert.reinterpretcast,jasonliu,Xiangling_L Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D72461	2020-01-14 13:21:49 -05:00
Ulrich Weigand	81ee484484	[FPEnv] Fix chain handling regression after `04a8696` Code in getRoot made the assumption that every node in PendingLoads must always itself have a dependency on the current DAG root node. After the changes in `04a8696`, it turns out that this assumption no longer holds true, causing wrong codegen in some cases (e.g. stores after constrained FP intrinsics might get deleted). To fix this, we now need to make sure that the TokenFactor created by getRoot always includes the previous root, if there is no implicit dependency already present. The original getControlRoot code already has exactly this check, so this patch simply reuses that code now for getRoot as well. This fixes the regression. NFC if no constrained FP intrinsic is present.	2020-01-14 14:10:57 +01:00
Benjamin Kramer	df186507e1	Make helper functions static or move them into anonymous namespaces. NFC.	2020-01-14 14:06:37 +01:00
Simon Pilgrim	31aed2e0da	Fix "MIParser::getIRValue(unsigned int)’ defined but not used" warning. NFCI.	2020-01-14 11:58:54 +00:00
Simon Pilgrim	c05a11108b	[SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() and generic ISD::SHL handling. As mentioned by @nikic on rGef5debac4302, we can merge the guaranteed bottom zero bits from the shifted value, and then, if a min shift amount is known, zero out the bottom bits as well.	2020-01-14 11:51:41 +00:00
Simon Pilgrim	a43b0065c5	[SelectionDAG] ComputeKnownBits - merge getValidMinimumShiftAmountConstant() and generic ISD::SRL handling. As mentioned by @nikic on rGef5debac4302 (although that was just about SHL), we can merge the guaranteed top zero bits from the shifted value, and then, if a min shift amount is known, zero out the top bits as well. SHL tests / handling will be added in a follow up patch.	2020-01-14 11:41:47 +00:00
Eli Friedman	e68e4cbcc5	[GlobalISel] Change representation of shuffle masks in MachineOperand. We're planning to remove the shufflemask operand from ShuffleVectorInst (D72467); fix GlobalISel so it doesn't depend on that Constant. The change to prelegalizercombiner-shuffle-vector.mir happens because the input contains a literal "-1" in the mask (so the parser/verifier weren't really handling it properly). We now treat it as equivalent to "undef" in all contexts. Differential Revision: https://reviews.llvm.org/D72663	2020-01-13 16:55:41 -08:00
Amy Huang	328e0f3dca	Revert "[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return type for C++ member functions." This reverts commit `c958639098`, which causes a crash. See https://reviews.llvm.org/D70524 for details.	2020-01-13 13:58:14 -08:00
Craig Topper	26c7a4ed10	[LegalizeIntegerTypes][X86] Add support for expanding input of STRICT_SINT_TO_FP/STRICT_UINT_TO_FP into a libcall. Needed to support i128->fp128 on 32-bit X86. Add full set of strict sint_to_fp/uint_to_fp conversion tests for fp128.	2020-01-13 13:11:12 -08:00
Daniel Sanders	a0f4600f4f	Rework `be15dfa88f` such that it works with GlobalISel which doesn't use EVT Summary: `be15dfa88f` broke GlobalISel's usage of getSetCCInverse() which currently appears to be limited to our out-of-tree backend. GlobalISel doesn't use EVT's and isn't able to derive them from the information it has as it doesn't distinguish between integer and floating point types (that distinction is made by operations rather than values). Bring back the bool version of getSetCCInverse() in a way that doesn't break the intent of `be15dfa88f` but also allows GlobalISel to continue using it. Reviewers: spatel, bogner, arichardson Reviewed By: arichardson Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72309	2020-01-13 12:19:37 -08:00
Puyan Lotfi	484a7472f1	[llvm][MIRVRegNamerUtils] Adding hashing on FrameIndex MachineOperands. This patch makes it so that cases where multiple instructions that differ only in their FrameIndex MachineOperand values no longer collide. For instance: %1:_(p0) = G_FRAME_INDEX %stack.0 %2:_(p0) = G_FRAME_INDEX %stack.1 Prior to this patch these instructions would collide together. Differential Revision: https://reviews.llvm.org/D71583	2020-01-13 13:39:54 -05:00
Simon Pilgrim	c6fcd5d115	[SelectionDAG] ComputeNumSignBits add getValidMaximumShiftAmountConstant() for ISD::SHL support Allows us to handle non-uniform SHL shifts to determine the minimum number of sign bits remaining (based off the maximum shift amount value)	2020-01-13 18:02:37 +00:00
Andrew Wei	05366870ee	[LegalizeTypes] Add SoftenFloatResult support for STRICT_SINT_TO_FP/STRICT_UINT_TO_FP Some target like arm/riscv with soft-float will have compiling crash when using -fno-unsafe-math-optimization option. This patch will add the missing strict FP support to SoftenFloatRes_XINT_TO_FP. Differential Revision: https://reviews.llvm.org/D72277	2020-01-14 01:01:56 +08:00
Simon Pilgrim	38e2c01221	[SelectionDAG] ComputeNumSignBits add getValidMinimumShiftAmountConstant() ISD::SRA support Allows us to handle more non-uniform SRA sign bits cases	2020-01-13 16:55:02 +00:00
David Green	90555d9253	[Scheduler] Remove superfluous casts. NFC	2020-01-13 16:34:13 +00:00
Simon Pilgrim	376bc39c82	[SelectionDAG] ComputeNumSignBits - Use getValidShiftAmountConstant for shift opcodes getValidShiftAmountConstant handles out of bounds shift amounts for us, allowing us to remove the local handling.	2020-01-13 14:12:12 +00:00
Simon Pilgrim	6d1a8fd447	[SelectionDAG] ComputeKnownBits - Add DemandedElts support to getValidShiftAmountConstant/getValidMinimumShiftAmountConstant()	2020-01-13 14:12:12 +00:00
Ulrich Weigand	04a86966fb	[FPEnv] Fix chain handling for fpexcept.strict nodes We need to ensure that fpexcept.strict nodes are not optimized away even if the result is unused. To do that, we need to chain them into the block's terminator nodes, like already done for PendingExcepts. This patch adds two new lists of pending chains, PendingConstrainedFP and PendingConstrainedFPStrict to hold constrained FP intrinsic nodes without and with fpexcept.strict markers. This allows not only to solve the above problem, but also to relax chains a bit further by no longer flushing all FP nodes before a store or other memory access. (They are still flushed before nodes with other side effects.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72341	2020-01-13 14:38:49 +01:00
Simon Pilgrim	ef5debac43	[SelectionDAG] ComputeKnownBits add getValidMinimumShiftAmountConstant() ISD::SHL support As mentioned on D72573	2020-01-13 12:02:13 +00:00
Simon Pilgrim	8f49204f26	[SelectionDAG] ComputeKnownBits - minimum leading/trailing zero bits in LSHR/SHL (PR44526) As detailed in https://blog.regehr.org/archives/1709 we don't make use of the known leading/trailing zeros for shifted values in cases where we don't know the shift amount value. This patch adds support to SelectionDAG::ComputeKnownBits to use KnownBits::countMinTrailingZeros and countMinLeadingZeros to set the minimum guaranteed leading/trailing known zero bits. Differential Revision: https://reviews.llvm.org/D72573	2020-01-13 11:08:12 +00:00
Awanish Pandey	c958639098	[DWARF5][DebugInfo]: Added support for DebugInfo generation for auto return type for C++ member functions. Summary: This patch will provide support for auto return type for the C++ member functions. Before this return type of the member function is deduced and stored in the DIE. This patch includes llvm side implementation of this feature. Patch by: Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: dblaikie, aprantl, shafik, alok, SouraVX, jini.susan.george Reviewed by: dblaikie Differential Revision: https://reviews.llvm.org/D70524	2020-01-13 12:26:13 +05:30
Fangrui Song	7fa5290d5b	__patchable_function_entries: don't use linkage field 'unique' with -no-integrated-as .section name, "flags"G, @type, GroupName[, linkage] As of binutils 2.33, linkage cannot be 'unique'. For integrated assembler, we use both 'o' flag and 'unique' linkage to support --gc-sections and COMDAT with lld. https://sourceware.org/ml/binutils/2019-11/msg00266.html	2020-01-12 12:53:44 -08:00
Qiu Chaofan	f33fd43a7c	[NFC] Refactor memory ops cluster method Current implementation of BaseMemOpsClusterMutation is a little bit obscure. This patch directly uses a map from store chain ID to set of memory instrs to make it simpler, so that future improvements are easier to read, update and review. Reviewed By: evandro Differential Revision: https://reviews.llvm.org/D72070	2020-01-12 13:10:04 +08:00
Craig Topper	efb674ac2f	[LegalizeVectorOps] Parallelize the lo/hi part of STRICT_UINT_TO_FLOAT legalization. The lo and hi computation are independent. Give them the same input chain and TokenFactor the results together.	2020-01-11 17:50:30 -08:00
Craig Topper	ed679804d5	[TargetLowering][X86] Connect the chain from STRICT_FSETCC in TargetLowering::expandFP_TO_UINT and X86TargetLowering::FP_TO_INTHelper.	2020-01-11 17:50:20 -08:00
Craig Topper	ddfcd82bdc	[LegalizeVectorOps] Expand vector MERGE_VALUES immediately. Custom legalization can produce MERGE_VALUES to return multiple results. We can expand them immediately instead of leaving them around for DAG combine to clean up.	2020-01-11 17:50:20 -08:00
Craig Topper	5a9954c02a	[LegalizeVectorOps] Remove some of the simpler Expand methods. Pass Results vector to a couple. NFCI Some of the simplest handlers just call TLI and if that fails, they fall back to unrolling. For those just inline the TLI call and share the unrolling call with the default case of Expand. For ExpandFSUB and ExpandBITREVERSE so that its obvious they don't return results sometimes and want to defer to LegalizeDAG.	2020-01-11 12:14:19 -08:00
Craig Topper	9fe6f36c1a	[LegalizeVectorOps] Only pass SDNode* instead SDValue to all of the Expand* and Promote* methods. All the Expand* and Promote* function assume they are being called with result 0 anyway. Just hardcode result 0 into them.	2020-01-11 11:41:23 -08:00
Simon Pilgrim	a8ed86b5c7	moveOperands - assert Src/Dst MachineOperands are non-null. Fixes static-analyzer warnings.	2020-01-11 14:37:19 +00:00
Craig Topper	bb2553175a	[TargetLowering][ARM][Mips][WebAssembly] Remove the ordered FP compare from RunttimeLibcalls.def and all associated usages Summary: This always just used the same libcall as unordered, but the comparison predicate was different. This change appears to have been made when targets were given the ability to override the predicates. Before that they were hardcoded into the type legalizer. At that time we never inverted predicates and we handled ugt/ult/uge/ule compares by emitting an unordered check ORed with a ogt/olt/oge/ole checks. So only ordered needed an inverted predicate. Later ugt/ult/uge/ule were optimized to only call a single libcall and invert the compare. This patch removes the ordered entries and just uses the inverting logic that is now present. This removes some odd things in both the Mips and WebAssembly code. Reviewers: efriedma, ABataev, uweigand, cameron.mcinally, kpn Reviewed By: efriedma Subscribers: dschuff, sdardis, sbc100, arichardson, jgravelle-google, kristof.beyls, hiraditya, aheejin, sunfish, atanasyan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72536	2020-01-10 19:30:08 -08:00
Stanislav Mekhanoshin	987bf8b6c1	Let targets adjust operand latency of bundles This reverts the AMDGPU DAG mutation implemented in D72487 and gives a more general way of adjusting BUNDLE operand latency. It also replaces FixBundleLatencyMutation with adjustSchedDependency callback in the AMDGPU, fixing not only successor latencies but predecessors' as well. Differential Revision: https://reviews.llvm.org/D72535	2020-01-10 14:56:53 -08:00
Craig Topper	71cee21861	[TargetLowering] Use SelectionDAG::getSetCC and remove a repeated call to getSetCCResultType in softenSetCCOperands. NFCI	2020-01-10 13:24:00 -08:00
Craig Topper	b590e0fd81	[TargetLowering][ARM][X86] Change softenSetCCOperands handling of ONE to avoid spurious exceptions for QNANs with strict FP quiet compares ONE is currently softened to OGT \| OLT. But the libcalls for OGT and OLT libcalls will trigger an exception for QNAN. At least for X86 with libgcc. UEQ on the other hand uses UO \| OEQ. The UO and OEQ libcalls will not trigger an exception for QNAN. This patch changes ONE to use the inverse of the UEQ lowering. So we now produce O & UNE. Technically the existing behavior was correct for a signalling ONE, but since I don't know how to generate one of those from clang that seemed like something we can deal with later as we would need to fix other predicates as well. Also removing spurious exceptions seemed better than missing an exception. There are also problems with quiet OGT/OLT/OLE/OGE, but those are harder to fix. Differential Revision: https://reviews.llvm.org/D72477	2020-01-10 11:00:17 -08:00
Craig Topper	f678fc7660	[LegalizeVectorOps] Improve handling of multi-result operations. This system wasn't very well designed for multi-result nodes. As a consequence they weren't consistently registered in the LegalizedNodes map leading to nodes being revisited for different results. I've removed the "Result" variable from the main LegalizeOp method and used a SDNode* instead. The result number from the incoming Op SDValue is only used for deciding which result to return to the caller. When LegalizeOp is called it should always register a legalized result for all of its results. Future calls for any other result should be pulled for the LegalizedNodes map. Legal nodes will now register all of their results in the map instead of just the one we were called for. The Expand and Promote handling to use a vector of results similar to LegalizeDAG. Each of the new results is then re-legalized and logged in the LegalizedNodes map for all of the Results for the node being legalized. None of the handles register their own results now. And none call ReplaceAllUsesOfValueWith now. Custom handling now always passes result number 0 to LowerOperation. This matches what LegalizeDAG does. Since the introduction of STRICT nodes, I've encountered several issues with X86's custom handling being called with an SDValue pointing at the chain and our custom handlers using that to get a VT instead of result 0. This should prevent us from having any more of those issues. On return we will update the LegalizedNodes map for all results so we shouldn't call the custom handler again for each result number. I want to push SDNode* further into the Expand and Promote handlers, but I've left that for a follow to keep this patch size down. I've created a dummy SDValue(Node, 0) to keep the handlers working. Differential Revision: https://reviews.llvm.org/D72224	2020-01-10 10:14:58 -08:00
Fangrui Song	4d1e23e3b3	[AArch64] Add function attribute "patchable-function-entry" to add NOPs at function entry The Linux kernel uses -fpatchable-function-entry to implement DYNAMIC_FTRACE_WITH_REGS for arm64 and parisc. GCC 8 implemented -fpatchable-function-entry, which can be seen as a generalized form of -mnop-mcount. The N,M form (function entry points before the Mth NOP) is currently only used by parisc. This patch adds N,0 support to AArch64 codegen. N is represented as the function attribute "patchable-function-entry". We will use a different function attribute for M, if we decide to implement it. The patch reuses the existing patchable-function pass, and TargetOpcode::PATCHABLE_FUNCTION_ENTER which is currently used by XRay. When the integrated assembler is used, __patchable_function_entries will be created for each text section with the SHF_LINK_ORDER flag to prevent --gc-sections (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93197) and COMDAT (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93195) issues. Retrospectively, __patchable_function_entries should use a PC-relative relocation type to avoid the SHF_WRITE flag and dynamic relocations. "patchable-function-entry"'s interaction with Branch Target Identification is still unclear (see https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424 for GCC discussions). Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D72215	2020-01-10 09:55:51 -08:00
Ulrich Weigand	f0fd11df7d	[FPEnv] Invert sense of MIFlag::FPExcept flag In D71841 we inverted the sense of the SDNode-level flag to ensure all nodes default to potentially raising FP exceptions unless otherwise specified -- i.e. if we forget to propagate the flag somewhere, the effect is now only lost performance, not incorrect code. However, the related flag at the MI level still defaults to nodes not raising FP exceptions unless otherwise specified. To be fully on the (conservatively) safe side, we should invert that flag as well. This patch does so by replacing MIFlag::FPExcept with MIFlag::NoFPExcept. (Note that this does also introduce an incompatible change in the MIR format.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D72466	2020-01-10 15:34:50 +01:00
Simon Pilgrim	b2cd273416	Fix Wdocumentation warning. NFCI.	2020-01-10 10:32:37 +00:00
Peng Guo	cfd8498401	[MIR] Fix cyclic dependency of MIR formatter Summary: Move MIR formatter pointer from TargetMachine to TargetInstrInfo to avoid cyclic dependency between target & codegen. Reviewers: dsanders, bkramer, arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D72485	2020-01-10 11:18:12 +01:00
Matt Arsenault	0ea3c7291f	GlobalISel: Handle llvm.read_register Compared to the attempt in `bdcc6d3d26`, this uses intermediate generic instructions.	2020-01-09 17:37:52 -05:00
Matt Arsenault	f33f3d98e9	DAG: Don't use unchecked dyn_cast	2020-01-09 17:37:52 -05:00
Matt Arsenault	ac53a5f1dc	GlobalISel: Fix else after return	2020-01-09 17:37:52 -05:00
Matt Arsenault	255cc5a760	CodeGen: Use LLT instead of EVT in getRegisterByName Only PPC seems to be using it, and only checks some simple cases and doesn't distinguish between FP. Just switch to using LLT to simplify use from GlobalISel.	2020-01-09 17:37:52 -05:00
Matt Arsenault	595ac8c46e	GlobalISel: Move getLLTForMVT/getMVTForLLT As an intermediate step, some TLI functions can be converted to using LLT instead of MVT. Move this somewhere out of GlobalISel so DAG functions can use these.	2020-01-09 16:32:51 -05:00
Matt Arsenault	fba1fbb9c7	GlobalISel: Don't assert on MoreElements creating vectors If the original type was a scalar, it should be valid to add elements to turn it into a vector. Tests included with following legalization change.	2020-01-09 16:29:44 -05:00
Craig Topper	b705fe5686	[TargetLowering][X86] TeachSimplifyDemandedBits to handle cases where only the sign bit is demanded from a SETCC and can be passed through If we're doing a compare that only tests the sign bit and only the sign bit is demanded, we can just bypass the node. This removes one of the blend dependencies in our v2i64->v2f32 uint_to_fp codegen on pre-sse4.2 targets. Differential Revision: https://reviews.llvm.org/D72356	2020-01-09 10:21:25 -08:00
Sanjay Patel	cb5612e2df	[DAGCombiner] reduce extract subvector of concat If we are extracting a chunk of a vector that's a fraction of an operand of the concatenated vector operand, we can extract directly from one of those original operands. This is another suggestion from PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024#c2 But I'm not sure yet if it will make any difference on those patterns. It seems to help a few existing AVX512 tests though. Differential Revision: https://reviews.llvm.org/D72361	2020-01-09 09:38:12 -05:00
Sam Parker	1cba261239	Revert "[ARM][LowOverheadLoops] Update liveness info" This reverts commit `e93e0d413f`. There's some ordering problems on some on the buildbots which needs investigating.	2020-01-09 09:22:06 +00:00
Sam Parker	e93e0d413f	[ARM][LowOverheadLoops] Update liveness info After expanding the pseudo instructions, update the liveness info. We do this in a post-order traversal of the loop, including its exit blocks and preheader(s). Differential Revision: https://reviews.llvm.org/D72131	2020-01-09 08:33:47 +00:00
QingShan Zhang	d48ac7d54d	[DAGCombine] Fold the (fma -x, y, -z) to -(fma x, y, z) This is a positive combination as long as the NEG is NOT free, as we are reducing the number of NEG from two to one. Differential Revision: https://reviews.llvm.org/D72312	2020-01-09 04:33:46 +00:00
Daniel Sanders	de3d0ee023	Revert "Revert "[MIR] Target specific MIR formating and parsing"" There was an unguarded dereference of MF in a function that permitted nullptr. Fixed This reverts commit `71d64f72f9`.	2020-01-08 20:03:29 -08:00
Nico Weber	71d64f72f9	Revert "[MIR] Target specific MIR formating and parsing" This reverts commit `3ef05d85be`. It broke check-llvm on many bots, see comments on D69836.	2020-01-08 22:50:49 -05:00
Peng Guo	3ef05d85be	[MIR] Target specific MIR formating and parsing Summary: Added MIRFormatter for target specific MIR formating and parsing with immediate and custom pseudo source values. Target machine can subclass MIRFormatter and implement custom logic for printing and parsing immediate and custom pseudo source values for better readability. * Target specific immediate mnemonic need to start with "." follows by identifier string. When MIR parser sees immediate it will call target specific parsing function. * Custom pseudo source value need to start with custom follows by double-quoted string. MIR parser will pass the quoted string to target specific PSV parsing function. * MIRFormatter have 2 helper functions to facilitate LLVM value printing and parsing for custom PSV if they refers LLVM values. Patch by Peng Guo Reviewers: dsanders, arsenm Reviewed By: dsanders Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69836	2020-01-08 18:48:02 -08:00
Daniel Sanders	5ab6fa7b70	Revert "[MIR] Target specific MIR formating and parsing" Forgot to credit Peng in the commit message. This reverts commit `be841f89d0`.	2020-01-08 18:48:02 -08:00
Peng Guo	be841f89d0	[MIR] Target specific MIR formating and parsing Summary: Added MIRFormatter for target specific MIR formating and parsing with immediate and custom pseudo source values. Target machine can subclass MIRFormatter and implement custom logic for printing and parsing immediate and custom pseudo source values for better readability. * Target specific immediate mnemonic need to start with "." follows by identifier string. When MIR parser sees immediate it will call target specific parsing function. * Custom pseudo source value need to start with custom follows by double-quoted string. MIR parser will pass the quoted string to target specific PSV parsing function. * MIRFormatter have 2 helper functions to facilitate LLVM value printing and parsing for custom PSV if they refers LLVM values. Reviewers: dsanders, arsenm Reviewed By: dsanders Subscribers: wdng, jvesely, nhaehnle, hiraditya, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69836	2020-01-08 18:34:21 -08:00
Jonas Paulsson	659efa21f1	Recommit "[MachineVerifier] Improve verification of live-in lists." MachineVerifier::visitMachineFunctionAfter() is extended to check the live-through case for live-in lists. This is only done for registers without aliases and that are neither allocatable or reserved, such as the SystemZ::CC register. The MachineVerifier earlier only catched the case of a live-in use without an entry in the live-in list (as "using an undefined physical register"). A comment in LivePhysRegs.h has been added stating a guarantee that addLiveOuts() can be trusted for a full register both before and after register allocation. Review: Quentin Colombet Differential Revision: https://reviews.llvm.org/D68267	2020-01-08 16:58:54 -08:00
Evgenii Stepanov	58deb20dd2	Revert "Merge memtag instructions with adjacent stack slots." * Bad machine code: Tied use must be a register * - function: stg_alloca17 - basic block: %bb.0 entry (0x20076710580) - instruction: early-clobber %0:gpr64common, early-clobber %1:gpr64sp = STGloop 272, %stack.0.a :: (store 272 into %ir.a, align 16) - operand 3: %stack.0.a http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/21481/steps/test-check-all/logs/stdio This reverts commit `b675a7628c`.	2020-01-08 14:36:12 -08:00
Evgenii Stepanov	b675a7628c	Merge memtag instructions with adjacent stack slots. Summary: Detect a run of memory tagging instructions for adjacent stack frame slots, and replace them with a shorter instruction sequence * replace STG + STG with ST2G * replace STGloop + STGloop with STGloop This code needs to run when stack slot offsets are already known, but before FrameIndex operands in STG instructions are eliminated; that's the reason for the new hook in PrologueEpilogue. This change modifies STGloop and STZGloop pseudos to take the size as an immediate integer operand, and base address as a FI operand when possible. This is needed to simplify recognizing an STGloop instruction as operating on a stack slot post-regalloc. This improves memtag code size by ~0.25%, and it looks like an additional ~0.1% is possible by rearranging the stack frame such that consecutive STG instructions reference adjacent slots (patch pending). Reviewers: pcc, ostannard Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70286	2020-01-08 11:02:03 -08:00
Simon Pilgrim	108279948d	[SelectionDAG] Use llvm::Optional<APInt> for FoldValue. Use llvm::Optional<APInt> instead of std::pair<APInt, bool> with the bool second being used to report success/failure of fold.	2020-01-08 16:09:24 +00:00
Sanjay Patel	780ba1f22b	[DAGCombiner] clean up extract-of-concat fold; NFC This hopes to improve readability and adds an assert. The functional change noted by the TODO comment is proposed in: D72361	2020-01-08 10:15:33 -05:00
Bevin Hansson	8e2b44f7e0	[Intrinsic] Add fixed point division intrinsics. Summary: This patch adds intrinsics and ISelDAG nodes for signed and unsigned fixed-point division: llvm.sdiv.fix.* llvm.udiv.fix.* These intrinsics perform scaled division on two integers or vectors of integers. They are required for the implementation of the Embedded-C fixed-point arithmetic in Clang. Patch by: ebevhan Reviewers: bjope, leonardchan, efriedma, craig.topper Reviewed By: craig.topper Subscribers: Ka-Ka, ilya, hiraditya, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70007	2020-01-08 15:17:46 +01:00
Qiu Chaofan	b2c2fe7219	[NFC] Move InPQueue into arguments of releaseNode This patch moves `InPQueue` into function arguments instead of template arguments of `releaseNode`, which is a cleaner approach. Differential Revision: https://reviews.llvm.org/D72125	2020-01-08 22:15:32 +08:00
Alexey Lapshin	1cf11a4c67	[Dsymutil][Debuginfo][NFC] Reland: Refactor dsymutil to separate DWARF optimizing part. #2 . Summary: This patch relands D71271. The problem with D71271 is that it has cyclic dependency: CodeGen->AsmPrinter->DebugInfoDWARF->CodeGen. To avoid cyclic dependency this patch puts implementation for DWARFOptimizer into separate library: lib/DWARFLinker. Thus the difference between this patch and D71271 is in that DWARFOptimizer renamed into DWARFLinker and it`s files are put into lib/DWARFLinker. Reviewers: JDevlieghere, friss, dblaikie, aprantl Reviewed By: JDevlieghere Subscribers: thegameg, merge_guards_bot, probinson, mgorny, hiraditya, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D71839	2020-01-08 14:15:31 +03:00
Wang, Pengfei	9a621de1ec	[X86] Adding fp128 support for strict fcmp Summary: Adding fp128 support for strict fcmp Reviewers: craig.topper, LiuChen3, andrew.w.kaylor, RKSimon, uweigand Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71897	2020-01-08 12:59:31 +08:00
Amara Emerson	b6598bcf4b	[AArch64][GlobalISel] Fold a chain of two G_PTR_ADDs of constant offsets. E.g. %addr1 = G_PTR_ADD %base, G_CONSTANT 20 %addr2 = G_PTR_ADD %addr1, G_CONSTANT 8 --> %addr2 = G_PTR_ADD %base, G_CONSTANT 28 Differential Revision: https://reviews.llvm.org/D72351	2020-01-07 14:12:42 -08:00
Bill Wendling	e886e762dd	Revert "Allow output constraints on "asm goto"" This reverts commit `52366088a8`. I accidentally pushed this before supporting changes.	2020-01-07 13:44:08 -08:00
Bill Wendling	52366088a8	Allow output constraints on "asm goto" Summary: Remove the restrictions that preventing "asm goto" from returning non-void values. The values returned by "asm goto" are only valid on the "fallthrough" path. Reviewers: jyknight, nickdesaulniers, hfinkel Reviewed By: jyknight, nickdesaulniers Subscribers: rsmith, hiraditya, llvm-commits, cfe-commits, craig.topper, rnk Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D69876	2020-01-07 13:40:26 -08:00
Jessica Paquette	acd2580824	[MachineOutliner][AArch64] Save + restore LR in noreturn functions Conservatively always save + restore LR in noreturn functions. These functions do not end in a RET, and so they aren't guaranteed to have an instruction which uses LR in any way. So, as a result, you can end up in unfortunate situations where you can't backtrace out of these functions in a debugger. Remove the old noreturn test, and add a new one which is more descriptive. Remove the restriction that we can't outline from noreturn functions as well since we now do the right thing.	2020-01-07 11:27:25 -08:00
diggerlin	a3832f33d9	[AIX][XCOFF]Implement mergeable const SUMMARY: In this patch, we map mergeable const objects to the read-only section in the same manner as const objects that are not mergeable. Reviewers: hubert.reinterpretcast,jasonliu Subscribers: wuzish, nemanjai, hiraditya Differential Revision: https://reviews.llvm.org/D71551	2020-01-07 11:20:51 -05:00
Sam Parker	3c7f740f28	[TypePromotion] Use SetVectors instead of PtrSets Remove the chance of non-deterministic insertion of zexts of the sources by using a SetVector instead of SmallPtrSet. Do the same for sinks for consistency and to negate the small issue from possibly happening. The SafeWrap instructions are now also stored in a SmallVector. The IRPromoter members of these structures have been changed to references. Differential Revision: https://reviews.llvm.org/D72322	2020-01-07 14:51:54 +00:00
Sanjay Patel	58e2e92a57	[DAGCombiner] reduce shuffle of concat of same vector This is possibly a small part towards solving PR42024: https://bugs.llvm.org/show_bug.cgi?id=42024 The vectorizer is creating shuffles of concat like this: %63 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 1, i32 2, i32 3, i32 0, i32 1, i32 2, i32 3> %64 = shufflevector <8 x i64> %63, <8 x i64> undef, <8 x i32> <i32 0, i32 4, i32 1, i32 5, i32 2, i32 6, i32 3, i32 7> That might be fixable in the vectorizers, but we're not allowed to fold that into a single shuffle in instcombine, so we should have a backend backstop to convert that into the likely simpler form: %64 = shufflevector <4 x i64> %x, <4 x i64> undef, <8 x i32> <i32 0, i32 0, i32 1, i32 1, i32 2, i32 2, i32 3, i32 3> Differential Revision: https://reviews.llvm.org/D72300	2020-01-07 09:48:59 -05:00
Sjoerd Meijer	e34801c8e6	[ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS This is a recommit of D71330, but with a few things fixed and changed: 1) ReachingDefAnalysis: this was not running with optnone as it was checking skipFunction(), which other analysis passes don't do. I guess this is a copy-paste from a codegen pass. 2) VPTBlockPass: here I've added skipFunction(), because like most/all optimisations, we don't want to run this with optnone. This fixes the issues with the initial/previous commit: the VPTBlockPass was running with optnone, but ReachingDefAnalysis wasn't, and so VPTBlockPass was crashing querying ReachingDefAnalysis. I've added test case mve-vpt-block-optnone.mir to check that we don't run VPTBlock with optnone. Differential Revision: https://reviews.llvm.org/D71470	2020-01-07 13:54:47 +00:00
Matt Arsenault	f3de8ab5cc	GlobalISel: Implement lower for G_INTRINSIC_ROUND Mostly copied from AMDGPU lowering implementation, except used G_SITOFP instead of directly creating a select on -1.0, 0.0.	2020-01-06 18:26:42 -05:00
Bill Wendling	83d690a149	Don't rely on 'l'(ell) modifiers to indicate a label reference Summary: It's not necessary to use an 'l'(ell) modifier when referencing a label. Treat block addresses and MBB references as if the modifier is used anyway. This prevents us from generating references to ficticious labels. Reviewers: jyknight, nickdesaulniers, hfinkel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71849	2020-01-06 14:44:03 -08:00
Matt Arsenault	1060b9e23b	GlobalISel: Correct result type for G_FCMP in lowerFPTOUI Using the final result type doesn't make any sense. Use the natural default boolean type for the select condition.	2020-01-06 17:21:51 -05:00
Matt Arsenault	0b093f0212	GlobalISel: Start adding computeNumSignBits to GISelKnownBits	2020-01-06 17:21:51 -05:00
Matt Arsenault	5518a02a83	llc/MIR: Fix setFunctionAttributes for MIR functions A random set of attributes are implemented by llc/opt forcing the string attributes on the IR functions before processing anything. This would not happen for MIR functions, which have not yet been created at this point. Use a callback in the MIR parser, purely to avoid dealing with the ugliness that the command line flags are in a .inc file, and would require allowing access to these flags from multiple places (either from the MIR parser directly, or a new utility pass to implement these flags). It would probably be better to cleanup the flag handling into a separate library. This is in preparation for treating more command line flags with a corresponding function attribute in a more uniform way. The fast math flags in particular have a messy system where the command line flag sets the behavior from a function attribute if present, and otherwise the command line flag. This means if any other pass tries to inspect the function attributes directly, it will be inconsistent with the intended behavior. This is also inconsistent with the current behavior of -mcpu and -mattr, which overwrites any pre-existing function attributes. I would like to move this to consistenly have the command line flags not overwrite any pre-existing attributes, and to always ensure the command line flags are consistent with the function attributes.	2020-01-06 17:21:51 -05:00
Craig Topper	62f3403bfc	[LegalizeTypes] Add widening support for STRICT_FSETCC/FSETCCS This patch adds widening which really just scalarizes because we don't have a strategy for the extra elements we would need to pad with. Differential Revision: https://reviews.llvm.org/D72193	2020-01-06 13:45:55 -08:00
Simon Pilgrim	ea5abf1453	Fix "use of uninitialized variable" static analyzer warning. NFCI.	2020-01-06 16:36:56 +00:00
Simon Pilgrim	6fa6000e3e	[DAG] DAGCombiner::XformToShuffleWithZero - use APInt::extractBits helper. NFCI.	2020-01-06 13:17:02 +00:00
James Henderson	d68904f957	[NFC] Fix trivial typos in comments Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D72143 Patch by Kazuaki Ishizaki.	2020-01-06 10:50:26 +00:00
Craig Topper	19ace449a3	[TargetLowering] Use SETCC input type to call getBooleanContents instead of the setcc result type. This isn't a functonal change since we also check the bit width is the same and the input type is integer. This guarantees the input and output type are the same. But passing the input type makes the code more readable.	2020-01-05 23:15:49 -08:00
QingShan Zhang	b9780f4f80	[DAGCombine] Don't check the legality of type when combine the SIGN_EXTEND_INREG This is the DAG node for SIGN_EXTEND_INREG : t21: v4i32 = sign_extend_inreg t18, ValueType:ch:v4i16 It has two operands. The first one is the value it want to extend, and the second one is the type to specify how to extend the value. For this example, it means that, it is signed extend the t18(v4i32) from v4i16 to v4i32. That is the semantics of c code: vector int foo(vector int m) { return m << 16 >> 16; } And it could be any vector type that hardware support the operation, though the type 'v4i16' is NOT legal for the target. When we are trying to combine the srl + sra, what we did now is calling the TLI.isOperationLegal(), which will also check the legality of the type. That doesn't make sense. Differential Revision: https://reviews.llvm.org/D70230	2020-01-06 03:00:58 +00:00
Craig Topper	4e37d60f2a	[LegalizeVectorOps][X86] Enable expansion of vector fp_to_uint in LegalizeVectorOps to avoid scalarization. The code here isn't great in all caess. Particularly v4f64->v4i32 on 64-bit AVX targets. But there is some improvement in some configurations. There's definitely some issues with computeNumSignBits with X86ISD::STRICT_FCMP. As well as not being able to propagate sign bits through merge_values nodes that get created during custom legalization.	2020-01-04 19:18:54 -08:00
Craig Topper	16a67d252c	[TargetLowering] In expandFP_TO_UINT, add proper extend or truncate for the condition to feed the DstVT select. Previously, for vectors we created a vselect with a condition that didn't match what the target wanted according to getSetCCResultType. To make up for this, X86 had a special DAG combine to detect if the condition was all sign bits and then insert its own truncate or extend. By adding the extend/truncate here explicitly we can avoid that.	2020-01-04 18:15:20 -08:00
Craig Topper	285d5e6b8b	[LegalizeVectorOps] Split most of ExpandStrictFPOp into a separate UnrollStrictFPOp method. Call that method from ExpandUINT_TO_FLOAT. ExpandStrictFPOp calls ExpandUINT_TO_FLOAT. Previously, ExpandUINT_TO_FLOAT returned SDValue() if it wasn't able to handle and needed to unroll. Then ExpandStrictFPOp would detect his SDValue() and do the unroll. After this change, ExpandUINT_TO_FLOAT will directly call UnrollStrictFPOp and return the unrolled result.	2020-01-04 17:03:50 -08:00
Matt Arsenault	d12f2a2998	GlobalISel: Scalarize all division operations This only handled G_SDIV, but they all are trivially scalarizable. Also define placeholder AMDGPU division legalizer rules.	2020-01-04 13:47:10 -05:00
Florian Hahn	b8a3c34eee	Revert "[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC)." This reverts commit `51ef53f3bd`, as it breaks some bots.	2020-01-04 18:44:38 +00:00
Florian Hahn	51ef53f3bd	[SCEV] Move ScalarEvolutionExpander.cpp to Transforms/Utils (NFC). SCEVExpander modifies the underlying function so it is more suitable in Transforms/Utils, rather than Analysis. This allows using other transform utils in SCEVExpander. Reviewers: sanjoy.google, efriedma, reames Reviewed By: sanjoy.google Differential Revision: https://reviews.llvm.org/D71537	2020-01-04 18:29:35 +00:00
Matt Arsenault	1f950ced50	GlobalISel: Define G_READCYCLECOUNTER	2020-01-04 13:10:19 -05:00
Simon Pilgrim	eb0e1978df	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT (REAPPLIED) This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0. Differential Revision: https://reviews.llvm.org/D65887	2020-01-04 13:15:50 +00:00
Matt Arsenault	21309eafde	GlobalISel: Add type argument to getRegBankFromRegClass AMDGPU can't unambiguously go back from the selected instruction register class to the register bank without knowing if this was used in a boolean context.	2020-01-03 16:25:10 -05:00
Sanjay Patel	ca7fdd41bd	[DAGCombiner] fix miscompile in translating (X & undef) to shuffle See PR42982 for more context: https://bugs.llvm.org/show_bug.cgi?id=42982	2020-01-03 14:58:49 -05:00
Craig Topper	7cdc60c3db	[LegalizeVectorOps] Pass the post-UpdateNodeOperands version of Op to ExpandLoad/ExpandStore UpdateNodeOperands might CSE to another existing node. So we should make sure we're legalizing that node otherwise we might fail to hook up the operands properly. I've moved the result registration up to the caller to avoid having to pass both Result and Op into the functions where it might be confusing which is which. This address 2 other issues pointed out in D71861. Differential Revision: https://reviews.llvm.org/D72021	2020-01-03 11:53:08 -08:00
Reid Kleckner	9c2b72821b	Move tail call disabling code to target independent code When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in `d9699bc7bd` (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118	2020-01-03 11:27:41 -08:00
Roman Lebedev	0727e2b90c	[DAGCombiner][X86][AArch64] Generalize `A-(A&B)`->`A&(~B)` fold (PR44448) The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should/can just be 'A - (A & B)' -> 'A & (~B)' Even if we don't manage to fold `~` into B, we have likely formed `ANDN` node. Also, this way there's less similar-but-duplicate folds. Name: X - (X & Y) -> X & (~Y) %o = and i32 %X, %Y %r = sub i32 %X, %o => %n = xor i32 %Y, -1 %r = and i32 %X, %n https://rise4fun.com/Alive/kOUl See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:47 +03:00
Roman Lebedev	86403c0ff8	[DAGCombiner] `~(add X, -1)` -> `neg X` fold The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should just be 'A - (A & B)' -> 'A & (~B)', but we currently fail to sink that '~' into `(B - 1)`. Name: ~(X - 1) -> (0 - X) %o = add i32 %X, -1 %r = xor i32 %o, -1 => %r = sub i32 0, %X https://rise4fun.com/Alive/rjU	2020-01-03 17:55:46 +03:00
Roman Lebedev	3d492d7503	[DAGCombine][X86][Thumb2/LowOverheadLoops] `A - (A & C)` -> `A & (~C)` fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. Name: PR44448 ptr - (ptr & C) -> ptr & (~C) %bias = and i32 %ptr, C %r = sub i32 %ptr, %bias => %r = and i32 %ptr, ~C See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:45 +03:00
Roman Lebedev	1711be78f7	[NFC][DAGCombine] Clarify comment for 'A - (A & (B - 1))' fold	2020-01-03 17:55:42 +03:00
Jay Foad	8382f87145	Fix typo "psuedo" in comments	2020-01-03 14:05:58 +00:00
Roman Lebedev	8dab0a4a7d	[DAGCombine][X86][AArch64] 'A - (A & (B - 1))' -> 'A & (0 - B)' fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. https://rise4fun.com/Alive/ZVdp Name: ptr - (ptr & (alignment-1)) -> ptr & (0 - alignment) %mask = add i64 %alignment, -1 %bias = and i64 %ptr, %mask %r = sub i64 %ptr, %bias => %highbitmask = sub i64 0, %alignment %r = and i64 %ptr, %highbitmask See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 13:58:36 +03:00
QingShan Zhang	2133d3c558	[DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG for vector type as 'expand' instead of 'legal' For now, we didn't set the default operation action for SIGN_EXTEND_INREG for vector type, which is 0 by default, that is legal. However, most target didn't have native instructions to support this opcode. It should be set as expand by default, as what we did for ANY_EXTEND_VECTOR_INREG. Differential Revision: https://reviews.llvm.org/D70000	2020-01-03 03:26:41 +00:00
Matt Arsenault	0d9f919b73	DAG: Use TargetConstant for FENCE operands	2020-01-02 17:16:10 -05:00
Fangrui Song	87fb204e8f	[SelectionDAG] Simplify SelectionDAGBuilder::visitInlineAsm	2020-01-02 09:44:23 -08:00
Ulrich Weigand	63336795f0	[FPEnv] Default NoFPExcept SDNodeFlag to false The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that transforms SDNodes without copying flags can introduce a correctness bug, not just a missed optimization. This patch changes the default to false. This makes it necessary to move setting the (No)FPExcept flag for constrained intrinsics from the visitConstrainedIntrinsic routine to the generic visit routine at the place where the other flags are set, or else the intersectFlagsWith call would erase the NoFPExcept flag again. In order to avoid making non-strict FP code worse, whenever SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes none of which can raise FP exceptions, it will preserve this property on all results nodes generated, by setting the NoFPExcept flag on those result nodes that would otherwise be considered as raising an FP exception. To check whether or not an SD node should be considered as raising an FP exception, the following logic applies: - For machine nodes, check the mayRaiseFPException property of the underlying MI instruction - For regular nodes, check isStrictFPOpcode - For target nodes, check a newly introduced isTargetStrictFPOpcode The latter is implemented by reserving a range of target opcodes, similarly to how memory opcodes are identified. (Note that there a bit of a quirk in identifying target nodes that are both memory nodes and strict FP nodes. To simplify the logic, right now all target memory nodes are automatically also considered strict FP nodes -- this could be fixed by adding one more range.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71841	2020-01-02 16:59:45 +01:00
Qiu Chaofan	bdf4224f9c	[NFC] Add explicit instantiation to releaseNode Resolve a build failure about undefined symbols introduced by `f9f78cf`. Differential Revision: https://reviews.llvm.org/D72069	2020-01-02 21:16:22 +08:00
Craig Topper	dac98a2205	[RegisterClassInfo] Use SmallVector::assign instead of resize to make sure we erase previous contents from all entries of the vector. resize only writes to elements that get added. Any elements that already existed maintain their previous value. In this case we're trying to erase cached information so we should use assign which will write to every element. Found while trying to add new tests to an existing X86 test and noticed register allocation changing in other functions.	2020-01-01 18:53:12 -08:00
Lorenzo Casalino	f9f78cf6ac	[MachineScheduler] improve reuse of 'releaseNode'method The 'SchedBoundary::releaseNode' is merely invoked for releasing the Top/Bottom root nodes. However, 'SchedBoundary::releasePending' uses its same logic to check if the Pending queue has any releasable SUnit. It is possible to slightly modify the body of the two, allowing re-use of the former ('releaseNode') in the latter. Patch by Lorenzo Casalino <lorenzo.casalino93@gmail.com> Reviewers: MatzeB, fhahn, atrick Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65506	2020-01-01 20:22:32 +00:00
Mark de Wever	8dc7b982b4	[NFC] Fixes -Wrange-loop-analysis warnings This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857	2020-01-01 20:01:37 +01:00
Matt Arsenault	4d7201e7b9	DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nsz This was increasing the number of instructions when fsub was legalized on AMDGPU with no signed zeros enabled. This fold should be guarded by hasOneUse, and I don't think getNode should be doing that. The same fold is already done as a regular combine through isNegatibleForFree. This does require duplicating, even though isNegatibleForFree does this combine already (and properly checks hasOneUse) to avoid one PPC regression. In the regression, the outer fneg has nsz but the fsub operand does not. isNegatibleForFree only sees the operand, and doesn't see it's used from a nsz context. A nsz parameter needs to be added and threaded through isNegatibleForFree to avoid this.	2019-12-31 22:49:51 -05:00
Craig Topper	4ae3120ed8	[LegalizeVectorOps][AArch64] Stop asking for v4f16 fp_round and fp_extend to be promoted. These operations are needed as building blocks for promoting so they can't be promoted themselves. This appeared to work because the fp_extend query type for operation actions is the result type, not the input type so it never triggered in the legalizer. For fp_round, the vector op legalizer just ended up creating a nop fp_extend that was elided by getNode, followed by a nop fp_round that was also elided by getNode. This was followed by a final fp_round from v4f32 back to vf416 which was CSEd to the original node. Then legalize vector ops just believed that node legalized to itself. LegalizeDAG took another crack at promoting it, but didn't have a handler so just skipped it with a debug message saying it wasn't promoted. This patch just removes the operation actions to avoid this non-sense. Found while trying to refactor LegalizeVectorOps to handle multiple result nodes better.	2019-12-31 15:04:12 -08:00
Sam Parker	b409f73e1f	[ARM][TypePromotion] Re-enable by default Re-enable the pass after it was reverted and the bug fixed.	2019-12-31 11:31:06 +00:00
Craig Topper	787e078f3e	[TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues instead of creating a MERGE_VALUES node. NFCI This allows us to clean up some places that were peeking through the MERGE_VALUES node after the call. By returning the SDValues directly, we can clean that up. Unfortunately, there are several call sites in AMDGPU that wanted the MERGE_VALUES and now need to create their own.	2019-12-30 19:36:04 -08:00
Fangrui Song	03b9f0a5e1	Ignore "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" in favor of "frame-pointer" D56351 (included in LLVM 8.0.0) introduced "frame-pointer". All tests which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf" have been migrated to use "frame-pointer". Implement UpgradeFramePointerAttributes to upgrade the two obsoleted function attributes for bitcode. Their semantics are ignored. Differential Revision: https://reviews.llvm.org/D71863	2019-12-30 09:46:19 -08:00
Petar Avramovic	98f72a5107	[MIPS GlobalISel] Select bitreverse. Recommit G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Recommit notes: Introduce temporary variables in order to make sure instructions get inserted into MachineFunction in same order regardless of compiler used to build llvm. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 18:06:29 +01:00
Matt Arsenault	9fd31fdbd3	GlobalISel: moreElementsVector for FP min/max	2019-12-30 10:39:53 -05:00
Dmitri Gribenko	32cc14100e	Revert "[MIPS GlobalISel] Select bitreverse" This reverts commit `dbc136e0fe`. It broke buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066	2019-12-30 14:29:47 +01:00
Petar Avramovic	dbc136e0fe	[MIPS GlobalISel] Select bitreverse G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 11:26:45 +01:00
Petar Avramovic	94a24e7a40	[MIPS GlobalISel] Select bswap G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates these intrinsics from __builtin_bswap32 and __builtin_bswap64. Add lower and narrowscalar for G_BSWAP. Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later. Differential Revision: https://reviews.llvm.org/D71362	2019-12-30 11:13:22 +01:00
Kai Luo	cd2a73a9f0	[MCP] Add stats for backward copy propagation. NFC.	2019-12-30 16:48:28 +08:00
Fangrui Song	6f9b4c6826	[SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsm Indirect C_Immediate or C_Other constraints have been excluded. Also simplify an unneeded change to indirect 'X' by D60942.	2019-12-29 20:53:30 -08:00
Fangrui Song	5edb40c022	[SelectionDAG] Disallow indirect "i" constraint This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.	2019-12-29 16:50:42 -08:00
Simon Pilgrim	34769e0783	SimplifyDemandedBits - Remove duplicate getOperand() call. NFC. Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.	2019-12-28 16:42:50 +00:00
Craig Topper	a3f8964813	[TargetLowering] Update comment to reference the correct compiler-rt function the code is based on. NFC	2019-12-27 22:49:04 -08:00
Fangrui Song	044cc919f4	Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removed	2019-12-27 18:09:22 -08:00
Matt Arsenault	3213ce966b	TailDuplication: Clear NoPHIs property The early tail duplicator pass introduces new ones, so a MIR test that infers no phis since there were none on the input would fail the verifier after running.	2019-12-27 14:06:31 -05:00
Fangrui Song	7a7334663c	Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821 Intrinsic has incorrect argument type! i32 (i32) @llvm.setjmp wipes tear	2019-12-27 00:00:14 -08:00
Craig Topper	53ee806d93	[X86][FPEnv] Promote some float strictfp operations to double on i686-pc-windows-msvc to match what we do for non-strict. The float libcalls are inlined in MSVC's math header where they just cast to double and use the double libcall. Do the same when we emit libcalls.	2019-12-26 20:22:24 -08:00
Kristina Bessonova	cdd25a4c74	[DebugInfo][SelectionDAG] Change order while transferring SDDbgValue to another node SelectionDAG::transferDbgValues() can 'reattach' SDDbgValue from one to another node, but doesn't change its source order. If the destination node has the order greater than the SDDbgValue, there are two possible issues revealed later: * If debug info is attached to an instruction that is the first definition of a register, this ends up with a def-after-use and the debug info gets 'undef' later. * If MIR has another definition of a register above the debug info, the debug info may represent a source variable incorrectly because it appears (significantly) before an instruction corresponded to this debug info. So, the patch changes the order of an SDDbgValue when it is moved to a node with greater order. Reviewers: dblaikie, jmorse, aprantl Reviewed By: aprantl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71175	2019-12-26 21:01:59 +03:00
Wang, Pengfei	472bded3ed	[X86] Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Summary: Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Reviewers: craig.topper, RKSimon, LiuChen3, uweigand, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71871	2019-12-26 08:15:13 +08:00
Matt Arsenault	0d47399167	GlobalISel: Update syntax in debug printing Physical register names now start with $, not %	2019-12-24 10:37:36 -05:00
Matt Arsenault	9b61641564	GlobalISel: Fix naming variables "brank" instead of "bank"	2019-12-24 10:36:54 -05:00
Sam Parker	42dba633a3	[TypePromotion] Make TypeSize a class member Having TypeSize as a static class variable was causing problems with multi-threading. Several static functions have now been converted into methods of TypePromotion and a few other members of TypePromotion and IRPromoter have been added or removed. Differential Revision: https://reviews.llvm.org/D71832	2019-12-24 05:04:35 -05:00
David Blaikie	fccac1ec16	DebugInfo: Correct the form of DW_AT_macro_info in .dwo files (sec_offset, rather than data4)	2019-12-24 01:23:21 -08:00
David Blaikie	83c7a424d9	DebugInfo: Add {} to address -Wdangling-else warning.	2019-12-24 01:14:15 -08:00
Sourabh Singh Tomar	0a72515d33	[DebugInfo] Fix v4 macinfo for dwo files. Dwo files must contain have DW_AT_macro_info attribute, when macro information is emitted. Adjusted the test case for the same.	2019-12-24 12:50:34 +05:30
Fangrui Song	e0d855b399	[SelectionDAG] Change SelectionDAGISel::{funcInfo,SDB} to use unique_ptr CurDAG is referenced more than 2000 times and used in many gerated .cpp files. Don't touch it for now.	2019-12-23 22:41:05 -08:00
Fangrui Song	01b98e6fd5	[SelectionDAG] Don't repeatedly add a node to the worklist in ComputeLiveOutVRegInfo. NFC For sqlite3 amalgram, this decreases the number of Worklist.push_back calls (603084) by 10%.	2019-12-23 22:04:14 -08:00
Ulrich Weigand	0d3f782e41	[FPEnv][X86] More strict int <-> FP conversion fixes Fix several several additional problems with the int <-> FP conversion logic both in common code and in the X86 target. In particular: - The STRICT_FP_TO_UINT expansion emits a floating-point compare. This compare can raise exceptions and therefore needs to be a strict compare. I've made it signaling (even though quiet would also be correct) as signaling is the more usual default for an LT. This code exists both in common code and in the X86 target. - The STRICT_UINT_TO_FP expansion algorithm was incorrect for strict mode: it emitted two STRICT_SINT_TO_FP nodes and then used a select to choose one of the results. This can cause spurious exceptions by the STRICT_SINT_TO_FP that ends up not chosen. I've fixed the algorithm to use only a single STRICT_SINT_TO_FP instead. - The !isStrictFPEnabled logic in DoInstructionSelection would sometimes do the wrong thing because it calls getOperationAction using the result VT. But for some opcodes, incuding [SU]INT_TO_FP, getOperationAction needs to be called using the operand VT. - Remove some (obsolete) code in X86DAGToDAGISel::Select that would mutate STRICT_FP_TO_[SU]INT to non-strict versions unnecessarily. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71840	2019-12-23 21:11:45 +01:00
Sanjay Patel	8cefc37be5	[DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815	2019-12-23 10:11:45 -05:00
Martin Storsjö	5a751e747d	[AArch64] [Windows] Use COFF stubs for calls to extern_weak functions As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Improve the classifyGlobalFunctionReference method to set MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in AArch64TargetLowering::LowerCall to use the return value from classifyGlobalFunctionReference for these cases. Add code in both AArch64FastISel and GlobalISel/IRTranslator to bail out for function calls to extern weak functions on windows, to let SelectionDAG handle them. This matches what was done for X86 in `6bf108d77a`. Differential Revision: https://reviews.llvm.org/D71721	2019-12-23 12:13:49 +02:00
Carl Ritson	2791667d2e	[DAGCombiner] Check term use before applying aggressive FSUB optimisations Summary: Without this check unnecessary FMA instructions are generated when the FSUB terms are reused. This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation. Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel Reviewed By: arsenm Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71656	2019-12-23 09:37:58 +09:00
Valentin Churavy	fb0ccff6e5	[SelectionDAG] Copy FP flags when visiting a binary instruction. Summary: We noticed in Julia that the sequence below no longer turned into a sequence of FMA instructions in LLVM 7+, but it did in LLVM 6. ``` %29 = fmul contract <4 x double> %wide.load, %wide.load16 %30 = fmul contract <4 x double> %wide.load13, %wide.load17 %31 = fmul contract <4 x double> %wide.load14, %wide.load18 %32 = fmul contract <4 x double> %wide.load15, %wide.load19 %33 = fadd fast <4 x double> %vec.phi, %29 %34 = fadd fast <4 x double> %vec.phi10, %30 %35 = fadd fast <4 x double> %vec.phi11, %31 %36 = fadd fast <4 x double> %vec.phi12, %32 ``` Unlike Clang, Julia doesn't set the `unsafe-fp-math=true` function attribute, but rather emits more local instruction flags. This partially undoes https://reviews.llvm.org/D46854 and if required I can try to minimize the test further. Reviewers: spatel, mcberg2017 Reviewed By: spatel Subscribers: chriselrod, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71495	2019-12-22 14:29:36 -05:00
Reid Kleckner	b2c1ba5b1f	Revert "[ARM][TypePromotion] Enable by default" This reverts commit `ee7579409b`. It causes crashes during ThinLTO. I suspect the issue is related to races on the global TypeSize variable, which is 80 at the time of the crash.	2019-12-22 11:27:11 -08:00
Eric Astor	dc5b614fa9	[ms] [X86] Use "P" modifier on operands to call instructions in inline X86 assembly. Summary: This is documented as the appropriate template modifier for call operands. Fixes PR44272, and adds a regression test. Also adds support for operand modifiers in Intel-style inline assembly. Reviewers: rnk Reviewed By: rnk Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71677	2019-12-22 09:16:34 -05:00
David Blaikie	d0bfb3c583	DebugInfo: Remove out of date comment	2019-12-21 23:13:26 -08:00
Jessica Paquette	d5750770eb	[NFC][MachineOutliner] Rewrite setSuffixIndices to be iterative Having this function be recursive could use up way too much stack space. Rewrite it as an iterative traversal in the tree instead to prevent this. Fixes PR44344.	2019-12-20 16:12:37 -08:00
Vedant Kumar	fa4701e197	[DWARF] Defer creating declaration DIEs until we prepare call site info It isn't necessary to create DIEs for all of the declaration subprograms in a CU's retainedTypes list. We can defer creating these subprograms until we need to prepare a call site tag that refers to one. This cleanup was mentioned in passing in D70350.	2019-12-20 15:26:31 -08:00
Vedant Kumar	79daafc903	Reland: [DWARF] Allow cross-CU references of subprogram definitions This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. Update: Reland with a fix to create a declaration DIE when the declaration is missing from the CU's retainedTypes list. The declaration is left out of the retainedTypes list in two cases: 1) Re-compiling pre-r266445 bitcode (in which declarations weren't added to the retainedTypes list), and 2) Doing LTO function importing (which doesn't update the retainedTypes list). It's possible to handle (1) and (2) by modifying the retainedTypes list (in AutoUpgrade, or in the LTO importing logic resp.), but I don't see an advantage to doing it this way, as it would cause more DWARF to be emitted compared to creating the declaration DIEs lazily. Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a ReleaseLTO-g build of the test suite. rdar://46577651, rdar://57855316, rdar://57840415 Differential Revision: https://reviews.llvm.org/D70350	2019-12-20 15:26:31 -08:00
Yury Delendik	adf7a0a558	[WebAssembly] Use TargetIndex operands in DbgValue to track WebAssembly operands locations Extends DWARF expression language to express locals/globals locations. (via target-index operands atm) (possible variants are: non-virtual registers or address spaces) The WebAssemblyExplicitLocals can replace virtual registers to targertindex operand type at the time when WebAssembly backend introduces {get,set,tee}_local instead of corresponding virtual registers. Reviewed By: aprantl, dschuff Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D52634	2019-12-20 14:39:05 -08:00
Adrian Prantl	44b4b833ad	Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot This is a purely cosmetic change that is NFC in terms of the binary output. I bugs me that I called the attribute DW_AT_LLVM_isysroot since the "i" is an artifact of GCC command line option syntax (-isysroot is in the category of -i options) and doesn't carry any useful information otherwise. This attribute only appears in Clang module debug info. Differential Revision: https://reviews.llvm.org/D71722	2019-12-20 13:11:17 -08:00
Tom Weaver	453dc4d7ec	[OPT-DBG] Teach DbgEntityHistoryCalculator about meta-instructions. The calculator was considering instructions such as KILLs as clobbers of a physical address. This is wrong as meta instructions such as KILLs produce no output in the final program and thus don't clobber or change any physical location's value. As a result they're safe to ignore whilst calculating location list ranges. reviewers: aprantl, vsk diff revision: https://reviews.llvm.org/D70497 fixes: https://bugs.llvm.org/show_bug.cgi?id=38753	2019-12-20 14:03:34 +00:00
Sam Parker	acbc9aed72	[ARM][MVE] Fixes for tail predication. 1) Fix an issue with the incorrect value being used for the number of elements being passed to [d\|w]lstp. We were trying to check that the value was available at LoopStart, but this doesn't consider that the last instruction in the block could also define the register. Two helpers have been added to RDA for this. 2) Insert some code to now try to move the element count def or the insertion point so that we can perform more tail predication. 3) Related to (1), the same off-by-one could prevent us from generating a low-overhead loop when a mov lr could have been the last instruction in the block. 4) Fix up some instruction attributes so that not all the low-overhead loop instructions are labelled as branches and terminators - as this is not true for dls/dlstp. Differential Revision: https://reviews.llvm.org/D71609	2019-12-20 09:34:18 +00:00
Philip Reames	8277c91cf3	[StackMaps] Be explicit about label formation [NFC] (try 2) Recommit after making the same API change in non-x86 targets. This has been build for all targets, and tested for effected ones. Why the difference? Because my disk filled up when I tried make check for all. For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.	2019-12-19 14:05:30 -08:00
Eric Christopher	add710eb23	Temporarily Revert "[StackMaps] Be explicit about label formation [NFC]" as it broke the aarch64 build. This reverts commit `bc7595d934`.	2019-12-19 12:52:40 -08:00
Philip Reames	bc7595d934	[StackMaps] Be explicit about label formation [NFC] For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.	2019-12-19 12:38:44 -08:00
Philip Reames	cf6aafa47c	[FaultMaps] Make label formation a bit more explicit [NFC] This is in advance of assembler padding directives support where we'll need to bundle the label w/the corresponding faulting instruction to avoid padding being inserted between.	2019-12-19 12:38:44 -08:00
Craig Topper	e6e23a24be	[LegalizeDAG] Add return to the strict node handling in PromoteLegalINT_TO_FP to prevent an invalid strict fp node from being created by falling into non-strict code path.	2019-12-19 11:39:50 -08:00
Jay Foad	c5c935ab66	Make more use of MachineInstr::mayLoadOrStore.	2019-12-19 11:51:52 +00:00
Liu, Chen3	2f932b5729	Enable STRICT_FP_TO_SINT/UINT on X86 backend This patch is mainly for custom lowering the vector operation. Differential Revision: https://reviews.llvm.org/D71592	2019-12-19 14:49:13 +08:00
David Blaikie	aaa5a5e7ff	DebugInfo: Include DW_AT_base_addr even in gmlt with no inline functions Since the address pool doesn't get populated in this case (due to the lack of inlining, no child DIEs are added to the CU - so no addresses are needed for the DIEs themselves) until the range list is emitted - at the time the attributes are added to the CU, the address pool is empty. So check whether the address pool will be used for the range lists & add an addr_base if that's the case.	2019-12-18 17:14:28 -08:00
David Blaikie	64fa76ef55	Reapply "NFC: DebugInfo: Refactor RangeSpanList to be a struct, like DebugLocStream::List" Move these data structures closer together so their emission code can eventually share more of its implementation. Was an egregious bug (completely untested, evidently) where I hadn't inverted a DWARFv5 test as needed, so it was doing the exact opposite of what was required & thus tried to emit a DWARFv5 range list header in DWARFv4. Reapply `8e04896288` which was reverted in `a8154e5e0c`.	2019-12-18 16:28:19 -08:00
Ulrich Weigand	1946461344	[FPEnv] Strict versions of llvm.minimum/llvm.maximum Add new intrinsics llvm.experimental.constrained.minimum llvm.experimental.constrained.maximum as strict versions of llvm.minimum and llvm.maximum. Includes SystemZ back-end support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71624	2019-12-18 21:35:28 +01:00
Craig Topper	cfe316007f	[SelectionDAGBuilder] Use getConstant instead of getTargetConstant to build the offset for struct types in getUniformBase. getTargetConstant prevents any optimizations from operating on the value and basically says its already been iseled. But since we want the index to be in a register, this isn't true. Prior to this we were generating a vbroadcast with an immediate argument which is illegal and was flagged by the expensive checks bot.	2019-12-18 10:44:28 -08:00
stozer	89d19d60ad	Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel This reverts commit `1f3dd83cc1`, reapplying commit `bb1b0bc4e5`. The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.	2019-12-18 16:26:42 +00:00
Daniel Sanders	c3cb089a87	[gicombiner] Import tryCombineIndexedLoadStore() Summary: Now that arbitrary data is supported, import tryCombineIndexedLoadStore() Depends on D69147 Reviewers: bogner, volkan Reviewed By: volkan Subscribers: hiraditya, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69151	2019-12-18 14:41:38 +00:00
stozer	1f3dd83cc1	Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel" Reverted due to build failure on windows bots. This reverts commit `bb1b0bc4e5`.	2019-12-18 11:46:10 +00:00
stozer	bb1b0bc4e5	[DebugInfo] Correctly handle salvaged casts and split fragments at ISel Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions. There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.	2019-12-18 11:09:18 +00:00
Jay Foad	97ca7c2cc9	[AArch64] Enable clustering memory accesses to fixed stack objects Summary: r347747 added support for clustering mem ops with FI base operands including support for fixed stack objects in shouldClusterFI, but apparently this was never tested. This patch fixes shouldClusterFI to work with scaled as well as unscaled load/store instructions, and fixes the ordering of memory ops in MemOpInfo::operator< to ensure that memory addresses always increase, regardless of which direction the stack grows. Subscribers: MatzeB, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71334	2019-12-18 09:46:11 +00:00
Anna Welker	7cd1cfdd6b	[NFC][TTI] Add Alignment for isLegalMasked[Gather/Scatter] Add an extra parameter so alignment can be taken under consideration in gather/scatter legalization. Differential Revision: https://reviews.llvm.org/D71610	2019-12-18 09:14:39 +00:00
Wang, Pengfei	8cc0b58673	[X86] Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic. Summary: Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic. Reviewers: craig.topper, c-rhodes, RKSimon Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71442	2019-12-18 12:24:58 +08:00
Craig Topper	c36773c78e	[FPEnv][LegalizeTypes] Make ScalarizeVecOp_STRICT_FP_ROUND do its own replacements and return SDValue() The caller will assert for nodes with more than 2 results unless we return a null SDValue. I tried to test this by copying an AArch64 test for ScalarizeVecOp_FP_ROUND. While it did hit the assert and this commited fixed that. It also hit a later problem that couldn't be fixed without adding strict FP support to AArch64.	2019-12-17 15:17:43 -08:00
Craig Topper	84d8fa30f9	[FPEnv][LegalizeTypes][LegalizeDAG][AArch64] Few fixes/improvements for legalizing fp<->int conversion nodes. This started with adding a test to support get code coverage on ScalarizeVecOp_UnaryOp_StrictFP by copying an existing AArch64 test and using constrained sitofp/uitofp intrinsics. This found 3 separate issues: -ScalarizeVecOp_UnaryOp_StrictFP needs to do its own replacement because the caller can't handle replacing multiple results. -Missing integer promotion support for sitofp/uitofp -Chain result not always assigned in ExpandLegalINT_TO_FP. Committing them together so I can add the test case.	2019-12-17 14:37:00 -08:00
Sourabh Singh Tomar	399273e5eb	Recommit "[DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission." This was reverted in `caa4120906`, since it was causing an assertion failure on Windows bots. This revision is revised to fix that. Original commit message - [DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission. Reviewers: dblaikie, aprantl, jini.susan.george Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71008	2019-12-18 02:12:59 +05:30
Sanjay Patel	6a77e36975	[SDAG] adjust isNegatibleForFree calculation to avoid crashing This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch adds a hack to work-around the case where we probably no longer detect that either multiply operand of an FMA isNegatibleForFree which is assumed to be true when we started rewriting the expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-17 13:49:15 -05:00
Sanjay Patel	5b0251da1c	Revert "[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression()" This reverts commit `36b1232ec5`. Need to adjust commit message - that was a leftover from the earlier version.	2019-12-17 13:47:59 -05:00
Sanjay Patel	36b1232ec5	[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch adds a hack to work-around the case where we probably no longer detect that either multiply operand of an FMA isNegatibleForFree which is assumed to be true when we started rewriting the expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-17 13:46:06 -05:00
Amaury Séchet	ff6567cc77	[DAGCombiner] Add node back in the worklist in topological order in CommitTargetLoweringOpt Summary: Right now, DAGCombiner process the nodes in an iplementation defined order. This tends to be fragile as optimisation may or may not kick in depending on the traversal order. This is part of a larger effort to get the DAGCombiner to process its node in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70921	2019-12-17 18:26:16 +01:00
Mitch Phillips	2423774cc2	Revert "Honor -fuse-init-array when os is not specified on x86" This reverts commit `aa5ee8f244`. This change broke the sanitizer buildbots. See comments at the patchset (https://reviews.llvm.org/D71360) for more information.	2019-12-17 07:36:59 -08:00
Kevin P. Neal	b1d8576b0a	This adds constrained intrinsics for the signed and unsigned conversions of integers to floating point. This includes some of Craig Topper's changes for promotion support from D71130. Differential Revision: https://reviews.llvm.org/D69275	2019-12-17 10:06:51 -05:00
alex-t	e7f585ed61	PostRA Machine Sink should take care of COPY defining register that is a sub-register by another COPY source operand Differential Revision: https://reviews.llvm.org/D71132	2019-12-17 15:20:43 +03:00
Guillaume Chatelet	531c1161b9	Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" Summary: This is a resubmit of D71473. This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: aaron.ballman, courbet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71547	2019-12-17 10:07:46 +01:00
Raphael Isemann	ccfab8e459	[ObjC][DWARF] Emit DW_AT_APPLE_objc_direct for methods marked as __attribute__((objc_direct)) Summary: With DWARF5 it is no longer possible to distinguish normal methods and methods with `__attribute__((objc_direct))` by just looking at the debug information as they are both now children of the of the DW_TAG_structure_type that defines them (before only the `__attribute__((objc_direct))` methods were children). This means that in LLDB we are no longer able to create a correct Clang AST of a module by just looking at the debug information. Instead we would need to call the Objective-C runtime to see which of the methods have a `__attribute__((objc_direct))` and then add the attribute to our own Clang AST depending on what the runtime returns. This would mean that we either let the module AST be dependent on the Objective-C runtime (which doesn't seem right) or we retroactively add the missing attribute to the imported AST in our expressions. A third option is to annotate methods with `__attribute__((objc_direct))` as `DW_AT_APPLE_objc_direct` which is what this patch implements. This way LLDB doesn't have to call the runtime for any `__attribute__((objc_direct))` method and the AST in our module will already be correct when we create it. Reviewers: aprantl, SouraVX Reviewed By: aprantl Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D71201	2019-12-17 09:40:36 +01:00
Craig Topper	13ce7c1291	[LegalizeTypes] Pre-size the SmallVectors in ScalarizeVecRes_StrictFPOp and SplitVecRes_StrictFPOp so we don't have to call push_back. NFCI This avoids grow checking/handling in each iteration of the loop.	2019-12-16 23:42:13 -08:00
Craig Topper	c738ebc1f5	[LegalizeTypes] Remove ScalarizeVecRes_STRICT_FP_ROUND in favor of just using ScalarizeVecRes_StrictFPOp. NFCI It looks like ScalarizeVecRes_StrictFPOp can handle a variable number of arguments with scalar and vector types so it should be sufficient.	2019-12-16 23:42:13 -08:00
Craig Topper	c4d2bb1ede	[LegalizeTypes] Remove the call to SplitVecRes_UnaryOp from SplitVecRes_StrictFPOp. NFCI It doesn't seem to do anything that SplitVecRes_StrictFPOp can't do. SplitVecRes_StrictFPOp already handles nodes with a variable number of arguments and a mix of scalar and vector arguments.	2019-12-16 23:42:13 -08:00
Craig Topper	4e48513b47	[SelectionDAG] Add the fpexcept flag to the SelectionDAG dumping output so we can better see when its not propagating. We're currently losing this flag in type legalization and probably other places when we expand strict fp nodes. This will make reading logs easier.	2019-12-16 18:05:11 -08:00
Puyan Lotfi	204dfabfe6	[NFC][llvm][MIRVRegNamerUtils] Moving some switch cases and altering comments.	2019-12-16 18:50:26 -05:00
Puyan Lotfi	f63b64c0c3	[llvm][MIRVRegNamerUtils] Adding hashing on CImm / FPImm MachineOperands. This patch makes it so that cases where multiple instructions that differ only in their ConstantInt or ConstantFP MachineOperand values no longer collide. For instance: %0:_(s1) = G_CONSTANT i1 true %1:_(s1) = G_CONSTANT i1 false %2:_(s32) = G_FCONSTANT float 1.0 %3:_(s32) = G_FCONSTANT float 0.0 Prior to this patch the first two instructions would collide together. Also, the last two G_FCONSTANT instructions would also collide. Now they will no longer collide. Differential Revision: https://reviews.llvm.org/D71558	2019-12-16 18:25:04 -05:00
Kamlesh Kumar	aa5ee8f244	Honor -fuse-init-array when os is not specified on x86 Currently -fuse-init-array option is not effective when target triple does not specify os, on x86,x86_64. i.e. // -fuse-init-array is not honored. $ clang -target i386 -fuse-init-array test.c -S // -fuse-init-array is honored. $ clang -target i386-linux -fuse-init-array test.c -S This patch fixes first case. And does cleanup. Reviewers: rnk, craig.topper, fhahn, echristo Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D71360	2019-12-16 15:21:23 -08:00
Guillaume Chatelet	4658da10e4	Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" This reverts commit `181ab91efc`.	2019-12-16 15:19:49 +01:00
Guillaume Chatelet	181ab91efc	[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove Summary: This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71473	2019-12-16 13:35:55 +01:00
Valentin Churavy	5c29e8c65f	[CodegenPrepare] Guard against degenerate branches Summary: Guard against a potential crash observed in https://github.com/JuliaLang/julia/issues/32994#issuecomment-524249628 If two branches are collapsed we can encounter a degenerate conditional branch `TBB==FBB`. The subsequent code assumes that they differ, so we exit out early. Reviewers: ributzka, spatel Subscribers: loladiro, dexonsmith, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66657	2019-12-16 04:23:32 -05:00
Sanjay Patel	2afe864118	[DAG] Add SimplifyDemandedBits support for BSWAP This exposes a shortcoming for AArch64, and that is tracked by PR40881: https://bugs.llvm.org/show_bug.cgi?id=40881 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D58017	2019-12-15 08:52:34 -05:00
Craig Topper	1dc0c8af5e	[LegalizeTypes] Teach BitcastToInt_ATOMIC_SWAP to only create FP16_TO_FP when called from PromoteFloatResult. There's also a call from SoftenFloatResult that should not be promoted. The change test case would fail with the new RUN line prior to this change.	2019-12-14 15:05:32 -08:00
Craig Topper	95ce8f9498	[LegalizeTypes] In PromoteFloatOp_SETCC, don't both querying for transforming the result type. The result type is already legal, is doesnt' need to be transformed.	2019-12-14 15:05:32 -08:00
Puyan Lotfi	816985c120	[NFC][llvm][MIRVRegNamerUtils] Refactoring GetHashableMO into switch-statement. This refactors the if-statements handling the hashing of various MachineOperand types into a switch-statement. The purpose is to cover all the basis for all MachineOperand types while being very deliberate about which MachineOperand types we are not handling and why (better added comments). This patch is a NFC redo of https://reviews.llvm.org/D71396. Much of the changes present in D71396 will come in smaller follow-up patches that will add support for hashing the MachineOperand types that aren't covered piece-meal with tests for each new case.	2019-12-14 02:31:07 -05:00
Roman Tereshin	8731799fc6	[Legalizer] Making artifact combining order-independent Legalization algorithm is complicated by two facts: 1) While regular instructions should be possible to legalize in an isolated, per-instruction, context-free manner, legalization artifacts can only be eliminated in pairs, which could be deeply, and ultimately arbitrary nested: { [ () ] }, where which paranthesis kind depicts an artifact kind, like extend, unmerge, etc. Such structure can only be fully eliminated by simple local combines if they are attempted in a particular order (inside out), or alternatively by repeated scans each eliminating only one innermost pair, resulting in O(n^2) complexity. 2) Some artifacts might in fact be regular instructions that could (and sometimes should) be legalized by the target-specific rules. Which means failure to eliminate all artifacts on the first iteration is not a failure, they need to be tried as instructions, which may produce more artifacts, including the ones that are in fact regular instructions, resulting in a non-constant number of iterations required to finish the process. I trust the recently introduced termination condition (no new artifacts were created during as-a-regular-instruction-retrial of artifacts not eliminated on the previous iteration) to be efficient in providing termination, but only performing the legalization in full if and only if at each step such chains of artifacts are successfully eliminated in full as well. Which is currently not guaranteed, as the artifact combines are applied only once and in an arbitrary order that has to do with the order of creation or insertion of artifacts into their worklist, which is a no particular order. In this patch I make a small change to the artifact combiner, making it to re-insert into the worklist immediate (modulo a look-through copies) artifact users of each vreg that changes its definition due to an artifact combine. Here the first scan through the artifacts worklist, while not being done in any guaranteed order, only needs to find the innermost pair(s) of artifacts that could be immediately combined out. After that the process follows def-use chains, making them shorter at each step, thus combining everything that can be combined in O(n) time. Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders Reviewed By: aditya_nandakumar, paquette Tags: #llvm Differential Revision: https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Roman Tereshin	18bf9670aa	[Legalizer] Refactoring out legalizeMachineFunction and introducing new unittests/CodeGen/GlobalISel/LegalizerTest.cpp relying on it to unit test the entire legalizer algorithm (including the top-level main loop). See also https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Roman Tereshin	8207c81597	[Legalizer] More detailed debugging printing in main loop	2019-12-13 15:45:18 -08:00
Alex Richardson	11448eeb72	[NFC] Use SelectionDAG::getMemBasePlusOffset() instead of getNode(ISD::ADD) Summary: To find potential opportunities to use getMemBasePlusOffset() I looked at all ISD::ADD uses found with the regex getNode\(ISD::ADD,.+,.+Ptr in lib/CodeGen/SelectionDAG. If this patch is accepted I will convert the files in the individual backends too. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71207	2019-12-13 21:40:03 +00:00
Alex Richardson	fc83f53a86	[NFC] Implement SelectionDAG::getObjectPtrOffset() using getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. In order to make this change, getMemBasePlusOffset() has been extended to also take a SDNodeFlags parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71206	2019-12-13 21:40:03 +00:00
Alex Richardson	ea8888d1af	[NFC] Add a SDValue overload for SelectionDAG::getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows integer constants offsets, but there are cases where we can use an existing SDValue parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel, craig.topper Reviewed By: spatel, craig.topper Subscribers: craig.topper, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71205	2019-12-13 21:40:03 +00:00
Alex Richardson	d9bb70acd7	[NFC] Change SelectionDAG::getMemBasePlusOffset() to use int64_t Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows positive offsets, but there are cases where we want to subtract an offset from an existing pointer. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71204	2019-12-13 21:40:03 +00:00
Sanjay Patel	2f0c7fd2db	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc (2nd try) The initial attempt (rG89633320) botched the logic by reversing the source/dest types. Added x86 tests for additional coverage. The vector tests show a potential improvement (fold vector load instead of broadcasting), but that's a known/existing problem. This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-13 14:03:54 -05:00
Nicola Zaghen	97572775d2	Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. This fixes the buildbot failures. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-13 14:30:21 +00:00
Alex Richardson	be15dfa88f	[NFC] Use EVT instead of bool for getSetCCInverse() Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void )0x12033091e < (void )0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917	2019-12-13 12:22:03 +00:00
Kerry McLaughlin	4194ca8e5a	Recommit "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores" Updated pred_load patterns added to AArch64SVEInstrInfo.td by this patch to use reg + imm non-temporal loads to fix previous test failures. Original commit message: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above.	2019-12-13 10:08:20 +00:00
David Stenberg	5c7cc6f83d	[LiveDebugValues] Omit entry values for DBG_VALUEs with pre-existing expressions Summary: This is a quickfix for PR44275. An assertion that checks that the DIExpression is valid failed due to attempting to create an entry value for an indirect parameter. This started appearing after D69028, as the indirect parameter started being represented using an DW_OP_deref, rather than with the DBG_VALUE's second operand, meaning that the isIndirectDebugValue() check in LiveDebugValues did not exclude such parameters. A DIExpression that has an entry value operation can currently not have any other operation, leading to the failed isValid() check. This patch simply makes us stop considering emitting entry values for such parameters. To support such cases I think we at least need to do the following changes: * In DIExpression::isValid(): Remove the limitation that a DW_OP_LLVM_entry_value operation can be the only operation in a DIExpression. * In LiveDebugValues::emitEntryValues(): Create an entry value of size 1, so that it only wraps the register operand, and not the whole pre-existing expression (the DW_OP_deref). * In LiveDebugValues::removeEntryValue(): Check that the new debug value has the same debug expression as the original, rather than checking that the debug expression is empty. * In DwarfExpression::addMachineRegExpression(): Modify the logic so that a DW_OP_reg* expression is emitted for the entry value. That is how GCC emits entry values for indirect parameters. That will currently not happen to due the DW_OP_deref causing the !HasComplexExpression to fail. The LocationKind needs to be changed also, rather than always emitting a DW_OP_stack_value for entry values. There are probably more things I have missed, but that could hopefully be a good starting point for emitting such entry values. Reviewers: djtodoro, aprantl, jmorse, vsk Reviewed By: aprantl, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D71416	2019-12-13 10:49:46 +01:00
Craig Topper	5c80a4f454	[LegalizeTypes] Remove unnecessary if before calling ReplaceValueWith on the chain in SoftenFloatRes_LOAD. I believe this is a leftover from when fp128 was softened to fp128 on X86-64. In that case type legalization must have been able to create a load that was the same as N which would make this replacement fail or assert. Since we no longer do that, this check should be unneeded.	2019-12-13 00:14:41 -08:00
Eric Christopher	a8154e5e0c	Temporarily revert "NFC: DebugInfo: Refactor RangeSpanList to be a struct, like DebugLocStream::List" as it was causing bot and build failures. This reverts commit `8e04896288`.	2019-12-12 17:55:41 -08:00
David Blaikie	8e04896288	NFC: DebugInfo: Refactor RangeSpanList to be a struct, like DebugLocStream::List Move these data structures closer together so their emission code can eventually share more of its implementation.	2019-12-12 16:53:59 -08:00
David Blaikie	20e06a28da	NFC: DebugInfo: Refactor debug_loc/loclist emission into a common function (except for v4 loclists, which are sufficiently different to not fit well in this generic implementation) In subsequent patches I intend to refactor the DebugLoc and ranges data structures to be more similar so I can common more of the implementation here.	2019-12-12 16:39:12 -08:00
Evgenii Stepanov	dabd2622a8	hwasan: add tag_offset DWARF attribute to optimized debug info Summary: Support alloca-referencing dbg.value in hwasan instrumentation. Update AsmPrinter to emit DW_AT_LLVM_tag_offset when location is in loclist format. Reviewers: pcc Subscribers: srhines, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70753	2019-12-12 16:18:54 -08:00
Sanjay Patel	9432937190	Revert "[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc" This reverts commit `8963332c33`. There was a logic bug typo in this code, but it wasn't visible in the asm for the tests.	2019-12-12 16:24:40 -05:00
Sanjay Patel	8963332c33	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-12 15:44:13 -05:00
Sanjay Patel	b39009bf1d	[DAGCombiner] improve readability This is not quite NFC because I changed the SDLoc to use the more standard 'N' (the starting node for the fold). This transform is a special-case of a more general fold that we do in IR, but it seems like the general fold is needed here too to avoid a potential regression seen in D58017. https://rise4fun.com/Alive/3jZm	2019-12-12 13:16:50 -05:00
stozer	e39e2b4a79	[DebugInfo] Prevent invalid fragments at ISel from dropping debug info During SelectionDAG, if a value which is associated with a DBG_VALUE needs to be split across multiple registers, the DBG_VALUE will be split into a set of fragment expressions to recreate the original value. If one or more of these fragments cannot be created, they would previously be silently dropped, causing the old debug value to live past its expiry date. This patch fixes this issue by keeping invalid fragments while setting their value as Undef. Differential revision: https://reviews.llvm.org/D70248	2019-12-12 12:28:39 +00:00
Nicola Zaghen	f798eb21ec	Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same." This reverts commit `5f6208778f`. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.	2019-12-12 10:29:54 +00:00
Nicola Zaghen	5f6208778f	[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-12 10:07:01 +00:00
Puyan Lotfi	756db63af9	[NFC][llvm][MIRVRegNamerUtils] Moving methods around. Making some private. Making all externally unused methods private in MIRVRegNamerUtils.h. Moving or deleting a couple other methods around.	2019-12-12 03:32:53 -05:00
Puyan Lotfi	f5b7a46837	[llvm][MIRVRegNamerUtils] Adding hashing on memoperands. No more hash collisions for memoperands. Now the MIRCanonicalization pass shouldn't hit hash collisions when dealing with nearly identical memory accessing instructions when their memoperands are in fact different. Differential Revision: https://reviews.llvm.org/D71328	2019-12-11 22:11:49 -05:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
Vedant Kumar	56232f950d	Revert "[DWARF] Allow cross-CU references of subprogram definitions" This reverts commit `30038da15b`. It causes the stage2 thinLTO bot to fail with: Assertion failed: (CU.getDIE(CalleeSP) && "Expected declaration subprogram DIE for callee") rdar://57840415	2019-12-11 15:55:48 -08:00
Sanjay Patel	cdf5cfea8e	Revert "[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression()" This reverts commit `d1f0bdf2d2`. The patch can cause infinite loops in DAGCombiner.	2019-12-11 16:56:58 -05:00
Craig Topper	4b452952fe	[LegalizeTypes] In SoftenFloatRes_FP_EXTEND, move the check for input already being promoted above the check for fp16 converting to something other than fp32. The fp16 to larger than fp32 inserts an extend that need to re-legalized if fp16 is promoted. But if we check for fp16 promotion first, then we can avoid emiting the fp_extend all together.	2019-12-11 12:48:08 -08:00
Sanjay Patel	d1f0bdf2d2	[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch skips the use check during the rewrite phase. So we determine that some expression isNegatibleForFree (identically to without this patch), but during the rewrite, don't rely on use counts to decide how to create the optimal expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-11 13:30:39 -05:00
Kerry McLaughlin	c0a3ab3655	Revert "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores" This reverts commit `3f5bf35f86` as it was causing build failures in llvm-clang-x86_64-expensive-checks: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/392 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1045	2019-12-11 13:58:39 +00:00
Kerry McLaughlin	3f5bf35f86	[AArch64][SVE] Implement intrinsics for non-temporal loads & stores Summary: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above. Reviewers: sdesmalen, paulwalker-arm, dancgr, mgudim, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71000	2019-12-11 11:13:51 +00:00
Sjoerd Meijer	d97cf1f889	[ARM][LowOverheadLoops] Remove dead loop update instructions. After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007	2019-12-11 10:20:19 +00:00
Sam Parker	ee7579409b	[ARM][TypePromotion] Enable by default Enable the TypePromotion pass my default (again). This patch was originally committed in `393dacacf7`. This patch was reverted in `a38396939c`. Differential Revision: https://reviews.llvm.org/D70998	2019-12-11 10:00:16 +00:00
shkzhang	1408e7e175	[PowerPC] [CodeGen] Use MachineBranchProbabilityInfo in EarlyIfPredicator to avoid the potential bug Summary: In the function `EarlyIfPredicator::shouldConvertIf()`, we call `TII->isProfitableToIfCvt()` with `BranchProbability::getUnknown()`, it may cause the potential assertion error for those hook which use `BranchProbability` in `isProfitableToIfCvt()`, for example `SystemZ`. `SystemZ` use `Probability < BranchProbability(1, 8))` in the function `SystemZInstrInfo::isProfitableToIfCvt()`, if we call this function with `BranchProbability::getUnknown()`, it will cause assertion error. This patch is to fix the potential bug. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D71273	2019-12-11 04:46:00 -05:00
Florian Hahn	11f311875f	[LiveRegUnits] Add phys_regs_and_masks iterator range (NFC). This iterator range just includes physical registers and register masks, which are interesting when dealing with register liveness. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70562	2019-12-11 09:34:42 +00:00
Craig Topper	d4345636e6	[LegalizeTypes] Remove manual worklist management from SoftenFloatRes_FP_EXTEND. I think this is no longer needed. The system should take care of legalizing any new nodes that are added. I think this might have been needed prior to r371709 or r307053.	2019-12-10 22:33:31 -08:00
Nico Weber	caa4120906	Revert "[DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission." This reverts commit `307f60a1a3`. DebugInfo/X86/debug-macinfo-split-dwarf.ll fails on Windows: Command Output (stdout): -- $ ":" "RUN: at line 1" $ "c:\src\llvm-project\out\gn\bin\llc.exe" "-mtriple=x86_64-pc-windows-gnu" "-O0" "-split-dwarf-file=foo.dwo" "-filetype=obj" Assertion failed: Section && "Cannot switch to a null section!", file ../../llvm/lib/MC/MCStreamer.cpp, line 1103 Stack dump: 0. Program arguments: c:\src\llvm-project\out\gn\bin\llc.exe -mtriple=x86_64-pc-windows-gnu -O0 -split-dwarf-file=foo.dwo -filetype=obj	2019-12-10 21:32:30 -05:00
Puyan Lotfi	f364686f34	[llvm][MIRVRegNamerUtil] Adding hashing against MachineInstr flags. Now, flags will result in differing hashes for a given MI. In effect, if you have two instructions with everything identical except for their flags then you should get two different hashes and fewer collisions. Differential Revision: https://reviews.llvm.org/D70479	2019-12-10 20:16:14 -05:00
Wang, Pengfei	21bc8631fe	[FPEnv][X86] Constrained FCmp intrinsics enabling on X86 Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582	2019-12-11 08:23:09 +08:00
David Blaikie	4ffd3f44e3	DebugInfo: Clarify some more reasons v4 loc.dwo can't share much implementation with loclists.dwo	2019-12-10 14:11:03 -08:00
Vedant Kumar	30038da15b	[DWARF] Allow cross-CU references of subprogram definitions This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. rdar://46577651 Differential Revision: https://reviews.llvm.org/D70350	2019-12-10 14:00:57 -08:00
Sourabh Singh Tomar	307f60a1a3	[DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission. Reviewers: dblaikie, aprantl, jini.susan.george Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71008	2019-12-11 02:19:27 +05:30
Sourabh Singh Tomar	fb4d8fe1a8	Recommit "[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified." Reviewers: dblaikie, aprantl, probinson Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71185	2019-12-11 01:24:50 +05:30
Sourabh Singh Tomar	d82b6ba21b	Revert "[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified." This reverts commit `6ef01588f4`. Missing Differetial revision.	2019-12-11 01:20:40 +05:30
Sourabh Singh Tomar	6ef01588f4	[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified.	2019-12-11 01:18:02 +05:30
Hans Wennborg	49da20ddb4	Revert `30e8f80fd5` "[DebugInfo] Don't create multiple DBG_VALUEs when sinking" This caused non-determinism in the compiler, see command on the Phabricator code review. > This patch addresses a performance problem reported in PR43855, and > present in the reapplication in in 001574938e5. It turns out that > MachineSink will (often) move instructions to the first block that > post-dominates the current block, and then try to sink further. This > means if we have a lot of conditionals, we can needlessly create large > numbers of DBG_VALUEs, one in each block the sunk instruction passes > through. > > To fix this, rather than immediately sinking DBG_VALUEs, record them in > a pass structure. When sinking is complete and instructions won't be > sunk any further, new DBG_VALUEs are added, avoiding lots of > intermediate DBG_VALUE $noregs being created. > > Differential revision: https://reviews.llvm.org/D70676	2019-12-10 19:20:11 +01:00
Sam Parker	933de40729	[TypePromotion] Query target register width TargetLoweringInfo may report that an integer should be promoted, but it maybe provide a size that isn't natively supported by the target register file... So check this before trying to perform a promotion. This is to fix some chromium issues: https://bugs.chromium.org/p/chromium/issues/detail?id=1031978 https://bugs.chromium.org/p/chromium/issues/detail?id=1031979 Differential Revision: https://reviews.llvm.org/D71200	2019-12-10 13:23:00 +00:00
Kiran Chandramohan	965ed1e974	[AArch64] Fix issues with large arrays on stack Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496	2019-12-10 11:44:41 +00:00
Mikael Holmen	4763267eee	[LegalizeTypes] Bugfixes for big-endian targets when handling BITCASTs Summary: This fixes PR44135. The special case when we promote a bitcast from a vector to an int needs special handling when we are on a big-endian target. Prior to this fix, for the added vec_to_int we see the following in the SelectionDAG printouts Type-legalized selection DAG: %bb.1 'foo:bb.1' SelectionDAG has 9 nodes: t0: ch = EntryToken t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0 t17: v4i32 = bitcast t2 t23: i32 = extract_vector_elt t17, Constant:i32<3> t8: ch,glue = CopyToReg t0, Register:i32 $r0, t23 t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1 and I think here the extract_vector_elt is wrong and extracts the value from the wrong index. The program program should return the 32 bits made up of the elements at index 4 and 5 in the vec6 array, but with t23: i32 = extract_vector_elt t17, Constant:i32<3> as far as I can tell, we will extract values that originally didn't even exist in the vec6 vectore. If we would instead extract the element at index 2 we would get the wanted values. With this fix we insert a right shift after the bitcast in DAGTypeLegalizer::PromoteIntRes_BITCAST which then gives us Type-legalized selection DAG: %bb.1 'vec_to_int:bb.1' SelectionDAG has 9 nodes: t0: ch = EntryToken t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0 t23: v4i32 = bitcast t2 t27: i32 = extract_vector_elt t23, Constant:i32<2> t8: ch,glue = CopyToReg t0, Register:i32 $r0, t27 t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1 So now we get t27: i32 = extract_vector_elt t23, Constant:i32<2> which is what we want. Similarly, the new int_to_vec testcase exposes a bug where we cast the other direction. Then we instead need to add a left shift before the bitcast on big-endian targets for the bits in the input integer to end up at the exptected place in the vector. Reviewers: bogner, spatel, craig.topper, t.p.northover, dmgreen, efriedma, SjoerdMeijer, samparker Reviewed By: efriedma Subscribers: eli.friedman, bjope, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70942	2019-12-10 11:22:35 +01:00
Puyan Lotfi	479e3b85e2	[NFCi][llvm][MIRVRegNamerUtils] Making some code cleanup and stylistic changes. Making some changes to MIRVRegNamerUtils.cpp to use some more modern c++ features as well as some changes to generally make the code more concise and more understandable. I make this an NFCi because in one case I drop the whole "if (!MO->isDef()) MO->setIsKill(false);" thing that was added in the original implementation, generally because I don't think this is really semantically sound. I also changed up the implementation of VRegRenamer::createVirtualRegisterWithLowerName somewhat because I am now lower-casing the name unconditionally because I confirmed that that was in fact aditya_nandakumar@apple.com's intent. In all other cases, behavior should not be changed. Differential Revision: https://reviews.llvm.org/D71182	2019-12-09 23:35:27 -05:00
Fangrui Song	9574757dba	[MC] Delete MCCodePadder D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106	2019-12-09 19:21:31 -08:00
QingShan Zhang	05b0c76aa7	[NFC][MacroFusion] Adding the assertion if someone want to fuse more than 2 instructions As discussed in https://reviews.llvm.org/D69998, we miss to create some dependency edges if chained more than 2 instructions. Adding an assertion here if someone want to chain more than 2 instructions. Differential Revision: https://reviews.llvm.org/D71180	2019-12-10 03:10:21 +00:00
Hiroshi Yamauchi	d9ae493937	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149	2019-12-09 12:42:59 -08:00
Thomas Raoux	caabb713ea	[ModuloSchedule] Fix data types in ModuloScheduleExpander::isLoopCarried The cycle values in modulo scheduling results can be negative. The result of ModuloSchedule::getCycle() must be received as an int type. Patch by Masaki Arai! Differential Revision: https://reviews.llvm.org/D71122	2019-12-09 07:37:00 -08:00
Jeremy Morse	00e238896c	[DebugInfo] Nerf placeDbgValues, with prejudice CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453	2019-12-09 12:52:10 +00:00
David Stenberg	6965f835b4	[DebugInfo] Make describeLoadedValue() reg aware Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431	2019-12-09 10:47:49 +01:00
David Stenberg	f3696533f2	Revert "[DebugInfo] Make describeLoadedValue() reg aware" This reverts commit `3cd93a4efc`. I'll recommit with a well-formatted arcanist commit message.	2019-12-09 10:45:13 +01:00
David Stenberg	3cd93a4efc	[DebugInfo] Make describeLoadedValue() reg aware Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.	2019-12-09 10:44:17 +01:00
Hans Wennborg	a38396939c	Revert `393dacacf7` "[ARM] Enable TypePromotion by default" This caused "Too many bits for uint64_t" asserts when building Chromium. See https://crbug.com/1031978#c2 for a reproducer. I'll follow up on the llvm-commits thread with a creduced version. > ARMCodeGenPrepare has already been generalized and renamed to > TypePromotion. We've had it enabled and tested downstream for a > while, so enable it by default. > > Differential Revision: https://reviews.llvm.org/D70998	2019-12-09 09:39:31 +01:00
rollrat	9fdb7ac503	[NFC][LivePhysRegs] Fix incorrect comment Reviewers: #llvm, tellenbach Reviewed By: tellenbach Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71051 Patch by: rollrat <rollrat.cse@gmail.com>	2019-12-08 21:07:28 +01:00
Ulrich Weigand	9db13b5a7d	[FPEnv] Constrained FCmp intrinsics This adds support for constrained floating-point comparison intrinsics. Specifically, we add: declare <ty2> @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) declare <ty2> @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) The first variant implements an IEEE "quiet" comparison (i.e. we only get an invalid FP exception if either argument is a SNaN), while the second variant implements an IEEE "signaling" comparison (i.e. we get an invalid FP exception if either argument is any NaN). The condition code is implemented as a metadata string. The same set of predicates as for the fcmp instruction is supported (except for the "true" and "false" predicates). These new intrinsics are mapped by SelectionDAG codegen onto two new ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again representing quiet vs. signaling comparison operations. Otherwise those nodes look like SETCC nodes, with an additional chain argument and result as usual for strict FP nodes. The patch includes support for the common legalization operations for those nodes. The patch also includes full SystemZ back-end support for the new ISD nodes, mapping them to all available SystemZ instruction to fully implement strict semantics (scalar and vector). Differential Revision: https://reviews.llvm.org/D69281	2019-12-07 11:28:39 +01:00
Craig Topper	28b573d249	[TargetLowering] Fix another potential FPE in expandFP_TO_UINT D53794 introduced code to perform the FP_TO_UINT expansion via FP_TO_SINT in a way that would never expose floating-point exceptions in the intermediate steps. Unfortunately, I just noticed there is still a way this can happen. As discussed in D53794, the compiler now generates this sequence: // Sel = Src < 0x8000000000000000 // Val = select Sel, Src, Src - 0x8000000000000000 // Ofs = select Sel, 0, 0x8000000000000000 // Result = fp_to_sint(Val) ^ Ofs The problem is with the Src - 0x8000000000000000 expression. As I mentioned in the original review, that expression can never overflow or underflow if the original value is in range for FP_TO_UINT. But I missed that we can get an Inexact exception in the case where Src is a very small positive value. (In this case the result of the sub is ignored, but that doesn't help.) Instead, I'd suggest to use the following sequence: // Sel = Src < 0x8000000000000000 // FltOfs = select Sel, 0, 0x8000000000000000 // IntOfs = select Sel, 0, 0x8000000000000000 // Result = fp_to_sint(Val - FltOfs) ^ IntOfs In the case where the value is already in range of FP_TO_SINT, we now simply compute Val - 0, which now definitely cannot trap (unless Val is a NaN in which case we'd want to trap anyway). In the case where the value is not in range of FP_TO_SINT, but still in range of FP_TO_UINT, the sub can never be inexact, as Val is between 2^(n-1) and (2^n)-1, i.e. always has the 2^(n-1) bit set, and the sub is always simply clearing that bit. There is a slight complication in the case where Val is a constant, so we know at compile time whether Sel is true or false. In that scenario, the old code would automatically optimize the sub away, while this no longer happens with the new code. Instead, I've added extra code to check for this case and then just fall back to FP_TO_SINT directly. (This seems to catch even slightly more cases.) Original version of the patch by Ulrich Weigand. X86 changes added by Craig Topper Differential Revision: https://reviews.llvm.org/D67105	2019-12-06 14:11:04 -08:00
Hiroshi Yamauchi	2eb30fafa5	Revert "[PGO][PGSO] Instrument the code gen / target passes." This reverts commit `9a0b5e1407`. This seems to break buildbots.	2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi	9a0b5e1407	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072	2019-12-06 10:43:39 -08:00
Guozhi Wei	72942459d0	[MBP] Avoid tail duplication if it can't bring benefit Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead. To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks: make sure there is at least one duplication in current work set. the number of duplication should not exceed the number of successors. The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain. Differential Revision: https://reviews.llvm.org/D64376	2019-12-06 09:53:53 -08:00
John Brawn	984f1bb3e7	[LegalizeTypes] Add missing case for STRICT_FP_ROUND softening This fixes a test failure in test/CodeGen/ARM/fp-intrinsics.ll.	2019-12-06 15:54:27 +00:00
Jeremy Morse	c93a9b15ce	[DebugInfo][CGP] Update dbg.values when sinking address computations One of CodeGenPrepare's optimizations is to duplicate address calculations into basic blocks, so that as much information as possible can be folded into memory addressing operands. This is great -- but the dbg.value variable location intrinsics are not updated in the same way. This can lead to dbg.values referring to address computations in other blocks that will never be encoded into the DAG, while duplicate address computations are performed locally that could be used by the dbg.value. Some of these (such as non-constant-offset GEPs) can't be salvaged past. Fix this by, whenever we duplicate an address computation into a block, looking for dbg.value users of the original memory address in the same block, and redirecting those to the local computation. Differential Revision: https://reviews.llvm.org/D58403	2019-12-06 11:27:19 +00:00
Ulrich Weigand	daee549b17	[FPEnv][SelectionDAG] Relax chain requirements This patch implements the following changes: 1) SelectionDAGBuilder::visitConstrainedFPIntrinsic currently treats each constrained intrinsic like a global barrier (e.g. a function call) and fully serializes all pending chains. This is actually not required; it is allowed for constrained intrinsics to be reordered w.r.t one another or (nonvolatile) memory accesses. The MI-level scheduler already allows for that flexibility, so it makes sense to allow it at the DAG level as well. This patch therefore changes the way chains for constrained intrisincs are created, and handles them basically like load operations are handled. This has the effect that constrained intrinsics are no longer serialized against one another or (nonvolatile) loads. They are still serialized against stores, but that seems hard to change with the current DAG chain setup, and it also doesn't seem to be a big problem preventing DAG 2) The OPC_CheckFoldableChainNode check requires that each of the intermediate nodes in a multi-node pattern match only has a single use. This check tends to fail if those intermediate nodes are strict operations as those have a chain output that typically indeed has another use. However, we don't really need to consider chains here at all, since they will all be rewritten anyway by UpdateChains later. Other parts of the matcher therefore already ignore chains, but this hasOneUse check doesn't. This patch replaces hasOneUse by a custom test that verifies there is no more than one use of any non-chain output value. In theory, this change could affect code unrelated to strict FP nodes, but at least on SystemZ I could not find any single instance of that happening 3) The SystemZ back-end currently does not allow matching multiply-and- extend operations (32x32 -> 64bit or 64x64 -> 128bit FP multiply) for strict FP operations. This was not possible in the past due to the problems described under 1) and 2) above. With those issues fixed, it is now possible to fully support those instructions in strict mode as well, and this patch does so. Differential Revision: https://reviews.llvm.org/D70913	2019-12-06 11:02:11 +01:00
Alexey Lapshin	9e8c799e2b	[Dsymutil][NFC] Move NonRelocatableStringpool into common CodeGen folder. That refactoring moves NonRelocatableStringpool into common CodeGen folder. So that NonRelocatableStringpool could be used not only inside dsymutil. Differential Revision: https://reviews.llvm.org/D71068	2019-12-06 10:02:27 +03:00
David Blaikie	560ab1f8d3	DebugInfo: Pull out a common expression. This is for the case where -gmlt -gsplit-dwarf -fsplit-dwarf-inlining are used together in some but not all units during LTO (or, in the reduced case, even without LTO) - ensuring that no split dwarf is used (because split-dwarf-inlining puts the same data in the .o file, so there's no need to duplicate it into the .dwo file)	2019-12-05 19:51:30 -08:00
Quentin Colombet	2ec71ea7c7	[RegisterCoalescer] Fix the creation of subranges when rematerialization is used * Context * During register coalescing, we use rematerialization when coalescing is not possible. That means we may rematerialize a super register when only a smaller register is actually used. E.g., 0B v1 = ldimm 0xFF 1B v2 = COPY v1.low8bits 2B = v2 => 0B v1 = ldimm 0xFF 1B v2 = ldimm 0xFF 2B = v2.low8bits Where xB are the slot indexes. Here v2 grew from a 8-bit register to a 16-bit register. When that happens and subregister liveness is enabled, we create subranges for the newly created value. E.g., before remat, the live range of v2 looked like: main range: [1r, 2r) (Reads v2 is defined at index 1 slot register and used before the slot register of index 2) After remat, it should look like: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 1d) <-- dead def I.e., the unsused lanes of v2 should be marked as dead definition. * The Problem * Prior to this patch, the live-ranges from the previous exampel, would have the full live-range for all subranges: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 2r) <-- too long * The Fix * Technically, the code that this patch changes is not wrong: When we create the subranges for the newly rematerialized value, we create only one subrange for the whole bit mask. In other words, at this point v2 live-range looks like this: main range: [1r, 2r) low & high: [1r, 2r) Then, it gets wrong when we call LiveInterval::refineSubRanges on low 8 bits: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 2r) <-- too long Ideally, we would like LiveInterval::refineSubRanges to be able to do the right thing and mark the dead lanes as such. However, this is not possible, because by the time we update / refine the live ranges, the IR hasn't been updated yet, therefore we actually don't have enough information to do the right thing. Another option to fix the problem would have been to call LiveIntervals::shrinkToUses after the IR is updated. This is not desirable as this may have a noticeable impact on compile time. Instead, what this patch does is when we create the subranges for the rematerialized value, we explicitly create one subrange for the lanes that were used before rematerialization and one for the lanes that were not used. The used one inherits the live range of the main range and the unused one is just created empty. The existing rematerialization code then detects that the unused one are not live and it correctly sets dead def intervals for them. https://llvm.org/PR41372	2019-12-05 16:32:30 -08:00
David Blaikie	decee04e63	DebugInfo: Fix LTO+DWARFv5 loclists The loclists_table_base was being overwritten for each CU even though only one loclists contribution is made so everything but the last CU would have a label that was never defined and fail to assemble.	2019-12-05 12:47:54 -08:00
Volkan Keles	bfa3d260b8	[GlobalISel] Localizer: Allow targets not to run the pass conditionally Summary: Previously, it was not possible to skip running the localizer pass conditionally. This patch adds an input function to the pass which decides if the pass should run on the given MachineFunction or not. No test case as there is no upstream target needs this functionality. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71038	2019-12-05 11:09:50 -08:00
Jeremy Morse	30e8f80fd5	[DebugInfo] Don't create multiple DBG_VALUEs when sinking This patch addresses a performance problem reported in PR43855, and present in the reapplication in in 001574938e5. It turns out that MachineSink will (often) move instructions to the first block that post-dominates the current block, and then try to sink further. This means if we have a lot of conditionals, we can needlessly create large numbers of DBG_VALUEs, one in each block the sunk instruction passes through. To fix this, rather than immediately sinking DBG_VALUEs, record them in a pass structure. When sinking is complete and instructions won't be sunk any further, new DBG_VALUEs are added, avoiding lots of intermediate DBG_VALUE $noregs being created. Differential revision: https://reviews.llvm.org/D70676	2019-12-05 15:52:20 +00:00
Jeremy Morse	e4cdd62631	[DebugInfo] Don't reorder DBG_VALUEs when sunk Fix part of PR43855, resolving a problem that comes from the reapplication in 001574938e5. If we have two DBG_VALUE insts in a block that specify the location of the same variable, for example: %0 = someinst DBG_VALUE %0, !123, !DIExpression() %1 = anotherinst DBG_VALUE %1, !123, !DIExpression() if %0 were to sink, the corresponding DBG_VALUE would sink too, past the next DBG_VALUE, effectively re-ordering assignments. To fix this, I've added a SeenDbgVars set recording what variable locations have been seen in a block already (working bottom up), and now flag DBG_VALUEs that would pass a later DBG_VALUE for the same variable. NB, this only works for repeated DBG_VALUEs in the same basic block, the general case involving control flow is much harder, which I've written up in PR44117. Differential revision: https://reviews.llvm.org/D70672	2019-12-05 15:52:20 +00:00
Jeremy Morse	fca4100196	[DebugInfo] Re-apply two patches to MachineSink These were: * D58386 / `f5e1b718a6` / reverted in `d382a8a768` * D58238 / `ee50590e16` / reverted in `a8db456b53` Of which the latter has a performance regression tracked in PR43855, fixed by D70672 / D70676, which will be committed atomically with this reapplication. Contains a minor difference to account for a change in the IsCopyInstr signature.	2019-12-05 15:52:20 +00:00
Sam Parker	393dacacf7	[ARM] Enable TypePromotion by default ARMCodeGenPrepare has already been generalized and renamed to TypePromotion. We've had it enabled and tested downstream for a while, so enable it by default. Differential Revision: https://reviews.llvm.org/D70998	2019-12-05 14:21:11 +00:00
Djordje Todorovic	52b231ee84	[LiveDebugValues] Silence the unused var warning; NFC	2019-12-05 12:32:14 +01:00
David Stenberg	54682d871d	[DebugInfo] Handle call site values for instructions before call bundle Summary: If a call is bundled then the code that looks for instructions that produce parameter values would break when reaching the call's bundle header, due to the `ifCall(/AnyInBundle/)` invocation returning true. It is not enough to simply ignore bundle headers in the `isCall()` invocation, as the bundle header may have defines of parameter registers due to the call, meaning that such registers would incorrectly be removed from the worklist. Therefore, do not look at bundle headers at all. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: aprantl, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D71024	2019-12-05 11:50:41 +01:00
Djordje Todorovic	4b4ede440a	Reland "[LiveDebugValues] Introduce entry values of unmodified params" Relanding this after resolving the cause of the test failure.	2019-12-05 11:10:49 +01:00
Florian Hahn	76a5c8421e	[MCRegInfo] Add forward sub and super register iterators. (NFC) This patch adds forward iterators mc_difflist_iterator, mc_subreg_iterator and mc_superreg_iterator, based on the existing DiffListIterator. Those are used to provide iterator ranges over sub- and super-register from TRI, which are slightly more convenient than the existing MCSubRegIterator/MCSuperRegIterator. Unfortunately, it duplicates a bit of functionality, but the new iterators are a bit more convenient (and can be used with various existing iterator utilities) and should probably replace the old iterators in the future. This patch updates some existing users. Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D70565	2019-12-05 09:29:26 +00:00
Florian Hahn	1b81964586	[MIBundle] Turn MachineOperandIteratorBase into a forward iterator. This patch turns MachineOperandIteratorBase into a regular forward iterator, which can be used with iterator_range. It also adds mi_bundle_ops and const_mi_bundle_ops that return iterator ranges over all operands in a bundle and updates a use of the old iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D70561	2019-12-05 09:06:22 +00:00
Kai Luo	b200c5180e	Reland [MachineCopyPropagation] Extend MCP to do trivial copy backward propagation. Fix assertion error ``` bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed. ``` by checking if the register is 0 before invoking `isRenamable`.	2019-12-05 14:32:11 +08:00
Kai Luo	3882edbe19	Revert "[MachineCopyPropagation] Extend MCP to do trivial copy backward propagation" This reverts commit `75b3a1c318`, since it breaks bootstrap build.	2019-12-05 12:48:37 +08:00
Kai Luo	75b3a1c318	[MachineCopyPropagation] Extend MCP to do trivial copy backward propagation Summary: This patch mainly do such transformation ``` $R0 = OP ... ... // No read/clobber of $R0 and $R1 $R1 = COPY $R0 // $R0 is killed ``` Replace $R0 with $R1 and remove the COPY, we have ``` $R1 = OP ... ``` This transformation can also expose more opportunities for existing copy elimination in MCP. Differential Revision: https://reviews.llvm.org/D67794	2019-12-05 10:59:07 +08:00
Amara Emerson	28f5ad5801	[GlobalISel] Fix compiler crash lowering G_LOAD in AArch64. Patch by Daniel Rodríguez Troitiño. Differential Revision: https://reviews.llvm.org/D70794	2019-12-04 17:04:54 -08:00
Puyan Lotfi	fdc6f4b97b	[llvm] Fixing MIRVRegNamerUtils to properly handle 2+ MachineBasicBlocks. An interplay of code from D70210, along with code from the Value-Numbering-esque hash-based namer from D70210, as well as some crusty code from the original MIR-Canon code lead to multiple causes of failure when canonicalizing or renaming vregs for MIR with multiple basic blocks. This patch fixes those issues while deleting some no longer needed code and adding a nice diamond test case to boot. Differential Revision: https://reviews.llvm.org/D70478	2019-12-04 18:36:08 -05:00
Alexey Lapshin	789e257ce0	[DWARF5][Debuginfo] Compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) do not match. That patch fixes incompatible compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) error. cat split-dwarf.cpp int main() { int a = 1; return 0; } clang++ -O -g -gsplit-dwarf -gdwarf-5 split-dwarf.cpp; llvm-dwarfdump --verify ./a.out \| grep skeleton error: Compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) do not match. The fix is to change DW_TAG_compile_unit into DW_TAG_skeleton_unit when skeleton file is generated. Differential Revision: https://reviews.llvm.org/D70880	2019-12-05 00:53:47 +03:00
Amy Huang	9e978bb01c	Add support for lowering 32-bit/64-bit pointers Summary: This follows a previous patch that changes the X86 datalayout to represent mixed size pointers (32-bit sext, 32-bit zext, and 64-bit) with address spaces (https://reviews.llvm.org/D64931) This patch implements the address space cast lowering to the corresponding sign extension, zero extension, or truncate instructions. Related to https://bugs.llvm.org/show_bug.cgi?id=42359 Reviewers: rnk, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69639	2019-12-04 11:39:03 -08:00
Vedant Kumar	f208b70fbc	Revert "[Coverage] Revise format to reduce binary size" This reverts commit `e18531595b`. On Windows, there is an error: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/54963/steps/stage%201%20check/logs/stdio error: C:\b\slave\sanitizer-windows\build\stage1\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-merging.cpp.tmp.v1.o: Failed to load coverage: Malformed coverage data	2019-12-04 10:35:14 -08:00
Vedant Kumar	e18531595b	[Coverage] Revise format to reduce binary size Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2019-12-04 10:10:55 -08:00
Cullen Rhodes	17e537bc58	[NFC] Use default case in EVT::getEVTString Summary: The default case handles the majority of MVTs so most of the individual cases can be removed. Also added a case for floating point types. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D70955	2019-12-04 11:06:49 +00:00
Ulrich Weigand	c3d05c1b52	[SelectionDAG] Expand nnan FMINNUM/FMAXNUM to select sequence InstCombine may synthesize FMINNUM/FMAXNUM nodes from fcmp+select sequences (where the fcmp is marked nnan). Currently, if the target does not otherwise handle these nodes, they'll get expanded to libcalls to fmin/fmax. However, these functions may reside in libm, which may introduce a library dependency that was not originally present in the source code, potentially resulting in link failures. To fix this problem, add code to TargetLowering::expandFMINNUM_FMAXNUM to expand FMINNUM/FMAXNUM to a compare+select sequence instead of the libcall. This is done only if the node is marked as "nnan"; in this case, the expansion to compare+select is always correct. This also suffices to catch all cases where FMINNUM/FMAXNUM was synthesized as above. Differential Revision: https://reviews.llvm.org/D70965	2019-12-04 10:32:35 +01:00
QingShan Zhang	d84b320dfd	[MacroFusion] Limit the max fused number as 2 to reduce the dependency This is the example: int foo(int a, int b, int c, int d) { return a + b + c + d; } And this is the Dependency Graph: +------+ +------+ +------+ +------+ \| A \| \| B \| \| C \| \| D \| +--+--++ +---+--+ +--+---+ +--+---+ ^ ^ ^ ^ ^ ^ \| \| \| \| \| \| \| \| \| \|New1 +--------------+ \| \| \| \| \| \| \| \| \| +--+---+ \| \|New2 \| +-------+ ADD1 \| \| \| \| +--+---+ \| \| \| Fuse ^ \| \| +-------------+ \| +------------+ \| \| \| Fuse +--+---+ +----------->+ ADD2 \| \| +------+ +--+---+ \| ADD3 \| +------+ We need also create an artificial edge from ADD1 to A if https://reviews.llvm.org/D69998 is landed. That will force the Node A scheduled before the ADD1 and ADD2. But in fact, it is ok to schedule the Node A in-between ADD3 and ADD2, as ADD3 and ADD2 are NOT a fusion pair because ADD2 has been matched to ADD1. We are creating these unnecessary dependency edges that override the heuristics. Differential Revision: https://reviews.llvm.org/D70066	2019-12-04 05:05:35 +00:00
Craig Topper	f586fd44e4	[FPEnv] [PowerPC] Lowering ppc_fp128 StrictFP Nodes to libcalls This is an alternative to D64662 that shares more code between strict and non-strict nodes. It's modeled after the implementation that I did for softening. Differential Revision: https://reviews.llvm.org/D70867	2019-12-03 14:11:21 -08:00
Aditya Nandakumar	6da7dbb806	[GlobalISel]: Allow targets to override how to widen constants during legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.	2019-12-03 10:41:10 -08:00
Roman Lebedev	9a20c79ddc	[NFC][KnownBits] Add getMinValue() / getMaxValue() methods As it can be seen from accompanying cleanup, it is not unheard of to write `~Known.Zero` meaning "what maximal value can this KnownBits produce". But i think `~Known.Zero` isn't that self-explanatory, as compared to a method with a name. Note that not all `~Known.Zero` places were cleaned up, only those where this arguably improves things.	2019-12-03 20:04:51 +03:00
Amaury Séchet	b4980f7781	[SelectionDAG] Reoder ViewXXXDAGs declarations to match execution order. NFC	2019-12-03 16:26:12 +01:00
stozer	269a9afe25	[DebugInfo] Make DebugVariable class available in DebugInfoMetadata The DebugVariable class is a class declared in LiveDebugValues.cpp which is used to uniquely identify a single variable, using its source variable, inline location, and fragment info to do so. This patch moves this class into DebugInfoMetadata.h, making it available in a much broader scope.	2019-12-03 15:10:56 +00:00
Sourabh Singh Tomar	8dd17a13b0	[NFCI][DebugInfo] Corrected a comment.	2019-12-03 19:45:37 +05:30
Djordje Todorovic	409350deea	Revert "[LiveDebugValues] Introduce entry values of unmodified params" This reverts commit rG4cfceb910692 due to LLDB test failing.	2019-12-03 13:13:27 +01:00
Sam Parker	bc76dadb3c	[CodeGen] Move ARMCodegenPrepare to TypePromotion Convert ARMCodeGenPrepare into a generic type promotion pass by: - Removing the insertion of arm specific intrinsics to handle narrow types as we weren't using this. - Removing ARMSubtarget references. - Now query a generic TLI object to know which types should be promoted and what they should be promoted to. - Move all codegen tests into Transforms folder and testing using opt and not llc, which is how they should have been written in the first place... The pass searches up from icmp operands in an attempt to safely promote types so we can avoid generating unnecessary unsigned extends during DAG ISel. Differential Revision: https://reviews.llvm.org/D69556	2019-12-03 11:12:52 +00:00
Jonas Paulsson	f8c0cfc24e	ImplicitNullChecks: Don't add a dead definition of DepMI as live-in This is one of the fixes needed to reapply D68267 which improves verification of live-in lists. Review: craig.topper https://reviews.llvm.org/D70434	2019-12-03 11:02:53 +01:00
Djordje Todorovic	4cfceb9106	[LiveDebugValues] Introduce entry values of unmodified params The idea is to remove front-end analysis for the parameter's value modification and leave it to the value tracking system. Front-end in some cases marks a parameter as modified even the line of code that modifies the parameter gets optimized, that implies that this will cover more entry values even. In addition, extending the support for modified parameters will be easier with this approach. Since the goal is to recognize if a parameter’s value has changed, the idea at very high level is: If we encounter a DBG_VALUE other than the entry value one describing the same variable (parameter), we can assume that the variable’s value has changed and we should not track its entry value any more. That would be ideal scenario, but due to various LLVM optimizations, a variable’s value could be just moved around from one register to another (and there will be additional DBG_VALUEs describing the same variable), so we have to recognize such situation (otherwise, we will lose a lot of entry values) and salvage the debug entry value. Differential Revision: https://reviews.llvm.org/D68209	2019-12-03 11:01:45 +01:00
Jonas Paulsson	4fd8f11901	[MachineVerifier] Improve checks of target instructions operands. While working with a patch for instruction selection, the splitting of a large immediate ended up begin treated incorrectly by the backend. Where a register operand should have been created, it instead became an immediate. To my surprise the machine verifier failed to report this, which at the time would have been helpful. This patch improves the verifier so that it will report this type of error. This patch XFAILs CodeGen/SPARC/fp128.ll, which has been reported at https://bugs.llvm.org/show_bug.cgi?id=44091 Review: thegameg, arsenm, fhahn https://reviews.llvm.org/D63973	2019-12-03 10:20:52 +01:00
Craig Topper	039664db87	[LegalizeDAG] Return true from ExpandNode for some nodes that don't have expand support. These nodes have a FIXME that they only get here because a Custom handler returned SDValue() instead of the original Op. Even though we aren't expanding them, we should return true here to prevent ConvertNodeToLibcall from also trying to process them until the FIXME has been addressed. I'm hoping to add checking to ConvertNodeToLibcall to make sure we don't give it nodes it doesn't have support for.	2019-12-02 23:39:20 -08:00
Craig Topper	f92000187e	[LegalizeDAG] When expanding vector SRA/SRL/SHL add the new BUILD_VECTOR to the Results vector instead of just calling ReplaceNode The code that processes the Results vector also calls ReplaceNode and makes ExpandNode return true. If we don't add it to the Results node, we end up returning false from ExpandNode. This causes ConvertNodeToLibcall to be called next. But ConvertNodeToLibcall doesn't do anything for shifts so they just pass through unmodified. Except for printing a debug message. Ultimately, I'd like to add more checks to ExpandNode and ConvertNodeToLibcall to make sure we don't have nodes marked as Expand that don't have any Expand or libcall handling.	2019-12-02 23:07:39 -08:00
Sourabh Singh Tomar	f1e3988aa6	Recommit "[DWARF5]Addition of alignment atrribute in typedef DIE." This revision is revised to update Go-bindings and Release Notes. The original commit message follows. This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE. When explicit alignment is specified. Patch by Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok, deadalinx Differential Revision: https://reviews.llvm.org/D70111	2019-12-03 09:51:43 +05:30
Sourabh Singh Tomar	3f3d0f4f4b	[DebugInfo] Support for debug_macinfo.dwo section in llvm and llvm-dwarfdump. This patch adds support for debug_macinfo.dwo section[pre-standardized] to llvm and llvm-dwarfdump. Reviewers: probinson, dblaikie, aprantl, jini.susan.george, alok Differential Revision: https://reviews.llvm.org/D70705 Tags: #debug-info #llvm	2019-12-03 08:54:12 +05:30
Hiroshi Yamauchi	8cdfdfeee6	[PGO][PGSO] Add an optional query type parameter to shouldOptimizeForSize. Summary: In case of a need to distinguish different query sites for gradual commit or debugging of PGSO. NFC. Reviewers: davidxl Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70510	2019-12-02 13:54:13 -08:00
Florian Hahn	5154b0253d	[MIBundles] Move analyzePhysReg out of MIBundleOperands iterator (NFC). analyzePhysReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D70559	2019-12-02 20:47:08 +00:00
Volkan Keles	3d02fa6da7	[GlobalISel] CombinerHelper: Fix a bug in matchCombineCopy Summary: When combining COPY instructions, we were replacing the destination registers with the source register without checking register constraints. This patch adds a simple logic to check if the constraints match before replacing registers. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic Reviewed By: aditya_nandakumar Subscribers: rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70616	2019-12-02 12:05:09 -08:00
Florian Hahn	5d0625664b	[MIBundles] Move analyzeVirtReg out of MIBundleOperands iterator (NFC). analyzeVirtReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D70558	2019-12-02 19:50:33 +00:00
Amaury Séchet	c594d14d40	[DAGCombine] Factor oplist operations. NFC	2019-12-02 19:12:03 +01:00
Amaury Séchet	d8d5106225	[SelectionDAG] Reduce assumptions made about levels. NFC	2019-12-02 17:43:13 +01:00
Hans Wennborg	cee62e6fcf	Fix a typo.	2019-11-30 13:23:49 +01:00
Craig Topper	2f3e8cb313	[LegalizeTypes] Add strict FP support to SoftenFloatRes_FP_ROUND. Fix mistake in SoftenFloatRes_FP_EXTEND. These will be needed for ARM fp-instrinsics.ll which is currently XFAILed. One of the getOperand calls in SoftenFloatRes_FP_EXTEND was not taking strict FP into account. It only affected the call to setTypeListBeforeSoften which only has an effect on some targets.	2019-11-28 15:32:09 -08:00
Craig Topper	68ddf434c0	[LegalizeTypes] In SoftenFloatRes_FNEG, always generate integer arithmetic, never fall back to using fsub. We would previously fallback if the type wasn't f32/f64/f128. But I don't think any of the other floating point types ever go through the softening code anyway. So this code is dead.	2019-11-28 15:30:34 -08:00
Craig Topper	2485fa7739	[LegalizeTypes] Use SoftenFloatRes_Unary in SoftenFloatRes_FCBRT to reduce code. We don't have a STRICT_CBRT ISD opcode, but we can still use SoftenFloatRes_Unary to simplify some code.	2019-11-28 15:30:34 -08:00
Amaury Séchet	ca818f4550	[DAGCombiner] Peek through vector concats when trying to combine shuffles. Summary: This combine showed up as needed when exploring the regression when processing the DAG in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68195	2019-11-28 23:57:29 +01:00
Craig Topper	735f4793f1	[LegalizeTypes] Remove dead code related to softening f16 which we no longer do. f16 is promoted to f32 if it is not legal on the target. Found while reviewing what else needed to be done for strict FP in the softening code.	2019-11-27 22:10:30 -08:00
Craig Topper	ed521fef03	[LegalTypes][X86] Add SoftenFloatOperand support for STRICT_FP_TO_SINT/STRICT_FP_TO_UINT.	2019-11-27 21:16:13 -08:00
Craig Topper	1727c4f1a2	[LegalizeTypes][X86] Add ExpandIntegerResult support for STRICT_FP_TO_SINT/STRICT_FP_TO_UINT.	2019-11-27 18:41:45 -08:00
Craig Topper	9283681e16	[CriticalAntiDepBreaker] Teach the regmask clobber check to check if any subregister is preserved before considering the super register clobbered X86 has some calling conventions where bits 127:0 of a vector register are callee saved, but the upper bits aren't. Previously we could detect that the full ymm register was clobbered when the xmm portion was really preserved. This patch checks the subregisters to make sure they aren't preserved. Fixes PR44140 Differential Revision: https://reviews.llvm.org/D70699	2019-11-27 11:20:58 -08:00
Craig Topper	ebfff46c8d	[LegalizeTypes][FPEnv][X86] Add initial support for softening strict fp nodes This is based on what's required for softening fp128 operations on 32-bit X86 assuming f32/f64/f80 are legal. So there could be some things missing. Differential Revision: https://reviews.llvm.org/D70654	2019-11-27 10:50:10 -08:00
Craig Topper	350565dbc0	[LegalizeTypes] Add SoftenFloatOp_Unary to reduce some duplication for softening LRINT/LLRINT/LROUND/LLROUND Summary: This will be enhanced in a follow up to add strict fp support Reviewers: efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70751	2019-11-26 17:37:51 -08:00
Craig Topper	9b08366f57	[LegalizeTypes] Add SoftenFloatRes_Unary and SoftenFloatRes_Binary functions to factor repeated patterns out of many of the SoftenFloatRes_* functions This has been factored out of D70654 which will add strict FP support to these functions. By making the helpers we avoid repeating even more code. Differential Revision: https://reviews.llvm.org/D70736	2019-11-26 12:52:17 -08:00
Craig Topper	ee3b375b4c	[LegalizeDAG] Use getOperationAction instead of getStrictFPOperationAction for STRICT_LRINT/LROUND/LLRINT/LLROUND.	2019-11-26 11:57:45 -08:00
Fangrui Song	fe955e6c70	TargetPassConfig: const char * -> const char [] The latter has better codegen in non-optimized builds, which do not run ipsccp.	2019-11-26 11:25:00 -08:00
David Green	b5315ae8ff	[Codegen][ARM] Add addressing modes from masked loads and stores MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do. To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores. This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes. Differential Revision: https://reviews.llvm.org/D70176	2019-11-26 16:21:01 +00:00
Luís Marques	6fd4c42fa8	[LegalizeTypes][RISCV] Soften FCOPYSIGN operand Summary: Adds support for softening FCOPYSIGN operands. Adds RISC-V tests that exercise the new softening code. Reviewers: asb, lenary, efriedma Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D70679	2019-11-26 15:22:55 +00:00
Sam Parker	28166816b0	[ARM][ReachingDefs] Remove dead code in loloops. Add some more helper functions to ReachingDefs to query the uses of a given MachineInstr and also to query whether two MachineInstrs use the same def of a register. For Arm, while tail-predicating, these helpers are used in the low-overhead loops to remove the dead code that calculates the number of loop iterations. Differential Revision: https://reviews.llvm.org/D70240	2019-11-26 10:27:46 +00:00
Sam Parker	cced971fd3	[ARM][ReachingDefs] RDA in LoLoops Add several new methods to ReachingDefAnalysis: - getReachingMIDef, instead of returning an integer, return the MachineInstr that produces the def. - getInstFromId, return a MachineInstr for which the given integer corresponds to. - hasSameReachingDef, return whether two MachineInstr use the same def of a register. - isRegUsedAfter, return whether a register is used after a given MachineInstr. These methods have been used in ARMLowOverhead to replace searching for uses/defs. Differential Revision: https://reviews.llvm.org/D70009	2019-11-26 10:13:46 +00:00
Craig Topper	3dc7c5f7d8	[LegalizeTypes] Remove code to create ISD::FP_TO_FP16 from SoftenFloatRes_FTRUNC. There seems to have been a misunderstanding of what ISD::FTRUNC represents. ISD::FTRUNC is equivalent to llvm.trunc which takes a floating point value, truncates it without changing the size of the value and returns it. Despite its similar name, its different than the fptrunc instruction in IR which changes a floating point value to a smaller floating point value. fptrunc is represented by ISD::FP_ROUND in SelectionDAG. Since the ISD::FP_TO_FP16 node takes a floating point value and converts it to f16 its more similar to ISD::FP_ROUND. In fact there is identical code to what is being removed here in SoftenFloatRes_FP_ROUND. I assume this bug was never encountered because it would require f16 to be legalized by softening rather than the default of promoting.	2019-11-25 18:18:40 -08:00
Sanjay Patel	214683f3b2	[DAGCombiner] avoid crash on out-of-bounds insert index (PR44139) We already have this simplification at node-creation-time, but the test from: https://bugs.llvm.org/show_bug.cgi?id=44139 ...shows that we can combine our way to an assert/crash too.	2019-11-25 16:24:06 -05:00
Craig Topper	d6ec6e4bf6	[TargetLowering] Merge ExpandChainLibCall with makeLibCall I need to be able to drop an operand for STRICT_FP_ROUND handling on X86. Merging these functions gives me the ArrayRef interface that passes the return type, operands, and debugloc instead of the Node. Differential Revision: https://reviews.llvm.org/D70503	2019-11-25 10:52:49 -08:00
Jeremy Morse	d9c9a4e48d	[DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locations This is a re-land of D56151 / r364515 with a completely new implementation. Once MIR code leaves SSA form and the liveness of a vreg is considered, DBG_VALUE insts are able to refer to non-live vregs, because their debug-uses do not contribute to liveness. This non-liveness becomes problematic for optimizations like register coalescing, as they can't ``see'' the debug uses in the liveness analyses. As a result registers get coalesced regardless of debug uses, and that can lead to invalid variable locations containing unexpected values. In the added test case, the first vreg operand of ADD32rr is merged with various copies of the vreg (great for performance), but a DBG_VALUE of the unmodified operand is blindly updated to the modified operand. This changes what value the variable will appear to have in a debugger. Fix this by changing any DBG_VALUE whose operand will be resurrected by register coalescing to be a $noreg DBG_VALUE, i.e. give the variable no location. This is an overapproximation as some coalesced locations are safe (others are not) -- an extra domination analysis would be required to work out which, and it would be better if we just don't generate non-live DBG_VALUEs. Differential Revision: https://reviews.llvm.org/D64630	2019-11-25 13:47:06 +00:00
Thomas Raoux	e0297a8bee	[ModuloSchedule] Fix a bug in experimental expander Fix two problems that popped up after my last patch. One is that the stiching of prologue/epilogue can be wrong when reading a value from a previsou stage. Also changed how we duplicate phi instructions to avoid generating extra phi that we delete later. Differential Revision: https://reviews.llvm.org/D70213	2019-11-23 16:01:47 -08:00
Sourabh Singh Tomar	0e02977b6e	Recommit "[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump." The original commit message follows. This patch adds support for debug_loclists.dwo section in llvm and llvm-dwarfdump. Also Fixes PR43622, PR43623. Reviewers: dblaikie, probinson, labath, aprantl, jini.susan.george Differential Revision: https://reviews.llvm.org/D69462	2019-11-23 20:10:23 +05:30
Sourabh Singh Tomar	02cb4b2fd6	Revert "[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump." This reverts commit `81b0a3284a`. Will Re-apply, with updated Differtial Revision, for automatic closure of Phabricator review.	2019-11-23 19:46:07 +05:30
Sourabh Singh Tomar	81b0a3284a	[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump. This patch adds support for debug_loclists.dwo section in llvm and llvm-dwarfdump. Also Fixes PR43622, PR43623. Reviewers: dblaikie, probinson, labath, aprantl, jini.susan.george https://reviews.llvm.org/D69462	2019-11-23 10:25:11 +05:30
Clement Courbet	cb15ba84fe	Reland "[DAGCombiner] Allow zextended load combines." Check that the generated type is simple.	2019-11-22 14:47:18 +01:00
Roman Lebedev	96cf5c8d47	[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` (PR35479) Summary: The current lowering is: ``` Name: (X % C1) == C2 -> X * C3 <= C4 \|\| false Pre: (C2 == 0 \|\| C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n3, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/2xC https://rise4fun.com/Alive/jpb5 However, we can support non-tautological cases `C1 u> C2` too. Said handling consists of two parts: * `C2 u<= (-1 %u C1)`. It just works. We only have to change `(X % C1) == C2` into `((X - C2) % C1) == 0` ``` Name: (X % C1) == C2 -> (X - C2) * C3 <= C4 iff C2 u<= (-1 %u C1) Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u<= (-1 %u C1) %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = (-1 /u C1) %n0 = sub i8 %x, C2 %n1 = mul i8 %n0, C3 %n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right %n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n4 = or i8 %n2, %n3 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n4, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/m4P https://rise4fun.com/Alive/SKrx * `C2 u> (-1 %u C1)`. We also have to change `(X % C1) == C2` into `((X - C2) % C1) == 0`, and we have to decrement C4: ``` Name: (X % C1) == C2 -> (X - C2) * C3 <= C4 iff C2 u> (-1 %u C1) Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u> (-1 %u C1) %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = (-1 /u C1)-1 %n0 = sub i8 %x, C2 %n1 = mul i8 %n0, C3 %n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right %n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n4 = or i8 %n2, %n3 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n4, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/d40 https://rise4fun.com/Alive/8cF I believe this concludes `x u% C1 ==/!= C2` lowering. In fact, clang is may now be better in this regard than gcc: as it can be seen from `@t32_6_4` test, we do lower `x % 6 == 4` via this pattern, while gcc does not: https://godbolt.org/z/XNU2z9 And all the general alive proofs say this is legal. And manual checking agrees: https://rise4fun.com/Alive/WA2 Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=35479 \| PR35479 ]]. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: nick, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70053	2019-11-22 15:22:42 +03:00
Roman Lebedev	3f46022e33	[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` with tautological C1 u<= C2 (PR35479) Summary: This is a preparatory cleanup before i add more of this fold to deal with comparisons with non-zero. In essence, the current lowering is: ``` Name: (X % C1) == 0 -> X * C3 <= C4 Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, 0 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %r = icmp ule i8 %n3, %C4 ``` https://rise4fun.com/Alive/oqd It kinda just works, really no weird edge-cases. But it isn't all that great for when comparing with non-zero. In particular, given `(X % C1) == C2`, there will be problems in the always-false tautological case where `C2 u>= C1`: https://rise4fun.com/Alive/pH3 That case is tautological, always-false: ``` Name: (X % Y) u>= Y %o0 = urem i8 %x, %y %r = icmp uge i8 %o0, %y => %r = false ``` https://rise4fun.com/Alive/ofu While we can't/shouldn't get such tautological case normally, we do deal with non-splat vectors, so unless we want to give up in this case, we need to fixup/short-circuit such lanes. There are two lowering variants: 1. We can blend between whatever computed result and the correct tautological result ``` Name: (X % C1) == C2 -> X * C3 <= C4 \|\| false Pre: (C2 == 0 \|\| C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %res = icmp ule i8 %n3, %C4 %r = select i1 %is_tautologically_false, i1 0, i1 %res ``` https://rise4fun.com/Alive/PjT5 https://rise4fun.com/Alive/1KV 2. We can invert the comparison result ``` Name: (X % C1) == C2 -> X * C3 <= C4 \|\| false Pre: (C2 == 0 \|\| C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n3, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/2xC https://rise4fun.com/Alive/jpb5 3. We can expand into `and`/`or`: https://rise4fun.com/Alive/WGn https://rise4fun.com/Alive/lcb5 Blend-one is likely better since we avoid having to load the replacement from constant pool. `xor` is second best since it's still pretty general. I'm not adding `and`/`or` variants. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: nick, hiraditya, xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70051	2019-11-22 15:16:03 +03:00
Clement Courbet	88e205525c	Revert "[DAGCombiner] Allow zextended load combines." Breaks some bots.	2019-11-22 09:01:08 +01:00
Clement Courbet	036790f988	[DAGCombiner] Allow zextended load combines. Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base) Reviewers: apilipenko, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70487	2019-11-22 08:40:19 +01:00
Pengfei Wang	22a0edd070	[FPEnv] Add an option to disable strict float node mutating to an normal float node This patch add an option 'disable-strictnode-mutation' to prevent strict node mutating to an normal node. So we can make sure that the patch which sets strict-node as legal works correctly. Patch by Chen Liu(LiuChen3) Differential Revision: https://reviews.llvm.org/D70226	2019-11-21 18:07:11 -08:00
Craig Topper	7696b99258	[LegalizeDAG][X86] Add support for turning STRICT_FADD/SUB/MUL/DIV into libcalls. Use it for fp128 on x86-64. This requires a minor hack for f32/f64 strict fadd/fsub to avoid turning those into libcalls.	2019-11-21 16:19:25 -08:00
Hiroshi Yamauchi	52e377497d	[PGO][PGSO] DAG.shouldOptForSize part. Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095	2019-11-21 14:16:00 -08:00
Tom Stellard	ab411801b8	[cmake] Explicitly mark libraries defined in lib/ as "Component Libraries" Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179	2019-11-21 10:48:08 -08:00
Bjorn Pettersson	898de30291	[BranchFolding] Fix PR43964 about branch folder not being debug invariant Summary: The fix in BranchFolder related to non debug invariant problems done in commit `ec32dff0b0` actually introduced some new problems with debug invariance. Before that patch ComputeCommonTailLength would move iterators back, past debug instructions, in order to make ProfitableToMerge make consistent answers "when one block differs from the other only by whether debugging pseudos are present at the beginning". But the changes in `ec32dff0b0` undid that by moving the iterators forward again. This patch refactors ComputeCommonTailLength. The function was really complex, considering that the SkipTopCFIAndReturn part always moved the iterators forward to the first "real" instruction in the found tail after `ec32dff0b0`. The patch also restores the logic to "back past possible debugging pseudos at beginning of block" to make sure ProfitableToMerge gives consistent answers independent of DBG_VALUE instructions before the tail. That is now done by ProfitableToMerge instead of being hidden as a side-effect in ComputeCommonTailLength. Reviewers: probinson, yechunliang, jmorse Reviewed By: jmorse Subscribers: Orlando, mehdi_amini, dexonsmith, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70091	2019-11-21 18:13:32 +01:00
Clement Courbet	252567377c	[DAGCombine][NFC] Use ArrayRef and correctly size SmallVectors. In preparation for D70487.	2019-11-21 08:53:37 +01:00
Adrian Prantl	5da385fb56	Fix an offset underflow bug in DwarfExpression when describing small values with subregisters DwarfExpression::addMachineReg() knows how to build a larger register that isn't expressible in DWARF by combining multiple subregisters. However, if the entire value fits into just one subregister, it would still emit the other subregisters, leading to all sorts of inconsistencies down the line. This patch fixes that by moving an already existing(!) check whether the subregister's offset is before the end of the value to the right place. rdar://problem/57294211 Differential Revision: https://reviews.llvm.org/D70508	2019-11-20 17:07:54 -08:00
Craig Topper	c9e8e808cf	[SelectionDAG][X86] Mutate strictFP nodes to non-strict in DoInstructionSelection when the node is marked Expand rather than when it is not Legal. This allows operations that are marked Custom, but have some type combinations that are legal to get past this code. Add custom mutation code to X86's Select function for the nodes that don't have isel patterns yet.	2019-11-20 10:36:02 -08:00
Xiangling Liao	750e855641	A fix of the bug introduced by previous lowering in asm patch. Differential Revision: https://reviews.llvm.org/D70243	2019-11-20 11:29:10 -05:00
Xing Xue	5665fc91fe	[AIX][XCOFF] Add support for generating assembly code for one-byte mergable strings This patch adds support for generating assembly code for one-byte mergeable strings. Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later. Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L Reviewed by: daltenty Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70310	2019-11-20 11:26:49 -05:00
Xiangling Liao	ca33727abe	[AIX] Lowering jump table, constant pool and block address in asm This patch lowering jump table, constant pool and block address in assembly. 1. On AIX, jump table index is always relative; 2. Put CPI and JTI into ReadOnlySection until we support unique data sections; 3. Create the temp symbol for block address symbol; 4. Update MIR testcases and add related assembly part; Differential Revision: https://reviews.llvm.org/D70243	2019-11-20 10:27:15 -05:00
David Zarzycki	257acbf6ae	[SelectionDAG] Combine U{ADD,SUB}O diamonds into {ADD,SUB}CARRY Summary: Convert (uaddo (uaddo x, y), carryIn) into addcarry x, y, carryIn if-and-only-if the carry flags of the first two uaddo are merged via OR or XOR. Work remaining: match ADD, etc. Reviewers: craig.topper, RKSimon, spatel, niravd, jonpa, uweigand, deadalnix, nikic, lebedev.ri, dmgreen, chfast Reviewed By: lebedev.ri Subscribers: chfast, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70079	2019-11-20 16:25:42 +02:00
Djordje Todorovic	979592a6f7	[DebugInfo] Remove the DIFlagArgumentNotModified debug info flag Due to changes in D68206, we remove the DIFlagArgumentNotModified and its usage. Differential Revision: https://reviews.llvm.org/D68207	2019-11-20 13:18:40 +01:00
Serge Pavlov	ea8678d1c7	Move floating point related entities to namespace level This is recommit of commit `e6584b2b7b`, which was reverted in `30e7ee3c4b` together with `af57dbf12e`. Original message is below. Enumerations that describe rounding mode and exception behavior were defined inside ConstrainedFPIntrinsic. It makes sense to use the same definitions to represent the same properties in other cases, not only in constrained intrinsics. It was however inconvenient as required to include constrained intrinsics definitions even if they were not needed. Also using long scope prefix reduced readability. This change moves these definitioins to the namespace llvm::fp. No functional changes. Differential Revision: https://reviews.llvm.org/D69552	2019-11-20 19:05:46 +07:00
Serge Pavlov	0c50c0b055	[FEnv] File with properties of constrained intrinsics Summary In several places we need to enumerate all constrained intrinsics or IR nodes that should be represented by them. It is easy to miss some of the cases. To make working with these intrinsics more convenient and robust, this change introduces file containing definitions of all constrained intrinsics and some of their properties. This file can be included to generate constrained intrinsics processing code. Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69887	2019-11-20 13:30:07 +07:00
Craig Topper	c4b41e8d1d	[LegalizeDAG][X86] Enable STRICT_FP_TO_SINT/UINT to be promoted Differential Revision: https://reviews.llvm.org/D70220	2019-11-19 16:14:37 -08:00
Vedant Kumar	ba71ca3720	[DebugInfo] Describe size of spilled values in call site params A call site parameter description of a memory operand needs to unambiguously convey the size of the operand to prevent incorrect entry value evaluation. Thanks for David Stenberg for pointing this issue out!	2019-11-19 12:03:52 -08:00
Matt Arsenault	7fe9435dc8	Work on cleaning up denormal mode handling Cleanup handling of the denormal-fp-math attribute. Consolidate places checking the allowed names in one place. This is in preparation for introducing FP type specific variants of the denormal-fp-mode attribute. AMDGPU will switch to using this in place of the current hacky use of subtarget features for the denormal mode. Introduce a new header for dealing with FP modes. The constrained intrinsic classes define related enums that should also be moved into this header for uses in other contexts. The verifier could use a check to make sure the denorm-fp-mode attribute is sane, but there currently isn't one. Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by default in the one current user. Clang must be taught to start emitting this attribute by default to avoid regressions when this is switched to assume ieee behavior if the attribute isn't present.	2019-11-19 22:01:14 +05:30
Matt Arsenault	b696b9dba7	DAG: Add function context to isFMAFasterThanFMulAndFAdd AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.	2019-11-19 19:25:26 +05:30
Thomas Preud'homme	a89ca4ae17	Fix PR44001: assert failure in getFunctionLocalOffsetAfterInsn Summary: Assert in getFunctionLocalOffsetAfterInsn() fails when processing a call MachineInstr inside a bundle and compiling with debug info. This is because labels are added by DwarfDebug::beginInstruction() which is called for each top-level MI by EmitFunctionBody()'s for-loop iteration but constructCallSiteEntryDIEs() which calls getFunctionLocalOffsetAfterInsn() iterates over all MIs. This commit modifies constructCallSiteEntryDIEs() to get the associated bundle MI for call MIs inside a bundle and use that to when calling getFunctionLocalOffsetAfterInsn() and getLabelAfterInsn(). It also skips loop iterations for bundle MIs since the loop statements are concerned with debug info for each physical instructions and bundles represent a group of instructions. It also fix the comment about PCAddr since the code is getting the return address and not the call address. Reviewers: dstenb, vsk, aprantl, djtodoro, dblaikie, NikolaPrica Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70293	2019-11-19 11:23:11 +00:00
Craig Topper	dc02eb1909	[SelectionDAG] Merge the two identical ExpandChainLibCall methods from LegalizeTypes and LegalizeDAG to one version in TaretLowering. Reviewers: RKSimon, efriedma, spatel Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70354	2019-11-18 20:22:33 -08:00
Craig Topper	6e20d70a69	[LegalizeDAG] Convert strict fp nodes to libcalls without losing the chain. Previously we mutated the node and then converted it to a libcall. But this loses the chain information. This patch keeps the chain, but unfortunately breaks tail call optimization as the functions involved in deciding if a node is in tail call position can't handle the chain. But correct ordering seems more important to be right. Somehow the SystemZ tests improved. I looked at one of them and it seemed that we're handling the split vector elements in a different order and that made the copies work better. Differential Revision: https://reviews.llvm.org/D70334	2019-11-18 11:24:08 -08:00
Eric Christopher	30e7ee3c4b	Temporarily Revert "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=" and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread. This reverts commits `af57dbf12e` and `e6584b2b7b`	2019-11-18 10:46:48 -08:00
Sam McCall	d27a16eb39	Revert "[DWARF5]Addition of alignment atrribute in typedef DIE." This reverts commit `423f541c1a`, which breaks llvm-c ABI.	2019-11-18 15:53:22 +01:00
Graham Hunter	3f08ad611a	[SVE][CodeGen] Scalable vector MVT size queries * Implements scalable size queries for MVTs, split out from D53137. * Contains a fix for FindMemType to avoid using scalable vector type to contain non-scalable types. * Explicit casts for several places where implicit integer sign changes or promotion from 32 to 64 bits caused problems. * CodeGenDAGPatterns will treat scalable and non-scalable vector types as different. Reviewers: greened, cameron.mcinally, sdesmalen, rovka Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D66871	2019-11-18 12:30:59 +00:00
Craig Topper	bfbbf0aba8	[LegalizeTypes] Remove SoftenFloat handling from ExpandIntRes_LLROUND_LLRINT and remove assert from the strict fp path. These were both recently added. While the call to GetSoftenedFloat is a little more optimal, we don't do it in the expand for FP_TO_SINT/UINT so there's no real reason to do it here. This avoids a FIXME for strict fp.	2019-11-17 23:48:31 -08:00
Craig Topper	5a56d2aa33	[LegalizeTypes] Remove unnecessary conversion from EVT to MVT to MVT::SimpleValueType just to assign back to EVT. NFC	2019-11-17 23:48:31 -08:00
Craig Topper	af435286e5	[LegalizeTypes][X86] Add support for expanding the result type of STRICT_LLROUND and STRICT_LLRINT. This doesn't handle softening the input type, but we don't handle softening any of the strict nodes yet. Skipping that made it easy to reuse an existing function for creating a libcall from a node with a chain.	2019-11-17 20:03:05 -08:00
Craig Topper	1b0efe2b17	[LegalizeTypes] When expanding the integer result of LLROUND/LLRINT, also call GetSoftenedFloat if the floating point input needs to be softened. Before this we were emitting a bitcast to integer from the lowering code that itself will need to be legalized. By calling GetSoftenedFloat we get the integer conversion in one step without needing to relegalize a bitcast.	2019-11-17 13:31:30 -08:00
Craig Topper	9b515b6dd9	[LegalizeTypes] Remove PromoteFloat support form ExpandIntRes_LLROUND_LLRINT. This code isn't exercised, and was in the wrong place. If we need this, we would need to promote the type before figuring out which libcall to use. I'm choosing to remove it rather than fixing since we don't support PromoteFloat for LRINT/LROUND/LLRINT/LLROUND when the result type is legal so I don't see much reason to support it for the case where the result type isn't legal.	2019-11-17 13:31:30 -08:00
Craig Topper	d4ba11ae32	[LegalizeTypes] Merge ExpandIntRes_LLROUND and ExpandIntRes_LLRINT into one function that handles both. NFC These too functions are were the same except for which libcall gets emitted. Just merge them into one. This is prep work for some other work including strict fp support.	2019-11-17 13:31:30 -08:00
Sourabh Singh Tomar	423f541c1a	[DWARF5]Addition of alignment atrribute in typedef DIE. This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE. When explicit alignment is specified. Patch by Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok, deadalinx Differential Revision: https://reviews.llvm.org/D70111	2019-11-16 21:56:53 +05:30
David Blaikie	77cfcd7509	DebugInfo: Use loclistx for DWARFv5 location lists to reduce the number of relocations This only implements the non-dwo part, but loclistx is necessary to use location lists in DWARFv5, so it's a precursor to that work - and generally reduces relocations (only using one reloc, then indexes/relative offsets for all location list references) in non-split DWARF.	2019-11-15 18:51:13 -08:00
Quentin Colombet	98ceac4981	[GISel][CombinerHelper] Use uses() instead of operands() when traversing use operands. NFC	2019-11-15 13:54:33 -08:00
Quentin Colombet	304abde077	[GISel][CombinerHelper] Add support for scalar type for the result of shuffle vector LLVM IR of 1-element vectors get lower into scalar in GISel. As a result, shuffle vector may also produce a scalar. This patch teaches the shuffle combiner how to deal with scalars when they are in the destination type of a shuffle vector. For now, we just support the easy case where this can be lowered to a plain copy. For other cases, we leave the shuffle vector as is. This type of IR are seen in O0 pipelines. E.g., as produced with SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c. rdar://problem/57198904	2019-11-15 13:54:33 -08:00
Aditya Nandakumar	7276868556	[MirNamer][Canonicalizer]: Perform instruction semantic based renaming https://reviews.llvm.org/D70210 Previously: Due to sensitivity of the algorithm with gaps, and extra instructions, when diffing, often we see naming being off by a few. Makes the diff unreadable even for tests with 7 and 8 instructions respectively. Naming can change depending on candidates (and order of picking candidates). Suddenly if there's one extra instruction somewhere, the entire subtree would be named completely differently. No consistent naming of similar instructions which occur in different functions. If we try to do something like count the frequency distribution of various differences across suite, then the above sensitivity issues are going to result in poor results. Instead: Name instruction based on semantics of the instruction (hash of the opcode and operands). Essentially for a given instruction that occurs in any module/function it'll be named similarly (ie semantic). This has some nice properties Can easily look at many instructions and just check the hash and if they're named similarly, then it's the same instruction. Makes it very easy to spot the same instruction both multiple times, as well as across many functions (useful for frequency distribution). Independent of traversal/candidates/depth of graph. No need to keep track of last index/gaps/skip count etc. No off by few issues with diffs. I've tried the old vs new implementation in files ranging from 30 to 700 instructions. In both cases with the old algorithm, diffs are a sea of red, where as for the semantic version, in both cases, the diffs line up beautifully. Simplified implementation of the main loop (simple iteration) , no keep track of what's visited and not. Handle collision just by incrementing a counter. Roughly bb[N]_hash_[CollisionCount]. Additionally with the new implementation, we can probably avoid doing the hoisting of instructions to various places, as they'll likely be named the same resulting in differences only based on collision (ie regardless of whether the instruction is hoisted or not/close to use or not, it'll be named the same hash which should result in use of the instruction be identical with the only change being the collision count) which is very easy to spot visually.	2019-11-15 08:38:54 -08:00
diggerlin	3dfa975fb3	Add read-only data assembly writing for aix SUMMARY: The patch will emit read-only variable assembly code for aix. Reviewers: daltenty,Xiangling_Liao Subscribers: rupprecht, seiyai,hiraditya Differential Revision: https://reviews.llvm.org/D70182	2019-11-15 11:30:19 -05:00
Serge Pavlov	e6584b2b7b	Move floating point related entities to namespace level Enumerations that describe rounding mode and exception behavior were defined inside ConstrainedFPIntrinsic. It makes sense to use the same definitions to represent the same properties in other cases, not only in constrained intrinsics. It was however inconvenient as required to include constrained intrinsics definitions even if they were not needed. Also using long scope prefix reduced readability. This change moves these definitioins to the namespace llvm::fp. No functional changes. Differential Revision: https://reviews.llvm.org/D69552	2019-11-15 19:56:33 +07:00
Jay Foad	c953e061b4	[CodeGen] Increase the size of a SmallVector The SmallVector reserve() call in MachineInstrExpressionTrait::getHashValue accounted for over 3% of all calls to malloc() when I compiled a bunch of graphics shaders for the AMDGPU target. Its initial size was only enough for machine instructions with up to 7 operands, but for AMDGPU 8 and 10 operands are very common. Here's a histogram of number of operands for each call to getHashValue, gathered from the same collection of shaders: 1 13503 2 254273 3 135781 4 422508 5 614997 6 194953 7 287248 8 1517255 9 31218 10 1191269 11 70731 12 24 13 77 15 84 17 4692 27 16 33 705 49 6 Typical instructions with 8 and 10 operands are floating point arithmetic and multiply-accumulate instructions like: %83:vgpr_32 = V_MUL_F32_e64 0, killed %82:vgpr_32, 0, killed %81:vgpr_32, 0, 0, implicit $exec %330:vgpr_32 = V_MAC_F32_e64 0, killed %327:vgpr_32, 0, killed %329:sgpr_32, 0, %328:vgpr_32(tied-def 0), 0, 0, implicit $exec Differential Revision: https://reviews.llvm.org/D70301	2019-11-15 11:32:11 +00:00
Matt Arsenault	bc276c6379	GlobalISel: Lower s1 source G_SITOFP/G_UITOFP	2019-11-15 13:37:20 +05:30
Reid Kleckner	4c1a1d3cf9	Add missing includes needed to prune LLVMContext.h include, NFC These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280	2019-11-14 15:23:15 -08:00
Vedant Kumar	1ee84e5ab2	[DebugInfo] Allow spill slots in call site parameter descriptions Allow call site paramter descriptions to reference spill slots. Spill slots are not visible to high-level LLVM IR, so they can safely be referenced during entry value evaluation (as they cannot be clobbered by some other function). This gives a 5% increase in the number of call site parameter DIEs in an LTO x86_64 build of the xnu kernel. This reverts commit `eb4c98ca3d` ( [DebugInfo] Exclude memory location values as parameter entry values), effectively reintroducing the portion of D60716 which dealt with memory locations (authored by Djordje, Nikola, Ananth, and Ivan). This partially addresses llvm.org/PR43343. However, not all memory operands forwarded to callees live in spill slots. In the xnu build, it may be possible to use an escape analysis to increase the number of call site parameter by another 15% (more details in PR43343). Differential Revision: https://reviews.llvm.org/D70254	2019-11-14 12:48:51 -08:00
Daniel Sanders	b2839c442e	[globalisel][irtanslator] The IRTranslator should preserve TBAA information	2019-11-14 12:11:27 -08:00
Sumanth Gundapaneni	7c7e368a7f	[Pipeliner] Fix an assertion caused by iterator invalidation.	2019-11-14 13:08:06 -06:00
Craig Topper	17bb2d7c80	[ExpandReductions] Don't push all intrinsics to the worklist. Just push reductions. We were previously pushing all intrinsics used in a function to the worklist. This is wasteful for memory in a function with a lot of intrinsics. We also ask TTI if we should expand every intrinsic, but we only have expansion support for the reduction intrinsics. This just wastes time for the non-reduction intrinsics. This patch only pushes reduction intrinsics into the worklist and skips other intrinsics. Differential Revision: https://reviews.llvm.org/D69470	2019-11-14 10:26:53 -08:00
Reid Kleckner	5fe3f00ae2	Replace wrongly deleted header banner, fix formatting I reviewed the diff hunks of `05da2fe521` that don't contain '#include' lines, and found two unintended changes. I deleted a header banner inadvertently while inserting a header, and changed the indentation of a constructor in an odd way. Add back the banner, and reformat the constructor.	2019-11-14 10:21:42 -08:00

... 9 10 11 12 13 ...

28401 Commits