llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	eb0e1978df	[TargetLowering] SimplifyDemandedBits - call SimplifyMultipleUseDemandedBits for ISD::EXTRACT_VECTOR_ELT (REAPPLIED) This patch attempts to peek through vectors based on the demanded bits/elt of a particular ISD::EXTRACT_VECTOR_ELT node, allowing us to avoid dependencies on ops that have no impact on the extract. In particular this helps remove some unnecessary scalar->vector->scalar patterns. The wasm shift patterns are annoying - @tlively has indicated that the wasm vector shift codegen are to be refactored in the near-term and isn't considered a major issue. Reapplied after reversion at rL368660 due to PR42982 which was fixed at rGca7fdd41bda0. Differential Revision: https://reviews.llvm.org/D65887	2020-01-04 13:15:50 +00:00
Matt Arsenault	21309eafde	GlobalISel: Add type argument to getRegBankFromRegClass AMDGPU can't unambiguously go back from the selected instruction register class to the register bank without knowing if this was used in a boolean context.	2020-01-03 16:25:10 -05:00
Sanjay Patel	ca7fdd41bd	[DAGCombiner] fix miscompile in translating (X & undef) to shuffle See PR42982 for more context: https://bugs.llvm.org/show_bug.cgi?id=42982	2020-01-03 14:58:49 -05:00
Craig Topper	7cdc60c3db	[LegalizeVectorOps] Pass the post-UpdateNodeOperands version of Op to ExpandLoad/ExpandStore UpdateNodeOperands might CSE to another existing node. So we should make sure we're legalizing that node otherwise we might fail to hook up the operands properly. I've moved the result registration up to the caller to avoid having to pass both Result and Op into the functions where it might be confusing which is which. This address 2 other issues pointed out in D71861. Differential Revision: https://reviews.llvm.org/D72021	2020-01-03 11:53:08 -08:00
Reid Kleckner	9c2b72821b	Move tail call disabling code to target independent code When the "disable-tail-calls" attribute was added, checks were added for it in various backends. Now this code has proliferated, and it is something the target is responsible for checking. Move that responsibility back to the ISels (fast, global, and SD). There's no major functionality change, except for targets that never implemented this check. This LLVM attribute was originally added in `d9699bc7bd` (2015). Reviewers: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D72118	2020-01-03 11:27:41 -08:00
Roman Lebedev	0727e2b90c	[DAGCombiner][X86][AArch64] Generalize `A-(A&B)`->`A&(~B)` fold (PR44448) The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should/can just be 'A - (A & B)' -> 'A & (~B)' Even if we don't manage to fold `~` into B, we have likely formed `ANDN` node. Also, this way there's less similar-but-duplicate folds. Name: X - (X & Y) -> X & (~Y) %o = and i32 %X, %Y %r = sub i32 %X, %o => %n = xor i32 %Y, -1 %r = and i32 %X, %n https://rise4fun.com/Alive/kOUl See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:47 +03:00
Roman Lebedev	86403c0ff8	[DAGCombiner] `~(add X, -1)` -> `neg X` fold The fold 'A - (A & (B - 1))' -> 'A & (0 - B)' added in `8dab0a4a7d` is too specific. It should just be 'A - (A & B)' -> 'A & (~B)', but we currently fail to sink that '~' into `(B - 1)`. Name: ~(X - 1) -> (0 - X) %o = add i32 %X, -1 %r = xor i32 %o, -1 => %r = sub i32 0, %X https://rise4fun.com/Alive/rjU	2020-01-03 17:55:46 +03:00
Roman Lebedev	3d492d7503	[DAGCombine][X86][Thumb2/LowOverheadLoops] `A - (A & C)` -> `A & (~C)` fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. Name: PR44448 ptr - (ptr & C) -> ptr & (~C) %bias = and i32 %ptr, C %r = sub i32 %ptr, %bias => %r = and i32 %ptr, ~C See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 17:55:45 +03:00
Roman Lebedev	1711be78f7	[NFC][DAGCombine] Clarify comment for 'A - (A & (B - 1))' fold	2020-01-03 17:55:42 +03:00
Jay Foad	8382f87145	Fix typo "psuedo" in comments	2020-01-03 14:05:58 +00:00
Roman Lebedev	8dab0a4a7d	[DAGCombine][X86][AArch64] 'A - (A & (B - 1))' -> 'A & (0 - B)' fold (PR44448) While we do manage to fold integer-typed IR in middle-end, we can't do that for the main motivational case of pointers. There is @llvm.ptrmask() intrinsic which may or may not be helpful, but i'm not sure it is fully considered canonical yet, not everything is fully aware of it likely. https://rise4fun.com/Alive/ZVdp Name: ptr - (ptr & (alignment-1)) -> ptr & (0 - alignment) %mask = add i64 %alignment, -1 %bias = and i64 %ptr, %mask %r = sub i64 %ptr, %bias => %highbitmask = sub i64 0, %alignment %r = and i64 %ptr, %highbitmask See https://bugs.llvm.org/show_bug.cgi?id=44448 https://reviews.llvm.org/D71499	2020-01-03 13:58:36 +03:00
QingShan Zhang	2133d3c558	[DAGCombine] Initialize the default operation action for SIGN_EXTEND_INREG for vector type as 'expand' instead of 'legal' For now, we didn't set the default operation action for SIGN_EXTEND_INREG for vector type, which is 0 by default, that is legal. However, most target didn't have native instructions to support this opcode. It should be set as expand by default, as what we did for ANY_EXTEND_VECTOR_INREG. Differential Revision: https://reviews.llvm.org/D70000	2020-01-03 03:26:41 +00:00
Matt Arsenault	0d9f919b73	DAG: Use TargetConstant for FENCE operands	2020-01-02 17:16:10 -05:00
Fangrui Song	87fb204e8f	[SelectionDAG] Simplify SelectionDAGBuilder::visitInlineAsm	2020-01-02 09:44:23 -08:00
Ulrich Weigand	63336795f0	[FPEnv] Default NoFPExcept SDNodeFlag to false The NoFPExcept bit in SDNodeFlags currently defaults to true, unlike all other such flags. This is a problem, because it implies that all code that transforms SDNodes without copying flags can introduce a correctness bug, not just a missed optimization. This patch changes the default to false. This makes it necessary to move setting the (No)FPExcept flag for constrained intrinsics from the visitConstrainedIntrinsic routine to the generic visit routine at the place where the other flags are set, or else the intersectFlagsWith call would erase the NoFPExcept flag again. In order to avoid making non-strict FP code worse, whenever SelectionDAGISel::SelectCodeCommon matches on a set of orignal nodes none of which can raise FP exceptions, it will preserve this property on all results nodes generated, by setting the NoFPExcept flag on those result nodes that would otherwise be considered as raising an FP exception. To check whether or not an SD node should be considered as raising an FP exception, the following logic applies: - For machine nodes, check the mayRaiseFPException property of the underlying MI instruction - For regular nodes, check isStrictFPOpcode - For target nodes, check a newly introduced isTargetStrictFPOpcode The latter is implemented by reserving a range of target opcodes, similarly to how memory opcodes are identified. (Note that there a bit of a quirk in identifying target nodes that are both memory nodes and strict FP nodes. To simplify the logic, right now all target memory nodes are automatically also considered strict FP nodes -- this could be fixed by adding one more range.) Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71841	2020-01-02 16:59:45 +01:00
Qiu Chaofan	bdf4224f9c	[NFC] Add explicit instantiation to releaseNode Resolve a build failure about undefined symbols introduced by `f9f78cf`. Differential Revision: https://reviews.llvm.org/D72069	2020-01-02 21:16:22 +08:00
Craig Topper	dac98a2205	[RegisterClassInfo] Use SmallVector::assign instead of resize to make sure we erase previous contents from all entries of the vector. resize only writes to elements that get added. Any elements that already existed maintain their previous value. In this case we're trying to erase cached information so we should use assign which will write to every element. Found while trying to add new tests to an existing X86 test and noticed register allocation changing in other functions.	2020-01-01 18:53:12 -08:00
Lorenzo Casalino	f9f78cf6ac	[MachineScheduler] improve reuse of 'releaseNode'method The 'SchedBoundary::releaseNode' is merely invoked for releasing the Top/Bottom root nodes. However, 'SchedBoundary::releasePending' uses its same logic to check if the Pending queue has any releasable SUnit. It is possible to slightly modify the body of the two, allowing re-use of the former ('releaseNode') in the latter. Patch by Lorenzo Casalino <lorenzo.casalino93@gmail.com> Reviewers: MatzeB, fhahn, atrick Reviewed By: fhahn Differential Revision: https://reviews.llvm.org/D65506	2020-01-01 20:22:32 +00:00
Mark de Wever	8dc7b982b4	[NFC] Fixes -Wrange-loop-analysis warnings This avoids new warnings due to D68912 adds -Wrange-loop-analysis to -Wall. Differential Revision: https://reviews.llvm.org/D71857	2020-01-01 20:01:37 +01:00
Matt Arsenault	4d7201e7b9	DAG: Stop trying to fold FP -(x-y) -> y-x in getNode with nsz This was increasing the number of instructions when fsub was legalized on AMDGPU with no signed zeros enabled. This fold should be guarded by hasOneUse, and I don't think getNode should be doing that. The same fold is already done as a regular combine through isNegatibleForFree. This does require duplicating, even though isNegatibleForFree does this combine already (and properly checks hasOneUse) to avoid one PPC regression. In the regression, the outer fneg has nsz but the fsub operand does not. isNegatibleForFree only sees the operand, and doesn't see it's used from a nsz context. A nsz parameter needs to be added and threaded through isNegatibleForFree to avoid this.	2019-12-31 22:49:51 -05:00
Craig Topper	4ae3120ed8	[LegalizeVectorOps][AArch64] Stop asking for v4f16 fp_round and fp_extend to be promoted. These operations are needed as building blocks for promoting so they can't be promoted themselves. This appeared to work because the fp_extend query type for operation actions is the result type, not the input type so it never triggered in the legalizer. For fp_round, the vector op legalizer just ended up creating a nop fp_extend that was elided by getNode, followed by a nop fp_round that was also elided by getNode. This was followed by a final fp_round from v4f32 back to vf416 which was CSEd to the original node. Then legalize vector ops just believed that node legalized to itself. LegalizeDAG took another crack at promoting it, but didn't have a handler so just skipped it with a debug message saying it wasn't promoted. This patch just removes the operation actions to avoid this non-sense. Found while trying to refactor LegalizeVectorOps to handle multiple result nodes better.	2019-12-31 15:04:12 -08:00
Sam Parker	b409f73e1f	[ARM][TypePromotion] Re-enable by default Re-enable the pass after it was reverted and the bug fixed.	2019-12-31 11:31:06 +00:00
Craig Topper	787e078f3e	[TargetLowering][AMDGPU] Make scalarizeVectorLoad return a pair of SDValues instead of creating a MERGE_VALUES node. NFCI This allows us to clean up some places that were peeking through the MERGE_VALUES node after the call. By returning the SDValues directly, we can clean that up. Unfortunately, there are several call sites in AMDGPU that wanted the MERGE_VALUES and now need to create their own.	2019-12-30 19:36:04 -08:00
Fangrui Song	03b9f0a5e1	Ignore "no-frame-pointer-elim" and "no-frame-pointer-elim-non-leaf" in favor of "frame-pointer" D56351 (included in LLVM 8.0.0) introduced "frame-pointer". All tests which use "no-frame-pointer-elim" or "no-frame-pointer-elim-non-leaf" have been migrated to use "frame-pointer". Implement UpgradeFramePointerAttributes to upgrade the two obsoleted function attributes for bitcode. Their semantics are ignored. Differential Revision: https://reviews.llvm.org/D71863	2019-12-30 09:46:19 -08:00
Petar Avramovic	98f72a5107	[MIPS GlobalISel] Select bitreverse. Recommit G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Recommit notes: Introduce temporary variables in order to make sure instructions get inserted into MachineFunction in same order regardless of compiler used to build llvm. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 18:06:29 +01:00
Matt Arsenault	9fd31fdbd3	GlobalISel: moreElementsVector for FP min/max	2019-12-30 10:39:53 -05:00
Dmitri Gribenko	32cc14100e	Revert "[MIPS GlobalISel] Select bitreverse" This reverts commit `dbc136e0fe`. It broke buildbots: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/21066	2019-12-30 14:29:47 +01:00
Petar Avramovic	dbc136e0fe	[MIPS GlobalISel] Select bitreverse G_BITREVERSE is generated from llvm.bitreverse.<type> intrinsics, clang genrates these intrinsics from __builtin_bitreverse32 and __builtin_bitreverse64. Add lower and narrowscalar for G_BITREVERSE. Lower G_BITREVERSE on MIPS32. Differential Revision: https://reviews.llvm.org/D71363	2019-12-30 11:26:45 +01:00
Petar Avramovic	94a24e7a40	[MIPS GlobalISel] Select bswap G_BSWAP is generated from llvm.bswap.<type> intrinsics, clang genrates these intrinsics from __builtin_bswap32 and __builtin_bswap64. Add lower and narrowscalar for G_BSWAP. Lower G_BSWAP on MIPS32, select G_BSWAP on MIPS32 revision 2 and later. Differential Revision: https://reviews.llvm.org/D71362	2019-12-30 11:13:22 +01:00
Kai Luo	cd2a73a9f0	[MCP] Add stats for backward copy propagation. NFC.	2019-12-30 16:48:28 +08:00
Fangrui Song	6f9b4c6826	[SelectionDAT] Simplify SelectionDAGBuilder::visitInlineAsm Indirect C_Immediate or C_Other constraints have been excluded. Also simplify an unneeded change to indirect 'X' by D60942.	2019-12-29 20:53:30 -08:00
Fangrui Song	5edb40c022	[SelectionDAG] Disallow indirect "i" constraint This allows us to delete InlineAsm::Constraint_i workarounds in SelectionDAGISel::SelectInlineAsmMemoryOperand overrides and TargetLowering::getInlineAsmMemConstraint overrides. They were introduced to X86 in r237517 to prevent crashes for constraints like "=*imr". They were later copied to other targets.	2019-12-29 16:50:42 -08:00
Simon Pilgrim	34769e0783	SimplifyDemandedBits - Remove duplicate getOperand() call. NFC. Pulled out from D56387 - cleanup variable names, move shift amount legalization inside if() of its only user and remove duplicate getOperand() call.	2019-12-28 16:42:50 +00:00
Craig Topper	a3f8964813	[TargetLowering] Update comment to reference the correct compiler-rt function the code is based on. NFC	2019-12-27 22:49:04 -08:00
Fangrui Song	044cc919f4	Delete setjmp_undefined_for_msvc workaround after llvm.setjmp was removed	2019-12-27 18:09:22 -08:00
Matt Arsenault	3213ce966b	TailDuplication: Clear NoPHIs property The early tail duplicator pass introduces new ones, so a MIR test that infers no phis since there were none on the input would fail the verifier after running.	2019-12-27 14:06:31 -05:00
Fangrui Song	7a7334663c	Delete llvm.{sig,}{setjmp,longjmp} remnant after r136821 Intrinsic has incorrect argument type! i32 (i32) @llvm.setjmp wipes tear	2019-12-27 00:00:14 -08:00
Craig Topper	53ee806d93	[X86][FPEnv] Promote some float strictfp operations to double on i686-pc-windows-msvc to match what we do for non-strict. The float libcalls are inlined in MSVC's math header where they just cast to double and use the double libcall. Do the same when we emit libcalls.	2019-12-26 20:22:24 -08:00
Kristina Bessonova	cdd25a4c74	[DebugInfo][SelectionDAG] Change order while transferring SDDbgValue to another node SelectionDAG::transferDbgValues() can 'reattach' SDDbgValue from one to another node, but doesn't change its source order. If the destination node has the order greater than the SDDbgValue, there are two possible issues revealed later: * If debug info is attached to an instruction that is the first definition of a register, this ends up with a def-after-use and the debug info gets 'undef' later. * If MIR has another definition of a register above the debug info, the debug info may represent a source variable incorrectly because it appears (significantly) before an instruction corresponded to this debug info. So, the patch changes the order of an SDDbgValue when it is moved to a node with greater order. Reviewers: dblaikie, jmorse, aprantl Reviewed By: aprantl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71175	2019-12-26 21:01:59 +03:00
Wang, Pengfei	472bded3ed	[X86] Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Summary: Enable STRICT_SINT_TO_FP/STRICT_UINT_TO_FP on X86 backend Reviewers: craig.topper, RKSimon, LiuChen3, uweigand, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71871	2019-12-26 08:15:13 +08:00
Matt Arsenault	0d47399167	GlobalISel: Update syntax in debug printing Physical register names now start with $, not %	2019-12-24 10:37:36 -05:00
Matt Arsenault	9b61641564	GlobalISel: Fix naming variables "brank" instead of "bank"	2019-12-24 10:36:54 -05:00
Sam Parker	42dba633a3	[TypePromotion] Make TypeSize a class member Having TypeSize as a static class variable was causing problems with multi-threading. Several static functions have now been converted into methods of TypePromotion and a few other members of TypePromotion and IRPromoter have been added or removed. Differential Revision: https://reviews.llvm.org/D71832	2019-12-24 05:04:35 -05:00
David Blaikie	fccac1ec16	DebugInfo: Correct the form of DW_AT_macro_info in .dwo files (sec_offset, rather than data4)	2019-12-24 01:23:21 -08:00
David Blaikie	83c7a424d9	DebugInfo: Add {} to address -Wdangling-else warning.	2019-12-24 01:14:15 -08:00
Sourabh Singh Tomar	0a72515d33	[DebugInfo] Fix v4 macinfo for dwo files. Dwo files must contain have DW_AT_macro_info attribute, when macro information is emitted. Adjusted the test case for the same.	2019-12-24 12:50:34 +05:30
Fangrui Song	e0d855b399	[SelectionDAG] Change SelectionDAGISel::{funcInfo,SDB} to use unique_ptr CurDAG is referenced more than 2000 times and used in many gerated .cpp files. Don't touch it for now.	2019-12-23 22:41:05 -08:00
Fangrui Song	01b98e6fd5	[SelectionDAG] Don't repeatedly add a node to the worklist in ComputeLiveOutVRegInfo. NFC For sqlite3 amalgram, this decreases the number of Worklist.push_back calls (603084) by 10%.	2019-12-23 22:04:14 -08:00
Ulrich Weigand	0d3f782e41	[FPEnv][X86] More strict int <-> FP conversion fixes Fix several several additional problems with the int <-> FP conversion logic both in common code and in the X86 target. In particular: - The STRICT_FP_TO_UINT expansion emits a floating-point compare. This compare can raise exceptions and therefore needs to be a strict compare. I've made it signaling (even though quiet would also be correct) as signaling is the more usual default for an LT. This code exists both in common code and in the X86 target. - The STRICT_UINT_TO_FP expansion algorithm was incorrect for strict mode: it emitted two STRICT_SINT_TO_FP nodes and then used a select to choose one of the results. This can cause spurious exceptions by the STRICT_SINT_TO_FP that ends up not chosen. I've fixed the algorithm to use only a single STRICT_SINT_TO_FP instead. - The !isStrictFPEnabled logic in DoInstructionSelection would sometimes do the wrong thing because it calls getOperationAction using the result VT. But for some opcodes, incuding [SU]INT_TO_FP, getOperationAction needs to be called using the operand VT. - Remove some (obsolete) code in X86DAGToDAGISel::Select that would mutate STRICT_FP_TO_[SU]INT to non-strict versions unnecessarily. Reviewed by: craig.topper Differential Revision: https://reviews.llvm.org/D71840	2019-12-23 21:11:45 +01:00
Sanjay Patel	8cefc37be5	[DAGCombine] visitEXTRACT_SUBVECTOR - 'little to big' extract_subvector(bitcast()) support This moves the X86 specific transform from rL364407 into DAGCombiner to generically handle 'little to big' cases (for example: extract_subvector(v2i64 bitcast(v16i8))). This allows us to remove both the x86 implementation and the aarch64 bitcast(extract_subvector(bitcast())) combine. Earlier patches that dealt with regressions initially exposed by this patch: rG5e5e99c041e4 rG0b38af89e2c0 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D63815	2019-12-23 10:11:45 -05:00
Martin Storsjö	5a751e747d	[AArch64] [Windows] Use COFF stubs for calls to extern_weak functions As the extern_weak target might be missing, resolving to the absolute address zero, we can't use the normal direct PC-relative branch instructions (as that would result in relocations out of range). Improve the classifyGlobalFunctionReference method to set MO_DLLIMPORT/MO_COFFSTUB, and simplify the existing code in AArch64TargetLowering::LowerCall to use the return value from classifyGlobalFunctionReference for these cases. Add code in both AArch64FastISel and GlobalISel/IRTranslator to bail out for function calls to extern weak functions on windows, to let SelectionDAG handle them. This matches what was done for X86 in `6bf108d77a`. Differential Revision: https://reviews.llvm.org/D71721	2019-12-23 12:13:49 +02:00
Carl Ritson	2791667d2e	[DAGCombiner] Check term use before applying aggressive FSUB optimisations Summary: Without this check unnecessary FMA instructions are generated when the FSUB terms are reused. This also has the side-effect that the same value is computed to different levels of precision, which can create undesirable effects if the results are used together in subsequent computation. Reviewers: arsenm, nhaehnle, foad, tpr, dstuttard, spatel Reviewed By: arsenm Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71656	2019-12-23 09:37:58 +09:00
Valentin Churavy	fb0ccff6e5	[SelectionDAG] Copy FP flags when visiting a binary instruction. Summary: We noticed in Julia that the sequence below no longer turned into a sequence of FMA instructions in LLVM 7+, but it did in LLVM 6. ``` %29 = fmul contract <4 x double> %wide.load, %wide.load16 %30 = fmul contract <4 x double> %wide.load13, %wide.load17 %31 = fmul contract <4 x double> %wide.load14, %wide.load18 %32 = fmul contract <4 x double> %wide.load15, %wide.load19 %33 = fadd fast <4 x double> %vec.phi, %29 %34 = fadd fast <4 x double> %vec.phi10, %30 %35 = fadd fast <4 x double> %vec.phi11, %31 %36 = fadd fast <4 x double> %vec.phi12, %32 ``` Unlike Clang, Julia doesn't set the `unsafe-fp-math=true` function attribute, but rather emits more local instruction flags. This partially undoes https://reviews.llvm.org/D46854 and if required I can try to minimize the test further. Reviewers: spatel, mcberg2017 Reviewed By: spatel Subscribers: chriselrod, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71495	2019-12-22 14:29:36 -05:00
Reid Kleckner	b2c1ba5b1f	Revert "[ARM][TypePromotion] Enable by default" This reverts commit `ee7579409b`. It causes crashes during ThinLTO. I suspect the issue is related to races on the global TypeSize variable, which is 80 at the time of the crash.	2019-12-22 11:27:11 -08:00
Eric Astor	dc5b614fa9	[ms] [X86] Use "P" modifier on operands to call instructions in inline X86 assembly. Summary: This is documented as the appropriate template modifier for call operands. Fixes PR44272, and adds a regression test. Also adds support for operand modifiers in Intel-style inline assembly. Reviewers: rnk Reviewed By: rnk Subscribers: merge_guards_bot, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D71677	2019-12-22 09:16:34 -05:00
David Blaikie	d0bfb3c583	DebugInfo: Remove out of date comment	2019-12-21 23:13:26 -08:00
Jessica Paquette	d5750770eb	[NFC][MachineOutliner] Rewrite setSuffixIndices to be iterative Having this function be recursive could use up way too much stack space. Rewrite it as an iterative traversal in the tree instead to prevent this. Fixes PR44344.	2019-12-20 16:12:37 -08:00
Vedant Kumar	fa4701e197	[DWARF] Defer creating declaration DIEs until we prepare call site info It isn't necessary to create DIEs for all of the declaration subprograms in a CU's retainedTypes list. We can defer creating these subprograms until we need to prepare a call site tag that refers to one. This cleanup was mentioned in passing in D70350.	2019-12-20 15:26:31 -08:00
Vedant Kumar	79daafc903	Reland: [DWARF] Allow cross-CU references of subprogram definitions This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. Update: Reland with a fix to create a declaration DIE when the declaration is missing from the CU's retainedTypes list. The declaration is left out of the retainedTypes list in two cases: 1) Re-compiling pre-r266445 bitcode (in which declarations weren't added to the retainedTypes list), and 2) Doing LTO function importing (which doesn't update the retainedTypes list). It's possible to handle (1) and (2) by modifying the retainedTypes list (in AutoUpgrade, or in the LTO importing logic resp.), but I don't see an advantage to doing it this way, as it would cause more DWARF to be emitted compared to creating the declaration DIEs lazily. Tested with a stage2 ThinLTO+RelWithDebInfo build of clang, and with a ReleaseLTO-g build of the test suite. rdar://46577651, rdar://57855316, rdar://57840415 Differential Revision: https://reviews.llvm.org/D70350	2019-12-20 15:26:31 -08:00
Yury Delendik	adf7a0a558	[WebAssembly] Use TargetIndex operands in DbgValue to track WebAssembly operands locations Extends DWARF expression language to express locals/globals locations. (via target-index operands atm) (possible variants are: non-virtual registers or address spaces) The WebAssemblyExplicitLocals can replace virtual registers to targertindex operand type at the time when WebAssembly backend introduces {get,set,tee}_local instead of corresponding virtual registers. Reviewed By: aprantl, dschuff Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D52634	2019-12-20 14:39:05 -08:00
Adrian Prantl	44b4b833ad	Rename DW_AT_LLVM_isysroot to DW_AT_LLVM_sysroot This is a purely cosmetic change that is NFC in terms of the binary output. I bugs me that I called the attribute DW_AT_LLVM_isysroot since the "i" is an artifact of GCC command line option syntax (-isysroot is in the category of -i options) and doesn't carry any useful information otherwise. This attribute only appears in Clang module debug info. Differential Revision: https://reviews.llvm.org/D71722	2019-12-20 13:11:17 -08:00
Tom Weaver	453dc4d7ec	[OPT-DBG] Teach DbgEntityHistoryCalculator about meta-instructions. The calculator was considering instructions such as KILLs as clobbers of a physical address. This is wrong as meta instructions such as KILLs produce no output in the final program and thus don't clobber or change any physical location's value. As a result they're safe to ignore whilst calculating location list ranges. reviewers: aprantl, vsk diff revision: https://reviews.llvm.org/D70497 fixes: https://bugs.llvm.org/show_bug.cgi?id=38753	2019-12-20 14:03:34 +00:00
Sam Parker	acbc9aed72	[ARM][MVE] Fixes for tail predication. 1) Fix an issue with the incorrect value being used for the number of elements being passed to [d\|w]lstp. We were trying to check that the value was available at LoopStart, but this doesn't consider that the last instruction in the block could also define the register. Two helpers have been added to RDA for this. 2) Insert some code to now try to move the element count def or the insertion point so that we can perform more tail predication. 3) Related to (1), the same off-by-one could prevent us from generating a low-overhead loop when a mov lr could have been the last instruction in the block. 4) Fix up some instruction attributes so that not all the low-overhead loop instructions are labelled as branches and terminators - as this is not true for dls/dlstp. Differential Revision: https://reviews.llvm.org/D71609	2019-12-20 09:34:18 +00:00
Philip Reames	8277c91cf3	[StackMaps] Be explicit about label formation [NFC] (try 2) Recommit after making the same API change in non-x86 targets. This has been build for all targets, and tested for effected ones. Why the difference? Because my disk filled up when I tried make check for all. For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.	2019-12-19 14:05:30 -08:00
Eric Christopher	add710eb23	Temporarily Revert "[StackMaps] Be explicit about label formation [NFC]" as it broke the aarch64 build. This reverts commit `bc7595d934`.	2019-12-19 12:52:40 -08:00
Philip Reames	bc7595d934	[StackMaps] Be explicit about label formation [NFC] For auto-padding assembler support, we'll need to bundle the label with the instructions (nops or call sequences) so that they don't get separated. This just rearranges the code to make the upcoming change more obvious.	2019-12-19 12:38:44 -08:00
Philip Reames	cf6aafa47c	[FaultMaps] Make label formation a bit more explicit [NFC] This is in advance of assembler padding directives support where we'll need to bundle the label w/the corresponding faulting instruction to avoid padding being inserted between.	2019-12-19 12:38:44 -08:00
Craig Topper	e6e23a24be	[LegalizeDAG] Add return to the strict node handling in PromoteLegalINT_TO_FP to prevent an invalid strict fp node from being created by falling into non-strict code path.	2019-12-19 11:39:50 -08:00
Jay Foad	c5c935ab66	Make more use of MachineInstr::mayLoadOrStore.	2019-12-19 11:51:52 +00:00
Liu, Chen3	2f932b5729	Enable STRICT_FP_TO_SINT/UINT on X86 backend This patch is mainly for custom lowering the vector operation. Differential Revision: https://reviews.llvm.org/D71592	2019-12-19 14:49:13 +08:00
David Blaikie	aaa5a5e7ff	DebugInfo: Include DW_AT_base_addr even in gmlt with no inline functions Since the address pool doesn't get populated in this case (due to the lack of inlining, no child DIEs are added to the CU - so no addresses are needed for the DIEs themselves) until the range list is emitted - at the time the attributes are added to the CU, the address pool is empty. So check whether the address pool will be used for the range lists & add an addr_base if that's the case.	2019-12-18 17:14:28 -08:00
David Blaikie	64fa76ef55	Reapply "NFC: DebugInfo: Refactor RangeSpanList to be a struct, like DebugLocStream::List" Move these data structures closer together so their emission code can eventually share more of its implementation. Was an egregious bug (completely untested, evidently) where I hadn't inverted a DWARFv5 test as needed, so it was doing the exact opposite of what was required & thus tried to emit a DWARFv5 range list header in DWARFv4. Reapply `8e04896288` which was reverted in `a8154e5e0c`.	2019-12-18 16:28:19 -08:00
Ulrich Weigand	1946461344	[FPEnv] Strict versions of llvm.minimum/llvm.maximum Add new intrinsics llvm.experimental.constrained.minimum llvm.experimental.constrained.maximum as strict versions of llvm.minimum and llvm.maximum. Includes SystemZ back-end support. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D71624	2019-12-18 21:35:28 +01:00
Craig Topper	cfe316007f	[SelectionDAGBuilder] Use getConstant instead of getTargetConstant to build the offset for struct types in getUniformBase. getTargetConstant prevents any optimizations from operating on the value and basically says its already been iseled. But since we want the index to be in a register, this isn't true. Prior to this we were generating a vbroadcast with an immediate argument which is illegal and was flagged by the expensive checks bot.	2019-12-18 10:44:28 -08:00
stozer	89d19d60ad	Reapply: [DebugInfo] Correctly handle salvaged casts and split fragments at ISel This reverts commit `1f3dd83cc1`, reapplying commit `bb1b0bc4e5`. The original commit failed on some builds seemingly due to the use of a bracketed constructor with an std::array, i.e. `std::array<> arr({...})`.	2019-12-18 16:26:42 +00:00
Daniel Sanders	c3cb089a87	[gicombiner] Import tryCombineIndexedLoadStore() Summary: Now that arbitrary data is supported, import tryCombineIndexedLoadStore() Depends on D69147 Reviewers: bogner, volkan Reviewed By: volkan Subscribers: hiraditya, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69151	2019-12-18 14:41:38 +00:00
stozer	1f3dd83cc1	Revert "[DebugInfo] Correctly handle salvaged casts and split fragments at ISel" Reverted due to build failure on windows bots. This reverts commit `bb1b0bc4e5`.	2019-12-18 11:46:10 +00:00
stozer	bb1b0bc4e5	[DebugInfo] Correctly handle salvaged casts and split fragments at ISel Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. This patch enables the salvaging of casts by using the DW_OP_LLVM_convert operator for SExt and Trunc instructions. There is another issue which is exposed by this fix, in which fragment DIExpressions (which are preserved more readily by this patch) for values that must be split across registers in ISel trigger an assertion, as the 'split' fragments extend beyond the bounds of the fragment DIExpression causing an error. This patch also fixes this issue by checking the fragment status of DIExpressions which are to be split, and dropping fragments that are invalid.	2019-12-18 11:09:18 +00:00
Jay Foad	97ca7c2cc9	[AArch64] Enable clustering memory accesses to fixed stack objects Summary: r347747 added support for clustering mem ops with FI base operands including support for fixed stack objects in shouldClusterFI, but apparently this was never tested. This patch fixes shouldClusterFI to work with scaled as well as unscaled load/store instructions, and fixes the ordering of memory ops in MemOpInfo::operator< to ensure that memory addresses always increase, regardless of which direction the stack grows. Subscribers: MatzeB, kristof.beyls, hiraditya, javed.absar, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71334	2019-12-18 09:46:11 +00:00
Anna Welker	7cd1cfdd6b	[NFC][TTI] Add Alignment for isLegalMasked[Gather/Scatter] Add an extra parameter so alignment can be taken under consideration in gather/scatter legalization. Differential Revision: https://reviews.llvm.org/D71610	2019-12-18 09:14:39 +00:00
Wang, Pengfei	8cc0b58673	[X86] Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic. Summary: Add calculation for elements in structures in getting uniform base for the Gather/Scatter intrinsic. Reviewers: craig.topper, c-rhodes, RKSimon Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke Tags: #llvm Differential Revision: https://reviews.llvm.org/D71442	2019-12-18 12:24:58 +08:00
Craig Topper	c36773c78e	[FPEnv][LegalizeTypes] Make ScalarizeVecOp_STRICT_FP_ROUND do its own replacements and return SDValue() The caller will assert for nodes with more than 2 results unless we return a null SDValue. I tried to test this by copying an AArch64 test for ScalarizeVecOp_FP_ROUND. While it did hit the assert and this commited fixed that. It also hit a later problem that couldn't be fixed without adding strict FP support to AArch64.	2019-12-17 15:17:43 -08:00
Craig Topper	84d8fa30f9	[FPEnv][LegalizeTypes][LegalizeDAG][AArch64] Few fixes/improvements for legalizing fp<->int conversion nodes. This started with adding a test to support get code coverage on ScalarizeVecOp_UnaryOp_StrictFP by copying an existing AArch64 test and using constrained sitofp/uitofp intrinsics. This found 3 separate issues: -ScalarizeVecOp_UnaryOp_StrictFP needs to do its own replacement because the caller can't handle replacing multiple results. -Missing integer promotion support for sitofp/uitofp -Chain result not always assigned in ExpandLegalINT_TO_FP. Committing them together so I can add the test case.	2019-12-17 14:37:00 -08:00
Sourabh Singh Tomar	399273e5eb	Recommit "[DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission." This was reverted in `caa4120906`, since it was causing an assertion failure on Windows bots. This revision is revised to fix that. Original commit message - [DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission. Reviewers: dblaikie, aprantl, jini.susan.george Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71008	2019-12-18 02:12:59 +05:30
Sanjay Patel	6a77e36975	[SDAG] adjust isNegatibleForFree calculation to avoid crashing This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch adds a hack to work-around the case where we probably no longer detect that either multiply operand of an FMA isNegatibleForFree which is assumed to be true when we started rewriting the expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-17 13:49:15 -05:00
Sanjay Patel	5b0251da1c	Revert "[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression()" This reverts commit `36b1232ec5`. Need to adjust commit message - that was a leftover from the earlier version.	2019-12-17 13:47:59 -05:00
Sanjay Patel	36b1232ec5	[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch adds a hack to work-around the case where we probably no longer detect that either multiply operand of an FMA isNegatibleForFree which is assumed to be true when we started rewriting the expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-17 13:46:06 -05:00
Amaury Séchet	ff6567cc77	[DAGCombiner] Add node back in the worklist in topological order in CommitTargetLoweringOpt Summary: Right now, DAGCombiner process the nodes in an iplementation defined order. This tends to be fragile as optimisation may or may not kick in depending on the traversal order. This is part of a larger effort to get the DAGCombiner to process its node in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70921	2019-12-17 18:26:16 +01:00
Mitch Phillips	2423774cc2	Revert "Honor -fuse-init-array when os is not specified on x86" This reverts commit `aa5ee8f244`. This change broke the sanitizer buildbots. See comments at the patchset (https://reviews.llvm.org/D71360) for more information.	2019-12-17 07:36:59 -08:00
Kevin P. Neal	b1d8576b0a	This adds constrained intrinsics for the signed and unsigned conversions of integers to floating point. This includes some of Craig Topper's changes for promotion support from D71130. Differential Revision: https://reviews.llvm.org/D69275	2019-12-17 10:06:51 -05:00
alex-t	e7f585ed61	PostRA Machine Sink should take care of COPY defining register that is a sub-register by another COPY source operand Differential Revision: https://reviews.llvm.org/D71132	2019-12-17 15:20:43 +03:00
Guillaume Chatelet	531c1161b9	Resubmit "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" Summary: This is a resubmit of D71473. This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: aaron.ballman, courbet Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71547	2019-12-17 10:07:46 +01:00
Raphael Isemann	ccfab8e459	[ObjC][DWARF] Emit DW_AT_APPLE_objc_direct for methods marked as __attribute__((objc_direct)) Summary: With DWARF5 it is no longer possible to distinguish normal methods and methods with `__attribute__((objc_direct))` by just looking at the debug information as they are both now children of the of the DW_TAG_structure_type that defines them (before only the `__attribute__((objc_direct))` methods were children). This means that in LLDB we are no longer able to create a correct Clang AST of a module by just looking at the debug information. Instead we would need to call the Objective-C runtime to see which of the methods have a `__attribute__((objc_direct))` and then add the attribute to our own Clang AST depending on what the runtime returns. This would mean that we either let the module AST be dependent on the Objective-C runtime (which doesn't seem right) or we retroactively add the missing attribute to the imported AST in our expressions. A third option is to annotate methods with `__attribute__((objc_direct))` as `DW_AT_APPLE_objc_direct` which is what this patch implements. This way LLDB doesn't have to call the runtime for any `__attribute__((objc_direct))` method and the AST in our module will already be correct when we create it. Reviewers: aprantl, SouraVX Reviewed By: aprantl Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D71201	2019-12-17 09:40:36 +01:00
Craig Topper	13ce7c1291	[LegalizeTypes] Pre-size the SmallVectors in ScalarizeVecRes_StrictFPOp and SplitVecRes_StrictFPOp so we don't have to call push_back. NFCI This avoids grow checking/handling in each iteration of the loop.	2019-12-16 23:42:13 -08:00
Craig Topper	c738ebc1f5	[LegalizeTypes] Remove ScalarizeVecRes_STRICT_FP_ROUND in favor of just using ScalarizeVecRes_StrictFPOp. NFCI It looks like ScalarizeVecRes_StrictFPOp can handle a variable number of arguments with scalar and vector types so it should be sufficient.	2019-12-16 23:42:13 -08:00
Craig Topper	c4d2bb1ede	[LegalizeTypes] Remove the call to SplitVecRes_UnaryOp from SplitVecRes_StrictFPOp. NFCI It doesn't seem to do anything that SplitVecRes_StrictFPOp can't do. SplitVecRes_StrictFPOp already handles nodes with a variable number of arguments and a mix of scalar and vector arguments.	2019-12-16 23:42:13 -08:00
Craig Topper	4e48513b47	[SelectionDAG] Add the fpexcept flag to the SelectionDAG dumping output so we can better see when its not propagating. We're currently losing this flag in type legalization and probably other places when we expand strict fp nodes. This will make reading logs easier.	2019-12-16 18:05:11 -08:00
Puyan Lotfi	204dfabfe6	[NFC][llvm][MIRVRegNamerUtils] Moving some switch cases and altering comments.	2019-12-16 18:50:26 -05:00
Puyan Lotfi	f63b64c0c3	[llvm][MIRVRegNamerUtils] Adding hashing on CImm / FPImm MachineOperands. This patch makes it so that cases where multiple instructions that differ only in their ConstantInt or ConstantFP MachineOperand values no longer collide. For instance: %0:_(s1) = G_CONSTANT i1 true %1:_(s1) = G_CONSTANT i1 false %2:_(s32) = G_FCONSTANT float 1.0 %3:_(s32) = G_FCONSTANT float 0.0 Prior to this patch the first two instructions would collide together. Also, the last two G_FCONSTANT instructions would also collide. Now they will no longer collide. Differential Revision: https://reviews.llvm.org/D71558	2019-12-16 18:25:04 -05:00
Kamlesh Kumar	aa5ee8f244	Honor -fuse-init-array when os is not specified on x86 Currently -fuse-init-array option is not effective when target triple does not specify os, on x86,x86_64. i.e. // -fuse-init-array is not honored. $ clang -target i386 -fuse-init-array test.c -S // -fuse-init-array is honored. $ clang -target i386-linux -fuse-init-array test.c -S This patch fixes first case. And does cleanup. Reviewers: rnk, craig.topper, fhahn, echristo Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D71360	2019-12-16 15:21:23 -08:00
Guillaume Chatelet	4658da10e4	Revert "[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove" This reverts commit `181ab91efc`.	2019-12-16 15:19:49 +01:00
Guillaume Chatelet	181ab91efc	[Alignment][NFC] Deprecate CreateMemCpy/CreateMemMove Summary: This patch introduces a set of functions to enable deprecation of IRBuilder functions without breaking out of tree clients. Functions will be deprecated one by one and as in tree code is cleaned up. This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71473	2019-12-16 13:35:55 +01:00
Valentin Churavy	5c29e8c65f	[CodegenPrepare] Guard against degenerate branches Summary: Guard against a potential crash observed in https://github.com/JuliaLang/julia/issues/32994#issuecomment-524249628 If two branches are collapsed we can encounter a degenerate conditional branch `TBB==FBB`. The subsequent code assumes that they differ, so we exit out early. Reviewers: ributzka, spatel Subscribers: loladiro, dexonsmith, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66657	2019-12-16 04:23:32 -05:00
Sanjay Patel	2afe864118	[DAG] Add SimplifyDemandedBits support for BSWAP This exposes a shortcoming for AArch64, and that is tracked by PR40881: https://bugs.llvm.org/show_bug.cgi?id=40881 Patch by: @RKSimon (Simon Pilgrim) Differential Revision: https://reviews.llvm.org/D58017	2019-12-15 08:52:34 -05:00
Craig Topper	1dc0c8af5e	[LegalizeTypes] Teach BitcastToInt_ATOMIC_SWAP to only create FP16_TO_FP when called from PromoteFloatResult. There's also a call from SoftenFloatResult that should not be promoted. The change test case would fail with the new RUN line prior to this change.	2019-12-14 15:05:32 -08:00
Craig Topper	95ce8f9498	[LegalizeTypes] In PromoteFloatOp_SETCC, don't both querying for transforming the result type. The result type is already legal, is doesnt' need to be transformed.	2019-12-14 15:05:32 -08:00
Puyan Lotfi	816985c120	[NFC][llvm][MIRVRegNamerUtils] Refactoring GetHashableMO into switch-statement. This refactors the if-statements handling the hashing of various MachineOperand types into a switch-statement. The purpose is to cover all the basis for all MachineOperand types while being very deliberate about which MachineOperand types we are not handling and why (better added comments). This patch is a NFC redo of https://reviews.llvm.org/D71396. Much of the changes present in D71396 will come in smaller follow-up patches that will add support for hashing the MachineOperand types that aren't covered piece-meal with tests for each new case.	2019-12-14 02:31:07 -05:00
Roman Tereshin	8731799fc6	[Legalizer] Making artifact combining order-independent Legalization algorithm is complicated by two facts: 1) While regular instructions should be possible to legalize in an isolated, per-instruction, context-free manner, legalization artifacts can only be eliminated in pairs, which could be deeply, and ultimately arbitrary nested: { [ () ] }, where which paranthesis kind depicts an artifact kind, like extend, unmerge, etc. Such structure can only be fully eliminated by simple local combines if they are attempted in a particular order (inside out), or alternatively by repeated scans each eliminating only one innermost pair, resulting in O(n^2) complexity. 2) Some artifacts might in fact be regular instructions that could (and sometimes should) be legalized by the target-specific rules. Which means failure to eliminate all artifacts on the first iteration is not a failure, they need to be tried as instructions, which may produce more artifacts, including the ones that are in fact regular instructions, resulting in a non-constant number of iterations required to finish the process. I trust the recently introduced termination condition (no new artifacts were created during as-a-regular-instruction-retrial of artifacts not eliminated on the previous iteration) to be efficient in providing termination, but only performing the legalization in full if and only if at each step such chains of artifacts are successfully eliminated in full as well. Which is currently not guaranteed, as the artifact combines are applied only once and in an arbitrary order that has to do with the order of creation or insertion of artifacts into their worklist, which is a no particular order. In this patch I make a small change to the artifact combiner, making it to re-insert into the worklist immediate (modulo a look-through copies) artifact users of each vreg that changes its definition due to an artifact combine. Here the first scan through the artifacts worklist, while not being done in any guaranteed order, only needs to find the innermost pair(s) of artifacts that could be immediately combined out. After that the process follows def-use chains, making them shorter at each step, thus combining everything that can be combined in O(n) time. Reviewers: volkan, aditya_nandakumar, qcolombet, paquette, aemerson, dsanders Reviewed By: aditya_nandakumar, paquette Tags: #llvm Differential Revision: https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Roman Tereshin	18bf9670aa	[Legalizer] Refactoring out legalizeMachineFunction and introducing new unittests/CodeGen/GlobalISel/LegalizerTest.cpp relying on it to unit test the entire legalizer algorithm (including the top-level main loop). See also https://reviews.llvm.org/D71448	2019-12-13 15:45:18 -08:00
Roman Tereshin	8207c81597	[Legalizer] More detailed debugging printing in main loop	2019-12-13 15:45:18 -08:00
Alex Richardson	11448eeb72	[NFC] Use SelectionDAG::getMemBasePlusOffset() instead of getNode(ISD::ADD) Summary: To find potential opportunities to use getMemBasePlusOffset() I looked at all ISD::ADD uses found with the regex getNode\(ISD::ADD,.+,.+Ptr in lib/CodeGen/SelectionDAG. If this patch is accepted I will convert the files in the individual backends too. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71207	2019-12-13 21:40:03 +00:00
Alex Richardson	fc83f53a86	[NFC] Implement SelectionDAG::getObjectPtrOffset() using getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. In order to make this change, getMemBasePlusOffset() has been extended to also take a SDNodeFlags parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71206	2019-12-13 21:40:03 +00:00
Alex Richardson	ea8888d1af	[NFC] Add a SDValue overload for SelectionDAG::getMemBasePlusOffset() Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows integer constants offsets, but there are cases where we can use an existing SDValue parameter. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel, craig.topper Reviewed By: spatel, craig.topper Subscribers: craig.topper, merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71205	2019-12-13 21:40:03 +00:00
Alex Richardson	d9bb70acd7	[NFC] Change SelectionDAG::getMemBasePlusOffset() to use int64_t Summary: This change is preparatory work to use this helper functions in more places. Currently the function only allows positive offsets, but there are cases where we want to subtract an offset from an existing pointer. The motivation for this change is our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). We use a separate register type to store pointers (128-bit capabilities, which are effectively unforgeable and monotonic fat pointers). These capabilities permit a reduced set of operations and therefore use a separate ValueType (iFATPTR). to represent pointers implemented as capabilities. Therefore, we need to avoid using ISD::ADD for our patterns that operate on pointers and need to use a function that chooses ISD::ADD or a new ISD::PTRADD opcode depending on the value type. We originally added a new DAG.getPointerAdd() function, but after this patch series we can modify the implementation of getMemBasePlusOffset() instead. Avoiding direct uses of ISD::ADD for pointer types will significantly reduce the amount of assertion/instruction selection failures for us in future upstream merges. Reviewers: spatel Reviewed By: spatel Subscribers: merge_guards_bot, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71204	2019-12-13 21:40:03 +00:00
Sanjay Patel	2f0c7fd2db	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc (2nd try) The initial attempt (rG89633320) botched the logic by reversing the source/dest types. Added x86 tests for additional coverage. The vector tests show a potential improvement (fold vector load instead of broadcasting), but that's a known/existing problem. This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-13 14:03:54 -05:00
Nicola Zaghen	97572775d2	Reland [DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. This fixes the buildbot failures. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-13 14:30:21 +00:00
Alex Richardson	be15dfa88f	[NFC] Use EVT instead of bool for getSetCCInverse() Summary: The use of a boolean isInteger flag (generally initialized using VT.isInteger()) caused errors in our out-of-tree CHERI backend (https://github.com/CTSRD-CHERI/llvm-project). In our backend, pointers use a separate ValueType (iFATPTR) and therefore .isInteger() returns false. This meant that getSetCCInverse() was using the floating-point variant and generated incorrect code for us: `(void )0x12033091e < (void )0xffffffffffffffff` would return false. Committing this change will significantly reduce our merge conflicts for each upstream merge. Reviewers: spatel, bogner Reviewed By: bogner Subscribers: wuzish, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70917	2019-12-13 12:22:03 +00:00
Kerry McLaughlin	4194ca8e5a	Recommit "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores" Updated pred_load patterns added to AArch64SVEInstrInfo.td by this patch to use reg + imm non-temporal loads to fix previous test failures. Original commit message: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above.	2019-12-13 10:08:20 +00:00
David Stenberg	5c7cc6f83d	[LiveDebugValues] Omit entry values for DBG_VALUEs with pre-existing expressions Summary: This is a quickfix for PR44275. An assertion that checks that the DIExpression is valid failed due to attempting to create an entry value for an indirect parameter. This started appearing after D69028, as the indirect parameter started being represented using an DW_OP_deref, rather than with the DBG_VALUE's second operand, meaning that the isIndirectDebugValue() check in LiveDebugValues did not exclude such parameters. A DIExpression that has an entry value operation can currently not have any other operation, leading to the failed isValid() check. This patch simply makes us stop considering emitting entry values for such parameters. To support such cases I think we at least need to do the following changes: * In DIExpression::isValid(): Remove the limitation that a DW_OP_LLVM_entry_value operation can be the only operation in a DIExpression. * In LiveDebugValues::emitEntryValues(): Create an entry value of size 1, so that it only wraps the register operand, and not the whole pre-existing expression (the DW_OP_deref). * In LiveDebugValues::removeEntryValue(): Check that the new debug value has the same debug expression as the original, rather than checking that the debug expression is empty. * In DwarfExpression::addMachineRegExpression(): Modify the logic so that a DW_OP_reg* expression is emitted for the entry value. That is how GCC emits entry values for indirect parameters. That will currently not happen to due the DW_OP_deref causing the !HasComplexExpression to fail. The LocationKind needs to be changed also, rather than always emitting a DW_OP_stack_value for entry values. There are probably more things I have missed, but that could hopefully be a good starting point for emitting such entry values. Reviewers: djtodoro, aprantl, jmorse, vsk Reviewed By: aprantl, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D71416	2019-12-13 10:49:46 +01:00
Craig Topper	5c80a4f454	[LegalizeTypes] Remove unnecessary if before calling ReplaceValueWith on the chain in SoftenFloatRes_LOAD. I believe this is a leftover from when fp128 was softened to fp128 on X86-64. In that case type legalization must have been able to create a load that was the same as N which would make this replacement fail or assert. Since we no longer do that, this check should be unneeded.	2019-12-13 00:14:41 -08:00
Eric Christopher	a8154e5e0c	Temporarily revert "NFC: DebugInfo: Refactor RangeSpanList to be a struct, like DebugLocStream::List" as it was causing bot and build failures. This reverts commit `8e04896288`.	2019-12-12 17:55:41 -08:00
David Blaikie	8e04896288	NFC: DebugInfo: Refactor RangeSpanList to be a struct, like DebugLocStream::List Move these data structures closer together so their emission code can eventually share more of its implementation.	2019-12-12 16:53:59 -08:00
David Blaikie	20e06a28da	NFC: DebugInfo: Refactor debug_loc/loclist emission into a common function (except for v4 loclists, which are sufficiently different to not fit well in this generic implementation) In subsequent patches I intend to refactor the DebugLoc and ranges data structures to be more similar so I can common more of the implementation here.	2019-12-12 16:39:12 -08:00
Evgenii Stepanov	dabd2622a8	hwasan: add tag_offset DWARF attribute to optimized debug info Summary: Support alloca-referencing dbg.value in hwasan instrumentation. Update AsmPrinter to emit DW_AT_LLVM_tag_offset when location is in loclist format. Reviewers: pcc Subscribers: srhines, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70753	2019-12-12 16:18:54 -08:00
Sanjay Patel	9432937190	Revert "[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc" This reverts commit `8963332c33`. There was a logic bug typo in this code, but it wasn't visible in the asm for the tests.	2019-12-12 16:24:40 -05:00
Sanjay Patel	8963332c33	[DAGCombiner] fold shift-trunc-shift to shift-mask-trunc This fold is done in IR by instcombine, and we have a special form of it already here in DAGCombiner, but we want the more general transform too: https://rise4fun.com/Alive/3jZm Name: general Pre: (C1 + zext(C2) < 64) %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %a = and i64 %s2, zext((1 << (16 - C2)) - 1) %r = trunc %a to i16 Name: special Pre: C1 == 48 %s = lshr i64 %x, C1 %t = trunc i64 %s to i16 %r = lshr i16 %t, C2 => %s2 = lshr i64 %x, C1 + zext(C2) %r = trunc %s2 to i16 ...because D58017 exposes a regression without this fold.	2019-12-12 15:44:13 -05:00
Sanjay Patel	b39009bf1d	[DAGCombiner] improve readability This is not quite NFC because I changed the SDLoc to use the more standard 'N' (the starting node for the fold). This transform is a special-case of a more general fold that we do in IR, but it seems like the general fold is needed here too to avoid a potential regression seen in D58017. https://rise4fun.com/Alive/3jZm	2019-12-12 13:16:50 -05:00
stozer	e39e2b4a79	[DebugInfo] Prevent invalid fragments at ISel from dropping debug info During SelectionDAG, if a value which is associated with a DBG_VALUE needs to be split across multiple registers, the DBG_VALUE will be split into a set of fragment expressions to recreate the original value. If one or more of these fragments cannot be created, they would previously be silently dropped, causing the old debug value to live past its expiry date. This patch fixes this issue by keeping invalid fragments while setting their value as Undef. Differential revision: https://reviews.llvm.org/D70248	2019-12-12 12:28:39 +00:00
Nicola Zaghen	f798eb21ec	Temporarily Revert "[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same." This reverts commit `5f6208778f`. This caused failures in Transforms/PhaseOrdering/scev-custom-dl.ll const: Assertion `getBitWidth() == CR.getBitWidth() && "ConstantRange types don't agree!"' failed.	2019-12-12 10:29:54 +00:00
Nicola Zaghen	5f6208778f	[DataLayout] Fix occurrences that size and range of pointers are assumed to be the same. GEP index size can be specified in the DataLayout, introduced in D42123. However, there were still places in which getIndexSizeInBits was used interchangeably with getPointerSizeInBits. This notably caused issues with Instcombine's visitPtrToInt; but the unit tests was incorrect, so this remained undiscovered. Differential Revision: https://reviews.llvm.org/D68328 Patch by Joseph Faulls!	2019-12-12 10:07:01 +00:00
Puyan Lotfi	756db63af9	[NFC][llvm][MIRVRegNamerUtils] Moving methods around. Making some private. Making all externally unused methods private in MIRVRegNamerUtils.h. Moving or deleting a couple other methods around.	2019-12-12 03:32:53 -05:00
Puyan Lotfi	f5b7a46837	[llvm][MIRVRegNamerUtils] Adding hashing on memoperands. No more hash collisions for memoperands. Now the MIRCanonicalization pass shouldn't hit hash collisions when dealing with nearly identical memory accessing instructions when their memoperands are in fact different. Differential Revision: https://reviews.llvm.org/D71328	2019-12-11 22:11:49 -05:00
Reid Kleckner	5d986953c8	[IR] Split out target specific intrinsic enums into separate headers This has two main effects: - Optimizes debug info size by saving 221.86 MB of obj file size in a Windows optimized+debug build of 'all'. This is 3.03% of 7,332.7MB of object file size. - Incremental step towards decoupling target intrinsics. The enums are still compact, so adding and removing a single target-specific intrinsic will trigger a rebuild of all of LLVM. Assigning distinct target id spaces is potential future work. Part of PR34259 Reviewers: efriedma, echristo, MaskRay Reviewed By: echristo, MaskRay Differential Revision: https://reviews.llvm.org/D71320	2019-12-11 18:02:14 -08:00
Vedant Kumar	56232f950d	Revert "[DWARF] Allow cross-CU references of subprogram definitions" This reverts commit `30038da15b`. It causes the stage2 thinLTO bot to fail with: Assertion failed: (CU.getDIE(CalleeSP) && "Expected declaration subprogram DIE for callee") rdar://57840415	2019-12-11 15:55:48 -08:00
Sanjay Patel	cdf5cfea8e	Revert "[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression()" This reverts commit `d1f0bdf2d2`. The patch can cause infinite loops in DAGCombiner.	2019-12-11 16:56:58 -05:00
Craig Topper	4b452952fe	[LegalizeTypes] In SoftenFloatRes_FP_EXTEND, move the check for input already being promoted above the check for fp16 converting to something other than fp32. The fp16 to larger than fp32 inserts an extend that need to re-legalized if fp16 is promoted. But if we check for fp16 promotion first, then we can avoid emiting the fp_extend all together.	2019-12-11 12:48:08 -08:00
Sanjay Patel	d1f0bdf2d2	[SDAG] remove use restriction in isNegatibleForFree() when called from getNegatedExpression() This is an alternate fix for the bug discussed in D70595. This also includes minimal tests for other in-tree targets to show the problem more generally. We check the number of uses as a predicate for whether some value is free to negate, but that use count can change as we rewrite the expression in getNegatedExpression(). So something that was marked free to negate during the cost evaluation phase becomes not free to negate during the rewrite phase (or the inverse - something that was not free becomes free). This can lead to a crash/assert because we expect that everything in an expression that is negatible to be handled in the corresponding code within getNegatedExpression(). This patch skips the use check during the rewrite phase. So we determine that some expression isNegatibleForFree (identically to without this patch), but during the rewrite, don't rely on use counts to decide how to create the optimal expression. Differential Revision: https://reviews.llvm.org/D70975	2019-12-11 13:30:39 -05:00
Kerry McLaughlin	c0a3ab3655	Revert "[AArch64][SVE] Implement intrinsics for non-temporal loads & stores" This reverts commit `3f5bf35f86` as it was causing build failures in llvm-clang-x86_64-expensive-checks: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-debian/builds/392 http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-ubuntu/builds/1045	2019-12-11 13:58:39 +00:00
Kerry McLaughlin	3f5bf35f86	[AArch64][SVE] Implement intrinsics for non-temporal loads & stores Summary: Adds the following intrinsics: - llvm.aarch64.sve.ldnt1 - llvm.aarch64.sve.stnt1 This patch creates masked loads and stores with the MONonTemporal flag set when used with the intrinsics above. Reviewers: sdesmalen, paulwalker-arm, dancgr, mgudim, efriedma, rengolin Reviewed By: efriedma Subscribers: tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71000	2019-12-11 11:13:51 +00:00
Sjoerd Meijer	d97cf1f889	[ARM][LowOverheadLoops] Remove dead loop update instructions. After creating a low-overhead loop, the loop update instruction was still lingering around hurting performance. This removes dead loop update instructions, which in our case are mostly SUBS instructions. To support this, some helper functions were added to MachineLoopUtils and ReachingDefAnalysis to analyse live-ins of loop exit blocks and find uses before a particular loop instruction, respectively. This is a first version that removes a SUBS instruction when there are no other uses inside and outside the loop block, but there are some more interesting cases in test/CodeGen/Thumb2/LowOverheadLoops/mve-tail-data-types.ll which shows that there is room for improvement. For example, we can't handle this case yet: .. dlstp.32 lr, r2 .LBB0_1: mov r3, r2 subs r2, #4 vldrh.u32 q2, [r1], #8 vmov q1, q0 vmla.u32 q0, q2, r0 letp lr, .LBB0_1 @ %bb.2: vctp.32 r3 .. which is a lot more tricky because r2 is not only used by the subs, but also by the mov to r3, which is used outside the low-overhead loop by the vctp instruction, and that requires a bit of a different approach, and I will follow up on this. Differential Revision: https://reviews.llvm.org/D71007	2019-12-11 10:20:19 +00:00
Sam Parker	ee7579409b	[ARM][TypePromotion] Enable by default Enable the TypePromotion pass my default (again). This patch was originally committed in `393dacacf7`. This patch was reverted in `a38396939c`. Differential Revision: https://reviews.llvm.org/D70998	2019-12-11 10:00:16 +00:00
shkzhang	1408e7e175	[PowerPC] [CodeGen] Use MachineBranchProbabilityInfo in EarlyIfPredicator to avoid the potential bug Summary: In the function `EarlyIfPredicator::shouldConvertIf()`, we call `TII->isProfitableToIfCvt()` with `BranchProbability::getUnknown()`, it may cause the potential assertion error for those hook which use `BranchProbability` in `isProfitableToIfCvt()`, for example `SystemZ`. `SystemZ` use `Probability < BranchProbability(1, 8))` in the function `SystemZInstrInfo::isProfitableToIfCvt()`, if we call this function with `BranchProbability::getUnknown()`, it will cause assertion error. This patch is to fix the potential bug. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D71273	2019-12-11 04:46:00 -05:00
Florian Hahn	11f311875f	[LiveRegUnits] Add phys_regs_and_masks iterator range (NFC). This iterator range just includes physical registers and register masks, which are interesting when dealing with register liveness. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D70562	2019-12-11 09:34:42 +00:00
Craig Topper	d4345636e6	[LegalizeTypes] Remove manual worklist management from SoftenFloatRes_FP_EXTEND. I think this is no longer needed. The system should take care of legalizing any new nodes that are added. I think this might have been needed prior to r371709 or r307053.	2019-12-10 22:33:31 -08:00
Nico Weber	caa4120906	Revert "[DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission." This reverts commit `307f60a1a3`. DebugInfo/X86/debug-macinfo-split-dwarf.ll fails on Windows: Command Output (stdout): -- $ ":" "RUN: at line 1" $ "c:\src\llvm-project\out\gn\bin\llc.exe" "-mtriple=x86_64-pc-windows-gnu" "-O0" "-split-dwarf-file=foo.dwo" "-filetype=obj" Assertion failed: Section && "Cannot switch to a null section!", file ../../llvm/lib/MC/MCStreamer.cpp, line 1103 Stack dump: 0. Program arguments: c:\src\llvm-project\out\gn\bin\llc.exe -mtriple=x86_64-pc-windows-gnu -O0 -split-dwarf-file=foo.dwo -filetype=obj	2019-12-10 21:32:30 -05:00
Puyan Lotfi	f364686f34	[llvm][MIRVRegNamerUtil] Adding hashing against MachineInstr flags. Now, flags will result in differing hashes for a given MI. In effect, if you have two instructions with everything identical except for their flags then you should get two different hashes and fewer collisions. Differential Revision: https://reviews.llvm.org/D70479	2019-12-10 20:16:14 -05:00
Wang, Pengfei	21bc8631fe	[FPEnv][X86] Constrained FCmp intrinsics enabling on X86 Summary: This is a follow up of D69281, it enables the X86 backend support for the FP comparision. Reviewers: uweigand, kpn, craig.topper, RKSimon, cameron.mcinally, andrew.w.kaylor Subscribers: hiraditya, llvm-commits, annita.zhang, LuoYuanke, LiuChen3 Tags: #llvm Differential Revision: https://reviews.llvm.org/D70582	2019-12-11 08:23:09 +08:00
David Blaikie	4ffd3f44e3	DebugInfo: Clarify some more reasons v4 loc.dwo can't share much implementation with loclists.dwo	2019-12-10 14:11:03 -08:00
Vedant Kumar	30038da15b	[DWARF] Allow cross-CU references of subprogram definitions This allows a call site tag in CU A to reference a callee DIE in CU B without resorting to creating an incomplete duplicate DIE for the callee inside of CU A. We already allow cross-CU references of subprogram declarations, so it doesn't seem like definitions ought to be special. This improves entry value evaluation and tail call frame synthesis in the LTO setting. During LTO, it's common for cross-module inlining to produce a call in some CU A where the callee resides in a different CU, and there is no declaration subprogram for the callee anywhere. In this case llvm would (unnecessarily, I think) emit an empty DW_TAG_subprogram in order to fill in the call site tag. That empty 'definition' defeats entry value evaluation etc., because the debugger can't figure out what it means. As a follow-up, maybe we could add a DWARF verifier check that a DW_TAG_subprogram at least has a DW_AT_name attribute. rdar://46577651 Differential Revision: https://reviews.llvm.org/D70350	2019-12-10 14:00:57 -08:00
Sourabh Singh Tomar	307f60a1a3	[DebugInfo] Refactored macro related generation, added a test case for macinfo.dwo emission. Reviewers: dblaikie, aprantl, jini.susan.george Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71008	2019-12-11 02:19:27 +05:30
Sourabh Singh Tomar	fb4d8fe1a8	Recommit "[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified." Reviewers: dblaikie, aprantl, probinson Tags: #debug-info #llvm Differential Revision: https://reviews.llvm.org/D71185	2019-12-11 01:24:50 +05:30
Sourabh Singh Tomar	d82b6ba21b	Revert "[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified." This reverts commit `6ef01588f4`. Missing Differetial revision.	2019-12-11 01:20:40 +05:30
Sourabh Singh Tomar	6ef01588f4	[DWARF5] Start emitting DW_AT_dwo_name when -gdwarf-5 is specified.	2019-12-11 01:18:02 +05:30
Hans Wennborg	49da20ddb4	Revert `30e8f80fd5` "[DebugInfo] Don't create multiple DBG_VALUEs when sinking" This caused non-determinism in the compiler, see command on the Phabricator code review. > This patch addresses a performance problem reported in PR43855, and > present in the reapplication in in 001574938e5. It turns out that > MachineSink will (often) move instructions to the first block that > post-dominates the current block, and then try to sink further. This > means if we have a lot of conditionals, we can needlessly create large > numbers of DBG_VALUEs, one in each block the sunk instruction passes > through. > > To fix this, rather than immediately sinking DBG_VALUEs, record them in > a pass structure. When sinking is complete and instructions won't be > sunk any further, new DBG_VALUEs are added, avoiding lots of > intermediate DBG_VALUE $noregs being created. > > Differential revision: https://reviews.llvm.org/D70676	2019-12-10 19:20:11 +01:00
Sam Parker	933de40729	[TypePromotion] Query target register width TargetLoweringInfo may report that an integer should be promoted, but it maybe provide a size that isn't natively supported by the target register file... So check this before trying to perform a promotion. This is to fix some chromium issues: https://bugs.chromium.org/p/chromium/issues/detail?id=1031978 https://bugs.chromium.org/p/chromium/issues/detail?id=1031979 Differential Revision: https://reviews.llvm.org/D71200	2019-12-10 13:23:00 +00:00
Kiran Chandramohan	965ed1e974	[AArch64] Fix issues with large arrays on stack Summary: This patch fixes a few issues when large arrays are allocated on the stack. Currently, clang has inconsistent behaviour, for debug builds there is an assertion failure when the array size on stack is around 2GB but there is no assertion when the stack is around 8GB. For release builds there is no assertion, the compilation succeeds but generates incorrect code. The incorrect code generated is due to using int/unsigned int instead of their 64-bit counterparts. This patch, 1) Removes the assertion in frame legality check. 2) Converts int/unsigned int in some places to the 64-bit variants. This helps in generating correct code and removes the inconsistent behaviour. 3) Adds a test which runs without optimisations. Reviewers: sdesmalen, efriedma, fhahn, aemerson Reviewed By: efriedma Subscribers: eli.friedman, fpetrogalli, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70496	2019-12-10 11:44:41 +00:00
Mikael Holmen	4763267eee	[LegalizeTypes] Bugfixes for big-endian targets when handling BITCASTs Summary: This fixes PR44135. The special case when we promote a bitcast from a vector to an int needs special handling when we are on a big-endian target. Prior to this fix, for the added vec_to_int we see the following in the SelectionDAG printouts Type-legalized selection DAG: %bb.1 'foo:bb.1' SelectionDAG has 9 nodes: t0: ch = EntryToken t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0 t17: v4i32 = bitcast t2 t23: i32 = extract_vector_elt t17, Constant:i32<3> t8: ch,glue = CopyToReg t0, Register:i32 $r0, t23 t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1 and I think here the extract_vector_elt is wrong and extracts the value from the wrong index. The program program should return the 32 bits made up of the elements at index 4 and 5 in the vec6 array, but with t23: i32 = extract_vector_elt t17, Constant:i32<3> as far as I can tell, we will extract values that originally didn't even exist in the vec6 vectore. If we would instead extract the element at index 2 we would get the wanted values. With this fix we insert a right shift after the bitcast in DAGTypeLegalizer::PromoteIntRes_BITCAST which then gives us Type-legalized selection DAG: %bb.1 'vec_to_int:bb.1' SelectionDAG has 9 nodes: t0: ch = EntryToken t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %0 t23: v4i32 = bitcast t2 t27: i32 = extract_vector_elt t23, Constant:i32<2> t8: ch,glue = CopyToReg t0, Register:i32 $r0, t27 t9: ch = ARMISD::RET_FLAG t8, Register:i32 $r0, t8:1 So now we get t27: i32 = extract_vector_elt t23, Constant:i32<2> which is what we want. Similarly, the new int_to_vec testcase exposes a bug where we cast the other direction. Then we instead need to add a left shift before the bitcast on big-endian targets for the bits in the input integer to end up at the exptected place in the vector. Reviewers: bogner, spatel, craig.topper, t.p.northover, dmgreen, efriedma, SjoerdMeijer, samparker Reviewed By: efriedma Subscribers: eli.friedman, bjope, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70942	2019-12-10 11:22:35 +01:00
Puyan Lotfi	479e3b85e2	[NFCi][llvm][MIRVRegNamerUtils] Making some code cleanup and stylistic changes. Making some changes to MIRVRegNamerUtils.cpp to use some more modern c++ features as well as some changes to generally make the code more concise and more understandable. I make this an NFCi because in one case I drop the whole "if (!MO->isDef()) MO->setIsKill(false);" thing that was added in the original implementation, generally because I don't think this is really semantically sound. I also changed up the implementation of VRegRenamer::createVirtualRegisterWithLowerName somewhat because I am now lower-casing the name unconditionally because I confirmed that that was in fact aditya_nandakumar@apple.com's intent. In all other cases, behavior should not be changed. Differential Revision: https://reviews.llvm.org/D71182	2019-12-09 23:35:27 -05:00
Fangrui Song	9574757dba	[MC] Delete MCCodePadder D34393 added MCCodePadder as an infrastructure for padding code with NOP instructions. It lacked tests and was not being worked on since then. Intel has now worked on an assembler patch to mitigate performance loss after applying microcode update for the Jump Conditional Code Erratum. https://www.intel.com/content/www/us/en/support/articles/000055650/processors.html This new patch shares similarity with MCCodePadder, but has a concrete use case in mind and is being actively developed. The infrastructure it introduces can potentially be used for general performance improvement via alignment. Delete the unused MCCodePadder so that people can develop the new feature from a clean state. Reviewed By: jyknight, skan Differential Revision: https://reviews.llvm.org/D71106	2019-12-09 19:21:31 -08:00
QingShan Zhang	05b0c76aa7	[NFC][MacroFusion] Adding the assertion if someone want to fuse more than 2 instructions As discussed in https://reviews.llvm.org/D69998, we miss to create some dependency edges if chained more than 2 instructions. Adding an assertion here if someone want to chain more than 2 instructions. Differential Revision: https://reviews.llvm.org/D71180	2019-12-10 03:10:21 +00:00
Hiroshi Yamauchi	d9ae493937	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). A second try after reverted D71072. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71149	2019-12-09 12:42:59 -08:00
Thomas Raoux	caabb713ea	[ModuloSchedule] Fix data types in ModuloScheduleExpander::isLoopCarried The cycle values in modulo scheduling results can be negative. The result of ModuloSchedule::getCycle() must be received as an int type. Patch by Masaki Arai! Differential Revision: https://reviews.llvm.org/D71122	2019-12-09 07:37:00 -08:00
Jeremy Morse	00e238896c	[DebugInfo] Nerf placeDbgValues, with prejudice CodeGenPrepare::placeDebugValues moves variable location intrinsics to be immediately after the Value they refer to. This makes tracking of locations very easy; but it changes the order in which assignments appear to the debugger, from the source programs order to the order in which the optimised program computes values. This then leads to PR43986 and PR38754, where variable locations that were in a conditional block are made unconditional, which is highly misleading. This patch adjusts placeDbgValues to only re-order variable location intrinsics if they use a Value before it is defined, significantly reducing the damage that it does. This is still not 100% safe, but the rest of CodeGenPrepare needs polishing to correctly update debug info when optimisations are performed to fully fix this. This will probably break downstream debuginfo tests -- if the instruction-stream position of variable location changes isn't the focus of the test, an easy fix should be to manually apply placeDbgValues' behaviour to the failing tests, moving dbg.value intrinsics next to SSA variable definitions thus: %foo = inst1 %bar = ... %baz = ... void call @llvm.dbg.value(metadata i32 %foo, ... to %foo = inst1 void call @llvm.dbg.value(metadata i32 %foo, ... %bar = ... %baz = ... This should return your test to exercising whatever it was testing before. Differential Revision: https://reviews.llvm.org/D58453	2019-12-09 12:52:10 +00:00
David Stenberg	6965f835b4	[DebugInfo] Make describeLoadedValue() reg aware Summary: Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes situations in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: ormris, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D70431	2019-12-09 10:47:49 +01:00
David Stenberg	f3696533f2	Revert "[DebugInfo] Make describeLoadedValue() reg aware" This reverts commit `3cd93a4efc`. I'll recommit with a well-formatted arcanist commit message.	2019-12-09 10:45:13 +01:00
David Stenberg	3cd93a4efc	[DebugInfo] Make describeLoadedValue() reg aware Currently the describeLoadedValue() hook is assumed to describe the value of the instruction's first explicit define. The hook will not be called for instructions with more than one explicit define. This commit adds a register parameter to the describeLoadedValue() hook, and invokes the hook for all registers in the worklist. This will allow us to for example describe instructions which produce more than two parameters' values; e.g. Hexagon's various combine instructions. This also fixes a case in our downstream target where we may pass smaller parameters in the high part of a register. If such a parameter's value is produced by a larger copy instruction, we can't describe the call site value using the super-register, and we instead need to know which sub-register that should be used. This also allows us to handle cases like this: $ebx = [...] $rdi = MOVSX64rr32 $ebx $esi = MOV32rr $edi CALL64pcrel32 @call The hook will first be invoked for the MOV32rr instruction, which will say that @call's second parameter (passed in $esi) is described by $edi. As $edi is not preserved it will be added to the worklist. When we get to the MOVSX64rr32 instruction, we need to describe two values; the sign-extended value of $ebx -> $rdi for the first parameter, and $ebx -> $edi for the second parameter, which is now possible. This commit modifies the dbgcall-site-lea-interpretation.mir test case. In the test case, the values of some 32-bit parameters were produced with LEA64r. Perhaps we can in general cases handle such by emitting expressions that AND out the lower 32-bits, but I have not been able to land in a case where a LEA64r is used for a 32-bit parameter instead of LEA64_32 from C code. I have not found a case where it would be useful to describe parameters using implicit defines, so in this patch the hook is still only invoked for explicit defines of forwarding registers.	2019-12-09 10:44:17 +01:00
Hans Wennborg	a38396939c	Revert `393dacacf7` "[ARM] Enable TypePromotion by default" This caused "Too many bits for uint64_t" asserts when building Chromium. See https://crbug.com/1031978#c2 for a reproducer. I'll follow up on the llvm-commits thread with a creduced version. > ARMCodeGenPrepare has already been generalized and renamed to > TypePromotion. We've had it enabled and tested downstream for a > while, so enable it by default. > > Differential Revision: https://reviews.llvm.org/D70998	2019-12-09 09:39:31 +01:00
rollrat	9fdb7ac503	[NFC][LivePhysRegs] Fix incorrect comment Reviewers: #llvm, tellenbach Reviewed By: tellenbach Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71051 Patch by: rollrat <rollrat.cse@gmail.com>	2019-12-08 21:07:28 +01:00
Ulrich Weigand	9db13b5a7d	[FPEnv] Constrained FCmp intrinsics This adds support for constrained floating-point comparison intrinsics. Specifically, we add: declare <ty2> @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) declare <ty2> @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>, metadata <condition code>, metadata <exception behavior>) The first variant implements an IEEE "quiet" comparison (i.e. we only get an invalid FP exception if either argument is a SNaN), while the second variant implements an IEEE "signaling" comparison (i.e. we get an invalid FP exception if either argument is any NaN). The condition code is implemented as a metadata string. The same set of predicates as for the fcmp instruction is supported (except for the "true" and "false" predicates). These new intrinsics are mapped by SelectionDAG codegen onto two new ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again representing quiet vs. signaling comparison operations. Otherwise those nodes look like SETCC nodes, with an additional chain argument and result as usual for strict FP nodes. The patch includes support for the common legalization operations for those nodes. The patch also includes full SystemZ back-end support for the new ISD nodes, mapping them to all available SystemZ instruction to fully implement strict semantics (scalar and vector). Differential Revision: https://reviews.llvm.org/D69281	2019-12-07 11:28:39 +01:00
Craig Topper	28b573d249	[TargetLowering] Fix another potential FPE in expandFP_TO_UINT D53794 introduced code to perform the FP_TO_UINT expansion via FP_TO_SINT in a way that would never expose floating-point exceptions in the intermediate steps. Unfortunately, I just noticed there is still a way this can happen. As discussed in D53794, the compiler now generates this sequence: // Sel = Src < 0x8000000000000000 // Val = select Sel, Src, Src - 0x8000000000000000 // Ofs = select Sel, 0, 0x8000000000000000 // Result = fp_to_sint(Val) ^ Ofs The problem is with the Src - 0x8000000000000000 expression. As I mentioned in the original review, that expression can never overflow or underflow if the original value is in range for FP_TO_UINT. But I missed that we can get an Inexact exception in the case where Src is a very small positive value. (In this case the result of the sub is ignored, but that doesn't help.) Instead, I'd suggest to use the following sequence: // Sel = Src < 0x8000000000000000 // FltOfs = select Sel, 0, 0x8000000000000000 // IntOfs = select Sel, 0, 0x8000000000000000 // Result = fp_to_sint(Val - FltOfs) ^ IntOfs In the case where the value is already in range of FP_TO_SINT, we now simply compute Val - 0, which now definitely cannot trap (unless Val is a NaN in which case we'd want to trap anyway). In the case where the value is not in range of FP_TO_SINT, but still in range of FP_TO_UINT, the sub can never be inexact, as Val is between 2^(n-1) and (2^n)-1, i.e. always has the 2^(n-1) bit set, and the sub is always simply clearing that bit. There is a slight complication in the case where Val is a constant, so we know at compile time whether Sel is true or false. In that scenario, the old code would automatically optimize the sub away, while this no longer happens with the new code. Instead, I've added extra code to check for this case and then just fall back to FP_TO_SINT directly. (This seems to catch even slightly more cases.) Original version of the patch by Ulrich Weigand. X86 changes added by Craig Topper Differential Revision: https://reviews.llvm.org/D67105	2019-12-06 14:11:04 -08:00
Hiroshi Yamauchi	2eb30fafa5	Revert "[PGO][PGSO] Instrument the code gen / target passes." This reverts commit `9a0b5e1407`. This seems to break buildbots.	2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi	9a0b5e1407	[PGO][PGSO] Instrument the code gen / target passes. Summary: Split off of D67120. Add the profile guided size optimization instrumentation / queries in the code gen or target passes. This doesn't enable the size optimizations in those passes yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass queries). Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71072	2019-12-06 10:43:39 -08:00
Guozhi Wei	72942459d0	[MBP] Avoid tail duplication if it can't bring benefit Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead. To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks: make sure there is at least one duplication in current work set. the number of duplication should not exceed the number of successors. The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain. Differential Revision: https://reviews.llvm.org/D64376	2019-12-06 09:53:53 -08:00
John Brawn	984f1bb3e7	[LegalizeTypes] Add missing case for STRICT_FP_ROUND softening This fixes a test failure in test/CodeGen/ARM/fp-intrinsics.ll.	2019-12-06 15:54:27 +00:00
Jeremy Morse	c93a9b15ce	[DebugInfo][CGP] Update dbg.values when sinking address computations One of CodeGenPrepare's optimizations is to duplicate address calculations into basic blocks, so that as much information as possible can be folded into memory addressing operands. This is great -- but the dbg.value variable location intrinsics are not updated in the same way. This can lead to dbg.values referring to address computations in other blocks that will never be encoded into the DAG, while duplicate address computations are performed locally that could be used by the dbg.value. Some of these (such as non-constant-offset GEPs) can't be salvaged past. Fix this by, whenever we duplicate an address computation into a block, looking for dbg.value users of the original memory address in the same block, and redirecting those to the local computation. Differential Revision: https://reviews.llvm.org/D58403	2019-12-06 11:27:19 +00:00
Ulrich Weigand	daee549b17	[FPEnv][SelectionDAG] Relax chain requirements This patch implements the following changes: 1) SelectionDAGBuilder::visitConstrainedFPIntrinsic currently treats each constrained intrinsic like a global barrier (e.g. a function call) and fully serializes all pending chains. This is actually not required; it is allowed for constrained intrinsics to be reordered w.r.t one another or (nonvolatile) memory accesses. The MI-level scheduler already allows for that flexibility, so it makes sense to allow it at the DAG level as well. This patch therefore changes the way chains for constrained intrisincs are created, and handles them basically like load operations are handled. This has the effect that constrained intrinsics are no longer serialized against one another or (nonvolatile) loads. They are still serialized against stores, but that seems hard to change with the current DAG chain setup, and it also doesn't seem to be a big problem preventing DAG 2) The OPC_CheckFoldableChainNode check requires that each of the intermediate nodes in a multi-node pattern match only has a single use. This check tends to fail if those intermediate nodes are strict operations as those have a chain output that typically indeed has another use. However, we don't really need to consider chains here at all, since they will all be rewritten anyway by UpdateChains later. Other parts of the matcher therefore already ignore chains, but this hasOneUse check doesn't. This patch replaces hasOneUse by a custom test that verifies there is no more than one use of any non-chain output value. In theory, this change could affect code unrelated to strict FP nodes, but at least on SystemZ I could not find any single instance of that happening 3) The SystemZ back-end currently does not allow matching multiply-and- extend operations (32x32 -> 64bit or 64x64 -> 128bit FP multiply) for strict FP operations. This was not possible in the past due to the problems described under 1) and 2) above. With those issues fixed, it is now possible to fully support those instructions in strict mode as well, and this patch does so. Differential Revision: https://reviews.llvm.org/D70913	2019-12-06 11:02:11 +01:00
Alexey Lapshin	9e8c799e2b	[Dsymutil][NFC] Move NonRelocatableStringpool into common CodeGen folder. That refactoring moves NonRelocatableStringpool into common CodeGen folder. So that NonRelocatableStringpool could be used not only inside dsymutil. Differential Revision: https://reviews.llvm.org/D71068	2019-12-06 10:02:27 +03:00
David Blaikie	560ab1f8d3	DebugInfo: Pull out a common expression. This is for the case where -gmlt -gsplit-dwarf -fsplit-dwarf-inlining are used together in some but not all units during LTO (or, in the reduced case, even without LTO) - ensuring that no split dwarf is used (because split-dwarf-inlining puts the same data in the .o file, so there's no need to duplicate it into the .dwo file)	2019-12-05 19:51:30 -08:00
Quentin Colombet	2ec71ea7c7	[RegisterCoalescer] Fix the creation of subranges when rematerialization is used * Context * During register coalescing, we use rematerialization when coalescing is not possible. That means we may rematerialize a super register when only a smaller register is actually used. E.g., 0B v1 = ldimm 0xFF 1B v2 = COPY v1.low8bits 2B = v2 => 0B v1 = ldimm 0xFF 1B v2 = ldimm 0xFF 2B = v2.low8bits Where xB are the slot indexes. Here v2 grew from a 8-bit register to a 16-bit register. When that happens and subregister liveness is enabled, we create subranges for the newly created value. E.g., before remat, the live range of v2 looked like: main range: [1r, 2r) (Reads v2 is defined at index 1 slot register and used before the slot register of index 2) After remat, it should look like: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 1d) <-- dead def I.e., the unsused lanes of v2 should be marked as dead definition. * The Problem * Prior to this patch, the live-ranges from the previous exampel, would have the full live-range for all subranges: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 2r) <-- too long * The Fix * Technically, the code that this patch changes is not wrong: When we create the subranges for the newly rematerialized value, we create only one subrange for the whole bit mask. In other words, at this point v2 live-range looks like this: main range: [1r, 2r) low & high: [1r, 2r) Then, it gets wrong when we call LiveInterval::refineSubRanges on low 8 bits: main range: [1r, 2r) low 8 bits: [1r, 2r) high 8 bits: [1r, 2r) <-- too long Ideally, we would like LiveInterval::refineSubRanges to be able to do the right thing and mark the dead lanes as such. However, this is not possible, because by the time we update / refine the live ranges, the IR hasn't been updated yet, therefore we actually don't have enough information to do the right thing. Another option to fix the problem would have been to call LiveIntervals::shrinkToUses after the IR is updated. This is not desirable as this may have a noticeable impact on compile time. Instead, what this patch does is when we create the subranges for the rematerialized value, we explicitly create one subrange for the lanes that were used before rematerialization and one for the lanes that were not used. The used one inherits the live range of the main range and the unused one is just created empty. The existing rematerialization code then detects that the unused one are not live and it correctly sets dead def intervals for them. https://llvm.org/PR41372	2019-12-05 16:32:30 -08:00
David Blaikie	decee04e63	DebugInfo: Fix LTO+DWARFv5 loclists The loclists_table_base was being overwritten for each CU even though only one loclists contribution is made so everything but the last CU would have a label that was never defined and fail to assemble.	2019-12-05 12:47:54 -08:00
Volkan Keles	bfa3d260b8	[GlobalISel] Localizer: Allow targets not to run the pass conditionally Summary: Previously, it was not possible to skip running the localizer pass conditionally. This patch adds an input function to the pass which decides if the pass should run on the given MachineFunction or not. No test case as there is no upstream target needs this functionality. Reviewers: qcolombet Reviewed By: qcolombet Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D71038	2019-12-05 11:09:50 -08:00
Jeremy Morse	30e8f80fd5	[DebugInfo] Don't create multiple DBG_VALUEs when sinking This patch addresses a performance problem reported in PR43855, and present in the reapplication in in 001574938e5. It turns out that MachineSink will (often) move instructions to the first block that post-dominates the current block, and then try to sink further. This means if we have a lot of conditionals, we can needlessly create large numbers of DBG_VALUEs, one in each block the sunk instruction passes through. To fix this, rather than immediately sinking DBG_VALUEs, record them in a pass structure. When sinking is complete and instructions won't be sunk any further, new DBG_VALUEs are added, avoiding lots of intermediate DBG_VALUE $noregs being created. Differential revision: https://reviews.llvm.org/D70676	2019-12-05 15:52:20 +00:00
Jeremy Morse	e4cdd62631	[DebugInfo] Don't reorder DBG_VALUEs when sunk Fix part of PR43855, resolving a problem that comes from the reapplication in 001574938e5. If we have two DBG_VALUE insts in a block that specify the location of the same variable, for example: %0 = someinst DBG_VALUE %0, !123, !DIExpression() %1 = anotherinst DBG_VALUE %1, !123, !DIExpression() if %0 were to sink, the corresponding DBG_VALUE would sink too, past the next DBG_VALUE, effectively re-ordering assignments. To fix this, I've added a SeenDbgVars set recording what variable locations have been seen in a block already (working bottom up), and now flag DBG_VALUEs that would pass a later DBG_VALUE for the same variable. NB, this only works for repeated DBG_VALUEs in the same basic block, the general case involving control flow is much harder, which I've written up in PR44117. Differential revision: https://reviews.llvm.org/D70672	2019-12-05 15:52:20 +00:00
Jeremy Morse	fca4100196	[DebugInfo] Re-apply two patches to MachineSink These were: * D58386 / `f5e1b718a6` / reverted in `d382a8a768` * D58238 / `ee50590e16` / reverted in `a8db456b53` Of which the latter has a performance regression tracked in PR43855, fixed by D70672 / D70676, which will be committed atomically with this reapplication. Contains a minor difference to account for a change in the IsCopyInstr signature.	2019-12-05 15:52:20 +00:00
Sam Parker	393dacacf7	[ARM] Enable TypePromotion by default ARMCodeGenPrepare has already been generalized and renamed to TypePromotion. We've had it enabled and tested downstream for a while, so enable it by default. Differential Revision: https://reviews.llvm.org/D70998	2019-12-05 14:21:11 +00:00
Djordje Todorovic	52b231ee84	[LiveDebugValues] Silence the unused var warning; NFC	2019-12-05 12:32:14 +01:00
David Stenberg	54682d871d	[DebugInfo] Handle call site values for instructions before call bundle Summary: If a call is bundled then the code that looks for instructions that produce parameter values would break when reaching the call's bundle header, due to the `ifCall(/AnyInBundle/)` invocation returning true. It is not enough to simply ignore bundle headers in the `isCall()` invocation, as the bundle header may have defines of parameter registers due to the call, meaning that such registers would incorrectly be removed from the worklist. Therefore, do not look at bundle headers at all. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: aprantl, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D71024	2019-12-05 11:50:41 +01:00
Djordje Todorovic	4b4ede440a	Reland "[LiveDebugValues] Introduce entry values of unmodified params" Relanding this after resolving the cause of the test failure.	2019-12-05 11:10:49 +01:00
Florian Hahn	76a5c8421e	[MCRegInfo] Add forward sub and super register iterators. (NFC) This patch adds forward iterators mc_difflist_iterator, mc_subreg_iterator and mc_superreg_iterator, based on the existing DiffListIterator. Those are used to provide iterator ranges over sub- and super-register from TRI, which are slightly more convenient than the existing MCSubRegIterator/MCSuperRegIterator. Unfortunately, it duplicates a bit of functionality, but the new iterators are a bit more convenient (and can be used with various existing iterator utilities) and should probably replace the old iterators in the future. This patch updates some existing users. Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D70565	2019-12-05 09:29:26 +00:00
Florian Hahn	1b81964586	[MIBundle] Turn MachineOperandIteratorBase into a forward iterator. This patch turns MachineOperandIteratorBase into a regular forward iterator, which can be used with iterator_range. It also adds mi_bundle_ops and const_mi_bundle_ops that return iterator ranges over all operands in a bundle and updates a use of the old iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D70561	2019-12-05 09:06:22 +00:00
Kai Luo	b200c5180e	Reland [MachineCopyPropagation] Extend MCP to do trivial copy backward propagation. Fix assertion error ``` bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed. ``` by checking if the register is 0 before invoking `isRenamable`.	2019-12-05 14:32:11 +08:00
Kai Luo	3882edbe19	Revert "[MachineCopyPropagation] Extend MCP to do trivial copy backward propagation" This reverts commit `75b3a1c318`, since it breaks bootstrap build.	2019-12-05 12:48:37 +08:00
Kai Luo	75b3a1c318	[MachineCopyPropagation] Extend MCP to do trivial copy backward propagation Summary: This patch mainly do such transformation ``` $R0 = OP ... ... // No read/clobber of $R0 and $R1 $R1 = COPY $R0 // $R0 is killed ``` Replace $R0 with $R1 and remove the COPY, we have ``` $R1 = OP ... ``` This transformation can also expose more opportunities for existing copy elimination in MCP. Differential Revision: https://reviews.llvm.org/D67794	2019-12-05 10:59:07 +08:00
Amara Emerson	28f5ad5801	[GlobalISel] Fix compiler crash lowering G_LOAD in AArch64. Patch by Daniel Rodríguez Troitiño. Differential Revision: https://reviews.llvm.org/D70794	2019-12-04 17:04:54 -08:00
Puyan Lotfi	fdc6f4b97b	[llvm] Fixing MIRVRegNamerUtils to properly handle 2+ MachineBasicBlocks. An interplay of code from D70210, along with code from the Value-Numbering-esque hash-based namer from D70210, as well as some crusty code from the original MIR-Canon code lead to multiple causes of failure when canonicalizing or renaming vregs for MIR with multiple basic blocks. This patch fixes those issues while deleting some no longer needed code and adding a nice diamond test case to boot. Differential Revision: https://reviews.llvm.org/D70478	2019-12-04 18:36:08 -05:00
Alexey Lapshin	789e257ce0	[DWARF5][Debuginfo] Compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) do not match. That patch fixes incompatible compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) error. cat split-dwarf.cpp int main() { int a = 1; return 0; } clang++ -O -g -gsplit-dwarf -gdwarf-5 split-dwarf.cpp; llvm-dwarfdump --verify ./a.out \| grep skeleton error: Compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) do not match. The fix is to change DW_TAG_compile_unit into DW_TAG_skeleton_unit when skeleton file is generated. Differential Revision: https://reviews.llvm.org/D70880	2019-12-05 00:53:47 +03:00
Amy Huang	9e978bb01c	Add support for lowering 32-bit/64-bit pointers Summary: This follows a previous patch that changes the X86 datalayout to represent mixed size pointers (32-bit sext, 32-bit zext, and 64-bit) with address spaces (https://reviews.llvm.org/D64931) This patch implements the address space cast lowering to the corresponding sign extension, zero extension, or truncate instructions. Related to https://bugs.llvm.org/show_bug.cgi?id=42359 Reviewers: rnk, craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69639	2019-12-04 11:39:03 -08:00
Vedant Kumar	f208b70fbc	Revert "[Coverage] Revise format to reduce binary size" This reverts commit `e18531595b`. On Windows, there is an error: http://lab.llvm.org:8011/builders/sanitizer-windows/builds/54963/steps/stage%201%20check/logs/stdio error: C:\b\slave\sanitizer-windows\build\stage1\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-merging.cpp.tmp.v1.o: Failed to load coverage: Malformed coverage data	2019-12-04 10:35:14 -08:00
Vedant Kumar	e18531595b	[Coverage] Revise format to reduce binary size Revise the coverage mapping format to reduce binary size by: 1. Naming function records and marking them `linkonce_odr`, and 2. Compressing filenames. This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB) and speeds up end-to-end single-threaded report generation by 10%. For reference the compressed name data in llc is 81MB (__llvm_prf_names). Rationale for changes to the format: - With the current format, most coverage function records are discarded. E.g., more than 97% of the records in llc are duplicate placeholders for functions visible-but-not-used in TUs. Placeholders are used to show under-covered functions, but duplicate placeholders waste space. - We reached general consensus about giving (1) a try at the 2017 code coverage BoF [1]. The thinking was that using `linkonce_odr` to merge duplicates is simpler than alternatives like teaching build systems about a coverage-aware database/module/etc on the side. - Revising the format is expensive due to the backwards compatibility requirement, so we might as well compress filenames while we're at it. This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB). See CoverageMappingFormat.rst for the details on what exactly has changed. Fixes PR34533 [2], hopefully. [1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html [2] https://bugs.llvm.org/show_bug.cgi?id=34533 Differential Revision: https://reviews.llvm.org/D69471	2019-12-04 10:10:55 -08:00
Cullen Rhodes	17e537bc58	[NFC] Use default case in EVT::getEVTString Summary: The default case handles the majority of MVTs so most of the individual cases can be removed. Also added a case for floating point types. Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D70955	2019-12-04 11:06:49 +00:00
Ulrich Weigand	c3d05c1b52	[SelectionDAG] Expand nnan FMINNUM/FMAXNUM to select sequence InstCombine may synthesize FMINNUM/FMAXNUM nodes from fcmp+select sequences (where the fcmp is marked nnan). Currently, if the target does not otherwise handle these nodes, they'll get expanded to libcalls to fmin/fmax. However, these functions may reside in libm, which may introduce a library dependency that was not originally present in the source code, potentially resulting in link failures. To fix this problem, add code to TargetLowering::expandFMINNUM_FMAXNUM to expand FMINNUM/FMAXNUM to a compare+select sequence instead of the libcall. This is done only if the node is marked as "nnan"; in this case, the expansion to compare+select is always correct. This also suffices to catch all cases where FMINNUM/FMAXNUM was synthesized as above. Differential Revision: https://reviews.llvm.org/D70965	2019-12-04 10:32:35 +01:00
QingShan Zhang	d84b320dfd	[MacroFusion] Limit the max fused number as 2 to reduce the dependency This is the example: int foo(int a, int b, int c, int d) { return a + b + c + d; } And this is the Dependency Graph: +------+ +------+ +------+ +------+ \| A \| \| B \| \| C \| \| D \| +--+--++ +---+--+ +--+---+ +--+---+ ^ ^ ^ ^ ^ ^ \| \| \| \| \| \| \| \| \| \|New1 +--------------+ \| \| \| \| \| \| \| \| \| +--+---+ \| \|New2 \| +-------+ ADD1 \| \| \| \| +--+---+ \| \| \| Fuse ^ \| \| +-------------+ \| +------------+ \| \| \| Fuse +--+---+ +----------->+ ADD2 \| \| +------+ +--+---+ \| ADD3 \| +------+ We need also create an artificial edge from ADD1 to A if https://reviews.llvm.org/D69998 is landed. That will force the Node A scheduled before the ADD1 and ADD2. But in fact, it is ok to schedule the Node A in-between ADD3 and ADD2, as ADD3 and ADD2 are NOT a fusion pair because ADD2 has been matched to ADD1. We are creating these unnecessary dependency edges that override the heuristics. Differential Revision: https://reviews.llvm.org/D70066	2019-12-04 05:05:35 +00:00
Craig Topper	f586fd44e4	[FPEnv] [PowerPC] Lowering ppc_fp128 StrictFP Nodes to libcalls This is an alternative to D64662 that shares more code between strict and non-strict nodes. It's modeled after the implementation that I did for softening. Differential Revision: https://reviews.llvm.org/D70867	2019-12-03 14:11:21 -08:00
Aditya Nandakumar	6da7dbb806	[GlobalISel]: Allow targets to override how to widen constants during legalization https://reviews.llvm.org/D70922 This adds a hook to allow targets to define exactly what extension operation should be performed for widening constants. This handles cases like widening i1 true which would end up becoming -1 which affects code quality during combines. Additionally, in order to stay consistent with how DAG is promoting constants, we now signextend for byte sized types and zero extend otherwise (by default). Targets can of course override this if necessary.	2019-12-03 10:41:10 -08:00
Roman Lebedev	9a20c79ddc	[NFC][KnownBits] Add getMinValue() / getMaxValue() methods As it can be seen from accompanying cleanup, it is not unheard of to write `~Known.Zero` meaning "what maximal value can this KnownBits produce". But i think `~Known.Zero` isn't that self-explanatory, as compared to a method with a name. Note that not all `~Known.Zero` places were cleaned up, only those where this arguably improves things.	2019-12-03 20:04:51 +03:00
Amaury Séchet	b4980f7781	[SelectionDAG] Reoder ViewXXXDAGs declarations to match execution order. NFC	2019-12-03 16:26:12 +01:00
stozer	269a9afe25	[DebugInfo] Make DebugVariable class available in DebugInfoMetadata The DebugVariable class is a class declared in LiveDebugValues.cpp which is used to uniquely identify a single variable, using its source variable, inline location, and fragment info to do so. This patch moves this class into DebugInfoMetadata.h, making it available in a much broader scope.	2019-12-03 15:10:56 +00:00
Sourabh Singh Tomar	8dd17a13b0	[NFCI][DebugInfo] Corrected a comment.	2019-12-03 19:45:37 +05:30
Djordje Todorovic	409350deea	Revert "[LiveDebugValues] Introduce entry values of unmodified params" This reverts commit rG4cfceb910692 due to LLDB test failing.	2019-12-03 13:13:27 +01:00
Sam Parker	bc76dadb3c	[CodeGen] Move ARMCodegenPrepare to TypePromotion Convert ARMCodeGenPrepare into a generic type promotion pass by: - Removing the insertion of arm specific intrinsics to handle narrow types as we weren't using this. - Removing ARMSubtarget references. - Now query a generic TLI object to know which types should be promoted and what they should be promoted to. - Move all codegen tests into Transforms folder and testing using opt and not llc, which is how they should have been written in the first place... The pass searches up from icmp operands in an attempt to safely promote types so we can avoid generating unnecessary unsigned extends during DAG ISel. Differential Revision: https://reviews.llvm.org/D69556	2019-12-03 11:12:52 +00:00
Jonas Paulsson	f8c0cfc24e	ImplicitNullChecks: Don't add a dead definition of DepMI as live-in This is one of the fixes needed to reapply D68267 which improves verification of live-in lists. Review: craig.topper https://reviews.llvm.org/D70434	2019-12-03 11:02:53 +01:00
Djordje Todorovic	4cfceb9106	[LiveDebugValues] Introduce entry values of unmodified params The idea is to remove front-end analysis for the parameter's value modification and leave it to the value tracking system. Front-end in some cases marks a parameter as modified even the line of code that modifies the parameter gets optimized, that implies that this will cover more entry values even. In addition, extending the support for modified parameters will be easier with this approach. Since the goal is to recognize if a parameter’s value has changed, the idea at very high level is: If we encounter a DBG_VALUE other than the entry value one describing the same variable (parameter), we can assume that the variable’s value has changed and we should not track its entry value any more. That would be ideal scenario, but due to various LLVM optimizations, a variable’s value could be just moved around from one register to another (and there will be additional DBG_VALUEs describing the same variable), so we have to recognize such situation (otherwise, we will lose a lot of entry values) and salvage the debug entry value. Differential Revision: https://reviews.llvm.org/D68209	2019-12-03 11:01:45 +01:00
Jonas Paulsson	4fd8f11901	[MachineVerifier] Improve checks of target instructions operands. While working with a patch for instruction selection, the splitting of a large immediate ended up begin treated incorrectly by the backend. Where a register operand should have been created, it instead became an immediate. To my surprise the machine verifier failed to report this, which at the time would have been helpful. This patch improves the verifier so that it will report this type of error. This patch XFAILs CodeGen/SPARC/fp128.ll, which has been reported at https://bugs.llvm.org/show_bug.cgi?id=44091 Review: thegameg, arsenm, fhahn https://reviews.llvm.org/D63973	2019-12-03 10:20:52 +01:00
Craig Topper	039664db87	[LegalizeDAG] Return true from ExpandNode for some nodes that don't have expand support. These nodes have a FIXME that they only get here because a Custom handler returned SDValue() instead of the original Op. Even though we aren't expanding them, we should return true here to prevent ConvertNodeToLibcall from also trying to process them until the FIXME has been addressed. I'm hoping to add checking to ConvertNodeToLibcall to make sure we don't give it nodes it doesn't have support for.	2019-12-02 23:39:20 -08:00
Craig Topper	f92000187e	[LegalizeDAG] When expanding vector SRA/SRL/SHL add the new BUILD_VECTOR to the Results vector instead of just calling ReplaceNode The code that processes the Results vector also calls ReplaceNode and makes ExpandNode return true. If we don't add it to the Results node, we end up returning false from ExpandNode. This causes ConvertNodeToLibcall to be called next. But ConvertNodeToLibcall doesn't do anything for shifts so they just pass through unmodified. Except for printing a debug message. Ultimately, I'd like to add more checks to ExpandNode and ConvertNodeToLibcall to make sure we don't have nodes marked as Expand that don't have any Expand or libcall handling.	2019-12-02 23:07:39 -08:00
Sourabh Singh Tomar	f1e3988aa6	Recommit "[DWARF5]Addition of alignment atrribute in typedef DIE." This revision is revised to update Go-bindings and Release Notes. The original commit message follows. This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE. When explicit alignment is specified. Patch by Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok, deadalinx Differential Revision: https://reviews.llvm.org/D70111	2019-12-03 09:51:43 +05:30
Sourabh Singh Tomar	3f3d0f4f4b	[DebugInfo] Support for debug_macinfo.dwo section in llvm and llvm-dwarfdump. This patch adds support for debug_macinfo.dwo section[pre-standardized] to llvm and llvm-dwarfdump. Reviewers: probinson, dblaikie, aprantl, jini.susan.george, alok Differential Revision: https://reviews.llvm.org/D70705 Tags: #debug-info #llvm	2019-12-03 08:54:12 +05:30
Hiroshi Yamauchi	8cdfdfeee6	[PGO][PGSO] Add an optional query type parameter to shouldOptimizeForSize. Summary: In case of a need to distinguish different query sites for gradual commit or debugging of PGSO. NFC. Reviewers: davidxl Subscribers: hiraditya, zzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70510	2019-12-02 13:54:13 -08:00
Florian Hahn	5154b0253d	[MIBundles] Move analyzePhysReg out of MIBundleOperands iterator (NFC). analyzePhysReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D70559	2019-12-02 20:47:08 +00:00
Volkan Keles	3d02fa6da7	[GlobalISel] CombinerHelper: Fix a bug in matchCombineCopy Summary: When combining COPY instructions, we were replacing the destination registers with the source register without checking register constraints. This patch adds a simple logic to check if the constraints match before replacing registers. Reviewers: qcolombet, aditya_nandakumar, aemerson, paquette, dsanders, Petar.Avramovic Reviewed By: aditya_nandakumar Subscribers: rovka, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70616	2019-12-02 12:05:09 -08:00
Florian Hahn	5d0625664b	[MIBundles] Move analyzeVirtReg out of MIBundleOperands iterator (NFC). analyzeVirtReg does not really fit into the iterator and moving it makes it easier to change the base iterator. Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm, qcolombet Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D70558	2019-12-02 19:50:33 +00:00
Amaury Séchet	c594d14d40	[DAGCombine] Factor oplist operations. NFC	2019-12-02 19:12:03 +01:00
Amaury Séchet	d8d5106225	[SelectionDAG] Reduce assumptions made about levels. NFC	2019-12-02 17:43:13 +01:00
Hans Wennborg	cee62e6fcf	Fix a typo.	2019-11-30 13:23:49 +01:00
Craig Topper	2f3e8cb313	[LegalizeTypes] Add strict FP support to SoftenFloatRes_FP_ROUND. Fix mistake in SoftenFloatRes_FP_EXTEND. These will be needed for ARM fp-instrinsics.ll which is currently XFAILed. One of the getOperand calls in SoftenFloatRes_FP_EXTEND was not taking strict FP into account. It only affected the call to setTypeListBeforeSoften which only has an effect on some targets.	2019-11-28 15:32:09 -08:00
Craig Topper	68ddf434c0	[LegalizeTypes] In SoftenFloatRes_FNEG, always generate integer arithmetic, never fall back to using fsub. We would previously fallback if the type wasn't f32/f64/f128. But I don't think any of the other floating point types ever go through the softening code anyway. So this code is dead.	2019-11-28 15:30:34 -08:00
Craig Topper	2485fa7739	[LegalizeTypes] Use SoftenFloatRes_Unary in SoftenFloatRes_FCBRT to reduce code. We don't have a STRICT_CBRT ISD opcode, but we can still use SoftenFloatRes_Unary to simplify some code.	2019-11-28 15:30:34 -08:00
Amaury Séchet	ca818f4550	[DAGCombiner] Peek through vector concats when trying to combine shuffles. Summary: This combine showed up as needed when exploring the regression when processing the DAG in topological order. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68195	2019-11-28 23:57:29 +01:00
Craig Topper	735f4793f1	[LegalizeTypes] Remove dead code related to softening f16 which we no longer do. f16 is promoted to f32 if it is not legal on the target. Found while reviewing what else needed to be done for strict FP in the softening code.	2019-11-27 22:10:30 -08:00
Craig Topper	ed521fef03	[LegalTypes][X86] Add SoftenFloatOperand support for STRICT_FP_TO_SINT/STRICT_FP_TO_UINT.	2019-11-27 21:16:13 -08:00
Craig Topper	1727c4f1a2	[LegalizeTypes][X86] Add ExpandIntegerResult support for STRICT_FP_TO_SINT/STRICT_FP_TO_UINT.	2019-11-27 18:41:45 -08:00
Craig Topper	9283681e16	[CriticalAntiDepBreaker] Teach the regmask clobber check to check if any subregister is preserved before considering the super register clobbered X86 has some calling conventions where bits 127:0 of a vector register are callee saved, but the upper bits aren't. Previously we could detect that the full ymm register was clobbered when the xmm portion was really preserved. This patch checks the subregisters to make sure they aren't preserved. Fixes PR44140 Differential Revision: https://reviews.llvm.org/D70699	2019-11-27 11:20:58 -08:00
Craig Topper	ebfff46c8d	[LegalizeTypes][FPEnv][X86] Add initial support for softening strict fp nodes This is based on what's required for softening fp128 operations on 32-bit X86 assuming f32/f64/f80 are legal. So there could be some things missing. Differential Revision: https://reviews.llvm.org/D70654	2019-11-27 10:50:10 -08:00
Craig Topper	350565dbc0	[LegalizeTypes] Add SoftenFloatOp_Unary to reduce some duplication for softening LRINT/LLRINT/LROUND/LLROUND Summary: This will be enhanced in a follow up to add strict fp support Reviewers: efriedma Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70751	2019-11-26 17:37:51 -08:00
Craig Topper	9b08366f57	[LegalizeTypes] Add SoftenFloatRes_Unary and SoftenFloatRes_Binary functions to factor repeated patterns out of many of the SoftenFloatRes_* functions This has been factored out of D70654 which will add strict FP support to these functions. By making the helpers we avoid repeating even more code. Differential Revision: https://reviews.llvm.org/D70736	2019-11-26 12:52:17 -08:00
Craig Topper	ee3b375b4c	[LegalizeDAG] Use getOperationAction instead of getStrictFPOperationAction for STRICT_LRINT/LROUND/LLRINT/LLROUND.	2019-11-26 11:57:45 -08:00
Fangrui Song	fe955e6c70	TargetPassConfig: const char * -> const char [] The latter has better codegen in non-optimized builds, which do not run ipsccp.	2019-11-26 11:25:00 -08:00
David Green	b5315ae8ff	[Codegen][ARM] Add addressing modes from masked loads and stores MVE has a basic symmetry between it's normal loads/store operations and the masked variants. This means that masked loads and stores can use pre-inc and post-inc addressing modes, just like the standard loads and stores already do. To enable that, this patch adds all the relevant infrastructure for treating masked loads/stores addressing modes in the same way as normal loads/stores. This involves: - Adding an AddressingMode to MaskedLoadStoreSDNode, along with an extra Offset operand that is added after the PtrBase. - Extending the IndexedModeActions from 8bits to 16bits to store the legality of masked operations as well as normal ones. This array is fairly small, so doubling the size still won't make it very large. Offset masked loads can then be controlled with setIndexedMaskedLoadAction, similar to standard loads. - The same methods that combine to indexed loads, such as CombineToPostIndexedLoadStore, are adjusted to handle masked loads in the same way. - The ARM backend is then adjusted to make use of these indexed masked loads/stores. - The X86 backend is adjusted to hopefully be no functional changes. Differential Revision: https://reviews.llvm.org/D70176	2019-11-26 16:21:01 +00:00
Luís Marques	6fd4c42fa8	[LegalizeTypes][RISCV] Soften FCOPYSIGN operand Summary: Adds support for softening FCOPYSIGN operands. Adds RISC-V tests that exercise the new softening code. Reviewers: asb, lenary, efriedma Reviewed By: efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D70679	2019-11-26 15:22:55 +00:00
Sam Parker	28166816b0	[ARM][ReachingDefs] Remove dead code in loloops. Add some more helper functions to ReachingDefs to query the uses of a given MachineInstr and also to query whether two MachineInstrs use the same def of a register. For Arm, while tail-predicating, these helpers are used in the low-overhead loops to remove the dead code that calculates the number of loop iterations. Differential Revision: https://reviews.llvm.org/D70240	2019-11-26 10:27:46 +00:00
Sam Parker	cced971fd3	[ARM][ReachingDefs] RDA in LoLoops Add several new methods to ReachingDefAnalysis: - getReachingMIDef, instead of returning an integer, return the MachineInstr that produces the def. - getInstFromId, return a MachineInstr for which the given integer corresponds to. - hasSameReachingDef, return whether two MachineInstr use the same def of a register. - isRegUsedAfter, return whether a register is used after a given MachineInstr. These methods have been used in ARMLowOverhead to replace searching for uses/defs. Differential Revision: https://reviews.llvm.org/D70009	2019-11-26 10:13:46 +00:00
Craig Topper	3dc7c5f7d8	[LegalizeTypes] Remove code to create ISD::FP_TO_FP16 from SoftenFloatRes_FTRUNC. There seems to have been a misunderstanding of what ISD::FTRUNC represents. ISD::FTRUNC is equivalent to llvm.trunc which takes a floating point value, truncates it without changing the size of the value and returns it. Despite its similar name, its different than the fptrunc instruction in IR which changes a floating point value to a smaller floating point value. fptrunc is represented by ISD::FP_ROUND in SelectionDAG. Since the ISD::FP_TO_FP16 node takes a floating point value and converts it to f16 its more similar to ISD::FP_ROUND. In fact there is identical code to what is being removed here in SoftenFloatRes_FP_ROUND. I assume this bug was never encountered because it would require f16 to be legalized by softening rather than the default of promoting.	2019-11-25 18:18:40 -08:00
Sanjay Patel	214683f3b2	[DAGCombiner] avoid crash on out-of-bounds insert index (PR44139) We already have this simplification at node-creation-time, but the test from: https://bugs.llvm.org/show_bug.cgi?id=44139 ...shows that we can combine our way to an assert/crash too.	2019-11-25 16:24:06 -05:00
Craig Topper	d6ec6e4bf6	[TargetLowering] Merge ExpandChainLibCall with makeLibCall I need to be able to drop an operand for STRICT_FP_ROUND handling on X86. Merging these functions gives me the ArrayRef interface that passes the return type, operands, and debugloc instead of the Node. Differential Revision: https://reviews.llvm.org/D70503	2019-11-25 10:52:49 -08:00
Jeremy Morse	d9c9a4e48d	[DebugInfo] Avoid register coalesing unsoundly changing DBG_VALUE locations This is a re-land of D56151 / r364515 with a completely new implementation. Once MIR code leaves SSA form and the liveness of a vreg is considered, DBG_VALUE insts are able to refer to non-live vregs, because their debug-uses do not contribute to liveness. This non-liveness becomes problematic for optimizations like register coalescing, as they can't ``see'' the debug uses in the liveness analyses. As a result registers get coalesced regardless of debug uses, and that can lead to invalid variable locations containing unexpected values. In the added test case, the first vreg operand of ADD32rr is merged with various copies of the vreg (great for performance), but a DBG_VALUE of the unmodified operand is blindly updated to the modified operand. This changes what value the variable will appear to have in a debugger. Fix this by changing any DBG_VALUE whose operand will be resurrected by register coalescing to be a $noreg DBG_VALUE, i.e. give the variable no location. This is an overapproximation as some coalesced locations are safe (others are not) -- an extra domination analysis would be required to work out which, and it would be better if we just don't generate non-live DBG_VALUEs. Differential Revision: https://reviews.llvm.org/D64630	2019-11-25 13:47:06 +00:00
Thomas Raoux	e0297a8bee	[ModuloSchedule] Fix a bug in experimental expander Fix two problems that popped up after my last patch. One is that the stiching of prologue/epilogue can be wrong when reading a value from a previsou stage. Also changed how we duplicate phi instructions to avoid generating extra phi that we delete later. Differential Revision: https://reviews.llvm.org/D70213	2019-11-23 16:01:47 -08:00
Sourabh Singh Tomar	0e02977b6e	Recommit "[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump." The original commit message follows. This patch adds support for debug_loclists.dwo section in llvm and llvm-dwarfdump. Also Fixes PR43622, PR43623. Reviewers: dblaikie, probinson, labath, aprantl, jini.susan.george Differential Revision: https://reviews.llvm.org/D69462	2019-11-23 20:10:23 +05:30
Sourabh Singh Tomar	02cb4b2fd6	Revert "[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump." This reverts commit `81b0a3284a`. Will Re-apply, with updated Differtial Revision, for automatic closure of Phabricator review.	2019-11-23 19:46:07 +05:30
Sourabh Singh Tomar	81b0a3284a	[DWARF] Support for loclist.dwo section in llvm and llvm-dwarfdump. This patch adds support for debug_loclists.dwo section in llvm and llvm-dwarfdump. Also Fixes PR43622, PR43623. Reviewers: dblaikie, probinson, labath, aprantl, jini.susan.george https://reviews.llvm.org/D69462	2019-11-23 10:25:11 +05:30
Clement Courbet	cb15ba84fe	Reland "[DAGCombiner] Allow zextended load combines." Check that the generated type is simple.	2019-11-22 14:47:18 +01:00
Roman Lebedev	96cf5c8d47	[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` (PR35479) Summary: The current lowering is: ``` Name: (X % C1) == C2 -> X * C3 <= C4 \|\| false Pre: (C2 == 0 \|\| C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n3, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/2xC https://rise4fun.com/Alive/jpb5 However, we can support non-tautological cases `C1 u> C2` too. Said handling consists of two parts: * `C2 u<= (-1 %u C1)`. It just works. We only have to change `(X % C1) == C2` into `((X - C2) % C1) == 0` ``` Name: (X % C1) == C2 -> (X - C2) * C3 <= C4 iff C2 u<= (-1 %u C1) Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u<= (-1 %u C1) %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = (-1 /u C1) %n0 = sub i8 %x, C2 %n1 = mul i8 %n0, C3 %n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right %n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n4 = or i8 %n2, %n3 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n4, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/m4P https://rise4fun.com/Alive/SKrx * `C2 u> (-1 %u C1)`. We also have to change `(X % C1) == C2` into `((X - C2) % C1) == 0`, and we have to decrement C4: ``` Name: (X % C1) == C2 -> (X - C2) * C3 <= C4 iff C2 u> (-1 %u C1) Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 && C2 u> (-1 %u C1) %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = (-1 /u C1)-1 %n0 = sub i8 %x, C2 %n1 = mul i8 %n0, C3 %n2 = lshr i8 %n1, countTrailingZeros(C1) ; rotate right %n3 = shl i8 %n1, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n4 = or i8 %n2, %n3 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n4, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/d40 https://rise4fun.com/Alive/8cF I believe this concludes `x u% C1 ==/!= C2` lowering. In fact, clang is may now be better in this regard than gcc: as it can be seen from `@t32_6_4` test, we do lower `x % 6 == 4` via this pattern, while gcc does not: https://godbolt.org/z/XNU2z9 And all the general alive proofs say this is legal. And manual checking agrees: https://rise4fun.com/Alive/WA2 Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=35479 \| PR35479 ]]. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: nick, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70053	2019-11-22 15:22:42 +03:00
Roman Lebedev	3f46022e33	[Codegen] TargetLowering::prepareUREMEqFold(): `x u% C1 ==/!= C2` with tautological C1 u<= C2 (PR35479) Summary: This is a preparatory cleanup before i add more of this fold to deal with comparisons with non-zero. In essence, the current lowering is: ``` Name: (X % C1) == 0 -> X * C3 <= C4 Pre: (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, 0 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %r = icmp ule i8 %n3, %C4 ``` https://rise4fun.com/Alive/oqd It kinda just works, really no weird edge-cases. But it isn't all that great for when comparing with non-zero. In particular, given `(X % C1) == C2`, there will be problems in the always-false tautological case where `C2 u>= C1`: https://rise4fun.com/Alive/pH3 That case is tautological, always-false: ``` Name: (X % Y) u>= Y %o0 = urem i8 %x, %y %r = icmp uge i8 %o0, %y => %r = false ``` https://rise4fun.com/Alive/ofu While we can't/shouldn't get such tautological case normally, we do deal with non-splat vectors, so unless we want to give up in this case, we need to fixup/short-circuit such lanes. There are two lowering variants: 1. We can blend between whatever computed result and the correct tautological result ``` Name: (X % C1) == C2 -> X * C3 <= C4 \|\| false Pre: (C2 == 0 \|\| C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %res = icmp ule i8 %n3, %C4 %r = select i1 %is_tautologically_false, i1 0, i1 %res ``` https://rise4fun.com/Alive/PjT5 https://rise4fun.com/Alive/1KV 2. We can invert the comparison result ``` Name: (X % C1) == C2 -> X * C3 <= C4 \|\| false Pre: (C2 == 0 \|\| C1 u<= C2) && (C1 u>> countTrailingZeros(C1)) * C3 == 1 %zz = and i8 C3, 0 ; trick alive into making C3 avaliable in precondition %o0 = urem i8 %x, C1 %r = icmp eq i8 %o0, C2 => %zz = and i8 C3, 0 ; and silence it from complaining about said reg %C4 = -1 /u C1 %n0 = mul i8 %x, C3 %n1 = lshr i8 %n0, countTrailingZeros(C1) ; rotate right %n2 = shl i8 %n0, ((8-countTrailingZeros(C1)) %u 8) ; rotate right %n3 = or i8 %n1, %n2 ; rotate right %is_tautologically_false = icmp ule i8 C1, C2 %C4_fixed = select i1 %is_tautologically_false, i8 -1, i8 %C4 %res = icmp ule i8 %n3, %C4_fixed %r = xor i1 %res, %is_tautologically_false ``` https://rise4fun.com/Alive/2xC https://rise4fun.com/Alive/jpb5 3. We can expand into `and`/`or`: https://rise4fun.com/Alive/WGn https://rise4fun.com/Alive/lcb5 Blend-one is likely better since we avoid having to load the replacement from constant pool. `xor` is second best since it's still pretty general. I'm not adding `and`/`or` variants. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: nick, hiraditya, xbolva00, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70051	2019-11-22 15:16:03 +03:00
Clement Courbet	88e205525c	Revert "[DAGCombiner] Allow zextended load combines." Breaks some bots.	2019-11-22 09:01:08 +01:00
Clement Courbet	036790f988	[DAGCombiner] Allow zextended load combines. Summary: or(zext(load8(base)), zext(load8(base+1)) -> zext(load16 base) Reviewers: apilipenko, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70487	2019-11-22 08:40:19 +01:00
Pengfei Wang	22a0edd070	[FPEnv] Add an option to disable strict float node mutating to an normal float node This patch add an option 'disable-strictnode-mutation' to prevent strict node mutating to an normal node. So we can make sure that the patch which sets strict-node as legal works correctly. Patch by Chen Liu(LiuChen3) Differential Revision: https://reviews.llvm.org/D70226	2019-11-21 18:07:11 -08:00
Craig Topper	7696b99258	[LegalizeDAG][X86] Add support for turning STRICT_FADD/SUB/MUL/DIV into libcalls. Use it for fp128 on x86-64. This requires a minor hack for f32/f64 strict fadd/fsub to avoid turning those into libcalls.	2019-11-21 16:19:25 -08:00
Hiroshi Yamauchi	52e377497d	[PGO][PGSO] DAG.shouldOptForSize part. Summary: (Split of off D67120) SelectionDAG::shouldOptForSize changes for profile guided size optimization. Reviewers: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70095	2019-11-21 14:16:00 -08:00
Tom Stellard	ab411801b8	[cmake] Explicitly mark libraries defined in lib/ as "Component Libraries" Summary: Most libraries are defined in the lib/ directory but there are also a few libraries defined in tools/ e.g. libLLVM, libLTO. I'm defining "Component Libraries" as libraries defined in lib/ that may be included in libLLVM.so. Explicitly marking the libraries in lib/ as component libraries allows us to remove some fragile checks that attempt to differentiate between lib/ libraries and tools/ libraires: 1. In tools/llvm-shlib, because llvm_map_components_to_libnames(LIB_NAMES "all") returned a list of all libraries defined in the whole project, there was custom code needed to filter out libraries defined in tools/, none of which should be included in libLLVM.so. This code assumed that any library defined as static was from lib/ and everything else should be excluded. With this change, llvm_map_components_to_libnames(LIB_NAMES, "all") only returns libraries that have been added to the LLVM_COMPONENT_LIBS global cmake property, so this custom filtering logic can be removed. Doing this also fixes the build with BUILD_SHARED_LIBS=ON and LLVM_BUILD_LLVM_DYLIB=ON. 2. There was some code in llvm_add_library that assumed that libraries defined in lib/ would not have LLVM_LINK_COMPONENTS or ARG_LINK_COMPONENTS set. This is only true because libraries defined lib lib/ use LLVMBuild.txt and don't set these values. This code has been fixed now to check if the library has been explicitly marked as a component library, which should now make it easier to remove LLVMBuild at some point in the future. I have tested this patch on Windows, MacOS and Linux with release builds and the following combinations of CMake options: - "" (No options) - -DLLVM_BUILD_LLVM_DYLIB=ON - -DLLVM_LINK_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_BUILD_LLVM_DYLIB=ON - -DBUILD_SHARED_LIBS=ON -DLLVM_LINK_LLVM_DYLIB=ON Reviewers: beanz, smeenai, compnerd, phosek Reviewed By: beanz Subscribers: wuzish, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, mgorny, mehdi_amini, sbc100, jgravelle-google, hiraditya, aheejin, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, steven_wu, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, dang, Jim, lenary, s.egerton, pzheng, sameer.abuasal, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70179	2019-11-21 10:48:08 -08:00
Bjorn Pettersson	898de30291	[BranchFolding] Fix PR43964 about branch folder not being debug invariant Summary: The fix in BranchFolder related to non debug invariant problems done in commit `ec32dff0b0` actually introduced some new problems with debug invariance. Before that patch ComputeCommonTailLength would move iterators back, past debug instructions, in order to make ProfitableToMerge make consistent answers "when one block differs from the other only by whether debugging pseudos are present at the beginning". But the changes in `ec32dff0b0` undid that by moving the iterators forward again. This patch refactors ComputeCommonTailLength. The function was really complex, considering that the SkipTopCFIAndReturn part always moved the iterators forward to the first "real" instruction in the found tail after `ec32dff0b0`. The patch also restores the logic to "back past possible debugging pseudos at beginning of block" to make sure ProfitableToMerge gives consistent answers independent of DBG_VALUE instructions before the tail. That is now done by ProfitableToMerge instead of being hidden as a side-effect in ComputeCommonTailLength. Reviewers: probinson, yechunliang, jmorse Reviewed By: jmorse Subscribers: Orlando, mehdi_amini, dexonsmith, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70091	2019-11-21 18:13:32 +01:00
Clement Courbet	252567377c	[DAGCombine][NFC] Use ArrayRef and correctly size SmallVectors. In preparation for D70487.	2019-11-21 08:53:37 +01:00
Adrian Prantl	5da385fb56	Fix an offset underflow bug in DwarfExpression when describing small values with subregisters DwarfExpression::addMachineReg() knows how to build a larger register that isn't expressible in DWARF by combining multiple subregisters. However, if the entire value fits into just one subregister, it would still emit the other subregisters, leading to all sorts of inconsistencies down the line. This patch fixes that by moving an already existing(!) check whether the subregister's offset is before the end of the value to the right place. rdar://problem/57294211 Differential Revision: https://reviews.llvm.org/D70508	2019-11-20 17:07:54 -08:00
Craig Topper	c9e8e808cf	[SelectionDAG][X86] Mutate strictFP nodes to non-strict in DoInstructionSelection when the node is marked Expand rather than when it is not Legal. This allows operations that are marked Custom, but have some type combinations that are legal to get past this code. Add custom mutation code to X86's Select function for the nodes that don't have isel patterns yet.	2019-11-20 10:36:02 -08:00
Xiangling Liao	750e855641	A fix of the bug introduced by previous lowering in asm patch. Differential Revision: https://reviews.llvm.org/D70243	2019-11-20 11:29:10 -05:00
Xing Xue	5665fc91fe	[AIX][XCOFF] Add support for generating assembly code for one-byte mergable strings This patch adds support for generating assembly code for one-byte mergeable strings. Generating assembly code for multi-byte mergeable strings and the `XCOFF` object code for mergeable strings will be supported later. Reviewers: hubert.reinterpretcast, jasonliu, daltenty, sfertile, DiggerLin, Xiangling_L Reviewed by: daltenty Subscribers: wuzish, nemanjai, hiraditya, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70310	2019-11-20 11:26:49 -05:00
Xiangling Liao	ca33727abe	[AIX] Lowering jump table, constant pool and block address in asm This patch lowering jump table, constant pool and block address in assembly. 1. On AIX, jump table index is always relative; 2. Put CPI and JTI into ReadOnlySection until we support unique data sections; 3. Create the temp symbol for block address symbol; 4. Update MIR testcases and add related assembly part; Differential Revision: https://reviews.llvm.org/D70243	2019-11-20 10:27:15 -05:00
David Zarzycki	257acbf6ae	[SelectionDAG] Combine U{ADD,SUB}O diamonds into {ADD,SUB}CARRY Summary: Convert (uaddo (uaddo x, y), carryIn) into addcarry x, y, carryIn if-and-only-if the carry flags of the first two uaddo are merged via OR or XOR. Work remaining: match ADD, etc. Reviewers: craig.topper, RKSimon, spatel, niravd, jonpa, uweigand, deadalnix, nikic, lebedev.ri, dmgreen, chfast Reviewed By: lebedev.ri Subscribers: chfast, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70079	2019-11-20 16:25:42 +02:00
Djordje Todorovic	979592a6f7	[DebugInfo] Remove the DIFlagArgumentNotModified debug info flag Due to changes in D68206, we remove the DIFlagArgumentNotModified and its usage. Differential Revision: https://reviews.llvm.org/D68207	2019-11-20 13:18:40 +01:00
Serge Pavlov	ea8678d1c7	Move floating point related entities to namespace level This is recommit of commit `e6584b2b7b`, which was reverted in `30e7ee3c4b` together with `af57dbf12e`. Original message is below. Enumerations that describe rounding mode and exception behavior were defined inside ConstrainedFPIntrinsic. It makes sense to use the same definitions to represent the same properties in other cases, not only in constrained intrinsics. It was however inconvenient as required to include constrained intrinsics definitions even if they were not needed. Also using long scope prefix reduced readability. This change moves these definitioins to the namespace llvm::fp. No functional changes. Differential Revision: https://reviews.llvm.org/D69552	2019-11-20 19:05:46 +07:00
Serge Pavlov	0c50c0b055	[FEnv] File with properties of constrained intrinsics Summary In several places we need to enumerate all constrained intrinsics or IR nodes that should be represented by them. It is easy to miss some of the cases. To make working with these intrinsics more convenient and robust, this change introduces file containing definitions of all constrained intrinsics and some of their properties. This file can be included to generate constrained intrinsics processing code. Reviewers: kpn, andrew.w.kaylor, cameron.mcinally, uweigand Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69887	2019-11-20 13:30:07 +07:00
Craig Topper	c4b41e8d1d	[LegalizeDAG][X86] Enable STRICT_FP_TO_SINT/UINT to be promoted Differential Revision: https://reviews.llvm.org/D70220	2019-11-19 16:14:37 -08:00
Vedant Kumar	ba71ca3720	[DebugInfo] Describe size of spilled values in call site params A call site parameter description of a memory operand needs to unambiguously convey the size of the operand to prevent incorrect entry value evaluation. Thanks for David Stenberg for pointing this issue out!	2019-11-19 12:03:52 -08:00
Matt Arsenault	7fe9435dc8	Work on cleaning up denormal mode handling Cleanup handling of the denormal-fp-math attribute. Consolidate places checking the allowed names in one place. This is in preparation for introducing FP type specific variants of the denormal-fp-mode attribute. AMDGPU will switch to using this in place of the current hacky use of subtarget features for the denormal mode. Introduce a new header for dealing with FP modes. The constrained intrinsic classes define related enums that should also be moved into this header for uses in other contexts. The verifier could use a check to make sure the denorm-fp-mode attribute is sane, but there currently isn't one. Currently, DAGCombiner incorrectly asssumes non-IEEE behavior by default in the one current user. Clang must be taught to start emitting this attribute by default to avoid regressions when this is switched to assume ieee behavior if the attribute isn't present.	2019-11-19 22:01:14 +05:30
Matt Arsenault	b696b9dba7	DAG: Add function context to isFMAFasterThanFMulAndFAdd AMDGPU needs to know the FP mode for the function to answer this correctly when this is removed from the subtarget. AArch64 had to make this more complicated by using this from an IR hook, so add an IR typed overload.	2019-11-19 19:25:26 +05:30
Thomas Preud'homme	a89ca4ae17	Fix PR44001: assert failure in getFunctionLocalOffsetAfterInsn Summary: Assert in getFunctionLocalOffsetAfterInsn() fails when processing a call MachineInstr inside a bundle and compiling with debug info. This is because labels are added by DwarfDebug::beginInstruction() which is called for each top-level MI by EmitFunctionBody()'s for-loop iteration but constructCallSiteEntryDIEs() which calls getFunctionLocalOffsetAfterInsn() iterates over all MIs. This commit modifies constructCallSiteEntryDIEs() to get the associated bundle MI for call MIs inside a bundle and use that to when calling getFunctionLocalOffsetAfterInsn() and getLabelAfterInsn(). It also skips loop iterations for bundle MIs since the loop statements are concerned with debug info for each physical instructions and bundles represent a group of instructions. It also fix the comment about PCAddr since the code is getting the return address and not the call address. Reviewers: dstenb, vsk, aprantl, djtodoro, dblaikie, NikolaPrica Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70293	2019-11-19 11:23:11 +00:00
Craig Topper	dc02eb1909	[SelectionDAG] Merge the two identical ExpandChainLibCall methods from LegalizeTypes and LegalizeDAG to one version in TaretLowering. Reviewers: RKSimon, efriedma, spatel Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70354	2019-11-18 20:22:33 -08:00
Craig Topper	6e20d70a69	[LegalizeDAG] Convert strict fp nodes to libcalls without losing the chain. Previously we mutated the node and then converted it to a libcall. But this loses the chain information. This patch keeps the chain, but unfortunately breaks tail call optimization as the functions involved in deciding if a node is in tail call position can't handle the chain. But correct ordering seems more important to be right. Somehow the SystemZ tests improved. I looked at one of them and it seemed that we're handling the split vector elements in a different order and that made the copies work better. Differential Revision: https://reviews.llvm.org/D70334	2019-11-18 11:24:08 -08:00
Eric Christopher	30e7ee3c4b	Temporarily Revert "Add support for options -frounding-math, ftrapping-math, -ffp-model=, and -ffp-exception-behavior=" and a follow-up NFC rearrangement as it's causing a crash on valid. Testcase is on the original review thread. This reverts commits `af57dbf12e` and `e6584b2b7b`	2019-11-18 10:46:48 -08:00
Sam McCall	d27a16eb39	Revert "[DWARF5]Addition of alignment atrribute in typedef DIE." This reverts commit `423f541c1a`, which breaks llvm-c ABI.	2019-11-18 15:53:22 +01:00
Graham Hunter	3f08ad611a	[SVE][CodeGen] Scalable vector MVT size queries * Implements scalable size queries for MVTs, split out from D53137. * Contains a fix for FindMemType to avoid using scalable vector type to contain non-scalable types. * Explicit casts for several places where implicit integer sign changes or promotion from 32 to 64 bits caused problems. * CodeGenDAGPatterns will treat scalable and non-scalable vector types as different. Reviewers: greened, cameron.mcinally, sdesmalen, rovka Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D66871	2019-11-18 12:30:59 +00:00
Craig Topper	bfbbf0aba8	[LegalizeTypes] Remove SoftenFloat handling from ExpandIntRes_LLROUND_LLRINT and remove assert from the strict fp path. These were both recently added. While the call to GetSoftenedFloat is a little more optimal, we don't do it in the expand for FP_TO_SINT/UINT so there's no real reason to do it here. This avoids a FIXME for strict fp.	2019-11-17 23:48:31 -08:00
Craig Topper	5a56d2aa33	[LegalizeTypes] Remove unnecessary conversion from EVT to MVT to MVT::SimpleValueType just to assign back to EVT. NFC	2019-11-17 23:48:31 -08:00
Craig Topper	af435286e5	[LegalizeTypes][X86] Add support for expanding the result type of STRICT_LLROUND and STRICT_LLRINT. This doesn't handle softening the input type, but we don't handle softening any of the strict nodes yet. Skipping that made it easy to reuse an existing function for creating a libcall from a node with a chain.	2019-11-17 20:03:05 -08:00
Craig Topper	1b0efe2b17	[LegalizeTypes] When expanding the integer result of LLROUND/LLRINT, also call GetSoftenedFloat if the floating point input needs to be softened. Before this we were emitting a bitcast to integer from the lowering code that itself will need to be legalized. By calling GetSoftenedFloat we get the integer conversion in one step without needing to relegalize a bitcast.	2019-11-17 13:31:30 -08:00
Craig Topper	9b515b6dd9	[LegalizeTypes] Remove PromoteFloat support form ExpandIntRes_LLROUND_LLRINT. This code isn't exercised, and was in the wrong place. If we need this, we would need to promote the type before figuring out which libcall to use. I'm choosing to remove it rather than fixing since we don't support PromoteFloat for LRINT/LROUND/LLRINT/LLROUND when the result type is legal so I don't see much reason to support it for the case where the result type isn't legal.	2019-11-17 13:31:30 -08:00
Craig Topper	d4ba11ae32	[LegalizeTypes] Merge ExpandIntRes_LLROUND and ExpandIntRes_LLRINT into one function that handles both. NFC These too functions are were the same except for which libcall gets emitted. Just merge them into one. This is prep work for some other work including strict fp support.	2019-11-17 13:31:30 -08:00
Sourabh Singh Tomar	423f541c1a	[DWARF5]Addition of alignment atrribute in typedef DIE. This patch, adds support for DW_AT_alignment[DWARF5] attribute, to be emitted with typdef DIE. When explicit alignment is specified. Patch by Awanish Pandey <Awanish.Pandey@amd.com> Reviewers: aprantl, dblaikie, jini.susan.george, SouraVX, alok, deadalinx Differential Revision: https://reviews.llvm.org/D70111	2019-11-16 21:56:53 +05:30
David Blaikie	77cfcd7509	DebugInfo: Use loclistx for DWARFv5 location lists to reduce the number of relocations This only implements the non-dwo part, but loclistx is necessary to use location lists in DWARFv5, so it's a precursor to that work - and generally reduces relocations (only using one reloc, then indexes/relative offsets for all location list references) in non-split DWARF.	2019-11-15 18:51:13 -08:00
Quentin Colombet	98ceac4981	[GISel][CombinerHelper] Use uses() instead of operands() when traversing use operands. NFC	2019-11-15 13:54:33 -08:00
Quentin Colombet	304abde077	[GISel][CombinerHelper] Add support for scalar type for the result of shuffle vector LLVM IR of 1-element vectors get lower into scalar in GISel. As a result, shuffle vector may also produce a scalar. This patch teaches the shuffle combiner how to deal with scalars when they are in the destination type of a shuffle vector. For now, we just support the easy case where this can be lowered to a plain copy. For other cases, we leave the shuffle vector as is. This type of IR are seen in O0 pipelines. E.g., as produced with SingleSource/UnitTests/Vector/AArch64/aarch64_neon_intrinsics.c. rdar://problem/57198904	2019-11-15 13:54:33 -08:00
Aditya Nandakumar	7276868556	[MirNamer][Canonicalizer]: Perform instruction semantic based renaming https://reviews.llvm.org/D70210 Previously: Due to sensitivity of the algorithm with gaps, and extra instructions, when diffing, often we see naming being off by a few. Makes the diff unreadable even for tests with 7 and 8 instructions respectively. Naming can change depending on candidates (and order of picking candidates). Suddenly if there's one extra instruction somewhere, the entire subtree would be named completely differently. No consistent naming of similar instructions which occur in different functions. If we try to do something like count the frequency distribution of various differences across suite, then the above sensitivity issues are going to result in poor results. Instead: Name instruction based on semantics of the instruction (hash of the opcode and operands). Essentially for a given instruction that occurs in any module/function it'll be named similarly (ie semantic). This has some nice properties Can easily look at many instructions and just check the hash and if they're named similarly, then it's the same instruction. Makes it very easy to spot the same instruction both multiple times, as well as across many functions (useful for frequency distribution). Independent of traversal/candidates/depth of graph. No need to keep track of last index/gaps/skip count etc. No off by few issues with diffs. I've tried the old vs new implementation in files ranging from 30 to 700 instructions. In both cases with the old algorithm, diffs are a sea of red, where as for the semantic version, in both cases, the diffs line up beautifully. Simplified implementation of the main loop (simple iteration) , no keep track of what's visited and not. Handle collision just by incrementing a counter. Roughly bb[N]_hash_[CollisionCount]. Additionally with the new implementation, we can probably avoid doing the hoisting of instructions to various places, as they'll likely be named the same resulting in differences only based on collision (ie regardless of whether the instruction is hoisted or not/close to use or not, it'll be named the same hash which should result in use of the instruction be identical with the only change being the collision count) which is very easy to spot visually.	2019-11-15 08:38:54 -08:00
diggerlin	3dfa975fb3	Add read-only data assembly writing for aix SUMMARY: The patch will emit read-only variable assembly code for aix. Reviewers: daltenty,Xiangling_Liao Subscribers: rupprecht, seiyai,hiraditya Differential Revision: https://reviews.llvm.org/D70182	2019-11-15 11:30:19 -05:00
Serge Pavlov	e6584b2b7b	Move floating point related entities to namespace level Enumerations that describe rounding mode and exception behavior were defined inside ConstrainedFPIntrinsic. It makes sense to use the same definitions to represent the same properties in other cases, not only in constrained intrinsics. It was however inconvenient as required to include constrained intrinsics definitions even if they were not needed. Also using long scope prefix reduced readability. This change moves these definitioins to the namespace llvm::fp. No functional changes. Differential Revision: https://reviews.llvm.org/D69552	2019-11-15 19:56:33 +07:00
Jay Foad	c953e061b4	[CodeGen] Increase the size of a SmallVector The SmallVector reserve() call in MachineInstrExpressionTrait::getHashValue accounted for over 3% of all calls to malloc() when I compiled a bunch of graphics shaders for the AMDGPU target. Its initial size was only enough for machine instructions with up to 7 operands, but for AMDGPU 8 and 10 operands are very common. Here's a histogram of number of operands for each call to getHashValue, gathered from the same collection of shaders: 1 13503 2 254273 3 135781 4 422508 5 614997 6 194953 7 287248 8 1517255 9 31218 10 1191269 11 70731 12 24 13 77 15 84 17 4692 27 16 33 705 49 6 Typical instructions with 8 and 10 operands are floating point arithmetic and multiply-accumulate instructions like: %83:vgpr_32 = V_MUL_F32_e64 0, killed %82:vgpr_32, 0, killed %81:vgpr_32, 0, 0, implicit $exec %330:vgpr_32 = V_MAC_F32_e64 0, killed %327:vgpr_32, 0, killed %329:sgpr_32, 0, %328:vgpr_32(tied-def 0), 0, 0, implicit $exec Differential Revision: https://reviews.llvm.org/D70301	2019-11-15 11:32:11 +00:00
Matt Arsenault	bc276c6379	GlobalISel: Lower s1 source G_SITOFP/G_UITOFP	2019-11-15 13:37:20 +05:30
Reid Kleckner	4c1a1d3cf9	Add missing includes needed to prune LLVMContext.h include, NFC These are a pre-requisite to removing #include "llvm/Support/Options.h" from LLVMContext.h: https://reviews.llvm.org/D70280	2019-11-14 15:23:15 -08:00
Vedant Kumar	1ee84e5ab2	[DebugInfo] Allow spill slots in call site parameter descriptions Allow call site paramter descriptions to reference spill slots. Spill slots are not visible to high-level LLVM IR, so they can safely be referenced during entry value evaluation (as they cannot be clobbered by some other function). This gives a 5% increase in the number of call site parameter DIEs in an LTO x86_64 build of the xnu kernel. This reverts commit `eb4c98ca3d` ( [DebugInfo] Exclude memory location values as parameter entry values), effectively reintroducing the portion of D60716 which dealt with memory locations (authored by Djordje, Nikola, Ananth, and Ivan). This partially addresses llvm.org/PR43343. However, not all memory operands forwarded to callees live in spill slots. In the xnu build, it may be possible to use an escape analysis to increase the number of call site parameter by another 15% (more details in PR43343). Differential Revision: https://reviews.llvm.org/D70254	2019-11-14 12:48:51 -08:00
Daniel Sanders	b2839c442e	[globalisel][irtanslator] The IRTranslator should preserve TBAA information	2019-11-14 12:11:27 -08:00
Sumanth Gundapaneni	7c7e368a7f	[Pipeliner] Fix an assertion caused by iterator invalidation.	2019-11-14 13:08:06 -06:00
Craig Topper	17bb2d7c80	[ExpandReductions] Don't push all intrinsics to the worklist. Just push reductions. We were previously pushing all intrinsics used in a function to the worklist. This is wasteful for memory in a function with a lot of intrinsics. We also ask TTI if we should expand every intrinsic, but we only have expansion support for the reduction intrinsics. This just wastes time for the non-reduction intrinsics. This patch only pushes reduction intrinsics into the worklist and skips other intrinsics. Differential Revision: https://reviews.llvm.org/D69470	2019-11-14 10:26:53 -08:00
Reid Kleckner	5fe3f00ae2	Replace wrongly deleted header banner, fix formatting I reviewed the diff hunks of `05da2fe521` that don't contain '#include' lines, and found two unintended changes. I deleted a header banner inadvertently while inserting a header, and changed the indentation of a constructor in an odd way. Add back the banner, and reformat the constructor.	2019-11-14 10:21:42 -08:00
Paweł Bylica	1c247dd028	[DAGCombiner] Drop redundant DAG method param. NFC	2019-11-14 14:02:53 +01:00
Paweł Bylica	9b89bda517	[DAGCombiner] Use TLI field already available. NFC	2019-11-14 14:02:52 +01:00
Reid Kleckner	1dfede3122	Move CodeGenFileType enum to Support/CodeGen.h Avoids the need to include TargetMachine.h from various places just for an enum. Various other enums live here, such as the optimization level, TLS model, etc. Data suggests that this change probably doesn't matter, but it seems nice to have anyway.	2019-11-13 16:39:34 -08:00
Reid Kleckner	05da2fe521	Sink all InitializePasses.h includes This file lists every pass in LLVM, and is included by Pass.h, which is very popular. Every time we add, remove, or rename a pass in LLVM, it caused lots of recompilation. I found this fact by looking at this table, which is sorted by the number of times a file was changed over the last 100,000 git commits multiplied by the number of object files that depend on it in the current checkout: recompiles touches affected_files header 342380 95 3604 llvm/include/llvm/ADT/STLExtras.h 314730 234 1345 llvm/include/llvm/InitializePasses.h 307036 118 2602 llvm/include/llvm/ADT/APInt.h 213049 59 3611 llvm/include/llvm/Support/MathExtras.h 170422 47 3626 llvm/include/llvm/Support/Compiler.h 162225 45 3605 llvm/include/llvm/ADT/Optional.h 158319 63 2513 llvm/include/llvm/ADT/Triple.h 140322 39 3598 llvm/include/llvm/ADT/StringRef.h 137647 59 2333 llvm/include/llvm/Support/Error.h 131619 73 1803 llvm/include/llvm/Support/FileSystem.h Before this change, touching InitializePasses.h would cause 1345 files to recompile. After this change, touching it only causes 550 compiles in an incremental rebuild. Reviewers: bkramer, asbirlea, bollu, jdoerfert Differential Revision: https://reviews.llvm.org/D70211	2019-11-13 16:34:37 -08:00
Reid Kleckner	364d1785a6	Sink MachineFunction private method out of line This method is private and only called from this file and doesn't need to be inline. Saves a TargetMachine.h include in MachineFunction.h, a popular header. The include was introduced in `98603a8153` despite the forward decl of LLVMTargetMachine.	2019-11-13 15:36:58 -08:00
Craig Topper	84e83b54bd	[TargetLowering] Increase the storage size of NumRegistersForVT to allow the type break down for v256i1 and other types to be stored correctly v256i1 on X86 without avx512 breaks down to 256 i8 values when passed between basic blocks. But the NumRegistersForVT was sized at a byte for each VT. This results in 256 being stored as 0. This patch enlarges the type to 16 bits and adds an assert to ensure that no information is lost when the entry is stored. Differential Revision: https://reviews.llvm.org/D70138	2019-11-13 12:09:35 -08:00
Quentin Colombet	de94cda81b	[LiveInterval] Allow updating subranges with slightly out-dated IR During register coalescing, we update the live-intervals on-the-fly. To do that we are in this strange mode where the live-intervals can be slightly out-of-sync (more precisely they are forward looking) compared to what the IR actually represents. This happens because the register coalescer only updates the IR when it is done with updating the live-intervals and it has to do it this way because updating the IR on-the-fly would actually clobber some information on how the live-ranges that are being updated look like. This is problematic for updates that rely on the IR to accurately represents the state of the live-ranges. Right now, we have only one of those: stripValuesNotDefiningMask. To reconcile this need of out-of-sync IR, this patch introduces a new argument to LiveInterval::refineSubRanges that allows the code doing the live range updates to reason about how the code should look like after the coalescer will have rewritten the registers. Essentially this captures how a subregister index with be offseted to match its position in a new register class. E.g., let say we want to merge: V1.sub1:<2 x s32> = COPY V2.sub3:<4 x s32> We do that by choosing a class where sub1:<2 x s32> and sub3:<4 x s32> overlap, i.e., by choosing a class where we can find "offset + 1 == 3". Put differently we align V2's sub3 with V1's sub1: V2: sub0 sub1 sub2 sub3 V1: <offset> sub0 sub1 This offset will look like a composed subregidx in the the class: V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> => V1.(composed sub2 with sub1):<4 x s32> = COPY V2.sub3:<4 x s32> Now if we didn't rewrite the uses and def of V1, all the checks for V1 need to account for this offset to match what the live intervals intend to capture. Prior to this patch, we would fail to recognize the uses and def of V1 and would end up with machine verifier errors: No live segment at def. This could lead to miscompile as we would drop some live-ranges and thus, miss some interferences. For this problem to trigger, we need to reach stripValuesNotDefiningMask while having a mismatch between the IR and the live-ranges (i.e., we have to apply a subreg offset to the IR.) This requires the following three conditions: 1. An update of overlapping subreg lanes: e.g., dsub0 == <ssub0, ssub1> 2. An update with Tuple registers with a possibility to coalesce the subreg index: e.g., v1.dsub_1 == v2.dsub_3 3. Subreg liveness enabled. looking at the IR to decide what is alive and what is not, i.e., calling stripValuesNotDefiningMask. coalescer maintains for the live-ranges information. None of the targets that currently use subreg liveness (i.e., the targets that fulfill #3, Hexagon, AMDGPU, PowerPC, and SystemZ IIRC) expose #1 and and #2, so this patch also artificial enables subreg liveness for ARM, so that a nice test case can be attached.	2019-11-13 11:17:56 -08:00
David Stenberg	7417cc149b	Fix typo in DwarfDebug [NFC]	2019-11-13 18:06:16 +01:00
David Stenberg	5e646ff530	[DebugInfo] Avoid creating entry values for clobbered registers Summary: Entry values are considered for parameters that have register-described DBG_VALUEs in the entry block (along with other conditions). If a parameter's value has been propagated from the caller to the callee, then the parameter's DBG_VALUE in the entry block may be described using a register defined by some instruction, and entry values should not be emitted for the parameter, which can currently occur. One such case was seen in the attached test case, in which the second parameter, which is described by a redefinition of the first parameter's register, would incorrectly get an entry value using the first parameter's register. This commit intends to solve such cases by keeping track of register defines, and ignoring DBG_VALUEs in the entry block that are described by such registers. In a RelWithDebInfo build of clang-8, the average size of the set was 27, and in a RelWithDebInfo+ASan build it was 30. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: djtodoro, vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D69889	2019-11-13 11:10:47 +01:00
David Stenberg	4fec44cd61	[DebugInfo] Add helper for finding entry value candidates [NFC] Summary: The conditions that are used to determine if entry values should be emitted for a parameter are quite many, and will grow slightly in a follow-up commit, so move those to a helper function, as was suggested in the code review for D69889. Reviewers: djtodoro, NikolaPrica Reviewed By: djtodoro Subscribers: probinson, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69955	2019-11-13 11:10:47 +01:00
Sander de Smalen	9a1c243aa5	[AArch64][SVE] Allocate locals that are scalable vectors. This patch adds a target interface to set the StackID for a given type, which allows scalable vectors (e.g. `<vscale x 16 x i8>`) to be assigned a 'sve-vec' StackID, so it is allocated in the SVE area of the stack frame. Reviewers: ostannard, efriedma, rengolin, cameron.mcinally Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D70080	2019-11-13 09:45:24 +00:00
joanlluch	d384ad6b63	[TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (4) Summary: Replaces ``` unsigned getShiftAmountThreshold(EVT VT) ``` by ``` bool shouldAvoidTransformToShift(EVT VT, unsigned amount) ``` thus giving more flexibility for targets to decide whether particular shift amounts must be considered expensive or not. Updates the MSP430 target with a custom implementation. This continues D69116, D69120, D69326 and updates them, so all of them must be committed before this. Existing tests apply, a few more have been added. Reviewers: asl, spatel Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D70042	2019-11-13 09:23:08 +01:00
Tim Renouf	07ebd74154	MCP: Fixed bug with dest overlapping copy source In MachineCopyPropagation, when propagating the source of a copy into the operand of a later instruction, bail if a destination overlaps (partly defines) the copy source. If the instruction where the substitution is happening is also a copy, allowing the propagation confuses the tracking mechanism. Differential Revision: https://reviews.llvm.org/D69953 Change-Id: Ic570754f878f2d91a4a50a9bdcf96fbaa240726d	2019-11-12 08:18:11 +00:00
aqjune	e87d71668e	[IR] Redefine Freeze instruction Summary: This patch redefines freeze instruction from being UnaryOperator to a subclass of UnaryInstruction. ConstantExpr freeze is removed, as discussed in the previous review. FreezeOperator is not added because there's no ConstantExpr freeze. `freeze i8* null` test is added to `test/Bindings/llvm-c/freeze.ll` as well, because the null pointer-related bug in `tools/llvm-c/echo.cpp` is now fixed. InstVisitor has visitFreeze now because freeze is not unaryop anymore. Reviewers: whitequark, deadalnix, craig.topper, jdoerfert, lebedev.ri Reviewed By: craig.topper, lebedev.ri Subscribers: regehr, nlopes, mehdi_amini, hiraditya, steven_wu, dexonsmith, jfb, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69932	2019-11-12 10:49:00 +09:00
Sean Fertile	e5e2e0a66b	[PowerPC][XCOFF] Add support for zero initialized global values. For XCOFF, globals mapped into the .bss section are linked as COMMON definitions. This behaviour is incorrect for zero initialized data, so emit those to the .data section instead. Differential Revision: https://reviews.llvm.org/D69528	2019-11-11 18:52:10 -05:00
Victor Huang	edab7dd426	Disable hoisting MI to hotter basic blocks In current Hoist() function of machine licm pass, it will not check the source and destination basic block frequencies that a instruction is hoisted from/to. There is a chance that instruction is hoisted from a cold to a hot basic block. In this patch, we add options to disable machine instruction hoisting if destination block is hotter. Differential Revision: https://reviews.llvm.org/D63676	2019-11-11 21:32:56 +00:00
Thomas Raoux	e0f1d9d872	[ModuloSchedule] Fix modulo expansion for data loop carried dependencies. The new experimental expansion has a problem when a value has a data dependency with an instruction from a previous stage. This is due to the way we peel out the kernel. To fix that I'm changing the way we peel out the kernel. We now peel the kernel NumberStage - 1 times. The code would be correct at this point if we didn't have to handle cases where the loop iteration is smaller than the number of stages. To handle this case we move instructions between different epilogues based on their stage and remap the PHI instructions correctly. Differential Revision: https://reviews.llvm.org/D69538	2019-11-11 12:09:27 -08:00
Thomas Raoux	03da6e8c00	[ModuloSchedule] Do target loop analysis before peeling. Simple change to call target hook analyzeLoopForPipelining before changing the loop. After peeling analyzing the loop may be more complicated for target that don't have a loop instruction. This doesn't affect Hexagone and PPC as they have hardware loop instructions. Differential Revision: https://reviews.llvm.org/D69912	2019-11-11 09:35:39 -08:00
Yi-Hong Lyu	6bbfafd037	[CGP] Make ICMP_EQ use CR result of ICMP_S(L\|G)T dominators For example: long long test(long long a, long long b) { if (a << b > 0) return b; if (a << b < 0) return a; return a*b; } Produces: sld. 5, 3, 4 ble 0, .LBB0_2 mr 3, 4 blr .LBB0_2: # %if.end cmpldi 5, 0 li 5, 1 isel 4, 4, 5, 2 mulld 3, 4, 3 blr But the compare (cmpldi 5, 0) is redundant and can be removed (CR0 already contains the result of that comparison). The root cause of this is that LLVM converts signed comparisons into equality comparison based on dominance. Equality comparisons are unsigned by default, so we get either a record-form or cmp (without the l for logical) feeding a cmpl. That is the situation we want to avoid here. Differential Revision: https://reviews.llvm.org/D60506	2019-11-11 17:28:50 +00:00
Francis Visoiu Mistrih	a9a3781df8	[ObjC] Override TailCallKind when lowering objc intrinsics The tail-call-kind-ness is known by the ObjCARC analysis and can be enforced while lowering the intrinsics to calls. This allows us to get the requested tail calls at -O0 without trying to preserve the attributes throughout passes that change code even at -O0 ,like the Always Inliner, where the ObjCOpt pass doesn't run. Differential Revision: https://reviews.llvm.org/D69980	2019-11-11 08:30:06 -08:00
joanlluch	e0012c5d6a	[TargetLowering][DAGCombine][MSP430] Shift Amount Threshold in DAGCombine (3) Summary: Additional filtering of undesired shifts for targets that do not support them efficiently. Related with D69116 and D69120 Applies the TLI.getShiftAmountThreshold hook to prevent undesired generation of shifts for the following IR code: ``` define i16 @testShiftBits(i16 %a) { entry: %and = and i16 %a, -64 %cmp = icmp eq i16 %and, 64 %conv = zext i1 %cmp to i16 ret i16 %conv } define i16 @testShiftBits_11(i16 %a) { entry: %cmp = icmp ugt i16 %a, 63 %conv = zext i1 %cmp to i16 ret i16 %conv } define i16 @testShiftBits_12(i16 %a) { entry: %cmp = icmp ult i16 %a, 64 %conv = zext i1 %cmp to i16 ret i16 %conv } ``` The attached diff file shows the piece code in TargetLowering that is responsible for the generation of shifts in relation to the IR above. Before applying this patch, shifts will be generated to replace non-legal icmp immediates. However, shifts may be undesired if they are even more expensive for the target. For all my previous patches in this series (cited above) I added test cases for the MSP430 target. However, in this case, the target is not suitable for showing improvements related with this patch, because the MSP430 does not implement "isLegalICmpImmediate". The default implementation returns always true, therefore the patched code in TargetLowering is never reached for that target. Targets implementing both "isLegalICmpImmediate" and "getShiftAmountThreshold" will benefit from this. The differential effect of this patch can only be shown for the MSP430 by temporarily implementing "isLegalICmpImmediate" to return false for large immediates. This is simulated with the implementation of a command line flag that was incorporated in D69975 This patch belongs to a initiative to "relax" the generation of shifts by LLVM for targets requiring it Reviewers: spatel, lebedev.ri, asl Reviewed By: spatel Subscribers: lenary, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69326	2019-11-11 10:18:25 +01:00
Simon Pilgrim	6976a0e826	RegisterCoalescer - remove duplicate variable to fix Wshadow warning. NFCI.	2019-11-09 20:10:12 +00:00
Simon Pilgrim	f092e80939	RegisterCoalescer - fix uninitialized variables. NFCI.	2019-11-09 20:10:11 +00:00
Simon Pilgrim	7f8488eeb4	Fix operator precedence warning. NFC.	2019-11-09 17:03:21 +00:00
David Blaikie	db797bfb2b	DebugInfo: Remove redundant conditionals/checks from macro info emission These checks fall out naturally from the current implementation without needing to be explicitly considered anymore.	2019-11-08 15:31:15 -08:00
David Blaikie	736273c7fe	DebugInfo: Do not create a debug_macinfo section if no CUs have associated macros Patch based on Sourabh Singh's D69839 patch.	2019-11-08 15:30:11 -08:00
David Blaikie	3951245c38	NVPTX: Don't insert an extra empty line at the end of the last section. This was arbitrarily appearing in only the last section emitted - which made tests more sensitive than they needed to be (removing the last section - like the macinfo section change that's coming after this) would, surprisingly, move the blank line to the previous section.	2019-11-08 15:16:04 -08:00
David Blaikie	39c308f6b8	DebugInfo: Use separate macinfo contributions for each CU The macinfo support was broken for LTO situations, by terminating macinfo lists only once - multiple macinfo contributions were correctly labeled, but they all continued/flowed into later contributions until only one terminator appeared at the end of the section. Correctly terminate each contribution & fix the parsing to handle this situation too. The parsing fix is also necessary for dumping linked binaries - the previous code would stop at the end of the first contribution - missing all later contributions in a linked binary. It'd be nice to improve the dumping to print the offsets of each contribution so it'd be easier to know which CU AT_macro_info refers to which macinfo contribution.	2019-11-08 13:27:00 -08:00
Eli Friedman	5df3a87224	[AArch64][X86] Don't assume __powidf2 is available on Windows. We had some code for this for 32-bit ARM, but this doesn't really need to be in target-specific code; generalize it. (I think this started showing up recently because we added an optimization that converts pow to powi.) Differential Revision: https://reviews.llvm.org/D69013	2019-11-08 12:43:21 -08:00
Djordje Todorovic	8d2ccd1ac3	Reland: [TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-11-08 13:00:39 +01:00
Hans Wennborg	ff3b513495	Revert `d91ed80` "[codeview] Reference types in type parent scopes" This triggered asserts in the Chromium build, see https://crbug.com/1022729 for details and reproducer. > Without this change, when a nested tag type of any kind (enum, class, > struct, union) is used as a variable type, it is emitted without > emitting the parent type. In CodeView, parent types point to their inner > types, and inner types do not point back to their parents. We already > walk over all of the parent scopes to build the fully qualified name. > This change simply requests their type indices as we go along to enusre > they are all emitted. > > Fixes PR43905 > > Reviewers: akhuang, amccarth > > Differential Revision: https://reviews.llvm.org/D69924	2019-11-08 11:30:33 +01:00
Sanne Wouda	f649f24d38	[RAGreedy] Enable -consider-local-interval-cost for AArch64 Summary: The greedy register allocator occasionally decides to insert a large number of unnecessary copies, see below for an example. The -consider-local-interval-cost option (which X86 already enables by default) fixes this. We enable this option for AArch64 only after receiving feedback that this change is not beneficial for PowerPC. We evaluated the impact of this change on compile time, code size and performance benchmarks. This option has a small impact on compile time, measured on CTMark. A 0.1% geomean regression on -O1 and -O2, and 0.2% geomean for -O3, with at most 0.5% on individual benchmarks. The effect on both code size and performance on AArch64 for the LLVM test suite is nil on the geomean with individual outliers (ignoring short exec_times) between: best worst size..text -3.3% +0.0% exec_time -5.8% +2.3% On SPEC CPU® 2017 (compiled for AArch64) there is a minor reduction (-0.2% at most) in code size on some benchmarks, with a tiny movement (-0.01%) on the geomean. Neither intrate nor fprate show any change in performance. This patch makes the following changes. - For the AArch64 target, enableAdvancedRASplitCost() now returns true. - Ensures that -consider-local-interval-cost=false can disable the new behaviour if necessary. This matrix multiply example: $ cat test.c long A[8][8]; long B[8][8]; long C[8][8]; void run_test() { for (int k = 0; k < 8; k++) { for (int i = 0; i < 8; i++) { for (int j = 0; j < 8; j++) { C[i][j] += A[i][k] * B[k][j]; } } } } results in the following generated code on AArch64: $ clang --target=aarch64-arm-none-eabi -O3 -S test.c -o - [...] // %for.cond1.preheader // =>This Inner Loop Header: Depth=1 add x14, x11, x9 str q0, [sp, #16] // 16-byte Folded Spill ldr q0, [x14] mov v2.16b, v15.16b mov v15.16b, v14.16b mov v14.16b, v13.16b mov v13.16b, v12.16b mov v12.16b, v11.16b mov v11.16b, v10.16b mov v10.16b, v9.16b mov v9.16b, v8.16b mov v8.16b, v31.16b mov v31.16b, v30.16b mov v30.16b, v29.16b mov v29.16b, v28.16b mov v28.16b, v27.16b mov v27.16b, v26.16b mov v26.16b, v25.16b mov v25.16b, v24.16b mov v24.16b, v23.16b mov v23.16b, v22.16b mov v22.16b, v21.16b mov v21.16b, v20.16b mov v20.16b, v19.16b mov v19.16b, v18.16b mov v18.16b, v17.16b mov v17.16b, v16.16b mov v16.16b, v7.16b mov v7.16b, v6.16b mov v6.16b, v5.16b mov v5.16b, v4.16b mov v4.16b, v3.16b mov v3.16b, v1.16b mov x12, v0.d[1] fmov x15, d0 ldp q1, q0, [x14, #16] ldur x1, [x10, #-256] ldur x2, [x10, #-192] add x9, x9, #64 // =64 mov x13, v1.d[1] fmov x16, d1 ldr q1, [x14, #48] mul x3, x15, x1 mov x14, v0.d[1] fmov x17, d0 mov x18, v1.d[1] fmov x0, d1 mov v1.16b, v3.16b mov v3.16b, v4.16b mov v4.16b, v5.16b mov v5.16b, v6.16b mov v6.16b, v7.16b mov v7.16b, v16.16b mov v16.16b, v17.16b mov v17.16b, v18.16b mov v18.16b, v19.16b mov v19.16b, v20.16b mov v20.16b, v21.16b mov v21.16b, v22.16b mov v22.16b, v23.16b mov v23.16b, v24.16b mov v24.16b, v25.16b mov v25.16b, v26.16b mov v26.16b, v27.16b mov v27.16b, v28.16b mov v28.16b, v29.16b mov v29.16b, v30.16b mov v30.16b, v31.16b mov v31.16b, v8.16b mov v8.16b, v9.16b mov v9.16b, v10.16b mov v10.16b, v11.16b mov v11.16b, v12.16b mov v12.16b, v13.16b mov v13.16b, v14.16b mov v14.16b, v15.16b mov v15.16b, v2.16b ldr q2, [sp] // 16-byte Folded Reload fmov d0, x3 mul x3, x12, x1 [...] With -consider-local-interval-cost the same section of code results in the following: $ clang --target=aarch64-arm-none-eabi -mllvm -consider-local-interval-cost -O3 -S test.c -o - [...] .LBB0_1: // %for.cond1.preheader // =>This Inner Loop Header: Depth=1 add x14, x11, x9 ldp q0, q1, [x14] ldur x1, [x10, #-256] ldur x2, [x10, #-192] add x9, x9, #64 // =64 mov x12, v0.d[1] fmov x15, d0 mov x13, v1.d[1] fmov x16, d1 ldp q0, q1, [x14, #32] mul x3, x15, x1 cmp x9, #512 // =512 mov x14, v0.d[1] fmov x17, d0 fmov d0, x3 mul x3, x12, x1 [...] Reviewers: SjoerdMeijer, samparker, dmgreen, qcolombet Reviewed By: dmgreen Subscribers: ZhangKang, jsji, wuzish, ppc-slack, lkail, steven.zhang, MatzeB, qcolombet, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69437	2019-11-08 10:20:28 +00:00
Galina Kistanova	ad3c9d46fe	Revert "[MachineVerifier] Improve verification of live-in lists. This reverts commit `b7b170c` to give the author more time to address failing tests on the expensive checks buildbots.	2019-11-07 14:02:13 -08:00
Reid Kleckner	d91ed80e97	[codeview] Reference types in type parent scopes Without this change, when a nested tag type of any kind (enum, class, struct, union) is used as a variable type, it is emitted without emitting the parent type. In CodeView, parent types point to their inner types, and inner types do not point back to their parents. We already walk over all of the parent scopes to build the fully qualified name. This change simply requests their type indices as we go along to enusre they are all emitted. Fixes PR43905 Reviewers: akhuang, amccarth Differential Revision: https://reviews.llvm.org/D69924	2019-11-07 13:58:01 -08:00
Simon Pilgrim	77cfe83f7d	PostRAScheduler - fix uninitialized variable warning. NFCI.	2019-11-07 16:56:16 +00:00
Sanjay Patel	777d1d1d98	[SDAG] reduce code duplication; NFC	2019-11-07 10:28:45 -05:00
Sanjay Patel	2fdd58c506	[SDAG] reduce code duplication; NFC	2019-11-07 10:15:17 -05:00
Dávid Bolvanský	62ad212825	[Analysis] Attribute deref/deref_or_null should not prevent tail call optimization	2019-11-06 23:08:07 +01:00
Philip Reames	db036ee0a4	[X86/Atomics] Correct a few transforms for new atomic lowering This is a partial fix for the issues described in commit message of `027aa27` (the revert of G24609). Unfortunately, I can't provide test coverage for it on it's own as the only (known) wrong example is still wrong, but due to a separate issue. These fixes are cases where when performing unrelated DAG combines, we were dropping the atomicity flags entirely.	2019-11-05 13:20:08 -08:00
Amy Huang	a078c77d72	[MIR] Add MIR parsing for heap alloc site instruction markers Summary: This patch adds MIR parsing and printing for heap alloc markers, which were added in D69136. They are printed as an operand similar to pre-/post-instr symbols, with a heap-alloc-marker token and a metadata node. Reviewers: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69864	2019-11-05 12:57:45 -08:00
Daniel Sanders	e74c5b9661	[globalisel] Rename G_GEP to G_PTR_ADD Summary: G_GEP is rather poorly named. It's a simple pointer+scalar addition and doesn't support any of the complexities of getelementptr. I therefore propose that we rename it. There's a G_PTR_MASK so let's follow that convention and go with G_PTR_ADD Reviewers: volkan, aditya_nandakumar, bogner, rovka, arsenm Subscribers: sdardis, jvesely, wdng, nhaehnle, hiraditya, jrtc27, atanasyan, arphaman, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69734	2019-11-05 10:31:17 -08:00
Simon Pilgrim	117e6dd6cc	Remove redundant assignment. NFCI. Fixes cppcheck warning.	2019-11-05 17:08:08 +00:00
Simon Pilgrim	76166a1ac7	Use iterator prefix increment. NFCI.	2019-11-05 17:08:08 +00:00
Simon Pilgrim	7ad2583613	[MachineOutliner] Reduce scope of variable and stop duplicate getMF() calls. NFCI.	2019-11-05 17:08:08 +00:00
jmolloy	39525a6723	[DFAPacketizer] Allow up to 64 functional units Summary: To drive the automaton we used a uint64_t as an action type. This contained the transition's resource requirements as a conjunction: (a OR b) AND (b OR c) We encoded this conjunction as a sequence of four 16-bit bitmasks. This limited the number of addressable functional units to 16, which is quite low and has bitten many people in the past. Instead, the DFAEmitter now generates a lookup table from InstrItinerary class (index of the ItinData inside the ProcItineraries) to an internal action index which is essentially a dense embedding of the conjunctive form. Because we never materialize the conjunctive form, we no longer have the 16 FU restriction. In this patch we limit to 64 functional units due to using a uint64_t bitmask in the DFAEmitter. Now that we've decoupled these representations we can increase this in future. Reviewers: ThomasRaoux, kparzysz, majnemer Reviewed By: ThomasRaoux Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69110	2019-11-05 15:41:42 +00:00
Simon Pilgrim	c7f127d93f	[MachineOutliner] Fix uninitialized variable warnings. NFCI.	2019-11-05 15:15:14 +00:00
Dávid Bolvanský	9f294fc497	[AtomicExpandPass] Silence static analyzer warnings about operator priority. NFCI.	2019-11-05 13:55:46 +01:00
David Green	f01b9aa89e	[MachineScheduler] Enable AA in PostRA Machine scheduler This adds AA to Post-RA Machine Scheduling, allowing the pass more freedom when handling memory operations. My understanding is that this was just never done, not that it is inherently incorrect to do so. The older PostRA List scheduler already makes use of AA, it's just that the MI PostRA Scheduler was never taught to use it. Differential Revision: https://reviews.llvm.org/D69814	2019-11-05 11:58:50 +00:00
Thomas Preud'homme	646896a442	Fix PR40644: miscompile indexed FP constant store Summary: Functions replaceStoreOfFPConstant() and OptimizeFloatStore() both replace store of float by a store of an integer unconditionally. However this generates wrong code when the store that is replaced is an indexed or truncating store. This commit solves this issue by adding an early return in these functions when the store being considered is not a normal store. Bug was only observed on out of tree targets, hence the lack of testcase in this commit. Reviewers: efriedma Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68420	2019-11-05 11:07:52 +00:00
David Green	7d9af03ff7	[Scheduling][ARM] Consistently enable PostRA Machine scheduling In the ARM backend, for historical reasons we have only some targets using Machine Scheduling. The rest use the old list scheduler as they are using itinaries and the list scheduler seems to produce better code (and not crash running out of register on v6m codes). So whether to use the MIScheduler or not is checked at runtime from the subtarget features. This is fine, except for post-ra scheduling. Whether to use the old post-ra list scheduler or the post-ra machine schedule is decided as the pass manager is set up, in arms case from a newly constructed subtarget. Under some situations, like LTO, this won't include the correct cpu so can pick the wrong option. This can have a surprising effect on performance. To fix that, this patch overrides targetSchedulesPostRAScheduling and addPreSched2 in the ARM backend, adding _both_ post-ra schedulers and picking at runtime which to execute. To pick between the two I've had to add a enablePostRAMachineScheduler() method that normally returns enableMachineScheduler() && enablePostRAScheduler(), which can be overridden to enable just one of PostRAMachineScheduler vs PostRAScheduler. Thanks to David Penry for the identifying this problem. Differential Revision: https://reviews.llvm.org/D69775	2019-11-05 10:44:55 +00:00
Sjoerd Meijer	92164cf25d	Recommit "[HardwareLoops] Optimisation remarks" With a few things fixed: - initialisaiton of the optimisation remark pass (this was causing the buildbot failures on PPC), - a test case. Differential Revision: https://reviews.llvm.org/D69660	2019-11-05 09:06:22 +00:00
aqjune	58acbce3de	[IR] Add Freeze instruction Summary: - Define Instruction::Freeze, let it be UnaryOperator - Add support for freeze to LLLexer/LLParser/BitcodeReader/BitcodeWriter The format is `%x = freeze <ty> %v` - Add support for freeze instruction to llvm-c interface. - Add m_Freeze in PatternMatch. - Erase freeze when lowering IR to SelDag. Reviewers: deadalnix, hfinkel, efriedma, lebedev.ri, nlopes, jdoerfert, regehr, filcab, delcypher, whitequark Reviewed By: lebedev.ri, jdoerfert Subscribers: jfb, kristof.beyls, hiraditya, lebedev.ri, steven_wu, dexonsmith, xbolva00, delcypher, spatel, regehr, trentxintong, vsk, filcab, nlopes, mehdi_amini, deadalnix, llvm-commits Differential Revision: https://reviews.llvm.org/D29011	2019-11-05 15:54:56 +09:00
Sanjay Patel	113181e9bd	[DAGCombine][MSP430] use shift amount threshold in DAGCombine (2/2) Continuation of: D69116 Contributes to a fix for PR43559: https://bugs.llvm.org/show_bug.cgi?id=43559 See also D69099 and D69116 Use the TLI hook in DAGCombine.cpp to guard against creating shift nodes that are not optimal for a target. Patch by: @joanlluch (Joan LLuch) Differential Revision: https://reviews.llvm.org/D69120	2019-11-04 13:41:41 -05:00
Ulrich Weigand	664f84e246	[FPEnv][SelectionDAG] Refactor strict FP node construction Small refactoring in visitConstrainedFPIntrinsic that should make it easier to create DAG nodes requiring extra arguments. That is the case currently only for STRICT_FP_ROUND, but may be the case for additional nodes (in particular compares) in the future. Extracted from the patch for D69281. NFC.	2019-11-04 17:45:54 +01:00
Jonas Paulsson	b7b170c9b4	[MachineVerifier] Improve verification of live-in lists. MachineVerifier::visitMachineFunctionAfter() is extended to check the live-through case for live-in lists. This is only done for registers without aliases and that are neither allocatable or reserved, such as the SystemZ::CC register. The MachineVerifier earlier only catched the case of a live-in use without an entry in the live-in list (as "using an undefined physical register"). A comment in LivePhysRegs.h has been added stating a guarantee that addLiveOuts() can be trusted for a full register both before and after register allocation. Review: Quentin Colombet https://reviews.llvm.org/D68267	2019-11-04 16:22:00 +01:00
Dávid Bolvanský	a18a8db0d4	[SelectionDAG] Fixed null check after dereferencing warning. NFCI.	2019-11-03 19:34:03 +01:00
shkzhang	4e9778e346	[CodeGen] [ExpandReduction] Fix the bug for ExpandReduction() when vector size isn't power of 2 Summary: For below test case, we will get assert error except for AArch64 and ARM: declare i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a) define i8 @test_v3i8(<3 x i8> %a) nounwind { %b = call i8 @llvm.experimental.vector.reduce.and.i8.v3i8(<3 x i8> %a) ret i8 %b } In the function getShuffleReduction (), we can see it needs the vector size must be power of 2. This patch is fix below error when the number of element is not power of 2 for those llvm.experimental.vector.reduce.* function. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D68625	2019-11-02 23:59:12 -04:00
Simon Pilgrim	095d2a4ced	FastISel - fix uninitialized variable warnings in constructor. NFCI.	2019-11-02 18:03:22 +00:00
Simon Pilgrim	97725707f4	Fix uninitialized variable warning. NFCI.	2019-11-02 14:42:38 +00:00
David Blaikie	89b7f16204	DebugInfo: Streamline debug_ranges/rnglists/rnglists.dwo emission code More code reuse, better basis for modelling debug_loc/loclists/loclists.dwo emission support.	2019-11-01 14:56:43 -07:00
Craig Topper	96bb076621	[TargetLowering] Move the setBooleanContents check on (xor (setcc), (setcc)) == / != 1 -> (setcc) != / == (setcc) to the right place We need to be checking the value types for the inner setccs not the outer setcc. We need to ensure those setccs produce a 0/1 value or that the xor is on the i1 type. I think at the time this code was originally written, getBooleanContents didn't take any arguments so this was probably correct. But now we can have a different boolean contents for integer and floating point. Not sure why the other combines below the xor were also checking the boolean contents. None of them involve any setccs other than the outer one and they only produce a new setcc. Differential Revision: https://reviews.llvm.org/D69480	2019-11-01 14:43:17 -07:00
Craig Topper	42d77461f3	[MachineBasicBlock] Skip over debug instructions in computeRegisterLiveness before checking for begin If there are debug instructions before the stopping point, we need to skip over them before checking for begin in order to avoid having the debug instructions effect behavior. Fixes PR43758. Differential Revision: https://reviews.llvm.org/D69606	2019-11-01 14:43:17 -07:00
Reid Kleckner	f5d935c167	[WinCFG] Handle constant casts carefully in .gfids emission Summary: The general Function::hasAddressTaken has two issues that make it inappropriate for our purposes: 1. it is sensitive to dead constant users (PR43858 / crbug.com/1019970), leading to different codegen when debu info is enabled 2. it considers direct calls via a function cast to be address escapes The first is fixable, but the second is not, because IPO clients rely on this behavior. They assume this function means that all call sites are analyzable for IPO purposes. So, implement our own analysis, which gets closer to finding functions that may be indirect call targets. Reviewers: ajpaverd, efriedma, hans Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69676	2019-11-01 13:32:03 -07:00
Bjorn Pettersson	56c22931bd	[LDV][RAGreedy] Inform LiveDebugVariables about new VRegs added by InlineSpiller Summary: Make sure RAGreedy informs LiveDebugVariables about new VRegs that is introduced at spill by InlineSpiller. Consider this example LDV: !"var" [48r;128r):0 Loc0=%2 48B %2 = ... ... 128B %7 = ADD %2, ... If %2 is spilled the InlineSpiller will insert spill/reload instructions and introduces some new vregs. So we get 48B %4 = ... 56B spill %4 ... 120B reload %5 128B %3 = ADD %5, ... In the past we did not inform LDV about this, and when reintroducing DBG_VALUE instruction LDV still got information that "var" had the location of the spilled register %2 for the interval [48r;128r). The result was bad, since we mapped "var" to the spill slot even before the spill happened: %4 = ... DBG_VALUE %spill.0, !"var" spill %4 to %spill.0 ... reload %5 %3 = ADD %5, ... This patch will inform LDV about the interval split introduced due to spilling. So the location map in LDV will become !"var" [48r;56r):1 [56r;120r):0 [120r;128r):2 Loc0=%2 Loc1=%4 Loc2=%5 And when inserting DBG_VALUE instructions we get %4 = ... DBG_VALUE %4, !"var" spill %4 to %spill.0 DBG_VALUE %spill.0, !"var" ... reload %5 DBG_VALUE %5, !"var" %3 = ADD %5, ... Fixes: https://bugs.llvm.org/show_bug.cgi?id=38899 Reviewers: jmorse, vsk, aprantl Reviewed By: jmorse Subscribers: dstenb, wuzish, MatzeB, qcolombet, nemanjai, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69584	2019-11-01 16:25:32 +01:00
Matt Arsenault	6221767055	DAG: Add DAG argument to isFPExtFoldable For AMDGPU this is dependent on the FP mode, which should eventually not be a property of the subtarget.	2019-10-31 22:32:45 -07:00
Fangrui Song	44d0c3d947	[PGO][PGSO] Fix -DBUILD_SHARED_LIBS=on builds after D69580/llvmorg-10-init-8797-g0d987e411ac Move TargetLoweringBase::isSuitableForJumpTable from llvm/CodeGen/TargetLowering.h to .cpp, to avoid the undefined reference from all LLVM${Target}ISelLowering.cpp. Another fix is to add a dependency on TransformUtils to all lib/Target/$Target/LLVMBuild.txt, but that is too disruptive.	2019-10-31 14:02:29 -07:00
Hiroshi Yamauchi	0d987e411a	[PGO][PGSO] TargetLowering/TargetTransformationInfo/SwitchLoweringUtils part. Summary: (Split of off D67120) TargetLowering/TargetTransformationInfo/SwitchLoweringUtils changes for profile guided size optimization. Reviewers: davidxl Subscribers: eraman, hiraditya, haicheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69580	2019-10-31 13:22:56 -07:00
Simon Pilgrim	3842b94c4e	Revert rG57ee0435bd47f23f3939f402914c231b4f65ca5e - [TII] Use optional destination and source pair as a return value; NFC This is breaking MSVC builds: http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/20375	2019-10-31 18:00:29 +00:00
Sanne Wouda	f2cb9c0eab	Fix missing memcpy, memmove and memset tail calls Summary: If a wrapper around one of the mem* stdlib functions bitcasts the returned pointer value before returning it (e.g. to a wchar_t*), LLVM does not emit a tail call. Add a check for this scenario so that we emit a tail call. Reviewers: wmi, mkuper, ramred01, dmgreen Reviewed By: wmi, dmgreen Subscribers: hiraditya, sanwou01, javed.absar, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59078	2019-10-31 16:13:29 +00:00
Matt Arsenault	1725f28841	DAG: Add new control for ISD::FMAD formation For AMDGPU this depends on whether denormals are enabled in the default FP mode for the function. Currently this is treated as a subtarget feature, so FMAD is selectively legal based on that. I want to move this out of the subtarget features so this can be controlled with a denormal mode attribute. Additionally, this will allow folding based on a future ftz fast math flag.	2019-10-31 07:51:38 -07:00
Djordje Todorovic	57ee0435bd	[TII] Use optional destination and source pair as a return value; NFC Refactor usage of isCopyInstrImpl, isCopyInstr and isAddImmediate methods to return optional machine operand pair of destination and source registers. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D69622	2019-10-31 15:34:49 +01:00
Jeremy Morse	a8db456b53	Revert "[DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructions" This reverts commit `ee50590e16`. PR43855 reports a performance regression from this commit, which I'll look into.	2019-10-31 12:39:06 +00:00
Jeremy Morse	d382a8a768	Revert "[DebugInfo] MachineSink: find more DBG_VALUEs to sink" This reverts commit `f5e1b718a6`. PR43855 reports a performance regression with commit `ee50590e`. This commit depends on the faulty one, so has to come out too.	2019-10-31 12:39:06 +00:00
David Candler	92aa0c2dbc	[cfi] Add flag to always generate .debug_frame This adds a flag to LLVM and clang to always generate a .debug_frame section, even if other debug information is not being generated. In situations where .eh_frame would normally be emitted, both .debug_frame and .eh_frame will be used. Differential Revision: https://reviews.llvm.org/D67216	2019-10-31 09:48:30 +00:00
Quentin Colombet	f0eeb3c7a7	[GISel][CombinerHelper] Combine shuffle_vector scalar to build_vector Teach the combiner helper how to replace shuffle_vector of scalars into build_vector. I am not particularly happy about having to add this combine, but we currently get those from <1 x iN> from the IR. Bonus: This fixes an assert in the shuffle_vector combines since before this patch, we were expecting vector types.	2019-10-30 18:20:37 -07:00
Matt Arsenault	0202fa3a47	RegAllocFast: Use Register	2019-10-30 14:40:21 -07:00
Jeremy Morse	3137fe4d23	[DebugInfo][DAG] Distinguish different kinds of location indirection From SelectionDAGs point of view, debug variable locations specified with dbg.declare and dbg.addr are indirect -- they specify the address of something. But calling conventions might mean that a Value is placed on the stack somewhere, and this too is indirection. Previously this was mixed up in the "IsIndirect" field of DBG_VALUE insts; this patch separates them by encoding the indirection in a DIExpression. If we have a dbg.declare or dbg.addr, then the expression produces an address that then becomes a DWARF memory location. We can represent this by putting a DW_OP_deref on the _end_ of the expression. If a Value has been placed on the stack, then we need to put a DW_OP_deref on the _start_ of the expression, to load the Value from the stack and have the rest of the expression operate on it. Differential Revision: https://reviews.llvm.org/D69028	2019-10-30 18:41:44 +00:00
Kevin P. Neal	72bc291f94	[NFC] Move this set of STRICT_* cases to be next to the non-strict cases. Requested by Cameron McInally in D69275.	2019-10-30 13:32:27 -04:00
David Tellenbach	fbe7f5e972	[NFC][MachineOutliner] Fix typo in comment	2019-10-30 16:28:11 +00:00
Jay Foad	86549c7528	[SelectionDAG] Add support for FP_ROUND in WidenVectorOperand. Summary: This is used on AMDGPU for rounding from v3f64 (which is illegal) to v3f32 (which is legal). Subscribers: jvesely, nhaehnle, tpr, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69339	2019-10-30 15:18:21 +00:00
Krzysztof Parzyszek	43144ffa91	LiveIntervals: Split live intervals on multiple dead defs This is a follow-up to D67448. Split live intervals with multiple dead defs during the initial execution of the live interval analysis, but do it outside of the function createAndComputeVirtRegInterval. Differential Revision: https://reviews.llvm.org/D68666	2019-10-30 08:50:46 -05:00
Djordje Todorovic	532815dd5c	[ARM][AArch64][DebugInfo] Improve call site instruction interpretation Extend the describeLoadedValue() with support for target specific ARM and AArch64 instructions interpretation. The patch provides specialization for ADD and SUB operations that include a register and an immediate/offset operand. Some of the instructions can operate with global string addresses or constant pool indexes but such cases are omitted since we currently lack flexible support for processing such operands at DWARF production stage. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D67556	2019-10-30 13:58:14 +01:00
David Zarzycki	f68925d450	[X86] Make memcmp vector lowering handle arbitrary expansions Teach combineVectorSizedSetCCEquality() to handle arbitrary memcmp expansions but do not change any default policy for now. This also fixes a bug in the memcmp expansion itself when large displacements are needed. https://reviews.llvm.org/D69507	2019-10-30 09:12:57 +02:00
Adrian Prantl	f919be3365	[DWARF5] Added support for deleted C++ special member functions. This patch adds support for deleted C++ special member functions in clang and llvm. Also added Defaulted member encodings for future support for defaulted member functions. Patch by Sourabh Singh Tomar! Differential Revision: https://reviews.llvm.org/D69215	2019-10-29 13:44:06 -07:00
Philip Reames	2460989eab	[SelectionDAG] Enable lowering unordered atomics loads w/LoadSDNode (and stores w/StoreSDNode) by default Enable the new SelectionDAG representation for unordered loads and stores introduced in r371441 by default. As a reminder, the new lowering changes the representation of an unordered atomic load from an AtomicSDNode - which is essentially a black box which gets passed through without combines messing with it - to a LoadSDNode w/a atomic marker on the MMO. The later parallels the way we handle volatiles, and I've audited the code to ensure that every location which checks one checks the other. This has been fairly heavily fuzzed, and I examined diffs in a reasonable large corpus of assembly by hand, so I'm reasonable sure this is correct for the common case. Late in the review for this, it was discovered that I hadn't correctly handled cases which could be legalized into CAS operations. This points out that there's a strong bias in the IR of the frontend I'm working with towards only legal atomics. If there are problems with this patch, the most likely area will be legalization. Differential Revision: https://reviews.llvm.org/D69219	2019-10-29 12:46:24 -07:00
Sander de Smalen	d6a7da80aa	Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2) llvm/test/DebugInfo/MIR/X86/live-debug-values-reg-copy.mir failed with EXPENSIVE_CHECKS enabled, causing the patch to be reverted in rG2c496bb5309c972d59b11f05aee4782ddc087e71. This patch relands the patch with a proper fix to the live-debug-values-reg-copy.mir tests, by ensuring the MIR encodes the callee-saves correctly so that the CalleeSaved info is taken from MIR directly, rather than letting it be recalculated by the PEI pass. I've done this by running `llc -stop-before=prologepilog` on the LLVM IR as captured in the test files, adding the extra MOV instructions that were manually added in the original test file, then running `llc -run-pass=prologepilog` and finally re-added the comments for the MOV instructions.	2019-10-29 16:13:07 +00:00
Greg Bedwell	1ba72a81ca	Fix some spelling mistakes in comments. NFC	2019-10-29 12:41:24 +00:00
Greg Bedwell	b1c4b4d5cb	Fix a spelling mistake in a comment. NFC	2019-10-29 12:19:52 +00:00
Andrea Di Biagio	67720e7bf7	Revert "[NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap" This reverts commit `8af5ada093`. As Bjorn pointed out in D68816, the iteration over `UserVals` may not be safe. Reverting on behalf of Orlando.	2019-10-29 12:13:23 +00:00
Simon Pilgrim	ec82eb2d02	Fix unused variable warning. NFCI.	2019-10-29 12:12:28 +00:00
Simon Pilgrim	2c496bb530	Revert rG70f5aecedef9a6e347e425eb5b843bf797b95319 - "Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2)" This fails on EXPENSIVE_CHECKS builds	2019-10-29 11:54:58 +00:00
Jeremy Morse	ec32dff0b0	[BranchFolding] skip debug instr to avoid code change Use the existing helper function in BranchFolding, "countsAsInstruction", to skip over non-instructions. Otherwise debug instructions can be identified as the last real instruction in a block, leading to different codegen decisions when debug is enabled as demonstrated by the test case. Patch by: yechunliang (Chris Ye)! Differential Revision: https://reviews.llvm.org/D66467	2019-10-29 11:45:38 +00:00
Amy Huang	742043047c	Recommit "Add a heap alloc site marker field to the ExtraInfo in MachineInstrs" Summary: Fixes some things from original commit at https://reviews.llvm.org/D69136. The main change is that the heap alloc marker is always stored as ExtraInfo in the machine instruction instead of in the PointerSumType because it cannot hold more than 4 pointer types. Add instruction marker to MachineInstr ExtraInfo. This does almost the same thing as Pre/PostInstrSymbols, except that it doesn't create a label until printing instructions. This allows for labels to be put around instructions that are deleted/duplicated somewhere. Use this marker to track heap alloc site call instructions. Reviewers: rnk Subscribers: MatzeB, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69536	2019-10-28 16:59:32 -07:00
Puyan Lotfi	6b7615ae9a	[MachineOutliner][NFC] clang-formating the MachineOutliner.	2019-10-28 17:58:27 -04:00
Hiroshi Yamauchi	75f72f6b73	[PGO][PGSO] SizeOpts changes. Summary: (Split of off D67120) SizeOpts/MachineSizeOpts changes for profile guided size optimization. (A second try after previously committed as r375254 and reverted as r375375.) Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69409	2019-10-28 12:57:26 -07:00
Francis Visoiu Mistrih	c7557dd692	[Remarks] Remove references to ELF support There is no ELF support at the moment. Remove all the references to the `.remarks` section.	2019-10-28 12:50:46 -07:00
Francis Visoiu Mistrih	209d5a12c5	[Remarks] Emit the remarks section by default for certain formats Emit a remarks section by default for the following formats: * bitstream * yaml-strtab while still providing -remarks-section=<bool> to override the defaults.	2019-10-28 12:50:46 -07:00
Puyan Lotfi	a51fc8ddf8	[MachineOuliner][NFC] Refactoring code to make outline rerunning a cleaner diff. I want to add the ability to rerun the outliner in certain cases, and I thought this could be an NFC change that could make a subsequent change that allows for rerunning the outliner a cleaner diff. Differential Revision: https://reviews.llvm.org/D69482	2019-10-28 15:13:45 -04:00
Nico Weber	e59f7488c7	Convert files added in `d157a9bc8b` to unix line endings. Ran: git show --diff-filter=A --stat `d157a9bc8b` \| grep '\|' \| \ awk '{ print $1 }' \| xargs dos2unix	2019-10-28 14:39:45 -04:00
Sander de Smalen	70f5aecede	Reland [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2) Fixed up test/DebugInfo/MIR/Mips/live-debug-values-reg-copy.mir that broke r375425.	2019-10-28 18:05:19 +00:00
Andrew Paverd	d157a9bc8b	Add Windows Control Flow Guard checks (/guard:cf). Summary: A new function pass (Transforms/CFGuard/CFGuard.cpp) inserts CFGuard checks on indirect function calls, using either the check mechanism (X86, ARM, AArch64) or or the dispatch mechanism (X86-64). The check mechanism requires a new calling convention for the supported targets. The dispatch mechanism adds the target as an operand bundle, which is processed by SelectionDAG. Another pass (CodeGen/CFGuardLongjmp.cpp) identifies and emits valid longjmp targets, as required by /guard:cf. This feature is enabled using the `cfguard` CC1 option. Reviewers: thakis, rnk, theraven, pcc Subscribers: ychen, hans, metalcanine, dmajor, tomrittervg, alex, mehdi_amini, mgorny, javed.absar, kristof.beyls, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D65761	2019-10-28 15:19:39 +00:00
Jeremy Morse	f5e1b718a6	[DebugInfo] MachineSink: find more DBG_VALUEs to sink In the Pre-RA machine sinker, previously we were relying on all DBG_VALUEs being immediately after the instruction that defined their operands. This isn't a valid assumption, as a variable location change doesn't necessarily correspond to where the value is computed. In this patch, we collect DBG_VALUEs that might need sinking as we walk through a block, and sink all of them if their defining instruction is sunk. This patch adds some copy propagation too, so that if we sink a copy inst, the now non-dominated paths can use the copy source for the variable location. Differential Revision: https://reviews.llvm.org/D58386	2019-10-28 14:32:50 +00:00
Sanjay Patel	1ebd4a2e3a	[DAGCombiner] widen any_ext of popcount based on target support This enhances D69127 (rGe6c145e0548e3b3de6eab27e44e1504387cf6b53) to handle the looser "any_extend" cast in addition to zext. This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688	2019-10-28 10:07:12 -04:00
Jeremy Morse	ee50590e16	[DebugInfo] MachineSink: Insert undef DBG_VALUEs when sinking instructions When we sink DBG_VALUEs between blocks, we simply move the DBG_VALUE instruction to below the sunk instruction. However, we should also mark the variable as being undef at the original location, to terminate any earlier variable location. This patch does that -- plus, if the instruction being sunk is a copy, it attempts to propagate the copy through the DBG_VALUE, replacing the destination with the source. Differential Revision: https://reviews.llvm.org/D58238	2019-10-28 12:17:56 +00:00
David Green	ba2c625531	[Codegen][ARM] Add float softening for cbrt We would previously have no soft-float softening for cbrt, so could hit a crash failing to select. This fills in what appears to be missing. Differential Revision: https://reviews.llvm.org/D69345	2019-10-28 11:08:55 +00:00
Kerry McLaughlin	da720a38b9	[AArch64][SVE] Implement masked load intrinsics Summary: Adds support for codegen of masked loads, with non-extending, zero-extending and sign-extending variants. Reviewers: huntergr, rovka, greened, dmgreen Reviewed By: dmgreen Subscribers: dmgreen, samparker, tschuett, kristof.beyls, hiraditya, rkruppe, psnobl, cfe-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68877	2019-10-28 10:06:14 +00:00
Sanjay Patel	85a2146c15	[SDAG] fold insert_vector_elt with undef index Similar to: rG4c47617627fb This makes the DAG behavior consistent with IR's insertelement. https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for AArch64 and WebAssembly by replacing undef index operands with something else.	2019-10-27 15:28:43 -04:00
Craig Topper	f067dd839e	[LegalizeTypes] When promoting BITREVERSE/BSWAP don't take the shift amount into account when determining the shift amount VT. If the target's preferred shift amount VT can't hold any shift amount for the promoted VT, we should use i32. The specific shift amount shouldn't matter. The type will be adjusted later when the shift itself is type legalized. This avoids an assert in getNode. Fixes PR43820.	2019-10-27 12:20:35 -07:00
Craig Topper	73f255b83a	[TargetLowering] Add getBooleanContents contents check to "SETCC (SETCC), [0\|1], [EQ\|NE] -> SETCC" combine. This combine is only valid if the inner setcc produces a 0/1 result or the inner type is MVT::i1. I haven't seen this cause any issues, just happened to notice it while reviewing combines in this function. While there also fix another call to use the value type from the SDValue for the operand instead of calling SDNode::getValueType(0). Though its likely the use is result 0, its not guaranteed.	2019-10-27 10:07:15 -07:00
Sanjay Patel	4c47617627	[SDAG] fold extract_vector_elt with undef index This makes the DAG behavior consistent with IR's extractelement after: rGb32e4664a715 https://bugs.llvm.org/show_bug.cgi?id=42689 I've tried to maintain test intent for WebAssembly. The AMDGPU test is trying to test for crashing or other bad behavior, but I'm not sure if that's possible after this change.	2019-10-25 19:27:26 -04:00
Matt Arsenault	1a276d1e8c	GlobalISel: Implement widenScalar for G_INSERT_VECTOR_ELT	2019-10-25 13:55:07 -07:00
Guillaume Chatelet	e8a0a0904b	[Alignment][NFC] Convert AllocaInst to MaybeAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69301	2019-10-25 22:41:34 +02:00
Amy Huang	64c1f6602a	Revert "Add an instruction marker field to the ExtraInfo in MachineInstrs." Reverting commit `b85b4e5a6f` due to some buildbot failures/ out of memory errors.	2019-10-25 12:41:34 -07:00
Sanjay Patel	e6c145e054	[DAGCombiner] widen zext of popcount based on target support zext (ctpop X) --> ctpop (zext X) This is a prerequisite step for canonicalizing in the other direction (narrow the popcount) in IR - PR43688: https://bugs.llvm.org/show_bug.cgi?id=43688 I'm not sure if any other targets are affected, but I found a missing fold for PPC, so added tests based on that. The reason we widen all the way to 64-bit in these tests is because the initial DAG looks something like this: t5: i8 = ctpop t4 t6: i32 = zero_extend t5 <-- created based on IR, but unused node? t7: i64 = zero_extend t5 Differential Revision: https://reviews.llvm.org/D69127	2019-10-25 14:10:51 -04:00
Austin Kerbow	c35b358b74	AMDGPU/GlobalISel: Legalize FDIV16 Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, rovka, dstuttard, tpr, t-tye, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69347	2019-10-25 11:07:17 -07:00
Amy Huang	b85b4e5a6f	Add an instruction marker field to the ExtraInfo in MachineInstrs. Summary: Add instruction marker to MachineInstr ExtraInfo. This does almost the same thing as Pre/PostInstrSymbols, except that it doesn't create a label until printing instructions. This allows for labels to be put around instructions that are deleted/duplicated somewhere. Also undo the workaround in r375137. Reviewers: rnk Subscribers: MatzeB, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69136	2019-10-25 09:21:10 -07:00
Itay Bookstein	59a51d84b3	[CodeGen][SelectionDAG] Fix tiny bug in ExpandIntRes_UADDSUBO Summary: Ternary expression checks for ISD::ADD instead of ISD::UADDO inside DAGTypeLegalizer::ExpandIntRes_UADDSUBO. This means the ternary expression will evaluate to ISD::SUBCARRY for both ISD::UADDO and ISD::USUBO nodes. Targets are likely to implement both, so impact will be very limited in practice. Reviewers: bogner, lebedev.ri Reviewed By: lebedev.ri Subscribers: lebedev.ri, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68123	2019-10-25 18:10:51 +03:00
David Stenberg	2a3dc6b74f	Fix a variable typo in LiveDebugValues [NFC]	2019-10-25 11:21:43 +02:00
Djordje Todorovic	8c99a549de	[LiveDebugValues] Small code clean up; NFC	2019-10-25 09:39:42 +02:00
Simon Pilgrim	a18818207a	Fix cppcheck shadow variable warning. NFCI.	2019-10-24 22:14:36 +01:00
Hans Wennborg	684ebc605e	Revert `4334892e7b` "[DAGCombine][ARM] x ==/!= c -> (x - c) ==/!= 0 iff '-c' can be folded into the x node." This broke various Windows builds, see comments on the Phabricator review. This also reverts the follow-up `20bf0cf`. > Summary: > This fold, helps recover from the rest of the D62266 ARM regressions. > https://rise4fun.com/Alive/TvpC > > Note that while the fold is quite flexible, i've restricted it > to the single interesting pattern at the moment. > > Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix > > Reviewed By: deadalnix > > Subscribers: javed.absar, kristof.beyls, llvm-commits > > Tags: #llvm > > Differential Revision: https://reviews.llvm.org/D62450	2019-10-23 19:52:02 +02:00
Mirko Brkusanin	4b63ca1379	[Mips] Use appropriate private label prefix based on Mips ABI MipsMCAsmInfo was using '$' prefix for Mips32 and '.L' for Mips64 regardless of -target-abi option. By passing MCTargetOptions to MCAsmInfo we can find out Mips ABI and pick appropriate prefix. Tags: #llvm, #clang, #lldb Differential Revision: https://reviews.llvm.org/D66795	2019-10-23 12:24:35 +02:00
David Stenberg	74a72e6848	[DebugInfo] Stop describing imms in TargetInstrInfo's describeLoadedValue() impl Summary: The default implementation of the describeLoadedValue() hook uses the MoveImm property to determine if an instruction moves an immediate. If an instruction has that property the function returns the second operand, assuming that that is the immediate value the instruction moves. As far as I can tell, the MoveImm property does not imply that the second operand is the immediate value, nor that any other operand necessarily holds the immediate value; it just means that the instruction moves some immediate value. One example where the second operand is not the immediate is SystemZ's LZER instruction, which moves a zero immediate implicitly: $f0S = LZER. That case triggered an out-of-bound assertion when getting the operand. I have added a test case for that instruction. Another example is ARM's MVN instruction, which holds the logical bitwise NOT'd value of the immediate that is moved. For the following reproducer: extern void foo(int); int main() { foo(-11); } an incorrect call site value would be emitted: $ clang --target=arm foo.c -O1 -g -Xclang -femit-debug-entry-values \ -c -o - \| ./build/bin/llvm-dwarfdump - \| \ grep -A2 call_site_parameter 0x00000058: DW_TAG_GNU_call_site_parameter DW_AT_location (DW_OP_reg0 R0) DW_AT_GNU_call_site_value (DW_OP_lit10) Another example is the A2_combineii instruction on Hexagon which moves two immediates to a super-register: $d0 = A2_combineii 20, 10. Perhaps these are rare exceptions, and most MoveImm instructions hold the immediate in the second operand, but in my opinion the default implementation of the hook should only describe values that it can, by some contract, guarantee are safe to describe, rather than leaving it up to the targets to override the exceptions, as that can silently result in incorrect call site values. This patch adds X86's relevant move immediate instructions to the target's hook implementation, so this commit should be a NFC for that target. We need to do the same for ARM and AArch64. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: kristof.beyls, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D69109	2019-10-23 11:41:29 +02:00
Roman Lebedev	20bf0cf2f0	[TargetLowering] optimizeSetCCToComparisonWithZero(): add extra sanity checks (PR43769) We should do the fold only if both constants are plain, non-opaque constants, at least that is the DAG.FoldConstantArithmetic() requirement. And if the constant we are comparing with is zero - we shouldn't be trying to do this fold in the first place. Fixes https://bugs.llvm.org/show_bug.cgi?id=43769	2019-10-23 12:01:40 +03:00
Roman Lebedev	4334892e7b	[DAGCombine][ARM] x ==/!= c -> (x - c) ==/!= 0 iff '-c' can be folded into the x node. Summary: This fold, helps recover from the rest of the D62266 ARM regressions. https://rise4fun.com/Alive/TvpC Note that while the fold is quite flexible, i've restricted it to the single interesting pattern at the moment. Reviewers: efriedma, craig.topper, spatel, RKSimon, deadalnix Reviewed By: deadalnix Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62450	2019-10-22 22:56:35 +03:00
Petar Avramovic	95290827d7	[MIParser] Set RegClassOrRegBank during instruction parsing MachineRegisterInfo::createGenericVirtualRegister sets RegClassOrRegBank to static_cast<RegisterBank *>(nullptr). MIParser on the other hand doesn't. When we attempt to constrain Register Class on such VReg, additional COPY is generated. This way we avoid COPY instructions showing in test that have MIR input while they are not present with llvm-ir input that was used to create given MIR for a -run-pass test. Differential Revision: https://reviews.llvm.org/D68946 llvm-svn: 375502	2019-10-22 14:25:37 +00:00
Quentin Colombet	6f0ae81512	[GISel][CombinerHelper] Add a combine turning shuffle_vector into concat_vectors Teach the CombinerHelper how to turn shuffle_vectors, that concatenate vectors, into concat_vectors and add this combine to the AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69149 llvm-svn: 375452	2019-10-21 20:39:58 +00:00
Sander de Smalen	8f2dac471a	Reverted r375425 as it broke some buildbots. llvm-svn: 375444	2019-10-21 19:11:40 +00:00
Sander de Smalen	814548ec8e	[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize (Take 2) Commit message from D66935: This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. This patch fixes the lldb unit tests in `functionalities/thread/concurrent_events/*` Changes after D66935: Ensures AArch64FunctionInfo::getCalleeSavedStackSize does not return the uninitialized CalleeSavedStackSize when running `llc` on a specific pass where the MIR code has already been expected to have gone through PEI. Instead, getCalleeSavedStackSize (when passed the MachineFrameInfo) will try to recalculate the CalleeSavedStackSize from the CalleeSavedInfo. In debug mode, the compiler will assert the recalculated size equals the cached size as calculated through a call to determineCalleeSaves. This fixes two tests: test/DebugInfo/AArch64/asan-stack-vars.mir test/DebugInfo/AArch64/compiler-gen-bbs-livedebugvalues.mir that otherwise fail when compiled using msan. Reviewed By: omjavaid, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D68783 llvm-svn: 375425	2019-10-21 17:12:56 +00:00
Guillaume Chatelet	301b4128ac	[Alignment][NFC] Finish transition for `Loads` Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, asbirlea, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69253 llvm-svn: 375419	2019-10-21 15:10:26 +00:00
Guillaume Chatelet	5df90cd71c	[Alignment][NFC] TargetCallingConv::setByValAlign Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69248 llvm-svn: 375410	2019-10-21 12:05:33 +00:00
Guillaume Chatelet	bac5f6bd21	[Alignment][NFC] TargetCallingConv::setOrigAlign and TargetLowering::getABIAlignmentForCallingConv Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69243 llvm-svn: 375407	2019-10-21 11:01:55 +00:00
Vladimir Vereschaka	92c96c7bc0	Reverted r375254 as it has broken some build bots for a long time. llvm-svn: 375375	2019-10-20 20:39:33 +00:00
Sanjay Patel	a298964d22	[TargetLowering][DAGCombine][MSP430] add/use hook for Shift Amount Threshold (1/2) Provides a TLI hook to allow targets to relax the emission of shifts, thus enabling codegen improvements on targets with no multiple shift instructions and cheap selects or branches. Contributes to a Fix for PR43559: https://bugs.llvm.org/show_bug.cgi?id=43559 Patch by: @joanlluch (Joan LLuch) Differential Revision: https://reviews.llvm.org/D69116 llvm-svn: 375347	2019-10-19 16:57:02 +00:00
Reid Kleckner	7bbe711fb1	Avoid including CodeView/SymbolRecord.h from MCStreamer.h Move the types needed out so they can be forward declared instead. llvm-svn: 375325	2019-10-19 01:44:09 +00:00
Reid Kleckner	904cd3e06b	Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each Now X86ISelLowering doesn't depend on many IR analyses. llvm-svn: 375320	2019-10-19 01:31:09 +00:00
Reid Kleckner	0ad6c191de	Prune Analysis includes from SelectionDAG.h Only forward declarations are needed here. Follow-on to r375311. llvm-svn: 375319	2019-10-19 01:07:48 +00:00
Reid Kleckner	1d7b41361f	Prune two MachineInstr.h includes, fix up deps MachineInstr.h included AliasAnalysis.h, which includes a world of IR constructs mostly unneeded in CodeGen. Prune it. Same for DebugInfoMetadata.h. Noticed with -ftime-trace. llvm-svn: 375311	2019-10-19 00:22:07 +00:00
Matt Arsenault	d4274f0174	LiveIntervals: Fix handleMoveUp with subreg def moving across a def If a subregister def was moved across another subregister def and another use, the main range was not correctly updated. The end point of the moved interval ended too early and missed the use from theh other lanes in the subreg def. llvm-svn: 375300	2019-10-18 23:24:25 +00:00
Hiroshi Yamauchi	7e1637451d	[PGO][PGSO] SizeOpts changes. Summary: (Split of off D67120) SizeOpts/MachineSizeOpts changes for profile guided size optimization. Reviewers: davidxl Subscribers: mgorny, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69070 llvm-svn: 375254	2019-10-18 16:46:01 +00:00
Graham Hunter	84da2596f9	[AArch64][SVE] Add SPLAT_VECTOR ISD Node Adds a new ISD node to replicate a scalar value across all elements of a vector. This is needed for scalable vectors, since BUILD_VECTOR cannot be used. Fixes up default type legalization for scalable vectors after the new MVT type ranges were introduced. At present I only use this node for scalable vectors. A DAGCombine has been added to transform a BUILD_VECTOR into a SPLAT_VECTOR if all elements are the same, but only if the default operation action of Expand has been overridden by the target. I've only added result promotion legalization for scalable vector i8/i16/i32/i64 types in AArch64 for now. Reviewers: t.p.northover, javed.absar, greened, cameron.mcinally, jmolloy Reviewed By: jmolloy Differential Revision: https://reviews.llvm.org/D47775 llvm-svn: 375222	2019-10-18 11:48:35 +00:00
David Green	e6f313b380	[Codegen] Alter the default promotion for saturating adds and subs The default promotion for the add_sat/sub_sat nodes currently does: ANY_EXTEND iN to iM SHL by M-N [US][ADD\|SUB]SAT L/ASHR by M-N If the promoted add_sat or sub_sat node is not legal, this can produce code that effectively does a lot of shifting (and requiring large constants to be materialised) just to use the overflow flag. It is simpler to just do the saturation manually, using the higher bitwidth addition and a min/max against the saturating bounds. That is what this patch attempts to do. Differential Revision: https://reviews.llvm.org/D68926 llvm-svn: 375211	2019-10-18 09:47:48 +00:00
David Blaikie	2941cda5be	DebugInfo: Move loclist base address from DwarfFile to DebugLocStream There's no need to have more than one of these (there can be two DwarfFiles - one for the .o, one for the .dwo - but only one loc/loclist section (either in the .o or the .dwo) & certainly one per DebugLocStream, which is currently singular in DwarfDebug) llvm-svn: 375183	2019-10-17 23:02:19 +00:00
David Blaikie	3d737b642a	DebugInfo: Remove unused parameter (from DwarfDebug.cpp:emitListsTableHeaderStart) llvm-svn: 375180	2019-10-17 22:11:40 +00:00
Reid Kleckner	fc69ad0988	[codeview] Workaround for PR43479, don't re-emit instr labels Summary: In the long run we should come up with another mechanism for marking call instructions as heap allocation sites, and remove this workaround. For now, we've had two bug reports about this, so let's apply this workaround. SLH (the other client of instruction labels) probably has the same bug, but the solution there is more likely to be to mark the call instruction as not duplicatable, which doesn't work for debug info. Reviewers: akhuang Subscribers: aprantl, hiraditya, aganea, chandlerc, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D69068 llvm-svn: 375137	2019-10-17 17:28:31 +00:00
James Molloy	12092a9691	[DFAPacketizer] Use DFAEmitter. NFC. Summary: This is a NFC change that removes the NFA->DFA construction and emission logic from DFAPacketizerEmitter and instead uses the generic DFAEmitter logic. This allows DFAPacketizer to use the Automaton class from Support and remove a bunch of logic there too. After this patch, DFAPacketizer is mostly logic for grepping Itineraries and collecting functional units, with no state machine logic. This will allow us to modernize by removing the 16-functional-unit limit and supporting non-itinerary functional units. This is all for followup patches. Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68992 llvm-svn: 375086	2019-10-17 08:34:29 +00:00
Sam Parker	39af8a3a3b	[DAGCombine][ARM] Enable extending masked loads Add generic DAG combine for extending masked loads. Allow us to generate sext/zext masked loads which can access v4i8, v8i8 and v4i16 memory to produce v4i32, v8i16 and v4i32 respectively. Differential Revision: https://reviews.llvm.org/D68337 llvm-svn: 375085	2019-10-17 07:55:55 +00:00
Marcello Maggioni	6fc9563dba	Move LiveRangeCalc header to publicily available position. NFC Differential Revision: https://reviews.llvm.org/D69078 llvm-svn: 375075	2019-10-17 03:12:51 +00:00
Daniel Sanders	149a020425	Fix unused variable in r375066 llvm-svn: 375070	2019-10-17 01:21:40 +00:00
Daniel Sanders	329e748c8c	[gicombiner] Add the run-time rule disable option Summary: Each generated helper can be configured to generate an option that disables rules in that helper. This can be used to bisect rulesets. The disable bits are stored in a SparseVector as this is very cheap for the common case where nothing is disabled. It gets more expensive the more rules are disabled but you're generally doing that for debug purposes where performance is less of a concern. Depends on D68426 Reviewers: volkan, bogner Reviewed By: volkan Subscribers: hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68438 llvm-svn: 375067	2019-10-17 00:37:04 +00:00
Quentin Colombet	c319afc903	[GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector Teach the combiner helper how to flatten concat_vectors of build_vectors into a build_vector. Add this combine as part of AArch64 pre-legalizer combiner. Differential Revision: https://reviews.llvm.org/D69071 llvm-svn: 375066	2019-10-17 00:34:32 +00:00
Daniel Sanders	ec5208fd65	[gicombiner] Hoist pure C++ combine into the tablegen definition Summary: This is just moving the existing C++ code around and will be NFC w.r.t AArch64. Renamed 'CombineBr' to something more descriptive ('ElideByByInvertingCond') at the same time. The remaining combines in AArch64PreLegalizeCombiner require features that aren't implemented at this point and will be hoisted as they are added. Depends on D68424 Reviewers: bogner, volkan Subscribers: kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68426 llvm-svn: 375057	2019-10-16 23:53:35 +00:00
Matt Arsenault	34ed76e180	GlobalISel: Implement lower for G_SADDO/G_SSUBO Port directly from SelectionDAG, minus the path using ISD::SADDSAT/ISD::SSUBSAT. llvm-svn: 375042	2019-10-16 20:46:32 +00:00
Simon Pilgrim	e2163f96ab	CombinerHelper - silence dead assignment warnings. NFCI. Copy the NewAlignment value to Alignment first and then use that to update the stack frame object alignments. llvm-svn: 375019	2019-10-16 17:21:50 +00:00
Sjoerd Meijer	5a13188966	Revert "[HardwareLoops] Optimisation remarks" while I investigate the PPC build bot failures. This reverts commit `ad76375156`. llvm-svn: 374992	2019-10-16 10:55:06 +00:00
Sjoerd Meijer	ad76375156	[HardwareLoops] Optimisation remarks This adds the initial plumbing to support optimisation remarks in the IR hardware-loop pass. I have left a todo in a comment where we can improve the reporting, and will iterate on that now that we have this initial support in. Differential Revision: https://reviews.llvm.org/D68579 llvm-svn: 374980	2019-10-16 09:09:55 +00:00
Orlando Cazalet-Hyams	8af5ada093	[NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap In LiveDebugVariables.cpp: Prior to this patch, UserValues were grouped into linked list chains. Each chain was the union of two sets: { A: Matching Source variable } or { B: Matching virtual register }. A ptr to the heads (or 'leaders') of each of these chains were kept in a map with the { Source variable } used as the key (set A predicate) and another with { Virtual register } as key (set B predicate). There was a search through the chains in the function getUserValue looking for UserValues with matching { Source variable, Complex expression, Inlined-at location }. Essentially searching for a subset of A through two interleaved linked lists of set A and B. Importantly, by design, the subset will only contain one or zero elements here. That is to say a UserValue can be uniquely identified by the tuple { Source variable, Complex expression, Inlined-at location } if it exists. This patch removes the linked list and instead uses a DenseMap to map the tuple { Source variable, Complex expression, Inlined-at location } to UserValue ptrs so that the getUserValue search predicate is this map key. The virtual register map now maps a vreg to a SmallVector<UserVal *> so that set B is still available for quick searches. Reviewers: aprantl, probinson, vsk, dblaikie Reviewed By: aprantl Subscribers: russell.gallop, gbedwell, bjope, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D68816 llvm-svn: 374979	2019-10-16 08:36:00 +00:00
Craig Topper	8995daafa0	[LegalizeTypes] Don't use PromoteTargetBoolean in WidenVecOp_SETCC. Similar to r374970, but I don't have a test for this. PromoteTargetBoolean is intended to be use for legalizing an operand that needs to be promoted. It picks its type based on the return from getSetccResultType and is intended to be used when we have freedom to pick the new type. But the return type we need for WidenVecOp_SETCC is completely determined by the type of the input node. llvm-svn: 374972	2019-10-16 03:29:24 +00:00
Craig Topper	7b49e8ac35	[LegalizeTypes] Don't call PromoteTargetBoolean from SplitVecOp_VSETCC. PromoteTargetBoolean calls getSetccResultType to get the return type. But we were passing it the setcc result type rather than the setcc input type. This causes an issue on X86 with avx512vl where the setcc result type for vXf16 vectors is vXi16 while the result type for vXi16 vectors is vXi1. There's really no guarantee that getSetccResultType is the type we need here. So now we just grab the extend type from getExtendForContent and extend to the original result VT of the node we're splitting. llvm-svn: 374970	2019-10-16 02:50:04 +00:00
Dmitry Mikulin	f14642f2f1	Added support for "#pragma clang section relro=<name>" Differential Revision: https://reviews.llvm.org/D68806 llvm-svn: 374934	2019-10-15 18:31:10 +00:00
David Zarzycki	59390efef2	[X86] Make memcmp() use PTEST if possible and also enable AVX1 llvm-svn: 374922	2019-10-15 17:40:12 +00:00
Sanjay Patel	d545c9056e	[DAGCombiner] fold select-of-constants based on sign-bit test Examples: i32 X > -1 ? C1 : -1 --> (X >>s 31) \| C1 i8 X < 0 ? C1 : 0 --> (X >>s 7) & C1 This is a small generalization of a fold requested in PR43650: https://bugs.llvm.org/show_bug.cgi?id=43650 The sign-bit of the condition operand can be used as a mask for the true operand: https://rise4fun.com/Alive/paT Note that we already handle some of the patterns (isNegative + scalar) because there's an over-specialized, yet over-reaching fold for that in foldSelectCCToShiftAnd(). It doesn't use any TLI hooks, so I can't easily rip out that code even though we're duplicating part of it here. This fold is guarded by TLI.convertSelectOfConstantsToMath(), so it should not cause problems for targets that prefer select over shift. Also worth noting: I thought we could generalize this further to include the case where the true operand of the select is not constant, but Alive says that may allow poison to pass through where it does not in the original select form of the code. Differential Revision: https://reviews.llvm.org/D68949 llvm-svn: 374902	2019-10-15 15:23:57 +00:00
Benjamin Kramer	ce00cd6ae8	[AsmPrinter] Fix unused variable warning in Release builds. NFC. llvm-svn: 374894	2019-10-15 14:23:11 +00:00
David Stenberg	1ae2d9a2bd	[DebugInfo] Add a DW_OP_LLVM_entry_value operation Summary: Internally in LLVM's metadata we use DW_OP_entry_value operations with the same semantics as DWARF; that is, its operand specifies the number of bytes that the entry value covers. At the time of emitting entry values we don't know the emitted size of the DWARF expression that the entry value will cover. Currently the size is hardcoded to 1 in DIExpression, and other values causes the verifier to fail. As the size is 1, that effectively means that we can only have valid entry values for registers that can be encoded in one byte, which are the registers with DWARF numbers 0 to 31 (as they can be encoded as single-byte DW_OP_reg0..DW_OP_reg31 rather than a multi-byte DW_OP_regx). It is a bit confusing, but it seems like llvm-dwarfdump will print an operation "correctly", even if the byte size is less than that, which may make it seem that we emit correct DWARF for registers with DWARF numbers > 31. If you instead use readelf for such cases, it will interpret the number of specified bytes as a DWARF expression. This seems like a limitation in llvm-dwarfdump. As suggested in D66746, a way forward would be to add an internal variant of DW_OP_entry_value, DW_OP_LLVM_entry_value, whose operand instead specifies the number of operations that the entry value covers, and we then translate that into the byte size at the time of emission. In this patch that internal operation is added. This patch keeps the limitation that a entry value can only be applied to simple register locations, but it will fix the issue with the size operand being incorrect for DWARF numbers > 31. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: aprantl Subscribers: jyknight, fedor.sergeev, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67492 llvm-svn: 374881	2019-10-15 11:31:21 +00:00
Guillaume Chatelet	0e62011df8	[Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, mehdi_amini, jvesely, nhaehnle, hiraditya, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68944 llvm-svn: 374880	2019-10-15 11:24:36 +00:00
David Stenberg	284827f32b	[DebugInfo] Add interface for pre-calculating the size of emitted DWARF Summary: DWARF's DW_OP_entry_value operation has two operands; the first is a ULEB128 operand that specifies the size of the second operand, which is a DWARF block. This means that we need to be able to pre-calculate and emit the size of DWARF expressions before emitting them. There is currently no interface for doing this in DwarfExpression, so this patch introduces that. When implementing this I initially thought about running through DwarfExpression's emission two times; first with a temporary buffer to emit the expression, in order to being able to calculate the size of that emitted data. However, DwarfExpression is a quite complex state machine, so I decided against that, as it seemed like the two runs could get out of sync, resulting in incorrect size operands. Therefore I have implemented this in a way that we only have to run DwarfExpression once. The idea is to emit DWARF to a temporary buffer, for which it is possible to query the size. The data in the temporary buffer can then be emitted to DwarfExpression's main output. In the case of DIEDwarfExpression, a temporary DIE is used. The values are all allocated using the same BumpPtrAllocator as for all other DIEs, and the values are then transferred to the real value list. In the case of DebugLocDwarfExpression, the temporary buffer is implemented using a BufferByteStreamer which emits to a buffer in the DwarfExpression object. Reviewers: aprantl, vsk, NikolaPrica, djtodoro Reviewed By: aprantl Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67768 llvm-svn: 374879	2019-10-15 11:14:35 +00:00
Jeremy Morse	ed29dbaafa	[DebugInfo] Remove some users of DBG_VALUEs IsIndirect field This patch kills off a significant user of the "IsIndirect" field of DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is techncally redundant as it can be expressed by the DIExpression of a DBG_VALUE inst, and it isn't helpful to have two ways of expressing things. Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref to the insts DIExpression. There should now be no appearences of IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter, which is ensured by an assertion in LDVImpl::handleDebugValue. This means we also get to delete the IsIndirect handling in LiveDebugVariables. Tests can be upgraded by for example swapping the following IsIndirect=True DBG_VALUE: DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo) With one where the indirection is in the DIExpression, by _appending_ a deref: DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref) Which both mean the same thing. Most of the test changes in this patch are updates of that form; also some changes in how the textual assembly printer handles these insts. Differential Revision: https://reviews.llvm.org/D68945 llvm-svn: 374877	2019-10-15 10:46:24 +00:00
David Stenberg	d46ac44ecd	Change Comments SmallVector to std::vector in DebugLocStream [NFC] This changes the 32-element SmallVector to a std::vector. When building a RelWithDebInfo clang-8 binary, the average size of the vector was ~10000, so it does not seem very beneficial or practical to use a small vector for that. The DWARFBytes SmallVector grows in the same way as Comments, so perhaps that also should be changed to a purely dynamically allocated structure, but that requires some more code changes, so I let that remain as a SmallVector for now. llvm-svn: 374871	2019-10-15 09:21:09 +00:00
Joerg Sonnenberger	9681ea9560	Reapply r374743 with a fix for the ocaml binding Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374784	2019-10-14 16:15:14 +00:00
David Stenberg	8535bed795	[DebugInfo] Fix truncation of call site immediates Summary: This addresses a bug in collectCallSiteParameters() where call site immediates would be truncated from int64_t to unsigned. This fixes PR43525. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: aprantl Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D68869 llvm-svn: 374770	2019-10-14 12:49:58 +00:00
Dmitri Gribenko	1a21f98ac3	Revert "Add a pass to lower is.constant and objectsize intrinsics" This reverts commit r374743. It broke the build with Ocaml enabled: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218 llvm-svn: 374768	2019-10-14 12:22:48 +00:00
Sam Parker	527a35e155	[NFC][TTI] Add Alignment for isLegalMasked[Load/Store] Add an extra parameter so the backend can take the alignment into consideration. Differential Revision: https://reviews.llvm.org/D68400 llvm-svn: 374763	2019-10-14 10:00:21 +00:00
Joerg Sonnenberger	e4300c392d	Add a pass to lower is.constant and objectsize intrinsics This pass lowers is.constant and objectsize intrinsics not simplified by earlier constant folding, i.e. if the object given is not constant or if not using the optimized pass chain. The result is recursively simplified and constant conditionals are pruned, so that dead blocks are removed even for -O0. This allows inline asm blocks with operand constraints to work all the time. The new pass replaces the existing lowering in the codegen-prepare pass and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert on the intrinsics. Differential Revision: https://reviews.llvm.org/D65280 llvm-svn: 374743	2019-10-13 23:00:15 +00:00
Simon Pilgrim	944a051ebb	IRTranslator - silence static analyzer null dereference warnings. NFCI. The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst. llvm-svn: 374716	2019-10-13 11:29:35 +00:00
David Blaikie	de9aa37bf0	DebugInfo: Reduce the scope of some variables related to debug_ranges emission Minor tidy up/NFC llvm-svn: 374613	2019-10-11 23:51:24 +00:00
David Blaikie	289c45cc62	DebugInfo: Use base address selection entries for debug_loc Unify the range and loc emission (for both DWARFv4 and DWARFv5 style lists) and take advantage of that unification to use strategic base addresses for loclists. Differential Revision: https://reviews.llvm.org/D68620 llvm-svn: 374600	2019-10-11 21:52:41 +00:00
David Green	7c30af8e65	Revert 374373: [Codegen] Alter the default promotion for saturating adds and subs This commit is not extending the promoted integers as it should. Reverting whilst I look into the details. llvm-svn: 374592	2019-10-11 20:33:03 +00:00
Quentin Colombet	9c36ec5941	[GISel][CallLowering] Enable vector support in argument lowering The exciting code is actually already enough to handle the splitting of vector arguments but we were lacking a test case. This commit adds a test case for vector argument lowering involving splitting and enable the related support in call lowering. llvm-svn: 374589	2019-10-11 20:22:57 +00:00
Quentin Colombet	7720f11498	[MachineIRBuilder] Fix an assertion failure with buildMerge Teach buildMerge how to deal with scalar to vector kind of requests. Prior to this patch, buildMerge would issue either a G_MERGE_VALUES when all the vregs are scalars or a G_CONCAT_VECTORS when the destination vreg is a vector. G_CONCAT_VECTORS was actually not the proper instruction when the source vregs were scalars and the compiler would assert that the sources must be vectors. Instead we want is to issue a G_BUILD_VECTOR when we are in this situation. This patch fixes that. llvm-svn: 374588	2019-10-11 20:22:47 +00:00
Sanjay Patel	3b581ac80f	[DAGCombiner] fold vselect-of-constants to shift The diffs suggest that we are missing some more basic analysis/transforms, but this keeps the vector path in sync with the scalar (rL374397). This is again a preliminary step for introducing the reverse transform in IR as proposed in D63382. llvm-svn: 374555	2019-10-11 14:17:56 +00:00
Marcello Maggioni	a064edf55e	[GISel] Simplifying return from else in function. NFC Forgot to integrate this little change in previous commit llvm-svn: 374463	2019-10-10 21:51:30 +00:00
Marcello Maggioni	0112123eea	[GISel] Allow getConstantVRegVal() to return G_FCONSTANT values. In GISel we have both G_CONSTANT and G_FCONSTANT, but because in GISel we don't really have a concept of Float vs Int value the only difference between the two is where the data originates from. What both G_CONSTANT and G_FCONSTANT return is just a bag of bits with the constant representation in it. By making getConstantVRegVal() return G_FCONSTANTs bit representation as well we allow ConstantFold and other things to operate with G_FCONSTANT. Adding tests that show ConstantFolding to work on mixed G_CONSTANT and G_FCONSTANT sources. Differential Revision: https://reviews.llvm.org/D68739 llvm-svn: 374458	2019-10-10 21:46:26 +00:00
Sanjay Patel	7b904ce724	[DAGCombiner] fold select-of-constants to shift This reverses the scalar canonicalization proposed in D63382. Pre: isPowerOf2(C1) %r = select i1 %cond, i32 C1, i32 0 => %z = zext i1 %cond to i32 %r = shl i32 %z, log2(C1) https://rise4fun.com/Alive/Z50 x86 already tries to fold this pattern, but it isn't done uniformly, so we still see a diff. AArch64 probably should enable the TLI hook to benefit too, but that's a follow-on. llvm-svn: 374397	2019-10-10 17:52:02 +00:00
David Green	94d379095a	[Codegen] Alter the default promotion for saturating adds and subs The default promotion for the add_sat/sub_sat nodes currently does: 1. ANY_EXTEND iN to iM 2. SHL by M-N 3. [US][ADD\|SUB]SAT 4. L/ASHR by M-N If the promoted add_sat or sub_sat node is not legal, this can produce code that effectively does a lot of shifting (and requiring large constants to be materialised) just to use the overflow flag. It is simpler to just do the saturation manually, using the higher bitwidth addition and a min/max against the saturating bounds. That is what this patch attempts to do. Differential Revision: https://reviews.llvm.org/D68643 llvm-svn: 374373	2019-10-10 16:04:49 +00:00
Sanjay Patel	7f0e7c0b1c	[DAGCombiner] reduce code duplication; NFC llvm-svn: 374370	2019-10-10 15:38:29 +00:00
Guillaume Chatelet	ff054b9e32	[Alignment][NFC] Use llv::Align in GISelKnownBits Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68786 llvm-svn: 374369	2019-10-10 15:38:22 +00:00
Amaury Sechet	aaf0507896	[DAGCombine] Match more patterns for half word bswap Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68250 llvm-svn: 374340	2019-10-10 13:20:10 +00:00
Oliver Stannard	4f454b2275	[IfCvt][ARM] Optimise diamond if-conversion for code size Currently, the heuristics the if-conversion pass uses for diamond if-conversion are based on execution time, with no consideration for code size. This adds a new set of heuristics to be used when optimising for code size. This is mostly target-independent, because the if-conversion pass can see the code size of the instructions which it is removing. For thumb, there are a few passes (insertion of IT instructions, selection of narrow branches, and selection of CBZ instructions) which are run after if conversion and affect these heuristics, so I've added target hooks to better predict the code-size effect of a proposed if-conversion. Differential revision: https://reviews.llvm.org/D67350 llvm-svn: 374301	2019-10-10 09:58:28 +00:00
Reid Kleckner	9d8f0b3519	[codeview] Try to avoid emitting .cv_loc with line zero Summary: Visual Studio doesn't like it while stepping. It kicks you out of the source view of the file being stepped through and tries to fall back to the disassembly view. Fixes PR43530 The fix is incomplete, because it's possible to have a basic block with no source locations at all. In this case, we don't emit a .cv_loc, but that will result in wrong stepping behavior in the debugger if the layout predecessor of the location-less BB has an unrelated source location. We could try harder to find a valid location that dominates or post-dominates the current BB, but in general it's a dataflow problem, and one still might not exist. I left a FIXME about this. As an alternative, we might want to consider having the middle-end check if its emitting codeview and get it to stop using line zero. Reviewers: akhuang Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68747 llvm-svn: 374267	2019-10-10 01:06:01 +00:00
Philip Reames	931120846e	Conservatively add volatility and atomic checks in a few places As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86. As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them. This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions. Differential Revision: https://reviews.llvm.org/D68419 llvm-svn: 374261	2019-10-09 23:43:33 +00:00
Matt Arsenault	3cd3959fe2	GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR. llvm-svn: 374252	2019-10-09 22:44:43 +00:00
Evandro Menezes	e60415a0db	[Support] Add mathematical constants Add own version of the mathematical constants from the upcoming C++20 `std::numbers`. Differential revision: https://reviews.llvm.org/D68257 llvm-svn: 374207	2019-10-09 19:58:01 +00:00
Simon Pilgrim	e746380f6a	CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI. The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us. llvm-svn: 374085	2019-10-08 17:00:01 +00:00
Nikola Prica	98603a8153	[DebugInfo][If-Converter] Update call site info during the optimization During the If-Converter optimization pay attention when copying or deleting call instructions in order to keep call site information in valid state. Reviewers: aprantl, vsk, efriedma Reviewed By: vsk, efriedma Differential Revision: https://reviews.llvm.org/D66955 llvm-svn: 374068	2019-10-08 15:43:12 +00:00
Graham Hunter	b302561b76	[SVE][IR] Scalable Vector size queries and IR instruction support * Adds a TypeSize struct to represent the known minimum size of a type along with a flag to indicate that the runtime size is a integer multiple of that size * Converts existing size query functions from Type.h and DataLayout.h to return a TypeSize result * Adds convenience methods (including a transparent conversion operator to uint64_t) so that most existing code 'just works' as if the return values were still scalars. * Uses the new size queries along with ElementCount to ensure that all supported instructions used with scalable vectors can be constructed in IR. Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen Reviewed By: rovka, sdesmalen Differential Revision: https://reviews.llvm.org/D53137 llvm-svn: 374042	2019-10-08 12:53:54 +00:00
Nicolai Haehnle	7febdb7f27	MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block Summary: When getValueInMiddleOfBlock happens to be called for a basic block that has no incoming value at all, an IMPLICIT_DEF is inserted in that block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at the top of its basic block or it will likely not reach the use that the caller intends to insert. Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204 Reviewers: arsenm, rampitec Subscribers: jvesely, wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68183 llvm-svn: 374040	2019-10-08 12:46:20 +00:00
Reid Kleckner	f9b67b810e	[X86] Add new calling convention that guarantees tail call optimization When the target option GuaranteedTailCallOpt is specified, calls with the fastcc calling convention will be transformed into tail calls if they are in tail position. This diff adds a new calling convention, tailcc, currently supported only on X86, which behaves the same way as fastcc, except that the GuaranteedTailCallOpt flag does not need to enabled in order to enable tail call optimization. Patch by Dwight Guth <dwight.guth@runtimeverification.com>! Reviewed By: lebedev.ri, paquette, rnk Differential Revision: https://reviews.llvm.org/D67855 llvm-svn: 373976	2019-10-07 22:28:58 +00:00
Matt Arsenault	4bcdcad91b	GlobalISel: Partially implement lower for G_INSERT llvm-svn: 373946	2019-10-07 19:13:27 +00:00
Matt Arsenault	27269054d2	GlobalISel: Add target pre-isel instructions Allows targets to introduce regbankselectable pseudo-instructions. Currently the closet feature to this is an intrinsic. However this requires creating a public intrinsic declaration. This litters the public intrinsic namespace with operations we don't necessarily want to expose to IR producers, and would rather leave as private to the backend. Use a new instruction bit. A previous attempt tried to keep using enum value ranges, but it turned into a mess. llvm-svn: 373937	2019-10-07 18:43:29 +00:00
Jordan Rose	fdaa742174	Second attempt to add iterator_range::empty() Doing this makes MSVC complain that `empty(someRange)` could refer to either C++17's std::empty or LLVM's llvm::empty, which previously we avoided via SFINAE because std::empty is defined in terms of an empty member rather than begin and end. So, switch callers over to the new method as it is added. https://reviews.llvm.org/D68439 llvm-svn: 373935	2019-10-07 18:14:24 +00:00
Kevin P. Neal	1c3d19c82d	[FPEnv] Add constrained intrinsics for lrint and lround Earlier in the year intrinsics for lrint, llrint, lround and llround were added to llvm. The constrained versions are now implemented here. Reviewed by: andrew.w.kaylor, craig.topper, cameron.mcinally Approved by: craig.topper Differential Revision: https://reviews.llvm.org/D64746 llvm-svn: 373900	2019-10-07 13:20:00 +00:00
Simon Pilgrim	b4ba3cbda0	[X86][AVX] Access a scalar float/double as a free extract from a broadcast load (PR43217) If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element. This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted. Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper. Fixes PR43217 Differential Revision: https://reviews.llvm.org/D68544 llvm-svn: 373871	2019-10-06 21:11:45 +00:00
Craig Topper	842dde6be4	[LegalizeTypes][X86] When splitting a vselect for type legalization, don't split a setcc condition if the setcc input is legal and vXi1 conditions are supported Summary: The VSELECT splitting code tries to split a setcc input as well. But on avx512 where mask registers are well supported it should be better to just split the mask and use a single compare. Reviewers: RKSimon, spatel, efriedma Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68359 llvm-svn: 373863	2019-10-06 18:43:03 +00:00
Sanjay Patel	f643fabb52	Revert [DAGCombine] Match more patterns for half word bswap This reverts r373850 (git commit `25ba49824d`) This patch appears to cause multiple codegen regression test failures - http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/10680 llvm-svn: 373853	2019-10-06 15:27:34 +00:00
Amaury Sechet	25ba49824d	[DAGCombine] Match more patterns for half word bswap Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68250 llvm-svn: 373850	2019-10-06 14:14:55 +00:00
Matt Arsenault	a5b9c75674	GlobalISel: Partially implement lower for G_EXTRACT Turn into shift and truncate. Doesn't yet handle pointers. llvm-svn: 373838	2019-10-06 01:37:35 +00:00
Craig Topper	2decdf42b9	[FastISel] Copy the inline assembly dialect to the INLINEASM instruction. Fixes PR43575. llvm-svn: 373836	2019-10-05 23:21:17 +00:00
Simon Pilgrim	f609c0a303	BranchFolding - IsBetterFallthrough - assert non-null pointers. NFCI. Silences static analyzer null dereference warnings. llvm-svn: 373823	2019-10-05 13:20:30 +00:00
Philip Reames	d5a4dad206	Fix a nasty miscompile in experimental unordered atomic lowering This is an omission in rL371441. Loads which happened to be unordered weren't being added to the PendingLoad set, and thus weren't be ordered w/respect to side effects which followed before the end of the block. Included test case is how I spotted this. We had an atomic load being folded into a using instruction after a fence that load was supposed to be ordered with. I'm sure it showed up a bunch of other ways as well. Spotted via manual inspecting of assembly differences in a corpus w/and w/o the new experimental mode. Finding this with testing would have been "unpleasant". llvm-svn: 373814	2019-10-05 00:32:10 +00:00
Reid Kleckner	67cfa79c01	Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks This reverts r371177 (git commit `f879c68755`) It caused PR43566 by removing empty, address-taken MachineBasicBlocks. Such blocks may have references from blockaddress or other operands, and need more consideration to be removed. See the PR for a test case to use when relanding. llvm-svn: 373805	2019-10-04 22:24:21 +00:00
Jessica Paquette	784892c964	[MachineOutliner] Disable outlining from noreturn functions Outlining from noreturn functions doesn't do the correct thing right now. The outliner should respect that the caller is marked noreturn. In the event that we have a noreturn function, and the outlined code is in tail position, the outliner will not see that the outlined function should be tail called. As a result, you end up with a regular call containing a return. Fixing this requires that we check that all candidates live inside noreturn functions. So, for the sake of correctness, don't outline from noreturn functions right now. Add machine-outliner-noreturn.mir to test this. llvm-svn: 373791	2019-10-04 21:24:12 +00:00
Eli Friedman	23ae13d51f	[ScheduleDAG] When a node is cloned, add an edge between the nodes. InstrEmitter's virtual register handling assumes that clones are emitted after the cloned node. Make sure this assumption actually holds. Fixes a "Node emitted out of order - early" assertion on the testcase. This is probably a very rare case to actually hit in practice; even without the explicit edge, the scheduler will usually end up scheduling the nodes in the expected order due to other constraints. Differential Revision: https://reviews.llvm.org/D68068 llvm-svn: 373782	2019-10-04 19:51:40 +00:00
James Molloy	9baac83a2e	[ModuloSchedule] Do not remap terminators This is a trivial point fix. Terminator instructions aren't scheduled, so we shouldn't expect to be able to remap them. This doesn't affect Hexagon and PPC because their terminators are always hardware loop backbranches that have no register operands. llvm-svn: 373762	2019-10-04 17:15:25 +00:00
Simon Pilgrim	84f5cd75b3	Fix MSVC "not all control paths return a value" warning. NFCI. llvm-svn: 373741	2019-10-04 12:45:27 +00:00
Jeremy Morse	61800a75b7	[DebugInfo] LiveDebugValues: move DBG_VALUE creation into VarLoc class Rather than having a mixture of location-state shared between DBG_VALUEs and VarLoc objects in LiveDebugValues, this patch makes VarLoc the master record of variable locations. The refactoring means that the transfer of locations from one place to another is always a performed by an operation on an existing VarLoc, that produces another transferred VarLoc. DBG_VALUEs are only created at the end of LiveDebugValues, once all locations are known. As a plus, there is now only one method where DBG_VALUEs can be created. The test case added covers a circumstance that is now impossible to express in LiveDebugValues: if an already-indirect DBG_VALUE is spilt, previously it would have been restored-from-spill as a direct DBG_VALUE. We now don't lose this information along the way, as VarLocs always refer back to the "original" non-transfer DBG_VALUE, and we can always work out whether a location was "originally" indirect. Differential Revision: https://reviews.llvm.org/D67398 llvm-svn: 373727	2019-10-04 10:53:47 +00:00
Jeremy Morse	0ca48de26c	[DebugInfo] LiveDebugValues: defer DBG_VALUE creation during analysis When transfering variable locations from one place to another, LiveDebugValues immediately creates a DBG_VALUE representing that transfer. This causes trouble if the variable location should subsequently be invalidated by a loop back-edge, such as in the added test case: the transfer DBG_VALUE from a now-invalid location is used as proof that the variable location is correct. This is effectively a self-fulfilling prophesy. To avoid this, defer the insertion of transfer DBG_VALUEs until after analysis has completed. Some of those transfers are still sketchy, but we don't propagate them into other blocks now. Differential Revision: https://reviews.llvm.org/D67393 llvm-svn: 373720	2019-10-04 09:38:05 +00:00
Sanjay Patel	288079aafd	[DAGCombiner] add operation legality checks before creating shift ops (PR43542) As discussed on llvm-dev and: https://bugs.llvm.org/show_bug.cgi?id=43542 ...we have transforms that assume shift operations are legal and transforms to use them are profitable, but that may not hold for simple targets. In this case, the MSP430 target custom lowers shifts by repeating (many) simpler/fixed ops. That can be avoided by keeping this code as setcc/select. Differential Revision: https://reviews.llvm.org/D68397 llvm-svn: 373666	2019-10-03 21:34:04 +00:00
David Blaikie	2ac586c58f	DebugInfo: Generalize rnglist emission as a precursor to reusing it for loclist emission llvm-svn: 373663	2019-10-03 20:56:23 +00:00
James Molloy	9972c992eb	[ModuloSchedule] removeBranch() before creating the trip count condition The Hexagon code assumes there's no existing terminator when inserting its trip count condition check. This causes swp-stages5.ll to break. The generated code looks good to me, it is likely a permutation. I have disabled the new codegen path to keep everything green and will investigate along with the other 3-4 tests that have different codegen. Fixes expensive-checks build. llvm-svn: 373629	2019-10-03 17:10:32 +00:00
Guillaume Chatelet	d400d45150	[Alignment][NFC] Remove StoreInst::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, bollu, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68268 llvm-svn: 373595	2019-10-03 13:17:21 +00:00
David Blaikie	11e0bcf8a2	DebugInfo: Rename DebugLocStream::Entry::Begin/EndSym to just Begin/End Brings this struct in line with the RangeSpan class so they might eventually be used by common template code for generating range/loc lists with less duplicate code. llvm-svn: 373540	2019-10-02 22:58:02 +00:00
Craig Topper	2772b970e3	[LegalizeTypes] Check for already split condition before calilng SplitVecRes_SETCC in SplitRes_SELECT. No point in manually splitting the SETCC if it was already done. llvm-svn: 373535	2019-10-02 22:34:49 +00:00
David Blaikie	b677cb8dc7	DebugInfo: Simplify RangeSpan to be a plain struct This is an effort to make RangeSpan and DebugLocStream::Entry more similar to share code for their emission (to reuse the more complicated code for using (& choosing when to use) base address selection entries, etc). It didn't seem like this struct was worth the complexity of encapsulation - when the members could be initialized by the ctor to any value (no validation) and the type is assignable (so there's no mutability or other constraint being implemented by its interface). llvm-svn: 373533	2019-10-02 22:27:24 +00:00
Simon Pilgrim	49c2390877	[CodeGen] Remove unused MachineMemOperand::print wrappers (PR41772) As noted on PR41772, the static analyzer reports that the MachineMemOperand::print partial wrappers set a number of args to null pointers that were then dereferenced in the actual implementation. It turns out that these wrappers are not being used at all (hence why we're not seeing any crashes), so I'd like to propose we just get rid of them. Differential Revision: https://reviews.llvm.org/D68208 llvm-svn: 373484	2019-10-02 16:20:28 +00:00
Hans Wennborg	9330005a54	Reapply r373431 "Switch lowering: omit range check for bit tests when default is unreachable (PR43129)" This was reverted in r373454 due to breaking the expensive-checks bot. This version addresses that by omitting the addSuccessorWithProb() call when omitting the range check. > Switch lowering: omit range check for bit tests when default is unreachable (PR43129) > > This is modeled after the same functionality for jump tables, which was > added in r357067. > > Differential revision: https://reviews.llvm.org/D68131 llvm-svn: 373477	2019-10-02 14:35:06 +00:00
Simon Pilgrim	369d16a1c6	AsmPrinter - emitGlobalConstantFP - silence static analyzer null dereference warning. NFCI. All the calls to emitGlobalConstantFP should provide a nonnull Type for the float. llvm-svn: 373464	2019-10-02 13:08:46 +00:00
James Molloy	9026518e73	[ModuloSchedule] Peel out prologs and epilogs, generate actual code Summary: This extends the PeelingModuloScheduleExpander to generate prolog and epilog code, and correctly stitch uses through the prolog, kernel, epilog DAG. The key concept in this patch is to ensure that all transforms are local; only a function of a block and its immediate predecessor and successor. By defining the problem in this way we can inductively rewrite the entire DAG using only local knowledge that is easy to reason about. For example, we assume that all prologs and epilogs are near-perfect clones of the steady-state kernel. This means that if a block has an instruction that is predicated out, we can redirect all users of that instruction to that equivalent instruction in our immediate predecessor. As all blocks are clones, every instruction must have an equivalent in every other block. Similarly we can make the assumption by construction that if a value defined in a block is used outside that block, the only possible user is its immediate successors. We maintain this even for values that are used outside the loop by creating a limited form of LCSSA. This code isn't small, but it isn't complex. Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet; I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns that we don't produce. Those still need a bit more investigation. In the meantime we (Google) are happy with the code produced by this on our downstream SMS implementation, and believe it generates correct code. Subscribers: mgorny, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68205 llvm-svn: 373462	2019-10-02 12:46:44 +00:00
Hans Wennborg	372aece777	Revert r373431 "Switch lowering: omit range check for bit tests when default is unreachable (PR43129)" This broke http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19967 > Switch lowering: omit range check for bit tests when default is unreachable (PR43129) > > This is modeled after the same functionality for jump tables, which was > added in r357067. > > Differential revision: https://reviews.llvm.org/D68131 llvm-svn: 373454	2019-10-02 12:08:44 +00:00
Simon Pilgrim	c9129cea27	WinException::emitExceptHandlerTable - silence static analyzer dyn_cast<Function> null dereference warning. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<Function> directly and if not assert will fire for us. llvm-svn: 373449	2019-10-02 11:48:32 +00:00
Hans Wennborg	cbefc36fcc	Switch lowering: omit range check for bit tests when default is unreachable (PR43129) This is modeled after the same functionality for jump tables, which was added in r357067. Differential revision: https://reviews.llvm.org/D68131 llvm-svn: 373431	2019-10-02 08:32:15 +00:00
David Blaikie	bfc68885d9	DebugInfo: Update support for detecting C++ language variants in debug info emission llvm-svn: 373420	2019-10-02 01:39:48 +00:00
Jakub Kuderski	856c1cd852	[Dominators][CodeGen] Don't mark MachineDominatorTree as preserved in MachineLICM llvm-svn: 373378	2019-10-01 18:27:44 +00:00
Jakub Kuderski	5be08ee902	[Dominators][CodeGen] Fix MachineDominatorTree preservation in PHIElimination Summary: PHIElimination modifies CFG and marks MachineDominatorTree as preserved. Therefore, it the CFG changes it should also update the MDT, when available. This patch teaches PHIElimination to recalculate MDT when necessary. This fixes the `tailmerging_in_mbp.ll` test failure discovered after switching to generic DomTree verification algorithm in MachineDominators in D67976. Reviewers: arsenm, hliao, alex-t, rampitec, vpykhtin, grosser Reviewed By: rampitec Subscribers: MatzeB, wdng, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68154 llvm-svn: 373377	2019-10-01 18:27:17 +00:00
Jakub Kuderski	925c285f43	Reapply [Dominators][CodeGen] Clean up MachineDominators This reverts r373117 (git commit `159ef37735`) Phabricator review: https://reviews.llvm.org/D67976. llvm-svn: 373376	2019-10-01 18:27:14 +00:00
Jay Foad	e536800022	[AMDGPU] Add VerifyScheduling support. Summary: This is cut and pasted from the corresponding GenericScheduler functions. Reviewers: arsenm, atrick, tstellar, vpykhtin Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68264 llvm-svn: 373346	2019-10-01 15:45:47 +00:00
Simon Pilgrim	3c912c4abe	[DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863) This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks. The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants). Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.). Partial reversion of rL372756 - I've identified the infinite loop issue inside the X86 override but haven't fixed it yet so I've only (re)committed the common TargetLowering refactoring part of the patch. Differential Revision: https://reviews.llvm.org/D67557 llvm-svn: 373343	2019-10-01 15:32:04 +00:00
Jakub Kuderski	56b52a207f	[Dominators][CodeGen] Add MachinePostDominatorTree verification Summary: This patch implements Machine PostDominator Tree verification and ensures that the verification doesn't fail the in-tree tests. MPDT verification can be enabled using `verify-machine-dom-info` -- the same flag used by Machine Dominator Tree verification. Flipping the flag revealed that MachineSink falsely claimed to preserve CFG and MDT/MPDT. This patch fixes that. Reviewers: arsenm, hliao, rampitec, vpykhtin, grosser Reviewed By: hliao Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68235 llvm-svn: 373341	2019-10-01 15:23:27 +00:00
Dmitri Gribenko	827a7fab78	Revert "GlobalISel: Handle llvm.read_register" This reverts commit r373294. It broke Clang's CodeGen/arm64-microsoft-status-reg.cpp: http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18483 llvm-svn: 373310	2019-10-01 08:24:01 +00:00
Matt Arsenault	bdcc6d3d26	GlobalISel: Handle llvm.read_register SelectionDAG has a bunch of machinery to defer this to selection time for some reason. Just directly emit a copy during IRTranslator. The x86 usage does somewhat questionably check hasFP, which could depend on the whole function being at minimum translated. This does lose the convergent bit if the callsite had it, which may be a problem. We also lose that in general for intrinsics, which may also be a problem. llvm-svn: 373294	2019-10-01 02:07:16 +00:00
Matt Arsenault	f24ac13aaa	TLI: Remove DAG argument from getRegisterByName Replace with the MachineFunction. X86 is the only user, and only uses it for the function. This removes one obstacle from using this in GlobalISel. The other is the more tolerable EVT argument. The X86 use of the function seems questionable to me. It checks hasFP, before frame lowering. llvm-svn: 373292	2019-10-01 01:44:39 +00:00
Matt Arsenault	ed85b0cee6	GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sources Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU. llvm-svn: 373287	2019-10-01 01:06:48 +00:00
David Blaikie	38456776b3	DebugInfo: Simplify section label caching/usage llvm-svn: 373273	2019-09-30 23:19:10 +00:00
Amaury Sechet	e6f98c0073	[DAGCombiner] Clang format MatchRotate. NFC llvm-svn: 373269	2019-09-30 21:41:52 +00:00
Daniel Sanders	cbe13a1461	[globalisel][knownbits] Allow targets to call GISelKnownBits::computeKnownBitsImpl() Summary: It seems we missed that the target hook can't query the known-bits for the inputs to a target instruction. Fix that oversight Reviewers: aditya_nandakumar Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67380 llvm-svn: 373264	2019-09-30 20:55:53 +00:00
Amaury Sechet	496c0564f1	[DAGCombiner] Update MatchRotate so that it returns an SDValue. NFC llvm-svn: 373260	2019-09-30 20:47:23 +00:00
Yuanfang Chen	cc382cf727	[NewPM] Port MachineModuleInfo to the new pass manager. Existing clients are converted to use MachineModuleInfoWrapperPass. The new interface is for defining a new pass manager API in CodeGen. Reviewers: fedor.sergeev, philip.pfaffe, chandlerc, arsenm Reviewed By: arsenm, fedor.sergeev Differential Revision: https://reviews.llvm.org/D64183 llvm-svn: 373240	2019-09-30 17:54:50 +00:00
Jessica Paquette	b1c1095fdc	[AArch64][GlobalISel] Support lowering variadic musttail calls This adds support for lowering variadic musttail calls. To do this, we have to... - Detect a musttail call in a variadic function before attempting to lower the call's formal arguments. This is done in the IRTranslator. - Compute forwarded registers in `lowerFormalArguments`, and add copies for those registers. - Restore the forwarded registers in `lowerTailCall`. Because there doesn't seem to be any nice way to wrap these up into the outgoing argument handler, the restore code in `lowerTailCall` is done separately. Also, irritatingly, you have to make sure that the registers don't overlap with any passed parameters. Otherwise, the scheduler doesn't know what to do with the extra copies and asserts. Add call-translator-variadic-musttail.ll to test this. This is pretty much the same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to base this off of, but the idea is the same. Differential Revision: https://reviews.llvm.org/D68043 llvm-svn: 373226	2019-09-30 16:49:13 +00:00
Paul Robinson	ed1f3f36ae	[SSP] [3/3] cmpxchg and addrspacecast instructions can now trigger stack protectors. Fixes PR42238. Add test coverage for llvm.memset, as proxy for all llvm.mem* intrinsics. There are two issues here: (1) they could be lowered to a libc call, which could be intercepted, and do Bad Stuff; (2) with a non-constant size, they could overwrite the current stack frame. The test was mostly written by Matt Arsenault in r363169, which was later reverted; I tweaked what he had and added the llvm.memset part. Differential Revision: https://reviews.llvm.org/D67845 llvm-svn: 373220	2019-09-30 15:11:23 +00:00
Paul Robinson	527815f5b0	[SSP] [2/3] Refactor an if/dyn_cast chain to switch on opcode. NFC Differential Revision: https://reviews.llvm.org/D67844 llvm-svn: 373219	2019-09-30 15:08:38 +00:00
Paul Robinson	14945186c2	[SSP] [1/3] Revert "StackProtector: Use PointerMayBeCaptured" "Captured" and "relevant to Stack Protector" are not the same thing. This reverts commit `f29366b1f5`. aka r363169. Differential Revision: https://reviews.llvm.org/D67842 llvm-svn: 373216	2019-09-30 15:01:35 +00:00
Tamas Berghammer	421a186fb4	Support MemoryLocation::UnknownSize in TargetLowering::IntrinsicInfo Summary: Previously IntrinsicInfo::size was an unsigned what can't represent the 64 bit value used by MemoryLocation::UnknownSize. Reviewers: jmolloy Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D68219 llvm-svn: 373214	2019-09-30 14:44:24 +00:00
Guillaume Chatelet	ab11b9188d	[Alignment][NFC] Remove AllocaInst::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68141 llvm-svn: 373207	2019-09-30 13:34:44 +00:00
Guillaume Chatelet	17380227e8	[Alignment][NFC] Remove LoadInst::setAlignment(unsigned) Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, jdoerfert Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D68142 llvm-svn: 373195	2019-09-30 09:37:05 +00:00
Hans Wennborg	dc7dbb1a88	NFC changes to SelectionDAGBuilder::visitBitTestHeader(), preparing for PR43129 llvm-svn: 373191	2019-09-30 08:47:53 +00:00
Roger Ferrer Ibanez	5a2a14db0b	[TargetLowering] Simplify expansion of S{ADD,SUB}O ISD::SADDO uses the suggested sequence described in the section §2.4 of the RISCV Spec v2.2. ISD::SSUBO uses the dual approach but checking for (non-zero) positive. Differential Revision: https://reviews.llvm.org/D47927 llvm-svn: 373187	2019-09-30 07:58:50 +00:00
Amara Emerson	509a4947c9	Add an operand to memory intrinsics to denote the "tail" marker. We need to propagate this information from the IR in order to be able to safely do tail call optimizations on the intrinsics during legalization. Assuming it's safe to do tail call opt without checking for the marker isn't safe because the mem libcall may use allocas from the caller. This adds an extra immediate operand to the end of the intrinsics and fixes the legalizer to handle it. Differential Revision: https://reviews.llvm.org/D68151 llvm-svn: 373140	2019-09-28 05:33:21 +00:00
Jakub Kuderski	159ef37735	Revert [Dominators][CodeGen] Clean up MachineDominators This reverts r373101 (git commit `72c57ec3e6`) llvm-svn: 373117	2019-09-27 19:33:39 +00:00
Jakub Kuderski	72c57ec3e6	[Dominators][CodeGen] Clean up MachineDominators Summary: This is a cleanup patch for MachineDominatorTree. It would be an NFC, except for replacing custom DomTree verification with the generic one. Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao Reviewed By: arsenm Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67976 llvm-svn: 373101	2019-09-27 17:25:39 +00:00
Djordje Todorovic	eb4c98ca3d	[DebugInfo] Exclude memory location values as parameter entry values Abandon describing of loaded values due to safety concerns. Loaded values are described as derefed memory location at caller point. At callee we can unintentionally change that memory location which would lead to different entry being printed value before and after the memory location clobbering. This problem is described in llvm.org/PR43343. Patch by Nikola Prica Differential Revision: https://reviews.llvm.org/D67717 llvm-svn: 373089	2019-09-27 13:52:43 +00:00
Jesper Antonsson	39b81f1cbc	[CodeGenPrepare] Mend "avoid crashing from replacing a phi twice" fix. Summary: An erroneously negated if-statement by an earlier (March 2019) bugfix left phi replacement/simplification under optimizeMemoryInst() in CodeGenPrepare largely inactivated. The error was found when csmith found that the same assert as in the original bug report could still be triggered in a different way. This patch fixes the bugfix. The original bug was: https://bugs.llvm.org/show_bug.cgi?id=41052 ... and the previous fix was D59358. Reviewers: aprantl, skatkov Reviewed By: skatkov Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67838 llvm-svn: 373084	2019-09-27 13:01:37 +00:00
Guillaume Chatelet	18f805a7ea	[Alignment][NFC] Remove unneeded llvm:: scoping on Align types llvm-svn: 373081	2019-09-27 12:54:21 +00:00
Hans Wennborg	3740ae3b8a	Revert r372893 "[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets" This caused severe compile-time regressions, see PR43455. > Modern processors predict the targets of an indirect branch regardless of > the size of any jump table used to glean its target address. Moreover, > branch predictors typically use resources limited by the number of actual > targets that occur at run time. > > This patch changes the semantics of the option `-max-jump-table-size` to limit > the number of different targets instead of the number of entries in a jump > table. Thus, it is now renamed to `-max-jump-table-targets`. > > Before, when `-max-jump-table-size` was specified, it could happen that > cluster jump tables could have targets used repeatedly, but each one was > counted and typically resulted in tables with the same number of entries. > With this patch, when specifying `-max-jump-table-targets`, tables may have > different lengths, since the number of unique targets is counted towards the > limit, but the number of unique targets in tables is the same, but for the > last one containing the balance of targets. > > Differential revision: https://reviews.llvm.org/D60295 llvm-svn: 373060	2019-09-27 09:54:26 +00:00
Changpeng Fang	f5524f0451	Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint Reviewers: arsenm Differential Revision: https://reviews.llvm.org/D58360 llvm-svn: 373024	2019-09-26 22:53:44 +00:00
Xiangling Liao	3b808fb330	[AIX]Emit function descriptor csect in assembly This patch emits the function descriptor csect for functions with definitions under both 32-bit/64-bit mode on AIX. Differential Revision: https://reviews.llvm.org/D66724 llvm-svn: 373009	2019-09-26 19:38:32 +00:00
Mikael Holmen	957e090ac9	[IfConversion] Disallow TBB == FBB for valid triangles Summary: Previously the case EBB \| \_ \| \| \| TBB \| / FBB was treated as a valid triangle also when TBB and FBB was the same basic block. This could then lead to an invalid CFG when we removed the edge from EBB to TBB, since that meant we would also remove the edge from EBB to FBB. Since TBB == FBB is quite a degenerated case of a triangle, we now don't treat it as a valid triangle anymore, and thus we will avoid the trouble with updating the CFG. Reviewers: efriedma, dmgreen, kparzysz Reviewed By: efriedma Subscribers: bjope, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67832 llvm-svn: 372943	2019-09-26 06:35:55 +00:00
Thomas Raoux	3c8c667235	[TargetLowering] Make allowsMemoryAccess methode virtual. Rename old function to explicitly show that it cares only about alignment. The new allowsMemoryAccess call the function related to alignment by default and can be overridden by target to inform whether the memory access is legal or not. Differential Revision: https://reviews.llvm.org/D67121 llvm-svn: 372935	2019-09-26 00:16:01 +00:00
Jessica Paquette	8535a8672e	[AArch64][GlobalISel] Choose CCAssignFns per-argument for tail call lowering When checking for tail call eligibility, we should use the correct CCAssignFn for each argument, rather than just checking if the caller/callee is varargs or not. This is important for tail call lowering with varargs. If we don't check it, then basically any varargs callee with parameters cannot be tail called on Darwin, for one thing. If the parameters are all guaranteed to be in registers, this should be entirely safe. On top of that, not checking for this could potentially make it so that we have the wrong stack offsets when checking for tail call eligibility. Also refactor some of the stuff for CCAssignFnForCall and pull it out into a helper function. Update call-translator-tail-call.ll to show that we can now correctly tail call on Darwin. Also add two extra tail call checks. The first verifies that we still respect the caller's stack size, and the second verifies that we still don't tail call when a varargs function has a memory argument. Differential Revision: https://reviews.llvm.org/D67939 llvm-svn: 372897	2019-09-25 16:45:35 +00:00
Evandro Menezes	3bd8ba156b	[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets Modern processors predict the targets of an indirect branch regardless of the size of any jump table used to glean its target address. Moreover, branch predictors typically use resources limited by the number of actual targets that occur at run time. This patch changes the semantics of the option `-max-jump-table-size` to limit the number of different targets instead of the number of entries in a jump table. Thus, it is now renamed to `-max-jump-table-targets`. Before, when `-max-jump-table-size` was specified, it could happen that cluster jump tables could have targets used repeatedly, but each one was counted and typically resulted in tables with the same number of entries. With this patch, when specifying `-max-jump-table-targets`, tables may have different lengths, since the number of unique targets is counted towards the limit, but the number of unique targets in tables is the same, but for the last one containing the balance of targets. Differential revision: https://reviews.llvm.org/D60295 llvm-svn: 372893	2019-09-25 16:10:20 +00:00
Sanjay Patel	831a7e7068	[DAGCombiner] add one-use restriction to vector transform with cheap extract We might be able to do better on the example in the test, but in general, we should not scalarize a splatted vector binop if there are other uses of the binop. Otherwise, we can end up with code as we had - a scalar op that is redundant with a vector op. llvm-svn: 372886	2019-09-25 15:08:33 +00:00
Simon Pilgrim	5f2d8b2618	[TargetInstrInfo] Let findCommutedOpIndices take const MachineInstr& Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future. Committed on behalf of @hvdijk (Harald van Dijk) Differential Revision: https://reviews.llvm.org/D66138 llvm-svn: 372882	2019-09-25 14:55:57 +00:00
Jakub Kuderski	269bd15c68	[Dominators][AMDGPU] Don't use virtual exit node in findNearestCommonDominator. Cleanup MachinePostDominators. Summary: This patch fixes a bug that originated from passing a virtual exit block (nullptr) to `MachinePostDominatorTee::findNearestCommonDominator` and resulted in assertion failures inside its callee. It also applies a small cleanup to the class. The patch introduces a new function in PDT that given a list of `MachineBasicBlock`s finds their NCD. The new overload of `findNearestCommonDominator` handles virtual root correctly. Note that similar handling of virtual root nodes is not necessary in (forward) `DominatorTree`s, as right now they don't use virtual roots. Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao Reviewed By: hliao Subscribers: hliao, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits Tags: #amdgpu, #llvm Differential Revision: https://reviews.llvm.org/D67974 llvm-svn: 372874	2019-09-25 14:04:36 +00:00
Sanjay Patel	2cec4b58f5	Revert [IR] allow fast-math-flags on phi of FP values This reverts r372866 (git commit `dec03223a9`) llvm-svn: 372868	2019-09-25 13:29:09 +00:00
Sanjay Patel	dec03223a9	[IR] allow fast-math-flags on phi of FP values The changes here are based on the corresponding diffs for allowing FMF on 'select': D61917 As discussed there, we want to have fast-math-flags be a property of an FP value because the alternative (having them on things like fcmp) leads to logical inconsistency such as: https://bugs.llvm.org/show_bug.cgi?id=38086 The earlier patch for select made almost no practical difference because most unoptimized conditional code begins life as a phi (based on what I see in clang). Similarly, I don't expect this patch to do much on its own either because SimplifyCFG promptly drops the flags when converting to select on a minimal example like: https://bugs.llvm.org/show_bug.cgi?id=39535 But once we have this plumbing in place, we should be able to wire up the FMF propagation and start solving cases like that. The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a regression in a LoopVectorize test. We are intersecting the FMF of any FPMathOperator there, so if a phi is not properly annotated, new math instructions may not be either. Once we fix the propagation in SimplifyCFG, it may be safe to remove that hack. Differential Revision: https://reviews.llvm.org/D67564 llvm-svn: 372866	2019-09-25 13:14:12 +00:00
Simon Pilgrim	20f4afc5a7	[DAG] Pull out minimum shift value calc into a helper function. NFCI. llvm-svn: 372856	2019-09-25 12:28:56 +00:00
Simon Pilgrim	be9beef5da	AggressiveAntiDepBreaker - silence static analyzer null dereference warning. NFCI. Assert that we've found the critical path. llvm-svn: 372759	2019-09-24 13:57:51 +00:00
Ilya Biryukov	60e5e0b667	Revert r372333: [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863) Reason: this caused severe compile time regressions in JAX. See email thread of original revision on llvm-commits for details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190923/697042.html llvm-svn: 372756	2019-09-24 13:48:02 +00:00
Simon Pilgrim	9942c07745	[ModuloSchedule] KernelRewriter::rewrite - silence static analyzer dyn_cast<> null dereference warning. NFCI. Assert that we've found the start of the MI schedule list. llvm-svn: 372723	2019-09-24 10:58:42 +00:00
Simon Pilgrim	c81f8e4ce1	lowerObjCCall - silence static analyzer dyn_cast<CallInst> null dereference warnings. NFCI. The static analyzer is warning about a potential null dereference, but we should be able to use cast<CallInst> directly and if not assert will fire for us. llvm-svn: 372720	2019-09-24 10:46:30 +00:00
Pavel Labath	aaff1a631a	MCRegisterInfo: Merge getLLVMRegNum and getLLVMRegNumFromEH Summary: The functions different in two ways: - getLLVMRegNum could return both "eh" and "other" dwarf register numbers, while getLLVMRegNumFromEH only returned the "eh" number. - getLLVMRegNum asserted if the register was not found, while the second function returned -1. The second distinction was pretty important, but it was very hard to infer that from the function name. Aditionally, for the use case of dumping dwarf expressions, we needed a function which can work with both kinds of number, but does not assert. This patch solves both of these issues by merging the two functions into one, returning an Optional<unsigned> value. While the same thing could be achieved by adding an "IsEH" argument to the (renamed) getLLVMRegNumFromEH function, it seemed better to avoid the confusion of two functions and put the choice of asserting into the hands of the caller -- if he checks the Optional value, he can safely process "untrusted" input, and if he blindly dereferences the Optional, he gets the assertion. I've updated all call sites to the new API, choosing between the two options according to the function they were calling originally, except that I've updated the usage in DWARFExpression.cpp to use the "safe" method instead, and added a test case which would have previously triggered an assertion failure when processing (incorrect?) dwarf expressions. Reviewers: dsanders, arsenm, JDevlieghere Subscribers: wdng, aprantl, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67154 llvm-svn: 372710	2019-09-24 09:31:02 +00:00
Amara Emerson	adec1209e6	[GlobalISel][IRTranslator] Fix switch table lowering to use signed LE not unsigned. We were miscompiling switch value comparisons with the wrong signedness, which shows up when we have things like switch case values with i1 types, which end up being legalized incorrectly. Fixes PR43383 llvm-svn: 372675	2019-09-24 00:09:23 +00:00
Sanjay Patel	7414151929	[BreakFalseDeps] ignore function with minsize attribute This came up in the x86-specific: https://bugs.llvm.org/show_bug.cgi?id=43239 ...but it is a general problem for the BreakFalseDeps pass. Dependencies may be broken by adding some other instruction, so that should be avoided if the overall goal is to minimize size. Differential Revision: https://reviews.llvm.org/D67363 llvm-svn: 372628	2019-09-23 17:01:01 +00:00
Guillaume Chatelet	1ae7905fc8	[Alignment][NFC] DataLayout migration to llvm::Align Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: jholewinski, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67914 llvm-svn: 372596	2019-09-23 12:41:36 +00:00
Simon Pilgrim	f6f6c6ca3b	Localizer - fix "variable used but never read" analyzer warning. NFCI. Simplify the code by separating the modification of the Changed variable from returning it. llvm-svn: 372583	2019-09-23 11:38:10 +00:00
Simon Pilgrim	0d6684d7e5	TargetInstrInfo::getStackSlotRange - fix "variable used but never read" analyzer warning. NFCI. We don't need to divide the BitSize local variable at all. llvm-svn: 372582	2019-09-23 11:36:24 +00:00
Simon Pilgrim	0b184b8526	CriticalAntiDepBreaker - Assert that we've found the bottom of the critical path. NFCI. Silences static analyzer null dereference warnings. llvm-svn: 372577	2019-09-23 10:42:47 +00:00
Craig Topper	a533e87792	[X86][SelectionDAGBuilder] Move the hack for handling MMX shift by i32 intrinsics into the X86 backend. This intrinsics should be shift by immediate, but gcc allows any i32 scalar and clang needs to match that. So we try to detect the non-constant case and move the data from an integer register to an MMX register. Previously this was done by creating a v2i32 build_vector and bitcast in SelectionDAGBuilder. This had to be done early since v2i32 isn't a legal type. The bitcast+build_vector would be DAG combined to X86ISD::MMX_MOVW2D which isel will turn into a GPR->MMX MOVD. This commit just moves the whole thing to lowering and emits the X86ISD::MMX_MOVW2D directly to avoid the illegal type. The test changes just seem to be due to nodes being linearized in a different order. llvm-svn: 372535	2019-09-23 01:05:33 +00:00
Simon Pilgrim	c8a9ae4ce2	[SelectionDAG] computeKnownBits/ComputeNumSignBits - cleanup demanded/unknown paths. NFCI. Merge the calls, just adjust the demandedelts if we have a valid extract_subvector constant index, else demand all elts. llvm-svn: 372521	2019-09-22 18:47:12 +00:00
James Molloy	8a74eca398	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount Recommit: fix asan errors. The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372463	2019-09-21 08:19:41 +00:00
Amara Emerson	7ac1039957	[GlobalISel] Defer setting HasCalls on MachineFrameInfo to selection time. We currently always set the HasCalls on MFI during translation and legalization if we're handling a call or legalizing to a libcall. However, if that call is later optimized to a tail call then we don't need the flag. The flag being set to true causes frame lowering to always save and restore FP/LR, which adds unnecessary code. This change does the same thing as SelectionDAG and ports over some code that scans instructions after selection, using TargetInstrInfo to determine if target opcodes are known calls. Code size geomean improvements on CTMark: -O0 : 0.1% -Os : 0.3% Differential Revision: https://reviews.llvm.org/D67868 llvm-svn: 372443	2019-09-20 23:52:07 +00:00
Mitch Phillips	72a3d8597d	Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount" This commit broke the ASan buildbot. See comments in rL372376 for more information. This reverts commit `15e27b0b6d`. llvm-svn: 372425	2019-09-20 20:25:16 +00:00
Craig Topper	1b7b4b467f	[SelectionDAG][Mips][Sparc] Don't allow SimplifyDemandedBits to constant fold TargetConstant nodes to a Constant. Summary: After the switch in SimplifyDemandedBits, it tries to create a constant when possible. If the original node is a TargetConstant the default in the switch will call computeKnownBits on the TargetConstant which will succeed. This results in the TargetConstant becoming a Constant. But TargetConstant exists to avoid being changed. I've fixed the two cases that relied on this in tree by explicitly making the nodes constant instead of target constant. The Sparc case is an old bug. The Mips case was recently introduced now that ImmArg on intrinsics gets turned into a TargetConstant when the SelectionDAG is created. I've removed the ImmArg since it lowers to generic code. Reviewers: arsenm, RKSimon, spatel Subscribers: jyknight, sdardis, wdng, arichardson, hiraditya, fedor.sergeev, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67802 llvm-svn: 372409	2019-09-20 16:49:51 +00:00
Krzysztof Parzyszek	2b5d7e93dd	[MVT] Add v256i1 to MachineValueType This type can show up when lowering some HVX vector code on Hexagon. llvm-svn: 372403	2019-09-20 15:19:20 +00:00
David Stenberg	b71d8d465a	Add a missing space in a MIR parser error message llvm-svn: 372398	2019-09-20 14:41:41 +00:00
David Tellenbach	2a47c77e72	[FastISel] Fix insertion of unconditional branches during FastISel The insertion of an unconditional branch during FastISel can differ depending on building with or without debug information. This happens because FastISel::fastEmitBranch emits an unconditional branch depending on the size of the current basic block without distinguishing between debug and non-debug instructions. This patch fixes this issue by ignoring debug instructions when getting the size of the basic block. Reviewers: aprantl Reviewed By: aprantl Subscribers: ormris, aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67703 llvm-svn: 372389	2019-09-20 13:22:59 +00:00
James Molloy	15e27b0b6d	[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount The way MachinePipeliner uses these target hooks is stateful - we reduce trip count by one per call to reduceLoopCount. It's a little overfit for hardware loops, where we don't have to worry about stitching a loop induction variable across prologs and epilogs (the induction variable is implicit). This patch introduces a new API: /// Analyze loop L, which must be a single-basic-block loop, and if the /// conditions can be understood enough produce a PipelinerLoopInfo object. virtual std::unique_ptr<PipelinerLoopInfo> analyzeLoopForPipelining(MachineBasicBlock LoopBB) const; The return value is expected to be an implementation of the abstract class: /// Object returned by analyzeLoopForPipelining. Allows software pipelining /// implementations to query attributes of the loop being pipelined. class PipelinerLoopInfo { public: virtual ~PipelinerLoopInfo(); /// Return true if the given instruction should not be pipelined and should /// be ignored. An example could be a loop comparison, or induction variable /// update with no users being pipelined. virtual bool shouldIgnoreForPipelining(const MachineInstr MI) const = 0; /// Create a condition to determine if the trip count of the loop is greater /// than TC. /// /// If the trip count is statically known to be greater than TC, return /// true. If the trip count is statically known to be not greater than TC, /// return false. Otherwise return nullopt and fill out Cond with the test /// condition. virtual Optional<bool> createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB, SmallVectorImpl<MachineOperand> &Cond) = 0; /// Modify the loop such that the trip count is /// OriginalTC + TripCountAdjust. virtual void adjustTripCount(int TripCountAdjust) = 0; /// Called when the loop's preheader has been modified to NewPreheader. virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0; /// Called when the loop is being removed. virtual void disposed() = 0; }; The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while allowing the target to hold its own state across all calls. This API, in particular the disjunction of creating a trip count check condition and adjusting the loop, improves the code quality in ModuloSchedule.cpp. llvm-svn: 372376	2019-09-20 08:57:46 +00:00
Matt Arsenault	dd74f4839b	MachineScheduler: Fix missing dependency with multiple subreg defs If an instruction had multiple subregister defs, and one of them was undef, this would improperly conclude all other lanes are killed. There could still be other defs of those read-undef lanes in other operands. This would improperly remove register uses from CurrentVRegUses, so the visitation of later operands would not find the necessary register dependency. This would also mean this would fail or not depending on how different subregister def operands were ordered. On an undef subregister def, scan the instruction for other subregister defs and avoid killing those. This possibly should be deferring removing anything from CurrentVRegUses until the entire instruction has been processed instead. llvm-svn: 372362	2019-09-20 00:09:15 +00:00
Matt Arsenault	3ecab8e455	Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This reverts r372314, reapplying r372285 and the commits which depend on it (r372286-r372293, and r372296-r372297) This was missing one switch to getTargetConstant in an untested case. llvm-svn: 372338	2019-09-19 16:26:14 +00:00
Simon Pilgrim	af6043557d	[DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863) This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation. The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants). Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.). I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches. Differential Revision: https://reviews.llvm.org/D67557 llvm-svn: 372333	2019-09-19 15:02:47 +00:00
Amaury Sechet	9e94ef42ba	[DAGCombiner] Add node to the worklist in topological order in scalarizeExtractedVectorLoad Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66661 llvm-svn: 372327	2019-09-19 14:22:11 +00:00
Simon Pilgrim	c65dd89804	[DAG] Add SelectionDAG::MaxRecursionDepth constant As commented on D67557 we have a lot of uses of depth checks all using magic numbers. This patch adds the SelectionDAG::MaxRecursionDepth constant and moves over some general cases to use this explicitly. Differential Revision: https://reviews.llvm.org/D67711 llvm-svn: 372315	2019-09-19 12:58:43 +00:00
Hans Wennborg	13bdae8541	Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics" This broke the Chromium build, causing it to fail with e.g. fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15> See llvm-commits thread of r372285 for details. This also reverts r372286, r372287, r372288, r372289, r372290, r372291, r372292, r372293, r372296, and r372297, which seemed to depend on the main commit. > Encode them directly as an imm argument to G_INTRINSIC. > > Since now intrinsics can now define what parameters are required to be > immediates, avoid using registers for them. Intrinsics could > potentially want a constant that isn't a legal register type. Also, > since G_CONSTANT is subject to CSE and legalization, transforms could > potentially obscure the value (and create extra work for the > selector). The register bank of a G_CONSTANT is also meaningful, so > this could throw off future folding and legalization logic for AMDGPU. > > This will be much more convenient to work with than needing to call > getConstantVRegVal and checking if it may have failed for every > constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth > immarg operands, many of which need inspection during lowering. Having > to find the value in a register is going to add a lot of boilerplate > and waste compile time. > > SelectionDAG has always provided TargetConstant for constants which > should not be legalized or materialized in a register. The distinction > between Constant and TargetConstant was somewhat fuzzy, and there was > no automatic way to force usage of TargetConstant for certain > intrinsic parameters. They were both ultimately ConstantSDNode, and it > was inconsistently used. It was quite easy to mis-select an > instruction requiring an immediate. For SelectionDAG, start emitting > TargetConstant for these arguments, and using timm to match them. > > Most of the work here is to cleanup target handling of constants. Some > targets process intrinsics through intermediate custom nodes, which > need to preserve TargetConstant usage to match the intrinsic > expectation. Pattern inputs now need to distinguish whether a constant > is merely compatible with an operand or whether it is mandatory. > > The GlobalISelEmitter needs to treat timm as a special case of a leaf > node, simlar to MachineBasicBlock operands. This should also enable > handling of patterns for some G_ instructions with immediates, like > G_FENCE or G_EXTRACT. > > This does include a workaround for a crash in GlobalISelEmitter when > ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372314	2019-09-19 12:33:07 +00:00
Matt Arsenault	c189f023ac	MachineScheduler: Fix assert from not checking subregs The assert would fail if there was a dead def of a subregister if there was a previous use of a different subregister. llvm-svn: 372287	2019-09-19 02:14:12 +00:00
Matt Arsenault	d8399d12cd	GlobalISel: Don't materialize immarg arguments to intrinsics Encode them directly as an imm argument to G_INTRINSIC. Since now intrinsics can now define what parameters are required to be immediates, avoid using registers for them. Intrinsics could potentially want a constant that isn't a legal register type. Also, since G_CONSTANT is subject to CSE and legalization, transforms could potentially obscure the value (and create extra work for the selector). The register bank of a G_CONSTANT is also meaningful, so this could throw off future folding and legalization logic for AMDGPU. This will be much more convenient to work with than needing to call getConstantVRegVal and checking if it may have failed for every constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth immarg operands, many of which need inspection during lowering. Having to find the value in a register is going to add a lot of boilerplate and waste compile time. SelectionDAG has always provided TargetConstant for constants which should not be legalized or materialized in a register. The distinction between Constant and TargetConstant was somewhat fuzzy, and there was no automatic way to force usage of TargetConstant for certain intrinsic parameters. They were both ultimately ConstantSDNode, and it was inconsistently used. It was quite easy to mis-select an instruction requiring an immediate. For SelectionDAG, start emitting TargetConstant for these arguments, and using timm to match them. Most of the work here is to cleanup target handling of constants. Some targets process intrinsics through intermediate custom nodes, which need to preserve TargetConstant usage to match the intrinsic expectation. Pattern inputs now need to distinguish whether a constant is merely compatible with an operand or whether it is mandatory. The GlobalISelEmitter needs to treat timm as a special case of a leaf node, simlar to MachineBasicBlock operands. This should also enable handling of patterns for some G_ instructions with immediates, like G_FENCE or G_EXTRACT. This does include a workaround for a crash in GlobalISelEmitter when ARM tries to uses "imm" in an output with a "timm" pattern source. llvm-svn: 372285	2019-09-19 01:33:14 +00:00
Adrian Prantl	0779dffbd4	Remove the obsolete BlockByRefStruct flag from LLVM IR DIFlagBlockByRefStruct is an unused DIFlag that originally was used by clang to express (Objective-)C block captures in debug info. For the last year Clang has been emitting complex DIExpressions to describe block captures instead, which makes all the code supporting this flag redundant. This patch removes the flag and all supporting "dead" code, so we can reuse the bit for something else in the future. Since this only affects debug info generated by Clang with the block extension this mostly affects Apple platforms and I don't have any bitcode compatibility concerns for removing this. The Verifier will reject debug info that uses the bit and thus degrade gracefully when LTO'ing older bitcode with a newer compiler. rdar://problem/44304813 Differential Revision: https://reviews.llvm.org/D67453 llvm-svn: 372272	2019-09-18 22:38:56 +00:00
Roman Lebedev	c00f318224	[DAGCombine][ARM][X86] (sub Carry, X) -> (addcarry (sub 0, X), 0, Carry) fold Summary: `DAGCombiner::visitADDLikeCommutative()` already has a sibling fold: `(add X, Carry) -> (addcarry X, 0, Carry)` This fold, as suggested by @efriedma, helps recover from //some// of the regressions of D62266 Reviewers: efriedma, deadalnix Subscribers: javed.absar, kristof.beyls, llvm-commits, efriedma Tags: #llvm Differential Revision: https://reviews.llvm.org/D62392 llvm-svn: 372259	2019-09-18 20:48:27 +00:00
Guillaume Chatelet	97a18dc704	[Alignment][NFC] Align(1) to Align::None() conversions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67715 llvm-svn: 372234	2019-09-18 16:19:40 +00:00
Guillaume Chatelet	d4c4671aa7	[Alignment][NFC] Remove LogAlignment functions Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67620 llvm-svn: 372231	2019-09-18 15:49:49 +00:00
Guillaume Chatelet	35b4b403b4	[Alignment][NFC] Use Align::None instead of 1 Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67704 llvm-svn: 372230	2019-09-18 15:40:20 +00:00
Krasimir Georgiev	2f1bba7fd0	Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize" Summary: This reverts commit r372204. This change causes build bot failures under msan: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio: ``` FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579) ****************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED **************** Script: -- : 'RUN: at line 1'; /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir \| /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - \| /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir -- Exit Code: 2 Command Output (stderr): -- ==62894==WARNING: MemorySanitizer: use-of-uninitialized-value #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10 #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536 #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21 #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11 #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079 #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361 #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18 #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13 #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27 #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16 #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27 #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863 #13 0x905f21 in compileModule(char, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8 #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22 #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369) MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const Exiting error: -: The file was not recognized as a valid object file FileCheck error: '-' is empty. FileCheck command line: /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir ``` Reviewers: bkramer Reviewed By: bkramer Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67710 llvm-svn: 372228	2019-09-18 14:42:09 +00:00
Sander de Smalen	dc2a7f5b39	[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize This patch fixes a bug exposed by D65653 where a subsequent invocation of `determineCalleeSaves` ends up with a different size for the callee save area, leading to different frame-offsets in debug information. In the invocation by PEI, `determineCalleeSaves` tries to determine whether it needs to spill an extra callee-saved register to get an emergency spill slot. To do this, it calls 'estimateStackSize' and manually adds the size of the callee-saves to this. PEI then allocates the spill objects for the callee saves and the remaining frame layout is calculated accordingly. A second invocation in LiveDebugValues causes estimateStackSize to return the size of the stack frame including the callee-saves. Given that the size of the callee-saves is added to this, these callee-saves are counted twice, which leads `determineCalleeSaves` to believe the stack has become big enough to require spilling an extra callee-save as emergency spillslot. It then updates CalleeSavedStackSize with a larger value. Since CalleeSavedStackSize is used in the calculation of the frame offset in getFrameIndexReference, this leads to incorrect offsets for variables/locals when this information is recalculated after PEI. Reviewers: omjavaid, eli.friedman, thegameg, efriedma Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D66935 llvm-svn: 372204	2019-09-18 09:02:44 +00:00
Francis Visoiu Mistrih	ba2e752c52	[Remarks] Allow the RemarkStreamer to be used directly with a stream The filename in the RemarkStreamer should be optional to allow clients to stream remarks to memory or to existing streams. This introduces a new overload of `setupOptimizationRemarks`, and avoids enforcing the presence of a filename at different places. llvm-svn: 372195	2019-09-18 01:04:45 +00:00
Craig Topper	b5ffbd0b14	[SimplifyDemandedBits] Use APInt::intersects to instead of ANDing and comparing to 0 separately. NFC llvm-svn: 372158	2019-09-17 18:19:02 +00:00
Benjamin Kramer	df4b9a3f4f	Hide implementation details in namespaces. llvm-svn: 372113	2019-09-17 12:56:29 +00:00
Graham Hunter	1a9195d817	[SVE][MVT] Fixed-length vector MVT ranges * Reordered MVT simple types to group scalable vector types together. * New range functions in MachineValueType.h to only iterate over the fixed-length int/fp vector types. * Stopped backends which don't support scalable vector types from iterating over scalable types. Reviewers: sdesmalen, greened Reviewed By: greened Differential Revision: https://reviews.llvm.org/D66339 llvm-svn: 372099	2019-09-17 10:19:23 +00:00
Alexander Timofeev	6524a7a2b9	[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed Defferential Revision: https://reviews.llvm.org/D67101 Reviewers: rampitec, vpykhtin llvm-svn: 372086	2019-09-17 09:08:58 +00:00
Amara Emerson	9d64721ca5	[GlobalISel] Partially revert r371901. r371901 was overeager and widenScalarDst() and the like in the legalizer attempt to increment the insert point given in order to add new instructions after the currently legalizing inst. In cases where the insertion point is not exactly the current instruction, then callers need to de-compensate for the behaviour by decrementing the insertion iterator before calling them. It's not a nice state of affairs, for now just undo the problematic parts of the change. llvm-svn: 372050	2019-09-16 23:46:03 +00:00
Simon Pilgrim	a8a4953fdf	[GlobalISel] findGISelOptimalMemOpLowering - remove dead initalization. NFCI. Fixes static analyzer warning that "Value stored to 'NewTySize' during its initialization is never read". llvm-svn: 371937	2019-09-15 16:56:06 +00:00
Simon Pilgrim	2b4ace3f29	InterleavedLoadCombine - merge isa<> and dyn_cast<> duplicates. NFCI. Silence static analyzer null dereference warning of *dyn_cast<BinaryOperator> by merging with the isa<BinaryOperator> above. llvm-svn: 371935	2019-09-15 16:20:12 +00:00
Simon Pilgrim	b743e94cdc	[TargetLowering] SimplifyDemandedBits - add EXTRACT_SUBVECTOR support. Call SimplifyDemandedBits on the source vector. llvm-svn: 371923	2019-09-14 16:38:26 +00:00
Mingjie Xing	4b191770f4	[ScheduleDAGMILive] Fix typo in comment. Differential Revision: https://reviews.llvm.org/D67478 llvm-svn: 371916	2019-09-14 03:27:38 +00:00
Amara Emerson	02bcc86b08	[GlobalISel] Fix insertion point of new instructions to be after PHIs. For some reason we sometimes insert new instructions one instruction before the first non-PHI when legalizing. This can result in having non-PHI instructions before PHIs, which mean that PHI elimination doesn't catch them. Differential Revision: https://reviews.llvm.org/D67570 llvm-svn: 371901	2019-09-13 21:49:24 +00:00
Jessica Paquette	727328ab63	[AArch64][GlobalISel] Tail call memory intrinsics Because memory intrinsics are handled differently than other calls, we need to check them for tail call eligiblity in the legalizer. This allows us to still inline them when it's beneficial to do so, but also tail call when possible. This adds simple tail calling support for when the intrinsic is followed by a return. It ports the attribute checks from `TargetLowering::isInTailCallPosition` into a similarly-named function in LegalizerHelper.cpp. The target-specific `isUsedByReturnOnly` hook is not ported here. Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call memory intrinsics. Update legalize-memcpy-et-al.mir to have a case where we don't tail call. Differential Revision: https://reviews.llvm.org/D67566 llvm-svn: 371893	2019-09-13 20:25:58 +00:00
Alexander Timofeev	9ff70132bf	Revert for: [AMDGPU]: PHI Elimination hooks added for custom COPY insertion. llvm-svn: 371873	2019-09-13 17:37:30 +00:00
Craig Topper	4d1df2aa23	[TargetRegisterInfo] Remove SVT argument from getCommonSubClass. This was added to support fp128 on x86-64, but appears to be unneeded now. This may be because the FR128 register class added back then was merged with the VR128 register class later. llvm-svn: 371815	2019-09-13 05:24:37 +00:00
Tim Shen	a31c521f5e	Temporarily revert r371640 "LiveIntervals: Split live intervals on multiple dead defs". It reveals a miscompile on Hexagon. See PR43302 for details. llvm-svn: 371802	2019-09-13 01:34:25 +00:00
Matt Arsenault	4d33918034	AMDGPU/GlobalISel: Legalize G_FMAD Unlike SelectionDAG, treat this as a normally legalizable operation. In SelectionDAG this is supposed to only ever formed if it's legal, but I've found that to be restricting. For AMDGPU this is contextually legal depending on whether denormal flushing is allowed in the use function. Technically we currently treat the denormal mode as a subtarget feature, so custom lowering could be avoided. However I consider this to be a defect, and this should be contextually dependent on the controllable rounding mode of the parent function. llvm-svn: 371800	2019-09-13 00:44:35 +00:00
Matt Arsenault	b85c8c4bbd	LiveIntervals: Remove assertion This testcase is invalid, and caught by the verifier. For the verifier to catch it, the live interval computation needs to complete. Remove the assert so the verifier catches this, which is less confusing. In this testcase there is an undefined use of a subregister, and lanes which aren't used or defined. An equivalent testcase with the super-register shrunk to have no untouched lanes already hit this verifier error. llvm-svn: 371792	2019-09-12 23:46:51 +00:00
Philip Reames	079e210463	[SDAG] Update generic code to conservatively check for isAtomic in addition to isVolatile This is the first sweep of generic code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time. That will come later. See D66309 for context. Differential Revision: https://reviews.llvm.org/D66318 llvm-svn: 371786	2019-09-12 22:49:17 +00:00
Jessica Paquette	a42070a6aa	[AArch64][GlobalISel] Support sibling calls with outgoing arguments This adds support for lowering sibling calls with outgoing arguments. e.g ``` define void @foo(i32 %a) ``` Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`. The only thing that is missing is a full port of `TargetLowering::parametersInCSRMatch`. So, if we're using swiftself, we'll never tail call. - Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used for both outgoing and incoming arguments - Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for stack arguments. - Teach `lowerFormalArguments` to set the bytes in the caller's stack argument area. This is used later to check if the tail call's parameters will fit on the caller's stack. - Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on the callee's outgoing arguments. For testing: - Update call-translator-tail-call to verify that we can now tail call with outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the size of the caller's stack - Remove GISel-specific check lines from speculation-hardening.ll, since GISel now tail calls like the other selectors - Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that test now - Add a GISel test line to tailcall_misched_graph.ll since we tail call there now. Add specific check lines for GISel, since the debug output from the machine-scheduler differs with GlobalISel. The dependency still holds, but the output comes out in a different order. Differential Revision: https://reviews.llvm.org/D67471 llvm-svn: 371780	2019-09-12 22:10:36 +00:00
Craig Topper	efe6724b9f	[DAGCombiner][X86] Pass the CmpOpVT to reduceSelectOfFPConstantLoads so X86 can exclude fp128 compares. The X86 decision assumes the compare will produce a result in an XMM register, but that can't happen for an fp128 compare since those go to a libcall the returns an i32. Pass the VT so X86 can check the type. llvm-svn: 371775	2019-09-12 21:30:18 +00:00
Craig Topper	344c398e2a	[SelectionDAGBuilder] Simplify loop in visitSelect back to how it was before r255558. This code was changed to accomodate fp128 being softened to itself during type legalization on x86-64. This was done in order to create libcalls while having fp128 as a legal type. We're now doing the libcall creation during LegalizeDAG and the type legalization changes to enable the old behavior have been removed. So this change to SelectionDAGBuilder is no longer needed. llvm-svn: 371771	2019-09-12 21:00:32 +00:00
David Green	a6e944b173	[CGP] Ensure sinking multiple instructions does not invalidate dominance checks In MVE, as of rL371218, we are attempting to sink chains of instructions such as: %l1 = insertelement <8 x i8> undef, i8 %l0, i32 0 %broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer In certain situations though, we can end up breaking the dominance relations of instructions. This happens when we sink the instruction into a loop, but cannot remove the originals. The Use is updated, which might in fact be a Use from the second instruction to the first. This attempts to fix that by reversing the order of instruction that are sunk, and ensuring that we update the uses on new instructions if they have already been sunk, not the old ones. Differential Revision: https://reviews.llvm.org/D67366 llvm-svn: 371743	2019-09-12 16:00:07 +00:00
Guillaume Chatelet	af11cc7eb5	[Alignment] Move OffsetToAlignment to Alignment.h Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet, JDevlieghere, alexshap, rupprecht, jhenderson Subscribers: sdardis, nemanjai, hiraditya, kbarton, jakehehrlich, jrtc27, MaskRay, atanasyan, jsji, seiya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D67499 llvm-svn: 371742	2019-09-12 15:20:36 +00:00
Simon Pilgrim	da59a6bf7d	[DAGCombine] visitFDIV - Use isCheaperToUseNegatedFPOps helper for (fdiv (fneg X), (fneg Y)) -> (fdiv X, Y). NFCI. Minor cleanup to use equivalent helper code. llvm-svn: 371724	2019-09-12 11:03:09 +00:00
Tim Northover	f1c2892912	AArch64: support arm64_32, an ILP32 slice for watchOS. This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM. FastISel is mostly disabled for now since it would generate incorrect code for ILP32. llvm-svn: 371722	2019-09-12 10:22:23 +00:00
Tim Northover	98534843fb	CodeGenPrep: add separate hook say when GEPs should be used for sinking. NFCI. Up to now, we've decided whether to sink address calculations using GEPs or normal arithmetic based on the useAA hook, but there are other reasons GEPs might be preferred. So this patch splits the two questions, with a default implementation falling back to useAA. llvm-svn: 371721	2019-09-12 10:21:00 +00:00
Qiu Chaofan	b7fb5d0f6f	[DAGCombiner] Improve division estimation of floating points. Current implementation of estimating divisions loses precision since it estimates reciprocal first and does multiplication. This patch is to re-order arithmetic operations in the last iteration in DAGCombiner to improve the accuracy. Reviewed By: Sanjay Patel, Jinsong Ji Differential Revision: https://reviews.llvm.org/D66050 llvm-svn: 371713	2019-09-12 07:51:24 +00:00
Craig Topper	b8dd075275	[LegalizeTypes] Remove code for softening a float type to itself. This was previously used to turn fp128 operations into libcalls on X86. This is now done through op legalization after r371672. This restores much of this code to before r254653. llvm-svn: 371709	2019-09-12 05:55:14 +00:00
Amara Emerson	55d86f04c7	[AArch64][GlobalISel] Fall back on attempts to allocate split types on the stack. First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually make a difference for us in CallLowering but fix that anyway to silence the assert. The bigger issue was that after fixing the assert we were generating invalid MIR because the merging/unmerging of values split across multiple registers wasn't also implemented for memory locs. This happens when we run out of registers and have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but for now just fall back. llvm-svn: 371693	2019-09-11 23:53:23 +00:00
Vedant Kumar	0b91333d59	[DWARF] Emit call site parameter info when tuning for lldb Emit debug entry values using standard DWARF5 opcodes when the debugger tuning is set to lldb. Differential Revision: https://reviews.llvm.org/D67410 llvm-svn: 371666	2019-09-11 21:23:39 +00:00
Matt Arsenault	81196a595c	LiveIntervals: Split live intervals on multiple dead defs If there are multiple dead defs of the same virtual register, these are required to be split into multiple virtual registers with separate live intervals to avoid a verifier error. llvm-svn: 371640	2019-09-11 17:59:21 +00:00
Guillaume Chatelet	97264366fb	[Alignment][NFC] use llvm::Align for AsmPrinter::EmitAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: dschuff, sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67443 llvm-svn: 371616	2019-09-11 13:37:35 +00:00
Guillaume Chatelet	48904e9452	[Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing Summary: This catches malformed mir files which specify alignment as log2 instead of pow2. See https://reviews.llvm.org/D65945 for reference, This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67433 llvm-svn: 371608	2019-09-11 11:16:48 +00:00
Jessica Paquette	469d42fcf6	[GlobalISel] When a tail call is emitted in a block, stop translating it This fixes a crash in tail call translation caused by assume and lifetime_end intrinsics. It's possible to have instructions other than a return after a tail call which will still have `Analysis::isInTailCallPosition` return true. (Namely, lifetime_end and assume intrinsics.) If we emit a tail call, we should stop translating instructions in the block. Otherwise, we can end up emitting an extra return, or dead instructions in general. This makes the verifier unhappy, and is generally unfortunate for codegen. This also removes the code from AArch64CallLowering that checks if we have a tail call when lowering a return. This is covered by the new code now. Also update call-translator-tail-call.ll to show that we now properly tail call in the presence of lifetime_end and assume. Differential Revision: https://reviews.llvm.org/D67415 llvm-svn: 371572	2019-09-10 23:34:45 +00:00
Jessica Paquette	2af5b193d5	[AArch64][GlobalISel] Support sibling calls with mismatched calling conventions Add support for sibcalling calls whose calling convention differs from the caller's. - Port over `CCState::resultsCombatible` from CallingConvLower.cpp into CallLowering. This is used to verify that the way the caller and callee CC handle incoming arguments matches up. - Add `CallLowering::analyzeCallResult`. This is basically a port of `CCState::AnalyzeCallResult`, but using `ArgInfo` rather than `ISD::InputArg`. - Add `AArch64CallLowering::doCallerAndCalleePassArgsTheSameWay`. This checks that the calling conventions are compatible, and that the caller and callee preserve the same registers. For testing: - Update call-translator-tail-call.ll to show that we can now handle this. - Add a GISel line to tailcall-ccmismatch.ll to show that we will not tail call when the regmasks don't line up. Differential Revision: https://reviews.llvm.org/D67361 llvm-svn: 371570	2019-09-10 23:25:12 +00:00
Sanjay Patel	df6a958dcb	[BreakFalseDeps] fix typos/grammar in documentation comment; NFC llvm-svn: 371516	2019-09-10 13:00:31 +00:00
Guillaume Chatelet	3729b17cff	[Alignment][NFC] Use llvm::Align for TargetLowering::getPrefLoopAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Reviewed By: courbet Subscribers: wuzish, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67386 llvm-svn: 371511	2019-09-10 12:00:43 +00:00
Alexander Timofeev	c2d292f839	[AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Reviewers: rampitec, vpykhtin Differential Revision: https://reviews.llvm.org/D67101 llvm-svn: 371508	2019-09-10 10:58:57 +00:00
Dmitri Gribenko	2bf8d77453	Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."" This reverts commit r371502, it broke tests (clang/test/CodeGenCXX/auto-var-init.cpp). llvm-svn: 371507	2019-09-10 10:39:09 +00:00
Clement Courbet	612c260ec3	Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline." With a fix for sanitizer breakage (see explanation in D60318). llvm-svn: 371502	2019-09-10 09:18:00 +00:00
Guillaume Chatelet	b6722af068	[Alignment] Use Align for TargetLowering::MinStackArgumentAlignment Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67288 llvm-svn: 371498	2019-09-10 09:01:18 +00:00
Craig Topper	e8b432fa0e	[LegalizeTypes] Teach SoftenFloatOp_SELECT_CC to handle operand 2 or 3 being softened. This can only happen on X86 when fp128 is a legal type, but we go through softening to generate libcalls. This causes fp128 to be softened to fp128 instead of an integer type. This can be removed if D67128 lands. llvm-svn: 371493	2019-09-10 07:56:02 +00:00
Aditya Nandakumar	5112b71126	[GlobalISel]: Fix a bug where we could dereference None getConstantVRegVal returns None when dealing with constants > 64 bits. Don't assume we always have a value in GISelKnownBits. llvm-svn: 371465	2019-09-09 22:51:41 +00:00
Philip Reames	20aafa3156	Introduce infrastructure for an incremental port of SelectionDAG atomic load/store handling This is the first patch in a large sequence. The eventual goal is to have unordered atomic loads and stores - and possibly ordered atomics as well - handled through the normal ISEL codepaths for loads and stores. Today, there handled w/instances of AtomicSDNodes. The result of which is that all transforms need to be duplicated to work for unordered atomics. The benefit of the current design is that it's harder to introduce a silent miscompile by adding an transform which forgets about atomicity. See the thread on llvm-dev titled "FYI: proposed changes to atomic load/store in SelectionDAG" for further context. Note that this patch is NFC unless the experimental flag is set. The basic strategy I plan on taking is: introduce infrastructure and a flag for testing (this patch) Audit uses of isVolatile, and apply isAtomic conservatively* piecemeal conservative* update generic code and x86 backedge code in individual reviews w/tests for cases which didn't check volatile, but can be found with inspection flip the flag at the end (with minimal diffs) Work through todo list identified in (2) and (3) exposing performance ops (*) The "conservative" bit here is aimed at minimizing the number of diffs involved in (4). Ideally, there'd be none. In practice, getting it down to something reviewable by a human is the actual goal. Note that there are (currently) no paths which produce LoadSDNode or StoreSDNode with atomic MMOs, so we don't need to worry about preserving any behaviour there. We've taken a very similar strategy twice before with success - once at IR level, and once at the MI level (post ISEL). Differential Revision: https://reviews.llvm.org/D66309 llvm-svn: 371441	2019-09-09 19:23:22 +00:00
Eli Friedman	79f0d3a6e5	[IfConversion] Correctly handle cases where analyzeBranch fails. If analyzeBranch fails, on some targets, the out parameters point to some blocks in the function. But we can't use that information, so make sure to clear it out. (In some places in IfConversion, we assume that any block with a TrueBB is analyzable.) The change to the testcase makes it trigger a bug on builds without this fix: IfConvertDiamond tries to perform a followup "merge" operation, which isn't legal, and we somehow end up with a branch to a deleted MBB. I'm not sure how this doesn't crash the compiler. Differential Revision: https://reviews.llvm.org/D67306 llvm-svn: 371434	2019-09-09 18:29:27 +00:00
Craig Topper	5ebd0a6e88	[SelectionDAG] Remove ISD::FP_ROUND_INREG I don't think anything in tree creates this node. So all of this code appears to be dead. Code coverage agrees http://lab.llvm.org:8080/coverage/coverage-reports/llvm/coverage/Users/buildslave/jenkins/workspace/clang-stage2-coverage-R/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp.html Differential Revision: https://reviews.llvm.org/D67312 llvm-svn: 371431	2019-09-09 17:54:44 +00:00
Dmitri Gribenko	d9c4060bd5	Revert "[MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp" This reverts commit 371359. I'm suspecting a miscompile, I posted a reproducer to https://reviews.llvm.org/D65267. llvm-svn: 371421	2019-09-09 16:46:45 +00:00
James Molloy	b6c7fce67a	[DFAPacketizer] Reapply: Track resources for packetized instructions Reapply with fix to reduce resources required by the compiler - use unsigned[2] instead of std::pair. This causes clang and gcc to compile the generated file multiple times faster, and hopefully will reduce the resource requirements on Visual Studio also. This fix is a little ugly but it's clearly the same issue the previous author of DFAPacketizer faced (the previous tables use unsigned[2] rather uglily too). This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 llvm-svn: 371399	2019-09-09 13:17:55 +00:00
Simon Pilgrim	462e3d8050	Revert rL371198 from llvm/trunk: [DFAPacketizer] Track resources for packetized instructions This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 ........ Reverted as this is causing "compiler out of heap space" errors on MSVC 2017/19 NDEBUG builds llvm-svn: 371393	2019-09-09 12:33:22 +00:00
Tim Northover	06d93e0a25	GlobalISel: fix unused warnings in release builds. llvm-svn: 371385	2019-09-09 10:36:58 +00:00
Tim Northover	36147adc0b	GlobalISel: add combiner to form indexed loads. Loosely based on DAGCombiner version, but this part is slightly simpler in GlobalIsel because all address calculation is performed by G_GEP. That makes the inc/dec distinction moot so there's just pre/post to think about. No targets can handle it yet so testing is via a special flag that overrides target hooks. llvm-svn: 371384	2019-09-09 10:04:23 +00:00
Kai Luo	9115c477bb	[MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp Summary: After tailduplication, we have redundant copies. We can remove these copies in machine-cp if it's safe to, i.e. ``` $reg0 = OP ... ... <<< No read or clobber of $reg0 and $reg1 $reg1 = COPY $reg0 <<< $reg0 is killed ... <RET> ``` will be transformed to ``` $reg1 = OP ... ... <RET> ``` Differential Revision: https://reviews.llvm.org/D65267 llvm-svn: 371359	2019-09-09 02:32:42 +00:00
Craig Topper	dac34f52d3	[DAGCombiner][X86][ARM] Teach visitMULO to fold multiplies with 0 to 0 and no carry. I modified the ARM test to use two inputs instead of 0 so the test hopefully still tests what was intended. llvm-svn: 371344	2019-09-08 19:24:39 +00:00
David Stenberg	5a583665f4	[DebugInfo][X86] Describe call site values for zero-valued imms Summary: Add zero-materializing XORs to X86's describeLoadedValue() hook in order to produce call site values. I have had to change the defs logic in collectCallSiteParameters() a bit to be able to describe the XORs. The XORs implicitly define $eflags, which would cause them to never be considered, due to a guard condition that I->getNumDefs() is one. I have changed that condition so that we now only consider instructions where a forwarded register overlaps with the instruction's single explicit define. We still need to collect the implicit defines of other forwarded registers to remove them from the work list. I'm not sure how to move towards supporting instructions with multiple explicit defines, cases where forwarded register are implicitly defined, and/or cases where an instruction produces values for multiple forwarded registers. Perhaps the describeLoadedValue() hook should take a register argument, and we then leave it up to the hook to describe the loaded value in that register? I have not yet encountered a situation where that would be necessary though. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: vsk Subscribers: ychen, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67225 llvm-svn: 371333	2019-09-08 14:22:06 +00:00
David Stenberg	8b70139e95	[NFC] Make the describeLoadedValue() hook return machine operand objects Summary: This changes the ParamLoadedValue pair which the describeLoadedValue() hook returns so that MachineOperand objects are returned instead of pointers. When describing call site values we may need to describe operands which are not part of the instruction. One such example is zero-materializing XORs on x86, which I have implemented support for in a child revision. Instead of having to return a pointer to an operand stored somewhere outside the instruction, start returning objects directly instead, as that simplifies the code. The MachineOperand class only holds POD members, and on x86-64 it is 32 bytes large. That combined with copy elision means that the overhead of returning a machine operand object from the hook does not become very large. I benchmarked this on a 8-thread i7-8650U machine with 32 GB RAM. The benchmark consisted of building a clang 8.0 binary configured with: -DCMAKE_BUILD_TYPE=RelWithDebInfo \ -DLLVM_TARGETS_TO_BUILD=X86 \ -DLLVM_USE_SANITIZER=Address \ -DCMAKE_CXX_FLAGS="-Xclang -femit-debug-entry-values -stdlib=libc++" The average wall clock time increased by 4 seconds, from 62:05 to 62:09, which is an 0.1% increase. Reviewers: aprantl, vsk, djtodoro, NikolaPrica Reviewed By: vsk Subscribers: hiraditya, ychen, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67261 llvm-svn: 371332	2019-09-08 14:05:10 +00:00
Bjorn Pettersson	d065c81164	[CodeGen] Handle SMULFIXSAT with scale zero in TargetLowering::expandFixedPointMul Summary: Normally TargetLowering::expandFixedPointMul would handle SMULFIXSAT with scale zero by using an SMULO to compute the product and determine if saturation is needed (if overflow happened). But if SMULO isn't custom/legal it falls through and uses the same technique, using MULHS/SMUL_LOHI, as used for non-zero scales. Problem was that when checking for overflow (handling saturation) when not using MULO we did not expect to find a zero scale. So we ended up in an assertion when doing APInt::getLowBitsSet(VTSize, Scale - 1) This patch fixes the problem by adding a new special case for how saturation is computed when scale is zero. Reviewers: RKSimon, bevinh, leonardchan, spatel Reviewed By: RKSimon Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67071 llvm-svn: 371309	2019-09-07 12:16:23 +00:00
Bjorn Pettersson	5e331e4ce8	[Intrinsic] Add the llvm.umul.fix.sat intrinsic Summary: Add an intrinsic that takes 2 unsigned integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Patch by: leonardchan, bjope Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel Reviewed By: leonardchan Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D57836 llvm-svn: 371308	2019-09-07 12:16:14 +00:00
Bjorn Pettersson	2b698a13a1	[DwarfExpression] Disallow some rewrites to avoid undefined behavior Summary: The value operand in DW_OP_plus_uconst/DW_OP_constu value can be large (it uses uint64_t as representation internally in LLVM). This means that in the uint64_t to int conversions, previously done by DwarfExpression::addMachineRegExpression, could lose information. Also, the negation done in "-Offset" was undefined behavior in case Offset was exactly INT_MIN. To avoid the above problems, we now avoid transformation like [Reg, DW_OP_plus_uconst, Offset] --> [DW_OP_breg, Offset] and [Reg, DW_OP_constu, Offset, DW_OP_plus] --> [DW_OP_breg, Offset] when Offset > INT_MAX. And we avoid to transform [Reg, DW_OP_constu, Offset, DW_OP_minus] --> [DW_OP_breg,-Offset] when Offset > INT_MAX+1. The patch also adjusts DwarfCompileUnit::constructVariableDIEImpl to make sure that "DW_OP_constu, Offset, DW_OP_minus" is used instead of "DW_OP_plus_uconst, Offset" when creating DIExpressions with negative frame index offsets. Notice that this might just be the tip of the iceberg. There are lots of fishy handling related to these constants. I think both DIExpression::appendOffset and DIExpression::extractIfOffset may trigger undefined behavior for certain values. Reviewers: sdesmalen, rnk, JDevlieghere Reviewed By: JDevlieghere Subscribers: jholewinski, aprantl, hiraditya, ychen, uabelho, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D67263 llvm-svn: 371304	2019-09-07 11:40:10 +00:00
Teresa Johnson	9c27b59cec	Change TargetLibraryInfo analysis passes to always require Function Summary: This is the first change to enable the TLI to be built per-function so that -fno-builtin* handling can be migrated to use function attributes. See discussion on D61634 for background. This is an enabler for fixing handling of these options for LTO, for example. This change should not affect behavior, as the provided function is not yet used to build a specifically per-function TLI, but rather enables that migration. Most of the changes were very mechanical, e.g. passing a Function to the legacy analysis pass's getTLI interface, or in Module level cases, adding a callback. This is similar to the way the per-function TTI analysis works. There was one place where we were looking for builtins but not in the context of a specific function. See FindCXAAtExit in lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround could provide the wrong behavior in some corner cases. Suggestions welcome. Reviewers: chandlerc, hfinkel Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66428 llvm-svn: 371284	2019-09-07 03:09:36 +00:00
Matt Arsenault	cf10372119	GlobalISel: Add G_FMAD instruction llvm-svn: 371254	2019-09-06 20:49:10 +00:00
James Molloy	db2fa06722	[DFAPacketizer] Track resources for packetized instructions This patch allows the DFAPacketizer to be queried after a packet is formed to work out which resources were allocated to the packetized instructions. This is particularly important for targets that do their own bundle packing - it's not sufficient to know simply that instructions can share a packet; which slots are used is also required for encoding. This extends the emitter to emit a side-table containing resource usage diffs for each state transition. The packetizer maintains a set of all possible resource states in its current state. After packetization is complete, all remaining resource states are possible packetization strategies. The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default (most uses of the packetizer like MachinePipeliner don't care and don't need the extra maintained state). Differential Revision: https://reviews.llvm.org/D66936 llvm-svn: 371198	2019-09-06 12:20:08 +00:00
Jeremy Morse	5d9cd3b4ca	[DebugInfo] LiveDebugValues: explicitly terminate overwritten stack locations If a stack spill location is overwritten by another spill instruction, any variable locations pointing at that slot should be terminated. We cannot rely on spills always being restored to registers or variable locations being moved by a DBG_VALUE: the register allocator is entitled to spill a value and then forget about it when it goes out of liveness. To address this, scan for memory writes to spill locations, even those we don't consider to be normal "spills". isSpillInstruction and isLocationSpill distinguish the two now. After identifying spill overwrites, terminate the open range, and insert a $noreg DBG_VALUE for that variable. Differential Revision: https://reviews.llvm.org/D66941 llvm-svn: 371193	2019-09-06 10:08:22 +00:00
Kang Zhang	f879c68755	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: Fix a bug of not update the jump table and recommit it again. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 371177	2019-09-06 08:16:18 +00:00
Puyan Lotfi	dc97ca9f25	[MIR] MIRNamer pass for improving MIR test authoring experience. This patch reuses the MIR vreg renamer from the MIRCanonicalizerPass to cleanup names of vregs in a MIR file for MIR test authors. I found it useful when writing a regression test for a globalisel failure I encountered recently and thought it might be useful for other folks as well. Differential Revision: https://reviews.llvm.org/D67209 llvm-svn: 371121	2019-09-05 20:44:33 +00:00
Daniel Sanders	f803237926	[globalisel][knownbits] Account for missing type constraints Now that we look through copies, it's possible to visit registers that have a register class constraint but not a type constraint. Avoid looking through copies when this occurs as the SrcReg won't be able to determine it's bit width or any known bits. Along the same lines, if the initial query is on a register that doesn't have a type constraint then the result is a default-constructed KnownBits, that is, a 1-bit fully-unknown value. llvm-svn: 371116	2019-09-05 20:26:02 +00:00
Jessica Paquette	20e8667098	Recommit "[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls" Recommit basic sibling call lowering (https://reviews.llvm.org/D67189) The issue was that if you have a return type other than void, call lowering will emit COPYs to get the return value after the call. Disallow sibling calls other than ones that return void for now. Also proactively disable swifterror tail calls for now, since there's a similar issue with COPYs there. Update call-translator-tail-call.ll to include test cases for each of these things. llvm-svn: 371114	2019-09-05 20:18:34 +00:00
Eli Friedman	cae1e47f6e	[IfConversion] Fix diamond conversion with unanalyzable branches. The code was incorrectly counting the number of identical instructions, and therefore tried to predicate an instruction which should not have been predicated. This could have various effects: a compiler crash, an assembler failure, a miscompile, or just generating an extra, unnecessary instruction. Instead of depending on TargetInstrInfo::removeBranch, which only works on analyzable branches, just remove all branch instructions. Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and https://bugs.llvm.org/show_bug.cgi?id=41121 . Differential Revision: https://reviews.llvm.org/D67203 llvm-svn: 371111	2019-09-05 20:02:38 +00:00
Guillaume Chatelet	f9f31ce6a9	[Alignment][NFC] Change internal representation of TargetLowering.h Summary: This is patch is part of a series to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67226 llvm-svn: 371082	2019-09-05 15:44:33 +00:00
Simon Pilgrim	071287c5a9	Revert rL370996 from llvm/trunk: [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls This adds support for basic sibling call lowering in AArch64. The intent here is to only handle tail calls which do not change the ABI (hence, sibling calls.) At this point, it is very restricted. It does not handle - Vararg calls. - Calls with outgoing arguments. - Calls whose calling conventions differ from the caller's calling convention. - Tail/sibling calls with BTI enabled. This patch adds - `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent to the same function in AArch64ISelLowering.cpp (albeit with the restrictions above.) - `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in AArch64ISelLowering.cpp. - `getCallOpcode`, which is exactly what it sounds like. Tail/sibling calls are lowered by checking if they pass target-independent tail call positioning checks, and checking if they satisfy `isEligibleForTailCallOptimization`. If they do, then a tail call instruction is emitted instead of a normal call. If we have a sibling call (which is always the case in this patch), then we do not emit any stack adjustment operations. When we go to lower a return, we check if we've already emitted a tail call. If so, then we skip the return lowering. For testing, this patch - Adds call-translator-tail-call.ll to test which tail calls we currently lower, which ones we don't, and which ones we shouldn't. - Updates branch-target-enforcement-indirect-calls.ll to show that we fall back as expected. Differential Revision: https://reviews.llvm.org/D67189 ........ This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll llvm-svn: 371051	2019-09-05 10:38:39 +00:00
Guillaume Chatelet	aff45e4b23	[LLVM][Alignment] Make functions using log of alignment explicit Summary: This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align. The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment. A few renames uncovered dubious assignments: - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation. - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation, - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation, Reviewers: lattner, thegameg, courbet Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65945 llvm-svn: 371045	2019-09-05 10:00:22 +00:00
Jessica Paquette	b78324fc40	[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls This adds support for basic sibling call lowering in AArch64. The intent here is to only handle tail calls which do not change the ABI (hence, sibling calls.) At this point, it is very restricted. It does not handle - Vararg calls. - Calls with outgoing arguments. - Calls whose calling conventions differ from the caller's calling convention. - Tail/sibling calls with BTI enabled. This patch adds - `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent to the same function in AArch64ISelLowering.cpp (albeit with the restrictions above.) - `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in AArch64ISelLowering.cpp. - `getCallOpcode`, which is exactly what it sounds like. Tail/sibling calls are lowered by checking if they pass target-independent tail call positioning checks, and checking if they satisfy `isEligibleForTailCallOptimization`. If they do, then a tail call instruction is emitted instead of a normal call. If we have a sibling call (which is always the case in this patch), then we do not emit any stack adjustment operations. When we go to lower a return, we check if we've already emitted a tail call. If so, then we skip the return lowering. For testing, this patch - Adds call-translator-tail-call.ll to test which tail calls we currently lower, which ones we don't, and which ones we shouldn't. - Updates branch-target-enforcement-indirect-calls.ll to show that we fall back as expected. Differential Revision: https://reviews.llvm.org/D67189 llvm-svn: 370996	2019-09-04 22:54:52 +00:00
Puyan Lotfi	028061d4eb	[mir-canon][NFC] Move MIR vreg renaming code to separate file for better reuse. Moving MIRCanonicalizerPass vreg renaming code to MIRVRegNamerUtils so that it can be reused in another pass (ie planing to write a standalone mir-namer pass). I'm going to write a mir-namer pass so that next time someone has to author a test in MIR, they can use it to cleanup the naming and make it more readable by having the numbered vregs swapped out with named vregs. Differential Revision: https://reviews.llvm.org/D67114 llvm-svn: 370985	2019-09-04 21:29:10 +00:00
Matt Arsenault	5ff310e298	GlobalISel: Add basic legalization for G_BITREVERSE llvm-svn: 370979	2019-09-04 20:46:15 +00:00
Daniel Sanders	b276a9a51e	[globalisel] Support trivial COPY in GISelKnownBits Summary: Allow GISelKnownBits to look through the trivial case of TargetOpcode::COPY Reviewers: aditya_nandakumar Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67131 llvm-svn: 370955	2019-09-04 18:59:43 +00:00
Philip Reames	3a49ca331f	Update CodeGen to use hasMetadata as appropriate [NFC] My intial grepping for rL370933 missed a directory worth of cases. llvm-svn: 370942	2019-09-04 17:46:55 +00:00
Matt Arsenault	70becc20fa	GlobalISel: Add G_BITREVERSE This is the first failing pattern for AMDGPU and is trivial to handle. llvm-svn: 370927	2019-09-04 17:06:53 +00:00
James Molloy	11f0f7f583	[ModuloSchedule] Fix no-asserts build Apologies, due to a git SNAFU this fix (dump doesn't exist and silence unused variables) stayed in my index rather than applying to rL370893. llvm-svn: 370894	2019-09-04 12:57:23 +00:00
James Molloy	fef9f59055	[ModuloSchedule] Introduce PeelingModuloScheduleExpander This is the beginnings of a reimplementation of ModuloScheduleExpander. It works by generating a single-block correct pipelined kernel and then peeling out the prolog and epilogs. This patch implements kernel generation as well as a validator that will confirm the number of phis added is the same as the ModuloScheduleExpander. Prolog and epilog peeling will come in a different patch. Differential Revision: https://reviews.llvm.org/D67081 llvm-svn: 370893	2019-09-04 12:54:24 +00:00
Jeremy Morse	337a7cb55e	[DebugInfo] LiveDebugValues: locations with different exprs should not be merged When comparing variable locations, LiveDebugValues currently considers only the machine location, ignoring any DIExpression applied to it. This is a problem because that DIExpression can do pretty much anything to the machine location, for example dereferencing it. This patch adds DIExpressions to that comparison; now variables based on the same register/memory-location but with different expressions will compare differently, and be dropped if we attempt to merge them between blocks. This reduces variable coverage-range a little, but only because we were producing broken locations. Differential Revision: https://reviews.llvm.org/D66942 llvm-svn: 370877	2019-09-04 11:09:05 +00:00
Jeremy Morse	c8c5f2a84e	[LiveDebugValues][NFC] Silence an unused variable warning On release builds, 'MI' isn't used by anything (it's already inserted into a block by BuildMI), while on non-release builds it's used by a LLVM_DEBUG statement. Mark as explicitly used to avoid the warning. llvm-svn: 370870	2019-09-04 10:18:03 +00:00
Amara Emerson	5d5150f0b4	[GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type combination. Similar to the issue with G_ZEXT that was fixed earlier, this is a quick to fall back if the source type is not exactly half of the dest type. Fixes the clang-cmake-aarch64-lld bot build. llvm-svn: 370847	2019-09-04 07:58:45 +00:00
Amara Emerson	2a2c25ba48	[AArch64][GlobalISel] Legalize 128 bit divisions to libcalls. Now that we have the infrastructure to support s128 types as parameters we can expand these to libcalls. Differential Revision: https://reviews.llvm.org/D66185 llvm-svn: 370823	2019-09-03 21:42:32 +00:00
Amara Emerson	fbaf425b79	[GlobalISel][CallLowering] Add support for splitting types according to calling conventions. On AArch64, s128 types have to be split into s64 GPRs when passed as arguments. This change adds the generic support in call lowering for dealing with multiple registers, for incoming and outgoing args. Support for splitting for return types not yet implemented. Differential Revision: https://reviews.llvm.org/D66180 llvm-svn: 370822	2019-09-03 21:42:28 +00:00
Bjorn Pettersson	b0eb394417	[CodeGen] Use FSHR in DAGTypeLegalizer::ExpandIntRes_MULFIX Summary: Simplify the right shift of the intermediate result (given in four parts) by using funnel shift. There are some impact on lit tests, but that seems to be related to register allocation differences due to how FSHR is expanded on X86 (giving a slightly different operand order for the OR operations compared to the old code). Reviewers: leonardchan, RKSimon, spatel, lebedev.ri Reviewed By: RKSimon Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, pzheng, bevinh, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D67036 llvm-svn: 370813	2019-09-03 19:35:07 +00:00
James Molloy	935499579c	[MachinePipeliner] Add a way to unit-test the schedule emitter Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself. One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway. This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at. We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with: llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir And run the emission in isolation with: llc < test.mir -run-pass=modulo-schedule-test llvm-svn: 370705	2019-09-03 08:20:31 +00:00
Craig Topper	9c74c77404	[LegalizeDAG] Pass DAG to two calls to SDNode::dump in debug prints so that they will print target specific nodes correctly. The dump methods can only print target node names correctly if they can get access to the TLI object. llvm-svn: 370694	2019-09-03 02:51:14 +00:00
Robert Lougher	13190c4225	[TargetLowering][PS4] Add sincos(f) lib functions when target is PS4 PS4 supports sincosf and sincos. Adding the library functions enables the sin(f)+cos(f) -> sincos(f) optimization. Differential Revision: https://reviews.llvm.org/D67009 llvm-svn: 370675	2019-09-02 16:53:32 +00:00
Sanjay Patel	4e54cf3e0e	[DAGCombiner] try to form test+set out of shift+mask patterns The motivating bugs are: https://bugs.llvm.org/show_bug.cgi?id=41340 https://bugs.llvm.org/show_bug.cgi?id=42697 As discussed there, we could view this as a failure of IR canonicalization, but then we would need to implement a backend fixup with target overrides to get this right in all cases. Instead, we can just view this as a codegen opportunity. It's not even clear for x86 exactly when we should favor test+set; some CPUs have better theoretical throughput for the ALU ops than bt/test. This patch is made more complicated than I expected because there's an early DAGCombine for 'and' that can change types of the intermediate ops via trunc+anyext. Differential Revision: https://reviews.llvm.org/D66687 llvm-svn: 370668	2019-09-02 14:52:09 +00:00
Jeremy Morse	22493f66f1	[DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locations The missing line added by this patch ensures that only spilt variable locations are candidates for being restored from the stack. Otherwise, register or constant-value information can be interpreted as a spill location, through a union. The added regression test replicates a scenario where this occurs: the stack load from [rsp] causes the register-location DBG_VALUE to be "restored" to rsi, when it should be left alone. See PR43058 for details. Un x-fail a test that was suffering from this from a previous patch. Differential Revision: https://reviews.llvm.org/D66895 llvm-svn: 370648	2019-09-02 12:28:36 +00:00
Amara Emerson	453ef4e376	[AArch64][GlobalISel] Fix zext narrowScalar to use the right type when creating the merges. Fixes PR43171. llvm-svn: 370627	2019-09-02 08:18:55 +00:00
Sanjay Patel	c882208367	[DAGCombiner] improve throughput of shift+logic+shift The motivating case for this is a long way from here: https://bugs.llvm.org/show_bug.cgi?id=43146 ...but I think this is where we have to start. We need to canonicalize/optimize sequences of shift and logic to ease pattern matching for things like bswap and improve perf in general. But without the artificial limit of '!LegalTypes' (early combining), there are a lot of test diffs, and not all are good. In the minimal tests added for this proposal, x86 should have better throughput in all cases. AArch64 is neutral for scalar tests because it can fold shifts into bitwise logic ops. There are 3 shift opcodes and 3 logic opcodes for a total of 9 possible patterns: https://rise4fun.com/Alive/VlI https://rise4fun.com/Alive/n1m https://rise4fun.com/Alive/1Vn Differential Revision: https://reviews.llvm.org/D67021 llvm-svn: 370617	2019-09-01 18:38:15 +00:00
Shiva Chen	adfdcb9c26	[TargetLowering] Fix Bugzilla ID 43183 to avoid soften comparison broken with constant inputs Summary: This fixes the bugzilla id 43183 which triggerd by the following commit: [RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall llvm-svn: 370604	2019-09-01 04:52:54 +00:00
Sanjay Patel	9e57b49392	[DAGCombiner] clean up code in visitShiftByConstant() This is not quite NFC because the SDLoc propagation is changed, but there are no regression test diffs from that. llvm-svn: 370587	2019-08-31 15:08:58 +00:00
Amaury Sechet	82825ab882	[DAGCombiner] Match (add X, X) as (shl X, 1) when detecting rotate. Summary: The combiner transforms (shl X, 1) into (add X, X). Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66882 llvm-svn: 370578	2019-08-31 11:40:02 +00:00
James Molloy	e62c509cd4	[DAGCombiner] Don't create illegal narrow stores Narrowing stores when the target doesn't support the narrow version forces the target to expand into a load-modify-store sequence, which is highly suboptimal. The information narrowing throws away (legality of the inverse transform) is hard to re-analyze. If the target doesn't support a store of the narrow type, don't narrow even in pre-legalize mode. No test as this is DAGCombiner and depends on target bits. llvm-svn: 370576	2019-08-31 10:46:16 +00:00
Bjorn Pettersson	e27c74abb6	[CodeGen] Refactor DAGTypeLegalizer::ExpandIntRes_MULFIX. NFC Restructured the code a little bit in preparation for adding UMULFIXSAT. I think it will be easier to understand the code if not interleaving the codegen for signed/unsigned/saturated cases that much. llvm-svn: 370569	2019-08-31 09:28:50 +00:00
James Molloy	790a779f06	[MachinePipeliner] Separate schedule emission, NFC This is the first stage in refactoring the pipeliner and making it more accessible for backends to override and control. This separates the logic and state required to emit a scheudule from the logic that computes and validates a schedule. This will enable (a) new schedule emitters and (b) new modulo scheduling implementations to coexist. NFC. Differential Revision: https://reviews.llvm.org/D67006 llvm-svn: 370500	2019-08-30 18:49:50 +00:00
Simon Pilgrim	3be7081aa1	[DAGCombine] ReduceLoadWidth - remove duplicate SDLoc. NFCI. SDLoc(N0) and SDLoc(cast<LoadSDNode>(N0)) should be equivalent. llvm-svn: 370498	2019-08-30 18:19:02 +00:00
Simon Pilgrim	2d1e0899e9	[TargetLowering] SimplifyDemandedBits ADD/SUB/MUL - correctly inherit SDNodeFlags from the original node. Just disable NSW/NUW flags. This matches what we're already doing for the other situations for these nodes, it was just missed for the demanded constant case. Noticed by inspection - confirmed in offline discussion with @spatel. I've checked we have test coverage in the x86 extract-bits.ll and extract-lowbits.ll tests llvm-svn: 370497	2019-08-30 17:58:55 +00:00
Matt Arsenault	466ec2d552	GlobalISel: Fix missing pass dependency llvm-svn: 370496	2019-08-30 17:41:58 +00:00
Craig Topper	30ddd2ab6c	[ValueTypes] Add v16f16 and v32f16 to EVT::getEVTString and Tablegen's getEnumName Missed these when I hadded the enum entries llvm-svn: 370494	2019-08-30 17:34:29 +00:00
Simon Pilgrim	ab8cb1a3c5	[DAGCombine] visitVSELECT - remove equivalent getValueType() call. NFCI. llvm-svn: 370489	2019-08-30 17:21:20 +00:00
Simon Pilgrim	c2fed1dc8a	[DAGCombine] visitVSELECT - remove duplicate getOperand calls. NFCI. llvm-svn: 370478	2019-08-30 15:17:37 +00:00
Simon Pilgrim	3367669668	[DAGCombine] visitVSELECT - use getShiftAmountTy for shift amounts. llvm-svn: 370471	2019-08-30 13:30:37 +00:00
Simon Pilgrim	8e1989e79a	[DAGCombine] visitMULHS - use getScalarValueSizeInBits() to make safe for vector types. This is hidden behind a (scalar-only) isOneConstant(N1) check at the moment, but once we get around to adding vector support we need to ensure we're dealing with the scalar bitwidth, not the total. llvm-svn: 370468	2019-08-30 12:22:06 +00:00
Bjorn Pettersson	227145924a	[CodeGen] Introduce MachineBasicBlock::replacePhiUsesWith helper and use it. NFC Summary: Found a couple of places in the code where all the PHI nodes of a MBB is updated, replacing references to one MBB by reference to another MBB instead. This patch simply refactors the code to use a common helper (MachineBasicBlock::replacePhiUsesWith) for such PHI node updates. Reviewers: t.p.northover, arsenm, uabelho Subscribers: wdng, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66750 llvm-svn: 370463	2019-08-30 11:23:10 +00:00
Simon Pilgrim	7cbf823f93	[DAGCombine] visitMULHS/visitMULHU - isBuildVectorAllZeros doesn't mean node is all zeros Return a proper zero vector, just in case some elements are undef. Noticed by inspection after dealing with a similar issue in PR43159. llvm-svn: 370460	2019-08-30 10:42:14 +00:00
David Stenberg	b35d4699d0	[LiveDebugValues] Insert entry values after bundles Summary: Change LiveDebugValues so that it inserts entry values after the bundle which contains the clobbering instruction. Previously it would insert the debug value after the bundle head using insertAfter(), breaking the bundle. Reviewers: djtodoro, NikolaPrica, aprantl, vsk Reviewed By: vsk Subscribers: hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D66888 llvm-svn: 370448	2019-08-30 09:06:50 +00:00
Petar Avramovic	6412b56513	[MIPS GlobalISel] Lower fptoui Add lower for G_FPTOUI. Algorithm is similar to the SDAG version in TargetLowering::expandFP_TO_UINT. Lower G_FPTOUI for MIPS32. Differential Revision: https://reviews.llvm.org/D66929 llvm-svn: 370431	2019-08-30 05:44:02 +00:00
Dan Gohman	8cfeeaf9de	[CodeGen] Fix lowering for returning the result of an extractvalue When the number of return values exceeds the number of registers available, SelectionDAGBuilder::visitRet transforms a function's return to use a pointer to a buffer to hold return values. When the returned value is an operator such as extractvalue, the value may have a non-zero result number. Add that number to the indexing when obtaining the values to store. This fixes https://bugs.llvm.org/show_bug.cgi?id=43132. Differential Revision: https://reviews.llvm.org/D66978 llvm-svn: 370430	2019-08-30 04:33:22 +00:00
Jordan Rupprecht	f9f81289e6	Revert [MBP] Disable aggressive loop rotate in plain mode This reverts r369664 (git commit `51f48295cb`) It causes many benchmark regressions, internally and in llvm's benchmark suite. llvm-svn: 370398	2019-08-29 19:03:58 +00:00
Matt Arsenault	093ebf9275	GlobalISel: Don't compute known bits for non-integral GEP llvm-svn: 370392	2019-08-29 17:55:05 +00:00
Matt Arsenault	b2b9a23758	GlobalISel: Add maskedValueIsZero and signBitIsZero to known bits I dropped the DemandedElts since it seems to be missing from some of the new interfaces, but not others. llvm-svn: 370389	2019-08-29 17:24:36 +00:00
Matt Arsenault	caff0a88dd	GlobalISel: Add known bits to InstructionSelector AMDGPU uses this for some addressing mode selection patterns. The analysis run itself doesn't do anything so it seems easier to just always require this than adding a way to opt in. llvm-svn: 370388	2019-08-29 17:24:32 +00:00
Simon Pilgrim	ea67741899	[DAGCombine] Fix shadow variable warnings. NFCI. llvm-svn: 370365	2019-08-29 14:34:07 +00:00
Jeremy Morse	ca0e4b3689	[DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locations The missing line added by this patch ensures that only spilt variable locations are candidates for being restored from the stack. Otherwise, register or constant-value information can be interpreted as a spill location, through a union. The added regression test replicates a scenario where this occurs: the stack load from [rsp] causes the register-location DBG_VALUE to be "restored" to rsi, when it should be left alone. See PR43058 for details. Un x-fail a test that was suffering from this from a previous patch. Differential Revision: https://reviews.llvm.org/D66895 llvm-svn: 370334	2019-08-29 11:20:54 +00:00
Simon Pilgrim	6c2fc64edc	Fix signed/unsigned comparison warning. NFCI. llvm-svn: 370333	2019-08-29 11:18:53 +00:00
Simon Pilgrim	27f43e6b1a	Fix shadow variable warning. NFCI. llvm-svn: 370332	2019-08-29 11:16:32 +00:00
Jeremy Morse	313d2ce999	[DebugInfo] LiveDebugValues should always revisit backedges if it skips them The "join" method in LiveDebugValues does not attempt to join unseen predecessor blocks if their out-locations aren't yet initialized, instead the block should be re-visited later to see if any locations have changed validity. However, because the set of blocks were all being "process"'d once before "join" saw them, that logic in "join" was actually ignoring legitimate out-locations on the first pass through. This meant that some invalidated locations were not removed from the head of loops, allowing illegal locations to persist. Fix this by removing the run of "process" before the main join/process loop in ExtendRanges. Now the unseen predecessors that "join" skips truly are uninitialized, and we come back to the block at a later time to re-run "join", see the @baz function added. This also fixes another fault where stack/register transfers in the entry block (or any other before-any-loop-block) had their tranfers initially ignored, and were then never revisited. The MIR test added tests for this behaviour. XFail a test that exposes another bug; a fix for this is coming in D66895. Differential Revision: https://reviews.llvm.org/D66663 llvm-svn: 370328	2019-08-29 10:53:29 +00:00
Amaury Sechet	8365e42010	[DAGCombiner] (insert_vector_elt (vector_shuffle X, Y), (extract_vector_elt X, N), IdxC) -> (vector_shuffle X, Y) Summary: This is beneficial when the shuffle is only used once and end up being generated in a few places when some node is combined into a shuffle. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66718 llvm-svn: 370326	2019-08-29 10:35:51 +00:00
Simon Pilgrim	dfb2a19ac2	LegalizeSetCCCondCode - Reduce scope of NeedSwap to fix cppcheck warning. NFCI. No need for this to be defined outside the only switch case its used in. llvm-svn: 370320	2019-08-29 10:11:34 +00:00
Craig Topper	1aadf6f39f	[X86] Make inline assembly 'x' and 'v' constraints work for f128. Including a type legalizer fix to make bitcast operand promotion work correctly when getSoftenedFloat returns f128 instead of i128. Fixes PR43157 llvm-svn: 370293	2019-08-29 05:13:56 +00:00
Shiva Chen	b39876d8cd	[RISCV] Avoid generating AssertZext for LP64 ABI when lowering floating LibCall The patch fixed the issue that RV64 didn't clear the upper bits when return complex floating value with lp64 ABI. float _Complex complex_add(float _Complex a, float _Complex b) { return a + b; } RealResult = zero_extend(RealA + RealB) ImageResult = ImageA + ImageB Return (RealResult \| (ImageResult << 32)) The patch introduces shouldExtendTypeInLibCall target hook to suppress the AssertZext generation when lowering floating LibCall. Thanks to Eli's comments from the Bugzilla https://bugs.llvm.org/show_bug.cgi?id=42820 Differential Revision: https://reviews.llvm.org/D65497 llvm-svn: 370275	2019-08-28 23:40:37 +00:00
Kevin P. Neal	ddf13c00ed	[FPEnv] Add fptosi and fptoui constrained intrinsics. This implements constrained floating point intrinsics for FP to signed and unsigned integers. Quoting from D32319: The purpose of the constrained intrinsics is to force the optimizer to respect the restrictions that will be necessary to support things like the STDC FENV_ACCESS ON pragma without interfering with optimizations when these restrictions are not needed. Reviewed by: Andrew Kaylor, Craig Topper, Hal Finkel, Cameron McInally, Roman Lebedev, Kit Barton Approved by: Craig Topper Differential Revision: http://reviews.llvm.org/D63782 llvm-svn: 370228	2019-08-28 16:33:36 +00:00
Jessica Paquette	af0bd41e06	[AArch64][GlobalISel] Fall back when translating musttail calls These are currently translated as normal functions calls in AArch64. Until we have proper tail call lowering, we shouldn't translate these. Differential Revision: https://reviews.llvm.org/D66842 llvm-svn: 370225	2019-08-28 16:19:01 +00:00
Ryan Taylor	3b1459ed7c	[AMDGPU] Adjust number of SGPRs available in Calling Convention This reduces the number of SGPRs due to some concerns about running out of SGPRs if you make all the SGPRs that aren't reserved available for the calling convention. Change-Id: Idb4ca4dc72f5b6808cb524ff7270915a8de5b4c1 llvm-svn: 370215	2019-08-28 15:00:45 +00:00
Simon Pilgrim	14e07d7f4b	[DAGCombine] Fix cppcheck shadow variable warning. NFCI. We already have an outer Ops variable. llvm-svn: 370197	2019-08-28 12:48:41 +00:00
Amaury Sechet	4f4387dd12	[TargetLowering] Add buildLegalVectorShuffle facility to help build legal shuffles Summary: There are at least 2 ways to express the same shuffle. Various pieces of code explicit check for both option, but other places do not when they would benefit from doing it. This patches refactor the codebase to use buildLegalVectorShuffle in order to make that behavior more consistent. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66804 llvm-svn: 370190	2019-08-28 12:00:06 +00:00
Simon Pilgrim	c5b38e2869	[DAGCombine] Remove LoadedSlice::Cost default 'ForCodeSize' constructor arguments. NFCI. These were always being passed in and it allowed me to add the explicit tag to stop a cppcheck warning about 1 argument constructors. llvm-svn: 370189	2019-08-28 11:50:36 +00:00
Amara Emerson	e20b91c265	[GlobalISel] Replace hard coded dynamic alloca handling with G_DYN_STACKALLOC. This change moves the actual stack pointer manipulation into the legalizer, available to targets via lower(). The codegen is slightly different because we're using explicit masks instead of G_PTRMASK, and using G_SUB rather than adding a negative amount via G_GEP. Differential Revision: https://reviews.llvm.org/D66678 llvm-svn: 370104	2019-08-27 19:54:27 +00:00
Matt Arsenault	2910184936	DAG: computeNumSignBits for MUL Copied directly from the IR version. Most of the testcases I've added for this are somewhat problematic because they really end up testing the yet to be implemented version for MUL_I24/MUL_U24. llvm-svn: 370099	2019-08-27 19:05:33 +00:00
Sanjay Patel	b516f1afdd	[DAGCombiner] cancel fnegs from multiplied operands of FMA (-X) * (-Y) + Z --> X * Y + Z This is a missing optimization that shows up as a potential regression in D66050, so we should solve it first. We appear to be partly missing this fold in IR as well. We do handle the simpler case already: (-X) * (-Y) --> X * Y And it might be beneficial to make the constraint less conservative (eg, if both operands are cheap, but not necessarily cheaper), but that causes infinite looping for the existing fmul transform. Differential Revision: https://reviews.llvm.org/D66755 llvm-svn: 370071	2019-08-27 15:17:46 +00:00
Jinsong Ji	7f536bcf22	Revert "[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks" This reverts commit `b3d258fc44`. @skatkov is reporting crash in D63972#1646303 Contacted @ZhangKang, and revert the commit on behalf of him. llvm-svn: 370069	2019-08-27 14:59:08 +00:00
Petar Avramovic	a393238422	[GlobalISel] Factor narrowScalar for G_ASHR and G_LSHR. NFC Main difference is in the way Hi for Long shift (HiL) is made. G_LSHR fills HiL with zeros, while G_ASHR fills HiL with sign bit value. Differential Revision: https://reviews.llvm.org/D66589 llvm-svn: 370064	2019-08-27 14:33:05 +00:00
Petar Avramovic	d568ed40e0	[GlobalISel] Fix narrowScalar for shifts to match algorithm from SDAG Fix typos. Use Hi and Lo prefixes for Or instead of LHS and RHS to match names of surrounding variables. Differential Revision: https://reviews.llvm.org/D66587 llvm-svn: 370062	2019-08-27 14:22:32 +00:00
Amaury Sechet	f28dee2cff	[DAGCombiner] Add node to the worklist in topological order in parallelizeChainedStores Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66659 llvm-svn: 370056	2019-08-27 13:27:57 +00:00
Amaury Sechet	a1e5ef3fd4	[DAGCombiner] Add node to the worklist in topological order after relegalization. Summary: As per title. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66702 llvm-svn: 370040	2019-08-27 11:06:09 +00:00
Craig Topper	243ede9970	[SelectionDAGBuilder] Hide existence of ConstantDataVector vector from visitGetElementPtr. ConstantDataVector is a specialized verison of ConstantVector that stores data in a packed array of bits instead of as individual pointers to other Constants. But we really shouldn't expose that if we can void it. And we should handle regular ConstantVector equally well. This removes a dyn_cast to ConstantDataVector and just calls getSplatValue directly on a Constant* if the type is a vector. llvm-svn: 370018	2019-08-27 06:39:50 +00:00
Craig Topper	4a3f62f9fd	[SelectionDAGBuilder] Fix typo in comment. NFC llvm-svn: 370017	2019-08-27 06:38:51 +00:00
Richard Trieu	58e67b8aa3	Revert r369927 - [DAGCombiner] Remove a bunch of redundant AddToWorklist calls. This change causes instrumented builds of Clang to have a fatal error in the backend. https://reviews.llvm.org/D66537 has the details. llvm-svn: 370006	2019-08-27 02:04:11 +00:00
Shafik Yaghmour	90e00bd8f3	Debug Info: Support for DW_AT_export_symbols for anonymous structs This implements the DWARF 5 feature described in: http://dwarfstd.org/ShowIssue.php?issue=141212.1 To support recognizing anonymous structs: struct A { struct { // Anonymous struct int y; }; } a This patch adds support for the new flag in constructTypeDIE(...) and test to verify this change. Differential Revision: https://reviews.llvm.org/D66605 llvm-svn: 369969	2019-08-26 20:59:44 +00:00
Vedant Kumar	58a0714885	[DWARF] Rename getDwarf5OrGNUCallSite{Attr,Tag}, NFC llvm-svn: 369967	2019-08-26 20:53:34 +00:00
Vedant Kumar	533dd0214c	[DWARF] Pick the DWARF5 OP_entry_value opcode on Darwin Use the GNU extension for OP_entry_value consistently (i.e. whenever GNU extensions are used for TAG_call_site). llvm-svn: 369966	2019-08-26 20:53:12 +00:00
Craig Topper	846429de74	[DAGCombiner][X86] Teach SimplifyVBinOp to fold VBinOp (concat X, undef/constant), (concat Y, undef/constant) -> concat (VBinOp X, Y), VecC This improves the combine I included in D66504 to handle constants in the upper operands of the concat. If we can constant fold them away we can pull the concat after the bin op. This helps with chains of madd reductions on X86 from loop unrolling. The loop madd reduction pattern creates pmaddwd with half the width of the add that follows it using zeroes to fill the upper bits. If we have two of these added together we can pull the zeroes through the accumulating add and then shrink it. Differential Revision: https://reviews.llvm.org/D66680 llvm-svn: 369937	2019-08-26 17:59:11 +00:00
Amaury Sechet	b7075e40f3	[DAGCombiner] Remove a bunch of redundant AddToWorklist calls. Summary: This comes as a first step toward processing the DAG nodes in topological orders. Doing so ensure that arguments of a node are combined before the node itself is combined, which exposes ore opportunities for optimization and/or reduce the amount of patterns a node has to match for. DAGCombiner adding nodes to the worklist is various places causes the nodes to be in a different order from what is expected. In addition, this is reduant because these nodes end up being added to the worklist anyways due to the machinery at line 1621. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66537 llvm-svn: 369927	2019-08-26 17:02:12 +00:00
Craig Topper	b8b90ac1c5	[X86][DAGCombiner] Teach narrowShuffle to use concat_vectors instead of inserting into undef Summary: Concat_vectors is more canonical during early DAG combine. For example, its what's used by SelectionDAGBuilder when converting IR shuffles into SelectionDAG shuffles when element counts between inputs and mask don't match. We also have combines in DAGCombiner than can pull concat_vectors through a shuffle. See partitionShuffleOfConcats. So it seems like concat_vectors is a better operation to use here. I had to teach DAGCombiner's SimplifyVBinOp to also handle concat_vectors with undef. I haven't checked yet if we can remove the INSERT_SUBVECTOR version in there or not. I didn't want to mess with the other caller of getShuffleHalfVectors that's used during shuffle lowering where insert_subvector probably is what we want to produce so I've enabled this via a boolean passed to the function. Reviewers: spatel, RKSimon Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66504 llvm-svn: 369872	2019-08-25 17:59:49 +00:00
Xing Xue	ef039a3ccd	[PowerPC][AIX] Adds support for writing the .data section in assembly files Summary: Adds support for generating the .data section in assembly files for global variables with a non-zero initialization. The support for writing the .data section in XCOFF object files will be added in a follow-on patch. Any relocations are not included in this patch. Reviewers: hubert.reinterpretcast, sfertile, jasonliu, daltenty, Xiangling_L Reviewed by: hubert.reinterpretcast Subscribers: nemanjai, hiraditya, kbarton, MaskRay, jsji, wuzish, shchenz, DiggerLin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66154 llvm-svn: 369869	2019-08-25 15:17:25 +00:00
Nikita Popov	aa71c977ba	[SDAG] Fold umul_lohi with 0 or 1 multiplicand These can turn up during multiplication legalization. In principle these should also apply to smul_lohi, but I wasn't able to figure out how to produce those with the necessary operands. Differential Revision: https://reviews.llvm.org/D66380 llvm-svn: 369864	2019-08-25 08:04:22 +00:00
Nilanjana Basu	7da6f432d8	Removing block comments from CodeView records in assembly files & related code cleanup llvm-svn: 369860	2019-08-25 01:09:11 +00:00
Amara Emerson	3f6dd0c588	[GlobalISel] Introduce a G_DYN_STACKALLOC opcode to represent dynamic allocas. This just adds the opcode and verifier, it will be used to replace existing dynamic alloca handling in a subsequent patch. Differential Revision: https://reviews.llvm.org/D66677 llvm-svn: 369833	2019-08-24 02:25:56 +00:00
Guillaume Chatelet	b7be5b9095	[LLVM][NFC] remove unused fields Summary: Here is the commit introducing the fields https://github.com/llvm/llvm-project/commit/cf6749e4c091 It dates back from 2006 and was used by AArch64 backend. There is no more reference to these fields in the whole codebase so I think it's fine. Reviewers: courbet Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66683 llvm-svn: 369810	2019-08-23 20:49:06 +00:00
Volkan Keles	277631e3b8	[GlobalISel] Legalizer: Retry combining illegal artifacts as long as there new artifacts Summary: Currently, Legalizer aborts if it’s unable to legalize artifacts. However, it’s possible to combine them after processing the rest of the instruction because the legalization is likely to generate more artifacts that allow ArtifactCombiner to combine away them. Instead, move illegal artifacts to another list called RetryList and wait until all of the instruction in InstList are legalized. After that, check if there is any new artifacts and try to combine them again if that’s the case. If not, abort. The idea is similar to D59339, but the approach is a bit different. This patch fixes the issue described above, but the legalizer still may be unable to handle some cases depending on when to legalize artifacts. So, in the long run, we probably need a different legalization strategy that handles this dependency in a better way. Reviewers: dsanders, aditya_nandakumar, qcolombet, arsenm, aemerson, paquette Reviewed By: dsanders Subscribers: jvesely, wdng, nhaehnle, rovka, javed.absar, hiraditya, Petar.Avramovic, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65894 llvm-svn: 369805	2019-08-23 20:30:35 +00:00
Benjamin Kramer	dc5f805d31	Do a sweep of symbol internalization. NFC. llvm-svn: 369803	2019-08-23 19:59:23 +00:00
Matt Arsenault	2fd1afe8ef	RegScavenger: Use Register llvm-svn: 369794	2019-08-23 18:25:34 +00:00
Craig Topper	e7211bb567	[SelectionDAG][X86] Enable iX SimplifyDemandedBits to vXi1 SimplifyDemandedVectorElts simplification. Add a hack to X86 to avoid a regression Patch showing the effect of enabling bool vector oversimplification. Non-VLX builds can simplify a kshift shuffle, but VLX builds simplify: insert_subvector v8i zeroinitializer, v2i --> insert_subvector v8i undef, v2i Preventing the removal of the AND to clear the upper bits of result Differential Revision: https://reviews.llvm.org/D53022 llvm-svn: 369780	2019-08-23 17:14:58 +00:00
Jeremy Morse	0ae5498146	[DebugInfo] Remove invalidated locations during LiveDebugValues LiveDebugValues gives variable locations to blocks, but it should also take away. There are various circumstances where a variable location is known until a loop backedge with a different location is detected. In those circumstances, where there's no agreement on the variable location, it should be undef / removed, otherwise we end up picking a location that's valid on some loop iterations but not others. However, LiveDebugValues doesn't currently do this, see the new testcase attached. Without this patch, the location of !3 is assumed to be %bar through the loop. Once it's added to the In-Locations list, it's never removed, even though the later dbg.value(0... of !3 makes the location un-knowable. This patch checks during block-location-joining to see whether any previously-present locations have been removed in a predecessor. If they have, the live-ins have changed, and the block needs reprocessing. Similarly, in transferTerminator, assign rather than \|= the Out-Locations after processing a block, as we may have deleted some previously valid locations. This will mean that LiveDebugValues performs more propagation -- but that's necessary for it being correct. Differential Revision: https://reviews.llvm.org/D66599 llvm-svn: 369778	2019-08-23 16:33:42 +00:00
Simon Pilgrim	04906ef1f2	[DAGCombine] GetNegatedExpression - add FMA\FMAD support If the accumulator and either of the multiply operands are negatable then we can we negate the entire expression. Differential Revision: https://reviews.llvm.org/D63141 llvm-svn: 369746	2019-08-23 10:49:46 +00:00
Peter Collingbourne	2452d7030b	IR. Change strip* family of functions to not look through aliases. I noticed another instance of the issue where references to aliases were being replaced with aliasees, this time in InstCombine. In the instance that I saw it turned out to be only a QoI issue (a symbol ended up being missing from the symbol table due to the last reference to the alias being removed, preventing HWASAN from symbolizing a global reference), but it could easily have manifested as incorrect behaviour. Since this is the third such issue encountered (previously: D65118, D65314) it seems to be time to address this common error/QoI issue once and for all and make the strip* family of functions not look through aliases. Includes a test for the specific issue that I saw, but no doubt there are other similar bugs fixed here. As with D65118 this has been tested to make sure that the optimization isn't load bearing. I built Clang, Chromium for Linux, Android and Windows as well as the test-suite and there were no size regressions. Differential Revision: https://reviews.llvm.org/D66606 llvm-svn: 369697	2019-08-22 19:56:14 +00:00
Matt Arsenault	fba82858f2	GlobalISel: Don't create G_UADDE with constant false carry in The x86 tests are now broken (in paticular add-scalar.ll now hits the DAG fallback) due to not handling G_UADDO. The DAG x86 backend has a custom lowering for this, so that will need to be implemented. llvm-svn: 369673	2019-08-22 17:29:17 +00:00
Francis Visoiu Mistrih	5b5ee61b5f	[MachO][TLOF] Use hasLocalLinkage to determine if indirect symbol is local Local symbols in the indirect symbol table contain the value `INDIRECT_SYMBOL_LOCAL` and the corresponding __pointers entry must contain the address of the target. In r349060, I added support for local symbols in the indirect symbol table, which was checking if the symbol `isDefined` && `!isExternal` to determine if the symbol is local or not. It turns out that `isDefined` will return false if the user of the symbol comes before its definition, and we'll again generate .long 0 which will be the symbol at the adress 0x0. Instead of doing that, use GlobalValue::hasLocalLinkage() to check if the symbol is local. Differential Revision: https://reviews.llvm.org/D66563 llvm-svn: 369671	2019-08-22 16:59:00 +00:00
Guozhi Wei	51f48295cb	[MBP] Disable aggressive loop rotate in plain mode Patch https://reviews.llvm.org/D43256 introduced more aggressive loop layout optimization which depends on profile information. If profile information is not available, the statically estimated profile information(generated by BranchProbabilityInfo.cpp) is used. If user program doesn't behave as BranchProbabilityInfo.cpp expected, the layout may be worse. To be conservative this patch restores the original layout algorithm in plain mode. But user can still try the aggressive layout optimization with -force-precise-rotation-cost=true. Differential Revision: https://reviews.llvm.org/D65673 llvm-svn: 369664	2019-08-22 16:21:32 +00:00
Amaury Sechet	95cf66de7c	[DAGCombiner] Remove explicit call to AddToWorklist in sqrt and reciprocal computations Summary: These nodes end up being processed regardless due to DAGCombiner ensuring arguments are processed. This changes the order in which nodes are processed, which fixes an issue on PowerPC. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri, mcberg2017, stefanp, hfinkel Subscribers: nemanjai, MaskRay, jsji, steven.zhang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66548 llvm-svn: 369662	2019-08-22 15:35:45 +00:00
Jinsong Ji	545e993b8b	[SlotIndexes] Add print-slotindexes to disable printing slotindexes Summary: When we print the IR with --print-after/before-*, SlotIndexes will be printed whenever available (We haven't freed it). This introduces some noises when we try to compare the IR among different optimizations. eg: -print-before=machine-cp will print SlotIndexes for 1st machine-cp pass, but NOT for 2nd machine-cp; -print-after=machine-cp will NOT print SlotIndexes for both machine-cp passes. So SlotIndexes in 1st pass introduce noises when differing these IRs. This patch introduces an option to hide indexes. Reviewers: stoklund, thegameg, qcolombet Reviewed By: thegameg Subscribers: hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66500 llvm-svn: 369650	2019-08-22 13:44:47 +00:00
Shiva Chen	72a41e7b0d	[TargetLowering] Remove optional arguments passing to makeLibCall The patch introduces MakeLibCallOptions struct as suggested by @efriedma on D65497. The struct contain argument flags which will pass to makeLibCall function. The patch should not has any functionality changes. Differential Revision: https://reviews.llvm.org/D65795 llvm-svn: 369622	2019-08-22 04:59:43 +00:00
Fangrui Song	246750c2a9	[COFF] Fix section name for constants larger than 64 bits on Windows APIntToHexString returns wrong value ("0000000000000000ffffffffffffffff") for integer larger than 64 bits, and thus TargetLoweringObjectFileCOFF::getSectionForConstant returns same section name for all numbers larger than 64 bits. This patch tries to fix it. Differential Revision: https://reviews.llvm.org/D66458 Patch by Senran Zhang llvm-svn: 369610	2019-08-22 01:48:34 +00:00
Craig Topper	3f59bfd5be	[MVT] Add v16f16 and v32f16 vectors. I might look at improving PR43065 which will require being able to mark a 256 and 512 bit vector of f16 as Legal. Differential Revision: https://reviews.llvm.org/D66515 llvm-svn: 369565	2019-08-21 19:14:48 +00:00
Amaury Sechet	c0f190a048	[DAGCombiner] Remove mostly redundant calls to AddToWorklist Summary: These calls change the order in which some nodes are processed and so have an effect on codegen. The change in fixup-bw-copy.ll is due to (and (load anyext)) gets transformed into (load zext) while previously the and was removed by SimplifyDemandedBits, so the (load anyext) remained. Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66543 llvm-svn: 369561	2019-08-21 18:51:08 +00:00
Matt Arsenault	954a012b4c	GlobalISel: Implement moreElementsVector for G_UNMERGE_VALUES sources This is necessary for handling <3 x s16> on AMDGPU, assuming this should be handled as 2 separate legalization actions. The alternative would be for fewerElementsVector to handle 3->2. llvm-svn: 369547	2019-08-21 16:59:10 +00:00
Nilanjana Basu	ac3851c434	Improving CodeView debug info type record's inline comments llvm-svn: 369533	2019-08-21 15:19:58 +00:00
Alexander Timofeev	78347c979e	[AMDGPU] Prevent VGPR copies from moving across the EXEC mask definitions Differential Revision: https://reviews.llvm.org/D63731 Reviewers: qcolombet, rampitec llvm-svn: 369532	2019-08-21 15:15:04 +00:00
Guillaume Chatelet	1c18a9cb9e	[LLVM][Alignment] Introduce Alignment In MachineFrameInfo Summary: This is patch is part of a serie to introduce an Alignment type. See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html See this patch for the introduction of the type: https://reviews.llvm.org/D64790 Reviewers: jfb Subscribers: hiraditya, dexonsmith, llvm-commits, courbet Tags: #llvm Differential Revision: https://reviews.llvm.org/D65800 llvm-svn: 369531	2019-08-21 14:29:30 +00:00
Amaury Sechet	045f33aec9	[DAGCombiner] Various nits. NFC llvm-svn: 369520	2019-08-21 12:01:37 +00:00
Petar Avramovic	5b4c5c2c54	[MIPS GlobalISel] NarrowScalar G_TRUNC Add NarrowScalar for G_TRUNC when NarrowTy is half the size of source. NarrowScalar G_TRUNC to s32 for MIPS32. Differential Revision: https://reviews.llvm.org/D66202 llvm-svn: 369509	2019-08-21 09:26:39 +00:00
Jeremy Morse	67443c3c6e	[DebugInfo] Avoid dropping location info across block boundaries LiveDebugValues propagates variable locations between blocks by creating new DBG_VALUE insts in the successors, then interpreting them when it passes back through the block at a later time. However, this flushes out any extra information about the location that LiveDebugValues holds: for example, connections between variable locations such as discussed in D65368. And as reported in PR42772 this causes us to lose track of the fact that a spill-location is actually a spill, not a register location. This patch fixes that by deferring the creation of propagated DBG_VALUEs until after propagation has completed: instead location propagation occurs only by sharing location ID numbers between blocks. Differential Revision: https://reviews.llvm.org/D66412 llvm-svn: 369508	2019-08-21 09:22:31 +00:00
Amara Emerson	56606a4db3	[AArch64][GlobalISel] Add support for narrowScalar of G_ZEXT We do this by merging the source with the high bits set to 0. Differential Revision: https://reviews.llvm.org/D66181 llvm-svn: 369480	2019-08-21 00:12:37 +00:00
Craig Topper	ba375263e8	[DAGCombiner][X86] Teach visitCONCAT_VECTORS to combine (concat_vectors (concat_vectors X, Y), undef)) -> (concat_vectors X, Y, undef, undef) I also had to add a new combine to X86's combineExtractSubvector to prevent a regression. This helps our vXi1 code see the full concat operation and allow it optimize undef to a zero if there is already a zero in the concat. This helped us use a movzx instead of an AND in some of the tests. In those tests, one concat comes from SelectionDAGBuilder and the second comes from type legalization of v4i1->i4 bitcasts which uses an additional concat. Though these changes weren't my original motivation. I'm looking at making X86ISelLowering's narrowShuffle emit a concat_vectors instead of an insert_subvector since concat_vectors is more canonical during early DAG combine. This patch helps prevent a regression from my experiments with that. Differential Revision: https://reviews.llvm.org/D66456 llvm-svn: 369459	2019-08-20 22:12:50 +00:00
Sean Fertile	1e46d4cec5	Adds support for writing the .bss section for XCOFF object files. Adds Wrapper classes for MCSymbol and MCSection into the XCOFF target object writer. Also adds a class to represent the top-level sections, which we materialize in the ObjectWriter. executePostLayoutBinding will map all csects into the appropriate container depending on its storage mapping class, and map all symbols into their containing csect. Once all symbols have been processed we - Assign addresses and symbol table indices. - Calaculte section sizes. - Build the section header table. - Assign the sections raw-pointer value for non-virtual sections. Since the .bss section is virtual, writing the header table is enough to add support. Writing of a sections raw data, or of any relocations is not included in this patch. Testing is done by dumping the section header table, but it needs to be extended to include dumping the symbol table once readobj support for dumping auxiallary entries lands. Differential Revision: https://reviews.llvm.org/D65159 llvm-svn: 369454	2019-08-20 22:03:18 +00:00
Aditya Nandakumar	08bd080872	[GlobalISel] Handle multiple registers in dbg.value intrinsic https://reviews.llvm.org/D66077 The value passed into dbg.value may relate to multiple registers, each of which need a DBG_VALUE. This fix calls MIRBuilder.buildDirectDbgValue for each register. Without this, IR passed in from flang-compiler/flang may fail an assertion in getOrCreateVReg. Patch by : peterwaller-arm. llvm-svn: 369403	2019-08-20 16:28:37 +00:00
Thomas Raoux	be699bf389	[CodeGen] Add a pass to do block predication on SSA machine IR. For targets requiring aggressive scheduling and/or software pipeline we need to apply predication before preRA scheduling. This adds a pass re-using the early if-cvt infrastructure but generating predicated instructions instead of speculatively executing instructions. It allows doing if conversion on blocks containing instructions with side-effects. The pass re-use the target hook from postRA if-conversion to let the target decide on the heuristic to apply. Differential Revision: https://reviews.llvm.org/D66190 llvm-svn: 369395	2019-08-20 15:54:59 +00:00
Karl-Johan Karlsson	40da6be2bd	[AsmPrinter] Remove const qualifier from EmitBasicBlockStart. Overriders may want to modify state in it. AMDGPU wants to, but has to make its members mutable in order to do so. Besides, EmitBasicBlockEnd is not const, so why should Start be? Patch by Bevin Hansson. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D66341 llvm-svn: 369325	2019-08-20 05:13:57 +00:00
Vyacheslav Zakharin	f7229ac7d8	Fixed placement of llvm.global_dtors on Windows. Differential revision: https://reviews.llvm.org/D66373 llvm-svn: 369299	2019-08-19 21:07:03 +00:00
Craig Topper	93c2787193	[CGP] Remove ModifiedDT from the makeBitReverse loop I don't think anything in this loop modifies the control flow and we don't restart any iteration after setting the flag. This code was added in http://reviews.llvm.org/D16893 but looking at the test case added there the code that caused the dominator tree to change was merging blocks with their predecessor not the bitreverse optimization. Differential Revision: https://reviews.llvm.org/D66366 llvm-svn: 369283	2019-08-19 18:02:24 +00:00
Roman Lebedev	edfaee0811	[TargetLowering] x s% C == 0 fold: vector divisor with INT_MIN handling Summary: The general fold is only valid for positive divisors. Which effectively means, it is invalid for `INT_MIN` divisors, and we currently bailout if we see them. But that is too strict, we can just fix-up the results. For that, let's do a second computation 'in parallel': ``` Name: srem -> and Pre: isPowerOf2(C) %o = srem i8 %X, C %r = icmp eq %o, 0 => %n = and i8 %X, C-1 %r = icmp eq %n, 0 ``` https://rise4fun.com/Alive/Sup And then just blend results: if the divisor was `INT_MIN`, pick the value we got via bit-test, else pick the value from general fold. There's interesting observation - `ISD::ROTR` is set to `LegalizeAction::Expand` before AVX512, so we should not treat `INT_MIN` divisor as even; and as it can be seen while `@test_srem_odd_even_one` improves on all run-lines, `@test_srem_odd_even_INT_MIN` only improves for AVX512. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D66300 llvm-svn: 369268	2019-08-19 15:01:42 +00:00
Jinsong Ji	0776da5236	[PeepholeOptimizer] Don't assume bitcast def always has input Summary: If we have a MI marked with bitcast bits, but without input operands, PeepholeOptimizer might crash with assert. eg: If we apply the changes in PPCInstrVSX.td as in this patch: [(set v4i32:$XT, (bitconvert (v16i8 immAllOnesV)))]>; We will get assert in PeepholeOptimizer. ``` llvm-lit llvm-project/llvm/test/CodeGen/PowerPC/build-vector-tests.ll -v llvm-project/llvm/include/llvm/CodeGen/MachineInstr.h:417: const llvm::MachineOperand &llvm::MachineInstr::getOperand(unsigned int) const: Assertion `i < getNumOperands() && "getOperand() out of range!"' failed. ``` The fix is to abort if we found out of bound access. Reviewers: qcolombet, MatzeB, hfinkel, arsenm Reviewed By: qcolombet Subscribers: wdng, arsenm, steven.zhang, wuzish, nemanjai, hiraditya, kbarton, MaskRay, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D65542 llvm-svn: 369261	2019-08-19 14:19:04 +00:00
David Stenberg	88df53e6ea	[DebugInfo] Allow bundled calls in the MIR's call site info Summary: Extend the MIR parser and writer so that the call site information can refer to calls that are bundled. Reviewers: aprantl, asowda, NikolaPrica, djtodoro, ivanbaev, vsk Reviewed By: aprantl Subscribers: arsenm, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D66145 llvm-svn: 369256	2019-08-19 12:41:22 +00:00
Jeremy Morse	176bbd5cde	[DebugInfo] Make postra sinking of DBG_VALUEs subregister-safe Currently the machine instruction sinker identifies DBG_VALUE insts that also need to sink by comparing register numbers. Unfortunately this isn't safe, because (after register allocation) a DBG_VALUE may read a register that aliases what's being sunk. To fix this, identify the DBG_VALUEs that need to sink by recording & examining their register units. Register units gives us the following guarantee: "Two registers overlap if and only if they have a common register unit" [MCRegisterInfo.h] Thus we can always identify aliasing DBG_VALUEs if the set of register units read by the DBG_VALUE, and the register units of the instruction being sunk, intersect. (MachineSink already uses classes like "LiveRegUnits" for determining sinking validity anyway). The test added checks for super and subregister DBG_VALUE reads of a sunk copy being sunk as well. Differential Revision: https://reviews.llvm.org/D58191 llvm-svn: 369247	2019-08-19 09:53:07 +00:00
Craig Topper	74168ded03	[TargetLowering] Teach computeRegisterProperties to only widen v3i16/v3f16 vectors to the next power of 2 type if that's legal. These were recently made simple types. This restores their behavior back to something like their EVT legalization. We might be able to fix the code in type legalization where the assert was failing, but I didn't investigate too much as I had already looked at the computeRegisterProperties code during the review for v3i16/v3f16. Most of the test changes restore the X86 codegen back to what it looked like before the recent change. The test case in vec_setcc.ll and is a reduced version of the reproducer from the fuzzer. Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=16490 llvm-svn: 369205	2019-08-18 06:28:06 +00:00
Craig Topper	f43106e341	[SelectionDAG] Add a node creation debug message to getMachineNode. llvm-svn: 369204	2019-08-18 06:28:00 +00:00
Kang Zhang	b3d258fc44	[CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks Summary: Fix a bug of preducessors. In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun. But the `early-ret` pass is before `block-placement`, we don't want to run it again. This patch is to do the simple early return to optimize the blocks at the last of `block-placement`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D63972 llvm-svn: 369191	2019-08-17 14:37:05 +00:00
Sanjay Patel	acceedb15f	[CodeGenPrepare] Fix use-after-free If OptimizeExtractBits() encountered a shift instruction with no operands at all, it would erase the instruction, but still return false. This previously didn’t matter because its caller would always return after processing the instruction, but https://reviews.llvm.org/D63233 changed the function’s caller to fall through if it returned false, which would then cause a use-after-free detectable by ASAN. This change makes OptimizeExtractBits return true if it removes a shift instruction with no users, terminating processing of the instruction. Patch by: @brentdax (Brent Royal-Gordon) Differential Revision: https://reviews.llvm.org/D66330 llvm-svn: 369168	2019-08-16 23:10:34 +00:00

... 14 15 16 17 18 ...

28401 Commits