llvm-project

Commit Graph

Author	SHA1	Message	Date
Cullen Rhodes	75dfbdc2da	[AArch64][SVE2] Asm: support Floating Point Widening Multiply-Add Summary: Patch adds support for the indexed and unpredicated vectors forms of the FMLALB, FMLALT, FMLSLB and FMLSLT instructions. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62386 llvm-svn: 361935	2019-05-29 08:53:06 +00:00
Cullen Rhodes	4f58ad4e72	[AArch64][SVE2] Asm: support SVE2 Floating Point Pairwise Group Summary: Patch adds support for the following instructions: SVE2 floating-point pairwise operations: * FADDP, FMAXNMP, FMINNMP, FMAXP, FMINP The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62383 llvm-svn: 361933	2019-05-29 08:40:33 +00:00
Richard Trieu	c77aff7e17	Inline a variable into debug section to fix unused variable warning. llvm-svn: 361927	2019-05-29 04:09:32 +00:00
Richard Trieu	e8698ead9d	Inline value into debug statement to avoid unused variable warning. llvm-svn: 361924	2019-05-29 03:43:01 +00:00
Peter Collingbourne	31fda09b2d	Add IR support, ELF section and user documentation for partitioning feature. The partitioning feature was proposed here: http://lists.llvm.org/pipermail/llvm-dev/2019-February/130583.html This is mostly just documentation. The feature itself will be contributed in subsequent patches. Differential Revision: https://reviews.llvm.org/D60242 llvm-svn: 361923	2019-05-29 03:29:01 +00:00
Peter Collingbourne	10c548cdfa	IR: Give the TypeAllocator a more generic name and start using it for section names as well. NFCI. This prepares us to start using it for partition names. llvm-svn: 361922	2019-05-29 03:28:51 +00:00
Jinsong Ji	f6cb3bcb4c	Support resource tracking with InstrSchedModel The current design use DFA to do resource tracking in SMS, and DFA only support InstrItins, and also has scaling limitation. This patch extend SMS to allow Subtarget to use ProcResource in InstrSchedModel instead. Differential Revision: https://reviews.llvm.org/D62163 llvm-svn: 361919	2019-05-29 03:02:59 +00:00
Pengfei Wang	72e3f9662b	Revert "[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to" This reverts commit c1b3716614bc0a107e6f41a7d3d503baefad8a5b. llvm-svn: 361918	2019-05-29 02:49:59 +00:00
Pengfei Wang	818c652643	[X86] Use 'llvm_unreachable' instead of nullptr in unreachable code to avoid static check fail RegClassOrBank is an object of RegClassOrRegBank, which is defined as using llvm::RegClassOrRegBank = typedef PointerUnion<const TargetRegisterClass , const RegisterBank > so control flow can not get here. Use ""llvm_unreachable" here to avoid "null pointer" confusion. Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D62006 Signed-off-by: pengfei <pengfei.wang@intel.com> llvm-svn: 361912	2019-05-29 02:20:37 +00:00
Fangrui Song	656afe370d	[X86] Fix x86-64 call foo@tlsdesc(%rax) and support R_386_TLSGOTDESC R_386_TLS_DESC_CALL D18885 emitted 5 bytes for call foo@tlsdesc(%rax). It should use the 2-byte form instead and let R_X86_64_TLSDESC_CALL apply to the beginning of the call instruction. The 2-byte form was deliberately chosen to make ->LE and ->IE relaxation work: 0: 48 8d 05 00 00 00 00 lea 0x0(%rip),%rax # 7 <.text+0x7> 3: R_X86_64_GOTPC32_TLSDESC a-0x4 7: ff 10 callq *(%rax) 7: R_X86_64_TLSDESC_CALL a => 0: 48 c7 c0 fc ff ff ff mov $0xfffffffffffffffc,%rax 7: 66 90 xchg %ax,%ax Also change the symbol type to STT_TLS when VK_TLSCALL or VK_TLSDESC is seen. Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D62512 llvm-svn: 361910	2019-05-29 02:02:59 +00:00
Thomas Lively	26d711be6e	[WebAssembly] Add signatures for RINT builtins Reviewers: azakai, dschuff Subscribers: sbc100, jgravelle-google, hiraditya, aheejin, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62564 llvm-svn: 361904	2019-05-29 01:06:00 +00:00
Quentin Colombet	a6f57ad2c9	[RegUsageInfoCollector] Don't mark as saved registers that don't have subregister lanes To determine the list of clobbered registers, the RegUsageInfoCollector pass uses the list of callee saved registers provided by the target and then augments it with the list of registers which have all their subregisters saved. It then basically does the difference between all the registers and the saved registers to come up with what is clobbered (plus it checks that the register is defined within that functions). The patch fixes a bug where when register does not have any subregister lane, hence when checking if any of its subregister are not saved, we would find none and think the register is saved as well. That's obviously wrong. The code was actually kind of checking for something like that with the CoveredBySubRegs bit. What this bit says is that a register is completely covered by its subregisters. We required that this bit was set, to check that a register was saved by its subregister lanes, since without this bit, we potentially would miss to check some part of the register. However, this bit is used de facto on registers that don't have any subregisters (e.g., on ARM) and the code was not prepared for that. This patch fixes this by checking that a register has subregisters before declaring it saved when none of its lanes are modified. llvm-svn: 361901	2019-05-28 23:43:12 +00:00
Lang Hames	eb5ee3004f	[ORC] Track JIT symbol states more explicitly. Prior to this patch, JITDylibs inferred symbol states (whether a symbol was newly added, materializing, resolved, or ready to run) via a combination of (1) bits in the JITSymbolFlags member, and (2) the state of some internal JITDylib data structures. This patch explicitly tracks symbol states by adding a new SymbolState member to the symbol table entries, and removing the 'Lazy' and 'Materializing' bits from JITSymbolFlags. This is a first step towards adding additional states representing initialization phases (e.g. eh-frame registration, registration with the language runtime, and static initialization). llvm-svn: 361899	2019-05-28 23:35:44 +00:00
Jessica Paquette	b73ea75b38	[AArch64][GlobalISel] Select FCMPSri/FCMPDri when comparing against 0.0 Add support for selecting FCMPSri and FCMPDri when comparing against 0.0, and factor out opcode selection for G_FCMP into its own function. Add a test to show that we don't do this with other immediates. Differential Revision: https://reviews.llvm.org/D62539 llvm-svn: 361888	2019-05-28 22:52:49 +00:00
Heejin Ahn	5514658591	[WebAssembly] Support for atomic fences Summary: This adds support for translation of LLVM IR fence instruction. We convert a singlethread fence to a pseudo compiler barrier which becomes 0 instructions in final binary, and a thread fence to an idempotent atomicrmw instruction to a memory address. Reviewers: dschuff, jfb, sunfish, tlively Subscribers: sbc100, jgravelle-google, llvm-commits Differential Revision: https://reviews.llvm.org/D50277 llvm-svn: 361884	2019-05-28 22:09:12 +00:00
Rong Xu	e88173abc0	[PGO] Handle cases of failing to split critical edges Fix PR41279 where critical edges to EHPad are not split. The fix is to not instrument those critical edges. We used to be able to know the size of counters right after MST is computed. With this, we have to pre-collect the instrument BBs to know the size, and then instrument them. Differential Revision: https://reviews.llvm.org/D62439 llvm-svn: 361882	2019-05-28 21:45:56 +00:00
Nikita Popov	5b32f60ec3	Revert "[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst" This reverts commit `53f2f32865`. As reported on D62126, this causes assertion failures if the switch has incorrect branch_weights metadata, which may happen as a result of other transforms not handling it correctly yet. llvm-svn: 361881	2019-05-28 21:28:24 +00:00
Konstantin Zhuravlyov	fe23ed2c68	AMDGPU: Temporary drop s_mul_hi_i/u32 patterns It introduces performance regressions in several applications. This has already been submitted downstream. llvm-svn: 361879	2019-05-28 21:18:34 +00:00
Adhemerval Zanella	34d8daae53	[AArch64] Handle ISD::LRINT and ISD::LLRINT This patch optimizes ISD::LRINT and ISD::LLRINT to frintx plus fcvtzs. It currently only handles the scalar version. Reviewed By: SjoerdMeijer, mstorsjo Differential Revision: https://reviews.llvm.org/D62018 llvm-svn: 361877	2019-05-28 21:04:29 +00:00
Adhemerval Zanella	6d7bf5e8df	[CodeGen] Add lrint/llrint builtins This patch add the ISD::LRINT and ISD::LLRINT along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lrint/llrint generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D62017 llvm-svn: 361875	2019-05-28 20:47:44 +00:00
Roman Lebedev	dfc34f0211	[DAGCombine] (x - C) - y -> (x - y) - C fold. Try 2 Summary: Again only vectors affected. Frustrating. Let me take a look into that.. https://rise4fun.com/Alive/AAq This is a recommit, originally committed in rL361856, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62294 llvm-svn: 361874	2019-05-28 20:40:10 +00:00
Roman Lebedev	d485c6bc9f	[DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold. Try 2 Summary: This prevents regressions in next patch, and somewhat recovers from the regression to AMDGPU test in D62223. It is indeed not great that we leave vector decrement, don't transform it into vector add all-ones.. https://rise4fun.com/Alive/ZRl This is a recommit, originally committed in rL361855, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel, arsenm Reviewed By: RKSimon, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62263 llvm-svn: 361873	2019-05-28 20:40:03 +00:00
Roman Lebedev	96c9986199	[DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 2 Summary: Direct sibling of D62223 patch. While i don't have a direct motivational pattern for this, it would seem to make sense to handle both patterns (or none), for symmetry? The aarch64 changes look neutral; sparc and systemz look like improvement (one less instruction each); x86 changes - 32bit case improves, 64bit case shows that LEA no longer gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea` https://rise4fun.com/Alive/ffh This is a recommit, originally committed in rL361853, but reverted to investigate test-suite compile-time hangs. Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62252 llvm-svn: 361872	2019-05-28 20:39:55 +00:00
Roman Lebedev	2feb7e56e2	[DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 2 Summary: The main motivation is shown by all these `neg` instructions that are now created. In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test. AArch64 test changes all look good (`neg` created), or neutral. X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created). I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill is now hoisted into preheader (which should still be good?), 2 4-byte reloads become 1 8-byte reload, and are elsewhere, but i'm not sure how that affects that loop. I'm unable to interpret AMDGPU change, looks neutral-ish? This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 \| PR41952 ]]. https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later) This is a recommit, originally committed in rL361852, but reverted to investigate test-suite compile-time hangs. Reviewers: craig.topper, RKSimon, spatel, arsenm Reviewed By: RKSimon Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62223 llvm-svn: 361871	2019-05-28 20:39:39 +00:00
Michael Liao	5fc1dfa784	[AMDGPU] Correct the handling of inlineasm output registers. Summary: - There's a regression due to the cross-block RC assignment. Use the proper way to derive the output register RC in inline asm. Reviewers: rampitec, alex-t Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, dstuttard, tpr, t-tye, eraman, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D62537 llvm-svn: 361868	2019-05-28 19:37:09 +00:00
Roman Lebedev	272d70c366	Revert DAGCombine "hoist binop with const" folds Appear to introduce test-suite compile-time hang. http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/22825 This reverts r361852,r361853,r361854,r361855,r361856 llvm-svn: 361865	2019-05-28 19:04:21 +00:00
Nikita Popov	c51cdacab9	[InstCombine] Clean up saturing math overflow optimizations; NFC Reduce duplication and make it easier to handle signed always-overflows conditions in the future. llvm-svn: 361863	2019-05-28 18:59:21 +00:00
Nikita Popov	332c100562	[ValueTracking][ConstantRange] Distinguish low/high always overflow In order to fold an always overflowing signed saturating add/sub, we need to know in which direction the always overflow occurs. This patch splits up AlwaysOverflows into AlwaysOverflowsLow and AlwaysOverflowsHigh to pass through this information (but it is not used yet). Differential Revision: https://reviews.llvm.org/D62463 llvm-svn: 361858	2019-05-28 18:08:31 +00:00
Nikita Popov	2fb0a820df	[IR] Add SaturatingInst and BinaryOpIntrinsic classes Based on the suggestion in D62447, this adds a SaturatingInst class that represents the saturating add/sub family of intrinsics. It exposes the same interface as WithOverflowInst, for this reason I have also added a common base class BinaryOpIntrinsic that holds the actual implementation code and will be useful in some places handling both overflowing and saturating math. Differential Revision: https://reviews.llvm.org/D62466 llvm-svn: 361857	2019-05-28 18:08:06 +00:00
Roman Lebedev	7669665432	[DAGCombine] (x - C) - y -> (x - y) - C fold Summary: Again only vectors affected. Frustrating. Let me take a look into that.. https://rise4fun.com/Alive/AAq Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62294 llvm-svn: 361856	2019-05-28 17:54:21 +00:00
Roman Lebedev	8c9b3e4e4a	[DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold Summary: This prevents regressions in next patch, and somewhat recovers from the regression to AMDGPU test in D62223. It is indeed not great that we leave vector decrement, don't transform it into vector add all-ones.. https://rise4fun.com/Alive/ZRl Reviewers: RKSimon, craig.topper, spatel, arsenm Reviewed By: RKSimon, arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62263 llvm-svn: 361855	2019-05-28 17:54:13 +00:00
Roman Lebedev	6a24c9b9ab	[DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold Summary: Only vector tests are being affected here, since subtraction by scalar constant is rewritten as addition by negated constant. No surprising test changes. https://rise4fun.com/Alive/pbT Reviewers: RKSimon, craig.topper, spatel Reviewed By: RKSimon Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62257 llvm-svn: 361854	2019-05-28 17:54:04 +00:00
Roman Lebedev	1499f65ac1	[DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold Summary: Direct sibling of D62223 patch. While i don't have a direct motivational pattern for this, it would seem to make sense to handle both patterns (or none), for symmetry? The aarch64 changes look neutral; sparc and systemz look like improvement (one less instruction each); x86 changes - 32bit case improves, 64bit case shows that LEA no longer gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea` https://rise4fun.com/Alive/ffh Reviewers: RKSimon, craig.topper, spatel, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62252 llvm-svn: 361853	2019-05-28 17:53:54 +00:00
Roman Lebedev	19f51ec04a	[DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold Summary: The main motivation is shown by all these `neg` instructions that are now created. In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test. AArch64 test changes all look good (`neg` created), or neutral. X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created). I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill is now hoisted into preheader (which should still be good?), 2 4-byte reloads become 1 8-byte reload, and are elsewhere, but i'm not sure how that affects that loop. I'm unable to interpret AMDGPU change, looks neutral-ish? This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 \| PR41952 ]]. https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later) Reviewers: craig.topper, RKSimon, spatel, arsenm Reviewed By: RKSimon Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62223 llvm-svn: 361852	2019-05-28 17:53:43 +00:00
Sanjay Patel	f7980e727f	Revert "[x86] split 256-bit store of concatenated vectors" This reverts commit `d5a8637072`. Most likely suspect for this bot failure: http://lab.llvm.org:8011/builders/clang-cmake-x86_64-avx2-linux/builds/9684 llvm-svn: 361850	2019-05-28 17:37:58 +00:00
Matt Arsenault	24e80b8d04	AMDGPU: Don't enable all lanes with non-CSR VGPR spills If the only VGPRs used for SGPR spilling were not CSRs, this was enabling all laness and immediately restoring exec. This is the usual situation in leaf functions. llvm-svn: 361848	2019-05-28 16:46:02 +00:00
Michael Liao	7166843f1e	[AMDGPU] Fix the mis-handling of `vreg_1` copied from scalar register. Summary: - Don't treat the use of a scalar register as `vreg_1` an VGPR usage. Otherwise, that promotes that scalar register into vector one, which breaks the assumption that scalar register holds the lane mask. - The issue is triggered in a complicated case, where if the uses of that (lane mask) scalar register is legalized firstly before its definition, e.g., due to the mismatch block placement and its topological order or loop. In that cases, the legalization of PHI introduces the use of that scalar register as `vreg_1`. Reviewers: rampitec, nhaehnle, arsenm, alex-t Subscribers: kzhuravl, jvesely, wdng, dstuttard, tpr, t-tye, hiraditya, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D62492 llvm-svn: 361847	2019-05-28 16:29:39 +00:00
Simon Tatham	760df47b77	[ARM] Replace fp-only-sp and d16 with fp64 and d32. Those two subtarget features were awkward because their semantics are reversed: each one indicates the _lack_ of support for something in the architecture, rather than the presence. As a consequence, you don't get the behavior you want if you combine two sets of feature bits. Each SubtargetFeature for an FP architecture version now comes in four versions, one for each combination of those options. So you can still say (for example) '+vfp2' in a feature string and it will mean what it's always meant, but there's a new string '+vfp2d16sp' meaning the version without those extra options. A lot of this change is just mechanically replacing positive checks for the old features with negative checks for the new ones. But one more interesting change is that I've rearranged getFPUFeatures() so that the main FPU feature is appended to the output list before rather than after the features derived from the Restriction field, so that -fp64 and -d32 can override defaults added by the main feature. Reviewers: dmgreen, samparker, SjoerdMeijer Subscribers: srhines, javed.absar, eraman, kristof.beyls, hiraditya, zzheng, Petar.Avramovic, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60691 llvm-svn: 361845	2019-05-28 16:13:20 +00:00
Fangrui Song	448a79d123	[AArch64] Delete unused VariantKind in AArch64MCExpr llvm-svn: 361844	2019-05-28 16:11:56 +00:00
David Greene	561fcc0d63	[X86-64] Fix 256-bit SET0 lowering for non-VLX targets If we don't have VLX then 256-bit SET0 should be lowered to VPXOR with ZMM registers. This restores functionality accidentally removed by r309926. Differential Revision: https://reviews.llvm.org/D62415 llvm-svn: 361843	2019-05-28 15:37:01 +00:00
Nico Weber	a2ca6e7803	llvm-undname: Support demangling char8_t Ports clang's mangling support added in r354633 to llvm-undname. llvm-svn: 361839	2019-05-28 15:30:04 +00:00
Nico Weber	88ab281b4d	llvm-undname: Add support for local static thread guards llvm-svn: 361835	2019-05-28 14:54:49 +00:00
Jason Liu	9212206d25	[XCOFF] Implement parsing symbol table for xcoffobjfile and output as yaml format Summary: This patch implement parsing symbol table for xcoffobjfile and output as yaml format. Parsing auxiliary entries of a symbol will be in a separate patch. The XCOFF object file (aix_xcoff.o) used in the test comes from -bash-4.2$ cat test.c extern int i; extern int TestforXcoff; int main() { i++; TestforXcoff--; } Patch by DiggerLin Reviewers: sfertile, hubert.reinterpretcast, MaskRay, daltenty Differential Revision: https://reviews.llvm.org/D61532 llvm-svn: 361832	2019-05-28 14:37:59 +00:00
Sanjay Patel	d5a8637072	[x86] split 256-bit store of concatenated vectors This shows up as a side issue to the main problem for the AVX target example from PR37428: https://bugs.llvm.org/show_bug.cgi?id=37428 - https://godbolt.org/z/7tpRa3 But as we can see in the pile of existing test diffs, it's actually a widespread problem that affects any AVX or later target. Apart from a couple of oddballs, I think these are all improvements for the reasons stated in the code comment: we do not want to enable YMM unnecessarily (avoid vzeroupper and frequency throttling) and some cores split 256-bit stores anyway. We could say that MergeConsecutiveStores() is going overboard on some of these examples, but that won't solve the problem completely. But that is the reason I'm proposing this as a lowering rather than a combine: we will infinite loop fighting the merge code if we try this earlier. Differential Revision: https://reviews.llvm.org/D62498 llvm-svn: 361822	2019-05-28 13:54:17 +00:00
Simon Pilgrim	9cd9624fb6	[DAG] LegalizeVectorTypes - reduce scope of local variables. NFCI. Move the element index/count variables into the block where they are actually used - appeases cppcheck and helps avoid shadow variable warnings. llvm-svn: 361821	2019-05-28 13:46:26 +00:00
David Stenberg	5d0e6b6755	Stop undef fragments from closing non-overlapping fragments Summary: When DwarfDebug::buildLocationList() encountered an undef debug value, it would truncate all open values, regardless if they were overlapping or not. This patch fixes so that it only does that for overlapping fragments. This change unearthed a bug that I had introduced in D57511, which I have fixed in this patch. The code in DebugHandlerBase that changes labels for parameter debug values could break DwarfDebug's assumption that the labels for the entries in the debug value history are monotonically increasing. Before this patch, that bug could result in location list entries whose ending address was lower than the beginning address, and with the changes for undef debug values that this patch introduces it could trigger an assertion, due to attempting to emit location list entries with empty ranges. A reproducer for the bug is added in param-reg-const-mix.mir. Reviewers: aprantl, jmorse, probinson Reviewed By: aprantl Subscribers: javed.absar, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D62379 llvm-svn: 361820	2019-05-28 13:23:25 +00:00
Matt Arsenault	d3ed418ad3	MIR: Fix printer crashing on dead CSR frame indexes llvm-svn: 361819	2019-05-28 13:08:31 +00:00
Sanjay Patel	6bf4ca9d2e	[x86] fix 256-bit vector store splitting to honor 'volatile' Forking this out of the discussion in D62498 (and assuming that will be committed later, so adding the helper function here). The LangRef says: "the backend should never split or merge target-legal volatile load/store instructions." Differential Revision: https://reviews.llvm.org/D62506 llvm-svn: 361815	2019-05-28 12:58:07 +00:00
Benjamin Kramer	57e267a2e9	[X86] Custom lower CONCAT_VECTORS of v2i1 The generic legalizer cannot handle this. Add an assert instead of silently miscompiling vectors with elements smaller than 8 bits. llvm-svn: 361814	2019-05-28 12:52:57 +00:00
Graham Hunter	19e91253c0	[NFC] Test commit, delete trailing whitespace llvm-svn: 361813	2019-05-28 12:36:39 +00:00
Hans Wennborg	d936e40575	Re-commit r357452 (take 2): "SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" This was reverted in r360086 as it was supected of causing mysterious test failures internally. However, it was never concluded that this patch was the root cause. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 361811	2019-05-28 12:19:38 +00:00
Yevgeny Rouban	53f2f32865	[CorrelatedValuePropagation] Fix prof branch_weights metadata handling for SwitchInst This patch fixes the CorrelatedValuePropagation pass to keep prof branch_weights metadata of SwitchInst consistent. It makes use of SwitchInstProfUpdateWrapper. New tests are added. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D62126 llvm-svn: 361808	2019-05-28 11:33:50 +00:00
Simon Pilgrim	4b48aa0e30	[X86] X86CmovConverterPass::collectCmovCandidates - fix uninitialized variable warnings. NFCI. llvm-svn: 361804	2019-05-28 10:53:23 +00:00
Cullen Rhodes	f57bd6bd23	[AArch64][SVE2] Asm: support SVE2 Floating Point Convert Group Summary: Patch adds support for the following intructions: SVE2 floating-point convert precision: * FCVTXNT, FCVTNT, FCVTLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62382 llvm-svn: 361801	2019-05-28 09:36:52 +00:00
Cullen Rhodes	8e91dd7934	[AArch64][SVE2] Asm: support SVE2 Crypto Extensions Group Summary: Patch adds support for the following instructions: SVE2 crypto constructive binary operations: * SM4EKEY, RAX1 SVE2 crypto destructive binary operations: * AESE, AESD, SM4E SVE2 crypto unary operations: * AESMC, AESIMC AESE, AESD, AESMC and AESIMC are enabled with +sve2-aes. SM4E and SM4EKEY are enabled with +sve2-sm4. RAX1 is enabled with +sve2-sha3. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62307 llvm-svn: 361797	2019-05-28 09:13:17 +00:00
Cullen Rhodes	c4ed601bd9	[AArch64][SVE2] Asm: support SVE2 Histogram Computation Groups Summary: Patch adds support for the following instructions: SVE2 histogram generation (segment): * HISTSEG SVE2 histogram generation (vector): * HISTCNT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62306 llvm-svn: 361796	2019-05-28 08:51:59 +00:00
Cullen Rhodes	7d9cac5bba	[AArch64][SVE2] Asm: support SVE2 Misc Group Summary: Patch adds support for the following instructions: SVE2 bitwise exclusive-or interleaved: * EORBT, EORTB SVE2 bitwise permute: * BEXT, BDEP, BGRP SVE2 bitwise shift left long: * SSHLLB, SSHLLT, USHLLB, USHLLT SVE2 integer add/subtract interleaved long: * SADDLBT, SSUBLBT, SSUBLTB BDEP, BEXT and BGRP are enabled with SVE2 feature +bitperm, all other instructions in this group are enabled with +sve2. Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62304 llvm-svn: 361795	2019-05-28 08:42:22 +00:00
Craig Topper	ab53c5e5ab	[InlineCost] Fix a couple comments. NFC Replace "unary operator" with "unary instruction" in visitUnaryInstruction since we now have a UnaryOperator class which might needs its own visit function. Fix a copy/paste in visitCastInst that appears to have been copied from visitPtrToInt. llvm-svn: 361794	2019-05-28 07:25:27 +00:00
Craig Topper	50d502826b	[CostModel] Add really basic support for being able to query the cost of the FNeg instruction. Summary: This reuses the getArithmeticInstrCost, but passes dummy values of the second operand flags. The X86 costs are wrong and can be improved in a follow up. I just wanted to stop it from reporting an unknown cost first. Reviewers: RKSimon, spatel, andrew.w.kaylor, cameron.mcinally Reviewed By: spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62444 llvm-svn: 361788	2019-05-28 04:09:18 +00:00
Nico Weber	f83c39e53f	llvm-undname: Remove unreachable statement llvm-svn: 361786	2019-05-28 01:20:36 +00:00
Nico Weber	82dc06c340	llvm-undname: Extract demangleMD5Name() method; no behavior change llvm-svn: 361783	2019-05-27 23:10:42 +00:00
Lang Hames	23343c5d90	[RuntimeDyld][ARM] Fix an incorrect assertion condition. Fixes https://llvm.org/PR42036 llvm-svn: 361782	2019-05-27 21:34:31 +00:00
Matt Arsenault	ca84c4be4b	RegAllocFast: Set MayLiveAcrossBlocks when allocating uses Setting mayLiveOut based only on use instructions after allocating the def block did not work if the use block was allocated before the def block, since the virtual register uses were already removed. Fixes bug 41973. llvm-svn: 361781	2019-05-27 20:37:31 +00:00
Sanjay Patel	2f99d009c1	[SelectionDAG] fold concat of extract subvectors This is derived from the related fold for build vectors. We also have a version of this in DAGCombiner. The benefit of having this fold at node creation time is (1) efficiency and (2) preventing infinite looping from creating patterns that should not exist in the first place. Currently, the inf-loop could happen with MergeConsecutiveStores() because it naively creates concat of extracts when forming a wider vector store. That could fight with target-specific store narrowing. llvm-svn: 361780	2019-05-27 20:26:21 +00:00
Sanjay Patel	e13ae3e4d8	[SelectionDAG] fix formatting and redundant comments; NFC There's a possible missing fold here for extracting from the same source vector. It's similar to a check that we use to squash a build vector with all extracted elements from the same source vector. llvm-svn: 361778	2019-05-27 18:26:43 +00:00
Michael Liao	9c70c574b4	[SelectionDAG] Enhance the simplification of `copyto` from `implicit-def`. Summary: - The current implementation simplifies the case where the source of `copyto` is `implicit-def`ed. However, it only works when that `implicit-def` is single-used since it detects that from `implicit-def` and cannot determine which destination vreg should be used if there are multiple uses. - This patch changes that detection when `copyto` is being emitted. If that `copyto`'s source is defined from `implicit-def`, it simplifies it. Hence, it works even that `implicit-def` is multi-used. - Except it simplifies the internal IR, it won't improve the quality of code generation. However, it helps to detect 'implicit-def` in a straight-forward manner in some passes, such as `si-i1-copies`. A test case is added. Reviewers: sunfish, nhaehnle Subscribers: jvesely, hiraditya, asbirlea, llvm-commits, yaxunl Tags: #llvm Differential Revision: https://reviews.llvm.org/D62342 llvm-svn: 361777	2019-05-27 18:26:29 +00:00
Alexander Timofeev	f4040a0dd8	[AMDGPU] Fix for the address sanitizer failure. Fixing typo llvm-svn: 361776	2019-05-27 18:17:21 +00:00
Dmitri Gribenko	5379f1a6c5	Include what you use in AArch64AsmBackend.cpp AArch64AsmBackend.cpp was not using any APIs from AArch64.h, and was only including it for transitive dependencies. Doing so is problematic from include-what-you-use perspective, but it is also a layering issue (it creates a dependency cycle between the primary AArch64 target library and the MCTargetDesc library). llvm-svn: 361774	2019-05-27 17:03:57 +00:00
Simon Pilgrim	ebb053b139	[SelectionDAG] GetDemandedBits - add demanded elements wrapper implementation The DemandedElts variable is pretty much inert at the moment - the original GetDemandedBits implementation calls it with an 'all ones' DemandedElts value so the function is active and behaves exactly as it used to. llvm-svn: 361773	2019-05-27 16:39:25 +00:00
Simon Pilgrim	d99f9373d3	[LLParser] Fix uninitialized flag variable warnings. NFCI. Fixes a large number of warnings in the scan-build report on llvm builds. llvm-svn: 361772	2019-05-27 16:33:15 +00:00
Alexander Timofeev	4a7c4069ae	[AMDGPU] Fix for the address sanitizer failure caused by the ifollowing commit: 1a8b2ea611cf4ca7cb09562e0238cfefa27c05b5 Divergence driven ISel. Assign register class for cross block values according to the divergence. llvm-svn: 361770	2019-05-27 15:03:29 +00:00
Dmitry Preobrazhensky	b79af7930c	[AMDGPU][MC] Enabled constant expressions as operands of s_waitcnt See bug 40820: https://bugs.llvm.org/show_bug.cgi?id=40820 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D61017 llvm-svn: 361763	2019-05-27 14:08:43 +00:00
Xing Xue	3860aad6e7	[MustExecute] Improve MustExecute to correctly handle loop nest Summary: for.outer: br for.inner for.inner: LI <loop invariant load instruction> for.inner.latch: br for.inner, for.outer.latch for.outer.latch: br for.outer, for.outer.exit LI is a loop invariant load instruction that post dominate for.outer, so LI should be able to move out of the loop nest. However, there is a bug in allLoopPathsLeadToBlock(). Current algorithm of allLoopPathsLeadToBlock() 1. get all the transitive predecessors of the basic block LI belongs to (for.inner) ==> for.outer, for.inner.latch 2. if any successors of any of the predecessors are not for.inner or for.inner's predecessors, then return false 3. return true Although for.inner.latch is for.inner's predecessor, but for.inner dominates for.inner.latch, which means if for.inner.latch is ever executed, for.inner should be as well. It should not return false for cases like this. Author: Whitney (committed by xingxue) Reviewers: kbarton, jdoerfert, Meinersbur, hfinkel, fhahn Reviewed By: jdoerfert Subscribers: hiraditya, jsji, llvm-commits, etiotto, bmahjour Tags: #LLVM Differential Revision: https://reviews.llvm.org/D62418 llvm-svn: 361762	2019-05-27 13:57:28 +00:00
Nikola Prica	441ad62531	Test commit (NFC) Add blank line. llvm-svn: 361761	2019-05-27 13:51:30 +00:00
Diana Picus	68b20c589c	[ARM GlobalISel] Cleanup CallLowering a bit We never actually use the Offsets produced by ComputeValueVTs, so remove them until we need them. llvm-svn: 361755	2019-05-27 10:30:33 +00:00
David L. Jones	0ff41b8a5a	Revert r361356: "[MIR] Add simple PRE pass to MachineCSE" This is problematic on buildbots, as discussed here: https://reviews.llvm.org/rL361356 It seems like the plan already was to revert, but that hasn't happened yet. llvm-svn: 361746	2019-05-27 06:00:00 +00:00
Nico Weber	cfe08bc7d6	llvm-undname: Make demangling of MD5 names more robust Demangler::parse() for MD5 names would: 1. Put all remaining text into the MD5 name sight unseen 2. Not modify MangledName This meant that if the demangler recursively called parse() (e.g. in demangleLocallyScopedNamePiece()), every recursive call that started on an MD5 name would add all remaining bytes to the output buffer but only advance the input by a byte. For valid inputs, MD5 types are never (well, see comments for 2 exceptions) nested, but for invalid input this could cause memory use quadratic in the input size. llvm-svn: 361744	2019-05-27 00:48:59 +00:00
Florian Hahn	11b2f4fe50	[LoopInterchange] Fix handling of LCSSA nodes defined in headers and latches. The code to preserve LCSSA PHIs currently only properly supports reduction PHIs and PHIs for values defined outside the latches. This patch improves the LCSSA PHI handling to cover PHIs for values defined in the latches. Fixes PR41725. Reviewers: efriedma, mcrosier, davide, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61576 llvm-svn: 361743	2019-05-26 23:38:25 +00:00
Yonghong Song	e698958ad8	[BPF] generate R_BPF_NONE relocation for BTF DataSec variables The variables in BTF DataSec type encode in-section offset. R_BPF_NONE should be generated instead of R_BPF_64_32. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D62460 llvm-svn: 361742	2019-05-26 21:26:06 +00:00
Alexander Timofeev	ba447bae74	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 This commit was reverted because of the build failure. The reason was mlformed patch. Build failure fixed. llvm-svn: 361741	2019-05-26 20:33:26 +00:00
Andrea Di Biagio	c2493ce4a4	[MCA][Scheduler] Improved critical memory dependency computation. This fixes a problem where back-pressure increases caused by register dependencies were not correctly notified if execution was also delayed by memory dependencies. llvm-svn: 361740	2019-05-26 19:50:31 +00:00
Simon Pilgrim	06e02856ab	[SelectionDAG] GetDemandedBits - cleanup to more closely match SimplifyDemandedBits. NFCI. Prep work before adding demanded elts support. llvm-svn: 361739	2019-05-26 18:58:14 +00:00
Simon Pilgrim	2916b9e28c	[SelectionDAG] MaskedValueIsZero - add demanded elements implementation Will be used in an upcoming patch but I've updated the original implementation to call this to ensure test coverage. llvm-svn: 361738	2019-05-26 18:43:44 +00:00
Andrea Di Biagio	a549dd2560	[MCA] Refactor the logic that computes the critical memory dependency info. NFCI CriticalRegDep has been renamed CriticalDependency, and it is now used by class Instruction to store information about the critical register dependency and the critical memory dependency. No functional change intendend. llvm-svn: 361737	2019-05-26 18:41:35 +00:00
Shawn Landden	343578759e	[SimplifyCFG] back out all SwitchInst commits They caused the sanitizer builds to fail. My suspicion is the change the countLeadingZeros(). llvm-svn: 361736	2019-05-26 18:15:51 +00:00
Simon Pilgrim	a044410f37	[X86][SSE] Add shuffle combining support for ISD::ANY_EXTEND_VECTOR_INREG Reuses what we already have in place for ISD::ZERO_EXTEND_VECTOR_INREG just with a different sentinel llvm-svn: 361734	2019-05-26 16:00:35 +00:00
Simon Pilgrim	e434368a67	Revert rL361731 : [LLParser] Fix uninitialized variable warnings. NFCI. These 3 variables cause quite a few warnings in the scan-build report on llvm. ........ Revert accidental commit. llvm-svn: 361732	2019-05-26 15:08:45 +00:00
Simon Pilgrim	aabe7781a5	[LLParser] Fix uninitialized variable warnings. NFCI. These 3 variables cause quite a few warnings in the scan-build report on llvm. llvm-svn: 361731	2019-05-26 15:05:12 +00:00
Sanjay Patel	9317963920	[InstCombine] prevent crashing with invalid extractelement index This was found/reduced from a fuzzer report: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14956 llvm-svn: 361729	2019-05-26 14:03:50 +00:00
Shawn Landden	fa91ab85d9	[SimplifyCFG] ReduceSwitchRange: Improve on the case where the SubThreshold doesn't trigger llvm-svn: 361728	2019-05-26 13:55:52 +00:00
Shawn Landden	30111c786f	[SimplifyCFG] Run ReduceSwitchRange unconditionally, generalize Rather than gating on "isSwitchDense" (resulting in necessesarily sparse lookup tables even when they were generated), always run this quite cheap transform. This transform is useful not just for generating tables. LowerSwitch also wants this: read LowerSwitch.cpp:257. Be careful to not generate worse code, by introducing a SubThreshold heuristic. Instead of just sorting by signed, generalize the finding of the best base. And now that it is run unconditionally, do not replicate its functionality in SwitchToLookupTable (which could use a Sub when having a hole is smaller, hence the SubThreshold heuristic located in a single place). This simplifies SwitchToLookupTable, and fixes some ugly corner cases due to the use of signed numbers, such as a table containing i16 32768 and 32769, of which 32769 would be interpreted as -32768, and now the code thinks the table is size 65536. (We still use unconditional subtraction when building a single-register mask, but I think this whole block should go when the more general sparse map is added, which doesn't leave empty holes in the table.) And the reason test4 and test5 did not trigger was documented wrong: it was because they were not considered sufficiently "dense". Also, fix generation of invalid LLVM-IR: shl by bit-width. llvm-svn: 361727	2019-05-26 13:55:14 +00:00
Shawn Landden	444eaaf1cc	[SimpligyCFG] NFC, remove GCD that was only used for powers of two and replace with an equilivent countTrailingZeros. GCD is much more expensive than this, with repeated division. This depends on D60823 llvm-svn: 361726	2019-05-26 13:54:04 +00:00
Shawn Landden	b7cc093db2	[Support] make countLeadingZeros() and countTrailingZeros() return unsigned This matches countLeadingOnes() and countTrailingOnes(), and APInt's countLeadingZeros() and countTrailingZeros(). (as well as __builtin_clzll()) llvm-svn: 361724	2019-05-26 13:49:58 +00:00
Nikita Popov	d0f13e618f	[ValueTracking] Base computeOverflowForUnsignedMul() on ConstantRange code; NFCI The implementation in ValueTracking and ConstantRange are equally powerful, reuse the one in ConstantRange, which will make this easier to extend. llvm-svn: 361723	2019-05-26 13:22:01 +00:00
Nikita Popov	39f2bebf41	[InstCombine] Refactor OptimizeOverflowCheck; NFCI Extract method to compute overflow based on binop and signedness, and then make the result handling code generic. This extends the always-overflow handling to signed muls, but has currently no effect, as we don't compute always overflow for them (thus NFC). llvm-svn: 361721	2019-05-26 11:43:37 +00:00
Nikita Popov	352f598795	[InstCombine] Remove OverflowCheckFlavor; NFC Instead pass binary op and signedness. The extra enum only makes things more complicated in this case. llvm-svn: 361720	2019-05-26 11:43:31 +00:00
David Green	0dbafe191e	[ARM] Select fp16 fma This adds a pattern for fma, similar to the float and double patterns. Differential Revision: https://reviews.llvm.org/D62330 llvm-svn: 361719	2019-05-26 11:34:30 +00:00
David Green	21542cd6f4	[ARM] Select a number of fp16 rounding functions This add patterns for fp16 round and ceil etc. Same as the float and double patterns. Differential Revision: https://reviews.llvm.org/D62326 llvm-svn: 361718	2019-05-26 11:13:00 +00:00
David Green	c9f4b7d201	[ARM] Promote various fp16 math intrinsics Promote a number of fp16 math intrinsics to float, so that the relevant float math routines can be used. Copysign is expanded so as to be handled in-place. Differential Revision: https://reviews.llvm.org/D62325 llvm-svn: 361717	2019-05-26 10:59:21 +00:00
Simon Pilgrim	58a8541dcc	[X86][AVX] combineBitcastvxi1 - peek through bitops to determine size of original vector We were only testing for direct SETCC results - this allows us to peek through AND/OR/XOR combinations of the comparison results as well. There's a missing SEXT(PACKSS) fold that I need to investigate for v8i1 cases before I can enable it there as well. llvm-svn: 361716	2019-05-26 10:54:23 +00:00
David Green	2881325b17	[ARM] Select fp16 fabs This adds a pattern for the fabs intrinsic, the same as float and double. Differential Revision: https://reviews.llvm.org/D62324 llvm-svn: 361715	2019-05-26 10:51:58 +00:00
David Green	aeade651f3	[ARM] Select fp16 fsqrt This adds a pattern for the sqrt intrinsic, the same as float and double. Differential Revision: https://reviews.llvm.org/D62322 llvm-svn: 361714	2019-05-26 10:42:24 +00:00
David Green	caf8a11b65	[ARM] Promote fp16 frem Promote fp16 frem operations on ARM to floats so they call fmodf. Differential Revision: https://reviews.llvm.org/D62321 llvm-svn: 361713	2019-05-26 10:30:22 +00:00
David Bolvansky	0290a77aa8	[SimplifyCFG] Added condition assumption for unreachable blocks Summary: PR41688 Reviewers: spatel, efriedma, craig.topper, hfinkel, reames Reviewed By: hfinkel Subscribers: javed.absar, dmgreen, fhahn, hfinkel, reames, nikic, lebedev.ri, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61409 llvm-svn: 361707	2019-05-25 22:34:27 +00:00
Simon Pilgrim	40fa52b174	[X86] lowerBuildVectorToBitOp - support build_vector(shift()) -> shift(build_vector(),C) Commonly occurs in sign-extension cases llvm-svn: 361706	2019-05-25 18:02:17 +00:00
Robert Widmann	b0fd12b689	[LLVM-C] Add Accessor for Mach-O Universal Binary Slices Summary: Allow for retrieving an object file corresponding to an architecture-specific slice in a Mach-O universal binary file. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60378 llvm-svn: 361705	2019-05-25 16:47:27 +00:00
Nikita Popov	d87eceda0e	[X86] Combine fminnum/fmaxnum with non-nan operand to fmin/fmax If we have a known non-nan operand, place it in the second operand of fmin/fmax that is returned if either operand is nan. Differential Revision: https://reviews.llvm.org/D62448 llvm-svn: 361704	2019-05-25 16:44:29 +00:00
Nikita Popov	6bb5041e94	[LVI][CVP] Add support for saturating add/sub Adds support for the uadd.sat family of intrinsics in LVI, based on ConstantRange methods from D60946. Differential Revision: https://reviews.llvm.org/D62447 llvm-svn: 361703	2019-05-25 16:44:14 +00:00
Nikita Popov	8b1fa07639	[CVP] Remove unnecessary checks for empty GNWR; NFC The guaranteed no-wrap region is never empty, it always contains at least zero, so these optimizations don't ever apply. To make this more obviously true, replace the conversative return in makeGNWR with an assertion. llvm-svn: 361698	2019-05-25 14:11:55 +00:00
Sanjay Patel	91131b6500	[SelectionDAG] soften assertion when legalizing narrow vector FP ops The test based on PR42010: https://bugs.llvm.org/show_bug.cgi?id=42010 ...may show an inaccuracy for PPC's target defs, but we should not be so aggressive with an assert here. There's no telling what out-of-tree targets look like. llvm-svn: 361696	2019-05-25 13:48:07 +00:00
Nikita Popov	024b18aca7	[LVI][CVP] Calculate with.overflow result range In LVI, calculate the range of extractvalue(op.with.overflow(%x, %y), 0) as the range of op(%x, %y). This is mainly useful in conjunction with D60650: If the result of the operation is extracted in a branch guarded against overflow, then the value of %x will be appropriately constrained and the result range of the operation will be calculated taking that into account. Differential Revision: https://reviews.llvm.org/D60656 llvm-svn: 361693	2019-05-25 09:53:45 +00:00
Nikita Popov	17367b0d89	[LVI] Extract helper for binary range calculations; NFC llvm-svn: 361692	2019-05-25 09:53:37 +00:00
Craig Topper	46e5052b8e	[X86FixupLEAs] Turn optIncDec into a generic two address LEA optimizer. Support LEA64_32r properly. INC/DEC is really a special case of a more generic issue. We should also turn leas into add reg/reg or add reg/imm regardless of the slow lea flags. This also supports LEA64_32 which has 64 bit input registers and 32 bit output registers. So we need to convert the 64 bit inputs to their 32 bit equivalents to check if they are equal to base reg. One thing to note, the original code preserved the kill flags by adding operands to the new instruction instead of using addReg. But I think tied operands aren't supposed to have the kill flag set. I dropped the kill flags, but I could probably try to preserve it in the add reg/reg case if we think its important. Not sure which operand its supposed to go on for the LEA64_32r instruction due to the super reg implicit uses. Though I'm also not sure those are needed since they were probably just created by an INSERT_SUBREG from a 32-bit input. Differential Revision: https://reviews.llvm.org/D61472 llvm-svn: 361691	2019-05-25 06:17:47 +00:00
Craig Topper	4b08fcdeb1	[X86] Add zero idioms to the haswell, broadwell, and skylake schedule models. Add 256-bit fp xor to sandybridge zero idioms This copies the Sandy Bridge zero idiom support to later CPUs. Adding the AVX2 and AVX512F/VL instructions as appropriate. Differential Revision: https://reviews.llvm.org/D62360 llvm-svn: 361690	2019-05-25 04:47:49 +00:00
Peter Collingbourne	3b93737446	Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence." Broke sanitizer bots: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio llvm-svn: 361688	2019-05-25 01:52:38 +00:00
David Blaikie	a17564c2f1	llvm-dwarfdump: Don't error on mixed units using/not using str_offsets This lead to errors when dumping binaries with v4 and v5 units linked together (but could've also errored on v5 units that did/didn't use str_offsets). Also improves error handling and messages around invalid str_offsets contributions. llvm-svn: 361683	2019-05-25 00:07:22 +00:00
Jessica Paquette	97d668d70f	[GlobalISel][AArch64] Make FP constraint checks consider possible use/def banks In a few places in getInstrMapping, we check if use/def instructions for the instruction we're mapping have floating point constraints. We can improve this check and reduce the number of copies in GISel-compiled code if we make a couple observations: - For a def instruction, it only matters if the def instruction must always output a value stored on a FPR - For a use instruction, it only matters if the use instruction must always only take in values stored in FPRs This adds two new functions: - onlyUsesFP - onlyDefinesFP Then we can use those when we're checking the uses/defs instead. Without this patch, the load, unmerge, store, and select in the added test would have unnecessary copies. Differential Revision: https://reviews.llvm.org/D62426 llvm-svn: 361679	2019-05-24 23:08:45 +00:00
Jessica Paquette	bede937b16	[GlobalISel][AArch64] NFC: Factor out HasFPConstraints into a proper function Factor it out into a function, and replace places where we had the same check with the new function. Differential Revision: https://reviews.llvm.org/D62421 llvm-svn: 361677	2019-05-24 22:12:21 +00:00
Jonas Devlieghere	0da8160df3	[dwarfdump] Add flag to limit the number of parents DIEs This adds `-parent-recurse-depth` which limits the number of parent DIEs being dumped. Differential revision: https://reviews.llvm.org/D62359 llvm-svn: 361671	2019-05-24 21:11:28 +00:00
Jason Liu	8e1d921bb3	Implement call lowering without parameters on AIX Summary:dd This patch implements call lowering for calls without parameters on AIX as initial support. Reviewers: sfertile, hubert.reinterpretcast, aheejin, efriedma Differential Revision: https://reviews.llvm.org/D61948 llvm-svn: 361669	2019-05-24 20:54:35 +00:00
Jessica Paquette	56503865ed	[GlobalISel][AArch64] Improve register bank mappings for G_SELECT The fcsel and csel instructions differ in only the register banks they work on. So, they're entirely interchangeable otherwise. With this in mind, this does two things: - Teach AArch64RegisterBankInfo to consider the inputs to G_SELECT as well as the outputs. - Teach it to choose the best register bank mapping based off the constraints of the inputs and outputs. The "best" in this case means the one that requires the smallest number of copies to properly emit a fcsel/csel. For example, if the inputs are all already going to be on FPRs, we should emit a fcsel, even if the output is a GPR. This costs one copy to produce the result, but saves us from copying the inputs into GPRs. Also update the regbank-select.mir to check that we end up with the right select instruction. Differential Revision: https://reviews.llvm.org/D62267 llvm-svn: 361665	2019-05-24 19:35:25 +00:00
Nick Desaulniers	33bc64202b	[AArch64] check for INLINEASM_BR along w/ INLINEASM Summary: It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Reviewers: t.p.northover, peter.smith Reviewed By: peter.smith Subscribers: craig.topper, javed.absar, kristof.beyls, hiraditya, llvm-commits, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62402 llvm-svn: 361661	2019-05-24 19:00:13 +00:00
Nick Desaulniers	9f7bd71cf5	[ARM] additionally check for ARM::INLINEASM_BR w/ ARM::INLINEASM Summary: We were observing failures for arm32 allyesconfigs of the Linux kernel with the asm goto Clang patch, where ldr's were being generated to offsets too far away to encode in imm12. It looks like since INLINEASM_BR was created off of INLINEASM, a few checks for INLINEASM needed to be updated to check for either case. pr/41999 Link: https://github.com/ClangBuiltLinux/linux/issues/490 Reviewers: peter.smith, kristof.beyls, ostannard, rengolin, t.p.northover Reviewed By: peter.smith Subscribers: jyu2, javed.absar, hiraditya, llvm-commits, nathanchance, craig.topper, kees, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62400 llvm-svn: 361659	2019-05-24 18:58:21 +00:00
Matt Arsenault	3d59e388ca	AMDGPU: Activate all lanes when spilling CSR VGPR for SGPR spills If some lanes weren't active on entry to the function, this could clobber their VGPR values. llvm-svn: 361655	2019-05-24 18:18:51 +00:00
Matt Arsenault	0ff901fba0	AMDGPU: Boost inline threshold with addrspacecasted alloca arguments This was skipping GetUnderlyingObject for nonprivate addresses, but an alloca could also be found through an addrspacecast if it's flat. llvm-svn: 361649	2019-05-24 16:52:35 +00:00
Alexander Timofeev	dffedea014	[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence. Details: To make instruction selection really divergence driven it is necessary to assign the correct register classes to the cross block values beforehand. For the divergent targets same value type requires different register classes dependent on the value divergence. Reviewers: rampitec, nhaehnle Differential Revision: https://reviews.llvm.org/D59990 llvm-svn: 361644	2019-05-24 15:32:18 +00:00
Stefan Pintilie	522307fa40	[PowerPC] Remove CRBits Copy Of Unset/set CBit For the situation, where we generate the following code: crxor 8, 8, 8 < Some instructions> .LBB0_1: < Some instructions> cror 1, 8, 8 cror (COPY of CRbit) depends on the result of the crxor instruction. CR8 is known to be zero as crxor is equivalent to CRUNSET. We can simply use crxor 1, 1, 1 instead to zero out CR1, which does not have any dependency on any previous instruction. This patch will optimize it to: < Some instructions> .LBB0_1: < Some instructions> cror 1, 1, 1 Patch By: Victor Huang (NeHuang) Differential Revision: https://reviews.llvm.org/D62044 llvm-svn: 361632	2019-05-24 12:05:37 +00:00
Cullen Rhodes	b3e58df80c	[AArch64][SVE2] Asm: support SVE2 String Processing Group Summary: Patch adds support for the SVE2 character match instructions MATCH and NMATCH. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62206 llvm-svn: 361627	2019-05-24 10:32:01 +00:00
Cullen Rhodes	adb1d74bf9	[AArch64][SVE2] Asm: support SVE2 Narrowing Group Summary: Patch adds support for the following instructions: SVE2 bitwise shift right narrow: * SQSHRUNB, SQSHRUNT, SQRSHRUNB, SQRSHRUNT, SHRNB, SHRNT, RSHRNB, RSHRNT, SQSHRNB, SQSHRNT, SQRSHRNB, SQRSHRNT, UQSHRNB, UQSHRNT, UQRSHRNB, UQRSHRNT SVE2 integer add/subtract narrow high part: * ADDHNB, ADDHNT, RADDHNB, RADDHNT, SUBHNB, SUBHNT, RSUBHNB, RSUBHNT SVE2 saturating extract narrow: * SQXTNB, SQXTNT, UQXTNB, UQXTNT, SQXTUNB, SQXTUNT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62205 llvm-svn: 361624	2019-05-24 10:22:30 +00:00
Cullen Rhodes	5f04f00282	[AArch64][SVE2] Asm: support SVE2 Accumulate Group Summary: Patch adds support for the following instructions: SVE2 bitwise shift and insert: * SRI, SLI SVE2 bitwise shift right and accumulate: * SSRA, USRA, SRSRA, URSRA SVE2 complex integer add: * CADD, SQCADD SVE2 integer absolute difference and accumulate: * SABA, UABA SVE2 integer absolute difference and accumulate long: * SABALB, SABALT, UABALB, UABALT SVE2 integer add/subtract long with carry: * ADCLB, ADCLT, SBCLB, SBCLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62204 llvm-svn: 361622	2019-05-24 10:10:34 +00:00
Simon Pilgrim	95b8d9bbf8	[SelectionDAG] computeKnownBits - support constant pool values from target This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets. computeKnownBits then uses this function to improve codegen, notably vector code after legalization. A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit. This required a couple of fixes: * SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR). * Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets. Differential Revision: https://reviews.llvm.org/D61887 llvm-svn: 361620	2019-05-24 10:03:11 +00:00
Cullen Rhodes	980f760515	[AArch64][SVE2] Asm: add PMULLB/PMULLT instructions Summary: This patch adds support for the polynomial multiplication instructions PMULLB/PMULLT. The 64-bit source and 128-bit destination element variants are enabled with crypto extensions (+sve2-aes), similar to the NEON PMULL2 instruction. All other variants are enabled with +sve2. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62145 llvm-svn: 361619	2019-05-24 09:56:23 +00:00
Cullen Rhodes	8bcea9daaa	[AArch64][SVE2] Asm: add integer add/sub long/wide instructions Summary: Patch adds support for the following instructions: SVE2 integer add/subtract long: * SADDLB, SADDLT, UADDLB, UADDLT, SSUBLB, SSUBLT, USUBLB, USUBLT, SABDLB, SABDLT, UABDLB, UABDLT SVE2 integer add/subtract wide: * SADDWB, SADDWT, UADDWB, UADDWT, SSUBWB, SSUBWT, USUBWB, USUBWT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62142 llvm-svn: 361615	2019-05-24 09:28:27 +00:00
Bjorn Pettersson	b4771425f5	Use the DataLayout::typeSizeEqualsStoreSize helper. NFC Just a minor refactoring to use the new helper method DataLayout::typeSizeEqualsStoreSize(). This is done when checking if getTypeSizeInBits is equal/non-equal to getTypeStoreSizeInBits. llvm-svn: 361613	2019-05-24 09:20:20 +00:00
Cullen Rhodes	968cb0e049	[AArch64][SVE2] Asm: add various bitwise shift instructions Summary: This patch adds support for the SVE2 saturating/rounding bitwise shift left (predicated) group of instructions: * SRSHL, URSHL, SRSHLR, URSHLR, SQSHL, UQSHL, SQRSHL, UQRSHL, SQSHLR, UQSHLR, SQRSHLR, UQRSHLR Immediate forms of the SQSHL and UQSHL instructions are also added to the existing SVE bitwise shift by immediate (predicated) group, as well as three new instructions SRSHR/URSHR/SQSHLU. The new instructions in this group are encoded similarly and are implemented using the same TableGen class with a minimal change (1 bit in encoding). The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62140 llvm-svn: 361612	2019-05-24 09:17:23 +00:00
Cullen Rhodes	6bca64fe5e	[AArch64][SVE2] Asm: add saturating add/sub instructions Summary: Patch adds support for the following instructions: * SQADD, UQADD, SUQADD, USQADD * SQSUB, UQSUB, SQSUBR, UQSUBR The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62130 llvm-svn: 361611	2019-05-24 09:06:37 +00:00
Neil Henning	119c31ad93	StructurizeCFG: Relax uniformity checks. This change relaxes the checks for hasOnlyUniformBranches such that our region is uniform if: 1. All conditional branches that are direct children are uniform. 2. And either: a. All sub-regions are uniform. b. There is one or less conditional branches among the direct children. Differential Revision: https://reviews.llvm.org/D62198 llvm-svn: 361610	2019-05-24 08:59:17 +00:00
Cullen Rhodes	d9bb7b69ab	[AArch64][SVE2] Asm: fix overlapping bit Summary: Bit 20 in sve2_int_arith_pred TableGen class was overlapping. The encodings are not affected as bit 20 is defined by the opc bits and this was overwriting the earlier error of setting bit 20 to 0. Raised by Momchil: https://reviews.llvm.org/D62130 Reviewed By: chill Differential Revision: https://reviews.llvm.org/D62292 llvm-svn: 361609	2019-05-24 08:45:37 +00:00
Tim Northover	3b2157aeed	GlobalISel: support swifterror attribute on AArch64. swifterror marks an argument as a register pretending to be a pointer, so we need a guaranteed mem2reg-like analysis of its uses. Fortunately most of the infrastructure can be reused from the DAG world. llvm-svn: 361608	2019-05-24 08:40:13 +00:00
Tim Northover	3d7a057b0d	CodeGen: factor out swifterror value tracking. llvm-svn: 361607	2019-05-24 08:39:43 +00:00
Simon Atanasyan	c1b482f2a5	[mips] Always check that `shift and add` optimization is efficient. The D45316 introduced the `shouldTransformMulToShiftsAddsSubs` function to check that breaking down constant multiplications into a series of shifts, adds, and subs is efficient. Unfortunately, this function does not check maximum number of steps on all paths of the algorithm. This patch fixes this bug. Fix for PR41929. Differential Revision: https://reviews.llvm.org/D62166 llvm-svn: 361606	2019-05-24 08:39:40 +00:00
Bjorn Pettersson	d63a2bb35f	[DSE] Bugfix to avoid PartialStoreMerging involving non byte-sized stores Summary: The DeadStoreElimination pass now skips doing PartialStoreMerging when stores overlap according to OW_PartialEarlierWithFullLater and at least one of the stores is having a store size that is different from the size of the type being stored. This solves problems seen in https://bugs.llvm.org/show_bug.cgi?id=41949 for which we in the past could end up with mis-compiles or assertions. The content and location of the padding bits is not formally described (or undefined) in the LangRef at the moment. So the solution is chosen based on that we cannot assume anything about the padding bits when having a store that clobbers more memory than indicated by the type of the value that is stored (such as storing an i6 using an 8-bit store instruction). Fixes: https://bugs.llvm.org/show_bug.cgi?id=41949 Reviewers: spatel, efriedma, fhahn Reviewed By: efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62250 llvm-svn: 361605	2019-05-24 08:32:02 +00:00
Sjoerd Meijer	937af54666	[ARM] ARMExpandPseudoInsts: add debug messages This pass wasn't printing any messages at all, which I find really inconvenient while debugging/tracing things. It now dumps the before and after of expanded instructions. It doesn't do this yet for all instructions, but this is a good start I guess. Differential Revision: https://reviews.llvm.org/D62297 llvm-svn: 361604	2019-05-24 08:25:02 +00:00
QingShan Zhang	449bfdd1b0	[Power9] Add a specific heuristic to schedule the addi before the load When we are scheduling the load and addi, if all other heuristic didn't take effect, we will try to schedule the addi before the load, to hide the latency, and avoid the true dependency added by RA. And this only take effects for Power9. Differential Revision: https://reviews.llvm.org/D61930 llvm-svn: 361600	2019-05-24 05:30:09 +00:00
Yevgeny Rouban	c652b3455e	[NFC] SwitchInst: Introduce wrapper for prof branch_weights handling This patch introduces a wrapper class that re-implements several mutator methods of SwitchInst to handle changes of prof branch_weights metadata along with remove/add switch case methods. Subsequent patches will use this wrapper to implement prof branch_weights metadata handling for SwitchInst. Reviewers: davidx, eraman, reames, chandlerc Reviewed By: davidx Differential Revision: https://reviews.llvm.org/D62122 llvm-svn: 361596	2019-05-24 04:34:23 +00:00
David Blaikie	fc302c2b7f	dwarfdump: Deterministically... determine whether parsing a DWARF32 or DWARF64 str_offsets header Rather than trying one and then the other - use the kind of the CU to select which kind of header to parse. llvm-svn: 361589	2019-05-24 01:41:58 +00:00
Reid Kleckner	b7a78c7dff	[AArch64] Preserve X8 for thunks ending in variadic musttail calls Summary: On Windows, X8 may be used to pass in the address of an aggregate that is returned indirectly. Therefore, it should be forwarded to variadic musttail calls and preserved in thunks. Fixes PR41997 Reviewers: mgrang, efriedma Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62344 llvm-svn: 361585	2019-05-24 01:27:20 +00:00
Serge Pavlov	ed595e8627	[AArch64] Add nvcast patterns for v2f32 -> v1f64 Summary: Constant stores of f32 values can create such NvCast nodes. Reviewers: t.p.northover Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62285 llvm-svn: 361584	2019-05-24 01:20:34 +00:00
David Blaikie	79872a88a0	dwarfdump: Add a bit more DWARF64 support This test case was incorrect because it mixed DWARF32 and DWARF64 for a single unit (DWARF32 unit referencing a DWARF64 str_offsets section). So fix enough of the unit parsing for DWARF64 and make the test valid. (not sure if anyone needs DWARF64 support though - support in libDebugInfoDWARF has been added piecemeal and LLVM doesn't produce it at all) llvm-svn: 361582	2019-05-24 01:05:52 +00:00
Eli Friedman	052f87ae36	Revert r361460 It regresses https://bugs.llvm.org/show_bug.cgi?id=38309 (represented by the testcase test/Transforms/GlobalOpt/globalsra-multigep.ll). llvm-svn: 361581	2019-05-24 01:03:51 +00:00
Thomas Lively	55229f6b10	[WebAssembly] Expand more SIMD float ops Summary: These were previously causing ISel failures. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62354 llvm-svn: 361577	2019-05-24 00:15:04 +00:00
Sanjay Patel	8869a98e82	[InstSimplify] fold insertelement-of-extractelement This was partly handled in InstCombine (only the constant index case), so delete that and zap it more generally in InstSimplify. llvm-svn: 361576	2019-05-24 00:13:58 +00:00
Sanjay Patel	093c922205	[InstCombine] remove redundant fold for extractelement; NFC The out-of-bounds index pattern is handled by InstSimplify, so the extractelement should be eliminated next time it is visited. llvm-svn: 361570	2019-05-23 23:33:38 +00:00
Sanjay Patel	4d4df6f144	[InstCombine] remove redundant fold for insertelement; NFC The out-of-bounds index pattern is handled by InstSimplify. llvm-svn: 361569	2019-05-23 23:33:34 +00:00
Alina Sbirlea	d82ddfa7c3	[NewPassManager] Add tuning option: ForgetAllSCEVInLoopUnroll [NFC]. Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61612 llvm-svn: 361560	2019-05-23 21:52:59 +00:00
Sanjay Patel	e60cb7d1be	[InstSimplify] insertelement V, undef, ? --> V This was part of InstCombine, but it's better placed in InstSimplify. InstCombine also had an unreachable but weaker fold for insertelement with undef index, so that is deleted. llvm-svn: 361559	2019-05-23 21:49:47 +00:00
Kit Barton	987fdfd9a7	Revert [LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch. This reverts r361517 (git commit `2049e4dd8f`) llvm-svn: 361553	2019-05-23 20:53:05 +00:00
Sanjay Patel	7d6c0bce50	[DAGCombiner] make folds of binops safe for opcodes that produce >1 value This is no-functional-change-intended currently because the definition of isBinOp() only includes opcodes that produce 1 value. But if we share that implementation with isCommutativeBinOp() as proposed in D62191, then we need to make sure that the callers bail out for opcodes that they are not prepared to handle correctly. llvm-svn: 361547	2019-05-23 20:17:25 +00:00
Matt Arsenault	5c714cbdd8	AMDGPU: Correct maximum possible private allocation size We were assuming a much larger possible per-wave visible stack allocation than is possible: `faa3ae5138/src/core/runtime/amd_gpu_agent.cpp (L70)` Based on this, we can assume the high 15 bits of a frame index or sret are 0. The frame index value is the per-lane offset, so the maximum frame index value is MAX_WAVE_SCRATCH / wavesize. Remove the corresponding subtarget feature and option that made this configurable. llvm-svn: 361541	2019-05-23 19:38:14 +00:00
Alina Sbirlea	e4b27869c6	[NewPassManager] Add tuning option: LoopUnrolling [NFC]. Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: jlebar, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61618 llvm-svn: 361540	2019-05-23 19:35:40 +00:00
Alina Sbirlea	63729b0c49	[SLPVectorizer] Set flag to previous default. Summary: The refactoring in r360276 moved the `RunSLPVectorization` flag and added the default explicitly. The default should have been `false`, as before. The new pass manager used to have SLPVectorization on by default, now it's off in opt, and needs D61617 checked in to enable it in clang. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61955 llvm-svn: 361537	2019-05-23 19:07:41 +00:00
Sanjay Patel	3249be1e03	[InstCombine] be more careful when transforming a shuffle mask This is reduced from a fuzzer test: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=14890 Usually, demanded elements should be able to simplify shuffle mask elements that are pointing to undef elements of its source operands, but that doesn't happen in the test case. llvm-svn: 361533	2019-05-23 18:46:03 +00:00
Robert Lougher	170dfeb2ff	Resubmit r360436 "[X86] Avoid SFB - Fix inconsistent codegen with/without debug info" Fixes https://bugs.llvm.org/show_bug.cgi?id=40969 The functions findPotentiallyBlockedCopies and buildCopy are currently not accounting for the presence of debug instructions. In the former this results in the optimization not being trigerred, and in the latter results in inconsistent codegen. This patch enables the optimization to be performed in a debug build and ensures the codegen is consistent with non-debug builds. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61680 llvm-svn: 361527	2019-05-23 18:15:12 +00:00
Thomas Lively	e18b5c6237	[WebAssembly] Implement ReplaceNodeResults to fix a SIMD crash Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61037 llvm-svn: 361526	2019-05-23 18:09:26 +00:00
Matt Arsenault	0f3ba44b57	AMDGPU/GlobalISel: Legality for integer min/max llvm-svn: 361519	2019-05-23 17:58:48 +00:00
Kit Barton	2049e4dd8f	[LOOPINFO] Extend Loop object to add utilities to get the loop bounds, step, induction variable, and guard branch. Summary: This PR extends the loop object with more utilities to get loop bounds, step, induction variable, and guard branch. There already exists passes which try to obtain the loop induction variable in their own pass, e.g. loop interchange. It would be useful to have a common area to get these information. Moreover, loop fusion (https://reviews.llvm.org/D55851) is planning to use getGuard() to extend the kind of loops it is able to fuse, e.g. rotated loop with non-constant upper bound, which would have a loop guard. /// Example: /// for (int i = lb; i < ub; i+=step) /// <loop body> /// --- pseudo LLVMIR --- /// beforeloop: /// guardcmp = (lb < ub) /// if (guardcmp) goto preheader; else goto afterloop /// preheader: /// loop: /// i1 = phi[{lb, preheader}, {i2, latch}] /// <loop body> /// i2 = i1 + step /// latch: /// cmp = (i2 < ub) /// if (cmp) goto loop /// exit: /// afterloop: /// /// getBounds /// getInitialIVValue --> lb /// getStepInst --> i2 = i1 + step /// getStepValue --> step /// getFinalIVValue --> ub /// getCanonicalPredicate --> '<' /// getDirection --> Increasing /// getGuard --> if (guardcmp) goto loop; else goto afterloop /// getInductionVariable --> i1 /// getAuxiliaryInductionVariable --> {i1} /// isCanonical --> false Committed on behalf of @Whitney (Whitney Tsang). Reviewers: kbarton, hfinkel, dmgreen, Meinersbur, jdoerfert, syzaara, fhahn Reviewed By: kbarton Subscribers: tvvikram, bmahjour, etiotto, fhahn, jsji, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60565 llvm-svn: 361517	2019-05-23 17:56:35 +00:00
Thomas Lively	eafe8ef6f2	[WebAssembly] Add multivalue and tail-call target features Summary: These features will both be implemented soon, so I thought I would save time by adding the boilerplate for both of them at the same time. Reviewers: aheejin Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D62047 llvm-svn: 361516	2019-05-23 17:26:47 +00:00
Thomas Preud'homme	7b7683d7a6	[FileCheck] Remove llvm:: prefix Summary: Remove all llvm:: prefixes in FileCheck library header and implementation except for calls to make_unique and make_shared since both files already use the llvm namespace. Reviewers: jhenderson, jdenny, probinson, arichardson Subscribers: hiraditya, arichardson, probinson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62323 llvm-svn: 361515	2019-05-23 17:19:36 +00:00
Saleem Abdulrasool	7bbefb13ee	Transforms: lower fadd and fsub atomicrmw instructions `fadd` and `fsub` have recently (r351850) been added as `atomicrmw` operations. This diff adds lowering cases for them to the LowerAtomic transform. Patch by Josh Berdine! llvm-svn: 361512	2019-05-23 17:03:43 +00:00
Andrea Di Biagio	27b3b5d952	[MCA] Add the ability to compute critical register dependency of an instruction. This patch adds the methods `getCriticalRegDep()` and `computeCriticalRegDep()` to class InstructionBase. The goal is to allow users to obtain information about the critical register dependency that most affects the latency of an instruction. These methods are currently unused. However, the long term plan is to use them in order to allow the computation of a critical-path as part of the bottleneck analysis. So, this is yet another step towards fixing PR37494. llvm-svn: 361509	2019-05-23 16:32:19 +00:00
Shoaib Meenai	87226a7202	[AsmPrinter] Treat a narrowing PtrToInt like Trunc When printing assembly for PtrToInt, AsmPrinter::lowerConstant incorrectly assumed that if PtrToInt was not converting to an int with exactly the same number of bits, it must be widening to a larger int. But this isn't necessarily true; PtrToInt can also shrink the size, which is useful when you want to produce a known 32-bit pointer on a 64-bit platform (on x86_64 ELF this yields a R_X86_64_32 relocation). The old behavior of falling through to the widening case for a narrowing PtrToInt yields bogus assembly code like this, which fails to assemble because the no-op bit and it accidentally creates is not a valid relocation: ``` .long a&-1 ``` The fix is to treat a narrowing PtrToInt exactly the same as it already treats Trunc: just emit the expression and let the assembler deal with truncating it in the appropriate way. Patch by Mat Hostetter <mjh@fb.com>. Differential Revision: https://reviews.llvm.org/D61325 llvm-svn: 361508	2019-05-23 16:29:09 +00:00
Lewis Revill	74927554e2	[RISCV] Support assembling TLS LA pseudo instructions This patch adds the pseudo instructions la.tls.ie and la.tls.gd, used in the initial-exec and global-dynamic TLS models respectively when addressing a global. The pseudo instructions are expanded in the assembly parser. llvm-svn: 361499	2019-05-23 14:46:27 +00:00
Petar Jovanovic	aa28b6d198	[LiveDebugValues] Rename 'DMI' into 'DebugInstr' (NFC) This will improve code readability. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D62295 llvm-svn: 361497	2019-05-23 13:49:06 +00:00
Andrea Di Biagio	dd0d9e01ee	[MCA] Introduce class LSUnitBase and let LSUnit derive from it. Class LSUnitBase provides a abstract interface for all the concrete LS units in llvm-mca. Methods exposed by the public abstract LSUnitBase interface are: - Status isAvailable(const InstRef&); - void dispatch(const InstRef &); - const InstRef &isReady(const InstRef &); LSUnitBase standardises the API, but not the data structures internally used by LS units. This allows for more flexibility. Previously, only method `isReady()` was declared virtual by class LSUnit. Also, derived classes had to inherit all the internal data members of LSUnit. No functional change intended. llvm-svn: 361496	2019-05-23 13:42:47 +00:00
Clement Courbet	43882b16a3	[MergeICmps] Make the pass compatible with the new pass manager. Reviewers: gchatelet, spatel Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62287 llvm-svn: 361490	2019-05-23 12:35:26 +00:00
Andrea Di Biagio	28afd8dc71	[MCA] Make the bool conversion operator in class InstRef explicit. NFCI This patch makes the bool conversion operator in InstRef explicit. It also adds a operator< to hel comparing InstRef objects in sets. llvm-svn: 361482	2019-05-23 10:50:01 +00:00
Petar Jovanovic	ff47d83e78	[DwarfExpression] Refactor dwarf expression (NFC) Refactor location description kind in order to be easier for extensions (needed for D60866). In addition, cut off some bits from the other class fields. Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D62002 llvm-svn: 361480	2019-05-23 10:37:13 +00:00
Sam Parker	617cdc5a6d	[ARM][CGP] Clear SafeWrap before each search The previous patch added a member set to store instructions that we could allow to wrap. But this wasn't cleared between searches meaning that they could get promoted, incorrectly, during the promotion of a separate valid chain. Differential Revision: https://reviews.llvm.org/D62254 llvm-svn: 361462	2019-05-23 07:46:39 +00:00
Christian Bruel	4a7da98bd9	[GlobalOpt] recognize dead struct fields and propagate values Summary: Allow struct fields SRA and dead stores. This works by considering fields accesses from getElementPtr to be considered as a possible pointer root that can be cleaned up. We check that the variable can be SRA by recursively checking the sub expressions with the new isSafeSubSROAGEP function. basically this allows the array in following C code to be optimized out struct Expr { int a[2]; int b; }; static struct Expr e; int foo (int i) { e.b = 2; e.a[i] = 1; return e.b; } Reviewers: greened, bkramer, nicholas, jmolloy Reviewed By: jmolloy Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61911 llvm-svn: 361460	2019-05-23 05:53:10 +00:00
Thomas Lively	1a3cbe720c	[WebAssembly] Implement __builtin_return_address for emscripten Summary: In this patch, `ISD::RETURNADDR` is lowered on the emscripten target to the new Emscripten runtime function `emscripten_return_address`, which implements the functionality. Patch by Guanzhong Chen Reviewers: tlively, aheejin Reviewed By: tlively Subscribers: dschuff, sbc100, jgravelle-google, hiraditya, sunfish, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62210 llvm-svn: 361454	2019-05-23 01:24:01 +00:00
Fangrui Song	86c9ca48c3	[X86] Support -fno-plt __tls_get_addr calls In general dynamic/local dynamic TLS models, with -fno-plt, * x86: emit `calll ___tls_get_addr@GOT(%ebx)` instead of `calll ___tls_get_addr@PLT` Note, on x86, if we can get rid of %ebx as the PIC register, it may be better to use a register not preserved across function calls. x86_64: emit `callq *__tls_get_addr@GOTPCREL(%rip)` instead of `callq __tls_get_addr@PLT` Reorganize the code by separating 32-bit and 64-bit. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D62106 llvm-svn: 361453	2019-05-23 01:05:13 +00:00
Thomas Preud'homme	f3b9bb3d69	[FileCheck] Introduce substitution subclasses Summary: With now a clear distinction between string and numeric substitutions, this patch introduces separate classes to represent them with a parent class implementing the common interface. Diagnostics in printSubstitutions() are also adapted to not require knowing which substitution is being looked at since it does not hinder clarity and makes the implementation simpler. Reviewers: jhenderson, jdenny, probinson, arichardson Subscribers: llvm-commits, probinson, arichardson, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D62241 llvm-svn: 361446	2019-05-23 00:10:29 +00:00
Thomas Preud'homme	1a944d27b2	FileCheck: Improve FileCheck variable terminology Summary: Terminology introduced by [[#]] blocks is confusing and does not integrate well with existing terminology. First, variables referred by [[]] blocks are called "pattern variables" while the text a CHECK directive needs to match is called a "CHECK pattern". This is inconsistent with variables in [[#]] blocks since [[#]] blocks are also found in CHECK pattern yet those variables are called "numeric variable". Second, the replacing of both [[]] and [[#]] blocks by the value of the variable or expression they contain is represented by a FileCheckPatternSubstitution class. The naming refers to being a substitution in a CHECK pattern but could be wrongly understood as being a substitution of a pattern variable. Third and lastly, comments use "numeric expression" to refer both to the [[#]] blocks as well as to the numeric expressions these blocks contain which get evaluated at match time. This patch solves these confusions by - calling variables in [[]] and [[#]] blocks as string and numeric variables respectively; - referring to [[]] and [[#]] as substitution blocks, with the former being a string substitution block and the latter a numeric substitution block; - calling [[]] and [[#]] blocks to be replaced by the value of a variable or expression they contain a substitution (as opposed to definition when these blocks are used to defined a variable), with the former being a string substitution and the latter a numeric substitution; - renaming the FileCheckPatternSubstitution as a FileCheckSubstitution class with FileCheckStringSubstitution and FileCheckNumericSubstitution subclasses; - restricting the use of "numeric expression" to refer to the expression that is evaluated in a numeric substitution. While numeric substitution blocks only support numeric substitutions of numeric expressions at the moment there are plans to augment numeric substitution blocks to support numeric definitions as well as both a numeric definition and numeric substitution in the same numeric substitution block. Reviewers: jhenderson, jdenny, probinson, arichardson Subscribers: hiraditya, arichardson, probinson, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62146 llvm-svn: 361445	2019-05-23 00:10:14 +00:00
Matt Arsenault	b79a25b124	TableGen: Handle nontrivial foreach range bounds This allows using anything that isn't a literal integer as the bounds for a foreach. Some of the diagnostics aren't perfect, but nobody ever accused tablegen of having good errors. For example, the existing wording suggests a bitrange is valid, but as far as I can tell this has never worked. Fixes bug 41958. llvm-svn: 361434	2019-05-22 21:28:20 +00:00
Craig Topper	93f38e1f1a	[X86] Explcitly disable VEXTRACT instruction matching for an immediate of 0. Remove a bunch of isel patterns that become unnecessary. We effectively had a second set of isel patterns that tried to use a regular store instruction and an extract_subreg instruction. Or a masked move and an extract_subreg. These patterns were intended to override the matching of VEXTRACT instructions by taking advantage of the priority of the explicit immediate 0 for the index. This patch instaed just disables the immediate 0 matchin the VEXTRACT patterns. This each of the component pieces of the larger patterns will match by themselves. This found a bug of sorts were we didn't use 128-bit store for 512->128 extract on KNL. Its unclear what the right thing here should be. Using the vextract avoids constraining the register allocator to use xmm0-15. But it always results in a longer encoding if the register allocator ends up choosing xmm0-15 anyway. llvm-svn: 361431	2019-05-22 21:00:18 +00:00
Galina Kistanova	ed49f6d8e6	Reverted r361134 because of a failing test left unattended for a long time. http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/17792/steps/test-check-all/logs/stdio Failing Tests (1): LLVM :: CodeGen/AMDGPU/regbank-reassign.mir llvm-svn: 361430	2019-05-22 20:42:56 +00:00
Craig Topper	9816d55776	[X86][InstCombine] Remove InstCombine code that turns X86 round intrinsics into llvm.ceil/floor. Remove some isel patterns that existed because that was happening. We were turning roundss/sd/ps/pd intrinsics with immediates of 1 or 2 into llvm.floor/ceil. The llvm.ceil/floor intrinsics are supposed to correspond to the libm functions. For the libm functions we need to disable the precision exception so the llvm.floor/ceil functions should always map to encodings 0x9 and 0xA. We had a mix of isel patterns where some used 0x9 and 0xA and others used 0x1 and 0x2. We need to be consistent and always use 0x9 and 0xA. Since we have no way in isel of knowing where the llvm.ceil/floor came from, we can't map X86 specific intrinsics with encodings 1 or 2 to it. We could map 0x9 and 0xA to llvm.ceil/floor instead, but I'd really like to see a use case and optimization advantage first. I've left the backend test cases to show the blend we now emit without the extra isel patterns. But I've removed the InstCombine tests completely. llvm-svn: 361425	2019-05-22 20:04:55 +00:00
Craig Topper	2f1895e03d	[X86] Add more icelake model numbers to getHostCPUName. Using model numbers found in Table 2-1 of the May 2019 version of the Intel Software Developer's Manual Volume 4. llvm-svn: 361422	2019-05-22 19:51:35 +00:00
Alexey Lapshin	53726588f6	[DebugInfo][AArch64] Recognise target specific instruction as mov instr This fix is for the problem from https://bugs.llvm.org/show_bug.cgi?id=38714. Specifically, Simple Register Coalescing creates following conversion : undef %0.sub_32:gpr64 = ORRWrs $wzr, %3:gpr32common, 0, debug-location !24; It copies 32-bit value from gpr32 into gpr64. But Live DEBUG_VALUE analysis is not able to create debug location record for that instruction. So the problem is in that debug info for argc variable is incorrect. The fix is to write custom isCopyInstrImpl() which would recognize the ORRWrs instr. llvm-svn: 361417	2019-05-22 18:48:58 +00:00
Hiroshi Yamauchi	dfeb797455	[PGO][CHR] Speed up following long use-def chains. Summary: Avoid visiting an instruction more than once by using a map. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62262 llvm-svn: 361416	2019-05-22 18:37:34 +00:00
Xing Xue	4246b75295	Disable EHFrameSupport in JITLink/RuntimeDyld on AIX Summary: EH Frames aren't supported on AIX with the system compiler, but the definition of HAVE_EHTABLE_SUPPORT misses this which causes linking problems on AIX. This patch updates the definition of HAVE_EHTABLE_SUPPORT in both JITLink and RuntimeDyld. Author: daltenty Reviewers: sfertile, xingxue, hubert.reinterpretcase Reviewed By: xingxue Subscribers: hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62203 llvm-svn: 361410	2019-05-22 17:41:27 +00:00
Matt Arsenault	418e23e33c	AMDGPU: Move disassembler support check to constructor Don't check for unsupported targets for every instruction. llvm-svn: 361406	2019-05-22 16:28:48 +00:00
Matt Arsenault	ca64ef2043	MC: Allow getMaxInstLength to depend on the subtarget Keep it optional in cases this is ever needed in some global context. Currently it's only used for getting an upper bound inline asm code size. For AMDGPU, gfx10 increases the maximum instruction size to 20-bytes. This avoids penalizing older subtargets when estimating code size, and making some annoying branch relaxation test adjustments. llvm-svn: 361405	2019-05-22 16:28:41 +00:00
Kees Cook	c2187c20a4	[TargetLowering] Extend bool args to inline-asm according to getBooleanType Summary: This extends Krzysztof Parzyszek's X86-specific solution (https://reviews.llvm.org/D60208) to the generic code pointed out by James Y Knight. Reviewers: kparzysz, craig.topper, nickdesaulniers Subscribers: efriedma, sdardis, nemanjai, javed.absar, eraman, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, srhines, void, nickdesaulniers, jyknight Tags: #llvm Differential Revision: https://reviews.llvm.org/D60224 llvm-svn: 361404	2019-05-22 16:16:15 +00:00
Kees Cook	a7a687e500	[TargetLowering] Add blank line (test commit) llvm-svn: 361403	2019-05-22 16:02:13 +00:00
Nico Weber	09fb2029e5	llvm-undname: Fix an assert-on-invalid, found by oss-fuzz If a template parameter refers to a pointer to member, but the mangling of that was a string literal instead of a real symbol, llvm-undname used to crash instead of rejecting the input. llvm-svn: 361402	2019-05-22 15:53:23 +00:00
Sanjay Patel	5a4f7cf2ff	[IR] allow fast-math-flags on select of FP values This is a minimal start to correcting a problem most directly discussed in PR38086: https://bugs.llvm.org/show_bug.cgi?id=38086 We have been hacking around a limitation for FP select patterns by using the fast-math-flags on the condition of the select rather than the select itself. This patch just allows FMF to appear with the 'select' opcode. No changes are needed to "FPMathOperator" because it already includes select-of-FP because that definition is based on the (return) value type. Once we have this ability, we can start correcting and adding IR transforms to use the FMF on a 'select' instruction. The instcombine and vectorizer test diffs only show that the IRBuilder change is behaving as expected by applying an FMF guard value to 'select'. For reference: rL241901 - allowed FMF with fcmp rL255555 - allowed FMF with FP calls Differential Revision: https://reviews.llvm.org/D61917 llvm-svn: 361401	2019-05-22 15:50:46 +00:00
Simon Pilgrim	3c05cad03e	LoopVectorizationCostModel::selectInterleaveCount - assert we have a non-zero loop cost. NFCI. The input LoopCost value can be zero, but if so it should be recalculated with the current VF. After that it should always be non-zero. llvm-svn: 361387	2019-05-22 14:18:17 +00:00
Dmitry Preobrazhensky	7773fc478d	[AMDGPU][MC] Corrected parsing of op_sel* and neg_* modifiers See bug 41361: https://bugs.llvm.org/show_bug.cgi?id=41361 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D61012 llvm-svn: 361386	2019-05-22 13:59:01 +00:00
Simon Pilgrim	9b40dd6318	[Hexagon] assert getRegisterBitWidth returns non-zero value. NFCI. Fixes scan-build warning. llvm-svn: 361375	2019-05-22 12:25:46 +00:00
Simon Pilgrim	cfe6fe06ab	[VirtualFileSystem] Fix uninitialized variable warning. NFCI. llvm-svn: 361371	2019-05-22 11:20:52 +00:00
Sjoerd Meijer	aa4f1ffca4	[TargetMachine] error message unsupported code model When the tiny code model is requested for a target machine that does not support this, we get an error message (which is nice) but also this diagnostic and request to submit a bug report: fatal error: error in backend: Target does not support the tiny CodeModel [Inferior 2 (process 31509) exited with code 0106] clang-9: error: clang frontend command failed with exit code 70 (use -v to see invocation) (gdb) clang version 9.0.0 (http://llvm.org/git/clang.git 29994b0c63a40f9c97c664170244a7bba5ecc15e) (http://llvm.org/git/llvm.git 95606fdf91c2d63a931e865f4b78b2e9828ddc74) Target: arm-arm-none-eabi Thread model: posix clang-9: note: diagnostic msg: PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. clang-9: note: diagnostic msg: ******************** PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT: Preprocessed source(s) and associated run script(s) are located at: clang-9: note: diagnostic msg: /tmp/tiny-dfe1a2.c clang-9: note: diagnostic msg: /tmp/tiny-dfe1a2.sh clang-9: note: diagnostic msg: But this is not a bug, this is a feature. :-) Not only is this not a bug, this is also pretty confusing. This patch causes just to print the fatal error and not the diagnostic: fatal error: error in backend: Target does not support the tiny CodeModel Differential Revision: https://reviews.llvm.org/D62236 llvm-svn: 361370	2019-05-22 10:40:26 +00:00
Martin Storsjo	de6038b265	[llvm-dlltool] Respect NONAME keyword This adds proper handling of the NONAME-keyword, which makes llvm-dlltool generate an import using the ordinal instead of the name. Patch by by Jannik Vogel, test added by Stefan Schmidt. Differential Revision: https://reviews.llvm.org/D62175 llvm-svn: 361367	2019-05-22 09:49:54 +00:00
Clement Courbet	f8f93ba90d	Re-land r361257 "[MergeICmps][NFC] Make BCEAtom move-only."" llvm-svn: 361366	2019-05-22 09:45:40 +00:00
Anton Afanasyev	df00c6a54f	[MIR] Add simple PRE pass to MachineCSE This is the second part of the commit fixing PR38917 (hoisting partitially redundant machine instruction). Most of PRE (partitial redundancy elimination) and CSE work is done on LLVM IR, but some of redundancy arises during DAG legalization. Machine CSE is not enough to deal with it. This simple PRE implementation works a little bit intricately: it passes before CSE, looking for partitial redundancy and transforming it to fully redundancy, anticipating that the next CSE step will eliminate this created redundancy. If CSE doesn't eliminate this, than created instruction will remain dead and eliminated later by Remove Dead Machine Instructions pass. The third part of the commit is supposed to refactor MachineCSE, to make it more clear and to merge MachinePRE with MachineCSE, so one need no rely on further Remove Dead pass to clear instrs not eliminated by CSE. First step: https://reviews.llvm.org/D54839 Fixes llvm.org/PR38917 llvm-svn: 361356	2019-05-22 07:41:34 +00:00
Fangrui Song	1c61471ab1	[PPC64] Parse -elfv1 -elfv2 when specified on target triple Summary: For big-endian powerpc64, the default ABI is ELFv1. OpenPower ABI ELFv2 is supported when -mabi=elfv2 is specified. FreeBSD support for PowerPC64 ELFv2 ABI with LLVM is in progress[1]. This patch adds an alternative way to specify ELFv2 ABI on target triple [2]. The following results are expected: ELFv1 when using: -target powerpc64-unknown-freebsd12.0 -target powerpc64-unknown-freebsd12.0 -mabi=elfv1 -target powerpc64-unknown-freebsd12.0-elfv1 ELFv2 when using: -target powerpc64-unknown-freebsd12.0 -mabi=elfv2 -target powerpc64-unknown-freebsd12.0-elfv2 [1] https://wiki.freebsd.org/powerpc/llvm-elfv2 [2] https://clang.llvm.org/docs/CrossCompilation.html Patch by Alfredo Dal'Ava Júnior! Differential Revision: https://reviews.llvm.org/D61950 llvm-svn: 361355	2019-05-22 07:29:59 +00:00
Sjoerd Meijer	eec021658b	[AArch64] Subtarget crypto extension defaults The Armv8.2-A crypto extensions all defaulted to true, but should default to false, like all the other extensions. Differential Revision: https://reviews.llvm.org/D62180 llvm-svn: 361354	2019-05-22 07:10:27 +00:00
Nikita Popov	15df05152d	[X86] Don't compare i128 through vector if construction not cheap (PR41971) Fix for https://bugs.llvm.org/show_bug.cgi?id=41971. Make the combineVectorSizedSetCCEquality() transform more conservative by checking that the bitcast to the vector type will be cheap/free for both operands. I'm considering it cheap if it's a constant, a load or already a vector. I've dropped the explicit check for f128 because it should fall out naturally (in the cases where it'd be detrimental). Differential Revision: https://reviews.llvm.org/D62220 llvm-svn: 361352	2019-05-22 06:47:06 +00:00
Chen Zheng	b727b0483c	[PowerPC] use meaningful name for displacement form aligned with x-form - NFC llvm-svn: 361347	2019-05-22 03:17:39 +00:00
Chen Zheng	9970665f60	[PowerPC] [ISEL] select x-form instruction for unaligned offset Differential Revision: https://reviews.llvm.org/D62173 llvm-svn: 361346	2019-05-22 02:57:31 +00:00
Pengfei Wang	6a0d432e9e	[X86] [CET] Deal with return-twice function such as vfork, setjmp when CET-IBT enabled Return-twice functions will indirectly jump after the caller's position. So when CET-IBT is enable, we should make sure these is endbr* instructions follow these Return-twice function caller. Like GCC does. Patch by Xiang Zhang (xiangzhangllvm) Differential Revision: https://reviews.llvm.org/D61881 llvm-svn: 361342	2019-05-22 00:50:21 +00:00
Sanjay Patel	6a554188aa	[InstCombine] fold shuffles of insert_subvectors This should be a valid exception to the general rule of not creating new shuffle masks in IR... because we already do it. :) Also, DAG combining/legalization will undo this by widening the shuffle back out if needed. Explanation for how we already do this: SLP or vector source can create chains of insert/extract as shown in 1 of the examples from PR16739: https://godbolt.org/z/NlK7rA https://bugs.llvm.org/show_bug.cgi?id=16739 And we expect instcombine or DAGCombine to clean that up by creating relatively simple shuffles. Differential Revision: https://reviews.llvm.org/D62024 llvm-svn: 361338	2019-05-22 00:32:25 +00:00
Matt Arsenault	2cba91b8db	AMDGPU: Assume calls read exec llvm-svn: 361333	2019-05-21 23:23:16 +00:00
Matt Arsenault	dd1ffa00a5	AMDGPU: Assume call pseudos are convergent There should probably be nonconvergent versions, but my guess is it doesn't matter in practice. llvm-svn: 361331	2019-05-21 23:23:10 +00:00
Matt Arsenault	60ba03e210	AMDGPU: Fix not marking new gfx10 SGPRs as CSRs llvm-svn: 361330	2019-05-21 23:23:05 +00:00
Dan Gohman	a49496fb2a	[WebAssembly] Add the signature for the new llround builtin function r360889 added new llround builtin functions. This patch adds their signatures for the WebAssembly backend. It also adds wasm32 support to utils/update_llc_test_checks.py, since that's the script other targets are using for their testcases for this feature. Differential Revision: https://reviews.llvm.org/D62207 llvm-svn: 361327	2019-05-21 23:06:34 +00:00
Stanislav Mekhanoshin	44d17ca02e	Fix register coalescer failure to prune value Register coalescer fails for the test in the patch with the assertion in JoinVals::ConflictResolution `DefMI != nullptr'. It attempts to join live intervals for two adjacent instructions and erase the copy: %2:vreg_256 = COPY %1 %3:vreg_256 = COPY killed %1 The LI needs to be adjusted to kill subrange for the erased instruction and extend the subrange of the original def. That was done for the main interval only but not for the subrange. As a result subrange had a VNI pointing to the erased slot resulting in the above failure. Differential Revision: https://reviews.llvm.org/D62162 llvm-svn: 361293	2019-05-21 19:32:41 +00:00
Leonard Chan	0bada7ce6c	[Intrinsic] Signed Fixed Point Saturation Multiplication Intrinsic Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. The result is saturated and clamped between the largest and smallest representable values of the first 2 operands. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D55720 llvm-svn: 361289	2019-05-21 19:17:19 +00:00
Craig Topper	ed6df47bae	[X86] Remove an unneeded ZERO_EXTEND creation from LowerINTRINSIC_W_CHAIN. NFC We were trying to ZERO_EXTEND from an i8 X86ISD::SETCC to i8 again. llvm-svn: 361288	2019-05-21 19:03:45 +00:00
Sanjay Patel	10f6b39899	[SelectionDAG] fold insert subvector of undef into undef DAGCombiner simplifies this more liberally as: // If inserting an UNDEF, just return the original vector. if (N1.isUndef()) return N0; So there's no way to make this visible in output AFAIK, but doing this at node creation time should be slightly more efficient. llvm-svn: 361287	2019-05-21 18:53:53 +00:00
Sanjay Patel	51dc59d090	[SelectionDAG] remove redundant code; NFCI getNode() squashes concatenation of undefs via FoldCONCAT_VECTORS(): // Concat of UNDEFs is UNDEF. if (llvm::all_of(Ops, [](SDValue Op) { return Op.isUndef(); })) return DAG.getUNDEF(VT); llvm-svn: 361284	2019-05-21 18:28:22 +00:00
Clement Courbet	122c6e6f36	[MergeICmps] Make sorting strongly stable on the rhs. Summary: Because the sort order was not strongly stable on the RHS, whether the chain could merge would depend on the order of the blocks in the Phi. EXPENSIVE_CHECKS would shuffle the blocks before sorting, resulting in non-deterministic merging. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, RKSimon Tags: #llvm Differential Revision: https://reviews.llvm.org/D62193 llvm-svn: 361281	2019-05-21 17:58:42 +00:00
Simon Pilgrim	4b82e50315	[X86][SSE] computeKnownBitsForTargetNode - add X86ISD::ANDNP support Fixes PACKSS-PSHUFB shuffle regressions mentioned on D61692 llvm-svn: 361270	2019-05-21 15:20:24 +00:00
Sanjay Patel	78c3f58122	[DAGCombiner] prevent unsafe reassociation of FP ops There are no FP callers of DAGCombiner::reassociateOps() currently, but we can add a fast-math check to make sure this API is not being misused. This was noted as a potential risk (and that risk might increase) with: D62191 llvm-svn: 361268	2019-05-21 14:47:38 +00:00
Clement Courbet	8361a10493	Revert r361257 "[MergeICmps][NFC] Make BCEAtom move-only." Broke some bots. llvm-svn: 361263	2019-05-21 14:24:46 +00:00
Clement Courbet	8fa970c2d8	[MergeICmps][NFC] Make BCEAtom move-only. And handle for self-move. This is required so that llvm::sort can work with EXPENSIVE_CHECKS, as it will do a random shuffle of the input which can result in self-moves. llvm-svn: 361257	2019-05-21 13:34:12 +00:00
Florian Hahn	f9b28e53c7	[ScheduleDAGInstrs] Compute topological ordering on demand. In most cases, the topological ordering does not get changed in ScheduleDAGInstrs. We can compute the ordering on demand, similar to D60125. This drastically cuts down the number of times we need to compute the topological ordering, e.g. for SPEC2006, SPEC2k and MultiSource, we get the following stats for -O3 -flto on X86 (showing the top reductions, with small absolute values filtered). The smallest reduction is -50%. Slightly positive impact on compile-time (-0.1 % geomean speedup for test-suite + SPEC & co, with -O1 on X86) Tests: 243 Metric: pre-RA-sched.NumTopoInits Program base patch diff test-suite...ngs-C/fixoutput/fixoutput.test 115.00 3.00 -97.4% test-suite...ks/Prolangs-C/cdecl/cdecl.test 957.00 26.00 -97.3% test-suite...math/automotive-basicmath.test 107.00 3.00 -97.2% test-suite...rolangs-C++/deriv2/deriv2.test 144.00 6.00 -95.8% test-suite...lowfish/security-blowfish.test 410.00 18.00 -95.6% test-suite...frame_layout/frame_layout.test 441.00 23.00 -94.8% test-suite...rolangs-C++/employ/employ.test 159.00 11.00 -93.1% test-suite...s/Ptrdist/anagram/anagram.test 157.00 11.00 -93.0% test-suite...s-C/unix-smail/unix-smail.test 829.00 59.00 -92.9% test-suite...chmarks/Olden/power/power.test 154.00 11.00 -92.9% test-suite...T95/147.vortex/147.vortex.test 19876.00 1434.00 -92.8% test-suite...000/255.vortex/255.vortex.test 19881.00 1435.00 -92.8% test-suite...ce/Applications/Burg/burg.test 2203.00 168.00 -92.4% test-suite...urce/Applications/hbd/hbd.test 1067.00 85.00 -92.0% test-suite...ternal/HMMER/hmmcalibrate.test 3145.00 251.00 -92.0% test-suite.../Applications/spiff/spiff.test 1037.00 84.00 -91.9% test-suite...SPEC/CINT95/130.li/130.li.test 5913.00 487.00 -91.8% test-suite.../CINT95/134.perl/134.perl.test 12532.00 1041.00 -91.7% test-suite...ce/Benchmarks/Olden/bh/bh.test 220.00 19.00 -91.4% test-suite :: External/Nurbs/nurbs.test 2304.00 206.00 -91.1% test-suite...arks/VersaBench/dbms/dbms.test 773.00 75.00 -90.3% test-suite...ce/Applications/siod/siod.test 9043.00 878.00 -90.3% test-suite...pplications/treecc/treecc.test 4510.00 438.00 -90.3% test-suite...T2006/456.hmmer/456.hmmer.test 7093.00 697.00 -90.2% test-suite...s-C/Pathfinder/PathFinder.test 882.00 87.00 -90.1% test-suite.../CINT2000/176.gcc/176.gcc.test 64978.00 6721.00 -89.7% test-suite...cations/hexxagon/hexxagon.test 657.00 69.00 -89.5% test-suite...fice-ispell/office-ispell.test 2712.00 285.00 -89.5% test-suite.../CINT2006/403.gcc/403.gcc.test 139613.00 14992.00 -89.3% test-suite...lications/ClamAV/clamscan.test 25880.00 2785.00 -89.2% Reviewers: MatzeB, atrick, efriedma, niravd Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60839 llvm-svn: 361253	2019-05-21 13:04:53 +00:00
Paul Robinson	9c56326934	[DebugInfo] Handle '# line "file"' correctly for asm source. This provides the correct file path for the original source, rather than the preprocessed source. Part of the fix for PR41839. Differential Revision: https://reviews.llvm.org/D62074 llvm-svn: 361248	2019-05-21 11:59:03 +00:00
Bob Haarman	032f87bbb3	Revert r360902 "Resubmit: [Salvage] Change salvage debug info ..." This reverts commit rr360902. It caused an assertion failure in lib/IR/DebugInfoMetadata.cpp: Assertion `(OffsetInBits + SizeInBits <= FragmentSizeInBits) && "new fragment outside of original fragment"' failed. PR41931. llvm-svn: 361246	2019-05-21 11:53:41 +00:00
Paul Robinson	116e8d4876	[DebugInfo] Handle -main-file-name correctly for asm source. This option provides only the base filename, not a full relative path. Part of the fix for PR41839. Differential Revision: https://reviews.llvm.org/D62071 llvm-svn: 361245	2019-05-21 11:52:27 +00:00
Clement Courbet	a95d95d392	[MergeICmps] Preserve the dominator tree. Summary: In preparation for D60318 . Reviewers: gchatelet, efriedma Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62068 llvm-svn: 361239	2019-05-21 11:02:23 +00:00
Fangrui Song	cd36a2857e	[PPC64] Update LocalEntry from assigned symbols On PowerPC64 ELFv2 ABI, functions may have 2 entry points: global and local. The local entry point location of a function is stored in the st_other field of the symbol, as an offset relative to the global entry point. In order to make symbol assignments (e.g. .equ/.set) work properly with this, PPCTargetELFStreamer already copies the local entry bits from the source symbol to the destination one, on emitAssignment(). The problem is that this copy is performed only at the assignment location, where the source symbol may not yet have processed the .localentry directive, that sets the local entry. This may cause the destination symbol to end up with wrong local entry information. Other symbol info is not affected by this because, in this case, the destination symbol value is actually a symbol reference. This change keeps track of these assignments, and update all needed st_other fields when finish() is called. Patch by Leandro Lupori! Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D56586 llvm-svn: 361237	2019-05-21 10:41:25 +00:00
Florian Hahn	4a8835c655	[AArch64] Skip mask checks for masks with an odd number of elements. Some checks in isShuffleMaskLegal expect an even number of elements, e.g. isTRN_v_undef_Mask or isUZP_v_undef_Mask, otherwise they access invalid elements and crash. This patch adds checks to the impacted functions. Fixes PR41951 Reviewers: t.p.northover, dmgreen, samparker Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D60690 llvm-svn: 361235	2019-05-21 10:05:26 +00:00
Cullen Rhodes	7f47b75d18	[AArch64][SVE2] Asm: add integer unary instructions (predicated) Summary: Patch adds support for the following instructions: * URECPE, URSQRTE, SQABS, SQNEG The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62129 llvm-svn: 361230	2019-05-21 09:06:51 +00:00
Cullen Rhodes	e798e8d9d2	[AArch64][SVE2] Asm: add integer pairwise arithmetic instructions Summary: Patch adds support for the following instructions: ADDP, SMAXP, UMAXP, SMINP, UMINP The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62128 llvm-svn: 361229	2019-05-21 08:59:00 +00:00
Sam Parker	3141bbd52d	[ARM][CGP] Skip nuw in PrepareConstants PrepareConstants step converts add/sub with 'negative' immediates to sub/add with a 'positive' imm to make promotion more simple. nuw already states that the add shouldn't cause an unsigned wrap, so it shouldn't need any tweaking. Plus, we also don't allow a sub with a 'negative' immediate to be safe wrap, so this functionality has been removed. The PrepareConstants step now just handles the add instructions that we've determined would be safe if they wrap around zero. Differential Revision: https://reviews.llvm.org/D62057 llvm-svn: 361227	2019-05-21 07:56:47 +00:00
Dylan McKay	e967308da4	Add TargetLoweringInfo hook for explicitly setting the ABI calling convention endianess Summary: The endianess used in the calling convention does not always match the endianess of the target on all architectures, namely AVR. When an argument is too large to be legalised by the architecture and is split for the ABI, a new hook TargetLoweringInfo::shouldSplitFunctionArgumentsAsLittleEndian is queried to find the endianess that function arguments must be laid out in. This approach was recommended by Eli Friedman. Originally reported in https://github.com/avr-rust/rust/issues/129. Patch by Carl Peto. Reviewers: bogner, t.p.northover, RKSimon, niravd, efriedma Reviewed By: efriedma Subscribers: JDevlieghere, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62003 llvm-svn: 361222	2019-05-21 06:38:02 +00:00
Chen Zheng	c4c407a0eb	[PowerPC] use more meaningful name - NFC llvm-svn: 361218	2019-05-21 03:54:42 +00:00
Lang Hames	f088e195cc	[ORC] Assert that JITDylibs have unique names. Patch by Praveen Velliengiri. Thanks Praveen! Differential Revision: https://reviews.llvm.org/D62139 llvm-svn: 361215	2019-05-21 03:23:08 +00:00
Nick Desaulniers	28e351af2a	[ORC] fix use-after-move. NFC Summary: scan-build flagged a potential use-after-move in debug builds. It's not safe that a moved from value contains anything but garbage. Manually DRY up these repeated expressions. Reviewers: lhames Reviewed By: lhames Subscribers: hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62112 llvm-svn: 361203	2019-05-20 22:17:43 +00:00
Matt Arsenault	6dd08e335f	AMDGPU: Force skip branches over calls Unfortunately the way SIInsertSkips works is backwards, and is required for correctness. r338235 added handling of some special cases where skipping is mandatory to avoid side effects if no lanes are active. It conservatively handled asm correctly, but the same logic needs to apply to calls. Usually the call sequence code is larger than the skip threshold, although the way the count is computed is really broken, so I'm not sure if anything was likely to really hit this. llvm-svn: 361202	2019-05-20 22:04:42 +00:00
Lang Hames	0dcf69eb82	[ORC] Remove some unreachable code. Fixes http://llvm.org/PR41662. llvm-svn: 361199	2019-05-20 21:30:33 +00:00
Cameron McInally	8bec58d5f7	[NFC][InstCombine] Add FIXME for one-use check on constant negation transforms. llvm-svn: 361197	2019-05-20 21:00:42 +00:00
Lang Hames	93d2bdda6b	[Support] Renamed member 'Size' to 'AllocatedSize' in MemoryBlock and OwningMemoryBlock. Rename member 'Size' to 'AllocatedSize' in order to provide a hint that the allocated size may be different than the requested size. Comments are added to clarify this point. Updated the InMemoryBuffer in FileOutputBuffer.cpp to track the requested buffer size. Patch by Machiel van Hooren. Thanks Machiel! https://reviews.llvm.org/D61599 llvm-svn: 361195	2019-05-20 20:53:05 +00:00
Martin Storsjo	4ed18e5ef5	[AArch64] Handle lowering lround on windows, where long is 32 bit Differential Revision: https://reviews.llvm.org/D62108 llvm-svn: 361192	2019-05-20 19:53:28 +00:00
Cameron McInally	2557ca296a	[InstCombine] Add visitFNeg(...) visitor for unary Fneg Also, break out a helper function, namely foldFNegIntoConstant(...), which performs transforms common between visitFNeg(...) and visitFSub(...). Differential Revision: https://reviews.llvm.org/D61693 llvm-svn: 361188	2019-05-20 19:10:30 +00:00
Sanjay Patel	63fa690617	[InstSimplify] update stale comment; NFC Missed this diff with rL361118. llvm-svn: 361180	2019-05-20 17:52:18 +00:00
Craig Topper	97d4f7c194	[SelectionDAGBuilder] Flush PendingExports before creating INLINEASM_BR node for asm goto. Since INLINEASM_BR is a terminator we need to flush the pending exports before emitting it. If we don't do this, a TokenFactor can be inserted between it and the BR instruction emitted to finish the callbr lowering. It looks like nodes are glued to the INLINEASM_BR so I had to make sure we emit the TokenFactor before that. Differential Revision: https://reviews.llvm.org/D59981 llvm-svn: 361177	2019-05-20 17:08:02 +00:00
Nick Desaulniers	bf940622c8	[DWARF] hoist nullptr checks. NFC Summary: This was flagged in https://www.viva64.com/en/b/0629/ under "Snippet No. 15" (see under #13). It looks like PVS studio flags nullptr checks where the ptr is used inbetween creation and checking against nullptr. Reviewers: JDevlieghere, probinson Reviewed By: JDevlieghere Subscribers: RKSimon, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D62118 llvm-svn: 361176	2019-05-20 16:58:59 +00:00
Craig Topper	cac6b76a76	[X86] Add icelake-client and tremont model numbers to getHostCPUName. llvm-svn: 361174	2019-05-20 16:58:23 +00:00
Nick Desaulniers	639b29b1b5	[INLINER] allow inlining of blockaddresses if sole uses are callbrs Summary: It was supposed that Ref LazyCallGraph::Edge's were being inserted by inlining, but that doesn't seem to be the case. Instead, it seems that there was no test for a blockaddress Constant in an instruction that referenced the function that contained the instruction. Ex: ``` define void @f() { %1 = alloca i8, align 8 2: store i8 blockaddress(@f, %2), i8** %1, align 8 ret void } ``` When iterating blockaddresses, do not add the function they refer to back to the worklist if the blockaddress is referring to the contained function (as opposed to an external function). Because blockaddress has sligtly different semantics than GNU C's address of labels, there are 3 cases that can occur with blockaddress, where only 1 can happen in GNU C due to C's scoping rules: * blockaddress is within the function it refers to (possible in GNU C). * blockaddress is within a different function than the one it refers to (not possible in GNU C). * blockaddress is used in to declare a global (not possible in GNU C). The second case is tested in: ``` $ ./llvm/build/unittests/Analysis/AnalysisTests \ --gtest_filter=LazyCallGraphTest.HandleBlockAddress ``` This patch adjusts the iteration of blockaddresses in LazyCallGraph::visitReferences to not revisit the blockaddresses function in the first case. The Linux kernel contains code that's not semantically valid at -O0; specifically code passed to asm goto. It requires that asm goto be inline-able. This patch conservatively does not attempt to handle the more general case of inlining blockaddresses that have non-callbr users (pr/39560). https://bugs.llvm.org/show_bug.cgi?id=39560 https://bugs.llvm.org/show_bug.cgi?id=40722 https://github.com/ClangBuiltLinux/linux/issues/6 https://reviews.llvm.org/rL212077 Reviewers: jyknight, eli.friedman, chandlerc Reviewed By: chandlerc Subscribers: george.burgess.iv, nathanchance, mgorny, craig.topper, mengxu.gatech, void, mehdi_amini, E5ten, chandlerc, efriedma, eraman, hiraditya, haicheng, pirama, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D58260 llvm-svn: 361173	2019-05-20 16:48:09 +00:00
Bjorn Pettersson	eee0f2330d	[AMDGPU] Fix std::array initializers to avoid warnings with older tool chains. NFC A std::array is implemented as a template with an array inside a struct. Older versions of clang, like 3.6, require an extra set of curly braces around std::array initializations to avoid warnings. The C++ language was changed regarding this by CWG 1270. So more modern tool chains does not complaing even if leaving out one level of braces. llvm-svn: 361171	2019-05-20 16:41:08 +00:00
Craig Topper	af7a188453	[Intrinsics] Merge lround.i32 and lround.i64 into a single intrinsic with overloaded result type. Make result type for llvm.llround overloaded instead of fixing to i64 We shouldn't really make assumptions about possible sizes for long and long long. And longer term we should probably support vectorizing these intrinsics. By making the result types not fixed we can support vectors as well. Differential Revision: https://reviews.llvm.org/D62026 llvm-svn: 361169	2019-05-20 16:27:09 +00:00
Craig Topper	203bfdd0f0	[DAGCombiner] Refactor code in visitShiftByConstant slightly to make it more readable. NFC This changes the isShift variable to include the constant operand check that was previously in the if statement. While there fix an 80 column violation and an unnecessary use of getNode. Also fix variable name capitalization. llvm-svn: 361168	2019-05-20 16:26:55 +00:00
Matt Arsenault	5239298b0d	R600: Fix unconditional return in loop llvm-svn: 361167	2019-05-20 16:22:11 +00:00
Nikita Popov	9060b6df97	[SDAG] Vector op legalization for overflow ops Fixes issue reported by aemerson on D57348. Vector op legalization support is added for uaddo, usubo, saddo and ssubo (umulo and smulo were already supported). As usual, by extracting TargetLowering methods and calling them from vector op legalization. Vector op legalization doesn't really deal with multiple result nodes, so I'm explicitly performing a recursive legalization call on the result value that is not being legalized. There are some existing test changes because expansion happens earlier, so we don't get a DAG combiner run in between anymore. Differential Revision: https://reviews.llvm.org/D61692 llvm-svn: 361166	2019-05-20 16:09:22 +00:00
Matt Arsenault	7c8ec18964	RegAlloc: Fix verifier error with undef identity copies The code did not match the example in the comment, and was checking the undef flag on the copy dest instead of source. The existing tests were only hitting the > 2 operands case. llvm-svn: 361156	2019-05-20 14:09:36 +00:00
Cullen Rhodes	523789fa6b	[AArch64][SVE2] Asm: add SADALP and UADALP instructions Summary: This patch adds support for the integer pairwise add and accumulate long instructions SADALP/UADALP. These instructions are predicated. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D62001 llvm-svn: 361154	2019-05-20 13:50:15 +00:00
Cameron McInally	2d2a46db8e	[InstSimplify] Teach fsub -0.0, (fneg X) ==> X about unary fneg Differential Revision: https://reviews.llvm.org/D62077 llvm-svn: 361151	2019-05-20 13:13:35 +00:00
Orlando Cazalet-Hyams	ed67bf8d2f	Resubmit "[DebugInfo] Update loop metadata for inlined loops" This reverts commit `95805bc425`. I've squashed the test fix into this commit. [DebugInfo] Update loop metadata for inlined loops Currently, when a loop is cloned while inlining function (A) into function (B) the loop metadata is copied and then not modified at all. The loop metadata can encode the loop's start and end DILocations. Therefore, the new inlined loop in function (B) may have loop metadata which shows start and end locations residing in function (A). This patch ensures loop metadata is updated while inlining so that the start and end DILocations are given the "inlinedAt" operand. I've also added a regression test for this. This fix is required for D60831 because that patch uses loop metadata to determine the DILocation for the branches of new loop preheaders. Reviewers: aprantl, dblaikie, anemet Reviewed By: aprantl Subscribers: eraman, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D61933 llvm-svn: 361149	2019-05-20 13:02:30 +00:00
Orlando Cazalet-Hyams	95805bc425	Revert "[DebugInfo] Update loop metadata for inlined loops" This reverts commit `6e8f1a80cd`. Reverting patch while investigating build bot failure. llvm-svn: 361143	2019-05-20 11:24:39 +00:00
Guillaume Chatelet	e386a01e84	[NFC] Refactor visitIntrinsicCall so it doesn't return a const char* Summary: API simplification Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61306 llvm-svn: 361140	2019-05-20 11:01:30 +00:00
Petar Jovanovic	e85bbf564d	[DebugInfoMetadata] Refactor DIExpression::prepend constants (NFC) Refactor DIExpression::With* into a flag enum in order to be less error-prone to use (as discussed on D60866). Patch by Djordje Todorovic. Differential Revision: https://reviews.llvm.org/D61943 llvm-svn: 361137	2019-05-20 10:35:57 +00:00
Cullen Rhodes	96c5929926	[AArch64][SVE2] Asm: add int halving add/sub (predicated) instructions Summary: This patch adds support for the predicated integer halving add/sub instructions: * SHADD, UHADD, SRHADD, URHADD * SHSUB, UHSUB, SHSUBR, UHSUBR The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D62000 llvm-svn: 361136	2019-05-20 10:35:23 +00:00
Cullen Rhodes	0fc6347b35	[AArch64][SVE2] Asm: add saturating multiply-add interleaved long instructions Summary: Patch adds support for SQDMLALBT and SQDMLSLBT instructions. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61998 llvm-svn: 361135	2019-05-20 10:29:48 +00:00
Fangrui Song	68774edcd6	Use llvm::sort. NFC llvm-svn: 361134	2019-05-20 10:18:35 +00:00
Sander de Smalen	f83cccf917	Match types of accumulator and result for llvm.experimental.vector.reduce.fadd/fmul The scalar start/accumulator value of the fadd- and fmul reduction should match the result type of the reduction, as well as the vector element-type of the input vector. Although this was not explicitly specified in the LangRef, it was taken for granted in code implementing the reductions. The patch also fixes the LangRef by adding this constraint. Reviewed By: aemerson, nikic Differential Revision: https://reviews.llvm.org/D60260 llvm-svn: 361133	2019-05-20 09:54:06 +00:00
Orlando Cazalet-Hyams	6e8f1a80cd	[DebugInfo] Update loop metadata for inlined loops Summary: Currently, when a loop is cloned while inlining function (A) into function (B) the loop metadata is copied and then not modified at all. The loop metadata can encode the loop's start and end DILocations. Therefore, the new inlined loop in function (B) may have loop metadata which shows start and end locations residing in function (A). This patch ensures loop metadata is updated while inlining so that the start and end DILocations are given the "inlinedAt" operand. I've also added a regression test for this. This fix is required for D60831 because that patch uses loop metadata to determine the DILocation for the branches of new loop preheaders. Reviewers: aprantl, dblaikie, anemet Reviewed By: aprantl Subscribers: eraman, hiraditya, llvm-commits Tags: #debug-info, #llvm Differential Revision: https://reviews.llvm.org/D61933 llvm-svn: 361132	2019-05-20 09:40:44 +00:00
Guillaume Chatelet	a760e69840	Revert "[NFC] Refactor visitIntrinsicCall so it doesn't return a const char*" This reverts commit 706d3cd6388cc3446aab282f3af879862b10cbed. llvm-svn: 361130	2019-05-20 09:00:12 +00:00
Guillaume Chatelet	fa8c152576	[NFC] Refactor visitIntrinsicCall so it doesn't return a const char* Summary: API simplification Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61306 llvm-svn: 361129	2019-05-20 08:52:10 +00:00
Carl Ritson	34e95ce259	[AMDGPU] gfx1010 Avoid SMEM WAR hazard for some s_waitcnt values Summary: Avoid introducing hazard mitigation when lgkmcnt is reduced to 0. Clarify code comments to explain assumptions made for this hazard mitigation. Expand and correct test cases to cover variants of s_waitcnt. Reviewers: nhaehnle, rampitec Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D62058 llvm-svn: 361124	2019-05-20 07:20:12 +00:00
Sanjay Patel	9ef99b4b11	[InstSimplify] fold fcmp (maxnum, X, C1), C2 This is the sibling transform for rL360899 (D61691): maxnum(X, GreaterC) == C --> false maxnum(X, GreaterC) <= C --> false maxnum(X, GreaterC) < C --> false maxnum(X, GreaterC) >= C --> true maxnum(X, GreaterC) > C --> true maxnum(X, GreaterC) != C --> true llvm-svn: 361118	2019-05-19 14:26:39 +00:00
Dinar Temirbulatov	2ff72f6654	[SLP] Refactoring of EdgeInfo and UserTreeIdx in buildTree_rec(). This is a follow-up refactoring patch after the introduction of usable TreeEntry pointers in D61706. The EdgeInfo struct can now use a TreeEntry pointer instead of an index in VectorizableTree. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D61795 llvm-svn: 361110	2019-05-19 01:30:41 +00:00
Craig Topper	3164b50af7	[X86] Remove combineShift function. Just dispatch directly to the handler for each flavor from the main switch. NFC llvm-svn: 361108	2019-05-19 01:01:46 +00:00
Matt Arsenault	b04f3258dd	GVN: Handle addrspacecast llvm-svn: 361103	2019-05-18 14:36:06 +00:00
Simon Pilgrim	2b45a70fd6	MemCmpExpansion::getCompareLoadPairs - assert we find a comparison diff. NFCI. Fix scan-build uninitialized warning and assert the final diff isn't null. llvm-svn: 361095	2019-05-18 11:31:48 +00:00
Matt Arsenault	2f29220d6d	AMDGPU/GlobalISel: Implement s64->s64 [SU]ITOFP llvm-svn: 361082	2019-05-17 23:05:18 +00:00
Matt Arsenault	02b5ca8cd1	GlobalISel: Implement lower for S64->S32 [SU]ITOFP This is ported from the custom AMDGPU DAG implementation. I think this is a better default expansion than what the DAG currently uses, at least if the target has CTLZ. This implements the signed version in terms of the unsigned conversion, which is implemented with bit operations. SelectionDAG has several other implementations that should eventually be ported depending on what instructions are legal. llvm-svn: 361081	2019-05-17 23:05:13 +00:00
Sam Clegg	13717bd54b	[WebAssembly] Remove expected failure of builtin-location.C test This seems to have been fixed by https://reviews.llvm.org/D61956 Yay Differential Revision: https://reviews.llvm.org/D62075 llvm-svn: 361071	2019-05-17 19:55:17 +00:00
Matt Arsenault	f3cedf4823	GlobalISel: Define integer min/max instructions Doesn't attempt to emit them for anything yet, but some legalizations I want to port use them. llvm-svn: 361061	2019-05-17 18:36:31 +00:00
Sanjay Patel	926e47751b	[InstCombine] move bitcast after insertelement-with-bitcasted-operands llvm-svn: 361058	2019-05-17 18:06:12 +00:00
Simon Pilgrim	065431c82b	[X86][SSE] Fold movmsk(not(x)) -> not(movmsk) Helps to improve folding of comparisons with movmsk results. llvm-svn: 361056	2019-05-17 17:56:25 +00:00
Simon Pilgrim	2c2f8e74b9	[X86][SSE] Match all-of bool scalar reductions into a bitcast/movmsk + cmp. Same as what we do for vector reductions in combineHorizontalPredicateResult, use movmsk+cmp for scalar (and(extract(x,0),extract(x,1)) reduction patterns. llvm-svn: 361052	2019-05-17 17:25:55 +00:00
Cameron McInally	067e946859	[InstSimplify] Add unary fneg to `fsub 0.0, (fneg X) ==> X` transform Differential Revision: https://reviews.llvm.org/D62013 llvm-svn: 361047	2019-05-17 16:47:00 +00:00
Dmitry Preobrazhensky	198611b0ff	[AMDGPU][MC] Corrected parsing of NAME:VALUE modifiers See bug 41298: https://bugs.llvm.org/show_bug.cgi?id=41298 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D61009 llvm-svn: 361045	2019-05-17 16:04:17 +00:00
Roman Lebedev	64c756b991	[DAGCombiner] visitShiftByConstant(): drop bogus signbit check Summary: That check claims that the transform is illegal otherwise. That isn't true: 1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter https://rise4fun.com/Alive/K4A 2. For `ISD::AND`, there is no restriction on constants: https://rise4fun.com/Alive/Wy3 3. For `ISD::OR`, there is no restriction on constants: https://rise4fun.com/Alive/GOH 3. For `ISD::XOR`, there is no restriction on constants: https://rise4fun.com/Alive/ml6 So, why is it there then? This changes the testcase that was touched by @spatel in rL347478, but i'm not sure that test tests anything particular? Reviewers: RKSimon, spatel, craig.topper, jojo, rengolin Reviewed By: spatel Subscribers: javed.absar, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61918 llvm-svn: 361044	2019-05-17 15:52:58 +00:00
Roman Lebedev	3275060fe8	[InstCombine] canShiftBinOpWithConstantRHS(): drop bogus signbit check Summary: In D61918 i was looking at dropping it in DAGCombiner `visitShiftByConstant()`, but as @craig.topper pointed out, it was copied from here. That check claims that the transform is illegal otherwise. That isn't true: 1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter https://rise4fun.com/Alive/K4A 2. For `ISD::AND`, there is no restriction on constants: https://rise4fun.com/Alive/Wy3 3. For `ISD::OR`, there is no restriction on constants: https://rise4fun.com/Alive/GOH 3. For `ISD::XOR`, there is no restriction on constants: https://rise4fun.com/Alive/ml6 So, why is it there then? As far as i can tell, it dates all the way back to original check-in rL7793. I think we should just drop it. Reviewers: spatel, craig.topper, efriedma, majnemer Reviewed By: spatel Subscribers: llvm-commits, craig.topper Tags: #llvm Differential Revision: https://reviews.llvm.org/D61938 llvm-svn: 361043	2019-05-17 15:52:49 +00:00
Dmitry Preobrazhensky	5ae3113969	[AMDGPU][MC] Enabled labels with s_call_b64 and s_cbranch_i_fork See https://bugs.llvm.org/show_bug.cgi?id=41888 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D62016 llvm-svn: 361040	2019-05-17 14:57:04 +00:00
Simon Pilgrim	279314e81b	[X86][AVX] Remove LowerCTTZ's AVX1 custom vector handling. We can now rely on generic expansion to handle this. llvm-svn: 361038	2019-05-17 14:37:19 +00:00
Simon Pilgrim	62c7032c18	[X86][AVX] isNOT - add extract_subvector(xor X, -1) -> extract_subvector(X) fold. Prep work for the removal of the remaining x86 CTTZ vector lowering. llvm-svn: 361035	2019-05-17 14:04:56 +00:00
Dmitry Preobrazhensky	43fcc79837	[AMDGPU][MC] Enabled expressions for most operands which accept integer values See bug 40873: https://bugs.llvm.org/show_bug.cgi?id=40873 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60768 llvm-svn: 361031	2019-05-17 13:17:48 +00:00
Matt Arsenault	1a02d30c87	AMDGPU: Fix unused variable warnings in release builds llvm-svn: 361030	2019-05-17 12:59:27 +00:00
Matt Arsenault	a510b570c2	AMDGPU/GlobalISel: Legalize G_FCEIL llvm-svn: 361028	2019-05-17 12:20:05 +00:00
Matt Arsenault	6aebcd5499	AMDGPU/GlobalISel: Legalize G_INTRINSIC_TRUNC llvm-svn: 361027	2019-05-17 12:20:01 +00:00
Matt Arsenault	6aafc5e19d	AMDGPU/GlobalISel: Legalize G_FRINT llvm-svn: 361026	2019-05-17 12:19:57 +00:00
Matt Arsenault	1448f5689e	AMDGPU/GlobalISel: Legalize G_FCOPYSIGN llvm-svn: 361025	2019-05-17 12:19:52 +00:00
Clement Courbet	90900fbc9f	[MergeICmps][NFC] Add more debug. llvm-svn: 361024	2019-05-17 12:07:51 +00:00
Matt Arsenault	568f193847	AMDGPU/GlobalISel: RegBankSelect for llvm.amdgcn.s.buffer.load llvm-svn: 361023	2019-05-17 12:02:34 +00:00
Matt Arsenault	a3b5a386fa	AMDGPU/GlobalISel: Use subreg index instead of extra unmerge This saves instructions and extra steps, but I'm not sure about introducing subregister indexes at this point. llvm-svn: 361022	2019-05-17 12:02:31 +00:00
Matt Arsenault	b3dc73634c	AMDGPU/GlobalISel: Use waterfall loop for buffer_load This adds support for more complex waterfall loops that need to handle operands > 32-bits, and multiple operands. llvm-svn: 361021	2019-05-17 12:02:27 +00:00
Simon Pilgrim	a6d3bd486b	[X86] Pull out IsNOT helper. NFCI. Return the input value for the NOT pattern: (xor X, -1) -> X llvm-svn: 361012	2019-05-17 10:37:08 +00:00
Clement Courbet	632dfdda16	Re-land r360859: "[MergeICmps] Simplify the code." With a fix for PR41917: The predecessor list was changing under our feet. - for (BasicBlock Pred : predecessors(EntryBlock_)) { + while (!pred_empty(EntryBlock_)) { + BasicBlock const Pred = *pred_begin(EntryBlock_); llvm-svn: 361009	2019-05-17 09:43:45 +00:00
Rhys Perry	c4bc61bad7	[AMDGPU] detect WaW hazards when moving/merging load/store instructions Summary: In order to combine memory operations efficiently, the load/store optimizer might move some instructions around. It's usually safe to move instructions down past the merged instruction because the pass checks if memory operations can be re-ordered. Though, the current logic doesn't handle Write-after-Write hazards. This fixes a reflection issue with Monster Hunter World and DXVK. v2: - rebased on top of master - clean up the test case - handle WaW hazards correctly Bugzilla: https://bugs.llvm.org/show_bug.cgi?id=40130 Original patch by Samuel Pitoiset. Reviewers: tpr, arsenm, nhaehnle Reviewed By: nhaehnle Subscribers: ronlieb, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D61313 llvm-svn: 361008	2019-05-17 09:32:23 +00:00
Cullen Rhodes	7f605c3550	[AArch64][SVE2] Asm: add saturating multiply-add long instructions Summary: Patch adds support for indexed and unpredicated vectors forms of the following instructions: * SQDMLALB, SQDMLALT, SQDMLSLB, SQDMLSLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D61997 llvm-svn: 361005	2019-05-17 09:29:43 +00:00
Cullen Rhodes	334130a199	[AArch64][SVE2] Asm: add integer multiply-add long instructions Summary: Patch adds support for indexed and unpredicated vectors forms of the following instructions: * SMLALB, SMLALT, UMLALB, UMLALT, SMLSLB, SMLSLT, UMLSLB, UMLSLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61951 llvm-svn: 361003	2019-05-17 09:19:41 +00:00
Cullen Rhodes	0d47f00821	[AArch64][SVE2] Asm: add integer multiply long instructions Summary: Patch adds support for indexed and unpredicated vectors forms of the following instructions: * SMULLB, SMULLT, UMULLB, UMULLT, SQDMULLB, SQDMULLT The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61936 llvm-svn: 361002	2019-05-17 09:04:44 +00:00
Craig Topper	ae1597d360	[X86] Add FeatureFastScalarShiftMasks and FeatureFastVectorShiftMasks to the ignore list for inlining compatibility. These are tuning flags and won't cause any codegen issue if we inline a function with a different value. llvm-svn: 360992	2019-05-17 06:40:21 +00:00
Fangrui Song	ad7199f3e6	[PowerPC] Support .reloc , R_PPC{,64}_NONE, This can be used to create references among sections. When --gc-sections is used, the referenced section will be retained if the origin section is retained. llvm-svn: 360990	2019-05-17 06:04:11 +00:00
Fangrui Song	ec6dc3089e	[GlobalISel] Fix -Wsign-compare on 32-bit -DLLVM_ENABLE_ASSERTIONS=on builds llvm-svn: 360989	2019-05-17 05:53:39 +00:00
Fangrui Song	e18a6ad0b8	[MC][PowerPC] Clean up PPCAsmBackend Replace the member variable Target with Triple Use Triple instead of TheTarget.getName() to dispatch on 32-bit/64-bit. Delete redundant parameters llvm-svn: 360986	2019-05-17 05:44:26 +00:00
Ben Dunbobbin	1d16515fb4	[ELF] Implement Dependent Libraries Feature This patch implements a limited form of autolinking primarily designed to allow either the --dependent-library compiler option, or "comment lib" pragmas ( https://docs.microsoft.com/en-us/cpp/preprocessor/comment-c-cpp?view=vs-2017) in C/C++ e.g. #pragma comment(lib, "foo"), to cause an ELF linker to automatically add the specified library to the link when processing the input file generated by the compiler. Currently this extension is unique to LLVM and LLD. However, care has been taken to design this feature so that it could be supported by other ELF linkers. The design goals were to provide: - A simple linking model for developers to reason about. - The ability to to override autolinking from the linker command line. - Source code compatibility, where possible, with "comment lib" pragmas in other environments (MSVC in particular). Dependent library support is implemented differently for ELF platforms than on the other platforms. Primarily this difference is that on ELF we pass the dependent library specifiers directly to the linker without manipulating them. This is in contrast to other platforms where they are mapped to a specific linker option by the compiler. This difference is a result of the greater variety of ELF linkers and the fact that ELF linkers tend to handle libraries in a more complicated fashion than on other platforms. This forces us to defer handling the specifiers to the linker. In order to achieve a level of source code compatibility with other platforms we have restricted this feature to work with libraries that meet the following "reasonable" requirements: 1. There are no competing defined symbols in a given set of libraries, or if they exist, the program owner doesn't care which is linked to their program. 2. There may be circular dependencies between libraries. The binary representation is a mergeable string section (SHF_MERGE, SHF_STRINGS), called .deplibs, with custom type SHT_LLVM_DEPENDENT_LIBRARIES (0x6fff4c04). The compiler forms this section by concatenating the arguments of the "comment lib" pragmas and --dependent-library options in the order they are encountered. Partial (-r, -Ur) links are handled by concatenating .deplibs sections with the normal mergeable string section rules. As an example, #pragma comment(lib, "foo") would result in: .section ".deplibs","MS",@llvm_dependent_libraries,1 .asciz "foo" For LTO, equivalent information to the contents of a the .deplibs section can be retrieved by the LLD for bitcode input files. LLD processes the dependent library specifiers in the following way: 1. Dependent libraries which are found from the specifiers in .deplibs sections of relocatable object files are added when the linker decides to include that file (which could itself be in a library) in the link. Dependent libraries behave as if they were appended to the command line after all other options. As a consequence the set of dependent libraries are searched last to resolve symbols. 2. It is an error if a file cannot be found for a given specifier. 3. Any command line options in effect at the end of the command line parsing apply to the dependent libraries, e.g. --whole-archive. 4. The linker tries to add a library or relocatable object file from each of the strings in a .deplibs section by; first, handling the string as if it was specified on the command line; second, by looking for the string in each of the library search paths in turn; third, by looking for a lib<string>.a or lib<string>.so (depending on the current mode of the linker) in each of the library search paths. 5. A new command line option --no-dependent-libraries tells LLD to ignore the dependent libraries. Rationale for the above points: 1. Adding the dependent libraries last makes the process simple to understand from a developers perspective. All linkers are able to implement this scheme. 2. Error-ing for libraries that are not found seems like better behavior than failing the link during symbol resolution. 3. It seems useful for the user to be able to apply command line options which will affect all of the dependent libraries. There is a potential problem of surprise for developers, who might not realize that these options would apply to these "invisible" input files; however, despite the potential for surprise, this is easy for developers to reason about and gives developers the control that they may require. 4. This algorithm takes into account all of the different ways that ELF linkers find input files. The different search methods are tried by the linker in most obvious to least obvious order. 5. I considered adding finer grained control over which dependent libraries were ignored (e.g. MSVC has /nodefaultlib:<library>); however, I concluded that this is not necessary: if finer control is required developers can fall back to using the command line directly. RFC thread: http://lists.llvm.org/pipermail/llvm-dev/2019-March/131004.html. Differential Revision: https://reviews.llvm.org/D60274 llvm-svn: 360984	2019-05-17 03:44:15 +00:00
Fangrui Song	2463239777	[X86] Support .reloc , R_{386,X86_64}_NONE, This can be used to create references among sections. When --gc-sections is used, the referenced section will be retained if the origin section is retained. See R_MIPS_NONE (D13659), R_ARM_NONE (D61992), R_AARCH64_NONE (D61973) for similar changes. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D62014 llvm-svn: 360983	2019-05-17 03:25:39 +00:00
Fangrui Song	aa6102ad8e	[AArch64] Support .reloc , R_AARCH64_NONE, Summary: This can be used to create references among sections. When --gc-sections is used, the referenced section will be retained if the origin section is retained. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D61973 llvm-svn: 360981	2019-05-17 03:05:07 +00:00
Fangrui Song	43ca0e9eb8	[ARM] Support .reloc , R_ARM_NONE, R_ARM_NONE can be used to create references among sections. When --gc-sections is used, the referenced section will be retained if the origin section is retained. Add a generic MCFixupKind FK_NONE as this kind of no-op relocation is ubiquitous on ELF and COFF, and probably available on many other binary formats. See D62014. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D61992 llvm-svn: 360980	2019-05-17 02:51:54 +00:00
Philip Reames	a74d654374	[LFTR] Strengthen assertions in genLoopLimit [NFCI] llvm-svn: 360978	2019-05-17 02:18:03 +00:00
Philip Reames	45e7690796	[IndVars] Don't reimplement Loop::isLoopInvariant [NFC] Using dominance vs a set membership check is indistinguishable from a compile time perspective, and the two queries return equivelent results. Simplify code by using the existing function. llvm-svn: 360976	2019-05-17 02:09:03 +00:00
Philip Reames	8e169cd266	[LFTR] Factor out a helper function for readability purpose [NFC] llvm-svn: 360972	2019-05-17 01:39:58 +00:00
Philip Reames	9283f1847c	Clarify comments on helpers used by LFTR [NFC] I'm slowly wrapping my head around this code, and am making comment improvements where I can. llvm-svn: 360968	2019-05-17 01:12:02 +00:00
Jonas Paulsson	9427961c89	[SystemZ] Bugfix in SystemZTargetLowering::combineIntDIVREM() Make sure to not unroll a vector division/remainder (with a constant splat divisor) after type legalization, since the scalar type may then be illegal. Review: Ulrich Weigand https://reviews.llvm.org/D62036 llvm-svn: 360965	2019-05-17 00:50:35 +00:00
Nico Weber	d764e7c660	Revert r360859: "Reland r360771 "[MergeICmps] Simplify the code."" It caused PR41917. llvm-svn: 360963	2019-05-17 00:43:53 +00:00
David L. Jones	4a5e01faa4	[X86][AsmParser] Add mnemonics missed in r360954. These are valid Jcc, but aren't based on the EFLAGS condition codes (Intel 64 and IA-32 Architetcures Software Developer's Manual Vol. 1, Appendix B). These are covered in clang/test, but not llvm/test. llvm-svn: 360960	2019-05-17 00:19:20 +00:00
Evgeniy Stepanov	7f281b2c06	HWASan exception support. Summary: Adds a call to __hwasan_handle_vfork(SP) at each landingpad entry. Reusing __hwasan_handle_vfork instead of introducing a new runtime call in order to be ABI-compatible with old runtime library. Reviewers: pcc Subscribers: kubamracek, hiraditya, #sanitizers, llvm-commits Tags: #sanitizers, #llvm Differential Revision: https://reviews.llvm.org/D61968 llvm-svn: 360959	2019-05-16 23:54:41 +00:00
David L. Jones	add7ed2281	[X86][AsmParser] Ignore "short" even harder in Intel syntax ASM. In Intel syntax, it's not uncommon to see a "short" modifier on Jcc conditional jumps, which indicates the offset should be a "short jump" (8-bit immediate offset from EIP, -128 to +127). This patch expands to all recognized Jcc condition codes, and removes the inline restriction. Clang already ignores "jmp short" in inline assembly. However, only "jmp" and a couple of Jcc are actually checked, and only inline (i.e., not when using the integrated assembler for asm sources). A quick search through asm-containing libraries at hand shows a pretty broad range of Jcc conditions spelled with "short." GAS ignores the "short" modifier, and instead uses an encoding based on the given immediate. MS inline seems to do the same, and I suspect MASM does, too. NASM will yield an error if presented with an out-of-range immediate value. Example of GCC 9.1 and MSVC v19.20, "jmp short" with offsets that do and do not fit within 8 bits: https://gcc.godbolt.org/z/aFZmjY Differential Revision: https://reviews.llvm.org/D61990 llvm-svn: 360954	2019-05-16 23:27:07 +00:00
David L. Jones	11305984d0	[X86][AsmParser] Rename "ConditionCode" variable to "ConditionPredicate". This better matches the verbiage in Intel documentation, and should help avoid confusion between these two different kinds of values, both of which are parsed from mnemonics. llvm-svn: 360953	2019-05-16 23:27:05 +00:00
Reid Kleckner	08c15df29f	[X86] Deduplicate symbol lowering logic, NFC Summary: This refactors four pieces of code that create SDNodes for references to symbols: - normal global address lowering (LEA, MOV, etc) - callee global address lowering (CALL) - external symbol address lowering (LEA, MOV, etc) - external symbol address lowering (CALL) Each of these pieces of code need to: - classify the reference - lower the symbol - emit a RIP wrapper if needed - emit a load if needed - add offsets if needed I think handling them all in one place will make the code easier to maintain in the future. Reviewers: craig.topper, RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61690 llvm-svn: 360952	2019-05-16 23:15:26 +00:00
Amy Huang	c2029068bc	Emit global variables as S_CONSTANT records for codeview debug info. Summary: This emits S_CONSTANT records for global variables. Currently this emits records for the global variables already being tracked in the LLVM IR metadata, which are just constant global variables; we'll also want S_CONSTANTs for static data members and enums. Related to https://bugs.llvm.org/show_bug.cgi?id=41615 Reviewers: rnk Subscribers: aprantl, hiraditya, llvm-commits, thakis Tags: #llvm Differential Revision: https://reviews.llvm.org/D61926 llvm-svn: 360948	2019-05-16 22:28:52 +00:00
Tim Renouf	e3cbdaf1b5	[CodeGen] Fixed de-optimization of legalize subvector extract The recent introduction of v3i32 etc as an MVT, and its use in AMDGPU 3-dword memory instructions, caused a de-optimization problem for code with such a load that then bitcasts via vector of i8, because v12i8 is not an MVT so it legalizes the bitcast by widening it. This commit adds the ability to widen a bitcast using extract_subvector on the result, so the value does not need to go via memory. Differential Revision: https://reviews.llvm.org/D60457 Change-Id: Ie4abb7760547e54a2445961992eafc78e80d4b64 llvm-svn: 360942	2019-05-16 21:49:06 +00:00
Lang Hames	c97b50e224	[ORC] Change handling for SymbolStringPtr tombstones and empty keys. SymbolStringPtr used to use nullptr as its empty value and (since it performed ref-count operations on any non-nullptr) a pointer to a special pool-entry instance as its tombstone. This commit changes the scheme to use two invalid pointer values as the empty and tombstone values, and broadens the ref-count guard to prevent ref-counting operations from being performed on these pointers. This should improve the performance of SymbolStringPtrs used in DenseMaps/DenseSets, as ref counting operations will no longer be performed on the tombstone. llvm-svn: 360925	2019-05-16 18:29:34 +00:00
Joerg Sonnenberger	ec6ee797ec	Fix typos in comment. llvm-svn: 360921	2019-05-16 18:01:57 +00:00
Craig Topper	f09b9d419f	[X86] Use 0x9 instead of 0x1 as the immediate in some masked floor pattern. Similarly change 0x2 to 0xA for ceil. This suppresses exceptions which is what we should be doing for ceil and floor. We already use the correct immediate in patterns without masking. llvm-svn: 360915	2019-05-16 16:53:50 +00:00
Don Hinton	8249a8889d	[CommandLine] Don't allow duplicate categories. Summary: This is a fix to D61574, r360179, that allowed duplicate OptionCategory's. This change adds a check to make sure a category can only be added once even if the user passes it twice. Reviewed By: MaskRay Tags: #llvm Differential Revision: https://reviews.llvm.org/D61972 llvm-svn: 360913	2019-05-16 16:25:13 +00:00
Pavel Labath	2d29e16c30	Minidump: Add support for the MemoryList stream Summary: the stream format is exactly the same as for ThreadList and ModuleList streams, only the entry types are slightly different, so the changes in this patch are just straight-forward applications of established patterns. Reviewers: amccarth, jhenderson, clayborg Subscribers: markmentovai, lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61885 llvm-svn: 360908	2019-05-16 15:17:30 +00:00
Matt Arsenault	99e6f4d11a	AMDGPU: Introduce TokenFactor for ABI register copies in call sequence The call was missing chain dependencies on the pre-call copies. I don't think this was causing any real issues however. llvm-svn: 360906	2019-05-16 15:10:27 +00:00
Matt Arsenault	df24c92c0f	AMDGPU: Assume xnack is enabled by default This is the conservatively correct default. It is always safe to assume xnack is enabled, but not the converse. Introduce a feature to blacklist targets where xnack can never be meaningfully enabled. I'm not sure the targets this is applied to is 100% correct. llvm-svn: 360903	2019-05-16 14:48:34 +00:00
Stephen Tozer	6f59b4b6d9	Resubmit: [Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=40645 Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. With the recent addition of DW_OP_LLVM_convert this salvaging is now possible, and so can be used to fix the attached bug as well as any cases where SExt instruction results are lost in the debugging metadata. This patch introduces this fix by expanding the salvage debug info method to cover these cases using the new operator. Differential revision: https://reviews.llvm.org/D61184 llvm-svn: 360902	2019-05-16 14:41:01 +00:00
Sanjay Patel	152f81fae8	[InstSimplify] fold fcmp (minnum, X, C1), C2 minnum(X, LesserC) == C --> false minnum(X, LesserC) >= C --> false minnum(X, LesserC) > C --> false minnum(X, LesserC) != C --> true minnum(X, LesserC) <= C --> true minnum(X, LesserC) < C --> true maxnum siblings will follow if there are no problems here. We should be able to perform some other combines when the constants are equal or greater-than too, but that would go in instcombine. We might also generalize this by creating an FP ConstantRange (similar to what we do for integers). Differential Revision: https://reviews.llvm.org/D61691 llvm-svn: 360899	2019-05-16 14:03:10 +00:00
Xing Xue	2dee094a08	Fixes for builds that require strict X/Open and POSIX compatiblity Summary: - Use alternative to MAP_ANONYMOUS for allocating mapped memory if it isn't available - Use strtok_r instead of strsep as part of getting program path - Don't try to find the width of a terminal using "struct winsize" and TIOCGWINSZ on POSIX builds. These aren't defined under POSIX (even though some platforms make them available when they shouldn't), so just check if we are doing a X/Open or POSIX compliant build first. Author: daltenty Reviewers: hubert.reinterpretcast, xingxue, andusy Reviewed By: hubert.reinterpretcast Subscribers: MaskRay, jsji, hiraditya, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61326 llvm-svn: 360898	2019-05-16 14:02:13 +00:00
Adhemerval Zanella	2d28db6b9f	[AArch64] Handle ISD::LROUND and ISD::LLROUND This patch optimizes ISD::LROUND and ISD::LLROUND to fcvtas instruction. It currently only handles the scalar version. llvm-svn: 360894	2019-05-16 13:30:18 +00:00
Fangrui Song	e183340c29	Recommit [Object] Change object::SectionRef::getContents() to return Expected<StringRef> r360876 didn't fix 2 call sites in clang. Expected<ArrayRef<uint8_t>> may be better but use Expected<StringRef> for now. Follow-up of D61781. llvm-svn: 360892	2019-05-16 13:24:04 +00:00
Adhemerval Zanella	73643b5041	[CodeGen] Add lround/llround builtins This patch add the ISD::LROUND and ISD::LLROUND along with new intrinsics. The changes are straightforward as for other floating-point rounding functions, with just some adjustments required to handle the return value being an interger. The idea is to optimize lround/llround generation for AArch64 in a subsequent patch. Current semantic is just route it to libm symbol. llvm-svn: 360889	2019-05-16 13:15:27 +00:00
Matt Arsenault	828b685ebe	RegAllocFast: Improve hinting heuristic Trace through multiple COPYs when looking for a physreg source. Add hinting for vregs that will be copied into physregs (we only hinted for vregs getting copied to a physreg previously). Give hinted a register a bonus when deciding which value to spill. This is part of my rewrite regallocfast series. In fact this one doesn't even have an effect unless you also flip the allocation to happen from back to front of a basic block. Nonetheless it helps to split this up to ease review of D52010 Patch by Matthias Braun llvm-svn: 360887	2019-05-16 12:50:39 +00:00
Matt Arsenault	27ac8408f6	GlobalISel: Add DstOp version of buildIntrinsic llvm-svn: 360879	2019-05-16 12:22:56 +00:00
Hans Wennborg	4da9ff9fcf	Revert r360876 "[Object] Change object::SectionRef::getContents() to return Expected<StringRef>" It broke the Clang build, see llvm-commits thread. > Expected<ArrayRef<uint8_t>> may be better but use Expected<StringRef> for now. > > Follow-up of D61781. llvm-svn: 360878	2019-05-16 12:08:34 +00:00
Matt Arsenault	a8f88c388f	AMDGPU/GlobalISel: Correct regbank for 1-bit and/or/xor Bool values should use the scc/vcc regbank since r350611. llvm-svn: 360877	2019-05-16 12:06:41 +00:00
Fangrui Song	a076ec54be	[Object] Change object::SectionRef::getContents() to return Expected<StringRef> Expected<ArrayRef<uint8_t>> may be better but use Expected<StringRef> for now. Follow-up of D61781. llvm-svn: 360876	2019-05-16 11:33:48 +00:00
Cullen Rhodes	472c6ef8b0	[AArch64][SVE2] Asm: implement CMLA/SQRDCMLAH instructions Summary: This patch adds support for the indexed and unpredicated vectors forms of the CMLA and SQRDCMLAH instructions. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer Differential Revision: https://reviews.llvm.org/D61906 llvm-svn: 360871	2019-05-16 09:42:22 +00:00
Cullen Rhodes	07eba98dd7	[AArch64][SVE2] Asm: implement CDOT instruction Summary: The complex DOT instructions perform a dot-product on quadtuplets from two source vectors and the resuling wide real or wide imaginary is accumulated into the destination register. The instructions come in two forms: Vector form, e.g. cdot z0.s, z1.b, z2.b, #90 - complex dot product on four 8-bit quad-tuplets, accumulating results in 32-bit elements. The complex numbers in the second source vector are rotated by 90 degrees. cdot z0.d, z1.h, z2.h, #180 - complex dot product on four 16-bit quad-tuplets, accumulating results in 64-bit elements. The complex numbers in the second source vector are rotated by 180 degrees. Indexed form, e.g. cdot z0.s, z1.b, z2.b[3], #0 - complex dot product on four 8-bit quad-tuplets, with specified quadtuplet from second source vector, accumulating results in 32-bit elements. cdot z0.d, z1.h, z2.h[1], #0 - complex dot product on four 16-bit quad-tuplets, with specified quadtuplet from second source vector, accumulating results in 64-bit elements. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer, rovka Differential Revision: https://reviews.llvm.org/D61903 llvm-svn: 360870	2019-05-16 09:33:44 +00:00
Cullen Rhodes	064f6ab556	[AArch64][SVE2] Asm: add unpredicated integer multiply instructions Summary: Add support for the following instructions: * MUL (indexed and unpredicated vectors forms) * SQDMULH (indexed and unpredicated vectors forms) * SQRDMULH (indexed and unpredicated vectors forms) * SMULH (unpredicated, predicated form added in SVE) * UMULH (unpredicated, predicated form added in SVE) * PMUL (unpredicated) The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: SjoerdMeijer, rovka Differential Revision: https://reviews.llvm.org/D61902 llvm-svn: 360867	2019-05-16 09:07:26 +00:00
Fangrui Song	3e92df3e39	Add Triple::isPPC64() llvm-svn: 360864	2019-05-16 08:31:22 +00:00
Clement Courbet	c4fdd717ef	Reland r360771 "[MergeICmps] Simplify the code." This revision does not seem to be the culprit. llvm-svn: 360859	2019-05-16 06:18:02 +00:00
Igor Kudrin	1ff8b7bdf1	[IRMover] Improve diagnostic messages for conflicting metadata This does the similar for error messages as rL344011 has done for warnings. With llvm::lto::LTO, the error might appear when LTO::run() is executed. In that case, the calling code cannot know which module causes the error and, subsequently, cannot hint the user. Differential Revision: https://reviews.llvm.org/D61880 llvm-svn: 360857	2019-05-16 05:23:13 +00:00
Matt Arsenault	11be78bc7a	GlobalISel: Add buildFConstant for APFloat llvm-svn: 360853	2019-05-16 04:09:06 +00:00
Matt Arsenault	012ecbbbba	GlobalISel: Fix indentation llvm-svn: 360851	2019-05-16 04:08:46 +00:00
Matt Arsenault	55146d3139	GlobalISel: Add G_FCOPYSIGN llvm-svn: 360850	2019-05-16 04:08:39 +00:00
Lang Hames	c2fb896522	[JITLink][MachO] Use getSymbol64TableEntry for 64-bit MachO files. Fixes a think-o. No test case: The nlist and nlist64 data structures happen to line up for this field, so there's no way to construct a failing test case. llvm-svn: 360830	2019-05-16 00:21:07 +00:00
Craig Topper	e43bdf144c	[X86] Delay creating index register negations during address matching until after we know for sure the match will succeed If we're trying to match an LEA, its possible the LEA match will be deemed unprofitable. In which case the negation we created in matchAddress would be left dangling in the SelectionDAG. This could artificially increase use counts for other nodes in the DAG. Though I don't have an example of that. But it just seems like bad form to have dangling nodes in isel. Differential Revision: https://reviews.llvm.org/D61047 llvm-svn: 360823	2019-05-15 21:59:53 +00:00
Reid Kleckner	4882490349	[codeview] Fix SDNode representation of annotation labels Before this change, they were erroneously constructed with the EH_LABEL SDNode opcode, which caused other passes to interact with them in incorrect ways. See the FIXME about fastisel that this addresses in the existing test case. Fixes PR41890 llvm-svn: 360818	2019-05-15 21:46:05 +00:00
Simon Atanasyan	0b0cc23fb6	[mips] Use range-based `for` loops. NFC llvm-svn: 360817	2019-05-15 21:26:25 +00:00
Mandeep Singh Grang	814435fe87	[AArch64] only indicate CFI on Windows if we emitted CFI Summary: Otherwise, we emit directives for CFI without any actual CFI opcodes to go with them, which causes tools to malfunction. The technique is similar to what the x86 backend already does. Fixes https://bugs.llvm.org/show_bug.cgi?id=40876 Patch by: froydnj (Nathan Froyd) Reviewers: mstorsjo, eli.friedman, rnk, mgrang, ssijaric Reviewed By: rnk Subscribers: javed.absar, kristof.beyls, llvm-commits, dmajor Tags: #llvm Differential Revision: https://reviews.llvm.org/D61960 llvm-svn: 360816	2019-05-15 21:23:41 +00:00
Craig Topper	439228727a	[X86] Strengthen type constraints on some specialized X86 ISD opcodes that don't have any flexibility. NFC These particular instructions only operate on 128-bit vectors and have no wider equivalents. And the element size is always known. One could argue that MOVSS/MOVSD could be merged, but that's probably disruptive to code in X86ISelLowering and probably low value. llvm-svn: 360815	2019-05-15 21:16:28 +00:00
Reid Kleckner	7c438c5b07	[codeview] Finish support for reading and writing S_ANNOTATION records Implement dumping via llvm-pdbutil and llvm-readobj. llvm-svn: 360813	2019-05-15 20:53:39 +00:00
Pete Couperus	1ca049959f	Uncomment LLVM_FALLTHROUGH. llvm-svn: 360798	2019-05-15 19:46:17 +00:00
Taewook Oh	9d020de3e8	[PredicateInfo] Do not process unreachable operands. Summary: We should excluded unreachable operands from processing as their DFS visitation order is undefined. When `renameUses` function sorts `OpsToRename` (https://fburl.com/d2wubn60), the comparator assumes that the parent block of the operand has a corresponding dominator tree node. This is not the case for unreachable operands and crashes the compiler. Reviewers: dberlin, mgrang, davide Subscribers: efriedma, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61154 llvm-svn: 360796	2019-05-15 19:35:38 +00:00
Nicolai Haehnle	f672b6170c	[MachineOperand] Add a ChangeToGA method Summary: Analogous to the other ChangeToXXX methods. See the next patch for a use case. Change-Id: I6548d614706834fb9109ab3c8fe915e9c6ece2a7 Reviewers: arsenm, kzhuravl Subscribers: wdng, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61651 llvm-svn: 360789	2019-05-15 17:48:10 +00:00
Nicolai Haehnle	664ceeda68	RegAlloc: try to fail more gracefully when out of registers Summary: The emitError path allows the program to continue, unlike report_fatal_error. This is friendlier to use cases where LLVM is embedded in a larger program, because the caller may be able to deal with the error somewhat gracefully. Change the number of requested NOP bytes in the AArch64 and PowerPC test cases to avoid triggering an unrelated assertion. The compilation still fails, as verified by the test. Change-Id: Iafb9ca341002a597b82e59ddc7a1f13c78758e3d Reviewers: arsenm, MatzeB Subscribers: qcolombet, nemanjai, wdng, javed.absar, kristof.beyls, kbarton, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61489 llvm-svn: 360786	2019-05-15 17:29:58 +00:00
Hiroshi Yamauchi	7dfd087a9a	[JumpThreading] A bug fix for stale loop info after unfold select Summary: The return value of a TryToUnfoldSelect call was not checked, which led to an incorrectly preserved loop info and some crash. The original crash was reported on https://reviews.llvm.org/D59514. Reviewers: davidxl, amehsan Reviewed By: davidxl Subscribers: fhahn, brzycki, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61920 llvm-svn: 360780	2019-05-15 15:15:16 +00:00
Ryan Taylor	29257eb76c	[AMDGPU] Increases available SGPR for Calling Convention Summary: SGPR in CC can be either hw initialized or set by other chained shaders and so this increases the SGPR count availalbe to CC to 105. Change-Id: I3dfadc750fe4a3e2bd07117a2899fd13f3e2fef3 Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61261 llvm-svn: 360778	2019-05-15 14:43:55 +00:00
Cameron McInally	0c82d9b5a2	Teach InstSimplify -X + X --> 0.0 about unary FNeg Differential Revision: https://reviews.llvm.org/D61916 llvm-svn: 360777	2019-05-15 14:31:33 +00:00
Clement Courbet	eaf4413d2d	Revert r360771 "[MergeICmps] Simplify the code." Breaks a bunch of builbdots. llvm-svn: 360776	2019-05-15 14:21:59 +00:00
Clement Courbet	0d071be474	[MergeICmps] Fix r360771. Twine references a StringRef by reference, not value... llvm-svn: 360775	2019-05-15 14:00:45 +00:00
Stephen Tozer	0d02f2ff4f	Revert "[Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed" This reverts r360772 due to build issues. Reverted commit: `17dd4d7403`. llvm-svn: 360773	2019-05-15 13:41:44 +00:00
Stephen Tozer	17dd4d7403	[Salvage] Change salvage debug info implementation to use DW_OP_LLVM_convert where needed Fixes issue: https://bugs.llvm.org/show_bug.cgi?id=40645 Previously, LLVM had no functional way of performing casts inside of a DIExpression(), which made salvaging cast instructions other than Noop casts impossible. With the recent addition of DW_OP_LLVM_convert this salvaging is now possible, and so can be used to fix the attached bug as well as any cases where SExt instruction results are lost in the debugging metadata. This patch introduces this fix by expanding the salvage debug info method to cover these cases using the new operator. Differential revision: https://reviews.llvm.org/D61184 llvm-svn: 360772	2019-05-15 13:15:48 +00:00
Clement Courbet	157ae639fa	[MergeICmps] Simplify the code. Instead of patching the original blocks, we now generate new blocks and delete the old blocks. This results in simpler code with a less twisted control flow (see the change in `entry-block-shuffled.ll`). This will make https://reviews.llvm.org/D60318 simpler by making it more obvious where control flow created and deleted. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits, spatel Tags: #llvm Differential Revision: https://reviews.llvm.org/D61736 llvm-svn: 360771	2019-05-15 13:04:24 +00:00
Simon Pilgrim	2dd6a0c0c3	Revert rL360675 : [APFloat] APFloat::Storage::Storage - fix use after move This was mentioned both in https://www.viva64.com/en/b/0629/ and by scan-build checks ........ There's concerns this may just introduce a use-after-free instead..... llvm-svn: 360770	2019-05-15 13:03:10 +00:00
David Green	0582b22f10	[ARM] Don't use the Machine Scheduler for cortex-m at minsize The new cortex-m schedule in rL360768 helps performance, but can increase the amount of high-registers used. This, on average, ends up increasing the codesize by a fair amount (because less instructions are converted from T2 to T1). On cortex-m at -Oz, where we are quite size-paranoid, it is better to use the existing DAG scheduler with the RegPressure scheduling preference (at least until the issues around T2 vs T1 instructions can be improved). I have also made sure that the Sched::RegPressure dag scheduler is always chosen for MinSize. The test shows one case where we increase the number of registers used. Differential Revision: https://reviews.llvm.org/D61882 llvm-svn: 360769	2019-05-15 12:58:02 +00:00
David Green	d2d0f46cd2	[ARM] Cortex-M4 schedule This patch adds a simple Cortex-M4 schedule, renaming the existing M3 schedule to M4 and filling in the latencies as-per the Cortex-M4 TRM: https://developer.arm.com/docs/ddi0439/latest Most of these are 1, with the important exception being loads taking 2 cycles. A few others are also higher, but I don't believe they make a large difference. I've repurposed the M3 schedule as the latencies are mostly the same between the two cores, with the M4 having more FP and DSP instructions. We also turn on MISched and UseAA for the cores that now use this. It also adds some schedule Write's to various instruction to make things simpler. Differential Revision: https://reviews.llvm.org/D54142 llvm-svn: 360768	2019-05-15 12:41:58 +00:00
Simon Atanasyan	4c68c5ae71	[mips] LLVM and GAS now use same instructions for CFA Definition. NFCI LLVM previously used `DW_CFA_def_cfa` instruction in .eh_frame to set the register and offset for current CFA rule. We change it to `DW_CFA_def_cfa_register` which is the same one used by GAS that only changes the register but keeping the old offset. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D61899 llvm-svn: 360765	2019-05-15 12:05:27 +00:00
Florian Hahn	9e778e6c73	[LV] Move getScalarizationOverhead and vector call cost computations to CM. (NFC) This reduces the number of parameters we need to pass in and they seem a natural fit in LoopVectorizationCostModel. Also simplifies things for D59995. As a follow up refactoring, we could only expose a expose a shouldUseVectorIntrinsic() helper in LoopVectorizationCostModel, instead of calling getVectorCallCost/getVectorIntrinsicCost in InnerLoopVectorizer/VPRecipeBuilder. Reviewers: Ayal, hsaito, dcaballe, rengolin Reviewed By: rengolin Differential Revision: https://reviews.llvm.org/D61638 llvm-svn: 360758	2019-05-15 10:05:49 +00:00
Clement Courbet	d9d0665d1c	[[DAGCombiner][NFC] Add a comment. As suggested in D61846. llvm-svn: 360755	2019-05-15 08:21:18 +00:00
Craig Topper	384d46c0d5	[X86] Use OR32mi8Locked instead of LOCK_OR32mi8 in emitLockedStackOp. They encode the same way, but OR32mi8Locked sets hasUnmodeledSideEffects set which should be stronger than the mayLoad/mayStore on LOCK_OR32mi8. I think this makes sense since we are using it as a fence. This also seems to hide the operation from the speculative load hardening pass so I've reverted r360511. llvm-svn: 360747	2019-05-15 04:15:46 +00:00
Fangrui Song	f4dfd63c74	[IR] Disallow llvm.global_ctors and llvm.global_dtors of the 2-field form in textual format The 3-field form was introduced by D3499 in 2014 and the legacy 2-field form was planned to be removed in LLVM 4.0 For the textual format, this patch migrates the existing 2-field form to use the 3-field form and deletes the compatibility code. test/Verifier/global-ctors-2.ll checks we have a friendly error message. For bitcode, lib/IR/AutoUpgrade UpgradeGlobalVariables will upgrade the 2-field form (add i8* null as the third field). Reviewed By: rnk, dexonsmith Differential Revision: https://reviews.llvm.org/D61547 llvm-svn: 360742	2019-05-15 02:35:32 +00:00
Philip Reames	658cad1287	[NFC] Reuse a helper function to eliminate duplicate code llvm-svn: 360740	2019-05-15 01:39:07 +00:00
Richard Trieu	5f7d4ab5f9	[XCore] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360738	2019-05-15 01:28:30 +00:00
Richard Trieu	0116385452	[X86] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360736	2019-05-15 01:17:58 +00:00
Richard Trieu	c6c421379d	[WebAssembly] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360735	2019-05-15 01:03:00 +00:00
Richard Trieu	1e6f98b89d	[SystemZ] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360734	2019-05-15 00:46:18 +00:00
Richard Trieu	cf82d4a483	[Sparc] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360733	2019-05-15 00:35:37 +00:00
Richard Trieu	51fc56d603	[RISCV] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360732	2019-05-15 00:24:15 +00:00
Richard Trieu	ee6ced196d	[PowerPC] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360731	2019-05-15 00:09:58 +00:00
Richard Trieu	e8f83befd5	[NVPTX] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360729	2019-05-14 23:56:18 +00:00
Richard Trieu	a57ce32eff	[MSP430] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360728	2019-05-14 23:45:18 +00:00
Richard Trieu	313b78150c	[Mips] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360727	2019-05-14 23:34:37 +00:00
Richard Trieu	2e50dc78c5	[Lanai] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360726	2019-05-14 23:17:18 +00:00
Richard Trieu	7ef172998b	[Hexagon] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360724	2019-05-14 23:04:55 +00:00
Richard Trieu	a68ee931e6	[BPF] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360722	2019-05-14 22:54:06 +00:00
Richard Trieu	e982b42003	[AVR] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360721	2019-05-14 22:41:58 +00:00
Philip Reames	445f942fc4	Use an offset from TOS for idempotent rmw locked op lowering This was the portion split off D58632 so that it could follow the redzone API cleanup. Note that I changed the offset preferred from -8 to -64. The difference should be very minor, but I thought it might help address one concern which had been previously raised. Differential Revision: https://reviews.llvm.org/D61862 llvm-svn: 360719	2019-05-14 22:32:42 +00:00
Richard Trieu	f3011b9b10	[ARM] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360718	2019-05-14 22:29:50 +00:00
Richard Trieu	7f9a008a2d	[ARC] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360716	2019-05-14 22:06:04 +00:00
Richard Trieu	8ce2ee9d56	[AMDGPU] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360713	2019-05-14 21:54:37 +00:00
Richard Trieu	b26592e04d	[AArch64] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360709	2019-05-14 21:33:53 +00:00
Leonard Chan	0cdd3b1d81	[NewPM] Port HWASan and Kernel HWASan Port hardware assisted address sanitizer to new PM following the same guidelines as msan and tsan. Changes: - Separate HWAddressSanitizer into a pass class and a sanitizer class. - Create new PM wrapper pass for the sanitizer class. - Use the getOrINsert pattern for some module level initialization declarations. - Also enable kernel-kwasan in new PM - Update llvm tests and add clang test. Differential Revision: https://reviews.llvm.org/D61709 llvm-svn: 360707	2019-05-14 21:17:21 +00:00
Florian Hahn	53c9d585b5	[LICM] Allow AliasSetMap to contain top-level loops. When an outer loop gets deleted by a different pass, before LICM visits it, we cannot clean up its sub-loops in AliasSetMap, because at the point we receive the deleteAnalysisLoop callback for the outer loop, the loop object is already invalid and we cannot access its sub-loops any longer. Reviewers: asbirlea, sanjoy, chandlerc Reviewed By: asbirlea Differential Revision: https://reviews.llvm.org/D61904 llvm-svn: 360704	2019-05-14 19:41:36 +00:00
Dmitry Preobrazhensky	ee51d851ea	[AMDGPU][GFX8][GFX9] Corrected predicate of v_*_co_u32 aliases Reviewers: rampitec, arsenm Differential Revision: https://reviews.llvm.org/D61905 llvm-svn: 360702	2019-05-14 19:16:24 +00:00
Nikita Popov	48c4e4fa80	[LVI][CVP] Add support for abs/nabs select pattern flavor Based on ConstantRange support added in D61084, we can now handle abs and nabs select pattern flavors in LVI. Differential Revision: https://reviews.llvm.org/D61794 llvm-svn: 360700	2019-05-14 18:53:47 +00:00
Alina Sbirlea	80c6e79602	[MemorySSA] LoopSimplify preserves MemorySSA only when flag is flipped. LoopSimplify can preserve MemorySSA after r360270. But the MemorySSA analysis is retrieved and preserved only when the EnableMSSALoopDependency is set to true. Use the same conditional to mark the pass as preserved, otherwise subsequent passes will get an invalid analysis. Resolves PR41853. llvm-svn: 360697	2019-05-14 18:07:18 +00:00
Philip Reames	75ad8c5d63	Fix a release mode warning introduced in r360694 llvm-svn: 360696	2019-05-14 17:50:06 +00:00
Philip Reames	bd8d309111	[IndVars] Extend reasoning about loop invariant exits to non-header blocks Noticed while glancing through the code for other reasons. The extension is trivial enough, decided to just do it. llvm-svn: 360694	2019-05-14 17:20:10 +00:00
Cameron McInally	7c5c0c9fe5	Support FNeg in SpeculativeExecution pass Differential Revision: https://reviews.llvm.org/D61910 llvm-svn: 360692	2019-05-14 16:51:18 +00:00
Stanislav Mekhanoshin	05791d90c9	[AMDGPU] Fixed handling of imemdiate i1 literals This bug was exposed by the rL360395. Differential Revision: https://reviews.llvm.org/D61812 llvm-svn: 360689	2019-05-14 16:18:00 +00:00
Tim Renouf	33cb8f5b54	[AMDGPU] Fixed +DumpCode The +DumpCode attribute is a horrible hack in AMDGPU to embed the disassembly of the generated code into the elf file. It is used by LLPC to implement an extension that allows the application to read back the disassembly of the code. Longer term, we should re-implement that by using the LLVM disassembler from the Vulkan driver. Recent LLVM changes broke +DumpCode. With -filetype=asm it crashed, and with -filetype=obj I think it did not include any instructions, only the labels. Fixed with this commit: now it has no effect with -filetype=asm, and works as intended with -filetype=obj. Differential Revision: https://reviews.llvm.org/D60682 Change-Id: I6436d86fe2ea220d74a643a85e64753747c9366b llvm-svn: 360688	2019-05-14 16:17:14 +00:00
Simon Pilgrim	c2d9cfd925	[X86] Disable shouldFoldConstantShiftPairToMask for scalar shifts on AMD targets (PR40758) D61068 handled vector shifts, this patch does the same for scalars where there are similar number of pipes for shifts as bit ops - this is true almost entirely for AMD targets where the scalar ALUs are well balanced. This combine avoids AND immediate mask which usually means we reduce encoding size. Some tests show use of (slow, scaled) LEA instead of SHL in some cases, but thats due to particular shift immediates - shift+mask generate these just as easily. Differential Revision: https://reviews.llvm.org/D61830 llvm-svn: 360684	2019-05-14 15:21:28 +00:00
Cullen Rhodes	3b917019a5	[AArch64][SVE2] Asm: add SQRDMLAH/SQRDMLSH instructions Summary: This patch adds support for the indexed and unpredicated vectors forms of the SQRDMLAH and SQRDMLSH instructions. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61515 llvm-svn: 360683	2019-05-14 15:10:16 +00:00
Cullen Rhodes	e029da46e6	[AArch64][SVE2] Asm: add integer multiply-add/subtract (indexed) instructions Summary: This patch adds support for the following instructions: MLA mul-add, writing addend (Zda = Zda + Zn * Zm[idx]) MLS mul-sub, writing addend (Zda = Zda + -Zn * Zm[idx]) Predicated forms of these instructions were added in SVE. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewed By: rovka Differential Revision: https://reviews.llvm.org/D61514 llvm-svn: 360682	2019-05-14 15:01:00 +00:00
Fangrui Song	2f6ef2fc92	DWARF v5: emit DW_AT_addr_base if DW_AT_low_pc references .debug_addr The condition !AddrPool.empty() is tested before attachRangesOrLowHighPC(), which may add an entry to AddrPool. We emit DW_AT_low_pc (DW_FORM_addrx) but may incorrectly omit DW_AT_addr_base for LineTablesOnly. This can be easily reproduced: clang -gdwarf-5 -gmlt -c a.cc Fix this by moving !AddrPool.empty() below. This was discovered while investigating an lld crash (fixed by D61889) on such object files: ld.lld --gdb-index a.o Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D61891 llvm-svn: 360678	2019-05-14 14:37:26 +00:00
Lei Huang	22561972af	[PowerPC] Custom lower known CR bit spills For known CRBit spills, CRSET/CRUNSET, it is more efficient to load and spill the known value instead of extracting the bit. eg. This sequence is currently used to spill a CRUNSET: crclr 4*cr5+lt mfocrf r3,4 rlwinm r3,r3,20,0,0 stw r3,132(r1) This patch custom lower it to: li r3,0 stw r3,132(r1) Differential Revision: https://reviews.llvm.org/D61754 llvm-svn: 360677	2019-05-14 14:27:06 +00:00
Simon Pilgrim	9fd3be294c	[APFloat] APFloat::Storage::Storage - fix use after move This was mentioned both in https://www.viva64.com/en/b/0629/ and by scan-build checks llvm-svn: 360675	2019-05-14 14:13:30 +00:00
Kit Barton	37b7922daa	Save the induction binary operator in IVDescriptors for non FP induction variables. Summary: Currently InductionBinOps are only saved for FP induction variables, the PR extends it with non FP induction variable, so user of IVDescriptors can query the InductionBinOps for integer induction variables. The changes in hasUnsafeAlgebra() and getUnsafeAlgebraInst() are required for the existing LIT test cases to pass. As described in the comment of the two functions, one of the requirement to return true is it is a FP induction variable. The checks was not needed because InductionBinOp was not set on non FP cases before. https://reviews.llvm.org/D60565 depends on the patch. Committed on behalf of @Whitney (Whitney Tsang). Reviewers: jdoerfert, kbarton, fhahn, hfinkel, dmgreen, Meinersbur Reviewed By: jdoerfert Subscribers: mgorny, hiraditya, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61329 llvm-svn: 360671	2019-05-14 13:26:36 +00:00
Tim Northover	717b62a146	TableGen: support #ifndef in addition to #ifdef. TableGen has a limited preprocessor, which only really supports easier. llvm-svn: 360670	2019-05-14 13:04:25 +00:00
Thomas Preud'homme	7b4ecdd3c2	Reinstate "FileCheck [5/12]: Introduce regular numeric variables" This reinstates r360578 (git `e47362c1ec`), reverted in r360653 (git `004393681c`), with a fix for the list added in FileCheck.rst to build without error. Copyright: - Linaro (changes up to diff 183612 of revision D55940) - GraphCore (changes in later versions of revision D55940 and in new revision created off D55940) Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D60385 llvm-svn: 360665	2019-05-14 11:58:30 +00:00
Simon Pilgrim	2747ee2c83	[X86] X86TargetLowering::LowerINTRINSIC_WO_CHAIN - ensure rounding control is initialized. NFCI. Fixes scan-build warnings llvm-svn: 360664	2019-05-14 11:30:39 +00:00
Tim Northover	ff6875acd9	AArch64: support binutils-like things on arm64_32. This adds support for the arm64_32 watchOS ABI to LLVM's low level tools, teaching them about the specific MachO choices and constants needed to disassemble things. llvm-svn: 360663	2019-05-14 11:25:44 +00:00
Tim Northover	ed9117f88d	GlobalOpt: do not promote globals used atomically to constants. Some atomic loads are implemented as cmpxchg (particularly if large or floating), and that usually requires write access to the memory involved or it will segfault. We can still propagate the constant value to users we understand though. llvm-svn: 360662	2019-05-14 11:03:13 +00:00
Simon Pilgrim	15842132d5	[MemorySanitizer] getMMXVectorTy - assert valid element size. NFCI. Fixes scan-build warnings llvm-svn: 360658	2019-05-14 10:29:18 +00:00
Diana Picus	a568222ddd	[IRTranslator] Don't hardcode GEP index type When breaking up loads and stores of aggregates, the IRTranslator uses LLT::scalar(64) for the index type of the G_GEP instructions that compute the addresses. This is unnecessarily large for 32-bit targets. Use the int ptr type provided by the DataLayout instead. Note that we're already doing the right thing when translating getelementptr instructions from the IR. This is just an oversight when generating new ones while translating loads/stores. Both x86 and AArch64 already have tests confirming that the old behaviour is preserved for 64-bit targets. Differential Revision: https://reviews.llvm.org/D61852 llvm-svn: 360656	2019-05-14 09:25:17 +00:00
Thomas Preud'homme	004393681c	Revert "FileCheck [5/12]: Introduce regular numeric variables" This reverts r360578 (git `e47362c1ec`) to solve the sphinx build failure on http://lab.llvm.org:8011/builders/llvm-sphinx-docs buildbot. llvm-svn: 360653	2019-05-14 08:43:11 +00:00
Philip Reames	3098e44daa	[X86] Prefer locked stack op over mfence for seq_cst 64-bit stores on 32-bit targets This is a follow on to D58632, with the same logic. Given a memory operation which needs ordering, but doesn't need to modify any particular address, prefer to use a locked stack op over an mfence. Differential Revision: https://reviews.llvm.org/D61863 llvm-svn: 360649	2019-05-14 04:43:37 +00:00
Fangrui Song	e1cb2c0f40	[Object] Change ObjectFile::getSectionContents to return Expected<ArrayRef<uint8_t>> Change std::error_code getSectionContents(DataRefImpl, StringRef &) const; to Expected<ArrayRef<uint8_t>> getSectionContents(DataRefImpl) const; Many object formats use ArrayRef<uint8_t> as the underlying type, which is generally better than StringRef to represent binary data, so change the type to decrease the number of type conversions. Reviewed By: ruiu, sbc100 Differential Revision: https://reviews.llvm.org/D61781 llvm-svn: 360648	2019-05-14 04:22:51 +00:00
Sanjay Patel	99d6420a82	[SDAG] fix unused variable warning and unneeded indirection; NFC llvm-svn: 360640	2019-05-14 00:57:31 +00:00
Sanjay Patel	3a13d970aa	[SDAG, x86] allow targets to override test for binop opcodes This follows the pattern of the existing isCommutativeBinOp(). x86 shows improvements from vector narrowing for the min/max opcodes. llvm-svn: 360639	2019-05-14 00:39:40 +00:00
Gor Nishanov	d64455cd43	[coroutines] Fix spills of static array allocas Summary: CoroFrame was not considering static array allocas, and was only ever reserving a single element in the coroutine frame. This meant that stores to the non-zero'th element would corrupt later frame data. Store static array allocas as field arrays in the coroutine frame. Added test. Committed by Gor Nishanov on behalf of ben-clayton Reviewers: GorNishanov, modocache Reviewed By: GorNishanov Subscribers: Orlando, capn, EricWF, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61372 llvm-svn: 360636	2019-05-13 23:58:24 +00:00
Craig Topper	e2966473dd	[X86] Use ISD::MERGE_VALUES to return from lowerAtomicArith instead of calling ReplaceAllUsesOfValueWith and returning SDValue(). Returning SDValue() makes the caller think that nothing happened and it will end up executing the Expand path. This generates extra nodes that will need to be pruned as dead code. Returning an ISD::MERGE_VALUES will tell the caller that we'd like to make a change and it will take care of replacing uses. This will prevent falling into the Expand path. llvm-svn: 360627	2019-05-13 22:17:13 +00:00
Craig Topper	5f999c2bea	[X86] Various type corrections to the code that creates LOCK_OR32mi8/OR32mi8Locked to the stack for idempotent atomic rmw and atomic fence. These are updates to match how isel table would emit a LOCK_OR32mi8 node. -Use i32 for the immediate zero even though only 8 bits are encoded. -Use i16 for segment register. -Use LOCK_OR32mi8 for idempotent atomic operations in 32-bit mode to match 64-bit mode. I'm not sure why OR32mi8Locked and LOCK_OR32mi8 both exist. The only difference seems to be that OR32mi8Locked is marked as UnmodeledSideEffects=1. -Emit an extra i32 result for the flags output. I don't know if the types here really matter just noticed it was inconsistent with normal behavior. llvm-svn: 360619	2019-05-13 21:01:24 +00:00
Lang Hames	56baade10d	[JITLink][MachO] Honor the no-dead-strip flag on nlist entries. llvm-svn: 360618	2019-05-13 20:52:30 +00:00
Nikita Popov	323dc634b9	[WebAssembly] Don't assume that zext/sext result is i32/i64 in fast isel (PR41841) Usually this will abort fast-isel at the instruction using the non-legal result, but if the only use is in a different basic block, we'll incorrectly assume that the zext/sext is to i32 (rather than i128 in this case). Differential Revision: https://reviews.llvm.org/D61823 llvm-svn: 360616	2019-05-13 19:40:18 +00:00
Stanislav Mekhanoshin	79b2828b3f	[AMDGPU] Reorder includes per coding standard. NFC. llvm-svn: 360609	2019-05-13 18:05:10 +00:00
Stanislav Mekhanoshin	21088639ae	[AMDGPU] Remove now unused V2FP16_ONE constant def. NFC. llvm-svn: 360608	2019-05-13 17:52:57 +00:00
Robert Lougher	91a9d4ef4b	Revert [X86] Avoid SFB - Fix inconsistent codegen with/without debug info Revert r360436 as it is causing clang-x64-windows-msvc buildbot to fail. llvm-svn: 360606	2019-05-13 17:36:46 +00:00
Sanjay Patel	760f61ab36	[InstCombine] try harder to form rotate (funnel shift) (PR20750) We have a similar match for patterns ending in a truncate. This should be ok for all targets because the default expansion would still likely be better from replacing 2 'and' ops with 1. Attempt to show the logic equivalence in Alive (which doesn't currently have funnel-shift in its vocabulary AFAICT): %shamt = zext i8 %i to i32 %m = and i32 %shamt, 31 %neg = sub i32 0, %shamt %and4 = and i32 %neg, 31 %shl = shl i32 %v, %m %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl => %a = and i8 %i, 31 %shamt2 = zext i8 %a to i32 %neg2 = sub i32 0, %shamt2 %and4 = and i32 %neg2, 31 %shl = shl i32 %v, %shamt2 %shr = lshr i32 %v, %and4 %or = or i32 %shr, %shl https://rise4fun.com/Alive/V9r llvm-svn: 360605	2019-05-13 17:28:19 +00:00
Nick Desaulniers	c33f754e74	[TargetLowering] Handle multi depth GEPs w/ inline asm constraints Summary: X86TargetLowering::LowerAsmOperandForConstraint had better support than TargetLowering::LowerAsmOperandForConstraint for arbitrary depth getelementpointers for "i", "n", and "s" extended inline assembly constraints. Hoist its support from the derived class into the base class. Link: https://github.com/ClangBuiltLinux/linux/issues/469 Reviewers: echristo, t.p.northover Reviewed By: t.p.northover Subscribers: t.p.northover, E5ten, kees, jyknight, nemanjai, javed.absar, eraman, hiraditya, jsji, llvm-commits, void, craig.topper, nathanchance, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D61560 llvm-svn: 360604	2019-05-13 17:27:44 +00:00
Simon Pilgrim	73aee29095	[X86][SSE] LowerBuildVectorv4x32 - don't insert MOVQ for undef elts Fixes the regression noted in D61782 where a VZEXT_MOVL was being inserted because we weren't discriminating between 'zeroable' and 'all undef' for the upper elts. Differential Revision: https://reviews.llvm.org/D61782 llvm-svn: 360596	2019-05-13 16:10:11 +00:00
Simon Pilgrim	cf5a8eb7cd	[X86][SSE] Relax use limits for lowerAddSubToHorizontalOp (PR32433) Now that we can use HADD/SUB for scalar additions from any pair of extracted elements (D61263), we can relax the one use limit as we will be able to merge multiple uses into using the same HADD/SUB op. This exposes a couple of missed opportunities in LowerBuildVectorv4x32 which will be committed separately. Differential Revision: https://reviews.llvm.org/D61782 llvm-svn: 360594	2019-05-13 16:02:45 +00:00
Simon Pilgrim	d3cedee3c6	[TargetLowering] Add SimplifyDemandedBits support for ZERO_EXTEND_VECTOR_INREG More work for PR39709. llvm-svn: 360592	2019-05-13 15:51:26 +00:00
Amara Emerson	e5248e6b41	Revert "[LSR] Tweak setup cost depth threshold to 10." Changing the threshold might not be the best long term approach. Revert for now. llvm-svn: 360589	2019-05-13 15:37:18 +00:00
Simon Pilgrim	d9aa928603	[X86] Add SimplifyDemandedBits support for PEXTRB/PEXTRW (PR39709) Test case will be included in a followup - its being used but its tricky to show a case that isn't caught at a later stage anyway. llvm-svn: 360588	2019-05-13 15:31:27 +00:00
Sanjay Patel	05dafb1c97	[DAGCombiner] narrow vector binop with inserts/extract We catch most of these patterns (on x86 at least) by matching a concat vectors opcode early in combining, but the pattern may emerge later using insert subvector instead. The AVX1 diffs for add/sub overflow show another missed narrowing pattern. That one may be falling though the cracks because of combine ordering and multiple uses. llvm-svn: 360585	2019-05-13 14:31:14 +00:00
Kevin P. Neal	5987749e33	Add constrained fptrunc and fpext intrinsics. The new fptrunc and fpext intrinsics are constrained versions of the regular fptrunc and fpext instructions. Reviewed by: Andrew Kaylor, Craig Topper, Cameron McInally, Conner Abbot Approved by: Craig Topper Differential Revision: https://reviews.llvm.org/D55897 llvm-svn: 360581	2019-05-13 13:23:30 +00:00
Simon Pilgrim	d845bc3d0c	TargetLowering::SimplifyDemandedBits - early-out for UNDEF ops. NFCI. llvm-svn: 360579	2019-05-13 12:44:03 +00:00
Thomas Preud'homme	e47362c1ec	FileCheck [5/12]: Introduce regular numeric variables Summary: This patch is part of a patch series to add support for FileCheck numeric expressions. This specific patch introduces regular numeric variables which can be set on the command-line. This commit introduces regular numeric variable that can be set on the command-line with the -D option to a numeric value. They can then be used in CHECK patterns in numeric expression with the same shape as @LINE numeric expression, ie. VAR, VAR+offset or VAR-offset where offset is an integer literal. The commit also enable strict whitespace in the verbose.txt testcase to check that the position or the location diagnostics. It fixes one of the existing CHECK in the process which was not accurately testing a location diagnostic (ie. the diagnostic was correct, not the CHECK). Copyright: - Linaro (changes up to diff 183612 of revision D55940) - GraphCore (changes in later versions of revision D55940 and in new revision created off D55940) Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D60385 llvm-svn: 360578	2019-05-13 12:39:08 +00:00
Eugene Leviant	053c6fc2b8	[ThinLTO] Don't internalize weak writeable variables Variables with linkonce_odr and weak_odr linkage shouldn't be internalized if they're not readonly. Otherwise we may end up with multiple copies of such variable, so reads and writes will become inconsistent Differential revision: https://reviews.llvm.org/D61255 llvm-svn: 360577	2019-05-13 11:53:05 +00:00
Cullen Rhodes	6dcef8fc0c	[AArch64][SVE2] Add SVE2 target features to backend and TargetParser Summary: This patch adds the following features defined by Arm SVE2 architecture extension: sve2, sve2-aes, sve2-sm4, sve2-sha3, bitperm For existing CPUs these features are declared as unsupported to prevent scheduler errors. The specification can be found here: https://developer.arm.com/docs/ddi0602/latest Reviewers: SjoerdMeijer, sdesmalen, ostannard, rovka Reviewed By: SjoerdMeijer, rovka Subscribers: rovka, javed.absar, tschuett, kristof.beyls, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61513 llvm-svn: 360573	2019-05-13 10:10:24 +00:00
Ulrich Weigand	8e42f6ddc8	[SystemZ] Model floating-point control register This adds the FPC (floating-point control register) as a reserved physical register and models its use by SystemZ instructions. Note that only the current rounding modes and the IEEE exception masks are modeled. Changes of the FPC due to exceptions (in particular the IEEE exception flags and the DXC) are not modeled. At this point, this patch is mostly NFC, but it will prevent scheduling of floating-point instructions across SPFC/LFPC etc. llvm-svn: 360570	2019-05-13 09:47:26 +00:00
Sam Parker	a33e311a3b	[ARM][ParallelDSP] Relax alias checks When deciding the safety of generating smlad, we checked for any writes within the block that may alias with any of the loads that need to be widened. This is overly conservative because it only matters when there's a potential aliasing write to a location accessed by a pair of loads. Now we check for aliasing writes only once, during setup. If two loads are found to have an aliasing write between them, we don't add these loads to LoadPairs. This means that later during the transform, we can safely widened a pair without worrying about aliasing. However, to maintain correctness, we also need to change the way that wide loads are inserted because the order is now important. The MatchSMLAD method has also been changed, absorbing MatchReductions and AddMACCandidate to hopefully improve readability. Differential Revision: https://reviews.llvm.org/D6102 llvm-svn: 360567	2019-05-13 09:23:32 +00:00
Clement Courbet	9afc4764dd	[DAGCombiner] Fix invalid alias analysis. Summary: When we know for sure whether two addresses do or do not alias, we should immediately return from DAGCombiner::isAlias(). I think this comes from a bad copy/paste, Sorry for not catching that during the code review. Fixes PR41855. Reviewers: niravd, gchatelet, EricWF Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61846 llvm-svn: 360566	2019-05-13 09:07:37 +00:00
Fangrui Song	f3be557159	[WebAssembly] Add dependency on WebAssemblyDesc to fix BUILD_SHARED_LIBS=on builds after rL360550 This fixes the link error ld.lld: error: undefined symbol: llvm::WebAssembly::anyTypeToString(unsigned int) >>> referenced by WebAssemblyDisassembler.cpp llvm-svn: 360558	2019-05-13 05:51:39 +00:00
Yonghong Song	98fe9c9869	[BPF] emit BTF sections only if debuginfo available Currently, without -g, BTF sections may still be emitted with data sections, e.g., for linux kernel bpf selftest test_tcp_check_syncookie_kern.c issue discovered by Martin as shown below. -bash-4.4$ bpftool btf dump file test_tcp_check_syncookie_kern.o [1] VAR 'results' type_id=0, linkage=global-alloc [2] VAR '_license' type_id=0, linkage=global-alloc [3] DATASEC 'license' size=0 vlen=1 type_id=2 offset=0 size=4 [4] DATASEC 'maps' size=0 vlen=1 type_id=1 offset=0 size=28 Let disable BTF generation if no debuginfo, which is the original design. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D61826 llvm-svn: 360556	2019-05-13 05:00:23 +00:00
Lang Hames	4513929094	[JITLink] Track section alignment and make sure it is respected during layout. Previously we had only honored alignments on individual atoms, but tools/runtimes may assume that the section alignment is respected too. llvm-svn: 360555	2019-05-13 04:51:31 +00:00
Craig Topper	61e556d2bd	Recommit r358887 "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling" I've included a new fix in X86RegisterInfo to prevent PR41619 without reintroducing r359392. We might be able to improve that in the base class implementation of shouldRewriteCopySrc somehow. But this hopefully enables forward progress on SimplifyDemandedBits improvements for now. Original commit message: This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGComb but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. llvm-svn: 360552	2019-05-13 04:03:35 +00:00
David L. Jones	a263aa25e1	[WebAssembly] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360550	2019-05-13 03:32:41 +00:00
Lang Hames	23085ec36d	[JITLink] Add a test for zero-filled content. Also updates RuntimeDyldChecker and llvm-rtdyld to support zero-fill tests by returning a content address of zero (but no error) for zero-fill atoms, and treating loads from zero as returning zero. llvm-svn: 360547	2019-05-12 22:26:33 +00:00
Simon Pilgrim	a7fc763082	[X86][AVX] Split VZEXT_MOVL ymm/zmm if the upper elements are not demanded. Removes unnecessary vzeroupper noted in D61806 llvm-svn: 360543	2019-05-12 15:16:29 +00:00
Sanjay Patel	a09e686821	[DAGCombiner] try to move bitcast after extract_subvector I noticed that we were failing to narrow an x86 ymm math op in a case similar to the 'madd' test diff. That is because a bitcast is sitting between the math and the extract subvector and thwarting our pattern matching for narrowing: t56: v8i32 = add t59, t58 t68: v4i64 = bitcast t56 t73: v2i64 = extract_subvector t68, Constant:i64<2> t96: v4i32 = bitcast t73 There are a few wins and neutral diffs in the other tests. Differential Revision: https://reviews.llvm.org/D61806 llvm-svn: 360541	2019-05-12 14:43:20 +00:00
Simon Pilgrim	fda6bffd3b	[X86][SSE] SimplifyDemandedBits - call PEXTRB/PEXTRW SimplifyDemandedVectorElts as well. See if we can simplify the demanded vector elts from the extraction before trying to simplify the demanded bits. This helps us with target shuffles and hops in particular. llvm-svn: 360535	2019-05-11 21:35:50 +00:00
Simon Pilgrim	605a840747	[DAG] Add SimplifyDemandedBits support for BITREVERSE Pulled out of D58017 while I continue to investigate the BSWAP regression on PPC llvm-svn: 360534	2019-05-11 20:56:05 +00:00
Don Hinton	0303e8a3fd	[CommandLine] Add long option flag for cl::ParseCommandLineOptions . Part 5 of 5 Summary: If passed, the long option flag makes the CommandLine parser mimic the behavior or GNU getopt_long. Short options are a single character prefixed by a single dash, and long options are multiple characters prefixed by a double dash. This patch was motivated by the discussion in the following thread: http://lists.llvm.org/pipermail/llvm-dev/2019-April/131786.html Reviewed By: MaskRay Tags: #llvm Differential Revision: https://reviews.llvm.org/D61294 llvm-svn: 360532	2019-05-11 20:27:01 +00:00
Simon Pilgrim	6b10fde69b	[CostModel][X86] Add min/max reduction costs for all SSE targets The original costs stopped at SSE42, I've added conservative estimates for everything down to SSE1/SSE2 and moved some of the SSE42 costs to SSE41 (really only the addition of PCMPGT makes any difference). I've also added missing vXi8 costs (we use PHMINPOSUW for i8/i16 for scarily quick results) and 256-bit vector costs for AVX1. llvm-svn: 360528	2019-05-11 17:12:52 +00:00
Simon Pilgrim	e4c5b6d9bd	[X86][SSE] Add SimplifyDemandedVectorElts HADD/HSUB handling. Still missing PHADDW/PHSUBW tests because PEXTRW doesn't call SimplifyDemandedVectorElts llvm-svn: 360526	2019-05-11 16:07:12 +00:00
Simon Pilgrim	5e0f92acad	FixupLEAPass::fixupIncDec - non-LEA opcodes should not happen here. NFCI. Matches what we do in other functions and fixes scan-build warning about uninitialized NewOpcode variable. llvm-svn: 360525	2019-05-11 16:02:34 +00:00
Craig Topper	c9d7484aa3	[X86] Add CMOV_FR32X/CMOV_FR64X pseudo instructions. Use them in fast isel to fix a machine verifier error after adding test cases. Fast isel picks the FR32X/FR64X register classes when lowering pseudo select, but it didn't have the right opcode to go with it. llvm-svn: 360524	2019-05-11 16:00:28 +00:00
Craig Topper	74a436596d	[X86] Sink some fast isel code into the only if that uses it. NFC llvm-svn: 360523	2019-05-11 16:00:19 +00:00
Craig Topper	26f2b13a65	[X86] Use TLI.getRegClassFor to simplify some more fast isel code. NFCI llvm-svn: 360522	2019-05-11 16:00:13 +00:00
Simon Pilgrim	e7c51137aa	HexagonConstEvaluator::evaluateHexExt - check incoming opcodes. NFCI. Only certain extension opcodes are supported - fixes scan build warning. llvm-svn: 360520	2019-05-11 15:24:34 +00:00
Simon Pilgrim	46d96c02b5	Fix uninitialized variable analyzer warning. NFCI. llvm-svn: 360516	2019-05-11 11:08:24 +00:00
Simon Pilgrim	aeed0a30c0	SelectionDAGISel::CodeGenAndEmitDAG - remove unused variable. NFCI. llvm-svn: 360514	2019-05-11 11:00:37 +00:00
Craig Topper	682cc09675	[X86] Use getRegClassFor to simplify some code in fast isel. NFCI No need to select the register class based on type and features. It should already be setup by X86ISelLowering. llvm-svn: 360513	2019-05-11 05:18:58 +00:00
Craig Topper	31f7adb94f	[X86] Don't emit MOVNTDQA loads from fast-isel without SSE4.1. We were checking for SSE4.1 for FP types, but not integer 128-bit types. Fixes PR41837. llvm-svn: 360512	2019-05-11 04:19:33 +00:00
Craig Topper	bdef12df8d	[X86] Add a test case for idempotent atomic operations with speculative load hardening. Fix an additional issue found by the test. This test covers the fix from r360475 as well. llvm-svn: 360511	2019-05-11 04:00:27 +00:00
Richard Trieu	d0124bd762	[SystemZ] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360510	2019-05-11 03:36:16 +00:00
Richard Trieu	03fe9d82c4	[Sparc] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360506	2019-05-11 02:59:02 +00:00
Richard Trieu	00ecf67045	[RISCV] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure llvm-svn: 360505	2019-05-11 02:43:58 +00:00
Richard Trieu	4bdb136b0f	[PowerPC] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360502	2019-05-11 02:33:18 +00:00
Richard Trieu	4b620fcf0f	[NVPTX] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360500	2019-05-11 02:09:13 +00:00
Richard Trieu	61fb6700a5	[MSP430] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360498	2019-05-11 01:58:52 +00:00
Richard Trieu	fa29bee9d0	[Mips] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360497	2019-05-11 01:38:56 +00:00
Richard Trieu	4c3890ddbf	[Lanai] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360496	2019-05-11 01:25:58 +00:00
Richard Trieu	48803aa65c	[BPF] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360494	2019-05-11 01:13:21 +00:00
Richard Trieu	bf9e67b5b9	[AVR] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360493	2019-05-11 01:03:03 +00:00
Richard Trieu	5e3ee4b84e	[ARM] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360490	2019-05-11 00:34:07 +00:00
Richard Trieu	dcf1ea08e5	[ARC] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360488	2019-05-11 00:13:01 +00:00
Richard Trieu	c0bd7bd481	[AMDGPU] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360487	2019-05-11 00:03:35 +00:00
Richard Trieu	7ba0605511	[AArch64] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360486	2019-05-10 23:50:01 +00:00
Richard Trieu	f48ef2f2ba	[XCore] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360485	2019-05-10 23:36:49 +00:00
Richard Trieu	b28b8b7724	[X86] Move InstPrinter files to MCTargetDesc. NFC For some targets, there is a circular dependency between InstPrinter and MCTargetDesc. Merging them together will fix this. For the other targets, the merging is to maintain consistency so all targets will have the same structure. llvm-svn: 360484	2019-05-10 23:24:38 +00:00
Jordan Rupprecht	16c7fbd112	Revert [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor This reverts r360171 (git commit `a9d6c32eaf`). A repro showing the asan/msan failures is forthcoming. llvm-svn: 360481	2019-05-10 23:20:02 +00:00
Philip Reames	849ef823df	Factor out redzone ABI checks [NFCI] As requested in D58632, cleanup our red zone detection logic in the X86 backend. The existing X86MachineFunctionInfo flag is used to track whether we use the redzone (via a particularly optimization?), but there's no common way to check whether the function has a red zone. I'd appreciate careful review of the uses being updated. I think they are NFC, but a careful eye from someone else would be appreciated. Differential Revision: https://reviews.llvm.org/D61799 llvm-svn: 360479	2019-05-10 22:55:42 +00:00
Lang Hames	b3d6073b3c	[ORC] Make a narrowing-cast explicit to silence a compiler warning. llvm-svn: 360478	2019-05-10 22:51:03 +00:00
Lang Hames	b0cecfc907	[JITLink][MachO] Mark atoms in sections 'no-dead-strip' set live by default. If a MachO section has the no-dead-strip attribute set then its atoms should be preserved, regardless of whether they're public or referenced elsewhere in the object. llvm-svn: 360477	2019-05-10 22:24:37 +00:00
Craig Topper	df10cc6068	[X86] Disable speculative load hardening for operations with an explicit RSP base. After D58632, we can create idempotent atomic operations to the top of stack. This confused speculative load hardening because it thinks accesses should have virtual register base except for the cases it already excluded. This commit adds a new exclusion for this case. I'll try to reduce a test case for this, but this fix was verified to work by the reporter. This should avoid needing to revert D58632. llvm-svn: 360475	2019-05-10 22:03:33 +00:00
Reid Kleckner	7eb6b5ffc3	[COFF] Fix .bss section size bug in obj2yaml / yaml2obj We need to serialize SizeOfRawData through even when there is no data, as in a .bss section. Fixes PR41836 llvm-svn: 360473	2019-05-10 21:53:44 +00:00
Craig Topper	114f763f37	[LegalizeVectorOps] Remove calls to LegalizeOp on the return value from ExpandLoad/ExpandStore. We already updated the LegalizedNodes map at the end of the Expand call. This would have marked the new node as being mapped to itself. So the LegalizeOp call will find that an immediately return. llvm-svn: 360472	2019-05-10 21:42:27 +00:00
Mircea Trofin	ff3bed0e61	Skip over prefetches Summary: Skip over prefetches when assigning debug info to instructions with memory operands. This way, the debug info is stable after instrumenting a binary with prefetches, allowing for iterative profiling and instrumentation. Reviewers: davidxl Reviewed By: davidxl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61789 llvm-svn: 360471	2019-05-10 21:27:55 +00:00
Nikita Popov	9f7537bd48	[SDAG] Recursively legalize both vector mulo results Split out from D61692 per RKSimon's suggestion. Vector op legalization will automatically recursively legalize the returned SDValue, but we need to take care of the other results ourselves. Otherwise it will end up getting legalized only during op legalization, by which point it might be too late (though I'm not aware of any specific cases right now). There are codegen differences because expansion occurs earlier now and we don't get a DAGCombiner run in between. Differential Revision: https://reviews.llvm.org/D61744 llvm-svn: 360470	2019-05-10 20:42:48 +00:00
Teresa Johnson	37b80122bd	[ThinLTO] Auto-hide prevailing linkonce_odr only when all copies eligible Summary: We hit undefined references building with ThinLTO when one source file contained explicit instantiations of a template method (weak_odr) but there were also implicit instantiations in another file (linkonce_odr), and the latter was the prevailing copy. In this case the symbol was marked hidden when the prevailing linkonce_odr copy was promoted to weak_odr. It led to unsats when the resulting shared library was linked with other code that contained a reference (expecting to be resolved due to the explicit instantiation). Add a CanAutoHide flag to the GV summary to allow the thin link to identify when all copies are eligible for auto-hiding (because they were all originally linkonce_odr global unnamed addr), and only do the auto-hide in that case. Most of the changes here are due to plumbing the new flag through the bitcode and llvm assembly, and resulting test changes. I augmented the existing auto-hide test to check for this situation. Reviewers: pcc Subscribers: mehdi_amini, inglorion, eraman, dexonsmith, arphaman, dang, llvm-commits, steven_wu, wmi Tags: #llvm Differential Revision: https://reviews.llvm.org/D59709 llvm-svn: 360466	2019-05-10 20:08:24 +00:00
Sanjay Patel	b37ddeafc0	[DAGCombiner] reduce code duplication; NFC llvm-svn: 360462	2019-05-10 20:02:30 +00:00
Cameron McInally	e75412ab47	Add InstCombine::visitFNeg(...) Differential Revision: https://reviews.llvm.org/D61784 llvm-svn: 360461	2019-05-10 20:01:04 +00:00
David Blaikie	7598b71488	DebugInfo: Only move types out of type units if they're named or type united Follow up to r359122, after a bug was reported in it - the original change too aggressively tried to move related types out of type units, which included unnamed types (like array types) which can't reasonably be declared-but-not-defined. A step beyond that is that some types in type units can be anonymous, if they are types with a name for linkage purposes (eg: "typedef struct { } x;"). So ensure those don't get turned into plain declarations (without signatures) because, lacking names, they can't be resolved to the definition. [Also include a fix for llvm-dwarfdump/libDebugInfoDWARF to pretty print types in type units] llvm-svn: 360458	2019-05-10 19:15:29 +00:00
Simon Pilgrim	6c3ae79e9b	[SLP] Refactor VectorizableTree to use unique_ptr. This patch fixes the TreeEntry dangling pointer issue caused by reallocations of VectorizableTree. Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D61706 llvm-svn: 360456	2019-05-10 18:55:17 +00:00
Amara Emerson	b6af291772	[LSR] Tweak setup cost depth threshold to 10. The original change introduced a depth limit of 7 which caused a 22% regression in the Swift MapReduceLazyCollection & Ackermann benchmarks. This new threshold still ensures that the original test case doesn't hang. rdar://50359639 llvm-svn: 360444	2019-05-10 17:29:35 +00:00
Fangrui Song	9529c563eb	[MC][ELF] Copy top 3 bits of st_other to .symver aliases On PowerPC64 ELFv2 ABI, the top 3 bits of st_other encode the local entry offset. A versioned symbol alias created by .symver should copy the bits from the source symbol. This partly fixes PR41048. A full fix needs tracking of .set assignments and updating st_other fields when finish() is called, see D56586. Patch by Alfredo Dal'Ava Júnior Differential Revision: https://reviews.llvm.org/D59436 llvm-svn: 360442	2019-05-10 17:09:25 +00:00
Momchil Velikov	c396f09ce9	Adjust MachineScheduler to use ProcResource counts This fix allows the scheduler to take into account the number of instances of each ProcResource specified. Previously a declaration in a scheduler of ProcResource<1> would be treated identically to a declaration of ProcResource<2>. Now the hazard recognizer would report a hazard only after all of the resource instances are busy. Patch by Jackson Woodruff and Momchil Velikov. Differential Revision: https://reviews.llvm.org/D51160 llvm-svn: 360441	2019-05-10 16:54:32 +00:00
Robert Lougher	986b6b86bb	[X86] Avoid SFB - Fix inconsistent codegen with/without debug info Fixes https://bugs.llvm.org/show_bug.cgi?id=40969 The functions findPotentiallyBlockedCopies and buildCopy are currently not accounting for the presence of debug instructions. In the former this results in the optimization not being trigerred, and in the latter results in inconsistent codegen. This patch enables the optimization to be performed in a debug build and ensures the codegen is consistent with non-debug builds. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61680 llvm-svn: 360436	2019-05-10 15:55:06 +00:00
Simon Pilgrim	a0b1518a4a	[X86][SSE] Add getHopForBuildVector vector splitting If we only use the lower xmm of a ymm hop, then extract the xmm's (for free), perform the xmm hop and then insert back into a ymm (for free). Fixes some of the regressions noted in D61782 llvm-svn: 360435	2019-05-10 15:46:04 +00:00
Michael Liao	b284414a1b	[InferAddressSpaces] Enhance the handling of cosntexpr. Summary: - Constant expressions may not be added in strict postorder as the forward instruction scan order. Thus, for a constant express (CE0), if its operand (CE1) is used in an previous instruction, they are not in postorder. However, different from `cloneInstructionWithNewAddressSpace`, `cloneConstantExprWithNewAddressSpace` doesn't bookkeep uninferred instructions for later resolving. That results in failure of inferring constant address. - This patch adds the support to infer constant expression operand recursively, since there won't be loop, if that operand is another constant expression. Reviewers: arsenm Subscribers: jholewinski, jvesely, wdng, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61760 llvm-svn: 360431	2019-05-10 14:57:42 +00:00
Lei Huang	1ac6e9636c	[PowerPC] custom lower `v2f64 fpext v2f32` Reduces scalarization overhead via custom lowering of v2f64 fpext v2f32. eg. For the following IR %0 = load <2 x float>, <2 x float>* %Ptr, align 8 %1 = fpext <2 x float> %0 to <2 x double> ret <2 x double> %1 Pre custom lowering: ld r3, 0(r3) mtvsrd f0, r3 xxswapd vs34, vs0 xscvspdpn f0, vs0 xxsldwi vs1, vs34, vs34, 3 xscvspdpn f1, vs1 xxmrghd vs34, vs0, vs1 After custom lowering: lfd f0, 0(r3) xxmrghw vs0, vs0, vs0 xvcvspdp vs34, vs0 Differential Revision: https://reviews.llvm.org/D57857 llvm-svn: 360429	2019-05-10 14:04:06 +00:00
Tim Northover	6c1e3f9493	SelectionDAG: accommodate atomic floating stores. We were applying a pointer truncation to floating types, which crashed LLVM. That is Not A Good Thing(TM). llvm-svn: 360421	2019-05-10 11:23:04 +00:00
Fangrui Song	93b6aa0751	[Object] Move ELF specific ObjectFile::getBuildAttributes to ELFObjectFileBase Change the return type from std::error_code to Error and make the function protected. llvm-svn: 360416	2019-05-10 10:19:08 +00:00
Jeremy Morse	a2b780b731	[DebugInfo] Use zero linenos for debug intrinsics when promoting dbg.declare In certain circumstances, optimizations pick line numbers from debug intrinsic instructions as the new location for altered instructions. This is problematic because the line number of a debugging intrinsic is meaningless (it doesn't produce any machine instruction), only the scope information is valid. The result can be the line number of a variable declaration "leaking" into real code from debugging intrinsics, making the line table un-necessarily jumpy, and potentially different with / without variable locations. Fix this by using zero line numbers when promoting dbg.declare intrinsics into dbg.values: this is safe for debug intrinsics as their line numbers are meaningless, and reduces the scope for damage / misleading stepping when optimizations pick locations from the wrong place. Differential Revision: https://reviews.llvm.org/D59272 llvm-svn: 360415	2019-05-10 10:03:41 +00:00
Fangrui Song	e357ca8231	[Object] Change SymbolicFile::printSymbolName to use Error llvm-svn: 360414	2019-05-10 09:59:04 +00:00
Sam Clegg	ea38ac5ba3	[WebAssembly] Don't assume that strongly defined symbols are DSO-local The current PIC model for WebAssembly is more like ELF in that it allows symbol interposition. This means that more functions end up being addressed via the GOT and fewer directly added to the wasm table. One effect is a reduction in the number of wasm table entries similar to the previous attempt in https://reviews.llvm.org/D61539 which was reverted. Differential Revision: https://reviews.llvm.org/D61772 llvm-svn: 360402	2019-05-10 01:52:08 +00:00
Sam Clegg	2147365484	[WebAssembly] Remove friend18.C from list of known gcc torture test failures. NFC. Differential Revision: https://reviews.llvm.org/D61775 llvm-svn: 360401	2019-05-10 01:45:34 +00:00
Mircea Trofin	5c31c05fbd	[llvm] X86DiscriminateMemOps: insert debug info when missing Reviewers: davidxl Reviewed By: davidxl Subscribers: aprantl, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61735 llvm-svn: 360396	2019-05-10 00:12:51 +00:00
Stanislav Mekhanoshin	64196850f0	[AMDGPU] Pattern for v_xor3_b32 This also allows three op patterns to use increased constant bus limit of GFX10. Differential Revision: https://reviews.llvm.org/D61763 llvm-svn: 360395	2019-05-10 00:09:01 +00:00
Philip Reames	bd588dfd59	[X86] Improve lowering of idemptotent RMW operations The current lowering uses an mfence. mfences are substaintially higher latency than the locked operations originally requested, but we do want to avoid contention on the original cache line. As such, use a locked instruction on a cache line assumed to be thread local. Differential Revision: https://reviews.llvm.org/D58632 llvm-svn: 360393	2019-05-09 23:23:42 +00:00
Lang Hames	112967833e	[JITLink] Fixed a signedness bug when processing X86_64_RELOC_SUBTRACTOR. Subtractor relocation addends are signed, so we need to read them via signed int pointers. Accidentally treating 32-bit addends as unsigned leads to out-of-range errors when we try to add very large (>INT32_MAX) bogus addends. llvm-svn: 360392	2019-05-09 23:17:41 +00:00
Philip Reames	76ea748d2d	Compile time tweak for libcall lookup If we have a large module which is mostly intrinsics, we hammer the lib call lookup path from CodeGenPrepare. Adding a fastpath reduces compile by 15% for one such example. The problem is really more general than intrinsics - a module with lots of non-intrinsics non-libcall calls has the same problem - but we might as well avoid an easy case quickly. llvm-svn: 360391	2019-05-09 23:13:09 +00:00
Lang Hames	5e332f1992	[ORC] Simplify logic for updating edges when should-discard atoms are pruned. llvm-svn: 360384	2019-05-09 22:03:58 +00:00
Lang Hames	dd61274f77	[JITLink] Improve/fix some JITLink debugging output. Adds full edge details (rather than just edge targets) when out-of-range errors are generated. Also fixes a bug where debugging output accessed an invalidated DenseMap iterator by moving the debugging output above the invalidation point. llvm-svn: 360383	2019-05-09 22:03:57 +00:00
Lang Hames	5fa4e9d990	[ORC] Fix a formatting bug. llvm-svn: 360382	2019-05-09 22:03:53 +00:00
Bill Wendling	6ee7f31484	Add ".dword" directive Summary: The ".dword" directive is a synonym for ".xword" and is used used by klibc, a minimalistic libc subset for initramfs. Reviewers: t.p.northover, nickdesaulniers Reviewed By: nickdesaulniers Subscribers: nickdesaulniers, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61719 llvm-svn: 360381	2019-05-09 21:57:44 +00:00
David Blaikie	12faa0d44b	DebugInfo/DWARF: Minor expression simplification llvm-svn: 360377	2019-05-09 21:23:40 +00:00
Cameron McInally	156eb28289	[CodeGen] Add comment about FSUB <-> FNEG xforms Differential Revision: https://reviews.llvm.org/D61741 llvm-svn: 360366	2019-05-09 19:28:52 +00:00
Stanislav Mekhanoshin	a76da34b1d	[AMDGPU] gfx1010 v_interp_* instructions Differential Revision: https://reviews.llvm.org/D61703 llvm-svn: 360364	2019-05-09 18:38:55 +00:00
Simon Pilgrim	93bfa5af48	[X86][SSE] Fold add(shuffle(),shuffle()) to hadd on 'slow' targets (PR39920) As reported on PR39920, "slow horizontal ops" targets tend to internally expand to 2shuffle+add/sub - so if we can reduce 2shuffle+add/sub to a hadd/sub then we should do it - similar port usage but reduced instruction count. This works out in most cases, although the "PR22377" regression in vector-shuffle-combining.ll is annoying - going from 2shuffle+add+shuffle to hadd+2shuffle - I've opened PR41813 to cover this. Differential Revision: https://reviews.llvm.org/D61308 llvm-svn: 360360	2019-05-09 17:45:01 +00:00
Florian Hahn	be10bc71f9	[DAGCombiner] Limit number of nodes explored as store candidates. To find the candidates to merge stores we iterate over all nodes in a chain for each store, which leads to quadratic compile times for large basic blocks with a large number of stores. Reviewers: niravd, spatel, craig.topper Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D61511 llvm-svn: 360357	2019-05-09 17:05:52 +00:00
Stanislav Mekhanoshin	4d4c9e0757	[AMDGPU] gfx1010 changes for PAL metadata Differential Revision: https://reviews.llvm.org/D61704 llvm-svn: 360353	2019-05-09 16:34:13 +00:00
Pavel Labath	dcdb3c6650	MinidumpYAML: add support for the ThreadList stream Summary: The implementation is a pretty straightforward extension of the pattern used for (de)serializing the ModuleList stream. Since there are other streams which use the same format (MemoryList and MemoryList64, at least). I tried to generalize the code a bit so that adding future streams of this type can be done with less code. Reviewers: amccarth, jhenderson, clayborg Subscribers: markmentovai, lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61423 llvm-svn: 360350	2019-05-09 15:13:53 +00:00
David Stuttard	411488b11e	[CodeGenPrepare] Limit recursion depth for collectBitParts Summary: Seeing some issues for windows debug pathological cases with collectBitParts recursion (1525 levels of recursion!) Setting the limit to 64 as this should be sufficient - passes all lit cases Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61728 Change-Id: I7f44cdc6c1badf1c2ccbf1b0c4b6afe27ecb39a1 llvm-svn: 360347	2019-05-09 15:02:10 +00:00
Roman Lebedev	9db0e72570	[X86] AMD Piledriver (BdVer2): major cleanup (mainly inverse throughput) I've started this cleanup more several times now, but got sidetracked elsewhere, e.g. by llvm-exegesis problems. Not this time, finally! This is mainly cleaning up the inverse throughput values, and a few latencies/uops, based on the llvm-exegesis measured values. Though this is not complete by any means, there's certainly more cleanup to be done. The performance numbers (i've only checked by RawSpeed benchmark) aren't really surprising - overall this slightly (< -1%) improves perf. llvm-svn: 360341	2019-05-09 13:54:51 +00:00
Sam Parker	d7b650cc72	[ARM][CGP] Guard against signext args and sitofp Add an Argument that has the SExtAttr attached, as well as SIToFP instructions, as values that generate sign bits. SIToFP doesn't strictly do this and could be treated as a sink to be sign-extended. Differential Revision: https://reviews.llvm.org/D61381 llvm-svn: 360331	2019-05-09 11:56:16 +00:00
Simon Pilgrim	38ef296265	[CodeGenPrepare] Ensure we get a non-null result from getTrueOrFalseValue. NFCI. llvm-svn: 360328	2019-05-09 10:51:26 +00:00
Sven van Haastregt	ad9c7e0789	Fix LLVM_USE_PERF build after getPageSize change Commit r360221 ("[Support] Add error handling to sys::Process::getPageSize().", 2019-05-08) seems to have missed these uses of getPageSize(). Update them to getPageSizeEstimate(). llvm-svn: 360322	2019-05-09 10:10:44 +00:00
Diana Picus	3531453371	[ARM GlobalISel] Map DBG_VALUE for types != s32 ...and make sure we fail elegantly for unsupported values. s64 goes into DPR, anything <= 32 into GPR. llvm-svn: 360321	2019-05-09 09:49:36 +00:00
Hans Wennborg	b1b09e5b55	X86WinAllocaExpander: Drop code looking through register copies (PR41786) This code was never covered by tests, in PR41786 it was pointed out that the deletion part doesn't work, and in a full Chrome build I was never able to hit the code path that looks through copies. It seems the situation it's supposed to handle doesn't actually come up in practice. Delete it to simplify the code. Differential revision: https://reviews.llvm.org/D61671 llvm-svn: 360320	2019-05-09 09:22:56 +00:00
Markus Lavin	92d5db524e	Make sub-registers index names case sensitive in the MIRParser Prior to this change sub-register index names are assumed to be lower case (but they are printed with original casing). This means that if a target has some upper case characters in its sub-register names then mir-export directly followed by mir-import is not possible. This also means that sub-register indices currently are (and will continue to be) slightly inconsistent with register names which are printed and assumed to be lower case. As the current textual representation of mir has a few inconsistencies in this area it is a bit arbitrary how to address the matter. This change is towards the direction that we feel is most correct (i.e. case sensitivity). Differential Revision: https://reviews.llvm.org/D61499 llvm-svn: 360318	2019-05-09 08:29:04 +00:00
Pengfei Wang	c05aad0532	Bugfix for nullptr check by klocwork Klocwork static check: Pointer from call to function `DebugLoc::operator DILocation *() const ` may be NULL and will be dereference in function `printExtendedName``` Patch by Shengchen Kan (skan) Differential Revision: https://reviews.llvm.org/D61715 llvm-svn: 360317	2019-05-09 08:09:21 +00:00
Bjorn Pettersson	8d19e94f13	[CodeGen] Use "DL.getPointerSizeInBits" instead of "8 * DL.getPointerSize". NFC llvm-svn: 360315	2019-05-09 08:07:36 +00:00
Petr Hosek	366cda03a8	[NewPM] Setup Passes for KASan and KMSan While ASan and MSan passes were already ported to new PM, the kernel variants weren't setup in the pipeline which makes the KASan and KMSan tests in Clang fail. Differential Revision: https://reviews.llvm.org/D61664 llvm-svn: 360313	2019-05-09 06:09:35 +00:00
Leonard Chan	95b7abdcc5	[SelectionDAG] Expand ADD/SUBCARRY This patch allows for expansion of ADDCARRY and SUBCARRY when the target does not support it. Differential Revision: https://reviews.llvm.org/D61411 llvm-svn: 360303	2019-05-09 01:17:48 +00:00
Eric Christopher	c93f56d39e	Temporarily Revert "[DebugInfo] Terminate more location-list ranges at the end of blocks" as it was causing significant compile time regressions. This reverts commit r359426 while we come up with testcases and additional ideas. llvm-svn: 360301	2019-05-08 23:54:03 +00:00
Sanjay Patel	902b3ecdad	[SelectionDAG] fold 'fneg undef' to undef This is extracted from the original draft of D61419 with some additional tests. We don't currently get this in IR (it's conservatively turned into a NaN), but presumably that'll get updated as we add real IR support for 'fneg' rather than 'fsub -0.0, x'. The x86-32 run shows the following, and I haven't looked further to see why, but that seems to be independent: Legalizing: t1: f32 = undef Trying to expand node Creating fp constant: t4: f32 = ConstantFP<0.000000e+00> Differential Revision: https://reviews.llvm.org/D61516 llvm-svn: 360296	2019-05-08 22:19:52 +00:00
Matt Arsenault	462403a5c8	AMDGPU: Mark scheduler classes as final llvm-svn: 360294	2019-05-08 22:10:04 +00:00
Matt Arsenault	01434f9377	AMDGPU: Select VOP3 form of add The VOP3 form should always be the preferred selection, to be shrunk later. This should only be an optimization issue, but this partially works around a problem from clobbering VCC when SIFixSGPRCopies rewrites an SCC defining operation directly to VCC. 3 of the testcases are regressions from failing to fold the immediate in cases it should. These can be avoided by improving the VCC liveness handling in SIFoldOperands. Simply increasing the threshold to computeRegisterLiveness works, although this is common enough that VCC liveness should probably be tracked throughout the pass. The hack of leaving behind an implicit_def instruction to avoid breaking iterator wastes instruction count, which inhibits finding the VCC def in long chains of adds. Doing this however exposes different, worse looking regressions from poor scheduling behavior. This could probably be avoided around by forcing the shrink of the addc here, but the scheduler should probably be fixed. The r600 add test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. llvm-svn: 360293	2019-05-08 22:09:57 +00:00
Thomas Preud'homme	4a8ef1128b	[FileCheck] Fix code style of method comments Summary: Fix various issues in code style of method comments: 1) Move all heading comments to all non-static methods near their declaration in the FileCheck.h header file. 2) Harmonize the action verb in doxygen comments for methods to always be in third person 3) Use \returns instead of free text "return" and "returns". 4) Document a couple more parameters while at it. Reviewers: jhenderson, probinson, arichardson Subscribers: javed.absar, kristof.beyls, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61445 llvm-svn: 360288	2019-05-08 21:47:31 +00:00
Stanislav Mekhanoshin	1dbf721315	[AMDGPU] gfx1010 exp modifications Differential Revision: https://reviews.llvm.org/D61701 llvm-svn: 360287	2019-05-08 21:23:37 +00:00
Craig Topper	51a17df45d	[InstCombine] When turning sext into zext due to known bits, return the new ZExt instead of calling replaceinstuseswith The worklist loop that we're returning back to should be able to do the repacement itself. This is how we normally do replacements. My main motivation was that I observed that we weren't preserving the name of the result when we do this transform. The replacement code in the worklist loop will call takeName as part of the replacement. Differential Revision: https://reviews.llvm.org/D61695 llvm-svn: 360284	2019-05-08 20:59:21 +00:00
Changpeng Fang	73b7272e7a	AMDGPU: Fix a mis-placed bracket Differential Revision: https://reviews.llvm.org/D61430 llvm-svn: 360283	2019-05-08 19:46:04 +00:00
Warren Ristow	d27b0c6247	[SCEV] Suppress hoisting insertion point of binops when unsafe InsertBinop tries to move insertion-points out of loops for expressions that are loop-invariant. This patch adds a new parameter, IsSafeToHost, to guard that hoisting. This allows callers to suppress that hoisting for unsafe situations, such as divisions that may have a zero denominator. This fixes PR38697. Differential Revision: https://reviews.llvm.org/D55232 llvm-svn: 360280	2019-05-08 18:50:07 +00:00
Quentin Colombet	157427245a	[RegAllocFast] Scan physcial reg definitions before assigning virtual reg definitions When assigning the definitions of an instruction we were updating the available registers while walking the definitions. Some of those definitions may be from physical registers and thus, they are not available for other definitions to take, but by the time we see that we may have already assign these registers to another virtual register. Fix that by walking through all the definitions and mark as unavailable the physical register definitions, then do the virtual register assignments. PR41790 llvm-svn: 360278	2019-05-08 18:30:26 +00:00
Alina Sbirlea	458c7339e1	[NewPassManager] Add tuning option: SLPVectorization [NFC]. Summary: Mirror tuning option from old pass manager in new pass manager. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61616 llvm-svn: 360276	2019-05-08 17:58:35 +00:00
Craig Topper	493aec3ef5	[FastISel][X86] Support FNeg instruction in target independent fast isel handling This patch adds support for calling selectFNeg for FNeg instructions in addition to the fsub idiom Differential Revision: https://reviews.llvm.org/D61624 llvm-svn: 360273	2019-05-08 17:27:08 +00:00
Alina Sbirlea	f31eba6494	[MemorySSA] Teach LoopSimplify to preserve MemorySSA. Summary: Preserve MemorySSA in LoopSimplify, in the old pass manager, if the analysis is available. Do not preserve it in the new pass manager. Update tests. Subscribers: nemanjai, jlebar, javed.absar, Prazek, kbarton, zzheng, jsji, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60833 llvm-svn: 360270	2019-05-08 17:05:36 +00:00
Simon Pilgrim	e461e9a77d	[AArch64] Remove scan-build "Value stored during its initialization is never read" warnings. NFCI. llvm-svn: 360268	2019-05-08 16:29:39 +00:00
Simon Pilgrim	12521b2d43	[AArch64] Fix scan-build null/uninitialized pointer warnings. NFCI. llvm-svn: 360267	2019-05-08 16:27:24 +00:00
Simon Pilgrim	e3eec06dde	[AMDGPU] Reapplied BFE canonicalization from D60462 This was committed in rL358887 but reverted in rL360066 due to a x86 regression, really it should be have been pre-committed instead of being part of the SimplifyDemandedBits bitcast patch. llvm-svn: 360263	2019-05-08 15:49:10 +00:00
David Greene	6c433713e9	[Reassociation] Place moved instructions after landing pads Reassociation's NegateValue moved instructions to the beginning of blocks (after PHIs) without checking for exception handling pads. It's possible for reassociation to move something into an exception handling block so we need to make sure we don't move things too early in the block. This change advances the insertion point past any exception handling pads. If the block we want to move into contains a catchswitch, we cannot move into it. In that case just create a new neg as if we had not found an existing neg to move. Differential Revision: https://reviews.llvm.org/D61089 llvm-svn: 360262	2019-05-08 15:44:24 +00:00
Nikita Popov	9fd02a71a3	Revert "[ValueTracking] Improve isKnowNonZero for Ints" This reverts commit `3b137a4956`. As reported in https://reviews.llvm.org/D60846, this is causing miscompiles. llvm-svn: 360260	2019-05-08 14:50:01 +00:00
Simon Pilgrim	2788ad3ee2	[LegalizeDAG] Assert non-power-of-2 load/store op splits are in range. NFCI. Fixes static analyzer undefined/out-of-range shift warnings. llvm-svn: 360245	2019-05-08 11:22:10 +00:00
Simon Pilgrim	ec58090491	[Hexagon] Fix cppcheck reduce variable scope warnings. NFCI. Also fixes a static analyzer "Value stored to 'S2' during its initialization is never read" warning. llvm-svn: 360244	2019-05-08 11:02:46 +00:00
Tim Northover	18adcf331b	ARM: disallow SP as Rn for Thumb2 TST & TEQ instructions Using SP in this position is unpredictable in ARMv7. CMP and CMN are not affected, and of course v8 relaxes this requirement, but that's handled elsewhere. llvm-svn: 360242	2019-05-08 10:59:08 +00:00
Simon Pilgrim	cced3ecc35	[VPlan] Fix "value never used" static analyzer warning. NFCI. llvm-svn: 360241	2019-05-08 10:52:26 +00:00
Simon Pilgrim	02937dad69	R600InstrInfo.cpp - Add getTransSwizzle assert for the swizzle op index. NFCI. Fixes static analyzer undefined value warning. llvm-svn: 360239	2019-05-08 10:39:56 +00:00
Andrea Di Biagio	69b8b17945	[MCA] Remove dead assignment. NFC llvm-svn: 360237	2019-05-08 10:28:56 +00:00
Simon Pilgrim	be9ade93d1	[SIMode] Fix typo in Status constructor As noted in https://www.viva64.com/en/b/0629/ (Snippet No. 36) and the scan-build CI reports (https://llvm.org/reports/scan-build/report-SIModeRegister.cpp-Status-1-1.html#EndPath), rL348754 introduced a typo in the Status constructor due to argument variable names shadowing the member variable names. Differential Revision: https://reviews.llvm.org/D61595 llvm-svn: 360236	2019-05-08 10:24:22 +00:00
Simon Pilgrim	2a09a6cfe2	[DebugInfo] Fix use-after-move warning. NFCI. Don't rely on DWARFAbbreviationDeclarationSet::extract cleaning the struct up for reuse - the analyzers don't like it. llvm-svn: 360235	2019-05-08 10:09:57 +00:00
Simon Pilgrim	97a0c54179	Fix cppcheck operator precedence warning. NFCI. llvm-svn: 360234	2019-05-08 10:07:34 +00:00
Florian Hahn	3c696b3e7c	[SCCP] Fix crash when trying to constant-fold terminators multiple times. If we fold a branch/switch to an unconditional branch to another dead block we replace the branch with unreachable, to avoid attempting to fold the unconditional branch. Reviewers: davide, efriedma, mssimpso, jdoerfert Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61300 llvm-svn: 360232	2019-05-08 09:09:54 +00:00
QingShan Zhang	0e71a6e755	[CodeGenPrepare] Don't split the store if it is volatile We shouldn't split the store when it is volatile. Differential Revision: https://reviews.llvm.org/D61169 llvm-svn: 360228	2019-05-08 07:32:12 +00:00
QingShan Zhang	e065af6a42	[NFC] Add a static function to do the endian check Add a new function to do the endian check, as I will commit another patch later, which will also need the endian check. Differential Revision: https://reviews.llvm.org/D61236 llvm-svn: 360226	2019-05-08 07:21:37 +00:00
Mircea Trofin	0a753938db	[llvm] Avoid div by 0 when updating profile weights. Reviewers: davidxl Reviewed By: davidxl Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61661 llvm-svn: 360223	2019-05-08 03:57:25 +00:00
Dan Robertson	3b137a4956	[ValueTracking] Improve isKnowNonZero for Ints Improve isKnownNonZero for integers in order to improve cttz optimizations. Differential Revision: https://reviews.llvm.org/D60846 llvm-svn: 360222	2019-05-08 02:25:08 +00:00
Lang Hames	e4b4ab6d26	[Support] Add error handling to sys::Process::getPageSize(). This patch changes the return type of sys::Process::getPageSize to Expected<unsigned> to account for the fact that the underlying syscalls used to obtain the page size may fail (see below). For clients who use the page size as an optimization only this patch adds a new method, getPageSizeEstimate, which calls through to getPageSize but discards any error returned and substitues a "reasonable" page size estimate estimate instead. All existing LLVM clients are updated to call getPageSizeEstimate rather than getPageSize. On Unix, sys::Process::getPageSize is implemented in terms of getpagesize or sysconf, depending on which macros are set. The sysconf call is documented to return -1 on failure. On Darwin getpagesize is implemented in terms of sysconf and may also fail (though the manpage documentation does not mention this). These failures have been observed in practice when highly restrictive sandbox permissions have been applied. Without this patch, the result is that getPageSize returns -1, which wreaks havoc on any subsequent code that was assuming a sane page size value. <rdar://problem/41654857> Reviewers: dblaikie, echristo Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59107 llvm-svn: 360221	2019-05-08 02:11:07 +00:00
Reid Kleckner	6bf108d77a	[COFF] Use COFF stubs for extern_weak functions Summary: A COFF stub indirects the reference to a symbol through memory. A .refptr.$sym global variable pointer is created to refer to $sym. Typically mingw uses these for external global variable declarations, but we can use them for weak function declarations as well. Updates the dso_local classification to add a special case for extern_weak symbols on COFF in both clang and LLVM. Fixes PR37598 Reviewers: smeenai, mstorsjo Subscribers: hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D61615 llvm-svn: 360207	2019-05-07 23:06:21 +00:00
Sanjay Patel	e088d03b9c	[ValueTracking] add logic for known-never-nan with minnum/maxnum From the LangRef: "Returns NaN only if both operands are NaN." llvm-svn: 360206	2019-05-07 22:58:31 +00:00
Lang Hames	0d8ae1e343	Reapply r360194 "[JITLink] Add support for MachO .alt_entry atoms." with fixes. This patch modifies MachOAtomGraphBuilder to use setLayoutNext rather than addEdge, and fixes a bug in the section layout algorithm that could result in atoms appearing more than once in the section ordering (which resulted in those atoms being assigned invalid addresses during layout). llvm-svn: 360205	2019-05-07 22:56:40 +00:00
Lang Hames	1a10101e21	Revert r360194 "[JITLink] Add support for MachO .alt_entry atoms." The testcase is asserting on some bots - reverting while I investigate. llvm-svn: 360200	2019-05-07 22:19:29 +00:00
Austin Kerbow	8a3d3a9af6	[AMDGPU] Check MI bundles for hazards Summary: GCNHazardRecognizer fails to identify hazards that are in and around bundles. This patch allows the hazard recognizer to consider bundled instructions in both scheduler and hazard recognizer mode. We ignore “bundledness” for the purpose of detecting hazards and examine the instructions individually. Reviewers: arsenm, msearles, rampitec Reviewed By: rampitec Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61564 llvm-svn: 360199	2019-05-07 22:12:15 +00:00
Austin Kerbow	6e6480e216	[CodeGen] Rename DEBUG_TYPE for default hazard recognizer. Summary: The DEBUG_TYPE of the default hazard recognizer should be updated to match the DEBUG_TYPE of the machine-scheduler pass. Reviewers: rampitec Reviewed By: rampitec Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61359 llvm-svn: 360198	2019-05-07 22:09:04 +00:00
Lang Hames	2b09b25e48	[JITLink] Add support for MachO .alt_entry atoms. The MachO .alt_entry directive is applied to a symbol to indicate that it is locked (in terms of address layout and liveness) to its predecessor atom. I.e. it is an alternate entry point, at a fixed offset, for the previous atom. This patch updates MachOAtomGraphBuilder to check for the .alt_entry flag on symbols and add a corresponding LayoutNext edge to the atom-graph. It also updates MachOAtomGraphBuilder_x86_64 to generalize handling of the X86_64_RELOC_SUBTRACTOR relocation: previously either the minuend or subtrahend of the subtraction had to be the same as the atom being fixed up, now it is only necessary for the minuend or subtrahend to be locked (via any chain of alt_entry directives) to the atom being fixed up. llvm-svn: 360194	2019-05-07 21:35:14 +00:00
Kostya Serebryany	b9c5768302	revert r360162 as it breaks most of the buildbots llvm-svn: 360190	2019-05-07 20:57:11 +00:00
Nikita Popov	f610110f1a	[ConstantRange] Simplify makeGNWR implementation; NFC Compute results in more direct ways, avoid subset intersect operations. Extract the core code for computing mul nowrap ranges into separate static functions, so they can be reused. llvm-svn: 360189	2019-05-07 20:34:46 +00:00
Robert Lougher	8681ef8f41	[InstCombine] Add new combine to add folding (X \| C1) + C2 --> (X \| C1) ^ C1 iff (C1 == -C2) I verified the correctness using Alive: https://rise4fun.com/Alive/YNV This transform enables the following transform that already exists in instcombine: (X \| Y) ^ Y --> X & ~Y As a result, the full expected transform is: (X \| C1) + C2 --> X & ~C1 iff (C1 == -C2) There already exists the transform in the sub case: (X \| Y) - Y --> X & ~Y However this does not trigger in the case where Y is constant due to an earlier transform: X - (-C) --> X + C With this new add fold, both the add and sub constant cases are handled. Patch by Chris Dawson. Differential Revision: https://reviews.llvm.org/D61517 llvm-svn: 360185	2019-05-07 19:36:41 +00:00
Eric Christopher	4727221734	Make sure that the DAG combiner doesn't merge stores that we explicitly asked not be greater than preferred vector width for the vectorizer. Test for both 128 and 256 with a skylake architecture. llvm-svn: 360183	2019-05-07 19:25:34 +00:00
Sanjay Patel	6a281a7545	[InstCombine] allow sinking fneg operands through an FP min/max Fundamentally/generally, we should not have to rely on bailouts/crippling of folds. In this particular case, I think we always recognize the inverted predicate min/max pattern, so there should not be any loss of optimization. Codegen looks better because we are eliminating an fneg. llvm-svn: 360180	2019-05-07 18:58:07 +00:00
Don Hinton	102ec0977d	[CommandLine] Allow Options to specify multiple OptionCategory's. Summary: It's not uncommon for separate components to share common Options, e.g., it's common for related Passes to share Options in addition to the Pass specific ones. With this change, components can use OptionCategory's to simply help output even if some of the options are shared. Reviewed By: MaskRay Tags: #llvm Differential Revision: https://reviews.llvm.org/D61574 llvm-svn: 360179	2019-05-07 18:57:01 +00:00
Adrian Prantl	e6e8db5e9b	Debug Info: Support address space attributes on rvalue references. DWARF5, 2.12 20ff says that Any debugging information entry representing a pointer or reference type [may have a DW_AT_address_class attribute]. The existing code (https://reviews.llvm.org/D29670) seems to take a quite literal interpretation of that wording. I don't see a reason why an rvalue reference isn't a reference type in the spirit of that paragraph. This patch allows rvalue references to also have address spaces. rdar://problem/50511483 Differential Revision: https://reviews.llvm.org/D61625 llvm-svn: 360176	2019-05-07 17:42:38 +00:00
Adrian Prantl	ccdefb24ad	Guard __builtin_available() with __has_builtin to support older host compilers. llvm-svn: 360174	2019-05-07 17:10:27 +00:00
Florian Hahn	a9d6c32eaf	[DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor When simplifying TokenFactors, we potentially iterate over all operands of a large number of TokenFactors. This causes quadratic compile times in some cases and the large token factors cause additional scalability problems elsewhere. This patch adds some limits to the number of nodes explored for the cases mentioned above. Reviewers: niravd, spatel, craig.topper Reviewed By: niravd Differential Revision: https://reviews.llvm.org/D61397 llvm-svn: 360171	2019-05-07 16:47:27 +00:00
Simon Pilgrim	3044ac058b	Avoid use-after-move warnings by using swap instead. NFCI. Swap should be as quick in these cases, and leaves the original variables in a known (empty) state. llvm-svn: 360164	2019-05-07 15:45:00 +00:00
Orlando Cazalet-Hyams	78a6062c24	[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion Summary: Bug: https://bugs.llvm.org/show_bug.cgi?id=39024 The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here: A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins. B) Instructions in the middle block have different line numbers which give the impression of another iteration. In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks. Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel Reviewed By: hfinkel Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits Tags: #llvm, #debug-info Differential Revision: https://reviews.llvm.org/D60831 llvm-svn: 360162	2019-05-07 15:37:38 +00:00
Keno Fischer	a1a4adf4b9	[SCEV] Add explicit representations of umin/smin Summary: Currently we express umin as `~umax(~x, ~y)`. However, this becomes a problem for operands in non-integral pointer spaces, because `~x` is not something we can compute for `x` non-integral. However, since comparisons are generally still allowed, we are actually able to express `umin(x, y)` directly as long as we don't try to express is as a umax. Support this by adding an explicit umin/smin representation to SCEV. We do this by factoring the existing getUMax/getSMax functions into a new function that does all four. The previous two functions were largely identical. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D50167 llvm-svn: 360159	2019-05-07 15:28:47 +00:00
Simon Pilgrim	debb2b2a1e	Fix local shadow variable warning. NFCI. llvm-svn: 360157	2019-05-07 14:56:34 +00:00
Nemanja Ivanovic	b4f028f0f3	[PowerPC] Use the two-constant NR algorithm for refining estimates The single-constant algorithm produces infinities on a lot of denormal values. The precision of the two-constant algorithm is actually sufficient across the range of denormals. We will switch to that algorithm for now to avoid the infinities on denormals. In the future, we will re-evaluate the algorithm to find the optimal one for PowerPC. Differential revision: https://reviews.llvm.org/D60037 llvm-svn: 360144	2019-05-07 13:48:03 +00:00
George Rimar	0974688a42	[yaml2obj] - Allow setting st_value explicitly for Symbol. In some cases it is useful to explicitly set symbol's st_name value. For example, I am using it in a patch for LLD to remove the broken binary from a test case and replace it with a YAML test. Differential revision: https://reviews.llvm.org/D61180 llvm-svn: 360137	2019-05-07 12:10:51 +00:00
Diana Picus	0a47fb8884	[ARM GlobalISel] Widen G_SELECT operands ...except for the condition operand. llvm-svn: 360135	2019-05-07 11:39:30 +00:00
Simon Pilgrim	b0f51266b8	[X86][AVX] Fold concat(packus(),packus()) -> packus(concat(),concat()) (PR34773) Basic "revectorization" combine, we can probably do more opcodes here but it can be a tricky cost-benefit depending on where the subvectors came from - but this case helps shuffle combining. llvm-svn: 360134	2019-05-07 11:17:39 +00:00
Simon Pilgrim	a80abeea88	Fixed "Value stored to 'Opc' is never read" warning. NFCI. llvm-svn: 360133	2019-05-07 11:09:16 +00:00
Simon Pilgrim	3c975a0ab5	[X86] Reduce scope of variables where possible. NFCI. Fixes cppcheck warnings. llvm-svn: 360131	2019-05-07 10:50:11 +00:00
Diana Picus	d6d3808fa4	[ARM GlobalISel] Widen G_INTTOPTR/G_PTRTOINT We actually have a couple of G_PTRTOINT to s8 when building clang, so we should do something about them. llvm-svn: 360130	2019-05-07 10:48:01 +00:00
Simon Pilgrim	c5ac14eef8	Fix uninitialized variable warning. NFCI. This also fixes a scan-build "array subscript is undefined" warning. llvm-svn: 360128	2019-05-07 10:30:22 +00:00
Diana Picus	d18bac5d19	[ARM GlobalISel] Widen G_GEP index operand llvm-svn: 360127	2019-05-07 10:11:57 +00:00
Orlando Cazalet-Hyams	0d05177337	Test commit access llvm-svn: 360125	2019-05-07 09:30:55 +00:00
Nicolai Haehnle	79ea85c6af	AMDGPU: Verify that SOP2/SOPC instructions have at most one immediate operand Summary: No test case because I don't know of a way to trigger this, but I accidentally caused this to fail while working on a different change. Change-Id: I8015aa447fe27163cc4e4902205a203bd44bf7e3 Reviewers: arsenm, rampitec Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61490 llvm-svn: 360123	2019-05-07 09:19:09 +00:00
Craig Topper	c6d445f9c1	[FastISel][X86] If selectFNeg fails, fall back to SelectionDAG not treating it as an fsub. Summary: If fneg lowering for fsub -0.0, x fails we currently fall back to treating it as an fsub. This has different behavior for nans than the xor with sign bit trick we normally try to do. On X86, the xor trick for double fails fast-isel in 32-bit mode with sse2 due to 64 bit integer types not being available. With -O2 we would always use an xorpd for this case. If we use subsd, this creates an observable behavior difference between -O0 and -O2. So fall back to SelectionDAG if we can't fast-isel it, that way SelectionDAG will use the xorpd. I believe this patch is restoring the behavior prior to r345295 from last October. This was missed then because our fast isel case in 32-bit mode aborted fast-isel earlier for another reason. But I've added new tests to cover that. Reviewers: andrew.w.kaylor, cameron.mcinally, spatel, efriedma Reviewed By: cameron.mcinally Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61622 llvm-svn: 360111	2019-05-07 04:25:24 +00:00
Sam Clegg	5f8c2edef3	[WebAssembly] Add more test coverage for reloctions against section symbols The only known user of this relocation type and symbol type is the debug info sections, but we were not testing the `--relocatable` output path. This change adds a minimal test case to cover relocations against section symbols includes `--relocatable` output. Differential Revision: https://reviews.llvm.org/D61623 llvm-svn: 360110	2019-05-07 03:53:16 +00:00
Fangrui Song	da82ce99b7	[DebugInfo] Delete TypedDINodeRef TypedDINodeRef<T> is a redundant wrapper of Metadata * that is actually a T . Accordingly, change DI{Node,Scope,Type}Ref uses to DI{Node,Scope,Type} or their const variants. This allows us to delete many resolve() calls that clutter the code. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D61369 llvm-svn: 360108	2019-05-07 02:06:37 +00:00
Fangrui Song	a400ca3f3d	[SanitizerCoverage] Use different module ctor names for trace-pc-guard and inline-8bit-counters Fixes the main issue in PR41693 When both modes are used, two functions are created: `sancov.module_ctor`, `sancov.module_ctor.$LastUnique`, where $LastUnique is the current LastUnique counter that may be different in another module. `sancov.module_ctor.$LastUnique` belongs to the comdat group of the same name (due to the non-null third field of the ctor in llvm.global_ctors). COMDAT group section [ 9] `.group' [sancov.module_ctor] contains 6 sections: [Index] Name [ 10] .text.sancov.module_ctor [ 11] .rela.text.sancov.module_ctor [ 12] .text.sancov.module_ctor.6 [ 13] .rela.text.sancov.module_ctor.6 [ 23] .init_array.2 [ 24] .rela.init_array.2 # 2 problems: # 1) If sancov.module_ctor in this module is discarded, this group # has a relocation to a discarded section. ld.bfd and gold will # error. (Another issue: it is silently accepted by lld) # 2) The comdat group has an unstable name that may be different in # another translation unit. Even if the linker allows the dangling relocation # (with --noinhibit-exec), there will be many undesired .init_array entries COMDAT group section [ 25] `.group' [sancov.module_ctor.6] contains 2 sections: [Index] Name [ 26] .init_array.2 [ 27] .rela.init_array.2 By using different module ctor names, the associated comdat group names will also be different and thus stable across modules. Reviewed By: morehouse, phosek Differential Revision: https://reviews.llvm.org/D61510 llvm-svn: 360107	2019-05-07 01:39:37 +00:00
Craig Topper	a75630302d	[X86] Use extended vector register classes in getRegForInlineAsmConstraint to support x/y/zmm16-31 when the type is mismatched. The FR32/FR64/VR128/VR256 register classes don't contain the upper 16 registers. For most cases we use the default implementation which will find any register class that contains the register in question if the VT is legal for the register class. But if the VT is i32 or i64, we won't find a matching register class and will instead up in the code modified in this patch. If the requested register is x/y/zmm16-31 we weren't returning a register class that contains those registers and will hit an assertion in the caller. To fix this, I've changed to use the extended register class instead. I don't believe we need a subtarget check to see if avx512 is enabled. The default implementation just pick whatever register class it finds first. I checked and we currently pick FR32X for XMM0 with an f32 type using the default implementation regardless of whether avx512 is enabled. So I assume its it is ok to do the same for i32. Differential Revision: https://reviews.llvm.org/D61457 llvm-svn: 360102	2019-05-06 23:57:42 +00:00
Amy Huang	987b969bab	Fix bug in getCompleteTypeIndex in codeview debug info Summary: When there are multiple instances of a forward decl record type, only the first one is emitted with a type index, because the type is added to a map with a null type index. Avoid this by reordering so that forward decl types aren't added to the map. Reviewers: rnk Subscribers: aprantl, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61460 llvm-svn: 360101	2019-05-06 23:37:03 +00:00
Eli Friedman	2ea088173d	[ARM] Glue register copies to tail calls. This generally follows what other targets do. I don't completely understand why the special case for tail calls existed in the first place; even when the code was committed in r105413, call lowering didn't work in the way described in the comments. Stack protector lowering breaks if the register copies are not glued to a tail call: we have to insert the stack protector check before the tail call, and we choose the location based on the assumption that all physical register dependencies of a tail call are adjacent to the tail call. (See FindSplitPointForStackProtector.) This is sort of fragile, but I don't see any reason to break that assumption. I'm guessing nobody has seen this before just because it's hard to convince the scheduler to actually schedule the code in a way that breaks; even without the glue, the only computation that could actually be scheduled after the register copies is the computation of the call address, and the scheduler usually prefers to schedule that before the copies anyway. Fixes https://bugs.llvm.org/show_bug.cgi?id=41417 Differential Revision: https://reviews.llvm.org/D60427 llvm-svn: 360099	2019-05-06 23:21:59 +00:00
Craig Topper	39f1a97417	[FastISel] Pass the fneg input operand to hasTrivialKill in FastISel::selectFNeg. We're trying to calculate the kill flag for OpReg which is the input so we need to pass the input here. llvm-svn: 360097	2019-05-06 23:09:09 +00:00
Stanislav Mekhanoshin	491746a584	[AMDGPU] gfx1010 verifier changes Differential Revision: https://reviews.llvm.org/D61521 llvm-svn: 360095	2019-05-06 22:49:45 +00:00
Stanislav Mekhanoshin	971cb8b633	[AMDGPU] gfx1010: prefer V_MUL_LO_U32 over V_MUL_LO_I32 GFX10 deprecates v_mul_lo_i32 instruction, so choose u32 form for all targets. Differential Revision: https://reviews.llvm.org/D61525 llvm-svn: 360094	2019-05-06 22:27:05 +00:00
Philip Reames	2f53d79bff	Fix pr33010, a 2 year old crashing regression The problem was that we were creating a CMOV64rr <TargetFrameIndex>, <TargetFrameIndex>. The entire point of a TFI is that address code is not generated, so there's no way to legalize/lower this. Instead, simply prevent it's creation. Arguably, we shouldn't be using TargetFrameIndices in StatepointLowering at all, but that's a much deeper change. llvm-svn: 360090	2019-05-06 22:09:31 +00:00
Stanislav Mekhanoshin	1bc001dec4	[AMDGPU] gfx1010 memory legalizer Differential Revision: https://reviews.llvm.org/D61535 llvm-svn: 360087	2019-05-06 21:57:02 +00:00
Jordan Rupprecht	8f14e7cacf	Revert "Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259)" This reverts r357452 (git commit `21eb771dcb`). This was causing strange optimization-related test failures on an internal test. Will followup with more details offline. llvm-svn: 360086	2019-05-06 21:55:05 +00:00
Craig Topper	d10a200ceb	[X86] Remove the suffix on vcvt[u]si2ss/sd register variants in assembly printing. We require d/q suffixes on the memory form of these instructions to disambiguate the memory size. We don't require it on the register forms, but need to support parsing both with and without it. Previously we always printed the d/q suffix on the register forms, but it's redundant and inconsistent with gcc and objdump. After this patch we should support the d/q for parsing, but not print it when its unneeded. llvm-svn: 360085	2019-05-06 21:39:51 +00:00
Martin Storsjo	899f3cd581	[AArch64] Default to SEH exception handling on MinGW The SEH implementation is pretty mature at this point. Differential Revision: https://reviews.llvm.org/D61590 llvm-svn: 360080	2019-05-06 21:18:15 +00:00
Sanjay Patel	a6019d5164	[InstCombine] sink FP negation of operands through select We don't always get this: Cond ? -X : -Y --> -(Cond ? X : Y) ...even with the legacy IR form of fneg in the case with extra uses, and we miss matching with the newer 'fneg' instruction because we are expecting binops through the rest of the path. Differential Revision: https://reviews.llvm.org/D61604 llvm-svn: 360075	2019-05-06 20:34:05 +00:00
Simon Pilgrim	364ef5db2b	Pull out repeated CI->getCalledFunction() calls. NFCI. llvm-svn: 360070	2019-05-06 19:51:54 +00:00
Craig Topper	ad56843dd7	[SelectionDAG][X86] Support inline assembly returning an mmx register into a type with fewer than 64 bits. It's possible to use the 'y' mmx constraint with a type narrower than 64-bits. This patch supports this by bitcasting the mmx type to 64-bits and then truncating to the desired type. There are probably other missing type combinations we need to support, but this is the case we have a bug report for. Fixes PR41748. Differential Revision: https://reviews.llvm.org/D61582 llvm-svn: 360069	2019-05-06 19:50:14 +00:00
Amara Emerson	3d1128cc9e	[GlobalISel] Handle <1 x T> vector return types properly. After support for dealing with types that need to be extended in some way was added in r358032 we didn't correctly handle <1 x T> return types. These types don't have a GISel direct representation, instead we just see them as scalars. When we need to pad them into <2 x T> types however we need to use a G_BUILD_VECTOR instead of trying to do a G_CONCAT_VECTOR. This fixes PR41738. llvm-svn: 360068	2019-05-06 19:41:01 +00:00
Craig Topper	55a71b575c	Revert r359392 and r358887 Reverts "[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead" Reverts "[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling" Eric Christopher and Jorge Gorbe Moya reported some issues with these patches to me off list. Removing the CodeGenOnly instructions has changed how fneg is handled during fast-isel with sse/sse2. We're now emitting fsub -0.0, x instead moving to the integer domain(in a GPR), xoring the sign bit, and then moving back to xmm. This is because the fast isel table no longer contains an entry for (f32/f64 bitcast (i32/i64)) so the target independent fneg code fails. The use of fsub changes the behavior of nan with respect to -O2 codegen which will always use a pxor. NOTE: We still have a difference with double with -m32 since the move to GPR doesn't work there. I'll file a separate PR for that and add test cases. Since removing the CodeGenOnly instructions was fixing PR41619, I'm reverting r358887 which exposed that PR. Though I wouldn't be surprised if that bug can still be hit independent of that. This should hopefully get Google back to green. I'll work with Simon and other X86 folks to figure out how to move forward again. llvm-svn: 360066	2019-05-06 19:29:24 +00:00
Sanjay Patel	a64bd09ec4	[InstCombine] reduce code duplication; NFC llvm-svn: 360059	2019-05-06 17:39:18 +00:00
Nikita Popov	d5a403fb80	[ConstantRange] Add srem() support Add support for srem() to ConstantRange so we can use it in LVI. For srem the sign of the result matches the sign of the LHS. For the RHS only the absolute value is important. Apart from that the logic is like urem. Just like for urem this is only an approximate implementation. The tests check a few specific cases and run an exhaustive test for conservative correctness (but not exactness). Differential Revision: https://reviews.llvm.org/D61207 llvm-svn: 360055	2019-05-06 16:59:37 +00:00
Nikita Popov	cfe786a195	[SDAG][AArch64] Boolean and/or reduce to umax/min reduce (PR41635) This addresses one half of https://bugs.llvm.org/show_bug.cgi?id=41635 by combining a VECREDUCE_AND/OR into VECREDUCE_UMIN/UMAX (if latter is legal but former is not) for zero-or-all-ones boolean reductions (which are detected based on sign bits). Differential Revision: https://reviews.llvm.org/D61398 llvm-svn: 360054	2019-05-06 16:17:17 +00:00
Cameron McInally	c3167696bc	Add FNeg support to InstructionSimplify Differential Revision: https://reviews.llvm.org/D61573 llvm-svn: 360053	2019-05-06 16:05:10 +00:00
Sanjay Patel	62f457b137	[InstCombine] reduce code duplication; NFCI llvm-svn: 360051	2019-05-06 15:35:02 +00:00
Guillaume Chatelet	edd69fca3e	Modernize repmovsb implementation of x86 memcpy and allow runtime sizes. Summary: This is a prerequisite to RFC http://lists.llvm.org/pipermail/llvm-dev/2019-April/131973.html Reviewers: courbet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61593 Fix typo. Turn this patch into an NFC. Addressing comments llvm-svn: 360050	2019-05-06 15:10:19 +00:00
Simon Pilgrim	2a0ef0530b	[X86] Fix uninitialized members in constructor warnings. NFCI. Initialize all member variables in X86ATTInstPrinter and X86DAGToDAGISel constructors to fix cppcheck warning. llvm-svn: 360047	2019-05-06 14:48:02 +00:00
Alexandre Ganea	799d96ec39	Fix compilation warnings when compiling with GCC 7.3 Differential Revision: https://reviews.llvm.org/D61046 llvm-svn: 360044	2019-05-06 13:41:54 +00:00
Nemanja Ivanovic	70afe4f7e1	[PowerPC] Fix erroneous condition for converting uint-to-fp vector conversion A condition for exiting the legalization of v4i32 conversion to v2f64 through extract/convert/build erroneously checks for the extract having type i32. This is not adequate as smaller extracts are actually legalized to i32 as well. Furthermore, an early exit is missing which means that we only check that both extracts are from the same vector if that check fails. As a result, both cases in the included test case fail - the first gets a select error and the second generates incorrect code. The culprit commit is r274535. llvm-svn: 360043	2019-05-06 13:35:49 +00:00
Simon Pilgrim	d672d0e246	X86DAGToDAGISel::tryVPTESTM - fix uninitialized variable warning. NFCI. findBroadcastedOp should always initialize the value if it returns true but static-analyzer isn't great at recognising this. llvm-svn: 360037	2019-05-06 11:52:16 +00:00
Simon Pilgrim	97fbc2abfe	[LoadStoreVectorizer] vectorizeStoreChain - ensure we find a store type. Properly initialize store type to null then ensure we find a real store type in the chain. Fixes scan-build null dereference warning and makes the code clearer. llvm-svn: 360031	2019-05-06 10:25:11 +00:00
Simon Pilgrim	04dad8f66d	[X86] X86InstrInfo::findThreeSrcCommutedOpIndices - fix unread variable warning. scan-build was reporting that CommutableOpIdx1 never used its original initialized value - move it down to where its first used to make the real initialization more obvious (and matches the comment that's there). llvm-svn: 360028	2019-05-06 10:15:34 +00:00
Simon Pilgrim	07d91cd98a	[X86] lowerVectorShuffle - use any_of to detect out of bounds shuffle indices. NFCI. Fixes cppcheck local shadow warning as well. llvm-svn: 360027	2019-05-06 10:11:24 +00:00
Clement Courbet	9e1f2a7fe7	[SimplifyLibCalls] Simplify bcmp too. Summary: Fixes PR40699. Reviewers: gchatelet Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61585 llvm-svn: 360021	2019-05-06 09:15:22 +00:00
Luo, Yuanke	beec41c656	Enable AVX512_BF16 instructions, which are supported for BFLOAT16 in Cooper Lake Summary: 1. Enable infrastructure of AVX512_BF16, which is supported for BFLOAT16 in Cooper Lake; 2. Enable VCVTNE2PS2BF16, VCVTNEPS2BF16 and DPBF16PS instructions, which are Vector Neural Network Instructions supporting BFLOAT16 inputs and conversion instructions from IEEE single precision. VCVTNE2PS2BF16: Convert Two Packed Single Data to One Packed BF16 Data. VCVTNEPS2BF16: Convert Packed Single Data to Packed BF16 Data. VDPBF16PS: Dot Product of BF16 Pairs Accumulated into Packed Single Precision. For more details about BF16 isa, please refer to the latest ISE document: https://software.intel.com/en-us/download/intel-architecture-instruction-set-extensions-programming-reference Author: LiuTianle Reviewers: craig.topper, smaslov, LuoYuanke, wxiao3, annita.zhang, RKSimon, spatel Reviewed By: craig.topper Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60550 llvm-svn: 360017	2019-05-06 08:22:37 +00:00
Fangrui Song	7e55672b22	DWARF v5: fix directory index in the line table Summary: Prior to DWARF v5, a directory index of 0 represents DW_AT_comp_dir. In DWARF v5, the index starts with 0 and Entry.DirIdx is the index into Prologue.IncludeDirectories. Reviewed By: labath Differential Revision: https://reviews.llvm.org/D61253 llvm-svn: 360015	2019-05-06 08:03:46 +00:00
Markus Lavin	a778074165	[DebugInfo] GlobalOpt DW_OP_deref_size instead of DW_OP_deref. Optimization pass lib/Transforms/IPO/GlobalOpt.cpp needs to insert DW_OP_deref_size instead of DW_OP_deref to be compatible with big-endian targets for same reasons as in D59687. Differential Revision: https://reviews.llvm.org/D60611 llvm-svn: 360013	2019-05-06 07:20:56 +00:00
Craig Topper	f723490e76	[SelectionDAG] Replace llvm_unreachable at the end of getCopyFromParts with a report_fatal_error. Based on PR41748, not all cases are handled in this function. llvm_unreachable is treated as an optimization hint than can prune code paths in a release build. This causes weird behavior when PR41748 is encountered on a release build. It appears to generate an fp_round instruction from the floating point code. Making this a report_fatal_error prevents incorrect optimization of the code and will instead generate a message to file a bug report. llvm-svn: 360008	2019-05-06 04:01:49 +00:00
Simon Pilgrim	8462cc3c74	[X86] Pull out repeated Subtarget feature tests. NFCI. Avoids a scan-build "uninitialized value" warning in X86FastISel::X86SelectFPExtOrFPTrunc llvm-svn: 360001	2019-05-05 20:45:20 +00:00
Simon Pilgrim	addc90e4e8	[TTI][X86] Make getAddressComputationCost cost value const. NFCI. llvm-svn: 359999	2019-05-05 20:03:51 +00:00
Roman Lebedev	02569408ef	[NFC] BasicBlock: generalize replaceSuccessorsPhiUsesWith(), take Old bb Thus it does not assume that the old basic block is the basic block for which we are looking at successors. Not reviewed, but seems rather trivial, in line with the rest of previous few patches. llvm-svn: 359997	2019-05-05 18:59:45 +00:00
Roman Lebedev	1a1b922177	[NFC] BasicBlock: refactor changePhiUses() out of replacePhiUsesWith(), use it Summary: It is a common thing to loop over every `PHINode` in some `BasicBlock` and change old `BasicBlock` incoming block to a new `BasicBlock` incoming block. `replaceSuccessorsPhiUsesWith()` already had code to do that, it just wasn't a function. So outline it into a new function, and use it. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61013 llvm-svn: 359996	2019-05-05 18:59:39 +00:00
Roman Lebedev	e3b1d82b53	[NFC] PHINode: introduce replaceIncomingBlockWith() function, use it Summary: There is `PHINode::getBasicBlockIndex()`, `PHINode::setIncomingBlock()` and `PHINode::getNumOperands()`, but no function to replace every specified `BasicBlock` predecessor with some other specified `BasicBlock`. Clearly, there are a lot of places that could use that functionality. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61011 llvm-svn: 359995	2019-05-05 18:59:30 +00:00
Roman Lebedev	7ad5d14f3a	[NFC] Instruction: introduce replaceSuccessorWith() function, use it Summary: There is `Instruction::getNumSuccessors()`, `Instruction::getSuccessor()` and `Instruction::setSuccessor()`, but no function to replace every specified `BasicBlock` successor with some other specified `BasicBlock`. I've found one place where it should clearly be used. Reviewers: chandlerc, craig.topper, spatel, danielcdh Reviewed By: craig.topper Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61010 llvm-svn: 359994	2019-05-05 18:59:22 +00:00
Roman Lebedev	e5be660e25	[NFC][Utils] deleteDeadLoop(): add an assert that exit block has some non-PHI instruction Summary: If `deleteDeadLoop()` is called on such a loop, that has "bad" exit block, one that e.g. has no terminator instruction, the `DIBuilder::insertDbgValueIntrinsic()` will be told to insert the Dbg Value Intrinsic after `nullptr` (since there is no first non-PHI instruction), which will cause it to not insert those instructions into any basic block. The instructions will be parent-less, and IR verifier will complain. It is rather obvious to track down the root cause when that happens, so let's just assert it never happens. Reviewers: sanjoy, davide, vsk Reviewed By: vsk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61008 llvm-svn: 359993	2019-05-05 18:59:12 +00:00
Simon Pilgrim	5170c0e5fe	Move getOpcode() call into if statement. NFCI. Avoids a cppcheck "Local variable name shadows outer variable" warning. llvm-svn: 359991	2019-05-05 18:34:38 +00:00
Simon Pilgrim	afb0e664e6	[SLPVectorizer] Prefer pre-increments. NFCI. llvm-svn: 359989	2019-05-05 17:53:09 +00:00
Craig Topper	922e252a70	[LLParser] Remove unused variable after r359987. NFC llvm-svn: 359988	2019-05-05 17:46:17 +00:00
Craig Topper	f6e07c472d	[LLParser] Remove unnecessary error check making sure NUW/NSW flags aren't set on a non-integer operation. Summary: This check appears to be a leftover from when add/sub/mul could be either integer or fp. The NSW/NUW flags are only set for add/sub/mul/shl earlier. And we check that those operations only have integer types just below this. So it seems unnecessary to explicitly error for NUW/NSW being used on a add/sub/mul that have the wrong type that would later error for that. Reviewers: spatel, dblaikie, jyknight, arsenm Reviewed By: spatel Subscribers: wdng, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D61562 llvm-svn: 359987	2019-05-05 17:19:23 +00:00
Craig Topper	8279695d66	[LLParser] Simplify type checking in ParseArithmetic and ParseUnaryOp. Summary: These methods previously took a 0, 1, or 2 to indicate what types were allowed, but the 0 encoding which meant both fp and integer types has been unused for years. Its leftover from when add/sub/mul used to be shared between int and fp Simplify it by changing it to just a bool to distinquish int and fp. Reviewers: spatel, dblaikie, jyknight, arsenm Reviewed By: spatel Subscribers: wdng, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61561 llvm-svn: 359986	2019-05-05 17:19:19 +00:00
Craig Topper	41c999bcf5	[Constants] Simplify type checking switch in ConstantExpr::get. Summary: Remove duplicate checks that both operands have the same type. This is checked before the switch. Use 'integer' or 'floating-point' instead of 'arithmetic' type. I think this might be a leftover to the days when floating point and integer operations shared the same opcodes. Reviewers: spatel, RKSimon, dblaikie Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61558 llvm-svn: 359985	2019-05-05 17:19:16 +00:00
Andrea Di Biagio	0460a3629b	[MCA] Notify event listeners when instructions transition to the Pending state. NFCI llvm-svn: 359983	2019-05-05 16:07:27 +00:00
Cameron McInally	1d0c845d9d	Add FNeg IR constant folding support llvm-svn: 359982	2019-05-05 16:07:09 +00:00
Simon Pilgrim	70ee2def90	[X86] Make X86RegisterInfo(const Triple &TT) constructor explicit. Fixes cppcheck warning. llvm-svn: 359981	2019-05-05 12:51:47 +00:00
Simon Pilgrim	cbcd9b1b92	[X86] Fix some cppcheck "Local variable name shadows outer variable" warnings. NFCI. llvm-svn: 359976	2019-05-05 12:00:14 +00:00
Simon Pilgrim	5b05f20a3a	[SLPVectorizer] Make getSpillCost() const. NFCI. Ideally getTreeCost() should be const as well but non-const Type creation would need to be addressed first. llvm-svn: 359975	2019-05-05 10:37:38 +00:00
Simon Pilgrim	0f89b76b84	[SelectionDAG] Use any_of/all_of where possible. NFCI. llvm-svn: 359974	2019-05-05 10:30:04 +00:00
Simon Pilgrim	7a2e855a0f	Move Value *RHSCIOp def into the scope where its actually used. NFCI. llvm-svn: 359973	2019-05-05 10:27:45 +00:00
Sanjay Patel	5ab41a7a05	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block (2nd try) This is a subset of the original commit from rL359879 which was reverted because it could crash when using the 'RemovedInstructions' structure that enables delayed deletion of dead instructions. The motivating compile-time win does not require that change though. We should get most of that win from this change alone. Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359969	2019-05-04 12:46:32 +00:00
Stanislav Mekhanoshin	5ddd564e19	[AMDGPU] Fixed asan error after D61536 llvm-svn: 359963	2019-05-04 06:40:20 +00:00
Stanislav Mekhanoshin	51d1415a16	AMDGPU] gfx1010 hazard recognizer Differential Revision: https://reviews.llvm.org/D61536 llvm-svn: 359961	2019-05-04 04:30:57 +00:00
Stanislav Mekhanoshin	28a1936f6d	[AMDGPU] gfx1010: use fmac instructions Differential Revision: https://reviews.llvm.org/D61527 llvm-svn: 359959	2019-05-04 04:20:37 +00:00
Lang Hames	ce8255f3e2	[JITLink] Add two useful Section operations: find by name, get address range. These operations were already used in eh-frame registration, and are likely to be used in other runtime registrations, so this commit moves them into a header where they can be re-used. llvm-svn: 359950	2019-05-04 00:23:09 +00:00
Jessica Paquette	910630c1e4	[AArch64][GlobalISel] Use fcsel instead of csel for G_SELECT on FPRs This saves us some unnecessary copies. If the inputs to a G_SELECT are floating point, we should use fcsel rather than csel. Changes here are... - Teach selectCopy about s1-to-s1 copies across register banks. - AArch64RegisterBankInfo about G_SELECT in general. - Teach the instruction selector about the FCSEL instructions. Also add two tests: - select-select.mir to show that we get the expected FCSEL - regbank-select.mir (unfortunately named) to show the register banks on G_SELECT are properly preserved And update fast-isel-select.ll to show that we do the same thing as other instruction selectors in these cases. llvm-svn: 359940	2019-05-03 22:37:46 +00:00
Stanislav Mekhanoshin	d9dcf392c7	[AMDGPU] gfx1010 wait count insertion Differential Revision: https://reviews.llvm.org/D61534 llvm-svn: 359938	2019-05-03 21:53:53 +00:00
Stanislav Mekhanoshin	41bbe101a2	[AMDGPU] gfx1010 s_code_end generation Also add some missing metadata in the streamer. Differential Revision: https://reviews.llvm.org/D61531 llvm-svn: 359937	2019-05-03 21:26:39 +00:00
Stanislav Mekhanoshin	93f15c922f	[AMDGPU] gfx1010 loop alignment Differential Revision: https://reviews.llvm.org/D61529 llvm-svn: 359935	2019-05-03 21:17:29 +00:00
Mandeep Singh Grang	5dc8aeb26d	[COFF, ARM64] Fix ABI implementation of struct returns Summary: Refer the ABI doc at: https://docs.microsoft.com/en-us/cpp/build/arm64-windows-abi-conventions?view=vs-2019#return-values Related clang patch: D60349 Reviewers: rnk, efriedma, TomTan, ssijaric Reviewed By: rnk, efriedma Subscribers: mstorsjo, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60348 llvm-svn: 359934	2019-05-03 21:12:36 +00:00
Matt Arsenault	b6c599afd3	Reapply r359906, "RegAllocFast: Add heuristic to detect values not live-out of a block" This reverts commit r359912. This should pass now, since the clang test was made less fragile in r359918. llvm-svn: 359919	2019-05-03 19:06:57 +00:00
Don Hinton	f6eac2dd3b	[CommandLine] Enable Grouping for short options by default. Part 4 of 5 Summary: This change enables `cl::Grouping` for short options -- options with names of a single character. This is consistent with GNU getopt behavior. Reviewers: rnk, MaskRay Reviewed By: MaskRay Subscribers: thopre, cfe-commits, MaskRay, rupprecht, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D61270 llvm-svn: 359917	2019-05-03 18:56:25 +00:00
Simon Pilgrim	5d3b100750	[DAGCombine] Remove repeated variables. NFCI. llvm-svn: 359915	2019-05-03 18:20:28 +00:00
Nico Weber	bb852a9672	Revert r359906, "RegAllocFast: Add heuristic to detect values not live-out of a block" Makes clang/test/Misc/backend-stack-frame-diagnostics-fallback.cpp fail. llvm-svn: 359912	2019-05-03 18:08:03 +00:00
Simon Pilgrim	308b5ec1ff	[TargetLowering] SimplifySetCC - remove repeated variable. NFCI. Also reduce scope of Temp variable. llvm-svn: 359911	2019-05-03 18:02:33 +00:00
Don Hinton	c242be40a1	[CommandLine] Change help output to prefix long options with `--` instead of `-`. NFC . Part 3 of 5 Summary: By default, `parseCommandLineOptions()` will accept either a `-` or `--` prefix for long options -- options with names longer than a single character. While this change does not affect behavior, it will be helpful with a subsequent change that requires long options use the `--` prefix. Reviewers: rnk, thopre Reviewed By: thopre Subscribers: thopre, cfe-commits, hiraditya, llvm-commits Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D61269 llvm-svn: 359909	2019-05-03 17:47:29 +00:00
Evgeniy Stepanov	46ec57e576	Revert "[CodeGenPrepare] limit overflow intrinsic matching to a single basic block" This reverts commit r359879, which introduced a compiler crash. llvm-svn: 359908	2019-05-03 17:31:49 +00:00
Matt Arsenault	daf2d653fa	RegAllocFast: Add heuristic to detect values not live-out of a block Add an improved/new heuristic to catch more cases when values are not live out of a basic block. Patch by Matthias Braun llvm-svn: 359906	2019-05-03 17:03:24 +00:00
Brian Cain	3428c9daef	[hexagon] change AsmParser assertion to error For immediates that can't be evaluated in assembler-mapped instructions, we should return 'invalid operand' instead of assert. llvm-svn: 359905	2019-05-03 16:50:38 +00:00
Craig Topper	a8f3840c62	[X86] Allow assembly parser to accept x/y/z suffixes on non-memory vfpclassps/pd and on memory forms in intel syntax The x/y/z suffix is needed to disambiguate the memory form in at&t syntax since no xmm/ymm/zmm register is mentioned. But we should also allow it for the register and broadcast forms where its not needed for consistency. This matches gas. The printing code will still only use the suffix for the memory form where it is needed. llvm-svn: 359903	2019-05-03 16:15:15 +00:00
Simon Pilgrim	b323d5ec7c	[X86] LowerToHorizontalOp - Tidyup calls to getHopForBuildVector. NFCI. Merge the if() tests for the various HADD/SUB + Subtarget tests llvm-svn: 359901	2019-05-03 15:56:06 +00:00
Simon Pilgrim	d857f64c31	[SelectionDAG] CreateTopologicalOrder - don't use iterator We shouldn't use an iterator to loop across a std::vector when the same loop is adding elements to that std::vector Found by cppcheck llvm-svn: 359900	2019-05-03 15:50:37 +00:00
Matt Arsenault	657ef48a88	AMDGPU: Select VOP3 form of sub The VOP3 form should always be the preferred selection form to be shrunk later. The r600 sub test needs to be split out because it asserts on the arguments in the new test during the calling convention lowering. llvm-svn: 359899	2019-05-03 15:37:07 +00:00
Matt Arsenault	cfd0ca38b0	AMDGPU: Support shrinking add with FI in SIFoldOperands Avoids test regression in a future patch llvm-svn: 359898	2019-05-03 15:21:53 +00:00
Matt Arsenault	344d68d3c9	AMDGPU: Remove redundant patterns for shifts llvm-svn: 359895	2019-05-03 15:08:36 +00:00
Matt Arsenault	ada33314a2	AMDGPU: Remove redundant patterns for sub There were 2 patterns for sub, one selecting to sub and one to subrev. Only one of these will succeed, so remove the reversed one. llvm-svn: 359894	2019-05-03 15:08:35 +00:00
Matt Arsenault	0446fbe45e	AMDGPU: Replace shrunk instruction with dummy implicit_def This was broken if the original operand was killed. The kill flag would appear on both instructions, and fail the verifier. Keep the kill flag, but remove the operands from the old instruction. This has an added benefit of really reducing the use count for future folds. Ideally the pass would be structured more like what PeepholeOptimizer does to avoid this hack to avoid breaking instruction iterators. llvm-svn: 359891	2019-05-03 14:40:10 +00:00
Simon Pilgrim	bc876df3a5	[TargetLowering] ShrinkDemandedConstant - reduce scope of TLO.DAG variable. NFCI. Only ever used in one block llvm-svn: 359890	2019-05-03 14:38:24 +00:00
Simon Pilgrim	bfdd0f75a8	[X86] Remove repeated variables. NFCI. llvm-svn: 359889	2019-05-03 14:37:00 +00:00
Simon Pilgrim	aa49be4926	Avoid cppcheck operator precedence warnings. NFCI. Prefer ((X & Y) ? A : B) to (X & Y ? A : B) llvm-svn: 359884	2019-05-03 13:50:38 +00:00
Matt Arsenault	2c8936fd26	AMDGPU: Fix incorrect commute with sub when folding immediates When a fold of an immediate into a sub/subrev required shrinking the instruction, the wrong VOP2 opcode was used. This was using the VOP2 equivalent of the original instruction, not the commuted instruction with the inverted opcode. llvm-svn: 359883	2019-05-03 13:42:56 +00:00
Sanjay Patel	8ff072e48e	[CodeGenPrepare] limit overflow intrinsic matching to a single basic block Using/updating a dominator tree to match math overflow patterns may be very expensive in compile-time (because of the way CGP uses a DT), so just handle the single-block case. Also, we were restarting the iterator loops when doing the overflow intrinsic transforms by marking the dominator tree for update. That was done to prevent iterating over a removed instruction. But we can postpone the deletion using the existing "RemovedInsts" structure, and that means we don't need to update the DT. See post-commit thread for rL354298 for more details: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190422/646276.html Differential Revision: https://reviews.llvm.org/D61075 llvm-svn: 359879	2019-05-03 13:09:18 +00:00
Sean Fertile	fd75ee9154	[Object][XCOFF] Add an XCOFF dumper for llvm-readobj. Patch adds support for dumping of file headers with llvm-readobj. XCOFF object files are added to test dumping a well formed file, and dumping both negative timestamps and negative symbol counts, both of which are allowed in the XCOFF definition. Differential Revision: https://reviews.llvm.org/D60878 llvm-svn: 359878	2019-05-03 12:57:07 +00:00
Simon Pilgrim	e798e3a346	[TargetLowering] expandUnalignedStore - cleanup EVT variables. NFCI. Avoid duplicated EVTs and rename Store/Load VTs to avoid -Wshadow warnings. llvm-svn: 359877	2019-05-03 12:55:25 +00:00
Anton Afanasyev	6d08b8dbae	Revert "[MIR] Add simple PRE pass to MachineCSE" This reverts commit `9c20156de3`. It breaks stage 2 of clang-ppc64be-linux-multistage. llvm-svn: 359875	2019-05-03 12:36:22 +00:00
Simon Pilgrim	42d2b604b5	[SelectionDAG] Use INT_MIN as (1 << 31) is UB for signed integers. NFCI. llvm-svn: 359873	2019-05-03 11:32:00 +00:00
Simon Pilgrim	bfd00a6440	[SelectionDAG] computeKnownBits - remove some duplicate/shadow variables. NFCI. llvm-svn: 359872	2019-05-03 11:11:03 +00:00
Simon Pilgrim	a359ef192b	[X86] LowerMULH - remove unused Lo/Hi vector indices. NFCI. Leftover from before we had the extract128BitVector helpers. llvm-svn: 359871	2019-05-03 10:32:07 +00:00
Anton Afanasyev	9c20156de3	[MIR] Add simple PRE pass to MachineCSE This is the second part of the commit fixing PR38917 (hoisting partitially redundant machine instruction). Most of PRE (partitial redundancy elimination) and CSE work is done on LLVM IR, but some of redundancy arises during DAG legalization. Machine CSE is not enough to deal with it. This simple PRE implementation works a little bit intricately: it passes before CSE, looking for partitial redundancy and transforming it to fully redundancy, anticipating that the next CSE step will eliminate this created redundancy. If CSE doesn't eliminate this, than created instruction will remain dead and eliminated later by Remove Dead Machine Instructions pass. The third part of the commit is supposed to refactor MachineCSE, to make it more clear and to merge MachinePRE with MachineCSE, so one need no rely on further Remove Dead pass to clear instrs not eliminated by CSE. First step: https://reviews.llvm.org/D54839 Fixes llvm.org/PR38917 Reviewers: RKSimon Subscribers: hfinkel, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D56772 llvm-svn: 359870	2019-05-03 10:30:59 +00:00
Simon Pilgrim	88f9117168	Reduce variable scope to just the if() block its actually used in. NFCI. llvm-svn: 359869	2019-05-03 10:13:41 +00:00
Craig Topper	d724360695	[X86] Add more one checks to masked compare patterns that were missed in r358358. This covers the patterns we use for widening 128/256 comparisons to 512-bit when AVX512VL isn't supported. llvm-svn: 359863	2019-05-03 07:14:05 +00:00
Quentin Colombet	c9256cc6ba	[IRTranslator] Use the alloc size instead of the store size when translating allocas We use to incorrectly use the store size instead of the alloc size when creating the stack slot for allocas. On aarch64 this can be demonstrated by allocating weirdly sized types. For instance, in the added test case, we use an alloca for i19. We used to allocate a slot of size 24-bit (19 rounded up to the next byte), whereas we really want to use a full 32-bit slot for this type. llvm-svn: 359856	2019-05-03 01:23:56 +00:00
Eli Friedman	7238353848	[AArch64][MC] Reject "add x0, x1, w2, lsl #1" etc. Looks like just a minor oversight in the parsing code. Fixes https://bugs.llvm.org/show_bug.cgi?id=41504. Differential Revision: https://reviews.llvm.org/D60840 llvm-svn: 359855	2019-05-03 00:59:52 +00:00
Eric Christopher	86e2f169bb	Tidy up a comment, fix a typo, remove a comment that's obsolete. llvm-svn: 359852	2019-05-03 00:15:23 +00:00
Eli Friedman	0b61d220c9	[AArch64][Windows] Compute function length correctly in unwind tables. The primary fix here is to WinException.cpp: we need to exclude jump tables when computing the length of a function, or else we fail to correctly compute the length. (We can only compute the number of bytes consumed by certain assembler directives after the entire file is parsed. ".p2align" is one of those directives, and is used by jump table generation.) The secondary fix, to MCWin64EH, is to make sure we don't silently miscompile if we hit a similar situation in the future. It's possible we could extend ARM64EmitUnwindInfo so it allows function bodies that contain assembler directives, but that's a lot more complicated; see the FIXME in MCWin64EH.cpp. Fixes https://bugs.llvm.org/show_bug.cgi?id=41581 . Differential Revision: https://reviews.llvm.org/D61095 llvm-svn: 359849	2019-05-03 00:10:45 +00:00
Alina Sbirlea	0363c3b8bb	[MemorySSA] Check that block is reachable when adding phis. Summary: Originally the insertDef method was only used when building MemorySSA, and was limiting the number of Phi nodes that it created. Now it's used for updates as well, and it can create additional Phis needed for correctness. Make sure no Phis are created in unreachable blocks (condition met during MSSA build), otherwise the renamePass will find a null DTNode. Resolves PR41640. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61410 llvm-svn: 359845	2019-05-02 23:41:58 +00:00
Alina Sbirlea	151ab4844a	[MemorySSA] Refactor removing multiple trivial phis [NFC]. Summary: Create a method to clean up multiple potentially trivial phis, since we will need this often. Reviewers: george.burgess.iv Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61471 llvm-svn: 359842	2019-05-02 23:12:49 +00:00
Craig Topper	bf29238e1a	[X86] Remove LEA16r references from X86FixupLEAs. NFCI As far as I know, we never emit LEA16r llvm-svn: 359840	2019-05-02 22:46:23 +00:00
Craig Topper	e1e38d4248	[X86] Correct the register class for specific mask register constraints in getRegForInlineAsmConstraint when the VT is a scalar type The default impementation in the base class for TargetLowering::getRegForInlineAsmConstraint doesn't work for mask registers when the VT is a scalar type integer types since the only legal mask types are vXi1. So we end up just getting whatever the first register class that contains the register. Currently this appears to be VK1, but its really dependent on the order tablegen outputs the register classes. Some code in the caller ends up looking up the type for this register class and find v1i1 then generates a copyfromreg from the physical k-register with the v1i1 type. Then it generates an any_extend from v1i1 to the scalar VT which isn't legal. This bad any_extend sticks around until isel where it selects a MOVZX32rr8 with a v1i1 input or maybe a i8 input. Not sure but eventually we pick up a copy from VK1 to GR8 in MachineIR which isn't supported. This leads to a failure in physical register copying. This patch uses the scalar type to find a VK class of the right size. In the attached test case this will be VK16. This causes a bitcast from vk16 to i16 to be generated instead of an any_extend. This will be properly iseled to a VK16 to GR32 copy and a GR32->GR16 extract_subreg. Fixes PR41678 Differential Revision: https://reviews.llvm.org/D61453 llvm-svn: 359837	2019-05-02 22:26:40 +00:00
Craig Topper	e8a1cde886	[SelectionDAG] Add asserts to verify the vectorness of input and output types of TRUNCATE/ZERO_EXTEND/ANY_EXTEND/SIGN_EXTEND agree As a result of the underlying cause of PR41678 we created an ANY_EXTEND node with a scalar result type and v1i1 input type. Ideally we would have asserted for this instead of letting it go through to instruction selection and generate bad machine IR Differential Revision: https://reviews.llvm.org/D61463 llvm-svn: 359836	2019-05-02 22:26:26 +00:00
Evandro Menezes	111df108e6	[AArch64] Update for Exynos Fix the forwarding of multiplication results for Exynos M4. llvm-svn: 359834	2019-05-02 22:01:39 +00:00
Craig Topper	47d8865a38	[X86] Remove string literal from an if. NFC This if used to be an assert that got refactored into an if, but left the string literal behind. Fixes PR41718 llvm-svn: 359833	2019-05-02 21:57:18 +00:00
Nico Weber	81862f82ee	lld-link: Add /force:multipleres extension to make dupe resource diag non-fatal As a side benefit, lld-link now reports more than one duplicate resource entry before exiting with an error even if the new flag is not passed. llvm-svn: 359829	2019-05-02 21:21:55 +00:00
Sanjay Patel	1972826178	[DAGCombiner] try repeated fdiv divisor transform before building estimate (2nd try) The original patch was committed at rL359398 and reverted at rL359695 because of infinite looping. This includes a fix to check for a vector splat of "1.0" to avoid the infinite loop. Original commit message: This was originally part of D61028, but it's an independent diff. If we try the repeated divisor reciprocal transform before producing an estimate sequence, then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5 vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the full-precision division is only 3 cycle throughput, so that's probably the better perf default option and avoids problems from x86's inaccurate estimates. The last 2 tests show that users still have the option to override the defaults by using the function attributes for reciprocal estimates, but those patterns are potentially made faster by converting the vector ops (including ymm ops) to scalar math. Differential Revision: https://reviews.llvm.org/D61149 llvm-svn: 359793	2019-05-02 15:02:08 +00:00
Sanjay Patel	284472be6d	[SelectionDAG] remove constant folding limitations based on FP exceptions We don't have FP exception limits in the IR constant folder for the binops (apart from strict ops), so it does not make sense to have them here in the DAG either. Nothing else in the backend tries to preserve exceptions (again outside of strict ops), so I don't see how this could have ever worked for real code that cares about FP exceptions. There are still cases (examples: unary opcodes in SDAG, FMA in IR) where we are trying (at least partially) to preserve exceptions without even asking if the target supports FP exceptions. Those should be corrected in subsequent patches. Real support for FP exceptions requires several changes to handle the constrained/strict FP ops. Differential Revision: https://reviews.llvm.org/D61331 llvm-svn: 359791	2019-05-02 14:47:59 +00:00
Simon Pilgrim	df8daf0ef4	[X86][SSE] lowerAddSubToHorizontalOp - enable ymm extraction+fold Limiting scalar hadd/hsub generation to the lowest xmm looks to be unnecessary - we will be extracting one upper xmm whatever, and we can remove a shuffle by using the hop which is inline with what shouldUseHorizontalOp expects to happen anyway. Testing on btver2 (the main target for fast-hops) shows this is beneficial even for float ops where we have a 'shuffle' to extract the float result: https://godbolt.org/z/0R-U-K Differential Revision: https://reviews.llvm.org/D61426 llvm-svn: 359786	2019-05-02 14:00:55 +00:00
Simon Pilgrim	9fa56f7829	[X86][SSE] Move shouldUseHorizontalOp inside isHorizontalBinOp. NFCI. Matches what we do for lowerAddSubToHorizontalOp and will make it easier to peek through subvectors to help fix PR39921 llvm-svn: 359782	2019-05-02 12:18:24 +00:00
Fangrui Song	8be28cdc52	[Object] Change getSectionName() to return Expected<StringRef> Summary: It currently receives an output parameter and returns std::error_code. Expected<StringRef> fits for this purpose perfectly. Differential Revision: https://reviews.llvm.org/D61421 llvm-svn: 359774	2019-05-02 10:32:03 +00:00
Diana Picus	1136ea2d44	[ARM GlobalISel] Fixup r359768 Get rid of local variable used only in assertion. llvm-svn: 359772	2019-05-02 10:08:29 +00:00
Diana Picus	06a61ccc42	[ARM GlobalISel] Select extensions to < 32 bits Select G_SEXT and G_ZEXT with destination types smaller than 32 bits in the exact same way as 32 bits. This overwrites the higher bits, but that should be ok since all legal users of types smaller than 32 bits ignore those bits anyway. llvm-svn: 359768	2019-05-02 09:28:00 +00:00
Diana Picus	53bcf6f2e7	[ARM GlobalISel] Legalize extensions to < 32 bits Make it legal to extend from e.g. s1 to s8 or s16. llvm-svn: 359766	2019-05-02 09:21:46 +00:00
Kang Zhang	1a0d6d6899	[NFC][PowerPC] Return early if the element type is not byte-sized in combineBVOfConsecutiveLoads Summary: Based on the Eli Friedman's comments in https://reviews.llvm.org/D60811 , we'd better return early if the element type is not byte-sized in `combineBVOfConsecutiveLoads`. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D61076 llvm-svn: 359764	2019-05-02 08:15:13 +00:00
Pavel Labath	cfc4519ef3	Object/Minidump: Add support for the ThreadList stream Summary: The stream contains the list of threads belonging to the process described by the minidump. Its structure is the same as the ModuleList stream, and in fact, I have generalized the ModuleList reading code to handle this stream too. Reviewers: amccarth, jhenderson, clayborg Subscribers: llvm-commits, lldb-commits, markmentovai, zturner Tags: #llvm Differential Revision: https://reviews.llvm.org/D61064 llvm-svn: 359762	2019-05-02 07:45:42 +00:00
Fangrui Song	7d0e8cb1e2	[Support] Don't check MAP_ANONYMOUS, just use MAP_ANON Though being marked "deprecated" by the Linux man-pages project (MAP_ANON is a synonym of MAP_ANONYMOUS), it is the mostly widely available macro - many systems that don't provide MAP_ANONYMOUS have MAP_ANON. MAP_ANON is also used here and there in compiler-rt. llvm-svn: 359758	2019-05-02 05:58:09 +00:00
Stanislav Mekhanoshin	64399da8b8	[AMDGPU] gfx1010 lost VOP2 forms of some add/sub Add legalization of V_ADD_I32, V_SUB_I32, V_SUBREV_I32. Differential Revision: llvm-svn: 359757	2019-05-02 04:26:35 +00:00
Stanislav Mekhanoshin	5cf8167735	[AMDGPU] gfx1010 allows VOP3 to have a literal Differential Revision: https://reviews.llvm.org/D61413 llvm-svn: 359756	2019-05-02 04:01:39 +00:00
Stanislav Mekhanoshin	f2baae0abb	[AMDGPU] gfx1010 constant bus limit Constant bus limit has increased to 2 with GFX10. Differential Revision: https://reviews.llvm.org/D61404 llvm-svn: 359754	2019-05-02 03:47:23 +00:00
Craig Topper	b929a0062e	[X86] Remove the redundant suffix in vfpclassp[d,s]'s broadcasting variant The broadcasting variant for instruction vfpclassp[d,s] shouldn't use suffix q/l. So remove them from the template. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D61295 llvm-svn: 359753	2019-05-02 03:25:50 +00:00
Nico Weber	413517ecfe	lld-link: Make "duplicate resource" error message a bit more concise Reduces the error message from: lld-link: error: failed to parse .res file: duplicate resource: type STRINGTABLE (ID 6)/name ID 3/language 1033, in test1.res and in test2.res To: lld-link: error: duplicate resource: type STRINGTABLE (ID 6)/name ID 3/language 1033, in test1.res and in test2.res Make sure every error message emitted by cvtres contains the name of at least one ".res" file, so that removing the "failed to parse .res file" string doesn't lose information. Differential Revision: https://reviews.llvm.org/D61388 llvm-svn: 359749	2019-05-02 01:52:24 +00:00
Bob Haarman	a78ab77b6b	remove inalloca parameters in globalopt and simplify argpromotion Summary: Inalloca parameters require special handling in some optimizations. This change causes globalopt to strip the inalloca attribute from function parameters when it is safe to do so, removes the special handling for inallocas from argpromotion, and replaces it with a simple check that causes argpromotion to skip functions that receive inallocas (for when the pass is invoked on code that didn't run through globalopt first). This also avoids a case where argpromotion would incorrectly try to pass an inalloca in a register. Fixes PR41658. Reviewers: rnk, efriedma Reviewed By: rnk Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61286 llvm-svn: 359743	2019-05-02 00:37:36 +00:00
Thomas Preud'homme	288ed91e99	FileCheck [4/12]: Introduce @LINE numeric expressions Summary: This patch is part of a patch series to add support for FileCheck numeric expressions. This specific patch introduces the @LINE numeric expressions. This commit introduces a new syntax to express a relation a numeric value in the input text must have with the line number of a given CHECK pattern: [[#<@LINE numeric expression>]]. Further commits build on that to express relations between several numeric values in the input text. To help with naming, regular variables are renamed into pattern variables and old @LINE expression syntax is referred to as legacy numeric expression. Compared to existing @LINE expressions, this new syntax allow arbitrary spacing between the component of the expression. It offers otherwise the same functionality but the commit serves to introduce some of the data structure needed to support more general numeric expressions. Copyright: - Linaro (changes up to diff 183612 of revision D55940) - GraphCore (changes in later versions of revision D55940 and in new revision created off D55940) Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D60384 llvm-svn: 359741	2019-05-02 00:04:38 +00:00
Hiroshi Yamauchi	1620104034	[PGO][CHR] A bug fix. Summary: Fix a transformation bug where two scopes share a common instrution to hoist. Reviewers: davidxl Reviewed By: davidxl Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61405 llvm-svn: 359736	2019-05-01 22:49:52 +00:00
Lang Hames	42a3b4ff0e	[ORC] Pass object buffer ownership back in NotifyEmitted. Clients who want to regain ownership of object buffers after they have been linked may now use the NotifyEmitted callback for this purpose. Note: Currently NotifyEmitted is only called if linking succeeds. If linking fails the buffer is always discarded. llvm-svn: 359735	2019-05-01 22:40:23 +00:00
Jessica Paquette	a3843fe6f4	[GlobalISel][AArch64] Use fmov for G_FCONSTANT when possible This adds support for using fmov rather than a standard mov to materialize G_FCONSTANT when it's safe to do so. Update arm64-fast-isel-materialize.ll and select-constant.mir to show that the selection is correct. llvm-svn: 359734	2019-05-01 22:39:43 +00:00
Simon Pilgrim	9f04d97cd7	[X86][SSE] Fold scalar horizontal add/sub for non-0/1 element extractions We already perform horizontal add/sub if we extract from elements 0 and 1, this patch extends it to non-0/1 element extraction indices (as long as they are from the lowest 128-bit vector). Differential Revision: https://reviews.llvm.org/D61263 llvm-svn: 359707	2019-05-01 17:13:35 +00:00
Stanislav Mekhanoshin	3b7925f035	[AMDGPU] gfx1010 GCNRegBankReassign pass Reassign registers to reduce register bank conflicts. Differential Revision: https://reviews.llvm.org/D61344 llvm-svn: 359704	2019-05-01 16:49:31 +00:00
Nico Weber	c991daa532	Option spell checking: Penalize delimiter flags if input has no argument If the user passes a flag like `-version` to a program, it's more likely they mean `--version` than `-version:`, since there's no parameter passed. Hence, give delimited arguments a penalty of 1 if the user input doesn't contain the delimiter or no data after it. The motivation is that with this, lld-link can suggest "--version" instead of "-version:" for "-version" and "-nodefaultlib" instead of "-nodefaultlib:" for "-nodefaultlibs". Differential Revision: https://reviews.llvm.org/D61382 llvm-svn: 359701	2019-05-01 16:45:15 +00:00
Stanislav Mekhanoshin	c29d491596	[AMDGPU] gfx1010 GCNNSAReassign pass Convert NSA into non-NSA images. Differential Revision: https://reviews.llvm.org/D61341 llvm-svn: 359700	2019-05-01 16:40:49 +00:00
Stanislav Mekhanoshin	692560dc98	[AMDGPU] gfx1010 MIMG implementation Differential Revision: https://reviews.llvm.org/D61339 llvm-svn: 359698	2019-05-01 16:32:58 +00:00
Teresa Johnson	b3203ec078	[ThinLTO] Fix unreachable code when parsing summary entries. Summary: Early returns were causing some code to be skipped. This was missed since the summary entries are typically at the end of the llvm assembly file. Fixes PR41663. Reviewers: RKSimon, wristow Subscribers: mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61355 llvm-svn: 359697	2019-05-01 16:26:59 +00:00
Stanislav Mekhanoshin	a224f68a10	[AMDGPU] gfx1010 DS implementation Differential Revision: https://reviews.llvm.org/D61332 llvm-svn: 359696	2019-05-01 16:11:11 +00:00
Sanjay Patel	64d5751254	Revert "[DAGCombiner] try repeated fdiv divisor transform before building estimate" This reverts commit `fb9a5307a9` (rL359398) because it can cause an infinite loop due to opposing combines. llvm-svn: 359695	2019-05-01 16:06:21 +00:00
Simon Pilgrim	f5bdff7747	Fix 80 column violation. NFCI. llvm-svn: 359694	2019-05-01 16:01:49 +00:00
Keno Fischer	a3e4b3bd33	[SCEV] Use isKnownViaNonRecursiveReasoning for smax simplification Summary: Commit rL331949: SCEV] Do not use induction in isKnownPredicate for simplification umax changed the codepath for umax from isKnownPredicate to isKnownViaNonRecursiveReasoning to avoid compile time blow up (and as I found out also stack overflows). However, there is an exact copy of the code for umax that was lacking this change. In D50167 I want to unify these codepaths, but to avoid that being a behavior change for the smax case, pull this independent bit out of it. Reviewed By: sanjoy Differential Revision: https://reviews.llvm.org/D61166 llvm-svn: 359693	2019-05-01 15:58:24 +00:00
Simon Pilgrim	6711b9699a	[X86][SSE] Add demanded elts support X86ISD::PMULDQ\PMULUDQ Add to SimplifyDemandedVectorEltsForTargetNode and SimplifyDemandedBitsForTargetNode llvm-svn: 359686	2019-05-01 14:50:50 +00:00
Nico Weber	f68e0f79c7	Fix OptTable::findNearest() adding delimiter for free Prior to this, OptTable::findNearest() thought that the input `--foo` had an editing distance of 0 from an existing flag `--foo=`, which made it suggest flags with delimiters more often than flags without one. After this, it correctly assigns this case an editing distance of 1. Differential Revision: https://reviews.llvm.org/D61373 llvm-svn: 359685	2019-05-01 14:46:17 +00:00
Keno Fischer	d8f856d265	[LoopInfo] Faster implementation of setLoopID. NFC. Summary: This change was part of D46460. However, in the meantime rL341926 fixed the correctness issue here. What remained was the performance issue in setLoopID where it would iterate through all blocks in the loop and their successors, rather than just the predecessor of the header (the later presumably being much faster). We already have the `getLoopLatches` to compute precisely these basic blocks in an efficient manner, so just use it (as the original commit did for `getLoopID`). Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D61215 llvm-svn: 359684	2019-05-01 14:39:11 +00:00
Simon Pilgrim	3d6899e369	[X86][SSE] Add SSE vector shift support to SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359680	2019-05-01 13:51:09 +00:00
Nico Weber	4e701ab177	Wrap to 80 columns, no behavior change llvm-svn: 359679	2019-05-01 13:04:44 +00:00
Simon Pilgrim	ba372c6e62	[X86][SSE] Split 512-bit -> 128-bit vector directly in SimplifyDemandedVectorEltsForTargetNode llvm-svn: 359678	2019-05-01 12:48:42 +00:00
Simon Pilgrim	951a6b4579	[X86][SSE] Add 512-bit vector support to SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359677	2019-05-01 12:37:41 +00:00
Tim Northover	ee2474df9f	DAG: allow DAG pointer size different from memory representation. In preparation for supporting ILP32 on AArch64, this modifies the SelectionDAG builder code so that pointers are allowed to have a larger type when "live" in the DAG compared to memory. Pointers get zero-extended whenever they are loaded, and truncated prior to stores. In addition, a few not quite so obvious locations need updating: * A GEP that has not been marked inbounds needs to enforce the IR-documented 2s-complement wrapping at the memory pointer size. Inbounds GEPs are undefined if they overflow the address space, so no additional operations are needed. * Signed comparisons would give incorrect results if performed on the zero-extended values. This shouldn't affect CodeGen for now, but will become active when the AArch64 ILP32 support is committed. llvm-svn: 359676	2019-05-01 12:37:30 +00:00
Simon Pilgrim	37c2419cc7	[X86][SSE] Add X86ISD::PACKSS\PACKUS to SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359673	2019-05-01 11:29:36 +00:00
Simon Pilgrim	3353cee06c	[X86][SSE] Add X86ISD::UNPCKL\UNPCK to SimplifyDemandedVectorEltsForTargetNode vector splitting llvm-svn: 359670	2019-05-01 11:08:03 +00:00
Simon Pilgrim	f7b978a71b	[X86][SSE] Move extract_subvector(pshufb) fold to SimplifyDemandedVectorEltsForTargetNode This lets us hit more cases than combineExtractSubvector and allows us reuse more code. llvm-svn: 359669	2019-05-01 10:58:38 +00:00
Simon Pilgrim	a7d107a3e0	[X86] SimplifyDemandedVectorEltsForTargetNode - pull out vector halving code. NFCI. Pull out the HADD/HSUB code to halve vector widths if the upper half isn't used - prep work to adding support for other opcodes. llvm-svn: 359667	2019-05-01 10:38:10 +00:00
Simon Pilgrim	99eefe94b5	[X86][SSE] Extract i1 elements from vXi1 bool vectors This is an alternative to D59669 which more aggressively extracts i1 elements from vXi1 bool vectors using a MOVMSK. Differential Revision: https://reviews.llvm.org/D61189 llvm-svn: 359666	2019-05-01 10:02:22 +00:00
Craig Topper	dd66acef96	[X86FixupLEAs] Hoist the calls to isLEA out of the 3 separate functions and put it in the basic block instruction loop. NFC Now need to check it 3 different times. Just do it once at the top of the loop. llvm-svn: 359658	2019-05-01 06:53:03 +00:00
David L. Jones	fccb505f0f	Revert "[llvm] r359313 - [PowerPC] Update P9 vector costs for insert/extract element" This causes segfaults during optimized builds. More details, including a reproducer, are on the llvm-commits thread for r359313. llvm-svn: 359648	2019-05-01 05:01:03 +00:00
Lang Hames	3181b87cb6	[JITLink] Make sure we explicitly deallocate memory on failure. JITLinkGeneric phases 2 and 3 (focused on applying fixups and finalizing memory, respectively) may fail for various reasons. If this happens, we need to explicitly de-allocate the memory allocated in phase 1 (explicitly, because deallocation may also fail and so is implemented as a method returning error). No testcase yet: I am still trying to decide on the right way to test totally platform agnostic code like this. llvm-svn: 359643	2019-05-01 02:43:52 +00:00
Sam Clegg	6898781d87	[WebAssembly] Update expectations for gcc torture tests This is needed to make the wasm waterfall green again after we land the update to WASI: https://github.com/WebAssembly/waterfall/pull/492 Differential Revision: https://reviews.llvm.org/D61351 llvm-svn: 359634	2019-04-30 23:10:28 +00:00
Philip Reames	84e54eb471	[InstCombine] Limit a vector demanded elts rule which was producing invalid IR. The demanded elts rules introduced for GEPs in https://reviews.llvm.org/rL356293 replaced vector constants with undefs (by design). It turns out that the LangRef disallows such cases when indexing structs. The right fix is probably to relax the langref requirement, and update other passes to expect the result, but for the moment, limit the transform to avoid compiler crashes. This should fix https://bugs.llvm.org/show_bug.cgi?id=41624. llvm-svn: 359633	2019-04-30 23:09:26 +00:00
Alina Sbirlea	b468320313	[MemorySSA] Invalidate MemorySSA if AA or DT are invalidated. Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: LLVM Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359627	2019-04-30 22:43:55 +00:00
Lang Hames	4637e15844	[ORC] Move SimpleCompiler/ConcurrentIRCompiler definitions into a .cpp file. SimpleCompiler is no longer templated, so there's no reason for this code to be in a header any more. llvm-svn: 359626	2019-04-30 22:42:01 +00:00
Alina Sbirlea	ba48a2c5e8	[AliasAnalysis/NewPassManager] Invalidate AAManager less often. Summary: This is a redo of D60914. The objective is to not invalidate AAManager, which is stateless, unless there is an explicit invalidate in one of the AAResults. To achieve this, this patch adds an API to PAC, to check precisely this: is this analysis not invalidated explicitly == is this analysis not abandoned == is this analysis stateless, so preserved without explicitly being marked as preserved by everyone Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61284 llvm-svn: 359622	2019-04-30 22:15:47 +00:00
Stanislav Mekhanoshin	a6322941ff	[AMDGPU] gfx1010 VMEM and SMEM implementation Differential Revision: https://reviews.llvm.org/D61330 llvm-svn: 359621	2019-04-30 22:08:23 +00:00
Eric Christopher	6435102c03	Fix a few -Werror warnings: - Remove a variable only used in an assert - Fix pessimizing move warning around copy elision llvm-svn: 359617	2019-04-30 21:44:21 +00:00
Alina Sbirlea	4e1ac95cf5	[PassManagerBuilder] Add option for interleaved loops, for loop vectorize. Summary: Match NewPassManager behavior: add option for interleaved loops in the old pass manager, and use that instead of the flag used to disable loop unroll. No changes in the defaults. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61030 llvm-svn: 359615	2019-04-30 21:29:20 +00:00
Lang Hames	d407b4b980	[JITLink] Add debugging output to print resolved external atoms. llvm-svn: 359614	2019-04-30 21:28:07 +00:00
Lang Hames	88816bdd2f	[ORC][JITLink] Name in-memory compiled objects after their source modules. In-memory compiled object buffer identifiers will now be derived from the identifiers of their source IR modules. This makes it easier to connect in-memory objects with their source modules in debugging output. llvm-svn: 359613	2019-04-30 21:27:56 +00:00
Rong Xu	998b97f6f1	[llvm-profdata] Add overlap command to compute similarity b/w two profile files Add overlap functionality to llvm-profdata tool to compute the similarity between two profile files. Differential Revision: https://reviews.llvm.org/D60977 llvm-svn: 359612	2019-04-30 21:19:12 +00:00
Fedor Sergeev	eeae45dc77	[NFC][InlineCost] cleanup - comments, overflow handling. Reviewed By: apilipenko Tags: #llvm Differential Revision: https://reviews.llvm.org/D60751 llvm-svn: 359609	2019-04-30 20:44:53 +00:00
Simon Pilgrim	07ab4e7db8	[X86][SSE] Fold extract_subvector(extend(x)) -> extend_vector_inreg(x) This adds any extend support - folding to zero_extend_vector_inreg (PMOVZX) for legality Minor improvement for PR39709 llvm-svn: 359608	2019-04-30 20:31:07 +00:00
Nico Weber	e7fa09e4ae	Fix stack-use-after free after r359580 `Candidate` was a StringRef refering to a temporary string. Instead, create a local variable for the string and use a StringRef referring to that. llvm-svn: 359604	2019-04-30 19:43:35 +00:00
Dan Gohman	3b5b9d0e72	[WebAssembly] Support EXPLICIT_NAME symbols in llvm-readobj Teach llvm-readobj about WASM_SYMBOL_EXPLICIT_NAME. Differential Revision: https://reviews.llvm.org/D61323 Reviewer: sbc100 llvm-svn: 359602	2019-04-30 19:30:24 +00:00
Dan Gohman	3a7532e645	[WebAssembly] Support f16 libcalls Add support for f16 libcalls in WebAssembly. This entails adding signatures for the remaining F16 libcalls, and renaming gnu_f2h_ieee/gnu_h2f_ieee to truncsfhf2/extendhfsf2 for consistency between f32 and f64/f128 (compiler-rt already supports this). Differential Revision: https://reviews.llvm.org/D61287 Reviewer: dschuff llvm-svn: 359600	2019-04-30 19:17:59 +00:00
Craig Topper	cad318014e	[X86] Remove if that's always true It's been like this since it was added in a refactor of this code. Fixes PR41659 llvm-svn: 359597	2019-04-30 19:02:15 +00:00
Evandro Menezes	ea349f3ef5	[SimplifyLibCalls] Clean up code (NFC) Fix pointer check after dereferencing (PR41665). llvm-svn: 359595	2019-04-30 18:35:38 +00:00
Craig Topper	3958719dda	[X86] If PreprocessISelDAG reorders a load before a call, make sure we remove dead nodes from the graph The reordering can leave at least a dead TokenFactor in the graph. This cause the linearize scheduler to fail with something like the assert seen in PR22614. This is only one of many ways we can break the linearize scheduler today so I can't say for sure that any of the other failures in that bug were caused by this issue. This takes the heavy hammer approach of just running RemoveDeadNodes unconditionally at the end of the PreprocessISelDAG. If this turns out to be a compile time hit, we can try to refine it. Differential Revision: https://reviews.llvm.org/D61164 llvm-svn: 359582	2019-04-30 17:56:47 +00:00
Craig Topper	965d1306ae	[X86] Initial cleanups on the FixupLEAs pass. Separate Atom LEA creation from other LEA optimizations. This removes some of the class variables. Merge basic block processing into runOnMachineFunction to keep the flags local. Pass MachineBasicBlock around instead of an iterator. We can get the iterator in the few places that need it. Allows a range-based outer for loop. Separate the Atom optimization from the rest of the optimizations. This allows fixupIncDec to create INC/DEC and still allow Atom to turn it back into LEA when profitable by its heuristics. I'd like to improve fixupIncDec to turn LEAs into ADD any time the base or index register is equal to the destination register. This is profitable regardless of the various slow flags. But again we would want Atom to be able to undo that. Differential Revision: https://reviews.llvm.org/D60993 llvm-svn: 359581	2019-04-30 17:56:28 +00:00
Nico Weber	98ca8da55e	Re-reland "[Option] Fix PR37006 prefix choice in findNearest" This was first reviewed in https://reviews.llvm.org/D46776 and landed in r332299, but got reverted because it broke the PS4 bots. https://reviews.llvm.org/D50410 fixed this, and then this change was re-reviewed at https://reviews.llvm.org/D50515 and relanded in r341329. It got reverted due to causing MSan issues. However, nobody wrote down the error message and the bot link is dead, so I'm relanding this to capture the MSan error. I'll then either fix it, or copy it somewhere and revert if fixing looks difficult. llvm-svn: 359580	2019-04-30 17:46:00 +00:00
Sanjay Patel	0387bf5269	[SelectionDAG] remove div-by-zero constant folding restriction We don't have this restriction in IR, so it should not be here either simply out of consistency. Code that wants to handle FP exceptions is expected to use the 'strict' variants of these nodes. We don't get the frem case because frem by 0.0 produces NaN (invalid), and that's the remaining check here (so the removed check for frem was dead code AFAIK). This is the only place in SDAG that uses "HasFPExceptions", so I think we should remove that entirely as a follow-up patch. llvm-svn: 359566	2019-04-30 14:37:15 +00:00
Simon Pilgrim	123e04b8a8	[TableGen] Fix null pointer dereferencing in token parser. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359559	2019-04-30 13:09:55 +00:00
Simon Pilgrim	f5e8f222d6	Revert rL359519 : [MemorySSA] Invalidate MemorySSA if AA or DT are invalidated. Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 ........ This was causing windows build bot failures llvm-svn: 359555	2019-04-30 12:34:21 +00:00
Sjoerd Meijer	ea31ddb36f	[ARM] Implement TTI::getMemcpyCost This implements TargetTransformInfo method getMemcpyCost, which estimates the number of instructions to which a memcpy instruction expands to. Differential Revision: https://reviews.llvm.org/D59787 llvm-svn: 359547	2019-04-30 10:28:50 +00:00
Simon Pilgrim	22641cc194	Fix for bug 41512: lower INSERT_VECTOR_ELT(ZeroVec, 0, Elt) to SCALAR_TO_VECTOR(Elt) for all SSE flavors Current LLVM uses pxor+pinsrb on SSE4+ for INSERT_VECTOR_ELT(ZeroVec, 0, Elt) insead of much simpler movd. INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is idiomatic construct which is used e.g. for _mm_cvtsi32_si128(Elt) and for lowest element initialization in _mm_set_epi32. So such inefficient lowering leads to significant performance digradations in ceratin cases switching from SSSE3 to SSE4. https://bugs.llvm.org/show_bug.cgi?id=41512 Here INSERT_VECTOR_ELT(ZeroVec, 0, Elt) is simply converted to SCALAR_TO_VECTOR(Elt) when applicable since latter is closer match to desired behavior and always efficiently lowered to movd and alike. Committed on behalf of @Serge_Preis (Serge Preis) Differential Revision: https://reviews.llvm.org/D60852 llvm-svn: 359545	2019-04-30 10:18:25 +00:00
Sjoerd Meijer	0ed4619679	[TargetLowering] findOptimalMemOpLowering. NFCI. This was a local static funtion in SelectionDAG, which I've promoted to TargetLowering so that I can reuse it to estimate the cost of a memory operation in D59787. Differential Revision: https://reviews.llvm.org/D59766 llvm-svn: 359543	2019-04-30 10:09:15 +00:00
Diana Picus	59a4c0481a	[ARM GlobalISel] Widen small shift operands The legalizer was already widening the shift amount. Add tests for that behaviour, and also support widening the shifted value. llvm-svn: 359542	2019-04-30 09:24:43 +00:00
Fangrui Song	7bce25cd7d	[AsmPrinter] Make AsmPrinter::HandlerInfo::Handler a unique_ptr Handlers.clear() in AsmPrinter::doFinalization() will destroy these handlers. A unique_ptr makes the ownership clearer. llvm-svn: 359541	2019-04-30 09:14:02 +00:00
Diana Picus	1e88ac213b	[ARM GlobalISel] Be more careful about bailing out Bail out on function arguments/returns with types aggregating an unsupported type. This fixes cases where we would happily and incorrectly lower functions taking e.g. [1 x i64] parameters, when we don't even support plain i64 yet. llvm-svn: 359540	2019-04-30 09:05:25 +00:00
Sjoerd Meijer	180f1ae57c	[TargetLowering] Change getOptimalMemOpType to take a function attribute list The MachineFunction wasn't used in getOptimalMemOpType, but more importantly, this allows reuse of findOptimalMemOpLowering that is calling getOptimalMemOpType. This is the groundwork for the changes in D59766 and D59787, that allows implementation of TTI::getMemcpyCost. Differential Revision: https://reviews.llvm.org/D59785 llvm-svn: 359537	2019-04-30 08:38:12 +00:00
Alexander Potapenko	06d00afa61	MSan: handle llvm.lifetime.start intrinsic Summary: When a variable goes into scope several times within a single function or when two variables from different scopes share a stack slot it may be incorrect to poison such scoped locals at the beginning of the function. In the former case it may lead to false negatives (see https://github.com/google/sanitizers/issues/590), in the latter - to incorrect reports (because only one origin remains on the stack). If Clang emits lifetime intrinsics for such scoped variables we insert code poisoning them after each call to llvm.lifetime.start(). If for a certain intrinsic we fail to find a corresponding alloca, we fall back to poisoning allocas for the whole function, as it's now impossible to tell which alloca was missed. The new instrumentation may slow down hot loops containing local variables with lifetime intrinsics, so we allow disabling it with -mllvm -msan-handle-lifetime-intrinsics=false. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60617 llvm-svn: 359536	2019-04-30 08:35:14 +00:00
Markus Lavin	a475da36eb	[DebugInfo] DW_OP_deref_size in PrologEpilogInserter. The PrologEpilogInserter need to insert a DW_OP_deref_size before prepending a memory location expression to an already implicit expression to avoid having the existing expression act on the memory address instead of the value behind it. The reason for using DW_OP_deref_size and not plain DW_OP_deref is that big-endian targets need to read the right size as simply truncating a larger read would yield the wrong result (LSB bytes are not at the lower address). This re-commit fixes issues reported in the first one. Namely deref was inserted under wrong conditions and additionally the deref_size argument was incorrectly encoded. Differential Revision: https://reviews.llvm.org/D59687 llvm-svn: 359535	2019-04-30 07:58:57 +00:00
Zi Xuan Wu	49d60fdc2e	[DAGCombiner] Do not generate ISD::ADDE node if adde is not legal for the target when combine ISD::TRUNC node Do not combine (trunc adde(X, Y, Carry)) into (adde trunc(X), trunc(Y), Carry), if adde is not legal for the target. Even it's at type-legalize phase. Because adde is special and will not be legalized at operation-legalize phase later. This fixes: PR40922 https://bugs.llvm.org/show_bug.cgi?id=40922 Differential Revision: https://reviews.llvm.org//D60854 llvm-svn: 359532	2019-04-30 03:01:14 +00:00
Lang Hames	b12867230c	[ORC] Allow JITDylib definition generators to return Errors. Background: A definition generator can be attached to a JITDylib to generate new definitions in response to queries. For example: a generator that forwards calls to dlsym can map symbols from a dynamic library into the JIT process on demand. If definition generation fails then the generator should be able to return an error. This allows the JIT API to distinguish between the case where a generator does not provide a definition, and the case where it was not able to determine whether it provided a definition due to an error. The immediate motivation for this is cross-process symbol lookups: If the remote-lookup generator is attached to a JITDylib early in the search list, and if a generator failure is misinterpreted as "no definition in this JITDylib" then lookup may continue and bind to a different definition in a later JITDylib, which is a bug. llvm-svn: 359521	2019-04-30 00:03:26 +00:00
Alina Sbirlea	9a1edd14a2	[MemorySSA] Invalidate MemorySSA if AA or DT are invalidated. Summary: MemorySSA keeps internal pointers of AA and DT. If these get invalidated, so should MemorySSA. Reviewers: george.burgess.iv, chandlerc Subscribers: jlebar, Prazek, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61043 llvm-svn: 359519	2019-04-29 23:53:04 +00:00
Nico Weber	e577be4ed1	[PDB] Fix hash function used to write /src/headerblock lld-link used to write PDB files that DIA couldn't recover natvis files from if: - The global strings table was > 64kiB - There were at least 3 natvis files The cause was that the hash function for the /src/headerblock stream was incorrect: It needs to be truncated to 16 bit. If the global strings table was <= 64kiB, truncating to 16 bit is a no-op, so this wasn't needed for small programs. If there are only 1 or 2 natvis files, then the growth strategy in HashTable::grow() would mean the hash table would have 2 buckets (for 1 natvis file) or 4 buckets (for 4 natvis files), and since the hash function is used modulo number of buckets, and since 2 and 4 divide 0x10000, the missing `% 0x10000` is a no-op there too. For 3 natvis files, the hash table grows to 6 buckets, which has a factor that's not common with 0x10000 and the difference starts to matter. Fixes PR41626. Differential Revision: https://reviews.llvm.org/D61277 llvm-svn: 359515	2019-04-29 23:09:35 +00:00
Lang Hames	eb14dc7585	[ORC] Replace the LLJIT/LLLazyJIT Create methods with Builder utilities. LLJITBuilder and LLLazyJITBuilder construct LLJIT and LLLazyJIT instances respectively. Over time these will allow more configurable options to be added while remaining easy to use in the default case, which for default in-process JITing is now: auto J = ExitOnErr(LLJITBuilder.create()); llvm-svn: 359511	2019-04-29 22:37:27 +00:00
Dan Gohman	8d6e80f959	[WebAssembly] Make an assertion message prettier. NFC. This is a follow-up to https://reviews.llvm.org/D59521. llvm-svn: 359509	2019-04-29 22:37:08 +00:00
Steven Wu	6c9f6fd11b	[ThinLTO] Adding architecture name into saved object filename Summary: For ThinLTOCodegenerator, it has an option to save the object file outputs into a directory which is essential for debug info. Tools like lldb and dsymutil will look for these object files for debug info. On Darwin platform, you can link fat binaries with one single clang driver invocation like: $ clang -arch x86_64 -arch i386 -Wl,-object_path_lto,$TMPDIR ... Unfornately, the output object files for one architecture is going to overwrite the previous ones and one architecture slice will end up with no debug info. One example for this is to turn on ThinLTO for sanitizer dylibs in compiler-rt project. To fix the issue, add the name for the architecture into the name of the output object file. rdar://problem/35482935 Reviewers: tejohnson, bd1976llvm, dexonsmith, JDevlieghere Reviewed By: dexonsmith Subscribers: mehdi_amini, aprantl, inglorion, eraman, hiraditya, jkorous, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60924 llvm-svn: 359508	2019-04-29 21:39:54 +00:00
Dan Gohman	8306cb5702	[WebAssembly] Define the signature for __stack_chk_fail The WebAssembly backend needs to know the signatures of all runtime libcall functions. This adds the signature for __stack_chk_fail which was previously missing. Also, make the error message for a missing libcall include the name of the function. Differential Revision: https://reviews.llvm.org/D59521 Reviewed By: sbc100 llvm-svn: 359505	2019-04-29 21:09:44 +00:00
Roland Froese	728e139700	[PowerPC] Try harder to avoid load/move-to VSR for partial vector loads Change the PPCISelLowering.cpp function that decides to avoid update form in favor of partial vector loads to know about newer load types and to not be confused by the chain operand. Differential Revision: https://reviews.llvm.org/D60102 llvm-svn: 359504	2019-04-29 21:08:35 +00:00
Jessica Paquette	7f6fe7c02c	[GlobalISel][AArch64] Select llvm.aarch64.crypto.sha1h This was falling back and gives us a reason to create a selectIntrinsic function which we would need eventually anyway. Update arm64-crypto.ll to show that we correctly select it. Also factor out the code for finding an intrinsic ID. llvm-svn: 359501	2019-04-29 20:58:17 +00:00
Martin Storsjo	c0d138d147	[X86] Run CFIInstrInserter on Windows if Dwarf is used This is necessary since SVN r330706, as tail merging can include CFI instructions since then. This fixes PR40322 and PR40012. Differential Revision: https://reviews.llvm.org/D61252 llvm-svn: 359496	2019-04-29 20:25:51 +00:00
Simon Pilgrim	028485d7b9	[X86][SSE] isHorizontalBinOp - add support for target shuffles Add target shuffle decoding to isHorizontalBinOp as well as ISD::VECTOR_SHUFFLE support. This does mean we can go through bitcasts so we need to bitcast the extracted args to ensure they are the correct type Fixes PR39936 and should help with PR39920/PR39921 Differential Revision: https://reviews.llvm.org/D61245 llvm-svn: 359491	2019-04-29 19:52:59 +00:00
Simon Pilgrim	9b17b80a0e	computePolynomialFromPointer - add missing early-out return for non-pointer types. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359486	2019-04-29 19:25:16 +00:00
Sanjay Patel	a706b9a90e	[InstCombine] reduce code duplication; NFC Follow-up to: rL359482 Avoid this potential problem throughout by giving the type a name and verifying the assumption that both operands are the same type. llvm-svn: 359485	2019-04-29 19:23:44 +00:00
Simon Pilgrim	4559739f7c	Remove duplicate line. NFCI. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359483	2019-04-29 18:58:32 +00:00
Simon Pilgrim	e3c8776172	[InstCombine] visitFCmpInst - appease copy+paste pattern warning. NFCI. PVS Studio's copy+paste recognizer was seeing this as a typo, technically Op0/Op1 in a fcmp should always be the same type, but we might as well avoid the issue. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359482	2019-04-29 18:52:19 +00:00
Daniel Sanders	8f079844d0	[globalisel] Improve Legalizer debug output * LegalizeAction should be printed by name rather than number * Newly created instructions are incomplete at the point the observer first sees them. They are therefore recorded in a small vector and printed just before the legalizer moves on to another instruction. By this point, the instruction must be complete. llvm-svn: 359481	2019-04-29 18:45:59 +00:00
Don Hinton	89e583b843	[CommandLine] Don't allow unlimitted dashes for options. Part 1 or 5 Summary: Prior to this patch, the CommandLine parser would strip an unlimitted number of dashes from options. This patch limits it to two. Reviewers: rnk Reviewed By: rnk Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61229 llvm-svn: 359480	2019-04-29 18:34:18 +00:00
Simon Pilgrim	0a5c2b2449	[X86] scaleShuffleMask - avoid potential signed overflow warning. Use size_t assignment to prevent a bad explicit type conversion warning. Given the typical size of shuffle masks this was never going to happen, but this at least stops the warning. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359479	2019-04-29 18:32:06 +00:00
Simon Pilgrim	1c4c641ebc	[TextAPI] Fix Symbol::dump which was failing to append the SymbolKind string. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359478	2019-04-29 18:25:04 +00:00
Bjorn Pettersson	820994572c	[DAG] Refactor DAGCombiner::ReassociateOps Summary: Extract the logic for doing reassociations from DAGCombiner::reassociateOps into a helper function DAGCombiner::reassociateOpsCommutative, and use that helper to trigger reassociation on the original operand order, or the commuted operand order. Codegen is not identical since the operand order will be different when doing the reassociations for the commuted case. That causes some unfortunate churn in some test cases. Apart from that this should be NFC. Reviewers: spatel, craig.topper, tstellar Reviewed By: spatel Subscribers: dmgreen, dschuff, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, hiraditya, aheejin, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61199 llvm-svn: 359476	2019-04-29 17:50:10 +00:00
Thomas Preud'homme	15cb1f1501	FileCheck [3/12]: Stricter parsing of @LINE expressions Summary: This patch is part of a patch series to add support for FileCheck numeric expressions. This specific patch gives earlier and better diagnostics for the @LINE expressions. Rather than detect parsing errors at matching time, this commit adds enhance parsing to detect issues with @LINE expressions at parse time and diagnose them more accurately. Copyright: - Linaro (changes up to diff 183612 of revision D55940) - GraphCore (changes in later versions of revision D55940 and in new revision created off D55940) Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D60383 llvm-svn: 359475	2019-04-29 17:46:26 +00:00
Simon Pilgrim	19cde62008	Avoid "checking a pointer after dereferencing" warning. NFCI. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359473	2019-04-29 17:38:18 +00:00
Simon Pilgrim	6f349d8c39	Move if() to newline to stop ambiguity over whether it should be else if. NFCI. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359472	2019-04-29 17:34:26 +00:00
Simon Pilgrim	2755b73ba0	Fix operator precedence warning. NFCI. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359469	2019-04-29 17:04:14 +00:00
Simon Pilgrim	864cf8e274	Remove superfluous break from switch statement. NFCI. Reported in https://www.viva64.com/en/b/0629/ llvm-svn: 359467	2019-04-29 16:45:35 +00:00
Quentin Colombet	31ce274207	[BlockExtractor] Expose a constructor for the group extraction NFC Differential Revision: https://reviews.llvm.org/D60971 llvm-svn: 359463	2019-04-29 16:14:02 +00:00
Quentin Colombet	ae2cbb3400	[BlockExtractor] Change the basic block separator from ',' to ';' This change aims at making the file format be compatible with the way LLVM handles command line options. Differential Revision: https://reviews.llvm.org/D60970 llvm-svn: 359462	2019-04-29 16:14:00 +00:00
Simon Pilgrim	cbf3501e56	[X86] Remove duplicate string comparison Fix typo introduced in rL332824 where we simplified the extact string matches for "avx512.mask.permvar.sf.256" and "avx512.mask.permvar.si.256" to a string startswith test for "avx512.mask.permvar." llvm-svn: 359460	2019-04-29 16:03:35 +00:00
Cullen Rhodes	2c0d5043a7	[AArch64][SVE] Asm: add aliases for unpredicated bitwise logical instructions This patch adds aliases for element sizes .B/.H/.S to the AND/ORR/EOR/BIC bitwise logical instructions. The assembler now accepts these instructions with all element sizes up to 64-bit (.D). The preferred disassembly is .D. llvm-svn: 359457	2019-04-29 15:27:27 +00:00
Thomas Preud'homme	5a33047022	FileCheck [2/12]: Stricter parsing of -D option Summary: This patch is part of a patch series to add support for FileCheck numeric expressions. This specific patch gives earlier and better diagnostics for the -D option. Prior to this change, parsing of -D option was very loose: it assumed that there is an equal sign (which to be fair is now checked by the FileCheck executable) and that the part on the left of the equal sign was a valid variable name. This commit adds logic to ensure that this is the case and gives diagnostic when it is not, making it clear that the issue came from a command-line option error. This is achieved by sharing the variable parsing code into a new function ParseVariable. Copyright: - Linaro (changes up to diff 183612 of revision D55940) - GraphCore (changes in later versions of revision D55940 and in new revision created off D55940) Reviewers: jhenderson, chandlerc, jdenny, probinson, grimar, arichardson, rnk Subscribers: hiraditya, llvm-commits, probinson, dblaikie, grimar, arichardson, tra, rnk, kristina, hfinkel, rogfer01, JonChesterfield Tags: #llvm Differential Revision: https://reviews.llvm.org/D60382 llvm-svn: 359447	2019-04-29 13:32:36 +00:00
Yevgeny Rouban	0822bfc6de	[LoopSimplifyCFG] Suppress expensive DomTree verification This patch makes verification level lower for builds with inexpensive checks. Differential Revision: https://reviews.llvm.org/D61055 llvm-svn: 359446	2019-04-29 13:29:55 +00:00
Diogo N. Sampaio	d95abb170b	[ARM] Add bitcast/extract_subvec. of fp16 vectors Summary: This patch adds some basic operations for fp16 vectors, such as bitcast from fp16 to i16, required to perform extract_subvector (also added here) and extract_element. Reviewers: SjoerdMeijer, DavidSpickett, t.p.northover, ostannard Reviewed By: ostannard Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60618 llvm-svn: 359433	2019-04-29 10:28:07 +00:00
Diogo N. Sampaio	2078eb745d	[ARM] Add v4f16 and v8f16 types to the CallingConv Summary: The Procedure Call Standard for the Arm Architecture states that float16x4_t and float16x8_t behave just as uint16x4_t and uint16x8_t for argument passing. This patch adds the fp16 vectors to the ARMCallingConv.td file. Reviewers: miyuki, ostannard Reviewed By: ostannard Subscribers: ostannard, javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60720 llvm-svn: 359431	2019-04-29 10:10:37 +00:00
David Chisnall	714a4425de	Try to use /proc on FreeBSD for getExecutablePath Currently, clang's libTooling passes this function a fake argv0, which means that no libTooling tools can find the standard headers on FreeBSD. With this change, these will now work on any FreeBSD systems that have procfs mounted. This isn't the right fix for the libTooling issue, but it does bring the FreeBSD implementation of getExecutablePath closer to the Linux and macOS implementations. llvm-svn: 359427	2019-04-29 09:24:51 +00:00
Jeremy Morse	055aee1d8a	[DebugInfo] Terminate more location-list ranges at the end of blocks This patch fixes PR40795, where constant-valued variable locations can "leak" into blocks placed at higher addresses. The root of this is that DbgEntityHistoryCalculator terminates all register variable locations at the end of each block, but not constant-value variable locations. Fixing this requires constant-valued DBG_VALUE instructions to be broadcast into all blocks where the variable location remains valid, as documented in the LiveDebugValues section of SourceLevelDebugging.rst, and correct termination in DbgEntityHistoryCalculator. Differential Revision: https://reviews.llvm.org/D59431 llvm-svn: 359426	2019-04-29 09:13:16 +00:00
Fangrui Song	97b8cd54ad	[DWARF] Fix dump of local/foreign TU lists in .debug_names Differential Revision: https://reviews.llvm.org/D61241 llvm-svn: 359425	2019-04-29 08:55:10 +00:00
Fangrui Song	cc1fec31d9	[DWARF] Delete a redundant check in getFileNameByIndex() llvm-svn: 359422	2019-04-29 08:15:13 +00:00
Craig Topper	9202d5f8f1	[X86] Remove some intel syntax aliases on (v)cvtpd2(u)dq, (v)cvtpd2ps, (v)cvt(u)qq2ps. Add 'x' and'y' suffix aliases to masked version of the same in att syntax. The 128/256 bit version of these instructions require an 'x' or 'y' suffix to disambiguate the memory form in att syntax. We were allowing the same suffix in intel syntax, but it appears gas does not do that. gas does allow the 'x' and 'y' suffix on register and broadcast forms even though its not needed. We were allowing it on unmasked register form, but not on masked versions or on masked or unmasked broadcast form. While there fix some test coverage holes so they can be extended with the 'x' and 'y' suffix tests. llvm-svn: 359418	2019-04-29 06:13:41 +00:00
Nico Weber	cf6267cecb	llvm-cvtres: Attempt to make llvm-cvtres/duplicate.test work on big-endian systems llvm-svn: 359414	2019-04-29 00:51:41 +00:00
Simon Pilgrim	d5cc753b6d	[X86][SSE] combineExtractVectorElt - add early-out to return zero/undef for out-of-range extraction indices. llvm-svn: 359406	2019-04-28 19:12:58 +00:00
Nikita Popov	7a94795b2b	[ConstantRange] Add makeExactNoWrapRegion() I got confused on the terminology, and the change in D60598 was not correct. I was thinking of "exact" in terms of the result being non-approximate. However, the relevant distinction here is whether the result is * Largest range such that: Forall Y in Other: Forall X in Result: X BinOp Y does not wrap. (makeGuaranteedNoWrapRegion) * Smallest range such that: Forall Y in Other: Forall X not in Result: X BinOp Y wraps. (A hypothetical makeAllowedNoWrapRegion) * Both. (makeExactNoWrapRegion) I'm adding a separate makeExactNoWrapRegion method accepting a single APInt (same as makeExactICmpRegion) and using it in the places where the guarantee is relevant. Differential Revision: https://reviews.llvm.org/D60960 llvm-svn: 359402	2019-04-28 15:40:56 +00:00
Simon Pilgrim	22d1476bfa	[X86][AVX] Combine non-lane crossing binary shuffles using X86ISD::VPERMV3 Some of the combines might be further improved if we lower more shuffles with X86ISD::VPERMV3 directly, instead of waiting to combine the results. llvm-svn: 359400	2019-04-28 14:31:01 +00:00
Sanjay Patel	fb9a5307a9	[DAGCombiner] try repeated fdiv divisor transform before building estimate This was originally part of D61028, but it's an independent diff. If we try the repeated divisor reciprocal transform before producing an estimate sequence, then we have an opportunity to use scalar fdiv. On x86, the trade-off is 1 divss vs. 5 vector FP ops in the default estimate sequence. On recent chips (Skylake, Ryzen), the full-precision division is only 3 cycle throughput, so that's probably the better perf default option and avoids problems from x86's inaccurate estimates. The last 2 tests show that users still have the option to override the defaults by using the function attributes for reciprocal estimates, but those patterns are potentially made faster by converting the vector ops (including ymm ops) to scalar math. Differential Revision: https://reviews.llvm.org/D61149 llvm-svn: 359398	2019-04-28 12:23:43 +00:00
Simon Pilgrim	93ad48210c	[X86][SSE] Optimize llvm.experimental.vector.reduce.xor.vXi1 parity reduction (PR38840) An xor reduction of a bool vector can be optimized to a parity check of the MOVMSK/BITCAST'd integer - if the population count is odd return 1, else return 0. Differential Revision: https://reviews.llvm.org/D61230 llvm-svn: 359396	2019-04-28 10:46:17 +00:00
Craig Topper	bd35a30940	[X86] Remove (V)MOV64toSDrr/m and (V)MOVDI2SSrr/m. Use 128-bit result MOVD/MOVQ and COPY_TO_REGCLASS instead Summary: The register form of these instructions are CodeGenOnly instructions that cover GR32->FR32 and GR64->FR64 bitcasts. There is a similar set of instructions for the opposite bitcast. Due to the patterns using bitcasts these instructions get marked as "bitcast" machine instructions as well. The peephole pass is able to look through these as well as other copies to try to avoid register bank copies. Because FR32/FR64/VR128 are all coalescable to each other we can end up in a situation where a GR32->FR32->VR128->FR64->GR64 sequence can be reduced to GR32->GR64 which the copyPhysReg code can't handle. To prevent this, this patch removes one set of the 'bitcast' instructions. So now we can only go GR32->VR128->FR32 or GR64->VR128->FR64. The instruction that converts from GR32/GR64->VR128 has no special significance to the peephole pass and won't be looked through. I guess the other option would be to add support to copyPhysReg to just promote the GR32->GR64 to a GR64->GR64 copy. The upper bits were basically undefined anyway. But removing the CodeGenOnly instruction in favor of one that won't be optimized seemed safer. I deleted the peephole test because it couldn't be made to work with the bitcast instructions removed. The load version of the instructions were unnecessary as the pattern that selects them contains a bitcasted load which should never happen. Fixes PR41619. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61223 llvm-svn: 359392	2019-04-28 06:25:33 +00:00
Simon Pilgrim	03c4e2663c	Revert rL359389: [X86][SSE] Add support for <64 x i1> bool reduction Minor generalization of the existing <32 x i1> pre-AVX2 split code. ........ Causing irregular buildbot failures. llvm-svn: 359391	2019-04-27 20:44:08 +00:00
Simon Pilgrim	4118be3af6	[X86][SSE] Add support for <64 x i1> bool reduction Minor generalization of the existing <32 x i1> pre-AVX2 split code. llvm-svn: 359389	2019-04-27 20:04:44 +00:00
Simon Pilgrim	2a2d422400	[X86][AVX512] Improve vector bool reductions As predicate masks are legal on AVX512 targets, we avoid MOVMSK in these cases, but we can just bitcast the bool vector to the integer equivalent directly - avoiding expansion of the reduction to a shuffle pattern. llvm-svn: 359386	2019-04-27 17:32:46 +00:00
Fangrui Song	795c00b21f	[DJB] Fix variable case after D61178 llvm-svn: 359381	2019-04-27 15:33:22 +00:00
Simon Pilgrim	acc1e6d1c6	[X86][AVX] Merge mask select with shuffles across extract_subvector (PR40332) Fixes PR40332 in the limited case where we're selecting between a target shuffle and a zero vector. We can extend this in the future to handle more opcodes and non-zero selections. llvm-svn: 359378	2019-04-27 13:35:32 +00:00
Andrea Di Biagio	d77dc9ada2	[MCA] Add field `IsEliminated` to class Instruction. NFCI llvm-svn: 359377	2019-04-27 11:59:11 +00:00
Craig Topper	063b471ff7	[X86] Use MOVQ for i64 atomic_stores when SSE2 is enabled Summary: If we have SSE2 we can use a MOVQ to store 64-bits and avoid falling back to a cmpxchg8b loop. If its a seq_cst store we need to insert an mfence after the store. Reviewers: spatel, RKSimon, reames, jfb, efriedma Reviewed By: RKSimon Subscribers: hiraditya, dexonsmith, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60546 llvm-svn: 359368	2019-04-27 03:38:15 +00:00
Mark Searles	76c5b62988	Revert "AMDGPU: Split block for si_end_cf" This reverts commit 7a6ef3004655dd86d722199c471ae78c28e31bb4. We discovered some internal test failures, so reverting for now. Differential Revision: https://reviews.llvm.org/D61213 llvm-svn: 359363	2019-04-27 00:51:18 +00:00
Stanislav Mekhanoshin	4f331cb1f3	[AMDGPU] gfx1010 VOPC implementation Differential Revision: https://reviews.llvm.org/D61208 llvm-svn: 359358	2019-04-26 23:16:16 +00:00
Lang Hames	a9fdf375b3	[ORC] Add a 'plugin' interface to ObjectLinkingLayer for events/configuration. ObjectLinkingLayer::Plugin provides event notifications when objects are loaded, emitted, and removed. It also provides a modifyPassConfig callback that allows plugins to modify the JITLink pass configuration. This patch moves eh-frame registration into its own plugin, and teaches llvm-jitlink to only add that plugin when performing execution runs on non-Windows platforms. This should allow us to re-enable the test case that was removed in r359198. llvm-svn: 359357	2019-04-26 22:58:39 +00:00
Jessica Paquette	76f64b665b	[GlobalISel][AArch64] Use getConstantVRegValWithLookThrough for extracts getConstantVRegValWithLookThrough does the same thing as the getConstantValueForReg function, and has more visibility across GISel. Plus, it supports looking through G_TRUNC, G_SEXT, and G_ZEXT. So, we get better code reuse and more functionality for free by using it. Add some test cases to select-extract-vector-elt.mir to show that we can now look through those instructions. llvm-svn: 359351	2019-04-26 21:53:13 +00:00
Nick Desaulniers	7ab164c4a4	[AsmPrinter] refactor to support %c w/ GlobalAddress' Summary: Targets like ARM, MSP430, PPC, and SystemZ have complex behavior when printing the address of a MachineOperand::MO_GlobalAddress. Move that handling into a new overriden method in each base class. A virtual method was added to the base class for handling the generic case. Refactors a few subclasses to support the target independent %a, %c, and %n. The patch also contains small cleanups for AVRAsmPrinter and SystemZAsmPrinter. It seems that NVPTXTargetLowering is possibly missing some logic to transform GlobalAddressSDNodes for TargetLowering::LowerAsmOperandForConstraint to handle with "i" extended inline assembly asm constraints. Fixes: - https://bugs.llvm.org/show_bug.cgi?id=41402 - https://github.com/ClangBuiltLinux/linux/issues/449 Reviewers: echristo, void Reviewed By: void Subscribers: void, craig.topper, jholewinski, dschuff, jyknight, dylanmckay, sdardis, nemanjai, javed.absar, sbc100, jgravelle-google, eraman, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, jrtc27, atanasyan, jsji, llvm-commits, kees, tpimh, nathanchance, peter.smith, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60887 llvm-svn: 359337	2019-04-26 18:45:04 +00:00
Simon Pilgrim	27e01e675c	[X86][AVX] Fold extract_subvector(broadcast(x)) -> broadcast(x) iff x has one use llvm-svn: 359332	2019-04-26 18:02:14 +00:00
Jessica Paquette	67ab9eb193	[AArch64][GlobalISel] Select G_BSWAP for vectors of s32 and s64 There are instructions for these, so mark them as legal. Select the correct instruction in AArch64InstructionSelector.cpp. Update select-bswap.mir and arm64-rev.ll to reflect the changes. llvm-svn: 359331	2019-04-26 18:00:01 +00:00
Stanislav Mekhanoshin	61beff020e	[AMDGPU] gfx1010 VOP3 and VOP3P implementation Differential Revision: https://reviews.llvm.org/D61202 llvm-svn: 359328	2019-04-26 17:56:03 +00:00
Simon Pilgrim	ef54b1dddf	[DAGCombine] Cleanup visitEXTRACT_SUBVECTOR. NFCI. Use ArrayRef::slice, reduce some rather awkward long lines for legibility and run clang-format. llvm-svn: 359326	2019-04-26 17:49:02 +00:00
Nikita Popov	c0fa4ec01d	[ConstantRange] Add abs() support Add support for abs() to ConstantRange. This will allow to handle SPF_ABS select flavor in LVI and will also come in handy as a primitive for the srem implementation. The implementation is slightly tricky, because a) abs of signed min is signed min and b) sign-wrapped ranges may have an abs() that is smaller than a full range, so we need to explicitly handle them. Differential Revision: https://reviews.llvm.org/D61084 llvm-svn: 359321	2019-04-26 16:50:31 +00:00
Craig Topper	354247c08d	[X86] Sink NoRegister creation for unused Base/Index registers into getAddressOperands. NFCI llvm-svn: 359318	2019-04-26 16:39:38 +00:00
Craig Topper	ad662cf4c1	[X86] Segment registers should have i16 type not i32. Probably doesn't really matter, but was inconsistent with the rest of the code. llvm-svn: 359317	2019-04-26 16:39:35 +00:00
Stanislav Mekhanoshin	8f3da70eed	[AMDGPU] gfx1010 VOP2 changes Differential Revision: https://reviews.llvm.org/D61156 llvm-svn: 359316	2019-04-26 16:37:51 +00:00
Roland Froese	4b17772b9e	[PowerPC] Update P9 vector costs for insert/extract element The PPC vector cost model values for insert/extract element reflect older processors that lacked vector insert/extract and move-to/move-from VSR instructions. Update getVectorInstrCost to give appropriate values for when the newer instructions are present. Differential Revision: https://reviews.llvm.org/D60160 llvm-svn: 359313	2019-04-26 16:14:17 +00:00
Fangrui Song	3153764c88	s/Dwarf 5/DWARF v5/ NFC llvm-svn: 359307	2019-04-26 13:41:19 +00:00
Simon Pilgrim	c3a34c3e07	Fix Wparentheses warning. NFCI. llvm-svn: 359299	2019-04-26 12:23:42 +00:00
Simon Pilgrim	bb230c5e79	[X86][SSE] Pull out OR(EXTRACTELT(X,0),OR(EXTRACTELT(X,1),...)) matching code from LowerVectorAllZeroTest Create a matchBitOpReduction helper that checks for the pattern with any opcode. First step towards reusing this code to recognize other scalar reduction patterns. llvm-svn: 359296	2019-04-26 11:45:54 +00:00
Fangrui Song	50dcd8bf90	caseFoldingDjbHash: simplify and make the US-ASCII fast path faster The slow path (with at least one non US-ASCII) will be slower but that doesn't matter. Differential Revision: https://reviews.llvm.org/D61178 llvm-svn: 359294	2019-04-26 10:56:10 +00:00
Simon Pilgrim	5d6ef94c36	[X86][SSE] Disable shouldFoldConstantShiftPairToMask for btver1/btver2 targets (PR40758) As detailed on PR40758, Bobcat/Jaguar can perform vector immediate shifts on the same pipes as vector ANDs with the same latency - so it doesn't make sense to replace a shl+lshr with a shift+and pair as it requires an additional mask (with the extra constant pool, loading and register pressure costs). Differential Revision: https://reviews.llvm.org/D61068 llvm-svn: 359293	2019-04-26 10:49:13 +00:00
Simon Pilgrim	5e161df9f8	[X86][AVX] Combine shuffles extracted from a common vector A small step towards combining shuffles across vector sizes - this recognizes when a shuffle's operands are all extracted from the same larger source and tries to combine to an unary shuffle of that source instead. Fixes one of the test cases from PR34380. Differential Revision: https://reviews.llvm.org/D60512 llvm-svn: 359292	2019-04-26 09:56:14 +00:00
Sven van Haastregt	66f612601d	[InferAddressSpaces] Add AS parameter to the pass factory This enables the pass to be used in the absence of TargetTransformInfo. When the argument isn't passed, the factory defaults to UninitializedAddressSpace and the flat address space is obtained from the TargetTransformInfo as before this change. Existing users won't have to change. Patch by Kevin Petit. Differential Revision: https://reviews.llvm.org/D60602 llvm-svn: 359290	2019-04-26 09:21:25 +00:00
Hans Wennborg	5d5ee4aff7	Fix alignment in AArch64InstructionSelector::emitConstantPoolEntry() The code was using the alignment of a pointer to the value, not the alignment of the constant itself. Maybe we got away with it so far because the pointer alignment is fairly high, but we did end up under-aligning <16 x i8> vectors, which was caught in the Chromium build after lld stopped over-aligning the .rodata.cst16 section in r356428. (See crbug.com/953815) Differential revision: https://reviews.llvm.org/D61124 llvm-svn: 359287	2019-04-26 08:31:00 +00:00
Marcello Maggioni	c596584f67	[GlobalISel] Fix inserting copies in the right position for reg definitions When constrainRegClass is called if the constraining happens on a use the COPY needs to be inserted before the instruction that contains the MachineOperand, but if we are constraining a definition it actually needs to be added after the instruction. In addition, the COPY needs to have its operands flipped (in the use case we are copying from the old unconstrained register to the new constrained register, while in the definition case we are copying from the new constrained register that the instruction defines to the old unconstrained register). llvm-svn: 359282	2019-04-26 07:21:56 +00:00
Justin Bogner	df5d2b3846	[GlobalOpt] Swap the expensive check for cold calls with the cheap TTI check isValidCandidateForColdCC is much more expensive than TTI.useColdCCForColdCall, which by default just returns false. Avoid doing this work if we're not going to look at the answer anyway. This change is NFC, but I see significant compile time improvements on some code with pathologically many functions. llvm-svn: 359253	2019-04-26 00:12:50 +00:00
Lang Hames	4f71049a39	[ORC] Remove symbols from dependency lists when failing materialization. When failing materialization of a symbol X, remove X from the dependants list of any of X's dependencies. This ensures that when X's dependencies are emitted (or fail themselves) they do not try to access the no-longer-existing MaterializationInfo for X. llvm-svn: 359252	2019-04-25 23:31:33 +00:00
Artem Belevich	5fe85a003f	[CUDA] Implemented _[bi]mma* builtins. These builtins provide access to the new integer and sub-integer variants of MMA (matrix multiply-accumulate) instructions provided by CUDA-10.x on sm_75 (AKA Turing) GPUs. Also added a feature for PTX 6.4. While Clang/LLVM does not generate any PTX instructions that need it, we still need to pass it through to ptxas in order to be able to compile code that uses the new 'mma' instruction as inline assembly (e.g used by NVIDIA's CUTLASS library https://github.com/NVIDIA/cutlass/blob/master/cutlass/arch/mma.h#L101) Differential Revision: https://reviews.llvm.org/D60279 llvm-svn: 359248	2019-04-25 22:28:09 +00:00
Artem Belevich	16737538f4	PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32. All of the new instructions are still handled mostly by tablegen. I've slightly refactored the code to drive intrinsic/instruction generation from a master list of supported variants, so all irregularities have to be implemented in one place only. The test generation script wmma.py has been refactored in a similar way. Differential Revision: https://reviews.llvm.org/D60015 llvm-svn: 359247	2019-04-25 22:27:57 +00:00
Artem Belevich	8d825b38ed	[NVPTX] generate correct MMA instruction mnemonics with PTX63+. PTX 6.3 requires using ".aligned" in the MMA instruction names. In order to generate correct name, now we pass current PTX version to each instruction as an extra constant operand and InstPrinter adjusts its output accordingly. Differential Revision: https://reviews.llvm.org/D59393 llvm-svn: 359246	2019-04-25 22:27:46 +00:00
Artem Belevich	7ecd82ce19	[NVPTX] Refactor generation of MMA intrinsics and instructions. NFC. Generalized constructions of 'fragments' of MMA operations to provide common primitives for construction of the ops. This will make it easier to add new variants of the instructions that operate on integer types. Use nested foreach loops which makes it possible to better control naming of the intrinsics. This patch does not affect LLVM's output, so there are no test changes. Differential Revision: https://reviews.llvm.org/D59389 llvm-svn: 359245	2019-04-25 22:27:35 +00:00
Sean Fertile	a93a33cb87	[Object][XCOFF] Add intial support for section header table. Adds a representation of the section header table to XCOFFObjectFile, and implements enough to dump the section headers with llvm-obdump. Differential Revision: https://reviews.llvm.org/D60784 llvm-svn: 359244	2019-04-25 21:36:04 +00:00
Stanislav Mekhanoshin	917c477a07	[AMDGPU] gfx1010 - fix ubsan failure Revert DecoderNamespace in one place for now. It will need more changes to properly work. llvm-svn: 359239	2019-04-25 20:39:06 +00:00
David Blaikie	0c4dbf9ecd	Assigning to a local object in a return statement prevents copy elision. NFC. I added a diagnostic along the lines of `-Wpessimizing-move` to detect `return x = y` suppressing copy elision, but I don't know if the diagnostic is really worth it. Anyway, here are the places where my diagnostic reported that copy elision would have been possible if not for the assignment. P1155R1 in the post-San-Diego WG21 (C++ committee) mailing discusses whether WG21 should fix this pitfall by just changing the core language to permit copy elision in cases like these. (Kona update: The bulk of P1155 is proceeding to CWG review, but specifically not the parts that explored the notion of permitting copy-elision in these specific cases.) Reviewed By: dblaikie Author: Arthur O'Dwyer Differential Revision: https://reviews.llvm.org/D54885 llvm-svn: 359236	2019-04-25 20:09:00 +00:00
Jessica Paquette	f54258c888	[GlobalISel][AArch64] Make G_EXTRACT_VECTOR_ELT legal for v8s16s This case was missing before, so we couldn't legalize it. Add it to AArch64LegalizerInfo.cpp and update select-extract-vector-elt.mir. llvm-svn: 359231	2019-04-25 20:00:57 +00:00
Akira Hatanaka	8edf8f317b	[ObjC][ARC] Let ARC optimizer bail out if the number of pointer states it keeps track of becomes too large ARC optimizer does a top-down and a bottom-up traversal of the whole function to pair up retain and release instructions and remove them. This can be expensive if the number of instructions in the function and pointer states it tracks are large since it has to look at each pointer state and determine whether the instruction being visited can potentially use the pointer. This patch adds a command line option that sets a limit to the number of pointers it tracks. rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D61100 llvm-svn: 359226	2019-04-25 19:42:55 +00:00
Stanislav Mekhanoshin	2c97ff07bf	[AMDGPU] gfx1010 VOP1 instructions Differential Revision: https://reviews.llvm.org/D61099 llvm-svn: 359225	2019-04-25 19:01:51 +00:00
Stanislav Mekhanoshin	956b0be72e	[AMDGPU] gfx1010 utility functions Differential Revision: https://reviews.llvm.org/D61094 llvm-svn: 359224	2019-04-25 18:53:41 +00:00
Jessica Paquette	8184b6e7f6	[GlobalISel][AArch64] Add generic legalization rule for extends This adds a legalization rule for G_ZEXT, G_ANYEXT, and G_SEXT which allows extends whenever the types will fit in registers (or the source is an s1). Update tests. Add GISel checks throughout all of arm64-vabs.ll, where we now select a good portion of the code. Add GISel checks to arm64-subvector-extend.ll, which has a good number of vector extends in it. Differential Revision: https://reviews.llvm.org/D60889 llvm-svn: 359222	2019-04-25 18:42:00 +00:00
Craig Topper	f9c30eddd0	[SelectionDAG][X86] Use stack load/store in PromoteIntRes_BITCAST when the input needs to be be split and the output type is a vector. We had special case handling here, but it uses a scalar any_extend for the promotion then bitcasts to the final type. This won't split up the input data into multiple promoted elements like we need. This patch falls back to doing the conversion through memory. Fixes PR41594 which I believe was reflected in the bitcast-vector-bool.ll changes. The changes to vector-half-conversions.ll are fixing a previously unknown miscompile from this issue. Differential Revision: https://reviews.llvm.org/D61114 llvm-svn: 359219	2019-04-25 18:19:59 +00:00
Robert Lougher	d469133f95	[Evaluator] Walk initial elements when handling load through bitcast When evaluating a store through a bitcast, the evaluator tries to move the bitcast from the pointer onto the stored value. If the cast is invalid, it tries to "introspect" the type to get a valid cast by obtaining a pointer to the initial element (if the type is nested, this may require walking several initial elements). In some situations it is possible to get a bitcast on a load (e.g. with unions, where the bitcast may not be the same type as the store). However, equivalent logic to the store to introspect the type is missing. This patch add this logic. Note, when developing the patch I was unhappy with adding similar logic directly to the load case as it could get out of step. Instead, I have abstracted the "introspection" into a helper function, with the specifics being handled by a passed-in lambda function. Differential Revision: https://reviews.llvm.org/D60793 llvm-svn: 359205	2019-04-25 17:00:01 +00:00
Jessica Paquette	ba55767f51	[GlobalISel][AArch64] Legalize G_FNEARBYINT Add legalizer support for G_FNEARBYINT. It's the same as G_FCEIL etc. Since the importer allows us to automatically select this after legalization, also add tests for selection etc. Also update arm64-vfloatintrinsics.ll. llvm-svn: 359204	2019-04-25 16:44:40 +00:00
Jessica Paquette	bd7ac30b15	[GlobalISel] Add IRTranslator support for G_FNEARBYINT Translate llvm.nearbyint into G_FNEARBYINT as a simple intrinsic. Update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60922 llvm-svn: 359203	2019-04-25 16:39:28 +00:00
Simon Pilgrim	48a3b54572	[InstCombine][X86] Tweak generic expansion of PACKSS/PACKUS to shuffle then truncate. NFCI. This has no effect on constant folding but will be useful when we expand non-saturating PACKSS/PACKUS intrinsics. llvm-svn: 359191	2019-04-25 13:51:57 +00:00
Sam McCall	a7edcfb533	[Support] Add JSON streaming output API, faster where the heavy value types aren't needed. Summary: There's still a little bit of constant factor that could be trimmed (e.g. more overloads to avoid round-tripping primitives through json::Value). But this solves the memory scaling problem, and greatly improves the performance constant factor, and the API should leave room for optimization if needed. Adapt TimeProfiler to use it, eliminating almost all the performance regression from r358476. Performance test on my machine: perf stat -r 5 ~/llvmbuild-opt/bin/clang++ -w -S -ftime-trace -mllvm -time-trace-granularity=0 spirit.cpp Handcrafted JSON (HEAD=r358532 with r358476 reverted): 2480ms json::Value (HEAD): 2757ms (+11%) After this patch: 2520 ms (+1.6%) Reviewers: anton-afanasyev, lebedev.ri Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60804 llvm-svn: 359186	2019-04-25 12:51:42 +00:00
Fangrui Song	f6a6290908	Parallel: only allow the first TaskGroup to run tasks parallelly Summary: Concurrent (e.g. nested) llvm::parallel::for_each() may lead to dead locks. See PR35788 (fixed by rLLD322041) and PR41508 (fixed by D60757). When parallel_for_each() is about to return, in ~Latch() called by ~TaskGroup(), a thread (in the default executor) may block in Latch::sync() waiting for Count to become zero. If all threads in the default executor are blocked, it is a dead lock. To fix this, force serial execution if the current TaskGroup is not the first one. For a nested llvm::parallel::for_each(), this parallelizes the outermost loop and serializes inner loops. Differential Revision: https://reviews.llvm.org/D61115 llvm-svn: 359182	2019-04-25 11:33:30 +00:00
Florian Hahn	1038137f14	[ConstantRange] [a, b) udiv a full range is [0, umax(b)). Reviewers: nikic, spatel, efriedma Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D60536 llvm-svn: 359180	2019-04-25 10:12:43 +00:00
Ilya Biryukov	6fae38ec91	[Testing] Move clangd::Annotations to llvm testing support Summary: Annotations allow writing nice-looking unit test code when one needs access to locations from the source code, e.g. running code completion at particular offsets in a file. See comments in Annotations.cpp for more details on the API. Also got rid of a duplicate annotations parsing code in clang's code complete tests. Reviewers: gribozavr, sammccall Reviewed By: gribozavr Subscribers: mgorny, hiraditya, ioeric, MaskRay, jkorous, arphaman, kadircet, jdoerfert, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D59814 llvm-svn: 359179	2019-04-25 10:08:31 +00:00
George Rimar	45d042ed96	[yaml2obj] - Don't crash on invalid inputs. yaml2obj might crash on invalid input when unable to parse the YAML. Recently a crash with a very similar nature was fixed for an empty files. This patch revisits the fix and does it in yaml::Input instead. It seems to be more correct way to handle such situation. With that crash for invalid inputs is also fixed now. Differential revision: https://reviews.llvm.org/D61059 llvm-svn: 359178	2019-04-25 09:59:55 +00:00
Simon Pilgrim	4b7d3c4831	Fix include order. NFCI. llvm-svn: 359177	2019-04-25 09:49:37 +00:00
Simon Pilgrim	0a7d1b3ce1	[X86][SSE] combineBitcastvxi1 - add support for bitcasting to non-scalar integers Truncate the movmsk scalar integer result to the equivalent scalar integer width as before but then bitcast to the requested type. We still have the issue identified in PR41594 but D61114 should handle this. llvm-svn: 359176	2019-04-25 09:34:36 +00:00
Simon Atanasyan	a0291110da	[MIPS] Use custom bitcast lowering to avoid excessive instructions On Mips32r2 bitcast can be expanded to two sw instructions and an ldc1 when using bitcast i64 to double or an sdc1 and two lw instructions when using bitcast double to i64. By introducing custom lowering that uses mtc1/mthc1 we can avoid excessive instructions. Patch by Mirko Brkusanin. Differential Revision: https://reviews.llvm.org/D61069 llvm-svn: 359171	2019-04-25 07:47:28 +00:00
Craig Topper	013503c78d	[X86] Remove part of an if condition that should always be true. The IndexReg will always be non-null at this point. Earlier in the function, if IndexReg was null we set it to CurDAG->getRegister(0, VT) which made it non-null. llvm-svn: 359170	2019-04-25 06:08:02 +00:00
Alina Sbirlea	733c8c40c8	Enable LoopVectorization by default. Summary: When refactoring vectorization flags, vectorization was disabled by default in the new pass manager. This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults. Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization. Reviewers: chandlerc, jgorbe Subscribers: jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61091 llvm-svn: 359167	2019-04-25 04:49:48 +00:00
Philip Reames	88cd69b56f	Consolidate existing utilities for interpreting vector predicate maskes [NFC] llvm-svn: 359163	2019-04-25 02:30:17 +00:00
Kit Barton	8e64f0a649	Fix unused variable warning in LoopFusion pass. Do not wrap the contents of printFusionCandidates in the LLVM_DEBUG macro. This fixes an unused variable warning generated when compiling without asserts but with -DENABLE_LLVM_DUMP. Differential Revision: https://reviews.llvm.org/D61035 llvm-svn: 359161	2019-04-25 02:10:02 +00:00
Philip Reames	7c8647b26f	[InstCombine] Be consistent w/handling of masked intrinsics style wise [NFC] llvm-svn: 359160	2019-04-25 01:18:56 +00:00
Austin Kerbow	83e52142d1	Fix spelling error. NFC Summary: Test commit. Reviewers: msearles, jkorous Reviewed By: jkorous Subscribers: dexonsmith, arsenm, jvesely, nhaehnle, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61093 llvm-svn: 359154	2019-04-24 23:32:21 +00:00
Nico Weber	23cb79ff93	llvm-cvtres: Make new dupe resource error a bit friendlier For well-known type IDs, include the name of the type. To not duplicate the ID->name map, make llvm-readobj call this new function as well. It has slightly different output, so this also requires updating a few tests. Differential Revision: https://reviews.llvm.org/D61086 llvm-svn: 359153	2019-04-24 23:26:30 +00:00
JF Bastien	fb742da34c	posix_spawn should retry upon EINTR Summary: We've seen cases of bots failing with: clang: error: unable to execute command: posix_spawn failed: Interrupted system call Add a small retry loop to posix_spawn in case this happens. Don't retry too much in case there's some systemic problem going on, but retry a few times. <rdar://problem/50181448> Reviewers: Bigcheese, arphaman Subscribers: jkorous, dexonsmith, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61096 llvm-svn: 359152	2019-04-24 23:24:53 +00:00
Amy Huang	68c9199493	Recommitting r358783 and r358786 "[MS] Emit S_HEAPALLOCSITE debug info" with fixes for buildbot error (undefined assembler label). Summary: This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug info in codeview. Currently only changes FastISel, so emitting labels still needs to be implemented in SelectionDAG. Reviewers: rnk Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D61083 llvm-svn: 359149	2019-04-24 23:02:48 +00:00
Sanjay Patel	6f41bf948b	[DAGCombiner] scale repeated FP divisor by splat factor If we have a vector FP division with a splatted divisor, use the existing transform that converts 'x/y' into 'x * (1.0/y)' to allow more conversions. This can then potentially be converted into a scalar FP division by existing combines (rL358984) as seen in the tests here. That can be a potentially big perf difference if scalar fdiv has better timing (including avoiding possible frequency throttling for vector ops). Differential Revision: https://reviews.llvm.org/D61028 llvm-svn: 359147	2019-04-24 22:28:58 +00:00
Joerg Sonnenberger	8372b467f1	[PowerPC] Allow using initial-exec TLS with PIC Using initial-exec TLS variables is a reasonable performance optimisation for system libraries. Use the correct PIC mechanism to get hold of the GOT to avoid text relocations. Differential Revision: https://reviews.llvm.org/D61026 llvm-svn: 359146	2019-04-24 22:12:22 +00:00
Sean Fertile	526633deea	Add period at end of comment. llvm-svn: 359144	2019-04-24 21:51:30 +00:00
Craig Topper	6932abee2c	[X86] Attempt to fix use-after-poison from r359121. llvm-svn: 359143	2019-04-24 21:48:24 +00:00
Stanislav Mekhanoshin	9d287358a8	[AMDGPU] gfx1010 SOP instructions Differential Revision: https://reviews.llvm.org/D61080 llvm-svn: 359139	2019-04-24 20:44:34 +00:00
Alexey Bataev	ef3c1884ec	[SLP] Fix crash after r358519, by V. Porpodas. Summary: The code did not check if operand was undef before casting it to Instruction. Reviewers: RKSimon, ABataev, dtemirbulatov Reviewed By: ABataev Subscribers: uabelho Tags: #llvm Differential Revision: https://reviews.llvm.org/D61024 llvm-svn: 359136	2019-04-24 20:21:32 +00:00
Reid Kleckner	c06a470fc8	Try once more to ensure constant initializaton of ManagedStatics First, use the old style of linker initialization for MSVC 2019 in addition to 2017. MSVC 2019 emits a dynamic initializer for ManagedStatic when compiled in debug mode, and according to zturner, also sometimes in release mode. I wasn't able to reproduce that, but it seems best to stick with the old code that works. When clang is using the MSVC STL, we have to give ManagedStatic a constexpr constructor that fully zero initializes all fields, otherwise it emits a dynamic initializer. The MSVC STL implementation of std::atomic has a non-trivial (but constexpr) default constructor that zero initializes the atomic value. Because one of the fields has a non-trivial constructor, ManagedStatic ends up with a non-trivial ctor. The ctor is not constexpr, so clang ends up emitting a dynamic initializer, even though it simply does zero initialization. To make it constexpr, we must initialize all fields of the ManagedStatic. However, while the constructor that takes a pointer is marked constexpr, clang says it does not evaluate to a constant because it contains a cast from a pointer to an integer. I filed this as: https://developercommunity.visualstudio.com/content/problem/545566/stdatomic-value-constructor-is-not-actually-conste.html Once we do that, we can add back the LLVM_REQUIRE_CONSTANT_INITIALIZATION marker, and so far as I'm aware it compiles successfully on all supported targets. llvm-svn: 359135	2019-04-24 20:13:23 +00:00
Xinliang David Li	499c80b890	Add optional arg to profile count getters to filter synthetic profile count. Differential Revision: http://reviews.llvm.org/D61025 llvm-svn: 359131	2019-04-24 19:51:16 +00:00
Craig Topper	af194e9380	[X86] Prevent folding a load into an AND if that AND is really a ZEXT_INREG that should use movzx. This can save a 32-bit immediate move. We would shrink the load and fold it if it was non-volatile, but that's trickier to check for. llvm-svn: 359129	2019-04-24 19:28:38 +00:00
Adrian Prantl	c90ff5e123	Revert using fcopyfile(3) to implement sys::fs::copy_file(Twine, int) on macOS It turns out that I mesread the man page and fcopyfile(3) does not actually support COPYFILE_CLONE for files. <rdar://problem/50148757> llvm-svn: 359127	2019-04-24 19:08:43 +00:00
David Blaikie	832c7d9f36	DebugInfo: Emit only declarations (not whole definitions) of non-unit user defined types into type units While this doesn't come up in reasonable cases currently (the only user defined types not in type units are ones without linkage - which makes for near-ODR violations, because it'd be a type with linkage referencing a type without linkage - such a type can't be validly defined in more than one TU, so arguably it shouldn't be in a type unit to begin with - but it's a convenient way to demonstrate an issue that will become more revalent with homed modular debug info type definitions - which also don't need to be in type units but more legitimately so). Precursor to the Clang change to de-type-unit (by omitting the 'identifier') types homed due to strong linkage vtables. (making that change without this one would lead to major type duplication in type units) llvm-svn: 359122	2019-04-24 18:09:44 +00:00
Craig Topper	882ca6d484	[X86] Remove dead nodes left after ReplaceAllUsesWith calls during address matching ReplaceAllUsesWith doesn't remove the node that was replaced. So its left around in the graph messing up use counts on other nodes. One thing to note, is that this isn't valid if the node being deleted is the root node of an LEA match that gets rejected. In that case the node needs to stay alive because the isel table walking code would still have a reference to it that its going to try to match next. I don't think that's the case here though because the nodes being deleted here should be "and", "srl", and "zero_extend" none of which can be the root node of an LEA match. Differential Revision: https://reviews.llvm.org/D61048 llvm-svn: 359121	2019-04-24 18:02:07 +00:00
Stanislav Mekhanoshin	33d806a517	[AMDGPU] gfx1010 sgpr register changes Differential Revision: https://reviews.llvm.org/D61045 llvm-svn: 359117	2019-04-24 17:28:30 +00:00
Robert Widmann	09c5b883cb	[LLVM-C] Deprecate the LLVMValueRef-returning metadata creation functions Summary: There is still some value in using these functions while the remaining LLVMValueRef-based accessors are still around, but LLVMMDNodeInContext in particular has some wonky semantics that make it worth replacing outright. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60524 llvm-svn: 359114	2019-04-24 17:05:08 +00:00
Stanislav Mekhanoshin	cee607e414	[AMDGPU] Add gfx1010 target definitions Differential Revision: https://reviews.llvm.org/D61041 llvm-svn: 359113	2019-04-24 17:03:15 +00:00
Simon Pilgrim	55f14dac74	[InstCombine][X86] Use generic expansion of PACKSS/PACKUS for constant folding. NFCI. This patch rewrites the existing PACKSS/PACKUS constant folding code to expand as a generic expansion. This is a first NFCI step toward expanding PACKSS/PACKUS intrinsics which are acting as non-saturating truncations (although technically the expansion could be used in all cases - but we'll probably want to be conservative). llvm-svn: 359111	2019-04-24 16:53:17 +00:00
Nico Weber	8d05eb8556	llvm-undname: Fix assert-on->4GiB-string-literal, found by oss-fuzz llvm-svn: 359109	2019-04-24 16:09:38 +00:00
Lang Hames	b1ba4d8a8a	[JITLink] Refer to FDE's CIE (not the most recent CIE) when parsing eh-frame. Frame Descriptor Entries (FDEs) have a pointer back to a Common Information Entry (CIE) that describes how the rest FDE should be parsed. JITLink had been assuming that FDEs always referred to the most recent CIE encountered, but the spec allows them to point back to any previously encountered CIE. This patch fixes JITLink to look up the correct CIE for the FDE. The testcase is a MachO binary with an FDE that refers to a CIE that is not the one immediately proceeding it (the layout can be viewed wit 'dwarfdump --eh-frame <testcase>'. This test case had to be a binary as llvm-mc now sorts FDEs (as of r356216) to ensure FDEs do point to the most recent CIE. llvm-svn: 359105	2019-04-24 15:15:55 +00:00
Dmitry Preobrazhensky	47621d7c89	[AMDGPU][MC] Parser cleanup and refactoring Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60767 llvm-svn: 359096	2019-04-24 14:06:15 +00:00
Sanjay Patel	b1b3368907	[x86] make sure horizontal op and broadcast types match to simplify (PR41414) If the types don't match, we can't just remove the shuffle. There may be some other opportunity for optimization here, but this should prevent the crashing seen in: https://bugs.llvm.org/show_bug.cgi?id=41414 llvm-svn: 359095	2019-04-24 14:05:08 +00:00
whitequark	50392a3b1b	[LLVM-C] Use dyn_cast instead of unwrap in LLVMGetDebugLoc functions Summary: The `unwrap<Type>` calls can assert with: ``` Assertion failed: (isa<X>(Val) && "cast<Ty>() argument of incompatible type!"), function cast ``` so replace them with `dyn_cast`. Reviewers: whitequark, abdulras, hiraditya, compnerd Reviewed By: whitequark Subscribers: llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60473 llvm-svn: 359093	2019-04-24 13:30:03 +00:00
Simon Pilgrim	d30745b2a0	[X86] Add shouldFoldConstantShiftPairToMask override placeholder. NFCI. Prep work toward fixing PR40758 llvm-svn: 359088	2019-04-24 12:34:08 +00:00
Nico Weber	ccf096463a	Let llvm-cvtres (and lld-link) report duplicate resources If two .res files contain the same resource, cvtres.exe (and hence link.exe) reject the input with this message: CVTRES : fatal error CVT1100: duplicate resource. type:STRING, name:101, language:0x0409 LINK : fatal error LNK1123: failure during conversion to COFF: file invalid or corrupt llvm-cvtres (and lld-link) used to silently pick one of the duplicate resources instead. This patch makes them report an error as well. We slightly improve on cvtres by printing the name of two .res files containing duplicate entries as well. Differential Revision: https://reviews.llvm.org/D61049 llvm-svn: 359083	2019-04-24 11:42:59 +00:00
Bjorn Pettersson	71e8c6f20f	Add "const" in GetUnderlyingObjects. NFC Summary: Both the input Value pointer and the returned Value pointers in GetUnderlyingObjects are now declared as const. It turned out that all current (in-tree) uses of GetUnderlyingObjects were trivial to update, being satisfied with have those Value pointers declared as const. Actually, in the past several of the users had to use const_cast, just because of ValueTracking not providing a version of GetUnderlyingObjects with "const" Value pointers. With this patch we get rid of those const casts. Reviewers: hfinkel, materi, jkorous Reviewed By: jkorous Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D61038 llvm-svn: 359072	2019-04-24 06:55:50 +00:00
Craig Topper	1e413ffa7b	[Mips][CodeGen] Remove MachineFunction::setSubtarget. Change Mips to just copy the subtarget from the MachineFunction instead of recalculating it. Summary: The MachineFunction should have been created with the correct subtarget. As long as there is no way to change it, MipsTargetMachine can just capture it directly from the MachineFunction without calling getSubtargetImpl again. While there, const correct the Subtarget pointer to avoid a const_cast. I believe the Mips16Subtarget and NoMips16Subtarget members are never used, but I'll leave there removal for a separate patch. Reviewers: echristo, atanasyan Reviewed By: atanasyan Subscribers: sdardis, arichardson, hiraditya, jrtc27, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60936 llvm-svn: 359071	2019-04-24 06:48:31 +00:00
Fangrui Song	b5f3984541	[CommandLine] Provide parser<unsigned long> instantiation to allow cl::opt<uint64_t> on LP64 platforms Summary: And migrate opt<unsigned long long> to opt<uint64_t> Fixes PR19665 Differential Revision: https://reviews.llvm.org/D60933 llvm-svn: 359068	2019-04-24 02:40:20 +00:00
Alina Sbirlea	b341efce31	Revert [AliasAnalysis] AAResults preserves AAManager. Triggers use-after-free. llvm-svn: 359055	2019-04-24 00:28:29 +00:00
Francis Visoiu Mistrih	7fee2b89fd	[Remarks] Add string deduplication using a string table * Add support for uniquing strings in the remark streamer and emitting the string table in the remarks section. * Add parsing support for the string table in the RemarkParser. From this remark: ``` --- !Missed Pass: inline Name: NoDefinition DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c', Line: 7, Column: 3 } Function: printArgsNoRet Args: - Callee: printf - String: ' will not be inlined into ' - Caller: printArgsNoRet DebugLoc: { File: 'test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c', Line: 6, Column: 0 } - String: ' because its definition is unavailable' ... ``` to: ``` --- !Missed Pass: 0 Name: 1 DebugLoc: { File: 3, Line: 7, Column: 3 } Function: 2 Args: - Callee: 4 - String: 5 - Caller: 2 DebugLoc: { File: 3, Line: 6, Column: 0 } - String: 6 ... ``` And the string table in the .remarks/__remarks section containing: ``` inline\0NoDefinition\0printArgsNoRet\0 test-suite/SingleSource/UnitTests/2002-04-17-PrintfChar.c\0printf\0 will not be inlined into \0 because its definition is unavailable\0 ``` This is mostly supposed to be used for testing purposes, but it gives us a 2x reduction in the remark size, and is an incremental change for the updates to the remarks file format. Differential Revision: https://reviews.llvm.org/D60227 llvm-svn: 359050	2019-04-24 00:06:24 +00:00
Josh Stone	27924c3a3c	[Lint] Permit aliasing noalias readonly arguments Summary: If two arguments are both readonly, then they have no memory dependency that would violate noalias, even if they do actually overlap. Reviewers: hfinkel, efriedma Reviewed By: efriedma Subscribers: efriedma, hiraditya, llvm-commits, tstellar Tags: #llvm Differential Revision: https://reviews.llvm.org/D60239 llvm-svn: 359047	2019-04-23 23:43:47 +00:00
Jessica Paquette	4fe7574d5d	[AArch64][GlobalISel] Select G_INTRINSIC_ROUND Add selection support for G_INTRINSIC_ROUND, add a selection test, and add check lines to arm64-vfloatintrinsics.ll and f16-instructions.ll. llvm-svn: 359046	2019-04-23 23:03:03 +00:00
Jessica Paquette	9766bf1854	[AArch64][GlobalISel] Mark G_INTRINSIC_ROUND as a pre-isel floating point opcode Add G_INTRINSIC_ROUND to isPreISelGenericFloatingPointOpcode to ensure that its input and output are assigned the correct register bank. Add a regbankselect test to verify that we get what we expect here. llvm-svn: 359044	2019-04-23 22:47:00 +00:00
Dmitry Mikulin	312b5f86b7	The error message for mismatched value sites is very cryptic. Make it more readable for an average user. Differential Revision: https://reviews.llvm.org/D60896 llvm-svn: 359043	2019-04-23 22:26:55 +00:00
Francis Visoiu Mistrih	1646851b87	[CGP] Look through bitcasts when duplicating returns for tail calls The simple case of: ``` int callee(); void caller(void *a) { if (a == NULL) return callee(); return a; } ``` would generate a regular call instead of a tail call because we don't look through the bitcast of the call to `callee` when duplicating the return blocks. Differential Revision: https://reviews.llvm.org/D60837 llvm-svn: 359041	2019-04-23 21:57:46 +00:00
Heejin Ahn	b9f282d384	[WebAssembly] Emit br_table for most switch instructions Summary: Always convert switches to br_tables unless there is only one case, which is equivalent to a simple branch. This reduces code size for wasm, and we defer possible jump table optimizations to the VM. Addresses PR41502. Reviewers: kripken, sunfish Subscribers: dschuff, sbc100, jgravelle-google, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60966 llvm-svn: 359038	2019-04-23 21:30:30 +00:00
Amy Huang	fc79ab9857	Revert "[MS] Emit S_HEAPALLOCSITE debug info" because of ToTWin64(db) buildbot failure. This reverts commit `d07d6d6177` and `c774f687b6`. llvm-svn: 359034	2019-04-23 21:12:58 +00:00
Jessica Paquette	3cc6d1f542	[AArch64][GlobalISel] Legalize G_INTRINSIC_ROUND Add it to the same rule as G_FCEIL etc. Add a legalizer test, and add a missing switch case to AArch64LegalizerInfo.cpp. llvm-svn: 359033	2019-04-23 21:11:57 +00:00
Alina Sbirlea	4fd1f266b1	[MemorySSA] LCSSA preserves MemorySSA. Summary: Enabling MemorySSA in the old pass manager leads to MemorySSA being run twice due to the fact that LCSSA and LoopSimplify do not preserve MemorySSA. This is the first step to address that: target LCSSA. LCSSA does not make any changes that invalidate MemorySSA, so it preserves it by design. It must preserve AA as well, for this to hold. After this patch, MemorySSA is still run twice in the old pass manager. Step two follows: target LoopSimplify. Subscribers: mehdi_amini, jlebar, Prazek, llvm-commits, george.burgess.iv, chandlerc Tags: #llvm Differential Revision: https://reviews.llvm.org/D60832 llvm-svn: 359032	2019-04-23 20:59:44 +00:00
Jessica Paquette	991cb39242	[AArch64][GlobalISel] Actually select G_INTRINSIC_TRUNC Apparently FileCheck wasn't actually matching the fallback check lines in arm64-vfloatintrinsics.ll properly. So, there were selection fallbacks for G_INTRINSIC_TRUNC there. Actually hook it up into AArch64InstructionSelector.cpp and write a proper selection test. I guess I'll figure out the FileCheck magic to make the fallback checks work properly in arm64-vfloatintrinsics.ll. llvm-svn: 359030	2019-04-23 20:46:19 +00:00
Akira Hatanaka	5c3117b0a9	[ObjC][ARC] Check the basic block size before calling DominatorTree::dominate. ARC contract pass has an optimization that replaces the uses of the argument of an ObjC runtime function call with the call result. For example: ; Before optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %1, i8** @g0, align 8 ; After optimization %1 = tail call i8* @foo1() %2 = tail call i8* @llvm.objc.retainAutoreleasedReturnValue(i8* %1) store i8* %2, i8** @g0, align 8 // %1 is replaced with %2 Before replacing the argument use, DominatorTree::dominate is called to determine whether the user instruction is dominated by the ObjC runtime function call instruction. The call to DominatorTree::dominate can be expensive if the two instructions belong to the same basic block and the size of the basic block is large. This patch checks the basic block size and just bails out if the size exceeds the limit set by command line option "arc-contract-max-bb-size". rdar://problem/49477063 Differential Revision: https://reviews.llvm.org/D60900 llvm-svn: 359027	2019-04-23 19:49:03 +00:00
David Blaikie	2f51176223	Reapply: "DebugInfo: Emit only one kind of accelerated access/name table"" Originally committed in r358931 Reverted in r358997 Seems this change made Apple accelerator tables miss names (because names started respecting the CU NameTableKind GNU & assuming that shouldn't produce accelerated names too), which is never correct (apple accelerator tables don't have separators or CU lists - if present, they must describe all names in all CUs). Original Description: Currently to opt in to debug_names in DWARFv5, the IR must contain 'nameTableKind: Default' which also enables debug_pubnames. Instead, only allow one of {debug_names, apple_names, debug_pubnames, debug_gnu_pubnames}. nameTableKind: Default gives debug_names in DWARFv5 and greater, debug_pubnames in v4 and earlier - and apple_names when tuning for lldb on MachO. nameTableKind: GNU always gives gnu_pubnames llvm-svn: 359026	2019-04-23 19:00:45 +00:00
Teresa Johnson	867bc3951b	[ThinLTO] Pass down opt level to LTO backend and handle -O0 LTO in new PM Summary: The opt level was not being passed down to the ThinLTO backend when invoked via clang (for distributed ThinLTO). This exposed an issue where the new PM was asserting if the Thin or regular LTO backend pipelines were invoked with -O0 (not a new issue, could be provoked by invoking in-process *LTO backends via linker using new PM and -O0). Fix this similar to the old PM where -O0 only does the necessary lowering of type metadata (WPD and LowerTypeTest passes) and then quits, rather than asserting. Reviewers: xur Subscribers: mehdi_amini, inglorion, eraman, hiraditya, steven_wu, dexonsmith, cfe-commits, llvm-commits, pcc Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D61022 llvm-svn: 359025	2019-04-23 18:56:19 +00:00
Nico Weber	6967da8ffa	llvm-cvtres: Split addChild(ID) into two functions Before, there was an IsData parameter. Now, there are two different functions for data nodes and ID nodes. No behavior change, needed for a follow-up change to make two data nodes (but not two ID nodes) with the same ID an error. For consistency, rename another addChild() overload to addNameChild(). llvm-svn: 359024	2019-04-23 18:46:53 +00:00
Jessica Paquette	ede0b2e695	[AArch64][GlobalISel] Teach regbankselect about G_INTRINSIC_TRUNC Add it to isPreISelGenericFloatingPointOpcode, and add a regbankselect test. Update arm64-vfloatintrinsics.ll now that we can select it. llvm-svn: 359022	2019-04-23 18:20:47 +00:00
Jessica Paquette	56342642a0	[AArch64][GlobalISel] Legalize G_INTRINSIC_TRUNC Same patch as G_FCEIL etc. Add the missing switch case in widenScalar, add G_INTRINSIC_TRUNC to the correct rule in AArch64LegalizerInfo.cpp, and add a test. llvm-svn: 359021	2019-04-23 18:20:44 +00:00
Nikita Popov	f945429fed	[ConstantRange] Add urem support Add urem support to ConstantRange, so we can handle in in LVI. This is an approximate implementation that tries to capture the most useful conditions: If the LHS is always strictly smaller than the RHS, then the urem is a no-op and the result is the same as the LHS range. Otherwise the lower bound is zero and the upper bound is min(LHSMax, RHSMax - 1). Differential Revision: https://reviews.llvm.org/D60952 llvm-svn: 359019	2019-04-23 18:00:17 +00:00
Stanislav Mekhanoshin	c464dddccb	[AMDGPU] Fixed addReg() in SIOptimizeExecMaskingPreRA.cpp The second argument is flags, not subreg. Differential Revision: https://reviews.llvm.org/D61031 llvm-svn: 359017	2019-04-23 17:59:26 +00:00
Jessica Paquette	df5ce782ad	[AArch64][GlobalISel] Legalize G_FMA for more vector types Same as G_FCEIL, G_FABS, etc. Just move it into that rule. Add a legalizer test for G_FMA, which we didn't have before and update arm64-vfloatintrinsics.ll. llvm-svn: 359015	2019-04-23 17:37:56 +00:00
Alina Sbirlea	a809e8e5e7	[AliasAnalysis] AAResults preserves AAManager. Summary: AAResults should not invalidate AAManager. Update tests. Reviewers: chandlerc Subscribers: mehdi_amini, jlebar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60914 llvm-svn: 359014	2019-04-23 17:21:18 +00:00
Jessica Paquette	e50e6d2563	[AArch64][GlobalISel] Add G_FMA to isPreISelGenericFloatingPointOpcode Noticed an unnecessary fallback in arm64-vmul caused by this. Also add a regbankselect test for G_FMA. llvm-svn: 359013	2019-04-23 17:17:06 +00:00
Nico Weber	e8f21b1a6b	llvm-undname: Support demangling the spaceship operator Also add a test for demanling the co_await operator. llvm-svn: 359007	2019-04-23 16:20:27 +00:00
Philip Reames	2ce017026a	[InstCombine] Convert a masked.load of a dereferenceable address to an unconditional load If we have a masked.load from a location we know to be dereferenceable, we can simply issue a speculative unconditional load against that address. The key advantage is that it produces IR which is well understood by the optimizer. The select (cnd, load, passthrough) form produced should be pattern matchable back to hardware predication if profitable. Differential Revision: https://reviews.llvm.org/D59703 llvm-svn: 359000	2019-04-23 15:25:14 +00:00
Sanjay Patel	12a561fa1b	[x86] use psubus for more vsetcc lowering (PR39859) Circling back to a leftover bit from PR39859: https://bugs.llvm.org/show_bug.cgi?id=39859#c1 ...we have this counter-intuitive (based on the test diffs) opportunity to use 'psubus'. This appears to be the better perf option for both Haswell and Jaguar based on llvm-mca. We already do this transform for the SETULT predicate, so this makes the code more symmetrical too. If we have pminub/pminuw, we prefer those, so this should not affect anything but pre-SSE4.1 subtargets. $ cat before.s movdqa -16(%rip), %xmm2 ## xmm2 = [32768,32768,32768,32768,32768,32768,32768,32768] pxor %xmm0, %xmm2 pcmpgtw -32(%rip), %xmm2 ## xmm2 = [255,255,255,255,255,255,255,255] pand %xmm2, %xmm0 pandn %xmm1, %xmm2 por %xmm2, %xmm0 $ cat after.s movdqa -16(%rip), %xmm2 ## xmm2 = [256,256,256,256,256,256,256,256] psubusw %xmm0, %xmm2 pxor %xmm3, %xmm3 pcmpeqw %xmm2, %xmm3 pand %xmm3, %xmm0 pandn %xmm1, %xmm3 por %xmm3, %xmm0 $ llvm-mca before.s -mcpu=haswell Iterations: 100 Instructions: 600 Total Cycles: 909 Total uOps: 700 Dispatch Width: 4 uOps Per Cycle: 0.77 IPC: 0.66 Block RThroughput: 1.8 $ llvm-mca after.s -mcpu=haswell Iterations: 100 Instructions: 700 Total Cycles: 409 Total uOps: 700 Dispatch Width: 4 uOps Per Cycle: 1.71 IPC: 1.71 Block RThroughput: 1.8 Differential Revision: https://reviews.llvm.org/D60838 llvm-svn: 358999	2019-04-23 15:20:17 +00:00
Joerg Sonnenberger	6e7cc49d5c	[SPARC] Use the correct register set for the "r" asm constraint. 64bit mode must use 64bit registers, otherwise assumptions about the top half of the registers are made. Problem found by Takeshi Nakayama in NetBSD. llvm-svn: 358998	2019-04-23 15:15:33 +00:00
David Blaikie	a2470a4653	Revert "DebugInfo: Emit only one kind of accelerated access/name table" Regresses some apple_names situations - still investigating. This reverts commit r358931. llvm-svn: 358997	2019-04-23 15:03:24 +00:00
Fangrui Song	efd94c56ba	Use llvm::stable_sort While touching the code, simplify if feasible. llvm-svn: 358996	2019-04-23 14:51:27 +00:00
Lewis Revill	df3cb477a3	[RISCV] Support assembling %tls_{ie,gd}_pcrel_hi modifiers This patch adds support for parsing and assembling the %tls_ie_pcrel_hi and %tls_gd_pcrel_hi modifiers. Differential Revision: https://reviews.llvm.org/D55342 llvm-svn: 358994	2019-04-23 14:46:13 +00:00
Scott Linder	3eed961973	[AMDGPU] Fix hidden argument metadata duplication for V3 Essentially complete a proper rebase of the V3 metadata change over https://reviews.llvm.org/D49096. Minimize the diff between the V2 and V3 variants of the relevant lit tests, and clean up some trailing whitespace. llvm-svn: 358992	2019-04-23 14:31:17 +00:00
Simon Pilgrim	0e4992ce27	[X86] Pull out collectConcatOps helper. NFCI. Create collectConcatOps helper that returns all the subvector ops for CONCAT_VECTORS or a INSERT_SUBVECTOR series. llvm-svn: 358989	2019-04-23 14:07:49 +00:00
Tim Northover	6af366be8a	ARM: disallow add/sub to sp unless Rn is also sp. The manual says that Thumb2 add/sub instructions are only allowed to modify sp if the first source is also sp. This is slightly different from the usual rGPR restriction since it's context-sensitive, so implement it in C++. llvm-svn: 358987	2019-04-23 13:50:13 +00:00
Sanjay Patel	06ff5eae5b	[DAGCombiner] generalize binop-of-splats scalarization If we only match build vectors, we can miss some patterns that use shuffles as seen in the affected tests. Note that the underlying calls within getSplatSourceVector() have the potential for compile-time explosion because of exponential recursion looking through binop opcodes, but currently the list of supported opcodes is very limited. Both of those problems should be addressed in follow-up patches. llvm-svn: 358984	2019-04-23 13:16:41 +00:00
Nicolai Haehnle	7edae4c403	AMDGPU: Fix LCSSA phi lowering in SILowerI1Copies Summary: When an LCSSA phi survives through instruction selection, the pass ends up removing that phi entirely because it is dominated by the logic that does the lanemask merging. This then used to trigger an assertion when processing a dependent phi instruction. Change-Id: Id4949719f8298062fe476a25718acccc109113b6 Reviewers: llvm-commits Subscribers: kzhuravl, jvesely, wdng, yaxunl, t-tye, tpr, dstuttard, rtaylor, arsenm Tags: #llvm Differential Revision: https://reviews.llvm.org/D60999 llvm-svn: 358983	2019-04-23 13:12:52 +00:00
Fedor Sergeev	652168a99b	[CallSite removal] move InlineCost to CallBase usage Converting InlineCost interface and its internals into CallBase usage. Inliners themselves are still not converted. Reviewed By: reames Tags: #llvm Differential Revision: https://reviews.llvm.org/D60636 llvm-svn: 358982	2019-04-23 12:43:27 +00:00
David Green	c519d3c403	[ARM] Update check for CBZ in Ifcvt The check for creating CBZ in constant island pass recently obtained the ability to search backwards to find a Cmp instruction. The code in IfCvt should mirror this to allow more conversions to the smaller form. The common code has been pulled out into a separate function to be shared between the two places. Differential Revision: https://reviews.llvm.org/D60090 llvm-svn: 358977	2019-04-23 12:11:26 +00:00
David Green	2f9eed6265	[ARM] Don't replicate instructions in Ifcvt at minsize Ifcvt can replicate instructions as it converts them to be predicated. This stops that from happening on thumb2 targets at minsize where an extra IT instruction is likely needed. Differential Revision: https://reviews.llvm.org/D60089 llvm-svn: 358974	2019-04-23 11:46:58 +00:00
Simon Pilgrim	ddd225d1a9	Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFCI. llvm-svn: 358970	2019-04-23 11:16:16 +00:00
Simon Pilgrim	e7a68fd93e	Fix MSVC "32-bit shift implicitly converted to 64 bits" warning. NFCI. llvm-svn: 358969	2019-04-23 11:11:34 +00:00
Bjorn Pettersson	f97b29be88	[DAGCombiner] Combine OR as ADD when no common bits are set Summary: The DAGCombiner is rewriting (canonicalizing) an ISD::ADD with no common bits set in the operands as an ISD::OR node. This could sometimes result in "missing out" on some combines that normally are performed for ADD. To be more specific this could happen if we already have rewritten an ADD into OR, and later (after legalizations or combines) we expose patterns that could have been optimized if we had seen the OR as an ADD (e.g. reassociations based on ADD). To make the DAG combiner less sensitive to if ADD or OR is used for these "no common bits set" ADD/OR operations we now apply most of the ADD combines also to an OR operation, when value tracking indicates that the operands have no common bits set. Reviewers: spatel, RKSimon, craig.topper, kparzysz Reviewed By: spatel Subscribers: arsenm, rampitec, lebedev.ri, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59758 llvm-svn: 358965	2019-04-23 10:01:08 +00:00
Javed Absar	1cdc3dbc58	[AArch64] Add support for MTE intrinsics This patch provides intrinsics support for Memory Tagging Extension (MTE), which was introduced with the Armv8.5-a architecture. The intrinsics are described in detail in the latest ACLE Q1 2019 documentation: https://developer.arm.com/docs/101028/latest Reviewed by: David Spickett Differential Revision: https://reviews.llvm.org/D60486 llvm-svn: 358963	2019-04-23 09:39:58 +00:00
Diogo N. Sampaio	2619f399f9	[ARM][FIX] Add missing f16.lane.vldN/vstN lowering Summary: Add missing D and Q lane VLDSTLane lowering for fp16 elements. Reviewers: efriedma, kosarev, SjoerdMeijer, ostannard Reviewed By: efriedma Subscribers: javed.absar, kristof.beyls, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60874 llvm-svn: 358962	2019-04-23 09:36:39 +00:00
George Rimar	b9ed9cb5d7	[llvm-mc] - Properly set the the address align field of the compressed sections. About the compressed sections spec says: (https://docs.oracle.com/cd/E37838_01/html/E36783/section_compression.html) sh_addralign fields of the section header for a compressed section reflect the requirements of the compressed section. Currently, llvm-mc always puts uncompressed section alignment to sh_addralign. It is not correct. zlib styled section contains an Elfxx_Chdr header, so we should either use 4 or 8 values depending on the target (Uncompressed section alignment is stored in ch_addralign field of the compression header). GNU assembler version 2.31.1 also has this issue, but in 2.32.51 it was already fixed. This is how it was found during debugging of the https://bugs.llvm.org/show_bug.cgi?id=40482 actually. Differential revision: https://reviews.llvm.org/D60965 llvm-svn: 358960	2019-04-23 09:16:53 +00:00
David Green	63a2aa715a	[LSR] Limit the recursion for setup cost In some circumstances we can end up with setup costs that are very complex to compute, even though the scevs are not very complex to create. This can also lead to setupcosts that are calculated to be exactly -1, which LSR treats as an invalid cost. This patch puts a limit on the recursion depth for setup cost to prevent them taking too long. Thanks to @reames for the report and test case. Differential Revision: https://reviews.llvm.org/D60944 llvm-svn: 358958	2019-04-23 08:52:21 +00:00
Sam Clegg	9da81421b8	[WebAssembly] Bail out of fastisel earlier when computing PIC addresses This change partially reverts https://reviews.llvm.org/D54647 in favor of bailing out during computeAddress instead. This catches the condition earlier and handles more cases. Differential Revision: https://reviews.llvm.org/D60986 llvm-svn: 358948	2019-04-23 03:43:26 +00:00
Chandler Carruth	bbddf21f90	Revert "Use const DebugLoc&" This reverts r358910 (git commit `2b74466530`) While this patch seems trivial and safe and correct, it is not. The copies are actually load bearing copies. You can observe this with MSan or other ways of checking for use-after-destroy, but otherwise this may result in ... difficult to debug inexplicable behavior. I suspect the issue is that the debug location is used after the original reference to it is removed. The metadata backing it gets destroyed as its last references goes away, and then we reference it later through these const references. llvm-svn: 358940	2019-04-23 01:42:07 +00:00
David Blaikie	68602ab2f3	DebugInfo: Emit only one kind of accelerated access/name table Currently to opt in to debug_names in DWARFv5, the IR must contain 'nameTableKind: Default' which also enables debug_pubnames. Instead, only allow one of {debug_names, apple_names, debug_pubnames, debug_gnu_pubnames}. nameTableKind: Default gives debug_names in DWARFv5 and greater, debug_pubnames in v4 and earlier - and apple_names when tuning for lldb on MachO. nameTableKind: GNU always gives gnu_pubnames llvm-svn: 358931	2019-04-22 22:45:11 +00:00
Sanjay Patel	bf8aacb715	[SelectionDAG] move splat util functions up from x86 lowering This was supposed to be NFC, but the change in SDLoc definitions causes instruction scheduling changes. There's nothing x86-specific in this code, and it can likely be used from DAGCombiner's simplifyVBinOp(). llvm-svn: 358930	2019-04-22 22:43:36 +00:00
Michael Liao	389d5a3474	[AMDGPU] Fix an issue in `op_sel_hi` skipping. Summary: - Only apply packed literal `op_sel_hi` skipping on operands requiring packed literals. Even an instruction is `packed`, it may have operand requiring non-packed literal, such as `v_dot2_f32_f16`. Reviewers: rampitec, arsenm, kzhuravl Subscribers: jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60978 llvm-svn: 358922	2019-04-22 22:05:49 +00:00
Philip Reames	d748689c7f	[InstCombine] Eliminate stores to constant memory If we have a store to a piece of memory which is known constant, then we know the store must be storing back the same value. As a result, the store (or memset, or memmove) must either be down a dead path, or a noop. In either case, it is valid to simply remove the store. The motivating case for this involves a memmove to a buffer which is constant down a path which is dynamically dead. Note that I'm choosing to implement the less aggressive of two possible semantics here. We could simply say that the store is undefined, and prune the path. Consensus in the review was that the more aggressive form might be a good follow on change at a later date. Differential Revision: https://reviews.llvm.org/D60659 llvm-svn: 358919	2019-04-22 20:28:19 +00:00
Philip Reames	d8d9b7b20e	[InstSimplify] Move masked.gather w/no active lanes handling to InstSimplify from InstCombine In the process, use the existing masked.load combine which is slightly stronger, and handles a mix of zero and undef elements in the mask. llvm-svn: 358913	2019-04-22 19:30:01 +00:00
Matt Arsenault	2b74466530	Use const DebugLoc& llvm-svn: 358910	2019-04-22 19:14:27 +00:00
Matt Arsenault	f84ce75cd1	AMDGPU: Skip debug instructions in assert These are inserted after branch relaxation, and for some reason it's decided to put them in the long branch expansion block. It's probably not great to rely on the source block address, so this should probably be switched to being PC relative instead of relying on the block address llvm-svn: 358909	2019-04-22 19:14:26 +00:00
Justin Bogner	e90d5c8db0	[IPSCCP] Add missing `AssumptionCacheTracker` dependency Back in August, r340525 introduced a dependency on the assumption cache tracker in the ipsccp pass, but that commit missed a call to INITIALIZE_PASS_DEPENDENCY, which leaves the assumption cache improperly registered if SCCP is the only thing that pulls it in. llvm-svn: 358903	2019-04-22 17:38:29 +00:00
Philip Reames	37104d7189	[LPM/BPI] Preserve BPI through trivial loop pass pipeline (e.g. LCSSA, LoopSimplify) Currently, we do not expose BPI to loop passes at all. In the old pass manager, we appear to have been ignoring the fact that LCSSA and/or LoopSimplify didn't preserve BPI, and making it available to the following loop passes anyways. In the new one, it's invalidated before running any loop pass if either LCSSA or LoopSimplify actually make changes. If they don't make changes, then BPI is valid and available. So, we go ahead and teach LCSSA and LoopSimplify how to preserve BPI for consistency between old and new pass managers. This patch avoids an invalidation between the two requires in the following trivial pass pipeline: opt -passes="requires<branch-prob>,loop(no-op-loop),requires<branch-prob>" (when the input file is one which requires either LCSSA or LoopSimplify to canonicalize the loops) Differential Revision: https://reviews.llvm.org/D60790 llvm-svn: 358901	2019-04-22 17:13:43 +00:00
Wei Mi	01f8d556aa	[PGO/SamplePGO][NFC] Move the function updateProfWeight from Instruction to CallInst. The issue was raised here: https://reviews.llvm.org/D60903#1472783 The function Instruction::updateProfWeight is only used for CallInst in profile update. From the current interface, it is very easy to think that the function can also be used for branch instruction. However, Branch instruction does't need the scaling the function provides for branch_weights and VP (value profile), in addition, scaling may introduce inaccuracy for branch probablity. The patch moves the function updateProfWeight from Instruction class to CallInst to remove the confusion. The patch also changes the scaling of branch_weights from a loop to a block because we know that ProfileData for branch_weights of CallInst will only have two operands at most. Differential Revision: https://reviews.llvm.org/D60911 llvm-svn: 358900	2019-04-22 17:04:51 +00:00
Matt Arsenault	2b6f76f05f	AMDGPU/GlobalISel: Fix non-power-of-2 G_EXTRACT sources llvm-svn: 358894	2019-04-22 15:22:46 +00:00
Matt Arsenault	8f624abc1d	GlobalISel: Legalize scalar G_EXTRACT sources llvm-svn: 358892	2019-04-22 15:10:42 +00:00
Nico Weber	f5c7f3ad33	llvm-undname: Fix an assert-on-invalid, found by oss-fuzz llvm-svn: 358891	2019-04-22 15:05:18 +00:00
Matt Arsenault	70346d127b	AMDGPU: Fix not checking for copy when looking at copy src Effectively reverts r356956. The check for isFullCopy was excessive, but there still needs to be a check that this is a copy. llvm-svn: 358890	2019-04-22 14:54:39 +00:00
Dmitry Preobrazhensky	e2707f5aac	[AMDGPU][MC] Corrected parsing of SP3 'neg' modifier See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60624 llvm-svn: 358888	2019-04-22 14:35:47 +00:00
Simon Pilgrim	6276ce0142	[TargetLowering][AMDGPU][X86] Improve SimplifyDemandedBits bitcast handling This patch adds support for BigBitWidth -> SmallBitWidth bitcasts, splitting the DemandedBits/Elts accordingly. The AMDGPU backend needed an extra (srl (and x, c1 << c2), c2) -> (and (srl(x, c2), c1) combine to encourage BFE creation, I investigated putting this in DAGCombine but it caused a lot of noise on other targets - some improvements, some regressions. The X86 changes are all definite wins. Differential Revision: https://reviews.llvm.org/D60462 llvm-svn: 358887	2019-04-22 14:04:35 +00:00
Sanjay Patel	9bc6c77220	[DAGCombiner] make variable name less ambiguous; NFC llvm-svn: 358886	2019-04-22 13:42:50 +00:00
Sanjay Patel	d6989daae9	[DAGCombiner] prepare shuffle-of-splat to handle more patterns; NFC llvm-svn: 358884	2019-04-22 13:36:07 +00:00
Robert Widmann	ff8febcb6d	[LLVM-C] Add accessors to the default floating-point metadata node Summary: Add a getter and setter pair for floating-point accuracy metadata. Reviewers: whitequark, deadalnix Reviewed By: whitequark Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60527 llvm-svn: 358883	2019-04-22 13:13:22 +00:00
Serguei Katkov	40a3b96196	[NewPM] Add Option handling for SimpleLoopUnswitch This patch enables passing options to SimpleLoopUnswitch via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60676 llvm-svn: 358880	2019-04-22 10:35:07 +00:00
Nikita Popov	5aacc7a573	Revert "[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC" This reverts commit 7bf4d7c07f2fac862ef34c82ad0fef6513452445. After thinking about this more, this isn't right, the range is not exact in the same sense as makeExactICmpRegion(). This needs a separate function. llvm-svn: 358876	2019-04-22 09:01:38 +00:00
Nikita Popov	5299e25f50	[ConstantRange] Rename make{Guaranteed -> Exact}NoWrapRegion() NFC Following D60632 makeGuaranteedNoWrapRegion() always returns an exact nowrap region. Rename the function accordingly. This is in line with the naming of makeExactICmpRegion(). llvm-svn: 358875	2019-04-22 08:36:05 +00:00
Craig Topper	5c43ab337f	[X86] Reject 512-bit types in getRegForInlineAsmConstraint when AVX512 is not enabled. Same for 256 bit and AVX. llvm-svn: 358872	2019-04-22 06:12:02 +00:00
Lang Hames	1233c15be5	[JITLink] Remove a lot of reduntant 'JITLink_' prefixes. NFC. llvm-svn: 358869	2019-04-22 03:03:09 +00:00
Lang Hames	d3dac47aa2	[JITLink] Fix section start address calculation in eh-frame recorder. Section atoms are not sorted, so we need to scan the whole section to find the start address. No test case: Found by inspection, and any reproduction would depend on pointer ordering. llvm-svn: 358865	2019-04-22 01:35:16 +00:00
Nico Weber	ce67a41741	llvm-undname: Fix hex escapes in wchar_t, char16_t, char32_t strings llvm-undname used to put '\x' in front of every pair of nibbles, but u"\xD7\xFF" produces a string with 6 bytes: \xD7 \0 \xFF \0 (and \0\0). Correct for a single character (plus terminating \0) is u\xD7FF instead. Now, wchar_t, char16_t, and char32_t strings roundtrip from source to clang-cl (and cl.exe) and then llvm-undname. (...at least as long as it's not a string like L"\xD7FF" L"foo" which gets demangled as L"\xD7FFfoo", where the compiler then considers the "f" as part of the hex escape. That seems ok.) Also add a comment saying that the "almost-valid" char32_t string I added in my last commit is actually produced by compilers. llvm-svn: 358857	2019-04-21 17:19:27 +00:00
Nico Weber	8fc9902bbb	llvm-undname: Fix stack overflow on almost-valid If a unsigned with all 4 bytes non-0 was passed to outputHex(), there were two off-by-ones in it: - Both MaxPos and Pos left space for the final \0, which left the buffer one byte to small. Set MaxPos to 16 instead of 15 to fix. - The `assert(Pos >= 0);` was after a `Pos--`, move it up one line. Since valid Unicode codepoints are <= 0x10ffff, this could never really happen in practice. Found by oss-fuzz. llvm-svn: 358856	2019-04-21 16:58:25 +00:00
Nikita Popov	198ab60136	[ConstantRange] Add saturating add/sub methods Add support for uadd_sat and friends to ConstantRange, so we can handle uadd.sat and friends in LVI. The implementation is forwarding to the corresponding APInt methods with appropriate bounds. One thing worth pointing out here is that the handling of wrapping ranges is not maximally accurate. A simple example is that adding 0 to a wrapped range will return a full range, rather than the original wrapped range. The tests also only check that the non-wrapping envelope is correct and minimal. Differential Revision: https://reviews.llvm.org/D60946 llvm-svn: 358855	2019-04-21 15:23:05 +00:00
Nikita Popov	dbc3fbafe7	[ConstantRange] Add getNonEmpty() constructor ConstantRanges have an annoying special case: If upper and lower are the same, it can be either an empty or a full set. When constructing constant ranges nearly always a full set is intended, but this still requires an explicit check in many places. This revision adds a getNonEmpty() constructor that disambiguates this case: If upper and lower are the same, a full set is created. Differential Revision: https://reviews.llvm.org/D60947 llvm-svn: 358854	2019-04-21 15:22:54 +00:00
Nico Weber	aa162682ca	llvm-undname: Fix stack overflow on invalid found by oss-fuzz llvm-svn: 358852	2019-04-21 14:25:07 +00:00
David Green	0d741507f7	[ARM] Rewrite isLegalT2AddressImmediate This does two main things, firstly adding some at least basic addressing modes for i64 types, and secondly treats floats and doubles sensibly when there is no fpu. The floating point change can help codesize in some cases, especially with D60294. Most backends seems to not consider the exact VT in isLegalAddressingMode, instead switching on type size. That is now what this does when the target does not have an fpu (as the float data will be loaded using LDR's). i64's currently use the address range of an LDRD (even though they may be legalised and loaded with an LDR). This is at least better than marking them all as illegal addressing modes. I have not attempted to do much with vectors yet. That will need changing once MVE is added. Differential Revision: https://reviews.llvm.org/D60677 llvm-svn: 358845	2019-04-21 09:54:29 +00:00
Craig Topper	df02beb416	[X86] Add the rounding control operand to the printing for some scalar FMA instructions. llvm-svn: 358844	2019-04-21 07:12:56 +00:00
Fangrui Song	a0f9c4f72c	[CachePruning] Simplify comparator llvm-svn: 358843	2019-04-21 06:17:40 +00:00
Craig Topper	63db7e347b	[X86] Don't form masked vfpclass instruction from and+vfpclass unless the fpclass only has a single use. llvm-svn: 358841	2019-04-21 05:18:04 +00:00
Lang Hames	a97032e947	[JITLink] Remove an overly strict error check in JITLink's eh-frame parser. The error check required FDEs to refer to the most recent CIE, but the eh-frame spec allows them to refer to any previously seen CIE. This patch removes the offending check. llvm-svn: 358840	2019-04-21 04:48:32 +00:00
Lang Hames	0191531a76	[JITLink] Factor basic common GOT and stub creation code into its own class. llvm-svn: 358838	2019-04-21 03:14:42 +00:00
Nico Weber	8eeaf5178d	llvm-undname: Improve string literal demangling with embedded \0 chars - Don't assert when a string looks like a u32 string to the heuristic but doesn't have a length that's 0 mod 4. Instead, classify those as u16 with embedded \0 chars. Found by oss-fuzz. - Print embedded nul bytes as \0 instead of \x00. llvm-svn: 358835	2019-04-20 23:59:06 +00:00
Nico Weber	f2654b638d	ftime-trace: Trace the name of the currently active pass as well. Differential Revision: https://reviews.llvm.org/D60782 llvm-svn: 358834	2019-04-20 23:22:45 +00:00
Lang Hames	65e1ddd713	[JITLink] Add yet more detail to MachO/x86-64 unsupported relocation errors. Knowing the address/symbolnum field values makes it easier to identify the unsupported relocation, and provides enough information for the full bit pattern of the relocation to be reconstructed. llvm-svn: 358833	2019-04-20 22:59:43 +00:00
Lang Hames	5004abcd86	[JITLink][ORC] Add JITLink to the list of dependencies for ORC. The new ObjectLinkingLayer in ORC depends on JITLink. This should fix the build error at http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/9621 llvm-svn: 358832	2019-04-20 22:15:57 +00:00
Lang Hames	7f77a231fa	[JITLink] Fix a bad formatv format string. llvm-svn: 358831	2019-04-20 22:06:12 +00:00
Amara Emerson	4286652556	Revert r358800. Breaks Obsequi from the test suite. The last attempt fixed gcc and consumer-typeset, but Obsequi seems to fail with a different issue. llvm-svn: 358829	2019-04-20 21:25:00 +00:00
Lang Hames	daed9b10f1	[JITLink] Add BinaryFormat to JITLink's dependencies. Hopefully this will fix the missing dependence on llvm::identify_magic that is showing up on some PPC bots. E.g. http://lab.llvm.org:8011/builders/clang-ppc64le-linux-multistage/builds/9617 llvm-svn: 358827	2019-04-20 19:48:45 +00:00
Lang Hames	c283fc5ebb	[JITLink] Add more detail to MachO/x86-64 "unsupported relocation" errors. The extra information here will be helpful in diagnosing errors, like the ones currently occuring on the PPC big-endian bots. :) llvm-svn: 358826	2019-04-20 18:50:13 +00:00
Lang Hames	dfc3a4f6ff	[JITLink] Silence some MSVC implicit cast warnings. llvm-svn: 358824	2019-04-20 18:30:16 +00:00
Lang Hames	b39109585a	[JITLink] Use memset instead of bzero. llvm-svn: 358822	2019-04-20 17:49:58 +00:00
Lang Hames	68b0b8c192	[JITLink] Fix a missing header and bad prototype. llvm-svn: 358819	2019-04-20 17:29:57 +00:00
Lang Hames	11c8dfa583	Initial implementation of JITLink - A replacement for RuntimeDyld. Summary: JITLink is a jit-linker that performs the same high-level task as RuntimeDyld: it parses relocatable object files and makes their contents runnable in a target process. JITLink aims to improve on RuntimeDyld in several ways: (1) A clear design intended to maximize code-sharing while minimizing coupling. RuntimeDyld has been developed in an ad-hoc fashion for a number of years and this had led to intermingling of code for multiple architectures (e.g. in RuntimeDyldELF::processRelocationRef) in a way that makes the code more difficult to read, reason about, extend. JITLink is designed to isolate format and architecture specific code, while still sharing generic code. (2) Support for native code models. RuntimeDyld required the use of large code models (where calls to external functions are made indirectly via registers) for many of platforms due to its restrictive model for stub generation (one "stub" per symbol). JITLink allows arbitrary mutation of the atom graph, allowing both GOT and PLT atoms to be added naturally. (3) Native support for asynchronous linking. JITLink uses asynchronous calls for symbol resolution and finalization: these callbacks are passed a continuation function that they must call to complete the linker's work. This allows for cleaner interoperation with the new concurrent ORC JIT APIs, while still being easily implementable in synchronous style if asynchrony is not needed. To maximise sharing, the design has a hierarchy of common code: (1) Generic atom-graph data structure and algorithms (e.g. dead stripping and \| memory allocation) that are intended to be shared by all architectures. \| + -- (2) Shared per-format code that utilizes (1), e.g. Generic MachO to \| atom-graph parsing. \| + -- (3) Architecture specific code that uses (1) and (2). E.g. JITLinkerMachO_x86_64, which adds x86-64 specific relocation support to (2) to build and patch up the atom graph. To support asynchronous symbol resolution and finalization, the callbacks for these operations take continuations as arguments: using JITLinkAsyncLookupContinuation = std::function<void(Expected<AsyncLookupResult> LR)>; using JITLinkAsyncLookupFunction = std::function<void(const DenseSet<StringRef> &Symbols, JITLinkAsyncLookupContinuation LookupContinuation)>; using FinalizeContinuation = std::function<void(Error)>; virtual void finalizeAsync(FinalizeContinuation OnFinalize); In addition to its headline features, JITLink also makes other improvements: - Dead stripping support: symbols that are not used (e.g. redundant ODR definitions) are discarded, and take up no memory in the target process (In contrast, RuntimeDyld supported pointer equality for weak definitions, but the redundant definitions stayed resident in memory). - Improved exception handling support. JITLink provides a much more extensive eh-frame parser than RuntimeDyld, and is able to correctly fix up many eh-frame sections that RuntimeDyld currently (silently) fails on. - More extensive validation and error handling throughout. This initial patch supports linking MachO/x86-64 only. Work on support for other architectures and formats will happen in-tree. Differential Revision: https://reviews.llvm.org/D58704 llvm-svn: 358818	2019-04-20 17:10:34 +00:00
Craig Topper	3980d1ca6b	[X86] Disable argument copy elision for arguments passed via pointers Summary: If you pass two 1024 bit vectors in IR with AVX2 on Windows 64. Both vectors will be split in four 256 bit pieces. The four pieces of the first argument will be passed indirectly using 4 gprs. The second argument will get passed via pointers in memory. The PartOffsets stored for the second argument are all in terms of its original 1024 bit size. So the PartOffsets for each piece are 32 bytes apart. So if we consider it for copy elision we'll only load an 8 byte pointer, but we'll move the address 32 bytes. The stack object size we create for the first part is probably wrong too. This issue was encountered by ISPC. I'm working on getting a reduce test case, but wanted to go ahead and get feedback on the fix. Reviewers: rnk Reviewed By: rnk Subscribers: dbabokin, llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D60801 llvm-svn: 358817	2019-04-20 15:26:44 +00:00
Luqman Aden	2993661cc0	[CorrelatedValuePropagation] Mark subs that we know not to wrap with nuw/nsw. Summary: Teach CorrelatedValuePropagation to also handle sub instructions in addition to add. Relatively simple since makeGuaranteedNoWrapRegion already understood sub instructions. Only subtle change is which range is passed as "Other" to that function, since sub isn't commutative. Note that CorrelatedValuePropagation::processAddSub is still hidden behind a default-off flag as IndVarSimplify hasn't yet been fixed to strip the added nsw/nuw flags and causes a miscompile. (PR31181) Reviewers: sanjoy, apilipenko, nikic Reviewed By: nikic Subscribers: hiraditya, jfb, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60036 llvm-svn: 358816	2019-04-20 13:14:18 +00:00
Fangrui Song	d3b2682351	[ExecutionDomainFix] Optimize a binary search insertion llvm-svn: 358815	2019-04-20 13:00:50 +00:00
Fangrui Song	dd0e833555	[llvm-symbolizer] Fix section index at the end of a section This is very minor issue. The returned section index is only used by DWARFDebugLine as an llvm::upper_bound input and the use case shouldn't cause any behavioral change. llvm-svn: 358814	2019-04-20 13:00:09 +00:00
Nikita Popov	b75c8fc6fb	[X86] Fix stack probing on x32 (PR41477) Fix for https://bugs.llvm.org/show_bug.cgi?id=41477. On the x32 ABI with stack probing a dynamic alloca will result in a WIN_ALLOCA_32 with a 32-bit size. The current implementation tries to copy it into RAX, resulting in a physreg copy error. Fix this by copying to EAX instead. Also fix incorrect opcodes or registers used in subs. llvm-svn: 358807	2019-04-20 07:25:46 +00:00
Craig Topper	4d4b5d952e	[X86] Don't turn (and (shl X, C1), C2) into (shl (and X, (C1 >> C2), C2) if the original AND can represented by MOVZX. The MOVZX doesn't require an immediate to be encoded at all. Though it does use a 2 byte opcode so its the same size as a 1 byte immediate. But it has a separate source and dest register so can help avoid copies. llvm-svn: 358805	2019-04-20 04:38:53 +00:00
Craig Topper	8b8264828c	[X86] Turn (and (anyextend (shl X, C1), C2)) into (shl (and (anyextend X), (C1 >> C2), C2) if the AND could match a movzx. There's one slight regression in here because we don't check that the immediate already allowed movzx before the shift. I'll fix that next. llvm-svn: 358804	2019-04-20 04:38:49 +00:00
Sam Clegg	fe8aabf9d9	[WebAssembly] Object: Improve error messages on invalid section Also add a test. Differential Revision: https://reviews.llvm.org/D60836 llvm-svn: 358801	2019-04-20 00:11:46 +00:00
Amara Emerson	eac69e9377	Revert "Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores"" We were shifting the wrong component of a split load when trying to combine them back into a single value. llvm-svn: 358800	2019-04-19 23:54:44 +00:00
Jessica Paquette	d5c69e0836	[GlobalISel][AArch64] Legalize + select G_FRINT Exactly the same as G_FCEIL, G_FABS, etc. Add tests for the fp16/nofp16 behaviour, update arm64-vfloatintrinsics, etc. Differential Revision: https://reviews.llvm.org/D60895 llvm-svn: 358799	2019-04-19 23:41:52 +00:00
Sam Clegg	a27252794e	[WebAssembly] FastISel: Don't fallback to SelectionDAG after BuildMI in selectCall My understanding is that once BuildMI has been called we can't fallback to SelectionDAG. This change moves the fallback for when getRegForValue() fails for that target of an indirect call. This was failing in -fPIC mode when the callee is GlobalValue. Add a test case that tickles this. Differential Revision: https://reviews.llvm.org/D60908 llvm-svn: 358793	2019-04-19 22:43:32 +00:00
Vedant Kumar	282b26ec4d	[GVN+LICM] Use line 0 locations for better crash attribution This is a follow-up to r291037+r291258, which used null debug locations to prevent jumpy line tables. Using line 0 locations achieves the same effect, but works better for crash attribution because it preserves the right inline scope. Differential Revision: https://reviews.llvm.org/D60913 llvm-svn: 358791	2019-04-19 22:36:40 +00:00
Eric Christopher	dfebd84eb3	Remove the EnableEarlyCSEMemSSA set of options from the legacy and new pass managers. They were default to true and not being used. Differential Revision: https://reviews.llvm.org/D60747 llvm-svn: 358789	2019-04-19 22:18:53 +00:00
Eli Friedman	1810339bc3	[AArch64] Fix checks for AArch64MCExpr::VK_SABS flag. VK_SABS is part of the SymLoc bitfield in the variant kind which should be compared for equality, not by checking the VK_SABS bit. As far as I know, the existing code happened to produce the correct results in all cases, so this is just a cleanup. Patch by Stephen Crane. Differential Revision: https://reviews.llvm.org/D60596 llvm-svn: 358788	2019-04-19 21:58:10 +00:00
Jessica Paquette	ad69af3e95	[GlobalISel] Add IRTranslator support for G_FRINT Add it as a simple intrinsic, update arm64-irtranslator.ll. Differential Revision: https://reviews.llvm.org/D60893 llvm-svn: 358787	2019-04-19 21:46:12 +00:00
Amy Huang	d07d6d6177	Attempt to fix buildbot failure in commit 1bb57bac959ac163fd7d8a76d734ca3e0ecee6ab. llvm-svn: 358786	2019-04-19 21:44:30 +00:00
Amy Huang	c774f687b6	[MS] Emit S_HEAPALLOCSITE debug info Summary: This emits labels around heapallocsite calls and S_HEAPALLOCSITE debug info in codeview. Currently only changes FastISel, so emitting labels still needs to be implemented in SelectionDAG. Reviewers: hans, rnk Subscribers: aprantl, hiraditya, cfe-commits, llvm-commits Tags: #clang, #llvm Differential Revision: https://reviews.llvm.org/D60800 llvm-svn: 358783	2019-04-19 21:09:11 +00:00
Alina Sbirlea	43709f7233	[LICM & MemorySSA] Make limit flags pass tuning options. Summary: Make the flags in LICM + MemorySSA tuning options in the old and new pass managers. Subscribers: mehdi_amini, jlebar, Prazek, george.burgess.iv, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60490 llvm-svn: 358772	2019-04-19 17:46:50 +00:00
Amara Emerson	36c5baef49	Revert "[GlobalISel] Add legalization support for non-power-2 loads and stores" This introduces some runtime failures which I'll need to investigate further. llvm-svn: 358771	2019-04-19 17:42:13 +00:00
Jessica Paquette	dfd87f6fa1	[GlobalISel][AArch64] Legalize vector G_FPOW This instruction is legalized in the same way as G_FSIN, G_FCOS, G_FLOG10, etc. Update legalize-pow.mir and arm64-vfloatintrinsics.ll to reflect the change. Differential Revision: https://reviews.llvm.org/D60218 llvm-svn: 358764	2019-04-19 16:28:08 +00:00
Alina Sbirlea	0499a2f961	[NewPassManager] Adding pass tuning options: loop vectorize. Summary: Trying to add the plumbing necessary to add tuning options to the new pass manager. Testing with the flags for loop vectorize. Reviewers: chandlerc Subscribers: sanjoy, mehdi_amini, jlebar, steven_wu, dexonsmith, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59723 llvm-svn: 358763	2019-04-19 16:11:59 +00:00
Sanjay Patel	e197c617a6	[SelectionDAG] soften splat mask assert/unreachable (PR41535) These are general queries, so they should not die when given a degenerate input like an all undef mask. Callers should be able to deal with an op that will eventually be simplified away. llvm-svn: 358761	2019-04-19 15:31:11 +00:00
Nico Weber	e145a540cc	llvm-undname: Attempt to fix leak-on-invalid found by oss-fuzz llvm-svn: 358760	2019-04-19 14:13:11 +00:00
Florian Hahn	b340497f76	[LTO] Add plumbing to save stats during LTO on Darwin. Gold and ld on Linux already support saving stats, but the infrastructure is missing on Darwin. Unfortunately it seems like the configuration from lib/LTO/LTO.cpp is not used. This patch adds a new LTOStatsFile option and adds plumbing in Clang to use it on Darwin, similar to the way remarks are handled. Currnetly the handling of LTO flags seems quite spread out, with a bunch of duplication. But I am not sure if there is an easy way to improve that? Reviewers: anemet, tejohnson, thegameg, steven_wu Reviewed By: steven_wu Differential Revision: https://reviews.llvm.org/D60516 llvm-svn: 358753	2019-04-19 12:36:41 +00:00
Bjorn Pettersson	238c9d6308	[CodeGen] Add "const" to MachineInstr::mayAlias Summary: The basic idea here is to make it possible to use MachineInstr::mayAlias also when the MachineInstr is const (or the "Other" MachineInstr is const). The addition of const in MachineInstr::mayAlias then rippled down to the need for adding const in several other places, such as TargetTransformInfo::getMemOperandWithOffset. Reviewers: hfinkel Reviewed By: hfinkel Subscribers: hfinkel, MatzeB, arsenm, jvesely, nhaehnle, hiraditya, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60856 llvm-svn: 358744	2019-04-19 09:08:38 +00:00
James Molloy	9ad4cb3de4	[PATCH] [MachineScheduler] Check pending instructions when an instruction is scheduled Pending instructions that may have been blocked from being available by the HazardRecognizer may no longer may not be blocked any more when an instruction is scheduled; pending instructions should be re-checked in this case. This is primarily aimed at VLIW targets with large parallelism and esoteric constraints. No testcase as no in-tree targets have this behavior. Differential revision: https://reviews.llvm.org/D60861 llvm-svn: 358743	2019-04-19 09:00:55 +00:00
Fangrui Song	7137b54a03	[MergeFunc] Delete unused FunctionNode::release() llvm-svn: 358742	2019-04-19 08:03:20 +00:00
Fangrui Song	884f557bb2	[MergeFunc] removeUsers: call remove() only on direct users removeUsers uses a work list to collect indirect users and call remove() on those functions. However it has a bug (`if (!Visited.insert(UU).second)`). Actually, we don't have to collect indirect users. After the merge of F and G, G's callers will be considered (added to Deferred). If G's callers can be merged, G's callers' callers will be considered. Update the test unnamed-addr-reprocessing.ll to make it clear we can still merge indirect callers. llvm-svn: 358741	2019-04-19 07:57:51 +00:00
Piotr Sobczak	72e2960e52	[AMDGPU] Ignore non-SUnits edges Summary: Ignore edges to non-SUnits (e.g. ExitSU) when checking for low latency instructions. When calling the function isLowLatencyInstruction(), an ExitSU could be on the list of successors, not necessarily a regular SU. In other places in the code there is a check "Succ->NodeNum >= DAGSize" to prevent further processing of ExitSU as "Succ->getInstr()" is NULL in such a case. Also, 8 out of 9 cases of "SUnit *Succ = SuccDep.getSUnit())" has the guard, so it is clearly an omission here. Change-Id: Ica86f0327c7b2e6bcb56958e804ea6c71084663b Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: MatzeB, arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60864 llvm-svn: 358740	2019-04-19 06:19:14 +00:00
Chandler Carruth	ce3f75df1f	[CallSite removal] Move the legacy PM, call graph, and some inliner code to `CallBase`. This patch focuses on the legacy PM, call graph, and some of inliner and legacy passes interacting with those APIs from `CallSite` to the new `CallBase` class. No interesting changes. Differential Revision: https://reviews.llvm.org/D60412 llvm-svn: 358739	2019-04-19 05:59:42 +00:00
Fangrui Song	82216048e6	[MergeFunc] Use less_first() as the comparator of Schwartzian transform llvm-svn: 358738	2019-04-19 05:49:29 +00:00
Craig Topper	bb769a2946	[X86] Turn (and (shl X, C1), C2) into (shl (and X, (C1 >> C2), C2) if the AND could match a movzx. Could get further improvements by recognizing (i64 and (anyext (i32 shl))). llvm-svn: 358737	2019-04-19 05:48:13 +00:00
Craig Topper	f73caae956	[X86] Make sure we copy the HandleSDNode back to N before executing the default code after the switch in matchAddressRecursively Summary: There are two places where we create a HandleSDNode in address matching in order to handle the case where N is changed by CSE. But if we end up not matching, we fall back to code at the bottom of the switch that really would like N to point to something that wasn't CSEd away. So we should make sure we copy the handle back to N on any paths that can reach that code. This appears to be the true reason we needed to check DELETED_NODE in the negation matching. In pr32329.ll we had two subtracts back to back. We recursed through the first subtract, and onto the second subtract. The second subtract called matchAddressRecursively on its LHS which caused that subtract to CSE. We ultimately failed the match and ended up in the default code. But N was pointing at the old node that had been deleted, but the default code didn't know that and took it as the base register. Then we unwound back to the first subtract and tried to access this bogus base reg requiring the check for deleted node. With this patch we now use the CSE result as the base reg instead. matchAdd has been broken since sometime in 2015 when it was pulled out of the switch into a helper function. The assignment to N at the end was still there, but N was passed by value and not by reference so the update didn't go anywhere. Reviewers: niravd, spatel, RKSimon, bkramer Reviewed By: niravd Subscribers: llvm-commits, hiraditya Tags: #llvm Differential Revision: https://reviews.llvm.org/D60843 llvm-svn: 358735	2019-04-19 04:52:21 +00:00
Fangrui Song	9a331bba2a	[DWARF] Use hasFileAtIndex to properly verify DWARF 5 after rL358732 llvm-svn: 358734	2019-04-19 03:34:28 +00:00
Ali Tamur	783d84bb39	[llvm] Prevent duplicate files in debug line header in dwarf 5: another attempt Another attempt to land the changes in debug line header to prevent duplicate files in Dwarf 5. I rolled back my previous commit because of a mistake in generating the object file in a test. Meanwhile, I addressed some offline comments and changed the implementation; the largest difference is that MCDwarfLineTableHeader does not keep DwarfVersion but gets it as a parameter. I also merged the patch to fix two lld tests that will strt to fail into this patch. Original Commit: https://reviews.llvm.org/D59515 Original Message: Motivation: In previous dwarf versions, file name indexes started from 1, and the primary source file was not explicit. Dwarf 5 standard (6.2.4) prescribes the primary source file to be explicitly given an entry with an index number 0. The current implementation honors the specification by just duplicating the main source file, once with index number 0, and later maybe with another index number. While this is compliant with the letter of the standard, the duplication causes problems for consumers of this information such as lldb. (Some files are duplicated, where only some of them have a line table although all refer to the same file) With this change, dwarf 5 debug line section files always start from 0, and the zeroth entry is not duplicated whenever possible. This requires different handling of dwarf 4 and dwarf 5 during generation (e.g. when a function returns an index zero for a file name, it signals an error in dwarf 4, but not in dwarf 5) However, I think the minor complication is worth it, because it enables all consumers (lldb, gdb, dwarfdump, objdump, and so on) to treat all files in the file name list homogenously. llvm-svn: 358732	2019-04-19 02:26:56 +00:00
Fangrui Song	acc7641bcb	[APInt] Optimize umul_ov Change two costly udiv() calls to lshr(1)*RHS + left-shift + plus On one 64-bit umul_ov benchmark, I measured an obvious improvement: 12.8129s -> 3.6257s Note, there may be some value to special case 64-bit (the most common case) with __builtin_umulll_overflow(). Differential Revision: https://reviews.llvm.org/D60669 llvm-svn: 358730	2019-04-19 02:06:06 +00:00
Saleem Abdulrasool	b96d9b3419	MergeFunc: preserve COMDAT information when creating a thunk We would previously drop the COMDAT on the thunk we generated when replacing a function body with the forwarding thunk. This would result in a function that may have been multiply emitted and multiply merged to be emitted with the same name without the COMDAT. This is a hard error with PE/COFF where the COMDAT is used for the deduplication of Value Witness functions for Swift. llvm-svn: 358728	2019-04-19 01:48:36 +00:00
Alina Sbirlea	da0f71af7d	[LoopUnroll] Move list of params into a struct [NFCI]. Summary: Cleanup suggested in review of r358304. Reviewers: sanjoy, efriedma Subscribers: jlebar, zzheng, dmgreen, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60638 llvm-svn: 358723	2019-04-18 23:43:49 +00:00
Adrian Prantl	fac7875704	Implement sys::fs::copy_file using the macOS copyfile(3) API to support APFS clones. This patch adds a Darwin-specific implementation of llvm::sys::fs::copy_file() that uses the macOS copyfile(3) API to support APFS copy-on-write clones, which should be faster and much more space efficient. https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/ToolsandAPIs/ToolsandAPIs.html Differential Revision: https://reviews.llvm.org/D60802 This reapplies 358628 with an additional bugfix handling the case where the destination file already exists. (Caught by the clang testsuite). llvm-svn: 358716	2019-04-18 21:22:50 +00:00
Jessica Paquette	0aa9b453c4	[GlobalISel][AArch64] Legalize/select G_(S/Z/ANY)_EXT for v8s8s This adds legalization for G_SEXT, G_ZEXT, and G_ANYEXT for v8s8s. We were falling back on G_ZEXT in arm64-vabs.ll before, preventing us from selecting the @llvm.aarch64.neon.sabd.v8i8 intrinsic. This adds legalizer support for those 3, which gives us selection via the importer. Update the relevant tests (legalize-ext.mir, select-int-ext.mir) and add a GISel line to arm64-vabs.ll. Differential Revision: https://reviews.llvm.org/D60881 llvm-svn: 358715	2019-04-18 21:15:48 +00:00
Jessica Paquette	3b5119c684	[GlobalISel][AArch64] Legalize v8s8 loads Add legalizer support for loads of v8s8 and update legalize-load-store.mir. Differential Revision: https://reviews.llvm.org/D60877 llvm-svn: 358714	2019-04-18 21:13:58 +00:00
Nico Weber	a0ac65c98f	llvm-undname: Fix two more asserts-on-invalid, found by oss-fuzz llvm-svn: 358708	2019-04-18 19:52:32 +00:00
Nico Weber	502cf4bd19	llvm-undname: Fix two asserts-on-invalid llvm-svn: 358707	2019-04-18 19:30:21 +00:00
Philip Reames	137995d8da	[GuardWidening] Wire up a NPM version of the LoopGuardWidening pass llvm-svn: 358704	2019-04-18 19:17:14 +00:00
Michael Berg	d573aa0156	[NFC] FMF propagation for GlobalIsel llvm-svn: 358702	2019-04-18 18:48:57 +00:00
Quentin Colombet	ea3364bf85	[BlockExtractor] Extend the file format to support the grouping of basic blocks Prior to this patch, each basic block listed in the extrack-blocks-file would be extracted to a different function. This patch adds the support for comma separated list of basic blocks to form group. When the region formed by a group is not extractable, e.g., not single entry, all the blocks of that group are left untouched. Let us see this new format in action (comments are not part of the file format): ;; funcName bbName[,bbName...] foo bb1 ;; Extract bb1 in its own function foo bb2,bb3 ;; Extract bb2,bb3 in their own function bar bb1,bb4 ;; Extract bb1,bb4 in their own function bar bb2 ;; Extract bb2 in its own function Assuming all regions are extractable, this will create one function and thus one call per region. Differential Revision: https://reviews.llvm.org/D60746 llvm-svn: 358701	2019-04-18 18:28:30 +00:00
Simon Pilgrim	4171a91e92	[X86] combineVectorTruncationWithPACKUS - remove split/concatenation of mask combineVectorTruncationWithPACKUS is currently splitting the upper bit bit masking into 128-bit subregs and then concatenating them back together. This was originally done to avoid regressions that caused existing subregs to be concatenated to the larger type just for the AND masking before being extracted again. This was fixed by @spatel (notably rL303997 and rL347356). This also lets SimplifyDemandedBits do some further improvements before it hits the recursive depth limit. My only annoyance with this is that we were broadcasting some xmm masks but we seem to have lost them by moving to ymm - but that's a known issue as the logic in lowerBuildVectorAsBroadcast isn't great. Differential Revision: https://reviews.llvm.org/D60375#inline-539623 llvm-svn: 358692	2019-04-18 17:23:09 +00:00
Philip Reames	adf288c5d9	[LoopPred] Fix a blatantly obvious bug in r358684 The bug is that I didn't check whether the operand of the invariant_loads were themselves invariant. I don't know how this got missed in the patch and review. I even had an unreduced test case locally, and I remember handling this case, but I must have lost it in one of the rebases. Oops. llvm-svn: 358688	2019-04-18 17:01:19 +00:00
Philip Reames	92a7177e6b	[LoopPredication] Allow predication of loop invariant computations (within the loop) The purpose of this patch is to eliminate a pass ordering dependence between LoopPredication and LICM. To understand the purpose, consider the following snippet of code inside some loop 'L' with IV 'i' A = _a.length; guard (i < A) a = _a[i] B = _b.length; guard (i < B); b = _b[i]; ... Z = _z.length; guard (i < Z) z = _z[i] accum += a + b + ... + z; Today, we need LICM to hoist the length loads, LoopPredication to make the guards loop invariant, and TrivialUnswitch to eliminate the loop invariant guard to establish must execute for the next length load. Today, if we can't prove speculation safety, we'd have to iterate these three passes 26 times to reduce this example down to the minimal form. Using the fact that the array lengths are known to be invariant, we can short circuit this iteration. By forming the loop invariant form of all the guards at once, we remove the need for LoopPredication from the iterative cycle. At the moment, we'd still have to iterate LICM and TrivialUnswitch; we'll leave that part for later. As a secondary benefit, this allows LoopPred to expose peeling oppurtunities in a much more obvious manner. See the udiv test changes as an example. If the udiv was not hoistable (i.e. we couldn't prove speculation safety) this would be an example where peeling becomes obviously profitable whereas it wasn't before. A couple of subtleties in the implementation: - SCEV's isSafeToExpand guarantees speculation safety (i.e. let's us expand at a new point). It is not a precondition for expansion if we know the SCEV corresponds to a Value which dominates the requested expansion point. - SCEV's isLoopInvariant returns true for expressions which compute the same value across all iterations executed, regardless of where the original Value is located. (i.e. it can be in the loop) This implies we have a speculation burden to prove before expanding them outside loops. - invariant_loads and AA->pointsToConstantMemory are two cases that SCEV currently does not handle, but meets the SCEV definition of invariance. I plan to sink this part into SCEV once this has baked for a bit. Differential Revision: https://reviews.llvm.org/D60093 llvm-svn: 358684	2019-04-18 16:33:17 +00:00
Nicolai Haehnle	523f90a2ba	[SDA] Bug fix: Use IPD outside the loop as divergence bound Summary: The immediate post dominator of the loop header may be part of the divergent loop. Since this /was/ the divergence propagation bound the SDA would not detect joins of divergent paths outside the loop. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: mmasten, arsenm, jvesely, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59042 llvm-svn: 358681	2019-04-18 16:17:35 +00:00
Philip Reames	b2c9fc02d5	Fix a bug in SCEV's isSafeToExpand around speculation safety isSafeToExpand was making a common, but dangerously wrong, mistake in assuming that if any instruction within a basic block executes, that all instructions within that block must execute. This can be trivially shown to be false by considering the following small example: bb: add x, y <-- InsertionPoint call @throws() udiv x, y <-- SCEV* S br ... It's clearly not legal to expand S above the throwing call, but the previous logic would do so since S dominates (but not properlyDominates) the block containing the InsertionPoint. Since iterating instructions w/in a block is expensive, this change special cases two cases: 1) S is an operand of InsertionPoint, and 2) InsertionPoint is the terminator of it's block. These two together are enough to keep all current optimizations triggering while fixing the latent correctness issue. As best I can tell, this is a silent bug in current ToT. Given that, there's no tests with this change. It was noticed in an upcoming optimization change (D60093), and was reviewed as part of that. That change will include the test which caused me to notice the issue. I'm submitting this seperately so that anyone bisecting a problem gets a clear explanation. llvm-svn: 358680	2019-04-18 16:10:21 +00:00
Benjamin Kramer	7085795284	MinidumpYAML: Fix ambiguity between std::make_unique and llvm::make_unique llvm-svn: 358673	2019-04-18 15:06:03 +00:00
Pavel Labath	7429d86f36	MinidumpYAML: Add support for ModuleList stream Summary: This patch adds support for yaml (de)serialization of the minidump ModuleList stream. It's a fairly straight forward-application of the existing patterns to the ModuleList structures defined in previous patches. One thing, which may be interesting to call out explicitly is the addition of "new" allocation functions to the helper BlobAllocator class. The reason for this was, that there was an emerging pattern of a need to allocate space for entities, which do not have a suitable lifetime for use with the existing allocation functions. A typical example of that was the "size" of various lists, which is only available as a temporary returned by the .size() method of some container. For these cases, one can use the new set of allocation functions, which will take a temporary object, and store it in an allocator-managed buffer until it is written to disk. Reviewers: amccarth, jhenderson, clayborg, zturner Subscribers: lldb-commits, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60405 llvm-svn: 358672	2019-04-18 14:57:31 +00:00
Simon Pilgrim	8f87e53462	[X86][SSE] Lower ICMP EQ(AND(X,C),C) -> SRA(SHL(X,LOG2(C)),BW-1) iff C is power-of-2. This replaces the MOVMSK combine introduced at D52121/rL342326 (movmsk (setne (and X, (1 << C)), 0)) -> (movmsk (X << C)) with the more general icmp lowering so it can pick up more cases through bitcasts - notably vXi8 cases which use vXi16 shifts+masks, this patch can remove the mask and use pcmpgtb(0,x) for the sra. Differential Revision: https://reviews.llvm.org/D60625 llvm-svn: 358651	2019-04-18 09:58:59 +00:00
Serguei Katkov	ca6c03a22f	[NewPM] Add Option handling for LoopVectorize This patch enables passing options to LoopVectorizePass via the passes pipeline. Reviewers: chandlerc, fedor.sergeev, leonardchan, philip.pfaffe Reviewed By: fedor.sergeev Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D60681 llvm-svn: 358647	2019-04-18 08:46:11 +00:00
Kang Zhang	009a21d2fd	[PowerPC] Fix wrong ElemSIze when calling isConsecutiveLS() Summary: This issue from the bugzilla: https://bugs.llvm.org/show_bug.cgi?id=41177 When the two operands for BUILD_VECTOR are same, we will get assert error. llvm::SDValue combineBVOfConsecutiveLoads(llvm::SDNode*, llvm::SelectionDAG&): Assertion `!(InputsAreConsecutiveLoads && InputsAreReverseConsecutive) && "The loads cannot be both consecutive and reverse consecutive."' failed. This error caused by the wrong ElemSIze when calling isConsecutiveLS(). We should use `getScalarType().getStoreSize();` to get the ElemSize instread of `getScalarSizeInBits() / 8`. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D60811 llvm-svn: 358644	2019-04-18 07:24:15 +00:00
Eric Christopher	eff3b6fe7f	Elaborate why we have an option on by default for enabling chr. llvm-svn: 358641	2019-04-18 06:17:40 +00:00
Tim Renouf	7c55c8d8c3	[AMDGPU] Avoid DAG combining assert with fneg(fadd(A,0)) fneg combining attempts to turn it into fadd(fneg(A), fneg(0)), but creating the new fadd folds to just fneg(A). When A has multiple uses, this confuses it and you get an assert. Fixed. Differential Revision: https://reviews.llvm.org/D60633 Change-Id: I0ddc9b7286abe78edc0cd8d734fdeb05ff09821c llvm-svn: 358640	2019-04-18 05:27:01 +00:00
Ali Tamur	6263365b08	Fix a typo in comments. [NFC] llvm-svn: 358639	2019-04-18 02:39:37 +00:00
Aditya Nandakumar	9266337656	[GISel]:IRTranslator: Prefer a buidInstr form that allows CSE of cast instructions https://reviews.llvm.org/D60844 Use the style of buildInstr that allows CSEing. llvm-svn: 358637	2019-04-18 02:19:29 +00:00
Richard Trieu	7b6192025e	Fix bad compare function over FusionCandidate. Reverse the checking of the domiance order so that when a self compare happens, it returns false. This makes compare function have strict weak ordering. llvm-svn: 358636	2019-04-18 01:39:45 +00:00
Adrian Prantl	00d97ea202	Revert Implement sys::fs::copy_file using the macOS copyfile(3) API to support APFS clones. This reverts r358628 (git commit `91a06bee78`) while investigating a crash reproducer bot failure. llvm-svn: 358634	2019-04-18 01:21:10 +00:00
Adrian Prantl	91a06bee78	Implement sys::fs::copy_file using the macOS copyfile(3) API to support APFS clones. This patch adds a Darwin-specific implementation of llvm::sys::fs::copy_file() that uses the macOS copyfile(3) API to support APFS copy-on-write clones, which should be faster and much more space efficient. https://developer.apple.com/library/archive/documentation/FileManagement/Conceptual/APFS_Guide/ToolsandAPIs/ToolsandAPIs.html Differential Revision: https://reviews.llvm.org/D60802 llvm-svn: 358628	2019-04-18 00:01:05 +00:00
Akira Hatanaka	0b19f5aef9	Fix formatting. NFC llvm-svn: 358623	2019-04-17 23:14:39 +00:00
Sanjay Patel	fb363a778f	[x86] try to widen 'shl' as part of LEA formation The test file has pairs of tests that are logically equivalent: https://rise4fun.com/Alive/2zQ %t4 = and i8 %t1, 8 %t5 = zext i8 %t4 to i16 %sh = shl i16 %t5, 2 %t6 = add i16 %sh, %t0 => %t4 = and i8 %t1, 8 %sh2 = shl i8 %t4, 2 %z5 = zext i8 %sh2 to i16 %t6 = add i16 %z5, %t0 ...so if we can fold the shift op into LEA in the 1st pattern, then we should be able to do the same in the 2nd pattern (unnecessary 'movzbl' is a separate bug I think). We don't want to do this any sooner though because that would conflict with generic transforms that try to narrow the width of the shift. Differential Revision: https://reviews.llvm.org/D60789 llvm-svn: 358622	2019-04-17 22:38:51 +00:00
Denis Bakhvalov	cfd25a4b0e	Test commit by Denis Bakhvalov Change-Id: I4d85123a157d957434902fb14ba50926b2d56212 llvm-svn: 358619	2019-04-17 22:27:30 +00:00
Nick Desaulniers	9609ce2f33	[AsmPrinter] hoist %a output template to base class for ARM+Aarch64 Summary: X86 is quite complicated; so I intend to leave it as is. ARM+Aarch64 do basically the same thing (Aarch64 did not correctly handle immediates, ARM has a test llvm/test/CodeGen/ARM/2009-04-06-AsmModifier.ll that uses %a with an immediate) for a flag that should be target independent anyways. Reviewers: echristo, peter.smith Reviewed By: echristo Subscribers: javed.absar, eraman, kristof.beyls, hiraditya, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60841 llvm-svn: 358618	2019-04-17 22:21:10 +00:00
Amara Emerson	d51adf0568	Add a getSizeInBits() accessor to MachineMemOperand. NFC. Cleans up a bunch of places where we do getSize() * 8. Differential Revision: https://reviews.llvm.org/D60799 llvm-svn: 358617	2019-04-17 22:21:05 +00:00
Amara Emerson	daf6e66ac5	[GlobalISel] Add legalization support for non-power-2 loads and stores Legalize things like i24 load/store by splitting them into smaller power of 2 operations. This matches how SelectionDAG handles these operations. Differential Revision: https://reviews.llvm.org/D59971 llvm-svn: 358613	2019-04-17 21:30:07 +00:00
Kit Barton	3cdf87940f	Add basic loop fusion pass. This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Differential Revision: https://reviews.llvm.org/D55851 llvm-svn: 358607	2019-04-17 18:53:27 +00:00
Nick Desaulniers	a2077bab40	[AsmPrinter] defer %c to base class for ARM, PPC, and Hexagon. NFC Summary: None of these derived classes do anything that the base class cannot. If we remove these case statements, then the base class can handle them just fine. Reviewers: peter.smith, echristo Reviewed By: echristo Subscribers: nemanjai, javed.absar, eraman, kristof.beyls, hiraditya, kbarton, jsji, llvm-commits, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60803 llvm-svn: 358603	2019-04-17 18:22:48 +00:00
Steven Wu	05a358cdcd	[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols Summary: Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool. ThinLTOCodeGenerator currently does not preserve llvm.used symbols and it can internalize them. In order to pass the necessary information to the legacy ThinLTOCodeGenerator, the input to the code generator is rewritten to be based on lto::InputFile. Now ThinLTO using the legacy LTO API will requires data layout in Module. "internalize" thinlto action in llvm-lto is updated to run both "promote" and "internalize" with the same configuration as ThinLTOCodeGenerator. The old "promote" + "internalize" option does not produce the same output as ThinLTOCodeGenerator. This fixes: PR41236 rdar://problem/49293439 Reviewers: tejohnson, pcc, kromanova, dexonsmith Reviewed By: tejohnson Subscribers: ormris, bd1976llvm, mehdi_amini, inglorion, eraman, hiraditya, jkorous, dexonsmith, arphaman, dang, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60421 llvm-svn: 358601	2019-04-17 17:38:09 +00:00
Philip Reames	88679717ce	[InstCombine] Factor out unreachable inst idiom creation [NFC] In InstCombine, we use an idiom of "store i1 true, i1 undef" to indicate we've found a path which we've proven unreachable. We can't actually insert the unreachable instruction since that would require changing the CFG. We leave that to simplifycfg later. This just factors out that idiom creation so we don't duplicate the same mostly undocument idiom creation in multiple places. llvm-svn: 358600	2019-04-17 17:37:58 +00:00
Nikita Popov	2039581002	[LVI][CVP] Constrain values in with.overflow branches If a branch is conditional on extractvalue(op.with.overflow(%x, C), 1) then we can constrain the value of %x inside the branch based on makeGuaranteedNoWrapRegion(). We do this by extending the edge-value handling in LVI. This allows CVP to then fold comparisons against %x, as illustrated in the tests. Differential Revision: https://reviews.llvm.org/D60650 llvm-svn: 358597	2019-04-17 16:57:42 +00:00
Dmitry Preobrazhensky	394d0a1637	[AMDGPU][MC] Corrected handling of "-" before expressions See bug 41156: https://bugs.llvm.org/show_bug.cgi?id=41156 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60622 llvm-svn: 358596	2019-04-17 16:56:34 +00:00
Rhys Perry	c2814e12e7	AMDGPU: Force skip over SMRD, VMEM and s_waitcnt instructions Summary: This fixes a large Dawn of War 3 performance regression with RADV from Mesa 19.0 to master which was caused by creating less code in some branches. Reviewers: arsen, nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60824 llvm-svn: 358592	2019-04-17 16:31:52 +00:00
Florian Hahn	893aea58ea	[LoopUnroll] Allow unrolling if the unrolled size does not exceed loop size. Summary: In the following cases, unrolling can be beneficial, even when optimizing for code size: 1) very low trip counts 2) potential to constant fold most instructions after fully unrolling. We can unroll in those cases, by setting the unrolling threshold to the loop size. This might highlight some cost modeling issues and fixing them will have a positive impact in general. Reviewers: vsk, efriedma, dmgreen, paquette Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D60265 llvm-svn: 358586	2019-04-17 15:57:43 +00:00
Simon Pilgrim	e7fe6dd5ed	[DAGCombine] Add SimplifyDemandedBits helper that handles demanded elts mask as well The other SimplifyDemandedBits helpers become wrappers to this new demanded elts variant. llvm-svn: 358585	2019-04-17 15:45:44 +00:00
Lang Hames	c1106c9b11	[Support] Add LEB128 support to BinaryStreamReader/Writer. Summary: This patch adds support for ULEB128 and SLEB128 encoding and decoding to BinaryStreamWriter and BinaryStreamReader respectively. Support for ULEB128/SLEB128 will be used for eh-frame parsing in the JITLink library currently under development (see https://reviews.llvm.org/D58704). Reviewers: zturner, dblaikie Subscribers: kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60810 llvm-svn: 358584	2019-04-17 15:38:27 +00:00
Florian Hahn	258a425c69	[ScheduleDAGRRList] Recompute topological ordering on demand. Currently there is a single point in ScheduleDAGRRList, where we actually query the topological order (besides init code). Currently we are recomputing the order after adding a node (which does not have predecessors) and then we add predecessors edge-by-edge. We can avoid adding edges one-by-one after we added a new node. In that case, we can just rebuild the order from scratch after adding the edges to the DAG and avoid all the updates to the ordering. Also, we can delay updating the DAG until we query the DAG, if we keep a list of added edges. Depending on the number of updates, we can either apply them when needed or recompute the order from scratch. This brings down the geomean compile time for of CTMark with -O1 down 0.3% on X86, with no regressions. Reviewers: MatzeB, atrick, efriedma, niravd, paquette Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D60125 llvm-svn: 358583	2019-04-17 15:05:29 +00:00
Dmitry Preobrazhensky	20d52e3aa2	[AMDGPU][MC] Corrected parsing of registers See bug 41280: https://bugs.llvm.org/show_bug.cgi?id=41280 Reviewers: artem.tamazov, arsenm Differential Revision: https://reviews.llvm.org/D60621 llvm-svn: 358581	2019-04-17 14:44:01 +00:00
Tim Renouf	59e8bd3093	[AMDGPU] Flag new raw/struct atomic ops as source of divergence Differential Revision: https://reviews.llvm.org/D60731 Change-Id: I821d93dec8b9cdd247b8172d92fb5e15340a9e7d llvm-svn: 358579	2019-04-17 14:04:31 +00:00
Robert Widmann	d909a5ed8d	[LLVM-C] Add DIFile Field Accesssors Summary: Add accessors for the file, directory, source file name (curiously, an `Optional` value?), of a DIFile. This is intended to replace the LLVMValueRef-based accessors used in D52239 Reviewers: whitequark, jberdine, deadalnix Reviewed By: whitequark, jberdine Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60489 llvm-svn: 358577	2019-04-17 13:29:14 +00:00
Simon Pilgrim	9daacec816	[CostModel][X86] Add bool anyof/allof reduction costs On pre-AVX512 targets we can use MOVMSK to extract reduced boolean results. This is properly optimized, annoyingly AVX512 isn't and produces code that is almost as bad as the (unchanged) costs suggest...... Differential Revision: https://reviews.llvm.org/D60403 llvm-svn: 358574	2019-04-17 10:58:19 +00:00
Fangrui Song	a364d599ab	[DWARF] llvm::Error -> Error. NFC The unqualified name is more common and is used in the file as well. llvm-svn: 358567	2019-04-17 09:11:08 +00:00
Fangrui Song	c82e92bca8	Change some llvm::{lower,upper}_bound to llvm::bsearch. NFC llvm-svn: 358564	2019-04-17 07:58:05 +00:00
Roman Lebedev	0080645846	[CVP] processOverflowIntrinsic(): don't crash if constant-holding happened As reported by Mikael Holmén in post-commit review in https://reviews.llvm.org/D60791#1469765 llvm-svn: 358559	2019-04-17 06:35:07 +00:00
Fangrui Song	df44ff1b78	[DWARF] Pass ReferenceToDIEOffsets elements by reference llvm-svn: 358558	2019-04-17 06:33:52 +00:00
Craig Topper	6bf0802738	[X86] In CopyToFromAsymmetricReg, use VR128 instead of FR32 instructions for GR32<->XMM register copies. We have two versions of some instructions, VR128 versions and FR32 versions that are marked as CodeGenOnly. This change switches to using the VR128 versions for these copies. It's after register allocation so the class size no longer matters. This matches how GR64 works. llvm-svn: 358555	2019-04-17 06:09:11 +00:00
Eric Christopher	e29874eaa0	Revert "Add basic loop fusion pass." Per request. This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358553	2019-04-17 04:55:24 +00:00
Eric Christopher	cee313d288	Revert "Temporarily Revert "Add basic loop fusion pass."" The reversion apparently deleted the test/Transforms directory. Will be re-reverting again. llvm-svn: 358552	2019-04-17 04:52:47 +00:00
Eric Christopher	0ebbf72a63	Remove the run-slp-after-loop-vectorization option. It's been on by default for 4 years and cleans up the pass hierarchy. llvm-svn: 358548	2019-04-17 02:26:27 +00:00
Eric Christopher	a863435128	Temporarily Revert "Add basic loop fusion pass." As it's causing some bot failures (and per request from kbarton). This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda. llvm-svn: 358546	2019-04-17 02:12:23 +00:00
Kit Barton	ab70da0728	Add basic loop fusion pass. This patch adds a basic loop fusion pass. It will fuse loops that conform to the following 4 conditions: 1. Adjacent (no code between them) 2. Control flow equivalent (if one loop executes, the other loop executes) 3. Identical bounds (both loops iterate the same number of iterations) 4. No negative distance dependencies between the loop bodies. The pass does not make any changes to the IR to create opportunities for fusion. Instead, it checks if the necessary conditions are met and if so it fuses two loops together. The pass has not been added to the pass pipeline yet, and thus is not enabled by default. It can be run stand alone using the -loop-fusion option. Phabricator: https://reviews.llvm.org/D55851 llvm-svn: 358543	2019-04-17 01:37:00 +00:00
Robert Widmann	d6eb4bb801	[LLVM-C] Add Accessors For Global Variable Metadata Properties Summary: Metadata for a global variable is really a (GlobalVariable, Expression) tuple. Allow access to these, then allow retrieving the file, scope, and line for a DIVariable, whether global or local. This should be the last of the accessors required for uniform access to location and file information metadata. Reviewers: jberdine, whitequark, deadalnix Reviewed By: jberdine, whitequark Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60725 llvm-svn: 358532	2019-04-16 21:39:48 +00:00
Ali Tamur	e8de5cd602	Fix a typo in comments. [NFC] llvm-svn: 358531	2019-04-16 21:37:43 +00:00
Nick Desaulniers	3271ca01fe	[NVPTXAsmPrinter] clean up dead code. NFC Summary: The printOperand function takes a default parameter, for which there are zero call sites that explicitly pass such a parameter. As such, there is no case to support. This means that the method printVecModifiedImmediate is purly dead code, and can be removed. The eventual goal for some of these AsmPrinter refactoring is to have printOperand be a virtual method; making it easier to print operands from the base class for more generic Asm printing. It will help if all printOperand methods have the same function signature (ie. no Modifier argument when not needed). Reviewers: echristo, tra Reviewed By: echristo Subscribers: jholewinski, hiraditya, llvm-commits, craig.topper, srhines Tags: #llvm Differential Revision: https://reviews.llvm.org/D60727 llvm-svn: 358527	2019-04-16 21:04:34 +00:00
Simon Pilgrim	e5573f4f4e	[TargetLowering] Rename preferShiftsToClearExtremeBits and shouldFoldShiftPairToMask (PR41359) As discussed on PR41359, this patch renames the pair of shift-mask target feature functions to make their purposes more obvious. shouldFoldShiftPairToMask -> shouldFoldConstantShiftPairToMask preferShiftsToClearExtremeBits -> shouldFoldMaskToVariableShiftPair llvm-svn: 358526	2019-04-16 20:57:28 +00:00
Sanjay Patel	e08783e2f5	[EarlyCSE] detect equivalence of selects with inverse conditions and commuted operands (PR41101) This is 1 of the problems discussed in the post-commit thread for: rL355741 / http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190311/635516.html and filed as: https://bugs.llvm.org/show_bug.cgi?id=41101 Instcombine tries to canonicalize some of these cases (and there's room for improvement there independently of this patch), but it can't always do that because of extra uses. So we need to recognize these commuted operand patterns here in EarlyCSE. This is similar to how we detect commuted compares and commuted min/max/abs. Differential Revision: https://reviews.llvm.org/D60723 llvm-svn: 358523	2019-04-16 20:41:20 +00:00
Anton Afanasyev	3a00b020aa	Time profiler: optimize json output time Summary: Use llvm::json::Array.reserve() to optimize json output time. Here is motivation: https://reviews.llvm.org/D60609#1468941. In short: for the json array with ~32K entries, pushing back each entry takes ~4% of whole time compared to the method of preliminary memory reservation: (3995-3845)/3995 = 3.75%. Reviewers: lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60792 llvm-svn: 358522	2019-04-16 20:36:56 +00:00
Nikita Popov	52b24ee932	[CVP] Simplify umulo and smulo that cannot overflow If a umul.with.overflow or smul.with.overflow operation cannot overflow, simplify it to a simple mul nuw / mul nsw. After the refactoring in D60668 this is just a matter of removing an explicit check against multiplications. Differential Revision: https://reviews.llvm.org/D60791 llvm-svn: 358521	2019-04-16 20:31:41 +00:00
Simon Pilgrim	82ffa88a04	[SLP] Refactoring of the operand reordering code. This is a refactoring patch which should have all the functionality of the current code. Its goal is twofold: i. Cleanup and simplify the reordering code, and ii. Generalize reordering so that it will work for an arbitrary number of operands, not just 2. This is the second patch in a series of patches that will enable operand reordering across chains of operations. An example of this was presented in EuroLLVM'18 https://www.youtube.com/watch?v=gIEn34LvyNo . Committed on behalf of @vporpo (Vasileios Porpodas) Differential Revision: https://reviews.llvm.org/D59973 llvm-svn: 358519	2019-04-16 19:27:00 +00:00
Simon Pilgrim	d769bb1e58	[X86][AVX] X86ISD::PERMV/PERMV3 node types can never fold index ops Improves codegen demonstrated by D60512 - instructions represented by X86ISD::PERMV/PERMV3 can never memory fold the operand used for their index register. This patch updates the 'isUseOfShuffle' helper into the more capable 'isFoldableUseOfShuffle' that recognises that the op is used for a X86ISD::PERMV/PERMV3 index mask and can't be folded - allowing us to use broadcast/subvector-broadcast ops to reduce the size of the mask constant pool data. Differential Revision: https://reviews.llvm.org/D60562 llvm-svn: 358516	2019-04-16 19:18:53 +00:00
Nikita Popov	5ecd6a48b9	[InstCombine] Prune fshl/fshr with masked operands If a constant shift amount is used, then only some of the LHS/RHS operand bits are demanded and we may be able to simplify based on that. InstCombineSimplifyDemanded already had the necessary support for that, we just weren't calling it with fshl/fshr as root. In particular, this allows us to relax some masked funnel shifts into simple shifts, as shown in the tests. Patch by Shawn Landden. Differential Revision: https://reviews.llvm.org/D60660 llvm-svn: 358515	2019-04-16 19:05:49 +00:00
Nikita Popov	79dffc67b5	[IR] Add WithOverflowInst class This adds a WithOverflowInst class with a few helper methods to get the underlying binop, signedness and nowrap type and makes use of it where sensible. There will be two more uses in D60650/D60656. The refactorings are all NFC, though I left some TODOs where things could be improved. In particular we have two places where add/sub are handled but mul isn't. Differential Revision: https://reviews.llvm.org/D60668 llvm-svn: 358512	2019-04-16 18:55:16 +00:00
Krzysztof Parzyszek	ef6823ec8d	[Hexagon] Remove indeterministic traversal order Patch by Sergei Larin. llvm-svn: 358505	2019-04-16 16:05:07 +00:00
Luis Marques	eda370d4c8	[DAGCombiner] Add missing flag to addressing mode check The checks in `canFoldInAddressingMode` tested for addressing modes that have a base register but didn't set the `HasBaseReg` flag to true (it's false by default). This patch fixes that. Although the omission of the flag was technically incorrect it had no known observable impact, so no tests were changed by this patch. Differential Revision: https://reviews.llvm.org/D60314 llvm-svn: 358502	2019-04-16 15:09:18 +00:00
Luis Marques	20d2424016	[RISCV] Custom lower SHL_PARTS, SRA_PARTS, SRL_PARTS When not optimizing for minimum size (-Oz) we custom lower wide shifts (SHL_PARTS, SRA_PARTS, SRL_PARTS) instead of expanding to a libcall. Differential Revision: https://reviews.llvm.org/D59477 llvm-svn: 358498	2019-04-16 14:38:32 +00:00
Kadir Cetinkaya	8fdc5abffe	[llvm][Support] Provide interface to set thread priorities Summary: We have a multi-platform thread priority setting function(last piece landed with D58683), I wanted to make this available to all llvm community, there seem to be other users of such functionality with portability fixmes: lib/Support/CrashRecoveryContext.cpp tools/clang/tools/libclang/CIndex.cpp Reviewers: gribozavr, ioeric Subscribers: krytarowski, jfb, kristina, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59130 llvm-svn: 358494	2019-04-16 14:32:43 +00:00
Nico Weber	930994ce14	llvm-undname: Consistently use "return nullptr" in functions returning pointers llvm-svn: 358492	2019-04-16 14:24:42 +00:00
Nico Weber	c035c243da	llvm-undname: Fix nullptr deref on invalid structor names in template args Similar to r358421: A StructorIndentifierNode has a Class field which is read when printing it, but if the StructorIndentifierNode appears in a template argument then demangleFullyQualifiedSymbolName() which sets Class isn't called. Since StructorIndentifierNodes are always leaf names, we can just reject them as well. Found by oss-fuzz. llvm-svn: 358491	2019-04-16 14:10:34 +00:00
Hans Wennborg	21eb771dcb	Re-commit r357452: SimplifyCFG SinkCommonCodeFromPredecessors: Also sink function calls without used results (PR41259) The original commit caused false positives from AddressSanitizer's use-after-scope checks, which have now been fixed in r358478. > The code was previously checking that candidates for sinking had exactly > one use or were a store instruction (which can't have uses). This meant > we could sink call instructions only if they had a use. > > That limitation seemed a bit arbitrary, so this patch changes it to > "instruction has zero or one use" which seems more natural and removes > the need to special-case stores. > > Differential revision: https://reviews.llvm.org/D59936 llvm-svn: 358483	2019-04-16 12:13:25 +00:00
Hans Wennborg	6ae05777b8	Asan use-after-scope: don't poison allocas if there were untraced lifetime intrinsics in the function (PR41481) If there are any intrinsics that cannot be traced back to an alloca, we might have missed the start of a variable's scope, leading to false error reports if the variable is poisoned at function entry. Instead, if there are some intrinsics that can't be traced, fail safe and don't poison the variables in that function. Differential revision: https://reviews.llvm.org/D60686 llvm-svn: 358478	2019-04-16 07:54:20 +00:00
Anton Afanasyev	6547d51458	Use native llvm JSON library for time profiler output Summary: Replace plain json text output with llvm JSON library wrapper using. Reviewers: takuto.ikuta, lebedev.ri Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60609 llvm-svn: 358476	2019-04-16 06:35:07 +00:00

... 21 22 23 24 25 ...

124284 Commits