llvm-project

Commit Graph

Author	SHA1	Message	Date
Clement Courbet	36a3480385	Re-land r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads. Update PPC ir following GEP->bitcat to bitcat->GEP->bitcat change. llvm-svn: 349747	2018-12-20 13:01:04 +00:00
Ulrich Weigand	f43b510015	[SystemZ] Make better use of VLDEB We already have special code (DAG combine support for FP_ROUND) to recognize cases where we an use a vector version of VLEDB to perform two floating-point truncates in parallel, but equivalent support for VLEDB (vector floating-point extends) has been missing so far. This patch adds corresponding DAG combine support for FP_EXTEND. llvm-svn: 349746	2018-12-20 12:59:05 +00:00
George Rimar	6367d7a6d1	[yaml2obj/obj2yaml] - Support dumping/parsing ABI version. These tools were assuming ABI version is 0, that is not always true. Patch teaches them to work with that field. Differential revision: https://reviews.llvm.org/D55884 llvm-svn: 349737	2018-12-20 10:43:49 +00:00
Piotr Sobczak	deaacc17fe	[InstCombine][AMDGPU] Handle more buffer intrinsics Summary: Include the following intrinsics in the InsctCombine simplification: * amdgcn_raw_buffer_load * amdgcn_raw_buffer_load_format * amdgcn_struct_buffer_load * amdgcn_struct_buffer_load_format Change-Id: I14deceff74bcb21179baf6aa6e94bf39e7d63d5d Reviewers: arsenm Reviewed By: arsenm Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55882 llvm-svn: 349735	2018-12-20 10:08:18 +00:00
Alexander Potapenko	0e3b85a730	[MSan] Don't emit __msan_instrument_asm_load() calls LLVM treats void* pointers passed to assembly routines as pointers to sized types. We used to emit calls to __msan_instrument_asm_load() for every such void*, which sometimes led to false positives. A less error-prone (and truly "conservative") approach is to unpoison only assembly output arguments. llvm-svn: 349734	2018-12-20 10:05:00 +00:00
Clement Courbet	e22cf4d7cb	Revert r349731 "[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads." Forgot to update PowerPC tests for the GEP->bitcast change. llvm-svn: 349733	2018-12-20 09:58:33 +00:00
Clement Courbet	d4bd3eb85d	[NFC] Fix trailing comma after function. lib/Analysis/VectorUtils.cpp:482:2: warning: extra ‘;’ [-Wpedantic] llvm-svn: 349732	2018-12-20 09:20:07 +00:00
Clement Courbet	1bb6e1b0f2	[CodeGen][ExpandMemcmp] Add an option for allowing overlapping loads. Summary: This allows expanding {7,11,13,14,15,21,22,23,25,26,27,28,29,30,31}-byte memcmp in just two loads on X86. These were previously calling memcmp. Reviewers: spatel, gchatelet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55263 llvm-svn: 349731	2018-12-20 09:13:47 +00:00
Eugene Leviant	2d98eb1b2e	[HWASAN] Add support for memory intrinsics Differential revision: https://reviews.llvm.org/D55117 llvm-svn: 349728	2018-12-20 09:04:33 +00:00
Kang Zhang	ca8db48974	[PowerPC] Implement the isSelectSupported() target hook Summary: PowerPC has scalar selects (isel) and vector mask selects (xxsel). But PowerPC does not have vector CR selects, PowerPC does not support scalar condition selects on vectors. In addition to implementing this hook, isSelectSupported() should return false when the SelectSupportKind is ScalarCondVectorVal, so that predictable selects are converted into branch sequences. Reviewed By: steven.zhang, hfinkel Differential Revision: https://reviews.llvm.org/D55754 llvm-svn: 349727	2018-12-20 06:19:59 +00:00
Craig Topper	bd788ce5db	[DAGCombiner] Fix a place that was creating a SIGN_EXTEND with an extra operand. llvm-svn: 349726	2018-12-20 05:28:06 +00:00
Michael Kruse	978ba61536	Introduce llvm.loop.parallel_accesses and llvm.access.group metadata. The current llvm.mem.parallel_loop_access metadata has a problem in that it uses LoopIDs. LoopID unfortunately is not loop identifier. It is neither unique (there's even a regression test assigning the some LoopID to multiple loops; can otherwise happen if passes such as LoopVersioning make copies of entire loops) nor persistent (every time a property is removed/added from a LoopID's MDNode, it will also receive a new LoopID; this happens e.g. when calling Loop::setLoopAlreadyUnrolled()). Since most loop transformation passes change the loop attributes (even if it just to mark that a loop should not be processed again as llvm.loop.isvectorized does, for the versioned and unversioned loop), the parallel access information is lost for any subsequent pass. This patch unlinks LoopIDs and parallel accesses. llvm.mem.parallel_loop_access metadata on instruction is replaced by llvm.access.group metadata. llvm.access.group points to a distinct MDNode with no operands (avoiding the problem to ever need to add/remove operands), called "access group". Alternatively, it can point to a list of access groups. The LoopID then has an attribute llvm.loop.parallel_accesses with all the access groups that are parallel (no dependencies carries by this loop). This intentionally avoid any kind of "ID". Loops that are clones/have their attributes modifies retain the llvm.loop.parallel_accesses attribute. Access instructions that a cloned point to the same access group. It is not necessary for each access to have it's own "ID" MDNode, but those memory access instructions with the same behavior can be grouped together. The behavior of llvm.mem.parallel_loop_access is not changed by this patch, but should be considered deprecated. Differential Revision: https://reviews.llvm.org/D52116 llvm-svn: 349725	2018-12-20 04:58:07 +00:00
Thomas Lively	feb18fe927	[WebAssembly] Emit a splat for v128 IMPLICIT_DEF Summary: This is a code size savings and is also important to get runnable code while engines do not support v128.const. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55910 llvm-svn: 349724	2018-12-20 04:20:32 +00:00
Amara Emerson	321bfb210a	Fix build errors introduced by r349712 on aarch64 bots. llvm-svn: 349723	2018-12-20 03:27:42 +00:00
Thomas Lively	8dbf29af95	[WebAssembly] Gate unimplemented SIMD ops on flag Summary: Gates v128.const, f32x4.sqrt, f32x4.div, i8x16.extract_lane_u, and i16x8.extract_lane_u on the --wasm-enable-unimplemented-simd flag, since these ops are not implemented yet in V8. Reviewers: aheejin, dschuff Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55904 llvm-svn: 349720	2018-12-20 02:10:22 +00:00
Matt Arsenault	4339883710	AMDGPU: Make i1/i64/v2i32 and/or/xor legal The 64-bit types do depend on the register bank, but that's another issue to deal with later. llvm-svn: 349716	2018-12-20 01:35:49 +00:00
Matt Arsenault	8cc98bee8a	AMDGPU/GlobalISel: Fix ValueMapping tables for i1 This was incorrectly selecting SGPR for any i1 values, e.g. G_TRUNC to i1 from a VGPR was still an SGPR. llvm-svn: 349715	2018-12-20 01:33:43 +00:00
Craig Topper	9ca2f5605e	[X86] Disable custom widening of signed/unsigned add/sub saturation intrinsics under -x86-experimental-vector-widening-legalization. Generic legalization should take care of this. llvm-svn: 349714	2018-12-20 01:32:06 +00:00
Amara Emerson	8cb186ce17	[AArch64][GlobalISel] Implement selection og G_MERGE of two s32s into s64. This code pattern is an unfortunate side effect of the way some types get split at call lowering. Ideally we'd either not generate it at all or combine it away in the legalizer artifact combiner. Until then, add selection support anyway which is a significant proportion of our current fallbacks on CTMark. rdar://46491420 llvm-svn: 349712	2018-12-20 01:11:04 +00:00
Matt Arsenault	dff33c38e1	AMDGPU/GlobalISel: RegBankSelect for fp conversions llvm-svn: 349709	2018-12-20 00:37:02 +00:00
Matt Arsenault	36d4092173	AMDGPU/GlobalISel: Legality/regbankselect for atomicrmw/atomic_cmpxchg llvm-svn: 349708	2018-12-20 00:33:49 +00:00
Vitaly Buka	07a55f27dc	[asan] Undo special treatment of linkonce_odr and weak_odr Summary: On non-Windows these are already removed by ShouldInstrumentGlobal. On Window we will wait until we get actual issues with that. Reviewers: pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55899 llvm-svn: 349707	2018-12-20 00:30:27 +00:00
Vitaly Buka	d414e1bbb5	[asan] Prevent folding of globals with redzones Summary: ICF prevented by removing unnamed_addr and local_unnamed_addr for all sanitized globals. Also in general unnamed_addr is not valid here as address now is important for ODR violation detector and redzone poisoning. Before the patch ICF on globals caused: 1. false ODR reports when we register global on the same address more than once 2. globals buffer overflow if we fold variables of smaller type inside of large type. Then the smaller one will poison redzone which overlaps with the larger one. Reviewers: eugenis, pcc Subscribers: hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55857 llvm-svn: 349706	2018-12-20 00:30:18 +00:00
Matt Davis	87b2268c0c	[DwarfExpression] Fix a typo in a doxygen comment. NFC. llvm-svn: 349703	2018-12-20 00:01:57 +00:00
Craig Topper	217b3b20d8	[X86] Remove TLI variable from ReplaceNodeResults. NFC We're already in X86TargetLowering which is a derived class of TargetLowering. We can just call methods directly. llvm-svn: 349695	2018-12-19 23:13:03 +00:00
Rhys Perry	3931ad38b9	AMDGPU: Add patterns for v4i16/v4f16 -> v4i16/v4f16 bitcasts Reviewers: arsenm, tstellar Reviewed By: arsenm Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55058 llvm-svn: 349694	2018-12-19 22:53:33 +00:00
Eli Friedman	a69084ffa8	[CodeGenPrepare] Fix bad IR created by large offset GEP splitting. Creating the IR builder, then modifying the CFG, leads to an IRBuilder where the BB and insertion point are inconsistent, so new instructions have the wrong parent. Modified an existing test because the test wasn't covering anything useful (the "invoke" was not actually an invoke by the time we hit the code in question). Differential Revision: https://reviews.llvm.org/D55729 llvm-svn: 349693	2018-12-19 22:52:04 +00:00
Rhys Perry	972273d1d3	Fix test commit Seems that was actually a eight space tab... llvm-svn: 349690	2018-12-19 22:33:42 +00:00
Rhys Perry	111bf831de	Test commit Replace tab with 4 spaces. llvm-svn: 349689	2018-12-19 22:26:51 +00:00
Evandro Menezes	374ccf6768	[AArch64] Improve Exynos predicates Expand the predicate `ExynosResetPred` to include all forms of immediate moves. llvm-svn: 349686	2018-12-19 22:24:36 +00:00
Evandro Menezes	ff827d737a	[AArch64] Use canonical copy idiom Use only the canonical form of the alias for register transfers in the `IsCopyIdiomPred` predicate. llvm-svn: 349685	2018-12-19 22:24:31 +00:00
Nikita Popov	3817ee7908	Revert "[BDCE][DemandedBits] Detect dead uses of undead instructions" This reverts commit r349674. It causes a failure in test-suite enc-3des.execution_time. llvm-svn: 349684	2018-12-19 22:09:02 +00:00
Reid Kleckner	ed3ef41711	[llvm-ar] Simplify string table get-or-insert pattern with .insert, NFC llvm-svn: 349681	2018-12-19 20:54:06 +00:00
Nikita Popov	649e125451	[BDCE][DemandedBits] Detect dead uses of undead instructions This (mostly) fixes https://bugs.llvm.org/show_bug.cgi?id=39771. BDCE currently detects instructions that don't have any demanded bits and replaces their uses with zero. However, if an instruction has multiple uses, then some of the uses may be dead (have no demanded bits) even though the instruction itself is still live. This patch extends DemandedBits/BDCE to detect such uses and replace them with zero. While this will not immediately render any instructions dead, it may lead to simplifications (in the motivating case, by converting a rotate into a simple shift), break dependencies, etc. The implementation tries to strike a balance between analysis power and complexity/memory usage. Originally I wanted to track demanded bits on a per-use level, but ultimately we're only really interested in whether a use is entirely dead or not. I'm using an extra set to track which uses are dead. However, as initially all uses are dead, I'm not storing uses those user is also dead. This case is checked separately instead. The test case has a couple of cases that are not simplified yet. In particular, we're only looking at uses of instructions right now. I think it would make sense to also extend this to arguments. Furthermore DemandedBits doesn't yet know some of the tricks that InstCombine does for the demanded bits or bitwise or/and/xor in combination with known bits information. Differential Revision: https://reviews.llvm.org/D55563 llvm-svn: 349674	2018-12-19 19:56:21 +00:00
Craig Topper	d16da2b479	[X86] Remove a bunch of 'else' after returns in reduceVMULWidth. NFC This reduces indentation and makes it obvious this function always returns something. llvm-svn: 349671	2018-12-19 19:39:34 +00:00
David Blaikie	ac69af7ad6	llvm-dwarfdump: Improve/fix pretty printing of array dimensions This is to address post-commit feedback from Paul Robinson on r348954. The original commit misinterprets count and upper bound as the same thing (I thought I saw GCC producing an upper bound the same as Clang's count, but GCC correctly produces an upper bound that's one less than the count (in C, that is, where arrays are zero indexed)). I want to preserve the C-like output for the common case, so in the absence of a lower bound the count (or one greater than the upper bound) is rendered between []. In the trickier cases, where a lower bound is specified, a half-open range is used (eg: lower bound 1, count 2 would be "[1, 3)" and an unknown parts use a '?' (eg: "[1, ?)" or "[?, 7)" or "[?, ? + 3)"). Reviewers: aprantl, probinson, JDevlieghere Differential Revision: https://reviews.llvm.org/D55721 llvm-svn: 349670	2018-12-19 19:34:24 +00:00
Matthew Voss	62fcfc5adb	[ThinLTO] Remove dllimport attribute from locally defined symbols Summary: The LTO/ThinLTO driver currently creates invalid bitcode by setting symbols marked dllimport as dso_local. The compiler often has access to the definition (often dllexport) and the declaration (often dllimport) of an object at link-time, leading to a conflicting declaration. This patch resolves the inconsistency by removing the dllimport attribute. Reviewers: tejohnson, pcc, rnk, echristo Reviewed By: rnk Subscribers: dmikulin, wristow, mehdi_amini, inglorion, eraman, steven_wu, dexonsmith, dang, llvm-commits Differential Revision: https://reviews.llvm.org/D55627 llvm-svn: 349667	2018-12-19 19:07:45 +00:00
Jessica Paquette	3560e93dc1	[GlobalISel][AArch64] Add support for @llvm.ceil This adds a G_FCEIL generic instruction and uses it in AArch64. This adds selection for floating point ceil where it has a supported, dedicated instruction. Other cases aren't handled here. It updates the relevant gisel tests and adds a select-ceil test. It also adds a check to arm64-vcvt.ll which ensures that we don't fall back when we run into one of the relevant cases. llvm-svn: 349664	2018-12-19 19:01:36 +00:00
Craig Topper	84a00bd98a	[X86] Don't match TESTrr from (cmp (and X, Y), 0) during isel. Defer to post processing The (cmp (and X, Y) 0) pattern is greedy and ends up forming a TESTrr and consuming the and when it might be better to use one of the BMI/TBM like BLSR or BLSI. This patch moves removes the pattern from isel and adds a post processing check to combine TESTrr+ANDrr into just a TESTrr. With this patch we are able to select the BMI/TBM instructions, but we'll also emit a TESTrr when the result is compared to 0. In many cases the peephole pass will be able to use optimizeCompareInstr to remove the TEST, but its probably not perfect. Differential Revision: https://reviews.llvm.org/D55870 llvm-svn: 349661	2018-12-19 18:49:13 +00:00
Craig Topper	291470347a	[X86] Fix assert fails in pass X86AvoidSFBPass Fixes https://bugs.llvm.org/show_bug.cgi?id=38743 The function removeRedundantBlockingStores is supposed to remove any blocking stores contained in each other in lockingStoresDispSizeMap. But it currently looks only at the previous one, which will miss some cases that result in assert. This patch refine the function to check all previous layouts until find the uncontained one. So all redundant stores will be removed. Patch by Pengfei Wang Differential Revision: https://reviews.llvm.org/D55642 llvm-svn: 349660	2018-12-19 18:45:57 +00:00
Evandro Menezes	5d409b2278	[AArch64] Improve the Exynos M3 pipeline model llvm-svn: 349652	2018-12-19 17:37:51 +00:00
Anton Afanasyev	ce28791e20	Test commit Fix typos. llvm-svn: 349644	2018-12-19 17:18:40 +00:00
Sanjay Patel	798c5982a0	[ValueTracking] remove unused parameters from helper functions; NFC llvm-svn: 349641	2018-12-19 16:49:18 +00:00
Yonghong Song	7b410ac352	[BPF] Generate BTF DebugInfo under BPF target This patch implements BTF (BPF Type Format). The BTF is the debug info format for BPF, introduced in the below linux patch: `69b693f0ae (diff-06fb1c8825f653d7e539058b72c83332)` and further extended several times, e.g., https://www.spinics.net/lists/netdev/msg534640.html https://www.spinics.net/lists/netdev/msg538464.html https://www.spinics.net/lists/netdev/msg540246.html The main advantage of implementing in LLVM is: . better integration/deployment as no extra tools are needed. . bpf JIT based compilation (like bcc, bpftrace, etc.) can get BTF without much extra effort. . BTF line_info needs selective source codes, which can be easily retrieved when inside the compiler. This patch implemented BTF generation by registering a BPF specific DebugHandler in BPFAsmPrinter. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55752 llvm-svn: 349640	2018-12-19 16:40:25 +00:00
Peter Wu	f0ad811b54	[Object] Deduplicate long archive member names Summary: Import libraries as created by llvm-dlltool always use the same archive member name for every object file (namely, the DLL library name). Ensure that long names are not repeatedly stored in the string table. Reviewed By: ruiu Differential Revision: https://reviews.llvm.org/D55860 llvm-svn: 349637	2018-12-19 16:15:05 +00:00
Simon Pilgrim	7bfbf3caa4	[X86][SSE] Auto upgrade PADDUS/PSUBUS intrinsics to UADD_SAT/USUB_SAT generic intrinsics (llvm) Now that we use the generic ISD opcodes, we can use the generic intrinsics directly as well. This fixes the poor fast-isel codegen by not expanding to an easily broken IR code sequence. I'm intending to deal with the signed saturation equivalents as well. Clang counterpart: https://reviews.llvm.org/D55879 Differential Revision: https://reviews.llvm.org/D55855 llvm-svn: 349630	2018-12-19 14:43:36 +00:00
Simon Pilgrim	2ae3a91656	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 2 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. I've updated the (or (and X, c1), c2) -> (and (or X, c2), c1\|c2) fold to demonstrate its use, which I believe is safe for undef cases. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349629	2018-12-19 14:09:38 +00:00
Simon Pilgrim	47ff0431e9	[SelectionDAG] Optional handling of UNDEF elements in matchBinaryPredicate (part 1 of 2) Now that SimplifyDemandedBits/SimplifyDemandedVectorElts is simplifying vector elements, we're seeing more constant BUILD_VECTOR containing undefs. This patch provides opt-in support for UNDEF elements in matchBinaryPredicate, passing NULL instead of the result ConstantSDNode* argument. Differential Revision: https://reviews.llvm.org/D55822 llvm-svn: 349628	2018-12-19 14:09:09 +00:00
Simon Pilgrim	6c95bea072	[TargetLowering] Fix propagation of undefs in zero extension ops (PR40091) As described on PR40091, we have several places where zext (and zext_vector_inreg) fold an undef input into an undef output. For zero extensions this is incorrect as the output should guarantee to least have the new upper bits set to zero. SimplifyDemandedVectorElts is the worst offender (and its the most likely to cause new undefs to appear) but DAGCombiner's tryToFoldExtendOfConstant has a similar issue. Thanks to @dmgreen for catching this. Differential Revision: https://reviews.llvm.org/D55883 llvm-svn: 349625	2018-12-19 13:37:59 +00:00
Nico Weber	f7cf1a1a73	Let TableGen write output only if it changed, instead of doing so in cmake, attempt 2 This relands r330742: """ Let TableGen write output only if it changed, instead of doing so in cmake. Removes one subprocess and one temp file from the build for each tablegen invocation. No intended behavior change. """ In particular, if you see rebuilds after this change that you didn't see before this change, that's unintended and it's fine to revert this change again (but let me know). r330742 got reverted because some people reported that llvm-tblgen ran on every build after it. This could happen if the depfile output got deleted without deleting the main .inc output. To fix, make TableGen always write the depfile, but keep writing the main .inc output only if it has changed. This matches what we did in cmake before. Differential Revision: https://reviews.llvm.org/D55842 llvm-svn: 349624	2018-12-19 13:35:53 +00:00
Nicolai Haehnle	8d5e974076	AMDGPU: Use an ABS32_LO relocation for SCRATCH_RSRC_DWORD1 Summary: Using HI here makes no logical sense, since the dword is only 32 bits to begin with. Current Mesa master does not look at the relocation type at all, so this change is fine. Future Mesa will rely on this, however. Change-Id: I91085707834c4ac0370926602b93c94b90e44cb1 Reviewers: arsenm, rampitec, mareko Subscribers: kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D55369 llvm-svn: 349620	2018-12-19 11:55:03 +00:00
Simon Pilgrim	2072b5afbe	[SelectionDAG] Optional handling of UNDEF elements in matchUnaryPredicate Now that SimplifyDemandedBits/SimplifyDemandedVectorElts are simplifying vector elements, we're seeing more constant BUILD_VECTOR containing UNDEFs. This patch provides opt-in handling of UNDEF elements in matchUnaryPredicate, passing NULL instead of the ConstantSDNode* argument. I've updated SelectionDAG::simplifyShift to demonstrate its use. Differential Revision: https://reviews.llvm.org/D55819 llvm-svn: 349616	2018-12-19 10:41:06 +00:00
Carl Ritson	c521ac3a44	AMDGPU/InsertWaitcnts: Update VGPR/SGPR bounds when brackets are merged Summary: Fix an issue where VGPR/SGPR bounds are not properly extended when brackets are merged. This manifests as missing waitcnt insertions when multiple brackets are forwarded to a successor block and the first forward has lower VGPR/SGPR bounds. Irreducible loop test has been extended based on a CTS failure detected for GFX9. Reviewers: nhaehnle Reviewed By: nhaehnle Subscribers: arsenm, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, t-tye, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D55602 llvm-svn: 349611	2018-12-19 10:17:49 +00:00
Diana Picus	6c35a1e5af	[ARM GlobalISel] Support G_CONSTANT for Thumb2 All we have to do is mark it as legal. This allows us to select a lot of new patterns handled by TableGen. This patch adds tests for them and splits up the existing test file for binary operators into 2 files, one for arithmetic ops and one for logical ones. llvm-svn: 349610	2018-12-19 09:55:10 +00:00
Matt Arsenault	b110e2277c	AMDGPU/GlobalISel: Regbankselect for fsub llvm-svn: 349608	2018-12-19 09:07:58 +00:00
Martin Storsjo	e84a0b5a9e	[llvm-objcopy] Initial COFF support This is an initial implementation of no-op passthrough copying of COFF with objcopy. Differential Revision: https://reviews.llvm.org/D54939 llvm-svn: 349605	2018-12-19 07:24:38 +00:00
Kewen Lin	a6247e7cf4	[PowerPC]Exploit P9 vabsdu for unsigned vselect patterns For type v4i32/v8ii16/v16i8, do following transforms: (vselect (setcc a, b, setugt), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setuge), (sub a, b), (sub b, a)) -> (vabsd a, b) (vselect (setcc a, b, setult), (sub b, a), (sub a, b)) -> (vabsd a, b) (vselect (setcc a, b, setule), (sub b, a), (sub a, b)) -> (vabsd a, b) Differential Revision: https://reviews.llvm.org/D55812 llvm-svn: 349599	2018-12-19 03:04:07 +00:00
Evandro Menezes	f03c45d582	[AArch64] Simplify the Exynos M3 pipeline model llvm-svn: 349569	2018-12-18 23:19:57 +00:00
Evandro Menezes	4e39fa4474	[AArch64] Fix instructions order (NFC) llvm-svn: 349568	2018-12-18 23:19:55 +00:00
Yonghong Song	61b189e06f	[DebugInfo] Move several private headers to include directory This patch moved the following files in lib/CodeGen/AsmPrinter/ AsmPrinterHandler.h DbgEntityHistoryCalculator.h DebugHandlerBase.h to include/llvm/CodeGen directory. Such a change will enable Target to extend DebugHandlerBase and emit Target specific debug info sections. Signed-off-by: Yonghong Song <yhs@fb.com> Differential Revision: https://reviews.llvm.org/D55755 llvm-svn: 349564	2018-12-18 23:10:17 +00:00
Pete Cooper	a3e0be109c	Preserve the linkage for objc* intrinsics as clang will set them to weak_external in some cases Clang uses weak linkage for objc runtime functions when they are not available on the platform. The intrinsic has this linkage so we just need to pass that on to the runtime call. llvm-svn: 349559	2018-12-18 22:42:08 +00:00
Pete Cooper	d0ffdf8782	Add nonlazybind to objc_retain/objc_release when converting from intrinsics. For performance reasons, clang set nonlazybind on these functions. Now that we are using intrinsics instead of runtime calls, we should set this attribute when creating the runtime functions. llvm-svn: 349558	2018-12-18 22:31:34 +00:00
Florian Hahn	485f2826ba	[LAA] Introduce enum for vectorization safety status (NFC). This patch adds a VectorizationSafetyStatus enum, which will be extended in a follow up patch to distinguish between 'safe with runtime checks' and 'known unsafe' dependences. Reviewers: anemet, anna, Ayal, hsaito Reviewed By: Ayal Differential Revision: https://reviews.llvm.org/D54892 llvm-svn: 349556	2018-12-18 22:25:11 +00:00
Vitaly Buka	4e4920694c	[asan] Restore ODR-violation detection on vtables Summary: unnamed_addr is still useful for detecting of ODR violations on vtables Still unnamed_addr with lld and --icf=safe or --icf=all can trigger false reports which can be avoided with --icf=none or by using private aliases with -fsanitize-address-use-odr-indicator Reviewers: eugenis Reviewed By: eugenis Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55799 llvm-svn: 349555	2018-12-18 22:23:30 +00:00
Pete Cooper	f86db5ce9e	Rewrite objc intrinsics to runtime methods in PreISelIntrinsicLowering instead of SDAG. SelectionDAG currently changes these intrinsics to function calls, but that won't work for other ISel's. Also we want to eventually support nonlazybind and weak linkage coming from the front-end which we can't do in SelectionDAG. llvm-svn: 349552	2018-12-18 22:20:03 +00:00
Martin Storsjo	df20c666d6	[AArch64] Avoid crashing on .seh directives in assembly Differential Revision: https://reviews.llvm.org/D55670 llvm-svn: 349549	2018-12-18 22:10:17 +00:00
Kuba Mracek	3760fc9f3d	[asan] In llvm.asan.globals, allow entries to be non-GlobalVariable and skip over them Looks like there are valid reasons why we need to allow bitcasts in llvm.asan.globals, see discussion at https://github.com/apple/swift-llvm/pull/133. Let's look through bitcasts when iterating over entries in the llvm.asan.globals list. Differential Revision: https://reviews.llvm.org/D55794 llvm-svn: 349544	2018-12-18 21:20:17 +00:00
Evandro Menezes	3753c25b8c	[llvm-mca] Dump mask in hex Dump the resources masks as hexadecimal. llvm-svn: 349536	2018-12-18 20:45:50 +00:00
Pete Cooper	be4f571107	Change the objc ARC optimizer to use the new objc.* intrinsics We're moving ARC optimisation and ARC emission in clang away from runtime methods and towards intrinsics. This is the part which actually uses the intrinsics in the ARC optimizer when both analyzing the existing calls and emitting new ones. Differential Revision: https://reviews.llvm.org/D55348 Reviewers: ahatanak llvm-svn: 349534	2018-12-18 20:32:49 +00:00
Craig Topper	18a9d545e1	[X86] Add BSR to isUseDefConvertible. We already had BSF here as part of __builtin_ffs improvements and I was just wondering yesterday whether we should have BSR there. This addresses one issue from PR40090. llvm-svn: 349531	2018-12-18 20:03:54 +00:00
Nikita Popov	20853a7807	[InstCombine] Simplify cttz/ctlz + icmp eq/ne into mask check Checking whether a number has a certain number of trailing / leading zeros means checking whether it is of the form XXXX1000 / 0001XXXX, which can be done with an and+icmp. Related to https://bugs.llvm.org/show_bug.cgi?id=28668. As a next step, this can be extended to non-equality predicates. Differential Revision: https://reviews.llvm.org/D55745 llvm-svn: 349530	2018-12-18 19:59:50 +00:00
Farhana Aleen	59ee2c5362	[AMDGPU] Removed the unnecessary operand size-check-assert from processBaseWithConstOffset(). Summary: 32bit operand sizes are guaranteed by the opcode check AMDGPU::V_ADD_I32_e64 and AMDGPU::V_ADDC_U32_e64. Therefore, we don't any additional operand size-check-assert. Author: FarhanaAleen llvm-svn: 349529	2018-12-18 19:58:39 +00:00
David Blaikie	693f617763	DebugInfo: Fix missing local imported entities after r349207 Post commit review/bug reported by Pavel Labath - thanks! llvm-svn: 349528	2018-12-18 19:40:22 +00:00
Florian Hahn	5c014037b3	[SCCP] Get rid of redundant call for getPredicateInfoFor (NFC). We can use the result fetched a few lines above. llvm-svn: 349527	2018-12-18 19:37:07 +00:00
Craig Topper	8434ef7d1e	[X86] Don't use SplitOpsAndApply to create ISD::UADDSAT/ISD::USUBSAT nodes. Let type legalization and op legalization deal with it. Now that we've switched to target independent nodes we can rely on generic infrastructure to do the legalization for us. llvm-svn: 349526	2018-12-18 19:29:08 +00:00
Sanjay Patel	e51d5bdb3c	[InstCombine] refactor isCheapToScalarize(); NFC As the FIXME indicates, this has the potential to go overboard. So I'm not sure if it's even worth keeping this vs. iteratively doing simple matches, but we might as well clean it up. llvm-svn: 349523	2018-12-18 19:07:38 +00:00
Nikita Popov	f6058ff140	[X86] Use SADDSAT/SSUBSAT instead of ADDS/SUBS Migrate the X86 backend from X86ISD opcodes ADDS and SUBS to generic ISD opcodes SADDSAT and SSUBSAT. This also improves scodegen for @llvm.sadd.sat() and @llvm.ssub.sat() intrinsics. This is a followup to D55787 and part of PR40056. Differential Revision: https://reviews.llvm.org/D55833 llvm-svn: 349520	2018-12-18 18:28:22 +00:00
Craig Topper	20a6db5a84	[X86] Create PSUBUS from (add (umax X, C), -C) InstCombine seems to canonicalize or PSUB patter into a max with the cosntant and an add with an inverse of the constant. This patch recognizes this pattern and turns it into PSUBUS. Future work could improve undef element handling. Fixes some of PR40053 Differential Revision: https://reviews.llvm.org/D55780 llvm-svn: 349519	2018-12-18 18:26:25 +00:00
Alexandre Ganea	b536bf5299	Buildfix for r345516 (Clang compilation failing). llvm-svn: 349518	2018-12-18 18:23:36 +00:00
Alexandre Ganea	b67d91e090	[llvm-symbolizer] Omit stderr output when symbolizing a crash Differential revision: https://reviews.llvm.org/D55723 llvm-svn: 349516	2018-12-18 18:13:13 +00:00
Michael Berg	c6a5245cf7	Add FMF management to common fp intrinsics in GlobalIsel Summary: This the initial code change to facilitate managing FMF flags from Instructions to MI wrt Intrinsics in Global Isel. Eventually the GlobalObserver interface will be added as well, where FMF additions can be tracked for the builder and CSE. Reviewers: aditya_nandakumar, bogner Reviewed By: bogner Subscribers: rovka, kristof.beyls, javed.absar Differential Revision: https://reviews.llvm.org/D55668 llvm-svn: 349514	2018-12-18 17:54:52 +00:00
Michael Kruse	d4eb13c880	[LoopVectorize] Rename pass options. NFC. Rename: NoUnrolling to InterleaveOnlyWhenForced and AlwaysVectorize to !VectorizeOnlyWhenForced Contrary to what the name 'AlwaysVectorize' suggests, it does not unconditionally vectorize all loops, but applies a cost model to determine whether vectorization is profitable to all loops. Hence, passing false will disable the cost model, except when a loop is marked with llvm.loop.vectorize.enable. The 'OnlyWhenForced' suffix (suggested by @hfinkel in D55716) better matches this behavior. Similarly, 'NoUnrolling' disables the profitability cost model for interleaving (a term to distinguish it from unrolling by the LoopUnrollPass); rename it for consistency. Differential Revision: https://reviews.llvm.org/D55785 llvm-svn: 349513	2018-12-18 17:46:09 +00:00
Simon Pilgrim	1411917431	[X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for constant rotation amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349510	2018-12-18 17:31:11 +00:00
Michael Kruse	3284775b70	[LoopUnroll] Honor '#pragma unroll' even with -fno-unroll-loops. When using clang with `-fno-unroll-loops` (implicitly added with `-O1`), the LoopUnrollPass is not not added to the (legacy) pass pipeline. This also means that it will not process any loop metadata such as llvm.loop.unroll.enable (which is generated by #pragma unroll or WarnMissedTransformationsPass emits a warning that a forced transformation has not been applied (see https://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610833.html). Such explicit transformations should take precedence over disabling heuristics. This patch unconditionally adds LoopUnrollPass to the optimizing pipeline (that is, it is still not added with `-O0`), but passes a flag indicating whether automatic unrolling is dis-/enabled. This is the same approach as LoopVectorize uses. The new pass manager's pipeline builder has no option to disable unrolling, hence the problem does not apply. Differential Revision: https://reviews.llvm.org/D55716 llvm-svn: 349509	2018-12-18 17:16:05 +00:00
Simon Pilgrim	e9effe9744	[X86][SSE] Don't use 'sign bit select' vXi8 ROTL lowering for splat rotation amounts Noticed by @spatel on D55747 - we get much better codegen if we use the regular shift expansion. llvm-svn: 349500	2018-12-18 16:02:23 +00:00
Petar Avramovic	0a5e4eb776	[MIPS GlobalISel] Select G_SDIV, G_UDIV, G_SREM and G_UREM Add support for s64 libcalls for G_SDIV, G_UDIV, G_SREM and G_UREM and use integer type of correct size when creating arguments for CLI.lowerCall. Select G_SDIV, G_UDIV, G_SREM and G_UREM for types s8, s16, s32 and s64 on MIPS32. Differential Revision: https://reviews.llvm.org/D55651 llvm-svn: 349499	2018-12-18 15:59:51 +00:00
Nikita Popov	665ab08178	[X86] Use UADDSAT/USUBSAT instead of ADDUS/SUBUS Replace the X86ISD opcodes ADDUS and SUBUS with generic ISD opcodes UADDSAT and USUBSAT. As a side-effect, this also makes codegen for the @llvm.uadd.sat and @llvm.usub.sat intrinsics reasonable. This only replaces use in the X86 backend, and does not move any of the ADDUS/SUBUS X86 specific combines into generic codegen. Differential Revision: https://reviews.llvm.org/D55787 llvm-svn: 349481	2018-12-18 13:23:03 +00:00
Nikita Popov	a7d2a235bb	[SelectionDAG][X86] Fix [US](ADD\|SUB)SAT vector legalization, add tests Integer result promotion needs to use the scalar size, and we need support for result widening. This is in preparation for D55787. llvm-svn: 349480	2018-12-18 13:22:53 +00:00
Petar Avramovic	150fd430f6	[MIPS GlobalISel] ClampScalar G_AND G_OR and G_XOR Add narrowScalar for G_AND and G_XOR. Legalize G_AND G_OR and G_XOR for types other then s32 with clampScalar on MIPS32. Differential Revision: https://reviews.llvm.org/D55362 llvm-svn: 349475	2018-12-18 11:36:14 +00:00
Luke Cheeseman	f57d7d8237	[AArch64] - Return address signing dwarf support - Reapply changes intially introduced in r343089 - The archtecture info is no longer loaded whenever a DWARFContext is created - The runtimes libraries (santiziers) make use of the dwarf context classes but do not intialise the target info - The architecture of the object can be obtained without loading the target info - Adding a method to the dwarf context to get this information and multiplex the string printing later on Differential Revision: https://reviews.llvm.org/D55774 llvm-svn: 349472	2018-12-18 10:37:42 +00:00
Dylan McKay	f920da009e	[IPO][AVR] Create new Functions in the default address space specified in the data layout This modifies the IPO pass so that it respects any explicit function address space specified in the data layout. In targets with nonzero program address spaces, all functions should, by default, be placed into the default program address space. This is required for Harvard architectures like AVR. Without this, the functions will be marked as residing in data space, and thus not be callable. This has no effect to any in-tree official backends, as none use an explicit program address space in their data layouts. Patch by Tim Neumann. llvm-svn: 349469	2018-12-18 09:52:52 +00:00
Matt Arsenault	c94e26c71d	AMDGPU: Legalize/regbankselect frame_index llvm-svn: 349468	2018-12-18 09:46:13 +00:00
Matt Arsenault	c0ea221068	AMDGPU: Legalize/regbankselect fma llvm-svn: 349467	2018-12-18 09:39:56 +00:00
Simon Pilgrim	af6fbbf18b	[TargetLowering] Fallback from SimplifyDemandedVectorElts to SimplifyDemandedBits For opcodes not covered by SimplifyDemandedVectorElts, SimplifyDemandedBits might be able to help now that it supports demanded elts as well. llvm-svn: 349466	2018-12-18 09:33:25 +00:00
Tim Northover	856628f707	SROA: preserve alignment tags on loads and stores. When splitting up an alloca's uses we were dropping any explicit alignment tags, which means they default to the ABI-required default alignment and this can cause miscompiles if the real value was smaller. Also refactor the TBAA metadata into a parent class since it's shared by both children anyway. llvm-svn: 349465	2018-12-18 09:29:39 +00:00
Matt Arsenault	1ac38ba73f	GlobalISel: Improve crash on invalid mapping If NumBreakDowns is 0, BreakDown is null. This trades a null dereference with an assert somewhere else. llvm-svn: 349464	2018-12-18 09:27:29 +00:00
Matt Arsenault	e01e7c81f2	AMDGPU/GlobalISel: Legalize/regbankselect fneg/fabs/fsub llvm-svn: 349463	2018-12-18 09:19:03 +00:00
Simon Pilgrim	8488a44c34	[X86][SSE] Move VSRAI sign extend in reg fold into SimplifyDemandedBits (VSRAI (VSHLI X, C1), C1) --> X iff NumSignBits(X) > C1 This works better as part of SimplifyDemandedBits than part of the general combine. llvm-svn: 349462	2018-12-18 09:11:34 +00:00
Simon Pilgrim	26c630f416	[X86][SSE] Replace (VSRLI (VSRAI X, Y), 31) -> (VSRLI X, 31) fold. This fold was incredibly specific - replace with a SimplifyDemandedBits fold to remove a VSRAI if only the original sign bit is demanded (its guaranteed to stay the same). Test change is merely a rescheduling. llvm-svn: 349459	2018-12-18 08:55:47 +00:00
Kristof Beyls	e66bc1f756	Introduce control flow speculation tracking pass for AArch64 The pass implements tracking of control flow miss-speculation into a "taint" register. That taint register can then be used to mask off registers with sensitive data when executing under miss-speculation, a.k.a. "transient execution". This pass is aimed at mitigating against SpectreV1-style vulnarabilities. At the moment, it implements the tracking of miss-speculation of control flow into a taint register, but doesn't implement a mechanism yet to then use that taint register to mask off vulnerable data in registers (something for a follow-on improvement). Possible strategies to mask out vulnerable data that can be implemented on top of this are: - speculative load hardening to automatically mask of data loaded in registers. - using intrinsics to mask of data in registers as indicated by the programmer (see https://lwn.net/Articles/759423/). For AArch64, the following implementation choices are made. Some of these are different than the implementation choices made in the similar pass implemented in X86SpeculativeLoadHardening.cpp, as the instruction set characteristics result in different trade-offs. - The speculation hardening is done after register allocation. With a relative abundance of registers, one register is reserved (X16) to be the taint register. X16 is expected to not clash with other register reservation mechanisms with very high probability because: . The AArch64 ABI doesn't guarantee X16 to be retained across any call. . The only way to request X16 to be used as a programmer is through inline assembly. In the rare case a function explicitly demands to use X16/W16, this pass falls back to hardening against speculation by inserting a DSB SYS/ISB barrier pair which will prevent control flow speculation. - It is easy to insert mask operations at this late stage as we have mask operations available that don't set flags. - The taint variable contains all-ones when no miss-speculation is detected, and contains all-zeros when miss-speculation is detected. Therefore, when masking, an AND instruction (which only changes the register to be masked, no other side effects) can easily be inserted anywhere that's needed. - The tracking of miss-speculation is done by using a data-flow conditional select instruction (CSEL) to evaluate the flags that were also used to make conditional branch direction decisions. Speculation of the CSEL instruction can be limited with a CSDB instruction - so the combination of CSEL + a later CSDB gives the guarantee that the flags as used in the CSEL aren't speculated. When conditional branch direction gets miss-speculated, the semantics of the inserted CSEL instruction is such that the taint register will contain all zero bits. One key requirement for this to work is that the conditional branch is followed by an execution of the CSEL instruction, where the CSEL instruction needs to use the same flags status as the conditional branch. This means that the conditional branches must not be implemented as one of the AArch64 conditional branches that do not use the flags as input (CB(N)Z and TB(N)Z). This is implemented by ensuring in the instruction selectors to not produce these instructions when speculation hardening is enabled. This pass will assert if it does encounter such an instruction. - On function call boundaries, the miss-speculation state is transferred from the taint register X16 to be encoded in the SP register as value 0. Future extensions/improvements could be: - Implement this functionality using full speculation barriers, akin to the x86-slh-lfence option. This may be more useful for the intrinsics-based approach than for the SLH approach to masking. Note that this pass already inserts the full speculation barriers if the function for some niche reason makes use of X16/W16. - no indirect branch misprediction gets protected/instrumented; but this could be done for some indirect branches, such as switch jump tables. Differential Revision: https://reviews.llvm.org/D54896 llvm-svn: 349456	2018-12-18 08:50:02 +00:00
Martin Storsjo	8f0cb9c3a8	[AArch64] [MinGW] Allow enabling SEH exceptions The default still is dwarf, but SEH exceptions can now be enabled optionally for the MinGW target. Differential Revision: https://reviews.llvm.org/D55748 llvm-svn: 349451	2018-12-18 08:32:37 +00:00
Kewen Lin	44ace92596	[PowerPC] Exploit power9 new instruction setb Check the expected pattens feeding to SELECT_CC like: (select_cc lhs, rhs, 1, (sext (setcc [lr]hs, [lr]hs, cc2)), cc1) (select_cc lhs, rhs, -1, (zext (setcc [lr]hs, [lr]hs, cc2)), cc1) (select_cc lhs, rhs, 0, (select_cc [lr]hs, [lr]hs, 1, -1, cc2), seteq) (select_cc lhs, rhs, 0, (select_cc [lr]hs, [lr]hs, -1, 1, cc2), seteq) Further transform the sequence to comparison + setb if hits. Differential Revision: https://reviews.llvm.org/D53275 llvm-svn: 349445	2018-12-18 07:53:26 +00:00
Craig Topper	1ff7356f96	[X86] Const correct some helper functions X86InstrInfo.cpp. NFC llvm-svn: 349440	2018-12-18 04:58:05 +00:00
Artur Pilipenko	2a0146e0fd	[CaptureTracking] Pass MaxUsesToExplore from wrappers to the actual implementation This is a follow up for rL347910. In the original patch I somehow forgot to pass the limit from wrappers to the function which actually does the job. llvm-svn: 349438	2018-12-18 03:32:33 +00:00
Kewen Lin	3dac1252da	[PowerPC] Improve vec_abs on P9 Improve the current vec_abs support on P9, generate ISD::ABS node for vector types, combine ABS node to VABSD node for some special cases to make use of P9 VABSD* insns, do custom lowering to vsub(vneg later)+vmax if it has no combination opportunity. Differential Revision: https://reviews.llvm.org/D54783 llvm-svn: 349437	2018-12-18 03:16:43 +00:00
Eli Friedman	f457470286	[Support] Fix GNU/kFreeBSD build Patch by James Clarke. Differential Revision: https://reviews.llvm.org/D55296 llvm-svn: 349434	2018-12-18 01:38:20 +00:00
Reid Kleckner	4ab50b858e	[codeview] Update comment on aligning symbol records llvm-svn: 349433	2018-12-18 01:36:06 +00:00
Reid Kleckner	53ce05960e	[codeview] Align symbol records to save 441MB during linking clang.pdb In PDBs, symbol records must be aligned to four bytes. However, in the object file, symbol records may not be aligned. MSVC does not pad out symbol records to make sure they are aligned. That means the linker has to do extra work to insert the padding. Currently, LLD calculates the required space with alignment, and copies each record one at a time while padding them out to the correct size. It has a fast path that avoids this copy when the records are already aligned. This change fixes a bug in that codepath so that the copy is actually saved, and tweaks LLVM's symbol record emission to align symbol records. Here's how things compare when doing a plain clang Release+PDB build: - objs are 0.65% bigger (negligible) - link is 3.3% faster (negligible) - saves allocating 441MB - new LLD high water mark is ~1.05GB llvm-svn: 349431	2018-12-18 01:14:05 +00:00
David Blaikie	c4e08feb00	Recommit r348806: DebugInfo: Use symbol difference for CU length to simplify assembly reading/editing Mucking about simplifying a test case ( https://reviews.llvm.org/D55261 ) I stumbled across something I've hit before - that LLVM's (GCC's does too, FWIW) assembly output includes a hardcode length for a DWARF unit in its header. Instead we could emit a label difference - making the assembly easier to read/edit (though potentially at a slight (I haven't tried to observe it) performance cost of delaying/sinking the length computation into the MC layer). Fix: Predicated all the changes (including creating the labels, even if they aren't used/needed) behind the NVPTX useSectionsAsReferences, avoiding emitting labels in NVPTX where ptxas can't parse them. Reviewers: JDevlieghere, probinson, ABataev Differential Revision: https://reviews.llvm.org/D55281 llvm-svn: 349430	2018-12-18 01:06:09 +00:00
Joel E. Denny	e2afb61499	[FileCheck] Annotate input dump (final tweaks) Apply final suggestions from probinson for this patch series plus a few more tweaks: * Improve various docs, for MatchType in particular. * Rename some members of MatchType. The main problem was that the term "final match" became a misnomer when CHECK-COUNT-<N> was created. * Split InputStartLine, etc. declarations into multiple lines. Differential Revision: https://reviews.llvm.org/D55738 Reviewed By: probinson llvm-svn: 349425	2018-12-18 00:03:51 +00:00
Joel E. Denny	96f0e84ccf	[FileCheck] Annotate input dump (7/7) This patch implements annotations for diagnostics reporting CHECK-NOT failed matches. These diagnostics are enabled by -vv. As for diagnostics reporting failed matches for other directives, these annotations mark the search ranges using `X~~`. The difference here is that failed matches for CHECK-NOT are successes not errors, so they are green not red when colors are enabled. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - ^~~ marks good match (reported if -v) - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - CHECK-DAG overlapping match (discarded, reported if -vv) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - CHECK-NOT not found (success, reported if -vv) - CHECK-DAG not found after discarded matches (error) - ? marks fuzzy match when no match is found - colors success, error, fuzzy match, discarded match, unmatched input If you are not seeing color above or in input dumps, try: -color $ FileCheck -vv -dump-input=always check5 < input5 \|& sed -n '/^<<<</,$p' <<<<<< 1: abcdef check:1 ^~~ not:2 X~~ 2: ghijkl not:2 ~~~ check:3 ^~~ 3: mnopqr not:4 X~~~~~ 4: stuvwx not:4 ~~~~~~ 5: eof:4 ^ >>>>>> $ cat check5 CHECK: abc CHECK-NOT: foobar CHECK: jkl CHECK-NOT: foobar $ cat input5 abcdef ghijkl mnopqr stuvwx ``` Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53899 llvm-svn: 349424	2018-12-18 00:03:36 +00:00
Joel E. Denny	f7c1c4d8a4	[FileCheck] Annotate input dump (6/7) This patch implements input annotations for diagnostics reporting CHECK-DAG discarded matches. These diagnostics are enabled by -vv. These annotations mark discarded match ranges using `!~~` because they are bad matches even though they are not errors. CHECK-DAG discarded matches create another case where there can be multiple match results for the same directive. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - ^~~ marks good match (reported if -v) - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - CHECK-DAG overlapping match (discarded, reported if -vv) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - CHECK-DAG not found after discarded matches (error) - ? marks fuzzy match when no match is found - colors success, error, fuzzy match, discarded match, unmatched input If you are not seeing color above or in input dumps, try: -color $ FileCheck -vv -dump-input=always check4 < input4 \|& sed -n '/^<<<</,$p' <<<<<< 1: abcdef dag:1 ^~~~ dag:2'0 !~~~ discard: overlaps earlier match 2: cdefgh dag:2'1 ^~~~ check:3 X~ error: no match found >>>>>> $ cat check4 CHECK-DAG: abcd CHECK-DAG: cdef CHECK: efgh $ cat input4 abcdef cdefgh ``` This shows that the line 3 CHECK fails to match even though its pattern appears in the input because its search range starts after the line 2 CHECK-DAG's match range. The trouble might be that the line 2 CHECK-DAG's match range is later than expected because its first match range overlaps with the line 1 CHECK-DAG match range and thus is discarded. Because `!~~` for CHECK-DAG does not indicate an error, it is not colored red. Instead, when colors are enabled, it is colored cyan, which suggests a match that went cold. Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53898 llvm-svn: 349423	2018-12-18 00:03:19 +00:00
Joel E. Denny	7df86967b4	[FileCheck] Annotate input dump (5/7) This patch implements input annotations for diagnostics enabled by -v, which report good matches for directives. These annotations mark match ranges using `^~~`. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - ^~~ marks good match (reported if -v) - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - ? marks fuzzy match when no match is found - colors success, error, fuzzy match, unmatched input If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check3 < input3 \|& sed -n '/^<<<</,$p' <<<<<< 1: abc foobar def check:1 ^~~ not:2 !~~~~~ error: no match expected check:3 ^~~ >>>>>> $ cat check3 CHECK: abc CHECK-NOT: foobar CHECK: def $ cat input3 abc foobar def ``` -vv enables these annotations for FileCheck's implicit EOF patterns as well. For an example where EOF patterns become relevant, see patch 7 in this series. If colors are enabled, `^~~` is green to suggest success. -v plus color enables highlighting of input text that has no final match for any expected pattern. The highlight uses a cyan background to suggest a cold section. This highlighting can make it easier to spot text that was intended to be matched but that failed to be matched in a long series of good matches. CHECK-COUNT-<num> good matches are another case where there can be multiple match results for the same directive. Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53897 llvm-svn: 349422	2018-12-18 00:03:03 +00:00
Joel E. Denny	0e7e3fa0e9	[FileCheck] Annotate input dump (4/7) This patch implements input annotations for diagnostics that report unexpected matches for CHECK-NOT. Like wrong-line matches for CHECK-NEXT, CHECK-SAME, and CHECK-EMPTY, these annotations mark match ranges using red `!~~` to indicate bad matches that are errors. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - CHECK-NOT found (error) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - ? marks fuzzy match when no match is found - colors error, fuzzy match If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check3 < input3 \|& sed -n '/^<<<</,$p' <<<<<< 1: abc foobar def not:2 !~~~~~ error: no match expected >>>>>> $ cat check3 CHECK: abc CHECK-NOT: foobar CHECK: def $ cat input3 abc foobar def ``` Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53896 llvm-svn: 349421	2018-12-18 00:02:47 +00:00
Joel E. Denny	cadfcef493	[FileCheck] Annotate input dump (3/7) This patch implements input annotations for diagnostics that report wrong-line matches for the directives CHECK-NEXT, CHECK-SAME, and CHECK-EMPTY. Instead of the usual `^~~`, which is used by later patches for good matches, these annotations use `!~~` to mark the bad match ranges so that this category of errors is visually distinct. Because such matches are errors, these annotates are red when colors are enabled. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - !~~ marks bad match, such as: - CHECK-NEXT on same line as previous match (error) - X~~ marks search range when no match is found, such as: - CHECK-NEXT not found (error) - ? marks fuzzy match when no match is found - colors error, fuzzy match If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check2 < input2 \|& sed -n '/^<<<</,$p' <<<<<< 1: foo bar next:2 !~~ error: match on wrong line >>>>>> $ cat check2 CHECK: foo CHECK-NEXT: bar $ cat input2 foo bar ``` Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53894 llvm-svn: 349420	2018-12-18 00:02:22 +00:00
Joel E. Denny	2c007c807d	[FileCheck] Annotate input dump (2/7) This patch implements input annotations for diagnostics that suggest fuzzy matches for directives for which no matches were found. Instead of using the usual `^~~`, which is used by later patches for good matches, these annotations use `?` so that fuzzy matches are visually distinct. No tildes are included as these diagnostics (independently of this patch) currently identify only the start of the match. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the only match result for a pattern of type T from line L of the check file - T:L'N labels the Nth match result for a pattern of type T from line L of the check file - X~~ marks search range when no match is found - ? marks fuzzy match when no match is found - colors error, fuzzy match If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check1 < input1 \|& sed -n '/^<<<</,$p' <<<<<< 1: ; abc def 2: ; ghI jkl next:3'0 X~~~~~~~~ error: no match found next:3'1 ? possible intended match >>>>>> $ cat check1 CHECK: abc CHECK-SAME: def CHECK-NEXT: ghi CHECK-SAME: jkl $ cat input1 ; abc def ; ghI jkl ``` This patch introduces the concept of multiple "match results" per directive. In the above example, the first match result for the CHECK-NEXT directive is the failed match, for which the annotation shows the search range. The second match result is the fuzzy match. Later patches will introduce other cases of multiple match results per directive. When colors are enabled, `?` is colored magenta. That is, it doesn't indicate the actual error, which a red `X~~` marker indicates, but its color suggests it's closely related. Reviewed By: george.karpenkov, probinson Differential Revision: https://reviews.llvm.org/D53893 llvm-svn: 349419	2018-12-18 00:02:04 +00:00
Joel E. Denny	3c5d267eb7	[FileCheck] Annotate input dump (1/7) Extend FileCheck to dump its input annotated with FileCheck's diagnostics: errors, good matches if -v, and additional information if -vv. The goal is to make it easier to visualize FileCheck's matching behavior when debugging. Each patch in this series implements input annotations for a particular category of FileCheck diagnostics. While the first few patches alone are somewhat useful, the annotations become much more useful as later patches implement annotations for -v and -vv diagnostics, which show the matching behavior leading up to the error. This first patch implements boilerplate plus input annotations for error diagnostics reporting that no matches were found for a directive. These annotations mark the search ranges of the failed directives. Instead of using the usual `^~~`, which is used by later patches for good matches, these annotations use `X~~` so that this category of errors is visually distinct. For example: ``` $ FileCheck -dump-input=help The following description was requested by -dump-input=help to explain the input annotations printed by -dump-input=always and -dump-input=fail: - L: labels line number L of the input file - T:L labels the match result for a pattern of type T from line L of the check file - X~~ marks search range when no match is found - colors error If you are not seeing color above or in input dumps, try: -color $ FileCheck -v -dump-input=always check1 < input1 \|& sed -n '/^Input file/,$p' Input file: <stdin> Check file: check1 -dump-input=help describes the format of the following dump. Full input was: <<<<<< 1: ; abc def 2: ; ghI jkl next:3 X~~~~~~~~ error: no match found >>>>>> $ cat check1 CHECK: abc CHECK-SAME: def CHECK-NEXT: ghi CHECK-SAME: jkl $ cat input1 ; abc def ; ghI jkl ``` Some additional details related to the boilerplate: * Enabling: The annotated input dump is enabled by `-dump-input`, which can also be set via the `FILECHECK_OPTS` environment variable. Accepted values are `help`, `always`, `fail`, or `never`. As shown above, `help` describes the format of the dump. `always` is helpful when you want to investigate a successful FileCheck run, perhaps for an unexpected pass. `-dump-input-on-failure` and `FILECHECK_DUMP_INPUT_ON_FAILURE` remain as a deprecated alias for `-dump-input=fail`. * Diagnostics: The usual diagnostics are not suppressed in this mode and are printed first. For brevity in the example above, I've omitted them using a sed command. Sometimes they're perfectly sufficient, and then they make debugging quicker than if you were forced to hunt through a dump of long input looking for the error. If you think they'll get in the way sometimes, keep in mind that it's pretty easy to grep for the start of the input dump, which is `<<<`. * Colored Annotations: The annotated input is colored if colors are enabled (enabling colors can be forced using -color). For example, errors are red. However, as in the above example, colors are not vital to reading the annotations. I don't know how to test color in the output, so any hints here would be appreciated. Reviewed By: george.karpenkov, zturner, probinson Differential Revision: https://reviews.llvm.org/D52999 llvm-svn: 349418	2018-12-18 00:01:39 +00:00
Peter Collingbourne	d3a3e4b46d	hwasan: Move ctor into a comdat. Differential Revision: https://reviews.llvm.org/D55733 llvm-svn: 349413	2018-12-17 22:56:34 +00:00
Simon Pilgrim	7e2975a44c	[X86][SSE] Improve immediate vector shift known bits handling. Convert VSRAI to VSRLI is the sign bit is known zero and improve KnownBits output for all shift instruction. Fixes the poor codegen comments in D55768. llvm-svn: 349407	2018-12-17 22:09:47 +00:00
Wouter van Oortmerssen	d3c544aa6e	[WebAssembly] Fix assembler parsing of br_table. Summary: We use `variable_ops` in the tablegen defs to denote the list of branch targets in `br_table`, but unlike other uses of `variable_ops` (e.g. call) the these branch targets need to actually be encoded in the instruction. The existing tables for `variable_ops` cause not operands to be accepted by the assembly matcher. Following the example of ARM: `2cc0a7da87/lib/Target/ARM/ARMInstrInfo.td (L550-L555)` we introduce a new operand type to capture this list, and we use the same {} syntax as ARM as well to differentiate them from regular integer operands. Also removed definition and use of TSFlags in tablegen defs, since `br_table` now has a non-variable_ops immediate operand, so the previous logic of only the variable_ops arguments being labels didn't make sense anymore. Reviewers: dschuff, aheejin, sunfish Subscribers: javed.absar, sbc100, jgravelle-google, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55401 llvm-svn: 349405	2018-12-17 22:04:44 +00:00
Craig Topper	8c9d772991	[X86] Add T1MSKC and TZMSK to isDefConvertible used by optimizeCompareInstr. These seem to have been missed when the other TBM instructions were added. llvm-svn: 349404	2018-12-17 21:50:06 +00:00
Reid Kleckner	94ee0728e5	[codeview] Flush labels before S_DEFRANGE* fragments This was a pre-existing bug that could be triggered with assembly like this: .p2align 2 .LtmpN: .cv_def_range "..." I noticed this when attempting to change clang to emit aligned symbol records. llvm-svn: 349403	2018-12-17 21:49:35 +00:00
Simon Pilgrim	6b5e0b7b2b	[X86][SSE] Split SimplifyDemandedBitsForTargetNode X86ISD::VSRLI/VSRAI handling. First step towards adding more capable combines to fix comments in D55768. llvm-svn: 349400	2018-12-17 21:36:17 +00:00
Sanjay Patel	200885e654	[AggressiveInstCombine] convert rotate with guard branch into funnel shift (PR34924) Now, that we have funnel shift intrinsics, it should be safe to convert this form of rotate to it. In the worst case (a target that doesn't have rotate instructions), we will expand this into a branch-less sequence of ALU ops (neg/and/and/lshr/shl/or) in the backend, so it's still very likely to be a perf improvement over the original code. The motivating source code pattern for this is shown in: https://bugs.llvm.org/show_bug.cgi?id=34924 Background: I looked at several different options before deciding where to try this - instcombine, simplifycfg, CGP - because it doesn't fit cleanly anywhere AFAIK. The backend (CGP, SDAG, GlobalIsel?) is too late for what we're trying to accomplish. We want to have the IR converted before we reach things like vectorization because the reduced code can make a loop much simpler to transform. Technically, this could be included in instcombine, but it's a large pattern match that includes control-flow, so it just felt wrong to stuff into there (although I have a draft of that patch). Similarly, this could be part of simplifycfg, but all of this pattern matching is a stretch. So we're left with our relatively new dumping ground for homeless transforms: aggressive-instcombine. This only runs at -O3, but that seems like a reasonable limitation given that source code has many options to avoid this pattern (including the recently added clang intrinsics for rotates). I'm including a PhaseOrdering test because we require the teamwork of 3 passes (aggressive-instcombine, instcombine, simplifycfg) to get this into the minimal IR form that we want. That test shows a bug with the new pass manager that's independent of this change (but it will be masked if we canonicalize harder to funnel shift intrinsics in instcombine). Differential Revision: https://reviews.llvm.org/D55604 llvm-svn: 349396	2018-12-17 21:14:51 +00:00
Krzysztof Parzyszek	5852aa44ae	[SDAG] Clarify the origin of chain in REG_SEQUENCE in comment, NFC llvm-svn: 349391	2018-12-17 20:30:20 +00:00
Craig Topper	15b7246935	[SelectionDAG] Fix noop detection for vectors in AssertZext/AssertSext in getNode The assertion type is always supposed to be a scalar type. So if the result VT of the assertion is a vector, we need to get the scalar VT before we can compare them. Similarly for the assert above it. I don't have a test case because I don't know of any place we violate this today. A coworker found this while trying to use r347287 on the 6.0 branch without also having r336868 llvm-svn: 349390	2018-12-17 20:29:13 +00:00
Sanjay Patel	1a6e9ec434	[InstCombine] don't widen an arbitrary sequence of vector ops (PR40032) The problem is shown specifically for a case with vector multiply here: https://bugs.llvm.org/show_bug.cgi?id=40032 ...and this might mask the original backend bug for ARM shown in: https://bugs.llvm.org/show_bug.cgi?id=39967 As the test diffs here show, we were (and probably still aren't) doing these kinds of transforms in a principled way. We are producing more or equal wide instructions than we started with in some cases, so we still need to restrict/correct other transforms from overstepping. If there are perf regressions from this change, we can either carve out exceptions to the general IR rules, or improve the backend to do these transforms when we know the transform is profitable. That's probably similar to a change like D55448. Differential Revision: https://reviews.llvm.org/D55744 llvm-svn: 349389	2018-12-17 20:27:43 +00:00
Craig Topper	728cbc0378	Convert (CMP (srl/shl X, C), 0) to (CMP (and X, C'), 0) when only the zero flag is used. This allows a TEST to be used and can be combined with any AND that may already exist as an input to the shift. This was already done in EmitTest, but was easily tricked by multiple uses because the setcc might be used by multiple instructions. Once the SETCC and users are legalized then we can look for the shift to be used by a single CMP, but the CMP itself can have multiple users. This appears to fix the case in PR39968. llvm-svn: 349385	2018-12-17 20:02:16 +00:00
JF Bastien	1811217e4d	NFC: remove unused variable D55768 removed its use. llvm-svn: 349377	2018-12-17 19:03:24 +00:00
Simon Pilgrim	9274f17a5e	[TargetLowering] Add DemandedElts mask to SimplifyDemandedBits (PR40000) This is an initial patch to add the necessary support for a DemandedElts argument to SimplifyDemandedBits, more closely matching computeKnownBits and to help improve vector codegen. I've added only a small amount of the changes necessary to get at least one test to update - a lot more can be done but I'd like to add these methodically with proper test coverage, at the same time the hope is to slowly move some/all of SimplifyDemandedVectorElts into SimplifyDemandedBits as well. Differential Revision: https://reviews.llvm.org/D55768 llvm-svn: 349374	2018-12-17 18:43:43 +00:00
Nikita Popov	221f3fc750	[InstSimplify] Simplify saturating add/sub + icmp If a saturating add/sub has one constant operand, then we can determine the possible range of outputs it can produce, and simplify an icmp comparison based on that. The implementation is based on a similar existing mechanism for simplifying binary operator + icmps. Differential Revision: https://reviews.llvm.org/D55735 llvm-svn: 349369	2018-12-17 17:45:18 +00:00
Tim Northover	256a16d031	FastIsel: take care to update iterators when removing instructions. We keep a few iterators into the basic block we're selecting while performing FastISel. Usually this is fine, but occasionally code wants to remove already-emitted instructions. When this happens we have to be careful to update those iterators so they're not pointint at dangling memory. llvm-svn: 349365	2018-12-17 17:25:53 +00:00
Zachary Turner	1b9a938b9a	Add missing include file. llvm-svn: 349363	2018-12-17 16:42:26 +00:00
Zachary Turner	bb3d7e565f	[PDB] Add some helper functions for working with scopes. llvm-svn: 349361	2018-12-17 16:15:36 +00:00
Zachary Turner	b472512a77	[MS Demangler] Add a helper function to print a Node as a string. llvm-svn: 349359	2018-12-17 16:14:50 +00:00
Petar Avramovic	f9c9bc09ab	[MIPS GlobalISel] Remove switch statement (fix r349346 for MSVC) Temporarily remove switch statement without any case labels in function legalizeCustom in order to fix r349346 for MSVC. llvm-svn: 349356	2018-12-17 15:12:53 +00:00
Tim Northover	ae3b66b7b0	ARM: use acquire/release instruction variants when available. These features (fairly) recently got split out into their own feature, so we should make CodeGen use them when available. The main change here is that the check used to be based on the triple, but now it's based on CPU features. llvm-svn: 349355	2018-12-17 15:05:32 +00:00
Andrea Di Biagio	4c73711069	[MCA] Add support for BeginGroup/EndGroup. llvm-svn: 349354	2018-12-17 14:27:33 +00:00
Eric Liu	6c933a2bed	Revert "DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses)" This reverts commit r349333. It caused internal test to fail. I have sent more information to the author. llvm-svn: 349353	2018-12-17 14:14:40 +00:00
Andrea Di Biagio	4506067593	[MCA] Don't assume that createMCInstrAnalysis() always returns a valid pointer. Class InstrBuilder wrongly assumed that llvm targets were always able to return a non-null pointer when createMCInstrAnalysis() was called on them. This was causing crashes when simulating executions for targets that don't provide an MCInstrAnalysis object. This patch fixes the issue by making MCInstrAnalysis optional. llvm-svn: 349352	2018-12-17 14:00:37 +00:00
Petar Avramovic	b8276f2280	[MIPS GlobalISel] Lower G_UADDE and narrowScalar G_ADD Lower G_UADDE and legalize G_ADD using narrowScalar on MIPS32. Differential Revision: https://reviews.llvm.org/D54580 llvm-svn: 349346	2018-12-17 12:31:07 +00:00
Alexandros Lamprineas	490ae11717	[AArch64] Re-run load/store optimizer after aggressive tail duplication The Load/Store Optimizer runs before Machine Block Placement. At O3 the Tail Duplication Threshold is set to 4 instructions and this can create new opportunities for the Load/Store Optimizer. It seems worthwhile to run it once again. llvm-svn: 349338	2018-12-17 10:45:43 +00:00
David Blaikie	884deed1b3	DebugInfo: Assume an absence of ranges or high_pc on a CU means the CU is empty (devoid of code addresses) GCC emitted these unconditionally on/before 4.4/March 2012 Clang emitted these unconditionally on/before 3.5/March 2014 This improves performance when parsing CUs (especially those using split DWARF) that contain no code ranges (such as the mini CUs that may be created by ThinLTO importing - though generally they should be/are avoided, especially for Split DWARF because it produces a lot of very small CUs, which don't scale well in a bunch of other ways too (including size)). llvm-svn: 349333	2018-12-17 08:27:19 +00:00
Clement Courbet	cc5e6a72de	[llvm-mca] Move llvm-mca library to llvm/lib/MCA. Summary: See PR38731. Reviewers: andreadb Subscribers: mgorny, javed.absar, tschuett, gbedwell, andreadb, RKSimon, llvm-commits Differential Revision: https://reviews.llvm.org/D55557 llvm-svn: 349332	2018-12-17 08:08:31 +00:00
Craig Topper	fa4907d671	[X86] Fix bad operand lookup for cmov introduced in r349315 The CC is operand 2 not operand 3. llvm-svn: 349330	2018-12-17 06:40:35 +00:00
Davide Italiano	e41e1d015f	[EarlyCSE] If DI can't be salvaged, mark it as unavailable. Fixes PR39874. llvm-svn: 349323	2018-12-17 01:42:39 +00:00
Simon Pilgrim	d0c9e43b1c	[X86] Pull out constant splat rotation detection. We had 3 different approaches - consistently use getTargetConstantBitsFromNode and allow undef elts. llvm-svn: 349319	2018-12-16 19:46:04 +00:00
Craig Topper	10f8892837	[X86] Remove truncation handling from EmitTest. Replace it with a DAG combine. I'd like to try to move a lot of the flag matching out of EmitTest and push it to isel or isel preprocessing. This is a step towards that. The test-shrink-bug.ll changie is an improvement because we are no longer interfering with test shrink handling in isel. The pr34137.ll change is a regression, but the IR came from -O0 and was not reduced by InstCombine. So it contains a lot of redundancies like duplicate loads that made it combine poorly. llvm-svn: 349315	2018-12-16 18:35:55 +00:00
Sanjay Patel	13ac2f15b0	[x86] increment/decrement constant vector with min/max in vsetcc lowering (PR39859) This is part of fixing PR39859: https://bugs.llvm.org/show_bug.cgi?id=39859 We have a crippled vector ISA, so we have to invert a typical fold and create min/max here. As discussed in the bug report, we can probably do better by using saturating subtract when it's available, but we should have this improvement for the min/max patterns regardless. Alive proofs: https://rise4fun.com/Alive/zsf https://rise4fun.com/Alive/Qrl Differential Revision: https://reviews.llvm.org/D55515 llvm-svn: 349304	2018-12-16 15:05:48 +00:00
Sanjay Patel	f24900b934	[DAGCombiner] allow hoisting vector bitwise logic ahead of truncates The transform performs a bitwise logic op in a wider type followed by truncate when both inputs are truncated from the same source type: logic_op (truncate x), (truncate y) --> truncate (logic_op x, y) There are a bunch of other checks that should prevent doing this when it might be harmful. We already do this transform for scalars in this spot. The vector limitation was shared with a check for the case when the operands are extended. I'm not sure if that limit is needed either, but that would be a separate patch. Differential Revision: https://reviews.llvm.org/D55448 llvm-svn: 349303	2018-12-16 14:57:04 +00:00
Simon Pilgrim	0ef977b83d	[SelectionDAG] Add FSHL/FSHR support to computeKnownBits Also exposes an issue in DAGCombiner::visitFunnelShift where we were assuming the shift amount had the result type (after legalization it'll have the targets shift amount type). llvm-svn: 349298	2018-12-16 13:33:37 +00:00
Simon Pilgrim	52c982406e	[X86] Begin cleaning up combineOr -> SHLD/SHRD. NFCI. In preparation for converting to funnel shifts. llvm-svn: 349286	2018-12-15 21:11:49 +00:00
Simon Pilgrim	ef7b5949e5	[X86] Lower to SHLD/SHRD on slow machines for optsize Use consistent rules for when to lower to SHLD/SHRD for slow machines - fixes a weird issue where funnel shift gets expanded but then X86ISelLowering's combineOr sees the optsize and combines to SHLD/SHRD, but now with the modulo amount guard...... llvm-svn: 349285	2018-12-15 19:43:44 +00:00
Kamil Rytarowski	21e270a479	Add NetBSD support in needsRuntimeRegistrationOfSectionRange. Use linker script magic to get data/cnts/name start/end. llvm-svn: 349277	2018-12-15 16:51:35 +00:00
Kamil Rytarowski	15ae738bc8	Register kASan shadow offset for NetBSD/amd64 The NetBSD x86_64 kernel uses the 0xdfff900000000000 shadow offset. llvm-svn: 349276	2018-12-15 16:32:41 +00:00
Dinar Temirbulatov	8c8724dd0d	[CodeGen] Enhance machine PHIs optimization Summary: Make machine PHIs optimization to work for single value register taken from several different copies. This is the first step to fix PR38917. This change allows to get rid of redundant PHIs (see opt_phis2.mir test) to make the subsequent optimizations (like CSE) possible and simpler. For instance, before this patch the code like this: %b = COPY %z ... %a = PHI %bb1, %a; %bb2, %b could be optimized to: %a = %b but the code like this: %c = COPY %z ... %b = COPY %z ... %a = PHI %bb1, %a; %bb2, %b; %bb3, %c would remain unchanged. With this patch the latter case will be optimized: %a = %z```. Committed on behalf of: Anton Afanasyev anton.a.afanasyev@gmail.com Reviewers: RKSimon, MatzeB Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54839 llvm-svn: 349271	2018-12-15 14:37:01 +00:00
Simon Pilgrim	9831d4058c	Fix -Wunused-variable warning. NFCI. llvm-svn: 349265	2018-12-15 12:25:22 +00:00
Simon Pilgrim	1e1fd9c761	[TargetLowering] Add ISD::OR + ISD::XOR handling to SimplifyDemandedVectorElts Differential Revision: https://reviews.llvm.org/D55600 llvm-svn: 349264	2018-12-15 11:36:36 +00:00
Florian Hahn	abe32c9125	[SILoadStoreOptimizer] Use std::abs to avoid truncation. Using regular abs() causes the following warning error: absolute value function 'abs' given an argument of type 'int64_t' (aka 'long') but has parameter of type 'int' which may cause truncation of value [-Werror,-Wabsolute-value] (uint32_t)abs(Dist) > MaxDist) { ^ lib/Target/AMDGPU/SILoadStoreOptimizer.cpp:1369:19: note: use function 'std::abs' instead which causes a bot to fail: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/18284/steps/bootstrap%20clang/logs/stdio llvm-svn: 349224	2018-12-15 01:32:58 +00:00
Craig Topper	1fc257d97f	[X86] Rename hasNoSignedComparisonUses to hasNoSignFlagUses. Add the instruction that only modify the O flag to the waiver list. The only caller of this turns CMP with 0 into TEST. CMP with 0 and TEST both set OF to 0 so we should have no issues with instructions that only use OF. Though I don't think there's any reason we would read just OF after a compare with 0 anyway. So this probably isn't an observable change. llvm-svn: 349223	2018-12-15 01:07:19 +00:00
Craig Topper	5c304eac41	[X86] Make hasNoCarryFlagUses/hasNoSignedComparisonUses take an SDValue that indicates which result is the flag result. NFCI hasNoCarryFlagUses hardcoded that the flag result is 1 and used that to filter which uses were of interest. hasNoSignedComparisonUses just assumes the only result is flags and checks whether any user of the node is a CopyToReg instruction. After this patch we now do a result number check in both and rely on the caller to provide the result number. This shouldn't change behavior it was just an odd difference between the two functions that I noticed. llvm-svn: 349222	2018-12-15 01:07:16 +00:00
Heejin Ahn	feef720bb8	[WebAssembly] Check if the section order is correct Summary: This patch checks if the section order is correct when reading a wasm object file in `WasmObjectFile` and converting YAML to wasm object in yaml2wasm. (It is not possible to check when reading YAML because it is handled exclusively by the YAML reader.) This checks the ordering of all known sections (core sections + known custom sections). This also adds section ID DataCount section that will be scheduled to be added in near future. Reviewers: sbc100 Subscribers: dschuff, mgorny, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D54924 llvm-svn: 349221	2018-12-15 00:58:12 +00:00
Florian Hahn	c214bc2b8d	[NewGVN] Update use counts for SSA copies when replacing them by their operands. The current code relies on LeaderUseCount to determine if we can remove an SSA copy, but in that the LeaderUseCount does not refer to the SSA copy. If a SSA copy is a dominating leader, we use the operand as dominating leader instead. This means we removed a user of a ssa copy and we should decrement its use count, so we can remove the ssa copy once it becomes dead. Fixes PR38804. Reviewers: efriedma, davide Reviewed By: davide Differential Revision: https://reviews.llvm.org/D51595 llvm-svn: 349217	2018-12-15 00:32:38 +00:00
Vedant Kumar	9d1827331f	[Util] Refer to [s\|z]exts of args when converting dbg.declares (fix PR35400) When converting dbg.declares, if the described value is a [s\|z]ext, refer to the ext directly instead of referring to its operand. This fixes a narrowing bug (the debugger got the sign of a variable wrong, see llvm.org/PR35400). The main reason to refer to the ext's operand was that an optimization may remove the ext itself, leading to a dropped variable. Now that InstCombine has been taught to use replaceAllDbgUsesWith (r336451), this is less of a concern. Other passes can/should adopt this API as needed to fix dropped variable bugs. Differential Revision: https://reviews.llvm.org/D51813 llvm-svn: 349214	2018-12-15 00:03:33 +00:00
Artem Belevich	6d74bd638a	[NVPTX] Lower instructions that expand into libcalls. The change is an effort to split and refactor abandoned D34708 into smaller parts. Here the behaviour of unsupported instructions is changed to match the behaviour of explicit intrinsics calls. Currently LLVM crashes with: > Assertion getInstruction() && "Not a call or invoke instruction!" failed. With this patch LLVM produces a more sensible error message: > Cannot select: ... i32 = ExternalSymbol'__foobar' Author: Denys Zariaiev <denys.zariaiev@gmail.com> Differential Revision: https://reviews.llvm.org/D55145 llvm-svn: 349213	2018-12-14 23:53:06 +00:00
David Blaikie	560ff35592	DebugInfo: Avoid using split DWARF when the split unit would be empty. In ThinLTO many split CUs may be effectively empty because of the lack of support for cross-unit references in split DWARF. Using a split unit in those cases is just a waste/overhead - and turned out to be one contributor to a significant symbolizer performance issue when global variable debug info was being imported (see r348416 for the primary fix) due to symbolizers seeing CUs with no ranges, assuming there might still be addresses covered and walking into the split CU to see if there are any ranges (when that split CU was in a DWP file, that meant loading the DWP and its index, the index was extra large because of all these fractured/empty CUs... and so was very expensive to load). (the 3rd fix which will follow, is to assume that a CU with no ranges is empty rather than merely missing its CU level range data - and to not walk into its DIEs (split or otherwise) in search of address information that is generally not present) llvm-svn: 349207	2018-12-14 22:44:46 +00:00
Reid Kleckner	5bf71d1127	[codeview] Add begin/endSymbolRecord helpers, NFC Previously beginning a symbol record was excessively verbose. Now it's a bit simpler. This follows the same pattern as begin/endCVSubsection. llvm-svn: 349205	2018-12-14 22:40:28 +00:00
David Blaikie	61c127c1ad	DebugInfo: Move addAddrBase from DwarfUnit to DwarfCompileUnit Only CUs need an address table reference. llvm-svn: 349203	2018-12-14 22:34:03 +00:00
Krzysztof Parzyszek	26d994f56e	[Hexagon] Add patterns for shifts of v2i16 This fixes https://llvm.org/PR39983. llvm-svn: 349202	2018-12-14 22:33:48 +00:00
Volkan Keles	574d737e06	[GlobalISel] LegalizerHelper: Implement fewerElementsVector for G_LOAD/G_STORE Reviewers: aemerson, dsanders, bogner, paquette, aditya_nandakumar Reviewed By: dsanders Subscribers: rovka, kristof.beyls, javed.absar, tschuett, llvm-commits Differential Revision: https://reviews.llvm.org/D53728 llvm-svn: 349200	2018-12-14 22:11:20 +00:00
Krzysztof Parzyszek	c0fc0a9775	[Hexagon] Use IMPLICIT_DEF to any-extend 32-bit values to 64 bits llvm-svn: 349199	2018-12-14 22:05:44 +00:00
Farhana Aleen	ce095c564a	[AMDGPU] Promote constant offset to the immediate by finding a new base with 13bit constant offset from the nearby instructions. Summary: Promote constant offset to immediate by recomputing the relative 13bit offset from nearby instructions. E.g. s_movk_i32 s0, 0x1800 v_add_co_u32_e32 v0, vcc, s0, v2 v_addc_co_u32_e32 v1, vcc, 0, v6, vcc s_movk_i32 s0, 0x1000 v_add_co_u32_e32 v5, vcc, s0, v2 v_addc_co_u32_e32 v6, vcc, 0, v6, vcc global_load_dwordx2 v[5:6], v[5:6], off global_load_dwordx2 v[0:1], v[0:1], off => s_movk_i32 s0, 0x1000 v_add_co_u32_e32 v5, vcc, s0, v2 v_addc_co_u32_e32 v6, vcc, 0, v6, vcc global_load_dwordx2 v[5:6], v[5:6], off global_load_dwordx2 v[0:1], v[5:6], off offset:2048 Author: FarhanaAleen Reviewed By: arsenm, rampitec Subscribers: llvm-commits, AMDGPU Differential Revision: https://reviews.llvm.org/D55539 llvm-svn: 349196	2018-12-14 21:13:14 +00:00
Krzysztof Parzyszek	6b01d35497	[SDAG] Ignore chain operand in REG_SEQUENCE when emitting instructions llvm-svn: 349186	2018-12-14 20:14:12 +00:00
Evandro Menezes	ea9d90083f	[AArch64] Simplify the scheduling predicates (NFC) The instruction encodings make it unnecessary to distinguish extended W-form from X-form instructions. llvm-svn: 349185	2018-12-14 20:04:58 +00:00
Michael Kruse	ea9ef34558	[TransformWarning] Do not warn missed transformations in optnone functions. Optimization transformations are intentionally disabled by the 'optnone' function attribute. Therefore do not warn if transformation metadata is still present. Using the legacy pass manager structure, the `skipFunction` method takes care for the optnone attribute (already called before this patch). For the new pass manager, there is no equivalent, so we check for the 'optnone' attribute manually. Differential Revision: https://reviews.llvm.org/D55690 llvm-svn: 349184	2018-12-14 19:45:43 +00:00
Michael Kruse	5948b7f30f	[Transforms] Preserve metadata when converting invoke to call. The `changeToCall` function did not preserve the invoke's metadata. Currently, there is probably no metadata that depends on being applied on a CallInst or InvokeInst. Therefore we can replace the instruction's metadata. This fixes http://llvm.org/PR39994 Suggested-by: Moritz Kreutzer <moritz.kreutzer@siemens.com> Differential Revision: https://reviews.llvm.org/D55666 llvm-svn: 349170	2018-12-14 18:15:11 +00:00
Zachary Turner	8fb9a71dde	[MS Demangler] Fail gracefully on invalid pointer types. Once we detect a 'P', we know we a pointer type is upcoming, so we make some assumptions about the output that follows. If those assumptions didn't hold, we would assert. Instead, we should fail gracefully and propagate the error up. llvm-svn: 349169	2018-12-14 18:10:13 +00:00
Daniel Sanders	629db5d8e5	[globalisel][combiner] Make the CombinerChangeObserver a MachineFunction::Delegate Summary: This allows us to register it with the MachineFunction delegate and be notified automatically about erasure and creation of instructions. However, we still need explicit notification for modifications such as those caused by setReg() or replaceRegWith(). There is a catch with this though. The notification for creation is delivered before any operands can be added. While appropriate for scheduling combiner work. This is unfortunate for debug output since an opcode by itself doesn't provide sufficient information on what happened. As a result, the work list remembers the instructions (when debug output is requested) and emits a more complete dump later. Another nit is that the MachineFunction::Delegate provides const pointers which is inconvenient since we want to use it to schedule future modification. To resolve this GISelWorkList now has an optional pointer to the MachineFunction which describes the scope of the work it is permitted to schedule. If a given MachineInstr* is in this function then it is permitted to schedule work to be performed on the MachineInstr's. An alternative to this would be to remove the const from the MachineFunction::Delegate interface, however delegates are not permitted to modify the MachineInstr's they receive. In addition to this, the observer has three interface changes. * erasedInstr() is now erasingInstr() to indicate it is about to be erased but still exists at the moment. * changingInstr() and changedInstr() have been added to report changes before and after they are made. This allows us to trace the changes in the debug output. * As a convenience changingAllUsesOfReg() and finishedChangingAllUsesOfReg() will report changingInstr() and changedInstr() for each use of a given register. This is primarily useful for changes caused by MachineRegisterInfo::replaceRegWith() With this in place, both combine rules have been updated to report their changes to the observer. Finally, make some cosmetic changes to the debug output and make Combiner and CombinerHelp Reviewers: aditya_nandakumar, bogner, volkan, rtereshin, javed.absar Reviewed By: aditya_nandakumar Subscribers: mgorny, rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D52947 llvm-svn: 349167	2018-12-14 17:50:14 +00:00
Zachary Turner	2cd3286ed2	Fix a crash in llvm-undname with invalid types. llvm-svn: 349165	2018-12-14 17:43:56 +00:00
Ehsan Amiri	de1742c284	NFC. Adding an empty line to test the updated commit credentials. llvm-svn: 349158	2018-12-14 16:19:02 +00:00
Scott Linder	de6beb02a5	Implement -frecord-command-line (-frecord-gcc-switches) Implement options in clang to enable recording the driver command-line in an ELF section. Implement a new special named metadata, llvm.commandline, to support frontends embedding their command-line options in IR/ASM/ELF. This differs from the GCC implementation in some key ways: * In GCC there is only one command-line possible per compilation-unit, in LLVM it mirrors llvm.ident and multiple are allowed. * In GCC individual options are separated by NULL bytes, in LLVM entire command-lines are separated by NULL bytes. The advantage of the GCC approach is to clearly delineate options in the face of embedded spaces. The advantage of the LLVM approach is to support merging multiple command-lines unambiguously, while handling embedded spaces with escaping. Differential Revision: https://reviews.llvm.org/D54487 Clang Differential Revision: https://reviews.llvm.org/D54489 llvm-svn: 349155	2018-12-14 15:38:15 +00:00
John Brawn	1d0d86ae40	[RegAllocGreedy] IMPLICIT_DEF values shouldn't prefer registers It costs nothing to spill an IMPLICIT_DEF value (the only spill code that's generated is a KILL of the value), so when creating split constraints if the live-out value is IMPLICIT_DEF the exit constraint should be DontCare instead of PrefReg. Differential Revision: https://reviews.llvm.org/D55652 llvm-svn: 349151	2018-12-14 14:07:57 +00:00
Diana Picus	02c8343c75	[ARM GlobalISel] Thumb2: casts between int and ptr Mark as legal and add tests. Nothing special to do. llvm-svn: 349147	2018-12-14 13:45:38 +00:00
Diana Picus	813af0d283	[ARM GlobalISel] Minor refactoring. NFCI Refactor the ARMInstructionSelector to cache some opcodes in the constructor instead of checking all the time if we're in ARM or Thumb mode. llvm-svn: 349143	2018-12-14 12:37:24 +00:00
Diana Picus	14dc3b2959	[ARM GlobalISel] Allow simple binary ops in Thumb2 Mark G_ADD, G_SUB, G_MUL, G_AND, G_OR and G_XOR as legal for both ARM and Thumb2. Extract the legalizer tests for these opcodes into another file. Add tests for the instruction selector. llvm-svn: 349142	2018-12-14 11:58:14 +00:00
Craig Topper	257ce3871e	[DAGCombiner][X86] Prevent visitSIGN_EXTEND from returning N when (sext (setcc)) already has the target desired type for the setcc Summary: If the setcc already has the target desired type we can reach the getSetCC/getSExtOrTrunc after the MatchingVecType check with the exact same types as the nodes we started with. This causes those causes VsetCC to be CSEd to N0 and the getSExtOrTrunc will CSE to N. When we return N, the caller will think that meant we called CombineTo and did our own worklist management. But that's not what happened. This prevents target hooks from being called for the node. To fix this, I've now returned SDValue if the setcc is already the desired type. But to avoid some regressions in X86 I've had to disable one of the target combines that wasn't being reached before in the case of a (sext (setcc)). If we get vector widening legalization enabled that entire function will be deleted anyway so hopefully this is only for the short term. Reviewers: RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55459 llvm-svn: 349137	2018-12-14 08:28:24 +00:00
Fangrui Song	d2ed5be815	[Object] Rename getRelrRelocationType to getRelativeRelocationType Summary: The two utility functions were added in D47919 to support SHT_RELR. However, these are just relative relocations types and are't necessarily be named Relr. Reviewers: phosek, dberris Reviewed By: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D55691 llvm-svn: 349133	2018-12-14 07:46:58 +00:00
Petr Hosek	0c02306dbd	[llvm-xray] Use correct variable name This fixes the compiler error introduced in r349129. llvm-svn: 349130	2018-12-14 06:06:19 +00:00
Petr Hosek	27e2f2014a	[llvm-xray] Store offset pointers in temporaries DataExtractor::getU64 modifies the OffsetPtr which also pass to RelocateOrElse which breaks on Windows. This addresses the issue introduced in r349120. Differential Revision: https://reviews.llvm.org/D55689 llvm-svn: 349129	2018-12-14 05:56:20 +00:00
Petr Hosek	493a082483	[llvm-xray] Support for PIE When the instrumented binary is linked as PIE, we need to apply the relative relocations to sleds. This is handled by the dynamic linker at runtime, but when processing the file we have to do it ourselves. Differential Revision: https://reviews.llvm.org/D55542 llvm-svn: 349120	2018-12-14 01:37:56 +00:00
Alex Lorenz	afa75d7843	[macho] save the SDK version stored in module metadata into the version min and build version load commands in the object file This commit introduces a new metadata node called "SDK Version". It will be set by the frontend to mark the platform SDK (macOS/iOS/etc) version which was used during that particular compilation. This node is used when machine code is emitted, by either saving the SDK version into the appropriate macho load command (version min/build version), or by emitting the assembly for these load commands with the SDK version specified as well. The assembly for both load commands is extended by allowing it to contain the sdk_version X, Y [, Z] trailing directive to represent the SDK version respectively. rdar://45774000 Differential Revision: https://reviews.llvm.org/D55612 llvm-svn: 349119	2018-12-14 01:14:10 +00:00
Sanjay Patel	093ab45d4c	[DAGCombiner] clean up visitEXTRACT_VECTOR_ELT This isn't quite NFC, but I don't know how to expose any outward diffs from these changes. Mostly, this was confusing because it used 'VT' to refer to the operand type rather the usual type of the input node. There's also a large block at the end that is dedicated solely to matching loads, but that wasn't obvious. This could probably be split up into separate functions to make it easier to see. It's still not clear to me when we make certain transforms because the legality and constant conditions are intertwined in a way that might be improved. llvm-svn: 349095	2018-12-14 00:09:08 +00:00
Craig Topper	178abc59ac	[X86] Demote EmitTest to a helper function of EmitCmp. Route all callers except EmitCmp through EmitCmp. This requires the two callers to manifest a 0 to make EmitCmp call EmitTest. I'm looking into changing how we combine TEST and flag setting instructions to not be part of lowering. And instead be part of DAG combine or isel. Which will mean EmitTest will probably become gutted and maybe disappear entirely. llvm-svn: 349094	2018-12-13 23:55:30 +00:00
Evgeniy Stepanov	eb238ecf0f	Revert "[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6)" Breaks sanitizer-android buildbot. This reverts commit af8443a984c3b491c9ca2996b8d126ea31e5ecbe. llvm-svn: 349092	2018-12-13 23:47:50 +00:00
Evandro Menezes	6fe51ac973	[AArch64] Fix Exynos predicates (NFC) Fix the logic in the definition of the `ExynosShiftExPred` as a more specific version of `ExynosShiftPred`. But, since `ExynosShiftExPred` is not used yet, this change has NFC. llvm-svn: 349091	2018-12-13 23:19:46 +00:00
Wei Mi	66c6c5abea	[SampleFDO] handle ProfileSampleAccurate when initializing function entry count ProfileSampleAccurate is used to indicate the profile has exact match to the code to be optimized. Previously ProfileSampleAccurate is handled in ProfileSummaryInfo::isColdCallSite and ProfileSummaryInfo::isColdBlock. A better solution is to initialize function entry count to 0 when ProfileSampleAccurate is true, so we don't have to handle ProfileSampleAccurate in multiple places. Differential Revision: https://reviews.llvm.org/D55660 llvm-svn: 349088	2018-12-13 21:51:42 +00:00
Aakanksha Patil	bc568766b2	Revert r348971: [AMDGPU] Support for "uniform-work-group-size" attribute This patch breaks RADV (and probably RadeonSI as well) llvm-svn: 349084	2018-12-13 21:23:12 +00:00
Matt Arsenault	934e534c47	AMDGPU/GlobalISel: Legalize/regbankselect block_addr llvm-svn: 349081	2018-12-13 20:34:15 +00:00
Nikita Popov	dc73a6edde	Reapply "[MemCpyOpt] memset->memcpy forwarding with undef tail" Currently memcpyopt optimizes cases like memset(a, byte, N); memcpy(b, a, M); to memset(a, byte, N); memset(b, byte, M); if M <= N. Often this allows further simplifications down the line, which drop the first memset entirely. This patch extends this optimization for the case where M > N, but we know that the bytes a[N..M] are undef due to alloca/lifetime.start. This situation arises relatively often for Rust code, because Rust does not initialize trailing structure padding and loves to insert redundant memcpys. This also fixes https://bugs.llvm.org/show_bug.cgi?id=39844. The previous version of this patch did not perform dependency checking properly: While the dependency is checked at the position of the memset, the used size must be that of the memcpy. Previously the size of the memset was used, which missed modification in the region MemSetSize..CopySize, resulting in miscompiles. The added tests cover variations of this issue. Differential Revision: https://reviews.llvm.org/D55120 llvm-svn: 349078	2018-12-13 20:04:27 +00:00
Easwaran Raman	5a7056fa03	[ThinLTO] Compute synthetic function entry count Summary: This patch computes the synthetic function entry count on the whole program callgraph (based on module summary) and writes the entry counts to the summary. After function importing, this count gets attached to the IR as metadata. Since it adds a new field to the summary, this bumps up the version. Reviewers: tejohnson Subscribers: mehdi_amini, inglorion, llvm-commits Differential Revision: https://reviews.llvm.org/D43521 llvm-svn: 349076	2018-12-13 19:54:27 +00:00
Mircea Trofin	41c729e78e	[llvm] Address base discriminator overflow in X86DiscriminateMemOps Summary: Macros are expanded on a single line. In case of large expansions, with sufficiently many instructions with memory operands (and when -fdebug-info-for-profiling is requested), we may be unable to generate new base discriminator values - new values overflow (base discriminators may not be larger than 2^12). This CL warns instead of asserting in such a case. A subsequent CL will add APIs to check for overflow before creating new debug info. See https://bugs.llvm.org/show_bug.cgi?id=39890 Reviewers: davidxl, wmi, gbedwell Reviewed By: davidxl Subscribers: aprantl, llvm-commits Differential Revision: https://reviews.llvm.org/D55643 llvm-svn: 349075	2018-12-13 19:40:59 +00:00
Jordan Rupprecht	4888c4aba5	[llvm-size][libobject] Add explicit "inTextSegment" methods similar to "isText" section methods to calculate size correctly. Summary: llvm-size uses "isText()" etc. which seem to indicate whether the section contains code-like things, not whether or not it will actually go in the text segment when in a fully linked executable. The unit test added (elf-sizes.test) shows some types of sections that cause discrepencies versus the GNU size tool. llvm-size is not correctly reporting sizes of things mapping to text/data segments, at least for ELF files. This fixes pr38723. Reviewers: echristo, Bigcheese, MaskRay Reviewed By: MaskRay Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D54369 llvm-svn: 349074	2018-12-13 19:40:12 +00:00
Davide Italiano	9737096bb1	[LoopUtils] Use i32 instead of `void`. The actual type of the first argument of the @dbg intrinsic doesn't really matter as we're setting it to `undef`, but the bitcode reader is picky about `void` types. llvm-svn: 349069	2018-12-13 18:37:23 +00:00
Francis Visoiu Mistrih	91e69d8a92	[MachO][TLOF] Add support for local symbols in the indirect symbol table On 32-bit archs, before, we would assume that an indirect symbol will never have local linkage. This can lead to miscompiles where the symbol's value would be 0 and the linker would use that value, because the indirect symbol table would contain the value `INDIRECT_SYMBOL_LOCAL` for that specific symbol. Differential Revision: https://reviews.llvm.org/D55573 llvm-svn: 349060	2018-12-13 17:23:30 +00:00
Sanjay Patel	791ae69afe	[DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract; 2nd try This is a retry of rL349051 (reverted at rL349056). I changed the check for dead-ness from number of uses to an opcode test for DELETED_NODE based on existing similar code. Differential Revision: https://reviews.llvm.org/D55655 llvm-svn: 349058	2018-12-13 17:05:01 +00:00
Simon Pilgrim	b5aaa673c6	[X86][SSE] Add SSE vector imm/var shift support to SimplifyDemandedVectorEltsForTargetNode llvm-svn: 349057	2018-12-13 16:39:29 +00:00
Sanjay Patel	c56f5728ee	revert rL349051: [DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract This causes an address sanitizer bot failure: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/27187/steps/check-llvm%20asan/logs/stdio llvm-svn: 349056	2018-12-13 16:32:44 +00:00
Simon Pilgrim	b0b2f1503a	[X86][SSE] Fix all remaining modulo vector rotation amounts (PR38243) There's still a couple of minor SimplifyDemandedElts regressions in some of the shift amount splats that will be fixed in future patches. llvm-svn: 349052	2018-12-13 15:50:31 +00:00
Sanjay Patel	a7b115b392	[DAGCombiner] after simplifying demanded elements of vector operand of extract, revisit the extract Differential Revision: https://reviews.llvm.org/D55655 llvm-svn: 349051	2018-12-13 15:44:26 +00:00
Daniel Cederman	77611426e1	[Sparc] Add membar assembler tags Summary: The Sparc V9 membar instruction can enforce different types of memory orderings depending on the value in its immediate field. In the architectural manual the type is selected by combining different assembler tags into a mask. This patch adds support for these tags. Reviewers: jyknight, venkatra, brad Reviewed By: jyknight Subscribers: fedor.sergeev, jrtc27, jfb, llvm-commits Differential Revision: https://reviews.llvm.org/D53491 llvm-svn: 349048	2018-12-13 15:29:12 +00:00
Simon Pilgrim	ba91ff4a86	[X86][SSE] Fix modulo rotation amounts for v8i16/v16i16/v4i32 (PR38243) llvm-svn: 349047	2018-12-13 15:23:09 +00:00
Daniel Cederman	b5d284408e	[Sparc] Use float register for integer constrained with "f" in inline asm Summary: Constraining an integer value to a floating point register using "f" causes an llvm_unreachable to trigger. This patch allows i32 integers to be placed in a single precision float register and i64 integers to be placed in a double precision float register. This matches the behavior of GCC. For other types the llvm_unreachable is removed to instead trigger an error message that points out the offending line. Reviewers: jyknight, venkatra Reviewed By: jyknight Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits Differential Revision: https://reviews.llvm.org/D51614 llvm-svn: 349045	2018-12-13 15:13:29 +00:00
Jinsong Ji	c7b43b94ce	[PowerPC][NFC] Sorting out Pseudo related classes to avoid confusion There are several Pseudo in PowerPC backend. eg: * ISel Pseudo-instructions , which has let usesCustomInserter=1 in td ExpandISelPseudos -> EmitInstrWithCustomInserter will deal with them. * Post-RA pseudo instruction, which has let isPseudo = 1 in td, or Standard pseudo (SUBREG_TO_REG,COPY etc.) ExpandPostRAPseudos -> expandPostRAPseudo will expand them * Multi-instruction pseudo operations will expand them PPCAsmPrinter::EmitInstruction * Pseudo instruction in CodeEmitter, which has encoding of 0. Currently, in td files, especially PPCInstrVSX.td, we did not distinguish Post-RA pseudo instruction and Pseudo instruction in CodeEmitter very clearly. This patch is to * Rename Pseudo<> class to PPCEmitTimePseudo, which means encoding of 0 in CodeEmitter * Introduce new class PPCPostRAExpPseudo <> for previous PostRA Pseudo * Introduce new class PPCCustomInserterPseudo <> for previous Isel Pseudo Differential Revision: https://reviews.llvm.org/D55143 llvm-svn: 349044	2018-12-13 15:12:57 +00:00
Daniel Sanders	b51480ff3e	[mir] Fix uninitialized variable in r349035 noticed by clang-atom-d525-fedora-rel and 3 other bots llvm-svn: 349043	2018-12-13 15:05:27 +00:00
Simon Pilgrim	7c84f7ae3a	[X86][SSE] Merge the vXi16/vXi32 vector rotation expansion cases. NFCI. Merged the repeated code into a single if(). llvm-svn: 349040	2018-12-13 14:51:28 +00:00
Jonas Paulsson	e79b1b986d	[SystemZ] Pass copy-hinted regs first from getRegAllocationHints(). When computing register allocation hints for a GRX32Bit register, make sure that any of the hinted registers that are also copy hints are returned first in the list. Review: Ulrich Weigand. llvm-svn: 349037	2018-12-13 14:37:05 +00:00
Daniel Sanders	9f3cf55e63	[mir] Serialize DILocation inline when not possible to use a metadata reference Summary: Sometimes MIR-level passes create DILocations that were not present in the LLVM-IR. For example, it may merge two DILocations together to produce a DILocation that points to line 0. Previously, the address of these DILocations were printed which prevented the MIR from being read back into LLVM. With this patch, DILocations will use metadata references where possible and fall back on serializing them inline like so: MOV32mr %stack.0.x.addr, 1, _, 0, _, %0, debug-location !DILocation(line: 1, scope: !15) Reviewers: aprantl, vsk, arphaman Reviewed By: aprantl Subscribers: probinson, llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D55243 llvm-svn: 349035	2018-12-13 14:25:27 +00:00
Simon Pilgrim	320fd7383f	[X86][BWI] Don't custom lower vXi8 rotations. We always expand to shifts anyhow - test changes are just different scheduling only. llvm-svn: 349034	2018-12-13 13:44:33 +00:00
Chen Zheng	9c6fa536e0	[PowerPC] intrinsic llvm.eh.sjlj.setjmp should not have flag isBarrier. Differential Revision: https://reviews.llvm.org/D55499 llvm-svn: 349029	2018-12-13 12:25:20 +00:00
Simon Pilgrim	ab973a45b9	[DAGCombine] Moved X86 rotate_amount % bitwidth == 0 early out to DAGCombiner Remove common code from custom lowering (code is still safe if somehow a zero value gets used). llvm-svn: 349028	2018-12-13 12:23:32 +00:00
Diana Picus	99cd644b6c	[ARM GlobalISel] Support exts and truncs for Thumb2 Mark G_SEXT, G_ZEXT and G_ANYEXT to 32 bits as legal and add support for them in the instruction selector. This uses handwritten code again because the patterns that are generated with TableGen are tuned for what the DAG combiner would produce and not for simple sext/zext nodes. Luckily, we only need to update the opcodes to use the Thumb2 variants, everything else can be reused from ARM. llvm-svn: 349026	2018-12-13 12:06:54 +00:00
Simon Pilgrim	77fc551d1a	[TargetLowering] Add ISD::ROTL/ROTR vector expansion Move existing rotation expansion code into TargetLowering and set it up for vectors as well. Ideally this would share more of the funnel shift expansion, but we handle the shift amount modulo quite differently at the moment. Begun removing x86 vector rotate custom lowering to use the expansion. llvm-svn: 349025	2018-12-13 11:20:48 +00:00
Alex Bradbury	919f5fb8ca	[RISCV] Add support for the various RISC-V FMA instruction variants Adds support for the various RISC-V FMA instructions (fmadd, fmsub, fnmsub, fnmadd). The criteria for choosing whether a fused add or subtract is used, as well as whether the product is negated or not, is whether some of the arguments to the llvm.fma.* intrinsic are negated or not. In the tests, extraneous fadd instructions were added to avoid the negation being performed using a xor trick, which prevented the proper FMA forms from being selected and thus tested. The FMA instruction patterns might seem incorrect (e.g., fnmadd: -rs1 * rs2 - rs3), but they should be correct. The misleading names were inherited from MIPS, where the negation happens after computing the sum. The llvm.fmuladd.* intrinsics still do not generate RISC-V FMA instructions, as that depends on TargetLowering::isFMAFasterthanFMulAndFAdd. Some comments in the test files about what type of instructions are there tested were updated, to better reflect the current content of those test files. Differential Revision: https://reviews.llvm.org/D54205 Patch by Luís Marques. llvm-svn: 349023	2018-12-13 10:49:05 +00:00
Arnaud A. de Grandmaison	dfe861087d	[AArch64] Catch some more CMN opportunities. Fixes https://bugs.llvm.org/show_bug.cgi?id=33486 llvm-svn: 349022	2018-12-13 10:31:32 +00:00
Clement Courbet	76f4ae1092	[CodeGen] Allow mempcy/memset to generate small overlapping stores. Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 349016	2018-12-13 09:56:19 +00:00
Vitaly Buka	a257639a69	[asan] Don't check ODR violations for particular types of globals Summary: private and internal: should not trigger ODR at all. unnamed_addr: current ODR checking approach fail and rereport false violation if a linker merges such globals linkonce_odr, weak_odr: could cause similar problems and they are already not instrumented for ELF. Reviewers: eugenis, kcc Subscribers: kubamracek, hiraditya, llvm-commits Differential Revision: https://reviews.llvm.org/D55621 llvm-svn: 349015	2018-12-13 09:47:39 +00:00
Matt Arsenault	577b9fc543	AMDGPU/GlobalISel: Legalize f64 fadd/fmul llvm-svn: 349014	2018-12-13 08:27:48 +00:00
Matt Arsenault	f38f483bef	AMDGPU/GlobalISel: RegBankSelect some simple operations llvm-svn: 349012	2018-12-13 08:23:51 +00:00
Craig Topper	a048d58de7	[X86] Remove assert leftover from when i1 was a legal type. Add more accurate assert. NFC llvm-svn: 349007	2018-12-13 06:14:25 +00:00
Stanislav Mekhanoshin	d933c2ced7	[AMDGPU] Fix build failure, second attempt Some compilers complain that variable is captured and some complain when it is not. Switch to [&]. llvm-svn: 349006	2018-12-13 05:52:11 +00:00
Stanislav Mekhanoshin	5225746e03	[AMDGPU] Fix build failure Fixed error 'lambda capture 'CondReg' is not required to be captured for this use'. llvm-svn: 349005	2018-12-13 05:21:25 +00:00
Stanislav Mekhanoshin	6071e1aa58	[AMDGPU] Simplify negated condition Optimize sequence: %sel = V_CNDMASK_B32_e64 0, 1, %cc %cmp = V_CMP_NE_U32 1, %1 $vcc = S_AND_B64 $exec, %cmp S_CBRANCH_VCC[N]Z => $vcc = S_ANDN2_B64 $exec, %cc S_CBRANCH_VCC[N]Z It is the negation pattern inserted by DAGCombiner::visitBRCOND() in the rebuildSetCC(). Differential Revision: https://reviews.llvm.org/D55402 llvm-svn: 349003	2018-12-13 03:17:40 +00:00
David L. Jones	54c01ad6a9	Revert r348645 - "[MemCpyOpt] memset->memcpy forwarding with undef tail" This revision caused trucated memsets for structs with padding. See: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20181210/610520.html llvm-svn: 349002	2018-12-13 03:15:11 +00:00
Davide Italiano	8ee59ca653	[LoopUtils] Prefer a set over a map. NFCI. llvm-svn: 348999	2018-12-13 01:11:52 +00:00
Shoaib Meenai	96929fdd42	[Support] Fix FileNameLength passed to SetFileInformationByHandle The rename_internal function used for Windows has a minor bug where the filename length is passed as a character count instead of a byte count. Windows internally ignores this field, but other tools that hook NT api's may use the documented behavior: MSDN documentation specifying the size should be in bytes: https://docs.microsoft.com/en-us/windows/desktop/api/winbase/ns-winbase-_file_rename_info Patch by Ben Hillis. Differential Revision: https://reviews.llvm.org/D55624 llvm-svn: 348995	2018-12-13 00:08:25 +00:00
Daniel Sanders	d001e0e0f4	[globalisel] Add GISelChangeObserver::changingInstr() Summary: In addition to knowing that an instruction is changed. It's also useful to know when it's about to change. For example, it might print the instruction so you can track the changes in a debug log, it might remove it from some queue while it's being worked on, or it might want to change several instructions as a single transaction and act on all the changes at once. Added changingInstr() to all existing uses of changedInstr() Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55623 llvm-svn: 348992	2018-12-12 23:48:13 +00:00
Sam Clegg	03801256d8	[WebAssembly] Update dylink section parsing This updates the format of the dylink section in accordance with recent "spec" change: https://github.com/WebAssembly/tool-conventions/pull/77 Differential Revision: https://reviews.llvm.org/D55609 llvm-svn: 348989	2018-12-12 23:40:58 +00:00
Davide Italiano	744c3c327f	[LoopDeletion] Update debug values after loop deletion. When loops are deleted, we don't keep track of variables modified inside the loops, so the DI will contain the wrong value for these. e.g. int b() { int i; for (i = 0; i < 2; i++) ; patatino(); return a; -> 6 patatino(); 7 return a; 8 } 9 int main() { b(); } (lldb) frame var i (int) i = 0 We mark instead these values as unavailable inserting a @llvm.dbg.value(undef to make sure we don't end up printing an incorrect value in the debugger. We could consider doing something fancier, for, e.g. constants, in the future. PR39868. rdar://problem/46418795) Differential Revision: https://reviews.llvm.org/D55299 llvm-svn: 348988	2018-12-12 23:32:35 +00:00
Nikita Popov	36e03ac6ee	[InstCombine] Fix negative GEP offset evaluation for 32-bit pointers This fixes https://bugs.llvm.org/show_bug.cgi?id=39908. The evaluateGEPOffsetExpression() function simplifies GEP offsets for use in comparisons against zero, basically by converting XScale+Offset==0 to X+Offset/Scale==0 if Scale divides Offset. However, before this is done, Offset is masked down to the pointer size. This results in incorrect results for negative Offsets, because we basically end up dividing the 32-bit offset zero* extended to 64-bit bits (rather than sign extended). Fix this by explicitly sign extending the truncated value. Differential Revision: https://reviews.llvm.org/D55449 llvm-svn: 348987	2018-12-12 23:19:03 +00:00
Ryan Prichard	e028c818f5	[hwasan] Android: Switch from TLS_SLOT_TSAN(8) to TLS_SLOT_SANITIZER(6) Summary: The change is needed to support ELF TLS in Android. See D55581 for the same change in compiler-rt. Reviewers: srhines, eugenis Reviewed By: eugenis Subscribers: srhines, llvm-commits Differential Revision: https://reviews.llvm.org/D55592 llvm-svn: 348983	2018-12-12 22:45:06 +00:00
Daniel Sanders	91dfdd5734	[globalisel] Rename GISelChangeObserver's erasedInstr() to erasingInstr() and related nits. NFC Summary: There's little of interest that can be done to an already-erased instruction. You can't inspect it, write it to a debug log, etc. It ought to be notification that we're about to erase it. Rename the function to clarify the timing of the event and reflect current usage. Also fixed one case where we were trying to print an erased instruction. Reviewers: aditya_nandakumar Reviewed By: aditya_nandakumar Subscribers: rovka, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55611 llvm-svn: 348976	2018-12-12 21:32:01 +00:00
Craig Topper	d1c61861dd	[X86] Don't emit MULX by default with BMI2 MULX has somewhat improved register allocation constraints compared to the legacy MUL instruction. Both output registers are encoded instead of fixed to EAX/EDX, but EDX is used as input. It also doesn't touch flags. Unfortunately, the encoding is longer. Prefering it whenever BMI2 is enabled is probably not optimal. Choosing it should somehow be a function of register allocation constraints like converting adds to three address. gcc and icc definitely don't pick MULX by default. Not sure what if any rules they have for using it. Differential Revision: https://reviews.llvm.org/D55565 llvm-svn: 348975	2018-12-12 21:21:31 +00:00
Aakanksha Patil	729309cc89	[AMDGPU] Support for "uniform-work-group-size" attribute Updated the annotate-kernel-features pass to support the propagation of uniform-work-group attribute from the kernel to the called functions. Once this pass is run, all kernels, even the ones which initially did not have the attribute, will be able to indicate weather or not they have uniform work group size depending on the value of the attribute. Differential Revision: https://reviews.llvm.org/D50200 llvm-svn: 348971	2018-12-12 20:49:17 +00:00
David Blaikie	023674a9e4	DebugInfo/DWARF: Pretty print subroutine types Doesn't handle varargs and other fun things, but it's a start. (also doesn't print these strictly as valid C++ when it's a pointer to function, it'll print as "void(int)" instead of "void ()(int)") llvm-svn: 348965	2018-12-12 19:53:03 +00:00
Scott Linder	f5b36e56fb	[AMDGPU] Emit MessagePack HSA Metadata for v3 code object Continue to present HSA metadata as YAML in ASM and when output by tools (e.g. llvm-readobj), but encode it in Messagepack in the code object. Differential Revision: https://reviews.llvm.org/D48179 llvm-svn: 348963	2018-12-12 19:39:27 +00:00
David Blaikie	3f8f004daf	DebugInfo/DWARF: Improve dumping of pointers to members ('int foo::' rather than 'int') llvm-svn: 348962	2018-12-12 19:34:02 +00:00
David Blaikie	815cffaad8	DebugInfo/DWARF: Refactor type dumping to dump types, rather than DIEs that reference types This lays the foundation for dumping types not referenced by DW_AT_type attributes (in the near-term, that'll be DW_AT_containing_type for a DW_TAG_ptr_to_member_type - in the future, potentially dumping the pretty printed name next to the DW_TAG for the type, rather than only when the type is referenced from elsewhere) llvm-svn: 348961	2018-12-12 19:33:08 +00:00
David Blaikie	92b5493a14	DebugInfo/DWARF: Refactor getAttributeValueAsReferencedDie to accept a DWARFFormValue Save searching for the attribute again when you already have the DWARFFormValue at hand. llvm-svn: 348960	2018-12-12 19:23:55 +00:00
Craig Topper	4937adf75f	[X86] Emit SBB instead of SETCC_CARRY from LowerSELECT. Break false dependency on the SBB input. I'm hoping we can just replace SETCC_CARRY with SBB. This is another step towards that. I've explicitly used zero as the input to the setcc to avoid a false dependency that we've had with the SETCC_CARRY. I changed one of the patterns that used NEG to instead use an explicit compare with 0 on the LHS. We needed the zero anyway to avoid the false dependency. The negate would clobber its input register. By using a CMP we can avoid that which could be useful. Differential Revision: https://reviews.llvm.org/D55414 llvm-svn: 348959	2018-12-12 19:20:21 +00:00
Florian Hahn	81a22d32f7	[ConstantFold] Use getMinSignedBits for APInt in isIndexInRangeOfArrayType. Indices for getelementptr can be signed so we should use getMinSignedBits instead of getActiveBits here. The function later calls getSExtValue to get the int64_t value, which also checks getMinSignedBits. This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=11647. Reviewers: mssimpso, efriedma, davide Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55536 llvm-svn: 348957	2018-12-12 18:55:14 +00:00
David Blaikie	73066d60f1	llvm-dwarfdump: Dump array dimensions in stringified type names llvm-svn: 348954	2018-12-12 18:46:25 +00:00
Simon Pilgrim	eb508f8ccb	[SelectionDAG] Add a generic isSplatValue function This patch introduces a generic function to determine whether a given vector type is known to be a splat value for the specified demanded elements, recursing up the DAG looking for BUILD_VECTOR or VECTOR_SHUFFLE splat patterns. It also keeps track of the elements that are known to be UNDEF - it returns true if all the demanded elements are UNDEF (as this may be useful under some circumstances), so this needs to be handled by the caller. A wrapper variant is also provided that doesn't take the DemandedElts or UndefElts arguments for cases where we just want to know if the SDValue is a splat or not (with/without UNDEFS). I had hoped to completely remove the X86 local version of this function, but I'm seeing some regressions in shift/rotate codegen that will take a little longer to fix and I hope to get this in sooner so I can continue work on PR38243 which needs more capable splat detection. Differential Revision: https://reviews.llvm.org/D55426 llvm-svn: 348953	2018-12-12 18:32:29 +00:00
Artem Belevich	f802b9324a	[NVPTX] do not rely on cached subtarget info. If a module has function references, but no functions themselves, we may end up never calling runOnMachineFunction and therefore would never initialize nvptxSubtarget field which would eventually cause a crash. Instead of relying on nvptxSubtarget being initialized by one of the methods, retrieve subtarget info directly. Differential Revision: https://reviews.llvm.org/D55580 llvm-svn: 348952	2018-12-12 18:31:04 +00:00
Sanjay Patel	44eaa492b8	[x86] allow 8-bit adds to be promoted by convertToThreeAddress() to form LEA This extends the code that handles 16-bit add promotion to form LEA to also allow 8-bit adds. That allows us to combine add ops with register moves and save some instructions. This is another step towards allowing add truncation in generic DAGCombiner (see D54640). Differential Revision: https://reviews.llvm.org/D55494 llvm-svn: 348946	2018-12-12 17:58:27 +00:00
Michael Kruse	7244852557	[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes. When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g. #pragma clang loop unroll_and_jam(enable) #pragma clang loop distribute(enable) is the same as #pragma clang loop distribute(enable) #pragma clang loop unroll_and_jam(enable) and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used. This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance, !0 = !{!0, !1, !2} !1 = !{!"llvm.loop.unroll_and_jam.enable"} !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3} !3 = !{!"llvm.loop.distribute.enable"} defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop. Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account. For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations. Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated. To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied. With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling). Reviewed By: hfinkel, dmgreen Differential Revision: https://reviews.llvm.org/D49281 Differential Revision: https://reviews.llvm.org/D55288 llvm-svn: 348944	2018-12-12 17:32:52 +00:00
Wei Mi	7da5a08e1a	[SampleFDO] Extend profile-sample-accurate option to cover isFunctionColdInCallGraph For SampleFDO, when a callsite doesn't appear in the profile, it will not be marked as cold callsite unless the option -profile-sample-accurate is specified. But profile-sample-accurate doesn't cover function isFunctionColdInCallGraph which is used to decide whether a function should be put into text.unlikely section, so even if the user knows the profile is accurate and specifies profile-sample-accurate, those functions not appearing in the sample profile are still not be put into text.unlikely section right now. The patch fixes that. Differential Revision: https://reviews.llvm.org/D55567 llvm-svn: 348940	2018-12-12 17:09:27 +00:00
Neil Henning	76504a4c5e	[AMDGPU] Extend the SI Load/Store optimizer to combine more things. I've extended the load/store optimizer to be able to produce dwordx3 loads and stores, This change allows many more load/stores to be combined, and results in much more optimal code for our hardware. Differential Revision: https://reviews.llvm.org/D54042 llvm-svn: 348937	2018-12-12 16:15:21 +00:00
Simon Atanasyan	fa020082e4	[mips] Enable using of integrated assembler in all cases. llvm-svn: 348934	2018-12-12 15:32:03 +00:00
Simon Pilgrim	f6c898e12f	[TargetLowering] Add ISD::AND handling to SimplifyDemandedVectorElts If either of the operand elements are zero then we know the result element is going to be zero (even if the other element is undef). Differential Revision: https://reviews.llvm.org/D55558 llvm-svn: 348926	2018-12-12 13:43:07 +00:00
Piotr Sobczak	3732b4ce25	[AMDGPU] Set metadata access for explicit section Summary: This patch provides a means to set Metadata section kind for a global variable, if its explicit section name is prefixed with ".AMDGPU.metadata." This could be useful to make the global variable go to an ELF section without any section flags set. Reviewers: dstuttard, tpr, kzhuravl, nhaehnle, t-tye Reviewed By: dstuttard, kzhuravl Subscribers: llvm-commits, arsenm, jvesely, wdng, yaxunl, t-tye Differential Revision: https://reviews.llvm.org/D55267 llvm-svn: 348922	2018-12-12 11:20:04 +00:00
Diana Picus	59720b422a	[ARM GlobalISel] Select load/store for Thumb2 Unfortunately we can't use TableGen for this because it doesn't yet support predicates on the source pattern root. Therefore, add a bit of handwritten code to the instruction selector to handle the most basic cases. Also mark them as legal and extract their legalizer test cases to a new test file. llvm-svn: 348920	2018-12-12 10:32:15 +00:00
Jonas Paulsson	896775c2d3	[SystemZ] Minor cleanup of SchedModels Some fixes of a few InstRWs for z13 and z14. Review: Ulrich Weigand llvm-svn: 348917	2018-12-12 08:26:24 +00:00
Mikael Holmen	c06b01cb22	Fix compiler warning about unused variable [NFC] llvm-svn: 348913	2018-12-12 06:33:45 +00:00
Leonard Chan	118e53fd63	[Intrinsic] Signed Fixed Point Multiplication Intrinsic Add an intrinsic that takes 2 signed integers with the scale of them provided as the third argument and performs fixed point multiplication on them. This is a part of implementing fixed point arithmetic in clang where some of the more complex operations will be implemented as intrinsics. Differential Revision: https://reviews.llvm.org/D54719 llvm-svn: 348912	2018-12-12 06:29:14 +00:00
Craig Topper	1fe466689b	[X86] Combine vpmovdw+vpacksswb into vpmovdb. This is similar to the combine we already have for vpmovdw+vpackuswb. llvm-svn: 348910	2018-12-12 05:56:01 +00:00
Florian Hahn	cc419ad7df	[ConstantInt] Check active bits before calling getZExtValue. Without this check, we hit an assertion in getZExtValue, if the constant value does not fit into an uint64_t. As getZExtValue returns an uint64_t, should we update getAggregateElement to take an uin64_t as well? This fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=6109. Reviewers: efriedma, craig.topper, spatel Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D55547 llvm-svn: 348906	2018-12-12 02:22:12 +00:00
Nathan Lanza	893083ae5e	Implement IMAGE_REL_AMD64_SECREL for RuntimeDyldCOFFX86_64 lldb on Windows uses the ExecutionEngine for expression evaluation and hits the llvm_unreachable due to this relocation. Thus, implement the relocation and add a test to verify it's function. llvm-svn: 348904	2018-12-12 00:04:06 +00:00
Reid Kleckner	9571c806c5	[codeview] Look through typedefs in getCompleteTypeIndex Summary: Any time a symbol record, whether it's S_UDT, S_LOCAL, or S_[GL]DATA32, references a record type, it should use the complete type index, even if there's a typedef in the way. Fixes the compiler part of PR39853. Reviewers: zturner, aganea Subscribers: hiraditya, arphaman, llvm-commits Differential Revision: https://reviews.llvm.org/D55236 llvm-svn: 348902	2018-12-11 23:07:39 +00:00
Craig Topper	502865bddb	[GISel] Add parentheses to an assert because gcc is mean. llvm-svn: 348900	2018-12-11 22:07:06 +00:00
Jordan Rupprecht	e833cd46eb	Revert "debuginfo: Use symbol difference for CU length to simplify assembly reading/editing" Temporarily reverts commit r348806 due to strange asm compilation issues in certain modes (combination of asan+cuda+other things). Will provide repro soon. llvm-svn: 348898	2018-12-11 21:26:52 +00:00
Gor Nishanov	20d833d5e3	[coroutines] Improve suspend point simplification Summary: Enable suspend point simplification for cases where: * coro.save and coro.suspend are in different basic blocks * where there are intervening intrinsics Reviewers: modocache, tks2103, lewissbaker Reviewed By: modocache Subscribers: EricWF, llvm-commits Differential Revision: https://reviews.llvm.org/D55160 llvm-svn: 348897	2018-12-11 21:23:09 +00:00
Wolfgang Pieb	ac874c48ca	[Debuginfo] Prevent CodeGenPrepare from dropping debuginfo references. This fixes PR39845. CodeGenPrepare employs a transactional model when performing optimizations, i.e. it changes the IR to attempt an optimization and rolls back the change when it finds the change inadequate. It is during the rollback that references to locals were dropped from debug value intrinsics. This patch reinstates debuginfo references during rollbacks. Reviewers: aprantl, vsk Differential Revision: https://reviews.llvm.org/D55396 llvm-svn: 348896	2018-12-11 21:13:53 +00:00
Nikita Popov	79c994d976	[ConstantFolding] Handle leading zero-size elements in load folding Struct types may have leading zero-size elements like [0 x i32], in which case the "real" element at offset 0 will not necessarily coincide with the 0th element of the aggregate. ConstantFoldLoadThroughBitcast() wants to drill down the element at offset 0, but currently always picks the 0th aggregate element to do so. This patch changes the code to find the first non-zero-size element instead, for the struct case. The motivation behind this change is https://github.com/rust-lang/rust/issues/48627. Rust is fond of emitting [0 x iN] separators between struct elements to enforce alignment, which prevents constant folding in this particular case. The additional tests with [4294967295 x [0 x i32]] check that we don't end up unnecessarily looping over a large number of zero-size elements of a zero-size array. Differential Revision: https://reviews.llvm.org/D55169 llvm-svn: 348895	2018-12-11 20:29:16 +00:00
Aditya Nandakumar	853a667812	[GISel]: Add MachineIRBuilder support for passing in Flags while building https://reviews.llvm.org/D55516 Add the ability to pass in flags to buildInstr calls. Currently no validation is performed but that can be easily performed based on the opcode (if necessary). Reviewed by: paquette. llvm-svn: 348893	2018-12-11 20:04:40 +00:00
Fedor Sergeev	a1d95c3fc4	[NewPM] fixing asserts on deleted loop in -print-after-all IR-printing AfterPass instrumentation might be called on a loop that has just been invalidated. We should skip printing it to avoid spurious asserts. Reviewed By: chandlerc, philip.pfaffe Differential Revision: https://reviews.llvm.org/D54740 llvm-svn: 348887	2018-12-11 19:05:35 +00:00
Mandeep Singh Grang	802dc40f41	[COFF, ARM64] Emit COFF function header Summary: Emit COFF header when printing out the function. This is important as the header contains two important pieces of information: the storage class for the symbol and the symbol type information. This bit of information is required for the linker to correctly identify the type of symbol that it is dealing with. This patch mimics X86 and ARM COFF behavior for function header emission. Reviewers: rnk, mstorsjo, compnerd, TomTan, ssijaric Reviewed By: mstorsjo Subscribers: dmajor, javed.absar, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D55535 llvm-svn: 348875	2018-12-11 18:36:14 +00:00
Vedant Kumar	b3a7cae045	[HotColdSplitting] Disable outlining landingpad instructions (PR39917) It's currently not safe to outline landingpad instructions (see llvm.org/PR39917). Like @llvm.eh.typeid.for, the order and content of previous landingpad instructions in a function alters the lowering of subsequent landingpads by renumbering type info ID's. Outlining a landingpad therefore breaks exception handling & unwinding. llvm-svn: 348870	2018-12-11 18:05:31 +00:00
Sanjay Patel	2aa2dc76c2	[InstCombine] try to convert x86 movmsk intrinsic to generic IR (PR39927) call iM movmsk(sext <N x i1> X) --> zext (bitcast <N x i1> X to iN) to iM This has the potential to create less-than-8-bit scalar types as shown in some of the test diffs, but it looks like the backend knows how to deal with that in these patterns. This is the simple part of the fix suggested in: https://bugs.llvm.org/show_bug.cgi?id=39927 Differential Revision: https://reviews.llvm.org/D55529 llvm-svn: 348862	2018-12-11 16:38:03 +00:00
Craig Topper	b51283bfd7	Fix not correct imm operand assertion for SUB32ri in X86CondBrFolding::analyzeCompare Summary: When doing X86CondBrFolding::analyzeCompare, it will meet the SUB32ri instruction as below to use the global address for its operand, %733:gr32 = SUB32ri %62:gr32(tied-def 0), @img2buf_normal, implicit-def $eflags JNE_1 %bb.41, implicit $eflags so the assertion "assert(MI.getOperand(ValueIndex).isImm() && "Expecting Imm operand")" is not correct and change the assert to if make X86CondBrFolding::analyzeCompare return false as not finding the compare for this Patch by Jianping Chen Reviewers: smaslov, LuoYuanke, liutianle, Jianping Reviewed By: Jianping Subscribers: lebedev.ri, llvm-commits Differential Revision: https://reviews.llvm.org/D54250 llvm-svn: 348853	2018-12-11 15:32:14 +00:00
Sanjay Patel	05e36982dd	[x86] clean up code for converting 16-bit ops to LEA; NFC As discussed in D55494, we want to extend this to handle 8-bit ops too, but that could be extended further to enable this on 32-bit systems too. llvm-svn: 348851	2018-12-11 15:29:40 +00:00
Sanjay Patel	9765ba5f86	[x86] remove dead code for 16-bit LEA formation; NFC As discussed in: D55494 ...this code has been disabled/dead for a long time (the code references Athlon and Pentium 4), and there's almost no chance that it will be used given the last decade of uarch evolution. Also, in SDAG we promote 16-bit ops to 32-bit, so there's almost no way to test this code any more. llvm-svn: 348845	2018-12-11 14:05:03 +00:00
Clement Courbet	8b6434bbb9	Revert r348843 "[CodeGen] Allow mempcy/memset to generate small overlapping stores." Breaks ARM/memcpy-inline.ll llvm-svn: 348844	2018-12-11 13:38:43 +00:00
Clement Courbet	93b3445770	[CodeGen] Allow mempcy/memset to generate small overlapping stores. Summary: All targets either just return false here or properly model `Fast`, so I don't think there is any reason to prevent CodeGen from doing the right thing here. Subscribers: nemanjai, javed.absar, eraman, jsji, llvm-commits Differential Revision: https://reviews.llvm.org/D55365 llvm-svn: 348843	2018-12-11 13:15:56 +00:00
Simon Pilgrim	f6371f5f23	[TargetLowering] Add ISD::EXTRACT_VECTOR_ELT support to SimplifyDemandedBits Let SimplifyDemandedBits attempt to simplify all elements of a vector extraction. Part of PR39689. llvm-svn: 348839	2018-12-11 11:08:40 +00:00
David Stenberg	2474ce5862	[DeadArgElim] Fixes for dbg.values using dead arg/return values Summary: When eliminating a dead argument or return value in a function with local linkage, all uses, including in dbg.value intrinsics, would be replaced with null constants. This would mean that, for example for an integer argument, the debug info would incorrectly express that the value is 0. Instead, replace all uses with undef to indicate that the argument/return value is optimized out. Also, make sure that metadata uses of return values are rewritten even if there are no non-metadata uses of the value. As a bit of historical curiosity, the code that emitted null constants was introduced in the initial check-in of the pass in 2003, before 'undef' values even existed in LLVM. This fixes PR23260. Reviewers: dblaikie, aprantl, vsk, djtodoro Reviewed By: aprantl Subscribers: llvm-commits Tags: #debug-info Differential Revision: https://reviews.llvm.org/D55513 llvm-svn: 348837	2018-12-11 10:33:38 +00:00
Martell Malone	0b3ddec7ed	[PPC][NFC] store operands are dst not src Differential Revision: https://reviews.llvm.org/D55502 llvm-svn: 348826	2018-12-11 03:14:56 +00:00
Heejin Ahn	be5e5874f6	[WebAssembly] Add '.eventtype' directive support Summary: This patch supports `.eventtype` directive printing and parsing in the same syntax with `.functype`. Reviewers: aardappel, sbc100 Subscribers: dschuff, sbc100, jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55353 llvm-svn: 348818	2018-12-11 01:11:04 +00:00
Armando Montanez	6d6ff2e0d7	[TextAPI][elfabi] Make SoName optional This change makes DT_SONAME treated as an optional trait for ELF TextAPI stubs. This change accounts for the fact that shared objects aren't guaranteed to have a DT_SONAME entry. Tests have been updated to check for correct behavior of an optional soname. Differential Revision: https://reviews.llvm.org/D55533 llvm-svn: 348817	2018-12-11 01:00:16 +00:00
Heejin Ahn	21d45a2c98	[WebAssembly] TargetStreamer cleanup (NFC) Summary: - Unify mixed argument names (`Symbol` and `Sym`) to `Sym` - Changed `MCSymbolWasm` argument of `emit*` functions to `const MCSymbolWasm`. It seems not very intuitive that emit function in the streamer modifies symbol contents. - Moved empty function bodies to the header - clang-format Reviewers: aardappel, dschuff, sbc100 Subscribers: jgravelle-google, sunfish, llvm-commits Differential Revision: https://reviews.llvm.org/D55347 llvm-svn: 348816	2018-12-11 00:53:59 +00:00
Aditya Nandakumar	cef44a2342	[GISel]: Refactor MachineIRBuilder to allow passing additional parameters to build Instrs https://reviews.llvm.org/D55294 Previously MachineIRBuilder::buildInstr used to accept variadic arguments for sources (which were either unsigned or MachineInstrBuilder). While this worked well in common cases, it doesn't allow us to build instructions that have multiple destinations. Additionally passing in other optional parameters in the end (such as flags) is not possible trivially. Also a trivial call such as B.buildInstr(Opc, Reg1, Reg2, Reg3) can be interpreted differently based on the opcode (2defs + 1 src for unmerge vs 1 def + 2srcs). This patch refactors the buildInstr to buildInstr(Opc, ArrayRef<DstOps>, ArrayRef<SrcOps>) where DstOps and SrcOps are typed unions that know how to add itself to MachineInstrBuilder. After this patch, most invocations would look like B.buildInstr(Opc, {s32, DstReg}, {SrcRegs..., SrcMIBs..}); Now all the other calls (such as buildAdd, buildSub etc) forward to buildInstr. It also makes it possible to build instructions with multiple defs. Additionally in a subsequent patch, we should make it possible to add flags directly while building instructions. Additionally, the main buildInstr method is now virtual and other builders now only have to override buildInstr (for say constant folding/cseing) is straightforward. Also attached here (https://reviews.llvm.org/F7675680) is a clang-tidy patch that should upgrade the API calls if necessary. llvm-svn: 348815	2018-12-11 00:48:50 +00:00
David Blaikie	dbe67c4f19	debuginfo: Use symbol difference for CU length to simplify assembly reading/editing Mucking about simplifying a test case ( https://reviews.llvm.org/D55261 ) I stumbled across something I've hit before - that LLVM's (GCC's does too, FWIW) assembly output includes a hardcode length for a DWARF unit in its header. Instead we could emit a label difference - making the assembly easier to read/edit (though potentially at a slight (I haven't tried to observe it) performance cost of delaying/sinking the length computation into the MC layer). Reviewers: JDevlieghere, probinson, ABataev Differential Revision: https://reviews.llvm.org/D55281 llvm-svn: 348806	2018-12-10 22:44:48 +00:00
Davide Italiano	8ec7709f58	[Local] Promote an utility that could be used elsewhere. NFCI. llvm-svn: 348804	2018-12-10 22:17:04 +00:00
Krzysztof Parzyszek	9f003f9262	[Hexagon] Couple of fixes in optimize addressing mode - Check if an operand is an immediate before calling getImm. Some operands that take constant values can actually have global symbols or other constant expressions. - When a load-constant instruction can be folded into users, make sure to only delete it when all users have been successfully converted. llvm-svn: 348802	2018-12-10 21:56:04 +00:00
Matt Arsenault	9ccde61f81	InstCombine: Scalarize single use icmp/fcmp llvm-svn: 348801	2018-12-10 21:50:54 +00:00
Krzysztof Parzyszek	c1b2d5905a	Revert "[Hexagon] Check if operand is an immediate before getImm" This reverts r348787. The patch wasn't quite correct. llvm-svn: 348792	2018-12-10 19:30:08 +00:00
JF Bastien	69f6098e89	APFloat: allow 64-bit of payload Summary: The APFloat and Constant APIs taking an APInt allow arbitrary payloads, and that's great. There's a convenience API which takes an unsigned, and that's silly because it then directly creates a 64-bit APInt. Just change it to 64-bits directly. At the same time, add ConstantFP NaN getters which match the APFloat ones (with getQNaN / getSNaN and APInt parameters). Improve the APFloat testing to set more payload bits. Reviewers: scanon, rjmccall Subscribers: jkorous, dexonsmith, kristina, llvm-commits Differential Revision: https://reviews.llvm.org/D55460 llvm-svn: 348791	2018-12-10 19:27:38 +00:00
Amara Emerson	5ec146046c	[GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes. This patch restricts the capability of G_MERGE_VALUES, and uses the new G_BUILD_VECTOR and G_CONCAT_VECTORS opcodes instead in the appropriate places. This patch also includes AArch64 support for selecting G_BUILD_VECTOR of <4 x s32> and <2 x s64> vectors. Differential Revisions: https://reviews.llvm.org/D53629 llvm-svn: 348788	2018-12-10 18:44:58 +00:00
Krzysztof Parzyszek	c6e9380a56	[Hexagon] Check if operand is an immediate before getImm llvm-svn: 348787	2018-12-10 18:39:47 +00:00
Krzysztof Parzyszek	914f2d1c46	[Hexagon] Add patterns for any_extend from i1 and short vectors of i1 llvm-svn: 348785	2018-12-10 18:36:06 +00:00
Simon Pilgrim	fc2c9af99c	[TargetLowering] Add UNDEF folding to SimplifyDemandedVectorElts If all the demanded elements of the SimplifyDemandedVectorElts are known to be UNDEF, we can simplify to an ISD::UNDEF node. Zero constant folding will be handled in a future patch - its a little trickier as we often have bitcasted zero values. Differential Revision: https://reviews.llvm.org/D55511 llvm-svn: 348784	2018-12-10 18:29:46 +00:00

... 4 5 6 7 8 ...

119417 Commits