llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	ccb810fb54	GlobalISel: Verify memory size for load/store llvm-svn: 352578	2019-01-30 01:10:42 +00:00
Sanjay Patel	a61d586f74	[DAGCombiner] fold extract_subvector of extract_subvector This is the sibling fold for insert-of-insert that was added with D56604. Now that we have x86 shuffle narrowing (D57156), this change shows improvements for lots of AVX512 reduction code (not sure that we would ever expect extract-of-extract otherwise). There's a small regression in some of the partial-permute tests (extracting followed by splat). That is tracked by PR40500: https://bugs.llvm.org/show_bug.cgi?id=40500 Differential Revision: https://reviews.llvm.org/D57336 llvm-svn: 352528	2019-01-29 19:13:39 +00:00
Sanjay Patel	cd6b240303	[x86] add tests for vector bool math; NFC llvm-svn: 352520	2019-01-29 17:00:47 +00:00
Andrea Di Biagio	815cdbff29	[X86][Btver2] Improved latency/throughput model for scalar int-to-float conversions. Account for bypass delays when computing the latency of scalar int-to-float conversions. On Jaguar we need to account for an extra 6cy latency (see AMD fam16h SOG). This patch also fixes the number of micropcodes for the register-memory variants of scalar int-to-float conversions. Differential Revision: https://reviews.llvm.org/D57148 llvm-svn: 352518	2019-01-29 16:47:27 +00:00
Ayonam Ray	a1f6973ade	Reversing the checkin for version 352484 as tests are failing. llvm-svn: 352504	2019-01-29 15:00:50 +00:00
Ayonam Ray	4272af9b3e	[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default During the lowering of a switch that would result in the generation of a jump table, a range check is performed before indexing into the jump table, for the switch value being outside the jump table range and a conditional branch is inserted to jump to the default block. In case the default block is unreachable, this conditional jump can be omitted. This patch implements omitting this conditional branch for unreachable defaults. Review ID: D52002 Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev llvm-svn: 352484	2019-01-29 12:01:32 +00:00
Simon Pilgrim	4293ad8ab2	[X86] Add PR40483 test case llvm-svn: 352480	2019-01-29 10:58:42 +00:00
Simon Pilgrim	06a342b2d6	[X86] Fix linux32 pic tests to use correct relocation model (PR39684) Differential Revision: https://reviews.llvm.org/D57301 llvm-svn: 352476	2019-01-29 10:41:48 +00:00
Simon Pilgrim	0b7fce6d72	[X86] Regenerate abi-isel.ll test Adds note requested in D57301 and fixes some missing GOTPCREL addressmath checks llvm-svn: 352474	2019-01-29 10:39:02 +00:00
Craig Topper	390ac61b93	Recommit r352255 "[SelectionDAG][X86] Don't use SEXTLOAD for promoting masked loads in the type legalizer" This did not cause the buildbot failure it was previously reverted for. Original commit message: I'm not sure why we were using SEXTLOAD. EXTLOAD seems more appropriate since we don't care about the upper bits. This patch changes this and then modifies the X86 post legalization combine to emit a extending shuffle instead of a sign_extend_vector_inreg. Could maybe use an any_extend_vector_inre On AVX512 targets I think we might be able to use a masked vpmovzx and not have to expand this at all. llvm-svn: 352433	2019-01-28 21:38:47 +00:00
Nikita Popov	8e1a464e6a	[CodeGen][X86] Expand UADDSAT to NOT+UMIN+ADD Followup to D56636, this time handling the UADDSAT case by expanding uadd.sat(a, b) to umin(a, ~b) + b. Differential Revision: https://reviews.llvm.org/D56869 llvm-svn: 352409	2019-01-28 19:19:09 +00:00
Simon Pilgrim	2c17512456	[X86][AVX] Remove lowerShuffleByMerging128BitLanes 2-lane restriction First step towards adding support for 64-bit unary "sublane" handling (a bit like lowerShuffleAsRepeatedMaskAndLanePermute). This allows us to add lowerV64I8Shuffle handling. llvm-svn: 352389	2019-01-28 17:02:35 +00:00
Sanjay Patel	94cca60b82	[x86] allow more shuffle splitting to avoid vpermps (PR40434) This is tricky to make optimal: sometimes we're better off using a single wider op, but other times it makes more sense to combine a narrow ops to achieve the same result. This solves the case from: https://bugs.llvm.org/show_bug.cgi?id=40434 There's potentially a similar change for vectors with 64-bit elements, but it needs adjustments similar to rL352333 to avoid creating infinite loops. llvm-svn: 352380	2019-01-28 15:51:34 +00:00
Craig Topper	453150bc18	[X86] Add new variadic avx512 compress/expand intrinsics that use vXi1 types for the mask argument. Remove and autoupgrade the old intrinsics llvm-svn: 352343	2019-01-28 07:03:03 +00:00
Craig Topper	b23d5ccafc	[X86] Add vbmi2 compressstore and expandload tests that aren't fast-isel tests. These got removed when we autoupgraded to target independent intrinsics, but we didn't have coverage anywhere else. The avx512f/avx512vl versions do have coverage. Also move some tests back from the upgrade file that aren't really upgraded. llvm-svn: 352342	2019-01-28 05:42:39 +00:00
Amara Emerson	0bfa2faccc	[AArch64][GlobalISel] Add some missing vector support for FP arithmetic ops. Moved the fneg lowering legalization test from AArch64 to X86, as we want to specify that it's already legal. llvm-svn: 352338	2019-01-28 02:28:22 +00:00
Sanjay Patel	ebe6b43aec	[x86] add restriction for lowering to vpermps This transform was added with rL351346, and we had an escape for shufps, but we also want one for unpckps vs. vpermps because vpermps doesn't take an immediate shuffle index operand. llvm-svn: 352333	2019-01-27 21:53:33 +00:00
Sanjay Patel	9ceaf2932a	[x86] add tests for extract/extract/unpack; NFC llvm-svn: 352331	2019-01-27 21:34:51 +00:00
Simon Pilgrim	670a6971f8	[X86][SSE] Add UNDEF handling to combineSelect ISD::USUBSAT matching (PR40083) llvm-svn: 352330	2019-01-27 21:01:23 +00:00
Simon Pilgrim	e5cf884018	[X86][SSE] Add UNDEF test case for combineSelect ISD::USUBSAT matching (PR40083) llvm-svn: 352329	2019-01-27 20:52:34 +00:00
Simon Pilgrim	f10b6623cc	[X86][SSE] Permit UNDEFs in combineAddToSUBUS matching (PR40083) llvm-svn: 352328	2019-01-27 20:36:37 +00:00
Sanjay Patel	6c865deedd	[x86] add more tests for lowerShuffleWithUndefHalf; NFC Some other transform is creating the opposite form and causing an infinite loop if we try to split some of these. llvm-svn: 352327	2019-01-27 20:17:02 +00:00
Simon Pilgrim	976b093ecb	[X86][SSE] Add PSUBUS undef element test case (PR40083) llvm-svn: 352326	2019-01-27 20:09:30 +00:00
Simon Pilgrim	c9d32e20d5	[X86] Add test cases for PR36721 (unnecessary andl for %cl when shifting) llvm-svn: 352321	2019-01-27 18:31:33 +00:00
Roman Lebedev	d35424a2b3	[X86][NFC] Replace "<%s" with "< %s" in run-lines. While i have no intention of actually commiting regeneration of the check lines in these test files with update_llc_test_checks, lack of that whitespace breaks that util, which is mildly inconvenient. llvm-svn: 352318	2019-01-27 15:36:35 +00:00
Simon Pilgrim	f6d7cfef39	[X86] Add CGP tests for PR40486 llvm-svn: 352316	2019-01-27 14:04:45 +00:00
Simon Pilgrim	c09a4db3b7	[X86] Regenerate reverse branch test to explicitly show branching and condition codes. llvm-svn: 352314	2019-01-27 12:39:38 +00:00
Simon Pilgrim	7b980ad368	[X86] Regenerate test to explicitly show branching and condition codes. llvm-svn: 352313	2019-01-27 12:38:09 +00:00
Gabor Buella	a0f743b77a	[X86] Add some missing blsr patterns The add+and sequence followed by a branch can happen e.g. when looping over the set bits of an integer: ``` while (x != 0) { func(x & ~x); x &= x - 1; } ``` Reviewed By: ctopper Differential Revision: https://reviews.llvm.org/D57296 llvm-svn: 352306	2019-01-27 06:15:39 +00:00
Gabor Buella	23b04798ad	[NFC][X86] Add a few more blsr test cases llvm-svn: 352305	2019-01-27 06:05:40 +00:00
Craig Topper	e65d4c5525	[X86] Add a pattern for (i64 (and (anyext def32:), 0x00000000FFFFFFFF)) to produce SUBREG_TO_REG def32 here means the producing instruction zeroed bits 63:32. We already do this for zext, but it looks like we can get an and+anyext sometimes. Spotted in the diffs from D33587. llvm-svn: 352303	2019-01-27 03:37:05 +00:00
Simon Pilgrim	37a8e65a60	[X86] combineCarryThroughADD - add support for X86::COND_A commutations (PR24545) As discussed on PR24545, we should try to commute X86::COND_A 'icmp ugt' cases to X86::COND_B 'icmp ult' to more optimally bind the carry flag output to a SBB instruction. Differential Revision: https://reviews.llvm.org/D57281 llvm-svn: 352289	2019-01-26 20:23:04 +00:00
Simon Pilgrim	b7a15acd38	[X86] Fold X86ISD::SBB(ISD::SUB(X,Y),0) -> X86ISD::SBB(X,Y) (PR25858) We often generate X86ISD::SBB(X, 0) for carry flag arithmetic. I had tried to create test cases for the ADC equivalent (which often uses the same pattern) but haven't managed to find anything yet. Differential Revision: https://reviews.llvm.org/D57169 llvm-svn: 352288	2019-01-26 20:13:44 +00:00
Amaury Sechet	be03018384	Generate test results for combine-fcopysign.ll using update_llc_test_checks.py . NFC llvm-svn: 352285	2019-01-26 18:13:53 +00:00
Simon Pilgrim	6162fba57c	[X86][SSE] Generalized unsigned compares to support nonsplat constant vectors (PR39859) llvm-svn: 352283	2019-01-26 16:40:03 +00:00
Simon Pilgrim	7d6c58e843	[X86] Add nonsplat increment/decrement constant vector with min/max test (PR39859) llvm-svn: 352281	2019-01-26 16:27:48 +00:00
Simon Pilgrim	0199838883	[X86] Add test case from PR34292 llvm-svn: 352274	2019-01-26 13:56:53 +00:00
Simon Pilgrim	3cdf3f681d	[X86] Add 'less_than_ideal' followup test case from PR24545 llvm-svn: 352272	2019-01-26 12:51:52 +00:00
Craig Topper	21cdcd7b2b	[X86] Autoupgrade some of the intrinsics used by stack folding tests that have been previously removed. llvm-svn: 352271	2019-01-26 06:27:04 +00:00
Craig Topper	3b5e01b386	[X86] Remove and autoupgrade vpconflict intrinsics that take a mask and passthru argument. We have unmasked versions as of r352172 llvm-svn: 352270	2019-01-26 06:27:01 +00:00
Craig Topper	58e6b37e62	Revert r352255 "[SelectionDAG][X86] Don't use SEXTLOAD for promoting masked loads in the type legalizer" This might be breaking an lldb windows buildbot. llvm-svn: 352268	2019-01-26 02:44:58 +00:00
Craig Topper	6c9c7d0796	[X86] Remove GCCBuiltins from 512-bit cvt(u)qqtops, cvt(u)qqtopd, and cvt(u)dqtops intrinsics. Add new variadic uitofp/sitofp with rounding mode intrinsics. Summary: See clang patch D56998 for a full description. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56999 llvm-svn: 352266	2019-01-26 02:41:54 +00:00
Craig Topper	7a8e74775c	[X86] Add DAG combine to merge vzext_movl with the various fp<->int conversion operations that only write the lower 64-bits of an xmm register and zero the rest. Summary: We have isel patterns for this, but we're missing some load patterns and all broadcast patterns. A DAG combine seems like a better fit for this. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D56971 llvm-svn: 352260	2019-01-26 01:17:09 +00:00
Craig Topper	b1d3457c03	[SelectionDAG][X86] Don't use SEXTLOAD for promoting masked loads in the type legalizer Summary: I'm not sure why we were using SEXTLOAD. EXTLOAD seems more appropriate since we don't care about the upper bits. This patch changes this and then modifies the X86 post legalization combine to emit a extending shuffle instead of a sign_extend_vector_inreg. Could maybe use an any_extend_vector_inreg, but I just did what we already do in LowerLoad. I think we can actually get rid of this code entirely if we switch to -x86-experimental-vector-widening-legalization. On AVX512 targets I think we might be able to use a masked vpmovzx and not have to expand this at all. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D57186 llvm-svn: 352255	2019-01-26 00:26:37 +00:00
Mircea Trofin	519f42d914	[llvm] Opt-in flag for X86DiscriminateMemOps Summary: Currently, if an instruction with a memory operand has no debug information, X86DiscriminateMemOps will generate one based on the first line of the enclosing function, or the last seen debug info. This may cause confusion in certain debugging scenarios. The long term approach would be to use the line number '0' in such cases, however, that brings in challenges: the base discriminator value range is limited (4096 values). For the short term, adding an opt-in flag for this feature. See bug 40319 (https://bugs.llvm.org/show_bug.cgi?id=40319) Reviewers: dblaikie, jmorse, gbedwell Reviewed By: dblaikie Subscribers: aprantl, eraman, hiraditya Differential Revision: https://reviews.llvm.org/D57257 llvm-svn: 352246	2019-01-25 21:49:54 +00:00
Guozhi Wei	81f3fd4bf8	[MBP] Don't move bottom block before header if it can't reduce taken branches If bottom of block BB has only one successor OldTop, in most cases it is profitable to move it before OldTop, except the following case: -->OldTop<- \| . \| \| . \| \| . \| ---Pred \| \| \| BB----- Move BB before OldTop can't reduce the number of taken branches, this patch detects this case and prevent the moving. Differential Revision: https://reviews.llvm.org/D57067 llvm-svn: 352236	2019-01-25 19:45:13 +00:00
Craig Topper	4cf28bad5b	[X86] Combine masked store and truncate into masked truncating stores. We also need to combine to masked truncating with saturation stores, but I'm leaving that for a future patch. This does regress some tests that used truncate wtih saturation followed by a masked store. Those now use a truncating store and use min/max to saturate. Differential Revision: https://reviews.llvm.org/D57218 llvm-svn: 352230	2019-01-25 18:37:36 +00:00
Simon Pilgrim	f56298f4b9	[X86] Simplify X86ISD::ADD/SUB if we don't use the result flag Simplify to the generic ISD::ADD/SUB if we don't make use of the result flag. This mainly helps with ADDCARRY/SUBBORROW intrinsics which get expanded to X86ISD::ADD/SUB but could be simplified further. Noticed in some of the test cases in PR31754 Differential Revision: https://reviews.llvm.org/D57234 llvm-svn: 352210	2019-01-25 15:58:28 +00:00
Sanjay Patel	21aa6ddc14	[x86] narrow a shuffle that doesn't use or set any high elements This isn't the final fix for our reduction/horizontal codegen, but it takes care of a lot of the problems. After we narrow the shuffle, existing combines for insert/extract and binops kick in, and we end up with cheaper 128-bit ops. The avg and mul reduction tests show an existing shuffle lowering hole for AVX2/AVX512. I think in its most minimal form this is: https://bugs.llvm.org/show_bug.cgi?id=40434 ...but we might need multiple fixes to get it right. Differential Revision: https://reviews.llvm.org/D57156 llvm-svn: 352209	2019-01-25 15:37:42 +00:00
Simon Pilgrim	d41ccddda9	[X86] Add addcarry/subborrow combine tests Show failure to simplify cases with zero op/flags llvm-svn: 352196	2019-01-25 12:26:27 +00:00

1 2 3 4 5 ...

13252 Commits