llvm-project

Commit Graph

Author	SHA1	Message	Date
Justin Lebar	ccbd8f5a02	Revert "[attrs] Handle convergent CallSites." This reverts r261544, which was causing a test failure in Transforms/FunctionAttrs/readattrs.ll. llvm-svn: 261549	2016-02-22 18:24:43 +00:00
Justin Lebar	7bf9187abb	[attrs] Handle convergent CallSites. Summary: Previously we had a notion of convergent functions but not of convergent calls. This is insufficient to correctly analyze calls where the target is unknown, e.g. indirect calls. Now a call is convergent if it targets a known-convergent function, or if it's explicitly marked as convergent. As usual, we can remove convergent where we can prove that no convergent operations are performed in the call. Reviewers: chandlerc, jingyue Subscribers: hfinkel, jhen, tra, llvm-commits Differential Revision: http://reviews.llvm.org/D17317 llvm-svn: 261544	2016-02-22 17:51:35 +00:00
Sanjay Patel	2440130437	fix inaccurate comment; NFC llvm-svn: 261484	2016-02-21 17:33:31 +00:00
Sanjay Patel	368ac5dbf7	[InstCombine] add getNegativeIsTrueBoolVec() helper function; NFC Originally part of: http://reviews.llvm.org/D17485 We need this when simplifying masked memory ops too. llvm-svn: 261483	2016-02-21 17:29:33 +00:00
Simon Pilgrim	471efd244a	[InstCombine] SSE/SSE2 (u)comiss/(u)comisd comparison intrinsics only use the lowest vector element llvm-svn: 261460	2016-02-20 23:17:35 +00:00
Richard Trieu	7a08381403	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Artur Pilipenko	44e7c51b05	Don't propagate dereferenceable attribute through gc.relocate in InstCombine Reviewed By: reames Differential Revision: http://reviews.llvm.org/D16143 llvm-svn: 260509	2016-02-11 11:22:46 +00:00
Philip Reames	ea4d8e8ce9	[InstCombine][GC] Handle gc.relocations of vector type We introduced gc.relocates of vector-of-pointer types a couple of weeks back. Somehow, I missed updating the InstCombine rule to account for this. If we hit this code path with a vector-of-pointers gc.relocate, we'd crash on a cast<PointerType>. I also took the chance to do a bit of code style cleanup. llvm-svn: 260279	2016-02-09 21:09:22 +00:00
Sanjay Patel	4b198802b3	function names start with a lowercase letter; NFC llvm-svn: 259425	2016-02-01 22:23:39 +00:00
Sanjay Patel	103ab7d571	[InstCombine] simplify masked scatter/gather intrinsics with zero masks A masked scatter with a zero mask means there's no store. A masked gather with a zero mask means the passthru arg is returned. This is a continuation of: http://reviews.llvm.org/rL259369 http://reviews.llvm.org/rL259392 llvm-svn: 259421	2016-02-01 22:10:26 +00:00
Sanjay Patel	04f792bdc9	[InstCombine] simplify masked store intrinsics with all ones or zeros masks A masked store with a zero mask means there's no store. A masked store with an allOnes mask means it's a normal vector store. This is a continuation of: http://reviews.llvm.org/rL259369 llvm-svn: 259392	2016-02-01 19:39:52 +00:00
Sanjay Patel	b695c5557c	[InstCombine] simplify masked load intrinsics with all ones or zeros masks A masked load with a zero mask means there's no load. A masked load with an allOnes mask means it's a normal vector load. Differential Revision: http://reviews.llvm.org/D16691 llvm-svn: 259369	2016-02-01 17:00:10 +00:00
Sanjay Patel	0069f56e33	add helper function for minnum/maxnum ; NFC llvm-svn: 259326	2016-01-31 16:35:23 +00:00
Sanjay Patel	6038d3e5c6	function names start with a lower case letter ; NFC llvm-svn: 259264	2016-01-29 23:27:03 +00:00
Sanjay Patel	f9f5d3cc45	fix formatting; NFC llvm-svn: 259262	2016-01-29 23:14:58 +00:00
Sanjay Patel	03c03f57ee	less indenting; NFCI llvm-svn: 259002	2016-01-28 00:03:16 +00:00
Matt Arsenault	bef34e21c7	AMDGPU: Rename intrinsics to use amdgcn prefix The intrinsic target prefix should match the target name as it appears in the triple. This is not yet complete, but gets most of the important ones. llvm.AMDGPU.* intrinsics used by mesa and libclc are still handled for compatability for now. llvm-svn: 258557	2016-01-22 21:30:34 +00:00
Sanjay Patel	cd4377c74d	don't repeat function names in comments; NFC llvm-svn: 258360	2016-01-20 22:24:38 +00:00
Sanjay Patel	1c600c6e83	80-cols; NFC llvm-svn: 258323	2016-01-20 16:41:43 +00:00
Sanjay Patel	142c49bc42	remove outdated comment; NFC llvm-svn: 258147	2016-01-19 17:29:22 +00:00
Manuel Jacob	5f6eaac611	GlobalValue: use getValueType() instead of getType()->getPointerElementType(). Reviewers: mjacob Subscribers: jholewinski, arsenm, dsanders, dblaikie Patch by Eduard Burtescu. Differential Revision: http://reviews.llvm.org/D16260 llvm-svn: 257999	2016-01-16 20:30:46 +00:00
Manuel Jacob	83eefa6d20	[Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC. Summary: This commit renames GCRelocateOperands to GCRelocateInst and makes it an intrinsic wrapper, similar to e.g. MemCpyInst. Also, all users of GCRelocateOperands were changed to use the new intrinsic wrapper instead. Reviewers: sanjoy, reames Subscribers: reames, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D15762 llvm-svn: 256811	2016-01-05 04:03:00 +00:00
Sanjay Patel	af674fbfd9	getParent() ^ 3 == getModule() ; NFCI llvm-svn: 255511	2015-12-14 17:24:23 +00:00
Akira Hatanaka	237916b537	[AttributeSet] Overload AttributeSet::addAttribute to reduce compile time. The new overloaded function is used when an attribute is added to a large number of slots of an AttributeSet (for example, to function parameters). This is much faster than calling AttributeSet::addAttribute once per slot, because AttributeSet::getImpl (which calls FoldingSet::FIndNodeOrInsertPos) is called only once per function instead of once per slot. With this commit, clang compiles a file which used to take over 22 minutes in just 13 seconds. rdar://problem/23581000 Differential Revision: http://reviews.llvm.org/D15085 llvm-svn: 254491	2015-12-02 06:58:49 +00:00
Sanjoy Das	c521c7bea5	[OperandBundles] Extract duplicated code into a helper function, NFC llvm-svn: 254047	2015-11-25 00:42:24 +00:00
Sanjoy Das	7629346193	[InstCombine] Don't drop operand bundles Reviewers: majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D14857 llvm-svn: 254046	2015-11-25 00:42:19 +00:00
Pete Cooper	67cf9a723b	Revert "Change memcpy/memset/memmove to have dest and source alignments." This reverts commit r253511. This likely broke the bots in http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202 http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787 llvm-svn: 253543	2015-11-19 05:56:52 +00:00
Pete Cooper	72bc23ef02	Change memcpy/memset/memmove to have dest and source alignments. Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html These intrinsics currently have an explicit alignment argument which is required to be a constant integer. It represents the alignment of the source and dest, and so must be the minimum of those. This change allows source and dest to each have their own alignments by using the alignment attribute on their arguments. The alignment argument itself is removed. There are a few places in the code for which the code needs to be checked by an expert as to whether using only src/dest alignment is safe. For those places, they currently take the minimum of src/dest alignments which matches the current behaviour. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false) will now read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false) For out of tree owners, I was able to strip alignment from calls using sed by replacing: (call.llvm\.memset.)i32\ [0-9]\,\ i1 false\) with: $1i1 false) and similarly for memmove and memcpy. I then added back in alignment to test cases which needed it. A similar commit will be made to clang which actually has many differences in alignment as now IRBuilder can generate different source/dest alignments on calls. In IRBuilder itself, a new argument was added. Instead of calling: CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, / isVolatile / false) you now call CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, / isVolatile */ false) There is a temporary class (IntegerAlignment) which takes the source alignment and rejects implicit conversion from bool. This is to prevent isVolatile here from passing its default parameter to the source alignment. Note, changes in future can now be made to codegen. I didn't change anything here, but this change should enable better memcpy code sequences. Reviewed by Hal Finkel. llvm-svn: 253511	2015-11-18 22:17:24 +00:00
James Molloy	2d09c00b91	[InstCombine] Add trivial folding (bitreverse (bitreverse x)) -> x There are plenty more instcombines we could probably do with bitreverse, but this seems like a very obvious and trivial starting point and was brought up by Hal in his review. llvm-svn: 252879	2015-11-12 12:39:41 +00:00
Simon Pilgrim	216b1bf5ed	[InstCombine] SSE4A constant folding and conversion to shuffles. This patch improves support for combining the SSE4A EXTRQ(I) and INSERTQ(I) intrinsics: 1 - Converts INSERTQ/EXTRQ calls to INSERTQI/EXTRQI if the 'bit index' and 'length' operands are constant 2 - Converts INSERTQI/EXTRQI calls to shufflevector if the bit index/length are both byte aligned (we can already lower shuffles to INSERTQI/EXTRQI if its useful) 3 - Constant folding support 4 - Add zeroinitializer handling Differential Revision: http://reviews.llvm.org/D13348 llvm-svn: 250609	2015-10-17 11:40:05 +00:00
Duncan P. N. Exon Smith	9f8aaf21ba	InstCombine: Remove ilist iterator implicit conversions, NFC Stop relying on implicit conversions of ilist iterators in LLVMInstCombine. No functionality change intended. llvm-svn: 250183	2015-10-13 16:59:33 +00:00
Simon Pilgrim	3c2b30f8ba	[InstCombine][SSE4A] Remove broken INSERTQI range combining optimization As discussed in D13348 - the INSERTQI range combining code is wrong in that it confuses the insertion bit index with an extraction bit index. The remaining legal combines are very unlikely (especially once we've converted to shuffles in D13348) so I'm removing the optimization. llvm-svn: 250160	2015-10-13 14:48:54 +00:00
Simon Pilgrim	1d1c56e2df	[InstCombine][X86][XOP] Combine XOP integer vector comparisons to native IR We now have lowering support for XOP PCOM/PCOMU instructions. llvm-svn: 249977	2015-10-11 14:38:34 +00:00
Arnaud A. de Grandmaison	849f3bf8c9	[InstCombine] Remove trivially empty lifetime start/end ranges. Summary: Some passes may open up opportunities for optimizations, leaving empty lifetime start/end ranges. For example, with the following code: void foo(char , char ); void bar(int Size, bool flag) { for (int i = 0; i < Size; ++i) { char text[1]; char buff[1]; if (flag) foo(text, buff); // BBFoo } } the loop unswitch pass will create 2 versions of the loop, one with flag==true, and the other one with flag==false, but always leaving the BBFoo basic block, with lifetime ranges covering the scope of the for loop. Simplify CFG will then remove BBFoo in the case where flag==false, but will leave the lifetime markers. This patch teaches InstCombine to remove trivially empty lifetime marker ranges, that is ranges ending right after they were started (ignoring debug info or other lifetime markers in the range). This fixes PR24598: excessive compile time after r234581. Reviewers: reames, chandlerc Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13305 llvm-svn: 249018	2015-10-01 14:54:31 +00:00
Andrea Di Biagio	0594e2a1e9	[InstCombine] Teach how to convert SSSE3/AVX2 byte shuffles to builtin shuffles if the shuffle mask is constant. This patch teaches InstCombiner how to convert a SSSE3/AVX2 byte shuffle to a builtin shuffle if the mask is constant. Converting byte shuffle intrinsic calls to builtin shuffles can help finding more opportunities for combining shuffles later on in selection dag. We may end up with byte shuffles with constant masks as the result of inlining. Differential Revision: http://reviews.llvm.org/D13252 llvm-svn: 248913	2015-09-30 16:44:39 +00:00
Simon Pilgrim	9cb018b6b6	[X86][SSE] Replace 128-bit SSE41 PMOVSX intrinsics with native IR This patches removes the x86.sse41.pmovsx* intrinsics, provides a suitable upgrade path and updates relevant tests to sign extend a subvector instead. LLVM counterpart to D12835 Differential Revision: http://reviews.llvm.org/D13002 llvm-svn: 248368	2015-09-23 08:48:33 +00:00
Simon Pilgrim	996725eb17	[InstCombine] Use SimplifyDemandedVectorEltsLow helper function. NFCI. Use the SimplifyDemandedVectorEltsLow helper function introduced in D12680. llvm-svn: 248089	2015-09-19 11:41:53 +00:00
Simon Pilgrim	61116ddc7b	[InstCombine] Added vector demanded bits support for SSE4A EXTRQ/INSERTQ instructions The SSE4A instructions EXTRQ/INSERTQ only use the lower 64-bits (or less) for many of their input vector operands and all of them have undefined upper 64-bits results. Differential Revision: http://reviews.llvm.org/D12680 llvm-svn: 247934	2015-09-17 20:32:45 +00:00
Chen Li	0d043b52eb	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. It also adds assertions in isKnownNonNull() and isKnownNonNullFromDominatingCondition() to make sure the value checked is pointer type (as defined in LLVM document). These assertions might trip failures in things which are not covered under llvm/test, but fixes should be pretty obvious. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247587	2015-09-14 18:10:43 +00:00
Simon Pilgrim	48ffca0f47	Fixed unused variable warning. llvm-svn: 247505	2015-09-12 14:00:17 +00:00
Simon Pilgrim	20c607b110	[InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion): <4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion. Added constant folding support. Differential Revision: http://reviews.llvm.org/D12731 llvm-svn: 247504	2015-09-12 13:39:53 +00:00
Mehdi Amini	2bd08527ff	Revert "[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite" This reverts commit r247356. Breaks test/Transforms/InstCombine/pr8547.ll with: Wrong types for attribute: byval inalloca nest noalias nocapture nonnull readnone readonly sret dereferenceable(1) dereferenceable_or_null(1) %call = call i32 (i8, ...) @printf(i8 getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i64 0, i64 0), i32 nonnull %conv2) #0 LLVM ERROR: Broken function found, compilation aborted! From: Mehdi Amini <mehdi.amini@apple.com> llvm-svn: 247371	2015-09-11 01:33:48 +00:00
Chen Li	a29c612ddd	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D12779 llvm-svn: 247356	2015-09-10 23:04:49 +00:00
Chen Li	32a51416e5	[InstCombineCalls] Use isKnownNonNullAt() to check nullness of gc.relocate return value Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of gc.relocate return value. In this way it can handle cases where the relocated value does not have nonnull attribute but has a dominating null check from the CFG. Reviewers: reames Subscribers: llvm-commits, sanjoy Differential Revision: http://reviews.llvm.org/D12772 llvm-svn: 247353	2015-09-10 22:35:41 +00:00
Simon Pilgrim	becd5e8abd	[InstCombine] SSE/AVX vector shifts demanded shift amount bits Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this. I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift. Differential Revision: http://reviews.llvm.org/D11938 llvm-svn: 244872	2015-08-13 07:39:03 +00:00
Simon Pilgrim	93f59f53ca	unused variable warning fix. llvm-svn: 244725	2015-08-12 08:23:36 +00:00
Simon Pilgrim	8c049d5c03	[InstCombine] Move SSE/AVX vector blend folding to instcombiner As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely). InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask). I also moved all the relevant combine tests into InstCombine/blend_x86.ll Differential Revision: http://reviews.llvm.org/D11934 llvm-svn: 244723	2015-08-12 08:08:56 +00:00
Simon Pilgrim	a3a72b41de	[InstCombine] Move SSE2/AVX2 arithmetic vector shift folding to instcombiner As discussed in D11760, this patch moves the (V)PSRA(WD) arithmetic shift-by-constant folding to InstCombine to match the logical shift implementations. Differential Revision: http://reviews.llvm.org/D11886 llvm-svn: 244495	2015-08-10 20:21:15 +00:00
Simon Pilgrim	3815c16bf8	[InstCombine] Fix SSE2/AVX2 vector logical shift by constant This patch fixes the sse2/avx2 vector shift by constant instcombine call to correctly deal with the fact that the shift amount is formed from the entire lower 64-bit and not just the lowest element as it currently assumes. e.g. %1 = tail call <4 x i32> @llvm.x86.sse2.psrl.d(<4 x i32> %v, <4 x i32> <i32 15, i32 15, i32 15, i32 15>) In this case, (V)PSRLD doesn't perform a lshr by 15 but in fact attempts to shift by 64424509455 ((15 << 32) \| 15) - giving a zero result. In addition, this review also recognizes shift-by-zero from a ConstantAggregateZero type (PR23821). Differential Revision: http://reviews.llvm.org/D11760 llvm-svn: 244341	2015-08-07 18:22:50 +00:00
Simon Pilgrim	18617d193f	Fixed line endings. llvm-svn: 244021	2015-08-05 08:18:00 +00:00

1 2 3 4 5 ...

317 Commits