llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	10de2775bd	AMDGPU: Remove nan tests in class if src is nnan llvm-svn: 340850	2018-08-28 18:10:02 +00:00
Matt Arsenault	9a389fbd79	AMDGPU: Stop producing icmp/fcmp intrinsics with invalid types llvm-svn: 339815	2018-08-15 21:14:25 +00:00
Matt Arsenault	d35f46caf1	AMDGPU: Turn class x, p_zero\|n_zero into fcmp oeq x, 0 The library does use this for some reason. llvm-svn: 339461	2018-08-10 18:58:49 +00:00
Matt Arsenault	24ce89b717	Fix asserts in AMDGCN fmed3 folding by handling more cases of NaN Better NaN handling for AMDGCN fmed3. All operands are checked for NaN now. The checks were moved before the canonicalization to provide a better mapping from fclamp. Changed the behaviour of fmed3(x,y,NaN) to return max(x,y) instead of min(x,y) in light of this. Updated tests as a result and added some new cases to cover the fix. Patch by Alan Baker llvm-svn: 336375	2018-07-05 17:05:36 +00:00
Nicolai Haehnle	b29ee70122	InstCombine/AMDGPU: Add dimension-aware image intrinsics to SimplifyDemanded Summary: Use the expanded features of the TableGen generic tables to avoid manually adding the combinatorially exploded set of intrinsics. The getAMDGPUImageDimIntrinsic lookup function is early-out, i.e. non-AMDGPU intrinsics will never look at the underlying table. Use a generic approach for getting the new intrinsic overload to keep the code simple, and make the image dmask handling more generic: - handle non-sampler image loads - handle the case where the set of demanded elements is not a prefix There is some overlap between this code and an optimization that happens in the backend during code generation. They currently complement each other: - only the codegen optimization can generate vec3 loads - only the InstCombine optimization can handle D16 The InstCombine optimization also likely covers more cases since the codegen optimization is fairly ad-hoc. Ideally, we'll remove the optimization in codegen once the infrastructure for vec3 is in place (which will probably take a long time). Modify the test cases to use dimension-aware intrinsics. This makes it easier to see that the test coverage for the new intrinsics is equivalent, and the old style intrinsics will be removed in a follow-up commit anyway. Change-Id: I4b91ea661413d13004956fe4ef7d13d41b8ce3ad Reviewers: arsenm, rampitec, majnemer Subscribers: kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D48165 llvm-svn: 335230	2018-06-21 13:37:31 +00:00
Roman Lebedev	84c11aed10	[InstCombine] Recommit: Fold (x << y) >> y -> x & (-1 >> y) Summary: We already do it for splat constants, but not just values. Also, undef cases are mostly non-functional. The original commit was reverted because it broke tests for amdgpu backend, which i didn't check. Now, the backed was updated to recognize these new patterns, so we are good. https://bugs.llvm.org/show_bug.cgi?id=37603 https://rise4fun.com/Alive/cplX Reviewers: spatel, craig.topper, mareko, bogner, rampitec, nhaehnle, arsenm Reviewed By: spatel, rampitec, nhaehnle Subscribers: wdng, nhaehnle, llvm-commits Differential Revision: https://reviews.llvm.org/D47980 llvm-svn: 334818	2018-06-15 09:56:52 +00:00
Stanislav Mekhanoshin	0e132dca53	[AMDGPU] Optimze old value of v_mov_b32_dpp We can eliminate old value if bound_ctrl = 1 and row_mask = bank_mask = 0xf. This is alternative implementation working with the intrinsic in InstCombine. Original review for past-ISel optimization: D46570. Differential Revision: https://reviews.llvm.org/D46596 llvm-svn: 332956	2018-05-22 08:04:33 +00:00
Marek Olsak	13e4741275	AMDGPU: Add intrinsics llvm.amdgcn.cvt.{pknorm.i16, pknorm.u16, pk.i16, pk.u16} Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D41663 llvm-svn: 323908	2018-01-31 20:18:04 +00:00
Marek Olsak	ce76ea0394	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 llvm-svn: 316427	2017-10-24 10:27:13 +00:00
Marek Olsak	2114fc3bcb	AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 llvm-svn: 316426	2017-10-24 10:26:59 +00:00
Justin Bogner	be42c4aec8	Update three tests I missed in r302979 and r302990 llvm-svn: 303319	2017-05-18 00:58:06 +00:00
Justin Bogner	3c6fbad388	InstCombine: Move tests that use target intrinsics into subdirectories Tests with target intrinsics are inherently target specific, so it doesn't actually make sense to run them if we've excluded their target. llvm-svn: 302979	2017-05-13 05:39:46 +00:00

12 Commits