llvm-project

Commit Graph

Author	SHA1	Message	Date
Matt Arsenault	63f953795e	AMDGPU: Fold fneg into fma or fmad Patch mostly by Fiona Glaser llvm-svn: 291733	2017-01-12 00:32:16 +00:00
Matt Arsenault	4103a81d6d	AMDGPU: Fold fneg into fmul Patch mostly by Fiona Glaser llvm-svn: 291732	2017-01-12 00:23:20 +00:00
Matt Arsenault	2529fba989	AMDGPU: Fold fneg into fadd Patch mostly by Fiona Glaser llvm-svn: 291731	2017-01-12 00:09:34 +00:00
Matt Arsenault	2a04ff97ad	AMDGPU: Pull fneg/fabs out of a select Allows better source modifier usage. llvm-svn: 291729	2017-01-11 23:57:38 +00:00
Matt Arsenault	24a1273ae1	AMDGPU: Fix shrinking of addc/subb. To shrink to VOP2 the input carry must also be VCC. llvm-svn: 291720	2017-01-11 22:58:12 +00:00
Matt Arsenault	682eb4396a	AMDGPU: Fix sext_inreg for i1 in i16 This produces worse code when i16 is legal, mostly due to combines getting confused by conversions inserted for uniform 16-bit operations. llvm-svn: 291717	2017-01-11 22:35:22 +00:00
Matt Arsenault	28bd4cbeaf	AMDGPU: Fix breaking VOP3 v_add_i32s This was shrinking the instruction even though the carry output register was a virtual register, not known VCC. llvm-svn: 291716	2017-01-11 22:35:17 +00:00
Matt Arsenault	69e3001b84	AMDGPU: Fix folding immediates into mac src2 Whether it is legal or not needs to check for the instruction it will be replaced with. llvm-svn: 291711	2017-01-11 22:00:02 +00:00
Sam Kolton	9772eb3907	[AMDGPU] Assembler: SDWA/DPP should not accept scalar registers and immediate operands Reviewers: artem.tamazov, nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28157 llvm-svn: 291668	2017-01-11 11:46:30 +00:00
Mohammed Agabaria	2c96c43388	[X86] updating TTI costs for arithmetic instructions on X86\SLM arch. updated instructions: pmulld, pmullw, pmulhw, mulsd, mulps, mulpd, divss, divps, divsd, divpd, addpd and subpd. special optimization case which replaces pmulld with pmullw\pmulhw\pshuf seq. In case if the real operands bitwidth <= 16. Differential Revision: https://reviews.llvm.org/D28104 llvm-svn: 291657	2017-01-11 08:23:37 +00:00
Jan Vesely	0d6cb1caaf	AMDGPU/EG,CM: Add fp16 conversion instructions Differential Revision: https://reviews.llvm.org/D28164 llvm-svn: 291622	2017-01-11 00:12:39 +00:00
Matt Arsenault	51818c14b3	AMDGPU: Constant fold when immediate is materialized In future commits these patterns will appear after moveToVALU changes. llvm-svn: 291615	2017-01-10 23:32:04 +00:00
Matt Arsenault	8871683d60	AMDGPU: Add tests for HasMultipleConditionRegisters This was enabled without many specific tests or the comment. llvm-svn: 291586	2017-01-10 19:08:15 +00:00
Matt Arsenault	6dca542b4a	AMDGPU: Add Assert[SZ]Ext during argument load creation For i16 zeroext arguments when i16 was a legal type, the known bits information from the truncate was lost. Insert a zeroext so the known bits optimizations work with the 32-bit loads. Fixes code quality regressions vs. SI in min.ll test. llvm-svn: 291461	2017-01-09 18:52:39 +00:00
Matt Arsenault	5f45e7890a	Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector") llvm-svn: 291460	2017-01-09 18:44:11 +00:00
Jan Vesely	06200bd7bc	AMDGPU/R600: Don't use REGISTER_{LOAD,STORE} ISD nodes This will make transition to SCRATCH_MEMORY easier Differential Revision: https://reviews.llvm.org/D24746 llvm-svn: 291279	2017-01-06 21:00:46 +00:00
Konstantin Zhuravlyov	31dbb0391d	[AMDGPU] Remove extra semicolon. NFC llvm-svn: 291246	2017-01-06 17:23:21 +00:00
Konstantin Zhuravlyov	67a6d5401a	[AMDGPU] Do not emit .AMDGPU.config section for amdhsa Differential Revision: https://reviews.llvm.org/D27732 llvm-svn: 291245	2017-01-06 17:02:10 +00:00
Evgeniy Stepanov	e8e11eb726	Revert "Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector")" Summary: This reverts commit r291144. It breaks build bots. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-autoconf/builds/3270, http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fuzzer/builds/2058 lib/Target/AMDGPU/AsmParser/AMDGPUAsmParser.cpp:1638:12: error: could not convert ‘(const unsigned int)(& Variants)’ from ‘const unsigned int’ to ‘llvm::ArrayRef<unsigned int>’ return Variants; Reviewers: eugenis, tstellarAMD Patch by Alex Shlyapnikov. Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D28372 llvm-svn: 291168	2017-01-05 19:51:13 +00:00
Matt Arsenault	ec63f62c58	Reapply r291025 ("AMDGPU: Remove unneccessary intermediate vector") Arrays are supposed to be static const llvm-svn: 291144	2017-01-05 17:36:11 +00:00
Richard Smith	d4d575b955	Revert r291025 ("AMDGPU: Remove unneccessary intermediate vector") This caused buildbot failures due to returning ArrayRefs referencing local (temporary) objects. llvm-svn: 291067	2017-01-05 03:13:10 +00:00
Matt Arsenault	6796d7ea8b	AMDGPU: Remove unneccessary intermediate vector llvm-svn: 291025	2017-01-04 22:54:10 +00:00
Jan Vesely	d48445d513	AMDGPU/SI: Implement sendmsghalt intrinsic v2: expose using amdgcn prefix Differential Revision: https://reviews.llvm.org/D23511 llvm-svn: 290977	2017-01-04 18:06:55 +00:00
Artem Tamazov	25478d821b	[AMDGPU][mc] Enable absolute expressions in .hsa_code_object_isa directive Among other stuff, this allows to use predefined .option.machine_version_major /minor/stepping symbols in the directive. Relevant test expanded at once (also file renamed for clarity). Differential Revision: https://reviews.llvm.org/D28140 llvm-svn: 290710	2016-12-29 15:41:52 +00:00
Artem Tamazov	a01cce8887	[AMDGPU][llvm-mc] Predefined symbols to access register counts (.kernel.{v\|s}gpr_count) The feature allows for conditional assembly, filling the entries of .amd_kernel_code_t etc. Symbols are defined with value 0 at the beginning of each kernel scope. After each register usage, the respective symbol is set to: value = max( value, ( register index + 1 ) ) Thus, at the end of scope the value represents a count of used registers. Kernel scopes begin at .amdgpu_hsa_kernel directive, end at the next .amdgpu_hsa_kernel (or EOF, whichever comes first). There is also dummy scope that lies from the beginning of source file til the first .amdgpu_hsa_kernel. Test added. Differential Revision: https://reviews.llvm.org/D27859 llvm-svn: 290608	2016-12-27 16:00:11 +00:00
Sam Kolton	e66365e07d	[AMDGPU] Assembler: support SDWA and DPP for VOP2b instructions Reviewers: nhaustov, artem.tamazov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D28051 llvm-svn: 290599	2016-12-27 10:06:42 +00:00
Jan Vesely	206a510e54	AMDGPU: split ret/noret patterns for global atomics Differential Revision: https://reviews.llvm.org/D27989 llvm-svn: 290435	2016-12-23 15:34:51 +00:00
Chandler Carruth	ee08676102	Enable '-Wstring-conversion' and fix some bad asserts that it helped find. Notable is the assert in NewGVN which had no effect because of the bug. llvm-svn: 290400	2016-12-23 01:38:06 +00:00
Matt Arsenault	0b26e47345	AMDGPU: Invert cmp + select with constant Canonicalize a select with a constant to the false side. This enables more instruction shrinking opportunities since an inline immediate can be used for the false side of v_cndmask_b32_e32. This seems to usually be better but causes some code size regressions in some tests. llvm-svn: 290372	2016-12-22 21:40:08 +00:00
Matt Arsenault	941632839f	AMDGPU: Use i16 for i16 shift amount llvm-svn: 290351	2016-12-22 16:36:25 +00:00
Matt Arsenault	3c97e2030a	AMDGPU: Fix missing 16-bit cmpx instructions llvm-svn: 290349	2016-12-22 16:27:14 +00:00
Matt Arsenault	18f56be3d2	AMDGPU: Use i16 comparison instructions llvm-svn: 290348	2016-12-22 16:27:11 +00:00
Matt Arsenault	fef7beb6a6	AMDGPU: Fixed '!NodePtr->isKnownSentinel()' assert Caused by dereferencing end iterator when trying to const cast the iterator. Patch by Martin Sherburn llvm-svn: 290347	2016-12-22 16:06:32 +00:00
Sam Kolton	a568e3dde7	[AMDGPU] Add pseudo SDWA instructions Summary: This is needed for later SDWA support in CodeGen. Reviewers: vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27412 llvm-svn: 290338	2016-12-22 12:57:41 +00:00
Sam Kolton	a6792a39c4	[AMDGPU] Disassembler: fix for disaasembling v_mac_f32/16_dpp/sdwa Summary: Real instruction should copy constraints from real instruction. This allows auto-generated disassembler to correctly process tied operands. Reviewers: nhaustov, vpykhtin, tstellarAMD Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye Differential Revision: https://reviews.llvm.org/D27847 llvm-svn: 290336	2016-12-22 11:30:48 +00:00
Matt Arsenault	3de76b9dc8	AMDGPU: Fix missing commute table entries for cmpx No tests because these aren't currently used anywhere. llvm-svn: 290316	2016-12-22 04:39:41 +00:00
Matt Arsenault	e7d8ed32f9	AMDGPU: Swap order of operands in fadd/fsub combine FMA is canonicalized to constant in the middle operand. Do the same so fmad matches and avoid an extra combine step. llvm-svn: 290313	2016-12-22 04:03:40 +00:00
Matt Arsenault	46e6b7adef	AMDGPU: Check fast math flags in fadd/fsub combines llvm-svn: 290312	2016-12-22 04:03:35 +00:00
Matt Arsenault	770ec8680a	AMDGPU: Form more FMAs if fusion is allowed Extend the existing fadd/fsub->fmad combines to produce FMA if allowed. llvm-svn: 290311	2016-12-22 03:55:35 +00:00
Matt Arsenault	d8b73d5304	AMDGPU: Move combines into separate functions llvm-svn: 290309	2016-12-22 03:44:42 +00:00
Matt Arsenault	ef82ad94ea	AMDGPU: Enable some f32 fadd/fsub combines for f16 llvm-svn: 290308	2016-12-22 03:40:39 +00:00
Matt Arsenault	9e22bc2cd3	AMDGPU: Implement isFMAFasterThanFMulAndFAdd for f16 llvm-svn: 290307	2016-12-22 03:21:48 +00:00
Matt Arsenault	cdff21b14e	AMDGPU: Allow rcp and rsq usage with f16 llvm-svn: 290302	2016-12-22 03:05:44 +00:00
Matt Arsenault	4052a576c0	AMDGPU: Custom lower f16 fdiv llvm-svn: 290301	2016-12-22 03:05:41 +00:00
Matt Arsenault	ce84130f85	AMDGPU: Implement f16 fcanonicalize llvm-svn: 290300	2016-12-22 03:05:37 +00:00
Matt Arsenault	4e55c1ec11	AMDGPU: Update isFPImmLegal for f16 I don't think this matters because ConstantFP is legal. llvm-svn: 290299	2016-12-22 03:05:30 +00:00
Tom Stellard	d8ea85aced	AMDGPU/SI: Fix file header llvm-svn: 290265	2016-12-21 19:06:24 +00:00
Davide Italiano	c96272c47c	[AMDGPU] Garbage collect dead code. NFCI. llvm-svn: 290249	2016-12-21 10:19:00 +00:00
Matt Arsenault	9e91014282	AMDGPU: Allow 16-bit types in inline asm constraints llvm-svn: 290193	2016-12-20 19:06:12 +00:00
Matt Arsenault	4c1e9ec008	AMDGPU: Don't add same instruction multiple times to worklist When the instruction is processed the first time, it may be deleted resulting in crashes. While the new test adds the same user to the worklist twice, this particular case doesn't crash but I'm not sure why. llvm-svn: 290191	2016-12-20 18:55:06 +00:00

1 2 3 4 5 ...

1419 Commits