llvm-project

Commit Graph

Author	SHA1	Message	Date
Artyom Skrobov	40a4f40679	[ARM] Recommit the glueless lowering of addc/adde in Thumb1, including the amended (no UB anymore) fix for adding/subtracting -2147483648. This reverts r298328 "[ARM] Revert r297443 and r297820." and partially reverts r297842 "Revert "[Thumb1] Fix the bug when adding/subtracting -2147483648"" llvm-svn: 298417	2017-03-21 18:39:41 +00:00
Krzysztof Parzyszek	d033d1fd82	Recommit r298282 with fixes for memory allocation/deallocation [Hexagon] Recognize polynomial-modulo loop idiom again Regain the ability to recognize loops calculating polynomial modulo operation. This ability has been lost due to some changes in the preceding optimizations. Add code to preprocess the IR to a form that the pattern matching code can recognize. llvm-svn: 298400	2017-03-21 17:09:27 +00:00
Marek Olsak	5c7a61d221	AMDGPU: Buffer descriptor changes for GFX9 Reviewers: arsenm Subscribers: qcolombet, kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31158 llvm-svn: 298397	2017-03-21 17:00:39 +00:00
Marek Olsak	e22fdb9cac	AMDGPU: Always use VGPR indexing on GFX9 Reviewers: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, dstuttard, tpr Differential Revision: https://reviews.llvm.org/D31157 llvm-svn: 298396	2017-03-21 17:00:32 +00:00
Krzysztof Parzyszek	5e7f06f354	[Hexagon] Add -march=hexagon to a testcase llvm-svn: 298395	2017-03-21 16:59:40 +00:00
Matt Arsenault	f8fb605a68	AMDGPU: Fix asserting on 0 dmask for image intrinsics Fold these to undef during lowering so users get eliminated. llvm-svn: 298387	2017-03-21 16:32:17 +00:00
Matt Arsenault	964a848514	AMDGPU: Convert image intrinsic uses in tests llvm-svn: 298386	2017-03-21 16:24:12 +00:00
Matt Arsenault	dce313c3cf	DAG: Fold bitcast/extract_vector_elt of undef to undef Fixes not eliminating store when intrinsic is lowered to undef. llvm-svn: 298385	2017-03-21 16:20:16 +00:00
Simon Pilgrim	5e39cbaee5	Fix shufpd test name. llvm-svn: 298381	2017-03-21 15:12:53 +00:00
Sanjay Patel	79379cae15	[x86] use PMOVMSK for vector-sized equality comparisons We could do better by splitting any oversized type into whatever vector size the target supports, but I left that for future work if it ever comes up. The motivating case is memcmp() calls on 16-byte structs, so I think we can wire that up with a TLI hook that feeds into this. Differential Revision: https://reviews.llvm.org/D31156 llvm-svn: 298376	2017-03-21 13:50:33 +00:00
Simon Pilgrim	8bda035121	[X86][AVX] Tests showing missing SHUFPD + ZERO lowering This lowers to SHUFPD if the input is zeroinitializer but not with a demanded elts optimized build vector. llvm-svn: 298370	2017-03-21 13:30:40 +00:00
Valery Pykhtin	fd4c410f4d	[AMDGPU] Iterative scheduling infrastructure + minimal registry scheduler Differential revision: https://reviews.llvm.org/D31046 llvm-svn: 298368	2017-03-21 13:15:46 +00:00
Volkan Keles	044e003203	[GlobalISel] Fix shufflevector tests clang-lld-x86_64-2stage fails because of the order of the instructions. `CHECK-DAG` directives should fix the problem. llvm-svn: 298367	2017-03-21 13:12:59 +00:00
Sam Kolton	f60ad58dad	[ADMGPU] SDWA peephole optimization pass. Summary: First iteration of SDWA peephole. This pass tries to combine several instruction into one SDWA instruction. E.g. it converts: ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1 V_ADD_I32_e32 %vreg2, %vreg0, %vreg3 V_LSHLREV_B32_e32 %vreg4, 16, %vreg2 ''' Into: ''' V_ADD_I32_sdwa %vreg4, %vreg1, %vreg3 dst_sel:WORD_1 dst_unused:UNUSED_PAD src0_sel:WORD_1 src1_sel:DWORD ''' Pass structure: 1. Iterate over machine instruction in basic block and try to apply "SDWA patterns" to each of them. SDWA patterns match machine instruction into either source or destination SDWA operand. E.g. ''' V_LSHRREV_B32_e32 %vreg0, 16, %vreg1''' is matched to source SDWA operand '''%vreg1 src_sel:WORD_1'''. 2. Iterate over found SDWA operands and find instruction that could be potentially coverted into SDWA. E.g. for source SDWA operand potential instruction are all instruction in this basic block that uses '''%vreg0''' 3. Iterate over all potential instructions and check if they can be converted into SDWA. 4. Convert instructions to SDWA. This review contains basic implementation of SDWA peephole pass. This pass requires additional testing fot both correctness and performance (no performance testing done). There are several ways this pass can be improved: 1. Make this pass work on whole function not only basic block. As I can see this can be done right now without changes to pass. 2. Introduce more SDWA patterns 3. Introduce mnemonics to limit when SDWA patterns should apply Reviewers: vpykhtin, alex-t, arsenm, rampitec Subscribers: wdng, nhaehnle, mgorny Differential Revision: https://reviews.llvm.org/D30038 llvm-svn: 298365	2017-03-21 12:51:34 +00:00
Andrea Di Biagio	7937be7dd3	[DebugInfo][X86] Teach Optimize LEAs pass to handle debug values This patch fixes an issue in the Optimize LEAs pass where redundant LEAs were not removed because they were being used by debug values. The debug values are now ignored when determining whether LEAs are redundant. For now the debug values for the redundant LEAs are marked as undefined, effectively lost. The intention is for a follow up patch which will attempt to preserve the debug values where possible. Patch by Andrew Ng. Differential Revision: https://reviews.llvm.org/D30835 llvm-svn: 298360	2017-03-21 11:36:21 +00:00
Jonas Paulsson	54c7680e1f	[DAGTypeLegalizer] Handle widening truncate to vector of i1. Previously, PromoteIntRes_TRUNCATE() did not handle the case where the operand needs widening, which resulted in llvm_unreachable(). This patch adds the needed handling, along with a test case. Review: Eli Friedman, Simon Pilgrim. https://reviews.llvm.org/D31077 llvm-svn: 298357	2017-03-21 10:24:14 +00:00
Volkan Keles	75bdc7690e	[GlobalISel] Translate shufflevector Reviewers: qcolombet, aditya_nandakumar, t.p.northover, javed.absar, ab, dsanders Reviewed By: javed.absar Subscribers: dberris, rovka, llvm-commits, kristof.beyls Differential Revision: https://reviews.llvm.org/D30962 llvm-svn: 298347	2017-03-21 08:44:13 +00:00
Jonas Paulsson	bd65421f08	[SystemZ] Don't drop MO flags in foldMemoryOperandImpl() The def operand of the new LG/LD should have the old def operands flags and subreg index. New test: test/CodeGen/SystemZ/fold-memory-op-impl.ll Review: Ulrich Weigand llvm-svn: 298341	2017-03-21 05:49:40 +00:00
Vitaly Buka	c12716e742	Revert "[Hexagon] Recognize polynomial-modulo loop idiom again" Fix memory leaks on check-llvm tests detected by Asan. This reverts commit r298282. llvm-svn: 298329	2017-03-21 00:59:51 +00:00
Eli Friedman	76732acc23	[ARM] Revert r297443 and r297820. The glueless lowering of addc/adde in Thumb1 has known serious miscompiles (see https://reviews.llvm.org/D31081), and r297820 causes an infinite loop for certain constructs. It's not clear when they will be fixed, so let's just take them out of the tree for now. (I resolved a small conflict with r297453.) llvm-svn: 298328	2017-03-21 00:26:39 +00:00
Vadzim Dambrouski	ba789cbd3d	[ARM] Fix PR32130: Handle promotion of zero sized constants. The special case of zero sized values was previously not handled correctly. This patch handles this by not promoting if the size is zero. Patch by Tim Neumann. Differential Revision: https://reviews.llvm.org/D31116 llvm-svn: 298320	2017-03-20 22:59:57 +00:00
Sanjay Patel	f238902f52	[x86] add tests for setcc of i128/i256; NFC llvm-svn: 298317	2017-03-20 22:15:40 +00:00
Tim Northover	4340d64f91	GlobalISel: add implicit defs & uses when mutating an instruction. Otherwise a scheduler might do bad things to the code we produce. llvm-svn: 298311	2017-03-20 21:58:23 +00:00
David L. Jones	d61548471c	[X86] Clean up test/CodeGen/X86/2006-03-01-InstrSchedBug.ll Summary: - Migrated from grep to FileCheck. - Re-indented, removed boilerplate comments. - Added 'entry' label at beginning of basic block. Patch by Jorge Gorbe! Reviewed By: RKSimon Subscribers: RKSimon, jgorbe, llvm-commits Differential Revision: https://reviews.llvm.org/D30317 llvm-svn: 298298	2017-03-20 20:10:30 +00:00
Nirav Dave	f5f0864ac2	Add test case for merging of chained stores of mismatched type. llvm-svn: 298293	2017-03-20 19:48:22 +00:00
Krzysztof Parzyszek	8490251de3	[Hexagon] Recognize polynomial-modulo loop idiom again Regain the ability to recognize loops calculating polynomial modulo operation. This ability has been lost due to some changes in the preceding optimizations. Add code to preprocess the IR to a form that the pattern matching code can recognize. llvm-svn: 298282	2017-03-20 18:12:58 +00:00
Konstantin Zhuravlyov	2534bc07f4	[AMDGPU] Run always inliner early in opt Differential Revision: https://reviews.llvm.org/D31141 llvm-svn: 298281	2017-03-20 18:06:45 +00:00
Reid Kleckner	8819c73878	[WinEH] Adjust decision to emit SEH moves for leaf functions Move the check for "MF->hasWinCFI()" up into the calculation of the shouldEmitMoves boolean, rather than putting it in the early returning if. This ensures that endFunction doesn't try to emit .seh_* directives for leaf functions. llvm-svn: 298276	2017-03-20 17:45:59 +00:00
Tim Northover	89268b183f	GlobalISel: allow quad-precision values to be dumped. Otherwise the fallback path fails with an assertion on AAPCS AArch64 targets, when "long double" is encountered. llvm-svn: 298273	2017-03-20 16:52:08 +00:00
Diana Picus	d79253a9f7	[GlobalISel] Use the correct calling conv for calls This commit adds a parameter that lets us pass in the calling convention of the call to CallLowering::lowerCall. This allows us to handle situations where the calling convetion of the callee is different from that of the caller. Differential Revision: https://reviews.llvm.org/D31039 llvm-svn: 298254	2017-03-20 14:40:18 +00:00
Konstantin Zhuravlyov	8a67eb144f	Revert "[AMDGPU] Run always inliner early in opt" This reverts commit r297958, it breaks device-libs build. llvm-svn: 298239	2017-03-20 09:26:08 +00:00
Craig Topper	5992c8d1dc	[AVX-512] Handle kor/kand/kandn/kxor/kxnor/knot intrinsics at lowering time instead of isel Summary: Currently we handle these intrinsics at isel with special patterns. But as they just map to normal logic operations, we should just handle them at lowering. This will expose them to DAG combine optimizations. Right now the kor-sequence test generates a bunch of regclass copies between GR16 and VK16 that the peephole optimizer and/or register coallescing are removing to keep everything in the mask domain. By handling the logic op intrinsics earlier, these copies become bitcasts in the DAG and get removed by DAG combine which seems more robust. This should help enable my plan to stop copying between K registers and GR8/GR16. The peephole optimizer can't remove a chain of copies between K and GR32 with insert_subreg/extract_subreg present in the chain so the kor-sequence test break. But this patch should dodge the problem entirely. Reviewers: zvi, delena, RKSimon, igorb Reviewed By: igorb Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31056 llvm-svn: 298228	2017-03-19 17:11:09 +00:00
Simon Pilgrim	8424df7dea	Fix constant folding of fp2int to large integers We make the assumption in most of our constant folding code that a fp2int will target an integer of 128-bits or less, calling the APFloat::convertToInteger with only uint64_t[2] of raw bits for the result. Fuzz testing (PR24662) showed that we don't handle other cases at all, resulting in stack overflows and all sorts of crashes. This patch uses the APSInt version of APFloat::convertToInteger instead to better handle such cases. Differential Revision: https://reviews.llvm.org/D31074 llvm-svn: 298226	2017-03-19 16:50:25 +00:00
Ahmed Bougacha	931904d777	[GlobalISel] Don't select trivially dead instructions. Folding instructions when selecting can cause them to become dead. Don't select these dead instructions (if they don't have other side effects, and don't define physical registers). Preserve existing tests by adding COPYs. In some tests, the G_CONSTANT vregs never get constrained to a class: the only use of the vreg was folded into another instruction, so the G_CONSTANT, now dead, never gets selected. llvm-svn: 298224	2017-03-19 16:13:00 +00:00
Ahmed Bougacha	48bcd22ce8	[GlobalISel][AArch64] Add DBG_VALUE select test. NFC. llvm-svn: 298223	2017-03-19 16:12:53 +00:00
Ahmed Bougacha	dcd416a4b9	[GlobalISel][AArch64] Split out cast select tests. NFC. And remove some redundant bitcast tests. Also split the test functions themselves: it makes it obvious to see what's tested where and what isn't, it makes the tests much easier to read and manually update, and, most importantly, it makes them almost trivial to update using tooling. Yes, it's obnoxiously verbose, but said tooling helps upgrade to better MIR syntax whenever available. llvm-svn: 298222	2017-03-19 16:12:51 +00:00
Oren Ben Simhon	75537b6566	[MIR] Test assumes x64 windows calling convention upon printing/parsing MIR output/input. llvm-svn: 298212	2017-03-19 13:23:20 +00:00
Benjamin Kramer	6520f83ba4	[MIR] Add triple to test that assumes it runs on windows. llvm-svn: 298211	2017-03-19 13:04:35 +00:00
Oren Ben Simhon	9ce0ec5dbc	CalleeSavedRegister was removed from MIR and is recalculated upon MIR parsing. llvm-svn: 298210	2017-03-19 11:18:09 +00:00
Oren Ben Simhon	a96fdbf233	Moving the test to x86 because other architectures do not suport regcall calling convention. llvm-svn: 298209	2017-03-19 08:53:42 +00:00
Oren Ben Simhon	0ef61ec32a	[MIR] Support Customed Register Mask and CSRs The MIR printer dumps a string that describe the register mask of a function. A static predefined list of register masks matches a static list of strings. However when the register mask is not from the static predefined list, there is no descriptor string and the printer fails. This patch adds support to custom register mask printing and dumping. Also the list of callee saved registers (describing the registers that must be preserved for the caller) might be dynamic. As such this data needs to be dumped and parsed back to the Machine Register Info. Differential Revision: https://reviews.llvm.org/D30971 llvm-svn: 298207	2017-03-19 08:14:18 +00:00
Nirav Dave	ac6081cb67	Make library calls sensitive to regparm module flag (Fixes PR3997). Reviewers: mkuper, rnk Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin Differential Revision: https://reviews.llvm.org/D27050 llvm-svn: 298179	2017-03-18 00:44:07 +00:00
Stanislav Mekhanoshin	8e45acfc38	[AMDGPU] Add address space based alias analysis pass This is direct port of HSAILAliasAnalysis pass, just cleaned for style and renamed. Differential Revision: https://reviews.llvm.org/D31103 llvm-svn: 298172	2017-03-17 23:56:58 +00:00
Sanjay Patel	0429b1a431	[x86] regenerate checks; NFC llvm-svn: 298166	2017-03-17 23:04:18 +00:00
Sanjay Patel	77e6ebe748	[x86] regenerate checks; NFC llvm-svn: 298164	2017-03-17 22:47:21 +00:00
Jessica Paquette	ea8cc09be0	[Outliner] Add outliner for AArch64 This commit adds the necessary target hooks for outlining in AArch64. It also refactors the switch statement used in `getMemOpBaseRegImmOfsWidth` into a more general function, `getMemOpInfo`. This allows the outliner to share that code without copying and pasting it. The AArch64 outliner can be run using -mllvm -enable-machine-outliner, as with the X86-64 outliner. The test for this pass verifies that the outliner does, in fact outline functions, fixes up the stack accesses properly, and can correctly generate a tail call. In the future, this test should be replaced with a MIR test, so that we can properly test immediate offset overflows in fixed-up instructions. llvm-svn: 298162	2017-03-17 22:26:55 +00:00
Evgeniy Stepanov	51c962f72e	Add !associated metadata. This is an ELF-specific thing that adds SHF_LINK_ORDER to the global's section pointing to the metadata argument's section. The effect of that is a reverse dependency between sections for the linker GC. !associated does not change the behavior of global-dce. The global may also need to be added to llvm.compiler.used. Since SHF_LINK_ORDER is per-section, !associated effectively enables fdata-sections for the affected globals, the same as comdats do. Differential Revision: https://reviews.llvm.org/D29104 llvm-svn: 298157	2017-03-17 22:17:24 +00:00
Eli Friedman	46ddab3810	[SelectionDAG] Remove redundant stores more aggressively. Handle TokenFactors more aggressively in SDValue::reachesChainWithoutSideEffects. This isn't really a very effective change anymore because of other changes to chain handling, but it's a cheap check, and the expanded comments are still useful. It might be possible to loosen the hasOneUse() requirement with a deeper analysis, but a naive implementation of that check would be expensive. Differential Revision: https://reviews.llvm.org/D29845 llvm-svn: 298156	2017-03-17 22:15:50 +00:00
Matt Arsenault	e70d5dcf3e	AMDGPU: Fix handling of constant phi input loop conditions If the loop condition was an i1 phi with a constantexpr input, this would add a loop intrinsic fed by a phi dependent on a call to if.break in the same block. Insert the call in the loop header. llvm-svn: 298121	2017-03-17 20:52:21 +00:00
Sanjay Patel	455703a0c6	[x86] clean up setcc with negated operand transform and add missing test; NFCI llvm-svn: 298118	2017-03-17 20:29:40 +00:00

1 2 3 4 5 ...

19665 Commits