llvm-project

Commit Graph

Author	SHA1	Message	Date
George Burgess IV	8a457ddfd8	Fix include case. NFC. llvm-svn: 276465	2016-07-22 20:15:19 +00:00
Tim Northover	33b07d6725	GlobalISel: implement legalization pass, with just one transformation. This adds the actual MachineLegalizeHelper to do the work and a trivial pass wrapper that legalizes all instructions in a MachineFunction. Currently the only transformation supported is splitting up a vector G_ADD into one acting on smaller vectors. llvm-svn: 276461	2016-07-22 20:03:43 +00:00
Krzysztof Parzyszek	3c89bb09d5	[Hexagon] Make HexagonCodeGen depend on Scalar Hexagon backend uses LoopDataPrefetch pass that is defined in Scalar. llvm-svn: 276441	2016-07-22 17:23:46 +00:00
Matt Arsenault	3c07c813c0	AMDGPU: Fix groupstaticsize for large LDS The size can exceed s_movk_i32's limit, and we don't want to use it this early since it inhibits optimizations. This should probably be merged to the release branch. llvm-svn: 276438	2016-07-22 17:01:33 +00:00
Matt Arsenault	8d718dcfda	AMDGPU: Add HSA dispatch id intrinsic llvm-svn: 276437	2016-07-22 17:01:30 +00:00
Matt Arsenault	f9245b75c0	AMDGPU: Delete more dead code Remove dead code from r600 intrinsic removal. Remove unset members, rename StackSize to be less ambiguous. llvm-svn: 276436	2016-07-22 17:01:25 +00:00
Matt Arsenault	7fb961f3e6	AMDGPU: Fix i1 fp_to_int R600's i1 fp_to_uint selected but was incorrect according to what instcombine constant folds to. llvm-svn: 276435	2016-07-22 17:01:21 +00:00
Matt Arsenault	d40ded6681	AMDGPU: Don't reinvent transferSuccessorsAndUpdatePHIs llvm-svn: 276434	2016-07-22 17:01:15 +00:00
Krzysztof Parzyszek	047149f745	[RDF] Make the graph construction/use less expensive - FuncNode::findBlock traverses the function every time. Avoid using it, and keep a cache of block addresses in DataFlowGraph instead. - The operator[] in the map of definition stacks was very slow. Replace the map with unordered_map. llvm-svn: 276429	2016-07-22 16:09:47 +00:00
Krzysztof Parzyszek	d3d0a4bda3	[Hexagon] Use loop data prefetch on Hexagon llvm-svn: 276422	2016-07-22 14:22:43 +00:00
Simon Pilgrim	ea0d4f9962	[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 (reapplied) As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Reapplied with fix for PR28657 - removed intrinsic definitions (clang companion patch to be be submitted shortly). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276416	2016-07-22 13:58:44 +00:00
Benjamin Kramer	5ba0e20315	Revert "[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128" It caused PR28657. This reverts commit r276281. llvm-svn: 276405	2016-07-22 11:03:10 +00:00
Sjoerd Meijer	5c0ef83386	This refactoring of ARM machine block size computation creates two utility functions so that the size computation is available not only in ConstantIslands but in other passes as well. Differential Revision: https://reviews.llvm.org/D22640 llvm-svn: 276399	2016-07-22 08:39:12 +00:00
Hrvoje Varga	2db00ce4b6	[mips][microMIPS] Implement SLT, SLTI, SLTIU, SLTU microMIPS32r6 instructions Differential Revision: https://reviews.llvm.org/D19906 llvm-svn: 276397	2016-07-22 07:18:33 +00:00
Craig Topper	52e2e8381b	[AVX512] Add ExeDomain to vector extend and truncate instructions. llvm-svn: 276394	2016-07-22 05:46:44 +00:00
Craig Topper	f4151bea72	[AVX512] Add initial support for the Execution Domain fixing pass to change some EVEX instructions. llvm-svn: 276393	2016-07-22 05:00:52 +00:00
Craig Topper	5ec33a9411	[AVX512] Fix the ExeDomain for some packed fp instructions. llvm-svn: 276392	2016-07-22 05:00:42 +00:00
Craig Topper	0b90756b0a	[AVX512] Add load folding for some AVX512VL logic and arithmetic instructions. llvm-svn: 276391	2016-07-22 05:00:39 +00:00
Craig Topper	ab13b33ded	[AVX512] Update X86InstrInfo::foldMemoryOperandCustom to handle the EVEX encoded instructions too. llvm-svn: 276390	2016-07-22 05:00:35 +00:00
David Majnemer	1182dd8ed9	[AArch64] Cleanup sign extend in genAlternativeCodeSequence Use the machinery in MathExtras instead of rolling it by hand. This fixes PR28624. llvm-svn: 276366	2016-07-21 23:46:56 +00:00
Douglas Katzman	26cfb6a254	[Sparc]: Fix bug in LowerSTORE due to r275592 llvm-svn: 276362	2016-07-21 23:28:54 +00:00
Michael Kuperstein	c523333bbf	[X86] Do not use AND8ri8 in AVX512 pattern This variant is (as documented in the TD) for disassembler use only, and should not be used in patterns - it is longer, and is broken on 64-bit. llvm-svn: 276347	2016-07-21 22:24:08 +00:00
Akira Hatanaka	b8d2873d93	[AArch64][Inline-Asm] Return the 32-bit floating point register class when constraint "w" is used on a 32-bit operand. This enables compiling the following code, which used to error out in the backend: void foo1(int a) { asm volatile ("sqxtn h0, %s0\n" : : "w"(a):); } Fixes PR28633. llvm-svn: 276344	2016-07-21 21:39:05 +00:00
Konstantin Zhuravlyov	3c0d8d22fe	[AMDGPU] Emit read-only data to .rodata for hsa Differential Revision: https://reviews.llvm.org/D22538 llvm-svn: 276298	2016-07-21 15:59:23 +00:00
Konstantin Zhuravlyov	155626238b	AMDGPU/SI: Add support for R_AMDGPU_ABS32 Differential Revision: https://reviews.llvm.org/D21646 llvm-svn: 276294	2016-07-21 15:29:19 +00:00
Geoff Berry	4ff2e36d32	[AArch64] Load/store opt: Don't count transient instructions towards search limits. Summary: This change also changes findMatchingInsn and findMatchingUpdateInsnForward to take DBG_VALUE opcodes into account when tracking register defs and uses, which could potentially inhibit these optimizations in the presence of debug information. Reviewers: mcrosier Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D22582 llvm-svn: 276293	2016-07-21 15:20:25 +00:00
Simon Pilgrim	88e0940d3b	[X86][SSE] Allow folding of store/zext with PEXTRW of 0'th element Under normal circumstances we prefer the higher performance MOVD to extract the 0'th element of a v8i16 vector instead of PEXTRW. But as detailed on PR27265, this prevents the SSE41 implementation of PEXTRW from folding the store of the 0'th element. Additionally it prevents us from making use of the fact that the (SSE2) reg-reg version of PEXTRW implicitly zero-extends the i16 element to the i32/i64 destination register. This patch only preferentially lowers to MOVD if we will not be zero-extending the extracted i16, nor prevent a store from being folded (on SSSE41). Fix for PR27265. Differential Revision: https://reviews.llvm.org/D22509 llvm-svn: 276289	2016-07-21 14:54:17 +00:00
Simon Pilgrim	b11bdd95f6	[X86][SSE] Pull out duplicate EXTRW lowering code. NFCI. As requested on D22509, I've pulled out the v8i16 extraction lowering as the SSE41 and pre-SSE41 implementations are effectively the same. llvm-svn: 276285	2016-07-21 14:30:17 +00:00
Simon Pilgrim	c8e20b1150	[X86][AVX] Added support for lowering to VBROADCASTF128/VBROADCASTI128 As reported on PR26235, we don't currently make use of the VBROADCASTF128/VBROADCASTI128 instructions (or the AVX512 equivalents) to load+splat a 128-bit vector to both lanes of a 256-bit vector. This patch enables lowering from subvector insertion/concatenation patterns and auto-upgrades the llvm.x86.avx.vbroadcastf128.pd.256 / llvm.x86.avx.vbroadcastf128.ps.256 intrinsics to match. We could possibly investigate using VBROADCASTF128/VBROADCASTI128 to load repeated constants as well (similar to how we already do for scalar broadcasts). Differential Revision: https://reviews.llvm.org/D22460 llvm-svn: 276281	2016-07-21 14:10:54 +00:00
Sam Kolton	6c4efcadfe	[AMDGPU] Some code cleaning in SIRegisterInfo.td Reviewers: tstellarAMD, vpykhtin Subscribers: arsenm, kzhuravl Differential Revision: https://reviews.llvm.org/D22620 llvm-svn: 276274	2016-07-21 13:29:57 +00:00
Matt Arsenault	f0ba86a4d5	AMDGPU: Fix phis from blocks split due to register indexing llvm-svn: 276257	2016-07-21 09:40:57 +00:00
Matthias Braun	ca8210a952	X86InstrInfo: No need for liveness analysis in classifyLEAReg() classifyLEAReg() deals with switching operands from 32bit to 64bit in order to use a LEA64_32 instruction (for three address code goodness). It currently performs a liveness analysis to determine the kill/undef flag for the newly added operand. This should not be necessary: - If the previous operand had a kill flag, then the 32bit part of the register gets killed, this will kill the super register as well. - If the previous operand had an undef flag then we didn't care what value we read, just use the same flag on the new operand. (No matter what an operand with an undef flag won't affect liveness) This makes the code independent of the presence of kill flags because it avoids a call to MachineBasicBlock::computeRegisterLiveness(). Differential Revision: http://reviews.llvm.org/D22283 llvm-svn: 276222	2016-07-21 00:33:38 +00:00
Justin Lebar	cd564c6b46	[NVPTX] Enable the load-store vectorizer on nvptx. Reviewers: tra Subscribers: jholewinski, arsenm, asbirlea Differential Revision: https://reviews.llvm.org/D22592 llvm-svn: 276196	2016-07-20 22:11:36 +00:00
Geoff Berry	24c81e8d7c	[AArch64] Register AArch64LoadStoreOptimizer so it can be run by llc -run-pass. NFCI. llvm-svn: 276193	2016-07-20 21:45:58 +00:00
Artem Belevich	7e9c9a6582	[NVPTX] Renamed NVPTXLowerKernelArgs -> NVPTXLowerArgs. NFC. After r276153 the pass applies to both kernels and regular functions. Differential Revision: https://reviews.llvm.org/D22583 llvm-svn: 276189	2016-07-20 21:44:07 +00:00
Ahmed Bougacha	a0cdd79070	[AArch64][FastISel] Select -O0 legal cmpxchg. At -O0, cmpxchg survives AtomicExpand: it's mostly straightforward to select it in fast-isel, and let the pseudo be expanded later. extractvalues on the result are the tricky part: the generic logic only works for legal types (and it would be painful to make it support illegal types), so we can only support i32/i64 cmpxchg. llvm-svn: 276183	2016-07-20 21:12:32 +00:00
Ahmed Bougacha	b0674d1143	[AArch64][FastISel] Select atomic stores into STLR. llvm-svn: 276182	2016-07-20 21:12:27 +00:00
Tim Northover	62ae568bbb	GlobalISel: implement low-level type with just size & vector lanes. This should be all the low-level instruction selection needs to determine how to implement an operation, with the remaining context taken from the opcode (e.g. G_ADD vs G_FADD) or other flags not based on type (e.g. fast-math). llvm-svn: 276158	2016-07-20 19:09:30 +00:00
Artem Belevich	74158b5061	[NVPTX] deal with all aggregate return types. Fixes a crash in llvm_unreachable when a function has array return type. Differential Revision: https://reviews.llvm.org/D22524 llvm-svn: 276154	2016-07-20 18:39:52 +00:00
Artem Belevich	b2e76a5e7a	[NVPTX] Improve lowering of byval args of device functions. Avoid unnecessary spills of byval arguments of device functions to local space on SASS level and subsequent pointer conversion to generic address space that follows. Instead, make a local copy in IR, provide a way to access arguments directly, and let LLVM optimize the copy away when possible. Differential Review: https://reviews.llvm.org/D21421 llvm-svn: 276153	2016-07-20 18:39:47 +00:00
Yaxun Liu	4b1d9f7f18	AMDGPU: Fix bug causing crash due to invalid opencl version metadata. Differential Revision: https://reviews.llvm.org/D22526 llvm-svn: 276119	2016-07-20 14:38:06 +00:00
Simon Pilgrim	1b4f511aaa	[X86][SSE] Add cost model values for CTPOP of vectors This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better. Differential Revision: https://reviews.llvm.org/D22456 llvm-svn: 276104	2016-07-20 10:41:28 +00:00
Diana Picus	f345d40ae2	[ARM] Skip inline asm memory operands in DAGToDAGISel Retry r275776 (no changes, we suspect the issue was with another commit). The current logic for handling inline asm operands in DAGToDAGISel interprets the operands by looking for constants, which should represent the flags describing the kind of operand we're dealing with (immediate, memory, register def etc). The operands representing actual data are skipped only if they are non-const, with the exception of immediate operands which are skipped explicitly when a flag describing an immediate is found. The oversight is that memory operands may be const too (e.g. for device drivers reading a fixed address), so we should explicitly skip the operand following a flag describing a memory operand. If we don't, we risk interpreting that constant as a flag, which is definitely not intended. Fixes PR26038 Differential Revision: https://reviews.llvm.org/D22103 llvm-svn: 276101	2016-07-20 09:48:24 +00:00
Craig Topper	09b99c3a75	[AVX512] Add a missing NoVLX to give priority to the AVX512 version of the pattern. llvm-svn: 276088	2016-07-20 05:05:50 +00:00
Craig Topper	7a95428f7c	[X86] Use 'HasAVX1Only' to properly give priority to the AVX2 version without relying on file ordering. llvm-svn: 276087	2016-07-20 05:05:48 +00:00
Craig Topper	cde436a663	[X86] Create some multiclasses to reduce the repeated patterns for VEXTRACT(F/I)128/VINSERT(I/F)128. NFC llvm-svn: 276086	2016-07-20 05:05:46 +00:00
Craig Topper	55913ead3d	[X86] Create some wrapper multiclasses to create AVX and SSE shift instructions with less repeated code. NFC llvm-svn: 276085	2016-07-20 05:05:44 +00:00
David Majnemer	5d26127752	Revert "Disable this-return argument forwarding on ARM/AArch64" Inference of the 'returned' attribute was fixed in r276008, lets try turning the backend support back on. This reverts commit r275677. llvm-svn: 276081	2016-07-20 04:13:01 +00:00
Matt Arsenault	a1fe17c9ad	AMDGPU: Change fdiv lowering based on !fpmath metadata If 2.5 ulp is acceptable, denormals are not required, and isn't a reciprocal which will already be handled, replace with a faster fdiv. Simplify the lowering tests by using per function subtarget features. llvm-svn: 276051	2016-07-19 23:16:53 +00:00
Evandro Menezes	238fa76574	[AArch64] Properly validate the reciprocal estimation. Add check for legal data types when expanding into a Newton series. Differential Revision: https://reviews.llvm.org/D22267 llvm-svn: 276041	2016-07-19 22:31:11 +00:00

1 2 3 4 5 ...

38514 Commits