llvm-project

Commit Graph

Author	SHA1	Message	Date
Simon Pilgrim	ecb0433599	[X86][SSE] Fixed issue with commutation of 'faux unary' target shuffles (PR26667) Fixed a bug introduced by D16683 when a binary shuffle is simplified to a unary shuffle (with undef/zero sentinel mask indices) - if this resulted in only the second input being used combineX86ShuffleChain failed to take this into account and still referenced the first input. llvm-svn: 261434	2016-02-20 14:39:45 +00:00
Simon Pilgrim	ccf2cce67c	[X86][SSE] Move all undef/zero cases before target shuffle combining. First small step towards fixing PR26667 - we need to ensure that combineX86ShuffleChain only gets called with a valid shuffle input node (a similar issue was found in D17041). llvm-svn: 261433	2016-02-20 12:57:32 +00:00
Andrey Turetskiy	9994b8894a	[X86] Enable the LEA optimization pass by default. Differential Revision: http://reviews.llvm.org/D16877 llvm-svn: 261429	2016-02-20 11:11:55 +00:00
Andrey Turetskiy	0babd26626	[X86] PR26575: Fix LEA optimization pass (Part 2). Handle address displacement operands of a type other than Immediate or Global in LEAs and load/stores. Ref: https://llvm.org/bugs/show_bug.cgi?id=26575 Differential Revision: http://reviews.llvm.org/D17374 llvm-svn: 261428	2016-02-20 10:58:28 +00:00
David Majnemer	862c5ba302	Move some code from doInitialization to runOnFunction This has no observable behavior change, it just makes the state insertion pass look a little more like normal passes. llvm-svn: 261420	2016-02-20 07:34:21 +00:00
Craig Topper	2bf0c0394d	[X86] Add some missing reversed forms of XOP instructions. llvm-svn: 261417	2016-02-20 06:20:17 +00:00
Davide Italiano	228978c0dc	[X86ISelLowering] Fix TLSADDR lowering when shrink-wrapping is enabled. TLSADDR nodes are lowered into actuall calls inside MC. In order to prevent shrink-wrapping from pushing prologue/epilogue past them (which result in TLS variables being accessed before the stack frame is set up), we put markers, so that the stack gets adjusted properly. Thanks to Quentin Colombet for guidance/help on how to fix this problem! llvm-svn: 261387	2016-02-20 00:44:47 +00:00
Tom Stellard	467b5b9024	AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 llvm-svn: 261385	2016-02-20 00:37:25 +00:00
Davide Italiano	a8f1f2efaf	[X86ISelLowering] Provide a more informative assert message. I stumbled upon this while debugging a lowering bug. llvm-svn: 261371	2016-02-19 22:18:49 +00:00
Davide Italiano	4cfe2a9e38	[X86ISelLowering] Merge two conditions inside a single if. llvm-svn: 261370	2016-02-19 22:01:07 +00:00
Hans Wennborg	7c3077ca52	Revert r253557 "Alternative to long nops for X86 CPUs, by Andrey Turetsky" Turns out the new nop sequences aren't actually nops on x86_64 (PR26554). llvm-svn: 261365	2016-02-19 21:26:31 +00:00
Dimitry Andric	db417b6d40	Fix incorrect selection of AVX512 sqrt when OptForSize is on Summary: When optimizing for size, sqrt calls can be incorrectly selected as AVX512 VSQRT instructions. This is because X86InstrAVX512.td has a `Requires<[OptForSize]>` in its `avx512_sqrt_scalar` multiclass definition. Even if the target does not support AVX512, the class can apparently still be chosen, leading to an incorrect selection of `vsqrtss`. In PR26625, this lead to an assertion: Reg >= X86::FP0 && Reg <= X86::FP6 && "Expected FP register!", because the `vsqrtss` instruction requires an XMM register, which is not available on i686 CPUs. Reviewers: grosbach, resistor, joker.eph Subscribers: spatel, emaste, llvm-commits Differential Revision: http://reviews.llvm.org/D17414 llvm-svn: 261360	2016-02-19 20:14:11 +00:00
Dan Gohman	87e368b7db	[WebAssembly] Add another optimization idea to README.txt. llvm-svn: 261354	2016-02-19 19:22:44 +00:00
Geoff Berry	7e4ba3dc02	[AArch64][ShrinkWrap] Fix bug in prolog clobbering live reg when shrink wrapping. Summary: See bug https://llvm.org/bugs/show_bug.cgi?id=26642 Reviewers: qcolombet, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D17350 llvm-svn: 261349	2016-02-19 18:27:32 +00:00
Tom Stellard	2d26fe7aa6	AMDGPU/SI: Fix s_waitcnt insertion for flat instructions Summary: This was broken in r260694 which swapped the address and data operands for flat store instructions. The code in SIInsertWaits assumes that the data operand always comes before the address operand, so we need to add a special case for flat. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17366 llvm-svn: 261330	2016-02-19 15:33:13 +00:00
Ulrich Weigand	cfa1d2b49d	[SystemZ] Fix ABI for i128 argument and return types According to the SystemZ ABI, 128-bit integer types should be passed and returned via implicit reference. However, this is not currently implemented at the LLVM IR level for the i128 type. This does not matter when compiling C/C++ code, since clang will implement the implicit reference itself. However, it turns out that when calling libgcc helper routines operating on 128-bit integers, LLVM will use i128 argument and return value types; the resulting code is not compatible with the ABI used in libgcc, leading to crashes (see PR26559). This should be simple to fix, except that i128 currently is not even a legal type for the SystemZ back end. Therefore, common code will already split arguments and return values into multiple parts. The bulk of this patch therefore consists of detecting such parts, and correctly handling passing via implicit reference of a value split into multiple parts. If at some time in the future, i128 becomes a legal type, this code can be removed again. This fixes PR26559. llvm-svn: 261325	2016-02-19 14:10:21 +00:00
Craig Topper	5eeb41c173	[X86] Remove unused entries from the disassembler type enum. llvm-svn: 261311	2016-02-19 06:57:40 +00:00
Junmo Park	1108ab059c	Minor code cleanups. NFC. llvm-svn: 261294	2016-02-19 01:46:04 +00:00
Sanjay Patel	0adbea4b5c	[x86] fix initialization of PredictableSelectIsExpensive This is effectively NFC because Atom is the only in-order x86 subtarget currently, but the predicate would have become wrong if any other in-order CPU came along. See related discussion in: http://reviews.llvm.org/D16836 llvm-svn: 261275	2016-02-18 23:08:48 +00:00
Richard Trieu	7a08381403	Remove uses of builtin comma operator. Cleanup for upcoming Clang warning -Wcomma. No functionality change intended. llvm-svn: 261270	2016-02-18 22:09:30 +00:00
Adam Nemet	9d9cb274ea	[PPCLoopDataPrefetch] Move pass to Transforms/Scalar/LoopDataPrefetch. NFC This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). Obviously the pass still only used from PPC at this point. Subsequent patches will start driving this from ARM64 as well. Due to the previous patch most lines should show up as moved lines. llvm-svn: 261265	2016-02-18 21:38:19 +00:00
Adam Nemet	7cf9b1bf05	[PPCLoopDataPrefetch] Remove PPC from some of the names. NFC This is done only to make the next patch that move the pass out PPC to Transforms easier to read. After this most line should show up as moved lines in that patch. This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). llvm-svn: 261264	2016-02-18 21:37:12 +00:00
David Majnemer	a822c880a9	[WinEH] Hoist state stores from successors If we know that all of our successors want to be in the exact same state, it makes sense to hoist the state transition into their common predecessor. Differential Revision: http://reviews.llvm.org/D17391 llvm-svn: 261262	2016-02-18 21:13:35 +00:00
Davide Italiano	440a676136	[X86ISelLowering] Use isPowerof2 instead of rewriting it. NFC. llvm-svn: 261255	2016-02-18 20:43:15 +00:00
Matthew Simpson	921ad01a1d	[AArch64] Reduce vector insert/extract cost for Kryo Differential Revision: http://reviews.llvm.org/D17379 llvm-svn: 261237	2016-02-18 18:35:45 +00:00
Hans Wennborg	23cdc643b9	Revert to extend i8/i16 return values on Darwin (PR26665) In r260133, LLVM was changed to no longer extend i8/i16 return values, as it's not required by the ABI. However, code was found in the wild that relies on the old behaviour on Darwin, so this commit reverts back to that old behaviour for Darwin. On other platforms, it's less likely that code would be depending on the old behaviour, as GCC and MSVC haven't been extending such return values. llvm-svn: 261235	2016-02-18 18:17:05 +00:00
Chad Rosier	c00ab4f27d	[Hexagon] Remove redundant check. llvm-svn: 261232	2016-02-18 17:49:57 +00:00
Nicolai Haehnle	f2c64db55a	AMDGPU/SI: add llvm.amdgcn.image.load/store[.mip] intrinsics Summary: These correspond to IMAGE_LOAD/STORE[_MIP] and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. IMAGE_LOAD is already matched by llvm.SI.image.load. That intrinsic has a legacy name and pretends not to read memory. Differential Revision: http://reviews.llvm.org/D17276 llvm-svn: 261224	2016-02-18 16:44:18 +00:00
Krzysztof Parzyszek	754bad884d	[Hexagon] Fix compilation error with GCC 6 Compiling Hexagon target with GCC 6 produces "error: should have been declared inside" due to GCC PR c++/69657 which was merged. Properly wrapping operator<<() definitions within the namespace llvm fixes the issue. Author: domagoj.stolfa Differential Revision: http://reviews.llvm.org/D17281 llvm-svn: 261220	2016-02-18 16:10:27 +00:00
Krzysztof Parzyszek	7a737d1abb	[Hexagon] Implement TLS support Patch by Anand Kodnani. llvm-svn: 261218	2016-02-18 15:42:57 +00:00
Zlatko Buljan	f034021443	[mips][microMIPS] Implement TLBINV and TLBINVF instructions Differential Revision: http://reviews.llvm.org/D16849 llvm-svn: 261211	2016-02-18 14:10:52 +00:00
Krzysztof Parzyszek	6895b2ceb2	[Hexagon] Add support for __builtin_prefetch llvm-svn: 261210	2016-02-18 13:58:38 +00:00
Krzysztof Parzyszek	39686cf98e	[Hexagon] Update the callee-saved register set for EH-aware functions llvm-svn: 261208	2016-02-18 13:41:05 +00:00
Simon Pilgrim	05e48b95eb	[X86][SSE] Improve PSHUFB shuffle mask decoding. In cases where the PSHUFB shuffle mask is shared it might not be bitcasted to a vXi8 byte vector. This patch adds support for decoding these wider shuffle masks from the ConstantPool. The test case in question makes use of this to recognise the shuffle mask is an unary UNPCKL pattern and simplifies accordingly. llvm-svn: 261201	2016-02-18 10:17:40 +00:00
Nikolay Haustov	10813a4efa	Test commit access. llvm-svn: 261199	2016-02-18 10:02:12 +00:00
Michael Zuckerman	724dc3b20c	[AVX512][PRORQ][PRORD] Change imm8 to int Differential Revision: http://reviews.llvm.org/D17024 llvm-svn: 261198	2016-02-18 09:52:12 +00:00
Dan Gohman	d85ab7fc10	[WebAssembly] Don't use setRequiresStructuredCFG(true). While we still do want reducible control flow, the RequiresStructuredCFG flag imposes more strict structure constraints than WebAssembly wants. Unsetting this flag enables critical edge splitting and tail merging. Also, disable TailDuplication explicitly, as it doesn't support virtual registers, and was previously only disabled by the RequiresStructuredCFG flag. llvm-svn: 261190	2016-02-18 06:32:53 +00:00
Tom Stellard	e1818af8c5	[AMDGPU] Disassembler: Added basic disassembler for AMDGPU target Changes: - Added disassembler project - Fixed all decoding conflicts in .td files - Added DecoderMethod=“NONE” option to Target.td that allows to disable decoder generation for an instruction. - Created decoding functions for VS_32 and VReg_32 register classes. - Added stubs for decoding all register classes. - Added several tests for disassembler Disassembler only supports: - VI subtarget - VOP1 instruction encoding - 32-bit register operands and inline constants [Valery] One of the point that requires to pay attention to is how decoder conflicts were resolved: - Groups of target instructions were separated by using different DecoderNamespace (SICI, VI, CI) using similar to AssemblerPredicate approach. - There were conflicts in IMAGE_<> instructions caused by two different reasons: 1. dmask wasn’t specified for the output (fixed) 2. There are image instructions that differ only by the number of the address components but have the same encoding by the HW spec. The actual number of address components is determined by the HW at runtime using image resource descriptor starting from the VGPR encoded in an IMAGE instruction. This means that we should choose only one instruction from conflicting group to be the rule for decoder. I didn’t find the way to disable decoder generation for an arbitrary instruction and therefore made a onelinear fix to tablegen generator that would suppress decoder generation when DecoderMethod is set to “NONE”. This is a change that should be reviewed and submitted first. Otherwise I would need to specify different DecoderNamespace for every instruction in the conflicting group. I haven’t checked yet if DecoderMethod=“NONE” is not used in other targets. 3. IMAGE_GATHER decoder generation is for now disabled and to be done later. [/Valery] Patch By: Sam Kolton Differential Revision: http://reviews.llvm.org/D16723 llvm-svn: 261185	2016-02-18 03:42:32 +00:00
Derek Schuff	71434ff642	[WebAssembly] Disable register stackification and coloring when not optimizing These passes are optimizations, and should be disabled when not optimizing. Also create an MCCodeGenInfo so the opt level is correctly plumbed to the backend pass manager. Also remove the command line flag for disabling register coloring; running llc with -O0 should now be useful for debugging, so it's not necessary. Differential Revision: http://reviews.llvm.org/D17327 llvm-svn: 261176	2016-02-17 23:20:43 +00:00
Tim Northover	7687bcee4a	AArch64: always clear kill flags up to last eliminated copy After r261154, we were only clearing flags if the known-zero register was originally live-in to the basic block, but we have to do it even if not when more than one COPY has been eliminated, otherwise the user of the first COPY may still have <kill> marked. E.g. BB#N: %X0 = COPY %XZR STRXui %X0<kill>, <fi#0> %X0 = COPY %XZR STRXui %X0<kill>, <fi#1> We can eliminate both copies, X0 is not live-in, but we must clear the kill on the first store. Unfortunately, I've been unable to come up with a non-fragile test for this. I've only seen it in the wild with regalloc-created spills, and attempts to reproduce that in a reasonable way run afoul of COPY coalescing. Even volatile asm clobbers were moved around. Should fix the aarch64 bot though. llvm-svn: 261175	2016-02-17 23:07:04 +00:00
Amaury Sechet	22d2878399	Move LLVMCreateTargetData and LLVMDisposeTargetData together. NFC llvm-svn: 261172	2016-02-17 22:41:09 +00:00
Tim Northover	3f2285615a	AArch64: improve redundant copy elimination. Mostly, this fixes the bug that if the CBZ guaranteed Xn but Wn was used, we didn't sort out the use-def chain properly. I've also made it check more than just the last instruction for a compatible CBZ (so it can cope without fallthroughs). I'd have liked to do that separately, but it's helps writing the test. Finally, I removed some custom loops in favour of MachineInstr helpers and refactored the control flow to flatten it and avoid possibly quadratic iterations in blocks with many copies. NFC for these, just a general tidy-up. llvm-svn: 261154	2016-02-17 21:16:53 +00:00
Colin LeMahieu	5e552d141f	[Hexagon] Replacing reference/dereference with reference cast. llvm-svn: 261133	2016-02-17 18:50:21 +00:00
Nico Weber	32ac273a91	Remove superfluous semicolon. llvm-svn: 261128	2016-02-17 18:48:08 +00:00
David Majnemer	7e5937b775	[WinEH] Optimize WinEH state stores 32-bit x86 Windows targets use a linked-list of nodes allocated on the stack, referenced to via thread-local storage. The personality routine interprets one of the fields in the node as a 'state number' which indicates where the personality routine should transfer control. State transitions are possible only before call-sites which may throw exceptions. Our previous scheme had us update the state number before all call-sites which may throw. Instead, we can try to minimize the number of times we need to store by reasoning about the nearest store which dominates the current call-site. If the last store agrees with the current call-site, then we know that the state-update is redundant and can be elided. This is largely straightforward: an RPO walk of the blocks allows us to correctly forward propagate the information when the function is a DAG. Currently, loops are not handled optimally and may trigger superfluous state stores. Differential Revision: http://reviews.llvm.org/D16763 llvm-svn: 261122	2016-02-17 18:37:11 +00:00
Colin LeMahieu	3d3ff650d6	[Hexagon] Loop instructions don't need special processing. Extension and fitting is performed by generic code and the comment is incorrect, loops don't have a separate extended opcode. llvm-svn: 261118	2016-02-17 18:14:05 +00:00
Justin Lebar	f9b5add6ad	[NVPTX] Annotate convergent intrinsics as convergent. Summary: Previously the machine instructions for bar.sync &co. were not marked as convergent. This resulted in some MI passes (such as TailDuplication, fixed in an upcoming patch) doing unsafe things to these instructions. Reviewers: jingyue Subscribers: llvm-commits, tra, jholewinski, hfinkel Differential Revision: http://reviews.llvm.org/D17318 llvm-svn: 261115	2016-02-17 17:46:54 +00:00
Justin Lebar	d596ec93ce	[NVPTX] Annotate call machine instructions as calls. Summary: Otherwise we'll try to do unsafe optimizations on these MIs, such as sinking loads below calls. (I suspect that this is not the only bug in the NVPTX instruction tablegen files; I need to comb through them.) Reviewers: jholewinski, tra Subscribers: jingyue, jhen, llvm-commits Differential Revision: http://reviews.llvm.org/D17315 llvm-svn: 261113	2016-02-17 17:46:50 +00:00
Krzysztof Parzyszek	de697d4d40	[Hexagon] Fold object construction into map::insert llvm-svn: 261096	2016-02-17 15:02:07 +00:00
Igor Breger	ac02f1bb62	AVX512: Fix LowerMSCATTER() return value. Bug description: The bug was discovered when test was compiled with -O0. In case scatter result is DAG root , VectorLegalizer failed (assert) due to LowerMSCATTER() return kmask as result. Change LowerMSCATTER() to return chain as original node do. Differential Revision: http://reviews.llvm.org/D17331 llvm-svn: 261090	2016-02-17 14:04:33 +00:00

1 2 3 4 5 ...

36245 Commits