llvm-project

Commit Graph

Author	SHA1	Message	Date
Chad Rosier	bba881ef3d	[AArch64] Allocate the modified and used regs only once per function. llvm-svn: 259510	2016-02-02 15:02:30 +00:00
JF Bastien	926b189a81	WebAssembly: update expected GCC torture test failures The 3 programs used __attribute__((mode(?))) on enum, which clang r259497 fixed. llvm-svn: 259508	2016-02-02 14:27:34 +00:00
Oliver Stannard	7e7d983a87	Refactor backend diagnostics for unsupported features Re-commit of r258951 after fixing layering violation. The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. llvm-svn: 259498	2016-02-02 13:52:43 +00:00
Simon Pilgrim	96fe4ef5f7	[X86][AVX512] Add support for AVX512 VMOVQ (load) shuffle decoding llvm-svn: 259496	2016-02-02 13:32:56 +00:00
JF Bastien	dc1255f02f	WebAssembly: add option to disable register coloring Having this hidden option makes it easier to debug other issues. llvm-svn: 259482	2016-02-02 09:30:01 +00:00
Sjoerd Meijer	ffe19f5245	Removed FeatureVFPOnlySP from the Cortex-R7 processor model description and changed the regression test accordingly. The default configuration of a Cortex-R7 is to implement the VFPv3-D16 architecture and the feature line as it was is too restrictive. llvm-svn: 259480	2016-02-02 09:28:20 +00:00
Sanjoy Das	881de4d12a	[X86] Fix a bug in getMemOpBaseRegImmOfs Fix a crash in `getMemOpBaseRegImmOfs` that happens if the base of `MemOp` is a frame index memory operand. The fix is to have `getMemOpBaseRegImmOfs` bail out in such cases. We can possibly be more clever here, if needed. llvm-svn: 259456	2016-02-02 02:32:43 +00:00
Ahmed Bougacha	68a8efa374	[X86][FastISel] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR. FastISel counterpart to r259448. llvm-svn: 259449	2016-02-02 01:44:03 +00:00
Ahmed Bougacha	55c6682ae2	[X86] Don't force Nearest-Even rounding for VCVTPS2PH, use MXCSR. Officially, we don't acknowledge non-default configurations of MXCSR, as getting there would require usage of the FENV_ACCESS pragma (at least insofar as rounding mode is concerned). We don't support the pragma, so we can assume that the default rounding mode - round to nearest, ties to even - is always used. However, it's inconsistent with the rest of the instruction set, where MXCSR is always effective (unless otherwise specified). Also, it's an unnecessary obstacle to the few brave souls that use fenv.h with LLVM. Avoid the hard-coded rounding mode for fp_to_f16; use MXCSR instead. llvm-svn: 259448	2016-02-02 01:32:50 +00:00
Sanjay Patel	c54600dbb1	fix typos; NFC llvm-svn: 259438	2016-02-01 23:53:35 +00:00
Simon Pilgrim	5be17b6e3e	[X86][AVX512] Add support for AVX512 VMOVD (load) shuffle decoding llvm-svn: 259430	2016-02-01 23:04:05 +00:00
Simon Pilgrim	f5c23ad3d7	[X86][AVX512] Add support for AVX512 VMOVSD/VMOVSS shuffle decoding llvm-svn: 259427	2016-02-01 22:26:28 +00:00
Simon Pilgrim	025a3d857a	[X86][AVX512] Add support for AVX512 VINSERTPS shuffle decoding llvm-svn: 259420	2016-02-01 22:05:50 +00:00
Matthias Braun	3f88eabe93	SmallSet/SmallPtrSet: Refuse huge Small numbers These sets do linear searching in small mode; It is not a good idea to use huge numbers as the small value here, save people from themselves by adding a static_assert. Differential Revision: http://reviews.llvm.org/D16706 llvm-svn: 259419	2016-02-01 22:05:16 +00:00
Chad Rosier	dbdb1d6eaf	Move comments a bit closer to associated code. NFC. llvm-svn: 259411	2016-02-01 21:38:31 +00:00
Chad Rosier	064261da16	Remove extra semicolon. NFC. llvm-svn: 259402	2016-02-01 20:54:36 +00:00
Balaram Makam	92431703d7	AArch64: Implement missed conditional compare sequences. Summary: This is an extension to the existing implementation of r242436 which restricts to only select inputs. This version fixes missed opportunities in pr26084 by attempting to lower conditional compare sequences of and/or trees with setcc leafs. This will additionaly handle the case when a tree with select input is not a conjunction-disjunction tree but some of the sub trees are conjunction-disjunction trees. Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry Differential Revision: http://reviews.llvm.org/D16291 llvm-svn: 259387	2016-02-01 19:13:07 +00:00
Geoff Berry	29d4a695f4	[AArch64] Simplify prolog/epilog callee save/restore. NFC. Summary: Factor out common code for callee-save register pair calculation. This is intended to simplify follow-on changes that reduce the number of registers saved/restored. Depends on D16732 Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16734 llvm-svn: 259384	2016-02-01 19:07:06 +00:00
Ulrich Weigand	4a4d4ab7a4	[SystemZ] Fix wrong-code generation for certain always-false conditions We've found another bug in the code generation logic conditions for a certain class of always-false conditions, those of the form if ((a & 1) < 0) These only reach the back end when compiling without optimization. The bug was introduced by the choice of using TEST UNDER MASK to implement a check for if ((a & MASK) < VAL) as if ((a & MASK) == 0) where VAL is less than the the lowest bit of MASK. This is correct in all cases except for VAL == 0, in which case the original condition is always false, but the replacement isn't. Fixed by excluding that particular case. llvm-svn: 259381	2016-02-01 18:31:19 +00:00
Colin LeMahieu	6fdfa3dc32	[NFC] Referencing manual for reason why subregbit is checked llvm-svn: 259380	2016-02-01 18:15:39 +00:00
Geoff Berry	04bf91a8c1	[AArch64] Simplify callee-save register save/restore. NFC. Summary: Simplify callee-save register save/restore code generation by remembering the size of the callee-save area when it is computed so we don't have to scan the prologue/epilogue instructions again later to reconstruct it. This is intended to simplify follow-on changes that reduce the number of registers saved/restored. Reviewers: mcrosier, jmolloy, t.p.northover Subscribers: aemerson, rengolin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D16732 llvm-svn: 259365	2016-02-01 16:29:19 +00:00
Asaf Badouh	5a3a0231f4	[X86][AVX512VBMI] add encoding and intrinsics for Multishift Differential Revision: http://reviews.llvm.org/D16399 llvm-svn: 259363	2016-02-01 15:48:21 +00:00
Daniel Sanders	f8bb23e509	[mips] Range check uimm16 and fix several bugs this revealed. Summary: The bugs were: * teq and similar take 4-bit unsigned immediates on microMIPS. * teqi and similar have side-effects like teq do. * shll_s.w and shra_r.w take 5-bit unsigned immediates. * The various DSP ext* instructions take a 5-bit immediate. * repl.qh takes an 8-bit unsigned immediate. * repl.ph takes a 10-bit unsigned immediate. * rddsp/wrdsp take a 10-bit unsigned immediate. * teqi and similar take signed 16-bit immediates (10-bit for microMIPS). * Out-of-range immediate macros for or/xor take a simm32/simm64 depending on architecture. I'll fix the simm64 case properly when I reach simm32. lui is a bit more lenient than GAS and accepts signed immediates in addition to unsigned. This is because MipsMCExpr can produce signed values when constant folding and it currently lacks a way of knowing it should fold to an unsigned value. Reviewers: vkalintiris Subscribers: dsanders, llvm-commits Differential Revision: http://reviews.llvm.org/D15446 llvm-svn: 259360	2016-02-01 15:13:31 +00:00
JF Bastien	a5b8ea0d66	WebAssembly NFC: simplify control flow This should now be easier to read. llvm-svn: 259349	2016-02-01 10:46:16 +00:00
Igor Breger	56b039ea17	AVX512: fix mask handling for gather/scatter/prefetch intrinsics. Differential Revision: http://reviews.llvm.org/D16755 llvm-svn: 259346	2016-02-01 09:57:15 +00:00
Simon Pilgrim	1358d86659	[X86][SSE] Find source of the inserted element of INSERTPS Minor patch to trace back through target shuffles to the source of the inserted element in a (V)INSERTPS shuffle. Differential Revision: http://reviews.llvm.org/D16652 llvm-svn: 259343	2016-02-01 08:59:30 +00:00
Igor Breger	6cc9115cec	AVX512 : Fix SETCCE lowering for KNL 32 bit. Differential Revision: http://reviews.llvm.org/D16752 llvm-svn: 259342	2016-02-01 07:56:09 +00:00
David Majnemer	efb41741f2	[X86] Cleanup the WinEHState pass Remove unnecessary includes and class state. No functional change intended. llvm-svn: 259340	2016-02-01 04:28:59 +00:00
Craig Topper	3ef74f5956	Replace usages of llvm::utostr_32 with just llvm::utostr. While this is less efficient, its unclear the few places that were using the _32 version were doing so for efficiency. llvm-svn: 259330	2016-01-31 20:00:24 +00:00
JF Bastien	578c8cde53	WebAssembly: more failures are gone llvm-svn: 259321	2016-01-31 08:19:40 +00:00
JF Bastien	ac9e8664a4	WebAssembly: update expected failures r259305 fixed a few assertions around FrameIndex, and I forgot to update these failures despite having run the torture tests. llvm-svn: 259320	2016-01-31 08:05:05 +00:00
Derek Schuff	c97ba939d1	[WebAssembly] Fix uses of FrameIndex as store values Previously the code assumed all uses of FI on loads and stores were as addresses. This checks whether the use is the address or a value and handles the latter case as it does for non-memory instructions. llvm-svn: 259306	2016-01-30 21:43:08 +00:00
JF Bastien	fbc89d21dd	WebAssembly: don't optimize frameindex store The previous code was incorrect (can't getReg a frameindex). We could instead optimize it to reduce tree height, but I'm not sure that's worthwhile yet because we then try to eliminate the frameindex. This patch also fixes frame index elimination for operations which may load or store: it used to assume the base was operand 2 and immediate offset operand 1. That's not true for stores, where they're 4 and 3. llvm-svn: 259305	2016-01-30 14:11:26 +00:00
JF Bastien	3ca3ea690f	WebAssembly NFC: fix build warning WebAssemblyFrameLowering.cpp:158:44: warning: enumeral and non-enumeral type in conditional expression [enabled by default] llvm-svn: 259303	2016-01-30 11:19:26 +00:00
Matt Arsenault	e013246462	AMDGPU: Fix emitting invalid workitem intrinsics for HSA The AMDGPUPromoteAlloca pass was emitting the read.local.size calls, which with HSA was incorrectly selected to reading from the offset mesa uses off of the kernarg pointer. Error on intrinsics which aren't supported by HSA, and start emitting the correct IR to read the workgroup size out of the dispatch pointer. Also initialize the pass so it can be tested with opt, and start moving towards not depending on the subtarget as an argument. Start emitting errors for the intrinsics not handled with HSA. llvm-svn: 259297	2016-01-30 05:19:45 +00:00
Matt Arsenault	d0799df707	AMDGPU: Stop checking intrinsics not used by HSA for dispatch-ptr Only the dispatch.ptr intrinsic is supposed to be used now to get the workgroup size, and the read.local.size intrinsics do not work correctly. llvm-svn: 259296	2016-01-30 05:10:59 +00:00
Dan Gohman	ed0f113885	[WebAssembly] Refine block placement to insert blocks between trees. Refine the test for whether an instruction is in an expression tree so that it detects when one tree ends and another begins, so we can place a block at that point, rather than continuing to find the first instruction not in a tree at all. llvm-svn: 259294	2016-01-30 05:01:06 +00:00
Matt Arsenault	43976df0da	AMDGPU: Add new amdgcn workitem intrinsics These use the correct prefix and follow the HSA naming convention rather than the config register option names. llvm-svn: 259293	2016-01-30 04:25:19 +00:00
Matthias Braun	b30f2f5141	Avoid overly large SmallPtrSet/SmallSet These sets perform linear searching in small mode so it is never a good idea to use SmallSize/N bigger than 32. llvm-svn: 259283	2016-01-30 01:24:31 +00:00
Justin Lebar	ead59f4765	[CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. Summary: Previously we'd just silently skip these. Reviewers: tra, jholewinski Subscribers: llvm-commits, jhen, echristo, Differential Revision: http://reviews.llvm.org/D16739 llvm-svn: 259279	2016-01-30 01:07:38 +00:00
Yaron Keren	eb2a25467e	Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment. clang part in r259232, this is the LLVM part of the patch. llvm-svn: 259240	2016-01-29 20:50:44 +00:00
Tim Northover	c4093c3ced	ARM: don't mangle DAG constant if it has more than one use The basic optimisation was to convert (mul $LHS, $complex_constant) into roughly "(shl (mul $LHS, $simple_constant), $simple_amt)" when it was expected to be cheaper. The original logic checks that the mul only has one use (since we're mangling $complex_constant), but when used in even more complex addressing modes there may be an outer addition that can pick up the wrong value too. I think the ARM addressing-mode problem is actually unreachable at the moment, but that depends on complex assessments of the profitability of pre-increment addressing modes so I've put a real check in there instead of an assertion. llvm-svn: 259228	2016-01-29 19:18:46 +00:00
Derek Schuff	d91a12ec11	[WebAssembly] Update test expectations llvm-svn: 259223	2016-01-29 18:54:38 +00:00
Derek Schuff	6ea637af35	[WebAssembly] Support frame pointer Add support for frame pointer use in prolog/epilog. Supports dynamic allocas but not yet over-aligned locals. Target-independend CG generates SP updates, but we still need to write back the SP value to memory when necessary. llvm-svn: 259220	2016-01-29 18:37:49 +00:00
Zoran Jovanovic	d474ef3a3b	[mips] Absolute value macro expansion Author: obucina Reviewers: dsanders Differential Revision: http://reviews.llvm.org/D16323 llvm-svn: 259202	2016-01-29 16:18:34 +00:00
Alexandros Lamprineas	8c26e7c647	[ARM] Emit trap instruction using .inst directive The trap instruction is emitted as a data-in-text rather than an instruction. This patch uses the .inst directive for emitting trap. Differential Revision: http://reviews.llvm.org/D16684 llvm-svn: 259182	2016-01-29 10:23:32 +00:00
Matt Arsenault	295875efda	AMDGPU: Remove 24-bit intrinsics The known bit matching code seems to work reasonably well, so these shouldn't really be needed. llvm-svn: 259180	2016-01-29 10:05:16 +00:00
Eric Christopher	7d9b9b2d7d	Refactor common code for PPC fast isel load immediate selection. llvm-svn: 259178	2016-01-29 07:20:30 +00:00
Eric Christopher	5a2429e239	Since LI/LIS sign extend the constant passed into the instruction we should check that the sign extended constant fits into 16-bits if we want a zero extended value, otherwise go ahead and put it together piecemeal. Fixes PR26356. llvm-svn: 259177	2016-01-29 07:20:01 +00:00
Eric Christopher	80ba58a15c	Fix up conditional formatting. llvm-svn: 259176	2016-01-29 07:19:49 +00:00
David Majnemer	f2bb710da5	[WinEH] Don't perform state stores in cleanups Our cleanups do not support true lexical nesting of funclets which obviates the need to perform state stores. This fixes PR26361. llvm-svn: 259161	2016-01-29 05:33:15 +00:00
Ahmed Bougacha	53010a0d5b	[AArch64] Fix i64 nontemporal high-half extraction. Since we only have pair - not single - nontemporal store instructions, we have to extract the high part into a separate register to be able to use them. When the initial nontemporal codegen support was added, I wrote the extract using the nonsensical UBFX [0,32[. Use the correct LSR form instead. llvm-svn: 259134	2016-01-29 01:08:41 +00:00
Matt Arsenault	5b39b34ca5	AMDGPU: Match fmed3 patterns with legacy fmin/fmax llvm-svn: 259090	2016-01-28 20:53:48 +00:00
Matt Arsenault	f639c32739	AMDGPU: Match some med3 patterns llvm-svn: 259089	2016-01-28 20:53:42 +00:00
Matt Arsenault	7293f9895e	AMDGPU: Set DX10Clamp bit llvm-svn: 259088	2016-01-28 20:53:35 +00:00
Tom Stellard	3d2c852958	AMDGPU: waitcnt operand fixes Summary: Allow lgkmcnt up to 0xF (hardware allows that). Fix mask for ExpCnt in AMDGPUInstPrinter. Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16314 Patch by: Nikolay Haustov llvm-svn: 259059	2016-01-28 17:13:44 +00:00
Mitch Bodart	e5cadbbcdd	[X86] Test commit, fixed typos in comments. NFC. llvm-svn: 259057	2016-01-28 16:40:51 +00:00
Tom Stellard	2ff726272a	AMDGPU: Move subtarget specific code out of AMDGPUInstrInfo.cpp Summary: Also delete all the stub functions that are identical to the implementations in TargetInstrInfo.cpp. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16609 llvm-svn: 259054	2016-01-28 16:04:37 +00:00
Chad Rosier	3ada75f7e8	[AArch64] Set MMOs on pre- and post-index instructions. Without the MMOs the MI scheduler is unable to reason about the dependencies of these instructions. llvm-svn: 259052	2016-01-28 15:38:24 +00:00
Simon Pilgrim	de16172d9d	[x86] Merge multiple calls to DAG.getTargetLoweringInfo(). NFC. llvm-svn: 259050	2016-01-28 15:29:11 +00:00
Oliver Stannard	02fa1c80c4	Revert r259035, it introduces a cyclic library dependency llvm-svn: 259045	2016-01-28 13:19:47 +00:00
Igor Breger	fca0a34398	AVX512: Fix truncate v32i8 to v32i1 lowering implementation. Enable truncate 128/256bit packed byte/word with AVX512BW but without AVX512VL, use 512bit instructions. Differential Revision: http://reviews.llvm.org/D16531 llvm-svn: 259044	2016-01-28 13:19:25 +00:00
Benjamin Kramer	16e0f147a9	Unbreak the wasm backend again after r259035. llvm-svn: 259040	2016-01-28 11:26:34 +00:00
Zoran Jovanovic	838eabcd46	[mips][microMIPS] Disable FastISel for microMIPS Author: milena.vujosevic.janicic Reviewers: dsanders FastIsel is not supported for microMIPS, thus it needs to be disabled. Test micromips-zero-mat-uses.ll is deleted since the tested sequence of instructions is not generated for microMIPS without FastISel. Differential Revision: http://reviews.llvm.org/D15892 llvm-svn: 259039	2016-01-28 11:08:03 +00:00
Oliver Stannard	b4b092ea1b	Add backend dignostic printer for unsupported features Re-commit of r258951 after fixing layering violation. The related LLVM patch adds a backend diagnostic type for reporting unsupported features, this adds a printer for them to clang. In the case where debug location information is not available, I've changed the printer to report the location as the first line of the function, rather than the closing brace, as the latter does not give the user any information. This also affects optimisation remarks. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 259035	2016-01-28 10:07:27 +00:00
Simon Pilgrim	d3b78430d1	[X86][SSE] Move setTargetShuffleZeroElements closer to getTargetShuffleMask. NFCI. Keep target shuffle mask helper functions closer together. llvm-svn: 259034	2016-01-28 09:45:01 +00:00
Asaf Badouh	42852d99e7	[X86][AVX512] small fix in ptestm intrinsics move ptestm{q\|d} intrinsics from patterns form (in td file) to the intrinsics table Differential Revision: http://reviews.llvm.org/D16633 llvm-svn: 259029	2016-01-28 08:33:22 +00:00
JF Bastien	1e02c70ba3	WebAssembly: fix build r259016 didn't also revert r258957 which broken the WebAssembly build. llvm-svn: 259020	2016-01-28 05:05:17 +00:00
NAKAMURA Takumi	628a7a0aef	Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features" It broke layering violation in LLVMIR. clang r258950 "Add backend dignostic printer for unsupported features" llvm r258951 "Refactor backend diagnostics for unsupported features" llvm-svn: 259016	2016-01-28 04:41:32 +00:00
Dan Gohman	fbfe5ec4a4	[WebAssembly] Don't stackify a register def past a get_local use in the same tree. llvm-svn: 259013	2016-01-28 03:59:09 +00:00
Dan Gohman	adf28177eb	[WebAssembly] Enhanced register stackification This patch revamps the RegStackifier pass with a new tree traversal mechanism, enabling three major new features: - Stackification of values with multiple uses, using the result value of set_local - More aggressive stackification of instructions with side effects - Reordering operands in commutative instructions to enable more stackification. llvm-svn: 259009	2016-01-28 01:22:44 +00:00
Adam Nemet	dadfbb52f7	[TTI] Add getPrefetchDistance from PPCLoopDataPrefetch, NFC This patch is part of the work to make PPCLoopDataPrefetch target-independent (http://thread.gmane.org/gmane.comp.compilers.llvm.devel/92758). As it was discussed in the above thread, getPrefetchDistance is currently using instruction count which may change in the future. llvm-svn: 258995	2016-01-27 22:21:25 +00:00
Derek Schuff	4dd6778660	[WebAssembly] Implement byval arguments Summary: Just does the simple allocation of a stack object and passes a pointer to the callee. Differential Revision: http://reviews.llvm.org/D16610 llvm-svn: 258989	2016-01-27 21:17:39 +00:00
Tim Northover	042a6c1fe1	ARMv7k: base ABI decision on v7k Arch rather than watchos OS. Various bits we want to use the new ABI actually compile with "-arch armv7k -miphoneos-version-min=9.0". Not ideal, but also not ridiculous given how slices work. llvm-svn: 258975	2016-01-27 19:32:29 +00:00
Benjamin Kramer	391be792f2	One more batch of self-containing headers. llvm-svn: 258974	2016-01-27 19:29:56 +00:00
Benjamin Kramer	b32a5042bd	Don't put classes in headers into anonymous namespaces. You want ODR violations? That's how you get ODR violations. llvm-svn: 258973	2016-01-27 19:29:42 +00:00
Benjamin Kramer	c8be5be968	Unbreak wasm build after r258951. llvm-svn: 258957	2016-01-27 18:03:40 +00:00
Benjamin Kramer	45275a4d3c	Make more headers self-contained. A lot of this comes from the new complete type requirement of DenseMap. llvm-svn: 258956	2016-01-27 18:03:37 +00:00
Oliver Stannard	1e67a9f196	Refactor backend diagnostics for unsupported features The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. The implementation of DiagnosticInfoUnsupported::print must be in lib/Codegen rather than in the existing file in lib/IR/ to avoid introducing a dependency from IR to CodeGen. Differential Revision: http://reviews.llvm.org/D16590 llvm-svn: 258951	2016-01-27 17:30:33 +00:00
Benjamin Kramer	f9172fd4ac	Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/ It's a SelectionDAG thing, not a Target thing. llvm-svn: 258939	2016-01-27 16:32:26 +00:00
Benjamin Kramer	820f7548a1	Make some headers self-contained, remove unused includes that violate layering. llvm-svn: 258937	2016-01-27 16:05:37 +00:00
Tom Stellard	6e3b14de62	AMDGPU/SI: Fix commuting of 32-bit VOPC instructions Summary: We didn't have entries in the commuting table for the 32-bit instructions. I don't think we hit this problem now, but we will once uniform branching is enabled. Tests will come in a later commit. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16600 llvm-svn: 258936	2016-01-27 15:53:52 +00:00
Benjamin Kramer	d477e9e378	Revert "Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed." and "Add a missing test case for r258847." This reverts commit r258847, r258848. Causes miscompilations and backend errors. llvm-svn: 258927	2016-01-27 12:44:12 +00:00
Marek Olsak	e86f252209	AMDGPU/SI: Stoney has only 16 LDS banks Summary: This is a candidate for stable, along with all patches that add the "stoney" processor. Reviewers: tstellarAMD Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16485 llvm-svn: 258922	2016-01-27 11:19:45 +00:00
Benjamin Kramer	b3e8a6d2b8	Move MCTargetAsmParser.h to llvm/MC/MCParser where it belongs. llvm-svn: 258917	2016-01-27 10:01:28 +00:00
Igor Breger	b1bd47ca1a	AVX512: Fix vpmovzxbw predicate for AVX1/2 instructions. Differential Revision: http://reviews.llvm.org/D16595 llvm-svn: 258915	2016-01-27 08:57:46 +00:00
Igor Breger	d6c187b038	AVX512: Add store mask patterns. Differential Revision: http://reviews.llvm.org/D16596 llvm-svn: 258914	2016-01-27 08:43:25 +00:00
Matt Arsenault	b22828f2fb	AMDGPU: Fix default device handling When no device name is specified, default to kaveri for HSA since SI is not supported and it woud fail. Default to "tahiti" instead of "SI" since these are effectively the same, and tahiti is an actual device. Move default device handling to the TargetMachine rather than the AMDGPUSubtarget. The module ISA version is computed from the device name provided with the target machine, so the attributes printed by the AsmPrinter were inconsistent with those computed in the subtarget. Also remove DevName field from subtarget since it's redundant with getCPU() in the superclass. llvm-svn: 258901	2016-01-27 02:17:49 +00:00
Reid Kleckner	5b4637141e	[llvm-tblgen] Avoid StringMatcher for GCC and MS builtin names This brings the compile time of Function.cpp from ~40s down to ~4s for me locally. It also shaves off about 400KB of object file size in a release+asserts build. I also realized that the AMDGPU backend does not have any GCC builtin names to match, so the extra lookup was a no-op. I removed it to silence a zero-length string table array warning. There should be no functional change here. This change really ends the story of PR11951. llvm-svn: 258897	2016-01-27 01:43:12 +00:00
Reid Kleckner	1c93b4cd7b	[llvm-tblgen] Stop emitting the intrinsic name matching code The AMDGPU backend was the last user of the old StringMatcher recognition code. Move it over to the new lookupLLVMIntrinsicName funciton, which is now improved to handle all of the interesting edge cases exposed by AMDGPU intrinsic names. llvm-svn: 258875	2016-01-26 23:01:21 +00:00
Derek Schuff	90d9e8d370	[WebAssembly] Omit no-op adds for non-mem uses of FrameIndex Differential Revision: http://reviews.llvm.org/D16554 llvm-svn: 258872	2016-01-26 22:47:43 +00:00
Sanjay Patel	06fe9183b0	[x86] make the subtarget member a const reference, not a pointer ; NFCI It's passed in as a reference; it's not optional; it's not a pointer. llvm-svn: 258867	2016-01-26 22:08:58 +00:00
Simon Pilgrim	00adc1e105	[X86] Add support for zeroed shuffle elements to getShuffleScalarElt Enable handling of SM_SentinelZero shuffle elements to getShuffleScalarElt. Improves VZEXT_LOAD matches in EltsFromConsecutiveLoads. llvm-svn: 258865	2016-01-26 21:39:25 +00:00
Chris Bieneman	e49730d4ba	Remove autoconf support Summary: This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html "I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened." - Obi Wan Kenobi Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16471 llvm-svn: 258861	2016-01-26 21:29:08 +00:00
Derek Schuff	e7305cc4b3	[WebAssembly] Remove check for FrameIndex operands in WebAssemblyPeephole This pass runs after FrameIndex elimination, so it should never see FI operands. NFC llvm-svn: 258860	2016-01-26 21:08:27 +00:00
Sanjay Patel	3e1701da29	[x86] add materializeVectorConstant() helper function; NFC LowerBUILD_VECTOR is still over 300 lines long, but it's a start... llvm-svn: 258858	2016-01-26 21:05:00 +00:00
JF Bastien	43436716aa	WebAssembly NFC: update error message I forgot to update this one in my previous patch. llvm-svn: 258853	2016-01-26 20:24:51 +00:00
JF Bastien	1a6c7608b1	WebAssembly: don't optimize memcpy/memmove/memcpy to frame index r258781 optimized memcpy/memmove/memcpy so the intrinsic call can return its first argument, but missed the frame index case. Teach it to ignore that case so C code doesn't assert out in these cases. llvm-svn: 258851	2016-01-26 20:22:42 +00:00
Cong Hou	551a57f797	Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed. Currently, AnalyzeBranch() fails non-equality comparison between floating points on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this function can modify the branch by reversing the conditional jump and removing unconditional jump if there is a proper fall-through. However, in the case of non-equality comparison between floating points, this can turn the branch "unanalyzable". Consider the following case: jne.BB1 jp.BB1 jmp.BB2 .BB1: ... .BB2: ... AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be removed: jne.BB1 jnp.BB2 .BB1: ... .BB2: ... However, AnalyzeBranch() cannot analyze this branch anymore as there are two conditional jumps with different targets. This may disable some optimizations like block-placement: in this case the fall-through behavior is enforced even if the fall-through block is very cold, which is suboptimal. Actually this optimization is also done in block-placement pass, which means we can remove this optimization from AnalyzeBranch(). However, currently X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined negation conditions for them. In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them. Here only the second conditional jump is reversed. This is valid as we only need them to do this "unconditional jump removal" optimization. Differential Revision: http://reviews.llvm.org/D11393 llvm-svn: 258847	2016-01-26 20:08:01 +00:00
Sanjay Patel	70fa79fdf2	[x86] simplify getOnesVector() ; NFCI Let DAG.getConstant() handle the splatting; there's no need to repeat that logic here. llvm-svn: 258833	2016-01-26 18:49:36 +00:00

1 2 3 4 5 ...

36032 Commits