llvm-project

Commit Graph

Author	SHA1	Message	Date
Igor Breger	defab3c1ef	AVX512: vpextrb/w/d/q and vpinsrb/w/d/q implementation. This instructions doesn't have intrincis. Added tests for lowering and encoding. Differential Revision: http://reviews.llvm.org/D12317 llvm-svn: 249688	2015-10-08 12:55:01 +00:00
Michael Kuperstein	04e79329d0	[X86] Fix wrong treatment of multi-lane blends in BUILD_VECTORtoBlendMask() This fixes two separate bugs: 1) The mask for the high lane was not set correctly. That fixes PR24532. 2) The transformation should bail out if it believes it involves more than 2 lanes, as it does not currently do anything sensible in this case. Differential Revision: http://reviews.llvm.org/D13505 llvm-svn: 249669	2015-10-08 08:13:02 +00:00
Michael Kuperstein	2b3c16ca17	Do not assert on first non-prologue instruction being a CFI directive. llvm-svn: 249668	2015-10-08 07:48:49 +00:00
Jonas Paulsson	5d3fbd3733	[SystemZ] SystemZElimCompare pass improved. Compare elimination extended to recognize load-and-test instructions used for comparison and eliminate them the same way as with compare instructions. Test case fp-cmp-05.ll updated to expect optimized results now also for z13. The order of instruction shortening and compare elimination passes have been changed so that opcodes do not have to be handled in both passes. Reviewed by Ulrich Weigand. llvm-svn: 249666	2015-10-08 07:40:23 +00:00
Jonas Paulsson	7c5ce10a07	[SystemZ] Use load-and-test for fp compare with 0 if vector support is present. Since the LTxBRCompare instructions can't be used with vector registers, a normal load-and-test instruction (with a modelled def operand) is used instead. Reviewed by Ulrich Weigand. llvm-svn: 249664	2015-10-08 07:40:16 +00:00
Reid Kleckner	94fe836afa	[WinEH] Add missing test case for llvm.eh.exceptioncode llvm-svn: 249638	2015-10-07 23:55:06 +00:00
Reid Kleckner	97797419e6	[WinEH] Fix 32-bit funclet epilogues in the presence of dynamic allocas In particular, passing non-trivially copyable objects by value on win32 uses a dynamic alloca (inalloca). We would clobber ESP in the epilogue and end up returning to outer space. llvm-svn: 249637	2015-10-07 23:55:01 +00:00
David Majnemer	6af5f82c20	[WinEH] Refer to filter funclets using their symbol-table symbol The relocation for the filter funclet will be against a symbol table entry for a function instead of the section, making it easier to understand what is going on. llvm-svn: 249621	2015-10-07 21:34:00 +00:00
Reid Kleckner	70bf6bb5e6	[WinEH] Undo the effect of r249578 for 32-bit The __CxxFrameHandler3 tables for 32-bit are supposed to hold stack offsets relative to EBP, not ESP. I blindly updated the win-catchpad.ll test case, and immediately noticed that 32-bit catching stopped working. While I'm at it, move the frame index to frame offset WinEH table logic out of PEI. PEI shouldn't have to know about WinEHFuncInfo. I realized we can calculate frame index offsets just fine from the table printer. llvm-svn: 249618	2015-10-07 21:13:15 +00:00
David Majnemer	c289c9ff55	[WinEH] Remove unreachable blocks before preparation We remove unreachable blocks because it is pointless to consider them for coloring. However, we still had stale pointers to these blocks in some data structures after we removed them from the function. Instead, remove the unreachable blocks before attempting to do anything with the function. This fixes PR25099. llvm-svn: 249617	2015-10-07 21:08:25 +00:00
Joseph Tremoulet	39234fc67e	[WinEH] Set NoModuleLevelChanges in clone flags Summary: This is necessary to keep the cloner from making bogus copies of debug metadata attached to the IR it is cloning. Also, avoid running RemapInstruction over all instructions in the common case that no cloning was performed. Reviewers: rnk, andrew.w.kaylor, majnemer Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D13514 llvm-svn: 249591	2015-10-07 19:29:56 +00:00
Kevin B. Smith	99e8c0fffb	[X86]Update test to use FileCheck. Updates this test to use FileCheck and a single llc invocation rather than 3 llc invocations and grep. llvm-svn: 249583	2015-10-07 18:21:41 +00:00
Chad Rosier	7c6ac2b8f9	[AArch64] Fold a floating-point divide by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249579	2015-10-07 17:51:37 +00:00
Reid Kleckner	33bd2d99d8	[WinEH] Fix two minor issues in __CxxFrameHandler3 tables There was an off-by-one bug in ip2state tables which manifested when one call immediately preceded the try-range of the next. The return address of the previous call would appear to be within the try range of the next scope, resulting in extra destructors or catches running. We also computed the wrong offset for catch parameter stack objects. The offset should be from RSP, not from RBP. llvm-svn: 249578	2015-10-07 17:49:32 +00:00
Chad Rosier	fa30c9b436	[AArch64] Fold a floating-point multiply by power of two into fp conversion. Part of http://reviews.llvm.org/D13442 llvm-svn: 249576	2015-10-07 17:39:18 +00:00
Chad Rosier	169865ffda	[ARM] Promote helper function to SelectionDAG. I'll be using the function in a similar combine for AArch64. The helper was also improved to handle undef values. Part of http://reviews.llvm.org/D13442 llvm-svn: 249572	2015-10-07 17:28:58 +00:00
Oliver Stannard	d3d114ba54	[ARM] Use correct half-precision functions in EABI mode The ARM RTABI defines the half- to single-precision float conversion functions with an __aeabi prefix, but libgcc only has them with a __gnu prefix. Therefore we need to emit the __aeabi version when compiling with an eabi or eabihf triple, and the __gnu version with a gnueabi or gnueabihf triple. llvm-svn: 249565	2015-10-07 16:58:49 +00:00
Chad Rosier	17436bf64e	[ARM] Prevent PerformVDIVCombine from combining a vcvt/vdiv with 8 lanes. This would result in a crash since the vcvt used does not support v8i32 types. llvm-svn: 249560	2015-10-07 16:15:40 +00:00
Jeroen Ketema	aebca09543	[ARM][AArch64] Only lower to interleaved load/store if the target has NEON Without an additional check for NEON, the compiler crashes during legalization of NEON ldN/stN. Differential Revision: http://reviews.llvm.org/D13508 llvm-svn: 249550	2015-10-07 14:53:29 +00:00
Michael Kuperstein	259f1508f0	[X86] Emit .cfi_escape GNU_ARGS_SIZE when adjusting the stack before calls When outgoing function arguments are passed using push instructions, and EH is enabled, we may need to indicate to the stack unwinder that the stack pointer was adjusted before the call. This should fix the exception handling issues in PR24792. Differential Revision: http://reviews.llvm.org/D13132 llvm-svn: 249522	2015-10-07 07:01:31 +00:00
Eric Christopher	ab2802c58f	Update test to use FileCheck and clean up run lines to match the expected behavior. llvm-svn: 249498	2015-10-07 01:21:49 +00:00
Matt Arsenault	284192730a	AMDGPU: Use explicit register size indirect pseudos This stops using an unknown reg class operand. Currently build_vector selection has a broken looking check where it tries to use a VGPR reg class and an SGPR one if it sees an SGPR use. With the source operand has an explicit VGPR class, illegal copies will be inserted that SIFixSGPRCopies will take care of normally later, which will allow removing the weird check of build_vector users. Without this, when removed v_movrels_b32 would still be emitted even though all of the values were only stored in SGPRs. llvm-svn: 249494	2015-10-07 00:42:51 +00:00
Reid Kleckner	72ba70418f	[SEH] Add llvm.eh.exceptioncode intrinsic This will support the Clang __exception_code intrinsic. llvm-svn: 249492	2015-10-07 00:27:33 +00:00
David Majnemer	7735a6d07a	[WinEH] Create a separate MBB for funclet prologues Our current emission strategy is to emit the funclet prologue in the CatchPad's normal destination. This is problematic because intra-funclet control flow to the normal destination is not erroneous and results in us reevaluating the prologue if said control flow is taken. Instead, use the CatchPad's location for the funclet prologue. This correctly models our desire to have unwind edges evaluate the prologue but edges to the normal destination result in typical control flow. Differential Revision: http://reviews.llvm.org/D13424 llvm-svn: 249483	2015-10-06 23:31:59 +00:00
Tom Stellard	0fbf899c0f	AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments() Summary: We currently ignore the calling convention, so there is no real reason to assert on the calling convention of functions. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D13367 llvm-svn: 249468	2015-10-06 21:16:34 +00:00
Chad Rosier	cb14dd0265	[ARM] Simplify tests and make checks more rigid. NFC. llvm-svn: 249432	2015-10-06 17:54:12 +00:00
Krzysztof Parzyszek	fb33824efd	[Hexagon] Add an early if-conversion pass llvm-svn: 249423	2015-10-06 15:49:14 +00:00
Daniel Sanders	1b3341724c	[mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend Summary: This fixes 7 tests during fast LLVM test-suite run: * MultiSource/Benchmarks/McCat/18-imp/imp * MultiSource/Applications/oggenc/oggenc * MultiSource/Benchmarks/MallocBench/gs/gs * MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan * MultiSource/Benchmarks/VersaBench/beamformer/beamformer * MultiSource/Benchmarks/MiBench/consumer-lame/consumer-lame * MultiSource/Benchmarks/Bullet/bullet Error message was in the form of: fatal error: error in backend: Cannot select: 0x95c3288: f32 = fsqrt 0x95c0190 [ORD=9] [ID=18] 0x95c0190: f32 = fadd 0x95bef30, 0x95c4d00 [ORD=8] [ID=17] 0x95bef30: f32 = fmul 0x95c4988, 0x95c4988 [ORD=5] [ID=16] ... There was problem with selecting sqrt instruction in LLVM backend. To fix the issue changes are made in TableGen definition for sqrt instruction in MipsInstrFPU.td and new test file sqrt.ll is added to LLVM regression tests. Patch by Zlatko Buljan Reviewers: zoran.jovanovic, hvarga, dsanders Subscribers: llvm-commits, petarj Differential Revision: http://reviews.llvm.org/D13235 llvm-svn: 249416	2015-10-06 15:17:25 +00:00
Daniel Sanders	add9057fa7	Revert r249123 - [mips][microMIPS] Fix an issue with selecting sqrt instruction in LLVM backend The author was not credited and most of the commit message is missing. Will re-commit with this fixed. llvm-svn: 249415	2015-10-06 15:13:16 +00:00
Craig Topper	2c4068f409	[TwoAddressInstructionPass] When looking for a 3 addr conversion after commuting, make sure regB has been updated to take into account the commute. llvm-svn: 249378	2015-10-06 05:39:59 +00:00
Alexei Starovoitov	4e01a38da0	[bpf] Avoid extra pointer arithmetic for stack access For the program like below struct key_t { int pid; char name[16]; }; extern void test1(char *); int test() { struct key_t key = {}; test1(key.name); return 0; } For key.name, the llc/bpf may generate the below code: R1 = R10 // R10 is the frame pointer R1 += -24 // framepointer adjustment R1 \|= 4 // R1 is then used as the first parameter of test1 OR operation is not recognized by in-kernel verifier. This patch introduces an intermediate FI_ri instruction and generates the following code that can be properly verified: R1 = R10 R1 += -20 Patch by Yonghong Song <yhs@plumgrid.com> llvm-svn: 249371	2015-10-06 04:00:53 +00:00
Craig Topper	79dd1bf094	[X86] Teach constant hoisting that ANDs with 64-bit immediates in the range 0x80000000-0xffffffff can be handled cheaply and don't need to be hoisted. Most importantly, this keeps constant hoisting from preventing instruction selections ability to turn an AND with 0xffffffff into a move into a 32-bit subregister. llvm-svn: 249370	2015-10-06 02:50:24 +00:00
Dan Gohman	e51c058ecc	[WebAssembly] Switch to a more traditional assembly syntax This new syntax is built around putting each instruction on its own line in a "mnemonic op, op, op" like syntax. It also uses conventional data section directives like ".byte" and so on rather than requiring everything to be in hierarchical S-expression format. This is a more natural syntax for a ".s" file format from the perspective of LLVM MC and related tools, while remaining easy to translate into other forms as needed. llvm-svn: 249364	2015-10-06 00:27:55 +00:00
Scott Douglass	953f908173	[ARM] Modify codegen for memcpy intrinsic to prefer LDM/STM. We were previously codegen'ing memcpy as regular load/store operations and hoping that the register allocator would allocate registers in ascending order so that we could apply an LDM/STM combine after register allocation. According to the commit that first introduced this code (r37179), we planned to teach the register allocator to allocate the registers in ascending order. This never got implemented, and up to now we've been stuck with very poor codegen. A much simpler approach for achieving better codegen is to create MEMCPY pseudo instructions, attach scratch virtual registers to them and then, post register allocation, expand the MEMCPYs into LDM/STM pairs using the scratch registers. The register allocator will have picked arbitrary registers which we sort when expanding the MEMCPY. This approach also avoids the need to repeatedly calculate offsets which ultimately ought to be eliminated pre-RA in order to decrease register pressure. Fixes PR9199 and PR23768. [This is based on Peter Collingbourne's r238473 which was reverted.] Differential Revision: http://reviews.llvm.org/D13239 Change-Id: I727543c2e94136e0f80b8e22d5642d7b9ee5b458 Author: Peter Collingbourne <peter@pcc.me.uk> llvm-svn: 249322	2015-10-05 14:49:54 +00:00
Simon Pilgrim	bb01c6fda2	[X86][SSE4A] Added shuffle decode tests for 'special case' SSE4A EXTRQI/INSERTQI ops. llvm-svn: 249263	2015-10-04 10:12:53 +00:00
Igor Breger	78741a1b1e	AVX512: Implemented encoding and intrinsics for VPERMILPS/PD instructions. Added tests for intrinsics and encoding. Differential Revision: http://reviews.llvm.org/D12690 llvm-svn: 249261	2015-10-04 07:20:41 +00:00
David Majnemer	161935520d	[WinEH] Permit branch folding in the face of funclets Track which basic blocks belong to which funclets. Permit branch folding to fire but only if it can prove that doing so will not cause code in one funclet to be reused in another. llvm-svn: 249257	2015-10-04 02:22:52 +00:00
Simon Pilgrim	dde63374c5	[DAGCombiner] Generalize FADD constant combines to work with vectors Updated the FADD combines to work with vectors as well as scalars. Differential Revision: http://reviews.llvm.org/D13416 llvm-svn: 249251	2015-10-03 22:06:06 +00:00
Sanjay Patel	004ea240ad	add test cases that demonstrate bad behavior These are based on PR25016 and likely caused by a bug in MachineCombiner's definition of improvesCriticalPathLen(). llvm-svn: 249249	2015-10-03 20:52:55 +00:00
Simon Pilgrim	93ea954e6d	[X86][SSE] Add FADD combine tests. llvm-svn: 249240	2015-10-03 18:17:43 +00:00
Dan Gohman	dc51b96b7f	[WebAssembly] Implement the remaining conversion operations. This is a temporary assembly syntax that will likely evolve along with broader upcoming syntax changes. llvm-svn: 249225	2015-10-03 02:10:28 +00:00
Dan Gohman	6a050f30de	[WebAssembly] Rename setlocal to set_local to match the spec. llvm-svn: 249218	2015-10-03 00:01:53 +00:00
Dan Gohman	eb440092c9	[WebAssembly] Update this test for the new loop scheme. llvm-svn: 249217	2015-10-02 23:54:03 +00:00
Dan Gohman	e3e4a5ff52	[WebAssembly] Fix CFG stackification of nested loops. llvm-svn: 249187	2015-10-02 21:11:36 +00:00
Dan Gohman	9cc692b06e	[WebAssembly] Support calls marked as "tail", fastcc, and coldcc. llvm-svn: 249184	2015-10-02 20:54:23 +00:00
Richard Trieu	e0129e474d	Call the correct overload. Call the correct overload so a string literal does not get converted to a bool. Also fix the test case to match the names given. llvm-svn: 249183	2015-10-02 20:52:14 +00:00
Dan Gohman	baba8c648b	[WebAssembly] Add a resize_memory intrinsic. llvm-svn: 249178	2015-10-02 20:10:26 +00:00
Dan Gohman	72f1692a2c	[WebAssembly] Add a memory_size intrinsic. llvm-svn: 249171	2015-10-02 19:21:15 +00:00
Tim Northover	956b008db6	ARM: correctly align constant pool value on Thumb1 targets. Since we're using tLDRpci to access it, the constant pool's address must be 0 (mod 4). llvm-svn: 249163	2015-10-02 18:07:13 +00:00
Andrea Di Biagio	77f62652c1	Reapply r249121 : "[FastISel][x86] Teach how to select SSE2/AVX bitcasts between 128/256-bit vector types." This patch teaches FastIsel the following two things: 1) On SSE2, no instructions are needed for bitcasts between 128-bit vector types; 2) On AVX, no instructions are needed for bitcasts between 256-bit vector types. Example: %1 = bitcast <4 x i31> %V to <2 x i64> Before (-fast-isel -fast-isel-abort=1): FastIsel miss: %1 = bitcast <4 x i31> %V to <2 x i64> Now we don't fall back to SelectionDAG and we correctly fold that computation propagating the register associated to %V. Originally reviewed here: http://reviews.llvm.org/D13347 llvm-svn: 249147	2015-10-02 16:08:05 +00:00

1 2 3 4 5 ...

13875 Commits