llvm-project

Commit Graph

Author	SHA1	Message	Date
Jakob Stoklund Olesen	c3bcb02154	Eliminate copies of undefined values during coalescing. These copies would coalesce easily, but the resulting value would be defined by a deleted instruction. Now we also remove the undefined value number from the destination register. This fixes PR10503. llvm-svn: 136174	2011-07-26 23:00:24 +00:00
Benjamin Kramer	a79c1e0589	Update test. llvm-svn: 136170	2011-07-26 22:45:39 +00:00
Benjamin Kramer	124ac2b997	Add a neat little two's complement hack for x86. On x86 we can't encode an immediate LHS of a sub directly. If the RHS comes from a XOR with a constant we can fold the negation into the xor and add one to the immediate of the sub. Then we can turn the sub into an add, which can be commuted and encoded efficiently. This code is generated for __builtin_clz and friends. llvm-svn: 136167	2011-07-26 22:42:13 +00:00
Bruno Cardoso Lopes	f8fe47bd2b	Recognize unpckh* masks and match 256-bit versions. The new versions are different from the previous 128-bit because they work in lanes. Update a few comments and add testcases llvm-svn: 136157	2011-07-26 22:03:40 +00:00
Eli Friedman	93dc04d5ca	Prevent x86-specific DAGCombine from creating nodes with illegal type (which could not be selected). Fixes a minor isel issue that was breaking the testcase from r136130. llvm-svn: 136148	2011-07-26 21:02:58 +00:00
Jim Grosbach	73a8393a47	FileCheck'ize test. llvm-svn: 136135	2011-07-26 20:49:44 +00:00
Eli Friedman	747430417b	XFAIL this test while I investigate it; it's failing for an unexpected reason. llvm-svn: 136131	2011-07-26 20:41:03 +00:00
Eli Friedman	06b8b571b2	Add obvious missing case to switch. PR10497. llvm-svn: 136130	2011-07-26 20:38:49 +00:00
Bruno Cardoso Lopes	d600a0f878	Add 256-bit isel for movsldup/movshdup llvm-svn: 136051	2011-07-26 02:39:32 +00:00
Bruno Cardoso Lopes	9212bf275d	Codegen allonesvector better while using AVX: vpcmpeqd + vinsertf128 This also fixes PR10452 llvm-svn: 136004	2011-07-25 23:05:32 +00:00
Bruno Cardoso Lopes	123dff0f58	- Handle special scalar_to_vector case: splats. Using a native 128-bit shuffle before inserting on a 256-bit vector. - Add AVX versions of movd/movq instructions - Introduce a few COPY patterns to match insert_subvector instructions. This turns a trivial insert_subvector instruction into a register copy, coalescing the xmm into a ymm and avoid emiting on more instruction. llvm-svn: 136002	2011-07-25 23:05:25 +00:00
Eli Friedman	442d1b199f	Attempt to fix test failure reported on llvm-commits. llvm-svn: 135995	2011-07-25 22:28:51 +00:00
Eli Friedman	cbd3ba91b7	Make sure this DAGCombine actually returns an UNDEF of the correct type; PR10476. llvm-svn: 135993	2011-07-25 22:25:42 +00:00
Eli Friedman	ea8c66fea5	Get rid of an incorrect optimization for shuffles with PALIGNR and simplify isPALIGNRMask. Addresses PR10466, although the crash from that PR only triggers in cases where DAGCombine misses optimizing a shuffle. llvm-svn: 135980	2011-07-25 21:36:45 +00:00
Jakob Stoklund Olesen	56a56eb80e	Correctly handle <undef> tied uses when rewriting after a split. This fixes PR10463. A two-address instruction with an <undef> use operand was incorrectly rewritten so the def and use no longer used the same register, violating the tie constraint. Fix this by always rewriting <undef> operands with the register a def operand would use. llvm-svn: 135885	2011-07-24 20:23:50 +00:00
Bruno Cardoso Lopes	7a2075511b	Fix test check! llvm-svn: 135802	2011-07-22 20:55:28 +00:00
Bruno Cardoso Lopes	a89039998d	Fix PR10422 by adding the necessary AVX UCOMISD memory versions to load folding logic llvm-svn: 135801	2011-07-22 20:53:20 +00:00
Rafael Espindola	77242dd537	Turn shuffles into unpacks for VT == MVT::v2i64 and MVT::v2f64 too. Patch by Jeff Muizelaar. llvm-svn: 135789	2011-07-22 18:56:05 +00:00
Bruno Cardoso Lopes	612e56174b	-Inspected a AVX code block added by someone in early Feb. This was never used and was actually very wrong, fix it and make it simpler. Also remove the ConcatVectors function, which is unused now. - Fix a introduction of useless nodes in r126664 and r126264. The VUNPCKL* should never be introduced cause we don't want duplicate nodes for 128 AVX and non-AVX modes, the actual instruction difference only exists during isel, but not for target specific DAG nodes. We only introduce V* target nodes when there is no 128-bit version already there. - Fix a fragile test and make it more useful. llvm-svn: 135729	2011-07-22 00:15:07 +00:00
Bruno Cardoso Lopes	14a95bda04	Although we already support this, add testcases for consistency llvm-svn: 135728	2011-07-22 00:15:03 +00:00
Bruno Cardoso Lopes	91eff5140f	Add a DAGCombine for transforming 128->256 casts into a simple vxorps + vinsertf128 pair of instructions llvm-svn: 135727	2011-07-22 00:15:00 +00:00
Bruno Cardoso Lopes	178fb40612	- Register v16i16 as valid VR256 register class - Add more bitcasts for v16i16 - Since 135661 and 135662 already added the splat logic, just add one more splat test for v16i16 llvm-svn: 135663	2011-07-21 02:24:08 +00:00
Bruno Cardoso Lopes	b878caa5e2	Add support for 256-bit versions of VPERMIL instruction. This is a new instruction introduced in AVX, which can operate on 128 and 256-bit vectors. It considers a 256-bit vector as two independent 128-bit lanes. It can permute any 32 or 64 elements inside a lane, and restricts the second lane to have the same permutation of the first one. With the improved splat support introduced early today, adding codegen for this instruction enable more efficient 256-bit code: Instead of: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vextractf128 $1, %ymm0, %xmm1 shufps $1, %xmm1, %xmm1 movss %xmm1, 28(%rsp) movss %xmm1, 24(%rsp) movss %xmm1, 20(%rsp) movss %xmm1, 16(%rsp) vextractf128 $0, %ymm0, %xmm0 shufps $1, %xmm0, %xmm0 movss %xmm0, 12(%rsp) movss %xmm0, 8(%rsp) movss %xmm0, 4(%rsp) movss %xmm0, (%rsp) vmovaps (%rsp), %ymm0 We get: vextractf128 $0, %ymm0, %xmm0 punpcklbw %xmm0, %xmm0 punpckhbw %xmm0, %xmm0 vinsertf128 $0, %xmm0, %ymm0, %ymm1 vinsertf128 $1, %xmm0, %ymm1, %ymm0 vpermilps $85, %ymm0, %ymm0 llvm-svn: 135662	2011-07-21 01:55:47 +00:00
Devang Patel	bcd50a10d5	While emitting constant value, look through derived type and use underlying basic type to determine size and signness of the constant value. llvm-svn: 135627	2011-07-20 21:57:04 +00:00
Eli Friedman	6ed783228d	PR10421: Fix a straightforward bug in the widening logic for CONCAT_VECTORS. llvm-svn: 135595	2011-07-20 18:14:33 +00:00
Evan Cheng	76792992d6	Add MCObjectFileInfo and sink the MCSections initialization code from TargetLoweringObjectFileImpl down to MCObjectFileInfo. TargetAsmInfo is done to one last method. It's almost gone! llvm-svn: 135569	2011-07-20 05:58:47 +00:00
Eric Christopher	60648578ba	New pointer rotate test. llvm-svn: 135562	2011-07-20 03:09:11 +00:00
Akira Hatanaka	a4c09bce9b	Lower memory barriers to sync instructions. llvm-svn: 135537	2011-07-19 23:30:50 +00:00
Evan Cheng	ccf243d56b	Fix an obvious typo that's preventing x86 (32-bit) from using .literal16. llvm-svn: 135535	2011-07-19 23:14:32 +00:00
Akira Hatanaka	f3b29992d5	Use the correct opcodes: SLLV/SRLV or AND must be used instead of SLL/SRL or ANDi, when the instruction does not have any immediate operands. llvm-svn: 135520	2011-07-19 20:34:00 +00:00
Akira Hatanaka	e450358a21	Remove redundant instructions. - In EmitAtomicBinaryPartword, mask incr in loopMBB only if atomic.swap is the instruction being expanded, instead of masking it in thisMBB. - Remove redundant Or in EmitAtomicCmpSwap. llvm-svn: 135495	2011-07-19 18:14:26 +00:00
Richard Osborne	f1b800998a	Add intrinsics for the zext / sext instructions. llvm-svn: 135476	2011-07-19 13:28:50 +00:00
Richard Osborne	252c43ee88	Add intrinsics for the testct, testwct instructions. llvm-svn: 135475	2011-07-19 13:00:40 +00:00
Richard Osborne	707f0beae1	Add intrinsics for the peek and endin instructions. llvm-svn: 135474	2011-07-19 12:50:25 +00:00
Evan Cheng	2129f59637	Introduce MCCodeGenInfo, which keeps information that can affect codegen (including compilation, assembly). Move relocation model Reloc::Model from TargetMachine to MCCodeGenInfo so it's accessible even without TargetMachine. llvm-svn: 135468	2011-07-19 06:37:02 +00:00
Devang Patel	9ab3cac694	Revert r135423. llvm-svn: 135454	2011-07-19 00:28:24 +00:00
Eli Friedman	4d5532a085	FileCheck-ize a couple tests. llvm-svn: 135427	2011-07-18 21:23:42 +00:00
Devang Patel	4dc76f2438	During bottom up fast-isel, instructions emitted to materalize registers are at top of basic block and do not have debug location. This may misguide debugger while entering the basic block and sometimes debugger provides semi useful view of current location to developer by picking up previous known location as current location. Assign a sensible location to the first instruction in a basic block, if it does not have one location derived from source file, so that debugger can provide meaningful user experience to developers in edge cases. [take 2] llvm-svn: 135423	2011-07-18 20:55:23 +00:00
Akira Hatanaka	338879a7f4	Do not treat atomic.load.sub differently than other atomic binary intrinsics. llvm-svn: 135418	2011-07-18 19:58:59 +00:00
Akira Hatanaka	27292638bd	Set mayLoad or mayStore flags for SC and LL in order to prevent LICM from moving them out of the loop. Previously, stores and loads to a stack frame object were inserted to accomplish this. Remove the code that was needed to do this. Patch by Sasa Stankovic. llvm-svn: 135415	2011-07-18 18:52:12 +00:00
Jakob Stoklund Olesen	c45d38e14a	Fix a crash when building 177.mesa for armv6. When splitting a live range immediately before an LDR_POST instruction that redefines the address register, make sure to use the correct value number in leaveIntvBefore. We need the value number entering the instruction. <rdar://problem/9793765> llvm-svn: 135413	2011-07-18 18:47:13 +00:00
Bruno Cardoso Lopes	4208cace5f	Add AVX 128-bit sqrt versions llvm-svn: 135404	2011-07-18 17:51:40 +00:00
Nick Lewycky	d8921f939c	Delete empty unused file. llvm-svn: 135379	2011-07-18 05:54:06 +00:00
Bruno Cardoso Lopes	4480040191	Add AVX 128-bit patterns for sint_to_fp llvm-svn: 135332	2011-07-16 00:50:20 +00:00
Bruno Cardoso Lopes	8df9cfc279	Fix a couple of things: 1) Make non-legal 256-bit loads to be promoted to v4i64. This lets us canonize the loads and handle things the same way we use to handle for 128-bit registers. Despite of what one of the removed comments explained, the load promotion would not mess with VPERM, it's only a matter of doing the appropriate bitcasts when this instructions comes to be introduced. Also make LOAD v8i32 legal. 2) Doing 1) exposed two bugs: - v4i64 was being promoted to itself for several opcodes (introduced in r124447 by David Greene) causing endless recursion and the stack to explode. - there was no support for allOnes BUILD_VECTORs and ANDNP would fail to match because it was generating early target constant pools during lowering. 3) The testcases are already checked-in, doing 1) exposed the bugs in the current testcases. 4) Tidy up code to be more clear and explicit about AVX. llvm-svn: 135313	2011-07-15 22:24:33 +00:00
Owen Anderson	454e1c7abb	Remove VMOVDneon and VMOVQ, which are just aliases for VORR. This continues to simplify the path towards an auto-generated disassembler. llvm-svn: 135290	2011-07-15 18:46:47 +00:00
Eric Christopher	92464be28c	Check register class matching instead of width of type matching when determining validity of matching constraint. Allow i1 types access to the GR8 reg class for x86. Fixes PR10352 and rdar://9777108 llvm-svn: 135180	2011-07-14 20:13:52 +00:00
Bruno Cardoso Lopes	6778597deb	Add 256-bit load/store recognition and matching in several places. llvm-svn: 135171	2011-07-14 18:50:58 +00:00
Eric Christopher	0c666b4664	Add a testcase for r135123. Part of rdar://9761830 llvm-svn: 135133	2011-07-14 06:23:09 +00:00
Benjamin Kramer	15cd5a3f12	Don't emit a bit test if there is only one case the test can yield false. A simple SETNE is sufficient. llvm-svn: 135126	2011-07-14 01:38:42 +00:00

1 2 3 4 5 ...

4802 Commits