llvm-project

Commit Graph

Author	SHA1	Message	Date
Bill Wendling	3d7b0b8ac7	Rename the 'Attributes' class to 'Attribute'. It's going to represent a single attribute in the future. llvm-svn: 170502	2012-12-19 07:18:57 +00:00
Patrik Hagglund	5e6c361bc0	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. This is the second attempt. In the first attempt (r169837), a few getSimpleVT() were hoisted too far, detected by bootstrap failures. llvm-svn: 170104	2012-12-13 06:34:11 +00:00
Evan Cheng	962711ee71	Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. llvm-svn: 169959	2012-12-12 02:34:41 +00:00
Evan Cheng	c3d1aca657	- Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term. Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. llvm-svn: 169954	2012-12-12 01:32:07 +00:00
Evan Cheng	04e5518783	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. llvm-svn: 169944	2012-12-12 00:42:09 +00:00
Evan Cheng	eb54240dc2	Replace TargetLowering::isIntImmLegal() with ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined term for something like integer immediate materialization. It is always possible to materialize an integer immediate. Whether to use it for memcpy expansion is more a "cost" conceern. llvm-svn: 169929	2012-12-11 23:26:14 +00:00
Patrik Hagglund	e98b7a0389	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	8d2e7cf561	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 169845	2012-12-11 09:57:18 +00:00
Patrik Hagglund	3708e548f8	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. llvm-svn: 169837	2012-12-11 09:10:33 +00:00
Evan Cheng	c2bd620fac	Stylistic tweak. llvm-svn: 169811	2012-12-11 02:31:57 +00:00
Evan Cheng	79e2ca90bc	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791	2012-12-10 23:21:26 +00:00
Evan Cheng	9ec512d768	Replace r169459 with something safer. Rather than having computeMaskedBits to understand target implementation of any_extend / extload, just generate zero_extend in place of any_extend for liveouts when the target knows the zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz). rdar://12771555 llvm-svn: 169536	2012-12-06 19:13:27 +00:00
Evan Cheng	5213139f48	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 llvm-svn: 169459	2012-12-06 01:28:01 +00:00
Matt Beaumont-Gay	50f61b662f	Appease GCC's -Wparentheses. (TIL that Clang's -Wparentheses ignores 'x \|\| y && "foo"' on purpose. Neat.) llvm-svn: 169337	2012-12-04 23:54:02 +00:00
Evan Cheng	b4eae1361c	ARM custom lower ctpop for vector types. Patch by Pete Couperus. llvm-svn: 169325	2012-12-04 22:41:50 +00:00
Chandler Carruth	ed0881b2a6	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Sebastian Pop	a204f72237	Codegen failure for vmull with small vectors Codegen was failing with an assertion because of unexpected vector operands when legalizing the selection DAG for a MUL instruction. The asserting code was legalizing multiplies for vectors of size 128 bits. It uses a custom lowering to try and detect cases where it can use a VMULL instruction instead of a VMOVL + VMUL. The code was looking for input operands to the MUL that had been sign or zero extended. If it found the extended operands it would drop the sign/zero extension and use the original vector size as input to a VMULL instruction. The code assumed that the original input vector was 64 bits so that after dropping the extension it would fit directly into a D register and could be used as an operand of a VMULL instruction. The input code that trigger the failure used a vector of <4 x i8> that was sign extended to <4 x i32>. It was not safe to drop the sign extension in this case because the original vector is only 32 bits wide. The fix is to insert a sign extension for the vector to reach the required 64 bit size. In this particular example, the vector would need to be sign extented to a <4 x i16>. llvm-svn: 169024	2012-11-30 19:08:04 +00:00
Silviu Baranga	93aefa5f2c	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. llvm-svn: 168886	2012-11-29 14:41:25 +00:00
Benjamin Kramer	b1996da782	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. llvm-svn: 168809	2012-11-28 20:55:10 +00:00
Eli Friedman	30834940ec	Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus. llvm-svn: 168240	2012-11-17 01:52:46 +00:00
Weiming Zhao	8f56f88661	Remove hard coded registers in ARM ldrexd and strexd instructions This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. llvm-svn: 168207	2012-11-16 21:55:34 +00:00
Anton Korobeynikov	7d94f3bd7f	Make sure FABS on v2f32 and v4f32 is legal on ARM NEON This fixes PR14359 llvm-svn: 168200	2012-11-16 21:15:20 +00:00
Eli Friedman	e6385e61b5	Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing case to vector legalization so this actually works. Patch by Pete Couperus. Fixes PR12540. llvm-svn: 168107	2012-11-15 22:44:27 +00:00
Craig Topper	323f614cd1	Revert changing FNEG of v4f32 to Expand. It's legal. llvm-svn: 168030	2012-11-15 08:09:46 +00:00
Craig Topper	bb7060584c	Make FNEG and FABS of v4f32 Expand. llvm-svn: 168029	2012-11-15 08:06:12 +00:00
Craig Topper	61d045781a	Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics. llvm-svn: 168025	2012-11-15 06:51:10 +00:00
Evan Cheng	21b0348199	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 llvm-svn: 167657	2012-11-10 02:09:05 +00:00
Chad Rosier	66bb178eef	Revert r167620; this can be implemented using an existing CL option. llvm-svn: 167622	2012-11-09 18:25:27 +00:00
Chad Rosier	332fc75b2c	Add support for -mstrict-align compiler option for ARM targets. rdar://12340498 llvm-svn: 167620	2012-11-09 17:29:38 +00:00
Chad Rosier	1ec8e404fc	Mark the Int_eh_sjlj_dispatchsetup pseudo instruction as clobbering all registers. Previously, the register we being marked as implicitly defined, but not killed. In some cases this would cause the register scavenger to spill a dead register. Also, use an empty register mask to simplify the logic and to reduce the memory footprint. rdar://12592448 llvm-svn: 167499	2012-11-06 23:05:24 +00:00
Quentin Colombet	8e1fe84c3c	Vext Lowering was missing opportunities llvm-svn: 167318	2012-11-02 21:32:17 +00:00
Quentin Colombet	5799e9f66c	Change ForceSizeOpt attribute into MinSize attribute llvm-svn: 167020	2012-10-30 16:32:52 +00:00
Quentin Colombet	3ee56a3bf5	[code size][ARM] Emit regular call instructions instead of the move, branch sequence llvm-svn: 166854	2012-10-27 01:10:17 +00:00
Stepan Dyatkovskiy	dab8043048	ARM: Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. llvm-svn: 166273	2012-10-19 08:23:06 +00:00
Stepan Dyatkovskiy	e59a920b0c	Issue: Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018	2012-10-16 07:16:47 +00:00
Silviu Baranga	b14097000b	Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. llvm-svn: 165929	2012-10-15 09:41:32 +00:00
Manman Ren	7e48b252e7	ARM: tail-call inside a function where part of a byval argument is on caller's local frame causes problem. For example: void f(StructToPass s) { g(&s, sizeof(s)); } will cause problem with tail-call since part of s is passed via registers and saved in f's local frame. When g tries to access s, part of s may be corrupted since f's local frame is popped out before the tail-call. The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for the caller. This is a conservative approach, if we can prove the address of s or part of s is not taken and passed to g, it should be okay to perform tail-call. rdar://12442472 llvm-svn: 165853	2012-10-12 23:39:43 +00:00
Jim Grosbach	30af442a84	ARM: Mark VSELECT as 'expand'. The backend already pattern matches to form VBSL when it can. We may want to teach it to use the vbsl intrinsics at some point to prevent machine licm from mucking with this, but using the Expand is completely correct. http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Patch by Peter Couperus <peter.couperus@st.com>. llvm-svn: 165845	2012-10-12 22:59:21 +00:00
Stepan Dyatkovskiy	283baa0027	Fix for LDRB instruction: SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer that described in .td. 7 ops is needed, but SDNode with only 6 is created. In more details: In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset operand is defined as am2offset_imm. am2offset_imm is complex parameter type, and actually it consists from dummy register and imm itself. As I understood trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy register was not added to SDNode, and it cause crash in Peephole Optimizer pass. The problem fixed by setting up additional dummy reg when emitting LDRB_POST_IMM instruction. llvm-svn: 165617	2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy	f13dbb8e24	Issue description: SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack objects and byval parameters. So loading byval parameters from stack may be inserted before it will be stored, since these operations are treated as independent. Fix: Currently ARMTargetLowering::LowerFormalArguments saves byval registers with FixedStack MachinePointerInfo. To fix the problem we need to store byval registers with MachinePointerInfo referenced to first the "byval" parameter. Also commit adds two new fields to the InputArg structure: Function's argument index and InputArg's part offset in bytes relative to the start position of Function's argument. E.g.: If function's argument is 128 bit width and it was splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index, but different offset values. llvm-svn: 165616	2012-10-10 11:37:36 +00:00
Bill Wendling	c9b22d735a	Create enums for the different attributes. We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488	2012-10-09 07:45:08 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
Bob Wilson	e8a549cd92	Add LLVM support for Swift. llvm-svn: 164899	2012-09-29 21:43:49 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00
James Molloy	9e98ef1c59	Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM. llvm-svn: 164685	2012-09-26 09:48:32 +00:00
Evan Cheng	90ae8f8442	Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte aligned address. Based on patch by David Peixotto. Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment hints. rdar://12090772, rdar://12238782 llvm-svn: 164089	2012-09-18 01:42:45 +00:00
Silviu Baranga	b47bb94f93	This patch introduces A15 as a target in LLVM. llvm-svn: 163803	2012-09-13 15:05:10 +00:00
Craig Topper	3e41a5bb31	Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458	2012-09-08 04:58:43 +00:00
Jakob Stoklund Olesen	e45e22b20f	Custom DAGCombine for and/or/xor are for all ARMs. The 'select' transformations apply to all ARM architectures and don't require hasV6T2Ops. llvm-svn: 163396	2012-09-07 17:34:15 +00:00
James Molloy	9d30dc2432	Fix self-host; ensure signedness is consistent. llvm-svn: 163306	2012-09-06 10:32:08 +00:00
James Molloy	49bdbce8e1	Improve codegen for BUILD_VECTORs on ARM. If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base. llvm-svn: 163304	2012-09-06 09:55:02 +00:00
Arnold Schwaighofer	f00fb1c581	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136	2012-09-04 14:37:49 +00:00
Jakob Stoklund Olesen	d3bda3c5b9	Fix a couple of typos in EmitAtomic. Thumb2 instructions are mostly constrained to rGPR, not tGPR which is for Thumb1. rdar://problem/12203728 llvm-svn: 162968	2012-08-31 02:08:34 +00:00
Jakob Stoklund Olesen	710093e360	Use a SmallPtrSet to dedup successors in EmitSjLjDispatchBlock. The test case ARM/2011-05-04-MultipleLandingPadSuccs.ll was creating duplicate successor list entries. llvm-svn: 162222	2012-08-20 20:52:03 +00:00
Jakob Stoklund Olesen	e1014e7b98	Remove the CAND/COR/CXOR custom ISD nodes and their select code. These nodes are no longer needed because the peephole pass can fold CMOV+AND into ANDCC etc. llvm-svn: 162179	2012-08-18 21:49:50 +00:00
Jakob Stoklund Olesen	dded061f85	Also combine zext/sext into selects for ARM. This turns common i1 patterns into predicated instructions: (add (zext cc), x) -> (select cc (add x, 1), x) (add (sext cc), x) -> (select cc (add x, -1), x) For a function like: unsigned f(unsigned s, int x) { return s + (x>0); } We now produce: cmp r1, #0 it gt addgt.w r0, r0, #1 Instead of: movs r2, #0 cmp r1, #0 it gt movgt r2, #1 add r0, r2 llvm-svn: 162177	2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen	aab43dbfbb	Also pass logical ops to combineSelectAndUse. Add these transformations to the existing add/sub ones: (and (select cc, -1, c), x) -> (select cc, x, (and, x, c)) (or (select cc, 0, c), x) -> (select cc, x, (or, x, c)) (xor (select cc, 0, c), x) -> (select cc, x, (xor, x, c)) The selects can then be transformed to a single predicated instruction by peephole. This transformation will make it possible to eliminate the ISD::CAND, COR, and CXOR custom DAG nodes. llvm-svn: 162176	2012-08-18 21:25:16 +00:00
Jakob Stoklund Olesen	c1dee482c8	Add comment, clean up code. No functional change. llvm-svn: 162107	2012-08-17 16:59:09 +00:00
Jakob Stoklund Olesen	c19bf0282d	Handle ARM MOVCC optimization in PeepholeOptimizer. Use the target independent select analysis hooks. llvm-svn: 162060	2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen	6cb96120f1	Fold predicable instructions into MOVCC / t2MOVCC. The ARM select instructions are just predicated moves. If the select is the only use of an operand, the instruction defining the operand can be predicated instead, saving one instruction and decreasing register pressure. This implementation can turn AND/ORR/EOR instructions into their corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to predicate any instruction, but we don't yet support predicated instructions in SSA form. llvm-svn: 161994	2012-08-15 22:16:39 +00:00
Evan Cheng	eec6bc6270	Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029 llvm-svn: 161962	2012-08-15 17:44:53 +00:00
Nadav Rotem	3a94c545cf	Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user. rdar://11876519 llvm-svn: 161775	2012-08-13 18:52:44 +00:00
Arnold Schwaighofer	b73da9453c	Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM architecture It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7 thumb O3. llvm-svn: 161736	2012-08-12 05:11:56 +00:00
Craig Topper	4fa625fda7	Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed. llvm-svn: 161735	2012-08-12 03:16:37 +00:00
Arnold Schwaighofer	81b2eec1ab	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581	2012-08-09 15:25:52 +00:00
Bob Wilson	3e6fa462f3	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Eric Christopher	b3322364e4	Add support for the ARM GHC calling convention, this patch was in 3.0, but somehow managed to be dropped later. Patch by Karel Gardas. llvm-svn: 161226	2012-08-03 00:05:53 +00:00
Jim Grosbach	6df755cc4e	ARM: Don't assume an SDNode is a constant. Before accessing a node as a ConstandSDNode, make sure it actually is one. No testcase of non-trivial size. rdar://11948669 llvm-svn: 160735	2012-07-25 17:02:47 +00:00
Andrew Trick	a22cdb713b	Fix ARMTargetLowering::isLegalAddImmediate to consider thumb encodings. Based on Evan's suggestion without a commitable test. llvm-svn: 160441	2012-07-18 18:34:27 +00:00
Andrew Trick	bc325168c3	whitespace llvm-svn: 160440	2012-07-18 18:34:24 +00:00
Manman Ren	6e1fd46fdf	ARM: use NOEN loads and stores if possible when handling struct byval. This change is to be enabled in clang. rdar://9877866 llvm-svn: 158684	2012-06-18 22:23:48 +00:00
Manman Ren	e0763c7472	ARM: optimization for sub+abs. This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551	2012-06-15 21:32:12 +00:00
Bill Wendling	4b79647a6e	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Manman Ren	e873552091	ARM: properly handle alignment for struct byval. Factor out the expansion code into a function. This change is to be enabled in clang. rdar://9877866 llvm-svn: 157830	2012-06-01 19:33:18 +00:00
Manman Ren	9f9111651e	ARM: support struct byval in llvm We handle struct byval by inserting a pseudo op, which will be expanded to a loop at ExpandISelPseudos. A separate patch for clang will be submitted to enable struct byval. rdar://9877866 llvm-svn: 157793	2012-06-01 02:44:42 +00:00
Justin Holewinski	aa58397b3c	Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall to pass around a struct instead of a large set of individual values. This cleans up the interface and allows more information to be added to the struct for future targets without requiring changes to each and every target. NV_CONTRIB llvm-svn: 157479	2012-05-25 16:35:28 +00:00
Jakob Stoklund Olesen	691ae3388f	Use the right register class for LDRrs. llvm-svn: 157152	2012-05-20 06:38:47 +00:00
Benjamin Kramer	e31f31e5c0	Add a new target hook "predictableSelectIsExpensive". This will be used to determine whether it's profitable to turn a select into a branch when the branch is likely to be predicted. Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM. I'm not entirely happy with the name of this flag, suggestions welcome ;) llvm-svn: 156233	2012-05-05 12:49:14 +00:00
Matt Beaumont-Gay	e82ab6baa7	Pacify GCC's -Wreturn-type llvm-svn: 156189	2012-05-04 18:34:27 +00:00
Hans Wennborg	aea412008e	Make ARM and Mips use TargetMachine::getTLSModel() This moves the logic for selecting a TLS model to a single place, instead of the previous three (ARM, Mips, and X86 which already uses this function). llvm-svn: 156162	2012-05-04 09:40:39 +00:00
Bob Wilson	9245c93656	Don't introduce illegal types when creating vmull operations. <rdar://11324364> ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. llvm-svn: 155824	2012-04-30 16:53:34 +00:00
Craig Topper	c7242e054d	Convert more uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent. llvm-svn: 155188	2012-04-20 07:30:17 +00:00
Evan Cheng	d0007f3c83	Handle llvm.fma.* intrinsics. rdar://10914096 llvm-svn: 154439	2012-04-10 21:40:28 +00:00
Evan Cheng	f8bad08001	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Chad Rosier	e0e38f61a5	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340	2012-04-09 20:32:02 +00:00
Chad Rosier	99cbde9e82	Update comments and remove unnecessary isVolatile() check. llvm-svn: 154336	2012-04-09 19:38:15 +00:00
Jim Grosbach	0c509fa6bf	Tidy up. 80 columns. llvm-svn: 154226	2012-04-06 23:43:50 +00:00
Chandler Carruth	8a102c21e3	There is no portable std::abs overload for int64_t, use the llvm::abs64 which exists for this purpose. llvm-svn: 154199	2012-04-06 20:10:52 +00:00
Jakob Stoklund Olesen	967b86a0a2	Allow negative immediates in ARM and Thumb2 compares. ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183	2012-04-06 17:45:04 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Evan Cheng	a40d40602c	ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249 llvm-svn: 153717	2012-03-30 01:24:39 +00:00
Lang Hames	591cdaf2ee	Try using vmov.i32 to materialize FP32 constants that can't be materialized by vmov.f32. llvm-svn: 153696	2012-03-29 21:56:11 +00:00
Craig Topper	f6e7e12f75	Remove unnecessary llvm:: qualifications llvm-svn: 153500	2012-03-27 07:21:54 +00:00
Craig Topper	5fa0caafc0	Prune includes and replace uses of ARMRegisterInfo.h with ARMBaeRegisterInfo.h llvm-svn: 153422	2012-03-26 00:45:15 +00:00
Craig Topper	07720d8dcd	Replace uses of ARMBaseInstrInfo and ARMTargetMachine with the Base versions. llvm-svn: 153421	2012-03-25 23:49:58 +00:00
Anton Korobeynikov	3edd854d64	Perform mul combine when multiplying wiht negative constants. Patch by Weiming Zhao! This fixes PR12212 llvm-svn: 153049	2012-03-19 19:19:50 +00:00
Craig Topper	188ed9d56e	Reorder includes to match coding standards. Fix an issue or two exposed by that. llvm-svn: 152978	2012-03-17 07:33:42 +00:00
Lang Hames	c35ee8b54a	Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824	2012-03-15 18:49:02 +00:00

1 2 3 4 5 ...

934 Commits