llvm-project

Commit Graph

Author	SHA1	Message	Date
Patrik Hagglund	5e6c361bc0	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. This is the second attempt. In the first attempt (r169837), a few getSimpleVT() were hoisted too far, detected by bootstrap failures. llvm-svn: 170104	2012-12-13 06:34:11 +00:00
Evan Cheng	962711ee71	Sorry about the churn. One more change to getOptimalMemOpType() hook. Did I mention the inline memcpy / memset expansion code is a mess? This patch split the ZeroOrLdSrc argument into two: IsMemset and ZeroMemset. The first indicates whether it is expanding a memset or a memcpy / memmove. The later is whether the memset is a memset of zero. It's totally possible (likely even) that targets may want to do different things for memcpy and memset of zero. llvm-svn: 169959	2012-12-12 02:34:41 +00:00
Evan Cheng	c3d1aca657	- Rename isLegalMemOpType to isSafeMemOpType. "Legal" is a very overloade term. Also added more comments to explain why it is generally ok to return true. - Rename getOptimalMemOpType argument IsZeroVal to ZeroOrLdSrc. It's meant to be true for loaded source (memcpy) or zero constants (memset). The poor name choice is probably some kind of legacy issue. llvm-svn: 169954	2012-12-12 01:32:07 +00:00
Evan Cheng	04e5518783	Avoid using lossy load / stores for memcpy / memset expansion. e.g. f64 load / store on non-SSE2 x86 targets. llvm-svn: 169944	2012-12-12 00:42:09 +00:00
Evan Cheng	eb54240dc2	Replace TargetLowering::isIntImmLegal() with ScalarTargetTransformInfo::getIntImmCost() instead. "Legal" is a poorly defined term for something like integer immediate materialization. It is always possible to materialize an integer immediate. Whether to use it for memcpy expansion is more a "cost" conceern. llvm-svn: 169929	2012-12-11 23:26:14 +00:00
Patrik Hagglund	e98b7a0389	Revert EVT->MVT changes, r169836-169851, due to buildbot failures. llvm-svn: 169854	2012-12-11 11:14:33 +00:00
Patrik Hagglund	8d2e7cf561	Change TargetLowering::findRepresentativeClass to take an MVT, instead of EVT. llvm-svn: 169845	2012-12-11 09:57:18 +00:00
Patrik Hagglund	3708e548f8	Change TargetLowering::getRegClassFor to take an MVT, instead of EVT. Accordingly, add helper funtions getSimpleValueType (in parallel to getValueType) in SDValue, SDNode, and TargetLowering. This is the first, in a series of patches. llvm-svn: 169837	2012-12-11 09:10:33 +00:00
Evan Cheng	c2bd620fac	Stylistic tweak. llvm-svn: 169811	2012-12-11 02:31:57 +00:00
Evan Cheng	79e2ca90bc	Some enhancements for memcpy / memset inline expansion. 1. Teach it to use overlapping unaligned load / store to copy / set the trailing bytes. e.g. On 86, use two pairs of movups / movaps for 17 - 31 byte copies. 2. Use f64 for memcpy / memset on targets where i64 is not legal but f64 is. e.g. x86 and ARM. 3. When memcpy from a constant string, do not replace the load with a constant if it's not possible to materialize an integer immediate with a single instruction (required a new target hook: TLI.isIntImmLegal()). 4. Use unaligned load / stores more aggressively if target hooks indicates they are "fast". 5. Update ARM target hooks to use unaligned load / stores. e.g. vld1.8 / vst1.8. Also increase the threshold to something reasonable (8 for memset, 4 pairs for memcpy). This significantly improves Dhrystone, up to 50% on ARM iOS devices. rdar://12760078 llvm-svn: 169791	2012-12-10 23:21:26 +00:00
Evan Cheng	9ec512d768	Replace r169459 with something safer. Rather than having computeMaskedBits to understand target implementation of any_extend / extload, just generate zero_extend in place of any_extend for liveouts when the target knows the zero_extend will be implicit (e.g. ARM ldrb / ldrh) or folded (e.g. x86 movz). rdar://12771555 llvm-svn: 169536	2012-12-06 19:13:27 +00:00
Evan Cheng	5213139f48	Let targets provide hooks that compute known zero and ones for any_extend and extload's. If they are implemented as zero-extend, or implicitly zero-extend, then this can enable more demanded bits optimizations. e.g. define void @foo(i16* %ptr, i32 %a) nounwind { entry: %tmp1 = icmp ult i32 %a, 100 br i1 %tmp1, label %bb1, label %bb2 bb1: %tmp2 = load i16* %ptr, align 2 br label %bb2 bb2: %tmp3 = phi i16 [ 0, %entry ], [ %tmp2, %bb1 ] %cmp = icmp ult i16 %tmp3, 24 br i1 %cmp, label %bb3, label %exit bb3: call void @bar() nounwind br label %exit exit: ret void } This compiles to the followings before: push {lr} mov r2, #0 cmp r1, #99 bhi LBB0_2 @ BB#1: @ %bb1 ldrh r2, [r0] LBB0_2: @ %bb2 uxth r0, r2 cmp r0, #23 bhi LBB0_4 @ BB#3: @ %bb3 bl _bar LBB0_4: @ %exit pop {lr} bx lr The uxth is not needed since ldrh implicitly zero-extend the high bits. With this change it's eliminated. rdar://12771555 llvm-svn: 169459	2012-12-06 01:28:01 +00:00
Matt Beaumont-Gay	50f61b662f	Appease GCC's -Wparentheses. (TIL that Clang's -Wparentheses ignores 'x \|\| y && "foo"' on purpose. Neat.) llvm-svn: 169337	2012-12-04 23:54:02 +00:00
Evan Cheng	b4eae1361c	ARM custom lower ctpop for vector types. Patch by Pete Couperus. llvm-svn: 169325	2012-12-04 22:41:50 +00:00
Chandler Carruth	ed0881b2a6	Use the new script to sort the includes of every file under lib. Sooooo many of these had incorrect or strange main module includes. I have manually inspected all of these, and fixed the main module include to be the nearest plausible thing I could find. If you own or care about any of these source files, I encourage you to take some time and check that these edits were sensible. I can't have broken anything (I strictly added headers, and reordered them, never removed), but they may not be the headers you'd really like to identify as containing the API being implemented. Many forward declarations and missing includes were added to a header files to allow them to parse cleanly when included first. The main module rule does in fact have its merits. =] llvm-svn: 169131	2012-12-03 16:50:05 +00:00
Sebastian Pop	a204f72237	Codegen failure for vmull with small vectors Codegen was failing with an assertion because of unexpected vector operands when legalizing the selection DAG for a MUL instruction. The asserting code was legalizing multiplies for vectors of size 128 bits. It uses a custom lowering to try and detect cases where it can use a VMULL instruction instead of a VMOVL + VMUL. The code was looking for input operands to the MUL that had been sign or zero extended. If it found the extended operands it would drop the sign/zero extension and use the original vector size as input to a VMULL instruction. The code assumed that the original input vector was 64 bits so that after dropping the extension it would fit directly into a D register and could be used as an operand of a VMULL instruction. The input code that trigger the failure used a vector of <4 x i8> that was sign extended to <4 x i32>. It was not safe to drop the sign extension in this case because the original vector is only 32 bits wide. The fix is to insert a sign extension for the vector to reach the required 64 bit size. In this particular example, the vector would need to be sign extented to a <4 x i16>. llvm-svn: 169024	2012-11-30 19:08:04 +00:00
Silviu Baranga	93aefa5f2c	Added atomic 64 min/max/umin/umax instrinsics support in the ARM backend. llvm-svn: 168886	2012-11-29 14:41:25 +00:00
Benjamin Kramer	b1996da782	ARM: Implement CanLowerReturn so large vectors get expanded into sret. Fixes 14337. llvm-svn: 168809	2012-11-28 20:55:10 +00:00
Eli Friedman	30834940ec	Mark FP_EXTEND form v2f32 to v2f64 as "expand" for ARM NEON. Patch by Pete Couperus. llvm-svn: 168240	2012-11-17 01:52:46 +00:00
Weiming Zhao	8f56f88661	Remove hard coded registers in ARM ldrexd and strexd instructions This patch replaces the hard coded GPR pair [R0, R1] of Intrinsic:arm_ldrexd and [R2, R3] of Intrinsic:arm_strexd with even/odd GPRPair reg class. Similar to the lowering of atomic_64 operation. llvm-svn: 168207	2012-11-16 21:55:34 +00:00
Anton Korobeynikov	7d94f3bd7f	Make sure FABS on v2f32 and v4f32 is legal on ARM NEON This fixes PR14359 llvm-svn: 168200	2012-11-16 21:15:20 +00:00
Eli Friedman	e6385e61b5	Mark FP_ROUND for converting NEON v2f64 to v2f32 as expand. Add a missing case to vector legalization so this actually works. Patch by Pete Couperus. Fixes PR12540. llvm-svn: 168107	2012-11-15 22:44:27 +00:00
Craig Topper	323f614cd1	Revert changing FNEG of v4f32 to Expand. It's legal. llvm-svn: 168030	2012-11-15 08:09:46 +00:00
Craig Topper	bb7060584c	Make FNEG and FABS of v4f32 Expand. llvm-svn: 168029	2012-11-15 08:06:12 +00:00
Craig Topper	61d045781a	Add llvm.ceil, llvm.trunc, llvm.rint, llvm.nearbyint intrinsics. llvm-svn: 168025	2012-11-15 06:51:10 +00:00
Evan Cheng	21b0348199	Disable the Thumb no-return call optimization: mov lr, pc b.w _foo The "mov" instruction doesn't set bit zero to one, it's putting incorrect value in lr. It messes up backtraces. rdar://12663632 llvm-svn: 167657	2012-11-10 02:09:05 +00:00
Chad Rosier	66bb178eef	Revert r167620; this can be implemented using an existing CL option. llvm-svn: 167622	2012-11-09 18:25:27 +00:00
Chad Rosier	332fc75b2c	Add support for -mstrict-align compiler option for ARM targets. rdar://12340498 llvm-svn: 167620	2012-11-09 17:29:38 +00:00
Chad Rosier	1ec8e404fc	Mark the Int_eh_sjlj_dispatchsetup pseudo instruction as clobbering all registers. Previously, the register we being marked as implicitly defined, but not killed. In some cases this would cause the register scavenger to spill a dead register. Also, use an empty register mask to simplify the logic and to reduce the memory footprint. rdar://12592448 llvm-svn: 167499	2012-11-06 23:05:24 +00:00
Quentin Colombet	8e1fe84c3c	Vext Lowering was missing opportunities llvm-svn: 167318	2012-11-02 21:32:17 +00:00
Quentin Colombet	5799e9f66c	Change ForceSizeOpt attribute into MinSize attribute llvm-svn: 167020	2012-10-30 16:32:52 +00:00
Quentin Colombet	3ee56a3bf5	[code size][ARM] Emit regular call instructions instead of the move, branch sequence llvm-svn: 166854	2012-10-27 01:10:17 +00:00
Stepan Dyatkovskiy	dab8043048	ARM: Removed extra stack frame object for fixed byval arguments, VarArgsStyleRegisters invocation was reworked due to some improper usage in past. PR14099 also demonstrates it. llvm-svn: 166273	2012-10-19 08:23:06 +00:00
Stepan Dyatkovskiy	e59a920b0c	Issue: Stack is formed improperly for long structures passed as byval arguments for EABI mode. If we took AAPCS reference, we can found the next statements: A: "If the argument requires double-word alignment (8-byte), the NCRN (Next Core Register Number) is rounded up to the next even register number." (5.5 Parameter Passing, Stage C, C.3). B: "The alignment of an aggregate shall be the alignment of its most-aligned component." (4.3 Composite Types, 4.3.1 Aggregates). So if we have structure with doubles (9 double fields) and 3 Core unused registers (r1, r2, r3): caller should use r2 and r3 registers only. Currently r1,r2,r3 set is used, but it is invalid. Callee VA routine should also use r2 and r3 regs only. All is ok here. This behaviour is guessed by rounding up SP address with ADD+BFC operations. Fix: Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and 8 byte alignment, we waste odd registers then. P.S.: I also improved LDRB_POST_IMM regression test. Since ldrb instruction will not generated by current regression test after this patch. llvm-svn: 166018	2012-10-16 07:16:47 +00:00
Silviu Baranga	b14097000b	Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE. llvm-svn: 165929	2012-10-15 09:41:32 +00:00
Manman Ren	7e48b252e7	ARM: tail-call inside a function where part of a byval argument is on caller's local frame causes problem. For example: void f(StructToPass s) { g(&s, sizeof(s)); } will cause problem with tail-call since part of s is passed via registers and saved in f's local frame. When g tries to access s, part of s may be corrupted since f's local frame is popped out before the tail-call. The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for the caller. This is a conservative approach, if we can prove the address of s or part of s is not taken and passed to g, it should be okay to perform tail-call. rdar://12442472 llvm-svn: 165853	2012-10-12 23:39:43 +00:00
Jim Grosbach	30af442a84	ARM: Mark VSELECT as 'expand'. The backend already pattern matches to form VBSL when it can. We may want to teach it to use the vbsl intrinsics at some point to prevent machine licm from mucking with this, but using the Expand is completely correct. http://llvm.org/bugs/show_bug.cgi?id=13831 http://llvm.org/bugs/show_bug.cgi?id=13961 Patch by Peter Couperus <peter.couperus@st.com>. llvm-svn: 165845	2012-10-12 22:59:21 +00:00
Stepan Dyatkovskiy	283baa0027	Fix for LDRB instruction: SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer that described in .td. 7 ops is needed, but SDNode with only 6 is created. In more details: In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset operand is defined as am2offset_imm. am2offset_imm is complex parameter type, and actually it consists from dummy register and imm itself. As I understood trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy register was not added to SDNode, and it cause crash in Peephole Optimizer pass. The problem fixed by setting up additional dummy reg when emitting LDRB_POST_IMM instruction. llvm-svn: 165617	2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy	f13dbb8e24	Issue description: SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack objects and byval parameters. So loading byval parameters from stack may be inserted before it will be stored, since these operations are treated as independent. Fix: Currently ARMTargetLowering::LowerFormalArguments saves byval registers with FixedStack MachinePointerInfo. To fix the problem we need to store byval registers with MachinePointerInfo referenced to first the "byval" parameter. Also commit adds two new fields to the InputArg structure: Function's argument index and InputArg's part offset in bytes relative to the start position of Function's argument. E.g.: If function's argument is 128 bit width and it was splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index, but different offset values. llvm-svn: 165616	2012-10-10 11:37:36 +00:00
Bill Wendling	c9b22d735a	Create enums for the different attributes. We use the enums to query whether an Attributes object has that attribute. The opaque layer is responsible for knowing where that specific attribute is stored. llvm-svn: 165488	2012-10-09 07:45:08 +00:00
Micah Villmow	cdfe20b97f	Move TargetData to DataLayout. llvm-svn: 165402	2012-10-08 16:38:25 +00:00
Bob Wilson	e8a549cd92	Add LLVM support for Swift. llvm-svn: 164899	2012-09-29 21:43:49 +00:00
Sylvestre Ledru	91ce36c986	Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767 llvm-svn: 164768	2012-09-27 10:14:43 +00:00
Sylvestre Ledru	721cffd53a	Fix a typo 'iff' => 'if' llvm-svn: 164767	2012-09-27 09:59:43 +00:00
Bill Wendling	863bab689a	Remove the `hasFnAttr' method from Function. The hasFnAttr method has been replaced by querying the Attributes explicitly. No intended functionality change. llvm-svn: 164725	2012-09-26 21:48:26 +00:00
James Molloy	9e98ef1c59	Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM. llvm-svn: 164685	2012-09-26 09:48:32 +00:00
Evan Cheng	90ae8f8442	Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte aligned address. Based on patch by David Peixotto. Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment hints. rdar://12090772, rdar://12238782 llvm-svn: 164089	2012-09-18 01:42:45 +00:00
Silviu Baranga	b47bb94f93	This patch introduces A15 as a target in LLVM. llvm-svn: 163803	2012-09-13 15:05:10 +00:00
Craig Topper	3e41a5bb31	Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct. llvm-svn: 163458	2012-09-08 04:58:43 +00:00
Jakob Stoklund Olesen	e45e22b20f	Custom DAGCombine for and/or/xor are for all ARMs. The 'select' transformations apply to all ARM architectures and don't require hasV6T2Ops. llvm-svn: 163396	2012-09-07 17:34:15 +00:00
James Molloy	9d30dc2432	Fix self-host; ensure signedness is consistent. llvm-svn: 163306	2012-09-06 10:32:08 +00:00
James Molloy	49bdbce8e1	Improve codegen for BUILD_VECTORs on ARM. If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base. llvm-svn: 163304	2012-09-06 09:55:02 +00:00
Arnold Schwaighofer	f00fb1c581	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 163136	2012-09-04 14:37:49 +00:00
Jakob Stoklund Olesen	d3bda3c5b9	Fix a couple of typos in EmitAtomic. Thumb2 instructions are mostly constrained to rGPR, not tGPR which is for Thumb1. rdar://problem/12203728 llvm-svn: 162968	2012-08-31 02:08:34 +00:00
Jakob Stoklund Olesen	710093e360	Use a SmallPtrSet to dedup successors in EmitSjLjDispatchBlock. The test case ARM/2011-05-04-MultipleLandingPadSuccs.ll was creating duplicate successor list entries. llvm-svn: 162222	2012-08-20 20:52:03 +00:00
Jakob Stoklund Olesen	e1014e7b98	Remove the CAND/COR/CXOR custom ISD nodes and their select code. These nodes are no longer needed because the peephole pass can fold CMOV+AND into ANDCC etc. llvm-svn: 162179	2012-08-18 21:49:50 +00:00
Jakob Stoklund Olesen	dded061f85	Also combine zext/sext into selects for ARM. This turns common i1 patterns into predicated instructions: (add (zext cc), x) -> (select cc (add x, 1), x) (add (sext cc), x) -> (select cc (add x, -1), x) For a function like: unsigned f(unsigned s, int x) { return s + (x>0); } We now produce: cmp r1, #0 it gt addgt.w r0, r0, #1 Instead of: movs r2, #0 cmp r1, #0 it gt movgt r2, #1 add r0, r2 llvm-svn: 162177	2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen	aab43dbfbb	Also pass logical ops to combineSelectAndUse. Add these transformations to the existing add/sub ones: (and (select cc, -1, c), x) -> (select cc, x, (and, x, c)) (or (select cc, 0, c), x) -> (select cc, x, (or, x, c)) (xor (select cc, 0, c), x) -> (select cc, x, (xor, x, c)) The selects can then be transformed to a single predicated instruction by peephole. This transformation will make it possible to eliminate the ISD::CAND, COR, and CXOR custom DAG nodes. llvm-svn: 162176	2012-08-18 21:25:16 +00:00
Jakob Stoklund Olesen	c1dee482c8	Add comment, clean up code. No functional change. llvm-svn: 162107	2012-08-17 16:59:09 +00:00
Jakob Stoklund Olesen	c19bf0282d	Handle ARM MOVCC optimization in PeepholeOptimizer. Use the target independent select analysis hooks. llvm-svn: 162060	2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen	6cb96120f1	Fold predicable instructions into MOVCC / t2MOVCC. The ARM select instructions are just predicated moves. If the select is the only use of an operand, the instruction defining the operand can be predicated instead, saving one instruction and decreasing register pressure. This implementation can turn AND/ORR/EOR instructions into their corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to predicate any instruction, but we don't yet support predicated instructions in SSA form. llvm-svn: 161994	2012-08-15 22:16:39 +00:00
Evan Cheng	eec6bc6270	Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029 llvm-svn: 161962	2012-08-15 17:44:53 +00:00
Nadav Rotem	3a94c545cf	Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user. rdar://11876519 llvm-svn: 161775	2012-08-13 18:52:44 +00:00
Arnold Schwaighofer	b73da9453c	Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM architecture It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7 thumb O3. llvm-svn: 161736	2012-08-12 05:11:56 +00:00
Craig Topper	4fa625fda7	Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed. llvm-svn: 161735	2012-08-12 03:16:37 +00:00
Arnold Schwaighofer	81b2eec1ab	Patch to implement UMLAL/SMLAL instructions for the ARM architecture This patch corrects the definition of umlal/smlal instructions and adds support for matching them to the ARM dag combiner. Bug 12213 Patch by Yin Ma! llvm-svn: 161581	2012-08-09 15:25:52 +00:00
Bob Wilson	3e6fa462f3	Fall back to selection DAG isel for calls to builtin functions. Fast isel doesn't currently have support for translating builtin function calls to target instructions. For embedded environments where the library functions are not available, this is a matter of correctness and not just optimization. Most of this patch is just arranging to make the TargetLibraryInfo available in fast isel. <rdar://problem/12008746> llvm-svn: 161232	2012-08-03 04:06:28 +00:00
Eric Christopher	b3322364e4	Add support for the ARM GHC calling convention, this patch was in 3.0, but somehow managed to be dropped later. Patch by Karel Gardas. llvm-svn: 161226	2012-08-03 00:05:53 +00:00
Jim Grosbach	6df755cc4e	ARM: Don't assume an SDNode is a constant. Before accessing a node as a ConstandSDNode, make sure it actually is one. No testcase of non-trivial size. rdar://11948669 llvm-svn: 160735	2012-07-25 17:02:47 +00:00
Andrew Trick	a22cdb713b	Fix ARMTargetLowering::isLegalAddImmediate to consider thumb encodings. Based on Evan's suggestion without a commitable test. llvm-svn: 160441	2012-07-18 18:34:27 +00:00
Andrew Trick	bc325168c3	whitespace llvm-svn: 160440	2012-07-18 18:34:24 +00:00
Manman Ren	6e1fd46fdf	ARM: use NOEN loads and stores if possible when handling struct byval. This change is to be enabled in clang. rdar://9877866 llvm-svn: 158684	2012-06-18 22:23:48 +00:00
Manman Ren	e0763c7472	ARM: optimization for sub+abs. This patch will optimize abs(x-y) FROM sub, movs, rsbmi TO subs, rsbmi For abs, we will use cmp instead of movs. This is necessary because we already have an existing peephole pass which optimizes away cmp following sub. rdar: 11633193 llvm-svn: 158551	2012-06-15 21:32:12 +00:00
Bill Wendling	4b79647a6e	Re-enable the CMN instruction. We turned off the CMN instruction because it had semantics which we weren't getting correct. If we are comparing with an immediate, then it's okay to use the CMN instruction. <rdar://problem/7569620> llvm-svn: 158302	2012-06-11 08:07:26 +00:00
Manman Ren	e873552091	ARM: properly handle alignment for struct byval. Factor out the expansion code into a function. This change is to be enabled in clang. rdar://9877866 llvm-svn: 157830	2012-06-01 19:33:18 +00:00
Manman Ren	9f9111651e	ARM: support struct byval in llvm We handle struct byval by inserting a pseudo op, which will be expanded to a loop at ExpandISelPseudos. A separate patch for clang will be submitted to enable struct byval. rdar://9877866 llvm-svn: 157793	2012-06-01 02:44:42 +00:00
Justin Holewinski	aa58397b3c	Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall to pass around a struct instead of a large set of individual values. This cleans up the interface and allows more information to be added to the struct for future targets without requiring changes to each and every target. NV_CONTRIB llvm-svn: 157479	2012-05-25 16:35:28 +00:00
Jakob Stoklund Olesen	691ae3388f	Use the right register class for LDRrs. llvm-svn: 157152	2012-05-20 06:38:47 +00:00
Benjamin Kramer	e31f31e5c0	Add a new target hook "predictableSelectIsExpensive". This will be used to determine whether it's profitable to turn a select into a branch when the branch is likely to be predicted. Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM. I'm not entirely happy with the name of this flag, suggestions welcome ;) llvm-svn: 156233	2012-05-05 12:49:14 +00:00
Matt Beaumont-Gay	e82ab6baa7	Pacify GCC's -Wreturn-type llvm-svn: 156189	2012-05-04 18:34:27 +00:00
Hans Wennborg	aea412008e	Make ARM and Mips use TargetMachine::getTLSModel() This moves the logic for selecting a TLS model to a single place, instead of the previous three (ARM, Mips, and X86 which already uses this function). llvm-svn: 156162	2012-05-04 09:40:39 +00:00
Bob Wilson	9245c93656	Don't introduce illegal types when creating vmull operations. <rdar://11324364> ARM BUILD_VECTORs created after type legalization cannot use i8 or i16 operands, since those types are not legal. Instead use i32 operands, which will be implicitly truncated by the BUILD_VECTOR to match the element type. llvm-svn: 155824	2012-04-30 16:53:34 +00:00
Craig Topper	c7242e054d	Convert more uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent. llvm-svn: 155188	2012-04-20 07:30:17 +00:00
Evan Cheng	d0007f3c83	Handle llvm.fma.* intrinsics. rdar://10914096 llvm-svn: 154439	2012-04-10 21:40:28 +00:00
Evan Cheng	f8bad08001	Fix a long standing tail call optimization bug. When a libcall is emitted legalizer always use the DAG entry node. This is wrong when the libcall is emitted as a tail call since it effectively folds the return node. If the return node's input chain is not the entry (i.e. call, load, or store) use that as the tail call input chain. PR12419 rdar://9770785 rdar://11195178 llvm-svn: 154370	2012-04-10 01:51:00 +00:00
Chad Rosier	e0e38f61a5	When performing a truncating store, it's possible to rearrange the data in-register, such that we can use a single vector store rather then a series of scalar stores. For func_4_8 the generated code vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vmov.u16 r0, d16[3] strb r0, [r2, #3] vmov.u16 r0, d16[2] strb r0, [r2, #2] vmov.u16 r0, d16[1] strb r0, [r2, #1] vmov.u16 r0, d16[0] strb r0, [r2] bx lr becomes vldr d16, LCPI0_0 vmov d17, r0, r1 vadd.i16 d16, d17, d16 vuzp.8 d16, d17 vst1.32 {d16[0]}, [r2, :32] bx lr I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll, but I couldn't think of a way to judiciously apply this combine. This ldrh r0, [r0, #4] strh r0, [r1] becomes vldr d16, [r0] vmov.u16 r0, d16[2] vmov.32 d16[0], r0 vuzp.16 d16, d17 vst1.32 {d16[0]}, [r1, :32] PR11158 rdar://10703339 llvm-svn: 154340	2012-04-09 20:32:02 +00:00
Chad Rosier	99cbde9e82	Update comments and remove unnecessary isVolatile() check. llvm-svn: 154336	2012-04-09 19:38:15 +00:00
Jim Grosbach	0c509fa6bf	Tidy up. 80 columns. llvm-svn: 154226	2012-04-06 23:43:50 +00:00
Chandler Carruth	8a102c21e3	There is no portable std::abs overload for int64_t, use the llvm::abs64 which exists for this purpose. llvm-svn: 154199	2012-04-06 20:10:52 +00:00
Jakob Stoklund Olesen	967b86a0a2	Allow negative immediates in ARM and Thumb2 compares. ARM and Thumb2 mode can use cmn instructions to compare against negative immediates. Thumb1 mode can't. llvm-svn: 154183	2012-04-06 17:45:04 +00:00
Rafael Espindola	ba0a6cabb8	Always compute all the bits in ComputeMaskedBits. This allows us to keep passing reduced masks to SimplifyDemandedBits, but know about all the bits if SimplifyDemandedBits fails. This allows instcombine to simplify cases like the one in the included testcase. llvm-svn: 154011	2012-04-04 12:51:34 +00:00
Evan Cheng	a40d40602c	ARM target should allow codegenprep to duplicate ret instructions to enable tailcall opt. rdar://11140249 llvm-svn: 153717	2012-03-30 01:24:39 +00:00
Lang Hames	591cdaf2ee	Try using vmov.i32 to materialize FP32 constants that can't be materialized by vmov.f32. llvm-svn: 153696	2012-03-29 21:56:11 +00:00
Craig Topper	f6e7e12f75	Remove unnecessary llvm:: qualifications llvm-svn: 153500	2012-03-27 07:21:54 +00:00
Craig Topper	5fa0caafc0	Prune includes and replace uses of ARMRegisterInfo.h with ARMBaeRegisterInfo.h llvm-svn: 153422	2012-03-26 00:45:15 +00:00
Craig Topper	07720d8dcd	Replace uses of ARMBaseInstrInfo and ARMTargetMachine with the Base versions. llvm-svn: 153421	2012-03-25 23:49:58 +00:00
Anton Korobeynikov	3edd854d64	Perform mul combine when multiplying wiht negative constants. Patch by Weiming Zhao! This fixes PR12212 llvm-svn: 153049	2012-03-19 19:19:50 +00:00
Craig Topper	188ed9d56e	Reorder includes to match coding standards. Fix an issue or two exposed by that. llvm-svn: 152978	2012-03-17 07:33:42 +00:00
Lang Hames	c35ee8b54a	Use vmov.f32 to materialize f32 consts on ARM. This relaxes constraints on register allocation by allowing all 32 D-registers to be used. Patch by Cameron Zwarich. llvm-svn: 152824	2012-03-15 18:49:02 +00:00
Craig Topper	bef78fc2ee	Convert more static tables of registers used by calling convention to uint16_t to reduce space. llvm-svn: 152538	2012-03-11 07:57:25 +00:00
Craig Topper	420525ce3b	Use uint16_t to store registers in callee saved register tables to reduce size of static data. llvm-svn: 151996	2012-03-04 03:33:22 +00:00
Evan Cheng	d12af5dc69	Neuter the optimization I implemented with r107852 and r108258 which turn some floating point equality comparisons into integer ones with -ffast-math. The issue is the optimization causes +0.0 != -0.0. Now the optimization is only done when one side is known to be 0.0. The other side's sign bit is masked off for the comparison. rdar://10964603 llvm-svn: 151861	2012-03-01 23:27:13 +00:00
Evan Cheng	65f9d19c4f	Re-commit r151623 with fix. Only issue special no-return calls if it's a direct call. llvm-svn: 151645	2012-02-28 18:51:51 +00:00
Daniel Dunbar	ee7b899343	Revert r151623 "Some ARM implementaions, e.g. A-series, does return stack prediction. ...", it is breaking the Clang build during the Compiler-RT part. llvm-svn: 151630	2012-02-28 15:36:07 +00:00
Evan Cheng	87c7b09d8d	Some ARM implementaions, e.g. A-series, does return stack prediction. That is, the processor keeps a return addresses stack (RAS) which stores the address and the instruction execution state of the instruction after a function-call type branch instruction. Calling a "noreturn" function with normal call instructions (e.g. bl) can corrupt RAS and causes 100% return misprediction so LLVM should use a unconditional branch instead. i.e. mov lr, pc b _foo The "mov lr, pc" is issued in order to get proper backtrace. rdar://8979299 llvm-svn: 151623	2012-02-28 06:42:03 +00:00
Jakob Stoklund Olesen	fa7a53746c	Switch ARM target to register masks. I'll let the buildbots determine the compile time improvements from this change, but 464.h264ref has 5% faster codegen at -O2. This patch does cause some assembly changes. Branch folding can make different decisions about calls with dead return values. CriticalAntiDepBreaker may choose different registers because its liveness tracking is affected. MachineCopyPropagation may sometimes leave a dead copy behind. llvm-svn: 151331	2012-02-24 01:19:29 +00:00
Dan Gohman	d4a77c4682	When emitting a cmp with 0 for a lowered select, mask out the high bits of the value carying the boolean condition, as their contents are undefined. This fixes rdar://10887484. llvm-svn: 151310	2012-02-24 00:09:36 +00:00
Evan Cheng	f258a15bdf	Canonicalize (srl (bswap x), 16) to (rotr (bswap x), 16) if the high 16 bits of x are zero. This optimizes rev + lsr 16 to rev16. rdar://10750814 llvm-svn: 151230	2012-02-23 02:58:19 +00:00
Evan Cheng	e87681cf34	Optimize a couple of common patterns involving conditional moves where the false value is zero. Instead of a cmov + op, issue an conditional op instead. e.g. cmp r9, r4 mov r4, #0 moveq r4, #1 orr lr, lr, r4 should be: cmp r9, r4 orreq lr, lr, #1 That is, optimize (or x, (cmov 0, y, cond)) to (or.cond x, y). Similarly extend this to xor as well as (and x, (cmov -1, y, cond)) => (and.cond x, y). It's possible to extend this to ADD and SUB but I don't think they are common. rdar://8659097 llvm-svn: 151224	2012-02-23 01:19:06 +00:00
Craig Topper	760b134ffa	Make all pointers to TargetRegisterClass const since they are all pointers to static data that should not be modified. llvm-svn: 151134	2012-02-22 05:59:10 +00:00
Evan Cheng	0460ae8d80	Proper support for a bastardized darwin-eabi hybird ABI. llvm-svn: 151083	2012-02-21 20:46:00 +00:00
James Molloy	547d4c0662	Improve generated code for extending loads and some trunc stores on ARM. Teach TargetSelectionDAG about lengthening loads for vector types and set v4i8 as legal. Allow FP_TO_UINT for v4i16 from v4i32. llvm-svn: 150956	2012-02-20 09:24:05 +00:00
Bill Wendling	05d6f2ff1e	Don't reserve the R0 and R1 registers here. We don't use these registers, and marking them as "live-in" into a BB ruins some invariants that the back-end tries to maintain. llvm-svn: 150437	2012-02-13 23:47:16 +00:00
Jason W Kim	c7f4841769	Make valgrind happy. llvm-svn: 150251	2012-02-10 16:07:59 +00:00
Craig Topper	e55c556a24	Convert assert(0) to llvm_unreachable llvm-svn: 149961	2012-02-07 02:50:20 +00:00
Anton Korobeynikov	d0c46550fe	Cleanups for EABI standard functions llvm-svn: 149195	2012-01-29 09:11:50 +00:00
Anton Korobeynikov	1b42e64280	Use base AAPCS for varargs functions even for AAPCS-VFP CC llvm-svn: 149194	2012-01-29 09:06:09 +00:00
David Blaikie	46a9f016c5	More dead code removal (using -Wunreachable-code) llvm-svn: 148578	2012-01-20 21:51:11 +00:00
David Blaikie	5d8e42755c	Refactor variables unused under non-assert builds (& remove two entirely unused variables). llvm-svn: 148230	2012-01-16 05:17:39 +00:00
Benjamin Kramer	339ced4e34	Return an ArrayRef from ShuffleVectorSDNode::getMask and push it through CodeGen. llvm-svn: 148218	2012-01-15 13:16:05 +00:00
Jakob Stoklund Olesen	083dbdca7f	Match SelectionDAG logic for enabling movt. Darwin doesn't do static, and ELF targets only support static. llvm-svn: 147740	2012-01-07 20:49:15 +00:00
Benjamin Kramer	6898db6269	Remove VectorExtras. This unused helper was written for a type of API that is discouraged now. llvm-svn: 147738	2012-01-07 19:42:13 +00:00
Bob Wilson	1a74de9504	Add variants of the dispatchsetup pseudo for Thumb and !VFP. <rdar://10620138> My change r146949 added register clobbers to the eh_sjlj_dispatchsetup pseudo instruction, but on Thumb1 some of those registers cannot be used. This caused massive failures on the testsuite when compiling for Thumb1. While fixing that, I noticed that the eh_sjlj_setjmp instruction has a "nofp" variant, and I realized that dispatchsetup needs the same thing, so I have added that as well. llvm-svn: 147204	2011-12-22 23:39:48 +00:00
Eli Friedman	c9bf1b1bff	Make check a bit more strict so we don't call ARM_AM::getFP32Imm with a value that isn't a 32-bit value. (This is just to be safe; I don't think this actually causes any issues in practice.) llvm-svn: 146700	2011-12-15 22:56:53 +00:00
Chandler Carruth	637cc6a8aa	Initial CodeGen support for CTTZ/CTLZ where a zero input produces an undefined result. This adds new ISD nodes for the new semantics, selecting them when the LLVM intrinsic indicates that the undef behavior is desired. The new nodes expand trivially to the old nodes, so targets don't actually need to do anything to support these new nodes besides indicating that they should be expanded. I've done this for all the operand types that I could figure out for all the targets. Owners of various targets, please review and let me know if any of these are incorrect. Note that the expand behavior is conservatively correct, and exactly matches LLVM's current behavior with these operations. Ideally this patch will not change behavior in any way. For example the regtest suite finds the exact same instruction sequences coming out of the code generator. That's why there are no new tests here -- all of this is being exercised by the existing test suite. Thanks to Duncan Sands for reviewing the various bits of this patch and helping me get the wrinkles ironed out with expanding for each target. Also thanks to Chris for clarifying through all the discussions that this is indeed the approach he was looking for. That said, there are likely still rough spots. Further review much appreciated. llvm-svn: 146466	2011-12-13 01:56:10 +00:00
Stepan Dyatkovskiy	4683740967	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Third attempt: simplified checks in test for armv7-apple-darwin11. llvm-svn: 146341	2011-12-11 14:35:48 +00:00
Chad Rosier	6641294e3b	Revert r146322 to appease buildbots. Original commit message: Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146328	2011-12-10 19:55:03 +00:00
Stepan Dyatkovskiy	df0b779e9f	Fixed bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). Second attempt. llvm-svn: 146322	2011-12-10 08:42:24 +00:00
Eli Friedman	4e36a934dc	Splats can contain undef's; make sure to handle them correctly. PR11526. llvm-svn: 146299	2011-12-09 23:54:42 +00:00
Daniel Dunbar	c09e4593b2	Revert r146143, "Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2).", it is failing tests. llvm-svn: 146157	2011-12-08 17:32:18 +00:00
Stepan Dyatkovskiy	a4bcf27dae	Fix bug 9905: Failure in code selection for llvm intrinsics sqrt/exp (fix for FSQRT, FSIN, FCOS, FPOWI, FPOW, FLOG, FLOG2, FLOG10, FEXP, FEXP2). llvm-svn: 146143	2011-12-08 07:55:03 +00:00
Evan Cheng	7f8e563a69	Add bundle aware API for querying instruction properties and switch the code generator to it. For non-bundle instructions, these behave exactly the same as the MC layer API. For properties like mayLoad / mayStore, look into the bundle and if any of the bundled instructions has the property it would return true. For properties like isPredicable, only return true if all of the bundled instructions have the property. For properties like canFoldAsLoad, isCompare, conservatively return false for bundles. llvm-svn: 146026	2011-12-07 07:15:52 +00:00
Nick Lewycky	50f02cb21b	Move global variables in TargetMachine into new TargetOptions class. As an API change, now you need a TargetOptions object to create a TargetMachine. Clang patch to follow. One small functionality change in PTX. PTX had commented out the machine verifier parts in their copy of printAndVerify. That now calls the version in LLVMTargetMachine. Users of PTX who need verification disabled should rely on not passing the command-line flag to enable it. llvm-svn: 145714	2011-12-02 22:16:29 +00:00
Benjamin Kramer	7ba71be392	Move code into anonymous namespaces. llvm-svn: 145154	2011-11-26 23:01:57 +00:00
Bob Wilson	f6d1728d8f	Fix ARM SjLj-EH dispatch setup code. <rdar://problem/10444602> The EmitBasePointerRecalculation function has 2 problems, one minor and one fatal. The minor problem is that it inserts the code at the setjmp instead of in the dispatch block. The fatal problem is that at the point where this code runs, we don't know whether there will be a base pointer, so the entire function is a no-op. The base pointer recalculation needs to be handled as it was before, by inserting a pseudo instruction that gets expanded late. Most of the support for the old approach is still here, but it no longer has any connection to the eh_sjlj_dispatchsetup intrinsic. Clean up the parts related to the intrinsic and just generate the pseudo instruction directly. llvm-svn: 144781	2011-11-16 07:11:57 +00:00
Jay Foad	0745e645e0	Remove some unnecessary includes of PseudoSourceValue.h. llvm-svn: 144631	2011-11-15 07:24:32 +00:00
Evan Cheng	7ca4b6eb5c	Add vmov.f32 to materialize f32 immediate splats which cannot be handled by integer variants. rdar://10437054 llvm-svn: 144608	2011-11-15 02:12:34 +00:00
Eli Friedman	c4a001478c	Make sure to expand SIGN_EXTEND_INREG for NEON vectors. PR11319, round 3. llvm-svn: 144361	2011-11-11 03:16:38 +00:00
Eli Friedman	2d4055b683	Make sure we correctly unroll conversions between v2f64 and v2i32 on ARM. llvm-svn: 144241	2011-11-09 23:36:02 +00:00
Lang Hames	b85fcd07df	Lower mem-ops to unaligned i32/i16 load/stores on ARM where supported. Add support for trimming constants to GetDemandedBits. This fixes some funky constant generation that occurs when stores are expanded for targets that don't support unaligned stores natively. llvm-svn: 144102	2011-11-08 18:56:23 +00:00
Pete Cooper	82cd9e81fc	Added invariant field to the DAG.getLoad method and changed all calls. When this field is true it means that the load is from constant (runt-time or compile-time) and so can be hoisted from loops or moved around other memory accesses llvm-svn: 144100	2011-11-08 18:42:53 +00:00
Eli Friedman	6f84fed675	Make sure to mark vector extload's as expand on ARM. Fixes PR11319. llvm-svn: 144057	2011-11-08 01:43:53 +00:00
Dan Gohman	198b7ffc11	Reapply r143206, with fixes. Disallow physical register lifetimes across calls, and only check for nested dependences on the special call-sequence-resource register. llvm-svn: 143660	2011-11-03 21:49:52 +00:00
Lang Hames	1f4603d498	Fixed parameter name. llvm-svn: 143594	2011-11-02 23:37:04 +00:00
Lang Hames	9929c423a1	Try to lower memset/memcpy/memmove to vector instructions on ARM where the alignment permits. llvm-svn: 143582	2011-11-02 22:52:45 +00:00
Dan Gohman	9b9c970148	Revert r143206, as there are still some failing tests. llvm-svn: 143262	2011-10-29 00:41:52 +00:00
Dan Gohman	73057ad24f	Reapply r143177 and r143179 (reverting r143188), with scheduler fixes: Use a separate register, instead of SP, as the calling-convention resource, to avoid spurious conflicts with actual uses of SP. Also, fix unscheduling of calling sequences, which can be triggered by pseudo-two-address dependencies. llvm-svn: 143206	2011-10-28 17:55:38 +00:00
Duncan Sands	225a7037d6	Speculatively disable Dan's commits 143177 and 143179 to see if it fixes the dragonegg self-host (it looks like gcc is miscompiled). Original commit messages: Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. Delete #if 0 code accidentally left in. llvm-svn: 143188	2011-10-28 09:55:57 +00:00
Dan Gohman	4db3f7dd83	Eliminate LegalizeOps' LegalizedNodes map and have it just call RAUW on every node as it legalizes them. This makes it easier to use hasOneUse() heuristics, since unneeded nodes can be removed from the DAG earlier. Make LegalizeOps visit the DAG in an operands-last order. It previously used operands-first, because LegalizeTypes has to go operands-first, and LegalizeTypes used to be part of LegalizeOps, but they're now split. The operands-last order is more natural for several legalization tasks. For example, it allows lowering code for nodes with floating-point or vector constants to see those constants directly instead of seeing the lowered form (often constant-pool loads). This makes some things somewhat more complicated today, though it ought to allow things to be simpler in the future. It also fixes some bugs exposed by Legalizing using RAUW aggressively. Remove the part of LegalizeOps that attempted to patch up invalid chain operands on libcalls generated by LegalizeTypes, since it doesn't work with the new LegalizeOps traversal order. Instead, define what LegalizeTypes is doing to be correct, and transfer the responsibility of keeping calls from having overlapping calling sequences into the scheduler. Teach the scheduler to model callseq_begin/end pairs as having a physical register definition/use to prevent calls from having overlapping calling sequences. This is also somewhat complicated, though there are ways it might be simplified in the future. This addresses rdar://9816668, rdar://10043614, rdar://8434668, and others. Please direct high-level questions about this patch to management. llvm-svn: 143177	2011-10-28 01:29:32 +00:00
Lang Hames	c47e283430	Make sure short memsets on ARM lower to stores, even when optimizing for size. llvm-svn: 143055	2011-10-26 20:56:52 +00:00
James Molloy	dd9137aa56	Revert r142530 at least temporarily while a discussion is had on llvm-commits regarding exactly how much optsize should optimize for size over performance. llvm-svn: 143023	2011-10-26 08:53:19 +00:00
Bill Wendling	1414bc5a14	Use a worklist to prevent the iterator from becoming invalidated because of the 'removeSuccessor' call. Noticed in a Release+Asserts+Check buildbot. llvm-svn: 143018	2011-10-26 07:16:18 +00:00
Evan Cheng	043c9d3f7a	Revert part of r142530. The patch potentially hurts performance especially on Darwin platforms where -Os means optimize for size without hurting performance. llvm-svn: 143002	2011-10-26 01:17:44 +00:00
Eli Friedman	a5e244c08d	Don't crash on variable insertelement on ARM. PR10258. llvm-svn: 142871	2011-10-24 23:08:52 +00:00
Dan Gohman	4ed1afa51d	Change this overloaded use of Sched::Latency to be an overloaded use of Sched::ILP instead, as Sched::Latency is going away. llvm-svn: 142813	2011-10-24 17:55:11 +00:00
Bill Wendling	94e6643fce	The different flavors of ARM have different valid subsets of registers. Check that the set of callee-saved registers is correct for the specific platform. <rdar://problem/10313708> & ctor_dtor_count & ctor_dtor_count-2 llvm-svn: 142706	2011-10-22 00:29:28 +00:00
Bill Wendling	cf7bdf4438	Add missing operand. <rdar://problem/10313323> llvm-svn: 142615	2011-10-20 20:37:11 +00:00
James Molloy	2d768fd379	Use literal pool loads instead of MOVW/MOVT for materializing global addresses when optimizing for size. On spec/gcc, this caused a codesize improvement of ~1.9% for ARM mode and ~4.9% for Thumb(2) mode. This is codesize including literal pools. The pools themselves doubled in size for ARM mode and quintupled for Thumb mode, leaving suggestion that there is still perhaps redundancy in LLVM's use of constant pools that could be decreased by sharing entries. Fixes PR11087. llvm-svn: 142530	2011-10-19 14:11:07 +00:00
Bill Wendling	2977a15ab1	Make sure we emit the 'movw' and 'movt' only if it's supported. Otherwise, use a constant pool. llvm-svn: 142485	2011-10-19 09:24:02 +00:00
Bill Wendling	7c1634556d	Remove some dead code. llvm-svn: 142484	2011-10-19 09:04:11 +00:00
Bill Wendling	94f60018e0	Emit the MOVT instruction only if the # LPads is > 64K. llvm-svn: 142460	2011-10-18 23:19:55 +00:00
Bill Wendling	64e6bfc16c	For Thumb mode, we need to use a constant pool if the value is too large to be used with the CMP instruction. llvm-svn: 142458	2011-10-18 23:11:05 +00:00
Bill Wendling	4969dcdef9	Use the integer compare when the value is small enough. Use the "move into a register and then compare against that" method when it's too large. We have to move the value into the register in the "movw, movt" pair of instructions. llvm-svn: 142440	2011-10-18 22:52:20 +00:00
Bill Wendling	85833f71c6	Use the integer compare when the value is small enough. Use the "move into a register and then compare against that" method when it's too large. We have to move the value into the register in the "movw, movt" pair of instructions. llvm-svn: 142437	2011-10-18 22:49:07 +00:00
Bill Wendling	973c817cde	The value we're comparing against may be too large for the ARM CMP instruction. Move the value into a register and then use that for the CMP. <rdar://problem/10305266> llvm-svn: 142431	2011-10-18 22:11:18 +00:00
Bill Wendling	b2a703d352	The immediate may be too large for the CMP instruction. Move it into a register and use that in the CMP. <rdar://problem/10305266> llvm-svn: 142429	2011-10-18 21:55:58 +00:00
Andrew Trick	88b2450adc	Use ARM/t2PseudoInst class from ARM/Thumb2 special adds/subs patterns. Clean up the patterns, fix comments, and avoid confusing both tools and coders. Note that the special adds/subs SelectionDAG nodes no longer have the dummy cc_out operand. llvm-svn: 142397	2011-10-18 19:18:52 +00:00
Bob Wilson	93b0f7b319	Use isIntN and isUIntN to check for valid signed/unsigned numbers. llvm-svn: 142395	2011-10-18 18:46:49 +00:00
Andrew Trick	3f07c429b5	whitespace llvm-svn: 142394	2011-10-18 18:40:53 +00:00
Bill Wendling	617075fcf6	A landing pad could have more than one predecessor. In that case, we want that predecessor to remove the jump to it as well. Delay clearing the 'landing pad' flag until after the jumps have been removed. (There is an implicit assumption in several modules that an MBB which jumps to a landing pad has only two successors.) <rdar://problem/10304224> llvm-svn: 142390	2011-10-18 18:30:49 +00:00
Bob Wilson	9258b76d8d	Fix incorrect check for sign-extended constant BUILD_VECTOR. <rdar://problem/10298332> llvm-svn: 142371	2011-10-18 17:34:51 +00:00
Duncan Sands	d278d35b13	Fix a bunch of unused variable warnings when doing a release build with gcc-4.6. llvm-svn: 142350	2011-10-18 12:44:00 +00:00
Bill Wendling	510fbcd440	Don't renumber the blocks here. This could cause problems later on if another pass renumbers the blocks again. llvm-svn: 142258	2011-10-17 21:32:56 +00:00
Bill Wendling	f7f223f69e	Add a call to EmitSjLjDispatchBlock. Once the intrinsics are marked as having a custom inserter, it will call this method to emit the dispatch table into the machine function. llvm-svn: 142245	2011-10-17 20:37:20 +00:00
Bill Wendling	26d2780d07	Add comment explaining that the order of processing doesn't matter here. llvm-svn: 142176	2011-10-17 05:25:09 +00:00
Nadav Rotem	097106b77a	ARM cannot select a pattern for trunc-store v4i8; /ARM/vrev.ll fails when promoting elements. llvm-svn: 142080	2011-10-15 20:03:12 +00:00
Bill Wendling	9c1019c6c7	Mark registers as DEAD because they're really just clobbers. llvm-svn: 142027	2011-10-15 00:27:44 +00:00
Eli Friedman	74d1da5a05	Add missing correctness check to ARMTargetLowering::ReconstructShuffle. Fixes PR11129. llvm-svn: 142022	2011-10-14 23:58:49 +00:00
Bill Wendling	9e0cd1ee17	Make sure that the register is in the register class before adding it as a machine op. llvm-svn: 142021	2011-10-14 23:55:44 +00:00
Bill Wendling	6f3f9a391e	Mark the invoke call instruction as implicitly defining the callee-saved registers. The callee-saved registers cannot be live across an invoke call because the control flow may continue along the exceptional edge. When this happens, all of the callee-saved registers are no longer valid. llvm-svn: 142018	2011-10-14 23:34:37 +00:00
Eli Friedman	aa6ec39056	Simplify and avoid undefined shift. Based on patch by Ahmed Charles. llvm-svn: 141903	2011-10-13 22:40:23 +00:00
Bill Wendling	a7d697e4a6	Reapply r141365 now that PR11107 is fixed. llvm-svn: 141591	2011-10-10 22:59:55 +00:00
Bill Wendling	47aac51043	Revert r141365. It was causing MultiSource/Benchmarks/MiBench/consumer-lame to hang, and possibly SPEC/CINT2006/464_h264ref. llvm-svn: 141560	2011-10-10 18:27:30 +00:00
Bill Wendling	883ec97115	Take all of the invoke basic blocks and make the dispatch basic block their new successor. Remove the old landing pad from their successor list, because it's now the successor of the dispatch block. Now that the landing pad blocks are no longer the destination of invokes, we can mark them as normal basic blocks instead of landing pads. This more closely resembles what the CFG is actually doing. llvm-svn: 141436	2011-10-07 23:18:02 +00:00
Bill Wendling	f9f5e455d4	Take the code that was emitted for the llvm.eh.dispatch.setup intrinsic and emit it with the new SjLj emitter stuff. This way there's no need to emit that kind-of-hacky intrinsic. llvm-svn: 141419	2011-10-07 22:08:37 +00:00
Bill Wendling	7ecfbd90ef	Thread the chain through the eh.sjlj.setjmp intrinsic, like it's documented to do. This will be useful later on with the new SJLJ stuff. llvm-svn: 141416	2011-10-07 21:25:38 +00:00
Bob Wilson	8decdc472f	Reenable tail calls for iOS 5.0 and later. llvm-svn: 141370	2011-10-07 17:17:49 +00:00
Bob Wilson	bc1589945d	Reenable use of divmod compiler_rt functions for iOS 5.0 and later. llvm-svn: 141368	2011-10-07 16:59:21 +00:00
Anton Korobeynikov	318d6bae80	Peephole optimization for ABS on ARM. Patch by Ana Pazos! llvm-svn: 141365	2011-10-07 16:15:08 +00:00
Bill Wendling	8d50ea0f82	Use the correct vreg here. llvm-svn: 141342	2011-10-06 23:41:14 +00:00
Bill Wendling	b3d4678877	Generate the dispatch code for a 'thumb' function. This is very similar to the others. They take the call site value. Determine if it's a proper value. And then jumps to the correct call site via a jump table. llvm-svn: 141341	2011-10-06 23:37:36 +00:00
Bill Wendling	5626c66a89	Generate the dispatch table for ARM mode. llvm-svn: 141327	2011-10-06 22:53:00 +00:00
Bill Wendling	030b58e5c9	Refactor some of the code that sets up the entry block for SjLj EH. No functionality change. llvm-svn: 141323	2011-10-06 22:18:16 +00:00
Bill Wendling	31d973cde6	Use a thumb ORR instead of thumb2 ORR when in thumb-only mode. (Picky! Picky!) Place the immediate to OR into a register so that it works. llvm-svn: 141319	2011-10-06 21:51:21 +00:00
Bill Wendling	362c1b01cc	* Set the low bit of the return address when we are in thumb mode. * Some code cleanup. llvm-svn: 141317	2011-10-06 21:29:56 +00:00
Bill Wendling	6134655f08	Add the MBBs before inserting the instructions. Doing it afterwards could lead to an infinite loop because of the def-use chains. Also use a frame load instead of store for the LD instruction. llvm-svn: 141263	2011-10-06 00:53:33 +00:00
Bill Wendling	f793e7ed5c	Get the proper call site numbers for the landing pads. Also remove a magic number (18) for the proper addressing mode. llvm-svn: 141245	2011-10-05 23:28:57 +00:00
Bill Wendling	324be98a3c	Look at the number of entries in the jump table and jump to a 'trap' block if the value exceeds that number. llvm-svn: 141143	2011-10-05 00:39:32 +00:00
Bill Wendling	202803e39c	Checkpoint for SJLJ EH code. This is a first pass at generating the jump table for the sjlj dispatch. It currently generates something plausible, but hasn't been tested thoroughly. llvm-svn: 141140	2011-10-05 00:02:33 +00:00
Bill Wendling	1eab54f8ba	Use the PC label ID rather than '1'. Add support for thumb-2, because I heard that some people use it. llvm-svn: 141042	2011-10-03 22:44:15 +00:00
Bill Wendling	374ee194f2	Check-pointing the new SjLj EH lowering. This code will replace the version in ARMAsmPrinter.cpp. It creates a new machine basic block, which is the dispatch for the return from a longjmp call. It then shoves the address of that machine basic block into the correct place in the function context so that the EH runtime will jump to it directly instead of having to go through a compare-and-jump-to-the-dispatch bit. This should be more efficient in the common case. llvm-svn: 141031	2011-10-03 21:25:38 +00:00
Bill Wendling	c214cb055d	Use the new ARMConstantPoolSymbol class to handle external symbols. llvm-svn: 140939	2011-10-01 08:58:29 +00:00
Bill Wendling	7753d66468	Switch over to using ARMConstantPoolConstant for global variables, functions, and block addresses. llvm-svn: 140936	2011-10-01 08:00:54 +00:00
Jim Grosbach	efc761a1eb	ARM fix encoding of VMOV.f32 and VMOV.f64 immediates. Encode the immediate into its 8-bit form as part of isel rather than later, which simplifies things for mapping the encoding bits, allows the removal of the custom disassembler decoding hook, makes the operand printer trivial, and prepares things more cleanly for handling these in the asm parser. rdar://10211428 llvm-svn: 140834	2011-09-30 00:50:06 +00:00
Evan Cheng	8156376aa9	Tighten a ARM dag combine condition to avoid an identity transformation, which ends up introducing a cycle in the DAG. rdar://10196296 llvm-svn: 140733	2011-09-28 23:16:31 +00:00
David Meyer	b1fbf9ff26	PR11004: Inline memcpy to avoid generating nested call sequence. Un-XFAIL 2011-06-09-TailCallByVal and 2010-11-04-BigByval llvm-svn: 140516	2011-09-26 06:13:20 +00:00
Andrew Trick	924123acb3	Lower ARM adds/subs to add/sub after adding optional CPSR operand. This is still a hack until we can teach tblgen to generate the optional CPSR operand rather than an implicit CPSR def. But the strangeness is now limited to the selection DAG. ADD/SUB MI's no longer have implicit CPSR defs, nor do we allow flag setting variants of these opcodes in machine code. There are several corner cases to consider, and getting one wrong would previously lead to nasty miscompilation. It's not the first time I've debugged one, so this time I added enough verification to ensure it won't happen again. llvm-svn: 140228	2011-09-21 02:20:46 +00:00
Andrew Trick	8586e62d91	ARM isel bug fix for adds/subs operands. Modified ARMISelLowering::AdjustInstrPostInstrSelection to handle the full gamut of CPSR defs/uses including instructins whose "optional" cc_out operand is not really optional. This allowed removal of the hasPostISelHook to simplify the .td files and make the implementation more robust. Fixes rdar://10137436: sqlite3 miscompile llvm-svn: 140134	2011-09-20 03:17:40 +00:00
Andrew Trick	53df4b6dfa	whitespace llvm-svn: 140133	2011-09-20 03:06:13 +00:00
Jim Grosbach	9c0b86a76d	Thumb2 assembly parsing and encoding for STR. More addressing mode encoding bits. Handle pre increment for STR/STRB/STRH and STR(register). llvm-svn: 139949	2011-09-16 21:55:56 +00:00
Eli Friedman	10f9ce2b7d	Minor cleanup. llvm-svn: 139869	2011-09-15 22:26:18 +00:00
Eli Friedman	ba912e06c2	Use a more efficient lowering for Unordered/Monotonic atomic load/store on Thumb1. llvm-svn: 139865	2011-09-15 22:18:49 +00:00
Jim Grosbach	e7e2aca322	Tidy up a few 80 column violations. llvm-svn: 139636	2011-09-13 20:30:37 +00:00
Owen Anderson	29cfe6c368	Thumb unconditional branches are allowed in IT blocks, and therefore should have a predicate operand, unlike conditional branches. llvm-svn: 139415	2011-09-09 21:48:23 +00:00
Jim Grosbach	a05627ebaf	Thumb2 assembly parsing and encoding for LDREX/LDREXB/LDREXD/LDREXH. llvm-svn: 139381	2011-09-09 18:37:27 +00:00
Duncan Sands	f2641e1bc1	Add codegen support for vector select (in the IR this means a select with a vector condition); such selects become VSELECT codegen nodes. This patch also removes VSETCC codegen nodes, unifying them with SETCC nodes (codegen was actually often using SETCC for vector SETCC already). This ensures that various DAG combiner optimizations kick in for vector comparisons. Passes dragonegg bootstrap with no testsuite regressions (nightly testsuite as well as "make check-all"). Patch mostly by Nadav Rotem. llvm-svn: 139159	2011-09-06 19:07:46 +00:00
Evan Cheng	0b758ed6ba	Fix fall outs from my recent change on how carry bit is modeled during isel. Now the 'S' instructions, e.g. ADDS, treat S bit as optional operand as well. Also fix isel hook to correctly set the optional operand. rdar://10073745 llvm-svn: 139157	2011-09-06 18:52:20 +00:00
Eli Friedman	d7776ed030	Null-initialize to shut up -Wuninitialized warnings. llvm-svn: 138974	2011-09-01 22:27:41 +00:00
Eli Friedman	1ccecbb9d3	64-bit atomic cmpxchg for ARM. llvm-svn: 138868	2011-08-31 17:52:22 +00:00
Eli Friedman	c3f9c4a852	Some 64-bit atomic operations on ARM. 64-bit cmpxchg coming next. llvm-svn: 138845	2011-08-31 00:31:29 +00:00
Evan Cheng	e6fba77971	Follow up to r138791. Add a instruction flag: hasPostISelHook which tells the pre-RA scheduler to call a target hook to adjust the instruction. For ARM, this is used to adjust instructions which may be setting the 's' flag. ADC, SBC, RSB, and RSC instructions have implicit def of CPSR (required since it now uses CPSR physical register dependency rather than "glue"). If the carry flag is used, then the target hook will fill in the optional operand with CPSR. Otherwise, the hook will remove the CPSR implicit def from the MachineInstr. llvm-svn: 138810	2011-08-30 19:09:48 +00:00
Evan Cheng	e891654a58	Change ARM / Thumb2 addc / adde and subc / sube modeling to use physical register dependency (rather than glue them together). This is general goodness as it gives scheduler more freedom. However it is motivated by a nasty bug in isel. When a i64 sub is expanded to subc + sube. libcall #1 \ \ subc \ / \ \ / \ \ / libcall #2 sube If the libcalls are not serialized (i.e. both have chains which are dag entry), legalizer can serialize them in arbitrary orders. If it's unlucky, it can force libcall #2 before libcall #1 in the above case. subc \| libcall #2 \| libcall #1 \| sube However since subc and sube are "glued" together, this ends up being a cycle when the scheduler combine subc and sube as a single scheduling unit. The right solution is to fix LegalizeType too chains the libcalls together. However, LegalizeType is not processing nodes in order so that's harder than it should be. For now, the move to physical register dependency will do. rdar://10019576 llvm-svn: 138791	2011-08-30 01:34:54 +00:00
Eli Friedman	7dfa791f4f	Expand ATOMIC_LOAD and ATOMIC_STORE for architectures I don't know well enough to fix properly. llvm-svn: 138751	2011-08-29 18:23:02 +00:00
Benjamin Kramer	61a1ff543c	Silence GCC warnings and make an array const. llvm-svn: 138706	2011-08-27 17:36:14 +00:00
Eli Friedman	452aae6202	Atomic load/store on ARM/Thumb. I don't really like the patterns, but I'm having trouble coming up with a better way to handle them. I plan on making other targets use the same legalization ARM-without-memory-barriers is using... it's not especially efficient, but if anyone cares, it's not that hard to fix for a given target if there's some better lowering. llvm-svn: 138621	2011-08-26 02:59:24 +00:00
Jim Grosbach	f402f694e2	ARM expansion of pre-indexed store pseudos should maintain memoperands. Partial fix for rdar://9945172. llvm-svn: 137513	2011-08-12 21:02:34 +00:00
Duncan Sands	a41634e307	Silence a bunch (but not all) "variable written but not read" warnings when building with assertions disabled. llvm-svn: 137460	2011-08-12 14:54:45 +00:00
Jim Grosbach	d886f8cd8d	ARM STRH assembly parsing and encoding. llvm-svn: 137353	2011-08-11 21:17:22 +00:00
Jim Grosbach	5e80abbb5d	ARM fix typo in pre-indexed store lowering. rdar://9915869 llvm-svn: 137148	2011-08-09 21:22:41 +00:00
Jim Grosbach	f0c95cadc7	ARM refactor indexed store instructions. Refactor STR[B] pre and post indexed instructions to use addressing modes for memory operands, which is necessary for assembly parsing and is more consistent with the rest of the memory instruction definitions. Make some incremental progress on refactoring away the mega-operand addrmode2 along the way, which is nice. llvm-svn: 136978	2011-08-05 20:35:44 +00:00
Eli Friedman	30a49e93e3	New approach to r136737: insert the necessary fences for atomic ops in platform-independent code, since a bunch of platforms (ARM, Mips, PPC, Alpha are the relevant targets here) need to do essentially the same thing. I think this completes the basic CodeGen for atomicrmw and cmpxchg. llvm-svn: 136813	2011-08-03 21:06:02 +00:00
Eli Friedman	5c863aeefd	ARM backend support for atomicrmw and cmpxchg with non-monotonic ordering. Not especially pretty, but seems to work well enough. If this looks okay, I'll put together similar patches for Mips, PPC, and Alpha. llvm-svn: 136737	2011-08-02 22:44:16 +00:00
Eric Christopher	aa5030066f	Add support for the 'Q' constraint. Fixes rdar://9866494 llvm-svn: 136523	2011-07-29 21:18:58 +00:00
Eli Friedman	26a484852e	Code generation for 'fence' instruction. llvm-svn: 136283	2011-07-27 22:21:52 +00:00
Jim Grosbach	8b31ef50c0	ARM extend instructions simplification. Refactor the SXTB, SXTH, SXTB16, UXTB, UXTH, and UXTB16 instructions to not have an 'r' and an 'r_rot' version, but just a single version with a rotate that can be zero. Use plain Pat<>'s for the ISel of the non-rotated version. llvm-svn: 136225	2011-07-27 16:47:19 +00:00
Owen Anderson	b595ed0085	Split up the ARM so_reg ComplexPattern into so_reg_reg and so_reg_imm, allowing us to distinguish the encodings that use shifted registers from those that use shifted immediates. This is necessary to allow the fixed-length decoder to distinguish things like BICS vs LDRH. llvm-svn: 135693	2011-07-21 18:54:16 +00:00
Evan Cheng	a20cde31e7	Sink ARMMCExpr and ARMAddressingModes into MC layer. First step to separate ARM MC code from target. llvm-svn: 135636	2011-07-20 23:34:39 +00:00
Chris Lattner	229907cd11	land David Blaikie's patch to de-constify Type, with a few tweaks. llvm-svn: 135375	2011-07-18 04:54:35 +00:00
Evan Cheng	f863e3fb73	Improve codegen for select's: if (x != 0) x = 1 if (x == 1) x = 1 Previous codegen looks like this: mov r1, r0 cmp r1, #1 mov r0, #0 moveq r0, #1 The naive lowering select between two different values. It should recognize the test is equality test so it's more a conditional move rather than a select: cmp r0, #1 movne r0, #0 rdar://9758317 llvm-svn: 135017	2011-07-13 00:42:17 +00:00
Cameron Zwarich	f03fa189ca	Add an intrinsic and codegen support for fused multiply-accumulate. The intent is to use this for architectures that have a native FMA instruction. llvm-svn: 134742	2011-07-08 21:39:21 +00:00
Jim Grosbach	3840c90f73	Add more info to FIXME. llvm-svn: 134729	2011-07-08 20:18:11 +00:00
Jim Grosbach	cf1464d943	ARMv7M vs. ARMv7E-M support. The DSP instructions in the Thumb2 instruction set are an optional extension in the Cortex-M* archtitecture. When present, the implementation is considered an "ARMv7E-M implementation," and when not, an "ARMv7-M implementation." Add a subtarget feature hook for the v7e-m instructions and hook it up. The cortex-m3 cpu is an example of a v7m implementation, while the cortex-m4 is a v7e-m implementation. rdar://9572992 llvm-svn: 134261	2011-07-01 21:12:19 +00:00
Eric Christopher	29f1db85dd	Add support for the 'j' immediate constraint. This is conditionalized on supporting the instruction that the constraint is for 'movw'. Part of rdar://9119939 llvm-svn: 134222	2011-07-01 01:00:07 +00:00
Eric Christopher	c011d31543	Add support for the ARM 't' register constraint. And another testcase for the 'x' register constraint. Part of rdar://9119939 llvm-svn: 134220	2011-07-01 00:30:46 +00:00
Eric Christopher	f09b0f1043	We'll return a null RC by default if we can't match. Part of rdar://9119939 llvm-svn: 134217	2011-07-01 00:19:27 +00:00
Eric Christopher	f1c74595aa	Add support for the 'x' constraint. Part of rdar://9307836 and rdar://9119939 llvm-svn: 134215	2011-07-01 00:14:47 +00:00
Eric Christopher	1f054f27af	Capitalize the unsigned part of the initializer. llvm-svn: 134211	2011-06-30 23:59:16 +00:00
Eric Christopher	cf2007ca78	Rename Pair to RCPair lacking any better naming ideas. llvm-svn: 134210	2011-06-30 23:50:52 +00:00
Eric Christopher	f45daac30f	Add support for the 'h' constraint. Part of rdar://9119939 llvm-svn: 134203	2011-06-30 23:23:01 +00:00
Eric Christopher	c486b47b15	Add a convenience typedef for std::pair<unsigned, const TargetRegisterClass*>. No functional change. Part of rdar://9119939 llvm-svn: 134198	2011-06-30 22:17:01 +00:00
Eric Christopher	1b8b9419ba	Remove getRegClassForInlineAsmConstraint from the ARM port. Part of rdar://9643582 llvm-svn: 134095	2011-06-29 21:10:36 +00:00
Evan Cheng	6cc775f905	- Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and sink them into MC layer. - Added MCInstrInfo, which captures the tablegen generated static data. Chang TargetInstrInfo so it's based off MCInstrInfo. llvm-svn: 134021	2011-06-28 19:10:37 +00:00
Chad Rosier	6b610b387d	Remove warning: 'c0' may be used uninitialized in this function. llvm-svn: 134014	2011-06-28 17:26:57 +00:00
Chad Rosier	fa8d89327f	The Neon VCVT (between floating-point and fixed-point, Advanced SIMD) instructions can be used to match combinations of multiply/divide and VCVT (between floating-point and integer, Advanced SIMD). Basically the VCVT immediate operand that specifies the number of fraction bits corresponds to a floating-point multiply or divide by the corresponding power of 2. For example, VCVT (floating-point to fixed-point, Advanced SIMD) can replace a combination of VMUL and VCVT (floating-point to integer) as follows: Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>): vmul.f32 d16, d17, d16 vcvt.s32.f32 d16, d16 becomes: vcvt.s32.f32 d16, d16, #3 Similarly, VCVT (fixed-point to floating-point, Advanced SIMD) can replace a combinations of VCVT (integer to floating-point) and VDIV as follows: Example (assume d17 = <float 8.000000e+00, float 8.000000e+00>): vcvt.f32.s32 d16, d16 vdiv.f32 d16, d17, d16 becomes: vcvt.f32.s32 d16, d16, #3 llvm-svn: 133813	2011-06-24 19:23:04 +00:00
Eric Christopher	e256cd0565	Handle the memory-ness of all U+ ARM constraints. Noticed on inspection. llvm-svn: 133553	2011-06-21 22:10:57 +00:00
Benjamin Kramer	25e17b0f89	Remove unused but set variables. llvm-svn: 133347	2011-06-18 11:09:41 +00:00
Bruno Cardoso Lopes	d66ab9ead1	Mark ldrexd/strexd w/ volatile memory by default llvm-svn: 133175	2011-06-16 18:11:32 +00:00
Chad Rosier	2730162bee	Revision r128665 added an optimization to make use of NEON multiplier accumulator forwarding. Specifically (from SVN log entry): Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 Make sure it catches cases where operand 1 is add/fadd/sub/fsub, which was intended in the original revision. llvm-svn: 133127	2011-06-16 01:21:54 +00:00
Bob Wilson	4b12a11f30	A minor simplification: no functional change. llvm-svn: 133047	2011-06-15 06:04:34 +00:00
Evan Cheng	6d02d9044b	PerformBFICombine - (bfi A, (and B, Mask1), Mask2) -> (bfi A, B, Mask2) iff the bits being cleared by the AND are not demanded by the BFI. The previous BFI dag combine rule was actually incorrect (or used to be correct until BFI representation changed). rdar://9609030 llvm-svn: 133034	2011-06-15 01:12:31 +00:00
Tanya Lattner	e9e6705cf9	Add an optimization that looks for a specific pair-wise add pattern and generates a vpaddl instruction instead of scalarizing the add. Includes a test case. llvm-svn: 133027	2011-06-14 23:48:48 +00:00
Bruno Cardoso Lopes	dc9ff3a4b1	Add one more argument to the prefetch intrinsic to indicate whether it's a data or instruction cache access. Update the targets to match it and also teach autoupgrade. llvm-svn: 132976	2011-06-14 04:58:37 +00:00
Cameron Zwarich	890197859b	Provide an ARMCCState subclass of CCState so that ARM clients will always set CallOrPrologue correctly and eliminate the existing setter. llvm-svn: 132856	2011-06-10 20:59:24 +00:00
Cameron Zwarich	361548d4b4	A CCState was being created without setting whether it is in the Call or Prologue state, causing an assertion failure downstream. This fixes <rdar://problem/9562908>. This really seems like it should always be set at CCState creation time, so mistakes like this can never happen. I'll take a look at doing that. llvm-svn: 132811	2011-06-09 22:30:07 +00:00
Eric Christopher	0713a9d8fc	Add a parameter to CCState so that it can access the MachineFunction. No functional change. Part of PR6965 llvm-svn: 132763	2011-06-08 23:55:35 +00:00
Eric Christopher	354b2a25f3	Make the Uv constraint a memory operand. This doesn't solve the addressing mode problem mentioned in r132559. Backend part of rdar://9037836 and part of rdar://9119939 llvm-svn: 132561	2011-06-03 17:24:37 +00:00
Eric Christopher	de9399bf76	Have LowerOperandForConstraint handle multiple character constraints. Part of rdar://9119939 llvm-svn: 132510	2011-06-02 23:16:42 +00:00
John McCall	7d84ece09b	On Darwin ARM, set the UNWIND_RESUME libcall to _Unwind_SjLj_Resume. This is important for the correct lowering of unwind instructions (which doesn't matter at all) and llvm.eh.resume calls (which does). Take 2, now with more basic competence. llvm-svn: 132295	2011-05-29 19:50:32 +00:00
John McCall	e64371b932	I didn't mean to commit these residues of a personal project. llvm-svn: 132293	2011-05-29 19:41:56 +00:00
John McCall	085d891d80	On Darwin ARM, set the UNWIND_RESUME libcall to _Unwind_SjLj_Resume. This is important for the correct lowering of unwind instructions (which doesn't matter at all) and llvm.eh.resume calls (which does). llvm-svn: 132291	2011-05-29 19:39:04 +00:00
Bruno Cardoso Lopes	325110f30d	Add support for ARM ldrexd/strexd intrinsics. They both use i32 register pairs to load/store i64 values. Since there's no current support to explicitly declare such restrictions, implement it by using specific hardcoded register pairs during isel. llvm-svn: 132248	2011-05-28 04:07:29 +00:00
Cameron Zwarich	1d553a2cc4	Fix the remaining atomic intrinsics to use the right register classes on Thumb2, and add some basic tests for them. llvm-svn: 132235	2011-05-27 23:54:00 +00:00
Evan Cheng	518bcd0ef4	Don't use movw / movt for iOS static codegen for now to workaround some tools issues. rdar://9514789 llvm-svn: 132211	2011-05-27 20:11:27 +00:00
Renato Golin	4cd5187f5b	RTABI chapter 4.3.4 specifies __eabi_mem* calls. Specifically, __eabi_memset accepts parameters (ptr, size, value) in a different order than GNU's memset (ptr, value, size), therefore the special lowering in AAPCS mode. Implementation by Evzen Muller. llvm-svn: 131868	2011-05-22 21:41:23 +00:00
Evan Cheng	4fcd8250ae	Revert accidental commit. llvm-svn: 131739	2011-05-20 17:38:48 +00:00
Evan Cheng	e8d2e9eb35	Revert r131664 and fix it in instcombine instead. rdar://9467055 llvm-svn: 131708	2011-05-20 00:54:37 +00:00
Mon P Wang	6d9e1c7c2e	Fixed sdiv and udiv for <4 x i16>. The test from r125402 still applies for this change. llvm-svn: 131630	2011-05-19 04:15:07 +00:00
Tanya Lattner	1d11720ae4	Handle perfect shuffle case that generates a vrev for vectors of floats. Add test case. llvm-svn: 131582	2011-05-18 21:44:54 +00:00
Evan Cheng	522fbfea3b	Revise r131553. Just use the type of the input node and forgo the bitcast. rdar://9449159. llvm-svn: 131555	2011-05-18 18:59:17 +00:00
Evan Cheng	80632c91b0	Fix an ARMTargetLowering::LowerSELECT bug: legalized result must have same type as input. Sorry test cases only trigger when dag combine is disabled. rdar://9449178 llvm-svn: 131553	2011-05-18 18:47:27 +00:00
Tanya Lattner	48b182c3a4	In r131488 I misunderstood how VREV works. It splits the vector in half and splits each half. Therefore, the real problem was that we were using a VREV64 for a 4xi16, when we should have been using a VREV32. Updated test case and reverted change to the PerfectShuffle Table. llvm-svn: 131529	2011-05-18 06:42:21 +00:00
Cameron Zwarich	f9839e4257	Fix typo. llvm-svn: 131519	2011-05-18 02:29:50 +00:00
Cameron Zwarich	d7c55fe2ef	Fix more of PR8825 by correctly using rGPR registers when lowering atomic compare-and-swap intrinsics. llvm-svn: 131518	2011-05-18 02:20:07 +00:00
Bill Wendling	50117f8186	Give the 'eh.sjlj.dispatchsetup' intrinsic call the value coming from the setjmp intrinsic call. This prevents it from being reordered so that it appears before the setjmp intrinsic (thus making it completely useless). <rdar://problem/9409683> llvm-svn: 131174	2011-05-11 01:11:55 +00:00
Eli Friedman	2518f8376d	Make the logic for determining function alignment more explicit. No functionality change. llvm-svn: 131012	2011-05-06 20:34:06 +00:00
Bob Wilson	09585e1c5d	Temporarily disable use of divmod compiler-rt functions for iOS. llvm-svn: 130766	2011-05-03 17:33:22 +00:00
Dan Gohman	6136e94897	Add an unfolded offset field to LSR's Formula record. This is used to model constants which can be added to base registers via add-immediate instructions which don't require an additional register to materialize the immediate. llvm-svn: 130743	2011-05-03 00:46:49 +00:00
Eric Christopher	e02e07c3a6	80-col. llvm-svn: 130558	2011-04-29 23:12:01 +00:00
Jim Grosbach	d4b733e4d8	ARM and Thumb2 support for atomic MIN/MAX/UMIN/UMAX loads. rdar://9326019 llvm-svn: 130234	2011-04-26 19:44:18 +00:00
Andrew Trick	0ed5778a1e	Thumb2 and ARM add/subtract with carry fixes. Fixes Thumb2 ADCS and SBCS lowering: <rdar://problem/9275821>. t2ADCS/t2SBCS are now pseudo instructions, consistent with ARM, so the assembly printer correctly prints the 's' suffix. Fixes Thumb2 adde -> SBC matching to check for live/dead carry flags. Fixes the internal ARM machine opcode mnemonic for ADCS/SBCS. Fixes ARM SBC lowering to check for live carry (potential bug). llvm-svn: 130048	2011-04-23 03:55:32 +00:00
Evan Cheng	5f1ba4cd2d	Remove -use-divmod-libcall. Let targets opt in when they are available. llvm-svn: 129884	2011-04-20 22:20:12 +00:00
Stuart Hastings	7850af6ea0	Excise unintended hunk in 129858. <rdar://problem/7662569> llvm-svn: 129862	2011-04-20 18:09:26 +00:00
Stuart Hastings	45fe3c38c5	ARM byval support. Will be enabled by another patch to the FE. <rdar://problem/7662569> llvm-svn: 129858	2011-04-20 16:47:52 +00:00
Eric Christopher	c721b0db6d	Remove some duplicate op action entries and reorganize. llvm-svn: 129781	2011-04-19 18:49:19 +00:00
Chris Lattner	0ab5e2cded	Fix a ton of comment typos found by codespell. Patch by Luis Felipe Strano Moraes! llvm-svn: 129558	2011-04-15 05:18:47 +00:00
Evan Cheng	12bb05b75b	Fix another fcopysign lowering bug. If src is f64 and destination is f32, don't forget to right shift the source by 32 first. rdar://9287902 llvm-svn: 129556	2011-04-15 01:31:00 +00:00
Cameron Zwarich	415b5e8341	Fix a typo in an ARM-specific DAG combine. This fixes <rdar://problem/9278274>. llvm-svn: 129468	2011-04-13 21:01:19 +00:00
Cameron Zwarich	fbcd69b96a	Split a store of a VMOVDRR into two integer stores to avoid mixing NEON and ARM stores of arguments in the same cache line. This fixes the second half of <rdar://problem/8674845>. llvm-svn: 129345	2011-04-12 02:24:17 +00:00
Evan Cheng	74d92c1924	Change -arm-trap-func= into a non-arm specific option. Now Intrinsic::trap is lowered into a call to the specified trap function at sdisel time. llvm-svn: 129152	2011-04-08 21:37:21 +00:00
Evan Cheng	9a3f2772f0	Add option to emit @llvm.trap as a function call instead of a trap instruction. rdar://9249183. llvm-svn: 129107	2011-04-07 20:31:12 +00:00
Tanya Lattner	266792a55a	Prevent ARM DAG Combiner from doing an AND or OR combine on an illegal vector type (vectors of size 3). Also included test cases. llvm-svn: 129074	2011-04-07 15:24:20 +00:00
Evan Cheng	a7c7b54dde	Change -arm-divmod-libcall to a target neutral option. llvm-svn: 129045	2011-04-07 00:58:44 +00:00
Owen Anderson	867846b1f0	Reapply r128946 (pseudoization of various instructions), and fix the extra imp-def of CPSR it was adding. llvm-svn: 128965	2011-04-05 23:55:28 +00:00
Owen Anderson	61e7a935bd	Revert r128946 while I figure out why it broke the buildbots. llvm-svn: 128951	2011-04-05 23:03:06 +00:00
Owen Anderson	3501655ad9	Give RSBS and RSCS the pseudo treatment. llvm-svn: 128946	2011-04-05 22:42:54 +00:00
Owen Anderson	77aa266de8	Fix bugs in the pseuo-ization of ADCS/SBCS pointed out by Jim, as well as doing the expansion earlier (using a custom inserter) to allow for the chance of predicating these instructions. llvm-svn: 128940	2011-04-05 21:48:57 +00:00
Bill Wendling	dd4dcd549b	Revamp the SjLj "dispatch setup" intrinsic. It needed to be moved closer to the setjmp statement, because the code directly after the setjmp needs to know about values that are on the stack. Also, the 'bitcast' of the function context was causing a dead load. This wouldn't be too horrible, except that at -O0 it wasn't optimized out, and because it wasn't using the correct base pointer (if there is a VLA), it would try to access a value from a garbage address. <rdar://problem/9130540> llvm-svn: 128873	2011-04-05 01:37:43 +00:00
Cameron Zwarich	6fe5c29430	Do some peephole optimizations to remove pointless VMOVs from Neon to integer registers that arise from argument shuffling with the soft float ABI. These instructions are particularly slow on Cortex A8. This fixes one half of <rdar://problem/8674845>. llvm-svn: 128759	2011-04-02 02:40:43 +00:00
Evan Cheng	bd76679700	Issue libcalls __udivmodi4 / __divmodi4 for div / rem pairs. rdar://8911343 llvm-svn: 128696	2011-04-01 00:42:02 +00:00
Evan Cheng	38bf5adcea	Distribute (A + B) * C to (A * C) + (B * C) to make use of NEON multiplier accumulator forwarding: vadd d3, d0, d1 vmul d3, d3, d2 => vmul d3, d0, d2 vmla d3, d1, d2 llvm-svn: 128665	2011-03-31 19:38:48 +00:00
Evan Cheng	ee9d45dd55	Don't try to create zero-sized stack objects. llvm-svn: 128586	2011-03-30 23:44:13 +00:00
Cameron Zwarich	53dd03d537	Add a ARM-specific SD node for VBSL so that forms with a constant first operand can be recognized. This fixes <rdar://problem/9183078>. llvm-svn: 128584	2011-03-30 23:01:21 +00:00
Evan Cheng	18381b4257	Add intrinsics @llvm.arm.neon.vmulls and @llvm.arm.neon.vmullu.* back. Frontends was lowering them to sext / uxt + mul instructions. Unfortunately the optimization passes may hoist the extensions out of the loop and separate them. When that happens, the long multiplication instructions can be broken into several scalar instructions, causing significant performance issue. Note the vmla and vmls intrinsics are not added back. Frontend will codegen them as intrinsics vmull* + add / sub. Also note the isel optimizations for catching mul + sext / zext are not changed either. First part of rdar://8832507, rdar://9203134 llvm-svn: 128502	2011-03-29 23:06:19 +00:00
Cameron Zwarich	143f9aea2b	Add Neon SINT_TO_FP and UINT_TO_FP lowering from v4i16 to v4f32. Fixes <rdar://problem/8875309> and <rdar://problem/9057191>. llvm-svn: 128492	2011-03-29 21:41:55 +00:00
Evan Cheng	e2086e740f	Optimizing (zext A + zext B) * C, to (VMULL A, C) + (VMULL B, C) during isel lowering to fold the zero-extend's and take advantage of no-stall back to back vmul + vmla: vmull q0, d4, d6 vmlal q0, d5, d6 is faster than vaddl q0, d4, d5 vmovl q1, d6 vmul q0, q0, q1 This allows us to vmull + vmlal for: f = vmull_u8( vget_high_u8(s), c); f = vmlal_u8(f, vget_low_u8(s), c); rdar://9197392 llvm-svn: 128444	2011-03-29 01:56:09 +00:00
Eric Christopher	d553096688	Fix the bfi handling for or (and a mask) (and b mask). We need the two masks to match inversely for the code as is to work. For the example given we actually want: bfi r0, r2, #1, #1 not #0, however, given the way the pattern is written it's not possible at the moment. Fixes rdar://9177502 llvm-svn: 128320	2011-03-26 01:21:03 +00:00
Evan Cheng	0663f23bd8	Re-apply r127953 with fixes: eliminate empty return block if it has no predecessors; update dominator tree if cfg is modified. llvm-svn: 127981	2011-03-21 01:19:09 +00:00
Daniel Dunbar	327cd36f74	Revert r127953, "SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR", it broke a lot of things. llvm-svn: 127954	2011-03-19 21:47:14 +00:00
Evan Cheng	824a711305	SimplifyCFG has stopped duplicating returns into predecessors to canonicalize IR to have single return block (at least getting there) for optimizations. This is general goodness but it would prevent some tailcall optimizations. One specific case is code like this: int f1(void); int f2(void); int f3(void); int f4(void); int f5(void); int f6(void); int foo(int x) { switch(x) { case 1: return f1(); case 2: return f2(); case 3: return f3(); case 4: return f4(); case 5: return f5(); case 6: return f6(); } } => LBB0_2: ## %sw.bb callq _f1 popq %rbp ret LBB0_3: ## %sw.bb1 callq _f2 popq %rbp ret LBB0_4: ## %sw.bb3 callq _f3 popq %rbp ret This patch teaches codegenprep to duplicate returns when the return value is a phi and where the phi operands are produced by tail calls followed by an unconditional branch: sw.bb7: ; preds = %entry %call8 = tail call i32 @f5() nounwind br label %return sw.bb9: ; preds = %entry %call10 = tail call i32 @f6() nounwind br label %return return: %retval.0 = phi i32 [ %call10, %sw.bb9 ], [ %call8, %sw.bb7 ], ... [ 0, %entry ] ret i32 %retval.0 This allows codegen to generate better code like this: LBB0_2: ## %sw.bb jmp _f1 ## TAILCALL LBB0_3: ## %sw.bb1 jmp _f2 ## TAILCALL LBB0_4: ## %sw.bb3 jmp _f3 ## TAILCALL rdar://9147433 llvm-svn: 127953	2011-03-19 17:17:39 +00:00
Bill Wendling	865f8b592a	The VTBL (and VTBX) instructions are rather permissive concerning the masks they accept. If a value in the mask is out of range, it uses the value 0, for VTBL, or leaves the value unchanged, for VTBX. llvm-svn: 127700	2011-03-15 21:15:20 +00:00
Bill Wendling	ebecb33307	Some minor cleanups based on feedback. llvm-svn: 127694	2011-03-15 20:47:26 +00:00
Bill Wendling	e1fd78f2bc	Generate a VTBL instruction instead of a series of loads and stores when we can. As Nate pointed out, VTBL isn't super performant, but it has to be better than this: _shuf: @ BB#0: @ %entry push {r4, r7, lr} add r7, sp, #4 sub sp, #12 mov r4, sp bic r4, r4, #7 mov sp, r4 mov r2, sp vmov d16, r0, r1 orr r0, r2, #6 orr r3, r2, #7 vst1.8 {d16[0]}, [r3] vst1.8 {d16[5]}, [r0] subs r4, r7, #4 orr r0, r2, #5 vst1.8 {d16[4]}, [r0] orr r0, r2, #4 vst1.8 {d16[4]}, [r0] orr r0, r2, #3 vst1.8 {d16[0]}, [r0] orr r0, r2, #2 vst1.8 {d16[2]}, [r0] orr r0, r2, #1 vst1.8 {d16[1]}, [r0] vst1.8 {d16[3]}, [r2] vldr.64 d16, [sp] vmov r0, r1, d16 mov sp, r4 pop {r4, r7, pc} The "illegal" testcase in vext.ll is no longer illegal. <rdar://problem/9078775> llvm-svn: 127630	2011-03-14 23:02:38 +00:00
Evan Cheng	383ecd873b	Indentation. llvm-svn: 127595	2011-03-14 18:02:30 +00:00
Bob Wilson	45acbd03db	Fix a compiler crash where a Glue value had multiple uses. Radar 9049552. llvm-svn: 127198	2011-03-08 01:17:20 +00:00
Bob Wilson	70bd363517	Fix comment typos. llvm-svn: 127197	2011-03-08 01:17:16 +00:00
Cameron Zwarich	df61694417	Move getRegPressureLimit() from TargetLoweringInfo to TargetRegisterInfo. llvm-svn: 127175	2011-03-07 21:56:36 +00:00
Bob Wilson	00d09428fe	Remove unused conditional negate operations. llvm-svn: 127090	2011-03-05 16:54:31 +00:00
Evan Cheng	6e3d443646	Fix a typo which cause dag combine crash. rdar://9059537. llvm-svn: 126661	2011-02-28 18:45:27 +00:00
Stuart Hastings	67c5c3e939	Support for byval parameters on ARM. Will be enabled by a forthcoming patch to the front-end. Radar 7662569. llvm-svn: 126655	2011-02-28 17:17:53 +00:00
Evan Cheng	d6b641e5bc	More fcopysign correctness and performance fix. The previous codegen for the slow path (when values are in VFP / NEON registers) was incorrect if the source is NaN. The new codegen uses NEON vbsl instruction to copy the sign bit. e.g. vmov.i32 d1, #0x80000000 vbsl d1, d2, d0 If NEON is not available, it uses integer instructions to copy the sign bit. rdar://9034702 llvm-svn: 126295	2011-02-23 02:24:55 +00:00
Devang Patel	f3292b2196	Revert r124611 - "Keep track of incoming argument's location while emitting LiveIns." In other words, do not keep track of argument's location. The debugger (gdb) is not prepared to see line table entries for arguments. For the debugger, "second" line table entry marks beginning of function body. This requires some coordination with debugger to get this working. - The debugger needs to be aware of prolog_end attribute attached with line table entries. - The compiler needs to accurately mark prolog_end in line table entries (at -O0 and at -O1+) llvm-svn: 126155	2011-02-21 23:21:26 +00:00
Nate Begeman	fa62d50481	Implement sdiv & udiv for <4 x i16> and <8 x i8> NEON vector types. This avoids moving each element to the integer register file and calling __divsi3 etc. on it. llvm-svn: 125402	2011-02-11 20:53:29 +00:00
Evan Cheng	2da1c95993	Fix buggy fcopysign lowering. This define float @foo(float %x, float %y) nounwind readnone { entry: %0 = tail call float @copysignf(float %x, float %y) nounwind readnone ret float %0 } Was compiled to: vmov s0, r1 bic r0, r0, #-2147483648 vmov s1, r0 vcmpe.f32 s0, #0 vmrs apsr_nzcv, fpscr it lt vneglt.f32 s1, s1 vmov r0, s1 bx lr This fails to copy the sign of -0.0f because it's lost during the float to int conversion. Also, it's sub-optimal when the inputs are in GPR registers. Now it uses integer and + or operations when it's profitable. And it's correct! lsrs r1, r1, #31 bfi r0, r1, #31, #1 bx lr rdar://8984306 llvm-svn: 125357	2011-02-11 02:28:55 +00:00
Evan Cheng	e1a4ac9b5b	Fix an obvious typo which caused an isel assertion. rdar://8964854. llvm-svn: 125023	2011-02-07 18:50:47 +00:00
Bob Wilson	06fce87c4a	Add codegen support for using post-increment NEON load/store instructions. The vld1-lane, vld1-dup and vst1-lane instructions do not yet support using post-increment versions, but all the rest of the NEON load/store instructions should be handled now. llvm-svn: 125014	2011-02-07 17:43:21 +00:00
Evan Cheng	d42641c6b5	Given a pair of floating point load and store, if there are no other uses of the load, then it may be legal to transform the load and store to integer load and store of the same width. This is done if the target specified the transformation as profitable. e.g. On arm, this can transform: vldr.32 s0, [] vstr.32 s0, [] to ldr r12, [] str r12, [] rdar://8944252 llvm-svn: 124708	2011-02-02 01:06:55 +00:00
Devang Patel	56cc5fdf09	Keep track of incoming argument's location while emitting LiveIns. llvm-svn: 124611	2011-01-31 21:38:14 +00:00
Anton Korobeynikov	f3a62314f3	Provide correct registers for EH stuff on ARM llvm-svn: 124151	2011-01-24 22:38:45 +00:00
Evan Cheng	2f2435d026	Last round of fixes for movw + movt global address codegen. 1. Fixed ARM pc adjustment. 2. Fixed dynamic-no-pic codegen 3. CSE of pc-relative load of global addresses. It's now enabled by default for Darwin. llvm-svn: 123991	2011-01-21 18:55:51 +00:00
Evan Cheng	b8b0ad80a8	Sorry, several patches in one. TargetInstrInfo: Change produceSameValue() to take MachineRegisterInfo as an optional argument. When in SSA form, targets can use it to make more aggressive equality analysis. Machine LICM: 1. Eliminate isLoadFromConstantMemory, use MI.isInvariantLoad instead. 2. Fix a bug which prevent CSE of instructions which are not re-materializable. 3. Use improved form of produceSameValue. ARM: 1. Teach ARM produceSameValue to look pass some PIC labels. 2. Look for operands from different loads of different constant pool entries which have same values. 3. Re-implement PIC GA materialization using movw + movt. Combine the pair with a "add pc" or "ldr [pc]" to form pseudo instructions. This makes it possible to re-materialize the instruction, allow machine LICM to hoist the set of instructions out of the loop and make it possible to CSE them. It's a bit hacky, but it significantly improve code quality. 4. Some minor bug fixes as well. With the fixes, using movw + movt to materialize GAs significantly outperform the load from constantpool method. 186.crafty and 255.vortex improved > 20%, 254.gap and 176.gcc ~10%. llvm-svn: 123905	2011-01-20 08:34:58 +00:00
Andrew Trick	43f2563114	For ARM subtargets with useNEONForSinglePrecisionFP, double count uses of the floating point types less than 64-bits. It's somewhat of a temporary hack but forces more accurate modeling of register pressure and results in fewer spills. llvm-svn: 123811	2011-01-19 02:35:27 +00:00
Andrew Trick	5eb0a30216	whitespace llvm-svn: 123810	2011-01-19 02:26:13 +00:00
Evan Cheng	68aec147b7	Don't forget to emit the load from indirect symbol when using movw + movt to materialize GA indirect symbols. llvm-svn: 123809	2011-01-19 02:16:49 +00:00
Evan Cheng	dfce83c8f5	Materialize GA addresses with movw + movt pairs for Darwin in PIC mode. e.g. movw r0, :lower16:(L_foo$non_lazy_ptr-(LPC0_0+4)) movt r0, :upper16:(L_foo$non_lazy_ptr-(LPC0_0+4)) LPC0_0: add r0, pc, r0 It's not yet enabled by default as some tests are failing. I suspect bugs in down stream tools. llvm-svn: 123619	2011-01-17 08:03:18 +00:00
Eric Christopher	2af9551ebf	Fix 80-cols. llvm-svn: 123494	2011-01-14 23:50:53 +00:00
Anton Korobeynikov	2f93128109	Rename TargetFrameInfo into TargetFrameLowering. Also, put couple of FIXMEs and fixes here and there. llvm-svn: 123170	2011-01-10 12:39:04 +00:00
Jakob Stoklund Olesen	2fb5b31578	Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic. These functions not longer assert when passed 0, but simply return false instead. No functional change intended. llvm-svn: 123155	2011-01-10 02:58:51 +00:00
Evan Cheng	078b0b095e	Recognize inline asm 'rev /bin/bash, ' as a bswap intrinsic call. llvm-svn: 123048	2011-01-08 01:24:27 +00:00
Bob Wilson	3fa9c064c2	Add an explanatory message for an assertion. llvm-svn: 123042	2011-01-07 23:40:46 +00:00
Matt Beaumont-Gay	5cc7a1fcad	Eliminate variable only used in debug builds. llvm-svn: 123040	2011-01-07 22:34:58 +00:00
Bob Wilson	6f2b8966ca	Lower some BUILD_VECTORS using VEXT+shuffle. Patch by Tim Northover. llvm-svn: 123035	2011-01-07 21:37:30 +00:00
Bob Wilson	8265d56638	Add ARM patterns to match EXTRACT_SUBVECTOR nodes. Also fix an off-by-one in SelectionDAGBuilder that was preventing shuffle vectors from being translated to EXTRACT_SUBVECTOR. Patch by Tim Northover. The test changes are needed to keep those spill-q tests from testing aligned spills and restores. If the only aligned stack objects are spill slots, we no longer realign the stack frame. Prior to this patch, an EXTRACT_SUBVECTOR was legalized by loading from the stack, which created an aligned frame index. Now, however, there is nothing except the spill slot in the stack frame, so I added an aligned alloca. llvm-svn: 122995	2011-01-07 04:59:04 +00:00
Evan Cheng	3ae2b79aa3	Re-implement r122936 with proper target hooks. Now getMaxStoresPerMemcpy etc. takes an option OptSize. If OptSize is true, it would return the inline limit for functions with attribute OptSize. llvm-svn: 122952	2011-01-06 06:52:41 +00:00
Bob Wilson	36be00ceb3	Radar 8803471: Fix expansion of ARM BCCi64 pseudo instructions. If the basic block containing the BCCi64 (or BCCZi64) instruction ends with an unconditional branch, that branch needs to be deleted before appending the expansion of the BCCi64 to the end of the block. llvm-svn: 122521	2010-12-23 22:45:49 +00:00
Bob Wilson	1a20c2aedd	Add ARM-specific DAG combining to cast i64 vector element load/stores to f64. Type legalization splits up i64 values into pairs of i32 values, which leads to poor quality code when inserting or extracting i64 vector elements. If the vector element is loaded or stored, it can be treated as an f64 value and loaded or stored directly from a VPR register. Use the pre-legalization DAG combiner to cast those vector elements to f64 types so that the type legalizer won't mess them up. Radar 8755338. llvm-svn: 122319	2010-12-21 06:43:19 +00:00
Chris Lattner	3e5fbd74ed	rename MVT::Flag to MVT::Glue. "Flag" is a terrible name for something that just glues two nodes together, even if it is sometimes used for flags. llvm-svn: 122310	2010-12-21 02:38:05 +00:00
Bob Wilson	f268d0303b	Add some missing entries in ARMTargetLowering::getTargetNodeName. llvm-svn: 122111	2010-12-18 00:04:26 +00:00
Eric Christopher	347f4c32e8	Don't handle -arm-long-calls in fast isel for now. llvm-svn: 121919	2010-12-15 23:47:29 +00:00
Evan Cheng	c177813755	bfi A, (and B, C1), C2) -> bfi A, B, C2 iff C1 & C2 == C1. rdar://8458663 llvm-svn: 121746	2010-12-14 03:22:07 +00:00
Evan Cheng	2e51bb4ff0	Generalize BFI isel lowering a bit. llvm-svn: 121714	2010-12-13 20:32:54 +00:00
Evan Cheng	3434575704	(or (and (shl A, #shamt), mask), B) => ARMbfi B, A, ~mask where lsb(mask) == #shamt. rdar://8752056 llvm-svn: 121606	2010-12-11 04:11:38 +00:00
Jay Foad	583abbc4df	PR5207: Change APInt methods trunc(), sext(), zext(), sextOrTrunc() and zextOrTrunc(), and APSInt methods extend(), extOrTrunc() and new method trunc(), to be const and to return a new value instead of modifying the object in place. llvm-svn: 121120	2010-12-07 08:25:19 +00:00
Evan Cheng	419ea286ee	Fix and re-enable tail call optimization of expanded libcalls. llvm-svn: 120622	2010-12-01 22:59:46 +00:00
Evan Cheng	d4b0873c06	Enable sibling call optimization of libcalls which are expanded during legalization time. Since at legalization time there is no mapping from SDNode back to the corresponding LLVM instruction and the return SDNode is target specific, this requires a target hook to check for eligibility. Only x86 and ARM support this form of sibcall optimization right now. rdar://8707777 llvm-svn: 120501	2010-11-30 23:55:39 +00:00
Bob Wilson	2d790df105	Add support for NEON VLD2-dup instructions. llvm-svn: 120236	2010-11-28 06:51:26 +00:00
Bob Wilson	62a6f7eda6	Add entry in getTargetNodeName() for ARMISD::VBICIMM. llvm-svn: 120233	2010-11-28 06:51:11 +00:00
Bob Wilson	d7d2cf7842	Recognize sign/zero-extended constant BUILD_VECTORs for VMULL operations. We need to check if the individual vector elements are sign/zero-extended values. For now this only handles constants values. Radar 8687140. llvm-svn: 120034	2010-11-23 19:38:38 +00:00
Wesley Peck	527da1b6e2	Renaming ISD::BIT_CONVERT to ISD::BITCAST to better reflect the LLVM IR concept. llvm-svn: 119990	2010-11-23 03:31:01 +00:00
Evan Cheng	2debc86138	These instructions are thumb2 only. llvm-svn: 119793	2010-11-19 06:28:11 +00:00
Tanya Lattner	cd68095650	Fix bug in DAGCombiner for ARM that was trying to do a ShiftCombine on illegal types (vector should be split first). Added test case. llvm-svn: 119749	2010-11-18 22:06:46 +00:00
Anton Korobeynikov	0eecf5d201	Move hasFP() and few related hooks to TargetFrameInfo. llvm-svn: 119740	2010-11-18 21:19:35 +00:00
Bob Wilson	7d47133ff7	Split up ARM LowerShift function. This function was being called from two different places for completely unrelated reasons. During type legalization, it was called to expand 64-bit shift operations. During operation legalization, it was called to handle Neon vector shifts. The vector shift code was not written to check for illegal types, since it was assumed to be only called after type legalization. Fixed this by splitting off the 64-bit shift expansion into a separate function. I don't have a particular testcase for this; I just noticed it by inspection. llvm-svn: 119738	2010-11-18 21:16:28 +00:00
Nate Begeman	ca52411955	Fix an issue where we tried to turn a v2f32 build_vector into a v4i32 build vector with 2 elts llvm-svn: 118720	2010-11-10 21:35:41 +00:00
Bob Wilson	193722ebc8	Do not use MEMBARRIER_MCR for any Thumb code. It is only supported for ARM code. Normally Thumb2 code would use DMB instead, but depending on how the compiler is invoked (e.g., -mattr=-db) that might be disabled. This prevents a "cannot select MEMBARRIER_MCR" error in that situation. Radar 8644195 llvm-svn: 118642	2010-11-09 22:50:44 +00:00
Jim Grosbach	a942ad4222	Change the ARMConstantPoolValue modifier string to an enumeration. This will help in MC'izing the references that use them. llvm-svn: 118633	2010-11-09 21:36:17 +00:00
Owen Anderson	c7baee31ad	Add support for ARM's specialized vector-compare-against-zero instructions. llvm-svn: 118453	2010-11-08 23:21:22 +00:00
Owen Anderson	a4076924d1	Disallow the certain NEON modified-immediate forms when generating vorr or vbic. llvm-svn: 118300	2010-11-05 21:57:54 +00:00
Owen Anderson	30c4892ea5	Add codegen and encoding support for the immediate form of vbic. llvm-svn: 118291	2010-11-05 19:27:46 +00:00
Evan Cheng	21acf9fb38	Fix @llvm.prefetch isel. Selecting between pld / pldw using the first immediate rw. There is currently no intrinsic that matches to pli. llvm-svn: 118237	2010-11-04 05:19:35 +00:00
Owen Anderson	bc9b31c493	Covert VORRIMM to be produced via early target-specific DAG combining, rather than legalization. This is both the conceptually correct place for it, as well as allowing it to be more aggressive. llvm-svn: 118204	2010-11-03 23:15:26 +00:00
Owen Anderson	0747307049	Add support for code generation of the one register with immediate form of vorr. We could be more aggressive about making this work for a larger range of constants, but this seems like a good start. llvm-svn: 118201	2010-11-03 22:44:51 +00:00
Bob Wilson	ceb49296ef	Check for extractelement with a variable operand for the element number. For NEON we had been assuming this was always an immediate constant. llvm-svn: 118175	2010-11-03 16:24:50 +00:00
Duncan Sands	1462777017	Simplify uses of MVT and EVT. An MVT can be compared directly with a SimpleValueType, while an EVT supports equality and inequality comparisons with SimpleValueType. llvm-svn: 118169	2010-11-03 12:17:33 +00:00
Evan Cheng	8740ee3637	Fix preload instruction isel. Only v7 supports pli, and only v7 with mp extension supports pldw. Add subtarget attribute to denote mp extension support and legalize illegal ones to nothing. llvm-svn: 118160	2010-11-03 06:34:55 +00:00
Evan Cheng	6f36042557	Add support to match @llvm.prefetch to pld / pldw / pli. rdar://8601536. llvm-svn: 118152	2010-11-03 05:14:24 +00:00
Bob Wilson	44be217af1	NEON does not support truncating vector stores. Radar 8598391. llvm-svn: 117940	2010-11-01 18:31:39 +00:00
Bob Wilson	7ed597149b	Overhaul memory barriers in the ARM backend. Radar 8601999. There were a number of issues to fix up here: * The "device" argument of the llvm.memory.barrier intrinsic should be used to distinguish the "Full System" domain from the "Inner Shareable" domain. It has nothing to do with using DMB vs. DSB instructions. * The compiler should never need to emit DSB instructions. Remove the ARMISD::SYNCBARRIER node and also remove the instruction patterns for DSB. * Merge the separate DMB/DSB instructions for options only used for the disassembler with the default DMB/DSB instructions. Add the default "full system" option ARM_MB::SY to the ARM_MB::MemBOpt enum. * Add a separate ARMISD::MEMBARRIER_MCR node for subtargets that implement a data memory barrier using the MCR instruction. * Fix up encodings for these instructions (except MCR). I also updated the tests and added a few new ones to check for DMB options that were not currently being exercised. llvm-svn: 117756	2010-10-30 00:54:37 +00:00
Evan Cheng	0c4c5ca6e1	- Don't schedule nodes with only MVT::Flag and MVT::Other values for latency. - Compute CopyToReg use operand latency correctly. llvm-svn: 117674	2010-10-29 18:07:31 +00:00
John Thompson	e8360b7182	Inline asm multiple alternative constraints development phase 2 - improved basic logic, added initial platform support. llvm-svn: 117667	2010-10-29 17:29:13 +00:00
Bob Wilson	6c55007edb	Fix compiler warnings about signed/unsigned comparisons. llvm-svn: 117511	2010-10-27 23:49:00 +00:00
Bob Wilson	c7334a146e	SelectionDAG shuffle nodes do not allow operands with different numbers of elements than the result vector type. So, when an instruction like: %8 = shufflevector <2 x float> %4, <2 x float> %7, <4 x i32> <i32 1, i32 0, i32 3, i32 2> is translated to a DAG, each operand is changed to a concat_vectors node that appends 2 undef elements. That is: shuffle [a,b], [c,d] is changed to: shuffle [a,b,u,u], [c,d,u,u] That's probably the right thing for x86 but for NEON, we'd much rather have: shuffle [a,b,c,d], undef Teach the DAG combiner how to do that transformation for ARM. Radar 8597007. llvm-svn: 117482	2010-10-27 20:38:28 +00:00
Evan Cheng	817bbac4f7	Enable ARM fastcc. llvm-svn: 117194	2010-10-23 02:19:37 +00:00
Evan Cheng	08dd8c8295	Add fastcc cc: pass and return VFP / NEON values in registers. Controlled by -arm-fastcc for now. llvm-svn: 117119	2010-10-22 18:23:05 +00:00
Dale Johannesen	ff37675c72	Fix crash introduced in 116852. 8573915. llvm-svn: 116955	2010-10-20 22:03:37 +00:00
Jim Grosbach	bbdc5d2ef9	Add a pre-dispatch SjLj EH hook on the unwind edge for targets to do any setup they require. Use this for ARM/Darwin to rematerialize the base pointer from the frame pointer when required. rdar://8564268 llvm-svn: 116879	2010-10-19 23:27:08 +00:00
Dale Johannesen	710a2d9d46	Enable using vdup for vector constants which are splat of integers by default, and remove the controlling flag, now that LICM will hoist such vdup's. 8003375. llvm-svn: 116852	2010-10-19 20:00:17 +00:00
Jim Grosbach	2d00b1b2e5	Don't mark argument value stores as immutable, as otherwise the post-RA scheduler may reorder loads from them before the stores and other such badness. PR8347. Patch by David Meyer llvm-svn: 116602	2010-10-15 18:34:47 +00:00
Bob Wilson	3b1db392fc	Remove unused ARMISD::AND selection DAG node. llvm-svn: 116566	2010-10-15 04:34:40 +00:00
Anton Korobeynikov	81bdc93bbb	User proper libcall names & condcodes while compiling for ARM EABI. Patch by Evzen Muller! llvm-svn: 114991	2010-09-28 21:39:26 +00:00
Bob Wilson	3dc97324c1	Add a command line option "-arm-strict-align" to disallow unaligned memory accesses for ARM targets that would otherwise allow it. Radar 8465431. llvm-svn: 114941	2010-09-28 04:09:35 +00:00
Evan Cheng	dbcc4b4d4d	Enable code placement optimization pass for ARM. llvm-svn: 114746	2010-09-24 19:07:23 +00:00
Jim Grosbach	85dcd3d0f4	Add support for ELF PLT references for ARM MC asm printing. Adding a new VariantKind to the MCSymbolExpr seems like overkill, but I'm not sure there's a more straightforward way to get the printing difference captured. (i.e., x86 uses @PLT, ARM uses (PLT)). llvm-svn: 114613	2010-09-22 23:27:36 +00:00
Bob Wilson	463a05342a	Change VDUPLANE DAG combiner to just return the result instead of calling CombineTo to avoid putting the result on the worklist. I don't think it makes much difference for now, but it might help someday as we add more DAG combine optimizations. llvm-svn: 114595	2010-09-22 22:27:30 +00:00
Bob Wilson	2280674fa9	Combine both VMOVDRR(VMOVRRD) and VMOVRRD(VMOVDRR), instead of just doing one of those. Refactor to share code for handling BUILD_VECTOR(VMOVRRD). I don't have a testcase that exercises this, but it seems like an obvious good thing to do. llvm-svn: 114589	2010-09-22 22:09:21 +00:00
Owen Anderson	61158f98ab	Enable target-specific mul-lowering on ARM, even at -Os. Remove a test that this makes irrelevant, but add a new test for the new, improved functionality. llvm-svn: 114494	2010-09-21 22:51:46 +00:00
Chris Lattner	886250c8f0	convert a couple more places to use the new getStore() llvm-svn: 114463	2010-09-21 18:51:21 +00:00
Bob Wilson	5549d496dd	Define the TargetLowering::getTgtMemIntrinsic hook for ARM so that NEON load and store intrinsics are represented with MemIntrinsicSDNodes. llvm-svn: 114454	2010-09-21 17:56:22 +00:00
Chris Lattner	7727d05dbb	convert the targets off the non-MachinePointerInfo of getLoad. llvm-svn: 114410	2010-09-21 06:44:06 +00:00
Chris Lattner	2510de2bea	reimplement memcpy/memmove/memset lowering to use MachinePointerInfo instead of srcvalue/offset pairs. This corrects SV info for mem operations whose size is > 32-bits. llvm-svn: 114401	2010-09-21 05:40:29 +00:00
Bob Wilson	cb6db98897	Add target-specific DAG combiner for BUILD_VECTOR and VMOVRRD. An i64 value should be in GPRs when it's going to be used as a scalar, and we use VMOVRRD to make that happen, but if the value is converted back to a vector we need to fold to a simple bit_convert. Radar 8407927. llvm-svn: 114233	2010-09-17 22:59:05 +00:00
Eric Christopher	1c06917f15	Split out some of the calling convention bits so that they can be used for fast-isel. llvm-svn: 113652	2010-09-10 22:42:06 +00:00
Evan Cheng	bf4070756f	Teach if-converter to be more careful with predicating instructions that would take multiple cycles to decode. For the current if-converter clients (actually only ARM), the instructions that are predicated on false are not nops. They would still take machine cycles to decode. Micro-coded instructions such as LDM / STM can potentially take multiple cycles to decode. If-converter should take treat them as non-micro-coded simple instructions. llvm-svn: 113570	2010-09-10 01:29:16 +00:00
Jim Grosbach	535d3b4e09	remove trailing whitespace llvm-svn: 113338	2010-09-08 03:54:02 +00:00
Bob Wilson	f65c9ef720	Replace NEON vabdl, vaba, and vabal intrinsics with combinations of the vabd intrinsic and add and/or zext operations. In the case of vaba, this also avoids the need for a DAG combine pattern to combine vabd with add. Update tests. Auto-upgrade the old intrinsics. llvm-svn: 112941	2010-09-03 01:35:08 +00:00
Bob Wilson	38ab35a911	Remove NEON vmull, vmlal, and vmlsl intrinsics, replacing them with multiply, add, and subtract operations with zero-extended or sign-extended vectors. Update tests. Add auto-upgrade support for the old intrinsics. llvm-svn: 112773	2010-09-01 23:50:19 +00:00
Bill Wendling	0a65116cce	Create an ARMISD::AND node. This node is exactly like the "ARM::AND" node, but it sets the CPSR register. llvm-svn: 112393	2010-08-29 03:02:11 +00:00
Daniel Dunbar	a54a1b0edf	ARM/Thumb2: Fix a misselect in getARMCmp, when attempting to adjust a signed comparison that would overflow. - The other under/overflow cases can't actually happen because the immediates which would trigger them are legal (so we don't enter this code), but adjusted the style to make it clear the transform is always valid. llvm-svn: 112053	2010-08-25 16:58:05 +00:00
Bob Wilson	9a511c07e4	Replace the arm.neon.vmovls and vmovlu intrinsics with vector sign-extend and zero-extend operations. llvm-svn: 111614	2010-08-20 04:54:02 +00:00
Bob Wilson	fb7eaff759	Expand ZERO_EXTEND operations for NEON vector types. Testcase from Nick Lewycky. llvm-svn: 111341	2010-08-18 01:45:52 +00:00
Bob Wilson	411dfad981	Allow more cases of undef shuffle indices and add tests for them. llvm-svn: 111226	2010-08-17 05:54:34 +00:00
Bob Wilson	c350e7a509	Ignore undef shuffle indices when checking for a VTRN shuffle. Radar 8290937. llvm-svn: 111208	2010-08-16 23:37:17 +00:00
Bob Wilson	3c9ed76ba5	Temporarily disable tail calls on ARM to work around some linker problems. llvm-svn: 111050	2010-08-13 22:43:33 +00:00
Jim Grosbach	4d5dc3e7e5	cortex m4 has floating point support, but only single precision. llvm-svn: 110810	2010-08-11 15:44:15 +00:00
Bill Wendling	6a98131468	Consider this code snippet: float t1(int argc) { return (argc == 1123) ? 1.234f : 2.38213f; } We would generate truly awful code on ARM (those with a weak stomach should look away): _t1: movw r1, #1123 movs r2, #1 movs r3, #0 cmp r0, r1 mov.w r0, #0 it eq moveq r0, r2 movs r1, #4 cmp r0, #0 it ne movne r3, r1 adr r0, #LCPI1_0 ldr r0, [r0, r3] bx lr The problem was that legalization was creating a cascade of SELECT_CC nodes, for for the comparison of "argc == 1123" which was fed into a SELECT node for the ?: statement which was itself converted to a SELECT_CC node. This is because the ARM back-end doesn't have custom lowering for SELECT nodes, so it used the default "Expand". I added a fairly simple "LowerSELECT" to the ARM back-end. It takes care of this testcase, but can obviously be expanded to include more cases. Now we generate this, which looks optimal to me: _t1: movw r1, #1123 movs r2, #0 cmp r0, r1 adr r0, #LCPI0_0 it eq moveq r2, #4 ldr r0, [r0, r2] bx lr .align 2 LCPI0_0: .long 1075344593 @ float 2.382130e+00 .long 1067316150 @ float 1.234000e+00 llvm-svn: 110799	2010-08-11 08:43:16 +00:00
Evan Cheng	6e809de90c	- Add subtarget feature -mattr=+db which determine whether an ARM cpu has the memory and synchronization barrier dmb and dsb instructions. - Change instruction names to something more sensible (matching name of actual instructions). - Added tests for memory barrier codegen. llvm-svn: 110785	2010-08-11 06:22:01 +00:00
Evan Cheng	fa16acae44	Delete some unused instructions. llvm-svn: 110710	2010-08-10 19:36:22 +00:00
Evan Cheng	3f251fb26e	Re-apply r110655 with fixes. Epilogue must restore sp from fp if the function stack frame has a var-sized object. Also added a test case to check for the added benefit of this patch: it's optimizing away the unnecessary restore of sp from fp for some non-leaf functions. llvm-svn: 110707	2010-08-10 19:30:19 +00:00
Daniel Dunbar	0dd47bfca3	Revert r110655, "Fix ARM hasFP() semantics. It should return true whenever FP register is", it breaks a couple test-suite tests. llvm-svn: 110701	2010-08-10 18:32:02 +00:00
Evan Cheng	8d5d1c1331	Fix ARM hasFP() semantics. It should return true whenever FP register is reserved, not available for general allocation. This eliminates all the extra checks for Darwin. This change also fixes the use of FP to access frame indices in leaf functions and cleaned up some confusing code in epilogue emission. llvm-svn: 110655	2010-08-10 06:26:49 +00:00
Dale Johannesen	21f13209f8	Remove switch for disabling ARM tail calls. They seem to be working correctly. No functional change. llvm-svn: 110226	2010-08-04 18:07:17 +00:00
Bob Wilson	79daf7e0ae	Combine NEON VABD (absolute difference) intrinsics with ADDs to make VABA (absolute difference with accumulate) intrinsics. Radar 8228576. llvm-svn: 110170	2010-08-04 00:12:08 +00:00
Nate Begeman	b69b182191	Add support for getting & setting the FPSCR application register on ARM when VFP is enabled. Add support for using the FPSCR in conjunction with the vcvtr instruction, for controlling fp to int rounding. Add support for the FLT_ROUNDS_ node now that the FPSCR is exposed. llvm-svn: 110152	2010-08-03 21:31:55 +00:00
Bob Wilson	728eb292eb	Refactor ARM-specific DAG combining in preparation for adding some more transformations. llvm-svn: 109800	2010-07-29 20:34:14 +00:00
Dale Johannesen	2bff50546c	Implement vector constants which are splat of integers with mov + vdup. 8003375. This is currently disabled by default because LICM will not hoist a VDUP, so it pessimizes the code if the construct occurs inside a loop (8248029). llvm-svn: 109799	2010-07-29 20:10:08 +00:00
Anton Korobeynikov	19edda0323	Hook in GlobalMerge pass llvm-svn: 109359	2010-07-24 21:52:08 +00:00
Jim Grosbach	0acbcb1a60	Use the appropriate register class for an i32 when adding ARM::LR to the function live in set. This will give us tGPR for Thumb1 and GPR otherwise, so the copy will be spillable. rdar://8224931 llvm-svn: 109293	2010-07-23 23:50:35 +00:00
Evan Cheng	df907f4594	- Allow target to specify when is register pressure "too high". In most cases, it's too late to start backing off aggressive latency scheduling when most of the registers are in use so the threshold should be a bit tighter. - Correctly handle live out's and extract_subreg etc. - Enable register pressure aware scheduling by default for hybrid scheduler. For ARM, this is almost always a win on # of instructions. It's runtime neutral for most of the tests. But for some kernels with high register pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by 54 and sped up by 20%. llvm-svn: 109279	2010-07-23 22:39:59 +00:00
Chandler Carruth	a1d7516cb7	Mark an assert-only variable as used. llvm-svn: 109091	2010-07-22 08:02:25 +00:00
Evan Cheng	285903853f	More register pressure aware scheduling work. llvm-svn: 109064	2010-07-21 23:53:58 +00:00
Eric Christopher	84bdfd80df	Baby steps towards ARM fast-isel. llvm-svn: 109047	2010-07-21 22:26:11 +00:00
Rafael Espindola	4277e14dc4	Fix calling convention on ARM if vfp2+ is enabled. llvm-svn: 109009	2010-07-21 11:38:30 +00:00
Evan Cheng	a77f3d3b37	Teach bottom up pre-ra scheduler to track register pressure. Work in progress. llvm-svn: 108991	2010-07-21 06:09:07 +00:00
Jim Grosbach	9c7708cc1b	Removed un-used code. llvm-svn: 108841	2010-07-20 14:51:32 +00:00
Evan Cheng	10f99a3490	ARM has to provide its own TargetLowering::findRepresentativeClass because its scalar floating point registers alias its vector registers. llvm-svn: 108761	2010-07-19 22:15:08 +00:00
Jim Grosbach	8d3ba7349c	Since ARM emits inline jump tables as part of the ConstantIsland pass, it should set the jump table encloding the EK_Inline. This prevents a second, unused, copy of the table from being emitted after the function body. PR6581. llvm-svn: 108730	2010-07-19 17:20:38 +00:00
Jim Grosbach	d9ad52adff	revert so I can get the right PR# in the log message. llvm-svn: 108727	2010-07-19 17:19:40 +00:00
Jim Grosbach	c685756cfb	Since ARM emits inline jump tables as part of the ConstantIsland pass, it should set the jump table encloding the EK_Inline. This prevents a second, unused, copy of the table from being emitted after the function body. PR7499. llvm-svn: 108722	2010-07-19 17:18:28 +00:00
Jim Grosbach	b97e2bbe32	Add combiner patterns to more effectively utilize the BFI (bitfield insert) instruction for non-constant operands. This includes the case referenced in the README.txt regarding a bitfield copy. llvm-svn: 108608	2010-07-17 03:30:54 +00:00
Jim Grosbach	6e3b5fa91c	add BFI to getTargetNodeName() llvm-svn: 108603	2010-07-17 01:50:57 +00:00
Jim Grosbach	adc81f8ee8	Fix logic think-o llvm-svn: 108601	2010-07-17 01:22:19 +00:00
Jim Grosbach	11013eda5a	Add basic support to code-gen the ARM/Thumb2 bit-field insert (BFI) instruction and a combine pattern to use it for setting a bit-field to a constant value. More to come for non-constant stores. llvm-svn: 108570	2010-07-16 23:05:05 +00:00
Evan Cheng	55f0c6b9fc	Split -enable-finite-only-fp-math to two options: -enable-no-nans-fp-math and -enable-no-infs-fp-math. All of the current codegen fp math optimizations only care whether the fp arithmetics arguments and results can never be NaN. llvm-svn: 108465	2010-07-15 22:07:12 +00:00
Bob Wilson	bad47f62f6	Add support for NEON VMVN immediate instructions. llvm-svn: 108324	2010-07-14 06:31:50 +00:00
Bob Wilson	103a0dcfe1	Add an ARM-specific DAG combining to avoid redundant VDUPLANE nodes. Radar 7373643. llvm-svn: 108303	2010-07-14 01:22:12 +00:00
Bob Wilson	a3f1901531	Use a target-specific VMOVIMM DAG node instead of BUILD_VECTOR to represent NEON VMOV-immediate instructions. This simplifies some things. llvm-svn: 108275	2010-07-13 21:16:48 +00:00
Evan Cheng	0cc4ad983d	Extend the r107852 optimization which turns some fp compare to code sequence using only i32 operations. It now optimize some f64 compares when fp compare is exceptionally slow (e.g. cortex-a8). It also catches comparison against 0.0. llvm-svn: 108258	2010-07-13 19:27:42 +00:00
Bob Wilson	c1c6f4796e	Move NEON "modified immediate" encode/decode into ARMAddressingModes.h to avoid replicated code. llvm-svn: 108227	2010-07-13 04:44:34 +00:00
Bob Wilson	8a2bdc8231	Remove some code that doesn't appear to do anything. All the ARM call instructions already have implicit defs of LR. The comment suggests that this is intended to fix something like pr6111, but it doesn't really do that either. llvm-svn: 108186	2010-07-12 20:22:45 +00:00
Rafael Espindola	a76eccf815	Fix va_arg for doubles. With this patch VAARG nodes always contain the correct alignment information, which simplifies ExpandRes_VAARG a bit. The patch introduces a new alignment information to TargetLoweringInfo. This is needed since the two natural candidates cannot be used: * The 's' in target data: If this is set to the minimal alignment of any argument, getCallFrameTypeAlignment would return 4 for doubles on ARM for example. * The getTransientStackAlignment method. It is possible for an architecture to have argument less aligned than what we maintain the stack pointer. llvm-svn: 108072	2010-07-11 04:01:49 +00:00
Evan Cheng	0f54854a1d	Check for FiniteOnlyFPMath as well. llvm-svn: 107904	2010-07-08 20:12:24 +00:00
Evan Cheng	be1f7a931e	r107852 is only safe with -enable-unsafe-fp-math to account for +0.0 == -0.0. llvm-svn: 107856	2010-07-08 06:01:49 +00:00
Evan Cheng	25f9364cbd	Optimize some vfp comparisons to integer ones. This patch implements the simplest case when the following conditions are met: 1. The arguments are f32. 2. The arguments are loads and they have no uses other than the comparison. 3. The comparison code is EQ or NE. e.g. vldr.32 s0, [r1] vldr.32 s1, [r0] vcmpe.f32 s1, s0 vmrs apsr_nzcv, fpscr beq LBB0_2 => ldr r1, [r1] ldr r0, [r0] cmp r0, r1 beq LBB0_2 More complicated cases will be implemented in subsequent patches. llvm-svn: 107852	2010-07-08 02:08:50 +00:00
Dale Johannesen	e2289285ae	Changes to ARM tail calls, mostly cosmetic. Add explicit testcases for tail calls within the same module. Duplicate some code to humor those who think .w doesn't apply on ARM. Leave this disabled on Thumb1, and add some comments explaining why it's hard and won't gain much. llvm-svn: 107851	2010-07-08 01:18:23 +00:00
Dan Gohman	fe7532a308	Split the SDValue out of OutputArg so that SelectionDAG-independent code can do calling-convention queries. This obviates OutputArgReg. llvm-svn: 107786	2010-07-07 15:54:55 +00:00
Jim Grosbach	3198483851	Mark eh.sjlj.set/longjmp custom lowerings as Darwin-only since that's where they've been tested to work. llvm-svn: 107742	2010-07-07 00:07:57 +00:00
Jim Grosbach	dc0a0659be	By default, the eh.sjlj.setjmp/longjmp intrinsics should just do nothing rather than assuming a target will custom lower them. Targets which do so should exlicitly mark them as having custom lowerings. PR7454. llvm-svn: 107734	2010-07-06 23:44:52 +00:00
Devang Patel	a3ca21b228	Propagate debug loc. llvm-svn: 107710	2010-07-06 22:08:15 +00:00
Dan Gohman	3439629239	Reapply r107655 with fixes; insert the pseudo instruction into the block before calling the expansion hook. And don't put EFLAGS in a mbb's live-in list twice. llvm-svn: 107691	2010-07-06 20:24:04 +00:00
Dan Gohman	f4f04107ef	Revert r107655. llvm-svn: 107668	2010-07-06 15:49:48 +00:00
Dan Gohman	12205645a6	Fix a bunch of custom-inserter functions to handle the case where the pseudo instruction is not at the end of the block. llvm-svn: 107655	2010-07-06 15:18:19 +00:00
Evan Cheng	0664a67fe1	Remove isSS argument from CreateFixedObject. Fixed objects cannot be spill slots so it's always false. llvm-svn: 107550	2010-07-03 00:40:23 +00:00
Bob Wilson	8a99b730a9	ARM function alignments were off by a power of two. svn 83242 changed getFunctionAlignment and the corresponding use of that value in the ARM asm printer, but now we're using the standard asm printer. The result of this was that function alignments were dropped completely for Thumb functions. Radar 8143571. llvm-svn: 107435	2010-07-01 22:26:26 +00:00
Duncan Sands	6d28e73acc	Remove initialized but otherwise unused variables. llvm-svn: 107127	2010-06-29 11:22:26 +00:00
Eli Friedman	8cfa7713e9	Followup to r106770: actually generate SXTB and SXTH for sign-extensions. llvm-svn: 106940	2010-06-26 04:36:50 +00:00
Evan Cheng	b71233f34d	It's now possible to run code placement pass for ARM. llvm-svn: 106935	2010-06-26 01:52:05 +00:00
Evan Cheng	02b184de5b	Change if-conversion block size limit checks to add some flexibility. llvm-svn: 106901	2010-06-25 22:42:03 +00:00
Dale Johannesen	ce97d55ad9	The hasMemory argument is irrelevant to how the argument for an "i" constraint should get lowered; PR 6309. While this argument was passed around a lot, this is the only place it was used, so it goes away from a lot of other places. llvm-svn: 106893	2010-06-25 21:55:36 +00:00
Bob Wilson	eadbf9732f	Reduce indentation. llvm-svn: 106819	2010-06-25 04:12:31 +00:00
Dale Johannesen	d24c66b4a3	Do not do tail calls to external symbols. If the branch turns out to be ARM-to-Thumb or vice versa the linker cannot resolve this. 8120438. If this optimization is going to be useful we probably need a compiler flag "assume callees are same architecture" or something like that. llvm-svn: 106662	2010-06-23 18:52:34 +00:00
Jim Grosbach	a8ea498171	When using libcall expansions for the atomic intrinsics, the explicit MEMBARRIER fences aren't necessary for ARM. Tell the combiner to fold them away. llvm-svn: 106631	2010-06-23 16:08:49 +00:00
Bob Wilson	72df24037e	sign_extend_inreg needs to be expanded for pre-v6 Thumb as well as ARM. Radar 8104310. llvm-svn: 106484	2010-06-21 21:27:34 +00:00
Bob Wilson	0ae08935f6	Fix error message to match function name. llvm-svn: 106381	2010-06-19 05:32:09 +00:00
Evan Cheng	f3c01f3ef6	Disable sibcall optimization for Thumb1 for now since Thumb1RegisterInfo::emitEpilogue is not expecting them. llvm-svn: 106368	2010-06-19 01:01:32 +00:00
Jim Grosbach	a57c2885cf	back-end libcall handling for ATOMIC_SWAP (__sync_lock_test_and_set) llvm-svn: 106342	2010-06-18 23:03:10 +00:00
Jim Grosbach	6860bb7796	Enable Expand handling of atomics for subtargets that can't do them inline. llvm-svn: 106336	2010-06-18 22:35:32 +00:00
Dale Johannesen	c1570dda5c	Enable tail calls on ARM by default, with some basic tests. This has been well tested on Darwin but not elsewhere. It should work provided the linker correctly resolves B.W <label in other function> which it has not seen before, at least from llvm-based compilers. I'm leaving the arm-tail-calls switch in until I see if there's any problems because of that; it might need to be disabled for some environments. llvm-svn: 106299	2010-06-18 19:00:18 +00:00
Dale Johannesen	3ac52b3e43	Last round of changes for ARM tail calls. Not turning them on yet. llvm-svn: 106295	2010-06-18 18:13:11 +00:00
Jakob Stoklund Olesen	b9f91667e1	Treat the ARM inline asm {cc} constraint as a physreg (%CPSR), just like X86 does for {flags}. If we create virtual registers of the CCR class, RegAllocFast may try to spill them, and we can't do that. llvm-svn: 106289	2010-06-18 16:49:33 +00:00
Jim Grosbach	5712c77c89	Thumb1 and any pre-v6 ARM target should use the libcall expansion of ISD::MEMBARRIER. v7 and v7 ARM mode continue to use the custom lowering. llvm-svn: 106204	2010-06-17 02:02:03 +00:00
Jim Grosbach	6e758c97fd	simplify code a bit and add a more explanatory assert for cases that previously would result in 'cannot yet select' errors. llvm-svn: 106199	2010-06-17 01:37:00 +00:00
Jim Grosbach	e3864cc15e	format and 80-column cleanup llvm-svn: 106173	2010-06-16 23:45:49 +00:00
Bob Wilson	01ac8f9fc0	Remove the hidden "neon-reg-sequence" option. The reg sequences are working now, so there's no need to disable them. llvm-svn: 106155	2010-06-16 21:34:01 +00:00
Evan Cheng	f128bdcb55	Make post-ra scheduling, anti-dep breaking, and register scavenger (conservatively) aware of predicated instructions. This enables ARM to move if-conversion before post-ra scheduler. llvm-svn: 106091	2010-06-16 07:35:02 +00:00
Dale Johannesen	44f9dfc9cf	Next round of tail call changes. Register used in a tail call must not be callee-saved; following x86, add a new regclass to represent this. Also fixes a couple of bugs. Still disabled by default; Thumb doesn't work yet. llvm-svn: 106053	2010-06-15 22:08:33 +00:00
Bob Wilson	f3f7a770b7	Add basic support for NEON modified immediates besides VMOV. llvm-svn: 106030	2010-06-15 19:05:35 +00:00
Bob Wilson	5b2b504038	Rename functions referring to VMOV immediates to refer to NEON "modified immediate" operands. These functions have so far only been used for VMOV but they also apply to other NEON instructions with modified immediate operands. No functional changes. llvm-svn: 105969	2010-06-14 22:19:57 +00:00
Bob Wilson	f07d33d8f1	Add a missing bitcast. This code used to only handle conversions between i64 and f64 types, but now it also handle Neon vector types, so the f64 result of VMOVDRR may need to be converted to a Neon type. Radar 8084742. llvm-svn: 105845	2010-06-11 22:45:25 +00:00
Bob Wilson	6eae520de9	Add instruction encoding for the Neon VMOV immediate instruction. This changes the machine instruction representation of the immediate value to be encoded into an integer with similar fields as the actual VMOV instruction. This makes things easier for the disassembler, since it can just stuff the bits into the immediate operand, but harder for the asm printer since it has to decode the value to be printed. Testcase for the encoding will follow later when MC has more support for ARM. llvm-svn: 105836	2010-06-11 21:34:50 +00:00
Bob Wilson	846bd7992c	Further changes for Neon vector shuffles: - change isShuffleMaskLegal to show that all shuffles with 32-bit and 64-bit elements are legal - the Neon shuffle instructions do not support 64-bit elements, but we were not checking for that before lowering shuffles to use them - remove some 64-bit element vduplane patterns that are no longer needed llvm-svn: 105586	2010-06-07 23:53:38 +00:00
Dale Johannesen	81ef35b3ca	Improvements to tail call code. No functional effect unless using -arm-tail-calls. llvm-svn: 105515	2010-06-05 00:51:39 +00:00
Dale Johannesen	d1b9311afa	More thoroughly disable tails calls by default. 8060143, although this doesn't fix the real problem with tail call. llvm-svn: 105472	2010-06-04 18:04:24 +00:00
Bob Wilson	d8a9a04739	For NEON vectors with 32- or 64-bit elements, select BUILD_VECTORs and VECTOR_SHUFFLEs to REG_SEQUENCE instructions. The standard ISD::BUILD_VECTOR node corresponds closely to REG_SEQUENCE but I couldn't use it here because its operands do not get legalized. That is pretty awful, but I guess it makes sense for other targets. Instead, I have added an ARM-specific version of BUILD_VECTOR that will have its operands properly legalized. This fixes the rest of Radar 7872877. llvm-svn: 105439	2010-06-04 00:04:02 +00:00
Dale Johannesen	d679ff7330	Early implementation of tail call for ARM. A temporary flag -arm-tail-calls defaults to off, so there is no functional change by default. Intrepid users may try this; simple cases work but there are bugs. llvm-svn: 105413	2010-06-03 21:09:53 +00:00
Jim Grosbach	84511e1526	Clean up 80 column violations. No functional change. llvm-svn: 105350	2010-06-02 21:53:11 +00:00
Evan Cheng	bf91499f1a	Schedule high latency instructions for latency reduction even if they are not vfp / NEON instructions. llvm-svn: 105060	2010-05-28 23:25:23 +00:00
Jim Grosbach	faa3abbe39	Update the saved stack pointer in the sjlj function context following either an alloca() or an llvm.stackrestore(). rdar://8031573 llvm-svn: 104900	2010-05-27 23:49:24 +00:00
Jim Grosbach	c9f532dddc	back out 104862/104869. Can reuse stacksave after all. Very cool. llvm-svn: 104897	2010-05-27 23:11:57 +00:00
Jim Grosbach	5cde219fb1	add ISD::STACKADDR to get the current stack pointer. Will be used by sjlj EH to update the jmpbuf in the presence of VLAs. llvm-svn: 104862	2010-05-27 18:23:48 +00:00
Jim Grosbach	c98892fdaa	Adjust eh.sjlj.setjmp to properly have a chain and to have an opcode entry in ISD::. No functional change. llvm-svn: 104734	2010-05-26 20:22:18 +00:00
Bob Wilson	26fdebcae9	Clean up indentation. llvm-svn: 104580	2010-05-25 03:36:52 +00:00
Evan Cheng	755d45be43	LR is in GPR, not tGPR even in Thumb1 mode. llvm-svn: 104518	2010-05-24 18:00:18 +00:00
Bob Wilson	49f40e8c32	VDUP doesn't support vectors with 64-bit elements. llvm-svn: 104455	2010-05-23 05:42:31 +00:00
Evan Cheng	168ced94d8	Implement @llvm.returnaddress. rdar://8015977. llvm-svn: 104421	2010-05-22 01:47:14 +00:00
Jim Grosbach	bd9485db63	Implement eh.sjlj.longjmp for ARM. Clean up the intrinsic a bit. Followups: docs patch for the builtin and eh.sjlj.setjmp cleanup to match longjmp. llvm-svn: 104419	2010-05-22 01:06:18 +00:00
Bob Wilson	91fdf68516	Recognize more BUILD_VECTORs and VECTOR_SHUFFLEs that can be implemented by copying VFP subregs. This exposed a bunch of dead code in the *spill-q.ll tests, so I tweaked those tests to keep that code from being optimized away. Radar 7872877. llvm-svn: 104415	2010-05-22 00:23:12 +00:00
Evan Cheng	34c260458a	Change ARM scheduling default to list-hybrid if the target supports floating point instructions (and is not using soft float). llvm-svn: 104307	2010-05-21 00:43:17 +00:00
Evan Cheng	4401f8873c	Allow targets more controls on what nodes are scheduled by reg pressure, what for latency in hybrid mode. llvm-svn: 104293	2010-05-20 23:26:43 +00:00
Bob Wilson	5954994bba	Handle Neon v2f64 and v2i64 vector shuffles as register copies. This fixes the remaining issue with pr7167. llvm-svn: 104257	2010-05-20 18:39:53 +00:00
Evan Cheng	738e920edf	Code refactoring: pull SchedPreference enum from TargetLowering.h to TargetMachine.h and put it in its own namespace. llvm-svn: 104147	2010-05-19 20:19:50 +00:00
Evan Cheng	f19384d54a	Sink dag combine's post index load / store code that swap base ptr and index into the target hook. Only the target knows whether the swap is safe. In Thumb2 mode, the offset must be an immediate. rdar://7998649 llvm-svn: 104060	2010-05-18 21:31:17 +00:00
Anton Korobeynikov	4c719c4515	Generalize the ARM DAG combiner of mul with constants to all power-of-two cases. llvm-svn: 103901	2010-05-16 08:54:20 +00:00
Anton Korobeynikov	1bf28a128b	Some cheap DAG combine goodness for multiplication with a particular constant. This can be extended later on to handle more "complex" constants. llvm-svn: 103881	2010-05-15 18:16:59 +00:00
Evan Cheng	3d214cdfaf	v4i64 and v8i64 are only synthesizable when NEON is available. llvm-svn: 103855	2010-05-15 02:20:21 +00:00
Evan Cheng	4cad68eb34	Allow TargetLowering::getRegClassFor() to be called on illegal types. Also allow target to override it in order to map register classes to illegal but synthesizable types. e.g. v4i64, v8i64 for ARM / NEON. llvm-svn: 103854	2010-05-15 02:18:07 +00:00
Evan Cheng	cd67c21407	Added a QQQQ register file to model 4-consecutive Q registers. llvm-svn: 103760	2010-05-14 02:13:41 +00:00
Dan Gohman	bb919dfb6b	Implement a bunch more TargetSelectionDAGInfo infrastructure. Move EmitTargetCodeForMemcpy, EmitTargetCodeForMemset, and EmitTargetCodeForMemmove out of TargetLowering and into SelectionDAGInfo to exercise this. llvm-svn: 103481	2010-05-11 17:31:57 +00:00
Evan Cheng	2fa5a7e7e4	Select @llvm.trap to the special B with 1111 condition (i.e. trap) instruction. llvm-svn: 103459	2010-05-11 07:26:32 +00:00
Evan Cheng	c2ae5f546f	Model vld2 / vst2 with reg_sequence. llvm-svn: 103411	2010-05-10 17:34:18 +00:00
Jim Grosbach	2a41cad900	Clean up the conditional for handling of sign_extend_inreg based on whether the extract instructions are available. rdar://7956878 llvm-svn: 103277	2010-05-07 18:34:55 +00:00
Jim Grosbach	151cd8f159	Cleanup of ARMv7M support. Move hardware divide and Thumb2 extract/pack instructions to subtarget features and update tests to reflect. PR5717. llvm-svn: 103136	2010-05-05 23:44:43 +00:00
Jim Grosbach	92d999001c	Add initial support for ARMv7M subtarget and cortex-m3 cpu. Patch by Jordy <snhjordy@gmail.com>. Followup patches will add some tests and adjust to use Subtarget features for the instructions. llvm-svn: 103119	2010-05-05 20:44:35 +00:00
Evan Cheng	d85631e700	Model CONCAT_VECTORS of two 64-bit values as a REG_SEQUENCE. llvm-svn: 103104	2010-05-05 18:28:36 +00:00
Dan Gohman	25c1653700	Get rid of the EdgeMapping map. Instead, just check for BasicBlock changes before doing phi lowering for switches. llvm-svn: 102809	2010-05-01 00:01:06 +00:00
Dan Gohman	21cea8ac2e	Use const qualifiers with TargetLowering. This eliminates several const_casts, and it reinforces the design of the Target classes being immutable. SelectionDAGISel::IsLegalToFold is now a static member function, because PIC16 uses it in an unconventional way. There is more room for API cleanup here. And PIC16's AsmPrinter no longer uses TargetLowering. llvm-svn: 101635	2010-04-17 15:26:15 +00:00
Dan Gohman	31ae586c74	Move per-function state out of TargetLowering subclasses and into MachineFunctionInfo subclasses. llvm-svn: 101634	2010-04-17 14:41:14 +00:00
Bob Wilson	59b70eacad	Revise my previous change to ExpandBIT_CONVERT. I hadn't realized that this may be called when either the source or destination type is i64, and my change also hadn't fixed the most obvious problem -- assuming that i64 will only be bitconverted to f64, ignoring the various vector types. Radar 7873160. llvm-svn: 101615	2010-04-17 05:30:19 +00:00
Evan Cheng	f7f97b4bbd	Use default lowering of DYNAMIC_STACKALLOC. As far as I can tell, ARM isle is doing the right thing and codegen looks correct for both Thumb and Thumb2. llvm-svn: 101410	2010-04-15 22:20:34 +00:00
Anders Carlsson	47bccf7f28	Fix build. llvm-svn: 101335	2010-04-15 03:11:28 +00:00
Dan Gohman	bcaf681cde	Add const qualifiers to CodeGen's use of LLVM IR constructs. llvm-svn: 101334	2010-04-15 01:51:59 +00:00
Jim Grosbach	32bb362655	Add -arm-long-calls option to force calls to be indirect. This makes the kernel linker happier when dealing with kexts. Radar 7805069 llvm-svn: 101303	2010-04-14 22:28:31 +00:00
Bob Wilson	c05b887c84	Don't custom lower bit converts to ARM VMOVDRRD or VMOVDRR when the operand does not have a legal type. The legalizer does not know how to handle those nodes. Radar 7854640. llvm-svn: 101282	2010-04-14 20:45:23 +00:00
Bob Wilson	699bdf7adf	Handle a v2f64 formal parameter that is split between registers and memory such that the entire second half is in memory. Radar 7855014. llvm-svn: 101181	2010-04-13 22:03:22 +00:00
Bob Wilson	5202269dc4	Expand SELECT and SELECT_CC for NEON vector types. Radar 7770501. llvm-svn: 100568	2010-04-06 22:02:24 +00:00
Mon P Wang	c576ee9040	Reapply address space patch after fixing an issue in MemCopyOptimizer. Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) llvm-svn: 100304	2010-04-04 03:10:48 +00:00
Mon P Wang	999c1b927b	Revert r100191 since it breaks objc in clang llvm-svn: 100199	2010-04-02 18:43:02 +00:00
Mon P Wang	a972ab8564	Reapply address space patch after fixing an issue in MemCopyOptimizer. Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) llvm-svn: 100191	2010-04-02 18:04:15 +00:00
Bob Wilson	6f7fd28824	Revert Mon Ping's change 99928, since it broke all the llvm-gcc buildbots. llvm-svn: 99948	2010-03-30 22:27:04 +00:00
Mon P Wang	7460571381	Added support for address spaces and added a isVolatile field to memcpy, memmove, and memset, e.g., llvm.memcpy.i32(i8, i8, i32, i32) -> llvm.memcpy.p0i8.p0i8.i32(i8, i8, i32, i32, i1) A update of langref will occur in a subsequent checkin. llvm-svn: 99928	2010-03-30 20:55:56 +00:00
Jim Grosbach	07607382d8	tweak the arm if conversion heuristic llvm-svn: 99402	2010-03-24 16:15:14 +00:00
Jim Grosbach	e0874fa02f	try being more permissive for if-conversion on ARM V7. see what the nightly test run permformance numbers say as to whether it helps. llvm-svn: 99355	2010-03-24 00:03:13 +00:00

... 9 10 11 12 13 ...

1383 Commits