Commit Graph

854 Commits

Author SHA1 Message Date
Quentin Colombet 8e1fe84c3c Vext Lowering was missing opportunities
llvm-svn: 167318
2012-11-02 21:32:17 +00:00
Quentin Colombet 5799e9f66c Change ForceSizeOpt attribute into MinSize attribute
llvm-svn: 167020
2012-10-30 16:32:52 +00:00
Quentin Colombet 3ee56a3bf5 [code size][ARM] Emit regular call instructions instead of the move, branch sequence
llvm-svn: 166854
2012-10-27 01:10:17 +00:00
Stepan Dyatkovskiy dab8043048 ARM:
Removed extra stack frame object for fixed byval arguments,
VarArgsStyleRegisters invocation was reworked due to some improper usage in
past. PR14099 also demonstrates it.

llvm-svn: 166273
2012-10-19 08:23:06 +00:00
Stepan Dyatkovskiy e59a920b0c Issue:
Stack is formed improperly for long structures passed as byval arguments for
EABI mode.

If we took AAPCS reference, we can found the next statements:

A: "If the argument requires double-word alignment (8-byte), the NCRN (Next
Core Register Number) is rounded up to the next even register number." (5.5
Parameter Passing, Stage C, C.3).

B: "The alignment of an aggregate shall be the alignment of its most-aligned
component." (4.3 Composite Types, 4.3.1 Aggregates).

So if we have structure with doubles (9 double fields) and 3 Core unused
registers (r1, r2, r3): caller should use r2 and r3 registers only.
Currently r1,r2,r3 set is used, but it is invalid.

Callee VA routine should also use r2 and r3 regs only. All is ok here. This
behaviour is guessed by rounding up SP address with ADD+BFC operations.

Fix:
Main fix is in ARMTargetLowering::HandleByVal. If we detected AAPCS mode and
8 byte alignment, we waste odd registers then.

P.S.:
I also improved LDRB_POST_IMM regression test. Since ldrb instruction will
not generated by current regression test after this patch. 

llvm-svn: 166018
2012-10-16 07:16:47 +00:00
Silviu Baranga b14097000b Fixed PR13938: the ARM backend was crashing because it couldn't select a VDUPLANE node with the vector input size different from the output size. This was bacause the BUILD_VECTOR lowering code didn't check that the size of the input vector was correct for using VDUPLANE.
llvm-svn: 165929
2012-10-15 09:41:32 +00:00
Manman Ren 7e48b252e7 ARM: tail-call inside a function where part of a byval argument is on caller's
local frame causes problem.

For example:
void f(StructToPass s) {
  g(&s, sizeof(s));
}
will cause problem with tail-call since part of s is passed via registers and
saved in f's local frame. When g tries to access s, part of s may be corrupted
since f's local frame is popped out before the tail-call.

The current fix is to disable tail-call if getVarArgsRegSaveSize is not 0 for
the caller. This is a conservative approach, if we can prove the address of
s or part of s is not taken and passed to g, it should be okay to perform
tail-call.

rdar://12442472

llvm-svn: 165853
2012-10-12 23:39:43 +00:00
Jim Grosbach 30af442a84 ARM: Mark VSELECT as 'expand'.
The backend already pattern matches to form VBSL when it can. We may want to
teach it to use the vbsl intrinsics at some point to prevent machine licm from
mucking with this, but using the Expand is completely correct.

http://llvm.org/bugs/show_bug.cgi?id=13831
http://llvm.org/bugs/show_bug.cgi?id=13961

Patch by Peter Couperus <peter.couperus@st.com>.

llvm-svn: 165845
2012-10-12 22:59:21 +00:00
Stepan Dyatkovskiy 283baa0027 Fix for LDRB instruction:
SDNode for LDRB_POST_IMM is invalid: number of registers added to SDNode fewer
that described in .td.

7 ops is needed, but SDNode with only 6 is created.

In more details:
In ARMInstrInfo.td, in multiclass AI2_ldridx, in definition _POST_IMM, offset
operand is defined as am2offset_imm. am2offset_imm is complex parameter type,
and actually it consists from dummy register and imm itself. As I understood
trick with dummy reg was made for AsmParser. In ARMISelLowering.cpp, this dummy
register was not added to SDNode, and it cause crash in Peephole Optimizer pass.

The problem fixed by setting up additional dummy reg when emitting
LDRB_POST_IMM instruction.

llvm-svn: 165617
2012-10-10 11:43:40 +00:00
Stepan Dyatkovskiy f13dbb8e24 Issue description:
SchedulerDAGInstrs::buildSchedGraph ignores dependencies between FixedStack
objects and byval parameters. So loading byval parameters from stack may be
inserted *before* it will be stored, since these operations are treated as
independent.

Fix:
Currently ARMTargetLowering::LowerFormalArguments saves byval registers with
FixedStack MachinePointerInfo. To fix the problem we need to store byval
registers with MachinePointerInfo referenced to first the "byval" parameter.

Also commit adds two new fields to the InputArg structure: Function's argument
index and InputArg's part offset in bytes relative to the start position of
Function's argument. E.g.: If function's argument is 128 bit width and it was
splitted onto 32 bit regs, then we got 4 InputArg structs with same arg index,
but different offset values. 

llvm-svn: 165616
2012-10-10 11:37:36 +00:00
Bill Wendling c9b22d735a Create enums for the different attributes.
We use the enums to query whether an Attributes object has that attribute. The
opaque layer is responsible for knowing where that specific attribute is stored.

llvm-svn: 165488
2012-10-09 07:45:08 +00:00
Micah Villmow cdfe20b97f Move TargetData to DataLayout.
llvm-svn: 165402
2012-10-08 16:38:25 +00:00
Bob Wilson e8a549cd92 Add LLVM support for Swift.
llvm-svn: 164899
2012-09-29 21:43:49 +00:00
Sylvestre Ledru 91ce36c986 Revert 'Fix a typo 'iff' => 'if''. iff is an abreviation of if and only if. See: http://en.wikipedia.org/wiki/If_and_only_if Commit 164767
llvm-svn: 164768
2012-09-27 10:14:43 +00:00
Sylvestre Ledru 721cffd53a Fix a typo 'iff' => 'if'
llvm-svn: 164767
2012-09-27 09:59:43 +00:00
Bill Wendling 863bab689a Remove the `hasFnAttr' method from Function.
The hasFnAttr method has been replaced by querying the Attributes explicitly. No
intended functionality change.

llvm-svn: 164725
2012-09-26 21:48:26 +00:00
James Molloy 9e98ef1c59 Fix ordering of operands on lowering of atomicrmw min/max nodes on ARM.
llvm-svn: 164685
2012-09-26 09:48:32 +00:00
Evan Cheng 90ae8f8442 Use vld1 / vst2 for unaligned v2f64 load / store. e.g. Use vld1.16 for 2-byte
aligned address. Based on patch by David Peixotto.

Also use vld1.64 / vst1.64 with 128-bit alignment to take advantage of alignment
hints. rdar://12090772, rdar://12238782

llvm-svn: 164089
2012-09-18 01:42:45 +00:00
Silviu Baranga b47bb94f93 This patch introduces A15 as a target in LLVM.
llvm-svn: 163803
2012-09-13 15:05:10 +00:00
Craig Topper 3e41a5bb31 Set operation action for FFLOOR to Expand for all vector types for X86. Set FFLOOR of v4f32 to Expand for ARM. v2f64 was already correct.
llvm-svn: 163458
2012-09-08 04:58:43 +00:00
Jakob Stoklund Olesen e45e22b20f Custom DAGCombine for and/or/xor are for all ARMs.
The 'select' transformations apply to all ARM architectures and don't
require hasV6T2Ops.

llvm-svn: 163396
2012-09-07 17:34:15 +00:00
James Molloy 9d30dc2432 Fix self-host; ensure signedness is consistent.
llvm-svn: 163306
2012-09-06 10:32:08 +00:00
James Molloy 49bdbce8e1 Improve codegen for BUILD_VECTORs on ARM.
If we have a BUILD_VECTOR that is mostly a constant splat, it is often better to splat that constant then insertelement the non-constant lanes instead of insertelementing every lane from an undef base.

llvm-svn: 163304
2012-09-06 09:55:02 +00:00
Arnold Schwaighofer f00fb1c581 Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 163136
2012-09-04 14:37:49 +00:00
Jakob Stoklund Olesen d3bda3c5b9 Fix a couple of typos in EmitAtomic.
Thumb2 instructions are mostly constrained to rGPR, not tGPR which is
for Thumb1.

rdar://problem/12203728

llvm-svn: 162968
2012-08-31 02:08:34 +00:00
Jakob Stoklund Olesen 710093e360 Use a SmallPtrSet to dedup successors in EmitSjLjDispatchBlock.
The test case ARM/2011-05-04-MultipleLandingPadSuccs.ll was creating
duplicate successor list entries.

llvm-svn: 162222
2012-08-20 20:52:03 +00:00
Jakob Stoklund Olesen e1014e7b98 Remove the CAND/COR/CXOR custom ISD nodes and their select code.
These nodes are no longer needed because the peephole pass can fold
CMOV+AND into ANDCC etc.

llvm-svn: 162179
2012-08-18 21:49:50 +00:00
Jakob Stoklund Olesen dded061f85 Also combine zext/sext into selects for ARM.
This turns common i1 patterns into predicated instructions:

  (add (zext cc), x) -> (select cc (add x, 1), x)
  (add (sext cc), x) -> (select cc (add x, -1), x)

For a function like:

  unsigned f(unsigned s, int x) {
    return s + (x>0);
  }

We now produce:

  cmp r1, #0
  it  gt
  addgt.w r0, r0, #1

Instead of:

  movs  r2, #0
  cmp r1, #0
  it  gt
  movgt r2, #1
  add r0, r2

llvm-svn: 162177
2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen aab43dbfbb Also pass logical ops to combineSelectAndUse.
Add these transformations to the existing add/sub ones:

  (and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
  (or  (select cc, 0, c), x)  -> (select cc, x, (or, x, c))
  (xor (select cc, 0, c), x)  -> (select cc, x, (xor, x, c))

The selects can then be transformed to a single predicated instruction
by peephole.

This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.

llvm-svn: 162176
2012-08-18 21:25:16 +00:00
Jakob Stoklund Olesen c1dee482c8 Add comment, clean up code. No functional change.
llvm-svn: 162107
2012-08-17 16:59:09 +00:00
Jakob Stoklund Olesen c19bf0282d Handle ARM MOVCC optimization in PeepholeOptimizer.
Use the target independent select analysis hooks.

llvm-svn: 162060
2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen 6cb96120f1 Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

llvm-svn: 161994
2012-08-15 22:16:39 +00:00
Evan Cheng eec6bc6270 Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029
llvm-svn: 161962
2012-08-15 17:44:53 +00:00
Nadav Rotem 3a94c545cf Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user.
rdar://11876519

llvm-svn: 161775
2012-08-13 18:52:44 +00:00
Arnold Schwaighofer b73da9453c Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM
architecture

It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.

llvm-svn: 161736
2012-08-12 05:11:56 +00:00
Craig Topper 4fa625fda7 Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed.
llvm-svn: 161735
2012-08-12 03:16:37 +00:00
Arnold Schwaighofer 81b2eec1ab Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 161581
2012-08-09 15:25:52 +00:00
Bob Wilson 3e6fa462f3 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Eric Christopher b3322364e4 Add support for the ARM GHC calling convention, this patch was in 3.0,
but somehow managed to be dropped later.

Patch by Karel Gardas.

llvm-svn: 161226
2012-08-03 00:05:53 +00:00
Jim Grosbach 6df755cc4e ARM: Don't assume an SDNode is a constant.
Before accessing a node as a ConstandSDNode, make sure it actually is one.
No testcase of non-trivial size.

rdar://11948669

llvm-svn: 160735
2012-07-25 17:02:47 +00:00
Andrew Trick a22cdb713b Fix ARMTargetLowering::isLegalAddImmediate to consider thumb encodings.
Based on Evan's suggestion without a commitable test.

llvm-svn: 160441
2012-07-18 18:34:27 +00:00
Andrew Trick bc325168c3 whitespace
llvm-svn: 160440
2012-07-18 18:34:24 +00:00
Manman Ren 6e1fd46fdf ARM: use NOEN loads and stores if possible when handling struct byval.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 158684
2012-06-18 22:23:48 +00:00
Manman Ren e0763c7472 ARM: optimization for sub+abs.
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi

For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.

rdar: 11633193
llvm-svn: 158551
2012-06-15 21:32:12 +00:00
Bill Wendling 4b79647a6e Re-enable the CMN instruction.
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>

llvm-svn: 158302
2012-06-11 08:07:26 +00:00
Manman Ren e873552091 ARM: properly handle alignment for struct byval.
Factor out the expansion code into a function.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 157830
2012-06-01 19:33:18 +00:00
Manman Ren 9f9111651e ARM: support struct byval in llvm
We handle struct byval by inserting a pseudo op, which will be expanded to a
loop at ExpandISelPseudos.
A separate patch for clang will be submitted to enable struct byval.

rdar://9877866

llvm-svn: 157793
2012-06-01 02:44:42 +00:00
Justin Holewinski aa58397b3c Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Jakob Stoklund Olesen 691ae3388f Use the right register class for LDRrs.
llvm-svn: 157152
2012-05-20 06:38:47 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00