Commit Graph

6354 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen 74e6f9fc65 Add a missing def flag.
*** Bad machine code: Explicit definition marked as use ***
- function:    test_cos
- basic block: BB#0 L.entry (0x7ff2a2024fd0)
- instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def>
- operand 0:   %D11

llvm-svn: 162247
2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen 710093e360 Use a SmallPtrSet to dedup successors in EmitSjLjDispatchBlock.
The test case ARM/2011-05-04-MultipleLandingPadSuccs.ll was creating
duplicate successor list entries.

llvm-svn: 162222
2012-08-20 20:52:03 +00:00
Jakob Stoklund Olesen e1014e7b98 Remove the CAND/COR/CXOR custom ISD nodes and their select code.
These nodes are no longer needed because the peephole pass can fold
CMOV+AND into ANDCC etc.

llvm-svn: 162179
2012-08-18 21:49:50 +00:00
Craig Topper fd1c925946 Remove virtual from many methods. These methods replace methods in the base class, but the base class methods aren't virtual so it just increased call overhead.
llvm-svn: 162178
2012-08-18 21:38:45 +00:00
Jakob Stoklund Olesen dded061f85 Also combine zext/sext into selects for ARM.
This turns common i1 patterns into predicated instructions:

  (add (zext cc), x) -> (select cc (add x, 1), x)
  (add (sext cc), x) -> (select cc (add x, -1), x)

For a function like:

  unsigned f(unsigned s, int x) {
    return s + (x>0);
  }

We now produce:

  cmp r1, #0
  it  gt
  addgt.w r0, r0, #1

Instead of:

  movs  r2, #0
  cmp r1, #0
  it  gt
  movgt r2, #1
  add r0, r2

llvm-svn: 162177
2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen aab43dbfbb Also pass logical ops to combineSelectAndUse.
Add these transformations to the existing add/sub ones:

  (and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
  (or  (select cc, 0, c), x)  -> (select cc, x, (or, x, c))
  (xor (select cc, 0, c), x)  -> (select cc, x, (xor, x, c))

The selects can then be transformed to a single predicated instruction
by peephole.

This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.

llvm-svn: 162176
2012-08-18 21:25:16 +00:00
Anton Korobeynikov 1e28826abe fp16-to-fp32 conversion instructions are available in Thumb mode as well.
Make sure the generic pattern is used.

llvm-svn: 162170
2012-08-18 13:08:43 +00:00
Jakob Stoklund Olesen 7b1a2e8f02 Avoid folding ADD instructions with FI operands.
PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.

Fixes PR13628.

llvm-svn: 162130
2012-08-17 20:55:34 +00:00
Jakob Stoklund Olesen c1dee482c8 Add comment, clean up code. No functional change.
llvm-svn: 162107
2012-08-17 16:59:09 +00:00
Tim Northover f66181530f Implement NEON domain switching for scalar <-> S-register vmovs on ARM
llvm-svn: 162094
2012-08-17 11:32:52 +00:00
Craig Topper f6add7e667 Remove unnecessary include of ARMGenInstrInfo.inc.
llvm-svn: 162086
2012-08-17 06:21:09 +00:00
Jakob Stoklund Olesen 0ea1fce6b4 Add ADD and SUB to the predicable ARM instructions.
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

llvm-svn: 162061
2012-08-16 23:21:55 +00:00
Jakob Stoklund Olesen c19bf0282d Handle ARM MOVCC optimization in PeepholeOptimizer.
Use the target independent select analysis hooks.

llvm-svn: 162060
2012-08-16 23:14:20 +00:00
Jush Lu 26088cb30e [arm-fast-isel] Add support for fastcc.
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.

llvm-svn: 162013
2012-08-16 05:15:53 +00:00
Jakob Stoklund Olesen 6cb96120f1 Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

llvm-svn: 161994
2012-08-15 22:16:39 +00:00
Evan Cheng eec6bc6270 Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029
llvm-svn: 161962
2012-08-15 17:44:53 +00:00
Jakob Stoklund Olesen 2ec0c41e01 Add missing Rfalse operand to the predicated pseudo-instructions.
When predicating this instruction:

  Rd = ADD Rn, Rm

We need an extra operand to represent the value given to Rd when the
predicate is false:

  Rd = ADDCC Rfalse, Rn, Rm, pred

The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.

Previously, Rd and Rn were tied, but that is not required.

Compare to MOVCC:

  Rd = MOVCC Rfalse, Rtrue, pred

llvm-svn: 161955
2012-08-15 16:17:24 +00:00
Anton Korobeynikov c6d945b11a The names of VFP variants of half-to-float conversion instructions were
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.

llvm-svn: 161907
2012-08-14 23:36:01 +00:00
Eric Christopher 5f61a7498b This needs braces. Spotted by Bill.
llvm-svn: 161906
2012-08-14 23:32:15 +00:00
Jim Grosbach ecaef49f59 Switch the fixed-length disassembler to be table-driven.
Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.

As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:

Time to compile at -O2 (averaged w/ hot caches):
  Previous: 35.5s
  New:       8.9s

TEXT size:
  Previous: 447,251
  New:      297,661

Builds in 25% of the time previously required and generates code 66% of
the size.

Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.

llvm-svn: 161888
2012-08-14 19:06:05 +00:00
Jakob Stoklund Olesen 702bcc3bcf Remove the TII::scheduleTwoAddrSource() hook.
It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.

The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.

When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.

llvm-svn: 161794
2012-08-13 21:52:57 +00:00
Manman Ren d6c8270eaa ARM: enable struct byval for AAPCS-VFP.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 161789
2012-08-13 21:22:50 +00:00
Nadav Rotem 3a94c545cf Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user.
rdar://11876519

llvm-svn: 161775
2012-08-13 18:52:44 +00:00
Eric Christopher 7d8b53c1f8 Add support for the %H output modifier.
Patch by Weiming Zhao.

llvm-svn: 161768
2012-08-13 18:18:52 +00:00
Tim Northover 5aaa7fde94 Use correct loads for vector types during extending-load operations.
Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.

llvm-svn: 161748
2012-08-13 09:06:31 +00:00
Arnold Schwaighofer b73da9453c Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM
architecture

It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.

llvm-svn: 161736
2012-08-12 05:11:56 +00:00
Craig Topper 4fa625fda7 Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed.
llvm-svn: 161735
2012-08-12 03:16:37 +00:00
Manman Ren e201e27eb1 ARM: enable struct byval for AAPCS.
This change is to be enabled in clang.

rdar://9877866
PR://13350

llvm-svn: 161693
2012-08-10 20:39:38 +00:00
Eric Christopher 6ac277ce91 Remove getARMRegisterNumbering and replace with calls into
the register info for getEncodingValue. This builds on the
small patch of yesterday to set HWEncoding in the register
file.

One (deprecated) use was turned into a hard number to avoid
needing register info in the old JIT.

llvm-svn: 161628
2012-08-09 22:10:21 +00:00
Arnold Schwaighofer 81b2eec1ab Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 161581
2012-08-09 15:25:52 +00:00
Eric Christopher 245f9b5552 This field isn't used anymore, use it with HWEncoding instead.
llvm-svn: 161564
2012-08-09 01:39:32 +00:00
Andrew Trick 352abc19a5 Added MispredictPenalty to SchedMachineModel.
This replaces an existing subtarget hook on ARM and allows standard
CodeGen passes to potentially use the property.

llvm-svn: 161471
2012-08-08 02:44:16 +00:00
Andrew Trick 207c569cf3 whitespace
llvm-svn: 161469
2012-08-08 02:44:08 +00:00
Anton Korobeynikov ef731edf53 Skip impdef regs during eabi save/restore list emission to workaround PR11902
llvm-svn: 161301
2012-08-04 13:25:58 +00:00
Anton Korobeynikov 3a4fdfeceb Recognize vst1.64 / vld1.64 with 3 and 4 regs as load from / store to stack stuff
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes).
No testcase, found by inspection.

llvm-svn: 161300
2012-08-04 13:22:14 +00:00
Anton Korobeynikov 218aaf6d04 Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377

llvm-svn: 161299
2012-08-04 13:16:12 +00:00
Bob Wilson 3e6fa462f3 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Jush Lu 4705da9020 [arm-fast-isel] Add support for shl, lshr, and ashr.
llvm-svn: 161230
2012-08-03 02:37:48 +00:00
Eric Christopher b3322364e4 Add support for the ARM GHC calling convention, this patch was in 3.0,
but somehow managed to be dropped later.

Patch by Karel Gardas.

llvm-svn: 161226
2012-08-03 00:05:53 +00:00
Jim Grosbach 5d6d015969 ARM: Tidy up. Remove unused template parameters.
llvm-svn: 161222
2012-08-02 22:08:27 +00:00
Jim Grosbach b79c33ef55 ARM: More InstAlias refactors to use #NAME#.
llvm-svn: 161220
2012-08-02 21:59:52 +00:00
Jim Grosbach 6d27ad62a8 ARM: Refactor instaliases using TableGen support for #NAME#.
Now that TableGen supports references to NAME w/o it being explicitly
referenced in the definition's own name, use that to simplify
assembly InstAlias definitions in multiclasses.

llvm-svn: 161218
2012-08-02 21:50:41 +00:00
Jiangning Liu fa18005a4c Support fpv4 for ARM Cortex-M4.
llvm-svn: 161163
2012-08-02 08:35:55 +00:00
Jiangning Liu 6a43bf7d74 Fix #13035, a bug around Thumb instruction LDRD/STRD with negative #0 offset index issue.
llvm-svn: 161162
2012-08-02 08:29:50 +00:00
Jiangning Liu 288e1af8c8 Fix #13138, a bug around ARM instruction DSB encoding and decoding issue.
llvm-svn: 161161
2012-08-02 08:21:27 +00:00
Jiangning Liu 10dd40e42d Fix #13241, a bug around shift immediate operand for ARM instruction ADR.
llvm-svn: 161159
2012-08-02 08:13:13 +00:00
Jim Grosbach 8724d0fd99 ARM: Remove redundant instalias.
llvm-svn: 161134
2012-08-01 20:33:05 +00:00
Jim Grosbach 96e8a8dc6d Clean up formatting.
llvm-svn: 161133
2012-08-01 20:33:02 +00:00
Jim Grosbach b437a8c5d5 Tidy up.
llvm-svn: 161132
2012-08-01 20:33:00 +00:00
Kevin Enderby 5c490f1b8f Fix a bug in ARMMachObjectWriter::RecordRelocation() in ARMMachObjectWriter.cpp
where the other_half of the movt and movw relocation entries needs to get set
and only with the 16 bits of the other half.

rdar://10038370

llvm-svn: 160978
2012-07-30 18:46:15 +00:00
Jim Grosbach 6df755cc4e ARM: Don't assume an SDNode is a constant.
Before accessing a node as a ConstandSDNode, make sure it actually is one.
No testcase of non-trivial size.

rdar://11948669

llvm-svn: 160735
2012-07-25 17:02:47 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Jush Lu e67e07b901 [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 160500
2012-07-19 09:49:00 +00:00
Andrew Trick a22cdb713b Fix ARMTargetLowering::isLegalAddImmediate to consider thumb encodings.
Based on Evan's suggestion without a commitable test.

llvm-svn: 160441
2012-07-18 18:34:27 +00:00
Andrew Trick bc325168c3 whitespace
llvm-svn: 160440
2012-07-18 18:34:24 +00:00
Joel Jones b84f7bea09 More replacing of target-dependent intrinsics with target-indepdent
intrinsics.  The second instruction(s) to be handled are the vector versions 
of count set bits (ctpop).

The changes here are to clang so that it generates a target independent 
vector ctpop when it sees an ARM dependent vector bits set count.  The changes 
in llvm are to match the target independent vector ctpop and in 
VMCore/AutoUpgrade.cpp to update any existing bc files containing ARM 
dependent vector pop counts with target-independent ctpops.  There are also 
changes to an existing test case in llvm for ARM vector count instructions and 
to a test for the bitcode upgrade.

<rdar://problem/11892519>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160410
2012-07-18 00:02:16 +00:00
Joel Jones 43cb87839c This is one of the first steps at moving to replace target-dependent
intrinsics with target-indepdent intrinsics.  The first instruction(s) to be 
handled are the vector versions of count leading zeros (ctlz).

The changes here are to clang so that it generates a target independent 
vector ctlz when it sees an ARM dependent vector ctlz.  The changes in llvm 
are to match the target independent vector ctlz and in VMCore/AutoUpgrade.cpp 
to update any existing bc files containing ARM dependent vector ctlzs with 
target-independent ctlzs.  There are also changes to an existing test case in 
llvm for ARM vector count instructions and a new test for the bitcode upgrade.

<rdar://problem/11831778>

There is deliberately no test for the change to clang, as so far as I know, no
consensus has been reached regarding how to test neon instructions in clang;
q.v. <rdar://problem/8762292>

llvm-svn: 160200
2012-07-13 23:25:25 +00:00
Jakob Stoklund Olesen 6a81d30269 Remove variable_ops from ARM call instructions.
Function argument registers are added to the call SDNode, but
InstrEmitter now knows how to make those operands implicit, and the call
instruction doesn't have to be variadic.

Explicit register operands should only be those that are encoded in the
instruction, implicit register operands are for extra dependencies like
call argument and return values.

llvm-svn: 160188
2012-07-13 20:27:00 +00:00
Manman Ren 88a0d3313b ARM: fix typo in comments
llvm-svn: 160093
2012-07-11 23:47:00 +00:00
Manman Ren 34cb93e192 ARM: Fix optimizeCompare to correctly check safe condition.
It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.

llvm-svn: 160090
2012-07-11 22:51:44 +00:00
Richard Barton 1dc44dcedd Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters) (and do it properly this time!
llvm-svn: 159989
2012-07-10 12:51:09 +00:00
Jim Grosbach 16b43dbbfe ARM: Allow more flexible patterns in NEON formats.
Some NEON instructions want to match against normal SDNodes for some
operand types and Intrinsics for others. For example, CTLZ. To enable this,
switch from explicitly requiring Intrinsic on the class templates to using
SDPatternOperator instead.

llvm-svn: 159974
2012-07-10 00:51:13 +00:00
Chad Rosier aeed158f75 Revert r159938 (and r159945) to appease the buildbots.
llvm-svn: 159960
2012-07-09 20:43:34 +00:00
Richard Barton 984d0ba6b6 Some formatting to keep Clang happy
llvm-svn: 159948
2012-07-09 18:30:56 +00:00
Richard Barton 5beef2d242 Oops - correct broken disassembly for VMOV
llvm-svn: 159945
2012-07-09 18:20:02 +00:00
Richard Barton c9e1c94fae Fix instruction description of VMOV (between two ARM core registers and two single-precision resiters)
llvm-svn: 159938
2012-07-09 16:41:33 +00:00
Richard Barton 35aceb86fe Prevent ARM assembler from losing a right shift by #32 applied to a register
llvm-svn: 159937
2012-07-09 16:31:14 +00:00
Richard Barton d56603722e Spelling!
llvm-svn: 159936
2012-07-09 16:14:28 +00:00
Richard Barton a39625ecc6 Teach the assembler to use the narrow thumb encodings of various three-register dp instructions where permissable.
llvm-svn: 159935
2012-07-09 16:12:24 +00:00
Andrew Trick 87255e340e I'm introducing a new machine model to simultaneously allow simple
subtarget CPU descriptions and support new features of
MachineScheduler.

MachineModel has three categories of data:
1) Basic properties for coarse grained instruction cost model.
2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
3) Instruction itineraties for detailed per-cycle reservation tables.

These will all live side-by-side. Any subtarget can use any
combination of them. Instruction itineraries will not change in the
near term. In the long run, I expect them to only be relevant for
in-order VLIW machines that have complex contraints and require a
precise scheduling/bundling model. Once itineraries are only actively
used by VLIW-ish targets, they could be replaced by something more
appropriate for those targets.

This tablegen backend rewrite sets things up for introducing
MachineModel type #2: per opcode/operand cost model.

llvm-svn: 159891
2012-07-07 04:00:00 +00:00
Chad Rosier 73b02825d0 Fix the naming of ensureAlignment. Per the coding standard function names
should be camel case, and start with a lower case letter.

llvm-svn: 159877
2012-07-06 23:13:38 +00:00
Jim Grosbach 09487775d3 ARM: Add test cleanup entry to the README.
llvm-svn: 159864
2012-07-06 21:52:04 +00:00
NAKAMURA Takumi 0246724cd6 Revert r159804, "[arm-fast-isel] Add support for vararg function calls."
It broke LLVM :: CodeGen/Thumb2/large-call.ll on several hosts.

llvm-svn: 159817
2012-07-06 11:12:44 +00:00
Jush Lu 5e6e6264f4 [arm-fast-isel] Add support for vararg function calls.
llvm-svn: 159804
2012-07-06 03:02:37 +00:00
Bob Wilson b9b693650a Consistently use AnalysisID types in TargetPassConfig.
This makes it possible to just use a zero value to represent "no pass", so
the phony NoPassID global variable is no longer needed.

llvm-svn: 159568
2012-07-02 19:48:37 +00:00
Bob Wilson bbd38dd9c0 Add all codegen passes to the PassManager via TargetPassConfig.
This is a preliminary step toward having TargetPassConfig be able to
start and stop the compilation at specified passes for unit testing
and debugging.  No functionality change.

llvm-svn: 159567
2012-07-02 19:48:31 +00:00
Andrew Trick 21cca97d95 Revert accidental checkin.
My last checkin was apparently not the branch I intended. It was missing one change (added by chandlerc), and contained a spurious change.

llvm-svn: 159548
2012-07-02 19:12:29 +00:00
Andrew Trick f161e391f8 Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary."
Reapplies r159406 with minor cleanup. The regressions appear to have been spurious.

llvm-svn: 159541
2012-07-02 18:10:42 +00:00
Bob Wilson 2297221028 Do not attempt to use ROR for Thumb1.
Patch by Matt Fischer!

llvm-svn: 159538
2012-07-02 17:22:47 +00:00
Manman Ren b1b3db6802 ARM: Clean up optimizeCompare in peephole, no functional change.
Use getUniqueVRegDef.
Replace a loop with existing interfaces: modifiesRegister and readsRegister.
Factor out code into inline functions and simplify the code.

llvm-svn: 159470
2012-06-29 22:06:19 +00:00
Manman Ren 6fa76dc0e0 Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare
instructions with two register operands.

llvm-svn: 159465
2012-06-29 21:33:59 +00:00
Andrew Trick 51a8cf77b8 Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary."
This reverts commit r159406. I noticed a performance regression so I'll back out for now.

llvm-svn: 159411
2012-06-29 07:10:41 +00:00
Andrew Trick 1f50152b2d Make NumMicroOps a variable in the subtarget's instruction itinerary.
The TargetInstrInfo::getNumMicroOps API does not change, but soon it
will be used by MachineScheduler. Now each subtarget can specify the
number of micro-ops per itinerary class. For ARM, this is currently
always dynamic (-1), because it is used for load/store multiple which
depends on the number of register operands.

Zero is now a valid number of micro-ops. This can be used for
nop pseudo-instructions or instructions that the hardware can squash
during dispatch.

llvm-svn: 159406
2012-06-29 03:23:18 +00:00
Bill Wendling e38859dc8e Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.

The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.

llvm-svn: 159312
2012-06-28 00:05:13 +00:00
Richard Barton 57b7d16e34 Teach assembler to handle capitalised operation values for DSB instructions
llvm-svn: 159259
2012-06-27 09:48:23 +00:00
Richard Barton 4b7558ef9a Prevent ARM Assembler crashing on unrecognised assembly format for DSB instruction
llvm-svn: 159257
2012-06-27 09:36:19 +00:00
Evan Cheng a75127871c Add a missing check to avoid dereference null. No sensible test case possible. Sorry. rdar://11745134
llvm-svn: 159236
2012-06-26 22:54:59 +00:00
Jack Carter 5e69cffed5 There are a number of generic inline asm operand modifiers that
up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter
llvm-svn: 159203
2012-06-26 13:49:27 +00:00
Manman Ren 606953fbe7 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965
llvm-svn: 159166
2012-06-25 21:49:38 +00:00
NAKAMURA Takumi 704de074b8 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
llvm-svn: 159112
2012-06-24 13:32:01 +00:00
Evan Cheng 68c2f9a9a7 (sub X, imm) gets canonicalized to (add X, -imm)
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=>   sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=>   movw r1, #65535
     sub r0, r0, r1

rdar://11726136

llvm-svn: 159057
2012-06-23 00:29:06 +00:00
Jim Grosbach 087affe2f3 ARM: Add a better diagnostic for some out of range immediates.
As an example of how the custom DiagnosticType can be used to provide
better operand-mismatch diagnostics, add a custom diagnostic for
the imm0_15 operand class used for several system instructions.
Update the tests to expect the improved diagnostic.

rdar://8987109

llvm-svn: 159051
2012-06-22 23:56:48 +00:00
Andrew Trick 9c302673b2 Use "NoItineraries" for processors with no itineraries.
This makes it explicit when ScoreboardHazardRecognizer will be used.
"GenericItineraries" would only make sense if it contained real
itinerary values and still required ScoreboardHazardRecognizer.

llvm-svn: 158963
2012-06-22 03:58:51 +00:00
Andrew Trick 77d0b88999 ARM scheduling fix: don't guess at implicit operand latency.
This is a minor drive-by fix with no robust way to unit test.
As an example see neon-div.ll:
SU(16):   %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill>
 val SU(1): Latency=2 Reg=%Q8
...should be latency=1

llvm-svn: 158960
2012-06-22 02:50:33 +00:00
Andrew Trick 3ccb1b8cf9 ARM scheduling fix: compute predicated implicit use properly.
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

llvm-svn: 158959
2012-06-22 02:50:31 +00:00
Lang Hames b8650f106a Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Lang Hames 90b2a4cbad Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
llvm-svn: 158902
2012-06-21 06:10:00 +00:00
Lang Hames 39fb1d08dc Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Jan Wen Voung 7f5d79f864 Have ARM ELF use correct reloc for "b" instr.
The condition code didn't actually matter for arm "b" instructions,
unlike "bl".  It should just use the R_ARM_JUMP24 reloc.

llvm-svn: 158722
2012-06-19 16:03:02 +00:00
Rafael Espindola ca3e0ee8b3 Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00