Commit Graph

351 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen 8b9dce5c18 Don't attempt to use flags from predicated instructions.
The ARM backend can eliminate cmp instructions by reusing flags from a
nearby sub instruction with similar arguments.

Don't do that if the sub is predicated - the flags are not written
unconditionally.

<rdar://problem/12263428>

llvm-svn: 163535
2012-09-10 19:17:25 +00:00
Jakob Stoklund Olesen f831059f60 Use predication instead of pseudo-opcodes when folding into MOVCC.
Now that it is possible to dynamically tie MachineInstr operands,
predicated instructions are possible in SSA form:

  %vreg3<def> = SUBri %vreg1, -2147483647, pred:14, pred:%noreg, %opt:%noreg
  %vreg4<def,tied1> = MOVCCr %vreg3<tied0>, %vreg1, %pred:12, pred:%CPSR

Becomes a predicated SUBri with a tied imp-use:

  SUBri %vreg1, -2147483647, pred:13, pred:%CPSR, opt:%noreg, %vreg1<imp-use,tied0>

This means that any instruction that is safe to move can be folded into
a MOVCC, and the *CC pseudo-instructions are no longer needed.

The test case changes reflect that Thumb2SizeReduce recognizes the
predicated instructions. It didn't understand the pseudos.

llvm-svn: 163274
2012-09-05 23:58:02 +00:00
Tim Northover c8d867d42d Strip old MachineInstrs *after* we know we can put them back.
Previous patch accidentally decided it couldn't convert a VFP to a
NEON instruction after it had already destroyed the old one. Not a
good move.

llvm-svn: 163230
2012-09-05 18:37:53 +00:00
Tim Northover 726d32cdfa Limit domain conversion to cases where it won't break dep chains.
NEON domain conversion was too heavy-handed with its widened
registers, which could have stripped existing instructions of their
dependency, leaving them vulnerable to scheduling errors.

llvm-svn: 163070
2012-09-01 18:07:29 +00:00
Tim Northover ca9f384ff8 Add support for moving pure S-register to NEON pipeline if desired
llvm-svn: 162898
2012-08-30 10:17:45 +00:00
Tim Northover 771f160758 Refactor setExecutionDomain to be clearer about what it's doing and more robust.
llvm-svn: 162844
2012-08-29 16:36:07 +00:00
Andrew Trick b57e225742 Cleanup sloppy code. Jakob's review.
llvm-svn: 162825
2012-08-29 04:41:37 +00:00
Andrew Trick bd0073ddd7 Fix ARM vector copies of overlapping register tuples.
I have tested the fix, but have not been successfull in generating
a robust unit test. This can only be exposed through particular
register assignments.

llvm-svn: 162821
2012-08-29 01:58:55 +00:00
Andrew Trick 4cc6949a2b cleanup
llvm-svn: 162820
2012-08-29 01:58:52 +00:00
Jakob Stoklund Olesen b3de7b1790 Revert r162713: "Add ATOMIC_LDR* pseudo-instructions to model atomic_load on ARM."
This wasn't the right way to enforce ordering of atomics.

We are already setting the isVolatile bit on memory operands of atomic
operations which is good enough to enforce the correct ordering.

llvm-svn: 162732
2012-08-28 03:11:27 +00:00
Jakob Stoklund Olesen b24cb8c541 Add ATOMIC_LDR* pseudo-instructions to model atomic_load on ARM.
It is not safe to use normal LDR instructions because they may be
reordered by the scheduler. The ATOMIC_LDR pseudos have a mayStore flag
that prevents reordering.

Atomic loads are also prevented from participating in rematerialization
and load folding.

llvm-svn: 162713
2012-08-27 23:58:52 +00:00
Bill Wendling 988a47d7e5 Make sure we add the predicate after all of the registers are added.
<rdar://problem/12183003>

llvm-svn: 162703
2012-08-27 22:12:44 +00:00
Jakob Stoklund Olesen 74e6f9fc65 Add a missing def flag.
*** Bad machine code: Explicit definition marked as use ***
- function:    test_cos
- basic block: BB#0 L.entry (0x7ff2a2024fd0)
- instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def>
- operand 0:   %D11

llvm-svn: 162247
2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen 7b1a2e8f02 Avoid folding ADD instructions with FI operands.
PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.

Fixes PR13628.

llvm-svn: 162130
2012-08-17 20:55:34 +00:00
Tim Northover f66181530f Implement NEON domain switching for scalar <-> S-register vmovs on ARM
llvm-svn: 162094
2012-08-17 11:32:52 +00:00
Jakob Stoklund Olesen 0ea1fce6b4 Add ADD and SUB to the predicable ARM instructions.
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

llvm-svn: 162061
2012-08-16 23:21:55 +00:00
Jakob Stoklund Olesen c19bf0282d Handle ARM MOVCC optimization in PeepholeOptimizer.
Use the target independent select analysis hooks.

llvm-svn: 162060
2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen 6cb96120f1 Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

llvm-svn: 161994
2012-08-15 22:16:39 +00:00
Anton Korobeynikov 3a4fdfeceb Recognize vst1.64 / vld1.64 with 3 and 4 regs as load from / store to stack stuff
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes).
No testcase, found by inspection.

llvm-svn: 161300
2012-08-04 13:22:14 +00:00
Anton Korobeynikov 218aaf6d04 Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377

llvm-svn: 161299
2012-08-04 13:16:12 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Manman Ren 88a0d3313b ARM: fix typo in comments
llvm-svn: 160093
2012-07-11 23:47:00 +00:00
Manman Ren 34cb93e192 ARM: Fix optimizeCompare to correctly check safe condition.
It is safe if CPSR is killed or re-defined.
When we are done with the basic block, check whether CPSR is live-out.
Do not optimize away cmp if CPSR is live-out.

llvm-svn: 160090
2012-07-11 22:51:44 +00:00
Andrew Trick 21cca97d95 Revert accidental checkin.
My last checkin was apparently not the branch I intended. It was missing one change (added by chandlerc), and contained a spurious change.

llvm-svn: 159548
2012-07-02 19:12:29 +00:00
Andrew Trick f161e391f8 Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary."
Reapplies r159406 with minor cleanup. The regressions appear to have been spurious.

llvm-svn: 159541
2012-07-02 18:10:42 +00:00
Manman Ren b1b3db6802 ARM: Clean up optimizeCompare in peephole, no functional change.
Use getUniqueVRegDef.
Replace a loop with existing interfaces: modifiesRegister and readsRegister.
Factor out code into inline functions and simplify the code.

llvm-svn: 159470
2012-06-29 22:06:19 +00:00
Manman Ren 6fa76dc0e0 Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare
instructions with two register operands.

llvm-svn: 159465
2012-06-29 21:33:59 +00:00
Andrew Trick 51a8cf77b8 Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary."
This reverts commit r159406. I noticed a performance regression so I'll back out for now.

llvm-svn: 159411
2012-06-29 07:10:41 +00:00
Andrew Trick 1f50152b2d Make NumMicroOps a variable in the subtarget's instruction itinerary.
The TargetInstrInfo::getNumMicroOps API does not change, but soon it
will be used by MachineScheduler. Now each subtarget can specify the
number of micro-ops per itinerary class. For ARM, this is currently
always dynamic (-1), because it is used for load/store multiple which
depends on the number of register operands.

Zero is now a valid number of micro-ops. This can be used for
nop pseudo-instructions or instructions that the hardware can squash
during dispatch.

llvm-svn: 159406
2012-06-29 03:23:18 +00:00
Evan Cheng a75127871c Add a missing check to avoid dereference null. No sensible test case possible. Sorry. rdar://11745134
llvm-svn: 159236
2012-06-26 22:54:59 +00:00
Manman Ren 606953fbe7 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965
llvm-svn: 159166
2012-06-25 21:49:38 +00:00
Andrew Trick 77d0b88999 ARM scheduling fix: don't guess at implicit operand latency.
This is a minor drive-by fix with no robust way to unit test.
As an example see neon-div.ll:
SU(16):   %Q8<def> = VMOVLsv4i32 %D17, pred:14, pred:%noreg, %Q8<imp-use,kill>
 val SU(1): Latency=2 Reg=%Q8
...should be latency=1

llvm-svn: 158960
2012-06-22 02:50:33 +00:00
Andrew Trick 3ccb1b8cf9 ARM scheduling fix: compute predicated implicit use properly.
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

llvm-svn: 158959
2012-06-22 02:50:31 +00:00
Andrew Trick a5d24ca453 Continue factoring computeOperandLatency. Use it for ARM hasHighOperandLatency.
llvm-svn: 158164
2012-06-07 19:42:04 +00:00
Andrew Trick 5b1cadf9f7 ARM getOperandLatency rewrite.
Match expectations of the new latency API. Cleanup and make the logic consistent.

llvm-svn: 158163
2012-06-07 19:42:00 +00:00
Andrew Trick 3564bdfa61 ARM getOperandLatency should return -1 for unknown, consistent with API
llvm-svn: 158162
2012-06-07 19:41:58 +00:00
Andrew Trick fb1a74c2b2 Fix ARM getInstrLatency logic to work with the current API.
llvm-svn: 158161
2012-06-07 19:41:55 +00:00
Andrew Trick 4544606c71 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

llvm-svn: 158021
2012-06-05 21:11:27 +00:00
Craig Topper 2fbd130a79 Mark a static table as const. Shrink opcode size in static tables to uint16_t. Simplify loop iterating over one of those tables. No functional change intended.
llvm-svn: 157367
2012-05-24 03:59:11 +00:00
David Blaikie 81a84bd841 Fix use of uninitialized variable.
Found by GCC's maybe-uninitialized.

llvm-svn: 156780
2012-05-14 21:48:19 +00:00
Manman Ren 0d5ec28ccc Add space before an open parenthesis in control flow statements.
llvm-svn: 156620
2012-05-11 15:36:46 +00:00
Manman Ren dc8ad0058f ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156599
2012-05-11 01:30:47 +00:00
Manman Ren b555b382bd Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.

llvm-svn: 156556
2012-05-10 18:49:43 +00:00
Manman Ren c860887b2d ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156550
2012-05-10 16:48:21 +00:00
Jakob Stoklund Olesen 0a5b72f0e4 Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

llvm-svn: 154033
2012-04-04 18:23:42 +00:00
Jakob Stoklund Olesen caa6bd273f Handle register copies for the new ARM register classes.
ARM recently gained DPair, DTriple, and DQuad register classes.
Update copyPhysReg() to handle copies in these register classes.

No test case, it is difficult to make the register allocator emit the
odd copies reliably. The missing DPair copy caused a failure on
partialsums in the nightly test suite.

<rdar://problem/11147997>

llvm-svn: 153686
2012-03-29 21:10:40 +00:00
Jakob Stoklund Olesen 9e512120b7 Spill DPair registers, not just QPR.
The arm_neon intrinsics can create virtual registers from the DPair
register class which allows both even-odd and odd-even D-register pairs.

This fixes PR12389.

llvm-svn: 153603
2012-03-28 21:20:32 +00:00
Evan Cheng a2b48d985b ARM has a peephole optimization which looks for a def / use pair. The def
produces a 32-bit immediate which is consumed by the use. It tries to 
fold the immediate by breaking it into two parts and fold them into the
immmediate fields of two uses. e.g
       movw    r2, #40885
       movt    r3, #46540
       add     r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       add.w   r0, r0, #30146560
;
However, this transformation is incorrect if the user produces a flag. e.g.
       movw    r2, #40885
       movt    r3, #46540
       adds    r0, r0, r3
=>
       add.w   r0, r0, #3019898880
       adds.w  r0, r0, #30146560
Note the adds.w may not set the carry flag even if the original sequence
would.

rdar://11116189

llvm-svn: 153484
2012-03-26 23:31:00 +00:00
Craig Topper 5fa0caafc0 Prune includes and replace uses of ARMRegisterInfo.h with ARMBaeRegisterInfo.h
llvm-svn: 153422
2012-03-26 00:45:15 +00:00
Jim Grosbach 13a292cc74 ARM refactor more NEON VLD/VST instructions to use composite physregs
Register pair VLD1/VLD2 all-lanes instructions. Kill off more of the
pseudos as a result.

llvm-svn: 152150
2012-03-06 22:01:44 +00:00