Commit Graph

13402 Commits

Author SHA1 Message Date
Jakob Stoklund Olesen 25c4195ecc Cache basic block boundaries for faster RegMaskSlots access.
Provide API to get a list of register mask slots and bits in a basic
block.

llvm-svn: 150219
2012-02-10 01:26:29 +00:00
Jakob Stoklund Olesen aa06de2447 Optimize LiveIntervals::intervalIsInOneMBB().
No looping and binary searches necessary.

Return a pointer to the containing block instead of just a bool.

llvm-svn: 150218
2012-02-10 01:23:55 +00:00
Benjamin Kramer baa41d4175 Cache iterators. Some of these are expensive to create.
llvm-svn: 150214
2012-02-10 00:28:31 +00:00
Jakob Stoklund Olesen 4a6a0eec52 Add register mask support to RAGreedy.
This only adds the interference checks required for correctness.
We still need to take advantage of register masks for the
interference driven live range splitting.

llvm-svn: 150191
2012-02-09 18:25:05 +00:00
Lang Hames edeea175ad Preserve physreg kills in MachineBasicBlock::SplitCriticalEdge.
Failure to preserve kills was causing LiveIntervals to miss some EFLAGS live
ranges. Unfortunately I've been unable to reduce a good test case yet.

llvm-svn: 150152
2012-02-09 05:59:36 +00:00
Lang Hames 2176edf1c4 Fix kill flags when moving instructions using LiveIntervals::moveInstr(...).
llvm-svn: 150150
2012-02-09 04:45:38 +00:00
Lang Hames 95d6edeba8 Remove assertion. Not all use operands are reads.
llvm-svn: 150149
2012-02-09 04:39:48 +00:00
Andrew Trick f542675ae3 Improve TargetPassConfig. No intended functionality.
Split CodeGen into stages.
Distinguish between optimization and correctness.

llvm-svn: 150122
2012-02-09 00:40:55 +00:00
Andrew Trick c24e09b226 comment
llvm-svn: 150121
2012-02-09 00:40:52 +00:00
Jakob Stoklund Olesen 938b4d26f1 Erase dead copies that are clobbered by a call.
This does make a difference, at least when using RABasic.

llvm-svn: 150118
2012-02-09 00:19:08 +00:00
Jakob Stoklund Olesen 5d33291e8e Never delete instructions that define reserved registers.
I think this was already the intention, but DeadMachineInstructionElim
was accidentally tracking the liveness of reserved registers. Now,
instructions with reserved defs are never deleted.

This prevents the call stack adjustment instructions from getting
deleted when enabling register masks.

llvm-svn: 150116
2012-02-09 00:15:39 +00:00
Jakob Stoklund Olesen 8610a59de1 Handle register masks in MachineCopyPropagation.
For simplicity, treat calls with register masks as basic block
boundaries.  This means we can't copy propagate callee-saved registers
across calls, but I don't think that is a big deal.

llvm-svn: 150108
2012-02-08 22:37:35 +00:00
Andrew Trick 1fa5bcbe2a Codegen pass definition cleanup. No functionality.
Moving toward a uniform style of pass definition to allow easier target configuration.
Globally declare Pass ID.
Globally declare pass initializer.
Use INITIALIZE_PASS consistently.
Add a call to the initializer from CodeGen.cpp.
Remove redundant "createPass" functions and "getPassName" methods.

While cleaning up declarations, cleaned up comments (sorry for large diff).

llvm-svn: 150100
2012-02-08 21:23:13 +00:00
Andrew Trick c40815de62 Move pass configuration out of pass constructors: MachineLICM.
llvm-svn: 150099
2012-02-08 21:23:03 +00:00
Andrew Trick 5209c739cd whitespace
llvm-svn: 150098
2012-02-08 21:23:00 +00:00
Andrew Trick 3ed444a16a Move pass configuration out of pass constructors: StackSlotColoring.
llvm-svn: 150097
2012-02-08 21:22:57 +00:00
Andrew Trick df7e3769b5 Move pass configuration out of pass constructors: PostRAScheduler.
llvm-svn: 150096
2012-02-08 21:22:53 +00:00
Andrew Trick 58648e4e98 Move pass configuration out of pass constructors: BranchFolderPass
llvm-svn: 150095
2012-02-08 21:22:48 +00:00
Andrew Trick 9e761997d8 whitespace
llvm-svn: 150094
2012-02-08 21:22:43 +00:00
Andrew Trick dd37d52f95 Added TargetPassConfig::setOpt
llvm-svn: 150093
2012-02-08 21:22:39 +00:00
Andrew Trick 3a61b7862b Added Pass::createPass(ID) to handle pass configuration by ID
llvm-svn: 150092
2012-02-08 21:22:34 +00:00
Andrew Trick c044917a8a Move pass configuration out of pass constructors: TailDuplicate::PreRegAlloc
llvm-svn: 150091
2012-02-08 21:22:30 +00:00
Jakob Stoklund Olesen 0c1eea2922 Add Register mask support to RABasic.
When a virtual register is live across a call, limit the search space to
call-preserved registers.

llvm-svn: 150081
2012-02-08 18:54:35 +00:00
Jakob Stoklund Olesen 3ff74d8e62 Keep track of register masks in LiveIntervalAnalysis.
Build an ordered vector of register mask operands (i.e., calls) when
computing live intervals. Provide a checkRegMaskInterference() function
that computes a bit mask of usable registers for a live range.

This is a quick way of determining of a live range crosses any calls,
and restricting it to the callee saved registers if it does.
Previously, we had to discover call clobbers for each candidate register
independently.

llvm-svn: 150077
2012-02-08 17:33:45 +00:00
Andrew Trick 3bc0e0c651 Added MachineInstr::isBundled() to check if an instruction is part of a bundle.
llvm-svn: 150044
2012-02-08 02:17:25 +00:00
Andrew Trick e57583ab19 misched: bug in debug output.
llvm-svn: 150043
2012-02-08 02:17:21 +00:00
Andrew Trick de9f8979c4 stale comment
llvm-svn: 150041
2012-02-08 02:17:16 +00:00
Devang Patel d925d1a8d7 Remove tabs.
llvm-svn: 150012
2012-02-07 23:33:58 +00:00
Andrew Trick 8e7b34c7fc Expose TargetPassConfig to PEI Pass
llvm-svn: 149927
2012-02-06 22:51:18 +00:00
Andrew Trick 34914910f5 Add TargetPassConfig to the PassManager for use inside passes
llvm-svn: 149926
2012-02-06 22:51:15 +00:00
Jakob Stoklund Olesen 537444ca37 Don't explicitly renumber slot indices.
We have automatic local renumbering now.

llvm-svn: 149920
2012-02-06 22:37:56 +00:00
Jakob Stoklund Olesen c805369fdc Make sure a reserved register has a live interval before merging.
llvm-svn: 149910
2012-02-06 21:52:18 +00:00
Bill Wendling 0aef16afd5 [unwind removal] Remove all of the code for the dead 'unwind' instruction. There
were no 'unwind' instructions being generated before this, so this is in effect
a no-op.

llvm-svn: 149906
2012-02-06 21:44:22 +00:00
Bill Wendling d5d95b0b51 [unwind removal] We no longer have 'unwind' instructions being generated, so
remove the code that handles them.

llvm-svn: 149901
2012-02-06 21:16:41 +00:00
Devang Patel 4488217f73 DebugInfo: Provide a new hook to encode relationship between a property and an ivar.
llvm-svn: 149874
2012-02-06 17:49:43 +00:00
Craig Topper accd351e56 Move some llvm_unreachable's from r149849 out of switch statements to satisfy -Wcovered-switch-default
llvm-svn: 149860
2012-02-06 08:17:43 +00:00
Duncan Sands ae22c60f90 Persuade GCC that there is nothing worth warning about here (there isn't).
llvm-svn: 149834
2012-02-05 14:20:11 +00:00
Nadav Rotem 4f4546b73a Add additional documentation to the extract-and-trunc dagcombine optimization.
llvm-svn: 149823
2012-02-05 11:39:23 +00:00
Craig Topper ee4dab5f1f Convert assert(0) to llvm_unreachable
llvm-svn: 149816
2012-02-05 08:31:47 +00:00
Chris Lattner cf9e8f6968 reapply the patches reverted in r149470 that reenable ConstantDataArray,
but with a critical fix to the SelectionDAG code that optimizes copies
from strings into immediate stores: the previous code was stopping reading
string data at the first nul.  Address this by adding a new argument to
llvm::getConstantStringInfo, preserving the behavior before the patch.

llvm-svn: 149800
2012-02-05 02:29:43 +00:00
Jakob Stoklund Olesen abb26bae4e Drop the REDEF_BY_EC VNInfo flag.
A live range that has an early clobber tied redef now looks like a
normal tied redef, except the early clobber def uses the early clobber
slot.

This is enough to handle any strange interference problems.

llvm-svn: 149769
2012-02-04 05:51:25 +00:00
Jakob Stoklund Olesen e386578121 Correctly terminate a physreg redefined by an early clobber.
I don't have a test that fails because of this, but a test case like
CodeGen/X86/2009-12-01-EarlyClobberBug.ll exposes the problem.  EAX is
redefined by a tied early clobber operand on inline asm, and the live
range should look like this:

  %EAX,inf = [48r,64e:0)[64e,80r:1)  0@48r 1@64e

Previously, the two values got merged:

  %EAX,inf = [48r,80r:0)  0@48r

With this bug fixed, the REDEF_BY_EC VNInfo flag is no longer needed.

llvm-svn: 149768
2012-02-04 05:41:20 +00:00
Nick Lewycky 6024f9b5be Fix a leak!
Andy, in a previous commit you made this into an ImmutablePass so that you could
add it to the PassManager, then in the next commit you left it a Pass but
removed the code that added it to the PM. If you do add it to the PM then the PM
should take care of deleting it, but it's also true that nothing in codegen
needs this object to exist after it's done its work here. It's not clear to me
which design you want; this should likely either cease to be a Pass or be added
to the PM where other parts of CodeGen will request it.

llvm-svn: 149765
2012-02-04 05:26:17 +00:00
Jakob Stoklund Olesen ad6b22eb16 Don't store COPY pointers in VNInfo.
If a value is defined by a COPY, that instuction can easily and cheaply
be found by getInstructionFromIndex(VNI->def).

This reduces the size of VNInfo from 24 to 16 bytes, and improves
llc compile time by 3%.

llvm-svn: 149763
2012-02-04 05:20:49 +00:00
Andrew Trick f8ea108c05 TargetPassConfig: confine the MC configuration to TargetMachine.
Passes prior to instructon selection are now split into separate configurable stages.
Header dependencies are simplified.
The bulk of this diff is simply removal of the silly DisableVerify flags.

Sorry for the target header churn. Attempting to stabilize them.

llvm-svn: 149754
2012-02-04 02:56:59 +00:00
Andrew Trick de401d3c29 Move TargetPassConfig implementation into Passes.cpp
llvm-svn: 149753
2012-02-04 02:56:48 +00:00
Andrew Trick b755133686 Make TargetPassConfig an ImmutablePass so CodeGenPasses can query options
llvm-svn: 149752
2012-02-04 02:56:45 +00:00
Devang Patel 403e819731 Emit new property tag.
llvm-svn: 149737
2012-02-04 01:30:32 +00:00
Chad Rosier 6d68c7cf79 [fast-isel] HandlePHINodesInSuccessorBlocks() can promite i8 and i16 types too.
llvm-svn: 149730
2012-02-04 00:39:19 +00:00
Jakob Stoklund Olesen 22e490d908 Trim headers.
llvm-svn: 149722
2012-02-03 23:51:15 +00:00
Jakob Stoklund Olesen f798a0a0e6 Delete some dead code.
llvm-svn: 149717
2012-02-03 21:32:06 +00:00
Jakob Stoklund Olesen 56fe2ed51e Handle register mask operands in setPhysRegsDeadExcept().
Calls that use register mask operands don't have implicit defs for
returned values.  The register mask operand handles the call clobber,
but it always behaves like a set of dead defs.

Add live implicit defs for any implicitly defined physregs that are
actually used.

llvm-svn: 149715
2012-02-03 21:23:14 +00:00
Jakob Stoklund Olesen 4290be4386 ArrayRef'ize MI::setPhysRegsDeadExcept().
llvm-svn: 149709
2012-02-03 20:43:39 +00:00
Jakob Stoklund Olesen f650732cab Handle all live physreg defs in the same place.
SelectionDAG has 4 different ways of passing physreg defs to users.
Collect all of the uses at the same time, and pass all of them to
MI->setPhysRegsDeadExcept() to mark the remaining defs dead.

The setPhysRegsDeadExcept() function will soon add the required
implicit-defs to instructions with register mask operands.

llvm-svn: 149708
2012-02-03 20:43:35 +00:00
Andrew Trick 99d316098e Initialize all common codegen passes before configuration so we can use their PassIDs.
llvm-svn: 149705
2012-02-03 20:14:47 +00:00
Nadav Rotem 5399f4d6bf The type-legalizer often scalarizes code. One of the common patterns is extract-and-truncate.
In this patch we optimize this pattern and convert the sequence into extract op of a narrow type.
This allows the BUILD_VECTOR dag optimizations to construct efficient shuffle operations in many cases.

llvm-svn: 149692
2012-02-03 13:18:25 +00:00
Andrew Trick ccb673659a Added TargetPassConfig. The first little step toward configuring codegen passes.
Allows command line overrides to be centralized in LLVMTargetMachine.cpp.
LLVMTargetMachine can intercept common passes and give precedence to command line overrides.
Allows adding "internal" target configuration options without touching TargetOptions.
Encapsulates the PassManager.
Provides a good point to initialize all CodeGen passes so that Pass ID's can be used in APIs.
Allows modifying the target configuration hooks without rebuilding the world.

llvm-svn: 149672
2012-02-03 05:12:41 +00:00
Andrew Trick 808a7a6ce6 whitespace
llvm-svn: 149671
2012-02-03 05:12:30 +00:00
Akira Hatanaka f0b08445f6 Add a new MachineJumpTableInfo entry type, EK_GPRel64BlockAddress, which is
needed to emit a 64-bit gp-relative relocation entry. Make changes necessary
for emitting jump tables which have entries with directive .gpdword. This patch
does not implement the parts needed for direct object emission or JIT.

llvm-svn: 149668
2012-02-03 04:33:00 +00:00
Jakob Stoklund Olesen 5e1ac45b93 Require non-NULL register masks.
It doesn't seem worthwhile to give meaning to a NULL register mask
pointer. It complicates all the code using register mask operands.

llvm-svn: 149646
2012-02-02 23:52:57 +00:00
Lang Hames a808dc45cf Re-apply the coalescer fix from r149147. Commit r149597 should have fixed the llvm-gcc and clang self-host issues.
llvm-svn: 149598
2012-02-02 08:01:53 +00:00
Lang Hames 4d04f753bd Break as soon as the MustMapCurValNos flag is set - no need to reiterate.
llvm-svn: 149596
2012-02-02 06:55:45 +00:00
Lang Hames 3a20bc3652 PR11868. The previous loop in LiveIntervals::join would sometimes fall over if
more than two adjacent ranges needed to be merged. The new version should be
able to handle an arbitrary sequence of adjancent ranges.

llvm-svn: 149588
2012-02-02 05:37:34 +00:00
Andrew Trick 3441597f84 fix cmake
llvm-svn: 149553
2012-02-01 22:28:29 +00:00
Andrew Trick d06df96a7c VLIW specific scheduler framework that utilizes deterministic finite automaton (DFA).
This new scheduler plugs into the existing selection DAG scheduling framework. It is a top-down critical path scheduler that tracks register pressure and uses a DFA for pipeline modeling.

Patch by Sergei Larin!

llvm-svn: 149547
2012-02-01 22:13:57 +00:00
Stepan Dyatkovskiy 513aaa5691 SwitchInst refactoring.
The purpose of refactoring is to hide operand roles from SwitchInst user (programmer). If you want to play with operands directly, probably you will need lower level methods than SwitchInst ones (TerminatorInst or may be User). After this patch we can reorganize SwitchInst operands and successors as we want.

What was done:

1. Changed semantics of index inside the getCaseValue method:
getCaseValue(0) means "get first case", not a condition. Use getCondition() if you want to resolve the condition. I propose don't mix SwitchInst case indexing with low level indexing (TI successors indexing, User's operands indexing), since it may be dangerous.
2. By the same reason findCaseValue(ConstantInt*) returns actual number of case value. 0 means first case, not default. If there is no case with given value, ErrorIndex will returned.
3. Added getCaseSuccessor method. I propose to avoid usage of TerminatorInst::getSuccessor if you want to resolve case successor BB. Use getCaseSuccessor instead, since internal SwitchInst organization of operands/successors is hidden and may be changed in any moment.
4. Added resolveSuccessorIndex and resolveCaseIndex. The main purpose of these methods is to see how case successors are really mapped in TerminatorInst.
4.1 "resolveSuccessorIndex" was created if you need to level down from SwitchInst to TerminatorInst. It returns TerminatorInst's successor index for given case successor.
4.2 "resolveCaseIndex" converts low level successors index to case index that curresponds to the given successor.

Note: There are also related compatability fix patches for dragonegg, klee, llvm-gcc-4.0, llvm-gcc-4.2, safecode, clang.
llvm-svn: 149481
2012-02-01 07:49:51 +00:00
Argyrios Kyrtzidis 17c981a45b Revert Chris' commits up to r149348 that started causing VMCoreTests unit test to fail.
These are:

r149348
r149351
r149352
r149354
r149356
r149357
r149361
r149362
r149364
r149365

llvm-svn: 149470
2012-02-01 04:51:17 +00:00
Andrew Trick 25c7b83a4b Obvious unnecessary loop removal. Follow through from previous checkin.
llvm-svn: 149398
2012-01-31 18:54:19 +00:00
Chris Lattner 8ea967d050 with recent changes, ConstantArray is never a "string". Remove the associated
methods and constant fold the clients to false.

llvm-svn: 149362
2012-01-31 06:05:00 +00:00
Andrew Trick 2b3c187489 RAFast: Generalize the logic for return operands.
This removes implicit assumption about the form of MI coming into regalloc. In particular, it should be independent of ProcessImplicitDefs which will eventually become a standard part of coming out of SSA--unless we simply can eliminate IMPLICIT_DEF completely. Current unit tests expose this once I remove incidental pass ordering restrictions.

This is not a final fix. Just a temporary workaround until I figure out the right way.

llvm-svn: 149360
2012-01-31 05:55:32 +00:00
Chris Lattner 997348e9fe remove the last vestiges of llvm::GetConstantStringInfo, in CodeGen.
llvm-svn: 149356
2012-01-31 05:09:17 +00:00
Chris Lattner 983005f51b rework this logic to not depend on the last argument to GetConstantStringInfo,
which is going away.

llvm-svn: 149348
2012-01-31 04:39:22 +00:00
Chris Lattner 0d3785e165 don't emit a 1-byte object as a .fill. This is silly and causes
CodeGen/X86/global-sections.ll to fail with CDArray

llvm-svn: 149343
2012-01-31 03:39:24 +00:00
Bill Wendling 8d9d1a0022 Remove the now-dead llvm.eh.exception and llvm.eh.selector intrinsics.
llvm-svn: 149331
2012-01-31 01:58:48 +00:00
Bill Wendling a4237652d2 Remove the eh.exception and eh.selector intrinsics. Also remove a hack to copy
over the catch information. The catch information is now tacked to the invoke
instruction.

llvm-svn: 149326
2012-01-31 01:46:13 +00:00
Eli Friedman 18a4c31525 Use the correct ShiftAmtTy for creating shifts after legalization. PR11881. Not committing a testcase because I think it will be too fragile.
llvm-svn: 149315
2012-01-31 01:08:03 +00:00
Chandler Carruth 2c469ff14a Chris's constant data sequence refactoring actually enabled printing
vectors of all one bits to be printed more cleverly in the AsmPrinter.
Unfortunately, the byte value for all one bits is the same with
-fsigned-char as the error return of '-1'. Force this to be the unsigned
byte value when returning it to avoid this problem, and update the test
case for the shiny new behavior.

Yay for building LLVM and Clang with -funsigned-char.

Chris, please review, and let me know if there is any reason to not
desire this change. It seems good on the surface, and certainly intended
based on the code written.

llvm-svn: 149299
2012-01-30 23:47:44 +00:00
Matt Beaumont-Gay 9cc6d524ea Here's a new one: GCC was complaining about an only-used-in-asserts
*function*. Wrap the function in #ifndef NDEBUG.

llvm-svn: 149259
2012-01-30 19:26:20 +00:00
Chris Lattner 829400bb22 when verbose asm is on, print integers in ConstantDataSequentials just
like normal integers.

llvm-svn: 149223
2012-01-30 05:55:11 +00:00
Chris Lattner 51ebe14a36 don't lose tail padding on ConstantDataAggregate vec3's.
llvm-svn: 149222
2012-01-30 05:49:43 +00:00
Jakob Stoklund Olesen 198236edae Fix some scavenger performance issues.
- Don't call malloc+free in the very hot forward().
- Don't call isTiedToDefOperand().
- Don't create BitVector temporaries.
- Merge DeadRegs into KillRegs.
- Eliminate the early clobber checks, they were irrelevant to scavenging.
- Remove unnecessary code from -Asserts builds.

This speeds up ARM PEI by 3.4x and overall llc -O0 codegen time by 11%.

llvm-svn: 149189
2012-01-29 01:29:28 +00:00
Jakob Stoklund Olesen 8b1853ddef Avoid creating BitVector temporaries.
llvm-svn: 149188
2012-01-29 01:29:25 +00:00
Bill Wendling 8c09040854 Reapply r149159 with a fix to add to a PHI node with a non-null parent.
llvm-svn: 149164
2012-01-28 01:17:56 +00:00
Lang Hames d78ec25e65 Remove code that adds live ranges for dead defs. It seems to be breaking things.
llvm-svn: 149163
2012-01-28 01:17:01 +00:00
Bill Wendling fa5ad212f8 Revert r149159 until I can fix tests.
llvm-svn: 149162
2012-01-28 01:10:01 +00:00
Bill Wendling b4544544a6 Don't always create a separate block for the call to _Unwind_Resume.
Sometimes there is only one 'resume' instruction per function. In those
situations, we don't need a separate block for the call to _Unwind_Resume. In
fact, it adds a lot of overhead to code-gen if we do that -- especially at -O0.
If we have a single 'resume' instruction, just generate the call within that
block.
<rdar://problem/10694814>

llvm-svn: 149159
2012-01-28 00:47:18 +00:00
Lang Hames a6958c6c4e Silence warning about parens for && within ||
llvm-svn: 149152
2012-01-27 23:52:25 +00:00
Lang Hames ad33d5ace7 Add a "moveInstr" method to LiveIntervals. This can be used to move instructions
around within a basic block while maintaining live-intervals.

Updated ScheduleTopDownLive in MachineScheduler.cpp to use the moveInstr API
when reordering MIs.

llvm-svn: 149147
2012-01-27 22:36:19 +00:00
Lang Hames 6f108d9229 Backing out ill-considered 'refactor'.
llvm-svn: 149146
2012-01-27 21:43:32 +00:00
Lang Hames 376a2d3e99 Move some duplicate loops in the coalescer into their own function.
llvm-svn: 149144
2012-01-27 19:58:14 +00:00
Lang Hames 9e18b6672a Physreg dead defs should be handled too.
llvm-svn: 149118
2012-01-27 03:20:42 +00:00
Chris Lattner 0256be96f2 continue making the world safe for ConstantDataVector. At this point,
we should (theoretically optimize and codegen ConstantDataVector as well
as ConstantVector.

llvm-svn: 149116
2012-01-27 03:08:05 +00:00
Bill Wendling 7717e9f4ae Place the GEP instructions nearer to the instructions which use them.
GEP instructions are there for the compiler and shouldn't really output much
code (if any at all). When a GEP is stored in the entry block, Fast ISel (for
one) will not know that it could fold it into further uses. For instance, inside
of the EH handling code. This results in a lot of unnecessary spills and loads
which bloat code and slows down pretty much everything.
<rdar://problem/10694814>

llvm-svn: 149114
2012-01-27 02:02:24 +00:00
Chris Lattner e32fad43b1 make sure the file's matching header is #include'd first.
llvm-svn: 149113
2012-01-27 01:47:28 +00:00
Chris Lattner 7ba81830ff Rewrite CanShareConstantPoolEntry to be implemented in terms of the
mid-level constant folding APIs instead of doing its own analysis.
This makes it more general (e.g. can now share a <2 x i64> with a
<4 x i32>) and avoid duplicating a bunch of logic.

llvm-svn: 149111
2012-01-27 01:46:00 +00:00
Lang Hames 286fed5c8c Rewrite instruction operands in AdjustCopiesBackFrom. Fixes PR11861.
llvm-svn: 149097
2012-01-27 00:05:42 +00:00
Chris Lattner 61a1d6cb81 progress making the world safe to ConstantDataVector. While
we're at it, allow PatternMatch's "neg" pattern to match integer
vector negations, and enhance ComputeNumSigned bits to handle
shl of vectors.

llvm-svn: 149082
2012-01-26 21:37:55 +00:00
Chris Lattner 88fce10928 tidy up forward declarations.
llvm-svn: 149078
2012-01-26 20:44:57 +00:00
Chad Rosier 9b61cf3167 Update comment for r149070.
llvm-svn: 149075
2012-01-26 20:19:05 +00:00
Chad Rosier 1a1531d65e Replace the use of isPredicable() with isPredicated() in
MachineBasicBlock::canFallThrough().  We're interested in the state of the
instruction (i.e., is this a barrier or not?), not if the instruction is
predicable or not.
rdar://10501092

llvm-svn: 149070
2012-01-26 18:24:25 +00:00
Jakob Stoklund Olesen 8c139a5125 Clear kill flags before propagating a copy.
The live range of the source register may be extended when a redundant
copy is eliminated. Make sure any kill flags between the two copies are
cleared.

This fixes PR11765.

llvm-svn: 149069
2012-01-26 17:52:15 +00:00
James Molloy 6685c08e5f Add support for the R_ARM_TARGET1 relocation, which should be given to relocations applied to all C++ constructors and destructors.
This enables the linker to match concrete relocation types (absolute or relative) with whatever library or C++ support code is being linked against.

llvm-svn: 149057
2012-01-26 09:25:43 +00:00
Chris Lattner cf12970bd0 eliminate the Constant::getVectorElements method. There are better (and
more robust) ways to do what it was doing now.  Also, add static methods
for decoding a ShuffleVector mask.

llvm-svn: 149028
2012-01-26 02:51:13 +00:00
Jakob Stoklund Olesen 4864a81aa3 Improve sub-register def handling in ProcessImplicitDefs.
This boils down to using MachineOperand::readsReg() more.

This fixes PR11829 where a use ended up after the first def when
lowering REG_SEQUENCE instructions involving IMPLICIT_DEFs.

llvm-svn: 148996
2012-01-25 23:36:27 +00:00
Anton Korobeynikov 7722a2d4e3 Properly emit ctors / dtors with priorities into desired sections
and let linker handle the rest.

This finally fixes PR5329

llvm-svn: 148990
2012-01-25 22:24:19 +00:00
Lang Hames f1508b78f9 Don't add live ranges for aliases of physregs that are live in to the
function. They don't appear to be used, and are inconsistent with handling of
other physreg intervals (i.e. intervals that are not live-in) where ranges are
not inserted for aliases.

llvm-svn: 148986
2012-01-25 22:11:06 +00:00
Lang Hames 19feb5f241 Always break upon finding a vreg operand (in Release as well as +Asserts). Remove assertion which can no longer trigger.
llvm-svn: 148984
2012-01-25 21:53:23 +00:00
Chris Lattner 47a86bdbe2 use ConstantVector::getSplat in a few places.
llvm-svn: 148929
2012-01-25 06:02:56 +00:00
Chris Lattner 9be59599b3 Use the right method to get the # elements in a CDS.
llvm-svn: 148897
2012-01-25 01:27:20 +00:00
Jakob Stoklund Olesen 1b8e437ab6 Set correct <def,undef> flags when lowering REG_SEQUENCE.
A REG_SEQUENCE instruction is lowered into a sequence of partial defs:

  %vreg7:ssub_0<def,undef> = COPY %vreg20:ssub_0
  %vreg7:ssub_1<def> = COPY %vreg2
  %vreg7:ssub_2<def> = COPY %vreg2
  %vreg7:ssub_3<def> = COPY %vreg2

The first def needs an <undef> flag to indicate it is the beginning of
the live range, while the other defs are read-modify-write.  Previously,
we depended on LiveIntervalAnalysis to notice and fix the missing
<def,undef>, but that solution was never robust, it was causing problems
with ProcessImplicitDefs and the lowering of chained REG_SEQUENCE
instructions.

This fixes PR11841.

llvm-svn: 148879
2012-01-24 23:28:42 +00:00
Jakob Stoklund Olesen 66ef9ad33f Use the standard MachineFunction::print() after SlotIndexes.
llvm-svn: 148878
2012-01-24 23:28:38 +00:00
Jakob Stoklund Olesen 5b9deabae8 Fix old doxygen comment.
llvm-svn: 148825
2012-01-24 18:09:18 +00:00
Chris Lattner 00245f420a add more support for ConstantDataSequential
llvm-svn: 148802
2012-01-24 13:41:11 +00:00
Evgeniy Stepanov 33a7e2f2a1 An option to selectively enable part of ARM EHABI support.
This change adds an new option --arm-enable-ehabi-descriptors that
enables emitting unwinding descriptors. This provides a mode with a
working backtrace() without the (currently broken) exception support.

llvm-svn: 148800
2012-01-24 13:05:33 +00:00
Benjamin Kramer 8aefffca4c Bit pack DIE structures better.
16 bits are sufficient to store attributes, tags and forms.

llvm-svn: 148799
2012-01-24 12:08:28 +00:00
Eric Christopher 3e8ccc2000 Remove generation of DW_AT_sibling. Nothing as far as I can tell uses it.
Saves about 1.5% on debug info size.

rdar://10278198

llvm-svn: 148794
2012-01-24 09:43:28 +00:00
Chris Lattner 5d4497bf4a Add AsmPrinter (aka MCLowering) support for ConstantDataSequential,
and clean up some other misc stuff.  Unlike ConstantArray, we will
prefer to emit .fill directives for "String" arrays that all have
the same value, since they are denser than emitting a .ascii

llvm-svn: 148793
2012-01-24 09:31:43 +00:00
Jakob Stoklund Olesen c46534a0cd Preserve <def,undef> flags in CoalesceExtSubRegs.
This won't have an effect until EliminateRegSequences() starts setting
the undef flags.

llvm-svn: 148779
2012-01-24 04:44:01 +00:00
Chandler Carruth ed975232bc Revert r148686 (and r148694, a fix to it) due to a serious layering
violation -- MC cannot depend on CodeGen.

Specifically, the MCTargetDesc component of each target is actually
a subcomponent of the MC library. As such, it cannot depend on the
target-independent code generator, because MC itself cannot depend on
the target-independent code generator. This change moved a flag from the
ARM MCTargetDesc file ARMMCAsmInfo.cpp to the CodeGen layer in
ARMException.cpp, leaving behind an 'extern' to refer back to it. That
layering order isn't viable givin the constraints outlined above.
Commandline flags are designed to be static specifically to avoid these
types of bugs.

Fixing this is likely going to require some non-trivial refactoring.

llvm-svn: 148759
2012-01-24 00:30:17 +00:00
Bill Wendling 11eeeff24f Remove extraneous ';'s.
llvm-svn: 148740
2012-01-23 22:55:02 +00:00
Lang Hames 2f6377cafe copyImplicitOps is redundant here - the loop above already copies these ops.
llvm-svn: 148725
2012-01-23 21:15:01 +00:00
Jakob Stoklund Olesen 20948fab69 Fix PR11829. PostRA LICM was too aggressive.
This fixes a typo in r148589.

llvm-svn: 148724
2012-01-23 21:01:15 +00:00
Jakob Stoklund Olesen 9082353e3b Simplify debug output.
llvm-svn: 148723
2012-01-23 21:01:11 +00:00
Evgeniy Stepanov 482cdc4ebd An option to selectively enable parts of ARM EHABI support.
This change adds an new value to the --arm-enable-ehabi option that
disables emitting unwinding descriptors. This mode gives a working
backtrace() without the (currently broken) exception support.

llvm-svn: 148686
2012-01-23 07:57:39 +00:00
Anton Korobeynikov 0251f20ad1 Add an option to disable buggy copy propagation pass
llvm-svn: 148662
2012-01-22 14:08:34 +00:00
Evan Cheng 64a2beca52 Fix an obvious typo.
llvm-svn: 148622
2012-01-21 03:31:03 +00:00
Jakob Stoklund Olesen 8e3bb315d8 Handle register masks in LiveVariables.
A register mask operand kills any live physreg that isn't preserved.
Unlike an implicit-def operand, the clobbered physregs are never live
afterwards.

This means LiveVariables has to track a much smaller number of live
physregs, and it should spend much less time in addRegisterDead().

llvm-svn: 148609
2012-01-21 00:58:53 +00:00
Jakob Stoklund Olesen 52ee45d64a Delete an unused member variable.
llvm-svn: 148594
2012-01-20 22:48:59 +00:00
Jakob Stoklund Olesen 6b17ef58a1 Support register masks in MachineLICM.
Only PostRA LICM is affected.

llvm-svn: 148589
2012-01-20 22:27:12 +00:00
Jakob Stoklund Olesen 58614f2f5a Handle register masks in DeadMachineInstructionElim.
Don't track live physregs that are clobbered by a register mask operand.

llvm-svn: 148588
2012-01-20 22:27:09 +00:00
David Blaikie 46a9f016c5 More dead code removal (using -Wunreachable-code)
llvm-svn: 148578
2012-01-20 21:51:11 +00:00
Kostya Serebryany a5054ad2f3 Extend Attributes to 64 bits
Problem: LLVM needs more function attributes than currently available (32 bits).
One such proposed attribute is "address_safety", which shows that a function is being checked for address safety (by AddressSanitizer, SAFECode, etc).

Solution:
- extend the Attributes from 32 bits to 64-bits
- wrap the object into a class so that unsigned is never erroneously used instead
- change "unsigned" to "Attributes" throughout the code, including one place in clang.
- the class has no "operator uint64 ()", but it has "uint64_t Raw() " to support packing/unpacking.
- the class has "safe operator bool()" to support the common idiom:  if (Attributes attr = getAttrs()) useAttrs(attr);
- The CTOR from uint64_t is marked explicit, so I had to add a few explicit CTOR calls
- Add the new attribute "address_safety". Doing it in the same commit to check that attributes beyond first 32 bits actually work.
- Some of the functions from the Attribute namespace are worth moving inside the class, but I'd prefer to have it as a separate commit.

Tested:
"make check" on Linux (32-bit and 64-bit) and Mac (10.6)
built/run spec CPU 2006 on Linux with clang -O2.


This change will break clang build in lib/CodeGen/CGCall.cpp.
The following patch will fix it.

llvm-svn: 148553
2012-01-20 17:56:17 +00:00
Bill Wendling 9249261858 When lowering the 'resume' instruction, look to see if we can eliminate the
'insertvalue' instructions that recreate the structure returned by the
'landingpad' instruction. Because the 'insertvalue' instruction isn't supported
by FastISel, this can save a bit of time during -O0 compilation.

llvm-svn: 148520
2012-01-20 00:53:28 +00:00
Evan Cheng c2679b2958 More bundle related API additions.
llvm-svn: 148465
2012-01-19 07:47:03 +00:00
Evan Cheng d42aba53e6 Rewriter should definitly rewrite instructions inside bundles.
llvm-svn: 148464
2012-01-19 07:46:36 +00:00
Evan Cheng 6ca2272183 Enhance finalizeBundle to return end of bundle iterator because it makes sense.
llvm-svn: 148462
2012-01-19 06:13:10 +00:00
Evan Cheng 2879467d4e - Slight change to finalizeBundle() interface. LastMI is not exclusive (pointing
to instruction right after the last instruction in the bundle.
- Add a finalizeBundle() variant that doesn't specify LastMI. Instead, the code
  will find the last instruction in the bundle by following the 'InsideBundle'
  marker. This is useful in case bundles are formed early (i.e. during MI
  scheduling) but finalized later (i.e. after register allocator has finished
  rewriting virtual registers with physical registers).

llvm-svn: 148444
2012-01-19 00:46:06 +00:00
Evan Cheng 1eb2bb2295 Rename Finalizebundle to finalizeBundle to conform to coding guideline.
llvm-svn: 148440
2012-01-19 00:06:10 +00:00
Jakob Stoklund Olesen 9349351d72 Add a RegisterMaskSDNode class.
This SelectionDAG node will be attached to call nodes by LowerCall(),
and eventually becomes a MO_RegisterMask MachineOperand on the
MachineInstr representing the call instruction.

LowerCall() will attach a register mask that depends on the calling
convention.

llvm-svn: 148436
2012-01-18 23:52:12 +00:00
Lang Hames 1997de0100 Fixed macro condition.
llvm-svn: 148408
2012-01-18 19:48:31 +00:00
Nadav Rotem 3b8f0cc9fa Fix a bug in the type-legalization of vector integers. When we bitcast one vector type to another, we must not bitcast the result if one type is widened while the other is promoted.
llvm-svn: 148383
2012-01-18 08:33:18 +00:00
Pete Cooper c52eeed310 Fix ISD::REG_SEQUENCE to accept physical registers and change TwoAddressInstructionPass to insert copies for any physical reg operands of the REG_SEQUENCE
llvm-svn: 148377
2012-01-18 04:16:16 +00:00
Nadav Rotem fb6ddee0e9 Transform: (EXTRACT_VECTOR_ELT( VECTOR_SHUFFLE )) -> EXTRACT_VECTOR_ELT.
llvm-svn: 148337
2012-01-17 21:44:01 +00:00
Craig Topper 02cb0fb136 Teach DAG combiner to turn a BUILD_VECTOR of UNDEFs into an UNDEF of vector type.
llvm-svn: 148297
2012-01-17 09:09:48 +00:00
Andrew Trick 7ccdc5c192 misched: Inital interface and implementation for ScheduleTopDownLive and ShuffleInstructions.
llvm-svn: 148291
2012-01-17 06:55:07 +00:00
Andrew Trick e1c034fefe Renamed MachineScheduler to ScheduleTopDownLive.
Responding to code review.

llvm-svn: 148290
2012-01-17 06:55:03 +00:00
Andrew Trick 8093eac51d Moving options declarations around.
More short term hackery until we have a way to configure passes that work on LiveIntervals.

llvm-svn: 148289
2012-01-17 06:54:59 +00:00
Rafael Espindola cbda0e255d Add 148175 back. I am unable to reproduce any non determinism in a dragonegg
or clang bootstrap.

I will keep an eye on the bots.

Original message:
Only emit the Leh_func_endN symbol when needed.

llvm-svn: 148283
2012-01-17 04:19:20 +00:00
Pete Cooper e3d305a206 Changed flag operand of ISD::FP_ROUND to TargetConstant as it should not get checked for legalisation
llvm-svn: 148275
2012-01-17 01:54:07 +00:00
Lang Hames 818e1ffd74 Fix typo in comment.
llvm-svn: 148268
2012-01-17 00:39:29 +00:00
David Blaikie 486df738c3 Removing unused default switch cases in switches over enums that already account for all enumeration values explicitly.
(This time I believe I've checked all the -Wreturn-type warnings from GCC & added the couple of llvm_unreachables necessary to silence them. If I've missed any, I'll happily fix them as soon as I know about them)

llvm-svn: 148262
2012-01-16 23:24:27 +00:00
Hal Finkel 8606e3c7e3 AggressiveAntiDepBreaker needs to skip debug values because a debug value does not have a corresponding SUnit
llvm-svn: 148260
2012-01-16 22:53:41 +00:00
Jakob Stoklund Olesen 86ae07f049 Extract method for detecting constant unallocatable physregs.
It is safe to move uses of such registers.

llvm-svn: 148259
2012-01-16 22:34:08 +00:00
Jakob Stoklund Olesen 6de6d3e4ec Give better scavenger errors by invoking the verifier.
llvm-svn: 148251
2012-01-16 20:38:31 +00:00
Jakob Stoklund Olesen 374ed322f2 Add a new kind of MachineOperand: MO_RegisterMask.
Register masks will be used as a compact representation of large clobber
lists.  Currently, an x86 call instruction has some 40 operands
representing call-clobbered registers.  That's more than 1kB of useless
operands per call site.

A register mask operand references a bit mask of call-preserved
registers, everything else is clobbered.  The bit mask will typically
come from TargetRegisterInfo::getCallPreservedMask().

By abandoning ImplicitDefs for call-clobbered registers, it also becomes
possible to share call instruction descriptions between calling
conventions, and we can get rid of the WINCALL* instructions.

This patch introduces the new operand kind.  Future patches will add
RegMask support to target-independent passes before finally the fixed
clobber lists can be removed from call instruction descriptions.

llvm-svn: 148250
2012-01-16 19:22:00 +00:00
David Blaikie 5d8e42755c Refactor variables unused under non-assert builds (& remove two entirely unused variables).
llvm-svn: 148230
2012-01-16 05:17:39 +00:00
Pete Cooper e85b95d754 Changed intrinsic ID operand to a target constant as its not used in any arithmetic so should not be checked in legalisation
llvm-svn: 148228
2012-01-16 04:08:12 +00:00
Nadav Rotem 57935243bd [AVX] Optimize x86 VSELECT instructions using SimplifyDemandedBits.
We know that the blend instructions only use the MSB, so if the mask is
sign-extended then we can convert it into a SHL instruction. This is a
common pattern because the type-legalizer sign-extends the i1 type which
is used by the LLVM-IR for the condition.

Added a new optimization in SimplifyDemandedBits for SIGN_EXTEND_INREG -> SHL.

llvm-svn: 148225
2012-01-15 19:27:55 +00:00
Benjamin Kramer 339ced4e34 Return an ArrayRef from ShuffleVectorSDNode::getMask and push it through CodeGen.
llvm-svn: 148218
2012-01-15 13:16:05 +00:00
Benjamin Kramer 5a377e28da DAGCombiner: Deduplicate code.
llvm-svn: 148217
2012-01-15 11:50:43 +00:00
Craig Topper 201c1a3505 Truncate of undef is just undef of smaller size.
llvm-svn: 148205
2012-01-15 01:05:11 +00:00
Duncan Sands 90212bde1f Speculatively revert commit 148175 (rafael), to see if this fixes
non-determinism in the 32 bit dragonegg buildbot.  Original commit
message:
Only emit the Leh_func_endN symbol when needed.

llvm-svn: 148191
2012-01-14 17:16:48 +00:00
Rafael Espindola dfde7631fa Only emit the Leh_func_endN symbol when needed.
llvm-svn: 148175
2012-01-14 02:36:51 +00:00
Andrew Trick 59ac4fb706 misched: Initial code for building an MI level scheduling DAG
llvm-svn: 148174
2012-01-14 02:17:18 +00:00
Andrew Trick dbee9d8900 Move physreg dependency generation into aptly named addPhysRegDeps.
llvm-svn: 148173
2012-01-14 02:17:15 +00:00
Andrew Trick 1d028a364d misched: Added ScheduleDAGInstrs::IsPostRA
llvm-svn: 148172
2012-01-14 02:17:12 +00:00
Andrew Trick 7e120f4e66 misched: Invoke the DAG builder on each sequence of schedulable instructions.
llvm-svn: 148171
2012-01-14 02:17:09 +00:00
Andrew Trick 6344087e17 Move things around to make the file navigable, even though it will probably be split up later.
llvm-svn: 148170
2012-01-14 02:17:06 +00:00
Evan Cheng 6bb95253eb After r147827 and r147902, it's now possible for unallocatable registers to be
live across BBs before register allocation. This miscompiled 197.parser
when a cmp + b are optimized to a cbnz instruction even though the CPSR def
is live-in a successor.
        cbnz    r6, LBB89_12
...
LBB89_12:
        ble     LBB89_1

The fix consists of two parts. 1) Teach LiveVariables that some unallocatable
registers might be liveouts so don't mark their last use as kill if they are.
2) ARM constantpool island pass shouldn't form cbz / cbnz if the conditional
branch does not kill CPSR.

rdar://10676853

llvm-svn: 148168
2012-01-14 01:53:46 +00:00
Rafael Espindola a693128778 Remove previous commit while I debug the bot failures.
llvm-svn: 148156
2012-01-13 23:28:50 +00:00
Rafael Espindola cef42c30a7 Remove label that is not used anymore.
llvm-svn: 148150
2012-01-13 22:41:58 +00:00
Andrew Trick f35c84032d Remove pointless mode line in .cpp file.
llvm-svn: 148143
2012-01-13 22:04:16 +00:00
Andrew Trick e77e84e4b7 Added the MachineSchedulerPass skeleton.
llvm-svn: 148105
2012-01-13 06:30:30 +00:00
Andrew Trick 4d4fef238a wrong filename
llvm-svn: 148103
2012-01-13 06:30:22 +00:00
Andrew Trick b1be1aa8f8 80-col violation
llvm-svn: 148102
2012-01-13 06:30:19 +00:00
Evan Cheng fa8326334b DAGCombine's logic for forming pre- and post- indexed loads / stores were being
overly conservative. It was concerned about cases where it would prohibit
folding simple [r, c] addressing modes. e.g.
  ldr r0, [r2]
  ldr r1, [r2, #4]
=>
  ldr r0, [r2], #4
  ldr r1, [r2]
Change the logic to look for such cases which allows it to form indexed memory
ops more aggressively.

rdar://10674430

llvm-svn: 148086
2012-01-13 01:37:24 +00:00
Bill Wendling 49c4dfb534 Revert accidental commit.
llvm-svn: 148065
2012-01-12 23:06:28 +00:00
Bill Wendling ee5eaebc58 Fix the code that was WRONG.
The registers are placed into the saved registers list in the reverse order,
which is why the original loop was written to loop backwards.

llvm-svn: 148064
2012-01-12 23:05:03 +00:00
Pete Cooper 99415fea87 Added FPOW, FEXP, FLOG to PromoteNode so that custom actions can be set to Promote for those operations.
Sorry, no test case yet

llvm-svn: 148050
2012-01-12 21:46:18 +00:00
Evan Cheng 5c03a6b8f5 When hoisting common code, watch out for uses which are marked "kill". If the
killed registers are needed below the insertion point, then unset the kill
marker.

Sorry I'm not able to find a reduced test case.

rdar://10660944

llvm-svn: 148043
2012-01-12 20:31:24 +00:00
Evan Cheng 09cc429cb1 Allow targets to select source order pre-RA scheduler.
llvm-svn: 148033
2012-01-12 18:27:52 +00:00
Jakob Stoklund Olesen 994fed689f Make SplitAnalysis::UseSlots private.
llvm-svn: 148031
2012-01-12 17:53:44 +00:00
Jakob Stoklund Olesen 20f19eb9ab Make data structures private.
llvm-svn: 147979
2012-01-11 23:19:08 +00:00
Jakob Stoklund Olesen 73edbf1682 Sink spillInterferences into RABasic.
This helper method is too simplistic for RAGreedy.

llvm-svn: 147976
2012-01-11 22:52:14 +00:00
Jakob Stoklund Olesen 06ec420347 Cleanup.
llvm-svn: 147975
2012-01-11 22:52:11 +00:00
Jakob Stoklund Olesen a818d804a1 Move RegAllocBase into its own cpp file separate from RABasic.
No functional change.

llvm-svn: 147972
2012-01-11 22:28:30 +00:00
Nadav Rotem b5ce6ee835 On AVX, we can load v8i32 at a time. The bug happens when two uneven loads are used.
When we load the v12i32 type, the GenWidenVectorLoads method generates two loads: v8i32 and v4i32 
and attempts to use CONCAT_VECTORS to join them. In this fix I concat undef values to widen 
the smaller value. The test "widen_load-2.ll" also exposes this bug on AVX.

llvm-svn: 147964
2012-01-11 20:19:17 +00:00
Chandler Carruth 55b2cdee26 Teach the X86 instruction selection to do some heroic transforms to
detect a pattern which can be implemented with a small 'shl' embedded in
the addressing mode scale. This happens in real code as follows:

  unsigned x = my_accelerator_table[input >> 11];

Here we have some lookup table that we look into using the high bits of
'input'. Each entity in the table is 4-bytes, which means this
implicitly gets turned into (once lowered out of a GEP):

  *(unsigned*)((char*)my_accelerator_table + ((input >> 11) << 2));

The shift right followed by a shift left is canonicalized to a smaller
shift right and masking off the low bits. That hides the shift right
which x86 has an addressing mode designed to support. We now detect
masks of this form, and produce the longer shift right followed by the
proper addressing mode. In addition to saving a (rather large)
instruction, this also reduces stalls in Intel chips on benchmarks I've
measured.

In order for all of this to work, one part of the DAG needs to be
canonicalized *still further* than it currently is. This involves
removing pointless 'trunc' nodes between a zextload and a zext. Without
that, we end up generating spurious masks and hiding the pattern.

llvm-svn: 147936
2012-01-11 08:41:08 +00:00
Jakob Stoklund Olesen 8b1d023a4a Detect when a value is undefined on an edge to a landing pad.
Consider this code:

int h() {
  int x;
  try {
    x = f();
    g();
  } catch (...) {
    return x+1;
  }
  return x;
}

The variable x is undefined on the first edge to the landing pad, but it
has the f() return value on the second edge to the landing pad.

SplitAnalysis::getLastSplitPoint() would assume that the return value
from f() was live into the landing pad when f() throws, which is of
course impossible.

Detect these cases, and treat them as if the landing pad wasn't there.
This allows spill code to be inserted after the function call to f().

<rdar://problem/10664933>

llvm-svn: 147912
2012-01-11 02:07:05 +00:00
Jakob Stoklund Olesen 67aec12409 Exclusively use SplitAnalysis::getLastSplitPoint().
Delete the alternative implementation in LiveIntervalAnalysis.

These functions computed the same thing, but SplitAnalysis caches the
result.

llvm-svn: 147911
2012-01-11 02:07:00 +00:00
Evan Cheng d9725a38d6 Avoid CSE of instructions which define physical registers across MBBs unless
the physical registers are not allocatable.

llvm-svn: 147902
2012-01-11 00:38:11 +00:00
Evan Cheng da46832e42 80 col violation.
llvm-svn: 147884
2012-01-10 22:27:32 +00:00
Chandler Carruth f3e8502cc1 Add 'llvm_unreachable' to passify GCC's understanding of the constraints
of several newly un-defaulted switches. This also helps optimizers
(including LLVM's) recognize that every case is covered, and we should
assume as much.

llvm-svn: 147861
2012-01-10 18:08:01 +00:00
David Blaikie edbb58c577 Remove unnecessary default cases in switches that cover all enum values.
llvm-svn: 147855
2012-01-10 16:47:17 +00:00
Nadav Rotem 61bdf79035 Fix a bug in the legalization of shuffle vectors. When we emulate shuffles using BUILD_VECTORS we may be using a BV of different type. Make sure to cast it back.
llvm-svn: 147851
2012-01-10 14:28:46 +00:00
Evan Cheng 0be4144a68 Allow machine-cse to look across MBB boundary when cse'ing instructions that
define physical registers. It's currently very restrictive, only catching
cases where the CE is in an immediate (and only) predecessor. But it catches
a surprising large number of cases.

rdar://10660865

llvm-svn: 147827
2012-01-10 02:02:58 +00:00
Rafael Espindola 5cb98f1062 Remove the logging streamer.
llvm-svn: 147820
2012-01-10 00:40:39 +00:00
Evan Cheng 520730ff23 Avoid eraseing copies from a reserved register unless the definition can be
safely proven not to have been clobbered. No small test case possible.

llvm-svn: 147751
2012-01-08 19:52:28 +00:00
Craig Topper 0515cd41e4 Replace some uses of hasNUsesOfValue(0, X) with !hasAnyUseOfValue(X)
llvm-svn: 147733
2012-01-07 18:31:09 +00:00
Craig Topper 43a1bd6ac7 Add some DAG combines for SUBC/SUBE. If nothing uses the carry/borrow out of subc, turn it into a sub. Turn (subc x, x) into 0 with no borrow. Turn (subc x, 0) into x with no borrow. Turn (subc -1, x) into (xor x, -1) with no borrow. Turn sube with no borrow in into subc.
llvm-svn: 147728
2012-01-07 09:06:39 +00:00
Jakob Stoklund Olesen 434fb37bb4 Optimize reserved register coalescing.
Reserved registers don't have proper live ranges, their LiveInterval
simply has a snippet of liveness for each def.  Virtual registers with a
single value that is a copy of a reserved register (typically %esp) can
be coalesced with the reserved register if the live range doesn't
overlap any reserved register defs.

When coalescing with a reserved register, don't modify the reserved
register live range.  Just leave it as a bunch of dead defs.  This
eliminates quadratic coalescer behavior in i386 functions with many
function calls.

PR11699

llvm-svn: 147726
2012-01-07 07:39:50 +00:00
Jakob Stoklund Olesen a8879087b5 Use the 'regalloc' debug tag for most register allocator tracing.
llvm-svn: 147725
2012-01-07 07:39:47 +00:00
Evan Cheng 6cc8d49885 Revert part of r147716. Looks like x87 instructions kill markers are all messed
up so branch folding pass can't use the scavenger. :-(  This doesn't breaks
anything currently. It just means targets which do not carefully update kill
markers cannot run post-ra scheduler (not new, it has always been the case).

We should fix this at some point since it's really hacky.

llvm-svn: 147719
2012-01-07 03:35:48 +00:00
Evan Cheng 00b1a3cd7e Added a late machine instruction copy propagation pass. This catches
opportunities that only present themselves after late optimizations
such as tail duplication .e.g.
## BB#1:
        movl    %eax, %ecx
        movl    %ecx, %eax
        ret

The register allocator also leaves some of them around (due to false
dep between copies from phi-elimination, etc.)

This required some changes in codegen passes. Post-ra scheduler and the
pseudo-instruction expansion passes have been moved after branch folding
and tail merging. They were before branch folding before because it did
not always update block livein's. That's fixed now. The pass change makes
independently since we want to properly schedule instructions after
branch folding / tail duplication.

rdar://10428165
rdar://10640363

llvm-svn: 147716
2012-01-07 03:02:36 +00:00
Andrew Trick ff4e2b7d23 Missing raw_ostream.h breaks MSVC build.
llvm-svn: 147703
2012-01-07 00:54:28 +00:00
Chad Rosier 73a3fab480 Add comment.
llvm-svn: 147696
2012-01-06 23:45:47 +00:00
Eric Christopher 8ea8e4fc76 Add a comment and ensure that anyone else looking at this code doesn't start
to bleed from the eyes.

llvm-svn: 147695
2012-01-06 23:03:37 +00:00
Eric Christopher 090fcc1a10 Use const vector references instead of a vector copy. Spotted by Devang.
llvm-svn: 147694
2012-01-06 23:03:34 +00:00
Eric Christopher 5a28a6ee2f Use -> instead of (*iter).
llvm-svn: 147693
2012-01-06 23:03:27 +00:00
Andrew Trick 85460d0d32 Tracing to help investigate issues with SjLj spill code.
llvm-svn: 147682
2012-01-06 21:16:27 +00:00
Eric Christopher 667a074be0 Fix a leak I noticed while reviewing the accelerator table changes. Passes
lldb testsuite.

rdar://10652330

llvm-svn: 147673
2012-01-06 19:35:04 +00:00
Eric Christopher 21bde87bf3 As part of the ongoing work in finalizing the accelerator tables, extend
the debug type accelerator tables to contain the tag and a flag
stating whether or not a compound type is a complete type.

rdar://10652330

llvm-svn: 147651
2012-01-06 04:35:23 +00:00
Benjamin Kramer 69eab4e0af Kill ObjectCodeEmitter and BinaryObject, they were unused and superseded by MC.
llvm-svn: 147618
2012-01-05 22:31:37 +00:00
Rafael Espindola afcf571ef9 Remove the old ELF writer.
llvm-svn: 147615
2012-01-05 22:07:43 +00:00
Chandler Carruth eab5029964 Remove an unused variable.
llvm-svn: 147605
2012-01-05 11:25:47 +00:00
Chandler Carruth e041a30bb9 Prevent a DAGCombine from firing where there are two uses of
a combined-away node and the result of the combine isn't substantially
smaller than the input, it's just canonicalized. This is the first part
of a significant (7%) performance gain for Snappy's hot decompression
loop.

llvm-svn: 147604
2012-01-05 11:05:55 +00:00
Andrew Trick 100af0adf7 Minor postra scheduler cleanup. It could result in more precise antidependence latency on ARM in exceedingly rare cases.
llvm-svn: 147594
2012-01-05 02:52:11 +00:00
Jakob Stoklund Olesen d19d3cab09 Freeze reserved registers before starting register allocation.
The register allocators don't currently support adding reserved
registers while they are running.  Extend the MRI API to keep track of
the set of reserved registers when register allocation started.

Target hooks like hasFP() and needsStackRealignment() can look at this
set to avoid reserving more registers during register allocation.

llvm-svn: 147577
2012-01-05 00:26:49 +00:00
Craig Topper f726e15f44 Allow vector shuffle normalizing to use concat vector even if the sources are commuted in the shuffle mask.
llvm-svn: 147527
2012-01-04 09:23:09 +00:00
Craig Topper 279c77b677 Implement VECTOR_SHUFFLE canonicalizations during DAG combine.
llvm-svn: 147525
2012-01-04 08:07:43 +00:00
Chris Lattner 6b77a07f75 Turn a few more inline asm errors into "emitErrors" instead of fatal errors.
Before we'd get:

$ clang t.c 
fatal error: error in backend: Invalid operand for inline asm constraint 'i'!

Now we get:

$ clang t.c
t.c:16:5: error: invalid operand for inline asm constraint 'i'!
    "movq         (%4), %%mm0\n"
    ^

Which at least gets us the inline asm that is the problem.

llvm-svn: 147502
2012-01-03 23:51:01 +00:00
Jakob Stoklund Olesen 4043d92872 Assert when reserved registers have been assigned.
This can only happen if the set of reserved registers changes during
register allocation.

<rdar://problem/10625436>

llvm-svn: 147486
2012-01-03 22:34:31 +00:00
Nadav Rotem 1e7dda13c8 Fix incorrect widening of the bitcast sdnode in case the incoming operand is integer-promoted.
llvm-svn: 147484
2012-01-03 22:12:28 +00:00
Owen Anderson fcc041eabf Remove the restriction that target intrinsics can only involve legal types. Targets can perfects well support intrinsics on illegal types, as long as they are prepared to perform custom expansion during type legalization. For example, a target where i64 is illegal might still support the i64 intrinsic operation using pairs of i32's. ARM already does some expansions like this for non-intrinsic operations.
llvm-svn: 147472
2012-01-03 20:09:02 +00:00
Lang Hames c405ac4429 Clarified assert text.
llvm-svn: 147471
2012-01-03 20:05:57 +00:00
Nick Lewycky bc26b2d162 Fix typo in ruler. No functionality change.
llvm-svn: 147454
2012-01-03 18:22:43 +00:00
Elena Demikhovsky 8ec21a2801 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147445
2012-01-03 11:59:04 +00:00
Rafael Espindola d3df940169 Revert 147399. It broke CodeGen/ARM/vext.ll.
llvm-svn: 147400
2012-01-01 17:36:23 +00:00
Elena Demikhovsky 67f80c3432 Fixed a bug in SelectionDAG.cpp.
The failure seen on win32, when i64 type is illegal.
It happens on stage of conversion VECTOR_SHUFFLE to BUILD_VECTOR.

The failure message is:
llc: SelectionDAG.cpp:784: void VerifyNodeCommon(llvm::SDNode*): Assertion `(I->getValueType() == EltVT || (EltVT.isInteger() && I->getValueType().isInteger() && EltVT.bitsLE(I->getValueType()))) && "Wrong operand type!"' failed.

I added a special test that checks vector shuffle on win32.

llvm-svn: 147399
2012-01-01 16:22:47 +00:00
Nadav Rotem 3c3dd6e588 PR11662.
Promotion of the mask operand needs to be done using PromoteTargetBoolean, and not padded with garbage.

llvm-svn: 147309
2011-12-28 13:08:20 +00:00
Eli Friedman e96286cdf2 Make sure DAGCombiner doesn't introduce multiple loads from the same memory location. PR10747, part 2.
llvm-svn: 147283
2011-12-26 22:49:32 +00:00
Nadav Rotem c1faeac410 Fix a typo in the widening of vectors in PromoteIntRes. Patch by Shemer Anat.
llvm-svn: 147272
2011-12-25 20:01:38 +00:00
Dylan Noblesmith 9e5b178ecc drop unneeded config.h includes
llvm-svn: 147197
2011-12-22 23:04:07 +00:00
Pete Cooper 1c3b1efa58 Hoisted some loop invariant smallvector lookups out of a MachineLICM loop
llvm-svn: 147127
2011-12-22 02:13:25 +00:00
Pete Cooper 1eed5b51e8 Changed MachineLICM to use a worklist list MachineCSE instead of recursion.
Fixes <rdar://problem/10584116>

llvm-svn: 147125
2011-12-22 02:05:40 +00:00
Jakub Staszak 9061616f9e Revert patch from 147090. There is not point to make code less readable if we
don't get any serious benefit there.

llvm-svn: 147101
2011-12-21 23:02:08 +00:00
Jakub Staszak df5133455f - Change a few operator[] to lookup which is cheaper.
- Add some constantness.

llvm-svn: 147090
2011-12-21 20:18:54 +00:00
Lang Hames e49fbd0755 Oops - LiveIntervalUnion.cpp file does use std::find. Moving STL header include to LiveIntervalUnion.cpp file.
llvm-svn: 147089
2011-12-21 20:16:11 +00:00
Lang Hames 93176d72e7 Remove disused STL header include.
llvm-svn: 147088
2011-12-21 20:12:54 +00:00
Jakob Stoklund Olesen 3588a43e3a Move common code into an MRI function.
llvm-svn: 147071
2011-12-21 19:50:05 +00:00
Lang Hames 6cee53d06e Fix assert condition.
llvm-svn: 146987
2011-12-20 20:23:40 +00:00
Jakub Staszak 96f8c551e3 Add some constantness to BranchProbabilityInfo and BlockFrequnencyInfo.
llvm-svn: 146986
2011-12-20 20:03:10 +00:00
Chandler Carruth e805b16e3d Fix up the CMake build for the new files added in r146960, they're
likely to stay either way that discussion ends up resolving itself.

llvm-svn: 146966
2011-12-20 08:42:11 +00:00
David Blaikie a379b18173 Unweaken vtables as per http://llvm.org/docs/CodingStandards.html#ll_virtual_anch
llvm-svn: 146960
2011-12-20 02:50:00 +00:00
Dan Gohman 94580ab375 Add basic generic CodeGen support for half.
llvm-svn: 146927
2011-12-20 00:02:33 +00:00
Evan Cheng 4266a79351 Add a if-conversion optimization that allows 'true' side of a diamond to be
unpredicated. That is, turn
 subeq  r0, r1, #1
 addne  r0, r1, #1                                                                                                                                                                                                     
into
 sub    r0, r1, #1
 addne  r0, r1, #1

For targets where conditional instructions are always executed, this may be
beneficial. It may remove pseudo anti-dependency in out-of-order execution
CPUs. e.g.
 op    r1, ...
 str   r1, [r10]        ; end-of-life of r1 as div result
 cmp   r0, #65
 movne r1, #44  ; raw dependency on previous r1
 moveq r1, #12

If movne is unpredicated, then
 op    r1, ...
 str   r1, [r10]
 cmp   r0, #65
 mov   r1, #44  ; r1 written unconditionally
 moveq r1, #12

Both mov and moveq are no longer depdendent on the first instruction. This gives
the out-of-order execution engine more freedom to reorder them.

This has passed entire LLVM test suite. But it has not been enabled for any ARM
variant pending more performance evaluation.

rdar://8951196

llvm-svn: 146914
2011-12-19 22:01:30 +00:00
Eli Friedman 5bb6826fdc Attempt to fix PR11607 by shuffling around which class defines which methods.
llvm-svn: 146897
2011-12-19 20:06:03 +00:00
Jakob Stoklund Olesen 8f9c6c4ad0 Handle sub-register operands in recomputeRegClass().
Now that getMatchingSuperRegClass() returns accurate results, it can be
used to compute constraints imposed by instructions using a sub-register
of a virtual register.

This means we can recompute the register class of any virtual register
by combining the constraints from all its uses.

llvm-svn: 146874
2011-12-19 16:53:37 +00:00
Joerg Sonnenberger d6cb7649d8 Allow inlining of functions with returns_twice calls, if they have the
attribute themselve.

llvm-svn: 146851
2011-12-18 20:35:43 +00:00
Rafael Espindola d3df3d3527 Add back the MC bits of 126425. Original patch by Nathan Jeffords. I added the
asm parsing and testcase.

llvm-svn: 146801
2011-12-17 01:14:52 +00:00
Eric Christopher da011dd0e3 Resolve part of a fixme and add a new one.
llvm-svn: 146784
2011-12-16 23:42:42 +00:00
Eric Christopher 03faed3eac Add a fixme here.
llvm-svn: 146783
2011-12-16 23:42:38 +00:00
Eric Christopher 365d083585 Extraneous whitespace and 80-col.
llvm-svn: 146780
2011-12-16 23:42:31 +00:00
Nick Lewycky c9e935c7e2 Move parts of lib/Target that use CodeGen into lib/CodeGen.
llvm-svn: 146702
2011-12-15 22:58:58 +00:00
Devang Patel 7bbc1e56f5 Update DebugLoc while merging nodes at -O0.
Patch by Kyriakos Georgiou!

llvm-svn: 146670
2011-12-15 18:21:18 +00:00
Eli Friedman 2ec824966d Don't try to form FGETSIGN after legalization; it is possible in some cases, but the existing code can't do it correctly. PR11570.
llvm-svn: 146630
2011-12-15 02:07:20 +00:00
Owen Anderson e7f329fa7a Enable synthesis of FLOG2 and FEXP2 SelectionDAG nodes from libm calls. These are already marked as illegal by default.
llvm-svn: 146623
2011-12-15 00:54:12 +00:00
Dan Gohman 75d7d5e988 Move Instruction::isSafeToSpeculativelyExecute out of VMCore and
into Analysis as a standalone function, since there's no need for
it to be in VMCore. Also, update it to use isKnownNonZero and
other goodies available in Analysis, making it more precise,
enabling more aggressive optimization.

llvm-svn: 146610
2011-12-14 23:49:11 +00:00
Devang Patel c268688643 Do not sink instruction, if it is not profitable.
On ARM, peephole optimization for ABS creates a trivial cfg triangle which tempts machine sink to sink instructions in code which is really straight line code. Sometimes this sinking may alter register allocator input such that use and def of a reg is divided by a branch in between, which may result in extra spills. Now mahine sink avoids sinking if final sink destination is post dominator.

Radar 10266272.

llvm-svn: 146604
2011-12-14 23:20:38 +00:00
Bill Wendling b108aaebbe Reapply r146481 with a fix to create the Builder value in the correct place and
with the correct iterator.
<rdar://problem/10530851>

llvm-svn: 146600
2011-12-14 22:45:33 +00:00
Evan Cheng da103bf9ec Model ARM predicated write as read-mod-write. e.g.
r0 = mov #0
r0 = moveq #1

Then the second instruction has an implicit data dependency on the first
instruction. Sadly I have yet to come up with a small test case that
demonstrate the post-ra scheduler taking advantage of this.

llvm-svn: 146583
2011-12-14 20:00:08 +00:00
NAKAMURA Takumi 4c5ab7bb38 llvm/lib/CodeGen: Fix cmake build since r146542.
llvm-svn: 146550
2011-12-14 03:50:53 +00:00
Eli Friedman 6512cd4366 Add missing cases to SDNode::getOperationName(). Patch by Micah Villmow.
llvm-svn: 146548
2011-12-14 02:28:54 +00:00
Evan Cheng 87975df580 Allow target to specify register output dependency. Still default to one.
llvm-svn: 146547
2011-12-14 02:28:53 +00:00
Bill Wendling 2be88f1301 Revert r146481 to review possible miscompilations.
llvm-svn: 146546
2011-12-14 02:18:26 +00:00
Evan Cheng 7fae11b231 - Add MachineInstrBundle.h and MachineInstrBundle.cpp. This includes a function
to finalize MI bundles (i.e. add BUNDLE instruction and computing register def
  and use lists of the BUNDLE instruction) and a pass to unpack bundles.
- Teach more of MachineBasic and MachineInstr methods to be bundle aware.
- Switch Thumb2 IT block to MI bundles and delete the hazard recognizer hack to
  prevent IT blocks from being broken apart.

llvm-svn: 146542
2011-12-14 02:11:42 +00:00
Nick Lewycky cfde1a26b4 DW_AT_virtuality is also defined to be constant, not flag.
llvm-svn: 146534
2011-12-14 00:56:07 +00:00
Chad Rosier b941674aa4 [fast-isel] Remove SelectInsertValue() as fast-isel wasn't designed to handle
instructions that define aggregate types.

llvm-svn: 146492
2011-12-13 17:45:06 +00:00
Bill Wendling 2f1d93ffe0 Avoid using the 'insertvalue' instruction here.
Fast ISel isn't able to handle 'insertvalue' and it causes a large slowdown
during -O0 compilation. We don't necessarily need to generate an aggregate of
the values here if they're just going to be extracted directly afterwards.
<rdar://problem/10530851>

llvm-svn: 146481
2011-12-13 09:22:43 +00:00
Nick Lewycky cb91849fc7 DW_AT_accessibility is "constant" class, not form class, so it may not use
DW_FORM_flag. Use DW_FORM_data1 for one byte.

llvm-svn: 146475
2011-12-13 05:09:11 +00:00
Chandler Carruth 637cc6a8aa Initial CodeGen support for CTTZ/CTLZ where a zero input produces an
undefined result. This adds new ISD nodes for the new semantics,
selecting them when the LLVM intrinsic indicates that the undef behavior
is desired. The new nodes expand trivially to the old nodes, so targets
don't actually need to do anything to support these new nodes besides
indicating that they should be expanded. I've done this for all the
operand types that I could figure out for all the targets. Owners of
various targets, please review and let me know if any of these are
incorrect.

Note that the expand behavior is *conservatively correct*, and exactly
matches LLVM's current behavior with these operations. Ideally this
patch will not change behavior in any way. For example the regtest suite
finds the exact same instruction sequences coming out of the code
generator. That's why there are no new tests here -- all of this is
being exercised by the existing test suite.

Thanks to Duncan Sands for reviewing the various bits of this patch and
helping me get the wrinkles ironed out with expanding for each target.
Also thanks to Chris for clarifying through all the discussions that
this is indeed the approach he was looking for. That said, there are
likely still rough spots. Further review much appreciated.

llvm-svn: 146466
2011-12-13 01:56:10 +00:00
Chad Rosier 2f8347e0b6 [fast-isel] Guard "exhastive" fast-isel output with -fast-isel-verbose2.
llvm-svn: 146453
2011-12-13 00:05:11 +00:00
Daniel Dunbar 8889bb08b8 LLVMBuild: Introduce a common section which currently has a list of the
subdirectories to traverse into.
 - Originally I wanted to avoid this and just autoscan, but this has one key
   flaw in that new subdirectories can not automatically trigger a rerun of the
   llvm-build tool. This is particularly a pain when switching back and forth
   between trees where one has added a subdirectory, as the dependencies will
   tend to be wrong. This will also eliminates FIXME implicitly.

llvm-svn: 146436
2011-12-12 22:45:54 +00:00
Pete Cooper 76e4bc4e26 Fixed register allocator splitting a live range on a spilling variable.
If we create new intervals for a variable that is being spilled, then those new intervals are not guaranteed to also spill.  This means that anything reading from the original spilling value might not get the correct value if spills were missed.

Fixes <rdar://problem/10546864>

llvm-svn: 146428
2011-12-12 22:16:27 +00:00
Daniel Dunbar 27a7489a03 LLVMBuild: Remove trailing newline, which irked me.
llvm-svn: 146409
2011-12-12 19:48:00 +00:00
Chad Rosier 3168cabef1 [fast-isel] SelectInsertValue seems to be causing miscompiles for ARM. Disable while I investigate.
llvm-svn: 146331
2011-12-10 21:27:40 +00:00
Chad Rosier f70174b869 Typo.
llvm-svn: 146327
2011-12-10 19:48:51 +00:00
Chad Rosier dd998ff4df [fast-isel] Add support for selecting insertvalue.
rdar://10530851

llvm-svn: 146276
2011-12-09 20:09:54 +00:00
Evan Cheng feb9f27de1 Move isUnpredicatedTerminator() default implementation to TargetInstrInfoImpl to break Target's dependency on CodeGen.
llvm-svn: 146247
2011-12-09 06:41:08 +00:00
Devang Patel 706574a994 Fix comment.
llvm-svn: 146226
2011-12-09 01:25:04 +00:00
Devang Patel 2f9a0e1b86 Update stale comment.
llvm-svn: 146220
2011-12-09 01:18:48 +00:00
Eli Friedman 053a724483 Fix a couple of logic bugs in TargetLowering::SimplifyDemandedBits. PR11514.
llvm-svn: 146219
2011-12-09 01:16:26 +00:00
Devang Patel 202cf2f6fc Revert r146184. I am seeing performance regression cause by this patch in one test case.
llvm-svn: 146205
2011-12-08 23:52:00 +00:00
Owen Anderson bb15fec2b8 Enhance both TargetLibraryInfo and SelectionDAGBuilder so that the latter can use the former to prevent the formation of libm SDNode's when -fno-builtin is passed.
llvm-svn: 146193
2011-12-08 22:15:21 +00:00
Devang Patel b94c9a47e9 Refactor. No intentional functionality change.
llvm-svn: 146187
2011-12-08 21:48:01 +00:00
Chad Rosier 0464869922 Add rather verbose stats for fast-isel failures.
llvm-svn: 146186
2011-12-08 21:37:10 +00:00
Devang Patel 1a3c1697f9 Filter "sink to" candidate blocks sooner. This avoids unnecessary computation to determine whether the block dominates all uses or not.
llvm-svn: 146184
2011-12-08 21:33:23 +00:00
Owen Anderson 0b9b9da6c8 Teach SelectionDAG to match more calls to libm functions onto existing SDNodes. Mark these nodes as illegal by default, unless the target declares otherwise.
llvm-svn: 146171
2011-12-08 19:32:14 +00:00
Evan Cheng cdf89fdeaf Make MachineInstr instruction property queries more flexible. This change all
clients to decide whether to look inside bundled instructions and whether
the query should return true if any / all bundled instructions have the
queried property.

llvm-svn: 146168
2011-12-08 19:23:10 +00:00
Nadav Rotem 26edb291ac Fix a bug in the integer-promotion of bitcast operations on vector types.
We must not issue a bitcast operation for integer-promotion of vector types, because the
location of the values in the vector may be different.

llvm-svn: 146150
2011-12-08 13:10:01 +00:00
Pete Cooper a48e753103 Reverting r145899 as it breaks clang self-hosting
llvm-svn: 146136
2011-12-08 03:24:10 +00:00
Eli Friedman d5c173fad0 Make sure we correctly set LiveRegGens when a call is unscheduled. <rdar://problem/10460321>. No testcase because this is very sensitive to scheduling.
llvm-svn: 146087
2011-12-07 22:24:28 +00:00
Eli Friedman 0bdc083e21 Fix an assertion in the scheduler. PR11386. No testcase included because it's rather delicate.
llvm-svn: 146083
2011-12-07 22:06:02 +00:00
Nick Lewycky d63851eb93 These global variables aren't thread-safe, STATISTIC is. Andy Trick tells me
that he isn't using these any more, so just delete them.

llvm-svn: 146076
2011-12-07 21:35:59 +00:00
Jakub Staszak 190c712f73 Remove unneeded semicolon.
Skip two looking up at BlockChain.

llvm-svn: 146053
2011-12-07 19:46:10 +00:00
Evan Cheng 7f8e563a69 Add bundle aware API for querying instruction properties and switch the code
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.

For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.

llvm-svn: 146026
2011-12-07 07:15:52 +00:00
Eli Friedman f9081a8afe Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target.
llvm-svn: 146015
2011-12-07 03:55:52 +00:00
Jakob Stoklund Olesen 6ad6848522 Add missing check.
llvm-svn: 146004
2011-12-07 01:08:22 +00:00
Eli Friedman ed8b3e38ec Support vector bitcasts in the AsmPrinter. PR11495.
llvm-svn: 146001
2011-12-07 00:50:54 +00:00
Jakob Stoklund Olesen b0d91abec0 Add MachineOperand IsInternalRead flag.
This flag is used when bundling machine instructions.  It indicates
whether the operand reads a value defined inside or outside its bundle.

llvm-svn: 145997
2011-12-07 00:22:07 +00:00
Eli Friedman 0e58cba286 Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494.
llvm-svn: 145996
2011-12-07 00:11:56 +00:00
Jakub Staszak c007ab8551 Remove unneeded type.
llvm-svn: 145995
2011-12-07 00:08:00 +00:00
Jakub Staszak d4d2b05eba - Remove unneeded #includes.
- Remove unused types/fields.
- Add some constantness.

llvm-svn: 145993
2011-12-06 23:59:33 +00:00
Evan Cheng 2a81dd4a3c First chunk of MachineInstr bundle support.
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs

llvm-svn: 145975
2011-12-06 22:12:01 +00:00
Jakob Stoklund Olesen 2a2b37ea4a Pretty-print basic block alignment.
llvm-svn: 145965
2011-12-06 21:08:39 +00:00
Sebastian Pop ac35a4d0f7 use space star instead of star space
llvm-svn: 145944
2011-12-06 17:34:16 +00:00
Sebastian Pop 9aa6137d97 add missing point at the end of sentences
llvm-svn: 145943
2011-12-06 17:34:11 +00:00
Evan Cheng c1610bede1 Mix some minor misuse of MachineBasicBlock iterator.
llvm-svn: 145903
2011-12-06 02:49:06 +00:00
Pete Cooper d2971264c6 Removed isWinToJoinCrossClass from the register coalescer.
The new register allocator is much more able to split back up ranges too constrained by register classes.

Fixes <rdar://problem/10466609>

llvm-svn: 145899
2011-12-06 02:06:50 +00:00
Lang Hames 52f24d7a32 Kill off the LoopSplitter. It's not being used or maintained.
llvm-svn: 145897
2011-12-06 01:57:59 +00:00
Lang Hames b13b6a04d0 Update PBQP's analysis usage to reflect the requirements of the inline spiller.
llvm-svn: 145893
2011-12-06 01:45:57 +00:00
Jakob Stoklund Olesen 10e1252269 Use logarithmic units for basic block alignment.
This was actually a bit of a mess. TLI.setPrefLoopAlignment was clearly
documented as taking log2(bytes) units, but the x86 target would still
set a preferred loop alignment of '16'.

CodePlacementOpt passed this number on to the basic block, and
AsmPrinter interpreted it as bytes.

Now both MachineFunction and MachineBasicBlock use logarithmic
alignments.

Obviously, MachineConstantPool still measures alignments in bytes, so we
can emulate the thrill of using as.

llvm-svn: 145889
2011-12-06 01:26:19 +00:00
Nadav Rotem 3924cb0267 Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Eric Christopher 8dda5d0f06 Add inline subprogram names to the name lookup table since they may
not get there any other way.

llvm-svn: 145789
2011-12-04 06:02:38 +00:00
Anton Korobeynikov 965e0c6de2 Emit the ctors in the proper order on ARM/EABI.
Maybe some targets should use this as well.

Patch by Evgeniy Stepanov!

llvm-svn: 145781
2011-12-03 23:49:37 +00:00
Benjamin Kramer 71ba18c1e0 Simplify code. No functionality change.
-3% on ARMDissasembler.cpp.

llvm-svn: 145773
2011-12-03 16:18:22 +00:00
Nick Lewycky 50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Hal Finkel 4201820275 make sure ScheduleDAGInstrs::EmitSchedule does not crash when the first instruction in Sequence is a Noop
llvm-svn: 145677
2011-12-02 04:58:07 +00:00
Dylan Noblesmith c19f0b7357 CodeGen: fix CMake build
Missing file from r145629.

llvm-svn: 145634
2011-12-01 21:49:23 +00:00
Anshuman Dasgupta 08ebdc1e71 Add a deterministic finite automaton based packetizer for VLIW architectures
llvm-svn: 145629
2011-12-01 21:10:21 +00:00
Chad Rosier 46addb9e07 If fast-isel fails, remove dead instructions generated during the failed
attempt.  

llvm-svn: 145425
2011-11-29 19:40:47 +00:00
Daniel Dunbar 539d0a8a09 build/CMake: Finish removal of add_llvm_library_dependencies.
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Bill Wendling e4cc332729 On MachO, the pointer to the personality function should always be in the
non_lazy_symbol_pointers section (__IMPORT,__pointers). Ignore the 'hidden' part
since that will place it in the wrong section.
<rdar://problem/10443720>

llvm-svn: 145356
2011-11-29 01:43:20 +00:00
Eli Friedman e7ab1a2f0f Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Evan Cheng 4a5b2040e2 Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621

llvm-svn: 145300
2011-11-28 22:37:34 +00:00
Evan Cheng a4b6404cf0 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Chad Rosier 61e8d1026f 80-column.
llvm-svn: 145267
2011-11-28 19:59:09 +00:00
Bill Wendling 5ebc95ff4c Remove dead llvm.eh.sjlj.dispatchsetup intrinsic.
llvm-svn: 145263
2011-11-28 19:23:13 +00:00
Chandler Carruth 4f56720754 Prevent rotating the blocks of a loop (and thus getting a backedge to be
fallthrough) in cases where we might fail to rotate an exit to an outer
loop onto the end of the loop chain.

Having *some* rotation, but not performing this rotation, is the primary
fix of thep performance regression with -enable-block-placement for
Olden/em3d (a whopping 30% regression). Still working on reducing the
test case that actually exercises this and the new rotation strategy out
of this code, but I want to check if this regresses other test cases
first as that may indicate it isn't the correct fix.

llvm-svn: 145195
2011-11-27 20:18:00 +00:00
Chandler Carruth 03adbd46ca Take two on rotating the block ordering of loops. My previous attempt
was centered around the premise of laying out a loop in a chain, and
then rotating that chain. This is good for preserving contiguous layout,
but bad for actually making sane rotations. In order to keep it safe,
I had to essentially make it impossible to rotate deeply nested loops.
The information needed to correctly reason about a deeply nested loop is
actually available -- *before* we layout the loop. We know the inner
loops are already fused into chains, etc. We lose information the moment
we actually lay out the loop.

The solution was the other alternative for this algorithm I discussed
with Benjamin and some others: rather than rotating the loop
after-the-fact, try to pick a profitable starting block for the loop's
layout, and then use our existing layout logic. I was worried about the
complexity of this "pick" step, but it turns out such complexity is
needed to handle all the important cases I keep teasing out of benchmarks.

This is, I'm afraid, a bit of a work-in-progress. It is still
misbehaving on some likely important cases I'm investigating in Olden.
It also isn't really tested. I'm going to try to craft some interesting
nested-loop test cases, but it's likely to be extremely time consuming
and I don't want to go there until I'm sure I'm testing the correct
behavior. Sadly I can't come up with a way of getting simple, fine
grained test cases for this logic. We need complex loop structures to
even trigger much of it.

llvm-svn: 145183
2011-11-27 13:34:33 +00:00
Chandler Carruth 9e46684154 Fix an impressive type-o / spell-o Duncan noticed.
llvm-svn: 145181
2011-11-27 10:32:16 +00:00
Chandler Carruth a054580993 Rework a bit of the implementation of loop block rotation to not rely so
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.

The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.

llvm-svn: 145179
2011-11-27 09:22:53 +00:00
Chandler Carruth 9ffb97e631 Introduce a loop block rotation optimization to the new block placement
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.

This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.

llvm-svn: 145158
2011-11-27 00:38:03 +00:00
Benjamin Kramer 7ba71be392 Move code into anonymous namespaces.
llvm-svn: 145154
2011-11-26 23:01:57 +00:00
Chandler Carruth 7adee1a01a Fix a silly use-after-free issue. A much earlier version of this code
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.

Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.

llvm-svn: 145120
2011-11-24 11:23:15 +00:00
Chandler Carruth d394bafd2d When adding blocks to the list of those which no longer have any CFG
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.

I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.

llvm-svn: 145119
2011-11-24 08:46:04 +00:00
Chandler Carruth 99fe42fbd9 Relax an invariant that block placement was trying to assert a bit
further. This invariant just wasn't going to work in the face of
unanalyzable branches; we need to be resillient to the phenomenon of
chains poking into a loop and poking out of a loop. In fact, we already
were, we just needed to not assert on it.

This was found during a bootstrap with block placement turned on.

llvm-svn: 145100
2011-11-23 10:35:36 +00:00
Chandler Carruth 8c68f1f3c8 Handle the case of a no-return invoke correctly. It actually still has
successors, they just are all landing pad successors. We handle this the
same way as no successors. Comments attached for the next person to wade
through here and another lovely test case courtesy of Benjamin Kramer's
bugpoint reduction.

llvm-svn: 145098
2011-11-23 08:23:54 +00:00
Bob Wilson ebb44646c4 Enable stack protectors for all arrays, not just char arrays. rdar://5875909
Patch by Bill Wendling.

llvm-svn: 145097
2011-11-23 07:13:56 +00:00
Jakob Stoklund Olesen 02845410f9 Fix PR11422.
This was a bug in keeping track of the available domains when merging
domain values.

The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.

Also add an assertion to catch future attempts at emitting AVX2
instructions.

llvm-svn: 145096
2011-11-23 04:03:08 +00:00
Chandler Carruth 4a87aa0c31 Fix a crash in block placement due to an inner loop that happened to be
reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.

Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.

llvm-svn: 145094
2011-11-23 03:03:21 +00:00
Chandler Carruth ee54feb6f6 Fix a devilish miscompile exposed by block placement. The
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

llvm-svn: 145062
2011-11-22 13:13:16 +00:00
Chandler Carruth e2530dc889 Fix an obvious omission in the SelectionDAGBuilder where we were
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060
2011-11-22 11:37:46 +00:00
Rafael Espindola 2021f38281 If a register is both an early clobber and part of a tied use, handle the use
before the clobber so that we copy the value if needed.

Fixes pr11415.

llvm-svn: 145056
2011-11-22 06:27:18 +00:00
Chandler Carruth 18dfac385b The logic for breaking the CFG in the presence of hot successors didn't
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010
2011-11-20 11:22:06 +00:00
Chandler Carruth f3dc9eff16 Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994
2011-11-19 10:26:02 +00:00
Devang Patel 107e8ec30d DISubrange supports unsigned lower/upper array bounds, so let's not fake it in the end while emitting DWARF. If a FE needs to encode signed lower/upper array bounds then we need to extend DISubrange or ad DISignedSubrange.
llvm-svn: 144937
2011-11-17 23:43:15 +00:00
Chad Rosier f83ab704e4 When fast iseling a GEP, accumulate the offset rather than emitting a series of
ADDs.  MaxOffs is used as a threshold to limit the size of the offset. Tradeoffs
being: (1) If we can't materialize the large constant then we'll cause fast-isel
to bail. (2) Too large of an offset can't be directly encoded in the ADD
resulting in a MOV+ADD.  Generally not a bad thing because otherwise we would
have had ADD+ADD, but on Thumb this turns into a MOVS+MOVT+ADD. Working on a fix
for that. (3) Conversely, too low of a threshold we'll miss opportunities to 
coalesce ADDs.
rdar://10412592

llvm-svn: 144886
2011-11-17 07:15:58 +00:00
Eli Friedman ff1eaa7578 Make sure to replace the chain properly when DAGCombining a LOAD+EXTRACT_VECTOR_ELT into a single LOAD. Fixes PR10747/PR11393.
llvm-svn: 144863
2011-11-16 23:50:22 +00:00
Chad Rosier ff40b1e164 Add fast-isel stats to determine who's doing all the work, the
target-independent selector or the target-specific selector.

llvm-svn: 144833
2011-11-16 21:05:28 +00:00
Chad Rosier cfd0d10e72 Fix the stats collection for fast-isel. The failed count was only accounting
for a single miss and not all predecessor instructions that get selected by
the selection DAG instruction selector.  This is still not exact (e.g., over
states misses when folded/dead instructions are present), but it is a step in
the right direction.

llvm-svn: 144832
2011-11-16 21:02:08 +00:00
Evan Cheng 822ddde50d Disable expensive two-address optimizations at -O0. rdar://10453055
llvm-svn: 144806
2011-11-16 18:44:48 +00:00
Evan Cheng 624eb2af6f Disable the assertion again. Looks like fastisel is still generating bad kill markers.
llvm-svn: 144804
2011-11-16 18:32:14 +00:00
Evan Cheng ecb2908bf9 Sink codegen optimization level into MCCodeGenInfo along side relocation model
and code model. This eliminates the need to pass OptLevel flag all over the
place and makes it possible for any codegen pass to use this information.

llvm-svn: 144788
2011-11-16 08:38:26 +00:00
Bob Wilson cca9aa58ca Record landing pads with a SmallSetVector to avoid multiple entries.
There may be many invokes that share one landing pad, and the previous code
would record the landing pad once for each invoke.  Besides the wasted
effort, a pair of volatile loads gets inserted every time the landing pad is
processed.  The rest of the code can get optimized away when a landing pad
is processed repeatedly, but the volatile loads remain, resulting in code like:

LBB35_18:
Ltmp483:
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r2, [r7, #-72]
        ldr     r2, [r7, #-68]
        ldr     r4, [r7, #-72]
        ldr     r2, [r7, #-68]

llvm-svn: 144787
2011-11-16 07:57:21 +00:00
Bob Wilson 643e63c40c Update the SP in the SjLj jmpbuf whenever it changes. <rdar://problem/10444602>
This same basic code was in the older version of the SjLj exception handling,
but it was removed in the recent revisions to that code.  It needs to be there.

llvm-svn: 144782
2011-11-16 07:12:00 +00:00
Evan Cheng 4ac36c8e26 Revert r144568 now that r144730 has fixed the fast-isel kill marker bug.
llvm-svn: 144776
2011-11-16 04:55:01 +00:00
Evan Cheng b8c55a5339 If the 2addr instruction has other kills, don't move it below any other uses since we don't want to extend other live ranges.
llvm-svn: 144772
2011-11-16 03:47:42 +00:00
Evan Cheng 59f8156ea0 RescheduleKillAboveMI() must backtrack to before the rescheduled DBG_VALUE instructions. rdar://10451185
llvm-svn: 144771
2011-11-16 03:33:08 +00:00
Evan Cheng 9ddd69a8bc Process all uses first before defs to accurately capture register liveness. rdar://10449480
llvm-svn: 144770
2011-11-16 03:05:12 +00:00
Eli Friedman 87f92512c3 CONCAT_VECTORS can have more than two operands. PR11389.
llvm-svn: 144768
2011-11-16 02:52:39 +00:00
Eli Friedman d257a464d1 Add a couple asserts so it will be easier to debug if we accidentally pass indexed loads/stores to the legalizer.
llvm-svn: 144767
2011-11-16 02:43:15 +00:00
Owen Anderson ca2f78a95b Rename MVT::untyped to MVT::Untyped to match similar nomenclature.
llvm-svn: 144747
2011-11-16 01:02:57 +00:00
Eric Christopher 0abbd0ef5a Stabilize the output of the dwarf accelerator tables. Fixes a comparison
failure during bootstrap with it turned on.

llvm-svn: 144731
2011-11-15 23:37:17 +00:00
Chad Rosier 291ce47db7 GEPs with all zero indices are trivially coalesced by fast-isel. For example,
%arrayidx135 = getelementptr inbounds [4 x [4 x [4 x [4 x i32]]]]* %M0, i32 0, i64 0
%arrayidx136 = getelementptr inbounds [4 x [4 x [4 x i32]]]* %arrayidx135, i32 0, i64 %idxprom134

Prior to this commit, the GEP instruction that defines %arrayidx136 thought that 
%arrayidx135 was a trivial kill.  The GEP that defines %arrayidx135 doesn't 
generate any code and thus %M0 gets folded into the second GEP.  Thus, we need
to look through GEPs with all zero indices.
rdar://10443319

llvm-svn: 144730
2011-11-15 23:34:05 +00:00
Pete Cooper 7c7ba1baa1 Added custom lowering for load->dec->store sequence in x86 when the EFLAGS registers is used
by later instructions.

Only done for DEC64m right now.

Fixes <rdar://problem/6172640>

llvm-svn: 144705
2011-11-15 21:57:53 +00:00
Devang Patel 43bde96a4c Insert modified DBG_VALUE into LiveDbgValueMap.
llvm-svn: 144696
2011-11-15 21:03:58 +00:00
Rafael Espindola f11e7f1305 We currently use a callback to handle an IL pass deleting a BB that still
has a reference to it. Unfortunately, that doesn't work for codegen passes
since we don't get notified of MBB's being deleted (the original BB stays).

Use that fact to our advantage and after printing a function, check if
any of the IL BBs corresponds to a symbol that was not printed. This fixes
pr11202.

llvm-svn: 144674
2011-11-15 19:08:46 +00:00
Benjamin Kramer 1f97a5a671 Remove all remaining uses of Value::getNameStr().
llvm-svn: 144648
2011-11-15 16:27:03 +00:00
Benjamin Kramer 4c93d15f09 Twinify GraphWriter a little bit.
llvm-svn: 144647
2011-11-15 16:26:38 +00:00
Jakob Stoklund Olesen e14ef7e6f8 Check all overlaps when looking for used registers.
A function using any RC alias is enough to enable the ExeDepsFix pass.

llvm-svn: 144636
2011-11-15 08:20:43 +00:00
Jay Foad ab9ebd3521 Make use of MachinePointerInfo::getFixedStack.
llvm-svn: 144635
2011-11-15 07:51:13 +00:00
Jay Foad 70679df664 Remove some unnecessary includes of PseudoSourceValue.h.
llvm-svn: 144634
2011-11-15 07:50:46 +00:00
Evan Cheng 7098c4e5f4 Set SeenStore to true to prevent loads from being moved; also eliminates a non-deterministic behavior.
llvm-svn: 144628
2011-11-15 06:26:51 +00:00
Chandler Carruth 9b548a7fcf Rather than trying to use the loop block sequence *or* the function
block sequence when recovering from unanalyzable control flow
constructs, *always* use the function sequence. I'm not sure why I ever
went down the path of trying to use the loop sequence, it is
fundamentally not the correct sequence to use. We're trying to preserve
the incoming layout in the cases of unreasonable control flow, and that
is only encoded at the function level. We already have a filter to
select *exactly* the sub-set of blocks within the function that we're
trying to form into a chain.

The resulting code layout is also significantly better because of this.
In several places we were ending up with completely unreasonable control
flow constructs due to the ordering chosen by the loop structure for its
internal storage. This change removes a completely wasteful vector of
basic blocks, saving memory allocation in the common case even though it
costs us CPU in the fairly rare case of unnatural loops. Finally, it
fixes the latest crasher reduced out of GCC's single source. Thanks
again to Benjamin Kramer for the reduction, my bugpoint skills failed at
it.

llvm-svn: 144627
2011-11-15 06:26:43 +00:00
Jakob Stoklund Olesen f8ad336bc4 Break false dependencies before partial register updates.
Two new TargetInstrInfo hooks lets the target tell ExecutionDepsFix
about instructions with partial register updates causing false unwanted
dependencies.

The ExecutionDepsFix pass will break the false dependencies if the
updated register was written in the previoius N instructions.

The small loop added to sse-domains.ll runs twice as fast with
dependency-breaking instructions inserted.

llvm-svn: 144602
2011-11-15 01:15:30 +00:00
Jakob Stoklund Olesen 543bef6ead Track register ages more accurately.
Keep track of the last instruction to define each register individually
instead of per DomainValue.  This lets us track more accurately when a
register was last written.

Also track register ages across basic blocks.  When entering a new
basic block, use the least stale predecessor def as a worst case
estimate for register age.

The register age is used to arbitrate between conflicting domains. The
most recently defined register wins.

llvm-svn: 144601
2011-11-15 01:15:25 +00:00
Evan Cheng f2fc508d4d Avoid dereferencing off the beginning of lists.
llvm-svn: 144569
2011-11-14 21:11:15 +00:00
Evan Cheng 28ffb7e444 At -O0, multiple uses of a virtual registers in the same BB are being marked
"kill". This looks like a bug upstream. Since that's going to take some time
to understand, loosen the assertion and disable the optimization when
multiple kills are seen.

llvm-svn: 144568
2011-11-14 21:02:09 +00:00
Evan Cheng 30f44ad785 Teach two-address pass to re-schedule two-address instructions (or the kill
instructions of the two-address operands) in order to avoid inserting copies.
This fixes the few regressions introduced when the two-address hack was
disabled (without regressing the improvements).
rdar://10422688

llvm-svn: 144559
2011-11-14 19:48:55 +00:00
Jakob Stoklund Olesen 7e6004a3c1 Fix early-clobber handling in shrinkToUses.
I broke this in r144515, it affected most ARM testers.

<rdar://problem/10441389>

llvm-svn: 144547
2011-11-14 18:45:38 +00:00
Chandler Carruth fd9b4d9813 It helps to deallocate memory as well as allocate it. =] This actually
cleans up all the chains allocated during the processing of each
function so that for very large inputs we don't just grow memory usage
without bound.

llvm-svn: 144533
2011-11-14 10:57:23 +00:00
Chandler Carruth 0a31d149ea Remove an over-eager assert that was firing on one of the ARM regression
tests when I forcibly enabled block placement.

It is apparantly possible for an unanalyzable block to fallthrough to
a non-loop block. I don't actually beleive this is correct, I believe
that 'canFallThrough' is returning true needlessly for the code
construct, and I've left a bit of a FIXME on the verification code to
try to track down why this is coming up.

Anyways, removing the assert doesn't degrade the correctness of the algorithm.

llvm-svn: 144532
2011-11-14 10:55:53 +00:00
Chandler Carruth 0af6a0bb69 Begin chipping away at one of the biggest quadratic-ish behaviors in
this pass. We're leaving already merged blocks on the worklist, and
scanning them again and again only to determine each time through that
indeed they aren't viable. We can instead remove them once we're going
to have to scan the worklist. This is the easy way to implement removing
them. If this remains on the profile (as I somewhat suspect it will), we
can get a lot more clever here, as the worklist's order is essentially
irrelevant. We can use swapping and fold the two loops to reduce
overhead even when there are many blocks on the worklist but only a few
of them are removed.

llvm-svn: 144531
2011-11-14 09:46:33 +00:00
Chandler Carruth 84cd44c750 Under the hood, MBPI is doing a linear scan of every successor every
time it is queried to compute the probability of a single successor.
This makes computing the probability of every successor of a block in
sequence... really really slow. ;] This switches to a linear walk of the
successors rather than a quadratic one. One of several quadratic
behaviors slowing this pass down.

I'm not really thrilled with moving the sum code into the public
interface of MBPI, but I don't (at the moment) have ideas for a better
interface. My direction I'm thinking in for a better interface is to
have MBPI actually retain much more state and make *all* of these
queries cheap. That's a lot of work, and would require invasive changes.
Until then, this seems like the least bad (ie, least quadratic)
solution. Suggestions welcome.

llvm-svn: 144530
2011-11-14 09:12:57 +00:00
Chandler Carruth a9e71faa0f Reuse the logic in getEdgeProbability within getHotSucc in order to
correctly handle blocks whose successor weights sum to more than
UINT32_MAX. This is slightly less efficient, but the entire thing is
already linear on the number of successors. Calling it within any hot
routine is a mistake, and indeed no one is calling it. It also
simplifies the code.

llvm-svn: 144527
2011-11-14 08:55:59 +00:00
Chandler Carruth ed5aa547bc Fix an overflow bug in MachineBranchProbabilityInfo. This pass relied on
the sum of the edge weights not overflowing uint32, and crashed when
they did. This is generally safe as BranchProbabilityInfo tries to
provide this guarantee. However, the CFG can get modified during codegen
in a way that grows the *sum* of the edge weights. This doesn't seem
unreasonable (imagine just adding more blocks all with the default
weight of 16), but it is hard to come up with a case that actually
triggers 32-bit overflow. Fortuately, the single-source GCC build is
good at this. The solution isn't very pretty, but its no worse than the
previous code. We're already summing all of the edge weights on each
query, we can sum them, check for an overflow, compute a scale, and sum
them again.

I've included a *greatly* reduced test case out of the GCC source that
triggers it. It's a pretty lame test, as it clearly is just barely
triggering the overflow. I'd like to have something that is much more
definitive, but I don't understand the fundamental pattern that triggers
an explosion in the edge weight sums.

The buggy code is duplicated within this file. I'll colapse them into
a single implementation in a subsequent commit.

llvm-svn: 144526
2011-11-14 08:50:16 +00:00
Jakob Stoklund Olesen d7bcf43dc2 Use getVNInfoBefore() when it makes sense.
llvm-svn: 144517
2011-11-14 01:39:36 +00:00
Chandler Carruth 1071cfa4ae Teach machine block placement to cope with unnatural loops. These don't
get loop info structures associated with them, and so we need some way
to make forward progress selecting and placing basic blocks. The
technique used here is pretty brutal -- it just scans the list of blocks
looking for the first unplaced candidate. It keeps placing blocks like
this until the CFG becomes tractable.

The cost is somewhat unfortunate, it requires allocating a vector of all
basic block pointers eagerly. I have some ideas about how to simplify
and optimize this, but I'm trying to get the logic correct first.

Thanks to Benjamin Kramer for the reduced test case out of GCC. Sadly
there are other bugs that GCC is tickling that I'm reducing and working
on now.

llvm-svn: 144516
2011-11-14 00:00:35 +00:00
Jakob Stoklund Olesen 697979028f Use kill slots instead of the previous slot in shrinkToUses.
It's more natural to use the actual end points.

llvm-svn: 144515
2011-11-13 23:53:25 +00:00
Chandler Carruth c4a2cb34bb Cleanup some 80-columns violations and poor formatting. These snuck by
when I was reading through the code for style.

llvm-svn: 144513
2011-11-13 22:50:09 +00:00
Jakob Stoklund Olesen d8f2405e73 Terminate all dead defs at the dead slot instead of the 'next' slot.
This makes no difference for normal defs, but early clobber dead defs
now look like:

  [Slot_EarlyClobber; Slot_Dead)

instead of:

  [Slot_EarlyClobber; Slot_Register).

Live ranges for normal dead defs look like:

  [Slot_Register; Slot_Dead)

as before.

llvm-svn: 144512
2011-11-13 22:42:13 +00:00
Jakob Stoklund Olesen ce7cc08f3a Simplify early clobber slots a bit.
llvm-svn: 144507
2011-11-13 22:05:42 +00:00
Chandler Carruth 8e1d906734 Enhance the assertion mechanisms in place to make it easier to catch
when we fail to place all the blocks of a loop. Currently this is
happening for unnatural loops, and this logic helps more immediately
point to the problem.

llvm-svn: 144504
2011-11-13 21:39:51 +00:00
Jakob Stoklund Olesen 90b5e565b6 Rename SlotIndexes to match how they are used.
The old naming scheme (load/use/def/store) can be traced back to an old
linear scan article, but the names don't match how slots are actually
used.

The load and store slots are not needed after the deferred spill code
insertion framework was deleted.

The use and def slots don't make any sense because we are using
half-open intervals as is customary in C code, but the names suggest
closed intervals.  In reality, these slots were used to distinguish
early-clobber defs from normal defs.

The new naming scheme also has 4 slots, but the names match how the
slots are really used.  This is a purely mechanical renaming, but some
of the code makes a lot more sense now.

llvm-svn: 144503
2011-11-13 20:45:27 +00:00
Chandler Carruth 0bb42c0f86 Teach MBP to force-merge layout successors for blocks with unanalyzable
branches that also may involve fallthrough. In the case of blocks with
no fallthrough, we can still re-order the blocks profitably. For example
instruction decoding will in some cases continue past an indirect jump,
making laying out its most likely successor there profitable.

Note, no test case. I don't know how to write a test case that exercises
this logic, but it matches the described desired semantics in
discussions with Jakob and others. If anyone has a nice example of IR
that will trigger this, that would be lovely.

Also note, there are still assertion failures in real world code with
this. I'm digging into those next, now that I know this isn't the cause.

llvm-svn: 144499
2011-11-13 12:17:28 +00:00
Chandler Carruth f9213fe721 Hoist another gross nested loop into a helper method.
llvm-svn: 144498
2011-11-13 11:42:26 +00:00
Chandler Carruth eb4ec3aea5 Add a missing doxygen comment for a helper method.
llvm-svn: 144497
2011-11-13 11:34:55 +00:00
Chandler Carruth b336172f90 Hoist a nested loop into its own method.
llvm-svn: 144496
2011-11-13 11:34:53 +00:00
Chandler Carruth 8d15078927 Rewrite #3 of machine block placement. This is based somewhat on the
second algorithm, but only loosely. It is more heavily based on the last
discussion I had with Andy. It continues to walk from the inner-most
loop outward, but there is a key difference. With this algorithm we
ensure that as we visit each loop, the entire loop is merged into
a single chain. At the end, the entire function is treated as a "loop",
and merged into a single chain. This chain forms the desired sequence of
blocks within the function. Switching to a single algorithm removes my
biggest problem with the previous approaches -- they had different
behavior depending on which system triggered the layout. Now there is
exactly one algorithm and one basis for the decision making.

The other key difference is how the chain is formed. This is based
heavily on the idea Andy mentioned of keeping a worklist of blocks that
are viable layout successors based on the CFG. Having this set allows us
to consistently select the best layout successor for each block. It is
expensive though.

The code here remains very rough. There is a lot that needs to be done
to clean up the code, and to make the runtime cost of this pass much
lower. Very much WIP, but this was a giant chunk of code and I'd rather
folks see it sooner than later. Everything remains behind a flag of
course.

I've added a couple of tests to exercise the issues that this iteration
was motivated by: loop structure preservation. I've also fixed one test
that was exhibiting the broken behavior of the previous version.

llvm-svn: 144495
2011-11-13 11:20:44 +00:00