Commit Graph

158 Commits

Author SHA1 Message Date
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Tim Shen e885d5e4d3 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
Sanjay Patel 3d07ec973f rangify; NFCI
llvm-svn: 256891
2016-01-06 00:45:42 +00:00
Chandler Carruth 7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Tom Stellard f01af29f01 MachineCSE: Add a target query for the LookAheadLimit heurisitic
This is used to determine whether or not to CSE physical register
defs.

Differential Revision: http://reviews.llvm.org/D9472

llvm-svn: 236923
2015-05-09 00:56:07 +00:00
Benjamin Kramer 799003bf8c Re-sort includes with sort-includes.py and insert raw_ostream.h where it's used.
llvm-svn: 232998
2015-03-23 19:32:43 +00:00
Matthias Braun 26e7ea6267 MachineCSE: Clear dead-def flag on CSE.
In case CSE reuses a previoulsy unused register the dead-def flag has to
be cleared on the def operand, as exposed by the arm64-cse.ll test.

This fixes PR22439 and the corresponding rdar://19694987

Differential Revision: http://reviews.llvm.org/D7395

llvm-svn: 228178
2015-02-04 19:35:16 +00:00
Ahmed Bougacha 54b7d334c7 [MachineCSE] Clear kill-flag on registers imp-def'd by the CSE'd instruction.
Go through implicit defs of CSMI and MI, and clear the kill flags on
their uses in all the instructions between CSMI and MI.
We might have made some of the kill flags redundant, consider:
  subs  ... %NZCV<imp-def>        <- CSMI
  csinc ... %NZCV<imp-use,kill>   <- this kill flag isn't valid anymore
  subs  ... %NZCV<imp-def>        <- MI, to be eliminated
  csinc ... %NZCV<imp-use,kill>
Since we eliminated MI, and reused a register imp-def'd by CSMI
(here %NZCV), that register, if it was killed before MI, should have
that kill flag removed, because it's lifetime was extended.

Also, add an exhaustive testcase for the motivating example.

Reviewed by: Juergen Ributzka <juergen@apple.com>

llvm-svn: 223133
2014-12-02 18:09:51 +00:00
Jiangning Liu dd6e12d71c In Machine CSE pass, the source register of a COPY machine instruction can
be propagated to all its users, and this propagation could increase the 
probability of finding common subexpressions. If the COPY has only one user,
the COPY itself can be removed.

llvm-svn: 215344
2014-08-11 05:17:19 +00:00
Eric Christopher fc6de428c8 Have MachineFunction cache a pointer to the subtarget to make lookups
shorter/easier and have the DAG use that to do the same lookup. This
can be used in the future for TargetMachine based caching lookups from
the MachineFunction easily.

Update the MIPS subtarget switching machinery to update this pointer
at the same time it runs.

llvm-svn: 214838
2014-08-05 02:39:49 +00:00
Eric Christopher d913448b38 Remove the TargetMachine forwards for TargetSubtargetInfo based
information and update all callers. No functional change.

llvm-svn: 214781
2014-08-04 21:25:23 +00:00
Jiangning Liu c3053129b9 Add TargetInstrInfo interface isAsCheapAsAMove.
llvm-svn: 214158
2014-07-29 01:55:19 +00:00
Chandler Carruth 1b9dde087e [Modules] Remove potential ODR violations by sinking the DEBUG_TYPE
define below all header includes in the lib/CodeGen/... tree. While the
current modules implementation doesn't check for this kind of ODR
violation yet, it is likely to grow support for it in the future. It
also removes one layer of macro pollution across all the included
headers.

Other sub-trees will follow.

llvm-svn: 206837
2014-04-22 02:02:50 +00:00
Paul Robinson 7c99ec5b99 Disable each MachineFunctionPass for 'optnone' functions, unless that
pass normally runs at optimization level None, or is part of the
register allocation pipeline.

llvm-svn: 205228
2014-03-31 17:43:35 +00:00
Owen Anderson b36376efcb Switch a number of loops in lib/CodeGen over to range-based for-loops, now that
the MachineRegisterInfo iterators are compatible with it.

llvm-svn: 204075
2014-03-17 19:36:09 +00:00
Owen Anderson 16c6bf49b7 Phase 2 of the great MachineRegisterInfo cleanup. This time, we're changing
operator* on the by-operand iterators to return a MachineOperand& rather than
a MachineInstr&.  At this point they almost behave like normal iterators!

Again, this requires making some existing loops more verbose, but should pave
the way for the big range-based for-loop cleanups in the future.

llvm-svn: 203865
2014-03-13 23:12:04 +00:00
Craig Topper 4584cd54e3 [C++11] Add 'override' keyword to virtual methods that override their base class.
llvm-svn: 203220
2014-03-07 09:26:03 +00:00
Rafael Espindola b1f25f1b93 Replace PROLOG_LABEL with a new CFI_INSTRUCTION.
The old system was fairly convoluted:
* A temporary label was created.
* A single PROLOG_LABEL was created with it.
* A few MCCFIInstructions were created with the same label.

The semantics were that the cfi instructions were mapped to the PROLOG_LABEL
via the temporary label. The output position was that of the PROLOG_LABEL.
The temporary label itself was used only for doing the mapping.

The new CFI_INSTRUCTION has a 1:1 mapping to MCCFIInstructions and points to
one by holding an index into the CFI instructions of this function.

I did consider removing MMI.getFrameInstructions completelly and having
CFI_INSTRUCTION own a MCCFIInstruction, but MCCFIInstructions have non
trivial constructors and destructors and are somewhat big, so the this setup
is probably better.

The net result is that we don't create temporary labels that are never used.

llvm-svn: 203204
2014-03-07 06:08:31 +00:00
Benjamin Kramer b6d0bd48bd [C++11] Replace llvm::next and llvm::prior with std::next and std::prev.
Remove the old functions.

llvm-svn: 202636
2014-03-02 12:27:27 +00:00
Andrew Trick e4083f9e85 Disabled subregister copy coalescing during MachineCSE.
This effectively backs out r197465 but leaves some of the general
fixes in place. Not all targets are ready to handle this feature. To
enable it, some infrastructure work is needed to better handle
register class constraints.

llvm-svn: 197514
2013-12-17 19:29:36 +00:00
Andrew Trick e339828b90 Allow MachineCSE to coalesce trivial subregister copies the same way that it coalesces normal copies.
Without this, MachineCSE is powerless to handle redundant operations with truncated source operands.

This required fixing the 2-addr pass to handle tied subregisters. It isn't clear what combinations of subregisters can legally be tied, but the simple case of truncated source operands is now safely handled:

     %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1
     %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2
     %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def>

Test case: cse-add-with-overflow.ll.

This exposed an existing bug in
PPCInstrInfo::commuteInstruction. Thanks to Rafael for the test case:
PowerPC/crash.ll.

llvm-svn: 197465
2013-12-17 04:50:45 +00:00
Rafael Espindola f152836788 Revert "Allow MachineCSE to coalesce trivial subregister copies the same way that it coalesces normal copies."
This reverts commit r197414.

It broke the ppc64 bootstrap. I will post a testcase in a sec.

llvm-svn: 197424
2013-12-16 20:57:09 +00:00
Andrew Trick 88bd8629b2 Allow MachineCSE to coalesce trivial subregister copies the same way
that it coalesces normal copies.

Without this, MachineCSE is powerless to handle redundant operations
with truncated source operands.

This required fixing the 2-addr pass to handle tied subregisters. It
isn't clear what combinations of subregisters can legally be tied, but
the simple case of truncated source operands is now safely handled:

     %vreg11<def> = COPY %vreg1:sub_32bit; GR32:%vreg11 GR64:%vreg1
     %vreg12<def> = COPY %vreg2:sub_32bit; GR32:%vreg12 GR64:%vreg2
     %vreg13<def,tied1> = ADD32rr %vreg11<tied0>, %vreg12<kill>, %EFLAGS<imp-def>

llvm-svn: 197414
2013-12-16 19:36:21 +00:00
Andrew Trick cccd82f21f whitespace
llvm-svn: 197413
2013-12-16 19:36:18 +00:00
Craig Topper b94011fd28 Use SmallVectorImpl& instead of SmallVector to avoid repeating small vector size.
llvm-svn: 186274
2013-07-14 04:42:23 +00:00
Chandler Carruth ed0881b2a6 Use the new script to sort the includes of every file under lib.
Sooooo many of these had incorrect or strange main module includes.
I have manually inspected all of these, and fixed the main module
include to be the nearest plausible thing I could find. If you own or
care about any of these source files, I encourage you to take some time
and check that these edits were sensible. I can't have broken anything
(I strictly added headers, and reordered them, never removed), but they
may not be the headers you'd really like to identify as containing the
API being implemented.

Many forward declarations and missing includes were added to a header
files to allow them to parse cleanly when included first. The main
module rule does in fact have its merits. =]

llvm-svn: 169131
2012-12-03 16:50:05 +00:00
Manman Ren f89406ac78 CSE: allow PerformTrivialCoalescing to check copies across basic block
boundaries.

Given the following case:
BB0
  %vreg1<def> = SUBrr %vreg0, %vreg7
  %vreg2<def> = COPY %vreg7
BB1
  %vreg10<def> = SUBrr %vreg0, %vreg2
We should be able to CSE between SUBrr in BB0 and SUBrr in BB1.

rdar://12462006

llvm-svn: 168717
2012-11-27 18:58:41 +00:00
Jakub Staszak f18753b8d0 Don't use iterator after being erased.
llvm-svn: 168622
2012-11-26 22:14:19 +00:00
Ulrich Weigand 3946877f88 Do not consider a machine instruction that uses and defines the same
physical register as candidate for common subexpression elimination
in MachineCSE.

This fixes a bug on PowerPC in MultiSource/Applications/oggenc/oggenc
caused by MachineCSE invalidly merging two separate DYNALLOC insns.

llvm-svn: 167855
2012-11-13 18:40:58 +00:00
Jakob Stoklund Olesen 244beb42ce Remove unused BitVectors from getAllocatableSet().
llvm-svn: 165999
2012-10-16 00:05:06 +00:00
Jakob Stoklund Olesen c30a9af2d7 Switch most getReservedRegs() clients to the MRI equivalent.
Using the cached bit vector in MRI avoids comstantly allocating and
recomputing the reserved register bit vector.

llvm-svn: 165983
2012-10-15 21:57:41 +00:00
Benjamin Kramer 59c8b411e0 MachineCSE: Hoist isConstantPhysReg out of the loop, it checks for overlaps already.
llvm-svn: 161729
2012-08-11 20:42:59 +00:00
Benjamin Kramer ef6494f24d PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely.
This is common e.g. when doing rip-relative addressing on x86_64.

llvm-svn: 161728
2012-08-11 19:05:13 +00:00
Manman Ren 1be131ba27 X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462
2012-08-08 00:51:41 +00:00
Manman Ren cb36b8c2e6 MachineCSE: Update the heuristics for isProfitableToCSE.
If the result of a common subexpression is used at all uses of the candidate
expression, CSE should not increase the live range of the common subexpression.

rdar://11393714 and rdar://11819721

llvm-svn: 161396
2012-08-07 06:16:46 +00:00
Bill Wendling d163405df8 Remove tabs.
llvm-svn: 160475
2012-07-19 00:04:14 +00:00
Nick Lewycky 765c699370 Remove ParentMap. You can just ask the domnode for its parent. No functionality
change.

Move the "Not profitable, avoid CSE!" debug message next to where we fail the
check for profitability and use a different message for avoiding CSE due to
being in different register classes.

llvm-svn: 159729
2012-07-05 06:19:21 +00:00
Jakob Stoklund Olesen 92a0083944 Switch some getAliasSet clients to MCRegAliasIterator.
MCRegAliasIterator can optionally visit the register itself, allowing
for simpler code.

llvm-svn: 157837
2012-06-01 20:36:54 +00:00
Craig Topper 1d32658877 Use uint16_t to store register overlaps to reduce static data.
llvm-svn: 152001
2012-03-04 10:43:23 +00:00
Jakob Stoklund Olesen 4c5ad2b812 Handle regmasks in MachineCSE.
Don't attempt to extend physreg live ranges across calls.

<rdar://problem/10942095>

llvm-svn: 151610
2012-02-28 02:08:50 +00:00
Lang Hames 5bade3dc6e Re-enable 150652 and 150654 - Make FPSCR non-reserved, and make MachineCSE bail on reserved registers. This *should* be safe as of r150786.
llvm-svn: 150769
2012-02-17 00:27:16 +00:00
Lang Hames 55a2a96153 Oop - r150653 + r150654 broke one of my test cases. Backing out for now...
llvm-svn: 150655
2012-02-16 02:32:10 +00:00
Lang Hames 2055493b97 MachineCSE shouldn't extend the live ranges of reserved or allocatable registers.
llvm-svn: 150653
2012-02-16 02:19:35 +00:00
Andrew Trick 1fa5bcbe2a Codegen pass definition cleanup. No functionality.
Moving toward a uniform style of pass definition to allow easier target configuration.
Globally declare Pass ID.
Globally declare pass initializer.
Use INITIALIZE_PASS consistently.
Add a call to the initializer from CodeGen.cpp.
Remove redundant "createPass" functions and "getPassName" methods.

While cleaning up declarations, cleaned up comments (sorry for large diff).

llvm-svn: 150100
2012-02-08 21:23:13 +00:00
Andrew Trick 9e761997d8 whitespace
llvm-svn: 150094
2012-02-08 21:22:43 +00:00
Duncan Sands ae22c60f90 Persuade GCC that there is nothing worth warning about here (there isn't).
llvm-svn: 149834
2012-02-05 14:20:11 +00:00
Evan Cheng d9725a38d6 Avoid CSE of instructions which define physical registers across MBBs unless
the physical registers are not allocatable.

llvm-svn: 147902
2012-01-11 00:38:11 +00:00
Evan Cheng 0be4144a68 Allow machine-cse to look across MBB boundary when cse'ing instructions that
define physical registers. It's currently very restrictive, only catching
cases where the CE is in an immediate (and only) predecessor. But it catches
a surprising large number of cases.

rdar://10660865

llvm-svn: 147827
2012-01-10 02:02:58 +00:00
Evan Cheng 7f8e563a69 Add bundle aware API for querying instruction properties and switch the code
generator to it. For non-bundle instructions, these behave exactly the same
as the MC layer API.

For properties like mayLoad / mayStore, look into the bundle and if any of the
bundled instructions has the property it would return true.
For properties like isPredicable, only return true if *all* of the bundled
instructions have the property.
For properties like canFoldAsLoad, isCompare, conservatively return false for
bundles.

llvm-svn: 146026
2011-12-07 07:15:52 +00:00
Bill Wendling 3e5409df77 We need to verify that the machine instruction we're using as a replacement for
our current machine instruction defines a register with the same register class
as what's being replaced. This showed up in the SPEC 403.gcc benchmark, where it
would ICE because a tail call was expecting one register class but was given
another. (The machine instruction verifier catches this situation.)
<rdar://problem/10270968>

llvm-svn: 141830
2011-10-12 23:03:40 +00:00
Evan Cheng 6cc775f905 - Rename TargetInstrDesc, TargetOperandInfo to MCInstrDesc and MCOperandInfo and
sink them into MC layer.
- Added MCInstrInfo, which captures the tablegen generated static data. Chang
TargetInstrInfo so it's based off MCInstrInfo.

llvm-svn: 134021
2011-06-28 19:10:37 +00:00
Eli Friedman 5401962643 Re-revert r130877; it's apparently causing a regression on 197.parser,
possibly related to cbnz formation.

llvm-svn: 130977
2011-05-06 05:23:07 +00:00
Eli Friedman 2311bdfa7b Minor correction to r130877; fixes PR9846 and hopefully the buildbot failures.
llvm-svn: 130925
2011-05-05 16:18:11 +00:00
Eli Friedman 0fe4608af2 Re-commit r130862 with a minor change to avoid an iterator running off the edge in some cases.
Original message:

Teach MachineCSE how to do simple cross-block CSE involving physregs.  This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .

llvm-svn: 130877
2011-05-04 22:10:36 +00:00
Eli Friedman 3bd79ba856 Back out r130862; it appears to be breaking bootstrap.
llvm-svn: 130867
2011-05-04 20:48:42 +00:00
Eli Friedman a16fc2fec0 Teach MachineCSE how to do simple cross-block CSE involving physregs. This allows, for example, eliminating duplicate cmpl's on x86. Part of rdar://problem/8259436 .
llvm-svn: 130862
2011-05-04 19:54:24 +00:00
Evan Cheng fe917efc8b Fix a couple of places where changes are made but not tracked.
llvm-svn: 129287
2011-04-11 18:47:20 +00:00
Chris Lattner 6c8b8dd522 fit in 80 cols and use MBB::isSuccessor instead of a hand
rolled std::find.

llvm-svn: 123164
2011-01-10 07:51:31 +00:00
Jakob Stoklund Olesen 2fb5b31578 Simplify a bunch of isVirtualRegister() and isPhysicalRegister() logic.
These functions not longer assert when passed 0, but simply return false instead.

No functional change intended.

llvm-svn: 123155
2011-01-10 02:58:51 +00:00
Evan Cheng 6eb516dbea Do not model all INLINEASM instructions as having unmodelled side effects.
Instead encode llvm IR level property "HasSideEffects" in an operand (shared
with IsAlignStack). Added MachineInstrs::hasUnmodeledSideEffects() to check
the operand when the instruction is an INLINEASM.

This allows memory instructions to be moved around INLINEASM instructions.

llvm-svn: 123044
2011-01-07 23:50:32 +00:00
Cameron Zwarich 18f164f7c9 Use a RecyclingAllocator to allocate values for MachineCSE's ScopedHashTable for
a 28% speedup of MachineCSE time on 403.gcc.

llvm-svn: 122735
2011-01-03 04:07:46 +00:00
Evan Cheng b7ff5a0f20 Teach machine cse to commute instructions.
llvm-svn: 121903
2010-12-15 22:16:21 +00:00
Evan Cheng 2b3f25e031 Teach machine cse to eliminate instructions with multiple physreg uses and defs. rdar://8610857.
llvm-svn: 117745
2010-10-29 23:36:03 +00:00
Owen Anderson 6c18d1aac0 Get rid of static constructors for pass registration. Instead, every pass exposes an initializeMyPassFunction(), which
must be called in the pass's constructor.  This function uses static dependency declarations to recursively initialize
the pass's dependencies.

Clients that only create passes through the createFooPass() APIs will require no changes.  Clients that want to use the
CommandLine options for passes will need to manually call the appropriate initialization functions in PassInitialization.h
before parsing commandline arguments.

I have tested this with all standard configurations of clang and llvm-gcc on Darwin.  It is possible that there are problems
with the static dependencies that will only be visible with non-standard options.  If you encounter any crash in pass
registration/creation, please send the testcase to me directly.

llvm-svn: 116820
2010-10-19 17:21:58 +00:00
Owen Anderson 8ac477ffb5 Begin adding static dependence information to passes, which will allow us to
perform initialization without static constructors AND without explicit initialization
by the client.  For the moment, passes are required to initialize both their
(potential) dependencies and any passes they preserve.  I hope to be able to relax
the latter requirement in the future.

llvm-svn: 116334
2010-10-12 19:48:12 +00:00
Owen Anderson df7a4f2515 Now with fewer extraneous semicolons!
llvm-svn: 115996
2010-10-07 22:25:06 +00:00
Jakob Stoklund Olesen 18842783cc Add MachineRegisterInfo::constrainRegClass and use it in MachineCSE.
This function is intended to be used when inserting a machine instruction that
trivially restricts the legal registers, like LEA requiring a GR32_NOSP
argument.

llvm-svn: 115875
2010-10-06 23:54:39 +00:00
Evan Cheng b08377e0db Machine CSE was forgetting to clear some data structures.
llvm-svn: 114222
2010-09-17 21:59:42 +00:00
Evan Cheng 0dcd3362bd Fix a potential bug that can cause miscomparison with and without debug info.
llvm-svn: 114220
2010-09-17 21:56:26 +00:00
Evan Cheng e0db9d01d9 Machine CSE preserves CFG. Pass manager was freeing machineloopinfo after machine cse before.
llvm-svn: 111281
2010-08-17 20:57:42 +00:00
Owen Anderson a7aed18624 Reapply r110396, with fixes to appease the Linux buildbot gods.
llvm-svn: 110460
2010-08-06 18:33:48 +00:00
Owen Anderson bda59bd247 Revert r110396 to fix buildbots.
llvm-svn: 110410
2010-08-06 00:23:35 +00:00
Owen Anderson 755aceb5d0 Don't use PassInfo* as a type identifier for passes. Instead, use the address of the static
ID member as the sole unique type identifier.  Clean up APIs related to this change.

llvm-svn: 110396
2010-08-05 23:42:04 +00:00
Owen Anderson a57b97e7e7 Fix batch of converting RegisterPass<> to INTIALIZE_PASS().
llvm-svn: 109045
2010-07-21 22:09:45 +00:00
Jakob Stoklund Olesen 37c42a3d02 Remove many calls to TII::isMoveInstr. Targets should be producing COPY anyway.
TII::isMoveInstr is going tobe completely removed.

llvm-svn: 108507
2010-07-16 04:45:42 +00:00
Jakob Stoklund Olesen 00264624a9 Convert EXTRACT_SUBREG to COPY when emitting machine instrs.
EXTRACT_SUBREG no longer appears as a machine instruction. Use COPY instead.

Add isCopy() checks in many places using isMoveInstr() and isExtractSubreg().
The isMoveInstr hook will be removed later.

llvm-svn: 107879
2010-07-08 16:40:22 +00:00
Jakob Stoklund Olesen 4c82a9e7d0 Detect and handle COPY in many places.
This code is transitional, it will soon be possible to eliminate
isExtractSubreg, isInsertSubreg, and isMoveInstr in most places.

llvm-svn: 107547
2010-07-03 00:04:37 +00:00
Evan Cheng a03e6f85fe Re-apply 105308 with fix.
llvm-svn: 105502
2010-06-04 23:28:13 +00:00
Bob Wilson 30093b5d8b Revert 105308.
llvm-svn: 105399
2010-06-03 18:28:31 +00:00
Evan Cheng a2da22734f Enable machine cse of instructions which define physical registers.
llvm-svn: 105308
2010-06-02 01:08:27 +00:00
Eric Christopher 53ff992dde Make this LookAheadLimit, not the uninitialized LookAheadLeft.
Evan please verify!

llvm-svn: 104408
2010-05-21 23:40:03 +00:00
Evan Cheng 2c8bdead9e Allow machine cse to cse instructions which define physical registers. Controlled by option -machine-cse-phys-defs.
llvm-svn: 104385
2010-05-21 21:22:19 +00:00
Dan Gohman 7767d2747b Add a utility function for conservatively clearing kill flags, and make
use of it in MachineCSE.

llvm-svn: 103726
2010-05-13 19:24:00 +00:00
Evan Cheng 4b2ef56ad2 Rewrite machine cse to avoid recursion.
llvm-svn: 101964
2010-04-21 00:21:07 +00:00
Evan Cheng 4019d571d9 Typo.
llvm-svn: 101914
2010-04-20 17:27:38 +00:00
Evan Cheng 604bc162da After trivial coalescing, the MI being visited may have become a copy. Avoid adding it to CSE hash table since copies aren't being considered for CSE and they may be deleted.
rdar://7819990

llvm-svn: 100170
2010-04-02 02:21:24 +00:00
Evan Cheng cf7be39e24 dbg_value may end a block.
llvm-svn: 99378
2010-03-24 01:50:28 +00:00
Evan Cheng c7d721aa03 Code clean up.
llvm-svn: 99319
2010-03-23 20:33:48 +00:00
Dale Johannesen 197bd3eee9 Fix debug_value handling.
llvm-svn: 98224
2010-03-11 02:10:24 +00:00
Evan Cheng 4c5f7a7f5e Add a couple more heuristics to neuter machine cse some more.
1. Be careful with cse "cheap" expressions. e.g. constant materialization. Only cse them when the common expression is local or in a direct predecessor. We don't want cse of cheap instruction causing other expressions to be spilled.
2. Watch out for the case where the expression doesn't itself uses a virtual register. e.g. lea of frame object. If the common expression itself is used by copies (common for passing addresses to function calls), don't perform the cse. Since these expressions do not use a register, it creates a live range but doesn't close any, we want to be very careful with increasing register pressure.

Note these are heuristics so machine cse doesn't make register allocator unhappy. Once we have proper live range splitting and re-materialization support in place, these should be evaluated again.

Now machine cse is almost always a win on llvm nightly tests on x86 and x86_64.

llvm-svn: 98121
2010-03-10 02:12:03 +00:00
Evan Cheng 51063739a4 Allow more cross-rc coalescing.
llvm-svn: 98048
2010-03-09 06:38:17 +00:00
Jakob Stoklund Olesen 7c699f92cd Don't do illegal cross-class coalescing.
llvm-svn: 98044
2010-03-09 03:56:06 +00:00
Evan Cheng 19e44b4510 - Make the machine cse dumb coalescer (as opposed to the more awesome simple
coalescer) handle sub-register classes.
- Add heuristics to avoid non-profitable cse. Given the current lack of live
  range splitting, avoid cse when an expression has PHI use and the would be
  new use is in a BB where the expression wasn't already being used.

llvm-svn: 98043
2010-03-09 03:21:12 +00:00
Evan Cheng c9e8621268 Don't waste time trying to CSE labels, phis, inline asm. Definitely avoid cse implicit-def for obvious performance reason.
llvm-svn: 98009
2010-03-08 23:49:12 +00:00
Evan Cheng 6ec41ee33c Restrict machine cse to really trivial coalescing. Leave the heavy lifting to a real coalescer.
llvm-svn: 98007
2010-03-08 23:28:08 +00:00
Evan Cheng 0f5f54784a Don't update physical register def.
llvm-svn: 97861
2010-03-06 01:14:19 +00:00
Evan Cheng 1abd1a9f4b Avoid cse load instructions unless they are known to be invariant loads.
llvm-svn: 97747
2010-03-04 21:18:08 +00:00
Evan Cheng 36f8aabb2c Look ahead a bit to determine if a physical register def that is not marked dead is really alive. This is necessary to catch a lot of common cse opportunities for targets like x86.
llvm-svn: 97706
2010-03-04 01:33:55 +00:00