Commit Graph

14248 Commits

Author SHA1 Message Date
Nadav Rotem 7c277da364 Add a new optimization pass: Stack Coloring, that merges disjoint static allocations (allocas). Allocas are known to be
disjoint if they are marked by disjoint lifetime markers (@llvm.lifetime.XXX intrinsics).

llvm-svn: 163299
2012-09-06 09:17:37 +00:00
Chad Rosier f24ae7b084 [ms-inline asm] Use the asm dialect from the MI to set the parser dialect.
llvm-svn: 163273
2012-09-05 23:57:37 +00:00
Chad Rosier e53314f7e3 Cleanup a few magic numbers.
llvm-svn: 163263
2012-09-05 22:40:13 +00:00
Roman Divacky ad06cee239 Stop casting away const qualifier needlessly.
llvm-svn: 163258
2012-09-05 22:26:57 +00:00
Chad Rosier cbd2a1983f [ms-inline asm] We only need one bit to represent the AsmDialect in the
MachineInstr.

llvm-svn: 163257
2012-09-05 22:17:43 +00:00
Roman Divacky 9338344acb Constify this properly. Found by gcc48 -Wcast-qual.
llvm-svn: 163256
2012-09-05 22:15:49 +00:00
Roman Divacky 665260222f Constify SDNodeIterator an stop its only non-const user being cast stripped
of its constness. Found by gcc48 -Wcast-qual.

llvm-svn: 163254
2012-09-05 22:03:34 +00:00
Chad Rosier 994f4040f5 [ms-inline asm] Propagate the asm dialect into the MachineInstr representation.
llvm-svn: 163243
2012-09-05 21:00:58 +00:00
Roman Divacky 09c8a3dde5 Remove unused typedefs gcc4.8 warns about.
llvm-svn: 163225
2012-09-05 17:55:46 +00:00
Silviu Baranga 3f40d87207 Fixed the DAG combiner to better handle the folding of AND nodes for vector types. The previous code was making the assumption that the length of the bitmask returned by isConstantSplat was equal to the size of the vector type. Now we first make sure that the splat value has at least the length of the vector lane type, then we only use as many fields as we have available in the splat value.
llvm-svn: 163203
2012-09-05 08:57:21 +00:00
Logan Chien 1b170de77a Reorder the comments of EmitExceptionTable.
llvm-svn: 163194
2012-09-05 06:28:26 +00:00
Craig Topper 2db2353b21 Convert vextracti128/vextractf128 intrinsics to extract_subvector at DAG build time. Similar was previously done for vinserti128/vinsertf128. Add patterns for folding these extract_subvectors with stores.
llvm-svn: 163192
2012-09-05 05:48:09 +00:00
Jakob Stoklund Olesen ade363e86c Search the whole instruction for tied operands.
Implicit uses can be dynamically tied to defs. This will soon be used
for predicated instructions on ARM.

llvm-svn: 163177
2012-09-04 22:59:30 +00:00
Jakob Stoklund Olesen d92e2bc2e9 Typo.
llvm-svn: 163154
2012-09-04 18:44:43 +00:00
Jakob Stoklund Olesen 9fceda741d Actually use the MachineOperand field for isRegTiedToDefOperand().
The MachineOperand::TiedTo field was maintained, but not used.

This patch enables it in isRegTiedToDefOperand() and
isRegTiedToUseOperand() which are the actual functions use by the
register allocator.

llvm-svn: 163153
2012-09-04 18:43:25 +00:00
Jakob Stoklund Olesen c7579cdded Move tie checks into MachineVerifier::visitMachineOperand.
llvm-svn: 163152
2012-09-04 18:38:28 +00:00
Jakob Stoklund Olesen 0a09da83b6 Allow tied uses and defs in different orders.
After much agonizing, use a full 4 bits of precious MachineOperand space
to encode this. This uses existing padding, and doesn't grow
MachineOperand beyond its current 32 bytes.

This allows tied defs among the first 15 operands on a normal
instruction, just like the current MCInstrDesc constraint encoding.
Inline assembly needs to be able to tie more than the first 15 operands,
and gets special treatment.

Tied uses can appear beyond 15 operands, as long as they are tied to a
def that's in range.

llvm-svn: 163151
2012-09-04 18:36:28 +00:00
Preston Gurd cdf540d5d6 Generic Bypass Slow Div
- CodeGenPrepare pass for identifying div/rem ops
- Backend specifies the type mapping using addBypassSlowDivType
- Enabled only for Intel Atom with O2 32-bit -> 8-bit
- Replace IDIV with instructions which test its value and use DIVB if the value
is positive and less than 256.
- In the case when the quotient and remainder of a divide are used a DIV
and a REM instruction will be present in the IR. In the non-Atom case
they are both lowered to IDIVs and CSE removes the redundant IDIV instruction,
using the quotient and remainder from the first IDIV. However,
due to this optimization CSE is not able to eliminate redundant
IDIV instructions because they are located in different basic blocks.
This is overcome by calculating both the quotient (DIV) and remainder (REM)
in each basic block that is inserted by the optimization and reusing the result
values when a subsequent DIV or REM instruction uses the same operands.
- Test cases check for the presents of the optimization when calculating
either the quotient, remainder,  or both.

Patch by Tyler Nowicki!

llvm-svn: 163150
2012-09-04 18:22:17 +00:00
Benjamin Kramer 8d9890ab69 IRBuilderify the SjlLjEHPrepare pass.
No functionality change.

llvm-svn: 163115
2012-09-03 12:27:43 +00:00
Lang Hames 90152701eb When updating live range endpoints, make sure to preserve the early clobber bit.
Fixs PR13719.

llvm-svn: 163107
2012-09-03 06:31:45 +00:00
Nadav Rotem 10f6b8802b Fix a typo.
llvm-svn: 163094
2012-09-02 12:21:50 +00:00
Nadav Rotem 500d691d4a Generate better select code by allowing the target to use scalar select, and not sign-extend.
llvm-svn: 163086
2012-09-02 08:20:07 +00:00
Pete Cooper 2455e9c4a5 Only legalise a VSELECT in to bitwise operations if the vector mask bool is zeros or all ones. A vector bool with just ones isn't suitable for masking with.
No test case unfortunately as i couldn't find a target which fit all
the conditions needed to hit this code.

llvm-svn: 163075
2012-09-01 22:27:48 +00:00
Pete Cooper 2117ac40c9 Revert "Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060"
This reverts commit 5dd9e214fb92847e947f9edab170f9b4e52b908f.

Thanks to Duncan for explaining how this should have been done.

Conflicts:

	test/CodeGen/X86/vec_select.ll

llvm-svn: 163064
2012-09-01 17:37:55 +00:00
Logan Chien 64f361e0e1 Fix typo.
llvm-svn: 163059
2012-09-01 12:11:41 +00:00
Owen Anderson 90e0eaffa8 Teach DAG combine a number of tricks to simplify FMA expressions in fast-math mode.
llvm-svn: 163051
2012-09-01 06:04:27 +00:00
Michael Liao ec385012ae Fix typo
llvm-svn: 163049
2012-09-01 04:09:16 +00:00
Jakob Stoklund Olesen 5c8eda0ebc Add MachineInstr::tieOperands, remove setIsTied().
Manage tied operands entirely internally to MachineInstr. This makes it
possible to change the representation of tied operands, as I will do
shortly.

The constraint that tied uses and defs must be in the same order was too
restrictive.

llvm-svn: 163021
2012-08-31 20:50:53 +00:00
Craig Topper a8227cb76a Use CloneMachineInstr to make a new MI in commuteInstruction to make the code tolerant of instructions with more than two input operands.
llvm-svn: 163000
2012-08-31 16:30:05 +00:00
Jakob Stoklund Olesen 96f87069c4 Don't enforce ordered inline asm operands.
I was too optimistic, inline asm can have tied operands that don't
follow the def order.

Fixes PR13742.

llvm-svn: 162998
2012-08-31 15:34:59 +00:00
Pete Cooper e969340fea Take account of boolean vector contents when promoting a build vector from i1 to some other type. rdar://problem/12210060
llvm-svn: 162960
2012-08-30 23:58:52 +00:00
Owen Anderson cc61f87cf7 Teach the DAG combiner to turn chains of FADDs (x+x+x+x+...) into FMULs by constants. This is only enabled in unsafe FP math mode, since it does not preserve rounding effects for all such constants.
llvm-svn: 162956
2012-08-30 23:35:16 +00:00
Nadav Rotem ea973bda26 Currently targets that do not support selects with scalar conditions and vector operands - scalarize the code. ARM is such a target
because it does not support CMOV of vectors. To implement this efficientlyi, we broadcast the condition bit and use a sequence of NAND-OR
to select between the two operands. This is the same sequence we use for targets that don't have vector BLENDs (like SSE2).

rdar://12201387

llvm-svn: 162926
2012-08-30 19:17:29 +00:00
Jakob Stoklund Olesen 0eecbbeb5b Don't use MCInstrDesc flags for implicit operands.
When a MachineInstr is constructed, its implicit operands are added
first, then the explicit operands are inserted before the implicits.

MCInstrDesc has oprand flags like early clobber and operand ties that
apply to the explicit operands.

Don't look at those flags when the implicit operands are first added in
the explicit operands's positions.

llvm-svn: 162910
2012-08-30 14:39:06 +00:00
Craig Topper 2da13f9ef8 Add FMA to switch statement in VectorLegalizer::LegalizeOp so that it can be expanded when it isn't legal.
llvm-svn: 162894
2012-08-30 07:34:22 +00:00
Craig Topper c8f5d77e75 Add support for FMA to WidenVectorResult.
llvm-svn: 162893
2012-08-30 07:13:41 +00:00
Jakob Stoklund Olesen ffba07b927 Verify the order of tied operands in inline asm.
When there are multiple tied use-def pairs on an inline asm instruction,
the tied uses must appear in the same order as the defs.

It is possible to write an LLVM IR inline asm instruction that breaks
this constraint, but there is no reason for a front end to emit the
operands out of order.

The gnu inline asm syntax specifies tied operands as a single read/write
constraint "+r", so ouf of order operands are not possible.

llvm-svn: 162878
2012-08-29 23:52:52 +00:00
Jakob Stoklund Olesen b2bef482fd Set the isTied flags when building INLINEASM MachineInstrs.
For normal instructions, isTied() is set automatically by addOperand(),
based on MCInstrDesc, but inline asm has tied operands outside the
descriptor.

llvm-svn: 162869
2012-08-29 22:02:00 +00:00
Jakob Stoklund Olesen cea3e77433 Rename hasVolatileMemoryRef() to hasOrderedMemoryRef().
Ordered memory operations are more constrained than volatile loads and
stores because they must be ordered with respect to all other memory
operations.

llvm-svn: 162861
2012-08-29 21:19:21 +00:00
Jakob Stoklund Olesen 813a109fa5 Don't move normal loads across volatile/atomic loads.
It is technically allowed to move a normal load across a volatile load,
but probably not a good idea.

It is not allowed to move a load across an atomic load with
Ordering > Monotonic, and we model those with MOVolatile as well.

I recently removed the mayStore flag from atomic load instructions, so
they don't need a pseudo-opcode. This patch makes up for the difference.

llvm-svn: 162857
2012-08-29 20:48:45 +00:00
Jakob Stoklund Olesen 7a837b9a76 Verify the consistency of inline asm operands.
The operands on an INLINEASM machine instruction are divided into groups
headed by immediate flag operands. Verify this structure.

Extract verifyTiedOperands(), and only call it for non-inlineasm
instructions.

llvm-svn: 162849
2012-08-29 18:11:05 +00:00
Eric Christopher 2a4e616df6 Clean this up slightly, doesn't really fall through.
llvm-svn: 162848
2012-08-29 17:59:32 +00:00
Jakob Stoklund Olesen dbbff7899d Verify the tied operand flags.
WHen running with -verify-machineinstrs, check that tied operands come
in matching use/def pairs, and that they are consistent with MCInstrDesc
when it applies.

llvm-svn: 162816
2012-08-29 00:38:03 +00:00
Jakob Stoklund Olesen 2b16664522 Maintain a vaild isTied bit as operands are added and removed.
The isTied bit is set automatically when a tied use is added and
MCInstrDesc indicates a tied operand. The tie is broken when one of the
tied operands is removed.

llvm-svn: 162814
2012-08-29 00:37:58 +00:00
Jakob Stoklund Olesen e56c60c5eb Add a MachineOperand::isTied() flag.
While in SSA form, a MachineInstr can have pairs of tied defs and uses.
The tied operands are used to represent read-modify-write operands that
must be assigned the same physical register.

Previously, tied operand pairs were computed from fixed MCInstrDesc
fields, or by using black magic on inline assembly instructions.

The isTied flag makes it possible to add tied operands to any
instruction while getting rid of (some of) the inlineasm magic.

Tied operands on normal instructions are needed to represent predicated
individual instructions in SSA form. An extra <tied,imp-use> operand is
required to represent the output value when the instruction predicate is
false.

Adding a predicate to:

  %vreg0<def> = ADD %vreg1, %vreg2

Will look like:

  %vreg0<tied,def> = ADD %vreg1, %vreg2, pred:3, %vreg7<tied,imp-use>

The virtual register %vreg7 is the value given to %vreg0 when the
predicate is false. It will be assigned the same physreg as %vreg0.

This commit adds the isTied flag and sets it based on MCInstrDesc when
building an instruction. The flag is not used for anything yet.

llvm-svn: 162774
2012-08-28 18:34:41 +00:00
Jakob Stoklund Olesen dba99d0dfa Don't allow TargetFlags on MO_Register MachineOperands.
Register operands are manipulated by a lot of target-independent code,
and it is not always possible to preserve target flags. That means it is
not safe to use target flags on register operands.

None of the targets in the tree are using register operand target flags.
External targets should be using immediate operands to annotate
instructions with operand modifiers.

llvm-svn: 162770
2012-08-28 18:05:48 +00:00
Jakob Stoklund Olesen 87cb471e52 Remove extra MayLoad/MayStore flags from atomic_load/store.
These extra flags are not required to properly order the atomic
load/store instructions. SelectionDAGBuilder chains atomics as if they
were volatile, and SelectionDAG::getAtomic() sets the isVolatile bit on
the memory operands of all atomic operations.

The volatile bit is enough to order atomic loads and stores during and
after SelectionDAG.

This means we set mayLoad on atomic_load, mayStore on atomic_store, and
mayLoad+mayStore on the remaining atomic read-modify-write operations.

llvm-svn: 162733
2012-08-28 03:11:32 +00:00
Akira Hatanaka adb14f56c7 Fix bug 13532.
In SelectionDAGLegalize::ExpandLegalINT_TO_FP, expand INT_TO_FP nodes without
using any f64 operations if f64 is not a legal type.

Patch by Stefan Kristiansson. 

llvm-svn: 162728
2012-08-28 02:12:42 +00:00
Richard Smith 228e6d4cf3 Fix integer undefined behavior due to signed left shift overflow in LLVM.
Reviewed offline by chandlerc.

llvm-svn: 162623
2012-08-24 23:29:28 +00:00
Jakob Stoklund Olesen 10cdd09318 Avoid including explicit uses when counting SDNode imp-uses.
It is legal to have a register node as an explicit operand, it shouldn't
be counted as an implicit use.

llvm-svn: 162591
2012-08-24 20:52:42 +00:00
Manman Ren cf10446ffa BranchProb: modify the definition of an edge in BranchProbabilityInfo to handle
the case of multiple edges from one block to another.

A simple example is a switch statement with multiple values to the same
destination. The definition of an edge is modified from a pair of blocks to
a pair of PredBlock and an index into the successors.

Also set the weight correctly when building SelectionDAG from LLVM IR,
especially when converting a Switch.
IntegersSubsetMapping is updated to calculate the weight for each cluster.

llvm-svn: 162572
2012-08-24 18:14:27 +00:00
Eric Christopher bb69a27dbc Use DW_FORM_flag_present to save space in debug information if we're
not in darwin gdb compat mode.

Fixes rdar://10975088

llvm-svn: 162526
2012-08-24 01:14:27 +00:00
Eric Christopher acb7115bde Remove the DW_AT_MIPS_linkage name attribute when we don't need it
output (we're emitting a specification already and the information
isn't changing) and we're not in old gdb compat mode.

Saves 1% on the debug information for a build of llvm.

Fixes rdar://11043421

llvm-svn: 162493
2012-08-23 22:52:55 +00:00
Eric Christopher 20b76a77c3 Turn these two options in to trinary state so that they can be
turned on and off separate from the platform if you're on darwin.

llvm-svn: 162487
2012-08-23 22:36:40 +00:00
Eric Christopher 4977f214d7 Add a flag to DwarfDebug to allow it to communicate whether or not
we're using the darwin old gdb compat mode for emitting dwarf.

llvm-svn: 162486
2012-08-23 22:36:36 +00:00
Eric Christopher a876b8243e Typo.
llvm-svn: 162438
2012-08-23 07:32:06 +00:00
Eric Christopher 3a47c3e3cd Only emit the __debug_inlined section if we're trying to be compatible
with older gdbs on darwin.

rdar://10975874

llvm-svn: 162436
2012-08-23 07:32:02 +00:00
Eric Christopher 7782618271 Emit pubtypes only when going for darwin gdb compatibility.
rdar://10393214

llvm-svn: 162434
2012-08-23 07:10:56 +00:00
Eric Christopher 978fbff11b Add an option for darwin gdb compatibility.
llvm-svn: 162432
2012-08-23 07:10:46 +00:00
Andrew Trick ae53561b0c Simplify the computeOperandLatency API.
The logic for recomputing latency based on a ScheduleDAG edge was
shady. This bypasses the problem by requiring the client to provide
operand indices. This ensures consistent use of the machine model's
API.

llvm-svn: 162420
2012-08-23 00:39:43 +00:00
David Blaikie c8c2920a3f Tidy up a few more uses of MF.getFunction()->getName().
Based on CR feedback from r162301 and Craig Topper's refactoring in r162347
here are a few other places that could use the same API (& in one instance drop
a Function.h dependency).

llvm-svn: 162367
2012-08-22 17:18:53 +00:00
Benjamin Kramer f29db275b2 Reduce duplicated hash map lookups.
llvm-svn: 162362
2012-08-22 15:37:57 +00:00
Stepan Dyatkovskiy 99120e04be Rejected 169195. As Duncan commented, bitcasting to proper type is wrong approach. We need to insert some valid TRANCATE node here.
llvm-svn: 162354
2012-08-22 09:33:55 +00:00
Craig Topper a538d831e6 Add a getName function to MachineFunction. Use it in places that previously did getFunction()->getName(). Remove includes of Function.h that are no longer needed.
llvm-svn: 162347
2012-08-22 06:07:19 +00:00
Richard Smith 3fb2047f82 Initialize SelectionDAGBuilder's Context in 'init', not in its constructor. The
SelectionDAG's 'init' has not been called when the SelectionDAGBuilder is
constructed (in SelectionDAGISel's constructor), so this was previously always
initialized with 0.

llvm-svn: 162333
2012-08-22 00:42:39 +00:00
David Blaikie 9c7226b456 Remove unnecessary cast that was also unnecessarily casting away constness.
Even looking at the revision history I couldn't quite piece together why this
cast was ever written in the first place, but I assume it was because of some
change in the inheritance, perhaps this function was reimplemented in a
derived type & this caller was meant to get the base version (& it wasn't
virtual)?

llvm-svn: 162301
2012-08-21 18:54:23 +00:00
Chad Rosier d269bd8c24 Add support for the --param ssp-buffer-size= driver option.
PR9673

llvm-svn: 162284
2012-08-21 16:15:24 +00:00
Jakob Stoklund Olesen 6bae2a57d5 Fix a quadratic algorithm in MachineBranchProbabilityInfo.
The getSumForBlock function was quadratic in the number of successors
because getSuccWeight would perform a linear search for an already known
iterator.

This patch was originally committed as r161460, but reverted again
because of assertion failures. Now that duplicate Machine CFG edges have
been eliminated, this works properly.

llvm-svn: 162233
2012-08-20 22:01:38 +00:00
Jakob Stoklund Olesen 7d33c5739f Don't add CFG edges for redundant conditional branches.
IR that hasn't been through SimplifyCFG can look like this:

  br i1 %b, label %r, label %r

Make sure we don't create duplicate Machine CFG edges in this case.

Fix the machine code verifier to accept conditional branches with a
single CFG edge.

llvm-svn: 162230
2012-08-20 21:39:52 +00:00
Jakob Stoklund Olesen 1d0262677b Add a verification pass after ExpandISelPseudos.
This pass often has weird CFG hacks and hand-written MI building code
that can go wrong in many ways.

llvm-svn: 162224
2012-08-20 20:52:08 +00:00
Jakob Stoklund Olesen de31b52c3e Add CFG checks to MachineVerifier.
Verify that the predecessor and successor lists are consistent and free
of duplicates.

llvm-svn: 162223
2012-08-20 20:52:06 +00:00
Stepan Dyatkovskiy 6a638ec521 Fixed DAGCombiner bug (found and localized by James Malloy):
The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
consists purely of get_vector_elts from one or two source vectors. If
so, it either makes a concat_vectors node or a shufflevector node.

However, it doesn't check the element type width of the underlying
vector, so if you have this sequence:

Node0: v4i16 = ...
Node1: i32 = extract_vector_elt Node0
Node2: i32 = extract_vector_elt Node0
Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...

It will attempt to:

Node0:    v4i16 = ...
NewNode1: v16i8 = concat_vectors Node0, ...

Where this is actually invalid because the element width is completely
different. This causes an assertion failure on DAG legalization stage.

Fix:
If output item type of BUILD_VECTOR differs from input item type.
Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
Node0:    v4i16 = ...
NewNode1: v8i16 = concat_vectors Node0, ...
NewNode2: v16i8 = bitcast NewNode1

llvm-svn: 162195
2012-08-20 07:57:06 +00:00
Eli Friedman 79a6b30d8a Make atomic load and store of pointers work. Tighten verification of atomic operations
so other unexpected operations don't slip through.  Based on patch by Logan Chien.
PR11786/PR13186.

llvm-svn: 162146
2012-08-17 23:24:29 +00:00
Bill Wendling bfb9b7598d Implement stack protectors for structures with character arrays in them.
<rdar://problem/10545247>

llvm-svn: 162131
2012-08-17 20:59:56 +00:00
Bill Wendling 34bc34ecae Change the `linker_private_weak_def_auto' linkage to `linkonce_odr_auto_hide' to
make it more consistent with its intended semantics.

The `linker_private_weak_def_auto' linkage type was meant to automatically hide
globals which never had their addresses taken. It has nothing to do with the
`linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix
among other things.

The intended semantic is more like the `linkonce_odr' linkage type.

Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore
changing the semantics so that it produces the correct output for the linker.

Note: The old linkage name `linker_private_weak_def_auto' will still parse but
is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0.
<rdar://problem/11754934>

llvm-svn: 162114
2012-08-17 18:33:14 +00:00
Benjamin Kramer ca7ca4f6c6 TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type.
llvm-svn: 162101
2012-08-17 15:54:21 +00:00
Jakob Stoklund Olesen 714f595c98 Use standard pattern for iterate+erase.
Increment the MBB iterator at the top of the loop to properly handle the
current (and previous) instructions getting erased.

This fixes PR13625.

llvm-svn: 162099
2012-08-17 14:38:59 +00:00
Jakob Stoklund Olesen 2382d320b3 Add an MCID::Select flag and TII hooks for optimizing selects.
Select instructions pick one of two virtual registers based on a
condition, like x86 cmov. On targets like ARM that support predication,
selects can sometimes be eliminated by predicating the instruction
defining one of the operands.

Teach PeepholeOptimizer to recognize select instructions, and ask the
target to optimize them.

llvm-svn: 162059
2012-08-16 23:11:47 +00:00
Richard Smith 8f3447c032 Fix undefined behavior: don't perform array indexing through a potentially null
pointer.

llvm-svn: 161919
2012-08-15 01:39:31 +00:00
Richard Smith 0ff8f0eaf9 Fix undefined behavior: binding null pointer to reference. No functionality change.
llvm-svn: 161853
2012-08-14 05:31:26 +00:00
Eric Christopher 160522c25a Grammar.
llvm-svn: 161851
2012-08-14 05:13:29 +00:00
Owen Anderson a40319b7f1 Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807
2012-08-13 23:32:49 +00:00
Jakob Stoklund Olesen 396b595b92 Transfer weights in transferSuccessorsAndUpdatePHIs().
llvm-svn: 161805
2012-08-13 23:13:25 +00:00
Jakob Stoklund Olesen 1dc107a84e Print out MachineBasicBlock successor weights when available.
llvm-svn: 161804
2012-08-13 23:13:23 +00:00
Jakob Stoklund Olesen 702bcc3bcf Remove the TII::scheduleTwoAddrSource() hook.
It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.

The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.

When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.

llvm-svn: 161794
2012-08-13 21:52:57 +00:00
Bill Wendling 49aeb5cc5d Whitespace cleanup.
llvm-svn: 161788
2012-08-13 21:20:43 +00:00
Jakob Stoklund Olesen d0af1d9657 Count triangles and diamonds in early if-conversion.
llvm-svn: 161783
2012-08-13 21:03:27 +00:00
Jakob Stoklund Olesen 62a097d134 Delete dead typedef.
llvm-svn: 161782
2012-08-13 21:03:25 +00:00
Jakob Stoklund Olesen 83a927d84a Handle extra Tail predecessors in if-conversion.
It is still possible to if-convert if the tail block has extra
predecessors, but the tail phis must be rewritten instead of being
removed.

llvm-svn: 161781
2012-08-13 20:49:04 +00:00
Benjamin Kramer 59c8b411e0 MachineCSE: Hoist isConstantPhysReg out of the loop, it checks for overlaps already.
llvm-svn: 161729
2012-08-11 20:42:59 +00:00
Benjamin Kramer ef6494f24d PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely.
This is common e.g. when doing rip-relative addressing on x86_64.

llvm-svn: 161728
2012-08-11 19:05:13 +00:00
Jakob Stoklund Olesen bc55bfde03 Add a proper if-conversion cost model.
Detect when there is not enough available ILP, so if-conversion can't
speculate instructions for free.

Compute the lengthening of the critical path when inserting a select
instruction that depends on the condition as well as both sides of the
if.

Reject conversions that would stretch the critical path by more than
half a mispredict penalty.

llvm-svn: 161713
2012-08-10 22:27:31 +00:00
Jakob Stoklund Olesen a0042acd3b Give MachineTraceMetrics its own debug tag.
llvm-svn: 161712
2012-08-10 22:27:29 +00:00
Jakob Stoklund Olesen 3484420927 Add more trace query functions.
Trace::getResourceLength() computes the number of cycles required to
execute the trace when ignoring data dependencies. The number can be
compared to the critical path to estimate the trace ILP.

Trace::getPHIDepth() computes the data dependency depth of a PHI in a
trace successor that isn't necessarily part of the trace.

llvm-svn: 161711
2012-08-10 22:27:27 +00:00
Jakob Stoklund Olesen 0a99062cf6 Add getTPred() and getFPred() functions.
They identify the PHI predecessors in both diamonds and triangles.

llvm-svn: 161689
2012-08-10 20:19:17 +00:00
Jakob Stoklund Olesen 0954d4199a Include loop-carried dependencies when computing instr heights.
When a trace ends with a back-edge, include PHIs in the loop header in
the height computations. This makes the critical path through a loop
more accurate by including the latencies of the last instructions in the
loop.

llvm-svn: 161688
2012-08-10 20:11:38 +00:00
Jakob Stoklund Olesen 8c28ac9ec9 Update edge weights correctly in replaceSuccessor().
When replacing Old with New, it can happen that New is already a
successor. Add the old and new edge weights instead of creating a
duplicate edge.

llvm-svn: 161653
2012-08-10 03:23:27 +00:00
Jakob Stoklund Olesen d9b66506a3 Reapply r161633-161634 "Partition use lists so defs always come before uses.""
No changes to these patches, MRI needed to be notified when changing
uses into defs and vice versa.

llvm-svn: 161644
2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen ae7b9711b1 Also update MRI use lists when changing a use to a def and vice versa.
This was the cause of the buildbot failures.

llvm-svn: 161643
2012-08-10 00:21:26 +00:00
Jakob Stoklund Olesen acd27c9279 Revert r161633-161634 "Partition use lists so defs always come before uses."
These commits broke a number of buildbots.

llvm-svn: 161640
2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen df01e00710 Partition use lists so defs always come before uses.
This makes it possible to speed up def_iterator by stopping at the first
use. This makes def_empty() and getUniqueVRegDef() much faster when
there are many uses.

In a +Asserts build, LiveVariables is 100x faster in one case because
getVRegDef() has an assertion that would scan to the end of a
def_iterator chain.

Spill weight calculation is significantly faster (300x in one case)
because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP)
which calls def_empty(%RIP).

llvm-svn: 161634
2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen 7d7051ca3c Don't use pointer-pointers for the register use lists.
Use a more conventional doubly linked list where the Prev pointers form
a cycle. This means it is no longer necessary to adjust the Prev
pointers when reallocating the VRegInfo array.

The test changes are required because the register allocation hint is
using the use-list order to break ties.

llvm-svn: 161633
2012-08-09 22:49:42 +00:00
Jakob Stoklund Olesen c4102d4902 Move use list management into MachineRegisterInfo.
Register MachineOperands are kept in linked lists accessible via MRI's
reg_iterator interfaces. The linked list management was handled partly
by MachineOperand methods, partly by MRI methods.

Move all of the list management into MRI, delete
MO::AddRegOperandToRegInfo() and MO::RemoveRegOperandFromRegInfo().

Be more explicit about handling the cases where an MRI pointer isn't
available.

llvm-svn: 161632
2012-08-09 22:49:37 +00:00
Jakob Stoklund Olesen 420798ca4f Fix a future TwoAddressInstructionPass crash.
No test case, the crash only happens when the default use list order is
changed.

llvm-svn: 161627
2012-08-09 22:08:26 +00:00
Nadav Rotem e0f84d31c8 Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly
handle the cases where the memory value type was illegal. 
PR 13111. 

llvm-svn: 161565
2012-08-09 01:56:44 +00:00
Jakob Stoklund Olesen f71bc7b267 Don't use getNextOperandForReg() in RAFast.
That particular optimization was probably premature anyway.

llvm-svn: 161541
2012-08-08 23:44:01 +00:00
Jakob Stoklund Olesen bf1ac4bdc3 Deal with irreducible control flow when building traces.
We filter out MachineLoop back-edges during the trace-building PO
traversals, but it is possible to have CFG cycles that aren't natural
loops, and MachineLoopInfo doesn't include such cycles.

Use a standard visited set to detect such CFG cycles, and completely
ignore them when picking traces.

llvm-svn: 161532
2012-08-08 22:12:01 +00:00
Jakob Stoklund Olesen fa8a26f9df Heed -stress-early-ifcvt.
llvm-svn: 161513
2012-08-08 18:24:23 +00:00
Jakob Stoklund Olesen e71b6c6b20 Get the MispredictPenalty from MCSchedModel.
Thanks, Andy!

llvm-svn: 161507
2012-08-08 18:19:58 +00:00
Andrew Trick db9b1b5e66 Minor cleanup of defaultDefLatency API
llvm-svn: 161470
2012-08-08 02:44:11 +00:00
Jakob Stoklund Olesen 0556be983d Revert "Fix a quadratic algorithm in MachineBranchProbabilityInfo."
It caused an assertion failure when compiling consumer-typeset.

llvm-svn: 161463
2012-08-08 01:10:31 +00:00
Manman Ren 1be131ba27 X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462
2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen c0b61ff9c7 Fix a quadratic algorithm in MachineBranchProbabilityInfo.
The getSumForBlock function was quadratic in the number of successors
because getSuccWeight would perform a linear search for an already known
iterator.

llvm-svn: 161460
2012-08-08 00:20:37 +00:00
Jakob Stoklund Olesen fbf45dc2bd Skip tied operand pairs that already have the same register.
llvm-svn: 161454
2012-08-07 22:47:06 +00:00
Jakob Stoklund Olesen 505715d816 Add SelectionDAG::getTargetIndex.
This adds support for TargetIndex operands during isel. The meaning of
these (index, offset, flags) operands is entirely defined by the target.

llvm-svn: 161453
2012-08-07 22:37:05 +00:00
Bill Wendling 61396b81a4 For non-Darwin platforms, we want to generate stack protectors only for
character arrays. This is in line with what GCC does.
<rdar://problem/10529227>

llvm-svn: 161446
2012-08-07 20:59:05 +00:00
Jakob Stoklund Olesen 84689b0d5a Add a new kind of MachineOperand: MO_TargetIndex.
A target index operand looks a lot like a constant pool reference, but
it is completely target-defined. It contains the 8-bit TargetFlags, a
32-bit index, and a 64-bit offset. It is preserved by all code generator
passes.

TargetIndex operands can be used to carry target-specific information in
cases where immediate operands won't suffice.

llvm-svn: 161441
2012-08-07 18:56:39 +00:00
Jakob Stoklund Olesen 296448b293 Fix a couple of typos.
llvm-svn: 161437
2012-08-07 18:32:57 +00:00
Jakob Stoklund Olesen 75d9d5159e Add trace accessor methods, implement primitive if-conversion heuristic.
Compare the critical paths of the two traces through an if-conversion
candidate. If the difference is larger than the branch brediction
penalty, reject the if-conversion. If would never pay.

llvm-svn: 161433
2012-08-07 18:02:19 +00:00
Chandler Carruth 881d0a7966 Add a much more conservative strategy for aligning branch targets.
Previously, MBP essentially aligned every branch target it could. This
bloats code quite a bit, especially non-looping code which has no real
reason to prefer aligned branch targets so heavily.

As Andy said in review, it's still a bit odd to do this without a real
cost model, but this at least has much more plausible heuristics.

Fixes PR13265.

llvm-svn: 161409
2012-08-07 09:45:24 +00:00
Manman Ren cb36b8c2e6 MachineCSE: Update the heuristics for isProfitableToCSE.
If the result of a common subexpression is used at all uses of the candidate
expression, CSE should not increase the live range of the common subexpression.

rdar://11393714 and rdar://11819721

llvm-svn: 161396
2012-08-07 06:16:46 +00:00
Jakob Stoklund Olesen a9d0b850b3 Delete a dead variable.
TwoAddressInstructionPass doesn't remat any more.

llvm-svn: 161285
2012-08-04 00:04:03 +00:00
Jakob Stoklund Olesen a0c72ecf79 TwoAddressInstructionPass refactoring: Extract another method.
llvm-svn: 161284
2012-08-03 23:57:58 +00:00
Bob Wilson 874886cd66 Refactor and check "onlyReadsMemory" before optimizing builtins.
This patch is mostly just refactoring a bunch of copy-and-pasted code, but
it also adds a check that the call instructions are readnone or readonly.
That check was already present for sin, cos, sqrt, log2, and exp2 calls, but
it was missing for the rest of the builtins being handled in this code.

llvm-svn: 161282
2012-08-03 23:29:17 +00:00
Jakob Stoklund Olesen 1162a1548b TwoAddressInstructionPass refactoring: Extract a method.
No functional change intended, except replacing a DenseMap with a
SmallDenseMap which should behave identically.

llvm-svn: 161281
2012-08-03 23:25:45 +00:00
Jakob Stoklund Olesen 24bc514c0c Begin adding support for updating LiveIntervals in TwoAddressInstructionPass.
This is far from complete, and only changes behavior when the
-early-live-intervals flag is passed to llc.

llvm-svn: 161273
2012-08-03 22:58:34 +00:00
Jakob Stoklund Olesen 1c46589290 Add an experimental -early-live-intervals option.
This option runs LiveIntervals before TwoAddressInstructionPass which
will eventually learn to exploit and update the analysis.

Eventually, LiveIntervals will run before PHIElimination, and we can get
rid of LiveVariables.

llvm-svn: 161270
2012-08-03 22:12:54 +00:00
Jakob Stoklund Olesen 918999db95 Delete merged physreg copies in joinReservedPhysReg().
Previously, the identity copy would survive through register allocation
before it was removed by the rewriter.

llvm-svn: 161269
2012-08-03 22:12:51 +00:00
Bob Wilson 871701c606 Try to reduce the compile time impact of r161232.
The previous change caused fast isel to not attempt handling any calls to
builtin functions.  That included things like "printf" and caused some
noticable regressions in compile time.  I wanted to avoid having fast isel
keep a separate list of functions that had to be kept in sync with what the
code in SelectionDAGBuilder.cpp was handling.  I've resolved that here by
moving the list into TargetLibraryInfo.  This is somewhat redundant in
SelectionDAGBuilder but it will ensure that we keep things consistent.

llvm-svn: 161263
2012-08-03 21:26:24 +00:00
Bob Wilson fa59485b94 Fix memcmp code-gen to honor -fno-builtin.
I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp
in TargetLibraryInfo, so that it would use custom code for memcmp calls even
with -fno-builtin.  I also had to add a new -disable-simplify-libcalls option
to llc so that I could write a test for this.

llvm-svn: 161262
2012-08-03 21:26:18 +00:00
Jakob Stoklund Olesen daae19f785 Completely eliminate VNInfo flags.
The 'unused' state of a value number can be represented as an invalid
def SlotIndex. This also exposed code that shouldn't have been looking
at unused value VNInfos.

llvm-svn: 161258
2012-08-03 20:59:32 +00:00
Jakob Stoklund Olesen 21809385a6 Fix a couple of loops that were processing unused value numbers.
Unused VNInfos should be left alone. Their def SlotIndex doesn't point
to anything.

llvm-svn: 161257
2012-08-03 20:59:29 +00:00
Matt Beaumont-Gay aaba08d503 Silence unused variable warning in -asserts build
llvm-svn: 161256
2012-08-03 20:54:11 +00:00
Jakob Stoklund Olesen 9f565e19c5 Eliminate the VNInfo::hasPHIKill() flag.
The only real user of the flag was removeCopyByCommutingDef(), and it
has been switched to LiveIntervals::hasPHIKill().

All the code changed by this patch was only concerned with computing and
propagating the flag.

llvm-svn: 161255
2012-08-03 20:19:44 +00:00
Jakob Stoklund Olesen 06d6a5363b Make the hasPHIKills flag a computed property.
The VNInfo::HAS_PHI_KILL is only half supported. We precompute it in
LiveIntervalAnalysis, but it isn't properly updated by live range
splitting and functions like shrinkToUses().

It is only used in one place: RegisterCoalescer::removeCopyByCommutingDef().

This patch changes that function to use a new LiveIntervals::hasPHIKill()
function that computes the flag for a given value number.

llvm-svn: 161254
2012-08-03 20:10:24 +00:00
Jakob Stoklund Olesen 19c4596629 Delete dead function.
llvm-svn: 161242
2012-08-03 15:21:21 +00:00
Jakob Stoklund Olesen 47ac20d4d6 Don't delete dead code in TwoAddressInstructionPass.
This functionality was added before we started running
DeadMachineInstructionElim on all targets. It serves no purpose now.

llvm-svn: 161241
2012-08-03 15:11:57 +00:00
Bob Wilson 3e6fa462f3 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Manman Ren ba8122cc25 X86 Peephole: fold loads to the source register operand if possible.
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276

llvm-svn: 161207
2012-08-02 19:37:32 +00:00
Jakob Stoklund Olesen 5d30630e22 Compute the critical path length through a trace.
Whenever both instruction depths and instruction heights are known in a
block, it is possible to compute the length of the critical path as
max(depth+height) over the instructions in the block.

The stored live-in lists make it possible to accurately compute the
length of a critical path that bypasses the current (small) block.

llvm-svn: 161197
2012-08-02 18:45:54 +00:00
Jakob Stoklund Olesen 637c467528 Verify regunit intervals along with virtreg intervals.
Don't cause regunit intervals to be computed just to verify them. Only
check the already cached intervals.

llvm-svn: 161183
2012-08-02 16:36:50 +00:00
Jakob Stoklund Olesen 374071dde2 Avoid creating dangling physreg live ranges during DCE.
LiveRangeEdit::eliminateDeadDefs() can delete a dead instruction that
reads unreserved physregs. This would leave the corresponding regunit
live interval dangling because we don't have shrinkToUses() for physical
registers.

Fix this problem by turning the instruction into a KILL instead of
deleting it. This happens in a landing pad in
test/CodeGen/X86/2012-05-19-CoalescerCrash.ll:

  %vreg27<def,dead> = COPY %EDX<kill>; GR32:%vreg27

becomes:

  KILL %EDX<kill>

An upcoming fix to the machine verifier will catch problems like this by
verifying regunit live intervals.

This fixes PR13498. I am not including the test case from the PR since
we already have one exposing the problem once the verifier is fixed.

llvm-svn: 161182
2012-08-02 16:36:47 +00:00
Jakob Stoklund Olesen bde5dc5e46 Add report() functions that take a LiveInterval argument.
llvm-svn: 161178
2012-08-02 14:31:49 +00:00
Manman Ren 5759d01230 X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152
2012-08-02 00:56:42 +00:00
Jakob Stoklund Olesen e736b97eff Extract some methods from verifyLiveIntervals.
No functional change.

llvm-svn: 161149
2012-08-02 00:20:20 +00:00
Jakob Stoklund Olesen a766b4746d Also verify RegUnit intervals at uses.
llvm-svn: 161147
2012-08-01 23:52:40 +00:00
Jakob Stoklund Olesen 2db6b65330 Compute instruction heights through a trace.
The height on an instruction is the minimum number of cycles from the
instruction is issued to the end of the trace. Heights are computed for
all instructions in and below the trace center block.

The method for computing heights is different from the depth
computation. As we visit instructions in the trace bottom-up, heights of
used instructions are pushed upwards. This way, we avoid scanning long
use lists, looking for uses in the current trace.

At each basic block boundary, a list of live-in registers and their
minimum heights is saved in the trace block info. These live-in lists
are used when restarting depth computations on a trace that
converges with an already computed trace. They will also be used to
accurately compute the critical path length.

llvm-svn: 161138
2012-08-01 22:36:00 +00:00
Eric Christopher b1b9451337 Temporarily revert c23b933d5f8be9b51a1d22e717c0311f65f87dcd. It's causing
failures in the debug testsuite and possibly PR13486.

llvm-svn: 161121
2012-08-01 18:19:01 +00:00
Jakob Stoklund Olesen 5e19d35e9a Add DataDep constructors. Explicitly check SSA form.
llvm-svn: 161115
2012-08-01 16:02:59 +00:00
Elena Demikhovsky 3cb3b0045c Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Manman Ren f288d2f120 MachineSink: Sort the successors before trying to find SuccToSinkTo.
Use stable_sort instead of sort. Follow-up to r161062.

rdar://11980766

llvm-svn: 161075
2012-07-31 20:45:38 +00:00
Jakob Stoklund Olesen 059e647c6d Compute instruction depths through the current trace.
Assuming infinite issue width, compute the earliest each instruction in
the trace can issue, when considering the latency of data dependencies.
The issue cycle is record as a 'depth' from the beginning of the trace.

This is half the computation required to find the length of the critical
path through the trace. Heights are next.

llvm-svn: 161074
2012-07-31 20:44:38 +00:00
Jakob Stoklund Olesen 1dfb101835 Rename CT -> MTM. MachineTraceMetrics is abbreviated MTM.
llvm-svn: 161072
2012-07-31 20:25:13 +00:00
Manman Ren 8c549b586c MachineSink: Sort the successors before trying to find SuccToSinkTo.
One motivating example is to sink an instruction from a basic block which has
two successors: one outside the loop, the other inside the loop. We should try
to sink the instruction outside the loop.

rdar://11980766

llvm-svn: 161062
2012-07-31 18:10:39 +00:00
Micah Villmow b67d7a3a33 Conform to LLVM coding style.
llvm-svn: 161061
2012-07-31 18:07:43 +00:00
Micah Villmow 6b12f596ef Don't generate ordered or unordered comparison operations if it is not legal to do so.
llvm-svn: 161053
2012-07-31 16:48:03 +00:00
Jakob Stoklund Olesen 0c807dfae2 Clear kill flags in removeCopyByCommutingDef().
We are extending live ranges, so kill flags are not accurate. They
aren't needed until they are recomputed after RA anyway.

<rdar://problem/11950722>

llvm-svn: 161023
2012-07-31 02:47:24 +00:00
Manman Ren 2b6a0dfd4c Reverse order of the two branches at end of a basic block if it is profitable.
We branch to the successor with higher edge weight first.
Convert from
     je    LBB4_8  --> to outer loop
     jmp   LBB4_14 --> to inner loop
to
     jne   LBB4_14
     jmp   LBB4_8

PR12750
rdar: 11393714

llvm-svn: 161018
2012-07-31 01:11:07 +00:00
Andrew Trick 79795897b3 Use the latest MachineRegisterInfo APIs. No functionality.
llvm-svn: 161010
2012-07-30 23:48:17 +00:00
Andrew Trick 535a23c38b Inline MachineRegisterInfo::hasOneUse
llvm-svn: 161007
2012-07-30 23:48:12 +00:00
Jakob Stoklund Olesen 68c2cd059e Avoid looking at stale data in verifyAnalysis().
llvm-svn: 161004
2012-07-30 23:15:12 +00:00
Jakob Stoklund Olesen c14cf57ba9 Allow traces to enter nested loops.
This lets traces include the final iteration of a nested loop above the
center block, and the first iteration of a nested loop below the center
block.

We still don't allow traces to contain backedges, and traces are
truncated where they would leave a loop, as seen from the center block.

llvm-svn: 161003
2012-07-30 23:15:10 +00:00
Jakob Stoklund Olesen 984cfe8322 Clarify invalidation strategy in comment.
llvm-svn: 160997
2012-07-30 21:16:22 +00:00
Jakob Stoklund Olesen f308c128ea Assert that all trace candidate blocks have been visited by the PO.
When computing a trace, all the candidates for pred/succ must have been
visited. Filter out back-edges first, though. The PO traversal ignores
them.

Thanks to Andy for spotting this in review.

llvm-svn: 160995
2012-07-30 21:10:27 +00:00
Jakob Stoklund Olesen a12a7d5f74 Hook into PassManager's analysis verification.
By overriding Pass::verifyAnalysis(), the pass contents will be verified
by the pass manager.

llvm-svn: 160994
2012-07-30 20:57:50 +00:00
Pete Cooper 91244268d7 Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd
llvm-svn: 160987
2012-07-30 20:23:19 +00:00
Jakob Stoklund Olesen 7361846f32 Add MachineInstr::isTransient().
This is a cleaned up version of the isFree() function in
MachineTraceMetrics.cpp.

Transient instructions are very unlikely to produce any code in the
final output. Either because they get eliminated by RegisterCoalescing,
or because they are pseudo-instructions like labels and debug values.

llvm-svn: 160977
2012-07-30 18:34:14 +00:00
Jakob Stoklund Olesen 3df6c46fdd Add MachineTraceMetrics::verify().
This function verifies the consistency of cached data in the
MachineTraceMetrics analysis.

llvm-svn: 160976
2012-07-30 18:34:11 +00:00
Jakob Stoklund Olesen eb488fe165 Verify that the CFG hasn't changed during invalidate().
The MachineTraceMetrics analysis must be invalidated before modifying
the CFG. This will catch some of the violations of that rule.

llvm-svn: 160969
2012-07-30 17:36:49 +00:00
Jakob Stoklund Olesen fee94ca15b Add MachineBasicBlock::isPredecessor().
A->isPredecessor(B) is the same as B->isSuccessor(A), but it can
tolerate a B that is null or dangling. This shouldn't happen normally,
but it it useful for verification code.

llvm-svn: 160968
2012-07-30 17:36:47 +00:00
Manman Ren f87dd7c01b Revert r160920 and r160919 due to dragonegg and clang selfhost failure
llvm-svn: 160927
2012-07-29 02:44:09 +00:00
Manman Ren 0fa3ab88ba X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919
2012-07-28 16:48:01 +00:00
Andrew Trick 940534371b Reenable a basic SSA DAG builder optimization.
Jakob fixed ProcessImplicifDefs in r159149.

llvm-svn: 160910
2012-07-28 01:48:15 +00:00
Jakob Stoklund Olesen 0563369755 Add more debug output to MachineTraceMetrics.
llvm-svn: 160905
2012-07-27 23:58:38 +00:00
Jakob Stoklund Olesen 1152202cc2 Keep track of the head and tail of the trace through each block.
This makes it possible to quickly detect blocks that are outside the
trace.

llvm-svn: 160904
2012-07-27 23:58:36 +00:00
Eric Christopher 86ca9f9e11 Add a DW_AT_high_pc for CUs that are a single address range. Update
all tests accordingly.

Fixes PR13351.

Patch by shinichiro hamaji!

llvm-svn: 160899
2012-07-27 22:00:05 +00:00
Jakob Stoklund Olesen 7dfe7abdee Also compute register mask lists under -new-live-intervals.
llvm-svn: 160898
2012-07-27 21:56:39 +00:00
Jakob Stoklund Olesen 97e14e02f1 Eliminate the IS_PHI_DEF flag and VNInfo::setIsPHIDef().
A value number is a PHI def if and only if it begins at a block
boundary. This can be derived from the def slot, a separate flag is not
necessary.

llvm-svn: 160893
2012-07-27 21:11:14 +00:00
Jakob Stoklund Olesen 4021a7bf25 Add a -new-live-intervals experimental option.
This option replaces the existing live interval computation with one
based on LiveRangeCalc.cpp. The new algorithm does not depend on
LiveVariables, and it can be run at any time, before or after leaving
SSA form.

llvm-svn: 160892
2012-07-27 20:58:46 +00:00
Jakob Stoklund Olesen bc65e8f94e Add <imp-def> of super-register when lowering SUBREG_TO_REG.
Patch by Tyler Nowicki!

llvm-svn: 160888
2012-07-27 20:19:49 +00:00
Jakob Stoklund Olesen 35400b1dda Use an otherwise unused variable.
llvm-svn: 160798
2012-07-26 19:42:56 +00:00
Jakob Stoklund Olesen f9029fef2a Start scaffolding for a MachineTraceMetrics analysis pass.
This is still a work in progress.

Out-of-order CPUs usually execute instructions from multiple basic
blocks simultaneously, so it is necessary to look at longer traces when
estimating the performance effects of code transformations.

The MachineTraceMetrics analysis will pick a typical trace through a
given basic block and provide performance metrics for the trace. Metrics
will include:

- Instruction count through the trace.
- Issue count per functional unit.
- Critical path length, and per-instruction 'slack'.

These metrics can be used to determine the performance limiting factor
when executing the trace, and how it will be affected by a code
transformation.

Initially, this will be used by the early if-conversion pass.

llvm-svn: 160796
2012-07-26 18:38:11 +00:00
Dan Gohman 0b3d782933 Add a floor intrinsic.
llvm-svn: 160791
2012-07-26 17:43:27 +00:00
Manman Ren cc1dc6dc11 Disable rematerialization in TwoAddressInstructionPass.
It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.

This also fixed rdar://11830760 where xor is expected instead of movi0.

llvm-svn: 160749
2012-07-25 18:28:13 +00:00
Jakob Stoklund Olesen cef9a618b1 Preserve 2-addr constraints in ConnectedVNInfoEqClasses.
When a live range splits into multiple connected components, we would
arbitrarily assign <undef> uses to component 0. This is wrong when the
use is tied to a def that gets assigned to a different component:

  %vreg69<def> = ADD8ri %vreg68<undef>, 1

The use and def must get the same virtual register.

Fix this by assigning <undef> uses to the same component as the value
defined by the instruction, if any:

  %vreg69<def> = ADD8ri %vreg69<undef>, 1

This fixes PR13402. The PR has a test case which I am not including
because it is unlikely to keep exposing this behavior in the future.

llvm-svn: 160739
2012-07-25 17:15:15 +00:00
Jakob Stoklund Olesen c6fd3deee6 Verify two-address constraints more carefully.
Include <undef> operands and virtual registers after leaving SSA form.

llvm-svn: 160734
2012-07-25 16:49:11 +00:00
Craig Topper 17300940ae Change llvm_unreachable in SplitVectorOperand to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type.
llvm-svn: 160661
2012-07-24 04:11:21 +00:00
Sylvestre Ledru 35521e2310 Fix a typo (the the => the)
llvm-svn: 160621
2012-07-23 08:51:15 +00:00
Nadav Rotem 9056076cab Fixed DAGCombine optimizations which generate select_cc for targets
that do not support it (X86 does not lower select_cc).

PR: 13428

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160619
2012-07-23 07:59:50 +00:00
Craig Topper 2694c05e86 Tidy up. Fix indentation and remove trailing whitespace.
llvm-svn: 160617
2012-07-23 05:38:07 +00:00
Craig Topper b49546a3b3 Change llvm_unreachable in SplitVectorResult to report_fatal_error. Keeps release builds from crashing if code uses an intrinsic with an illegal type. For instance 256-bit AVX intrinsics without having AVX enabled.
llvm-svn: 160616
2012-07-23 04:34:49 +00:00
Benjamin Kramer 5be8f60126 Remove unused private member variables uncovered by the recent changes to clang's -Wunused-private-field.
llvm-svn: 160583
2012-07-20 22:05:57 +00:00
Jakob Stoklund Olesen e2cfd0d45a Avoid folding loads that are unsafe to move.
LiveRangeEdit::foldAsLoad() can eliminate a register by folding a load
into its only use. Only do that when the load is safe to move, and it
won't extend any live ranges.

This fixes PR13414.

llvm-svn: 160575
2012-07-20 21:29:31 +00:00
Jakob Stoklund Olesen f62c07f147 Split loop exiting edges more aggressively.
PHIElimination splits critical edges when it predicts it can resolve
interference and eliminate copies. It doesn't split the edge if the
interference wouldn't be resolved anyway because the phi-use register is
live in the critical edge anyway.

Teach PHIElimination to split loop exiting edges with interference, even
if it wouldn't resolve the interference. This removes the necessary
copies from the loop, which is still an improvement from injecting the
copies into the loop.

The test case demonstrates the improvement. Before:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  movl  %esi, %eax
  je  LBB0_1

After:

LBB0_1:
  cmpb  $0, (%rdx)
  leaq  1(%rdx), %rdx
  je  LBB0_1

  movl  %esi, %eax

llvm-svn: 160571
2012-07-20 20:49:53 +00:00
Pete Cooper dcf94db677 Fix crash in machine verifier when trying to print the def of a register which has no def
llvm-svn: 160531
2012-07-19 23:40:38 +00:00
Benjamin Kramer f364a63c3e Replace some explicit compare loops with std::equal.
No functionality change.

llvm-svn: 160501
2012-07-19 10:46:05 +00:00
Galina Kistanova aaf9735951 Fixed few warnings.
llvm-svn: 160493
2012-07-19 04:50:12 +00:00
Bill Wendling d163405df8 Remove tabs.
llvm-svn: 160475
2012-07-19 00:04:14 +00:00
Chandler Carruth 985454e0ac Fix a somewhat nasty crasher in PR13378. This crashes inside of
LiveIntervals due to the two-addr pass generating bogus MI code.

The crux of the issue was a loop nesting problem. The intent of the code
which attempts to transform instructions before converting them to
two-addr form is to defer and reprocess any transformed instructions as
the second processing is likely to have more opportunities to coalesce
copies, etc. Unfortunately, there was one section of processing that was
not deferred -- the INSERT_SUBREG rewriting. Due to quirks of how this
rewriting proceeded, not only did it occur early, it removed the bits of
information needed for the deferred processing to correctly generate the
necessary two address form (specifically inserting a copy), but didn't
trigger any immediate assertions and produced what appeared to be
already valid two-address from code. Thus, the assertion only fired much
later in the pipeline.

The fix is to hoist the transformation logic up layer to where it can
more firmly defer all further processing, and to teach the normal
processing to handle an edge case previously handled as part of the
transformation logic. This edge case (already matched tied register
operands) needs to *not* defer any steps.

As has been brought up repeatedly in the process: wow does this code
need refactoring. I *may* squeeze in some time to at least bring sanity
to this loop... but wow... =]

Thanks to Jakob for helpful hints on the way here, and the review.

llvm-svn: 160443
2012-07-18 18:58:22 +00:00
Nuno Lopes 2151497dca ignore 'invoke @llvm.donothing', but still keep the edge to the continuation BB
llvm-svn: 160411
2012-07-18 00:07:17 +00:00
Evan Cheng e6a3b03ee0 Back out r160101 and instead implement a dag combine to recover from instcombine transformation.
llvm-svn: 160387
2012-07-17 18:54:11 +00:00
Jakob Stoklund Olesen 0ef031186c Add some trace output to TwoAddressInstructionPass.
llvm-svn: 160380
2012-07-17 17:57:23 +00:00
Benjamin Kramer 7c1598caaa Remove unused variable.
llvm-svn: 160372
2012-07-17 17:00:11 +00:00
Nadav Rotem 277a40bc0a Fix a crash in the legalization of large vectors.
When truncating a result of a vector that is split we need
to use the result of the split vector, and not re-split the dead node.

llvm-svn: 160357
2012-07-17 09:07:37 +00:00
Evan Cheng 780f9b5f92 Implement r160312 as target indepedenet dag combine.
llvm-svn: 160354
2012-07-17 08:31:11 +00:00
Evan Cheng 47d7be9578 Make sure constant bitwidth is <= 64 bit before calling getSExtValue().
llvm-svn: 160350
2012-07-17 07:47:50 +00:00
Evan Cheng f579beca6d This is another case where instcombine demanded bits optimization created
large immediates. Add dag combine logic to recover in case the large
immediates doesn't fit in cmp immediate operand field.

int foo(unsigned long l) {
  return (l>> 47) == 1;
}

we produce

  %shr.mask = and i64 %l, -140737488355328
  %cmp = icmp eq i64 %shr.mask, 140737488355328
  %conv = zext i1 %cmp to i32
  ret i32 %conv

which codegens to

movq    $0xffff800000000000,%rax
andq    %rdi,%rax
movq    $0x0000800000000000,%rcx
cmpq    %rcx,%rax
sete    %al
movzbl    %al,%eax
ret

TargetLowering::SimplifySetCC would transform
(X & -256) == 256 -> (X >> 8) == 1
if the immediate fails the isLegalICmpImmediate() test. For x86,
that's immediates which are not a signed 32-bit immediate.

Based on a patch by Eli Friedman.

PR10328
rdar://9758774

llvm-svn: 160346
2012-07-17 06:53:39 +00:00
Nadav Rotem 60f7904db7 Minor cleanup and docs.
llvm-svn: 160311
2012-07-16 18:56:39 +00:00
Nadav Rotem 839a06e9d7 Make ComputeDemandedBits return a deterministic result when computing an AssertZext value.
In the added testcase the constant 55 was behind an AssertZext of type i1, and ComputeDemandedBits
reported that some of the bits were both known to be one and known to be zero.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160305
2012-07-16 18:34:53 +00:00
Nadav Rotem 3050e07108 Fix a bug in the scalarization of BUILD_VECTOR. BUILD_VECTOR elements may be wider than the output element type. Make sure to trunc them if needed.
Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160235
2012-07-15 20:39:08 +00:00
Nadav Rotem a62368c965 Refactor the code that checks that all operands of a node are UNDEFs.
Add a micro-optimization to getNode of CONCAT_VECTORS when both operands are undefs.
Can't find a testcase for this because VECTOR_SHUFFLE already handles undef operands, but Duncan suggested that we add this.

Together with Michael Kuperstein <michael.m.kuperstein@intel.com>

llvm-svn: 160229
2012-07-15 08:38:23 +00:00
Chandler Carruth db5536f09d Reapply r160194, switching to use LV information for finding local kills.
The notable fix is to look at any dependencies attached to the kill
instruction (or other instructions between MI nad the kill) where the
dependencies are specific to the register in question.

The old code implicitly handled this by rejecting the transform if *any*
other uses were found within the block, but after the start point. The
new code directly finds the kill, and has to re-use the existing
dependency scan to check for non-kill uses.

This was caught by self-host, but I found the bug via inspection and use
of absurd assert scaffolding to compute the kills in two ways and
compare them. So I have no useful testcase for this other than
"bootstrap". I'd work harder to reduce a test case if this particular
code were likely to live for a long time.

Thanks to Benjamin Kramer for reviewing the fix itself.

llvm-svn: 160228
2012-07-15 03:29:46 +00:00
Nadav Rotem 018921002e Add a dagcombine optimization to convert concat_vectors of undefs into a single undef.
The unoptimized concat_vectors isd prevented the canonicalization of the vector_shuffle node.

llvm-svn: 160221
2012-07-14 21:30:27 +00:00
Jakob Stoklund Olesen 8f324a2cc8 Account for early-clobber reload instructions.
No test case, there are no in-tree targets that require this.

llvm-svn: 160219
2012-07-14 18:45:35 +00:00
Jakob Stoklund Olesen 3d604ab933 Be more verbose when detecting dominance problems.
Catch uses of undefined physregs that haven't been added to basic block
live-in lists. Run the verifier to pinpoint the problem.

Also run the verifier when a virtual register use is not jointly
dominated by defs.

llvm-svn: 160207
2012-07-13 23:39:05 +00:00
Chandler Carruth 9c97cd5672 Revert r160194, which switched to use LV information for finding local
kills.

This is causing miscompiles that I'm working on tracking down.

llvm-svn: 160196
2012-07-13 22:23:32 +00:00
Chandler Carruth 58c470dc68 Use the LiveVariables information to efficiently get local kills. This
removes the largest scaling problem in the test cases from PR13225 when
ASan is switched to insert basic blocks in the natural CFG order.

It may also solve some scaling problems for more normal code with large
numbers of basic blocks and variables.

llvm-svn: 160194
2012-07-13 21:18:38 +00:00
Jim Grosbach 1af8c8060c Provide function name in 'Cannot select' fatal error.
When dumping the DAG for a fatal 'Cannot select' back-end error, also
provide the name of the function the construct is in. Useful when dealing
with large testcases, as the next step is to llvm-extract the function
in question to get a small(er) testcase.

llvm-svn: 160152
2012-07-13 00:29:09 +00:00
Eric Christopher bf57091f8b The end of the prologue should be marked with is_stmt.
Fixes PR13303.

Patch by Paul Robinson!

llvm-svn: 160148
2012-07-12 23:30:25 +00:00
Duncan Sands 671cc2575d The result type of EXTRACT_VECTOR_ELT doesn't have to match the element type of
the input vector, it can be bigger (this is helpful for powerpc where <2 x i16>
is a legal vector type but i16 isn't a legal type, IIRC).  However this wasn't
being taken into account by ExpandRes_EXTRACT_VECTOR_ELT, causing PR13220.
Lightly tweaked version of a patch by Michael Liao.

llvm-svn: 160116
2012-07-12 09:01:35 +00:00
Evan Cheng b17122859b InstrEmitter::EmitSubregNode() optimize extract_subreg in this case:
r1025 = s/zext r1024, 4
r1026 = extract_subreg r1025, 4

to a copy:
r1026 = copy r1024

This is correct. However it uses TII->isCoalescableExtInstr() which can return
true for instructions which essentially does a sext_in_reg so this can end up
with an illegal copy where the source and destination register classes do not
match. Add a check to avoid it. Sorry, no test case possible at this time.

rdar://11849816

llvm-svn: 160059
2012-07-11 18:55:07 +00:00
Nadav Rotem 2a148668b6 Rename many of the Tmp1, Tmp2, Tmp3 variables to names such as Chain, Value, Ptr, etc.
No functionality change.

llvm-svn: 160042
2012-07-11 11:02:16 +00:00
Benjamin Kramer 9488100d46 Remove unused variable.
llvm-svn: 160040
2012-07-11 09:39:04 +00:00
Nadav Rotem de6fd282ef Refactor the DAG Legalizer by extracting the legalization of
Load and Store nodes into their own functions.
No functional change.

llvm-svn: 160037
2012-07-11 08:52:09 +00:00
Owen Anderson b8844d6744 Only apply the SETCC+SITOFP -> SELECTCC optimization when the SETCC returns an MVT::i1, i.e. before type legalization.
This is a speculative fix for a problem on Mips reported by Akira Hatanaka.

llvm-svn: 160036
2012-07-11 06:38:55 +00:00
Jakob Stoklund Olesen bc90a4ea82 Require and preserve LoopInfo for early if-conversion.
It will surely be needed by heuristics.

llvm-svn: 160027
2012-07-10 22:39:56 +00:00
Chandler Carruth 2207f76cd4 Teach the LiveInterval::join function to use the fast merge algorithm,
generalizing its implementation sufficiently to support this value
number scenario as well.

This cuts out another significant performance hit in large functions
(over 10k basic blocks, etc), especially those with "natural" CFG
structures.

llvm-svn: 160026
2012-07-10 22:25:21 +00:00
Jakob Stoklund Olesen 02638392c1 Run early if-conversion in domtree post-order.
This ordering allows nested if-conversion without using a work list, and
it makes it possible to update the dominator tree on the fly as well.

Any erased basic blocks will always be dominated by the current
post-order position, so the domtree can be pruned without invalidating
the iterator.

llvm-svn: 160025
2012-07-10 22:18:23 +00:00
Chandler Carruth 77d940011d Fix a bug where I didn't test for an empty range before inspecting the
back of it.

I don't have anything even remotely close to a test case for this. It
only broke two build bots, both of them doing bootstrap builds, one of
them a dragonegg bootstrap. It doesn't break for me when I bootstrap
either. It doesn't reproduce every time or on many machines during the
bootstrap. Many thanks to Duncan Sands who got the exact command (and
stage of the bootstrap) which failed on the dragonegg bootstrap and
managed to get it to trigger under valgrind with debug symbols. The fix
was then found by inspection.

llvm-svn: 159993
2012-07-10 15:41:33 +00:00
Nadav Rotem d908ddc186 Improve the loading of load-anyext vectors by allowing the codegen to load
multiple scalars and insert them into a vector. Next, we shuffle the elements
into the correct places, as before.
Also fix a small dagcombine bug in SimplifyBinOpWithSameOpcodeHands, when the
migration of bitcasts happened too late in the SelectionDAG process.

llvm-svn: 159991
2012-07-10 13:25:08 +00:00
Chandler Carruth e18614dd17 Add an efficient merge operation to LiveInterval and use it to avoid
quadratic behavior when performing pathological merges. Fixes the core
element of PR12652.

There is only one user of addRangeFrom left: join. I'm hoping to
refactor further in a future patch and have join use this merge
operation as well.

llvm-svn: 159982
2012-07-10 05:16:17 +00:00
Chandler Carruth ac766b9b42 Teach LiveIntervals how to verify themselves and start using it in some
of the trick merge routines. This adds a layer of testing that was
necessary when implementing more efficient (and complex) merge logic for
this datastructure.

No functionality changed here.

llvm-svn: 159981
2012-07-10 05:06:03 +00:00
Andrew Trick c50f06487c indentation
llvm-svn: 159958
2012-07-09 20:43:01 +00:00
Owen Anderson d4b841f8f9 Teach the DAG combiner to turn sitofp/uitofp from i1 into a conditional move, since there are only two possible values.
Previously, this would become an integer extension operation, followed by a real integer->float conversion.

llvm-svn: 159957
2012-07-09 20:31:12 +00:00
Andrew Trick 87255e340e I'm introducing a new machine model to simultaneously allow simple
subtarget CPU descriptions and support new features of
MachineScheduler.

MachineModel has three categories of data:
1) Basic properties for coarse grained instruction cost model.
2) Scheduler Read/Write resources for simple per-opcode and operand cost model (TBD).
3) Instruction itineraties for detailed per-cycle reservation tables.

These will all live side-by-side. Any subtarget can use any
combination of them. Instruction itineraries will not change in the
near term. In the long run, I expect them to only be relevant for
in-order VLIW machines that have complex contraints and require a
precise scheduling/bundling model. Once itineraries are only actively
used by VLIW-ish targets, they could be replaced by something more
appropriate for those targets.

This tablegen backend rewrite sets things up for introducing
MachineModel type #2: per opcode/operand cost model.

llvm-svn: 159891
2012-07-07 04:00:00 +00:00
Chad Rosier 879c34f45a Whitespace.
llvm-svn: 159839
2012-07-06 17:44:22 +00:00
Chad Rosier 88d53eae56 [fast-isel] Tell fast-isel to do nothing with the new donothing intrinsic.
llvm-svn: 159837
2012-07-06 17:33:39 +00:00
Alexey Samsonov 39602781f6 Fix PR13202 and a regtest.
DwarfDebug class could generate the same (inlined) DIVariable twice:
1) when trying to find abstract debug variable for a concrete inlined instance.
2) when explicitly collecting info for variables that were optimized out.

This change makes sure that this duplication won't happen and makes
Clang pass "gdb.opt/inline-locals" test from gdb testsuite.

Reviewed by Eric Christopher.

llvm-svn: 159811
2012-07-06 08:45:08 +00:00
Jakob Stoklund Olesen 3f1bb93cab Add some comments suggested in code review.
llvm-svn: 159800
2012-07-06 02:31:22 +00:00
Chandler Carruth 1088676476 Optimize extendIntervalEndTo a tiny bit by saving one call through the
vector erase. No functionality changed.

llvm-svn: 159746
2012-07-05 12:40:45 +00:00
Chandler Carruth 264854f9a0 Finish fixing the MachineOperand hashing, providing a nice modern
hash_value overload for MachineOperands. This addresses a FIXME
sufficient for me to remove it, and cleans up the code nicely too.

The important changes to the hashing logic:
- TargetFlags are now included in all of the hashes. These were complete
  missed.
- Register operands have their subregisters and whether they are a def
  included in the hash.
- We now actually hash all of the operand types. Previously, many
  operand types were simply *dropped on the floor*. For example:
  - Floating point immediates
  - Large integer immediates (>64-bit)
  - External globals!
  - Register masks
  - Metadata operands
- It removes the offset from the block-address hash; I'm a bit
  suspicious of this, but isIdenticalTo doesn't consider the offset for
  black addresses.

Any patterns involving these entities could have triggered extreme
slowdowns in MachineCSE or PHIElimination. Let me know if there are PRs
you think might be closed now... I'm looking myself, but I may miss
them.

llvm-svn: 159743
2012-07-05 11:06:22 +00:00
Duncan Sands 71dacd09fe All cases are covered, no need for a default. This deals with the
corresponding clang warning.

llvm-svn: 159742
2012-07-05 10:14:33 +00:00
Chandler Carruth 1d5d23106e The hash function for MI expressions, used by MachineCSE, is really
broken. This patch fixes the superficial problems which lead to the
intractably slow compile times reported in PR13225.

The specific issue is that we were failing to include the *offset* of
a global variable in the hash code. Oops. This would in turn cause all
MIs which were only distinguishable due to operating on different
offsets of a global variable to produce identical hash functions. In
some of the test cases attached to the PR I saw hash table activity
where there were O(1000) probes-per-lookup *on average*. A very few
entries were responsible for most of these probes.

There is still quite a bit more to do here. The ad-hoc layering of data
in MachineOperands makes them *extremely* brittle to hash correctly.
We're missing quite a few other cases, the only ones I've fixed here are
the specific MO types which were allowed through the assert() in
getOffset().

llvm-svn: 159741
2012-07-05 10:03:57 +00:00
Duncan Sands 0552a2cad2 Use the right kind of booleans: we were emitting 0/1 booleans, instead of 0/-1
booleans.  Patch by James Benton.

llvm-svn: 159739
2012-07-05 09:32:46 +00:00
Nick Lewycky 765c699370 Remove ParentMap. You can just ask the domnode for its parent. No functionality
change.

Move the "Not profitable, avoid CSE!" debug message next to where we fail the
check for profitability and use a different message for avoiding CSE due to
being in different register classes.

llvm-svn: 159729
2012-07-05 06:19:21 +00:00
Jakob Stoklund Olesen c300ef0e50 Allow trailing physreg RegisterSDNode operands on non-variadic instructions.
Also allow trailing register mask operands on non-variadic both
MachineSDNodes and MachineInstrs.

The extra physreg RegisterSDNode operands are added to the MI as
<imp-use> operands. This makes it possible to have non-variadic call
instructions.

Call and return instructions really are non-variadic, the argument
registers should only be used implicitly - they are not part of the
encoding.

llvm-svn: 159727
2012-07-04 23:53:23 +00:00
Jakob Stoklund Olesen adb50a7a09 Print SlotIndexes when available for -print-machineinstrs.
llvm-svn: 159726
2012-07-04 23:53:19 +00:00
Jakob Stoklund Olesen 2d827d628e Allow multiple terminators to read virtual registers.
Find the kill as the last terminator to read SrcReg.

Patch by Philipp Brüschweiler!

llvm-svn: 159722
2012-07-04 19:52:05 +00:00
Jakob Stoklund Olesen 29506f5e6d Make sure -print-machineinstrs applies to the first pass as well.
llvm-svn: 159720
2012-07-04 19:28:27 +00:00
Stepan Dyatkovskiy 7ff588f986 Reverted r156659, due to probable performance regressions, DenseMap should be used here:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159703
2012-07-04 05:53:05 +00:00
Eric Christopher ef9d710ea6 Reduce some code duplication.
llvm-svn: 159701
2012-07-04 02:02:18 +00:00
Matt Beaumont-Gay 11d08b2e22 Fix some ascii art in a comment to not have trailing backslashes (inspiration
from IfConversion.cc), and fix some spelling and grammar in the surrounding
prose.

llvm-svn: 159699
2012-07-04 01:09:45 +00:00
Jakob Stoklund Olesen f8a63a1507 Add an experimental early if-conversion pass, off by default.
This pass performs if-conversion on SSA form machine code by
speculatively executing both sides of the branch and using a cmov
instruction to select the result. This can help lower the number of
branch mispredictions on architectures like x86 that don't have
predicable instructions.

The current implementation is very aggressive, and causes regressions on
mosts tests. It needs good heuristics that have yet to be implemented.

llvm-svn: 159694
2012-07-04 00:09:54 +00:00
Stepan Dyatkovskiy 8b0c97e0dd Part of r159527. Splitted into series of patches and gone with fixed PR13256:
IntegersSubsetMapping
  - Replaced type of Items field from std::list with std::map. In neares future I'll test it with DenseMap and do the correspond replacement
    if possible.

llvm-svn: 159659
2012-07-03 13:46:45 +00:00
Eric Christopher b65acc61a5 Revert "IntRange:" as it appears to be breaking self hosting.
This reverts commit b2833d9dcba88c6f0520cad760619200adc0442c.

llvm-svn: 159618
2012-07-02 23:22:21 +00:00
Chandler Carruth 34263a0c95 All glory to address sanitizer. ;]
It appears to have caught a use-after-free introduced as by r159567
and/or friends which call 'addPass' from many more places. The bug in
'addPass' doesn't appear to be new, and was spotted by inspection when
ASan shown a bright light of a stacktrace at these functions.

Hopefully this will fix the ASan failure -- I have no test case other
than running an ASan-built clang over the test suite.

llvm-svn: 159614
2012-07-02 22:56:41 +00:00
Evan Cheng 39e90029a2 Target option DisableJumpTables is a gross hack. Move it to TargetLowering instead.
llvm-svn: 159611
2012-07-02 22:39:56 +00:00
Andrew Trick 2f26b34806 misched: allow NULL InstrItineraries.
llvm-svn: 159599
2012-07-02 21:55:12 +00:00
Eric Christopher dd8638fb3e Turn an assert into an error to make it a bit more friendly.
Part of rdar://6880388 and rdar://11766377

llvm-svn: 159590
2012-07-02 21:16:43 +00:00
Bob Wilson cac3b90633 Extend TargetPassConfig to allow running only a subset of the normal passes.
This is still a work in progress but I believe it is currently good enough
to fix PR13122 "Need unit test driver for codegen IR passes".  For example,
you can run llc with -stop-after=loop-reduce to have it dump out the IR after
running LSR.  Serializing machine-level IR is not yet supported but we have
some patches in progress for that.

The plan is to serialize the IR to a YAML file, containing separate sections
for the LLVM IR, machine-level IR, and whatever other info is needed.  Chad
suggested that we stash the stop-after pass in the YAML file and use that
instead of the start-after option to figure out where to restart the
compilation.  I think that's a great idea, but since it's not implemented yet
I put the -start-after option into this patch for testing purposes.

llvm-svn: 159570
2012-07-02 19:48:45 +00:00
Bob Wilson a3f9fa710a Move assertion with TargetPassConfig's Initialized flag.
llvm-svn: 159569
2012-07-02 19:48:39 +00:00
Bob Wilson b9b693650a Consistently use AnalysisID types in TargetPassConfig.
This makes it possible to just use a zero value to represent "no pass", so
the phony NoPassID global variable is no longer needed.

llvm-svn: 159568
2012-07-02 19:48:37 +00:00
Bob Wilson bbd38dd9c0 Add all codegen passes to the PassManager via TargetPassConfig.
This is a preliminary step toward having TargetPassConfig be able to
start and stop the compilation at specified passes for unit testing
and debugging.  No functionality change.

llvm-svn: 159567
2012-07-02 19:48:31 +00:00
Manman Ren 72098b2c91 Added assertion in getVRegDef of MachineRegisterInfo to make sure the virtual
register does not have multiple definitions. Modified TwoAddressInstructionPass
to use getUniqueVRegDef instead of getVRegDef.

llvm-svn: 159545
2012-07-02 18:55:36 +00:00
Andrew Trick f161e391f8 Reapply "Make NumMicroOps a variable in the subtarget's instruction itinerary."
Reapplies r159406 with minor cleanup. The regressions appear to have been spurious.

llvm-svn: 159541
2012-07-02 18:10:42 +00:00
Stepan Dyatkovskiy 8b9ecca42d IntRange:
- Changed isSingleNumber method behaviour. Now this flag is calculated on demand.
IntegersSubsetMapping
  - Optimized diff operation.
  - Replaced type of Items field from std::list with std::map.
  - Added new methods:
    bool isOverlapped(self &RHS)
    void add(self& RHS, SuccessorClass *S)
    void detachCase(self& NewMapping, SuccessorClass *Succ)
    void removeCase(SuccessorClass *Succ)
    SuccessorClass *findSuccessor(const IntTy& Val)
    const IntTy* getCaseSingleNumber(SuccessorClass *Succ)
IntegersSubsetTest
  - DiffTest: Added checks for successors.
SimplifyCFG
  Updated SwitchInst usage (now it is case-ragnes compatible) for
    - SimplifyEqualityComparisonWithOnlyPredecessor
    - FoldValueComparisonIntoPredecessors

llvm-svn: 159527
2012-07-02 13:02:18 +00:00
Rafael Espindola a77d31d7fd Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

llvm-svn: 159509
2012-07-01 17:08:01 +00:00
Rafael Espindola efab16d43b Handle implicit_defs in the register coalescer. I am still trying to produce
a reduced testcase, but this fixes pr13209.

llvm-svn: 159479
2012-06-30 01:45:55 +00:00
Manman Ren 6fa76dc0e0 Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare
instructions with two register operands.

llvm-svn: 159465
2012-06-29 21:33:59 +00:00
Jakob Stoklund Olesen 3e3cdecf98 Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

llvm-svn: 159461
2012-06-29 21:00:03 +00:00
Jakob Stoklund Olesen da9ea1d6bc Check for extra kill flags on live-out virtual registers.
This would previously get reported as the misleading "Virtual register
def doesn't dominate all uses."

llvm-svn: 159460
2012-06-29 21:00:00 +00:00
Manman Ren c146589aa4 Add getUniqueVRegDef to MachineRegisterInfo.
This comes in handy during peephole optimization.

llvm-svn: 159453
2012-06-29 19:16:05 +00:00
Alexey Samsonov 6e7e6b646b Cleanup in DwarfDebug - fix a typo and remove two unused functions
llvm-svn: 159433
2012-06-29 16:04:14 +00:00
Chandler Carruth aafe0918bc Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h
This was always part of the VMCore library out of necessity -- it deals
entirely in the IR. The .cpp file in fact was already part of the VMCore
library. This is just a mechanical move.

I've tried to go through and re-apply the coding standard's preferred
header sort, but at 40-ish files, I may have gotten some wrong. Please
let me know if so.

I'll be committing the corresponding updates to Clang and Polly, and
Duncan has DragonEgg.

Thanks to Bill and Eric for giving the green light for this bit of cleanup.

llvm-svn: 159421
2012-06-29 12:38:19 +00:00
Bill Wendling f799efdedc The DIBuilder class is just a wrapper around debug info creation
(a.k.a. MDNodes). The module doesn't belong in Analysis. Move it to the VMCore
instead.

llvm-svn: 159414
2012-06-29 08:32:07 +00:00
Andrew Trick 51a8cf77b8 Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary."
This reverts commit r159406. I noticed a performance regression so I'll back out for now.

llvm-svn: 159411
2012-06-29 07:10:41 +00:00
Andrew Trick 8c9e6728b3 misched: avoid scheduling instructions that can't be dispatched.
llvm-svn: 159408
2012-06-29 03:23:24 +00:00
Andrew Trick ce27bb999d misched: count micro-ops toward the issue limit.
llvm-svn: 159407
2012-06-29 03:23:22 +00:00
Andrew Trick 1f50152b2d Make NumMicroOps a variable in the subtarget's instruction itinerary.
The TargetInstrInfo::getNumMicroOps API does not change, but soon it
will be used by MachineScheduler. Now each subtarget can specify the
number of micro-ops per itinerary class. For ARM, this is currently
always dynamic (-1), because it is used for load/store multiple which
depends on the number of register operands.

Zero is now a valid number of micro-ops. This can be used for
nop pseudo-instructions or instructions that the hardware can squash
during dispatch.

llvm-svn: 159406
2012-06-29 03:23:18 +00:00
Nuno Lopes ec9653b363 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Jim Grosbach e0c10d8b86 'Promote' vector [su]int_to_fp should widen elements.
Teach vector legalization how to honor Promote for int to float
conversions. The code checking whether to promote the operation knew
to look at the operand, but the actual promotion code didn't. This
fixes that. The operand is promoted up via [zs]ext.

rdar://11762659

llvm-svn: 159378
2012-06-28 21:03:44 +00:00
Bill Wendling e38859dc8e Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.

The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.

llvm-svn: 159312
2012-06-28 00:05:13 +00:00
Jakob Stoklund Olesen 59a0d3243b Allow targets to inject passes before the virtual register rewriter.
Such passes can be used to tweak the register assignments in a
target-dependent way, for example to avoid write-after-write
dependencies.

llvm-svn: 159209
2012-06-26 17:09:29 +00:00
Chandler Carruth 9139f44d23 Update a bunch of stale comments that dated from when this folled the
very first (and worst) placement algorithm. These should now more
accurately reflect the reality of the pass.

llvm-svn: 159185
2012-06-26 05:16:37 +00:00
Andrew Trick fb2ba3e1cb Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

llvm-svn: 159183
2012-06-26 04:11:38 +00:00
Evan Cheng 4c6f917d34 Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection.
llvm-svn: 159179
2012-06-26 01:19:33 +00:00
Jakob Stoklund Olesen a57fc12ec9 Enforce stricter liveness rules for PHIs.
Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

llvm-svn: 159150
2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen eb49566447 Run ProcessImplicitDefs on SSA form where it can be much simpler.
Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

llvm-svn: 159149
2012-06-25 18:12:18 +00:00
Jakob Stoklund Olesen 70ed924e18 Teach PHIElimination to handle <undef> operands.
When a PHI use is <undef>, don't emit a copy in the predecessor block,
but insert an IMPLICIT_DEF instruction instead. This ensures that
virtual register uses are always jointly dominated by defs, even if some
of them are IMPLICIT_DEF.

llvm-svn: 159121
2012-06-25 03:36:12 +00:00
Jakob Stoklund Olesen 6b556f824d Handle <undef> operands in TwoAddressInstructionPass.
When the source register to a 2-addr instruction is undefined, there is
no need to attempt any transformations - simply replace the source
register with the destination register.

This also comes up when lowering IMPLICIT_DEF instructions - make sure
the <undef> flag is moved to the new partial register def operand:

  %vreg8<def> = INSERT_SUBREG %vreg9<undef>, %vreg0<kill>, sub_16bit
rewrite undef:
  %vreg8<def> = INSERT_SUBREG %vreg8<undef>, %vreg0<kill>, sub_16bit
convert to:
  %vreg8:sub_16bit<def,read-undef> = COPY %vreg0<kill>

llvm-svn: 159120
2012-06-25 03:27:12 +00:00
NAKAMURA Takumi 704de074b8 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
llvm-svn: 159112
2012-06-24 13:32:01 +00:00
Pete Cooper fe212e762f DAG legalisation can now handle illegal fma vector types by scalarisation
llvm-svn: 159092
2012-06-24 00:05:44 +00:00
Jakob Stoklund Olesen 502e4c6ac4 Teach LiveVariables to handle <undef> operands.
It's simple: Don't treat <undef> operands as uses, and don't assume a
virtual register has a defining instruction unless a real use has been
seen.

llvm-svn: 159061
2012-06-23 02:23:00 +00:00
Jakob Stoklund Olesen a127fc780a Remove ProcessImplicitDefs.h which was unused.
The ProcessImplicitDefs class can be local to its implementation file.

llvm-svn: 159041
2012-06-22 22:27:36 +00:00
Jakob Stoklund Olesen b033dede17 Also verify the def index for early clobbers.
llvm-svn: 159039
2012-06-22 22:23:58 +00:00
Jakob Stoklund Olesen 4fa84ba8b9 Delete a boring statistic.
llvm-svn: 159030
2012-06-22 20:40:15 +00:00
Jakob Stoklund Olesen c61edda0ab Store live intervals in an IndexedMap.
It is both smaller and faster than DenseMap.

llvm-svn: 159029
2012-06-22 20:37:52 +00:00
Hal Finkel 8db5547252 Revert r158679 - use case is unclear (and it increases the memory footprint).
Original commit message:
    Allow up to 64 functional units per processor itinerary.

    This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
    This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 159027
2012-06-22 20:27:13 +00:00
Jakob Stoklund Olesen 48828bb402 Fix a crash in --debug code.
Don't try to print out the live range of a physreg.

llvm-svn: 159021
2012-06-22 19:51:41 +00:00
Jakob Stoklund Olesen 48a1647c93 Don't depend on live ranges being present.
DBG_VALUE instructions could be referring to non-existing virtual
registers.

llvm-svn: 159020
2012-06-22 18:51:35 +00:00