Commit Graph

56134 Commits

Author SHA1 Message Date
Richard Smith 24f09cc82d Reduce alignment of SmallVector<T> to the required amount, rather than forcing 16-byte alignment. This fixes misaligned SmallVector accesses via ExtractValueInst's SmallVector data member.
llvm-svn: 162331
2012-08-22 00:11:07 +00:00
Chad Rosier 7fb0cd26f7 Add a few functions to TargetLibraryInfo as part of PR13574.
Patch by Weiming Zhao <weimingz@codeaurora.org>.

llvm-svn: 162329
2012-08-21 23:28:56 +00:00
Richard Smith 976f8605e2 MaximumSpanningTree::EdgeWeightCompare: Make this comparator actually be a
strict weak ordering, and don't pass possibly-null pointers to dyn_cast.

llvm-svn: 162314
2012-08-21 21:03:40 +00:00
Richard Smith 13473857a7 Fix unaligned memory accesses when performing relocations in X86 JIT. There's
no cost to using memcpy here: the fixed code is optimized by LLVM to perfect
machine code.

llvm-svn: 162311
2012-08-21 20:48:36 +00:00
Richard Smith ad9c8e839e Don't bind a reference to a dereferenced null pointer (for return value of WeakVH::operator*).
llvm-svn: 162309
2012-08-21 20:35:14 +00:00
Chad Rosier 3d4bc62a5c [ms-inline asm] Do not report a Parser error when matching inline assembly.
llvm-svn: 162306
2012-08-21 19:36:59 +00:00
David Blaikie 9c7226b456 Remove unnecessary cast that was also unnecessarily casting away constness.
Even looking at the revision history I couldn't quite piece together why this
cast was ever written in the first place, but I assume it was because of some
change in the inheritance, perhaps this function was reimplemented in a
derived type & this caller was meant to get the base version (& it wasn't
virtual)?

llvm-svn: 162301
2012-08-21 18:54:23 +00:00
Rafael Espindola 2c06448360 Fix macros arguments with an underscore, dot or dollar in them. This is based
on a patch by Andy/PaX. I added the support for dot and dollar.

llvm-svn: 162298
2012-08-21 18:29:30 +00:00
Chad Rosier 79e766c38e [ms-inline asm] Expose the ErrorInfo from the MatchInstructionImpl. In general,
this is the index of the operand that failed to match.

Note: This may cause a buildbot failure due to an API mismatch in clang.  Should
recover with my next commit to clang.

llvm-svn: 162295
2012-08-21 18:14:59 +00:00
Rafael Espindola af6da83a2c Make the wording in of the "expected identifier" error in the .macro directive
consistent with the other "expected identifier" errors.
Extracted from the Andy/PaX patch. I added the test.

llvm-svn: 162291
2012-08-21 17:12:05 +00:00
Chad Rosier d269bd8c24 Add support for the --param ssp-buffer-size= driver option.
PR9673

llvm-svn: 162284
2012-08-21 16:15:24 +00:00
Rafael Espindola d2dc2a7af3 Use typedefs. Fix indentation. Extracted from the Andy/PaX patch.
llvm-svn: 162283
2012-08-21 16:06:48 +00:00
Rafael Espindola 5535863043 Remove unused variable. Extracted from the Andy/PaX patch.
llvm-svn: 162282
2012-08-21 16:01:14 +00:00
Rafael Espindola 3e5eb4263a Fix typo. Extracted from the Andy/PaX patch.
llvm-svn: 162281
2012-08-21 15:55:04 +00:00
Jim Grosbach bea6753f4f MCJIT: Tidy up the constructor.
The MCJIT doesn't need or want a TargetJITInfo. That's vestigal from the old
JIT, so just remove it.

rdar://12119347

llvm-svn: 162280
2012-08-21 15:42:49 +00:00
Chandler Carruth c908ca1766 Port the global copy optimization from the SROA pass to InstCombine.
This optimization is really just replacing allocas wholesale with
globals, there is no scalarization.

The underlying motivation for this patch is to simplify the SROA pass
and focus it on splitting and promoting allocas.

llvm-svn: 162271
2012-08-21 08:39:44 +00:00
Craig Topper bab0c76674 Fix up indentation and remove a couple else's after returns.
llvm-svn: 162270
2012-08-21 08:29:51 +00:00
Kostya Serebryany f4be019fba [asan] add code to detect global initialization fiasco in C/C++. The sub-pass is off by default for now. Patch by Reid Watson. Note: this patch changes the interface between LLVM and compiler-rt parts of asan. The corresponding patch to compiler-rt will follow.
llvm-svn: 162268
2012-08-21 08:24:25 +00:00
Craig Topper bfcfdeb563 Use uint16_t for tables of opcodes.
llvm-svn: 162267
2012-08-21 08:23:21 +00:00
Craig Topper a0cabf19f8 Fix up indentation. No functional change.
llvm-svn: 162264
2012-08-21 08:17:07 +00:00
Craig Topper 4bc3e5a1bf Add a couple llvm_unreachables. Add a message to several others.
llvm-svn: 162263
2012-08-21 08:16:16 +00:00
Craig Topper 653e759046 Replace a break with llvm_unreachable in the default case of a nested switch. Condense code a bit. No functional change.
llvm-svn: 162261
2012-08-21 07:32:16 +00:00
Craig Topper 384fae2f0d Cleanup the scalar FMA3 definitions. Add patterns to fold loads with scalar forms.
llvm-svn: 162260
2012-08-21 07:11:11 +00:00
Craig Topper 4f3879dfa7 Merge FMA3 instructions with and without patterns into single classes using null_frag.
llvm-svn: 162257
2012-08-21 05:56:45 +00:00
Michael Liao 6e12d12830 revise debug output to avoid dangling pointer
llvm-svn: 162256
2012-08-21 05:55:22 +00:00
Jakob Stoklund Olesen 74e6f9fc65 Add a missing def flag.
*** Bad machine code: Explicit definition marked as use ***
- function:    test_cos
- basic block: BB#0 L.entry (0x7ff2a2024fd0)
- instruction: VSETLNi32 %D11, %D11<undef>, %R0, 0, pred:14, pred:%noreg, %Q5<imp-use,kill>, %Q5<imp-def>
- operand 0:   %D11

llvm-svn: 162247
2012-08-21 00:34:53 +00:00
Jakob Stoklund Olesen 6bae2a57d5 Fix a quadratic algorithm in MachineBranchProbabilityInfo.
The getSumForBlock function was quadratic in the number of successors
because getSuccWeight would perform a linear search for an already known
iterator.

This patch was originally committed as r161460, but reverted again
because of assertion failures. Now that duplicate Machine CFG edges have
been eliminated, this works properly.

llvm-svn: 162233
2012-08-20 22:01:38 +00:00
Jakob Stoklund Olesen 7d33c5739f Don't add CFG edges for redundant conditional branches.
IR that hasn't been through SimplifyCFG can look like this:

  br i1 %b, label %r, label %r

Make sure we don't create duplicate Machine CFG edges in this case.

Fix the machine code verifier to accept conditional branches with a
single CFG edge.

llvm-svn: 162230
2012-08-20 21:39:52 +00:00
Jakob Stoklund Olesen 1d0262677b Add a verification pass after ExpandISelPseudos.
This pass often has weird CFG hacks and hand-written MI building code
that can go wrong in many ways.

llvm-svn: 162224
2012-08-20 20:52:08 +00:00
Jakob Stoklund Olesen de31b52c3e Add CFG checks to MachineVerifier.
Verify that the predecessor and successor lists are consistent and free
of duplicates.

llvm-svn: 162223
2012-08-20 20:52:06 +00:00
Jakob Stoklund Olesen 710093e360 Use a SmallPtrSet to dedup successors in EmitSjLjDispatchBlock.
The test case ARM/2011-05-04-MultipleLandingPadSuccs.ll was creating
duplicate successor list entries.

llvm-svn: 162222
2012-08-20 20:52:03 +00:00
Sebastian Pop 1a0bef6d4b fix HexagonSubtarget parsing of -mv flag
llvm-svn: 162217
2012-08-20 19:56:47 +00:00
Michael Liao 10ff96ce8c fix a case where all operands of BUILD_VECTOR are undefined
llvm-svn: 162214
2012-08-20 17:59:18 +00:00
Akira Hatanaka 11dfbe196f Fix coding style violations in 162135 and 162136.
Patch by Petar Jovanovic.

llvm-svn: 162213
2012-08-20 17:53:24 +00:00
Benjamin Kramer 1b07ab51e4 DataExtractor: Fix integer truncation issues in LEB128 extraction.
llvm-svn: 162201
2012-08-20 10:52:11 +00:00
Stepan Dyatkovskiy 6a638ec521 Fixed DAGCombiner bug (found and localized by James Malloy):
The DAGCombiner tries to optimise a BUILD_VECTOR by checking if it
consists purely of get_vector_elts from one or two source vectors. If
so, it either makes a concat_vectors node or a shufflevector node.

However, it doesn't check the element type width of the underlying
vector, so if you have this sequence:

Node0: v4i16 = ...
Node1: i32 = extract_vector_elt Node0
Node2: i32 = extract_vector_elt Node0
Node3: v16i8 = BUILD_VECTOR Node1, Node2, ...

It will attempt to:

Node0:    v4i16 = ...
NewNode1: v16i8 = concat_vectors Node0, ...

Where this is actually invalid because the element width is completely
different. This causes an assertion failure on DAG legalization stage.

Fix:
If output item type of BUILD_VECTOR differs from input item type.
Make concat_vectors based on input element type and then bitcast it to the output vector type. So the case described above will transformed to:
Node0:    v4i16 = ...
NewNode1: v8i16 = concat_vectors Node0, ...
NewNode2: v16i8 = bitcast NewNode1

llvm-svn: 162195
2012-08-20 07:57:06 +00:00
Craig Topper b58eec4eaf Remove FMA3 intrinsic instructions in favor of patterns.
llvm-svn: 162194
2012-08-20 06:21:25 +00:00
Craig Topper 37eca54912 Use correct intrinsic for 256-bit VFMSUBADDPS.
llvm-svn: 162193
2012-08-20 06:03:04 +00:00
Craig Topper 5122e9f194 Remove trailing white space and tab characters. No functional change.
llvm-svn: 162192
2012-08-19 23:37:46 +00:00
Nadav Rotem 178250ad87 When unsafe math is used, we can use commutative FMAX and FMIN. In some cases
this allows for better code generation.

Added a new DAGCombine transformation to convert FMAX and FMIN to FMANC and
FMINC, which are commutative.

For example:

  movaps  %xmm0, %xmm1
  movsd LC(%rip), %xmm0
  minsd %xmm1, %xmm0

becomes:

  minsd LC(%rip), %xmm0

llvm-svn: 162187
2012-08-19 13:06:16 +00:00
Benjamin Kramer fd4fe7061c Fabs folding is implemented.
llvm-svn: 162186
2012-08-19 09:51:44 +00:00
Benjamin Kramer 9d03242fcf InstCombine: Fix a crasher when encountering a function pointer.
llvm-svn: 162180
2012-08-18 22:04:34 +00:00
Jakob Stoklund Olesen e1014e7b98 Remove the CAND/COR/CXOR custom ISD nodes and their select code.
These nodes are no longer needed because the peephole pass can fold
CMOV+AND into ANDCC etc.

llvm-svn: 162179
2012-08-18 21:49:50 +00:00
Craig Topper fd1c925946 Remove virtual from many methods. These methods replace methods in the base class, but the base class methods aren't virtual so it just increased call overhead.
llvm-svn: 162178
2012-08-18 21:38:45 +00:00
Jakob Stoklund Olesen dded061f85 Also combine zext/sext into selects for ARM.
This turns common i1 patterns into predicated instructions:

  (add (zext cc), x) -> (select cc (add x, 1), x)
  (add (sext cc), x) -> (select cc (add x, -1), x)

For a function like:

  unsigned f(unsigned s, int x) {
    return s + (x>0);
  }

We now produce:

  cmp r1, #0
  it  gt
  addgt.w r0, r0, #1

Instead of:

  movs  r2, #0
  cmp r1, #0
  it  gt
  movgt r2, #1
  add r0, r2

llvm-svn: 162177
2012-08-18 21:25:22 +00:00
Jakob Stoklund Olesen aab43dbfbb Also pass logical ops to combineSelectAndUse.
Add these transformations to the existing add/sub ones:

  (and (select cc, -1, c), x) -> (select cc, x, (and, x, c))
  (or  (select cc, 0, c), x)  -> (select cc, x, (or, x, c))
  (xor (select cc, 0, c), x)  -> (select cc, x, (xor, x, c))

The selects can then be transformed to a single predicated instruction
by peephole.

This transformation will make it possible to eliminate the ISD::CAND,
COR, and CXOR custom DAG nodes.

llvm-svn: 162176
2012-08-18 21:25:16 +00:00
Benjamin Kramer 9282aef86d Remove overly conservative hasOneUse check, this always expands into a single IR instruction.
llvm-svn: 162175
2012-08-18 20:24:19 +00:00
Benjamin Kramer 8c2a733c55 InstCombine: Add a couple of fabs identities for comparing with 0.0.
llvm-svn: 162174
2012-08-18 20:06:47 +00:00
Benjamin Kramer 000132454c SimplifyLibcalls: Add fabs and trunc to the list of libcalls that are safe to shrink from double to float.
llvm-svn: 162173
2012-08-18 19:27:32 +00:00
Nadav Rotem a136939fa9 Reapply r162160 with a fix: Optimize Arith->Trunc->SETCC sequence to allow better compare/branch code.
llvm-svn: 162172
2012-08-18 17:53:03 +00:00
Anton Korobeynikov 1e28826abe fp16-to-fp32 conversion instructions are available in Thumb mode as well.
Make sure the generic pattern is used.

llvm-svn: 162170
2012-08-18 13:08:43 +00:00
Craig Topper 0128f9bad7 Refactor code a bit to reduce number of calls in the final compiled code. No functional change intended.
llvm-svn: 162166
2012-08-18 06:39:34 +00:00
Craig Topper 2bd9c7bd2d Reorder initialization list to silence -Wreorder
llvm-svn: 162165
2012-08-18 06:20:54 +00:00
Nadav Rotem c324af609e Revert r162160 because it made a few buildbots fail.
llvm-svn: 162164
2012-08-18 05:02:36 +00:00
Nadav Rotem 2cb14a5c4b The X86 backend has a number of optimizations for SETCC nodes which use
arithmetic instructions. However, when small data types are used, a truncate
node appears between the SETCC node and the arithmetic operation. This patch
adds support for this pattern.

Before:
  xorl  %esi, %edi
  testb %dil, %dil
  setne %al
  ret

After:
  xorb  %dil, %sil
  setne %al
  ret

rdar://12081007

llvm-svn: 162160
2012-08-18 02:43:28 +00:00
Eli Friedman 79a6b30d8a Make atomic load and store of pointers work. Tighten verification of atomic operations
so other unexpected operations don't slip through.  Based on patch by Logan Chien.
PR11786/PR13186.

llvm-svn: 162146
2012-08-17 23:24:29 +00:00
Richard Smith 257c5f2088 Fix undefined behavior (binding a reference to a dereferenced null pointer) if
SSAUpdater was created and destroyed without being initialized.

llvm-svn: 162137
2012-08-17 21:42:44 +00:00
Akira Hatanaka fb21e84248 Add MipsELFWriterInfo.{h,cpp}.
llvm-svn: 162136
2012-08-17 21:38:47 +00:00
Akira Hatanaka 111174be7b Correct MCJIT functionality for MIPS32 architecture.
No new tests are added.
All tests in ExecutionEngine/MCJIT that have been failing pass after this patch
is applied (when "make check" is done on a mips board). 

Patch by Petar Jovanovic.

llvm-svn: 162135
2012-08-17 21:28:04 +00:00
Bill Wendling bfb9b7598d Implement stack protectors for structures with character arrays in them.
<rdar://problem/10545247>

llvm-svn: 162131
2012-08-17 20:59:56 +00:00
Jakob Stoklund Olesen 7b1a2e8f02 Avoid folding ADD instructions with FI operands.
PEI can't handle the pseudo-instructions. This can be removed when the
pseudo-instructions are replaced by normal predicated instructions.

Fixes PR13628.

llvm-svn: 162130
2012-08-17 20:55:34 +00:00
Akira Hatanaka 7605630c48 Add stub methods for mips assembly matcher.
Patch by Vladimir Medic.

llvm-svn: 162124
2012-08-17 20:16:42 +00:00
Benjamin Kramer 34764fe2e4 MemoryBuiltins: Properly guard ObjectSizeOffsetVisitor against cycles in the IR.
The previous fix only checked for simple cycles, use a set to catch longer
cycles too.

Drop the broken check from the ObjectSizeOffsetEvaluator. The BoundsChecking
pass doesn't have to deal with invalid IR like InstCombine does.

llvm-svn: 162120
2012-08-17 19:26:41 +00:00
Bill Wendling 34bc34ecae Change the `linker_private_weak_def_auto' linkage to `linkonce_odr_auto_hide' to
make it more consistent with its intended semantics.

The `linker_private_weak_def_auto' linkage type was meant to automatically hide
globals which never had their addresses taken. It has nothing to do with the
`linker_private' linkage type, which outputs the symbols with a `l' (ell) prefix
among other things.

The intended semantic is more like the `linkonce_odr' linkage type.

Change the name of the linkage type to `linkonce_odr_auto_hide'. And therefore
changing the semantics so that it produces the correct output for the linker.

Note: The old linkage name `linker_private_weak_def_auto' will still parse but
is not a synonym for `linkonce_odr_auto_hide'. This should be removed in 4.0.
<rdar://problem/11754934>

llvm-svn: 162114
2012-08-17 18:33:14 +00:00
Rafael Espindola 9a16735e22 Assert that dominates is not given a multiple edge. Finding out if we have
multiple edges between two blocks is linear. If the caller is iterating all
edges leaving a BB that would be a square time algorithm. It is more efficient
to have the callers handle that case.

Currently the only callers are:
* GVN: already avoids the multiple edge case.
* Verifier: could only hit this assert when looking at an invalid invoke. Since
it already rejects the invoke, just avoid computing the dominance for it.

llvm-svn: 162113
2012-08-17 18:21:28 +00:00
Jakob Stoklund Olesen c1dee482c8 Add comment, clean up code. No functional change.
llvm-svn: 162107
2012-08-17 16:59:09 +00:00
Benjamin Kramer ca7ca4f6c6 TargetLowering: Use the large shift amount during legalize types. The legalizer may call us with an overly large type.
llvm-svn: 162101
2012-08-17 15:54:21 +00:00
Jakob Stoklund Olesen 714f595c98 Use standard pattern for iterate+erase.
Increment the MBB iterator at the top of the loop to properly handle the
current (and previous) instructions getting erased.

This fixes PR13625.

llvm-svn: 162099
2012-08-17 14:38:59 +00:00
Benjamin Kramer 4901f0d2a2 Guard MemoryBuiltins against self-looping GEPs, which can occur in unreachable code due to constant propagation.
Fixes PR13621.

llvm-svn: 162098
2012-08-17 14:16:37 +00:00
Tim Northover f66181530f Implement NEON domain switching for scalar <-> S-register vmovs on ARM
llvm-svn: 162094
2012-08-17 11:32:52 +00:00
Craig Topper 31625574db Use nested switch to select arguments to reduce calls to EmitPCMP.
llvm-svn: 162089
2012-08-17 07:15:56 +00:00
Craig Topper 602e1abe0d Make ReplaceATOMIC_BINARY_64 a static function. Use a nested switch to reduce to only a single call to it thus allowing it to be inlined by the compiler.
llvm-svn: 162088
2012-08-17 06:55:11 +00:00
Craig Topper f6add7e667 Remove unnecessary include of ARMGenInstrInfo.inc.
llvm-svn: 162086
2012-08-17 06:21:09 +00:00
Jakob Stoklund Olesen 0ea1fce6b4 Add ADD and SUB to the predicable ARM instructions.
It is not my plan to duplicate the entire ARM instruction set with
predicated versions. We need a way of representing predicated
instructions in SSA form without requiring a separate opcode.

Then the pseudo-instructions can go away.

llvm-svn: 162061
2012-08-16 23:21:55 +00:00
Jakob Stoklund Olesen c19bf0282d Handle ARM MOVCC optimization in PeepholeOptimizer.
Use the target independent select analysis hooks.

llvm-svn: 162060
2012-08-16 23:14:20 +00:00
Jakob Stoklund Olesen 2382d320b3 Add an MCID::Select flag and TII hooks for optimizing selects.
Select instructions pick one of two virtual registers based on a
condition, like x86 cmov. On targets like ARM that support predication,
selects can sometimes be eliminated by predicating the instruction
defining one of the operands.

Teach PeepholeOptimizer to recognize select instructions, and ask the
target to optimize them.

llvm-svn: 162059
2012-08-16 23:11:47 +00:00
Roman Divacky 2039a987c4 Revert r162034, r162035 and r162037.
llvm-svn: 162039
2012-08-16 19:07:59 +00:00
Roman Divacky 9d38fc8ddc Define and handle additional fixup kinds. By Adhemerval Zanella.
llvm-svn: 162037
2012-08-16 18:37:52 +00:00
Roman Divacky 1faf5b07c6 Fix typo and grammar. By Adhemerval Zanella.
llvm-svn: 162032
2012-08-16 18:19:29 +00:00
Rafael Espindola cc80cdebb9 Teach GVN to reason about edges dominating uses. This allows it to handle cases
where some fact lake a=b dominates a use in a phi, but doesn't dominate the
basic block itself.

This feature could also be implemented by splitting critical edges, but at least
with the current algorithm reasoning about the dominance directly is faster.

The time for running "opt -O2" in the testcase in pr10584 is 1.003 times slower
and on gcc as a single file it is 1.0007 times faster.

llvm-svn: 162023
2012-08-16 15:09:43 +00:00
Jush Lu 26088cb30e [arm-fast-isel] Add support for fastcc.
Without fastcc support, the caller just falls through to CallingConv::C
for fastcc, but callee still uses fastcc, this inconsistency of calling
convention is a problem, and fastcc support can fix it.

llvm-svn: 162013
2012-08-16 05:15:53 +00:00
Anitha Boyapati af3e98347f Patch to enable FMA on bdver2 target. Make XOP feature enable FMA4 as well.
llvm-svn: 162012
2012-08-16 04:04:02 +00:00
Anitha Boyapati 426feb61b9 (no commit message)
llvm-svn: 162010
2012-08-16 03:50:04 +00:00
Akira Hatanaka 89d50b3957 Add Android ABI to Mips backend to handle functions returning vectors of four
floats.

llvm-svn: 162008
2012-08-16 03:48:05 +00:00
Jakob Stoklund Olesen 6cb96120f1 Fold predicable instructions into MOVCC / t2MOVCC.
The ARM select instructions are just predicated moves. If the select is
the only use of an operand, the instruction defining the operand can be
predicated instead, saving one instruction and decreasing register
pressure.

This implementation can turn AND/ORR/EOR instructions into their
corresponding ANDCC/ORRCC/EORCC variants. Ideally, we should be able to
predicate any instruction, but we don't yet support predicated
instructions in SSA form.

llvm-svn: 161994
2012-08-15 22:16:39 +00:00
Bill Wendling 4d5150d978 Remove dead flag.
llvm-svn: 161990
2012-08-15 21:18:10 +00:00
Sean Callanan 219fc0f386 Fixed a problem in the JIT memory allocator where
allocations of executable memory would not be padded
to account for the size of the allocation header.
This resulted in undersized allocations, meaning that
when the allocation was written to later the next
allocation's header would be corrupted.

llvm-svn: 161984
2012-08-15 20:53:52 +00:00
Michael J. Spencer 1d2d12deb1 Properly test the LLVM_USE_RVALUE_REFERENCES macro.
llvm-svn: 161978
2012-08-15 19:16:27 +00:00
Michael J. Spencer ef2284fbad [PathV2] Add mapped_file_region. Implementation for Windows and POSIX.
llvm-svn: 161976
2012-08-15 19:05:47 +00:00
Owen Anderson 352dfff447 Fix another roundToIntegral bug where very large values could become infinity. Problem and solution identified by Steve Canon.
llvm-svn: 161969
2012-08-15 18:28:45 +00:00
Evan Cheng eec6bc6270 Use vld1/vst1 to load/store f64 if alignment is < 4 and the target allows unaligned access. rdar://12091029
llvm-svn: 161962
2012-08-15 17:44:53 +00:00
Owen Anderson be7e297b6d Fix typo in comment.
llvm-svn: 161956
2012-08-15 16:42:53 +00:00
Jakob Stoklund Olesen 2ec0c41e01 Add missing Rfalse operand to the predicated pseudo-instructions.
When predicating this instruction:

  Rd = ADD Rn, Rm

We need an extra operand to represent the value given to Rd when the
predicate is false:

  Rd = ADDCC Rfalse, Rn, Rm, pred

The Rd and Rfalse operands are different registers while in SSA form.
Rfalse is tied to Rd to make sure they get the same register during
register allocation.

Previously, Rd and Rn were tied, but that is not required.

Compare to MOVCC:

  Rd = MOVCC Rfalse, Rtrue, pred

llvm-svn: 161955
2012-08-15 16:17:24 +00:00
Bill Wendling e1c54262f4 Set the branch probability of branching to the 'normal' destination of an invoke
instruction to something absurdly high, while setting the probability of
branching to the 'unwind' destination to the bare minimum. This should set cause
the normal destination's invoke blocks to be moved closer to the invoke.

PR13612

llvm-svn: 161944
2012-08-15 12:22:35 +00:00
Kostya Serebryany 1e575ab8b2 [asan] implement --asan-always-slow-path, which is a part of the improvement to handle unaligned partially OOB accesses. See http://code.google.com/p/address-sanitizer/issues/detail?id=100
llvm-svn: 161937
2012-08-15 08:58:58 +00:00
Owen Anderson 1ff74b0d2d Fix a problem with APFloat::roundToIntegral where it would return incorrect results for negative inputs to trunc. Add unit tests to verify this behavior.
llvm-svn: 161929
2012-08-15 05:39:46 +00:00
Michael Liao 69e172a6f0 fix infinite loop in instcombine with more than 4GB memcpy
- memcpy size is wrongly truncated into 32-bit and treat 8GB memcpy is
  0-sized memcpy
- as 0-sized memcpy/memset is already removed before SimplifyMemTransfer
  and SimplifyMemSet in visitCallInst, replace 0 checking with
  assertions.
- replace getZExtValue() with getLimitedValue() according to
  Eli Friedman

llvm-svn: 161923
2012-08-15 03:49:59 +00:00
Nick Lewycky 58564d5aa6 Fix a typo that led to a failure to correctly verify bitcast instructions.
Patch by Stephen Hines!

llvm-svn: 161921
2012-08-15 02:37:07 +00:00
Richard Smith 8f3447c032 Fix undefined behavior: don't perform array indexing through a potentially null
pointer.

llvm-svn: 161919
2012-08-15 01:39:31 +00:00
Anton Korobeynikov c6d945b11a The names of VFP variants of half-to-float conversion instructions were
reversed. This leads to wrong codegen for float-to-half conversion
intrinsics which are used to support storage-only fp16 type.
NEON variants of same instructions are fine.

llvm-svn: 161907
2012-08-14 23:36:01 +00:00
Eric Christopher 5f61a7498b This needs braces. Spotted by Bill.
llvm-svn: 161906
2012-08-14 23:32:15 +00:00
Michael Liao 06f6fe875a minor fix of X86ISD::VSEXT_MOVL dump
llvm-svn: 161902
2012-08-14 22:53:17 +00:00
Michael Liao 34107b9177 fix PR11334
- FP_EXTEND only support extending from vectors with matching elements.
  This results in the scalarization of extending to v2f64 from v2f32,
  which will be legalized to v4f32 not matching with v2f64.
- add X86-specific VFPEXT supproting extending from v4f32 to v2f64.
- add BUILD_VECTOR lowering helper to recover back the original
  extending from v4f32 to v2f64.
- test case is enhanced to include different vector width.

llvm-svn: 161894
2012-08-14 21:24:47 +00:00
Jim Grosbach ecaef49f59 Switch the fixed-length disassembler to be table-driven.
Refactor the TableGen'erated fixed length disassemblmer to use a
table-driven state machine rather than a massive set of nested
switch() statements.

As a result, the ARM Disassembler (ARMDisassembler.cpp) builds much more
quickly and generates a smaller end result. For a Release+Asserts build on
a 16GB 3.4GHz i7 iMac w/ SSD:

Time to compile at -O2 (averaged w/ hot caches):
  Previous: 35.5s
  New:       8.9s

TEXT size:
  Previous: 447,251
  New:      297,661

Builds in 25% of the time previously required and generates code 66% of
the size.

Execution time of the disassembler is only slightly slower (7% disassembling
10 million ARM instructions, 19.6s vs 21.0s). The new implementation has
not yet been tuned, however, so the performance should almost certainly
be recoverable should it become a concern.

llvm-svn: 161888
2012-08-14 19:06:05 +00:00
Owen Anderson 0b35722533 Fix the construction of the magic constant for roundToIntegral to be 64-bit safe. Fixes c-torture/execute/990826-0.c
llvm-svn: 161885
2012-08-14 18:51:15 +00:00
Kostya Serebryany fda7a138f7 [asan] insert crash basic blocks inline as opposed to inserting them at the end of the function. This doesn't seem to fix or break anything, but is considered to be more friendly to downstream passes
llvm-svn: 161870
2012-08-14 14:04:51 +00:00
Craig Topper 925a281b00 Factor duplicate calls to getUNDEF in several functions.
llvm-svn: 161860
2012-08-14 08:18:43 +00:00
Craig Topper d0d4b11f66 Re-factor intrinsic lowering to combine common parts of similar intrinsics. Reduces compiled code size a little bit.
llvm-svn: 161859
2012-08-14 07:43:25 +00:00
Craig Topper 2a40418a99 Change greater than to greater than or equal so that an identical sized store to the same offset is treated as completing overwriting.
llvm-svn: 161857
2012-08-14 07:32:05 +00:00
Richard Smith 0ff8f0eaf9 Fix undefined behavior: binding null pointer to reference. No functionality change.
llvm-svn: 161853
2012-08-14 05:31:26 +00:00
Nadav Rotem 70409991bc During the CodeGenPrepare we often lower intrinsics (such as objsize)
and allow some optimizations to turn conditional branches into unconditional.
This commit adds a simple control-flow optimization which merges two consecutive
basic blocks which are connected by a single edge. This allows the codegen to
operate on larger basic blocks.

rdar://11973998

llvm-svn: 161852
2012-08-14 05:19:07 +00:00
Eric Christopher 160522c25a Grammar.
llvm-svn: 161851
2012-08-14 05:13:29 +00:00
Eric Christopher 97f6ea9f34 Typo.
llvm-svn: 161826
2012-08-14 01:09:10 +00:00
Owen Anderson a40319b7f1 Add a roundToIntegral method to APFloat, which can be parameterized over various rounding modes. Use this to implement SelectionDAG constant folding of FFLOOR, FCEIL, and FTRUNC.
llvm-svn: 161807
2012-08-13 23:32:49 +00:00
Jakob Stoklund Olesen 396b595b92 Transfer weights in transferSuccessorsAndUpdatePHIs().
llvm-svn: 161805
2012-08-13 23:13:25 +00:00
Jakob Stoklund Olesen 1dc107a84e Print out MachineBasicBlock successor weights when available.
llvm-svn: 161804
2012-08-13 23:13:23 +00:00
Nadav Rotem 8d80452076 LICM uses AliasSet information to hoist and sink instructions. However, other passes, such as LoopRotate
may invalidate its AliasSet because SSAUpdater does not update the AliasSet properly.
This patch teaches SSAUpdater to notify AliasSet that it made changes.
The testcase in PR12901 is too big to be useful and I could not reduce it to a normal size. 

rdar://11872059 PR12901

llvm-svn: 161803
2012-08-13 23:06:54 +00:00
Nadav Rotem 5d4e205874 MemoryDependenceAnalysis attempts to find the first memory dependency for function calls.
Currently, if GetLocation reports that it did not find a valid pointer (this is the case for volatile load/stores),
we ignore the result. This patch adds code to handle the cases where we did not obtain a valid pointer.

rdar://11872864  PR12899

llvm-svn: 161802
2012-08-13 23:03:43 +00:00
Jakob Stoklund Olesen 702bcc3bcf Remove the TII::scheduleTwoAddrSource() hook.
It never does anything when running 'make check', and it get's in the
way of updating live intervals in 2-addr.

The hook was originally added to help form IT blocks in Thumb2 code
before register allocation, but the pass ordering has changed since
then, and we run if-conversion after register allocation now.

When the MI scheduler is enabled, there will be no less than two
schedulers between 2-addr and Thumb2ITBlockPass, so this hook is
unlikely to help anything.

llvm-svn: 161794
2012-08-13 21:52:57 +00:00
Manman Ren d6c8270eaa ARM: enable struct byval for AAPCS-VFP.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 161789
2012-08-13 21:22:50 +00:00
Bill Wendling 49aeb5cc5d Whitespace cleanup.
llvm-svn: 161788
2012-08-13 21:20:43 +00:00
Jakob Stoklund Olesen d0af1d9657 Count triangles and diamonds in early if-conversion.
llvm-svn: 161783
2012-08-13 21:03:27 +00:00
Jakob Stoklund Olesen 62a097d134 Delete dead typedef.
llvm-svn: 161782
2012-08-13 21:03:25 +00:00
Jakob Stoklund Olesen 83a927d84a Handle extra Tail predecessors in if-conversion.
It is still possible to if-convert if the tail block has extra
predecessors, but the tail phis must be rewritten instead of being
removed.

llvm-svn: 161781
2012-08-13 20:49:04 +00:00
Arnold Schwaighofer 0bb7f23cfc [Hexagon] Don't mark callee saved registers as clobbered by a tail call
This was causing unnecessary spills/restores of callee saved registers.

Fixes PR13572.

Patch by Pranav Bhandarkar!

llvm-svn: 161778
2012-08-13 19:54:01 +00:00
Nadav Rotem 3a94c545cf Do not optimize (or (and X,Y), Z) into BFI and other sequences if the AND ISDNode has more than one user.
rdar://11876519

llvm-svn: 161775
2012-08-13 18:52:44 +00:00
Manman Ren 959acb106b X86: move Int_CVTSD2SSrr, Int_CVTSI2SSrr, Int_CVTSI2SDrr, Int_CVTSS2SDrr from
OpTbl1 to OpTbl2 since they have 3 operands and the last operand can be changed
to a memory operand.

PR13576

llvm-svn: 161769
2012-08-13 18:29:41 +00:00
Eric Christopher 7d8b53c1f8 Add support for the %H output modifier.
Patch by Weiming Zhao.

llvm-svn: 161768
2012-08-13 18:18:52 +00:00
Manman Ren e90e94f117 X86: when auto-detecting the subtarget features, make sure use IsIntel to detect
Nehalem, Westmere and Sandy Bridge. AMD also has processor family 6.

llvm-svn: 161763
2012-08-13 17:26:46 +00:00
Kostya Serebryany 0f7a80d0c3 [asan] remove the code for --asan-merge-callbacks as it appears to be a bad idea. (partly related to Bug 13225)
llvm-svn: 161757
2012-08-13 14:08:46 +00:00
Tim Northover 5aaa7fde94 Use correct loads for vector types during extending-load operations.
Previously, we used VLD1.32 in all cases, however there are both 16 and 64-bit
accesses being selected, so we need to use an appropriate width load in those
cases.

llvm-svn: 161748
2012-08-13 09:06:31 +00:00
Craig Topper 4e5eb72735 Tidy up VSETCC lowering code a bit more by adding an llvm_unreachable and putting an a couple if conditions in a better order.
llvm-svn: 161746
2012-08-13 03:42:38 +00:00
Craig Topper 5145a0d967 Refactor code a bit to share commonalities. No functional change intended.
llvm-svn: 161745
2012-08-13 02:34:03 +00:00
Craig Topper ff6e4d1928 Fix an unused variable warning from r161742.
llvm-svn: 161743
2012-08-13 01:26:45 +00:00
Craig Topper a7aaa62d54 Remove the LowerMMXCONCAT_VECTORS function. It could never execute because there are no legal 64-bit vector types that could be used as inputs to a 128-bit concat_vectors. Remove a target specific SDNode and its patterns that become unused as a result.
llvm-svn: 161742
2012-08-13 01:23:55 +00:00
Nick Lewycky 333449cd65 When emitting the PC range in an FDE, use the same data encoding for both ends
of the range. Fixes PR13581!

llvm-svn: 161739
2012-08-12 08:09:45 +00:00
Craig Topper 3d2b271362 Remove call to setOperationAction for SETCC of v4f32. SETCC returns an integer type not an FP type.
llvm-svn: 161738
2012-08-12 05:31:32 +00:00
Craig Topper 498228d089 Remove unnecessary call to setOperationAction for SETCC of v2i64 under SSE42. It was already called for the same under SSE2.
llvm-svn: 161737
2012-08-12 05:15:16 +00:00
Arnold Schwaighofer b73da9453c Revert 161581: Patch to implement UMLAL/SMLAL instructions for the ARM
architecture

It broke MultiSource/Applications/JM/ldecod/ldecod on armv7 thumb O0 g and armv7
thumb O3.

llvm-svn: 161736
2012-08-12 05:11:56 +00:00
Craig Topper 4fa625fda7 Change addTypeForNeon to use MVT instead of EVT so all the calls to getSimpleVT can be removed.
llvm-svn: 161735
2012-08-12 03:16:37 +00:00
Craig Topper 10a8bf3b8c Make replace many calls to getSizeInBits() with is128BitVector/is256BitVector
llvm-svn: 161734
2012-08-12 02:23:29 +00:00
Craig Topper 03d2787275 Use MVT.isXBitVector instead of EVT.isXBitVector when setting up operation actions. Compiles to smaller code.
llvm-svn: 161733
2012-08-12 00:34:56 +00:00
Michael Liao e7e828fd64 fix PR13577, an issue introduced by r161687
- FCMOV only supports a subset of X86 conditions. Skip boolean
  simplification if X86 condition is not valid for FCMOV.
- add a minimal test case for PR13577.

llvm-svn: 161732
2012-08-11 23:47:06 +00:00
Craig Topper b5bcf58ba1 Move setOperationAction for CONCAT_VECTORS for 256-bit vectors into loop since all 256-bit types are supported.
llvm-svn: 161730
2012-08-11 22:34:26 +00:00
Benjamin Kramer 59c8b411e0 MachineCSE: Hoist isConstantPhysReg out of the loop, it checks for overlaps already.
llvm-svn: 161729
2012-08-11 20:42:59 +00:00
Benjamin Kramer ef6494f24d PR13578: Teach MachineCSE that instructions that use a constant register can be CSE'd safely.
This is common e.g. when doing rip-relative addressing on x86_64.

llvm-svn: 161728
2012-08-11 19:05:13 +00:00
Craig Topper 490c45c06c Tidy up indentation. No functional change.
llvm-svn: 161727
2012-08-11 17:53:00 +00:00
Craig Topper 55406d9f78 Fix a cast that was casting away 'const' unnecessarily
llvm-svn: 161726
2012-08-11 17:46:16 +00:00
Craig Topper 22cb0c572b Add a couple default: llvm_unreachable() to some switch statements. Fix a bad message in an existing llvm_unreachable.
llvm-svn: 161725
2012-08-11 17:44:14 +00:00
Manman Ren 1acb6707cd X86: when we are auto-detecting the subtarget features, make sure we turn on
FeatureFastUAMem for Nehalem, Westmere and Sandy Bridge.

FeatureFastUAMem is already on if we pass in nehalem or westmere as a command
argument.

rdar: 7252306
llvm-svn: 161717
2012-08-10 23:43:32 +00:00
Jakob Stoklund Olesen bc55bfde03 Add a proper if-conversion cost model.
Detect when there is not enough available ILP, so if-conversion can't
speculate instructions for free.

Compute the lengthening of the critical path when inserting a select
instruction that depends on the condition as well as both sides of the
if.

Reject conversions that would stretch the critical path by more than
half a mispredict penalty.

llvm-svn: 161713
2012-08-10 22:27:31 +00:00
Jakob Stoklund Olesen a0042acd3b Give MachineTraceMetrics its own debug tag.
llvm-svn: 161712
2012-08-10 22:27:29 +00:00
Jakob Stoklund Olesen 3484420927 Add more trace query functions.
Trace::getResourceLength() computes the number of cycles required to
execute the trace when ignoring data dependencies. The number can be
compared to the critical path to estimate the trace ILP.

Trace::getPHIDepth() computes the data dependency depth of a PHI in a
trace successor that isn't necessarily part of the trace.

llvm-svn: 161711
2012-08-10 22:27:27 +00:00
Eli Friedman 4c923b3b3f The normal edge of an invoke is not allowed to branch to a block with a
landingpad.  Enforce it in the verifier, and fix the regression tests to match.

llvm-svn: 161697
2012-08-10 20:55:20 +00:00
Manman Ren e201e27eb1 ARM: enable struct byval for AAPCS.
This change is to be enabled in clang.

rdar://9877866
PR://13350

llvm-svn: 161693
2012-08-10 20:39:38 +00:00
Jakob Stoklund Olesen 0a99062cf6 Add getTPred() and getFPred() functions.
They identify the PHI predecessors in both diamonds and triangles.

llvm-svn: 161689
2012-08-10 20:19:17 +00:00
Jakob Stoklund Olesen 0954d4199a Include loop-carried dependencies when computing instr heights.
When a trace ends with a back-edge, include PHIs in the loop header in
the height computations. This makes the critical path through a loop
more accurate by including the latencies of the last instructions in the
loop.

llvm-svn: 161688
2012-08-10 20:11:38 +00:00
Michael Liao 5248e9913f add X86-specific DAG optimization to simplify boolean test
- if a boolean test (X86ISD::CMP or X86ISD:SUB) checks a boolean value
  generated from X86ISD::SETCC, try to simplify the boolean value
  generation and checking by reusing the original EFLAGS with proper
  condition code
- add hooks to X86 specific SETCC/BRCOND/CMOV, the major 3 places
  consuming EFLAGS

part of patches fixing PR12312

llvm-svn: 161687
2012-08-10 19:58:13 +00:00
Rafael Espindola 64e7b5703e Constify some basic blocks, no functionality change.
llvm-svn: 161668
2012-08-10 15:55:25 +00:00
Michael Liao ea7d906b0f remove tailing whitespaces and test commit
llvm-svn: 161664
2012-08-10 14:39:24 +00:00
Rafael Espindola 1187077f81 Move BasicBlockEdge to the cpp file. No functionality change.
llvm-svn: 161663
2012-08-10 14:05:55 +00:00
Joerg Sonnenberger c0697304c9 stdcxx's cstdio doesn't include stdio.h, but the code using PathV2.inc
includes both. Deal with feof and ferror potentially being macros.

llvm-svn: 161658
2012-08-10 10:56:09 +00:00
Joerg Sonnenberger aa2f801ca3 Add some missing includes for the build against stdcxx.
llvm-svn: 161657
2012-08-10 10:53:56 +00:00
Pete Cooper 0deca6be79 Fix crash when when do lto on Bullet. Dynamic GEPs in SROA were incorrectly being applied to all accesses to an alloca, not just the ones which read from the GEP. Thanks to Evan for reducing the test. rdar://11861001
llvm-svn: 161654
2012-08-10 03:26:36 +00:00
Jakob Stoklund Olesen 8c28ac9ec9 Update edge weights correctly in replaceSuccessor().
When replacing Old with New, it can happen that New is already a
successor. Add the old and new edge weights instead of creating a
duplicate edge.

llvm-svn: 161653
2012-08-10 03:23:27 +00:00
Rafael Espindola 740a6bc8a0 Remove references to compression in llvm-ar. It has been a long time since we
switched from a bytecode+bzip2 to the current bitcode.

llvm-svn: 161651
2012-08-10 01:57:52 +00:00
Jakob Stoklund Olesen d9b66506a3 Reapply r161633-161634 "Partition use lists so defs always come before uses.""
No changes to these patches, MRI needed to be notified when changing
uses into defs and vice versa.

llvm-svn: 161644
2012-08-10 00:21:30 +00:00
Jakob Stoklund Olesen ae7b9711b1 Also update MRI use lists when changing a use to a def and vice versa.
This was the cause of the buildbot failures.

llvm-svn: 161643
2012-08-10 00:21:26 +00:00
Chad Rosier 09f74b5517 [ms-inline asm] Add a new Inline Asm Non-Standard Dialect attribute.
This new attribute is intended to be used by the backend to determine how
the inline asm string should be parsed/printed. This patch adds the 
ia_nsdialect attribute and also adds a test case to ensure the IR is
correctly parsed, but there is no functional change at this time.

The standard dialect is assumed to be AT&T.  Therefore, this attribute
should only be added to MS-style inline assembly statements, which use
the Intel dialect.  If we ever support more dialects we'll need to
add additional state to the attribute.

llvm-svn: 161641
2012-08-10 00:00:22 +00:00
Jakob Stoklund Olesen acd27c9279 Revert r161633-161634 "Partition use lists so defs always come before uses."
These commits broke a number of buildbots.

llvm-svn: 161640
2012-08-09 23:31:36 +00:00
Jakob Stoklund Olesen df01e00710 Partition use lists so defs always come before uses.
This makes it possible to speed up def_iterator by stopping at the first
use. This makes def_empty() and getUniqueVRegDef() much faster when
there are many uses.

In a +Asserts build, LiveVariables is 100x faster in one case because
getVRegDef() has an assertion that would scan to the end of a
def_iterator chain.

Spill weight calculation is significantly faster (300x in one case)
because isTriviallyReMaterializable() calls MRI->isConstantPhysReg(%RIP)
which calls def_empty(%RIP).

llvm-svn: 161634
2012-08-09 22:49:46 +00:00
Jakob Stoklund Olesen 7d7051ca3c Don't use pointer-pointers for the register use lists.
Use a more conventional doubly linked list where the Prev pointers form
a cycle. This means it is no longer necessary to adjust the Prev
pointers when reallocating the VRegInfo array.

The test changes are required because the register allocation hint is
using the use-list order to break ties.

llvm-svn: 161633
2012-08-09 22:49:42 +00:00
Jakob Stoklund Olesen c4102d4902 Move use list management into MachineRegisterInfo.
Register MachineOperands are kept in linked lists accessible via MRI's
reg_iterator interfaces. The linked list management was handled partly
by MachineOperand methods, partly by MRI methods.

Move all of the list management into MRI, delete
MO::AddRegOperandToRegInfo() and MO::RemoveRegOperandFromRegInfo().

Be more explicit about handling the cases where an MRI pointer isn't
available.

llvm-svn: 161632
2012-08-09 22:49:37 +00:00
Eric Christopher 6ac277ce91 Remove getARMRegisterNumbering and replace with calls into
the register info for getEncodingValue. This builds on the
small patch of yesterday to set HWEncoding in the register
file.

One (deprecated) use was turned into a hard number to avoid
needing register info in the old JIT.

llvm-svn: 161628
2012-08-09 22:10:21 +00:00
Jakob Stoklund Olesen 420798ca4f Fix a future TwoAddressInstructionPass crash.
No test case, the crash only happens when the default use list order is
changed.

llvm-svn: 161627
2012-08-09 22:08:26 +00:00
Jakob Stoklund Olesen 4238a89db8 Don't modify MO while use_iterator is still pointing to it.
llvm-svn: 161626
2012-08-09 22:08:24 +00:00
Chad Rosier 9cb988f3aa [ms-inline asm] Extend the MC AsmParser API to match MCInsts (but not emit).
This new API will be used by clang to parse ms-style inline asms.

One goal of this project is to use this style of inline asm for targets other
then x86.  Therefore, this API needs to be implemented for non-x86 targets at
some point in the future.

llvm-svn: 161624
2012-08-09 22:04:55 +00:00
Jack Carter 120a30a732 Another 32 to 64 bit sign extension bug.
The fields in the td definition were switched.

llvm-svn: 161607
2012-08-09 19:43:18 +00:00
Arnold Schwaighofer 81b2eec1ab Patch to implement UMLAL/SMLAL instructions for the ARM architecture
This patch corrects the definition of umlal/smlal instructions and adds support
for matching them to the ARM dag combiner.

Bug 12213

Patch by Yin Ma!

llvm-svn: 161581
2012-08-09 15:25:52 +00:00
Nadav Rotem e0f84d31c8 Fix the legalization of ExtLoad on ARM. ExpandUnalignedLoad did not properly
handle the cases where the memory value type was illegal. 
PR 13111. 

llvm-svn: 161565
2012-08-09 01:56:44 +00:00
Eric Christopher 245f9b5552 This field isn't used anymore, use it with HWEncoding instead.
llvm-svn: 161564
2012-08-09 01:39:32 +00:00
Jim Grosbach bf387df302 Move [SU]LEB128 encoding to a utility header.
These functions are very generic. There's no reason for them to
be tied to MCObjectWriter.

llvm-svn: 161545
2012-08-08 23:56:06 +00:00
Jakob Stoklund Olesen 978c1280a5 Don't use getNextOperandForReg().
This way of using getNextOperandForReg() was unlikely to work as
intended. We don't give any guarantees about the order of operands in
the use-def chains, so looking only at operands following a given
operand in the chain doesn't make sense.

llvm-svn: 161542
2012-08-08 23:44:04 +00:00
Jakob Stoklund Olesen f71bc7b267 Don't use getNextOperandForReg() in RAFast.
That particular optimization was probably premature anyway.

llvm-svn: 161541
2012-08-08 23:44:01 +00:00
Jakob Stoklund Olesen bf1ac4bdc3 Deal with irreducible control flow when building traces.
We filter out MachineLoop back-edges during the trace-building PO
traversals, but it is possible to have CFG cycles that aren't natural
loops, and MachineLoopInfo doesn't include such cycles.

Use a standard visited set to detect such CFG cycles, and completely
ignore them when picking traces.

llvm-svn: 161532
2012-08-08 22:12:01 +00:00
Jakob Stoklund Olesen fa8a26f9df Heed -stress-early-ifcvt.
llvm-svn: 161513
2012-08-08 18:24:23 +00:00
Jakob Stoklund Olesen e71b6c6b20 Get the MispredictPenalty from MCSchedModel.
Thanks, Andy!

llvm-svn: 161507
2012-08-08 18:19:58 +00:00
Rafael Espindola cb7eadfe7e Typedefs and indentation fixes from the Andy Zhang/PAX macro argument patch.
Committing it first as it makes the "real" patch a lot easier to read.

llvm-svn: 161491
2012-08-08 14:51:03 +00:00
Anton Korobeynikov cc8c539300 Fix for .pdata and .xdata section attributes on COFF.
Patch by kai@redstar.de !

llvm-svn: 161487
2012-08-08 12:46:46 +00:00
Bill Wendling ce4fe41d21 Add `.pushsection', `.popsection', and `.previous' directives to Darwin ASM.
There are situations where inline ASM may want to change the section -- for
instance, to create a variable in the .data section. However, it cannot do this
without (potentially) restoring to the wrong section. E.g.:

  asm volatile (".section __DATA, __data\n\t"
                ".globl _fnord\n\t"
                "_fnord: .quad 1f\n\t"
                ".text\n\t"
                "1:" :::);

This may be wrong if this is inlined into a function that has a "section"
attribute. The user should use `.pushsection' and `.popsection' here instead.

The addition of `.previous' is added for completeness.
<rdar://problem/12048387>

llvm-svn: 161477
2012-08-08 06:30:30 +00:00
Andrew Trick 352abc19a5 Added MispredictPenalty to SchedMachineModel.
This replaces an existing subtarget hook on ARM and allows standard
CodeGen passes to potentially use the property.

llvm-svn: 161471
2012-08-08 02:44:16 +00:00
Andrew Trick db9b1b5e66 Minor cleanup of defaultDefLatency API
llvm-svn: 161470
2012-08-08 02:44:11 +00:00
Andrew Trick 207c569cf3 whitespace
llvm-svn: 161469
2012-08-08 02:44:08 +00:00
Eli Friedman 08ec0a8122 isAllocLikeFn is allowed to return true for functions which read memory; make
sure we account for that correctly in DeadStoreElimination.  Fixes a regression
from r158919.  PR13547.

llvm-svn: 161468
2012-08-08 02:17:32 +00:00
Jakob Stoklund Olesen 0556be983d Revert "Fix a quadratic algorithm in MachineBranchProbabilityInfo."
It caused an assertion failure when compiling consumer-typeset.

llvm-svn: 161463
2012-08-08 01:10:31 +00:00
Manman Ren 1be131ba27 X86: enable CSE between CMP and SUB
We perform the following:
1> Use SUB instead of CMP for i8,i16,i32 and i64 in ISel lowering.
2> Modify MachineCSE to correctly handle implicit defs.
3> Convert SUB back to CMP if possible at peephole.

Removed pattern matching of (a>b) ? (a-b):0 and like, since they are handled
by peephole now.

rdar://11873276

llvm-svn: 161462
2012-08-08 00:51:41 +00:00
Jakob Stoklund Olesen 3b9a442841 Don't scan physreg use-def chains looking for a PIC base.
We can't rematerialize a PIC base after register allocation anyway, and
scanning physreg use-def chains is very expensive in a function with
many calls.

<rdar://problem/12047515>

llvm-svn: 161461
2012-08-08 00:40:47 +00:00
Jakob Stoklund Olesen c0b61ff9c7 Fix a quadratic algorithm in MachineBranchProbabilityInfo.
The getSumForBlock function was quadratic in the number of successors
because getSuccWeight would perform a linear search for an already known
iterator.

llvm-svn: 161460
2012-08-08 00:20:37 +00:00
Dan Gohman b948736002 Avoid recomputing the unique exit blocks and their insert points when doing
multiple scalar promotions on a single loop. This also has the effect of
preserving the order of stores sunk out of loops, which is aesthetically
pleasing, and it happens to fix the testcase in PR13542, though it doesn't
fix the underlying problem.

llvm-svn: 161459
2012-08-08 00:00:26 +00:00
Jakob Stoklund Olesen fbf45dc2bd Skip tied operand pairs that already have the same register.
llvm-svn: 161454
2012-08-07 22:47:06 +00:00
Jakob Stoklund Olesen 505715d816 Add SelectionDAG::getTargetIndex.
This adds support for TargetIndex operands during isel. The meaning of
these (index, offset, flags) operands is entirely defined by the target.

llvm-svn: 161453
2012-08-07 22:37:05 +00:00
Bob Wilson 61f3ad5759 Fix a serious typo in InstCombine's optimization of comparisons.
An unsigned value converted to floating-point will always be greater than
a negative constant.  Unfortunately InstCombine reversed the check so that
unsigned values were being optimized to always be greater than all positive
floating-point constants.  <rdar://problem/12029145>

llvm-svn: 161452
2012-08-07 22:35:16 +00:00
Evan Cheng fbdd25c135 X86 cmp lowering is looking past truncate on the condition node. It should only
do so when the high bits are known zero. This caused a subtle miscompilation.

rdar://12027825 

llvm-svn: 161451
2012-08-07 22:21:00 +00:00
Bill Wendling 61396b81a4 For non-Darwin platforms, we want to generate stack protectors only for
character arrays. This is in line with what GCC does.
<rdar://problem/10529227>

llvm-svn: 161446
2012-08-07 20:59:05 +00:00
Jakob Stoklund Olesen 84689b0d5a Add a new kind of MachineOperand: MO_TargetIndex.
A target index operand looks a lot like a constant pool reference, but
it is completely target-defined. It contains the 8-bit TargetFlags, a
32-bit index, and a 64-bit offset. It is preserved by all code generator
passes.

TargetIndex operands can be used to carry target-specific information in
cases where immediate operands won't suffice.

llvm-svn: 161441
2012-08-07 18:56:39 +00:00
Andrew Kaylor 1a568c3a4f Enable lazy compilation in MCJIT
llvm-svn: 161438
2012-08-07 18:33:00 +00:00
Jakob Stoklund Olesen 296448b293 Fix a couple of typos.
llvm-svn: 161437
2012-08-07 18:32:57 +00:00
Jakob Stoklund Olesen 75d9d5159e Add trace accessor methods, implement primitive if-conversion heuristic.
Compare the critical paths of the two traces through an if-conversion
candidate. If the difference is larger than the branch brediction
penalty, reject the if-conversion. If would never pay.

llvm-svn: 161433
2012-08-07 18:02:19 +00:00
Rafael Espindola 59564079e9 The dominance computation already has logic for computing if an edge dominates
a use or a BB, but it is inline in the handling of the invoke instruction.

This patch refactors it so that it can be used in other cases. For example, in

define i32 @f(i32 %x) {
bb0:
  %cmp = icmp eq i32 %x, 0
  br i1 %cmp, label %bb2, label %bb1
bb1:
  br label %bb2
bb2:
  %cond = phi i32 [ %x, %bb0 ], [ 0, %bb1 ]
  %foo = add i32 %cond, %x
  ret i32 %foo
}

GVN should be able to replace %x with 0 in any use that is dominated by the
true edge out of bb0. In the above example the only such use is the one in
the phi.

llvm-svn: 161429
2012-08-07 17:30:46 +00:00
Hal Finkel 895a5f5d12 Add a comment about mftb vs. mfspr on PPC.
Thanks to Alex Rosenberg for the suggestion.

llvm-svn: 161428
2012-08-07 17:04:20 +00:00
Alexey Samsonov 947228c4f7 Fix the representation of debug line table in DebugInfo LLVM library,
and "instruction address -> file/line" lookup.

Instead of plain collection of rows, debug line table for compilation unit is now
treated as the number of row ranges, describing sequences (series of contiguous machine
instructions). The sequences are not always listed in the order of increasing
address, so previously used std::lower_bound() sometimes produced wrong results.
Now the instruction address lookup consists of two stages: finding the correct
sequence, and searching for address in range of rows for this sequence.

llvm-svn: 161414
2012-08-07 11:46:57 +00:00
Benjamin Kramer c99d0e9186 PR13095: Give an inline cost bonus to functions using byval arguments.
We give a bonus for every argument because the argument setup is not needed
anymore when the function is inlined. With this patch we interpret byval
arguments as a compact representation of many arguments. The byval argument
setup is implemented in the backend as an inline memcpy, so to model the
cost as accurately as possible we take the number of pointer-sized elements
in the byval argument and give a bonus of 2 instructions for every one of
those. The bonus is capped at 8 elements, which is the number of stores
at which the x86 backend switches from an expanded inline memcpy to a real
memcpy. It would be better to use the real memcpy threshold from the backend,
but it's not available via TargetData.

This change brings the performance of c-ray in line with gcc 4.7. The included
test case tries to reproduce the c-ray problem to catch regressions for this
benchmark early, its performance is dominated by the inline decision of a
specific call.

This only has a small impact on most code, more on x86 and arm than on x86_64
due to the way the ABI works. When building LLVM for x86 it gives a small
inline cost boost to virtually any function using StringRef or STL allocators,
but only a 0.01% increase in overall binary size. The size of gcc compiled by
clang actually shrunk by a couple bytes with this patch applied, but not
significantly.

llvm-svn: 161413
2012-08-07 11:13:19 +00:00
Chandler Carruth 2f6cf4884c Fix PR13412, a nasty miscompile due to the interleaved
instsimplify+inline strategy.

The crux of the problem is that instsimplify was reasonably relying on
an invariant that is true within any single function, but is no longer
true mid-inline the way we use it. This invariant is that an argument
pointer != a local (alloca) pointer.

The fix is really light weight though, and allows instsimplify to be
resiliant to these situations: when checking the relation ships to
function arguments, ensure that the argumets come from the same
function. If they come from different functions, then none of these
assumptions hold. All credit to Benjamin Kramer for coming up with this
clever solution to the problem.

llvm-svn: 161410
2012-08-07 10:59:59 +00:00
Chandler Carruth 881d0a7966 Add a much more conservative strategy for aligning branch targets.
Previously, MBP essentially aligned every branch target it could. This
bloats code quite a bit, especially non-looping code which has no real
reason to prefer aligned branch targets so heavily.

As Andy said in review, it's still a bit odd to do this without a real
cost model, but this at least has much more plausible heuristics.

Fixes PR13265.

llvm-svn: 161409
2012-08-07 09:45:24 +00:00
Manman Ren cb36b8c2e6 MachineCSE: Update the heuristics for isProfitableToCSE.
If the result of a common subexpression is used at all uses of the candidate
expression, CSE should not increase the live range of the common subexpression.

rdar://11393714 and rdar://11819721

llvm-svn: 161396
2012-08-07 06:16:46 +00:00
Bill Wendling 0acd0c0a69 Revert r161371. Removing the 'const' before Type is a "good thing".
--- Reverse-merging r161371 into '.':
U    include/llvm/Target/TargetData.h
U    lib/Target/TargetData.cpp

llvm-svn: 161394
2012-08-07 05:51:59 +00:00
Jack Carter f4946cfbb9 The define for 64 bit sign extension neglected to
initialize fields of the class that it used.

The result was nonsense code.

Before:
0000000000000000 <foo>:
   0:    00441100     0x441100
   4:    03e00008     jr    ra
   8:    00000000     nop

After:
0000000000000000 <foo>:
   0:    00041000     sll    v0,a0,0x0
   4:    03e00008     jr    ra
   8:    00000000     nop 

llvm-svn: 161377
2012-08-07 00:35:22 +00:00
Bill Wendling 654cd4aaee Constify the Type parameter to some methods (which are const anyway).
llvm-svn: 161371
2012-08-07 00:26:35 +00:00
Andrew Trick e0c83b1f3b Allow x86 subtargets to use the GenericModel defined in X86Schedule.td.
This allows codegen passes to query properties like
InstrItins->SchedModel->IssueWidth. It also ensure's that
computeOperandLatency returns the X86 defaults for loads and "high
latency ops". This should have no significant impact on existing
schedulers because X86 defaults happen to be the same as global
defaults.

llvm-svn: 161370
2012-08-07 00:25:30 +00:00
Jack Carter 4c58381c3a Mips relocation R_MIPS_64 relocates a 64 bit double word.
I hit this in a very large program (spirit.cpp), but 
have not figured out how to make a small make check
test for it.

llvm-svn: 161366
2012-08-07 00:01:14 +00:00
Jack Carter 612c66314c The Mips64InstrInfo.td definitions DynAlloc64 LEA_ADDiu64
were using a class defined for 32 bit instructions and 
thus the instruction was for addiu instead of daddiu.

This was corrected by adding the instruction opcode as a 
field in the  base class to be filled in by the defs.

llvm-svn: 161359
2012-08-06 23:29:06 +00:00
Jack Carter 84491abb20 Mips relocations R_MIPS_HIGHER and R_MIPS_HIGHEST.
These 2 relocations gain access to the 
highest and the second highest 16 bits
of a 64 bit object.

R_MIPS_HIGHER %higher(A+S)
The %higher(x) function is [ (((long long) x + 0x80008000LL) >> 32) & 0xffff ]. 

R_MIPS_HIGHEST %highest(A+S)
The %highest(x) function is [ (((long long) x + 0x800080008000LL) >> 48) & 0xffff ]. 

llvm-svn: 161348
2012-08-06 21:26:03 +00:00
Hal Finkel 33e529d56b MFTB on PPC64 should really be encoded using MFSPR.
The MFTB instruction itself is being phased out, and its functionality
is provided by MFSPR. According to the ISA docs, using MFSPR works on all known
chips except for the 601 (which did not have a timebase register anyway)
and the POWER3.

Thanks to Adhemerval Zanella for pointing this out!

llvm-svn: 161346
2012-08-06 21:21:44 +00:00
Eric Christopher 22738d00a3 Add support for the OpenBSD for Bitrig.
Patch by David Hill.

llvm-svn: 161344
2012-08-06 20:52:18 +00:00
Roman Divacky 7d6e08560b Remove empty overrides of processFunctionBeforeFrameFinalized().
llvm-svn: 161328
2012-08-06 18:14:18 +00:00
Craig Topper ab47fe4e16 Implement proper handling for pcmpistri/pcmpestri intrinsics. Requires custom handling in DAGISelToDAG due to limitations in TableGen's implicit def handling. Fixes PR11305.
llvm-svn: 161318
2012-08-06 06:22:36 +00:00
Craig Topper 6d0408d3a5 Remove custom inserter for MWAIT. It doesn't do anything that couldn't be represented in a pattern.
llvm-svn: 161306
2012-08-05 00:36:57 +00:00
Craig Topper 43ee9fae92 Use a COPY node instead of an explicit MOVA opcode in the custom insterter for pcmpestrm/pcmpistrm. Allows the register allocator to handle it better and prevent wasted identity moves.
llvm-svn: 161305
2012-08-05 00:17:48 +00:00
Hal Finkel 70381a7b18 Add readcyclecounter lowering on PPC64.
On PPC64, this can be done with a simple TableGen pattern.
To enable this, I've added the (otherwise missing) readcyclecounter
SDNode definition to TargetSelectionDAG.td.

llvm-svn: 161302
2012-08-04 14:10:46 +00:00
Anton Korobeynikov ef731edf53 Skip impdef regs during eabi save/restore list emission to workaround PR11902
llvm-svn: 161301
2012-08-04 13:25:58 +00:00
Anton Korobeynikov 3a4fdfeceb Recognize vst1.64 / vld1.64 with 3 and 4 regs as load from / store to stack stuff
(this corresponds by spilling/reloading regs in DTriple / DQuad reg classes).
No testcase, found by inspection.

llvm-svn: 161300
2012-08-04 13:22:14 +00:00
Anton Korobeynikov 218aaf6d04 Add stack spill / reload instructions for DTriple and DQuad register classes, which
were missed for no reason. This fixes PR13377

llvm-svn: 161299
2012-08-04 13:16:12 +00:00
Benjamin Kramer 3849fcbe0e Postpone the deletion of the old name in StructType::setName to allow using a slice of the old name.
Fixes PR13522. Add a rudimentary unit test to exercise the behavior.

llvm-svn: 161296
2012-08-04 09:47:02 +00:00
Jakob Stoklund Olesen a9d0b850b3 Delete a dead variable.
TwoAddressInstructionPass doesn't remat any more.

llvm-svn: 161285
2012-08-04 00:04:03 +00:00
Jakob Stoklund Olesen a0c72ecf79 TwoAddressInstructionPass refactoring: Extract another method.
llvm-svn: 161284
2012-08-03 23:57:58 +00:00
Bob Wilson 874886cd66 Refactor and check "onlyReadsMemory" before optimizing builtins.
This patch is mostly just refactoring a bunch of copy-and-pasted code, but
it also adds a check that the call instructions are readnone or readonly.
That check was already present for sin, cos, sqrt, log2, and exp2 calls, but
it was missing for the rest of the builtins being handled in this code.

llvm-svn: 161282
2012-08-03 23:29:17 +00:00
Jakob Stoklund Olesen 1162a1548b TwoAddressInstructionPass refactoring: Extract a method.
No functional change intended, except replacing a DenseMap with a
SmallDenseMap which should behave identically.

llvm-svn: 161281
2012-08-03 23:25:45 +00:00
Jakob Stoklund Olesen 24bc514c0c Begin adding support for updating LiveIntervals in TwoAddressInstructionPass.
This is far from complete, and only changes behavior when the
-early-live-intervals flag is passed to llc.

llvm-svn: 161273
2012-08-03 22:58:34 +00:00
Akira Hatanaka 22bec282e9 1. Redo mips16 instructions to avoid multiple opcodes for same instruction.
Change these to patterns.
2. Add another 16 instructions.

Patch by Reed Kotler.

llvm-svn: 161272
2012-08-03 22:57:02 +00:00
Jakob Stoklund Olesen 1c46589290 Add an experimental -early-live-intervals option.
This option runs LiveIntervals before TwoAddressInstructionPass which
will eventually learn to exploit and update the analysis.

Eventually, LiveIntervals will run before PHIElimination, and we can get
rid of LiveVariables.

llvm-svn: 161270
2012-08-03 22:12:54 +00:00
Jakob Stoklund Olesen 918999db95 Delete merged physreg copies in joinReservedPhysReg().
Previously, the identity copy would survive through register allocation
before it was removed by the rewriter.

llvm-svn: 161269
2012-08-03 22:12:51 +00:00
Bob Wilson 871701c606 Try to reduce the compile time impact of r161232.
The previous change caused fast isel to not attempt handling any calls to
builtin functions.  That included things like "printf" and caused some
noticable regressions in compile time.  I wanted to avoid having fast isel
keep a separate list of functions that had to be kept in sync with what the
code in SelectionDAGBuilder.cpp was handling.  I've resolved that here by
moving the list into TargetLibraryInfo.  This is somewhat redundant in
SelectionDAGBuilder but it will ensure that we keep things consistent.

llvm-svn: 161263
2012-08-03 21:26:24 +00:00
Bob Wilson fa59485b94 Fix memcmp code-gen to honor -fno-builtin.
I noticed that SelectionDAGBuilder::visitCall was missing a check for memcmp
in TargetLibraryInfo, so that it would use custom code for memcmp calls even
with -fno-builtin.  I also had to add a new -disable-simplify-libcalls option
to llc so that I could write a test for this.

llvm-svn: 161262
2012-08-03 21:26:18 +00:00
Jakob Stoklund Olesen daae19f785 Completely eliminate VNInfo flags.
The 'unused' state of a value number can be represented as an invalid
def SlotIndex. This also exposed code that shouldn't have been looking
at unused value VNInfos.

llvm-svn: 161258
2012-08-03 20:59:32 +00:00
Jakob Stoklund Olesen 21809385a6 Fix a couple of loops that were processing unused value numbers.
Unused VNInfos should be left alone. Their def SlotIndex doesn't point
to anything.

llvm-svn: 161257
2012-08-03 20:59:29 +00:00
Matt Beaumont-Gay aaba08d503 Silence unused variable warning in -asserts build
llvm-svn: 161256
2012-08-03 20:54:11 +00:00
Jakob Stoklund Olesen 9f565e19c5 Eliminate the VNInfo::hasPHIKill() flag.
The only real user of the flag was removeCopyByCommutingDef(), and it
has been switched to LiveIntervals::hasPHIKill().

All the code changed by this patch was only concerned with computing and
propagating the flag.

llvm-svn: 161255
2012-08-03 20:19:44 +00:00
Jakob Stoklund Olesen 06d6a5363b Make the hasPHIKills flag a computed property.
The VNInfo::HAS_PHI_KILL is only half supported. We precompute it in
LiveIntervalAnalysis, but it isn't properly updated by live range
splitting and functions like shrinkToUses().

It is only used in one place: RegisterCoalescer::removeCopyByCommutingDef().

This patch changes that function to use a new LiveIntervals::hasPHIKill()
function that computes the flag for a given value number.

llvm-svn: 161254
2012-08-03 20:10:24 +00:00
Jakob Stoklund Olesen 19c4596629 Delete dead function.
llvm-svn: 161242
2012-08-03 15:21:21 +00:00
Jakob Stoklund Olesen 47ac20d4d6 Don't delete dead code in TwoAddressInstructionPass.
This functionality was added before we started running
DeadMachineInstructionElim on all targets. It serves no purpose now.

llvm-svn: 161241
2012-08-03 15:11:57 +00:00
Gabor Greif a1529b6ca4 allow 'make CPPFLAGS=<something>' work again
this makes this hack a bit more bearable
for poor souls who need to pass custom
preprocessor flags to the build process

llvm-svn: 161240
2012-08-03 13:31:24 +00:00
Bob Wilson 3e6fa462f3 Fall back to selection DAG isel for calls to builtin functions.
Fast isel doesn't currently have support for translating builtin function
calls to target instructions.  For embedded environments where the library
functions are not available, this is a matter of correctness and not
just optimization.  Most of this patch is just arranging to make the
TargetLibraryInfo available in fast isel.  <rdar://problem/12008746>

llvm-svn: 161232
2012-08-03 04:06:28 +00:00
Bob Wilson c740e3f0d1 Add new getLibFunc method to TargetLibraryInfo.
This just provides a way to look up a LibFunc::Func enum value for a
function name.  Alphabetize the enums and function names so we can use a
binary search.

llvm-svn: 161231
2012-08-03 04:06:22 +00:00
Jush Lu 4705da9020 [arm-fast-isel] Add support for shl, lshr, and ashr.
llvm-svn: 161230
2012-08-03 02:37:48 +00:00
Bill Wendling 8555a37c04 Move the "findUsedStructTypes" functionality outside of the Module class.
The "findUsedStructTypes" method is very expensive to run. It needs to be
optimized so that LTO can run faster. Splitting this method out of the Module
class will help this occur. For instance, it can keep a list of seen objects so
that it doesn't process them over and over again.

llvm-svn: 161228
2012-08-03 00:30:35 +00:00
Eric Christopher b3322364e4 Add support for the ARM GHC calling convention, this patch was in 3.0,
but somehow managed to be dropped later.

Patch by Karel Gardas.

llvm-svn: 161226
2012-08-03 00:05:53 +00:00
Jim Grosbach 5d6d015969 ARM: Tidy up. Remove unused template parameters.
llvm-svn: 161222
2012-08-02 22:08:27 +00:00
Jim Grosbach b79c33ef55 ARM: More InstAlias refactors to use #NAME#.
llvm-svn: 161220
2012-08-02 21:59:52 +00:00
Jim Grosbach 6d27ad62a8 ARM: Refactor instaliases using TableGen support for #NAME#.
Now that TableGen supports references to NAME w/o it being explicitly
referenced in the definition's own name, use that to simplify
assembly InstAlias definitions in multiclasses.

llvm-svn: 161218
2012-08-02 21:50:41 +00:00
Manman Ren ba8122cc25 X86 Peephole: fold loads to the source register operand if possible.
Add more comments and use early returns to reduce nesting in isLoadFoldable.
Also disable folding for V_SET0 to avoid introducing a const pool entry and
a const pool load.

rdar://10554090 and rdar://11873276

llvm-svn: 161207
2012-08-02 19:37:32 +00:00
Jim Grosbach bc5b61c74d TableGen: Allow use of #NAME# outside of 'def' names.
Previously, def NAME values were only populated, and references to NAME
resolved, when NAME was referenced in the 'def' entry of the multiclass
sub-entry. e.g.,
multiclass foo<...> {
  def prefix_#NAME : ...
}

It's useful, however, to be able to reference NAME even when the default
def name is used. For example, when a multiclass has 'def : Pat<...>'
or 'def : InstAlias<...>' entries which refer to earlier instruction
definitions in the same multiclass. e.g.,
multiclass myMulti<RegisterClass rc> {
  def _r : myI<(outs rc:$d), (ins rc:$r), "r $d, $r", []>;

  def : InstAlias<\"wilma $r\", (!cast<Instruction>(NAME#\"_r\") rc:$r, rc:$r)>;
}

llvm-svn: 161198
2012-08-02 18:46:42 +00:00
Jakob Stoklund Olesen 5d30630e22 Compute the critical path length through a trace.
Whenever both instruction depths and instruction heights are known in a
block, it is possible to compute the length of the critical path as
max(depth+height) over the instructions in the block.

The stored live-in lists make it possible to accurately compute the
length of a critical path that bypasses the current (small) block.

llvm-svn: 161197
2012-08-02 18:45:54 +00:00
Akira Hatanaka fab8929459 Move the code that creates instances of MipsInstrInfo and MipsFrameLowering out
of MipsTargetMachine.cpp.

llvm-svn: 161191
2012-08-02 18:21:47 +00:00
Akira Hatanaka fffad897f2 Set transient stack alignment in constructor of MipsFrameLowering and re-enable
test o32_cc_vararg.ll.

llvm-svn: 161189
2012-08-02 18:15:13 +00:00
Jakob Stoklund Olesen 637c467528 Verify regunit intervals along with virtreg intervals.
Don't cause regunit intervals to be computed just to verify them. Only
check the already cached intervals.

llvm-svn: 161183
2012-08-02 16:36:50 +00:00
Jakob Stoklund Olesen 374071dde2 Avoid creating dangling physreg live ranges during DCE.
LiveRangeEdit::eliminateDeadDefs() can delete a dead instruction that
reads unreserved physregs. This would leave the corresponding regunit
live interval dangling because we don't have shrinkToUses() for physical
registers.

Fix this problem by turning the instruction into a KILL instead of
deleting it. This happens in a landing pad in
test/CodeGen/X86/2012-05-19-CoalescerCrash.ll:

  %vreg27<def,dead> = COPY %EDX<kill>; GR32:%vreg27

becomes:

  KILL %EDX<kill>

An upcoming fix to the machine verifier will catch problems like this by
verifying regunit live intervals.

This fixes PR13498. I am not including the test case from the PR since
we already have one exposing the problem once the verifier is fixed.

llvm-svn: 161182
2012-08-02 16:36:47 +00:00
Jakob Stoklund Olesen bde5dc5e46 Add report() functions that take a LiveInterval argument.
llvm-svn: 161178
2012-08-02 14:31:49 +00:00
Hongbin Zheng bb1d209210 Implement the block_iterator of Region based on df_iterator.
llvm-svn: 161177
2012-08-02 14:20:02 +00:00
Nuno Lopes 79a4424f8e JIT::runFunction(): add a fast path for functions with a single argument that is a pointer.
llvm-svn: 161171
2012-08-02 12:09:32 +00:00
Jiangning Liu fa18005a4c Support fpv4 for ARM Cortex-M4.
llvm-svn: 161163
2012-08-02 08:35:55 +00:00
Jiangning Liu 6a43bf7d74 Fix #13035, a bug around Thumb instruction LDRD/STRD with negative #0 offset index issue.
llvm-svn: 161162
2012-08-02 08:29:50 +00:00
Jiangning Liu 288e1af8c8 Fix #13138, a bug around ARM instruction DSB encoding and decoding issue.
llvm-svn: 161161
2012-08-02 08:21:27 +00:00
Jiangning Liu 10dd40e42d Fix #13241, a bug around shift immediate operand for ARM instruction ADR.
llvm-svn: 161159
2012-08-02 08:13:13 +00:00
Manman Ren 5759d01230 X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

This patch is a rework of r160919 and was tested on clang self-host on my local
machine.

rdar://10554090 and rdar://11873276

llvm-svn: 161152
2012-08-02 00:56:42 +00:00
Jakob Stoklund Olesen e736b97eff Extract some methods from verifyLiveIntervals.
No functional change.

llvm-svn: 161149
2012-08-02 00:20:20 +00:00
Jakob Stoklund Olesen a766b4746d Also verify RegUnit intervals at uses.
llvm-svn: 161147
2012-08-01 23:52:40 +00:00
Manman Ren 4059145396 X86: mark GATHER instructios as mayLoad
llvm-svn: 161143
2012-08-01 23:28:59 +00:00
Jakob Stoklund Olesen 2db6b65330 Compute instruction heights through a trace.
The height on an instruction is the minimum number of cycles from the
instruction is issued to the end of the trace. Heights are computed for
all instructions in and below the trace center block.

The method for computing heights is different from the depth
computation. As we visit instructions in the trace bottom-up, heights of
used instructions are pushed upwards. This way, we avoid scanning long
use lists, looking for uses in the current trace.

At each basic block boundary, a list of live-in registers and their
minimum heights is saved in the trace block info. These live-in lists
are used when restarting depth computations on a trace that
converges with an already computed trace. They will also be used to
accurately compute the critical path length.

llvm-svn: 161138
2012-08-01 22:36:00 +00:00
Jim Grosbach 8724d0fd99 ARM: Remove redundant instalias.
llvm-svn: 161134
2012-08-01 20:33:05 +00:00
Jim Grosbach 96e8a8dc6d Clean up formatting.
llvm-svn: 161133
2012-08-01 20:33:02 +00:00
Jim Grosbach b437a8c5d5 Tidy up.
llvm-svn: 161132
2012-08-01 20:33:00 +00:00
Chad Rosier 24c19d20c0 Whitespace.
llvm-svn: 161122
2012-08-01 18:39:17 +00:00
Eric Christopher b1b9451337 Temporarily revert c23b933d5f8be9b51a1d22e717c0311f65f87dcd. It's causing
failures in the debug testsuite and possibly PR13486.

llvm-svn: 161121
2012-08-01 18:19:01 +00:00
Nuno Lopes a9a8c62714 remove tabs from my previous commit.
Sorry, not used to this editor anymore.. XCode please come back; you're forgiven :)

llvm-svn: 161120
2012-08-01 17:13:28 +00:00
Nuno Lopes e7220312c2 (hopefuly) fix the remaining cases where null wasnt expected (PR13497).
I'll commit a test to the clang tree.

llvm-svn: 161118
2012-08-01 16:58:51 +00:00
Jakob Stoklund Olesen 5e19d35e9a Add DataDep constructors. Explicitly check SSA form.
llvm-svn: 161115
2012-08-01 16:02:59 +00:00
Elena Demikhovsky 3cb3b0045c Added FMA functionality to X86 target.
llvm-svn: 161110
2012-08-01 12:06:00 +00:00
Nick Lewycky fb78083b1c Stay rational; don't assert trying to take the square root of a negative value.
If it's negative, the loop is already proven to be infinite. Fixes PR13489!

llvm-svn: 161107
2012-08-01 09:14:36 +00:00
Craig Topper b8aec08819 Add more indirection to the disassembler tables to reduce amount of space used to store the operand types and encodings. Store only the unique combinations in a separate table and store indices in the instruction table. Saves about 32K of static data.
llvm-svn: 161101
2012-08-01 07:39:18 +00:00
Nick Kledzik 5fce8c4ffe Initial commit of new FileOutputBuffer support class.
Since the llvm::sys::fs::map_file_pages() support function it relies on
is not yet implemented on Windows, the unit tests for FileOutputBuffer 
are currently conditionalized to run only on unix.

llvm-svn: 161099
2012-08-01 02:29:50 +00:00
Akira Hatanaka 0820f0ca8e Implement MipsJITInfo::replaceMachineCodeForFunction.
No new test case is added.
This patch makes test JITTest.FunctionIsRecompiledAndRelinked pass on mips
platform.

Patch by Petar Jovanovic.

llvm-svn: 161098
2012-08-01 02:29:24 +00:00
Akira Hatanaka 4a240b0f9d Remove unused variable.
llvm-svn: 161095
2012-08-01 00:37:53 +00:00
Akira Hatanaka 88d76cfd7a Implement MipsSERegisterInfo::eliminateCallFramePseudoInstr. The function emits
instructions that decrement and increment the stack pointer before and after a
call when the function does not have a reserved call frame.

llvm-svn: 161093
2012-07-31 23:52:55 +00:00
Akira Hatanaka cb37e13fa7 Add definitions of two subclasses of MipsRegisterInfo, Mips16RegisterInfo and
MipsSERegisterInfo.

llvm-svn: 161092
2012-07-31 23:41:32 +00:00
Akira Hatanaka d1c43cee24 Add definitions of two subclasses of MipsFrameLowering, Mips16FrameLowering and
MipsSEFrameLowering.

Implement MipsSEFrameLowering::hasReservedCallFrame. Call frames will not be
reserved if there is a call with a large call frame or there are variable sized
objects on the stack.

llvm-svn: 161090
2012-07-31 22:50:19 +00:00
Akira Hatanaka 2c64adf672 Add Mips16InstrInfo.cpp and MipsSEInstrInfo.cpp to CMakeLists.txt.
llvm-svn: 161083
2012-07-31 22:11:05 +00:00
Akira Hatanaka b7fa3c9db0 Add definitions of two subclasses of MipsInstrInfo, MipsInstrInfo (for mips16),
and MipsSEInstrInfo (for mips32/64).

llvm-svn: 161081
2012-07-31 21:49:49 +00:00
Akira Hatanaka 30651805c4 Delete mips64 target machine classes. mips target machines can be used in place
of them.

llvm-svn: 161080
2012-07-31 21:39:17 +00:00
Akira Hatanaka 02de0e4425 Let PEI::calculateFrameObjectOffsets compute the final stack size rather than
computing it in MipsFrameLowering::emitPrologue.

llvm-svn: 161078
2012-07-31 21:28:49 +00:00
Akira Hatanaka 33a25af5a8 Expand DYNAMIC_STACKALLOC nodes rather than doing custom-lowering.
The frame object which points to the dynamically allocated area will not be
needed after changes are made to cease reserving call frames.

llvm-svn: 161076
2012-07-31 20:54:48 +00:00
Manman Ren f288d2f120 MachineSink: Sort the successors before trying to find SuccToSinkTo.
Use stable_sort instead of sort. Follow-up to r161062.

rdar://11980766

llvm-svn: 161075
2012-07-31 20:45:38 +00:00
Jakob Stoklund Olesen 059e647c6d Compute instruction depths through the current trace.
Assuming infinite issue width, compute the earliest each instruction in
the trace can issue, when considering the latency of data dependencies.
The issue cycle is record as a 'depth' from the beginning of the trace.

This is half the computation required to find the length of the critical
path through the trace. Heights are next.

llvm-svn: 161074
2012-07-31 20:44:38 +00:00
Jakob Stoklund Olesen 1dfb101835 Rename CT -> MTM. MachineTraceMetrics is abbreviated MTM.
llvm-svn: 161072
2012-07-31 20:25:13 +00:00
Akira Hatanaka a66d676b20 Define ADJCALLSTACKDOWN/UP nodes. These nodes are emitted regardless of whether
or not it is in mips16 mode. Define MipsPseudo (mode-independant pseudo) and
PseudoSE (mips32/64 pseudo) classes.

llvm-svn: 161071
2012-07-31 19:13:07 +00:00
Akira Hatanaka 3a810eda91 Change name of class MipsInst to InstSE to distinguish it from mips16's
instruction class. SE stands for standard encoding.

llvm-svn: 161069
2012-07-31 18:55:01 +00:00
Akira Hatanaka beda2241a4 When store nodes or memcpy nodes are created to copy the function call
arguments to the stack in MipsISelLowering::LowerCall, use stack pointer and
integer offset operands rather than frame object operands.

llvm-svn: 161068
2012-07-31 18:46:41 +00:00
Chad Rosier 710be7df71 [x86 frame lowering] In 32-bit mode, use ESI as the base pointer.
Previously, we were using EBX, but PIC requires the GOT to be in EBX before 
function calls via PLT GOT pointer.

llvm-svn: 161066
2012-07-31 18:29:21 +00:00
Akira Hatanaka 4ce7c4060d Fix type of LUXC1 and SUXC1. These instructions were incorrectly defined as
single-precision load and store.

Also avoid selecting LUXC1 and SUXC1 instructions during isel. It is incorrect
to map unaligned floating point load/store nodes to these instructions.

llvm-svn: 161063
2012-07-31 18:16:49 +00:00
Manman Ren 8c549b586c MachineSink: Sort the successors before trying to find SuccToSinkTo.
One motivating example is to sink an instruction from a basic block which has
two successors: one outside the loop, the other inside the loop. We should try
to sink the instruction outside the loop.

rdar://11980766

llvm-svn: 161062
2012-07-31 18:10:39 +00:00
Micah Villmow b67d7a3a33 Conform to LLVM coding style.
llvm-svn: 161061
2012-07-31 18:07:43 +00:00
Micah Villmow 6b12f596ef Don't generate ordered or unordered comparison operations if it is not legal to do so.
llvm-svn: 161053
2012-07-31 16:48:03 +00:00
Craig Topper c2efce404e Make INSTRUCTION_SPECIFIER_FIELDS match X86DisassemblerCommon.h. Also remove trailing whitespace.
llvm-svn: 161029
2012-07-31 05:18:26 +00:00
Craig Topper fb39f97d4c Tidy up trailing whitespace
llvm-svn: 161027
2012-07-31 04:58:05 +00:00
Craig Topper 5f33d90214 Tidy up trailing whitespace
llvm-svn: 161026
2012-07-31 04:38:27 +00:00
Jakob Stoklund Olesen 0c807dfae2 Clear kill flags in removeCopyByCommutingDef().
We are extending live ranges, so kill flags are not accurate. They
aren't needed until they are recomputed after RA anyway.

<rdar://problem/11950722>

llvm-svn: 161023
2012-07-31 02:47:24 +00:00
Manman Ren 2b6a0dfd4c Reverse order of the two branches at end of a basic block if it is profitable.
We branch to the successor with higher edge weight first.
Convert from
     je    LBB4_8  --> to outer loop
     jmp   LBB4_14 --> to inner loop
to
     jne   LBB4_14
     jmp   LBB4_8

PR12750
rdar: 11393714

llvm-svn: 161018
2012-07-31 01:11:07 +00:00
Andrew Trick 79795897b3 Use the latest MachineRegisterInfo APIs. No functionality.
llvm-svn: 161010
2012-07-30 23:48:17 +00:00
Andrew Trick 535a23c38b Inline MachineRegisterInfo::hasOneUse
llvm-svn: 161007
2012-07-30 23:48:12 +00:00
Jakob Stoklund Olesen 68c2cd059e Avoid looking at stale data in verifyAnalysis().
llvm-svn: 161004
2012-07-30 23:15:12 +00:00
Jakob Stoklund Olesen c14cf57ba9 Allow traces to enter nested loops.
This lets traces include the final iteration of a nested loop above the
center block, and the first iteration of a nested loop below the center
block.

We still don't allow traces to contain backedges, and traces are
truncated where they would leave a loop, as seen from the center block.

llvm-svn: 161003
2012-07-30 23:15:10 +00:00
Jim Grosbach 20666162e9 Keep empty assembly macro argument values in the middle of the list.
Empty macro arguments at the end of the list should be as-if not specified at
all, but those in the middle of the list need to be kept so as not to screw
up the positional numbering. E.g.:
.macro foo
foo_-bash___:
  nop
.endm

foo 1, 2, 3, 4
foo 1, , 3, 4

Should create two labels, "foo_1_2_3_4" and "foo_1__3_4".

rdar://11948769

llvm-svn: 161002
2012-07-30 22:44:17 +00:00
Jakob Stoklund Olesen 984cfe8322 Clarify invalidation strategy in comment.
llvm-svn: 160997
2012-07-30 21:16:22 +00:00
Jakob Stoklund Olesen f308c128ea Assert that all trace candidate blocks have been visited by the PO.
When computing a trace, all the candidates for pred/succ must have been
visited. Filter out back-edges first, though. The PO traversal ignores
them.

Thanks to Andy for spotting this in review.

llvm-svn: 160995
2012-07-30 21:10:27 +00:00
Jakob Stoklund Olesen a12a7d5f74 Hook into PassManager's analysis verification.
By overriding Pass::verifyAnalysis(), the pass contents will be verified
by the pass manager.

llvm-svn: 160994
2012-07-30 20:57:50 +00:00
Pete Cooper 91244268d7 Consider address spaces for hashing and CSEing DAG nodes. Otherwise two loads from different x86 segments but the same address would get CSEd
llvm-svn: 160987
2012-07-30 20:23:19 +00:00
Kevin Enderby 5c490f1b8f Fix a bug in ARMMachObjectWriter::RecordRelocation() in ARMMachObjectWriter.cpp
where the other_half of the movt and movw relocation entries needs to get set
and only with the 16 bits of the other half.

rdar://10038370

llvm-svn: 160978
2012-07-30 18:46:15 +00:00
Jakob Stoklund Olesen 7361846f32 Add MachineInstr::isTransient().
This is a cleaned up version of the isFree() function in
MachineTraceMetrics.cpp.

Transient instructions are very unlikely to produce any code in the
final output. Either because they get eliminated by RegisterCoalescing,
or because they are pseudo-instructions like labels and debug values.

llvm-svn: 160977
2012-07-30 18:34:14 +00:00
Jakob Stoklund Olesen 3df6c46fdd Add MachineTraceMetrics::verify().
This function verifies the consistency of cached data in the
MachineTraceMetrics analysis.

llvm-svn: 160976
2012-07-30 18:34:11 +00:00
Jakob Stoklund Olesen eb488fe165 Verify that the CFG hasn't changed during invalidate().
The MachineTraceMetrics analysis must be invalidated before modifying
the CFG. This will catch some of the violations of that rule.

llvm-svn: 160969
2012-07-30 17:36:49 +00:00
Jakob Stoklund Olesen fee94ca15b Add MachineBasicBlock::isPredecessor().
A->isPredecessor(B) is the same as B->isSuccessor(A), but it can
tolerate a B that is null or dangling. This shouldn't happen normally,
but it it useful for verification code.

llvm-svn: 160968
2012-07-30 17:36:47 +00:00
Nadav Rotem 77f1b9c477 When constant folding GEP expressions, keep the address space information of pointers.
Together with Ran Chachick <ran.chachick@intel.com>

llvm-svn: 160954
2012-07-30 07:25:20 +00:00
Craig Topper efd97044a3 Mark MOVZX16/MOVSX16 as neverHasSideEffects/mayLoad
llvm-svn: 160953
2012-07-30 07:14:07 +00:00
Craig Topper c6b7ef61f4 Mark MOVZX32_NOREX as isCodeGenOnly and neverHasSideEffects. The isCodeGenOnly change allows special detection of _NOREX instructions to be removed from tablegen disassembler code.
llvm-svn: 160951
2012-07-30 06:48:11 +00:00
Craig Topper 14eac5dda8 Give VCVTTPD2DQ priority over CVTTPD2DQ.
llvm-svn: 160942
2012-07-30 02:20:32 +00:00
Craig Topper f881d385da Fix patterns for CVTTPS2DQ to specify SSE2 instead of SSE1.
llvm-svn: 160941
2012-07-30 02:14:02 +00:00
Craig Topper 415b3586d0 Fix up patterns for VCVTSS2SD. Specifically give it priority over SSE form. Add an OptForSpeed to explicitly pair up with an OptForSize that was already on another pattern.
llvm-svn: 160939
2012-07-30 01:38:57 +00:00
Craig Topper 28402efcb6 Fix load types on intrinsic forms of SS2SD and SD2SS AVX/SSE convert instruction patterns.
llvm-svn: 160938
2012-07-29 23:26:34 +00:00
Craig Topper b6767f3acd Move more SSE/AVX convert instruction patterns into their definitions.
llvm-svn: 160937
2012-07-29 22:30:06 +00:00
Manman Ren f87dd7c01b Revert r160920 and r160919 due to dragonegg and clang selfhost failure
llvm-svn: 160927
2012-07-29 02:44:09 +00:00
Craig Topper fc93281c07 Fold patterns for some of the SSE/AVX convert instructions into their instruction definitions.
llvm-svn: 160922
2012-07-28 18:59:19 +00:00
Craig Topper 024797b9a2 Mark some of the SSE/AVX convert instructions as mayLoad/neverHasSideEffects.
llvm-svn: 160921
2012-07-28 18:36:39 +00:00
Manman Ren 0fa3ab88ba X86 Peephole: fold loads to the source register operand if possible.
Machine CSE and other optimizations can remove instructions so folding
is possible at peephole while not possible at ISel.

rdar://10554090 and rdar://11873276

llvm-svn: 160919
2012-07-28 16:48:01 +00:00
Craig Topper 44f9b5343d Make CVTSS2SI instruction definition consistent with CVTSD2SI.
llvm-svn: 160914
2012-07-28 08:28:23 +00:00
Craig Topper 1c1aef07b8 Fix up memory load types for SSE scalar convert intrinsic patterns.
llvm-svn: 160913
2012-07-28 07:59:59 +00:00
Manman Ren 32367c063b X86 Peephole: fix PR13475 in optimizeCompare.
It is possible that an instruction can use and update EFLAGS.
When checking the safety, we should check the usage of EFLAGS first before
declaring it is safe to optimize due to the update.

llvm-svn: 160912
2012-07-28 03:15:46 +00:00
Andrew Trick 940534371b Reenable a basic SSA DAG builder optimization.
Jakob fixed ProcessImplicifDefs in r159149.

llvm-svn: 160910
2012-07-28 01:48:15 +00:00
Jakob Stoklund Olesen 0563369755 Add more debug output to MachineTraceMetrics.
llvm-svn: 160905
2012-07-27 23:58:38 +00:00
Jakob Stoklund Olesen 1152202cc2 Keep track of the head and tail of the trace through each block.
This makes it possible to quickly detect blocks that are outside the
trace.

llvm-svn: 160904
2012-07-27 23:58:36 +00:00
Eric Christopher 86ca9f9e11 Add a DW_AT_high_pc for CUs that are a single address range. Update
all tests accordingly.

Fixes PR13351.

Patch by shinichiro hamaji!

llvm-svn: 160899
2012-07-27 22:00:05 +00:00
Jakob Stoklund Olesen 7dfe7abdee Also compute register mask lists under -new-live-intervals.
llvm-svn: 160898
2012-07-27 21:56:39 +00:00
Chad Rosier bd9f2ba4d6 Typos.
llvm-svn: 160897
2012-07-27 21:41:59 +00:00
Evan Cheng 249716e8ae Teach CodeGenPrep to look past bitcast when it's duplicating return instruction
into predecessor blocks to enable tail call optimization.

rdar://11958338

llvm-svn: 160894
2012-07-27 21:21:26 +00:00
Jakob Stoklund Olesen 97e14e02f1 Eliminate the IS_PHI_DEF flag and VNInfo::setIsPHIDef().
A value number is a PHI def if and only if it begins at a block
boundary. This can be derived from the def slot, a separate flag is not
necessary.

llvm-svn: 160893
2012-07-27 21:11:14 +00:00
Jakob Stoklund Olesen 4021a7bf25 Add a -new-live-intervals experimental option.
This option replaces the existing live interval computation with one
based on LiveRangeCalc.cpp. The new algorithm does not depend on
LiveVariables, and it can be run at any time, before or after leaving
SSA form.

llvm-svn: 160892
2012-07-27 20:58:46 +00:00
Andrew Kaylor 8e87a75be7 Fixing problems with X86_64_32 relocations and making the assertions more readable.
llvm-svn: 160889
2012-07-27 20:30:12 +00:00
Jakob Stoklund Olesen bc65e8f94e Add <imp-def> of super-register when lowering SUBREG_TO_REG.
Patch by Tyler Nowicki!

llvm-svn: 160888
2012-07-27 20:19:49 +00:00
Andrew Kaylor 782d5c434f Test commit, clean up comment
llvm-svn: 160880
2012-07-27 18:39:47 +00:00
Nuno Lopes 85591f899d fix PR13390: do not loop forever with self-referencing self instructions
llvm-svn: 160876
2012-07-27 18:21:15 +00:00
Nuno Lopes 20c7eb3549 fix infinite loop in instcombine in the presence of a (malformed) self-referencing select inst.
This can happen as long as the instruction is not reachable. Instcombine does generate these unreachable malformed selects when doing RAUW

llvm-svn: 160874
2012-07-27 18:03:57 +00:00
Andrew Kaylor 5c01090c49 Test commit, clean up comment
llvm-svn: 160873
2012-07-27 17:52:42 +00:00
Jakob Stoklund Olesen 0c06121e4e Give MCRegisterInfo an implementation file.
Move some functions from MCRegisterInfo.h that don't need to be inline.

This shrinks llc by 8K.

llvm-svn: 160865
2012-07-27 16:25:20 +00:00
Akira Hatanaka 97ba7696f8 Pass the correct call frame size to callseq_start node. This is needed to
replace uses of function getMaxCallFrameSize defined in MipsFunctionInfo with
the one MachineFrameInfo has.

llvm-svn: 160841
2012-07-26 23:27:01 +00:00
Pete Cooper abc13af9c6 Simplify demanded bits of select sources where the condition is a constant vector
llvm-svn: 160835
2012-07-26 23:10:24 +00:00
Jakob Stoklund Olesen 7cd08536c2 Remove the X86 sub_ss and sub_sd sub-register indexes completely.
llvm-svn: 160833
2012-07-26 23:07:20 +00:00
Jakob Stoklund Olesen 77cd55b4ee Remove the last mentions of sub_ss and sub_sd from patterns.
I'll remove these two sub-register indexes shortly.

llvm-svn: 160831
2012-07-26 23:03:08 +00:00
Jakob Stoklund Olesen b96d0b4e08 Eliminate sub_ss, sub_sd from broadcast patterns.
The (COPY_TO_REGCLASS GR32:$src, VR128) pattern looks odd, but
copyPhysReg does the right thing with it. (The old pattern would
eventually produce the same cross-class copy).

llvm-svn: 160830
2012-07-26 22:59:06 +00:00
Pete Cooper e807e45bff Teach SimplifyDemandedBits how to look through fpext and fptrunc to simplify their operand
llvm-svn: 160823
2012-07-26 22:37:04 +00:00
Jakob Stoklund Olesen 206b825f5c Eliminate more sub_ss / sub_sd patterns.
This gets rid of some more INSERT_SUBREG - IMPLICIT_DEF patterns,
simplifying the emitted code a bit.

llvm-svn: 160820
2012-07-26 22:30:18 +00:00
Jakob Stoklund Olesen 75d17b0577 Eliminate some SUBREG_TO_REG patterns with sub_ss and sub_sd.
The SUBREG_TO_REG instruction has magic semantics asserting that the
source value was defined by an instruction that cleared the high half of
the register. Those semantics are never actually exploited for xmm
registers.

llvm-svn: 160818
2012-07-26 22:03:21 +00:00
Jakob Stoklund Olesen ceee4a9d0c Eliminate a batch of uses of sub_ss and sub_sd in the X86 target.
These idempotent sub-register indices don't do anything --- They simply
map XMM registers to themselves.  They no longer affect register classes
either since the SubRegClasses field has been removed from Target.td.

This patch replaces XMM->XMM EXTRACT_SUBREG and INSERT_SUBREG patterns
with COPY_TO_REGCLASS patterns which simply become COPY instructions.

The number of IMPLICIT_DEF instructions before register allocation is
reduced, and that is the cause of the test case changes.

llvm-svn: 160816
2012-07-26 21:40:42 +00:00
Micah Villmow 7b473d9f72 Add support for v16i32/v16i64 into the code generator. This is required for backends that use i32/i64 vectors for the getSetCCResultType function.
llvm-svn: 160814
2012-07-26 21:22:00 +00:00
Chad Rosier 7c427c40cb Make comments in Debug.cpp and Debug.h consistent. Rename SetCurrentDebugType;
Function names should be camel case, and start with a lower case letter.  No
functional change intended.

llvm-svn: 160813
2012-07-26 20:38:52 +00:00
Jakob Stoklund Olesen 35400b1dda Use an otherwise unused variable.
llvm-svn: 160798
2012-07-26 19:42:56 +00:00
Jakob Stoklund Olesen f9029fef2a Start scaffolding for a MachineTraceMetrics analysis pass.
This is still a work in progress.

Out-of-order CPUs usually execute instructions from multiple basic
blocks simultaneously, so it is necessary to look at longer traces when
estimating the performance effects of code transformations.

The MachineTraceMetrics analysis will pick a typical trace through a
given basic block and provide performance metrics for the trace. Metrics
will include:

- Instruction count through the trace.
- Issue count per functional unit.
- Critical path length, and per-instruction 'slack'.

These metrics can be used to determine the performance limiting factor
when executing the trace, and how it will be affected by a code
transformation.

Initially, this will be used by the early if-conversion pass.

llvm-svn: 160796
2012-07-26 18:38:11 +00:00
Dan Gohman 0b3d782933 Add a floor intrinsic.
llvm-svn: 160791
2012-07-26 17:43:27 +00:00
Nuno Lopes 5940c4a15f do null checks for a few more Emit*() functions.
Thanks Eli for noticing.

llvm-svn: 160787
2012-07-26 17:10:46 +00:00
Duncan Sands 5651452076 Stop reassociate from looking through expressions of arbitrary complexity. This
is a temporary measure until my fix for PR13021 is ready.

llvm-svn: 160778
2012-07-26 09:26:40 +00:00
Craig Topper c7690ac7ac Make l/q suffixes on AVX forms of scalar convert instructions consistent with their non-AVX forms.
llvm-svn: 160775
2012-07-26 07:48:28 +00:00
Akira Hatanaka 64626fc20f Fix call setup for PIC.
Patch by Reed Kotler.

llvm-svn: 160774
2012-07-26 02:24:43 +00:00
Nick Lewycky 7d0f110cb3 It's not safe to blindly remove invoke instructions. This happens when we
encounter an invoke of an allocation function. This should fix the dragonegg
bootstrap. Testcase to follow, later.

llvm-svn: 160757
2012-07-25 21:19:40 +00:00
Chad Rosier 13198f8f9f You cannot call removeModule on a JIT with no modules. Patch by Verena
Beckham <verena@codeplay.com>.  Reviewed by Jim Grosbach.

llvm-svn: 160753
2012-07-25 19:06:29 +00:00
Nuno Lopes f0626f2205 revert r160742: it's breaking CMake build
original commit msg:
MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings

llvm-svn: 160751
2012-07-25 18:49:28 +00:00
Manman Ren cc1dc6dc11 Disable rematerialization in TwoAddressInstructionPass.
It is redundant; RegisterCoalescer will do the remat if it can't eliminate
the copy. Collected instruction counts before and after this. A few extra
instructions are generated due to spilling but it is normal to see these kinds
of changes with almost any small codegen change, according to Jakob.

This also fixed rdar://11830760 where xor is expected instead of movi0.

llvm-svn: 160749
2012-07-25 18:28:13 +00:00
David Blaikie 70fdf72a48 Don't add null characters to the end of the APFloat string buffer.
Report/patch inspiration by Olaf Krzikalla.

llvm-svn: 160744
2012-07-25 18:04:24 +00:00
Nuno Lopes f0441e04bd MemoryBuiltins: add support to determine the size of strdup'ed non-constant strings
llvm-svn: 160742
2012-07-25 17:29:22 +00:00
Nuno Lopes 7ba5b98720 add EmitStrNLen()
llvm-svn: 160741
2012-07-25 17:18:59 +00:00
Jakob Stoklund Olesen cef9a618b1 Preserve 2-addr constraints in ConnectedVNInfoEqClasses.
When a live range splits into multiple connected components, we would
arbitrarily assign <undef> uses to component 0. This is wrong when the
use is tied to a def that gets assigned to a different component:

  %vreg69<def> = ADD8ri %vreg68<undef>, 1

The use and def must get the same virtual register.

Fix this by assigning <undef> uses to the same component as the value
defined by the instruction, if any:

  %vreg69<def> = ADD8ri %vreg69<undef>, 1

This fixes PR13402. The PR has a test case which I am not including
because it is unlikely to keep exposing this behavior in the future.

llvm-svn: 160739
2012-07-25 17:15:15 +00:00
Jim Grosbach 6df755cc4e ARM: Don't assume an SDNode is a constant.
Before accessing a node as a ConstandSDNode, make sure it actually is one.
No testcase of non-trivial size.

rdar://11948669

llvm-svn: 160735
2012-07-25 17:02:47 +00:00
Jakob Stoklund Olesen c6fd3deee6 Verify two-address constraints more carefully.
Include <undef> operands and virtual registers after leaving SSA form.

llvm-svn: 160734
2012-07-25 16:49:11 +00:00
Nuno Lopes 89702e94b5 make all Emit*() functions consult the TargetLibraryInfo information before creating a call to a library function.
Update all clients to pass the TLI information around.
Previous draft reviewed by Eli.

llvm-svn: 160733
2012-07-25 16:46:31 +00:00
Rafael Espindola 73173c55c2 Fix typos. Thanks to Matt Beaumont-Gay for noticing it.
llvm-svn: 160731
2012-07-25 15:42:45 +00:00
Rafael Espindola 11c38b9657 When a return struct pointer is passed in registers, the called has nothing
to pop.

llvm-svn: 160725
2012-07-25 13:41:10 +00:00
Rafael Espindola 2caee7f4d2 Factor a long list of conditions into a predicate function. No functionality
change.

llvm-svn: 160724
2012-07-25 13:35:45 +00:00
Duncan Sands 0b875a0c29 When folding a load from a global constant, if the load started in the middle
of an array element (rather than at the beginning of the element) and extended
into the next element, then the load from the second element was being handled
wrong due to incorrect updating of the notion of which byte to load next.  This
fixes PR13442.  Thanks to Chris Smowton for reporting the problem, analyzing it
and providing a fix.

llvm-svn: 160711
2012-07-25 09:14:54 +00:00
Akira Hatanaka 5a69c235ae Eliminate the stack slot used to save the global base register.
The long branch pass (fixed in r160601) no longer uses the global base register
to compute addresses of branch destinations, so it is not necessary to reserve
a slot on the stack.

llvm-svn: 160703
2012-07-25 03:16:47 +00:00
Kevin Enderby 216ac31971 Fix a bug in the x86 disassembler's symbolic disassembly support for Jcc-Jump
if Condition Is Met instuctions that was not correctly determining the target
instruction.

So for a jne rel32 instruction:

% cat x.s
.byte 0x0f, 0x85, 0x09, 0x00, 0x00, 0x00
% as x.s

it was incorrectly deterining the target:

% otool -q -tv a.out 
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xd

and with the fix it gets this correct as:

% otool -q -tv a.out
a.out:
(__TEXT,__text) section
0000000000000000	jne	0xf

rdar://11505997

llvm-svn: 160694
2012-07-24 21:40:01 +00:00
Nick Lewycky 38be931223 Don't delete one more instruction than we're allowed to. This should fix the
Darwin bootstrap. Testcase exists but isn't fully reduced, I expect to commit
the testcase this evening.

llvm-svn: 160693
2012-07-24 21:33:00 +00:00
Nuno Lopes 342cf787ef add a few more functions to TargetLibraryInfo:
fputc, memchr, memcmp, putchar, puts, strchr, strncmp

llvm-svn: 160690
2012-07-24 21:00:36 +00:00
David Chisnall 5b8c1680de ELF does not imply GNU/Linux. Do not assume GNU conventions just because we
are targeting an ELF platform.  Only fold gs-relative (and fs-relative) loads
if it is actually sensible to do so for the target platform.

This fixes PR13438.

llvm-svn: 160687
2012-07-24 20:04:16 +00:00
Nuno Lopes 20f5a7aeb7 TargetLibraryInfo: add strn?cat, strn?cpy, and strn?len
llvm-svn: 160678
2012-07-24 17:25:06 +00:00