Commit Graph

55843 Commits

Author SHA1 Message Date
Benjamin Kramer 83aa94711b Emit TableGen's header comment with C-style comments, so it can be used from C89 code.
Should silence warnings when compiling the X86 disassembler.

llvm-svn: 158723
2012-06-19 17:04:16 +00:00
Jan Wen Voung 7f5d79f864 Have ARM ELF use correct reloc for "b" instr.
The condition code didn't actually matter for arm "b" instructions,
unlike "bl".  It should just use the R_ARM_JUMP24 reloc.

llvm-svn: 158722
2012-06-19 16:03:02 +00:00
Hal Finkel d465810f7c Mark most PPC register classes to avoid write-after-write.
For processors with the G5-like instruction-grouping scheme, this helps avoid
early group termination due to a write-after-write dependency within the group.
It should also help on pipelined embedded cores.

On POWER7, over the test suite, this gives an average 0.5% speedup. The largest
speedups are:

SingleSource/Benchmarks/Stanford/Quicksort - 33%
MultiSource/Applications/d/make_dparser - 21%
MultiSource/Benchmarks/FreeBench/analyzer/analyzer - 12%
MultiSource/Benchmarks/MiBench/telecomm-FFT/telecomm-fft - 12%

Largest slowdowns:

SingleSource/Benchmarks/Stanford/Bubblesort - 23%
MultiSource/Benchmarks/Prolangs-C++/city/city - 21%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 16%
MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/mpeg2decode - 13%

llvm-svn: 158719
2012-06-19 13:57:17 +00:00
Michael J. Spencer 96ebd91d6c [Support/PathV2] Fix out of bounds access in identify_magic when the file is empty.
llvm-svn: 158704
2012-06-19 05:29:57 +00:00
Akira Hatanaka 9f96bb8619 Make MipsLongBranch::runOnMachineFunction return true.
llvm-svn: 158702
2012-06-19 03:45:29 +00:00
Akira Hatanaka 9846239bbc Use MachineBasicBlock::instr_iterator instead of MachineBasicBlock::iterator in
MipsCodeEmitter.cpp.

llvm-svn: 158701
2012-06-19 03:39:45 +00:00
Hal Finkel 1cc27e44a4 Add support for generating reg+reg preinc stores on PPC.
PPC will now generate STWUX and friends.

llvm-svn: 158698
2012-06-19 02:34:32 +00:00
Rafael Espindola ca3e0ee8b3 Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00
Nuno Lopes f9abcb7ba9 revert r158660, since Chris has some issues with this patch (namely using code to reprent information only used by the compiler)
Original commit msg:
add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158688
2012-06-18 23:34:26 +00:00
Manman Ren 6e1fd46fdf ARM: use NOEN loads and stores if possible when handling struct byval.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 158684
2012-06-18 22:23:48 +00:00
Hal Finkel 8eac009633 Allow up to 64 functional units per processor itinerary.
This patch changes the type used to hold the FU bitset from unsigned to uint64_t.
This will be needed for some upcoming PowerPC itineraries.

llvm-svn: 158679
2012-06-18 21:08:18 +00:00
Marshall Clow d3e2a76ca4 Added accessors for getting coff_relocation info
llvm-svn: 158675
2012-06-18 19:47:16 +00:00
Jim Grosbach cb540f5cff ARM: Define generic HINT instruction.
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/
a different immediate value in bits [7,0]. Define a generic HINT
instruction and refactor NOP, WFI, WFI, SEV and YIELD to be
assembly aliases of that.

rdar://11600518

llvm-svn: 158674
2012-06-18 19:45:50 +00:00
Nuno Lopes b7c941bad9 add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158660
2012-06-18 16:04:04 +00:00
Joel Jones 3237ce737e This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.  The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.

<rdar://problem/11481151>

llvm-svn: 158659
2012-06-18 14:51:32 +00:00
Chandler Carruth 2cc11fd8c7 Temporarily revert r158087.
This patch causes problems when both dynamic stack realignment and
dynamic allocas combine in the same function. With this patch, we no
longer build the epilog correctly, and silently restore registers from
the wrong position in the stack.

Thanks to Matt for tracking this down, and getting at least an initial
test case to Chad. I'm going to try to check a variation of that test
case in so we can easily track the fixes required.

llvm-svn: 158654
2012-06-18 07:03:12 +00:00
Pete Cooper 33ee6c9bf1 Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts
llvm-svn: 158623
2012-06-17 03:58:26 +00:00
Benjamin Kramer d0b767f849 Disable the right instance of TheJIT, this one is only used in asserts.
llvm-svn: 158610
2012-06-16 21:55:52 +00:00
Benjamin Kramer b9f84bb0ce Guard private fields that are unused in Release builds with #ifndef NDEBUG.
llvm-svn: 158608
2012-06-16 21:48:13 +00:00
Hal Finkel 6261c2dc28 Cleanup trip-count finding for PPC CTR loops (and some bug fixes).
This cleans up the method used to find trip counts in order to form CTR loops on PPC.
This refactoring allows the pass to find loops which have a constant trip count but also
happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different
classes of loops that are currently ignored.

In addition, we now search through all potential induction operations instead of just the first.
Also, we check the predicate code on the conditional branch and abort the transformation if the
code is not EQ or NE, and we then make sure that the branch to be transformed matches the
condition register defined by the comparison (multiple possible comparisons will be considered).

llvm-svn: 158607
2012-06-16 20:34:07 +00:00
Hal Finkel fa103d3fc7 Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions.
The present implementation handles only TBAA and FP metadata, discarding everything else.
For debug metadata, the current behavior is maintained (the debug metadata associated with
one of the instructions will be kept, discarding that attached to the other).

This should address PR 13040.

llvm-svn: 158606
2012-06-16 20:34:06 +00:00
Hal Finkel 16ddd4b66b Move the Metadata merging methods from GVN and make them public in MDNode.
There are other passes, BBVectorize specifically, that also need some of
this functionality.

llvm-svn: 158605
2012-06-16 20:33:37 +00:00
Rafael Espindola f70bea93e2 Implement irpc. Extracted from a patch by the PaX team. I just added the test.
llvm-svn: 158604
2012-06-16 18:03:25 +00:00
Kay Tiong Khoo 390edb0d91 *no need to pollute Intel syntax with bonus mnemonics; operand size is explicitly specified
llvm-svn: 158603
2012-06-16 17:19:49 +00:00
NAKAMURA Takumi e2d4a09305 Mips/AsmParser/CMakeLists.txt: Fix dependency.
llvm-svn: 158602
2012-06-16 15:33:52 +00:00
Evan Cheng 773b2cd63c It's not deterministic to iterate over SmallPtrSet. Replace it with SmallSetVector. Patch by Daniel Reynaud. rdar://11671029
llvm-svn: 158594
2012-06-16 04:28:11 +00:00
Pete Cooper 818e9f4a26 Fix crash from r158529 on Bullet.
Dynamic GEPs created by SROA needed to insert extra "i32 0"
operands to index through structs and arrays to get to the
vector being indexed.

llvm-svn: 158590
2012-06-16 01:43:26 +00:00
Chandler Carruth 52de271da1 Don't call 'FilesToRemove[0]' when the vector is empty, even to compute
the address of it. Found by a checking STL implementation used on
a dragonegg builder. Sorry about this one. =/

llvm-svn: 158582
2012-06-16 00:44:07 +00:00
Chandler Carruth e6196eba0d Harden the Unix signals code to be more async signal safe.
This is likely only the tip of the ice berg, but this particular bug
caused any double-free on a glibc system to turn into a deadlock! It is
not generally safe to either allocate or release heap memory from within
the signal handler. The 'pop_back()' in RemoveFilesToRemove was deleting
memory and causing the deadlock. What's worse, eraseFromDisk in PathV1
has lots of allocation and deallocation paths. We even passed 'true' in
a place that would have caused the *signal handler* to try to run the
'system' system call and shell out to 'rm -rf'. That was never going to
work...

This patch switches the file removal to use a vector of strings so that
the exact text needed for the 'unlink' system call can be stored there.
It switches the loop to be a boring indexed loop, and directly calls
unlink without looking at the error. It also works quite hard to ensure
that calling 'c_str()' is safe, by ensuring that the non-signal-handling
code path that manipulates the vector always leaves it in a state where
every element has already had 'c_str()' called at least once.

I dunno exactly how overkill this is, but it fixes the
deadlock-on-double free issue, and seems likely to prevent any other
issues from sneaking up.

Sorry for not having a test case, but I *really* don't know how to test
signal handling code easily....

llvm-svn: 158580
2012-06-16 00:09:41 +00:00
Jakob Stoklund Olesen 38a6fbf933 Remove final verification in RABasic.
We now have a proper machine code verifier pass between register
allocation and rewriting.

llvm-svn: 158577
2012-06-15 23:48:48 +00:00
Jakob Stoklund Olesen 45c1f9976c Print out register number in InlineSpiller.
llvm-svn: 158575
2012-06-15 23:47:09 +00:00
Jakob Stoklund Olesen 13dffcb766 Accept null PhysReg arguments to checkRegMaskInterference.
Calling checkRegMaskInterference(VirtReg) checks if VirtReg crosses any
regmask operands, regardless of the registers they clobber.

llvm-svn: 158563
2012-06-15 22:24:22 +00:00
Kevin Enderby 6c7279ec2e Fix the encoding of the armv7m (MClass) for MSR registers other than aspr,
iaspr, espr and xpsr which also needed to have 0b10 in their mask encoding bits.

llvm-svn: 158560
2012-06-15 22:14:44 +00:00
Manman Ren e0763c7472 ARM: optimization for sub+abs.
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi

For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.

rdar: 11633193
llvm-svn: 158551
2012-06-15 21:32:12 +00:00
Kay Tiong Khoo 3d8fc90f96 *fixed to separate mnemonic from operands with tab
llvm-svn: 158543
2012-06-15 21:04:21 +00:00
Andrew Trick 8370c7c38f LSR: fix expansion of scaled reg in non-address type formulae.
For non-address users, Base and Scaled registers are not specially
associated to fit an address mode, so SCEVExpander should apply normal
expansion rules. Otherwise we may sink computation into inner loops
that have already been optimized.

llvm-svn: 158537
2012-06-15 20:07:29 +00:00
Andrew Trick aca8fb3c45 LSR fix: "Special" users are just like "Basic" users but allow -1 scale.
llvm-svn: 158536
2012-06-15 20:07:26 +00:00
Bill Wendling 4fd966347a Remove assignments which aren't used afterwards.
llvm-svn: 158535
2012-06-15 19:30:42 +00:00
Pete Cooper e24d6a19e3 Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed
llvm-svn: 158529
2012-06-15 18:07:29 +00:00
Rafael Espindola 1821c6c3b0 Some optimizations done by globalopt are safe only for internal linkage, not
linkonce linkage. For example, it is not valid to add unnamed_addr.

This also fixes a crash in g++.dg/opt/static5.C.

llvm-svn: 158528
2012-06-15 18:00:24 +00:00
Jakob Stoklund Olesen a15a224db0 Preserve <undef> flags in ARMExpandPseudo.
This probably mostly shows up in bugpoint-generated code.

llvm-svn: 158527
2012-06-15 17:46:54 +00:00
Jakob Stoklund Olesen 5767ad727c Use regunit liveness in RegisterCoalescer when it is available.
We only do very limited physreg coalescing now, but we still merge
virtual registers into reserved registers.

llvm-svn: 158526
2012-06-15 17:36:48 +00:00
Rafael Espindola 768b41c17a Factor macro argument parsing into helper methods and add support for .irp.
Patch extracted from a larger one by the PaX team. I added the testcases
and tightened error handling a bit.

llvm-svn: 158523
2012-06-15 14:02:34 +00:00
Duncan Sands 7838603ffc Fix issues (infinite loop and/or crash) with self-referential instructions, for
example degenerate phi nodes and binops that use themselves in unreachable code.
Thanks to Charles Davis for the testcase that uncovered this can of worms.

llvm-svn: 158508
2012-06-15 08:37:50 +00:00
Craig Topper 11913052d6 Move AVX version of convert instructions that write to GPRs to the Op1 table.
llvm-svn: 158497
2012-06-15 07:02:58 +00:00
Marshall Clow bfb85e676c Had a closing brace inside an #ifdef -- oops!
llvm-svn: 158485
2012-06-15 01:15:47 +00:00
Marshall Clow 71757ef3ed Adding acessors to COFFObjectFile so that clients can get at the (non-generic) bits
llvm-svn: 158484
2012-06-15 01:08:25 +00:00
Pete Cooper 1d1fa72837 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158479
2012-06-14 23:53:53 +00:00
Rafael Espindola def1b09be2 Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and
globaldce. Globaldce was already removing linkonce globals, but globalopt was
not.

llvm-svn: 158476
2012-06-14 22:48:13 +00:00
Pete Cooper 8bbce768d8 Move X86::VCVTTSD2SIrr from the 2 operand to 1 operand MemRegOp table.
Can someone with more knowledge of this please look at other entries
to see if others need moved.

llvm-svn: 158474
2012-06-14 22:12:58 +00:00
Akira Hatanaka 5fd22485a3 Fix coding style violations. Remove white spaces and tabs.
llvm-svn: 158471
2012-06-14 21:10:56 +00:00
Akira Hatanaka d8ab16b86f 1. introduce MipsPat in place of Pat in order to exclude those from
being used by Mips16 or Micro Mips
2. clean up a few lines too long encountered

Patch by Reed Kotler.

llvm-svn: 158470
2012-06-14 21:03:23 +00:00
Akira Hatanaka 1b420ac4c8 Make machine verifier check the first instruction of the last bundle instead of
the last instruction of a basic block.

llvm-svn: 158468
2012-06-14 20:51:13 +00:00
Lang Hames a33db65bd9 Make comment slightly more helpful.
llvm-svn: 158467
2012-06-14 20:37:15 +00:00
Pete Cooper 5d19452f3f Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot
This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c.

llvm-svn: 158462
2012-06-14 18:32:52 +00:00
Andrew Trick 45877fa011 misched: disable SSA check pending PR13112.
llvm-svn: 158461
2012-06-14 17:48:49 +00:00
Pete Cooper a7e6d58a87 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158454
2012-06-14 16:38:13 +00:00
NAKAMURA Takumi 27bdc671ed MipsLongBranch.cpp: Tweak llvm::next() to appease msvc.
llvm-svn: 158446
2012-06-14 12:29:48 +00:00
Richard Barton b0ec375b96 Replace assertion failure for badly formatted CPS instrution with error message.
llvm-svn: 158445
2012-06-14 10:48:04 +00:00
Jush Lu ac96b764ea Cleanup whitespace.
llvm-svn: 158443
2012-06-14 06:08:19 +00:00
Manman Ren c2bc2d106b InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.

llvm-svn: 158441
2012-06-14 05:57:42 +00:00
Akira Hatanaka d74b1c1a48 Fix Mips/CMakeLists.txt.
llvm-svn: 158437
2012-06-14 01:23:55 +00:00
Akira Hatanaka a215929d5f Add file MipsLongBranch.cpp.
llvm-svn: 158436
2012-06-14 01:22:24 +00:00
Akira Hatanaka a1b142f97c Remove code in MipsAsmPrinter and MipsMCInstLower.
llvm-svn: 158434
2012-06-14 01:20:12 +00:00
Akira Hatanaka eb36522a4d Add long branch expansion pass for MIPS.
llvm-svn: 158433
2012-06-14 01:19:35 +00:00
Akira Hatanaka 64f8df28ed Add AT to the list of registers clobbered by branches so that it is available
as a scratch register when they are expanded to long branches.

llvm-svn: 158432
2012-06-14 01:17:59 +00:00
Akira Hatanaka 194a8773ea In MipsRegisterInfo::eliminateFrameIndex, call Mips::loadImmediate
to load an immediate that does not fit into 16-bit. 

llvm-svn: 158431
2012-06-14 01:17:36 +00:00
Akira Hatanaka 2372c8bb5f In MipsFrameLowering::emitPrologue and emitEpilogue, call Mips::loadImmediate
to load an immediate that does not fit into 16-bit. Also, take into
consideration the global base register slot on the stack when computing the
stack size. 

llvm-svn: 158430
2012-06-14 01:17:13 +00:00
Akira Hatanaka acd1a7dc68 Define function MipsInstrInfo::GetInstSizeInBytes, which will be called to
compute the size of basic blocks in a function. Also, define a function which
emits a series of instructions to load an immediate.

llvm-svn: 158429
2012-06-14 01:16:45 +00:00
Akira Hatanaka 0c76448471 In MipsISelDAGToDAG.cpp, store the global base register to a stack frame object.
Long-branches need access to the global base register to get the destination
address.

llvm-svn: 158428
2012-06-14 01:16:15 +00:00
Akira Hatanaka 51c70c62cf Add methods to MipsFunctionInfo for initializing and accessing the stack frame
object for the global base register.

This is the first of a series of patches which implements long branch expansion
for MIPS.

llvm-svn: 158427
2012-06-14 01:15:36 +00:00
Akira Hatanaka 5ac78681c1 Bundle jump/branch instructions with the instructions in the delay slot in
delay slot filler pass of MIPS, per suggestion of Jakob Stoklund Olesen.

This change, along with the fix in r158154, enables machine verification
to be run after delay slot filling.

llvm-svn: 158426
2012-06-13 23:25:52 +00:00
Akira Hatanaka df5205ef3d Implement a DAGCombine in MipsISelLowering.cpp which transforms the following
pattern:

(add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt))

"tjt" is a TargetJumpTable node. 

llvm-svn: 158419
2012-06-13 20:33:18 +00:00
Akira Hatanaka 1daf8c2a16 Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp.
llvm-svn: 158414
2012-06-13 19:33:32 +00:00
Akira Hatanaka 9586618c58 Simplify CreateLoadLR and CreateStoreLR in MipsISelLowering.cpp.
llvm-svn: 158413
2012-06-13 19:06:08 +00:00
Akira Hatanaka f0273603f5 Implement fastcc calling convention for MIPS.
llvm-svn: 158410
2012-06-13 18:06:00 +00:00
Richard Osborne ab7d788eb5 Fix pattern for MKMSK instruction.
llvm-svn: 158409
2012-06-13 17:59:12 +00:00
Pete Cooper e2fe809772 Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access"
This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f.

llvm-svn: 158408
2012-06-13 17:55:22 +00:00
Pete Cooper e1d4e8b563 Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access
llvm-svn: 158407
2012-06-13 17:30:34 +00:00
Argyrios Kyrtzidis 444fd42634 Fix building ThreadLocal.cpp with --disable-threads.
llvm-svn: 158405
2012-06-13 16:30:06 +00:00
Kay Tiong Khoo f294921e24 *typo: Cyles changed to Cycles
llvm-svn: 158404
2012-06-13 15:53:04 +00:00
Duncan Sands 409d8ae165 It is possible for several constants which aren't individually absorbing to
combine to the absorbing element.  Thanks to nbjoerg on IRC for pointing this 
out.

llvm-svn: 158399
2012-06-13 12:15:56 +00:00
Duncan Sands 318a89ddac When linearizing a multiplication, return at once if we see a factor of zero,
since then the entire expression must equal zero (similarly for other operations
with an absorbing element).  With this in place a bunch of reassociate code for
handling constants is dead since it is all taken care of when linearizing.  No
intended functionality change.

llvm-svn: 158398
2012-06-13 09:42:13 +00:00
Craig Topper 71dc02d659 Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them.
llvm-svn: 158396
2012-06-13 07:18:53 +00:00
Hal Finkel 9898614854 Add another missing 64-bit itinerary definition for the PPC A2 core.
llvm-svn: 158393
2012-06-13 05:55:09 +00:00
Manman Ren d33f4efbfd SimplifyCFG: fold unconditional branch to its predecessor if profitable.
This patch extends FoldBranchToCommonDest to fold unconditional branches.
For unconditional branches, we fold them if it is easy to update the phi nodes 
in the common successors.

rdar://10554090

llvm-svn: 158392
2012-06-13 05:43:29 +00:00
Jakob Stoklund Olesen 1c66b87f7d Eliminate struct TableGenBackend.
TableGen backends are simply written as functions now.

Patch by Sean Silva!

llvm-svn: 158389
2012-06-13 05:15:49 +00:00
Akira Hatanaka 21371766d1 Clean up trailing blanks in Mips16InstrFormats.td
Patch by Reed Kotler.

llvm-svn: 158382
2012-06-13 02:42:47 +00:00
Akira Hatanaka 5fa541231b disable use of directive .set nomicromips
until this directive is pushed in gas to open source fsf

Patch by Reed Kotler.

llvm-svn: 158381
2012-06-13 02:41:14 +00:00
Andrew Trick 344fb64fa3 sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

llvm-svn: 158380
2012-06-13 02:39:03 +00:00
Andrew Trick 5b90645abb sched: Avoid trivially redundant DAG edges. Take the one with higher latency.
llvm-svn: 158379
2012-06-13 02:39:00 +00:00
Akira Hatanaka 3fe00f29ad 1. fix places where immed is used in place of imm to be consistent with
non mips16
2. fix some comments to change OPcode->EXTEND for extended instructions

Patch by Reed Kotler.

llvm-svn: 158378
2012-06-13 02:37:54 +00:00
Hal Finkel 79c39da135 Add some missing 64-bit itinerary definitions for the PPC A2 core.
llvm-svn: 158373
2012-06-12 20:32:29 +00:00
Duncan Sands 72aea01b6e Use DenseMap as SmallMap workaround rather than std::map, at Chandler's request.
llvm-svn: 158371
2012-06-12 20:26:43 +00:00
Duncan Sands 67cd591989 Use std::map rather than SmallMap because SmallMap assumes that the value has
POD type, causing memory corruption when mapping to APInts with bitwidth > 64.
Merge another crash testcase into crash.ll while there.

llvm-svn: 158369
2012-06-12 20:16:51 +00:00
Chad Rosier c6916f88a8 [arm-fast-isel] Add support for -arm-long-calls.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 158368
2012-06-12 19:25:13 +00:00
Hal Finkel 8c33dde666 Split out the PPC instruction class IntSimple from IntGeneral.
On the POWER7, adds and logical operations can also be handled
in the load/store pipelines. We'll call these IntSimple.

llvm-svn: 158366
2012-06-12 19:01:24 +00:00
Hal Finkel f1cc96ab50 Fixes for PPC host detection and features.
POWER4 is a 64-bit CPU (better matched to the 970).
The g3 is really the 750 (no altivec), the g4+ is the 74xx (not the 750).

Patch by Andreas Tobler.

llvm-svn: 158363
2012-06-12 16:39:23 +00:00
Duncan Sands d7aeefebd6 Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x).  This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value.  It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights.  As a side-effect it
reduces the number of multiplies needed in some cases of large powers.  While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users.  This is progress towards
fixing PR13021.

llvm-svn: 158358
2012-06-12 14:33:56 +00:00
Hal Finkel 59b0ee8a56 Reapply r158337, this time properly protect Darwin/PPC host CPU use with __ppc__.
Original commit message:
Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName().

Both the new Linux functionality and the old Darwin functions have been moved.
This change also allows this information to be queried directly by clang and
other frontends (clang, for example, will now have real -mcpu=native support).

llvm-svn: 158349
2012-06-12 03:03:13 +00:00
Argyrios Kyrtzidis c6dc4d75fd Satisfy C++ aliasing rules, per suggestion by Chandler.
llvm-svn: 158346
2012-06-12 01:06:16 +00:00
Jakob Stoklund Olesen f8f128606c Revert r158337 "Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName()."
This commit broke most of the PowerPC unit tests when running on
Intel/Apple.

llvm-svn: 158345
2012-06-12 00:58:40 +00:00
Argyrios Kyrtzidis 8d19c86c9a For llvm::sys::ThreadLocalImpl instead of malloc'ing the platform-specific
thread local data, embed them in the class using a uint64_t and make sure
we get compiler errors if there's a platform where this is not big enough.

This makes ThreadLocal more safe for using it in conjunction with CrashRecoveryContext.

Related to crash in rdar://11434201.

llvm-svn: 158342
2012-06-12 00:21:31 +00:00
Andrew Trick 3e465fb225 misched: When querying RegisterPressureTracker, always save current and max pressure.
llvm-svn: 158340
2012-06-11 23:42:23 +00:00
Andrew Trick d054bd833a misched: regpressure getMaxPressureDelta, revert accidental checkin.
llvm-svn: 158339
2012-06-11 23:42:20 +00:00
Hal Finkel 23c699e497 Move PPC host-CPU detection logic from PPCSubtarget into sys::getHostCPUName().
Both the new Linux functionality and the old Darwin functions have been moved.
This change also allows this information to be queried directly by clang and
other frontends (clang, for example, will now have real -mcpu=native support).

llvm-svn: 158337
2012-06-11 23:14:31 +00:00
Hal Finkel bddc916f2b Enable MFOCRF generation on the PPC A2 core.
llvm-svn: 158324
2012-06-11 19:57:04 +00:00
Hal Finkel bfd3d08d18 Rename the PPC target feature gpul to mfocrf.
The PPC target feature gpul (IsGigaProcessor) was only used for one thing:
To enable the generation of the MFOCRF instruction. Furthermore, this
instruction is available on other PPC cores outside of the G5 line. This
feature now corresponds to the HasMFOCRF flag.

No functionality change.

llvm-svn: 158323
2012-06-11 19:57:01 +00:00
Hal Finkel 25d4c568d3 Add A2 to the list of PPC CPUs recognized by Linux host CPU-type detection.
llvm-svn: 158322
2012-06-11 19:56:57 +00:00
Hal Finkel 2c09058f19 Emit the two-operand form of the PPC mfcr instruction as mfocrf.
This is necessary on Linux and supported on Darwin, see PR2604.

llvm-svn: 158315
2012-06-11 15:43:15 +00:00
Hal Finkel ba671c0ea7 Add local CPU detection for Linux PPC.
This functionality mirrors that available on PPC/Darwin.

llvm-svn: 158314
2012-06-11 15:43:13 +00:00
Hal Finkel f2b9c38d6f Add POWER6 and POWER7 CPU types to the PPC backend.
No functional change; these will be used by upcoming scheduler enhancements.

llvm-svn: 158313
2012-06-11 15:43:08 +00:00
Jakob Stoklund Olesen e6aed139f0 Write llvm-tblgen backends as functions instead of sub-classes.
The TableGenBackend base class doesn't do much, and will be removed
completely soon.

Patch by Sean Silva!

llvm-svn: 158311
2012-06-11 15:37:55 +00:00
Bill Wendling 4b79647a6e Re-enable the CMN instruction.
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>

llvm-svn: 158302
2012-06-11 08:07:26 +00:00
Benjamin Kramer 2150145ae4 InstCombine: factor code better.
No functionality change.

llvm-svn: 158301
2012-06-11 08:01:25 +00:00
Benjamin Kramer 8b8a76974f InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare.
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.

stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us

where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.

llvm-svn: 158297
2012-06-10 20:35:00 +00:00
Hal Finkel 4e9f1a859f Enable ILP scheduling for all nodes by default on PPC.
Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).

Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)

Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%

llvm-svn: 158296
2012-06-10 19:32:29 +00:00
Nadav Rotem 17ee58a792 Add AutoUpgrade support for the SSE4 ptest intrinsics.
Patch by Michael Kuperstein.

llvm-svn: 158295
2012-06-10 18:42:51 +00:00
Hal Finkel a8100281ae Use critical anti-dep. breaking on all PPC targets, but also add other register classes.
Using 'all' instead of 'critical' would be better because it would make it easier to
satisfy the bundling constraints, but, as noted in the FIXME, that is currently not
possible with the crs.

This yields an average 1% speedup over the entire test suite (on Power 7). Largest speedups:
SingleSource/Benchmarks/Shootout-C++/moments - 40%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
SingleSource/Benchmarks/BenchmarkGame/nsieve-bits - 26%
SingleSource/Benchmarks/McGill/misr - 23%
MultiSource/Applications/JM/ldecod/ldecod - 22%

Largest slowdowns:
SingleSource/Benchmarks/Shootout-C++/matrix - -29%
SingleSource/Benchmarks/Shootout-C++/ary3 - -22%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -18%
SingleSource/Benchmarks/Shootout-C++/ary - -17%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - -15%

llvm-svn: 158294
2012-06-10 11:15:36 +00:00
Craig Topper 7afe343be5 Add intrinsics for immediate form of XOP vprot instructions. Use i128mem instead of f128mem for integer XOP instructions.
llvm-svn: 158291
2012-06-10 07:31:56 +00:00
Hal Finkel 2edfbddcf0 Improve ext/trunc patterns on PPC64.
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
would leave self-moves in the final assembly. Replacing those patterns with ones
based on the SUBREG builtins yields better-looking code.

Thanks to Jakob and Owen for their suggestions in this matter.

llvm-svn: 158283
2012-06-09 22:10:19 +00:00
Craig Topper a54893c662 Use XOP vpcom intrinsics in patterns instead of a target specific SDNode type. Remove the custom lowering code that selected the SDNode type.
llvm-svn: 158279
2012-06-09 17:02:24 +00:00
Craig Topper 3352ba55b9 Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
llvm-svn: 158278
2012-06-09 16:46:13 +00:00
Aaron Ballman 36a978cca2 Disabling a spurious deprecation warning about using PathV1 from within the PathV1 implementation file.
llvm-svn: 158274
2012-06-09 13:59:29 +00:00
Aaron Ballman 503bbff367 Fixing a typo in the comments.
llvm-svn: 158273
2012-06-09 13:46:36 +00:00
Benjamin Kramer 0748008df5 Allocate the contents of DwarfDebug's StringMaps in a single big BumpPtrAllocator.
llvm-svn: 158265
2012-06-09 10:34:15 +00:00
Duncan Sands 556eab8878 Silence a gcc-4.6 warning: GCC fails to understand that secondReg and cmpOp2 are
correlated, and thinks that cmpOp2 may be used uninitialized.

llvm-svn: 158263
2012-06-09 10:04:03 +00:00
Hal Finkel eb50c2d4a4 Enable tail merging on PPC.
Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).

Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%

Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%

This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.

llvm-svn: 158259
2012-06-09 03:14:50 +00:00
Andrew Trick fc8ce08be3 Register pressure: added getPressureAfterInstr.
llvm-svn: 158256
2012-06-09 02:16:58 +00:00
Jakob Stoklund Olesen c26fbbfba5 Sketch a LiveRegMatrix analysis pass.
The LiveRegMatrix represents the live range of assigned virtual
registers in a Live interval union per register unit. This is not
fundamentally different from the interference tracking in RegAllocBase
that both RABasic and RAGreedy use.

The important differences are:

- LiveRegMatrix tracks interference per register unit instead of per
  physical register. This makes interference checks cheaper and
  assignments slightly more expensive. For example, the ARM D7 reigster
  has 24 aliases, so we would check 24 physregs before assigning to one.
  With unit-based interference, we check 2 units before assigning to 2
  units.

- LiveRegMatrix caches regmask interference checks. That is currently
  duplicated functionality in RABasic and RAGreedy.

- LiveRegMatrix is a pass which makes it possible to insert
  target-dependent passes between register allocation and rewriting.
  Such passes could tweak the register assignments with interference
  checking support from LiveRegMatrix.

Eventually, RABasic and RAGreedy will be switched to LiveRegMatrix.

llvm-svn: 158255
2012-06-09 02:13:10 +00:00
Jack Carter 2db37e8226 Test commit
llvm-svn: 158250
2012-06-09 00:27:55 +00:00
Jakob Stoklund Olesen be336295cd Also compute MBB live-in lists in the new rewriter pass.
This deduplicates some code from the optimizing register allocators, and
it means that it is now possible to change the register allocators'
solutions simply by editing the VirtRegMap between the register
allocator pass and the rewriter.

llvm-svn: 158249
2012-06-09 00:14:47 +00:00
Dmitri Gribenko dbeafa773a Convert comments to proper Doxygen comments.
llvm-svn: 158248
2012-06-09 00:01:45 +00:00
Jakob Stoklund Olesen 1224312f5b Reintroduce VirtRegRewriter.
OK, not really. We don't want to reintroduce the old rewriter hacks.

This patch extracts virtual register rewriting as a separate pass that
runs after the register allocator. This is possible now that
CodeGen/Passes.cpp can configure the full optimizing register allocator
pipeline.

The rewriter pass uses register assignments in VirtRegMap to rewrite
virtual registers to physical registers, and it inserts kill flags based
on live intervals.

These finalization steps are the same for the optimizing register
allocators: RABasic, RAGreedy, and PBQP.

llvm-svn: 158244
2012-06-08 23:44:45 +00:00
Nuno Lopes 2710f1b049 canonicalize:
-%a + 42
into
42 - %a

previously we were emitting:
-(%a + 42)

This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next

llvm-svn: 158237
2012-06-08 22:30:05 +00:00
Evan Cheng c5adccab1a Start implementing pre-ra if-converter: using speculation and selects to eliminate branches.
llvm-svn: 158234
2012-06-08 21:53:50 +00:00
Andrew Trick 423fa6faee TargetInstrInfo hooks implemented in codegen should be declared pure virtual.
llvm-svn: 158233
2012-06-08 21:52:38 +00:00
Duncan Sands 3293f460e7 Reapply commit 158073 with a fix (the testcase was already committed). The
problem was that by moving instructions around inside the function, the pass
could accidentally move the iterator being used to advance over the function
too.  Fix this by only processing the instruction equal to the iterator, and
leaving processing of instructions that might not be equal to the iterator
to later (later = after traversing the basic block; it could also wait until
after traversing the entire function, but this might make the sets quite big).
Original commit message:

Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158226
2012-06-08 20:15:33 +00:00
Hal Finkel 41e6fd1df9 Remove the TODO statement in the PPC README re: CTR loops
As Chris points out, this can now be removed!

TODO: check if the associated section on viterbi's inner loop can also be removed.
llvm-svn: 158224
2012-06-08 20:02:09 +00:00
Hal Finkel c6b5debb40 Enable PPC CTR loop formation by default.
Thanks to Jakob's help, this now causes no new test suite failures!

Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%

The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%

In light of these slowdowns, additional profiling work is obviously needed!

llvm-svn: 158223
2012-06-08 19:19:53 +00:00
Hal Finkel 3d32ad3a7f Mark the PPC CTRRC and CTRRC8 register classes as non-allocatable.
Marking these classes as non-alocatable allows CTR loop generation to
work correctly with the block placement passes, etc. These register
classes are currently used only by some unused TCRETURN patterns.
In future cleanup, these will be removed.

Thanks again to Jakob for suggesting this fix to the CTR loop problem!

llvm-svn: 158221
2012-06-08 19:02:08 +00:00
Manman Ren 6bc2d27073 Enable optimization for integer ABS on X86 if Subtarget has CMOV.
llvm-svn: 158220
2012-06-08 18:58:26 +00:00
Chad Rosier 3d464d8068 Fix a crash in APInt::lshr when shiftAmt > BitWidth.
Patch by James Benton <jbenton@vmware.com>.

llvm-svn: 158213
2012-06-08 18:04:52 +00:00
Andrew Trick 596af1b02e Fix Target->Codegen dependence.
Bulk move of TargetInstrInfo implementation into
TargetInstrInfoImpl. This is dirty because the code isn't part of
TargetInstrInfoImpl class, nor should it be, because the methods are
not target hooks. However, it's the current mechanism for keeping
libTarget useful outside the backend. You'll get a not-so-nice link
error if you invoke a TargetInstrInfo method that depends on CodeGen.

The TargetInstrInfoImpl class should probably be removed since it
doesn't really solve this problem.

To really fix this, we probably need separate interfaces for the
CodeGen/nonCodeGen sides of TargetInstrInfo.

llvm-svn: 158212
2012-06-08 17:23:27 +00:00
Nuno Lopes 4b68c1da54 BoundsChecking: add support for ConstantPointerNull. fixes a bunch of instrumentation failures in loops with reallocs
llvm-svn: 158210
2012-06-08 16:31:42 +00:00
Hal Finkel 821e00121c Disable the PPC CTR-Loops pass by default.
The pass itself works well, but the something in the Machine* infrastructure
does not understand terminators which define registers. Without the ability
to use the block-placement pass, etc. this causes performance regressions (and
so is turned off by default). Turning off the analysis turns off the problems
with the Machine* infrastructure.

llvm-svn: 158206
2012-06-08 15:38:25 +00:00
Hal Finkel 8b01503ee5 Fix a bug in the new PPC CTR-Loops pass.
The code which tests for an induction operation cannot assume that any
ADDI instruction will have a register operand because the operand could
also be a frame index; for example:
    %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16

llvm-svn: 158205
2012-06-08 15:38:23 +00:00
Hal Finkel 96c2d4d945 Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code.
This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon
pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are
no longer otherwise used. Also, invalid preheader DebugLoc is not used.

llvm-svn: 158204
2012-06-08 15:38:21 +00:00
Duncan Sands 9a5cf92250 Revert commit 158073 while waiting for a fix. The issue is that reassociate
can move instructions within the instruction list.  If the instruction just
happens to be the one the basic block iterator is pointing to, and it is
moved to a different basic block, then we get into an infinite loop due to
the iterator running off the end of the basic block (for some reason this
doesn't fire any assertions).  Original commit message:

Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158199
2012-06-08 13:37:30 +00:00
Manman Ren 2cdc8afccf X86: optimize generated code for integer ABS
This patch will generate the following for integer ABS:
      movl    %edi, %eax
      negl    %eax
      cmovll  %edi, %eax
INSTEAD OF
      movl    %edi, %ecx
      sarl    $31, %ecx
      leal    (%rdi,%rcx), %eax
      xorl    %ecx, %eax

There exists a target-independent DAG combine for integer ABS, which converts
integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. 
This is implemented in PerformXorCombine.

rdar://10695237

llvm-svn: 158175
2012-06-07 22:39:10 +00:00
Nadav Rotem bbd40f67d8 Do not optimize the used bits of the x86 vselect condition operand, when the condition operand is a vector of 1-bit predicates.
This may happen on MIC devices.

llvm-svn: 158168
2012-06-07 20:53:48 +00:00
Nadav Rotem 4e50efead6 Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type.
llvm-svn: 158166
2012-06-07 20:28:57 +00:00
Andrew Trick a5d24ca453 Continue factoring computeOperandLatency. Use it for ARM hasHighOperandLatency.
llvm-svn: 158164
2012-06-07 19:42:04 +00:00
Andrew Trick 5b1cadf9f7 ARM getOperandLatency rewrite.
Match expectations of the new latency API. Cleanup and make the logic consistent.

llvm-svn: 158163
2012-06-07 19:42:00 +00:00
Andrew Trick 3564bdfa61 ARM getOperandLatency should return -1 for unknown, consistent with API
llvm-svn: 158162
2012-06-07 19:41:58 +00:00
Andrew Trick fb1a74c2b2 Fix ARM getInstrLatency logic to work with the current API.
llvm-svn: 158161
2012-06-07 19:41:55 +00:00
Manman Ren 746e4859d0 PR13046: we can't replace usage of SUB with CMP in the lowering phase.
It will cause assertion failure later on.

llvm-svn: 158160
2012-06-07 19:27:33 +00:00
Rafael Espindola 55d1145bd5 Use a base register instead of an index register with the local dynamic model.
Fixes pr13048.

llvm-svn: 158158
2012-06-07 18:39:19 +00:00
Pete Cooper cd72016cab Move terminator machine verification to check MachineBasicBlock::instr_iterator instead of MBB::iterator
llvm-svn: 158154
2012-06-07 17:41:39 +00:00
Manman Ren ae02c5a93e X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 158126
2012-06-07 00:42:47 +00:00
Manman Ren 9c9641812c Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Jakob Stoklund Olesen 00e7dffefb Properly verify liveness with bundled machine instructions.
Bundles should be treated as one atomic transaction when checking
liveness. That is how the register allocator (and VLIW targets) treats
bundles.

llvm-svn: 158116
2012-06-06 22:34:30 +00:00
Benjamin Kramer 3f87e3b707 Add accessors for all private members of DisasmContext.
LLVM should be -Wunused-private-field clean now.

llvm-svn: 158103
2012-06-06 20:45:10 +00:00
Andrew Trick 05ff4667eb Move RegisterClassInfo.h.
Allow targets to access this API. It's required for RegisterPressure.

llvm-svn: 158102
2012-06-06 20:29:31 +00:00
Andrew Trick 88517f608c Move RegisterPressure.h.
Make it a general utility for use by Targets.

llvm-svn: 158097
2012-06-06 19:47:35 +00:00
Benjamin Kramer 009b1c1cf1 Round 2 of dead private variable removal.
LLVM is now -Wunused-private-field clean except for
- lib/MC/MCDisassembler/Disassembler.h. Not sure why it keeps all those unaccessible fields.
- gtest.

llvm-svn: 158096
2012-06-06 19:47:08 +00:00
Benjamin Kramer 628a39faa3 Remove unused private fields found by clang's new -Wunused-private-field.
There are some that I didn't remove this round because they looked like
obvious stubs. There are dead variables in gtest too, they should be
fixed upstream.

llvm-svn: 158090
2012-06-06 18:25:08 +00:00
Chad Rosier 5d6f01ad77 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.
rdar://11496434

llvm-svn: 158087
2012-06-06 17:37:40 +00:00
Chad Rosier faa3894628 Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't
matter.
rdar://11579835

llvm-svn: 158084
2012-06-06 17:22:40 +00:00
Jakob Stoklund Olesen f435b1867d Remove dead debug option -disable-rematerialization.
Remat has been stable for years, and it isn't done by
LiveIntervalAnalysis any longer. (See LiveRangeEdit).

llvm-svn: 158079
2012-06-06 16:22:41 +00:00
Duncan Sands 763da45e9e Grab-bag of reassociate tweaks. Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158073
2012-06-06 14:53:10 +00:00
Benjamin Kramer 3de5d40f4d Stop leaking RegScavengers from TailDuplication.
llvm-svn: 158069
2012-06-06 13:53:41 +00:00
Richard Barton f1ef87ddbb Correct decoder for T1 conditional B encoding
llvm-svn: 158055
2012-06-06 09:12:53 +00:00
Craig Topper bf2409e8aa Mark several instructions SSE2 instead of SSE3 as they should be.
llvm-svn: 158049
2012-06-06 06:45:27 +00:00
Jakob Stoklund Olesen c141ba584e Move LiveUnionArray into LiveIntervalUnion.h
It is useful outside RegAllocBase.

llvm-svn: 158041
2012-06-05 23:57:30 +00:00
Jakob Stoklund Olesen 46d229c573 Don't print register names in LiveIntervalUnion::print().
Soon we'll be making LiveIntervalUnions for register units as well.

This was the only place using the RepReg member, so just remove it.

llvm-svn: 158038
2012-06-05 23:07:19 +00:00
Matt Beaumont-Gay 7ba769bedd Suppress -Wunused-variable in -Asserts build
llvm-svn: 158037
2012-06-05 23:00:03 +00:00
Jakob Stoklund Olesen f3f7d6f6e2 Simplify LiveInterval::print().
Don't print out the register number and spill weight, making the TRI
argument unnecessary.

This allows callers to interpret the reg field. It can currently be a
virtual register, a physical register, a spill slot, or a register unit.

llvm-svn: 158031
2012-06-05 22:51:54 +00:00
Jakob Stoklund Olesen 12e03dae44 Add experimental support for register unit liveness.
Instead of computing a live interval per physreg, LiveIntervals can
compute live intervals per register unit. This makes impossible the
confusing situation where aliasing registers could have overlapping live
intervals. It should also make fixed interferernce checking cheaper
since registers have fewer register units than aliases.

Live intervals for regunits are computed on demand, using MRI use-def
chains and the new LiveRangeCalc class. Only regunits live in to ABI
blocks are precomputed during LiveIntervals::runOnMachineFunction().

The regunit liveness computations don't depend on LiveVariables.

llvm-svn: 158029
2012-06-05 22:02:15 +00:00
Jakob Stoklund Olesen 989b3b1516 Implement LiveRangeCalc::extendToUses() and createDeadDefs().
These LiveRangeCalc methods are to be used when computing a live range
from scratch.

llvm-svn: 158027
2012-06-05 21:54:09 +00:00
Andrew Trick 4b037005d2 MachineInstr::eraseFromParent fix for removing bundled instrs.
Patch by Ivan Llopard.

llvm-svn: 158025
2012-06-05 21:44:23 +00:00
Andrew Trick 4544606c71 misched: API for minimum vs. expected latency.
Minimum latency determines per-cycle scheduling groups.
Expected latency determines critical path and cost.

llvm-svn: 158021
2012-06-05 21:11:27 +00:00
Lang Hames a59100cc08 Add a new intrinsic: llvm.fmuladd. This intrinsic represents a multiply-add
expression (a * b + c) that can be implemented as a fused multiply-add (fma)
if the target determines that this will be more efficient. This intrinsic
will be used to implement FP_CONTRACT support and an aggressive FMA formation
mode.

If your target has a fast FMA instruction you should override the
isFMAFasterThanMulAndAdd method in TargetLowering to return true.

llvm-svn: 158014
2012-06-05 19:07:46 +00:00
Yuan Lin 572a3a2cce Fix header file include order in NVPTX backend NV_CONTRIB
llvm-svn: 158013
2012-06-05 19:06:13 +00:00
Andrew Trick a6fb910fad LoopUnroll: always check for NULL LoopPassManager
llvm-svn: 158007
2012-06-05 17:51:05 +00:00
Roman Divacky c856653fb3 PPC32 uses R2 as the TLS register. Fix the copy and paste.
llvm-svn: 158004
2012-06-05 17:14:17 +00:00
Andrew Trick 39a99140c7 X86 itinerary properties.
llvm-svn: 157981
2012-06-05 03:44:46 +00:00
Andrew Trick b2680c718f ARM itinerary properties.
llvm-svn: 157980
2012-06-05 03:44:43 +00:00
Andrew Trick 73d7736b17 misched: Added MultiIssueItineraries.
This allows a subtarget to explicitly specify the issue width and
other properties without providing pipeline stage details for every
instruction.

llvm-svn: 157979
2012-06-05 03:44:40 +00:00
Andrew Trick a88d46e818 sdsched: Use the right heuristics when -mcpu is not provided and we have no itinerary.
Use ILP heuristics for long latency instrs if no scoreboard exists.

llvm-svn: 157978
2012-06-05 03:44:34 +00:00
Andrew Trick ed7c96d7d9 misched: Allow disabling scoreboard hazard checking for subtargets with a
valid itinerary but no pipeline stages.

An itinerary can contain useful scheduling information without specifying pipeline stages for each instruction.

llvm-svn: 157977
2012-06-05 03:44:32 +00:00
Andrew Trick 515f131786 whitespace
llvm-svn: 157976
2012-06-05 03:44:29 +00:00
Andrew Trick d36adece50 misched: comments from code review.
llvm-svn: 157975
2012-06-05 03:44:26 +00:00
Jakob Stoklund Olesen 345528944c Remove the last remat-related code from LiveIntervalAnalysis.
Rematerialization is handled by LiveRangeEdit now.

llvm-svn: 157974
2012-06-05 01:06:15 +00:00
Jakob Stoklund Olesen 9e27e2621a Stop using LiveIntervals::isReMaterializable().
It is an old function that does a lot more than required by
CalcSpillWeights, which was the only remaining caller.

The isRematerializable() function never actually sets the isLoad
argument, so don't try to compute that.

llvm-svn: 157973
2012-06-05 01:06:12 +00:00
Joel Jones 7f2ac7a2c8 Revert commit r157966
llvm-svn: 157972
2012-06-05 00:47:21 +00:00
Joel Jones d08534f82e This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.

<rdar://problem/11481151>

llvm-svn: 157966
2012-06-04 23:38:57 +00:00
Jakob Stoklund Olesen 188d830405 Delete dead code.
llvm-svn: 157963
2012-06-04 23:01:41 +00:00
Rafael Espindola 47d988c54c When gvn decides to replace an instruction with another, we have to patch the
replacement to make it at least as generic as the instruction being replaced.
This includes:
* dropping nsw/nuw flags
* getting the least restrictive tbaa and fpmath metadata
* merging ranges

Fixes PR12979.

llvm-svn: 157958
2012-06-04 22:44:21 +00:00
Jakob Stoklund Olesen 11fb248aa6 Switch LiveIntervals member variable to LLVM naming standards.
No functional change.

llvm-svn: 157957
2012-06-04 22:39:14 +00:00
Jakob Stoklund Olesen 5ef0e0b262 Pass context pointers to LiveRangeCalc::reset().
Remove the same pointers from all the other LiveRangeCalc functions,
simplifying the interface.

llvm-svn: 157941
2012-06-04 18:21:16 +00:00
Akira Hatanaka 6734685f21 Fix a bug in MipsTargetLowering::LowerLOAD. A shift-right-logical node is
inserted after the shift-left-logical node.

llvm-svn: 157937
2012-06-04 17:46:29 +00:00
Roman Divacky e3f15c98d1 Implement local-exec TLS on PowerPC.
llvm-svn: 157935
2012-06-04 17:36:38 +00:00
Hans Wennborg 245917b536 MIPS TLS: use the model selected by TargetMachine::getTLSModel().
This was mostly done already in r156162, but I missed one place.

llvm-svn: 157929
2012-06-04 14:02:08 +00:00
Nadav Rotem b7bb72e4f3 Remove the "-promote-elements" flag. This flag is now enabled by default.
llvm-svn: 157925
2012-06-04 11:27:21 +00:00
Hans Wennborg 09610f3e09 Better comments for TLS-related X86 MachineOperand flags.
llvm-svn: 157920
2012-06-04 09:55:36 +00:00
Craig Topper c6ac4cefcc Add intrinsic forms for FMA instructions to opcode folding tables.
llvm-svn: 157917
2012-06-04 07:46:16 +00:00
Craig Topper 3cb143016d Add VFMADDSUB and VFMSUBADD FMA instructions to folding tables. Also add 213 forms of scalar FMA instructions.
llvm-svn: 157914
2012-06-04 07:08:21 +00:00
Hal Finkel 1de9bf01e4 Fix a copy-and-paste duplication error in the PPC 440 and A2 schedules (no functionality change).
llvm-svn: 157912
2012-06-04 02:39:52 +00:00
Hal Finkel 595817eebe Enable generating PPC pre-increment (r+imm) instructions by default.
It seems that this no longer causes test suite failures on PPC64 (after r157159),
and often gives a performance benefit, so it can be enabled by default.

llvm-svn: 157911
2012-06-04 02:21:00 +00:00
Rafael Espindola 34b9c511c1 Represent .rept as an anonymous macro. This removes the need for the ActiveRept
vector. No functionality change.
Extracted from a patch by the PaX Team.

llvm-svn: 157909
2012-06-03 23:57:14 +00:00
Rafael Espindola dd17c237a8 Add a typedef to simplify the code a bit. Not functionality change.
Part of a patch by the PaX Team.

llvm-svn: 157908
2012-06-03 22:41:23 +00:00
Craig Topper 79dbb0c6e4 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
llvm-svn: 157903
2012-06-03 18:58:46 +00:00
Craig Topper 2c5ccd8af7 Simplify the fma4 renaming code.
llvm-svn: 157902
2012-06-03 16:48:52 +00:00
Craig Topper 720c7bde5c Autoupgrade support the rename of x86.fma4 intrinsics to x86.fma from r157898.
llvm-svn: 157899
2012-06-03 08:07:25 +00:00
Craig Topper fd53b80219 Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
llvm-svn: 157898
2012-06-03 07:26:46 +00:00
Manman Ren 5097e4f38a Revert r157831
llvm-svn: 157896
2012-06-03 03:14:24 +00:00
Craig Topper 29eafea292 Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
llvm-svn: 157895
2012-06-03 01:40:43 +00:00
Craig Topper badd755a0e Add neverHasSideEffects and mayLoad to FMA3 instructions.
llvm-svn: 157894
2012-06-03 00:30:49 +00:00
Benjamin Kramer 172f80849f Use access(2) instead of stat(2) to check if a file exists.
Apart from being slightly cheaper, this fixes a real bug that hits 32 bit
linux systems. When passing a file larger than 2G to be linked (which isn't
that uncommon with large projects such as WebKit), clang's driver checks
if the file exists but the file size doesn't fit in an off_t and stat(2)
fails with EOVERFLOW. Clang then says that the file doesn't exist instead
of passing it to the linker.

llvm-svn: 157891
2012-06-02 16:28:09 +00:00
Benjamin Kramer bde9176663 Fix typos found by http://github.com/lyda/misspell-check
llvm-svn: 157885
2012-06-02 10:20:22 +00:00
Stepan Dyatkovskiy 0e46d8a08c PR1255: case ranges.
IntRange converted from struct to class. So main change everywhere is replacement of ".Low/High" with ".getLow/getHigh()"

llvm-svn: 157884
2012-06-02 09:42:43 +00:00
Stepan Dyatkovskiy 9549f5894b PR1255: case ranges.
IntegersSubsetGeneric, IntegersSubsetMapping: added IntTy template parameter, that allows use either APInt or IntItem. This change allows to write unittest for these classes.

llvm-svn: 157880
2012-06-02 07:26:00 +00:00
Akira Hatanaka 6f3b2a670f Fix a bug in the code which custom-lowers truncating stores in LegalizeDAG.
Check that the SDValue TargetLowering::LowerOperation returns is not null
before replacing the original node with the returned node.

llvm-svn: 157873
2012-06-02 01:10:34 +00:00
Chris Lattner 58268c23ac remove an unused variable.
llvm-svn: 157872
2012-06-02 01:03:42 +00:00
Akira Hatanaka 23327b30ef Remove code which is no longer needed in MipsAsmPrinter and MipsMCInstLower.
llvm-svn: 157867
2012-06-02 00:05:11 +00:00
Akira Hatanaka 019e592f75 Set operation actions for load/store nodes in the Mips backend.
llvm-svn: 157866
2012-06-02 00:04:42 +00:00
Akira Hatanaka f11571d90d Add definitions of 32/64-bit unaligned load/store instructions for Mips.
llvm-svn: 157865
2012-06-02 00:04:19 +00:00
Akira Hatanaka 8f1db778a4 Define functions MipsTargetLowering::LowerLOAD and LowerSTORE which
custom-lower unaligned load and store nodes.

llvm-svn: 157864
2012-06-02 00:03:49 +00:00
Akira Hatanaka b9ebf8d644 Define Mips specific unaligned load/store nodes.
llvm-svn: 157863
2012-06-02 00:03:12 +00:00
Akira Hatanaka 4e76bf8282 Expand unaligned i16 loads/stores for the Mips backend.
This is the first of a series of patches which make changes to the backend to
emit unaligned load/store instructions (lwl,lwr,swl,swr) during instruction
selection.

llvm-svn: 157862
2012-06-02 00:02:45 +00:00
Akira Hatanaka 56bf023a6d In MipsMCInstLower::LowerSymbolOperand, get offset from symbol if
the MachineOperand type has a valid offset. 

llvm-svn: 157861
2012-06-02 00:02:11 +00:00
Jakob Stoklund Olesen 54038d796c Switch all register list clients to the new MC*Iterator interface.
No functional change intended.

Sorry for the churn. The iterator classes are supposed to help avoid
giant commits like this one in the future. The TableGen-produced
register lists are getting quite large, and it may be necessary to
change the table representation.

This makes it possible to do so without changing all clients (again).

llvm-svn: 157854
2012-06-01 23:28:30 +00:00
Bill Wendling e85f34969e Register the gcov "writeout" at init time. Don't list this as a d'tor. Instead,
inject some code in that will run via the "__mod_init_func" method that
registers the gcov "writeout" function to execute at exit time.

The problem is that the "__mod_term_func" method of specifying d'tors is
deprecated on Darwin. And it can lead to some ambiguities when dealing with
multiple libraries.
<rdar://problem/11110106>

llvm-svn: 157852
2012-06-01 23:14:32 +00:00
Jakob Stoklund Olesen ca487d2183 Remove physreg support from adjustCopiesBackFrom and removeCopyByCommutingDef.
After physreg coalescing was disabled, these functions can't do anything
useful with physregs anyway.

llvm-svn: 157849
2012-06-01 22:38:19 +00:00
Jakob Stoklund Olesen 9b09cf0c11 Simplify some more getAliasSet callers.
MCRegAliasIterator can include Reg itself in the list.

llvm-svn: 157848
2012-06-01 22:38:17 +00:00
Rafael Espindola 103c2cfbbd Use dominates(Instruction, Use) in the verifier.
This removes a bit of context from the verifier erros, but reduces code
duplication in a fairly critical part of LLVM and makes dominates easier to test.

llvm-svn: 157845
2012-06-01 21:56:26 +00:00
Chad Rosier f319324082 [arm-fast-isel] Fix handling of the frameaddress intrinsic. If depth is 0
then DestReg is undefined.

llvm-svn: 157840
2012-06-01 21:12:31 +00:00
Jakob Stoklund Olesen 92a0083944 Switch some getAliasSet clients to MCRegAliasIterator.
MCRegAliasIterator can optionally visit the register itself, allowing
for simpler code.

llvm-svn: 157837
2012-06-01 20:36:54 +00:00
Manman Ren 879ca9d47d X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.

llvm-svn: 157831
2012-06-01 19:49:33 +00:00
Manman Ren e873552091 ARM: properly handle alignment for struct byval.
Factor out the expansion code into a function.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 157830
2012-06-01 19:33:18 +00:00
Nuno Lopes adf1c859dd BoundsChecking: fix a bug when the handling of recursive PHIs failed and could leave dangling references in the cache
add regression tests for this problem.

Can already compile & run: PHP, PCRE, and ICU  (i.e., all the software I tried)

llvm-svn: 157822
2012-06-01 17:43:31 +00:00
Hans Wennborg 789acfb63d Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

llvm-svn: 157818
2012-06-01 16:27:21 +00:00
Stepan Dyatkovskiy 66305749f1 PR1255: case ranges.
IntegersSubset devided into IntegersSubsetGeneric and into IntegersSubset itself. The first has no references to ConstantInt and works with IntItem only.
IntegersSubsetMapping also made generic. Here added second template parameter "IntegersSubsetTy" that allows to use on of two IntegersSubset types described below.

llvm-svn: 157815
2012-06-01 16:17:57 +00:00
Chris Lattner cc84e6d2b5 quick fix for PR13006, will check in testcase later.
llvm-svn: 157813
2012-06-01 15:02:52 +00:00
Stepan Dyatkovskiy bd7303b7f7 PR1255: case ranges.
IntItem cleanup. IntItemBase, IntItemConstantIntImp and IntItem merged into IntItem. All arithmetic operators was propogated from APInt. Also added comparison operators <,>,<=,>=. Currently you will find set of macros that propogates operators from APInt to IntItem in the beginning of IntegerSubset. Note that THESE MACROS WILL REMOVED after all passes will case-ranges compatible. Also note that these macros much smaller pain that something like this:
if (V->getValue().ugt(AnotherV->getValue()) { ... }

These changes made IntItem full featured integer object. It allows to make IntegerSubset class generic (move out all ConstantInt references inside and add unit-tests) in next commits.

llvm-svn: 157810
2012-06-01 10:06:14 +00:00
Craig Topper 1d4d62d76c Enable automatic detection of FMA3 support to allow intrinsics to be used.
llvm-svn: 157805
2012-06-01 06:10:14 +00:00
Craig Topper 00649d5111 Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
llvm-svn: 157804
2012-06-01 06:07:48 +00:00
Craig Topper 2e127b5274 Add VFNSUB* instructions to folding table.
llvm-svn: 157802
2012-06-01 05:48:39 +00:00
Craig Topper 9eadcfdf2a Remove a trailing space and fix a comment.
llvm-svn: 157801
2012-06-01 05:34:01 +00:00
Chris Lattner 466076b95f enhance the logic for looking through tailcalls to look through transparent casts
in multiple-return value scenarios, like what happens on X86-64 when returning
small structs.

llvm-svn: 157800
2012-06-01 05:29:15 +00:00
Craig Topper df09da8355 Tidy up. Remove trailing spaces and fix the worst of the 80 column violations.
llvm-svn: 157799
2012-06-01 05:24:29 +00:00
Chris Lattner 182fe3eef1 enhance getNoopInput to know about vector<->vector bitcasts of legal
types, as well as int<->ptr casts.  This allows us to tailcall functions
with some trivial casts between the call and return (i.e. because the
return types disagree).

llvm-svn: 157798
2012-06-01 05:16:33 +00:00
Chris Lattner 4f3615de97 rearrange some logic, no functionality change.
llvm-svn: 157796
2012-06-01 05:01:15 +00:00
Manman Ren 9f9111651e ARM: support struct byval in llvm
We handle struct byval by inserting a pseudo op, which will be expanded to a
loop at ExpandISelPseudos.
A separate patch for clang will be submitted to enable struct byval.

rdar://9877866

llvm-svn: 157793
2012-06-01 02:44:42 +00:00
Michael J. Spencer 5c502811f1 Fix 80 columns.
llvm-svn: 157788
2012-06-01 00:58:41 +00:00
Eric Christopher 1cf3338bb4 Add support for enum forward declarations.
Part of rdar://11570854

llvm-svn: 157786
2012-06-01 00:22:32 +00:00
Chad Rosier 526772de29 Put the shiny new MCSubRegIterator to work.
llvm-svn: 157783
2012-06-01 00:02:08 +00:00
Nuno Lopes 288e86ff6b add -bounds-checking-multiple-traps option to make one trap BB per check
disabled by default for now; we can discusse the default value (& name) later

llvm-svn: 157777
2012-05-31 22:58:48 +00:00
Nuno Lopes 7d00061d87 revamp BoundsChecking considerably:
- compute size & offset at the same time. The side-effects of this are that we now support negative GEPs. It's now approaching a phase that it can be reused by other passes (e.g., lowering of the objectsize intrinsic)
 - use APInt throughout to handle wrap-arounds
 - add support for PHI instrumentation
 - add a cache (required for recursive PHIs anyway)
 - remove hoisting support for now, since it was wrong in a few cases

sorry for the churn here.. tests will follow soon.

llvm-svn: 157775
2012-05-31 22:45:40 +00:00
Jakob Stoklund Olesen 4f203ea34b Add support for return value promotion in X86 calling conventions.
Patch by Yiannis Tsiouris!

llvm-svn: 157757
2012-05-31 17:28:20 +00:00
Manman Ren 9bccb64e56 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755
2012-05-31 17:20:29 +00:00
Jakob Stoklund Olesen fa9d7db17b Add a PrintRegUnit helper similar to PrintReg.
Reg-units are named after their root registers, and most units have a
single root, so they simply print as 'AL', 'XMM0', etc. The rare dual
root reg-units print as FPSCR~FPSCR_NZCV, FP0~ST7, ...

The printing piggybacks on the existing register name tables, so no
extra const data space is required.

llvm-svn: 157754
2012-05-31 17:18:29 +00:00
Joel Jones 585bc82489 Fix typos
llvm-svn: 157752
2012-05-31 17:11:25 +00:00
Rafael Espindola e3c5f3e5b1 Fix typos noticed by Benjamin Kramer.
Also make the checks stronger and test that we reject ranges that overlap
a previous wrapped range.

llvm-svn: 157749
2012-05-31 16:04:26 +00:00
Benjamin Kramer a0396e4583 X86: Rename the CLMUL target feature to PCLMUL.
It was renamed in gcc/gas a while ago and causes all kinds of
confusion because it was named differently in llvm and clang.

llvm-svn: 157745
2012-05-31 14:34:17 +00:00
Rafael Espindola 97d7787788 Require intervals in the range metadata to be in a canonical form: They must
be non contiguous, non overlapping and sorted by the lower end.

While this is technically a backward incompatibility, every frontent currently
produces range metadata with a single interval and we don't have any pass
that merges intervals yet, so no existing bitcode files should be rejected by
this.

llvm-svn: 157741
2012-05-31 13:45:46 +00:00
Elena Demikhovsky 602f3a26d6 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.

llvm-svn: 157737
2012-05-31 09:20:20 +00:00
Duncan Sands 339bb61e32 Enhance the sinking code to handle diamond patterns. Patch by
Carlo Alberto Ferraris.

llvm-svn: 157736
2012-05-31 08:09:49 +00:00
Craig Topper c1ac05dad5 Add intrinsic for pclmulqdq instruction.
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Akira Hatanaka bff8e31d3c Cleanup and factoring of mips16 tablegen classes. Make register classes
CPU16RegsRegClass and CPURARegRegClass available. Add definition of mips16
jalr instruction.

Patch by Reed Kotler.

llvm-svn: 157730
2012-05-31 02:59:44 +00:00
Eric Christopher 368461cad0 Fix typo in assembly directive. Noticed by inspection.
llvm-svn: 157726
2012-05-31 00:53:18 +00:00
Jakob Stoklund Olesen 5541f6026e Avoid depending on list orders and register numbering.
This code is covered by test/CodeGen/ARM/arm-modifier.ll.

llvm-svn: 157720
2012-05-30 23:00:43 +00:00
Jakob Stoklund Olesen 0b97dbcf1a Extract some pointer hacking to a function.
Switch to MCSuperRegIterator while we're there.

llvm-svn: 157717
2012-05-30 22:40:03 +00:00
Jakob Stoklund Olesen 05e2245fc6 Prioritize smaller register classes for urgent evictions.
It helps compile exotic inline asm. In the test case, normal GR32
virtual registers use up eax-edx so the final GR32_ABCD live range has
no registers left. Since all the live ranges were tiny, we had no way of
prioritizing the smaller register class.

This patch allows tiny unspillable live ranges to be evicted by tiny
unspillable live ranges from a smaller register class.

<rdar://problem/11542429>

llvm-svn: 157715
2012-05-30 21:46:58 +00:00
Eric Christopher f481ab3877 Add support for the mips inline asm 'm' output modifier.
Patch by Jack Carter.

llvm-svn: 157709
2012-05-30 19:05:19 +00:00
Owen Anderson 0eda3e1de6 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
llvm-svn: 157708
2012-05-30 18:54:50 +00:00
Owen Anderson c7aaf523e1 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
llvm-svn: 157707
2012-05-30 18:50:39 +00:00
Chad Rosier fba46a64aa Remove extra space.
llvm-svn: 157706
2012-05-30 18:47:55 +00:00
Benjamin Kramer 406a2db1f6 Make sure that we're dealing with a binary SCEVExpr when simplifying.
llvm-svn: 157704
2012-05-30 18:42:43 +00:00
Jakob Stoklund Olesen ad8103dc7b Fix some uses of getSubRegisters() to use getSubReg() instead.
It is better to address sub-registers directly by name instead of
relying on their position in the sub-register list.

llvm-svn: 157703
2012-05-30 18:40:49 +00:00
Jakob Stoklund Olesen 3a48c06456 Remove some redundant tests.
An empty list is not represented as a null pointer. Let TRI do its own
shortcuts.

llvm-svn: 157702
2012-05-30 18:38:56 +00:00
Benjamin Kramer 50b26ebb2b Teach SCEV's icmp simplification logic that a-b == 0 is equivalent to a == b.
This also required making recursive simplifications until
nothing changes or a hard limit (currently 3) is hit.

With the simplification in place indvars can canonicalize
loops of the form
for (unsigned i = 0; i < a-b; ++i)
into
for (unsigned i = 0; i != a-b; ++i)
which used to fail because SCEV created a weird umax expr
for the backedge taken count.

llvm-svn: 157701
2012-05-30 18:32:23 +00:00
Chris Lattner 1622a99e58 it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more.
llvm-svn: 157699
2012-05-30 18:08:02 +00:00
Chris Lattner 04d722a68d Extend the (abi-irrelevant) return convention to be able to return more than two values in
integer registers.  This is already supported by the fastcc convention, but it doesn't
hurt to support it in the standard conventions as well.

In cases where we can cheat at the calling convention, this allows us to avoid returning
things through memory in more cases.

llvm-svn: 157698
2012-05-30 17:50:14 +00:00
Chad Rosier 820d248c4d [arm-fast-isel] Add support for the llvm.frameaddress() intrinsic.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 157696
2012-05-30 17:23:22 +00:00
Benjamin Kramer f1e0b6cdf7 Port support for SSE4a extrq/insertq to the old jit code emitter.
llvm-svn: 157685
2012-05-30 09:13:55 +00:00
Kostya Serebryany 9024160439 [asan] instrument cmpxchg and atomicrmw
llvm-svn: 157683
2012-05-30 09:04:06 +00:00
Andrew Trick a3f9043196 SCEV: Handle a corner case reducing AddRecExpr * AddRecExpr
If integer overflow causes one of the terms to reach zero, that can
force the entire expression to zero.

Fixes PR12929: cast<Ty>() argument of incompatible type

llvm-svn: 157673
2012-05-30 03:35:20 +00:00
Andrew Trick 946f76bf33 Reformat the loop that does AddRecExpr * AddRecExpr reduction.
No functionality.

llvm-svn: 157672
2012-05-30 03:35:17 +00:00
Evan Cheng bc2453dd3d Teach taildup to update livein set. rdar://11538365
llvm-svn: 157663
2012-05-30 00:42:39 +00:00
Evan Cheng 50954fb3e1 If-converter models predicated defs as read + write. The read should be marked as 'undef' since it may not already be live. This appeases -verify-machineinstrs.
llvm-svn: 157662
2012-05-30 00:42:02 +00:00
Bob Wilson 33e5188c27 Add an insertPass API to TargetPassConfig. <rdar://problem/11498613>
Besides adding the new insertPass function, this patch uses it to
enhance the existing -print-machineinstrs so that the MachineInstrs
after a specific pass can be printed.

Patch by Bin Zeng!

llvm-svn: 157655
2012-05-30 00:17:12 +00:00
Nuno Lopes 8bd45f8ecd bounds checking:
- hoist checks out of loops where SCEV is smart enough
 - add additional statistics to measure how much we loose for not supporting interprocedural and pointers loaded from memory

llvm-svn: 157649
2012-05-29 22:32:51 +00:00
Evan Cheng 76f6e2671a Optional def can be either a def or a use (of reg0).
llvm-svn: 157640
2012-05-29 19:40:44 +00:00
Benjamin Kramer ef479ea854 Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Lang Hames e256f71937 Clear the entering, exiting and internal ranges of a bundle before collecting
ranges for the instruction about to be bundled. This fixes a bug in an external
project where an assertion was triggered due to spurious 'multiple defs' within
the bundle.

Patch by Ivan Llopard. Thanks Ivan!

llvm-svn: 157632
2012-05-29 18:19:54 +00:00
Nicolas Geoffray 312b28ce9d Update CPPBackend to new API for AttrListPtr::get.
llvm-svn: 157624
2012-05-29 15:07:18 +00:00
Stepan Dyatkovskiy 58107dd547 ConstantRangesSet renamed to IntegersSubset. CRSBuilder renamed to IntegersSubsetMapping.
llvm-svn: 157612
2012-05-29 12:26:47 +00:00
Peter Collingbourne 913869be45 Add llvm.fabs intrinsic.
llvm-svn: 157594
2012-05-28 21:48:37 +00:00
Benjamin Kramer 9d5849f51d Fix suspicous hasOneUse() check, found by PVS Studio (PR12357).
llvm-svn: 157592
2012-05-28 20:52:48 +00:00
Benjamin Kramer b8743a9150 InstCombine: Fix infinite loop when encountering switch on trivial icmp.
The test case feeds the following into InstCombine's visitSelect:
%tobool8 = icmp ne i32 0, 0
%phitmp = select i1 %tobool8, i32 3, i32 0
Then instcombine replaces the right side of the switch with 0, doesn't notice
that nothing changes and tries again indefinitely.

This fixes PR12897.

llvm-svn: 157587
2012-05-28 19:18:16 +00:00
David Blaikie 1ab17ed57e Remove unused variable.
llvm-svn: 157586
2012-05-28 18:23:36 +00:00
Meador Inge e17b69a373 PR12696: Attribute bits above 1<<30 are not encoded in bitcode
Attribute bits above 1<<30 are now encoded correctly.  Additionally,
the encoding/decoding functionality has been hoisted to helper functions
in Attributes.h in an effort to help the encoding/decoding to stay in
sync with the Attribute bitcode definitions.

llvm-svn: 157581
2012-05-28 15:45:43 +00:00
Benjamin Kramer 9704ed0304 Random BitcodeReader cleanups.
llvm-svn: 157577
2012-05-28 14:10:31 +00:00
Stepan Dyatkovskiy e3e19cbb13 PR1255: Case Ranges
Implemented IntItem - the wrapper around APInt. Why not to use APInt item directly right now?
1. It will very difficult to implement case ranges as series of small patches. We got several large and heavy patches. Each patch will about 90-120 kb. If you replace ConstantInt with APInt in SwitchInst you will need to changes at the same time all Readers,Writers and absolutely all passes that uses SwitchInst.
2. We can implement APInt pool inside and save memory space. E.g. we use several switches that works with 256 bit items (switch on signatures, or strings). We can avoid value duplicates in this case.
3. IntItem can be easyly easily replaced with APInt.
4. Currenly we can interpret IntItem both as ConstantInt and as APInt. It allows to provide SwitchInst methods that works with ConstantInt for non-updated passes.

Why I need it right now? Currently I need to update SimplifyCFG pass (EqualityComparisons). I need to work with APInts directly a lot, so peaces of code
ConstantInt *V = ...;
if (V->getValue().ugt(AnotherV->getValue()) {
  ...
}
will look awful. Much more better this way:
IntItem V = ConstantIntVal->getValue();
if (AnotherV < V) {
}

Of course any reviews are welcome.

P.S.: I'm also going to rename ConstantRangesSet to IntegersSubset, and CRSBuilder to IntegersSubsetMapping (allows to map individual subsets of integers to the BasicBlocks).
Since in future these classes will founded on APInt, it will possible to use them in more generic ways.

llvm-svn: 157576
2012-05-28 12:39:09 +00:00
Bill Wendling 1560517ec3 Implement the indirect counter increment code in a better way. Instead of
replicating the code for every place it's needed, we instead generate a function
that does that for us. This function is local to the executable, so there
shouldn't be any writing violations.

llvm-svn: 157564
2012-05-28 06:10:56 +00:00
Chris Lattner 3cb6f83ebb switch AttrListPtr::get to take an ArrayRef, simplifying a lot of clients.
llvm-svn: 157556
2012-05-28 01:47:44 +00:00
Chris Lattner 5be972d8a2 simplify code.
llvm-svn: 157555
2012-05-28 01:37:08 +00:00
Benjamin Kramer 152f106e5f PR12967: Don't crash when trying to fold a shift that's larger than the type's size.
llvm-svn: 157548
2012-05-27 22:03:32 +00:00
Chris Lattner 144b619684 Reimplement the intrinsic verifier to use the same table as Intrinsic::getDefinition,
making it stronger and more sane.

Delete the code from tblgen that produced the old code.

Besides being a path forward in intrinsic sanity, this also eliminates a bunch of
machine generated code that was compiled into Function.o

llvm-svn: 157545
2012-05-27 19:37:05 +00:00
Peter Collingbourne 4d358b55fa Have getOrCreateSubprogramDIE store the DIE for a subprogram
definition in the map before calling itself to retrieve the
DIE for the declaration.  Without this change, if this causes
getOrCreateSubprogramDIE to be recursively called on the definition,
it will create multiple DIEs for that definition.  Fixes PR12831.

llvm-svn: 157541
2012-05-27 18:36:44 +00:00
Chris Lattner f39c278384 move some code around so that Verifier.cpp can get access to the intrinsic info table.
llvm-svn: 157540
2012-05-27 18:28:35 +00:00
Chris Lattner c464416107 enhance the intrinsic info table to encode what *kind* of Any argument
it is (at the cost of 45 bytes of extra table space) so that the verifier can
start using it.

llvm-svn: 157536
2012-05-27 16:39:08 +00:00
NAKAMURA Takumi b030e8a5c8 Path::GetTemporaryDirectory(): Add an assertion if TempDirectory is alive, to check when someone would remove the tempdir.
llvm-svn: 157529
2012-05-27 13:02:04 +00:00
Benjamin Kramer abb3fa69b4 Missed parens.
llvm-svn: 157527
2012-05-27 10:56:55 +00:00
Benjamin Kramer 4b8f8e75e6 r157525 didn't work, just disable iterator checking.
This is obviosly right but I don't see how to do this with proper vector
iterators without building a horrible mess of workarounds.

llvm-svn: 157526
2012-05-27 10:24:52 +00:00
Benjamin Kramer 48ff2751c1 SDAGBuilder: Avoid iterator invalidation harder.
vector.begin()-1 is invalid too.

llvm-svn: 157525
2012-05-27 09:44:52 +00:00
Benjamin Kramer 5aad872f8c SDAGBuilder: Don't create an invalid iterator when there is only one switch case.
Found by libstdc++'s debug mode.

llvm-svn: 157522
2012-05-26 21:19:12 +00:00
Benjamin Kramer f2beccf6b4 SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights.
SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move
the most likely condition to the front so it is checked first and the others can
be skipped. This is currently not as effective as it could be because SimplifyCFG
destroys profiling metadata when merging branches and switches. Merging branch
weight metadata is tricky though.

This code touches at most 3 cases so I didn't use a proper sorting algorithm.

llvm-svn: 157521
2012-05-26 20:01:32 +00:00
Duncan Sands 3c05cd3ea8 Since commit 157467, if reassociate isn't actually going to change an expression
then it doesn't alter the instructions composing it, however it would continue
to move the instructions to just before the expression root.  Ensure it doesn't
move them either, so now it really does nothing if there is nothing to do.  That
commit also ensured that nsw etc flags weren't cleared if the expression was not
being changed.  Tweak this a bit so that it doesn't clear flags on the initial
part of a computation either if that part didn't change but later bits did.

llvm-svn: 157518
2012-05-26 16:42:52 +00:00
Benjamin Kramer 58abf4f193 SimplifyCFG: Turn the ad-hoc std::pair that represents switch cases into an explicit struct.
llvm-svn: 157516
2012-05-26 14:29:37 +00:00
Benjamin Kramer 65e75666ff Add support for branch weight metadata to MDBuilder and use it in various places.
llvm-svn: 157515
2012-05-26 13:59:43 +00:00
Benjamin Kramer 484f4247aa ScoreboardHazardRecognizer: Remove dead conditional in debug code.
Negative cycles are filtered out earlier.

llvm-svn: 157514
2012-05-26 11:37:37 +00:00
Duncan Sands c94ac6fdf6 Move this debug statement earlier so it is easy to see the order in
which operands come flying out of the linearization stage.

llvm-svn: 157512
2012-05-26 07:47:48 +00:00
Bill Wendling 8ed0749a34 The llvm_gcda_increment_indirect_counter function writes to the arguments that
are passed in. However, those arguments may be in a write-protected area, as far
as the runtime library is concerned. For instance, the data could be placed into
a 'linkedit' section, which isn't writable. Emit the code from
llvm_gcda_increment_indirect_counter directly into the function instead.

Note: The code for this is ugly, and can lead to bloat. We should look into
simplifying this code instead of having all of these branches.

<rdar://problem/11181370>

llvm-svn: 157505
2012-05-25 23:55:00 +00:00
Akira Hatanaka 5cec9007bb Fix predicate HasStandardEncoding in MipsInstrInfo.td per suggestion of
Benjamin Kramer.

llvm-svn: 157504
2012-05-25 22:15:15 +00:00
Nuno Lopes e9b0bdf804 bounds checking: add support for byval arguments
llvm-svn: 157498
2012-05-25 21:15:17 +00:00
Akira Hatanaka 03968fac4f Delete MipsExpandPseudo.cpp.
llvm-svn: 157496
2012-05-25 20:54:48 +00:00
Akira Hatanaka d0ac2c93d3 Move the code in MipsExpandPseudo to MipsInstrInfo::expandPostRAPseudo.
Delete MipsExpandPseudo.

llvm-svn: 157495
2012-05-25 20:52:52 +00:00
Akira Hatanaka f4554485cb Remove the code that expands MIPS' .cpload directive.
llvm-svn: 157494
2012-05-25 20:46:52 +00:00
Akira Hatanaka 5de59266cd Remove the code that emits MIPS' .cprestore directive.
llvm-svn: 157493
2012-05-25 20:42:55 +00:00
Akira Hatanaka 4d9b017ef2 Remove pseudo instructions that are no longer used.
llvm-svn: 157492
2012-05-25 20:37:40 +00:00
Nuno Lopes a6da3ff896 boundschecking:
add support for select
add experimental support for alloc_size metadata

llvm-svn: 157481
2012-05-25 16:54:04 +00:00
Justin Holewinski aa58397b3c Change interface for TargetLowering::LowerCallTo and TargetLowering::LowerCall
to pass around a struct instead of a large set of individual values.  This
cleans up the interface and allows more information to be added to the struct
for future targets without requiring changes to each and every target.

NV_CONTRIB

llvm-svn: 157479
2012-05-25 16:35:28 +00:00
Duncan Sands bddfb2f96b Make the reassociation pass more powerful so that it can handle expressions
with arbitrary topologies (previously it would give up when hitting a diamond
in the use graph for example).  The testcase from PR12764 is now reduced from
a pile of additions to the optimal 1617*%x0+208.  In doing this I changed the
previous strategy of dropping all uses for expression leaves to one of dropping
all but one use.  This works out more neatly (but required a bunch of tweaks)
and is also safer: some recently fixed bugs during recursive linearization were
because the linearization code thinks it completely owns a node if it has no uses
outside the expression it is linearizing.  But if the node was also in another
expression that had been linearized (and thus all uses of the node from that
expression dropped) then the conclusion that it is completely owned by the
expression currently being linearized is wrong.  Keeping one use from within each
linearized expression avoids this kind of mistake.

llvm-svn: 157467
2012-05-25 12:03:02 +00:00
Andrew Trick 4e7f6a7702 misched: trace formatting
llvm-svn: 157455
2012-05-25 02:02:39 +00:00
Jakob Stoklund Olesen 49ea89ee2d Compress MCRegisterInfo register name tables.
Store (debugging) register names as offsets into a string table instead
of as char pointers.

llvm-svn: 157449
2012-05-25 00:21:41 +00:00
Eli Friedman 315a0c79f3 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
Kaelyn Uhrain 85d8f0cba8 Silence unused variable warnings from when assertions are disabled.
llvm-svn: 157438
2012-05-24 23:37:49 +00:00
Andrew Trick a306a8a844 misched: Use the same scheduling heuristics with -misched-topdown/bottomup.
(except the part about choosing direction)

llvm-svn: 157437
2012-05-24 23:11:17 +00:00
Jakob Stoklund Olesen ff7fd4543f Shrink.
llvm-svn: 157433
2012-05-24 22:17:44 +00:00
Jakob Stoklund Olesen 36a5c8e550 Add support for range expressions in TableGen foreach loops.
Like this:

  foreach i = 0-127 in ...

Use braces for composite ranges:

  foreach i = {0-3,9-7} in ...

llvm-svn: 157432
2012-05-24 22:17:39 +00:00
Jakob Stoklund Olesen 74fd80e8fc Don't put TGParser scratch results in the output.
Only fully expanded Records should go into RecordKeeper.

llvm-svn: 157431
2012-05-24 22:17:36 +00:00
Jakob Stoklund Olesen 8a120b10bd Simplify TGParser::ProcessForEachDefs.
Use static type checking.

llvm-svn: 157430
2012-05-24 22:17:33 +00:00
Andrew Trick 79d3eecbb4 misched: Trace regpressure.
llvm-svn: 157429
2012-05-24 22:11:14 +00:00
Andrew Trick a8ad5f7c7b misched: Give each ReadyQ a unique ID
llvm-svn: 157428
2012-05-24 22:11:12 +00:00
Andrew Trick 61f1a278b8 misched: Added ScoreboardHazardRecognizer.
The Hazard checker implements in-order contraints, or interlocked
resources. Ready instructions with hazards do not enter the available
queue and are not visible to other heuristics.

The major code change is the addition of SchedBoundary to encapsulate
the state at the top or bottom of the schedule, including both a
pending and available queue.

The scheduler now counts cycles in sync with the hazard checker. These
are minimum cycle counts based on known hazards.

Targets with no itinerary (x86_64) currently remain at cycle 0. To fix
this, we need to provide some maximum issue width for all targets. We
also need to add the concept of expected latency vs. minimum latency.

llvm-svn: 157427
2012-05-24 22:11:09 +00:00
Andrew Trick ca47335461 misched: Release bottom roots in reverse order.
llvm-svn: 157426
2012-05-24 22:11:05 +00:00
Andrew Trick dd375dd34a misched: rename ReadyQ class
llvm-svn: 157425
2012-05-24 22:11:03 +00:00
Andrew Trick f378617773 misched: copy comments so compareRPDelta is readable by itself.
llvm-svn: 157424
2012-05-24 22:11:01 +00:00
Andrew Trick d5326aea81 regpressure: Added RegisterPressure::dump
llvm-svn: 157423
2012-05-24 22:10:59 +00:00
Andrew Trick b2c172e20a regpressure: physreg livein/out fix
llvm-svn: 157422
2012-05-24 22:10:57 +00:00
Justin Holewinski 907f7606f2 Remove the PTX back-end and all of its artifacts (triple, etc.)
This back-end was deprecated in favor of the NVPTX back-end.

NV_CONTRIB

llvm-svn: 157417
2012-05-24 21:38:21 +00:00
Akira Hatanaka a649cc75b3 Turn on mips16 pseudo op when compiling for mips16.
Expand test case for this.

Patch by Reed Kotler.

llvm-svn: 157410
2012-05-24 18:37:43 +00:00
Akira Hatanaka df98a7a34d Enable Mips16 compiler to compile a null program.
First code from the Mips16 compiler. Includes trivial test program.

Patch by Reed Kotler.

llvm-svn: 157408
2012-05-24 18:32:33 +00:00
David Blaikie 4fd12a6abc Silence Clang's -Wlogical-op-parentheses warning.
I'm not sure it's really worth expressing this as a range rather than 3 specific equalities, but it doesn't seem fundamentally wrong either.

llvm-svn: 157398
2012-05-24 17:11:00 +00:00
Tobias Grosser 6b31d170a4 Add half support to LLVM (for OpenCL)
Submitted by: Anton Lokhmotov  <Anton.Lokhmotov@arm.com>

Approved by: o Anton Korobeynikov
             o Micah Villmow
             o David Neto

llvm-svn: 157393
2012-05-24 15:59:06 +00:00
Stepan Dyatkovskiy 183d18aa5a PR1255 related changes (case ranges):
LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.
test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage.

llvm-svn: 157384
2012-05-24 09:33:20 +00:00
Patrik Hägglund 26dcb95dfc Fix -Wcovered-switch-default warning.
llvm-svn: 157381
2012-05-24 07:51:46 +00:00
Craig Topper bdf39a46a3 Convert assert(0) to llvm_unreachable.
llvm-svn: 157380
2012-05-24 07:02:50 +00:00
Craig Topper 9520719b9b Mark some static arrays as const.
llvm-svn: 157377
2012-05-24 06:35:32 +00:00
Craig Topper 273b0d7be5 Use uint16_t to store registers in static tables. Matches other tables.
llvm-svn: 157375
2012-05-24 06:09:56 +00:00
Craig Topper be064d0136 Use uint16_t to store register number in static tables to match other tables.
llvm-svn: 157374
2012-05-24 05:55:47 +00:00
Craig Topper 01736f866a Make some opcode tables static and const. Allows code to avoid making copies to pass the tables around.
llvm-svn: 157373
2012-05-24 05:17:00 +00:00
Craig Topper e4260f911b Mark a couple arrays as static and const. Use array_lengthof instead of sizeof/sizeof.
llvm-svn: 157369
2012-05-24 04:22:05 +00:00
Craig Topper 42b96d1b74 Mark a static array as const.
llvm-svn: 157368
2012-05-24 04:11:15 +00:00
Craig Topper 2fbd130a79 Mark a static table as const. Shrink opcode size in static tables to uint16_t. Simplify loop iterating over one of those tables. No functional change intended.
llvm-svn: 157367
2012-05-24 03:59:11 +00:00
Chad Rosier 20b79dc40e Tidy up naming for consistency and other cleanup. No functional change intended.
llvm-svn: 157358
2012-05-23 23:45:10 +00:00
Jakob Stoklund Olesen 0ce90494e6 Add a last resort tryInstructionSplit() to RAGreedy.
Live ranges with a constrained register class may benefit from splitting
around individual uses. It allows the remaining live range to use a
larger register class where it may allocate. This is like spilling to a
different register class.

This is only attempted on constrained register classes.

<rdar://problem/11438902>

llvm-svn: 157354
2012-05-23 22:37:27 +00:00
Bill Wendling e351e8c52d Forgot to reverse conditional.
llvm-svn: 157349
2012-05-23 22:12:50 +00:00
Bill Wendling 041793c452 Reduce indentation by early detection of 'continue'. No functionality change.
llvm-svn: 157348
2012-05-23 22:09:50 +00:00
Jakob Stoklund Olesen 5b8f476037 Correctly deal with identity copies in RegisterCoalescer.
Now that the coalescer keeps live intervals and machine code in sync at
all times, it needs to deal with identity copies differently.

When merging two virtual registers, all identity copies are removed
right away. This means that other identity copies must come from
somewhere else, and they are going to have a value number.

Deal with such copies by merging the value numbers before erasing the
copy instruction. Otherwise, we leave dangling value numbers in the live
interval.

This fixes PR12927.

llvm-svn: 157340
2012-05-23 20:21:06 +00:00
Chad Rosier 223faf719c [arm-fast-isel] Add support for non-global callee.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 157336
2012-05-23 18:38:57 +00:00
Nuno Lopes 10287d839f BoundsChecking: add a couple of simple tests and fix a bug in branch emition
llvm-svn: 157329
2012-05-23 16:24:52 +00:00
Nuno Lopes 561dae0947 revert r156383: removal of TYPE_CODE_FUNCTION_OLD
Apparently LLVM only stopped emitting this after LLVM 3.0

llvm-svn: 157325
2012-05-23 15:19:39 +00:00
Patrik Hägglund 8a1e316c15 Fix the inliner so that the optsize function attribute don't alter the
inline threshold if the global inline threshold is lower (as for -Oz).

Reviewed by Chandler Carruth and Bill Wendling.

llvm-svn: 157323
2012-05-23 13:42:57 +00:00
Patrik Hägglund ca210d8432 Fixed typo in r156905.
llvm-svn: 157320
2012-05-23 12:34:56 +00:00
Patrik Hägglund 94537c2a06 Small fix for the debug output from PBQP (PR12822).
llvm-svn: 157319
2012-05-23 12:12:58 +00:00
Evgeniy Stepanov 617232f32b Use zero-based shadow by default on Android.
llvm-svn: 157317
2012-05-23 11:52:12 +00:00
Stepan Dyatkovskiy 7a50155227 PR1255(case ranges) related changes in Local Transformations.
llvm-svn: 157315
2012-05-23 08:18:26 +00:00
Craig Topper a4fd6d655a Tidy up spacing.
llvm-svn: 157313
2012-05-23 05:44:51 +00:00
Chris Lattner 4f18aa8f04 small refinement to r157218 to save a tiny amount of table size in the common
case.

llvm-svn: 157312
2012-05-23 05:19:18 +00:00
Craig Topper 9fc5c814fa Fix indentation of wrapped line for readability. No functional change.
llvm-svn: 157309
2012-05-23 03:59:53 +00:00
Eric Christopher c49643586b Add support for C++11 enum classes in llvm.
Part of rdar://11496790

llvm-svn: 157303
2012-05-23 00:09:20 +00:00
Nuno Lopes 59e9df773a address some of John Criswell's comments
teach computeAllocSize about realloc, reallocf, and valloc

llvm-svn: 157298
2012-05-22 22:02:19 +00:00
NAKAMURA Takumi 70c1aa0bb5 ARMDisassembler.cpp: Fix utf8 char in comments.
llvm-svn: 157292
2012-05-22 21:47:02 +00:00
Eric Christopher d42b92f5c3 Untabify and 80-col.
llvm-svn: 157274
2012-05-22 18:45:24 +00:00
Eric Christopher 775cbd2b47 Formatting consistency.
llvm-svn: 157273
2012-05-22 18:45:18 +00:00
Nuno Lopes eee43e1bc7 hopefully fix the CMake build. sorry for breakage
llvm-svn: 157264
2012-05-22 17:40:46 +00:00
Andrew Trick a7a3de1bcf LSR fix: add a missing phi check during IV hoisting.
Fixes PR12898: SCEVExpander crash.

llvm-svn: 157263
2012-05-22 17:39:59 +00:00
Nuno Lopes a2f6cecb6d add a new pass to instrument loads and stores for run-time bounds checking
move EmitGEPOffset from InstCombine to Transforms/Utils/Local.h

(a draft of this) patch reviewed by Andrew, thanks.

llvm-svn: 157261
2012-05-22 17:19:09 +00:00
Nuno Lopes ad40c0a425 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

llvm-svn: 157255
2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen 924279ca0e Only erase virtregs with no uses left.
Also make sure registers aren't erased twice if the dead def mentions
the register twice.

This fixes PR12911.

llvm-svn: 157254
2012-05-22 14:52:12 +00:00
Duncan Sands 4df5e96d3a Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from the
leader table.  That's because it wasn't expecting instructions to turn up as
leader for a value number that is not its own, but equality propagation could
create this situation.  One solution is to have the leader table use a WeakVH
but this slows down GVN by about 5%.  Instead just have equality propagation not
add instructions to the leader table, only constants and arguments.  In theory
this might cause GVN to run more (each time it changes something it runs again)
but it doesn't seem to occur enough to cause a slow down.

llvm-svn: 157251
2012-05-22 14:17:53 +00:00
Craig Topper 53b4b73be9 Fix constant used for pshufb mask when lowering v16i8 shuffles. Bug introduced in r157043. Fixes PR12908.
llvm-svn: 157236
2012-05-22 06:09:38 +00:00
Akira Hatanaka cdf4fd8267 This patch adds a predicate to existing mips32 and mips64 so that those
instruction encodings can be excluded during mips16 processing.

This revision fixes the issue raised by Jim Grosbach.

bool hasStandardEncoding() const { return !inMips16Mode(); }

When micromips is added it will be

bool StandardEncoding() const { return !inMips16Mode()&&  !inMicroMipsMode(); }

No additional testing is needed other than to assure that there is no regression
from this patch.

Patch by Reed Kotler.

llvm-svn: 157234
2012-05-22 03:10:09 +00:00
Jim Grosbach 2597f83889 ARM: .end_data_region mismatch in Thumb2.
32-bit offset jump tables just use real branch instructions and so aren't
marked as data regions. We were still emitting the .end_data_region
marker though, which assert()ed.

rdar://11499158

llvm-svn: 157221
2012-05-21 23:34:42 +00:00
Pete Cooper 243efd7ac3 Added address space qualifier to intrinsic PointerType arguments.
llvm-svn: 157218
2012-05-21 23:21:28 +00:00
Owen Anderson f2118ea826 Fix use of an unitialized value in the LegalizeOps expansion for ISD::SUB. No in-tree targets exercise this path.
Patch by Micah Villmow.

llvm-svn: 157215
2012-05-21 22:39:20 +00:00
Jim Grosbach 19a7bcedb1 Thumb2: RSB source register should be rGRP not GPRnopc.
t2RSB defined the operand correctly, but tRSBS didn't.

llvm-svn: 157200
2012-05-21 17:57:17 +00:00
Dan Gohman 9c97eea0fd Mark an unreachable region of code with llvm_unreachable.
llvm-svn: 157197
2012-05-21 17:41:28 +00:00
Chad Rosier 5d1f5d2be3 Typo.
llvm-svn: 157195
2012-05-21 17:13:41 +00:00
Owen Anderson 778056b7e6 Make it so that the MArch, MCPU, MAttrs passed to EngineBuilder are actually used.
Patch by Jose Fonseca.

llvm-svn: 157191
2012-05-21 16:57:17 +00:00
Stepan Dyatkovskiy e89dafd876 PR1255 (case ranges: work with ConstantRangesSet instead of ConstantInt) related changes for Execution and Verifier.
llvm-svn: 157183
2012-05-21 10:44:40 +00:00
Craig Topper e88f2fd4f7 Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
llvm-svn: 157175
2012-05-21 06:40:16 +00:00
Jakob Stoklund Olesen 29268b50f2 Give a small negative bias to giant edge bundles.
This helps compile time when the greedy register allocator splits live
ranges in giant functions. Without the bias, we would try to grow
regions through the giant edge bundles, usually to find out that the
region became too big and expensive.

If a live range has many uses in blocks near the giant bundle, the small
negative bias doesn't make a big difference, and we still consider
regions including the giant edge bundle.

Giant edge bundles are usually connected to landing pads or indirect
branches.

llvm-svn: 157174
2012-05-21 03:11:23 +00:00
Jakob Stoklund Olesen a7c3d2f902 Clear kill flags on the fly when joining intervals.
With physreg joining out of the way, it is easy to recognize the
instructions that need their kill flags cleared while testing for
interference.

This allows us to skip the final scan of all instructions for an 11%
speedup of the coalescer pass.

llvm-svn: 157169
2012-05-20 21:41:05 +00:00
Jakob Stoklund Olesen 38dcd598f9 Make the global base reg GR32_NOSP.
It can sometimes be used in addressing modes that don't support %ESP.

llvm-svn: 157165
2012-05-20 18:43:00 +00:00
Jakob Stoklund Olesen 2f06a6579c Constrain regclasses in PeepholeOptimizer.
It can be necessary to restrict to a sub-class before accessing
sub-registers.

llvm-svn: 157164
2012-05-20 18:42:55 +00:00
Jakob Stoklund Olesen 00f07dec0c Constrain register classes in TailDup.
When rewriting operands, make sure the new registers have a compatible
register class.

llvm-svn: 157163
2012-05-20 18:42:51 +00:00
Peter Collingbourne 8eb05fd093 When legalising shifts, do not pre-build a list of operands which
may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve
the other operands when calling UpdateNodeOperands.  Fixes PR12889.

llvm-svn: 157162
2012-05-20 18:36:15 +00:00
Benjamin Kramer a40555826e Emit memcmp directly from the StringMatcherEmitter.
There should be no difference in the resulting binary, given a sufficiently
smart compiler. However we already had compiler timeouts on the generated
code in Intrinsics.gen, this hopefully makes the lives of slow buildbots a
little easier.

llvm-svn: 157161
2012-05-20 18:10:42 +00:00
Benjamin Kramer 76004e69a6 Plug a leak when using MCJIT.
Found by valgrind.

llvm-svn: 157160
2012-05-20 17:24:08 +00:00
Hal Finkel 601f555eee Add a missing PPC 64-bit stwu pattern.
This seems to fix the remaining compile-time failures on PPC64 when
compiling with -enable-ppc-preinc.

llvm-svn: 157159
2012-05-20 17:11:24 +00:00
Benjamin Kramer a7c2c41c3c Use TargetMachine's register info instead of creating a new one and leaking it.
llvm-svn: 157155
2012-05-20 11:24:27 +00:00
Jakob Stoklund Olesen 691ae3388f Use the right register class for LDRrs.
llvm-svn: 157152
2012-05-20 06:38:47 +00:00
Jakob Stoklund Olesen 4fd0e4f415 Transfer memory operands to the right instruction.
They need to go on the PICLDR as the verifier points out.

llvm-svn: 157151
2012-05-20 06:38:42 +00:00
Jakob Stoklund Olesen 1f1c6add10 Properly constrain register classes for sub-registers.
Not all GR64 registers have sub_8bit sub-registers.

llvm-svn: 157150
2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen a103a516c6 Properly constrain register classes in 2-addr.
X86 has 2-addr instructions with different constraints on the tied def
and use operands. One is GR32, one is GR32_NOSP.

llvm-svn: 157149
2012-05-20 06:38:32 +00:00
Jakob Stoklund Olesen b8f950650b Missed a push_back in r157147.
llvm-svn: 157148
2012-05-20 05:28:53 +00:00
Jakob Stoklund Olesen d0a38a8daa Avoid deleting extra copies when RegistersDefinedFromSameValue is true.
This function adds copies to be erased to DupCopies, avoid also adding
them to DeadCopies.

llvm-svn: 157147
2012-05-20 04:52:48 +00:00
Jakob Stoklund Olesen 64d82b74dd Fix build bots.
Avoid looking at the operands of a potentially erased instruction.

llvm-svn: 157146
2012-05-20 03:57:12 +00:00
Jakob Stoklund Olesen 02d83e3b8b LiveRangeQuery simplifies shrinkToUses().
llvm-svn: 157145
2012-05-20 02:54:52 +00:00
Jakob Stoklund Olesen abc8c3d3ce Use LiveRangeQuery in ScheduleDAGInstrs.
llvm-svn: 157144
2012-05-20 02:44:38 +00:00
Jakob Stoklund Olesen 58165b92e6 Eliminate some uses of struct LiveRange.
That struct ought to be a LiveInterval implementation detail.

llvm-svn: 157143
2012-05-20 02:44:36 +00:00
Jakob Stoklund Olesen 2aeead4bf6 Use LiveRangeQuery instead of getLiveRangeContaining().
llvm-svn: 157142
2012-05-20 02:44:33 +00:00
Peter Collingbourne 9a03c73297 Do not pass an invalid domtree to SimplifyInstruction from
LoopUnswitch.  Fixes PR12887.

llvm-svn: 157140
2012-05-20 01:32:09 +00:00
Jakob Stoklund Olesen 4e1e43a355 Simplify overlap check.
llvm-svn: 157137
2012-05-19 23:59:27 +00:00
Jakob Stoklund Olesen a34a69ce0c Fix 12892.
Dead code elimination during coalescing could cause a virtual register
to be split into connected components. The following rewriting would be
confused about the already joined copies present in the code, but
without a corresponding value number in the live range.

Erase all joined copies instantly when joining intervals such that the
MI and LiveInterval representations are always in sync.

llvm-svn: 157135
2012-05-19 23:34:59 +00:00
Peter Collingbourne 97b1076435 Do not eliminate allocas whose alignment exceeds that of the
copied-in constant, as a subsequent user may rely on over alignment.
Fixes PR12885.

llvm-svn: 157134
2012-05-19 22:52:10 +00:00
Hal Finkel 66b0c93553 Add a FIXME about access to negative stack-pointer offsets on PPC32.
The current code will generate a prologue which starts with something like:
        mflr 0
        stw 31, -4(1)
        stw 0, 4(1)
        stwu 1, -16(1)

But under the PPC32 SVR4 ABI, access to negative offsets from R1 is not allowed.

This was pointed out by Peter Bergner.

llvm-svn: 157133
2012-05-19 21:52:55 +00:00
Jakob Stoklund Olesen e59d0c3252 Remove the late DCE in RegisterCoalescer.
Dead code and joined copies are now eliminated on the fly, and there is
no need for a post pass.

This makes the coalescer work like other modern register allocator
passes: Code is changed on the fly, there is no pending list of changes
to be committed.

llvm-svn: 157132
2012-05-19 21:02:31 +00:00
Jakob Stoklund Olesen 25ced18407 Erase joined copies immediately.
The late dead code elimination is no longer necessary.

The test changes are cause by a register hint that can be either %rdi or
%rax. The choice depends on the use list order, which this patch changes.

llvm-svn: 157131
2012-05-19 20:54:07 +00:00
Jakob Stoklund Olesen 1b707c8817 Fix an ancient bug in removeCopyByCommutingDef().
Before rewriting uses of one value in A to register B, check that there
are no tied uses. That would require multiple A values to be rewritten.

This bug can't bite in the current version of the code for a fairly
subtle reason: A tied use would have caused 2-addr to insert a copy
before the use. If the copy has been coalesced, it will be found by the
same loop changed by this patch, and the optimization is aborted.

This was exposed by 400.perlbench and lua after applying a patch that
deletes joined copies aggressively.

llvm-svn: 157130
2012-05-19 20:54:03 +00:00
Nadav Rotem c93e91da27 On Haswell, perfer storing YMM registers using a single instruction.
llvm-svn: 157129
2012-05-19 20:30:08 +00:00
Nadav Rotem 900c7cb7ce Add support for additional in-reg vbroadcast patterns
llvm-svn: 157127
2012-05-19 19:57:37 +00:00
Jakob Stoklund Olesen d05148ba89 Collect inflatable virtual registers on the fly.
There is no reason to defer the collection of virtual registers whose
register class may be replaced with a larger class.

llvm-svn: 157125
2012-05-19 19:25:00 +00:00
Benjamin Kramer 1ed0fa452c Move CallbackVHs dtor inline, it can be devirtualized in many cases. Move the other virtual methods out of line as they are only called from within Value.cpp anyway.
llvm-svn: 157123
2012-05-19 19:15:25 +00:00
Craig Topper 1964b6d39d Tidy up some spacing and inconsistent use of pre/post increment. No functional change intended.
llvm-svn: 157122
2012-05-19 19:14:18 +00:00
Stepan Dyatkovskiy 79a0d80d51 Ordinary PR1255 patch: DifferenceEngine and CPPBackend adopted to the new SwitchInst methods.
llvm-svn: 157112
2012-05-19 13:14:30 +00:00
Craig Topper 6166178573 Copy some AVX support from MCJIT to JIT. Maybe will fix PR12748.
llvm-svn: 157109
2012-05-19 08:28:17 +00:00
Jakob Stoklund Olesen 900f58441d Eliminate dead code after remat.
This will remove the original def once it has no more uses.

llvm-svn: 157104
2012-05-19 05:25:59 +00:00
Jakob Stoklund Olesen dcffc626c0 Don't remat during updateRegDefsUses().
Remaining virtreg->physreg copies were rematerialized during
updateRegDefsUses(), but we already do the same thing in joinCopy() when
visiting the physreg copy instruction.

Eliminate the preserveSrcInt argument to reMaterializeTrivialDef(). It
is now always true.

llvm-svn: 157103
2012-05-19 05:25:56 +00:00
Jakob Stoklund Olesen 06dc721203 Immediately erase trivially useless copies.
There is no need for these instructions to stick around since they are
known to be not dead.

llvm-svn: 157102
2012-05-19 05:25:53 +00:00
Jakob Stoklund Olesen 82d77e8145 Run proper recursive dead code elimination during coalescing.
Dead copies cause problems because they are trivial to coalesce, but
removing them gived the live range a dangling end point. This patch
enables full dead code elimination which trims live ranges to their uses
so end points don't dangle.

DCE may erase multiple instructions. Put the pointers in an ErasedInstrs
set so we never risk visiting erased instructions in the work list.

There isn't supposed to be any dead copies entering RegisterCoalescer,
but they do slip by as evidenced by test/CodeGen/X86/coalescer-dce.ll.

llvm-svn: 157101
2012-05-19 05:25:50 +00:00
Jakob Stoklund Olesen e5bbe37950 Allow LiveRangeEdit to be created with a NULL parent.
The dead code elimination with callbacks is still useful.

llvm-svn: 157100
2012-05-19 05:25:46 +00:00
Eric Christopher b5cf66cda2 Actually support DW_TAG_rvalue_reference_type that we were trying
to generate out of the front end.

rdar://11479676

llvm-svn: 157094
2012-05-19 01:36:37 +00:00
Eric Christopher bc5d24999c Add support for the 'd' mips inline asm output modifier.
Patch by Jack Carter.

llvm-svn: 157093
2012-05-19 00:51:56 +00:00
Andrew Trick 7fa4e0fea6 SCEV: Add MarkPendingLoopPredicates to avoid recursive isImpliedCond.
getUDivExpr attempts to simplify by checking for overflow.
isLoopEntryGuardedByCond then evaluates the loop predicate which
may lead to the same getUDivExpr causing endless recursion.

Fixes PR12868: clang 3.2 segmentation fault.

llvm-svn: 157092
2012-05-19 00:48:25 +00:00
Dan Gohman 14862c3141 Fix replacing all the users of objc weak runtime routines
when deleting them. rdar://11434915.

llvm-svn: 157080
2012-05-18 22:17:29 +00:00
Jakob Stoklund Olesen 3834dae65d Modernize naming convention for class members.
No functional change.

llvm-svn: 157079
2012-05-18 22:10:15 +00:00
Jakob Stoklund Olesen b686a2cebd Move all work list processing to copyCoalesceWorkList().
This will make it possible to filter out erased instructions later.

llvm-svn: 157073
2012-05-18 21:09:40 +00:00
Nuno Lopes ac59380dfd allow LazyValueInfo::getEdgeValue() to reason about multiple edges from the same switch instruction by doing union of ranges (which may still be conservative, but it's more aggressive than before)
llvm-svn: 157071
2012-05-18 21:02:10 +00:00
Jim Grosbach 4b63d2ae1d Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

llvm-svn: 157062
2012-05-18 19:12:01 +00:00
Eric Christopher e2b36ce24a Remove duplicate code that we could just fallthrough to.
llvm-svn: 157060
2012-05-18 18:24:15 +00:00
Jakob Stoklund Olesen b954b91ada Simplify RegisterCoalescer::copyCoalesceInMBB().
It is no longer necessary to separate VirtCopies, PhysCopies, and
ImpDefCopies. Implicitly defined copies are extremely rare after we
added the ProcessImplicitDefs pass, and physical register copies are not
joined any longer.

llvm-svn: 157059
2012-05-18 18:21:48 +00:00
Eric Christopher 9ca26cfb5f Add support for the mips 'x' inline asm modifier.
Patch by Jack Carter.

llvm-svn: 157057
2012-05-18 17:39:35 +00:00
Jakob Stoklund Olesen d78d7b05ae Remove support for PhysReg joining.
This has been disabled for a while, and it is not a feature we want to
support. Copies between physical and virtual registers are eliminated by
good hinting support in the register allocator. Joining virtual and
physical registers is really a form of register allocation, and the
coalescer is not properly equipped to do that. In particular, it cannot
backtrack coalescing decisions, and sometimes that would cause it to
create programs that were impossible to register allocate, by exhausting
a small register class.

It was also very difficult to keep track of the live ranges of aliasing
registers when extending the live range of a physreg. By disabling
physreg joining, we can let fixed physreg live ranges remain constant
throughout the register allocator super-pass.

One type of physreg joining remains: A virtual register that has a
single value which is a copy of a reserved register can be merged into
the reserved physreg. This always lowers register pressure, and since we
don't compute live ranges for reserved registers, there are no problems
with aliases.

llvm-svn: 157055
2012-05-18 17:18:58 +00:00
Stepan Dyatkovskiy b638ee0ed3 Recommited reworked r156804:
SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.

llvm-svn: 157046
2012-05-18 08:32:28 +00:00
Craig Topper 0cf4038c59 Simplify code a bit. No functional change intended.
llvm-svn: 157044
2012-05-18 07:07:36 +00:00
Craig Topper 92db928ee9 Simplify handling of v16i8 shuffles and fix a missed optimization.
llvm-svn: 157043
2012-05-18 06:42:06 +00:00
Evan Cheng 22d405f57b Teach two-address pass to update the "source" map so it doesn't perform a
non-profitable commute using outdated info. The test case would still fail
because of poor pre-RA schedule. That will be fixed by MI scheduler.

rdar://11472010

llvm-svn: 157038
2012-05-18 01:33:51 +00:00
Eric Christopher 5d5338fb81 Clarify comment.
llvm-svn: 157033
2012-05-18 00:16:22 +00:00
Nuno Lopes 63afc08ca7 fix corner case in ConstantRange::intersectWith().
this fixes the missed optimization I was seeing in the CorrelatedValuePropagation pass

llvm-svn: 157032
2012-05-18 00:14:36 +00:00
Nuno Lopes 097e37da0e minor simplification in the call to ConstantRange constructor
llvm-svn: 157024
2012-05-17 23:04:08 +00:00
Andrew Trick 6a50baa26e comments
llvm-svn: 157020
2012-05-17 22:37:09 +00:00
Kevin Enderby f1b225d0e0 Fix the encoding of the armv7m (MClass) for MSR APSR writes which was missing
the 0b10 mask encoding bits.  Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use.  Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions.  Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025

llvm-svn: 157019
2012-05-17 22:18:01 +00:00
Bill Wendling e065dc8d8d Remove extraneous ';'.
llvm-svn: 157011
2012-05-17 20:27:58 +00:00
Andrew Trick 276a3e8c46 misched: trace ReadyQ.
llvm-svn: 157007
2012-05-17 18:35:13 +00:00
Andrew Trick 2202577d80 misched: Added 3-level regpressure back-off.
Introduce the basic strategy for register pressure scheduling.

1) Respect target limits at all times.

2) Indentify critical register classes (pressure sets).
   Track pressure within the scheduled region.
   Avoid increasing scheduled pressure for critical registers.

3) Avoid exceeding the max pressure of the region prior to scheduling.

Added logic for picking between the top and bottom ready Q's based on
regpressure heuristics.

Status: functional but needs to be asjusted to achieve good results.
llvm-svn: 157006
2012-05-17 18:35:10 +00:00
Andrew Trick 47a1feaea0 comment
llvm-svn: 157005
2012-05-17 18:35:07 +00:00
Andrew Trick 1c646ac68b regpressure: Fix getMaxUpwardPressureDelta.
llvm-svn: 157004
2012-05-17 18:35:05 +00:00
Andrew Trick 463b2f1f04 misched: fix liveness iterators
llvm-svn: 157003
2012-05-17 18:35:03 +00:00
Andrew Trick 7d90035b0b whitespace
llvm-svn: 157002
2012-05-17 18:35:00 +00:00
Jakob Stoklund Olesen c3553ffc70 Never clear <undef> flags on already joined copies.
RegisterCoalescer set <undef> flags on all operands of copy instructions
that are scheduled to be removed. This is so they won't affect
shrinkToUses() by introducing false register reads.

Make sure those <undef> flags are never cleared, or shrinkToUses() could
cause live intervals to end at instructions about to be deleted.

This would be a lot simpler if RegisterCoalescer could just erase joined
copies immediately instead of keeping all the to-be-deleted instructions
around.

This fixes PR12862. Unfortunately, bugpoint can't create a sane test
case for this. Like many other coalescer problems, this failure depends
of a very fragile series of events.

<rdar://problem/11474428>

llvm-svn: 157001
2012-05-17 18:32:42 +00:00
Jakob Stoklund Olesen 14a8745990 Fix a verifier bug.
Make sure useless (def-only) intervals also get verified.

llvm-svn: 157000
2012-05-17 18:32:40 +00:00
Bill Wendling 27489fe014 Relax the requirement that the exception object must be an instruction. During
bugpoint-ing, it may turn into something else.

llvm-svn: 156998
2012-05-17 17:59:51 +00:00
Chris Lattner a3b0f52a72 enhance the intrinsic info stuff to emit encodings that don't fit in 32-bits into a
separate side table, using the handy SequenceToOffsetTable class.  This encodes all
these weird things into another 256 bytes, allowing all intrinsics to be encoded this way.

llvm-svn: 156995
2012-05-17 15:55:41 +00:00
Tim Northover af501a29d3 Remove incorrect pattern for ARM SMML instruction.
Patch by Meador Inge.

llvm-svn: 156989
2012-05-17 13:12:13 +00:00
Manuel Klimek 0fc33af2a7 Fix compile error.
llvm-svn: 156986
2012-05-17 09:32:05 +00:00
Stepan Dyatkovskiy 96d0c925e9 SelectionDAGBuilder: CaseBlock, CaseRanges and CaseCmp changed representation of Low and High from signed to unsigned. Since unsigned ints usually simpler, faster and allows to reduce some extra signed bit checks needed before <,>,<=,>= comparisons.
llvm-svn: 156985
2012-05-17 08:56:30 +00:00
Chris Lattner a57c797c58 Genericize the intrinsics descriptor decoding a bit to make room
for future expansion, no functionality change yet though.

llvm-svn: 156979
2012-05-17 05:13:57 +00:00
Chris Lattner 3e34a7b93d finish encoding all of the interesting details of intrinsics. Now intrinsics
are only rejected because they can't be encoded into a 32-bit unit, not because
they contain an unencodable feature.

llvm-svn: 156978
2012-05-17 05:03:24 +00:00
Chris Lattner 827b253c63 strengthen the intrinsic descriptor stuff to be able to handle sin, cos and other
intrinsics that use passed-in arguments.

llvm-svn: 156977
2012-05-17 04:30:58 +00:00
Akira Hatanaka 0faaebf27c This patch adds the register class for MIPS16 as well as the ability for
llc to recognize MIPS16 as a MIPS ASE extension. -mips16 will mean the
mips16 ASE for mips32 by default.

As part of fixing of adding this we discovered some small changes that
need to be made to MipsInstrInfo::storeRegToStackSLot and
MipsInstrInfo::loadRegFromStackSlot. We were using some "==" equality tests
where in fact we should have been using Mips::<regclas>.hasSubClassEQ instead,
per suggestion of Jakob Stoklund Olesen.

Patch by Reed Kotler.

llvm-svn: 156958
2012-05-16 22:19:56 +00:00
Jakob Stoklund Olesen ab4828390c Set sub-register <undef> flags more accurately.
When widening an existing <def,reads-undef> operand to a super-register,
it may be necessary to clear the <undef> flag because the wider register
is now read-modify-write through the instruction.

Conversely, it may be necessary to add an <undef> flag when the
coalescer turns a full-register def into a sub-register def, but the
larger register wasn't live before the instruction.

This happens in test/CodeGen/ARM/coalesce-subregs.ll, but the test
is too small for the <undef> flags to affect the generated code.

llvm-svn: 156951
2012-05-16 21:22:35 +00:00
Danil Malyshev 8c17fbd6c1 Added LLIMCJITMemoryManager to the lli. This manager will be used for MCJIT instead of DefaultJIMMemoryManager.
It's more flexible for MCJIT tasks, in addition it's provides a invalidation instruction cache for code sections which will be used before JIT code will be executed.

llvm-svn: 156933
2012-05-16 18:50:11 +00:00
Benjamin Kramer 7faf84f125 Hexagon: Remove unused command line option.
llvm-svn: 156917
2012-05-16 15:03:55 +00:00
Duncan Sands 49080cd9a1 Fix a thinko in DisintegrateMERGE_VALUES. Patch by Xiaoyi Guo.
llvm-svn: 156909
2012-05-16 07:57:18 +00:00
Chris Lattner 7f0e7bae25 Significantly reduce the compiled size of Functions.cpp by turning a big blob of tblgen
generated code (for Intrinsic::getType) into a table.  This handles common cases right now,
but I plan to extend it to handle all cases and merge in type verification logic as well
in follow-on patches.

llvm-svn: 156905
2012-05-16 06:34:44 +00:00
Evan Cheng 58a95f0c8a Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
llvm-svn: 156896
2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen 984997b3a0 Enable sub-sub-register copy coalescing.
It is now possible to coalesce weird skewed sub-register copies by
picking a super-register class larger than both original registers. The
included test case produces code like this:

  vld2.32 {d16, d17, d18, d19}, [r0]!
  vst2.32 {d18, d19, d20, d21}, [r0]

We still perform interference checking as if it were a normal full copy
join, so this is still quite conservative. In particular, the f1 and f2
functions in the included test case still have remaining copies because
of false interference.

llvm-svn: 156878
2012-05-15 23:31:35 +00:00
Jakob Stoklund Olesen a1626369b6 Teach RegisterCoalescer to handle symmetric sub-register copies.
It is possible to coalesce two overlapping registers to a common
super-register that it larger than both of the original registers.

The important difference is that it may be necessary to rewrite DstReg
operands as well as SrcReg operands because the sub-register index has
changed.

This behavior is still disabled by CoalescerPair.

llvm-svn: 156869
2012-05-15 22:26:28 +00:00
Jakob Stoklund Olesen 385970f290 Handle NewReg==OldReg in renameRegister().
This can happen when widening a virtual register to a super-register
class.

llvm-svn: 156867
2012-05-15 22:20:27 +00:00
Jakob Stoklund Olesen 1c6a2223d4 We never call adjustCopiesBackFrom() for partial copies.
There is no need to look at an always null SrcIdx.

llvm-svn: 156866
2012-05-15 22:18:49 +00:00
Nuno Lopes c2a170e26e reuse the result of some expensive computations in getSignExtendExpr() and getZeroExtendExpr()
this gives a speedup of > 80 in a debug build in the test case of PR12825 (php_sha512_crypt_r)

llvm-svn: 156849
2012-05-15 20:20:14 +00:00
Jakob Stoklund Olesen 71673b4faf Extend the CoalescerPair interface to handle symmetric sub-register copies.
Now both SrcReg and DstReg can be sub-registers of the final coalesced
register.

CoalescerPair::setRegisters still rejects such copies because
RegisterCoalescer doesn't yet handle them.

llvm-svn: 156848
2012-05-15 20:09:43 +00:00
Andrew Trick da01ba37e0 Add -enable-aa-sched-mi, off by default, for AliasAnalysis inside MachineScheduler.
This feature avoids creating edges in the scheduler's dependence graph
for non-aliasing memory operations according to whichever alias
analysis is available. It has been fully tested in Hexagon. Before
making this default, it needs to be extended to handle multiple
MachineMemOperands, compile time needs more evaluation, and
benchmarking on X86 and ARM is needed.

Patch by Sergei Larin!

llvm-svn: 156842
2012-05-15 18:59:41 +00:00
Jim Grosbach c3b0427921 Allow MCCodeEmitter access to the target MCRegisterInfo.
Add the MCRegisterInfo to the factories and constructors.

Patch by Tom Stellard <Tom.Stellard@amd.com>.

llvm-svn: 156828
2012-05-15 17:35:52 +00:00
Nuno Lopes ab5c924006 minor simplification to code: Ty is already a SCEV type; don't need to run getEffectiveSCEVType() twice
llvm-svn: 156823
2012-05-15 15:44:38 +00:00
David Majnemer a9330fe553 Teach SimplifyLibCalls about stpcpy.
llvm-svn: 156815
2012-05-15 11:46:21 +00:00
Stepan Dyatkovskiy e01e9863c5 Rejected r156804 due to buildbots failures.
llvm-svn: 156808
2012-05-15 06:50:18 +00:00
Stepan Dyatkovskiy d450d3fa12 SelectionDAGBuilder::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.
llvm-svn: 156804
2012-05-15 05:09:41 +00:00
Akira Hatanaka cf434ee4c1 Temporarily disable anti-dependence breaking for Mips until bug 12829 is
resolved.

llvm-svn: 156801
2012-05-15 03:14:52 +00:00
Bill Wendling 8b5c0e4af2 Remove extraneous ';'.
llvm-svn: 156791
2012-05-15 00:41:56 +00:00
Akira Hatanaka 4773e67e0b Add a command line option to skip the delay slot filler pass entirely for Mips.
The purpose of this option is to silence error messages issued by machine
verifier passes and enable them to run to the end. If this option is not
provided, -verify-machineinstrs complains when it discovers there is a
non-terminator instruction (an instruction that is in a delay slot) after the
first terminator in a basic block.

llvm-svn: 156790
2012-05-14 23:59:17 +00:00
Michael J. Spencer c10948d02b [Support/YAMLParser] Use rtrim on plain scalars.
llvm-svn: 156787
2012-05-14 22:43:34 +00:00
David Blaikie 81a84bd841 Fix use of uninitialized variable.
Found by GCC's maybe-uninitialized.

llvm-svn: 156780
2012-05-14 21:48:19 +00:00
Jakob Stoklund Olesen a13fd12872 Don't access MO reference after invalidating operand list.
This should unbreak llvm-x86_64-linux.

llvm-svn: 156778
2012-05-14 21:30:58 +00:00
Jakob Stoklund Olesen dc2e0cd44a Fix PR12821.
RAFast must add an <imp-def> operand when it is rewriting a sub-register
def that isn't a read-modify-write.

llvm-svn: 156777
2012-05-14 21:10:25 +00:00
Chad Rosier a968caf8e0 Move the capture analysis from MemoryDependencyAnalysis to a more general place
so that it can be reused in MemCpyOptimizer.  This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686

llvm-svn: 156776
2012-05-14 20:35:04 +00:00
Brendon Cahoon f6b687e5d1 Revert 156634 upon request until code improvement changes are made.
llvm-svn: 156775
2012-05-14 19:35:42 +00:00
Dan Gohman 164fe18cfe Rename @llvm.debugger to @llvm.debugtrap.
llvm-svn: 156774
2012-05-14 18:58:10 +00:00
Stepan Dyatkovskiy 3dea421826 SwitchInst cosmetics: renamed "Hash" method to "hash"
llvm-svn: 156757
2012-05-14 08:26:31 +00:00
Bill Wendling ea857e1b9f Use ArrayRef instead of an explicit vector type.
llvm-svn: 156755
2012-05-14 07:53:40 +00:00
Benjamin Kramer 0b03cbd416 Hexagon: Initialize TBB to 0.
Found by valgrind.

llvm-svn: 156744
2012-05-13 15:13:22 +00:00
Benjamin Kramer c7eda3ee9c Fix spacing after if.
llvm-svn: 156716
2012-05-12 16:52:21 +00:00
Rafael Espindola 47b7dac220 Add support for the .rept directive. Patch by Vladmir Sorokin. I added support
for nesting.

llvm-svn: 156714
2012-05-12 16:31:10 +00:00
Benjamin Kramer 6bee7f750d ELF: Add support for the asm .version directive.
llvm-svn: 156712
2012-05-12 14:30:47 +00:00
Benjamin Kramer 95d31bcba5 AsmParser: Add support for the .purgem directive.
Based on a patch by Team PaX.

llvm-svn: 156709
2012-05-12 11:21:46 +00:00
Benjamin Kramer 38de62f883 AsmParser: Give a nice error message for .code16gcc, which is currently unsupported.
Patch by Team PaX!

llvm-svn: 156708
2012-05-12 11:19:04 +00:00
Benjamin Kramer 66b8d4d28f AsmParser: ignore the .extern directive.
llvm-svn: 156707
2012-05-12 11:18:59 +00:00
Benjamin Kramer e297b9f506 AsmParser: Add support for .ifc and .ifnc directives.
Based on a patch from PaX Team.

llvm-svn: 156706
2012-05-12 11:18:51 +00:00
Benjamin Kramer 62c18b0881 AsmParser: Add support for .ifb and .ifnb directives.
Based on a patch from PaX Team.

llvm-svn: 156705
2012-05-12 11:18:42 +00:00
Stepan Dyatkovskiy 0beab5e1cd Recommited r156374 with critical fixes in BitcodeReader/Writer:
Ordinary patch for PR1255.
Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object.
Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported.

llvm-svn: 156704
2012-05-12 10:48:17 +00:00
Jay Foad ca0c499609 Teach Function::hasAddressTaken that BlockAddress doesn't really take
the address of a function.

llvm-svn: 156703
2012-05-12 08:30:16 +00:00
Sirish Pande 8bb9745a5e Make sure new value jump is enabled for Hexagon V5 as well.
llvm-svn: 156700
2012-05-12 05:54:15 +00:00
Sirish Pande 4bd20c50eb Support for Hexagon feature, New Value Jump.
llvm-svn: 156698
2012-05-12 05:10:30 +00:00
Akira Hatanaka a6c3fd8317 Remove MipsEmitGPRestore.cpp.
llvm-svn: 156696
2012-05-12 03:24:03 +00:00
Akira Hatanaka 3ecc5273c1 Delete all functions that are no longer needed in MipsFunctionInfo, including
the ones that get or set the frame index for the $gp save slot. 

Remove the piece of code in MipsFunctionInfo::getGlobalBaseReg() which returns
GP. This function should always return a virtual register.

llvm-svn: 156695
2012-05-12 03:22:13 +00:00
Akira Hatanaka 2e31e036b6 Stop reserving register $gp. Do not call isGPFI to check whether a frame object
is the $gp save slot.

llvm-svn: 156694
2012-05-12 03:21:18 +00:00
Akira Hatanaka 0fb87feb39 Do not add the pass which restores $gp after every function call.
llvm-svn: 156693
2012-05-12 03:19:51 +00:00
Akira Hatanaka f542ebd958 Make the following changes in MipsISelLowering.cpp:
- Stop creating stack frame objects needed for saving $gp.
- Insert a node that copies the global pointer register to register $gp
  before the call node. This will ensure $gp is valid at the entry of the
  called function.

llvm-svn: 156692
2012-05-12 03:19:04 +00:00
Akira Hatanaka c980f8453a Make the following changes in MipsFrameLowering.cpp:
- Stop emitting instructions needed to initialize the global pointer register.
- Stop emitting .cprestore directive.
- Do not take into account the $gp save slot when computing stack size.

llvm-svn: 156691
2012-05-12 03:18:00 +00:00
Akira Hatanaka 8f3573034b Make the following changes in MipsAsmPrinter.cpp:
- Remove code which lowers pseudo SETGP01.
- Fix LowerSETGP01. The first two of the three instructions that are emitted to
  initialize the global pointer register now use register $2.
- Stop emitting .cpload directive.

llvm-svn: 156689
2012-05-12 00:48:43 +00:00
Chad Rosier 10702d5f22 Hoist simpler checks above llvm::PointerMayBeCaptured. No functional change intended.
llvm-svn: 156687
2012-05-12 00:43:40 +00:00
Jakob Stoklund Olesen 165473247f Don't look for empty live ranges in the unions.
Empty live ranges represent undef and still get allocated, but they
won't appear in LiveIntervalUnions.

Patch by Patrik Hägglund!

llvm-svn: 156685
2012-05-12 00:33:28 +00:00
Akira Hatanaka d918f77ba3 Insert instructions to the entry basic block which initializes the global
pointer register. 


This is the first of the series of patches which clean up the way global pointer
register is used. The patches will make the following improvements:

- Make $gp an allocatable temporary register rather than reserving it.
- Use a virtual register as the global pointer register and let the register
  allocator decide which register to assign to it or whether spill/reloads are
  needed.
- Make sure $gp is valid at the entry of a called function, which is necessary
  for functions using lazy binding.
- Remove the need for emitting .cprestore and .cpload directives.

llvm-svn: 156671
2012-05-12 00:17:17 +00:00
Akira Hatanaka 0661b81bca Do not replace operands of pseudo instructions with register $zero.
llvm-svn: 156663
2012-05-11 23:22:18 +00:00
Chad Rosier a33015d4e0 Revert 156658.
llvm-svn: 156662
2012-05-11 23:21:01 +00:00
Chad Rosier e40f5d3ee0 [fast-isel] Fast-isel doesn't use the expect intrinsic.
llvm-svn: 156658
2012-05-11 23:10:58 +00:00
Michael J. Spencer 93303819ac [Support/StringRef] Add find_last_not_of and {r,l,}trim.
llvm-svn: 156652
2012-05-11 22:08:50 +00:00
Chad Rosier aa9cb9df59 [fast-isel] Add support for selecting @llvm.trap().
llvm-svn: 156646
2012-05-11 21:33:49 +00:00
Brendon Cahoon 5edcf8822d Updated instruction table due to addded intrinsics.
llvm-svn: 156644
2012-05-11 21:10:16 +00:00
Sirish Pande 95d0117bb3 Remove warnings from HexagonVLIWPacketizer.
llvm-svn: 156636
2012-05-11 20:00:34 +00:00
Brendon Cahoon 31f8723ef3 Hexagon constant extender support.
Patch by Jyotsna Verma.

llvm-svn: 156634
2012-05-11 19:56:59 +00:00
Chad Rosier 06e34d9220 Typo.
llvm-svn: 156633
2012-05-11 19:43:29 +00:00
Chad Rosier 3268692aa8 [fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup.
llvm-svn: 156632
2012-05-11 19:40:25 +00:00
Sirish Pande 83ccb6ce08 Hexagon V5 intrinsics support.
llvm-svn: 156631
2012-05-11 19:39:13 +00:00
Chad Rosier 90f9afe659 [fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg
retval.  Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796

llvm-svn: 156628
2012-05-11 18:51:55 +00:00
Nuno Lopes e2cfd3ce95 objectsize: add a few more tests and fix a bug
llvm-svn: 156625
2012-05-11 18:25:29 +00:00
Chad Rosier 519b12f927 [fast-isel] Rather then assert (or segfault in a non-asserts build), fall back
to selection DAG isel if we're unable to handle a non-double multi-reg retval.
rdar://11430407
PR12796

llvm-svn: 156622
2012-05-11 17:41:06 +00:00
Chad Rosier 466d3d8faa The return type is an unsigned, not a bool.
llvm-svn: 156621
2012-05-11 16:41:38 +00:00
Manman Ren 0d5ec28ccc Add space before an open parenthesis in control flow statements.
llvm-svn: 156620
2012-05-11 15:36:46 +00:00
Preston Gurd 09de6ae399 Added X86 Atom latencies to X86InstrMMX.td.
llvm-svn: 156615
2012-05-11 14:27:12 +00:00
Hans Wennborg f9d0e44b82 Implement initial-exec TLS model for 32-bit PIC x86
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).

llvm-svn: 156611
2012-05-11 10:11:01 +00:00
Silviu Baranga ddc67a7655 Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions.
llvm-svn: 156609
2012-05-11 09:28:27 +00:00
Silviu Baranga 5a719f9b9a Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate.
llvm-svn: 156608
2012-05-11 09:10:54 +00:00
Akira Hatanaka e37614438f Fix a misleading comment.
llvm-svn: 156603
2012-05-11 01:45:15 +00:00
Jim Grosbach dc1e36e9f5 Tidy up. Trailing whitespace.
llvm-svn: 156602
2012-05-11 01:41:30 +00:00
Eli Friedman e0a64d83fc Fix a minor logic mistake transforming compares in instcombine. PR12514.
llvm-svn: 156600
2012-05-11 01:32:59 +00:00
Manman Ren dc8ad0058f ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156599
2012-05-11 01:30:47 +00:00
Dan Gohman dfab443ae8 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.

llvm-svn: 156593
2012-05-11 00:19:32 +00:00
Eric Christopher b6148ed72c Allow unique_file to take a mode for file permissions, but default
to user only read/write.

Part of rdar://11325849

llvm-svn: 156591
2012-05-11 00:07:44 +00:00
Chad Rosier 8244b1dc7e Fix intendation.
llvm-svn: 156589
2012-05-10 23:38:07 +00:00
Nuno Lopes f573030391 objectsize: add support for GEPs with non-constant indexes
add an additional parameter to InstCombiner::EmitGEPOffset() to force it to *not* emit operations with NUW flag

llvm-svn: 156585
2012-05-10 23:17:35 +00:00
Preston Gurd 4fe10a5d9a Added X86 Atom latencies for instructions in X86InstrInfo.td.
llvm-svn: 156579
2012-05-10 21:58:35 +00:00
Eric Christopher ed51b9ec0b Add support for the 'X' inline asm operand modifier.
Patch by Jack Carter.

llvm-svn: 156577
2012-05-10 21:48:22 +00:00
Andrew Trick c5d7008f27 misched: Print machineinstrs with -debug-only=misched
llvm-svn: 156576
2012-05-10 21:06:21 +00:00
Andrew Trick 419eae2db7 misched: tracing register pressure heuristics.
llvm-svn: 156575
2012-05-10 21:06:19 +00:00
Andrew Trick 7ee9de51f2 misched: Add register pressure backoff to ConvergingScheduler.
Prioritize the instruction that comes closest to keeping pressure
under the target's limit. Then prioritize instructions that avoid
increasing the max pressure in the scheduled region. The max pressure
heuristic is a tad aggressive. Later I'll fix it to consider the
unscheduled pressure as well.

WIP: This is mostly functional but untested and not likely to do much good yet.
llvm-svn: 156574
2012-05-10 21:06:16 +00:00
Andrew Trick 795c1120a6 misched: Release only unscheduled nodes into ReadyQ.
llvm-svn: 156573
2012-05-10 21:06:14 +00:00
Andrew Trick 95dafd8b31 misched: Added ReadyQ container wrapper for Top and Bottom Queues.
llvm-svn: 156572
2012-05-10 21:06:12 +00:00
Andrew Trick 4add42f439 misched: Introducing Top and Bottom register pressure trackers during scheduling.
llvm-svn: 156571
2012-05-10 21:06:10 +00:00
Sirish Pande fc8118bf41 Hexagon V5 Support - V5 td file.
llvm-svn: 156569
2012-05-10 20:24:28 +00:00
Sirish Pande 69295b8963 Hexagon V5 FP Support.
llvm-svn: 156568
2012-05-10 20:20:25 +00:00
Andrew Trick 75812f815c RegPressure: API for speculatively checking instruction pressure.
Added getMaxExcessUpward/DownwardPressure. They somewhat abuse the
tracker by speculatively handling an instruction out of order. But it
is convenient for now. In the future, we will cache each instruction's
pressure contribution to make this efficient.

llvm-svn: 156561
2012-05-10 19:11:52 +00:00
Andrew Trick 1df762abf4 RegPressure: fix array index iteration style.
llvm-svn: 156560
2012-05-10 19:11:49 +00:00
Dan Gohman ed7c24e2d9 Teach DeadStoreElimination to eliminate exit-block stores with phi addresses.
llvm-svn: 156558
2012-05-10 18:57:38 +00:00
Manman Ren b555b382bd Revert: 156550 "ARM: peephole optimization to remove cmp instruction"
This commit broke an external linux bot and gave a compile-time warning.

llvm-svn: 156556
2012-05-10 18:49:43 +00:00
Dan Gohman 0291246ce7 Rewrite ScalarEvolution::hasOperand to use an explicit worklist instead
of recursion, to avoid excessive stack usage on deep expressions.

llvm-svn: 156554
2012-05-10 17:21:30 +00:00
Nuno Lopes 300d629924 teach DSE and isInstructionTriviallyDead() about calloc
llvm-svn: 156553
2012-05-10 17:14:00 +00:00
Manman Ren c860887b2d ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156550
2012-05-10 16:48:21 +00:00
Joel Jones 3d90a9ae65 Fix a problem with incomplete equality testing of PHINodes in
Instruction::IsIdenticalToWhenDefined.

This manifested itself when inlining two calls to the same function.  The 
inlined function had a switch statement that returned one of a set of 
global variables.  Without this modification, the two phi instructions that 
chose values from the branches of the switch instruction inlined from the 
callee were considered equivalent and jump-threading replaced a load for the 
first switch value with a phi selecting from the second switch, thereby 
producing incorrect code.

This patch has been tested with "make check-all", "lnt runteste nt", and 
llvm self-hosted, and on the original program that had this problem, 
wireshark.

<rdar://problem/11025519>

llvm-svn: 156548
2012-05-10 15:59:41 +00:00
Nadav Rotem 1a65397017 Fix merge-typo and cleanup
llvm-svn: 156541
2012-05-10 12:50:02 +00:00
Nadav Rotem 15946e50c1 AVX2: Add an additional broadcast idiom.
llvm-svn: 156540
2012-05-10 12:39:13 +00:00
Nadav Rotem b86a3fb8d0 Generate AVX/AVX2 shuffles even when there is a memory op somewhere else in the program.
Starting r155461 we are able to select patterns for vbroadcast even when the load op is used by other users.

Fix PR11900.

llvm-svn: 156539
2012-05-10 12:22:05 +00:00
Jim Grosbach 1ace8b6a76 ExecutionEngine: Check for NULL ErrorStr before using it.
Patch by Yury Mikhaylov <yury.mikhaylov@gmail.com>.

llvm-svn: 156523
2012-05-10 00:31:50 +00:00
Dan Gohman f8b19d09ba Fix the objc_storeStrong recognizer to stop before walking off the
end of a basic block if there's no store.

llvm-svn: 156520
2012-05-09 23:08:33 +00:00
Nuno Lopes 7100f463b0 objectsize:
refactor code a bit to enable future changes to support run-time information
add support to compute allocation sizes at run-time if penalty > 1 (e.g., malloc(x), calloc(x, y), and VLAs)

llvm-svn: 156515
2012-05-09 21:30:57 +00:00
Roman Divacky e07cc042f6 Mark .opd @progbits, thus avoiding a warning from asm.
llvm-svn: 156494
2012-05-09 18:24:23 +00:00
Chad Rosier 9d7b1cee39 Set the default iOS version to 3.0.
llvm-svn: 156492
2012-05-09 18:23:00 +00:00
Bob Wilson 8d4e2fab63 Use the cpuid 64 bit flag to pick the default CPU name for an unknown model.
For the Family 6 switch in sys::getHostCPUName, an unrecognized model was
reported as "i686".  That's a really bad default since it means that new
CPUs will be treated as if they can only use 32-bit code.  This just looks
at the cpuid extended feature flag for 64 bit support, and if that is set,
it uses a default x86-64 cpu.  Similar logic is already used for the Family
15 code.  <rdar://problem/11314502>

llvm-svn: 156486
2012-05-09 17:47:03 +00:00
Chad Rosier 2778cbc880 Don't return true on a function with a void return type.
llvm-svn: 156484
2012-05-09 17:38:47 +00:00
Chad Rosier d84eaac673 Add Triple::getiOSVersion.
This new function provides a way to get the iOS version number from ios triples.
Part of rdar://11409204

llvm-svn: 156483
2012-05-09 17:23:48 +00:00
Hans Wennborg b7ef2fe8ae Introduce llvm-c function LLVMPrintModuleToFile.
This lets you save the textual representation of the LLVM IR to a file.
Before this patch it could only be printed to STDERR from llvm-c.

Patch by Carlo Kok!

llvm-svn: 156479
2012-05-09 16:54:17 +00:00
Nuno Lopes 01547b3ad2 change the objectsize intrinsic signature: add a 3rd parameter to denote the maximum runtime performance penalty that the user is willing to accept.
This commit only adds the parameter. Code taking advantage of it will follow.

llvm-svn: 156473
2012-05-09 15:52:43 +00:00
Bill Wendling a3aeb980d2 Supply a C interface to the "LinkModules" method.
Patch by Andrew Wilkins!

llvm-svn: 156469
2012-05-09 08:55:40 +00:00
Craig Topper 28540adfcf Remove unused variable to get rid of warning.
llvm-svn: 156466
2012-05-09 07:08:58 +00:00
Akira Hatanaka ca41d13bbd Add another peephole pattern for conditional moves.
llvm-svn: 156460
2012-05-09 02:29:29 +00:00
Jakob Stoklund Olesen 7e21d617ef Use ptr_rc_tailcall instead of GR32_TC.
The getPointerRegClass() hook will return GR32_TC, or whatever is
appropriate for the current function.

Patch by Yiannis Tsiouris!

llvm-svn: 156459
2012-05-09 01:50:09 +00:00
Akira Hatanaka 05b9dad1e6 Make register FP allocatable if the compiled function does not have dynamic
allocas.

llvm-svn: 156458
2012-05-09 01:38:13 +00:00
Akira Hatanaka 0a8ab718cb Expand 64-bit shifts if target ABI is O32.
llvm-svn: 156457
2012-05-09 00:55:21 +00:00
Richard Trieu edf46e6b6e Remove unused variable to silence compiler warning.
llvm-svn: 156456
2012-05-09 00:30:21 +00:00
Dan Gohman 41375a3545 Miscellaneous accumulated cleanups.
llvm-svn: 156445
2012-05-08 23:39:44 +00:00
Kevin Enderby fe3d005ca5 Fix it so llvm-objdump -arch does accept x86 and x86-64 as valid arch names.
PR12731.  Patch by Meador Inge!

llvm-svn: 156444
2012-05-08 23:38:45 +00:00
Dan Gohman 61708d37d6 Fix objc_storeStrong pattern matching to catch a potential use of the
old value after the store but before it is released.
This fixes rdar:/11116986.

llvm-svn: 156442
2012-05-08 23:34:08 +00:00
Jakob Stoklund Olesen 10191fd44f Use a shared function for a common operation.
llvm-svn: 156441
2012-05-08 23:27:30 +00:00
Eric Christopher 8d2a77de63 Fix thinko in conditional.
Part of rdar://11352000 and should bring the buildbots back.

llvm-svn: 156421
2012-05-08 21:24:39 +00:00
Jim Grosbach 92f6adc8be DAGCombiner should not change the type of an extract_vector index.
When a combine twiddles an extract_vector, care should be take to preserve
the type of the index operand. No luck extracting a reasonable testcase,
unfortunately.

rdar://11391009

llvm-svn: 156419
2012-05-08 20:56:07 +00:00
Eric Christopher d666bb0dd8 Remove excess semi-colons to quiet warnings.
llvm-svn: 156416
2012-05-08 20:45:04 +00:00
Daniel Dunbar 5f1c956eb0 [Support] Fix sys::GetRandomNumber() to always use a high quality seed.
llvm-svn: 156414
2012-05-08 20:38:00 +00:00
Sirish Pande 1c9f7dbc10 Update load/store instruction patterns in Hexagon V4.
llvm-svn: 156411
2012-05-08 19:50:20 +00:00
Akira Hatanaka fd82286e62 Formatting fixes.
Patch by Jack Carter.

llvm-svn: 156409
2012-05-08 19:14:42 +00:00
Akira Hatanaka c515bfb9e7 Define mips16 instruction formats.
Patch by Reed Kotler.

llvm-svn: 156408
2012-05-08 19:08:58 +00:00
Eric Christopher 4d25052a9a Handle OpDeref in case it comes in as a register operand.
Part of rdar://11352000

llvm-svn: 156405
2012-05-08 18:56:00 +00:00
Nuno Lopes 24ac479a7d remove autoupgrade code for old function attributes format.
I still left another fixme regarding alignment, because I'm unsure how to remove that code without breaking things

llvm-svn: 156387
2012-05-08 17:07:35 +00:00
Nuno Lopes f7596c91af remove TYPE_CODE_FUNCTION_OLD type code. it is no longer in use and it was marked for removal in 3.0
llvm-svn: 156383
2012-05-08 16:16:20 +00:00
Jakob Stoklund Olesen 276ae14023 s/CSR_Ghc/CSR_NoRegs/
Share the CalleeSavedRegs defs between all calling conventions having no
callee-saved registers.

Patch by Yiannis Tsiouris!

llvm-svn: 156382
2012-05-08 15:07:29 +00:00
NAKAMURA Takumi 3b7f995b75 Windows/PathV2.inc: Retry rename() for (maximum) 2 seconds.
Files might be opend by system scanners (eg. file indexer, virus scanner, &c).

llvm-svn: 156380
2012-05-08 14:31:46 +00:00
Duncan Sands 3bbb1d50df Calling ReassociateExpression recursively is extremely dangerous since it will
replace the operands of expressions with only one use with undef and generate
a new expression for the original without using RAUW to update the original.
Thus any copies of the original expression held in a vector may end up
referring to some bogus value - and using a ValueHandle won't help since there
is no RAUW.  There is already a mechanism for getting the effect of recursion
non-recursively: adding the value to be recursed on to RedoInsts.  But it wasn't
being used systematically.  Have various places where recursion had snuck in at
some point use the RedoInsts mechanism instead.  Fixes PR12169.

llvm-svn: 156379
2012-05-08 12:16:05 +00:00
Stepan Dyatkovskiy 5eafce5c88 Rejected r156374: Ordinary PR1255 patch. Due to clang-x86_64-debian-fnt buildbot failure.
llvm-svn: 156377
2012-05-08 08:33:21 +00:00
Craig Topper 7daf897678 Remove 256-bit AVX non-temporal store intrinsics. Similar was previously done for 128-bit.
llvm-svn: 156375
2012-05-08 06:58:15 +00:00
Stepan Dyatkovskiy b6a4640163 Ordinary patch for PR1255.
Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object.
Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported.

llvm-svn: 156374
2012-05-08 06:36:08 +00:00
Andrew Trick d29cd732d4 Allow NULL LoopPassManager argument in UnrollLoop. PR12734.
llvm-svn: 156358
2012-05-08 02:52:09 +00:00
Jakob Stoklund Olesen 952b4c11fe Extract methods for joining physregs.
No functional change.

llvm-svn: 156345
2012-05-08 00:08:35 +00:00
Jakob Stoklund Olesen 9e8ae6c37f Naming convention and whitespace. No functional change.
llvm-svn: 156342
2012-05-07 23:46:16 +00:00
Jakob Stoklund Olesen 98595b5a61 Coalesce subreg-subreg copies.
At least some of them:

  %vreg1:sub_16bit = COPY %vreg2:sub_16bit; GR64:%vreg1, GR32: %vreg2

Previously, we couldn't figure out that the above copy could be
eliminated by coalescing %vreg2 with %vreg1:sub_32bit.

The new getCommonSuperRegClass() hook makes it possible.

This is not very useful yet since the unmodified part of the destination
register usually interferes with the source register. The coalescer
needs to understand sub-register interference checking first.

llvm-svn: 156334
2012-05-07 22:57:55 +00:00
Jakob Stoklund Olesen 3c52f0281f Add an MF argument to TRI::getPointerRegClass() and TII::getRegClass().
The getPointerRegClass() hook can return register classes that depend on
the calling convention of the current function (ptr_rc_tailcall).

So far, we have been able to infer the calling convention from the
subtarget alone, but as we add support for multiple calling conventions
per target, that no longer works.

Patch by Yiannis Tsiouris!

llvm-svn: 156328
2012-05-07 22:10:26 +00:00
Jakob Stoklund Olesen c4b3a7a1d7 Fix bug in TRI::getCommonSuperRegClass().
Test cases for this code are coming. It is not used for anything yet.

llvm-svn: 156327
2012-05-07 21:59:31 +00:00
Owen Anderson ab63d84252 Teach DAG combine to fold x-x to 0.0 when unsafe FP math is enabled.
llvm-svn: 156324
2012-05-07 20:51:25 +00:00
Owen Anderson f4f80e1f39 Teach reassociate to commute FMul's and FAdd's in order to canonicalize the order of their operands across instructions. This allows for greater CSE opportunities.
llvm-svn: 156323
2012-05-07 20:47:23 +00:00
Preston Gurd e65f4e66ac Make IntelJITEvents and OProfileJIT as optional libraries and add
optional library support to the llvm-build tool:
 - Add new command line parameter to llvm-build: “--enable-optional-libraries”
 - Add handing of new llvm-build library type “OptionalLibrary”
 - Update Cmake and automake build systems to pass correct flags to llvm-build
   based on configuration

Patch by Dan Malea!

llvm-svn: 156319
2012-05-07 19:38:40 +00:00
Jakob Stoklund Olesen 65a6dafc8d Add TRI::getCommonSuperRegClass().
This function is a generalization of getMatchingSuperRegClass() to the
symmetric case where both sides are using a sub-register index. It will
find a super-register class and sub-register indexes that make this
diagram commute:

                                   PreA
                       SuperRC  ---------->  RCA

                          |                   |
                          |                   |
                     PreB |                   | SubA
                          |                   |
                          |                   |
                          V                   V

                         RCB    ----------> SubRC
                                   SubB

This can be used to coalesce copies like:

  %vreg1:sub16 = COPY %vreg2:sub16; GR64:%vreg1, GR32: %vreg2

llvm-svn: 156317
2012-05-07 19:14:58 +00:00
Chad Rosier d8287fec17 Fix a regression from r147481. This combine should only happen if there is a
single use.
rdar://11360370

llvm-svn: 156316
2012-05-07 18:47:44 +00:00
Matt Beaumont-Gay a1b3b007f3 Don't assume size_t is unsigned long long.
Fixes a -Woverflow warning from gcc when building for 32-bit platforms.

llvm-svn: 156313
2012-05-07 18:12:42 +00:00
Manman Ren ef4e0479ec X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

In order to generate negl, I added patterns in Target/X86/X86InstrCompiler.td:
def : Pat<(X86sub_flag 0, GR32:$src), (NEG32r GR32:$src)>;

rdar: 10961709
llvm-svn: 156312
2012-05-07 18:06:23 +00:00
Eric Christopher 0d8c15d20f Add support for the 'x' constraint.
Patch by Jack Carter.

llvm-svn: 156295
2012-05-07 06:25:19 +00:00
Eric Christopher 9c492e6ebf Add support for the 'l' constraint.
Patch by Jack Carter.

llvm-svn: 156294
2012-05-07 06:25:15 +00:00
Eric Christopher e3c494de82 Add support for the 'c' constraint.
Patch by Jack Carter.

llvm-svn: 156293
2012-05-07 06:25:10 +00:00
Eric Christopher c18ae4a3b1 Add support for the 'P' constraint.
Patch by Jack Carter.

llvm-svn: 156292
2012-05-07 06:25:02 +00:00
Craig Topper dbb98b4917 Fix some issues in the f16c instructions.
llvm-svn: 156287
2012-05-07 06:00:15 +00:00
Eric Christopher 470578a91b Add support for the 'O' constraint.
Patch by Jack Carter.

llvm-svn: 156285
2012-05-07 05:46:48 +00:00
Eric Christopher e07aa430b8 Add support for the 'N' inline asm constraint.
Patch by Jack Carter.

llvm-svn: 156284
2012-05-07 05:46:43 +00:00
Eric Christopher 1109b3406d Add support for the 'L' inline asm constraint.
Patch by Jack Carter.

llvm-svn: 156283
2012-05-07 05:46:37 +00:00
Eric Christopher 3ff88a05b7 Add support for the inline asm constraint 'K'.
llvm-svn: 156282
2012-05-07 05:46:29 +00:00
Craig Topper d4e1894ec1 Add SSE4A MOVNTSS/MOVNTSD instructions.
llvm-svn: 156281
2012-05-07 05:36:19 +00:00
Eric Christopher 7201e1b4b9 Support the 'J' constraint.
Patch by Jack Carter.

llvm-svn: 156280
2012-05-07 03:13:42 +00:00
Eric Christopher 1d6c89eea1 Add support for the 'I' inline asm constraint. Also add tests
from the previous 2 patches.

Patch by Jack Carter.

llvm-svn: 156279
2012-05-07 03:13:32 +00:00
Eric Christopher 58daf04681 Allow 64 bit integer values in gpu registers if arch and abi are 64 bit.
Patch by Jack Carter.

llvm-svn: 156278
2012-05-07 03:13:22 +00:00
Eric Christopher cfcd77b0bc When using inline asm constraints representing
non-floating point general registers allow 8 and 16-bit
elements.

Patch by Jack Carter.

llvm-svn: 156277
2012-05-07 03:13:16 +00:00
Craig Topper 00a1e6d48b Use MVT instead of EVT as the argument to all the shuffle decode functions. Simplify some of the decode functions.
llvm-svn: 156268
2012-05-06 19:46:21 +00:00
Craig Topper 804be3b546 Add VPERMQ/VPERMPD to the list of target specific shuffles that can be looked through for DAG combine purposes.
llvm-svn: 156266
2012-05-06 18:54:26 +00:00
Craig Topper 54bdb350e2 Add shuffle decode support for VPERMQ/VPERMPD.
llvm-svn: 156265
2012-05-06 18:44:02 +00:00
Chris Lattner 854f366a1f make SourceMgr tolerate empty SMLoc()'s better.
llvm-svn: 156260
2012-05-06 16:20:49 +00:00
Benjamin Kramer 3d38c17b59 Switch the select to branch transformation on by default.
The primitive conservative heuristic seems to give a slight overall
improvement while not regressing stuff. Make it available to wider
testing. If you notice any speed regressions (or significant code
size regressions) let me know!

llvm-svn: 156258
2012-05-06 14:25:16 +00:00
Jakub Staszak cfc46f82ff Remove trailing spaces.
llvm-svn: 156257
2012-05-06 13:52:31 +00:00
NAKAMURA Takumi 7bec74112d Unix/Process.inc: Give more useful random seed to srand. Workaround for PR12743.
llvm-svn: 156252
2012-05-06 08:24:24 +00:00
NAKAMURA Takumi 54acb28882 Support/Process: Move llvm::sys::Process::GetRandomNumber() from Process.cpp to Unix/Process.inc.
FIXME: GetRandomNumber() is not implemented in Win32.
llvm-svn: 156251
2012-05-06 08:24:18 +00:00
Chris Lattner 9322ba824c reapply my patch, with a fix for an off-by-one error. Turned out to be a lot
of work for a drive-by fix :)

llvm-svn: 156246
2012-05-05 22:17:32 +00:00
Chris Lattner 64f65d33df revert my patches, which are causing problems.
llvm-svn: 156245
2012-05-05 22:11:04 +00:00
Chris Lattner cd60bc491e refactor some code to expose column numbers more and make diagnostic printing slightly more efficient.
llvm-svn: 156243
2012-05-05 21:39:51 +00:00
Jim Grosbach 7ce129268e Nuke a few dead remnants of the CBE.
llvm-svn: 156241
2012-05-05 17:45:12 +00:00
Daniel Dunbar d5f82d92f3 [Support] Add missing include.
llvm-svn: 156240
2012-05-05 16:49:11 +00:00
Daniel Dunbar 58ed0c6c09 [Support] Fix up comments.
llvm-svn: 156239
2012-05-05 16:39:22 +00:00
Daniel Dunbar 3f0fa19bc4 [Support] Rewrite sys::fs::unique_file to not be stupid with /dev/urandom.
- Just use sys::Process::GetRandomNumber instead of having two poor
   implementations.
 - This is ~70 times (!) faster on my OS X machine.

llvm-svn: 156238
2012-05-05 16:36:24 +00:00
Daniel Dunbar b57ddd4e29 [Support] Add sys::Process::GetRandomNumber().
- Primitive API, but we rarely have need for random numbers.

llvm-svn: 156237
2012-05-05 16:36:20 +00:00
Benjamin Kramer 047d7ca0b1 CodeGenPrepare: Add a transform to turn selects into branches in some cases.
This came up when a change in block placement formed a cmov and slowed down a
hot loop by 50%:

	ucomisd	(%rdi), %xmm0
	cmovbel	%edx, %esi

cmov is a really bad choice in this context because it doesn't get branch
prediction. If we emit it as a branch, an out-of-order CPU can do a better job
(if the branch is predicted right) and avoid waiting for the slow load+compare
instruction to finish. Of course it won't help if the branch is unpredictable,
but those are really rare in practice.

This patch uses a dumb conservative heuristic, it turns all cmovs that have one
use and a direct memory operand into branches. cmovs usually save some code
size, so we disable the transform in -Os mode. In-Order architectures are
unlikely to benefit as well, those are included in the
"predictableSelectIsExpensive" flag.

It would be better to reuse branch probability info here, but BPI doesn't
support select instructions currently. It would make sense to use the same
heuristics as the if-converter pass, which does the opposite direction of this
transform.


Test suite shows a small improvement here and there on corei7-level machines,
but the actual results depend a lot on the used microarchitecture. The
transformation is currently disabled by default and available by passing the
-enable-cgp-select2branch flag to the code generator.

Thanks to Chandler for the initial test case to him and Evan Cheng for providing
me with comments and test-suite numbers that were more stable than mine :)

llvm-svn: 156234
2012-05-05 12:49:22 +00:00
Benjamin Kramer e31f31e5c0 Add a new target hook "predictableSelectIsExpensive".
This will be used to determine whether it's profitable to turn a select into a
branch when the branch is likely to be predicted.

Currently enabled for everything but Atom on X86 and Cortex-A9 devices on ARM.

I'm not entirely happy with the name of this flag, suggestions welcome ;)

llvm-svn: 156233
2012-05-05 12:49:14 +00:00
Benjamin Kramer a25a61b9e8 NVPTX: Initialize the UseF32FTZ flag.
llvm-svn: 156232
2012-05-05 11:22:02 +00:00
Stepan Dyatkovskiy cb2a1a34e2 Small fix in InstCombineCasts.cpp. Restored "alloca + bitcast" reducing for case when alloca's size is calculated within the "add/sub/... nsw".
Also added fix to 2011-06-13-nsw-alloca.ll test.

llvm-svn: 156231
2012-05-05 07:09:40 +00:00
Eric Christopher de9e92ed9b Typo.
llvm-svn: 156226
2012-05-05 01:16:06 +00:00
Jakob Stoklund Olesen e326ed33a8 Make sure findRepresentativeClass picks the widest super-register.
We want the representative register class to contain the largest
super-registers available. This makes the function less sensitive to the
register class numbering.

llvm-svn: 156220
2012-05-04 22:53:28 +00:00
Jakob Stoklund Olesen e89496fe63 Remove extra comma in debug output.
llvm-svn: 156219
2012-05-04 22:53:26 +00:00
David Blaikie 891d0a3d20 Fix warnings in release build.
This fixes a couple of Clang warnings in release builds of LLVM:

* Missing return in ISelLowering
* Unused variable in NVPTXutil.cpp

llvm-svn: 156216
2012-05-04 22:34:16 +00:00
Kevin Enderby cabbae653e Tweak to the fix in r156212, as with the change in removing the shift the
SignExtend32<22>(Val<<1) also needs to change to SignExtend32<21>(Val) .

llvm-svn: 156213
2012-05-04 22:09:52 +00:00
Kevin Enderby 8ce1ada1be Fix a bug in the ARM disassembler for wide branch conditional instructions
where the symbolic operand's displacement was incorrectly shifted left by 1.
rdar://11387046

llvm-svn: 156212
2012-05-04 22:02:27 +00:00
Chandler Carruth cd3464ee22 Fix a Clang warning in the new NVPTX backend:
In file included from ../lib/Target/NVPTX/VectorElementize.cpp:53:
../lib/Target/NVPTX/NVPTX.h:44:3: warning: default label in switch which covers all enumeration values [-Wcovered-switch-default]
  default: assert(0 && "Unknown condition code");
  ^
1 warning generated.

The prevailing pattern in LLVM is to not use a default label, and instead to
use llvm_unreachable to denote that the switch in fact covers all return paths
from the function.

llvm-svn: 156209
2012-05-04 21:35:49 +00:00
Chandler Carruth 6781821c01 Teach the code extractor how to extract a sequence of blocks from
RegionInfo's RegionNode. This mirrors the logic for automating the
extraction from a Loop.

llvm-svn: 156208
2012-05-04 21:33:30 +00:00
Chandler Carruth 8880325a92 Rename the Region::block_iterator to Region::block_node_iterator, and
add a new Region::block_iterator which actually iterates over the basic
blocks of the region.

The old iterator, now call 'block_node_iterator' iterates over
RegionNodes which contain a single basic block. This works well with the
GraphTraits-based iterator design, however most users actually want an
iterator over the BasicBlocks inside these RegionNodes. Now the
'block_iterator' is a wrapper which exposes exactly this interface.
Internally it uses the block_node_iterator to walk all nodes which are
single basic blocks, but transparently unwraps the basic block to make
user code simpler.

While this patch is a bit of a wash, most of the updates are to internal
users, not external users of the RegionInfo. I have an accompanying
patch to Polly that is a strict simplification of every user of this
interface, and I'm working on a pass that also wants the same simplified
interface.

This patch alone should have no functional impact.

llvm-svn: 156202
2012-05-04 20:55:23 +00:00
Justin Holewinski ae556d3ef7 This patch adds a new NVPTX back-end to LLVM which supports code generation for NVIDIA PTX 3.0. This back-end will (eventually) replace the current PTX back-end, while maintaining compatibility with it.
The new target machines are:

nvptx (old ptx32) => 32-bit PTX
nvptx64 (old ptx64) => 64-bit PTX

The sources are based on the internal NVIDIA NVPTX back-end, and
contain more functionality than the current PTX back-end currently
provides.

NV_CONTRIB

llvm-svn: 156196
2012-05-04 20:18:50 +00:00
Sebastian Pop 2420e8b7d5 Added missing CMN case in Thumb2SizeReduction pass so that LLVM emits 16-bits encoding of CMN instructions.
llvm-svn: 156195
2012-05-04 19:53:56 +00:00
Preston Gurd d6c440cd4c Adds Intel Atom scheduling latencies to X86InstrSystem.td.
llvm-svn: 156194
2012-05-04 19:26:37 +00:00
Matt Beaumont-Gay e82ab6baa7 Pacify GCC's -Wreturn-type
llvm-svn: 156189
2012-05-04 18:34:27 +00:00
Chandler Carruth 14316fcf7d Factor the computation of input and output sets into a public interface
of the CodeExtractor utility. This allows speculatively computing input
and output sets to measure the likely size impact of the code
extraction.

These sets cannot be reused sadly -- we mutate the function prior to
forming the final sets used by the actual extraction.

The interface has been revamped slightly to make it easier to use
correctly by making the interface const and sinking the computation of
the number of exit blocks into the full extraction function and away
from the rest of this logic which just computed two output parameters.

llvm-svn: 156168
2012-05-04 11:20:27 +00:00
Chandler Carruth 44e13911bc Rather than trying to gracefully handle input sequences with repeated
blocks, assert that this doesn't happen. We don't want to bother trying
to support this call pattern as it isn't necessary.

llvm-svn: 156167
2012-05-04 11:17:06 +00:00
Chandler Carruth 0a570552d1 Fix a goof with my previous commit by completely returning when we
detect an in-eligible block rather than just breaking out of the loop.

llvm-svn: 156166
2012-05-04 11:14:19 +00:00
Chandler Carruth 2f5d0191f7 Hoist a safety assert from the extraction method into the construction
of the extractor itself.

llvm-svn: 156164
2012-05-04 10:26:45 +00:00
Chandler Carruth 0fde00150d Move the CodeExtractor utility to a dedicated header file / source file,
and expose it as a utility class rather than as free function wrappers.

The simple free-function interface works well for the bugpoint-specific
pass's uses of code extraction, but in an upcoming patch for more
advanced code extraction, they simply don't expose a rich enough
interface. I need to expose various stages of the process of doing the
code extraction and query information to decide whether or not to
actually complete the extraction or give up.

Rather than build up a new predicate model and pass that into these
functions, just take the class that was actually implementing the
functions and lift it up into a proper interface that can be used to
perform code extraction. The interface is cleaned up and re-documented
to work better in a header. It also is now setup to accept the blocks to
be extracted in the constructor rather than in a method.

In passing this essentially reverts my previous commit here exposing
a block-level query for eligibility of extraction. That is no longer
necessary with the more rich interface as clients can query the
extraction object for eligibility directly. This will reduce the number
of walks of the input basic block sequence by quite a bit which is
useful if this enters the normal optimization pipeline.

llvm-svn: 156163
2012-05-04 10:18:49 +00:00
Hans Wennborg aea412008e Make ARM and Mips use TargetMachine::getTLSModel()
This moves the logic for selecting a TLS model to a single place,
instead of the previous three (ARM, Mips, and X86 which already
uses this function).

llvm-svn: 156162
2012-05-04 09:40:39 +00:00
Craig Topper bdd2e34b1f Fix some loops to match coding standards. No functional change intended.
llvm-svn: 156159
2012-05-04 06:39:13 +00:00
Craig Topper d4d3237bb8 Fix up some spacing. No functional change.
llvm-svn: 156158
2012-05-04 06:18:33 +00:00
Craig Topper e2ae413746 Simplify broadcast lowering code. No functional change intended.
llvm-svn: 156157
2012-05-04 05:49:51 +00:00
Craig Topper 42f2182366 Allow v16i16 and v32i8 shuffles to be rewritten as narrower shuffles.
llvm-svn: 156156
2012-05-04 04:44:49 +00:00
Bill Wendling fa0ebcd1b0 Add 'landingpad' instructions to the list of instructions to ignore.
Also combine the code in the 'assert' statement.

llvm-svn: 156155
2012-05-04 04:22:32 +00:00
Craig Topper 59063c0a3d Simplify shuffle narrowing code a bit. No functional change intended.
llvm-svn: 156154
2012-05-04 04:08:44 +00:00
Jakob Stoklund Olesen 796e5272ab Remove the SubRegClasses field from RegisterClass descriptions.
This information in now computed by TableGen.

llvm-svn: 156152
2012-05-04 03:30:34 +00:00
Jakob Stoklund Olesen 75fbe90839 Use SuperRegClassIterator for findRepresentativeClass().
The masks returned by SuperRegClassIterator are computed automatically
by TableGen. This is better than depending on the manually specified
SuperRegClasses.

llvm-svn: 156147
2012-05-04 02:19:22 +00:00
Jakob Stoklund Olesen 34a8f13e5f Initialize SparcInstrInfo before SparcTargetLowering.
The TargetLowering construction needs to use a valid TargetRegisterInfo
instance.

llvm-svn: 156146
2012-05-04 02:16:39 +00:00
Jakob Stoklund Olesen 57c7050675 Add a SuperRegClassIterator class.
This iterator class provides a more abstract interface to the (Idx,
Mask) lists of super-registers for a register class. The layout of the
tables shouldn't be exposed to clients.

llvm-svn: 156144
2012-05-04 01:48:29 +00:00
Chandler Carruth da7513a834 A pile of long over-due refactorings here. There are some very, *very*
minor behavior changes with this, but nothing I have seen evidence of in
the wild or expect to be meaningful. The real goal is unifying our logic
and simplifying the interfaces. A summary of the changes follows:

- Make 'callIsSmall' actually accept a callsite so it can handle
  intrinsics, and simplify callers appropriately.
- Nuke a completely bogus declaration of 'callIsSmall' that was still
  lurking in InlineCost.h... No idea how this got missed.
- Teach the 'isInstructionFree' about the various more intelligent
  'free' heuristics that got added to the inline cost analysis during
  review and testing. This mostly surrounds int->ptr and ptr->int casts.
- Switch most of the interesting parts of the inline cost analysis that
  were essentially computing 'is this instruction free?' to use the code
  metrics routine instead. This way we won't keep duplicating logic.

All of this is motivated by the desire to allow other passes to compute
a roughly equivalent 'cost' metric for a particular basic block as the
inline cost analysis. Sadly, re-using the same analysis for both is
really messy because only the actual inline cost analysis is ever going
to go to the contortions required for simplification, SROA analysis,
etc.

llvm-svn: 156140
2012-05-04 00:58:03 +00:00
Jakob Stoklund Olesen 2f460ae3b4 Use a shared implementation of getMatchingSuperRegClass().
TargetRegisterClass now gives access to the necessary tables.

llvm-svn: 156122
2012-05-03 22:49:04 +00:00
Kevin Enderby 914223010c Fix issues with the ARM bl and blx thumb instructions and the J1 and J2 bits
for the assembler and disassembler.  Which were not being set/read correctly
for offsets greater than 22 bits in some cases.

Changes to lib/Target/ARM/ARMAsmBackend.cpp from Gideon Myles!

llvm-svn: 156118
2012-05-03 22:41:56 +00:00
Chandler Carruth a46e62424b Factor the logic for testing whether a basic block is viable for code
extraction into a public interface. Also clean it up and apply it more
consistently such that we check for landing pads *anywhere* in the
extracted code, not just in single-block extraction.

This will be used to guide decisions in passes that are planning to
eventually perform a round of code extraction.

llvm-svn: 156114
2012-05-03 22:26:53 +00:00
Nuno Lopes d4cf35d775 remove calls to calloc if the allocated memory is not used (it was already being done for malloc)
fix a few typos found by Chad in my previous commit

llvm-svn: 156110
2012-05-03 22:08:19 +00:00
Sirish Pande f8e5e3c072 Support for target dependent Hexagon VLIW packetizer.
This patch creates and optimizes packets as per Hexagon ISA rules.

llvm-svn: 156109
2012-05-03 21:52:53 +00:00
Nuno Lopes d2b71e7fa9 add support for calloc to objectsize lowering
llvm-svn: 156102
2012-05-03 21:19:58 +00:00
Silviu Baranga 9560af848c Fixed disassembler for vstm/vldm ARM VFP instructions.
llvm-svn: 156077
2012-05-03 16:38:40 +00:00
Sirish Pande c92c31674e Extensions of Hexagon V4 instructions.
This adds new instructions for Hexagon V4 architecture.

llvm-svn: 156071
2012-05-03 16:18:50 +00:00
Nuno Lopes 22f6f3b055 replace 'break's with 'return 0' in visitCallInst code for objectsize, since there is no need to fallback to visitCallSite.
This gives a 0.9% in a test case

llvm-svn: 156069
2012-05-03 16:06:07 +00:00
Craig Topper 242183834a Use 'unsigned' instead of 'int' in a few places dealing with counts of vector elements.
llvm-svn: 156060
2012-05-03 07:26:59 +00:00
Craig Topper 315a5cc789 Fix 256-bit vpshuflw and vpshufhw immediate encoding to handle undefs in the lower half correctly. Missed in r155982.
llvm-svn: 156059
2012-05-03 07:12:59 +00:00
Evan Cheng b64e7b778b Fix two-address pass's aggressive instruction commuting heuristics. It's meant
to catch cases like:
 %reg1024<def> = MOV r1
 %reg1025<def> = MOV r0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

By commuting ADD, it let coalescer eliminate all of the copies. However, there
was a bug in the heuristics where it ended up commuting the ADD in:

 %reg1024<def> = MOV r0
 %reg1025<def> = MOV 0
 %reg1026<def> = ADD %reg1024, %reg1025
 r0            = MOV %reg1026

That did no benefit but rather ensure the last MOV would not be coalesced.

rdar://11355268

llvm-svn: 156048
2012-05-03 01:45:13 +00:00
Andrew Trick 32aea358e1 Added TargetRegisterInfo::getAllocatableClass.
The ensures that virtual registers always belong to an allocatable class.
If your target attempts to create a vreg for an operand that has no
allocatable register subclass, you will crash quickly.

This ensures that targets define register classes as intended.

llvm-svn: 156046
2012-05-03 01:14:37 +00:00
Bill Wendling c94d86c4ad Whitespace cleanup.
llvm-svn: 156034
2012-05-02 23:43:23 +00:00
Owen Anderson 41b0665b5b Teach DAGCombine the same multiply-by-1.0 folding trick when doing FMAs, just like it now knows for FMULs.
llvm-svn: 156029
2012-05-02 22:17:40 +00:00
Preston Gurd 926afd7401 For Intel Atom, use ILP scheduling always, instead of ILP for 64 bit
and Hybrid for 32 bit, since benchmarks show ILP scheduling is better
most of the time.

llvm-svn: 156028
2012-05-02 22:02:02 +00:00
Preston Gurd c0b976c42a Change the Intel Atom detection code to recognize
Lincroft and Medfield.

llvm-svn: 156025
2012-05-02 21:38:46 +00:00
Owen Anderson b5f167c660 Teach DAG combine that multiplication by 1.0 can always be constant folded.
llvm-svn: 156023
2012-05-02 21:32:35 +00:00
Jim Grosbach 28b0b7279e ARM: Add missing two-operand VBIC aliases.
llvm-svn: 156019
2012-05-02 21:11:56 +00:00
Douglas Gregor 12c1cd33f4 Move llvm-tblgen's StringMatcher into the TableGen library so it can
be used by clang-tblgen.

llvm-svn: 156000
2012-05-02 17:32:48 +00:00
Preston Gurd fa3f6cb830 This patch continues the work of adding instruction latencies for X86 Atom,
by providing the latencies for the instructions in X86InstrFPStack.td.

llvm-svn: 155996
2012-05-02 16:03:35 +00:00
Manman Ren f02efc8731 Revert r155853
The commit is intended to fix rdar://10961709.
But it is the root cause of PR12720.
Revert it for now.

llvm-svn: 155992
2012-05-02 15:24:32 +00:00
Kostya Serebryany ae7188d9b9 [tsan] typo and style (thanks to Nick Lewycky)
llvm-svn: 155986
2012-05-02 13:12:19 +00:00
Bill Wendling 274ba89d77 The value held in the vector may be RAUW'ed by some of the canonicalization
methods. Use a weak value handle to keep up with this.
PR12245

llvm-svn: 155984
2012-05-02 09:59:45 +00:00
Richard Barton 0fc56890ba Disallow YIELD and other allocated nop hints in pre-ARMv6 architectures.
llvm-svn: 155983
2012-05-02 09:43:18 +00:00
Craig Topper c73bc39c22 Add support for selecting AVX2 vpshuflw and vpshufhw. Add decoding support for AsmPrinter.
llvm-svn: 155982
2012-05-02 08:03:44 +00:00
Eli Friedman 4a80e94b86 Fix the implementation of MachOObjectFile::isSectionZeroInit so it follows the MachO spec.
llvm-svn: 155976
2012-05-02 02:31:28 +00:00
Jim Grosbach edcb868fe3 Tidy up. Naming conventions.
llvm-svn: 155960
2012-05-01 23:21:41 +00:00
Jakub Staszak 6126401c83 Remove unneeded break.
llvm-svn: 155959
2012-05-01 23:08:16 +00:00
Jakub Staszak cd2353402d Use dyn_cast instead of checking opcode and cast.
llvm-svn: 155957
2012-05-01 23:06:00 +00:00
Jakub Staszak 339380286b Remove trailing spaces.
llvm-svn: 155956
2012-05-01 23:04:38 +00:00
Bill Wendling b6b50c6638 Strip the pointer casts off of allocas so that the selection DAG can find them.
PR10799

llvm-svn: 155954
2012-05-01 22:50:45 +00:00
Sirish Pande 94212168fc Target independent Hexagon Packetizer fix.
llvm-svn: 155947
2012-05-01 21:28:30 +00:00
Jim Grosbach 1d20efb837 ARM: Add a few missing add->sub aliases w/ 'w' suffix.
Aliases for adding a negative immediate when using an explicit 'w'
suffix. E.g.,
        adds.w r2, #-16
        adds.w r2, r2, #-16
        addw r2, #-16
        addw r2, #-16
        addw r2, r2, #-16

rdar://11330769

llvm-svn: 155946
2012-05-01 21:17:34 +00:00
Jim Grosbach 70bed4faaf ARM: allow vanilla expressions for movw/movt.
Expressions for movw/movt don't always have an :upper16: or :lower16:
on them and that's ok. When they don't, it's just a plain [0-65536]
immediate result, effectively the same as a :lower16: variant kind.

rdar://10550147

llvm-svn: 155941
2012-05-01 20:43:21 +00:00
Preston Gurd 5ae5278ca1 This patch marks the X86 floating point stack registers ST0-ST7 as reserved
in order to avoid assertion failures in the register scavenger. The assertion
failures were “Bad machine code: Using an undefined physical register” and
“Bad machine code: MBB exits via unconditional fall-through but its successor
differs from its CFG successor!”.

llvm-svn: 155930
2012-05-01 19:50:22 +00:00
Jim Grosbach 758e0cc94a MC: Unknown assembler directives are now hard errors.
Previously, an unsupported/unknown assembler directive issued a warning.
That's generally unsafe, and inconsistent with the behaviour of pretty
much every system assembler. Now that the MC assemblers are mature
enough to be the default on multiple targets, it's reasonable to
issue errors for these.

For target or platform directives that need to stay warnings, we
should add explicit handlers for them in, e.g., ELFAsmParser.cpp,
DarwinAsmParser.cpp, et. al., and issue the warning there.

rdar://9246275

llvm-svn: 155926
2012-05-01 18:38:27 +00:00
Jim Grosbach a0c53f147a MC: Remove errant EatToEndOfStatement() in asm parser.
The caller is already responsible for eating any additional input on the
line. Putting an additional EatToEndOfStatement() in ParseStatement()
causes an entire extra statement to be consumed when treating warnings
as errors. For example, test/MC/macros.s will assert() because the
.endmacro directive is missed as a result.

rdar://11355843

llvm-svn: 155925
2012-05-01 18:38:24 +00:00
Manman Ren 425a55c1ce X86: optimization for max-like struct
This patch will optimize the following cases on X86
(a > b) ? (a-b) : 0
(a >= b) ? (a-b) : 0
(b < a) ? (a-b) : 0
(b <= a) ? (a-b) : 0

FROM
movl    %edi, %ecx
subl    %esi, %ecx
cmpl    %edi, %esi
movl    $0, %eax
cmovll  %ecx, %eax
TO
xorl    %eax, %eax
subl    %esi, %edi
cmovll  %eax, %edi
movl    %edi, %eax

rdar: 10734411
llvm-svn: 155919
2012-05-01 17:16:15 +00:00
Alexey Samsonov c4b3ad8195 X86: Use StackRegister instead of FrameRegister in getFrameIndexReference (to generate debug info for local variables) if stack needs realignment
llvm-svn: 155917
2012-05-01 15:16:06 +00:00
Benjamin Kramer cb3e98cf44 Move MipsDisassembler classes into an anonymous namespace.
llvm-svn: 155915
2012-05-01 14:34:24 +00:00
Benjamin Kramer 512c1dce8f Value-initialize global to avoid global construction.
llvm-svn: 155909
2012-05-01 10:48:02 +00:00
Eli Bendersky 667b879e73 RuntimeDyld cleanup:
- Improved parameter names for clarity
- Added comments
- emitCommonSymbols should return void because its return value is not being
  used anywhere
- Attempt to reduce the usage of the RelocationValueRef type. Restricts it 
  for a single goal and may serve as a step for eventual removal.

llvm-svn: 155908
2012-05-01 10:41:12 +00:00
Benjamin Kramer 84b857e4e6 YAMLParser: get rid of global ctors & dtors.
llvm-svn: 155907
2012-05-01 10:19:59 +00:00
Bill Wendling b12f16e75f Change the PassManager from a reference to a pointer.
The TargetPassManager's default constructor wants to initialize the PassManager
to 'null'. But it's illegal to bind a null reference to a null l-value. Make the
ivar a pointer instead.
PR12468

llvm-svn: 155902
2012-05-01 08:27:43 +00:00
Craig Topper 05eb6e096a Allow BMI, AES, F16C, POPCNT, FMA3, and CLMUL to be detected on AMD processors.
llvm-svn: 155899
2012-05-01 07:10:32 +00:00
Eli Bendersky fc079081b7 RuntimeDyld code cleanup:
- There's no point having a different type for the local and global symbol
  tables.
- Renamed SymbolTable to GlobalSymbolTable to clarify the intention
- Improved const correctness where relevant

llvm-svn: 155898
2012-05-01 06:58:59 +00:00
Craig Topper bae0e9ea1d Make XOP and FMA4 require SSE4A to match GCC behavior. Use this to simplify Bulldozer feature list.
llvm-svn: 155897
2012-05-01 06:54:48 +00:00
Craig Topper d32ebcc36b Attempt to handle MRMInitReg in emitVEXOpcodePrefix. Hopefully fixes PR12711.
llvm-svn: 155896
2012-05-01 06:34:01 +00:00
Craig Topper 43518cc55f Make XOP imply AVX as its needed to legalize the registers types.
llvm-svn: 155891
2012-05-01 05:41:41 +00:00
Craig Topper c0cef32b83 Remove HasSSE2 from AES and CLMUL predicates. It's now implied by the HasAES and HasCLMUL predicates.
llvm-svn: 155890
2012-05-01 05:35:02 +00:00
Craig Topper 29dd148a71 Make CLMUL and AES imply SSE2 since its needed to legalize the type.
llvm-svn: 155888
2012-05-01 05:28:32 +00:00
Craig Topper 0eacda5f69 Enable AVX and FMA4 for AMD Bulldozer processors.
llvm-svn: 155885
2012-05-01 05:18:13 +00:00
Nick Lewycky 78ee67e814 An instruction in a loop is not guaranteed to be executed just because the loop
has no exit blocks. Fixes PR12706!

llvm-svn: 155884
2012-05-01 04:03:01 +00:00
Lang Hames 3a90fabd85 Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. Fixes
<rdar://problem/11291436>.

This is a second attempt at a fix for this, the first was r155468. Thanks
to Chandler, Bob and others for the feedback that helped me improve this.

llvm-svn: 155866
2012-05-01 00:20:38 +00:00
Jakub Staszak cec09b2594 Add some constantness. No functionality change.
llvm-svn: 155859
2012-04-30 23:41:30 +00:00
Manman Ren 4f4d5c8fc8 X86: optimization for -(x != 0)
This patch will optimize -(x != 0) on X86
FROM 
cmpl	$0x01,%edi
sbbl	%eax,%eax
notl	%eax
TO
negl %edi
sbbl %eax %eax

llvm-svn: 155853
2012-04-30 22:51:25 +00:00
Jim Grosbach e78031a9f3 ARM: Diagnostics for out of range fixups.
Replace some assert() calls w/ actual diagnostics. In a perfect world,
there'd be range checks on these values long before things ever reached
this code. For now, though, issuing a better-late-than-never diagnostic
is still a big improvement over assert().

rdar://11347287

llvm-svn: 155851
2012-04-30 22:30:43 +00:00
Jakob Stoklund Olesen 8503ba984f Fix address calculation error from r155744.
This was exposed by SingleSource/UnitTests/Vector/constpool.c.

The computed size of a basic block isn't always a multiple of its known
alignment, and that can introduce extra alignment padding after the
block.

<rdar://problem/11347135>

llvm-svn: 155845
2012-04-30 20:19:00 +00:00
Chad Rosier d427d51c2b Tidy up. No functional change intended.
llvm-svn: 155832
2012-04-30 17:47:15 +00:00
Derek Schuff b051adf263 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

(this time, actually commit what was reviewed!)

llvm-svn: 155825
2012-04-30 16:57:15 +00:00
Bob Wilson 9245c93656 Don't introduce illegal types when creating vmull operations. <rdar://11324364>
ARM BUILD_VECTORs created after type legalization cannot use i8 or i16
operands, since those types are not legal.  Instead use i32 operands, which
will be implicitly truncated by the BUILD_VECTOR to match the element type.

llvm-svn: 155824
2012-04-30 16:53:34 +00:00
Eli Bendersky b92e1cf636 It doesn't make sense to move symbol relocations to section relocations when
relocations are resolved.  It's much more reasonable to do this decision when
relocations are just being added - we have all the information at that point.

Also a bit of renaming and extra comments to clarify extensions.

llvm-svn: 155819
2012-04-30 12:15:58 +00:00
Duncan Sands 34c4869cf6 Just mark the sign bit as known zero, rather than any other irrelevant bits
known zero in the LHS.  Fixes PR12541.

llvm-svn: 155818
2012-04-30 11:56:58 +00:00
Bill Wendling bf4b9afbeb Second attempt at PR12573:
Allow the "SplitCriticalEdge" function to split the edge to a landing pad. If
the pass is *sure* that it thinks it knows what it's doing, then it may go ahead
and specify that the landing pad can have its critical edge split. The loop
unswitch pass is one of these passes. It will split the critical edges of all
edges coming from a loop to a landing pad not within the loop. Doing so will
retain important loop analysis information, such as loop simplify.

llvm-svn: 155817
2012-04-30 10:44:54 +00:00
Bill Wendling 325e6cd9cb Use an ArrayRef instead of explicit vector type.
llvm-svn: 155816
2012-04-30 10:25:51 +00:00
Eli Bendersky 32d5488f64 Code cleanup in RuntimeDyld:
- Add comments
- Change field names to be more reasonable
- Fix indentation and naming to conform to coding conventions
- Remove unnecessary includes / replace them by forward declatations

llvm-svn: 155815
2012-04-30 10:06:27 +00:00
Bill Wendling 712d85a8c0 Remove hack from r154987. The problem persists even with it, so it's not even a good hack.
llvm-svn: 155813
2012-04-30 09:23:48 +00:00
Craig Topper 55b3990837 No need to normalize index before calling Extract128BitVector
llvm-svn: 155811
2012-04-30 05:17:10 +00:00
Pete Cooper f76b5fe5ab Copied all the VEX prefix encoding code from X86MCCodeEmitter to the x86 JIT emitter. Needs some major refactoring as these two code emitters are almost identical
llvm-svn: 155810
2012-04-30 03:56:44 +00:00
Rafael Espindola dd48931461 Make sure HoistInsertPosition finds a position that is dominated by all
inputs.

llvm-svn: 155809
2012-04-30 03:53:06 +00:00
Jakub Staszak da03f3ba64 Remove unneeded casts. No functionality change.
llvm-svn: 155800
2012-04-29 20:52:53 +00:00
Craig Topper 3b94fa63d6 Simplify code a bit. No functional change intended.
llvm-svn: 155798
2012-04-29 20:22:05 +00:00
Kalle Raiskila 4c5f83ea19 Update the documentation of CellSPU, in case it gets removed in 3.1.
llvm-svn: 155797
2012-04-29 20:00:55 +00:00
Benjamin Kramer db25381a54 RegisterPressure: ArrayRefize some functions for better readability. No functionality change.
llvm-svn: 155795
2012-04-29 18:52:56 +00:00
Eli Bendersky 0e2ac5bcdb Fix some formatting, grammar and style issues and add a couple of missing comments.
llvm-svn: 155793
2012-04-29 12:40:47 +00:00
Jakob Stoklund Olesen 6053899aa0 Don't update spill weights when joining intervals.
We don't compute spill weights until after coalescing anyway.

llvm-svn: 155766
2012-04-28 19:19:11 +00:00
Jakob Stoklund Olesen 4fe0e1908e Spring cleaning - Delete dead code.
llvm-svn: 155765
2012-04-28 19:19:07 +00:00
Jakob Stoklund Olesen ae7521d1e4 Fix a problem with blocks that need to be split twice.
The code could search past the end of the basic block when there was
already a constant pool entry after the block.

Test case with giant basic block in SingleSource/UnitTests/Vector/constpool.c

llvm-svn: 155753
2012-04-28 06:21:38 +00:00
Andrew Trick 833f04962a Reapply 155668: Fix the SD scheduler to avoid gluing the same node twice.
This time, also fix the caller of AddGlue to properly handle
incomplete chains. AddGlue had failure modes, but shamefully hid them
from its caller. It's luck ran out.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155749
2012-04-28 01:03:23 +00:00
Jim Grosbach c6f32b3295 ARM: Thumb add(sp plus register) asm constraints.
Make sure when parsing the Thumb1 sp+register ADD instruction that
the source and destination operands match. In thumb2, just use the
wide encoding if they don't. In Thumb1, issue a diagnostic.

rdar://11219154

llvm-svn: 155748
2012-04-27 23:51:36 +00:00
Jim Grosbach 9d8f6f3d9d ARM: Tweak tADDrSP definition for consistent operand order.
Make the operand order of the instruction match that of the asm syntax.

llvm-svn: 155747
2012-04-27 23:51:33 +00:00
Derek Schuff a99b168145 Revert r155745
llvm-svn: 155746
2012-04-27 23:37:41 +00:00
Derek Schuff bbf8b83e90 Fix fastcc structure return with fast-isel on x86-32
On x86-32, structure return via sret lets the callee pop the hidden
pointer argument off the stack, which the caller then re-pushes.
However if the calling convention is fastcc, then a register is used
instead, and the caller should not adjust the stack. This is
implemented with a check of IsTailCallConvention
X86TargetLowering::LowerCall but is now checked properly in
X86FastISel::DoSelectCall.

llvm-svn: 155745
2012-04-27 23:27:17 +00:00
Jakob Stoklund Olesen 5f0d1b462c Track worst case alignment padding more accurately.
Previously, ARMConstantIslandPass would conservatively compute the
address of an aligned basic block as:

  RoundUpToAlignment(Offset + UnknownPadding)

This worked fine for the layout algorithm itself, but it could fool the
verify() function because it accounts for alignment padding twice: Once
when adding the worst case UnknownPadding, and again by rounding up the
fictional block offset. This meant that when optimizeThumb2Instructions
would shrink an instruction, the conservative distance estimate could
grow. That shouldn't be possible since the woorst case alignment padding
wss already included.

This patch drops the use of RoundUpToAlignment, and depends only on
worst case padding to compute conservative block offsets. This has the
weird effect that the computed offset for an aligned block may not be
aligned.

The important difference is that shrinking an instruction can never
cause the estimated distance between two instructions to grow. The
estimated distance is always larger than the real distance that only the
assembler knows.

<rdar://problem/11339352>

llvm-svn: 155744
2012-04-27 22:58:38 +00:00
Andrew Trick 7a773ec053 Temporarily revert r155668: Fix the SD scheduler to avoid gluing.
This definitely caused regression with ARM -mno-thumb.

llvm-svn: 155743
2012-04-27 22:55:59 +00:00
Craig Topper 0fa6c7e593 Use 'unsigned' instead of 'int' in several places when retrieving number of vector elements.
llvm-svn: 155742
2012-04-27 22:54:43 +00:00
Chad Rosier 32c2178ef3 Add x86-specific DAG combine to simplify:
x == -y --> x+y == 0
 x != -y --> x+y != 0

On x86, the generated code goes from
   negl    %esi
   cmpl    %esi, %edi
   je    .LBB0_2
to
   addl    %esi, %edi
   je    .L4

This case is correctly handled for ARM with "cmn".

Patch by Manman Ren.
rdar://11245199
PR12545

llvm-svn: 155739
2012-04-27 22:33:25 +00:00
Michael J. Spencer 6033113e35 [Support/YAMLParser] Fix ASan found bugs.
llvm-svn: 155735
2012-04-27 21:12:20 +00:00
Craig Topper 42cd8d2c00 Tidy up spacing.
llvm-svn: 155733
2012-04-27 21:05:09 +00:00
Hal Finkel 27c3246169 Don't vectorize target-specific types (ppc_fp128, x86_fp80, etc.).
Target specific types should not be vectorized. As a practical matter,
these types are already register matched (at least in the x86 case),
and codegen does not always work correctly (at least in the ppc case,
and this is not worth fixing because ppc_fp128 is currently broken and
will probably go away soon).

llvm-svn: 155729
2012-04-27 19:34:00 +00:00
David Blaikie 84e4b39995 Change recurse depth limit to uint32 to fix warning.
llvm-svn: 155727
2012-04-27 19:30:32 +00:00
Dan Gohman dae3349ac2 Miscellaneous accumulated cleanups.
llvm-svn: 155725
2012-04-27 18:56:31 +00:00
Lang Hames ea001225c1 Fix the order of the operands in the llvm.fma intrinsic patterns for ARM,
<rdar://problem/11325085>.

llvm-svn: 155724
2012-04-27 18:51:24 +00:00
Mon P Wang 6120cfb8cd Add an early bailout to IsValueFullyAvailableInBlock from deeply nested blocks.
The limit is set to an arbitrary 1000 recursion depth to avoid stack overflow
issues. <rdar://problem/11286839>.

llvm-svn: 155722
2012-04-27 18:09:28 +00:00
Dan Gohman 1ccecdb2fd Reapply r155682, making constant folding more consistent, with a fix to work
properly with how the code handles all-undef PHI nodes.

llvm-svn: 155721
2012-04-27 17:50:22 +00:00
Richard Barton 82f95ea2ad Fix ARM assembly parsing for upper case condition codes on IT instructions.
llvm-svn: 155720
2012-04-27 17:34:01 +00:00
Benjamin Kramer 913da4b261 X86: Don't emit conditional floating point moves on when targeting pre-pentiumpro architectures.
* Model FPSW (the FPU status word) as a register.
* Add ISel patterns for the FUCOM*, FNSTSW and SAHF instructions.
* During Legalize/Lowering, build a node sequence to transfer the comparison
result from FPSW into EFLAGS. If you're wondering about the right-shift: That's
an implicit sub-register extraction (%ax -> %ah) which is handled later on by
the instruction selector.

Fixes PR6679. Patch by Christoph Erhardt!

llvm-svn: 155704
2012-04-27 12:07:43 +00:00
Kostya Serebryany 5a464f03d3 [asan] small optimization: do not emit "x+0" instructions
llvm-svn: 155701
2012-04-27 10:04:53 +00:00
Richard Barton f435b09eaf Refactor IT handling not to store the bottom bit of the condition code in the mask operand in the MCInst.
llvm-svn: 155700
2012-04-27 08:42:59 +00:00
NAKAMURA Takumi 6008dfdb70 Revert r155682, "Use ConstantExpr::getExtractElement when constant-folding vectors"
It broke stage2 build. stage1/clang sometimes crashed.

llvm-svn: 155699
2012-04-27 07:59:20 +00:00
Kostya Serebryany a1259778b4 [tsan] Atomic support for ThreadSanitizer, patch by Dmitry Vyukov
llvm-svn: 155698
2012-04-27 07:31:53 +00:00
Evan Cheng 1ec87ee096 Implement a bastardized ABI.
llvm-svn: 155686
2012-04-27 02:11:10 +00:00
Evan Cheng f52003de56 - thumbv6 shouldn't imply +thumb2. Cortex-M0 doesn't suppport 32-bit Thumb2
instructions.
- However, it does support dmb, dsb, isb, mrs, and msr.
rdar://11331541

llvm-svn: 155685
2012-04-27 01:27:19 +00:00
Dan Gohman 90f3798f26 Use ConstantExpr::getExtractElement when constant-folding vectors
instead of getAggregateElement. This has the advantage of being
more consistent and allowing higher-level constant folding to
procede even if an inner extract element cannot be folded.

Make ConstantFoldInstruction call ConstantFoldConstantExpression
on the instruction's operands, making it more consistent with 
ConstantFoldConstantExpression itself. This makes sure that
ConstantExprs get TargetData-aware folding before being handed
off as operands for further folding.

This causes more expressions to be folded, but due to a known
shortcoming in constant folding, this currently has the side effect
of stripping a few more nuw and inbounds flags in the non-targetdata
side of constant-fold-gep.ll. This is mostly harmless.

This fixes rdar://11324230.

llvm-svn: 155682
2012-04-27 00:54:36 +00:00
Jakob Stoklund Olesen c90abc8956 Break up getProfitableChainIncrement().
The required checks are moved to ChainInstruction() itself and the
policy decisions are moved to IVChain::isProfitableInc().

Also cache the ExprBase in IVChain to avoid frequent recomputations.

No functional change intended.

llvm-svn: 155676
2012-04-26 23:33:11 +00:00
Jakob Stoklund Olesen a0337d7bd9 Turn IVChain into a struct.
No functional change intended.

llvm-svn: 155675
2012-04-26 23:33:09 +00:00
Chad Rosier 7813dcee30 Add instcombine patterns for the following transformations:
(x & y) | (x ^ y) -> x | y 
 (x & y) + (x ^ y) -> x | y 

Patch by Manman Ren.
rdar://10770603

llvm-svn: 155674
2012-04-26 23:29:14 +00:00
Andrew Trick 03fa574af5 Fix the SD scheduler to avoid gluing the same node twice.
DAGCombine strangeness may result in multiple loads from the same
offset. They both may try to glue themselves to another load. We could
insist that the redundant loads glue themselves to each other, but the
beter fix is to bail out from bad gluing at the time we detect it.

Fixes rdar://11314175: BuildSchedUnits assert.

llvm-svn: 155668
2012-04-26 21:48:25 +00:00
Jim Grosbach 3d6c629e26 ARM: Thumb ldr(literal) base address alignment is 32-bits.
The base address for the PC-relative load is Align(PC,4), so it's the
address of the word containing the 16-bit instruction, not the address
of the instruction itself. Ugh.

rdar://11314619

llvm-svn: 155659
2012-04-26 20:48:12 +00:00
Preston Gurd 81290f4be5 Trivial change to set UseLeaForSP flag in addition to toggling
the FeatureLeaForSP feature bit when llvm auto detects Intel Atom.

Patch by Andy Zhang

llvm-svn: 155655
2012-04-26 19:52:27 +00:00
Michael J. Spencer a6c2c29152 [Support/YAML] Properly fix unitialized variable warning by inserting a
'REPLACEMENT CHARACTER' (U+FFFD) when getAsInteger fails.

llvm-svn: 155653
2012-04-26 19:27:11 +00:00
Tim Northover 3de97b7a86 Use VLD1 in NEON extenting-load patterns instead of VLDR.
On some cores it's a bad idea for performance to mix VFP and NEON instructions
and since these patterns are NEON anyway, the NEON load should be used.

llvm-svn: 155630
2012-04-26 08:46:29 +00:00
Tim Northover 6699a60b0e Test commit.
llvm-svn: 155626
2012-04-26 08:24:07 +00:00
Craig Topper 08ccfbe57b Enable detection of AVX and AVX2 support through CPUID. Add AVX/AVX2 to corei7-avx, core-avx-i, and core-avx2 cpu names.
llvm-svn: 155618
2012-04-26 06:40:15 +00:00
Chandler Carruth 739ef80fd7 Teach the reassociate pass to fold chains of multiplies with repeated
elements to minimize the number of multiplies required to compute the
final result. This uses a heuristic to attempt to form near-optimal
binary exponentiation-style multiply chains. While there are some cases
it misses, it seems to at least a decent job on a very diverse range of
inputs.

Initial benchmarks show no interesting regressions, and an 8%
improvement on SPASS. Let me know if any other interesting results (in
either direction) crop up!

Credit to Richard Smith for the core algorithm, and helping code the
patch itself.

llvm-svn: 155616
2012-04-26 05:30:30 +00:00
Evan Cheng 9f7ad310b5 If triple is armv7 / thumbv7 and a CPU is specified, do not automatically assume
the feature set of v7a. This comes about if the user specifies something like
-arch armv7 -mcpu=cortex-m3. We shouldn't be generating instructions such as
uxtab in this case.

rdar://11318438

llvm-svn: 155601
2012-04-26 01:13:36 +00:00
Bill Wendling 0156f44a68 Don't forget to reset 'first operand' flag when we're setting the MDNodeOperand value.
llvm-svn: 155599
2012-04-26 00:38:42 +00:00
Jakob Stoklund Olesen 293673d788 Print IV chain numbers while collecting them.
llvm-svn: 155567
2012-04-25 18:01:32 +00:00
Jakob Stoklund Olesen 01f201f484 Remove more dead code.
llvm-svn: 155566
2012-04-25 18:01:30 +00:00
Richard Barton ba5b0cc82e Unify internal representation of ARM instructions with a register right-shifted by #32. These are stored as shifts by #0 in the MCInst and correctly marshalled when transforming from or to assembly representation.
llvm-svn: 155565
2012-04-25 18:00:18 +00:00
Jakob Stoklund Olesen 983dd43b15 Remove the -disable-cross-class-join option.
Cross-class joins have been normal and fully supported for a while now.
With TableGen generating the getMatchingSuperRegClass() hook, they are
unlikely to cause problems again.

llvm-svn: 155552
2012-04-25 16:17:50 +00:00
Jakob Stoklund Olesen d11cf9677f Cross-class joining is winning.
Remove the heuristic for disabling cross-class joins. The greedy
register allocator can handle the narrow register classes, and when it
splits a live range, it can pick a larger register class.

Benchmarks were unaffected by this change.

<rdar://problem/11302212>

llvm-svn: 155551
2012-04-25 16:17:47 +00:00
Craig Topper 3ec7c2aa84 Add ifdef around getSubtargetFeatureName in tablegen output file so that only targets that want the function get it. This prevents other targets from getting an unused function warning.
llvm-svn: 155538
2012-04-25 06:56:34 +00:00
Craig Topper 5ff6dc34b9 Use vector_shuffles instead of target specific unpack nodes for AVX ZERO_EXTEND/ANY_EXTEND combine. These will be converted to target specific nodes during lowering. This is more consistent with other code.
llvm-svn: 155537
2012-04-25 06:39:39 +00:00
Lang Hames 2fd0c69125 Reverting r155468. Chris and Chandler have convinced me that it's dangerous and
in poor taste.

Talking through some alternate solutions with Chandler.

llvm-svn: 155530
2012-04-25 02:16:54 +00:00
Akira Hatanaka 2020e27d6d Do not use $gp as a dedicated global register if the target ABI is not O32.
llvm-svn: 155522
2012-04-25 01:24:52 +00:00
Dan Gohman 62079b43cc Simplify the known retain count tracking; use a boolean state instead
of a precise count. Also, move RRInfo's Partial field into PtrState,
now that it won't increase the size.

llvm-svn: 155513
2012-04-25 00:50:46 +00:00
Dan Gohman c24c66f21c Build custom predecessor and successor lists for each basic block.
These lists exclude invoke unwind edges and loop backedges which
are being ignored. This makes it easier to ignore them
consistently.

llvm-svn: 155500
2012-04-24 22:53:18 +00:00
Jim Grosbach 5117ef7453 ARM: improved assembler diagnostics for missing CPU features.
When an instruction match is found, but the subtarget features it
requires are not available (missing floating point unit, or thumb vs arm
mode, for example), issue a diagnostic that identifies what the feature
mismatch is.

rdar://11257547

llvm-svn: 155499
2012-04-24 22:40:08 +00:00
Andrew Trick 4d4b5469ab Fix a naughty header include that breaks "installed" builds.
llvm-svn: 155486
2012-04-24 20:36:19 +00:00
Nadav Rotem 450d69a5ee ConstantFoldSelectInstruction swapped the operands of the select.
Fix 12592. Patch by Matt Pharr.

llvm-svn: 155480
2012-04-24 20:18:49 +00:00
Evan Cheng 2d14d8aca1 MachineBasicBlock::SplitCriticalEdge() should follow LLVM IR variant and refuse to break edge to EH landing pad. rdar://11300144
llvm-svn: 155470
2012-04-24 19:06:55 +00:00
Lang Hames 84531c2b5f Add support for llvm.arm.neon.vmull* intrinsics to InstCombine. This fixes
<rdar://problem/11291436>.

llvm-svn: 155468
2012-04-24 18:58:36 +00:00
Chandler Carruth aacb8a5809 Fix a crash on valid (if UB) bitcode that is produced for some global
constants in C++11 mode. I have no idea why it required such particular
circumstances to get here, the code seems clearly to rely upon unchecked
assumptions.

Specifically, when we decide to form an index into a struct type, we may
have gone through (at least one) zero-length array indexing round, which
would have left the offset un-adjusted, and thus not necessarily valid
for use when indexing the struct type.

This is just an canonicalization step, so the correct thing is to refuse
to canonicalize nonsensical GEPs of this form. Implemented, and test
case added.

Fixes PR12642. Pair debugged and coded with Richard Smith. =] I credit
him with most of the debugging, and preventing me from writing the wrong
code.

llvm-svn: 155466
2012-04-24 18:42:47 +00:00
Jim Grosbach 1e75fc1fe1 ARM: Nuke remnant bogus code.
r154362 was supposed to delete this bit, but obviously didn't.

rdar://11305594

llvm-svn: 155465
2012-04-24 18:39:47 +00:00
Nadav Rotem 810734b7f4 AVX: Add additional vbroadcast replacement sequences for integers.
Remove the v2f64 patterns because it does not match any vbroadcast
instruction.

llvm-svn: 155461
2012-04-24 18:09:59 +00:00
Andrew Trick 26bdff9b82 cmake: new file
llvm-svn: 155460
2012-04-24 18:06:49 +00:00
Andrew Trick 9e9a9f1465 misched: DAG builder must special case earlyclobber
llvm-svn: 155459
2012-04-24 18:04:41 +00:00
Andrew Trick c3ea00565f misched: try (not too hard) to place debug values where they belong
llvm-svn: 155458
2012-04-24 18:04:37 +00:00
Andrew Trick cc45a28320 misched: ignore debug values during scheduling
llvm-svn: 155457
2012-04-24 18:04:34 +00:00
Andrew Trick 88639928bd misched: DAG builder support for tracking register pressure within the current scheduling region.
The DAG builder is a convenient place to do it. Hopefully this is more
efficient than a separate traversal over the same region.

llvm-svn: 155456
2012-04-24 17:56:43 +00:00
Andrew Trick 3cd53a1a52 RegisterPressure: A utility for computing register pressure within a
MachineInstr sequence.

This uses the new target interface for tracking register pressure
using pressure sets to model overlapping register classes and
subregisters.

RegisterPressure results can be tracked incrementally or stored at
region boundaries. Global register pressure can be deduced from local
RegisterPressure results if desired.

This is an early, somewhat untested implementation. I'm working on
testing it within the context of a register pressure reducing
MachineScheduler.

llvm-svn: 155454
2012-04-24 17:53:35 +00:00
Nadav Rotem 7b7b99c74a AVX2: The BLENDPW instruction selects between vectors of v16i16 using an i8
immediate. We can't use it here because the shuffle code does not check that
the lower part of the word is identical to the upper part.

llvm-svn: 155440
2012-04-24 11:27:53 +00:00
Richard Barton e9600009e9 Refactor Thumb ITState handling in ARM Disassembler to more efficiently use its vector
llvm-svn: 155439
2012-04-24 11:13:20 +00:00
Nadav Rotem aa3ff8da00 AVX: We lower VECTOR_SHUFFLE and BUILD_VECTOR nodes into vbroadcast instructions
using the pattern (vbroadcast (i32load src)). In some cases, after we generate
this pattern new users are added to the load node, which prevent the selection
of the blend pattern. This commit provides fallback patterns which perform
in-vector broadcast (using in-vector vbroadcast in AVX2 and pshufd on AVX1).

llvm-svn: 155437
2012-04-24 11:07:03 +00:00
Bill Wendling f1b14b719f Look for the 'Is Simulated' module flag. This indicates that the program is compiled to run on a simulator.
llvm-svn: 155435
2012-04-24 11:03:50 +00:00
Craig Topper 0b65c40821 Remove dangling spaces. Fix some other formatting.
llvm-svn: 155429
2012-04-24 06:36:35 +00:00
Craig Topper 6f2a535de2 Simplify code a bit and make it compile better. Remove unused parameters.
llvm-svn: 155428
2012-04-24 06:02:29 +00:00
Evan Cheng 7fd160700f Add a missing cpu subtype.
llvm-svn: 155402
2012-04-23 22:41:39 +00:00
Jim Grosbach 671ad2a572 Tidy up. 80 columns, whitespace, et. al.
llvm-svn: 155399
2012-04-23 22:04:10 +00:00
Nadav Rotem 3f8acfc3c4 Optimize the vector UINT_TO_FP, SINT_TO_FP and FP_TO_SINT operations where the integer type is i8 (commonly used in graphics).
llvm-svn: 155397
2012-04-23 21:53:37 +00:00
Preston Gurd 9a0914753a This patch fixes a problem which arose when using the Post-RA scheduler
on X86 Atom. Some of our tests failed because the tail merging part of
the BranchFolding pass was creating new basic blocks which did not
contain live-in information. When the anti-dependency code in the Post-RA
scheduler ran, it would sometimes rename the register containing
the function return value because the fact that the return value was
live-in to the subsequent block had been lost. To fix this, it is necessary
to run the RegisterScavenging code in the BranchFolding pass.

This patch makes sure that the register scavenging code is invoked
in the X86 subtarget only when post-RA scheduling is being done.
Post RA scheduling in the X86 subtarget is only done for Atom.

This patch adds a new function to the TargetRegisterClass to control
whether or not live-ins should be preserved during branch folding.
This is necessary in order for the anti-dependency optimizations done
during the PostRASchedulerList pass to work properly when doing
Post-RA scheduling for the X86 in general and for the Intel Atom in particular.

The patch adds and invokes the new function trackLivenessAfterRegAlloc()
instead of using the existing requiresRegisterScavenging().
It changes BranchFolding.cpp to call trackLivenessAfterRegAlloc() instead of
requiresRegisterScavenging(). It changes the all the targets that
implemented requiresRegisterScavenging() to also implement
trackLivenessAfterRegAlloc().  

It adds an assertion in the Post RA scheduler to make sure that post RA
liveness information is available when it is needed.

It changes the X86 break-anti-dependencies test to use –mcpu=atom, in order
to avoid running into the added assertion.

Finally, this patch restores the use of anti-dependency checking
(which was turned off temporarily for the 3.1 release) for
Intel Atom in the Post RA scheduler.

Patch by Andy Zhang!

Thanks to Jakob and Anton for their reviews.

llvm-svn: 155395
2012-04-23 21:39:35 +00:00
Jim Grosbach 41e94d79be ARM: VSLI two-operand assmebly aliases are tblgen'erated.
llvm-svn: 155393
2012-04-23 21:22:04 +00:00
Jim Grosbach 3dada484c3 ARM: tblgen'erate VSRA/VRSRA/VSRI assembly two-operand aliases.
llvm-svn: 155392
2012-04-23 21:00:49 +00:00
Jim Grosbach e5012fbad3 ARM: vqdmulh two-operand aliases are tblgen'erated now.
llvm-svn: 155387
2012-04-23 20:37:20 +00:00
Michael J. Spencer 04b795bc1d [Support/Unix] Unconditionally include time.h.
When building LLVM on Linux with libc++ with CMake TIME_WITH_SYS_TIME is
undefined, and HAVE_SYS_TIME_H is defined. This ends up including
sys/time.h but not time.h. Unix/TimeValue.inc requires time.h for asctime_r
and localtime. libstdc++ seems to include time.h anyway, but libc++ does
not.

Fix this by always including time.h

llvm-svn: 155382
2012-04-23 19:00:27 +00:00
Eric Christopher 27deb265f9 Allow forward declarations to take a context. This helps the debugger
find forward declarations in the context that the actual definition
will occur.

rdar://11291658

llvm-svn: 155380
2012-04-23 19:00:11 +00:00
Chandler Carruth af0f8bf595 Temporarily revert r155364 until the upstream review can complete, per
the stated developer policy.

llvm-svn: 155373
2012-04-23 18:28:57 +00:00
Chandler Carruth 3c3bb55a85 Revert r155365, r155366, and r155367. All three of these have regression
test suite failures. The failures occur at each stage, and only get
worse, so I'm reverting all of them.

Please resubmit these patches, one at a time, after verifying that the
regression test suite passes. Never submit a patch without running the
regression test suite.

llvm-svn: 155372
2012-04-23 18:25:57 +00:00
Sirish Pande a3f8ba2439 Hexagon V5 (floating point) support.
llvm-svn: 155367
2012-04-23 17:49:40 +00:00
Sirish Pande 2c7bf00fba Support for Hexagon architectural feature, new value jump.
llvm-svn: 155366
2012-04-23 17:49:28 +00:00
Sirish Pande 6cd2251598 Support for Hexagon VLIW Packetizer.
llvm-svn: 155365
2012-04-23 17:49:20 +00:00
Sirish Pande 995c8dbfd2 Hexagon Packetizer's target independent fix.
llvm-svn: 155364
2012-04-23 17:49:09 +00:00
Jakob Stoklund Olesen 43bcb970e5 Reapply r155136 after fixing PR12599.
Original commit message:

Defer some shl transforms to DAGCombine.

The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155362
2012-04-23 17:39:52 +00:00
Sylvestre Ledru 3099f4bda8 Conflict with st_dev/st_ino identifiers under Debian GNU/Hurd
The problem is that the struct file_status on UNIX systems has two
members called st_dev and st_ino; those are also members of the
struct stat, and they are reserved identifiers which can also be
provided as #define (and this is the case for st_dev on Hurd).
The solution (attached) is to rename them, for example adding a
"fs_" prefix (= file status) to them.

Patch by Pino Toscano

llvm-svn: 155354
2012-04-23 16:37:23 +00:00
Alexander Potapenko 056e27ea49 Fix issue 67 by checking that the interface functions weren't redefined in the compiled source file.
llvm-svn: 155346
2012-04-23 10:47:31 +00:00
Kostya Serebryany 5a4b7a232c [tsan] use llvm/ADT/Statistic.h for tsan stats
llvm-svn: 155341
2012-04-23 08:44:59 +00:00
Craig Topper 153bb34a3c Use MVT instead of EVT through all of LowerVECTOR_SHUFFLEtoBlend and not just the switch. Saves a little bit of binary size.
llvm-svn: 155339
2012-04-23 07:36:33 +00:00
Craig Topper 0a2c809d09 Make getZeroVector and getOnesVector more alike as far as how they detect 128-bit versus 256-bit vectors. Be explicit about both sizes and use llvm_unreachable. Similar changes to getLegalSplat.
llvm-svn: 155337
2012-04-23 07:24:41 +00:00
Craig Topper 2bbe8bcf4e Tidy up by removing some 'else' after 'return'
llvm-svn: 155336
2012-04-23 06:57:04 +00:00
Craig Topper 5c51eeecfc Tidy up spacing in LowerVECTOR_SHUFFLEtoBlend. Remove code that checks if shuffle operand has a different type than the the shuffle result since it can never happen.
llvm-svn: 155333
2012-04-23 06:38:28 +00:00
Craig Topper a52f0d09b6 Add a couple llvm_unreachables.
llvm-svn: 155332
2012-04-23 03:42:40 +00:00
Craig Topper 984dc015ae Remove some tab characers.
llvm-svn: 155331
2012-04-23 03:28:34 +00:00
Craig Topper ea428fd79c Remove some 'else' after 'return'. No functional change.
llvm-svn: 155330
2012-04-23 03:26:18 +00:00
Chris Lattner 5e14666149 Don't die with an assertion if the Result bitwidth is already correct. This
fixes an assert reading "1239123123123123" when the result is already 64-bit.

llvm-svn: 155329
2012-04-23 00:27:54 +00:00
Bill Wendling e32c23a5e0 Cleanup whitespace.
llvm-svn: 155328
2012-04-23 00:23:33 +00:00
Bill Wendling 3d0ec2bedb Limit the number of times we recurse through this algorithm. All of the
intructions are processed. So there's no need to look at them if they're used as
operands of other instructions.

llvm-svn: 155327
2012-04-23 00:22:55 +00:00
Craig Topper bf7d5666f0 Make Extract128BitVector and Insert128BitVector take an unsigned instead of an ConstantNode SDValue. getConstant was almost always called just before only to have the functions take it apart and build a new ConstantSDNode.
llvm-svn: 155325
2012-04-22 20:55:18 +00:00
Craig Topper 2d474d6d92 Convert getNode(UNDEF) to getUNDEF.
llvm-svn: 155321
2012-04-22 19:29:34 +00:00
Craig Topper 860ed0d20a Make calls to getVectorShuffle more consistent. Use shuffle VT for calls to getUNDEF instead of requerying. Use &Mask[0] instead of Mask.data().
llvm-svn: 155320
2012-04-22 19:17:57 +00:00
Craig Topper 43397c0900 Tidy up. 80 columns and argument alignment.
llvm-svn: 155319
2012-04-22 18:51:37 +00:00
Craig Topper ad56a744f1 Simplify code by converting multiple places that were manually concatenating 128-bit vectors to use either CONCAT_VECTORS or a helper function. CONCAT_VECTORS will itself be lowered to the same pattern as before. The helper function is needed for concats of BUILD_VECTORs since getNode(CONCAT_VECTORS) will just return a large BUILD_VECTOR and we may be trying to lower large BUILD_VECTORS when this occurs.
llvm-svn: 155318
2012-04-22 18:15:59 +00:00
Benjamin Kramer 8877d68db7 ARM: Initialize the HasRAS bit.
Found by valgrind.

llvm-svn: 155313
2012-04-22 11:52:41 +00:00
Elena Demikhovsky 8d7e56c409 ZERO_EXTEND/SIGN_EXTEND/TRUNCATE optimization for AVX2
llvm-svn: 155309
2012-04-22 09:39:03 +00:00
Bill Wendling f9774c3253 Remove some potential warnings about variables used uninitialized.
llvm-svn: 155307
2012-04-22 07:23:04 +00:00
Bill Wendling 32854e2727 Add a flag to the struct type finder to collect only those types which have
names. This saves collecting types we normally don't care about.

llvm-svn: 155300
2012-04-21 23:59:16 +00:00
Chris Lattner 0a1bafed7b No need for "else if" after a return. Autosense "0o123" as octal in
StringRef::getAsInteger

llvm-svn: 155298
2012-04-21 22:03:05 +00:00
Nadav Rotem 31caa27bf5 Teach getVectorTypeBreakdown about promotion of vectors in addition to widening of vectors.
llvm-svn: 155296
2012-04-21 20:08:32 +00:00
Craig Topper 6eadae8e60 Make some fixed arrays const. Use array_lengthof in a couple places instead of a hardcoded number.
llvm-svn: 155294
2012-04-21 18:58:38 +00:00
Craig Topper 2568bf3089 Tidy up. 80 columns and some other spacing issues.
llvm-svn: 155291
2012-04-21 18:13:35 +00:00
NAKAMURA Takumi e30303fa86 llvm/lib/Target: [PR12611] Add "llvm/Support/raw_ostream.h" for Debug build on MSVC.
Thanks to Andy Gibbs, to report the issue.

llvm-svn: 155287
2012-04-21 15:31:45 +00:00
NAKAMURA Takumi 54eed760da HexagonISelLowering.cpp: Reorder #includes.
llvm-svn: 155286
2012-04-21 15:31:36 +00:00
Nuno Lopes e568efbc4d move Signals to .rodata
llvm-svn: 155283
2012-04-21 14:45:37 +00:00
NAKAMURA Takumi df3d5ea990 HexagonInstPrinter.cpp: Suppress -Wunused-variable warnings with -Asserts.
llvm-svn: 155281
2012-04-21 11:24:55 +00:00
Benjamin Kramer 0aa0d3d633 YAMLParser: silence warning about tautological comparison on unsigned-char platforms.
No functionality change.

llvm-svn: 155280
2012-04-21 10:51:42 +00:00
Jim Grosbach c931d451cd ARM: tblgen'erate more NEON two-operand aliases.
VMUL and VEXT.

llvm-svn: 155258
2012-04-20 23:46:33 +00:00
Jakob Stoklund Olesen d114da6004 Fix PR12599.
The X86 target is editing the selection DAG while isel is selecting
nodes following a topological ordering. When the DAG hacking triggers
CSE, nodes can be deleted and bad things happen.

llvm-svn: 155257
2012-04-20 23:36:09 +00:00
Jim Grosbach b4e849b924 ARM: tblgen'erate more NEON two-operand aliases.
llvm-svn: 155254
2012-04-20 23:30:14 +00:00
Bill Wendling 1bf41faca2 Revert r155241, which is causing some breakage.
llvm-svn: 155253
2012-04-20 23:11:38 +00:00
Jakob Stoklund Olesen e3a891cf08 Make ISelPosition a local variable.
Now that multiple DAGUpdateListeners can be active at the same time,
ISelPosition can become a local variable in DoInstructionSelection.

We simply register an ISelUpdater with CurDAG while ISelPosition exists.

llvm-svn: 155249
2012-04-20 22:08:50 +00:00
Jakob Stoklund Olesen beb9469d5c Register DAGUpdateListeners with SelectionDAG.
Instead of passing listener pointers to RAUW, let SelectionDAG itself
keep a linked list of interested listeners.

This makes it possible to have multiple listeners active at once, like
RAUWUpdateListener was already doing. It also makes it possible to
register listeners up the call stack without controlling all RAUW calls
below.

DAGUpdateListener uses an RAII pattern to add itself to the SelectionDAG
list of active listeners.

llvm-svn: 155248
2012-04-20 22:08:46 +00:00
Bill Wendling c5fae47a63 If we discover all of the named structs in a module, then don't bother to
process any more Values.

llvm-svn: 155241
2012-04-20 21:56:24 +00:00
Jakob Stoklund Olesen 7111a630d5 Print <def,read-undef> to avoid confusion.
The <undef> flag on a def operand only applies to partial register
redefinitions. Only print the flag when relevant, and print it as
<def,read-undef> to make it clearer what it means.

llvm-svn: 155239
2012-04-20 21:45:33 +00:00
Andrew Trick 51ee936101 New and improved comment.
llvm-svn: 155229
2012-04-20 20:24:33 +00:00
Andrew Trick 1eb4a0da55 SparseSet: Add support for key-derived indexes and arbitrary key types.
This nicely handles the most common case of virtual register sets, but
also handles anticipated cases where we will map pointers to IDs.

The goal is not to develop a completely generic SparseSet
template. Instead we want to handle the expected uses within llvm
without any template antics in the client code. I'm adding a bit of
template nastiness here, and some assumption about expected usage in
order to make the client code very clean.

The expected common uses cases I'm designing for:
- integer keys that need to be reindexed, and may map to additional
  data
- densely numbered objects where we want pointer keys because no
  number->object map exists.

llvm-svn: 155227
2012-04-20 20:05:28 +00:00
Andrew Trick 7405c6d57a misched: initialize BB
llvm-svn: 155226
2012-04-20 20:05:21 +00:00
Jim Grosbach 2937df45a8 ARM: Update NEON assembly two-operand aliases.
Use the new TwoOperandAliasConstraint to handle lots of the two-operand aliases
for NEON instructions. There's still more to go, but this is a good chunk of
them.

llvm-svn: 155210
2012-04-20 18:12:54 +00:00
Gabor Greif c8a9abe9df effectively back out my last change (r155190)
llvm-svn: 155195
2012-04-20 11:41:38 +00:00
Gabor Greif 9eccbe9c82 fix obviously bogus (IMO) operand index of the load in asserts
(load only has one operand) and smuggle in some whitespace changes too

NB: I am obviously testing the water here, and believe that the unguarded
    cast is still wrong, but why is the getZExtValue of the load's operand
    tested against zero here? Any review is appreciated.
llvm-svn: 155190
2012-04-20 08:58:49 +00:00
Craig Topper c7242e054d Convert more uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155188
2012-04-20 07:30:17 +00:00
Craig Topper abadc660e0 Convert some uses of XXXRegisterClass to &XXXRegClass. No functional change since they are equivalent.
llvm-svn: 155186
2012-04-20 06:31:50 +00:00
Jakob Stoklund Olesen 205ee3b389 Revert r155136 "Defer some shl transforms to DAGCombine."
While the patch was perfect and defect free, it exposed a really nasty
bug in X86 SelectionDAG that caused an llc crash when compiling lencod.

I'll put the patch back in after fixing the SelectionDAG problem.

llvm-svn: 155181
2012-04-20 00:38:45 +00:00
Jim Grosbach 9cc324d31a ARM some VFP tblgen'erated two-operand aliases.
llvm-svn: 155178
2012-04-20 00:15:00 +00:00
Jim Grosbach 6b46134862 ARM let TableGen handle a few two-operand aliases.
No need for these explicit aliases anymore. Nuke 'em.

llvm-svn: 155173
2012-04-19 23:59:26 +00:00
Bill Wendling 9f97595201 Put this expensive check below the less expensive ones.
llvm-svn: 155166
2012-04-19 23:31:07 +00:00
Dan Gohman 26aa827461 Avoid a bug in the path count computation, preventing an infinite
loop repeatedlt making the same change. This is for rdar://11256239.

llvm-svn: 155160
2012-04-19 21:50:46 +00:00
Jakob Stoklund Olesen 6b6c81e6b2 Defer some shl transforms to DAGCombine.
The shl instruction is used to represent multiplication by a constant
power of two as well as bitwise left shifts. Some InstCombine
transformations would turn an shl instruction into a bit mask operation,
making it difficult for later analysis passes to recognize the
constsnt multiplication.

Disable those shl transformations, deferring them to DAGCombine time.
An 'shl X, C' instruction is now treated mostly the same was as 'mul X, C'.

These transformations are deferred:

  (X >>? C) << C   --> X & (-1 << C)  (When X >> C has multiple uses)
  (X >>? C1) << C2 --> X << (C2-C1) & (-1 << C2)   (When C2 > C1)
  (X >>? C1) << C2 --> X >>? (C1-C2) & (-1 << C2)  (When C1 > C2)

The corresponding exact transformations are preserved, just like
div-exact + mul:

  (X >>?,exact C) << C   --> X
  (X >>?,exact C1) << C2 --> X << (C2-C1)
  (X >>?,exact C1) << C2 --> X >>?,exact (C1-C2)

The disabled transformations could also prevent the instruction selector
from recognizing rotate patterns in hash functions and cryptographic
primitives. I have a test case for that, but it is too fragile.

llvm-svn: 155136
2012-04-19 16:46:26 +00:00
Gabor Greif 180c4445cf zap tabs
llvm-svn: 155128
2012-04-19 15:16:31 +00:00
Andrew Trick a11810ad60 Allow targets to select the default scheduler by name.
llvm-svn: 155090
2012-04-19 01:34:10 +00:00
Kevin Enderby ec4bd31206 Fixed the llvm-mv X86 disassembler so the 'C' API gets jumps properly
symbolicated.  These have and operand type of TYPE_RELv which was not handled
as isBranch in translateImmediate() in X86Disassembler.cpp.  rdar://11268426 

llvm-svn: 155074
2012-04-18 23:12:11 +00:00
Dan Gohman 22fbe8d709 Don't crash on code where the user put __attribute__((constructor)) on
a function with arguments. This fixes rdar://11265785.

llvm-svn: 155073
2012-04-18 22:24:33 +00:00
Chandler Carruth b415bf98f0 This reverts a long string of commits to the Hexagon backend. These
commits have had several major issues pointed out in review, and those
issues are not being addressed in a timely fashion. Furthermore, this
was all committed leading up to the v3.1 branch, and we don't need piles
of code with outstanding issues in the branch.

It is possible that not all of these commits were necessary to revert to
get us back to a green state, but I'm going to let the Hexagon
maintainer sort that out. They can recommit, in order, after addressing
the feedback.

Reverted commits, with some notes:

Primary commit r154616: HexagonPacketizer
  - There are lots of review comments here. This is the primary reason
    for reverting. In particular, it introduced large amount of warnings
    due to a bad construct in tablegen.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154622: CMake fixes
    - r154660: Fix numerous build warnings in release builds.
  - Please don't resubmit this until the three commits above are
    included, and the issues in review addressed.

Primary commit r154695: Pass to replace transfer/copy ...
  - Reverted to minimize merge conflicts. I'm not aware of specific
    issues with this patch.

Primary commit r154703: New Value Jump.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154703: Remove iostream usage
    - r154758: Fix CMake builds
    - r154759: Fix build warnings in release builds
  - Please incorporate these fixes and and review feedback before
    resubmitting.

Primary commit r154829: Hexagon V5 (floating point) support.
  - Primarily reverted due to merge conflicts.
  - Follow-up commits that should be folded back into this when
    reposting:
    - r154841: Remove unused variable (fixing build warnings)

There are also accompanying Clang commits that will be reverted for
consistency.

llvm-svn: 155047
2012-04-18 21:31:19 +00:00
Pete Cooper 8998657c64 LiveIntervalUpdate validators weren't recorded after the calls to std::for_each. Turns out std::for_each doesn't update the variable passed in for the functor but instead copy constructs a new one.
llvm-svn: 155041
2012-04-18 20:29:17 +00:00
Benjamin Kramer bb73d19794 SourceMgr: Colorize diagnostics.
Same color scheme as clang uses. The colors are only enabled if the output is a tty.

llvm-svn: 155035
2012-04-18 19:04:15 +00:00
Akira Hatanaka fc1d00bbd6 Mark instruction classes ArithLogicR, ArithLogicI and LoadUpper as isRematerializable.
llvm-svn: 155031
2012-04-18 18:52:10 +00:00
Akira Hatanaka 4167bb9346 Delete blank line.
llvm-svn: 155030
2012-04-18 18:47:17 +00:00
Jim Grosbach f30541e920 Fix copy/paste-o.
llvm-svn: 155016
2012-04-18 18:09:53 +00:00
Jim Grosbach cc6cbf05a3 TableGen add warning diagnostic helper functions.
llvm-svn: 155012
2012-04-18 17:46:31 +00:00
Silviu Baranga ca45af9a75 Added support for disassembling unpredictable swp/swpb ARM instructions.
llvm-svn: 155004
2012-04-18 14:18:57 +00:00
Silviu Baranga d5c6a63a50 Fix the bahavior of the disassembler when decoding unpredictable mrs instructions on ARM. Now the diasassembler emmits warnings instead of errors.
llvm-svn: 155002
2012-04-18 14:09:07 +00:00
Silviu Baranga 41f1fcd80e Added support for unpredictable mcrr/mcrr2/mrrc/mrrc2 ARM instruction in the disassembler. Since the upredicability conditions are complex, C++ code was added to handle them.
llvm-svn: 155001
2012-04-18 13:12:50 +00:00
Silviu Baranga a2944116dc Fixed decoding for the ARM cdp2 instruction. The restriction on the coprocessor number was removed for this instruction.
llvm-svn: 155000
2012-04-18 13:02:55 +00:00
Silviu Baranga 9da1918c84 Add suport for unpredicatble cases of the cmp, tst, teq and cmnz ARM instructions in the disassembler.
llvm-svn: 154999
2012-04-18 12:48:43 +00:00
Benjamin Kramer 94988cb34c SmallPtrSet: Reuse DenseMapInfo's pointer hash function instead of inventing a bad one ourselves.
DenseMap's hash function uses slightly more entropy and reduces hash collisions
significantly.  I also experimented with Hashing.h, but it didn't gave a lot of
improvement while being much more expensive to compute.

llvm-svn: 154996
2012-04-18 10:37:32 +00:00
Bill Wendling 4d4d025751 Use a heavy hammer to fix PR12573.
If the loop contains invoke instructions, whose unwind edge escapes the loop,
then don't try to unswitch the loop. Doing so may cause the unwind edge to be
split, which not only is non-trivial but doesn't preserve loop simplify
information.

Fixes PR12573

llvm-svn: 154987
2012-04-18 06:00:09 +00:00
Craig Topper d3c9e404ba Remove AVX vpermil intrinsics. I removed their uses from clang headers and builtins a while back.
llvm-svn: 154985
2012-04-18 05:24:00 +00:00
Andrew Trick 19f80c1e7e loop-reduce: Add an early bailout to catch extremely large loops.
This introduces a threshold of 200 IV Users, which is very
conservative but should be sufficient to avoid serious compile time
sink or stack overflow. The llvm test-suite with LTO never exceeds 190
users per loop.

The bug doesn't relate to a specific type of loop. Checking in an
arbitrary giant loop as a unit test would be silly.

Fixes rdar://11262507.

llvm-svn: 154983
2012-04-18 04:00:10 +00:00
Seth Cantrell 3d013bccfa fix error check in assert
llvm-svn: 154971
2012-04-18 00:40:23 +00:00
David Blaikie 1b5d8d46fb C++ has newlines at the end of files (including include files).
llvm-svn: 154962
2012-04-17 23:46:51 +00:00
Joe Groff a81bcbb9bb fix pr12559: mark unavailable win32 math libcalls
also fix SimplifyLibCalls to use TLI rather than compile-time conditionals to enable optimizations on floor, ceil, round, rint, and nearbyint

llvm-svn: 154960
2012-04-17 23:05:54 +00:00
Joel Jones 828531f798 Fixes a problem in instruction selection with testing whether or not the
transformation:

(X op C1) ^ C2 --> (X op C1) & ~C2 iff (C1&C2) == C2

should be done.  

This change has been tested:
 Using a debug+asserts build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt
   using this clang to build a release version of clang
 Using the release+asserts clang-with-clang build:
   on the specific test case that brought this bug to light
   make check-all
   lnt nt

Checking in because Evan wants it checked in.  Test case forthcoming after
scrubbing.

llvm-svn: 154955
2012-04-17 22:23:10 +00:00
Chad Rosier 41675546eb Typo.
llvm-svn: 154953
2012-04-17 21:48:36 +00:00
Danil Malyshev bf7d3289fa Fix incorrect call of resolveRelocation() for ARM ELF stub relocations.
llvm-svn: 154948
2012-04-17 20:10:16 +00:00
Seth Cantrell 75dbcb8bdd platform support for counting column widths and checking isprint
llvm-svn: 154944
2012-04-17 20:03:03 +00:00
Akira Hatanaka 236e14017f Delete latter half of CMakeLists.txt.
llvm-svn: 154936
2012-04-17 18:18:09 +00:00
Akira Hatanaka 71928e681b Add disassembler to MIPS.
Patch by Vladimir Medic. 

llvm-svn: 154935
2012-04-17 18:03:21 +00:00
Manuel Klimek 1f8918f69d Goodbye, JSONParser...
llvm-svn: 154930
2012-04-17 17:21:17 +00:00
Jay Foad 08a0598cd4 Remove unused CCIfSubtarget.
llvm-svn: 154921
2012-04-17 11:29:05 +00:00
James Molloy a9bcf20d22 Fix bad EXTRACT_SUBREG in instruction selection for extending-loads on NEON.
llvm-svn: 154915
2012-04-17 08:18:00 +00:00
Benjamin Kramer e364d195e9 Revert "SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW."
This isn't right either, reverting for now.

llvm-svn: 154910
2012-04-17 06:33:57 +00:00
Craig Topper 354103d8ca Don't decode vperm2i128 or vperm2f128 into a shuffle if bit 3 or 7 of the immediate is set.
llvm-svn: 154907
2012-04-17 05:54:54 +00:00
Lang Hames aef9178301 SlotIndexes used to store the index list in a crufty custom linked-list. I can't
for the life of me remember why I wrote it this way, but I can't see any good
reason for it now. This patch replaces the custom linked list with an ilist.

This change should preserve the existing numberings exactly, so no generated code
should change (if it does, file a bug!).

llvm-svn: 154904
2012-04-17 04:15:51 +00:00
Kevin Enderby 29ae538647 Fix ARM disassembly of VLD2 (single 2-element structure to all lanes)
instructions with writebacks. And add test a case for all opcodes handed by
DecodeVLD2DupInstruction() in ARMDisassembler.cpp .

llvm-svn: 154884
2012-04-17 00:49:27 +00:00
Eric Christopher 7df0240e52 Typo.
llvm-svn: 154879
2012-04-16 23:54:31 +00:00
Eric Christopher a8caa739de Make comment here more clear.
llvm-svn: 154878
2012-04-16 23:54:23 +00:00
Jim Grosbach 2bf5f73977 ARM two-operand forms for vhadd and vhsub instructions.
rdar://11252521

llvm-svn: 154875
2012-04-16 23:00:25 +00:00
Preston Gurd 5333e2e5ce Temporarily turn off anti-dependency checking
during Post RA scheduling in X86,
until the X86 target is changed to properly set up
post RA liveness.

llvm-svn: 154874
2012-04-16 22:52:28 +00:00
Preston Gurd e185ecb28b Add files which were not included by commit 154868.
llvm-svn: 154872
2012-04-16 22:26:48 +00:00
Preston Gurd cc31af9328 Implement GDB integration for source level debugging of code JITed using
the MCJIT execution engine.

The GDB JIT debugging integration support works by registering a loaded
object image with a pre-defined function that GDB will monitor if GDB
is attached. GDB integration support is implemented for ELF only at this
time. This integration requires GDB version 7.0 or newer.

Patch by Andy Kaylor!

 

llvm-svn: 154868
2012-04-16 22:12:58 +00:00
Chandler Carruth 1f5580b6f3 Fix updateTerminator to be resiliant to degenerate terminators where
both fallthrough and a conditional branch target the same successor.
Gracefully delete the conditional branch and introduce any unconditional
branch needed to reach the actual successor. This fixes memory
corruption in 2009-06-15-RegScavengerAssert.ll and possibly other tests.

Also, while I'm here fix a latent bug I spotted by inspection. I never
applied the same fundamental fix to this fallthrough successor finding
logic that I did to the logic used when there are no conditional
branches. As a consequence it would have selected landing pads had they
be aligned in just the right way here. I don't have a test case as
I spotted this by inspection, and the previous time I found this
required have of TableGen's source code to produce it. =/ I hate backend
bugs. ;]

Thanks to Jim Grosbach for helping me reason through this and reviewing
the fix.

llvm-svn: 154867
2012-04-16 22:03:00 +00:00
Jim Grosbach 1e1d68f1b9 MC assembly parser handling for trailing comma in macro instantiation.
A trailing comma means no argument at all (i.e., as if the comma were not
present), not an empty argument to the invokee.

rdar://11252521

llvm-svn: 154863
2012-04-16 21:18:49 +00:00
Jim Grosbach 003607f474 ARM handle :lower16: and :upper16: after a '#' prefix.
rdar://11252521

llvm-svn: 154862
2012-04-16 21:18:46 +00:00
Duncan Sands 9af6298293 Remove support for the special 'fast' value for fpmath accuracy for the moment.
llvm-svn: 154850
2012-04-16 19:39:33 +00:00
Richard Smith 12da79b859 Fix incorrect atomics codegen introduced in r154705, and extend test to catch it.
llvm-svn: 154845
2012-04-16 18:43:53 +00:00
David Blaikie e67cdc07a5 Remove unused variable
llvm-svn: 154841
2012-04-16 18:10:13 +00:00
Jim Grosbach 6068d0014a ARM assembly two-operand forms for VRSHL.
rdar://11252521

llvm-svn: 154840
2012-04-16 18:03:16 +00:00
Akira Hatanaka 3e9d81f47c Do not add offset in applyFixup. This has already been accounted for in Value.
llvm-svn: 154838
2012-04-16 18:00:19 +00:00
Jim Grosbach cd1c000a9f ARM two-operand aliases for VRHADD instructions.
rdar://11252521

llvm-svn: 154832
2012-04-16 17:14:11 +00:00
Sirish Pande 96e8ee17e0 Hexagon V5 (Floating Point) Support.
llvm-svn: 154829
2012-04-16 17:05:06 +00:00
Duncan Sands 05f4df8d72 Make it possible to indicate relaxed floating point requirements at the IR level
through the use of 'fpmath' metadata.  Currently this only provides a 'fpaccuracy'
value, which may be a number in ULPs or the keyword 'fast', however the intent is
that this will be extended with additional information about NaN's, infinities
etc later.  No optimizations have been hooked up to this so far.

llvm-svn: 154822
2012-04-16 16:28:59 +00:00
Chandler Carruth 4190b507c5 Flip the new block-placement pass to be on by default.
This is mostly to test the waters. I'd like to get results from FNT
build bots and other bots running on non-x86 platforms.

This feature has been pretty heavily tested over the last few months by
me, and it fixes several of the execution time regressions caused by the
inlining work by preventing inlining decisions from radically impacting
block layout.

I've seen very large improvements in yacr2 and ackermann benchmarks,
along with the expected noise across all of the benchmark suite whenever
code layout changes. I've analyzed all of the regressions and fixed
them, or found them to be impossible to fix. See my email to llvmdev for
more details.

I'd like for this to be in 3.1 as it complements the inliner changes,
but if any failures are showing up or anyone has concerns, it is just
a flag flip and so can be easily turned off.

I'm switching it on tonight to try and get at least one run through
various folks' performance suites in case SPEC or something else has
serious issues with it. I'll watch bots and revert if anything shows up.

llvm-svn: 154816
2012-04-16 13:49:17 +00:00
Chandler Carruth 8c0b41d656 Add a somewhat hacky heuristic to do something different from whole-loop
rotation. When there is a loop backedge which is an unconditional
branch, we will end up with a branch somewhere no matter what. Try
placing this backedge in a fallthrough position above the loop header as
that will definitely remove at least one branch from the loop iteration,
where whole loop rotation may not.

I haven't seen any benchmarks where this is important but loop-blocks.ll
tests for it, and so this will be covered when I flip the default.

llvm-svn: 154812
2012-04-16 13:33:36 +00:00
Hal Finkel 52ba49f399 Fix style violation in BBVectorize (pointed out by Bill Wendling)
llvm-svn: 154810
2012-04-16 12:39:17 +00:00
Chandler Carruth 8c74c7b1c6 Tweak the loop rotation logic to check whether the loop is naturally
laid out in a form with a fallthrough into the header and a fallthrough
out of the bottom. In that case, leave the loop alone because any
rotation will introduce unnecessary branches. If either side looks like
it will require an explicit branch, then the rotation won't add any, do
it to ensure the branch occurs outside of the loop (if possible) and
maximize the benefit of the fallthrough in the bottom.

llvm-svn: 154806
2012-04-16 09:31:23 +00:00
Benjamin Kramer 13d16f3bf3 Reapply 'Add reverseColor to raw_ostream'.
To be used in printing unprintable source in clang diagnostics.
Patch by Seth Cantrell, with a minor fix for mingw by me.

llvm-svn: 154805
2012-04-16 08:56:50 +00:00
Argyrios Kyrtzidis 64104f16d4 Revert r154800 which breaks windows builders.
llvm-svn: 154802
2012-04-16 07:59:39 +00:00
Craig Topper 4badeb3f0d Replace vpermd/vpermps intrinic patterns with custom lowering to target specific nodes.
llvm-svn: 154801
2012-04-16 07:13:00 +00:00
Argyrios Kyrtzidis d17db2e0ee Add reverseColor to raw_ostream.
To be used in printing unprintable source in clang diagnostics.
Patch by Seth Cantrell!

llvm-svn: 154800
2012-04-16 07:07:38 +00:00
Craig Topper 26d7a94981 Change type profile for vpermv back to using operand type for the mask argument to match intrinsic behavior. Add a bitcast to the lowering code to convert mask from v8i32 to v8f32 for vpermps.
llvm-svn: 154798
2012-04-16 06:43:40 +00:00
Craig Topper c0075aa7ff Flip the arguments when converting vpermd/vpermps intrinsics into instructions. The intrinsic has the mask as the last operand, but the instruction has it as the second.
llvm-svn: 154797
2012-04-16 06:26:15 +00:00
Bill Wendling 82b90a3804 Add a Fixme.
llvm-svn: 154793
2012-04-16 04:23:52 +00:00
Hal Finkel 8ee309d9b7 Simplify checking for pointer types in BBVectorize (this change was suggested by Duncan).
llvm-svn: 154787
2012-04-16 03:49:42 +00:00
Hal Finkel e0cf6397fd Remove dead SD nodes after the combining pass. Fixes PR12201.
llvm-svn: 154786
2012-04-16 03:33:22 +00:00
Chandler Carruth ccc7e42b1f Rewrite how machine block placement handles loop rotation.
This is a complex change that resulted from a great deal of
experimentation with several different benchmarks. The one which proved
the most useful is included as a test case, but I don't know that it
captures all of the relevant changes, as I didn't have specific
regression tests for each, they were more the result of reasoning about
what the old algorithm would possibly do wrong. I'm also failing at the
moment to craft more targeted regression tests for these changes, if
anyone has ideas, it would be welcome.

The first big thing broken with the old algorithm is the idea that we
can take a basic block which has a loop-exiting successor and a looping
successor and use the looping successor as the layout top in order to
get that particular block to be the bottom of the loop after layout.
This happens to work in many cases, but not in all.

The second big thing broken was that we didn't try to select the exit
which fell into the nearest enclosing loop (to which we exit at all). As
a consequence, even if the rotation worked perfectly, it would result in
one of two bad layouts. Either the bottom of the loop would get
fallthrough, skipping across a nearer enclosing loop and thereby making
it discontiguous, or it would be forced to take an explicit jump over
the nearest enclosing loop to earch its successor. The point of the
rotation is to get fallthrough, so we need it to fallthrough to the
nearest loop it can.

The fix to the first issue is to actually layout the loop from the loop
header, and then rotate the loop such that the correct exiting edge can
be a fallthrough edge. This is actually much easier than I anticipated
because we can handle all the hard parts of finding a viable rotation
before we do the layout. We just store that, and then rotate after
layout is finished. No inner loops get split across the post-rotation
backedge because we check for them when selecting the rotation.

That fix exposed a latent problem with our exitting block selection --
we should allow the backedge to point into the middle of some inner-loop
chain as there is no real penalty to it, the whole point is that it
*won't* be a fallthrough edge. This may have blocked the rotation at all
in some cases, I have no idea and no test case as I've never seen it in
practice, it was just noticed by inspection.

Finally, all of these fixes, and studying the loops they produce,
highlighted another problem: in rotating loops like this, we sometimes
fail to align the destination of these backwards jumping edges. Fix this
by actually walking the backwards edges rather than relying on loopinfo.

This fixes regressions on heapsort if block placement is enabled as well
as lots of other cases where the previous logic would introduce an
abundance of unnecessary branches into the execution.

llvm-svn: 154783
2012-04-16 01:12:56 +00:00
Craig Topper b86fa404d3 Merge vpermps/vpermd and vpermpd/vpermq SD nodes.
llvm-svn: 154782
2012-04-16 00:41:45 +00:00
Craig Topper b04fe34030 Fix SDTypeProfile for vpermps. The mask operand should be v8i32.
llvm-svn: 154781
2012-04-16 00:12:20 +00:00
Craig Topper 1f8c9eb925 Spacing fixes and 80 column fixes. Use 0 instead of 0x80 for undef indices in vpermps/vpermd. Hardware only looks at lower 3-bits.
llvm-svn: 154780
2012-04-15 23:48:57 +00:00
Craig Topper bfc9a5f7d3 Remove AVX2 vpermq and vpermpd intrinsics. These can now be handled with normal shuffle vectors.
llvm-svn: 154778
2012-04-15 22:43:31 +00:00
Nadav Rotem 42bcd04ee3 Fix PR12529. The Vxx family of instructions are only supported by AVX.
Use non-vex instructions for SSE4.

llvm-svn: 154770
2012-04-15 19:36:44 +00:00
Benjamin Kramer 673824b4a1 Wire up support for diagnostic ranges in the ARMAsmParser.
As an example, attach range info to the "invalid instruction" message:

$ clang -arch arm -c asm.c
asm.c:2:11: error: invalid instruction
  __asm__("foo r0");
          ^
<inline asm>:1:2: note: instantiated into assembly here
        foo r0
        ^~~

llvm-svn: 154765
2012-04-15 17:04:27 +00:00
Nadav Rotem 02ef0c3524 When emulating vselect using OR/AND/XOR make sure to bitcast the result back to the original type.
llvm-svn: 154764
2012-04-15 15:08:09 +00:00
Elena Demikhovsky 779a72b49e Added VPERM optimization for AVX2 shuffles
llvm-svn: 154761
2012-04-15 11:18:59 +00:00
NAKAMURA Takumi 67de410135 HexagonCopyToCombine.cpp: Silence two warnings, -Wunused-variable, with -Asserts.
llvm-svn: 154759
2012-04-15 05:33:43 +00:00
NAKAMURA Takumi 355eebf4cf Target/Hexagon: Tweak to fix msvc build.
llvm-svn: 154758
2012-04-15 05:09:09 +00:00
Duncan Sands 34bd91a49f Rename "fpaccuracy" metadata to the more generic "fpmath". That's because I'm
thinking of generalizing it to be able to specify other freedoms beyond accuracy
(such as that NaN's don't have to be respected).  I'd like the 3.1 release (the
first one with this metadata) to have the more generic name already rather than
having to auto-upgrade it in 3.2.

llvm-svn: 154744
2012-04-14 12:36:06 +00:00
Hal Finkel 83c9796033 Fix an error in BBVectorize important for vectorizing pointer types.
When vectorizing pointer types it is important to realize that potential
pairs cannot be connected via the address pointer argument of a load or store.
This is because even after vectorization, the address is still a scalar because
the address of the higher half of the pair is implicit from the address of the
lower half (it need not be, and should not be, explicitly computed).

llvm-svn: 154735
2012-04-14 07:32:50 +00:00
Hal Finkel f589519a67 Enhance BBVectorize to more-properly handle pointer values and vectorize GEPs.
llvm-svn: 154734
2012-04-14 07:32:43 +00:00