Commit Graph

16835 Commits

Author SHA1 Message Date
Jack Carter 5c1a01a625 mips32 long long register inline asm constraint support.
inlineasm-cnstrnt-bad-r-1.ll is NOT supposed to fail, so it was removed.    This resulted in the removal of a negative test (inlineasm-cnstrnt-bad-r-1.ll)
    

llvm-svn: 159610
2012-07-02 22:39:45 +00:00
Chandler Carruth a7f1f35eb8 Extend the workaround from r159593 to cover a few explicit alias
targets.

llvm-svn: 159597
2012-07-02 21:45:22 +00:00
Chandler Carruth aec961811b Revert r159588, and apply a more principled fix. Place the fix for this
in the abstraction for lit test suites so that the various other layers
of abstraction pick up the same behavioral fix, and so that we still get
a complete list of dependencies for the 'check-all' target.

This should fix the follow-on issues of the same nature with various
other build targets, including Clang targets. Sorry for the churn, and
again thanks to Matt for testing and breaking this more thoroughly.

llvm-svn: 159593
2012-07-02 21:31:03 +00:00
Chandler Carruth 6e80d5934d Work around a really frustrating apparant CMake bug.
No functionality changed here, except that the CMake installed by
default on Ubuntu Lucid should actually work with the makefile
generators now.

Thanks to Matt for the report and head-desking required to figure out
why it was failing.

llvm-svn: 159588
2012-07-02 21:14:06 +00:00
Jack Carter 06de0fb083 Pass the correct ELFOSABI enumeration to the MipsELFObjectWriter constructor
Contributer: Sasa Stankovic 
llvm-svn: 159574
2012-07-02 20:04:43 +00:00
Bob Wilson cac3b90633 Extend TargetPassConfig to allow running only a subset of the normal passes.
This is still a work in progress but I believe it is currently good enough
to fix PR13122 "Need unit test driver for codegen IR passes".  For example,
you can run llc with -stop-after=loop-reduce to have it dump out the IR after
running LSR.  Serializing machine-level IR is not yet supported but we have
some patches in progress for that.

The plan is to serialize the IR to a YAML file, containing separate sections
for the LLVM IR, machine-level IR, and whatever other info is needed.  Chad
suggested that we stash the stop-after pass in the YAML file and use that
instead of the start-after option to figure out where to restart the
compilation.  I think that's a great idea, but since it's not implemented yet
I put the -start-after option into this patch for testing purposes.

llvm-svn: 159570
2012-07-02 19:48:45 +00:00
Chandler Carruth ff123d5c63 Fix the remaining TCL-style quotes found in the testsuite. This is
another mechanical change accomplished though the power of terrible Perl
scripts.

I have manually switched some "s to 's to make escaping simpler.

While I started this to fix tests that aren't run in all configurations,
the massive number of tests is due to a really frustrating fragility of
our testing infrastructure: things like 'grep -v', 'not grep', and
'expected failures' can mask broken tests all too easily.

Essentially, I'm deeply disturbed that I can change the testsuite so
radically without causing any change in results for most platforms. =/

llvm-svn: 159547
2012-07-02 19:09:46 +00:00
Duncan Sands e8ce94fcd7 GlobalOpt forgot to handle bitcast when analyzing globals. Found by inspection.
llvm-svn: 159546
2012-07-02 18:55:39 +00:00
Chandler Carruth 5da53436d5 Convert the uses of '|&' to use '2>&1 |' instead, which works on old
versions of Bash. In addition, I can back out the change to the lit
built-in shell test runner to support this.

This should fix the majority of fallout on Darwin, but I suspect there
will be a few straggling issues.

llvm-svn: 159544
2012-07-02 18:37:59 +00:00
Bob Wilson 2297221028 Do not attempt to use ROR for Thumb1.
Patch by Matt Fischer!

llvm-svn: 159538
2012-07-02 17:22:47 +00:00
Nuno Lopes d0bcfe4d9d fix the regression I introduced in r159385 (it's necessary to update PHI nodes in unwind BB
llvm-svn: 159534
2012-07-02 16:14:47 +00:00
Chandler Carruth 665c76bc52 The built-in shell test runner for some reason doesn't like the quoting
and multi-line nature of this test. I don't really feel like bugging
this kind of edge-case, so just put it on one line and use single
quotes. With this, every test *really* passes with the built-in shell
test runner.

llvm-svn: 159530
2012-07-02 13:35:01 +00:00
Chandler Carruth 872ac7cfad Fix the TCL-style quoting in one random test that somehow slipped
through my perl nets.

With this, the test suite passes even if I force it to run with the
built-in shell test logic, except for a test which REQUIREs shell.

llvm-svn: 159529
2012-07-02 13:29:47 +00:00
Stepan Dyatkovskiy 8b9ecca42d IntRange:
- Changed isSingleNumber method behaviour. Now this flag is calculated on demand.
IntegersSubsetMapping
  - Optimized diff operation.
  - Replaced type of Items field from std::list with std::map.
  - Added new methods:
    bool isOverlapped(self &RHS)
    void add(self& RHS, SuccessorClass *S)
    void detachCase(self& NewMapping, SuccessorClass *Succ)
    void removeCase(SuccessorClass *Succ)
    SuccessorClass *findSuccessor(const IntTy& Val)
    const IntTy* getCaseSingleNumber(SuccessorClass *Succ)
IntegersSubsetTest
  - DiffTest: Added checks for successors.
SimplifyCFG
  Updated SwitchInst usage (now it is case-ragnes compatible) for
    - SimplifyEqualityComparisonWithOnlyPredecessor
    - FoldValueComparisonIntoPredecessors

llvm-svn: 159527
2012-07-02 13:02:18 +00:00
Chandler Carruth a5a29f970e Convert all tests using TCL-style quoting to use shell-style quoting.
This was done through the aid of a terrible Perl creation. I will not
paste any of the horrors here. Suffice to say, it require multiple
staged rounds of replacements, state carried between, and a few
nested-construct-parsing hacks that I'm not proud of. It happens, by
luck, to be able to deal with all the TCL-quoting patterns in evidence
in the LLVM test suite.

If anyone is maintaining large out-of-tree test trees, feel free to poke
me and I'll send you the steps I used to convert things, as well as
answer any painful questions etc. IRC works best for this type of thing
I find.

Once converted, switch the LLVM lit config to use ShTests the same as
Clang. In addition to being able to delete large amounts of Python code
from 'lit', this will also simplify the entire test suite and some of
lit's architecture.

Finally, the test suite runs 33% faster on Linux now. ;]
For my 16-hardware-thread (2x 4-core xeon e5520): 36s -> 24s

llvm-svn: 159525
2012-07-02 12:47:22 +00:00
Chandler Carruth 0a4a261365 Make tests which first provide a negative assertion via 'not', then
a pipeline, and then a positive assertion via grep, use two RUN lines
instead.

Supporting these complex ideas of 'success' and 'failure' across
multiple stages of a pipeline is brittle in the shell world, and would
block switching to ShTest format; it only worked due to contrivances
introduced by the TclTest format.

Writing this as two separate RUN lines seems clearer in any event.

This is another step toward completely removing TclTests from lit.

llvm-svn: 159524
2012-07-02 12:23:19 +00:00
Chandler Carruth ae00a80869 Rewrite three tests that had truly egregious abuses of 'grep' in them to
use FileCheck.

Aside from removing a dependence on TCL-style quoting, this also makes
the tests ... significantly more robust. =] It would be really, *really*
great of the maintainer(s) of the CellSPU backend went through and
systematically rewrite these tests to use FileCheck. There are a lot
more that have nearly this bad of abuses.

Another step along the path to a TclTest-free testsuite.

llvm-svn: 159523
2012-07-02 12:20:14 +00:00
Chandler Carruth 8bdfe1ec92 Switch a bunch of Linker tests from using elaborate echo productions to
just provide and reference separate input files from an Inputs
subdirectory. This pattern works very well in the Clang tree and is
easier to understand in my opinion. It also has fewer limitations and
will remove one particularly annoying use of TCL-style {} quoting from
the testsuite.

Also teach the LLVM lit configuration to avoid recursing into 'Inputs'
subdirectories. This wasn't required for the previous 'Inputs'
subdirectories used due to fortuitous suffix patterns.

This is the first step to completely removing support for TCL-style tests.

llvm-svn: 159520
2012-07-02 10:18:06 +00:00
Alexey Samsonov f4462fa3ca This patch extends the libLLVMDebugInfo which contains a minimalistic DWARF parser:
1) DIContext is now able to return function name for a given instruction address (besides file/line info).
2) llvm-dwarfdump accepts flag --functions that prints the function name (if address is specified by --address flag).
3) test case that checks the basic functionality of llvm-dwarfdump added

llvm-svn: 159512
2012-07-02 05:54:45 +00:00
Rafael Espindola a77d31d7fd Now that RegistersDefinedFromSameValue handles one instruction being an
implicit_def, the other instruction can be anything, including instructions
that define multiple values. Be careful about that and don't assume what operand
0 is.
Fixes pr13249.

llvm-svn: 159509
2012-07-01 17:08:01 +00:00
Elena Demikhovsky 9af899fa88 Optimization of shuffle node that can fit to the register form of VBROADCAST instruction on AVX2.
llvm-svn: 159504
2012-07-01 06:12:26 +00:00
Chandler Carruth 69ce6652b8 Hoist LLVM's lit testsuite infrastructure into module so that it can be
re-used. Also, build in direct support for accumulating a set of lit
parameters, arguments, and testsuites to run as part of a 'check-all'
rule. This sinks 'check-all' from a Clang-specific construct to
a generic construct of the project.

llvm-svn: 159482
2012-06-30 10:14:14 +00:00
Jakob Stoklund Olesen 3e3cdecf98 Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

llvm-svn: 159461
2012-06-29 21:00:03 +00:00
Duncan Sands 369c6d270b Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to
the optimizers producing a multiply expression with more multiplications than
the original (!).

llvm-svn: 159426
2012-06-29 13:25:06 +00:00
Rafael Espindola efdfb1e6b2 In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

llvm-svn: 159409
2012-06-29 04:22:35 +00:00
Manman Ren 98a5bf24a9 X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256

llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Chandler Carruth 0cb6c4bdcc Remove a completely unnecessary mkdir from the CMake build.
Clang has been getting along fine without this for quite some time.

llvm-svn: 159400
2012-06-29 00:45:57 +00:00
Nick Lewycky 474112d82c If the step value is a constant zero, the loop isn't going to terminate. Fixes
the assert reported in PR13228!

llvm-svn: 159393
2012-06-28 23:44:57 +00:00
Nuno Lopes 2f49284f12 make the verifier accept @llvm.donothing as the only intrinsic that can be invoked
While at it, merge 2 tests and FileCheckize them

llvm-svn: 159388
2012-06-28 22:57:00 +00:00
Nuno Lopes b97a4e8bc2 make simplifyCFG erase invokes to readonly/readnone functions
llvm-svn: 159385
2012-06-28 22:32:27 +00:00
Nuno Lopes 9ac4661afa make instcombine produce calls to llvm.donothing instead of a random intrinsic
llvm-svn: 159384
2012-06-28 22:31:24 +00:00
Nuno Lopes ec9653b363 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Nuno Lopes 8650fb8e0e make LazyValueInfo analyze the default case of switch statements (we know that in the default branch the value cannot be any of the switch cases)
llvm-svn: 159353
2012-06-28 16:13:37 +00:00
Chandler Carruth 3511dd30c8 Move the setup for variables that are expanded in the lit.site.cfg into
a dedicated helper function. This will enable re-using the same logic
for Clang's lit setup, etc.

llvm-svn: 159333
2012-06-28 06:36:24 +00:00
Hal Finkel f2dcb9a9c4 Allow BBVectorize to form non-2^n-length vectors.
The original algorithm only used recursive pair fusion of equal-length
types. This is now extended to allow pairing of any types that share
the same underlying scalar type. Because we would still generally
prefer the 2^n-length types, those are formed first. Then a second
set of iterations form the non-2^n-length types.

Also, a call to SimplifyInstructionsInBlock has been added after each
pairing iteration. This takes care of DCE (and a few other things)
that make the following iterations execute somewhat faster. For the
same reason, some of the simple shuffle-combination cases are now
handled internally.

There is some additional refactoring work to be done, but I've had
many requests for this feature, so additional refactoring will come
soon in future commits (as will additional test cases).

llvm-svn: 159330
2012-06-28 05:42:42 +00:00
Jack Carter 6c0bc0b378 The Mips specific inline asm operand modifier 'z' has the
following description in the gnu sources:

    Print $0 if operand is zero otherwise print the op normally.

llvm-svn: 159324
2012-06-28 01:33:40 +00:00
Nuno Lopes e6e049020b make LVI::getEdgeValue() always intersect the constraints of the edge with the range of the block. Previously it was only performing the intersection for a few cases, thus losing precision
llvm-svn: 159320
2012-06-28 01:16:18 +00:00
Chandler Carruth bf2b400f3b Remove 'site.exp' building from both CMake and configure+make.
This is another vestige of the DejaGNU roots. There were FIXMEs in the
lit setup to add a 'lit.site.cfg', which has been around for quite some
time now, so I've properly switched the handling of the 4 things
actually used in site.exp to go through lit.site.cfg now. No more
parsing of the .exp file, one fewer configure-style generated file,
etc., etc.

llvm-svn: 159313
2012-06-28 00:16:51 +00:00
Chandler Carruth fd3a5e33d5 Remove the last vestiges of the '-lit' and '-dg' test runner split by
removing '-lit' qualifiers from make rules. I've left a legacy
'check-local-lit' rule in case build scripts have this encoded
somewhere.

llvm-svn: 159311
2012-06-28 00:03:15 +00:00
Chandler Carruth 256d3a9eaa Rip out legacy DejaGNU support from our Makefiles. This hasn't been the
default in forever, and hasn't even worked since most of the .exp files
were removed.

llvm-svn: 159307
2012-06-27 23:48:39 +00:00
Chandler Carruth b5c1a2b87c LLVM-GCC is dead. Really. I promise. ;]
More importantly, these files don't even have the variable that these
lines purport to substite.

llvm-svn: 159304
2012-06-27 23:34:25 +00:00
Jack Carter ef40238a0e This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159302
2012-06-27 23:13:42 +00:00
Jack Carter b9f9de93df This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159301
2012-06-27 22:48:25 +00:00
Matt Beaumont-Gay a58862310c Revert r159136 due to PR13124.
Original commit message:

If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159272
2012-06-27 17:10:33 +00:00
Duncan Sands 514db117bd Some reassociate optimizations create new instructions, which they insert just
before the expression root.  Any existing operators that are changed to use one
of them needs to be moved between it and the expression root, and recursively
for the operators using that one.  When I rewrote RewriteExprTree I accidentally
inverted the logic, resulting in the compacting going down from operators to
operands rather than up from operands to the operators using them, oops.  Fix
this, resolving PR12963.

llvm-svn: 159265
2012-06-27 14:19:00 +00:00
Richard Barton 57b7d16e34 Teach assembler to handle capitalised operation values for DSB instructions
llvm-svn: 159259
2012-06-27 09:48:23 +00:00
Chandler Carruth aa324c9078 Clean up the 'check' CMake build rule a bit, notable renaming it to
'check-llvm'.

Don't worry! 'check' still works! =] To rationalize the names of targets
used to run tests, the vague plan is the following:

make check-llvm  # run LLVM reg/unit tests  (currently 'check')
make check-clang # run Clang reg/unit tests (currently 'clang-test')
make check-rt    # run CompilerRT reg/unit tests
make check-asan  # run ASan reg/unit tests (subset of -rt)
make check-tsan  # run TSan reg/unit tests (subset of -rt)
make check-all   # run as much of the above as is available

The last one respects what projects are checked out and built for
a given tree. Personally, I would like to eventually make 'check' be an
alias for 'check-all'. For now however, it is an alias for 'check-llvm',
and thus no behavior has changed.

While this patch and my plan only really apply to CMake, I think it
might be good to similarly rationalize the naming scheme for the Make
builds.

llvm-svn: 159258
2012-06-27 09:44:16 +00:00
Akira Hatanaka ad31cd9a01 Test case for r159240.
llvm-svn: 159242
2012-06-27 00:40:34 +00:00
Evan Cheng 319be53a1f Remove a instcombine transform that (no longer?) makes sense:
// C - zext(bool) -> bool ? C - 1 : C
    if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
      if (ZI->getSrcTy()->isIntegerTy(1))
        return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);

This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
  return (x - y) + 1;
}
=>
        movzbl  %dil, %eax
        movzbl  %sil, %ecx
        shll    $31, %ecx
        sarl    $31, %ecx
        leal    1(%rax,%rcx), %eax
        ret


Without the rule, llvm now generates:
        movzbl  %sil, %ecx
        movzbl  %dil, %eax
        incl    %eax
        subl    %ecx, %eax
        ret

It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).

The transformation was done as part of Eli's r75531. He has given the ok to
remove it.

rdar://11748024

llvm-svn: 159230
2012-06-26 22:03:13 +00:00
Rafael Espindola e0eaa043eb Fix llc's -print-before=pass and -print-after=pass.
llvm-svn: 159227
2012-06-26 21:33:36 +00:00
Manman Ren a09820414a X86: add GATHER intrinsics (AVX2) in LLVM
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.

llvm-svn: 159221
2012-06-26 19:47:59 +00:00
Jack Carter 5e69cffed5 There are a number of generic inline asm operand modifiers that
up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter
llvm-svn: 159203
2012-06-26 13:49:27 +00:00
Duncan Sands 8bc764aeca Replacing zero-sized alloca's with a null pointer is too aggressive, instead
merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS
conformance testsuite.  What happened there was that a variable sized object
was being allocated on the stack, "alloca i8, i32 %size".  It was then being
passed to another function, which tested that the address was not null (raising
an exception if it was) then manipulated %size bytes in it (load and/or store).
The optimizers cleverly managed to deduce that %size was zero (congratulations
to them, as it isn't at all obvious), which made the alloca zero size, causing
the optimizers to replace it with null, which then caused the check mentioned
above to fail, and the exception to be raised, wrongly.  Note that no loads
and stores were actually being done to the alloca (the loop that does them is
executed %size times, i.e. is not executed), only the not-null address check.

llvm-svn: 159202
2012-06-26 13:39:21 +00:00
Elena Demikhovsky 26088d2e24 Shuffle optimization for AVX/AVX2.
The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]

llvm-svn: 159188
2012-06-26 08:04:10 +00:00
Craig Topper 94bf0f3855 Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.
llvm-svn: 159184
2012-06-26 04:12:49 +00:00
Andrew Trick fb2ba3e1cb Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

llvm-svn: 159183
2012-06-26 04:11:38 +00:00
Eli Friedman bbcd09cc00 Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196.
llvm-svn: 159176
2012-06-25 23:42:33 +00:00
Nuno Lopes 31b54a5379 revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias
llvm-svn: 159175
2012-06-25 23:26:10 +00:00
Nuno Lopes 75eaa72de9 do not set realloc() as NotAlias, since it can return the same pointer. This whole thing should be upgraded to use the MemoryBuiltin interface anyway..
llvm-svn: 159173
2012-06-25 22:55:50 +00:00
Manman Ren 606953fbe7 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965
llvm-svn: 159166
2012-06-25 21:49:38 +00:00
Dan Gohman 5f725cd196 Fix the objc_autoreleasedReturnValue optimization code to locate
the call correctly even in the case where it is an invoke. This
fixes rdar://11714057.

llvm-svn: 159157
2012-06-25 19:47:37 +00:00
Jakob Stoklund Olesen a57fc12ec9 Enforce stricter liveness rules for PHIs.
Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

llvm-svn: 159150
2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen eb49566447 Run ProcessImplicitDefs on SSA form where it can be much simpler.
Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

llvm-svn: 159149
2012-06-25 18:12:18 +00:00
Nuno Lopes 07594cba7c improve optimization of invoke instructions:
- simplifycfg:  invoke undef/null -> unreachable
 - instcombine:  invoke new  -> invoke expect(0, 0)  (an arbitrary NOOP intrinsic;  only done if the allocated memory is unused, of course)
 - verifier:  allow invoke of intrinsics  (to make the previous step work)

llvm-svn: 159146
2012-06-25 17:11:47 +00:00
Meador Inge fc2fb711e8 PR13013: ELF Type identification fails for MSB type ELF files.
Fix 'sys::IdentifyFileType' to work with big and little endian byte orderings
when reading the ELF object file type.

Initial patch by Stefan Hepp.

llvm-svn: 159138
2012-06-25 14:48:43 +00:00
Rafael Espindola 540c3d23df If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159136
2012-06-25 14:30:31 +00:00
Jakob Stoklund Olesen 2e22e6a361 %RCX is not a function live-out in eh.return functions.
The function live-out registers must be live at all function returns,
and %RCX is only used by eh.return. When a function also has a normal
return, only %RAX holds a return value.

This fixes PR13188.

llvm-svn: 159116
2012-06-24 15:53:01 +00:00
Hal Finkel 3099ce9489 Allow controlling vectorization of boolean values separately from other integer types.
These are used as the result of comparisons, and often handled differently from larger integer types.

llvm-svn: 159111
2012-06-24 13:28:01 +00:00
Nick Lewycky 0a045bbe4e Remove dyn_cast + dereference pattern by replacing it with a cast and changing
the safety check to look for the same type we're going to actually cast to.
Fixes PR13180!

llvm-svn: 159110
2012-06-24 10:15:42 +00:00
Nick Lewycky bfb07fb562 Remove a dangling reference to a deleted instruction. Fixes PR13185!
llvm-svn: 159096
2012-06-24 01:44:08 +00:00
Pete Cooper fe212e762f DAG legalisation can now handle illegal fma vector types by scalarisation
llvm-svn: 159092
2012-06-24 00:05:44 +00:00
Hal Finkel 4b06b1a0ee Allow BBVectorize to fuse compare instructions.
llvm-svn: 159088
2012-06-23 21:52:50 +00:00
Marshall Clow 78ade1dd08 Add relocation types for Hexagon processor; patch by Sidney Manning <sidneym@codeaurora.org>
llvm-svn: 159081
2012-06-23 14:46:18 +00:00
Hans Wennborg cbe34b4cc9 Extend the IL for selecting TLS models (PR9788)
This allows the user/front-end to specify a model that is better
than what LLVM would choose by default. For example, a variable
might be declared as

  @x = thread_local(initialexec) global i32 42

if it will not be used in a shared library that is dlopen'ed.

If the specified model isn't supported by the target, or if LLVM can
make a better choice, a different model may be used.

llvm-svn: 159077
2012-06-23 11:37:03 +00:00
Rafael Espindola a3088f09b3 Handle aliases to tls variables in all architectures, not just x86.
llvm-svn: 159058
2012-06-23 00:30:03 +00:00
Evan Cheng 68c2f9a9a7 (sub X, imm) gets canonicalized to (add X, -imm)
There are patterns to handle immediates when they fit in the immediate field.
e.g. %sub = add i32 %x, -123
=>   sub r0, r0, #123
Add patterns to catch immediates that do not fit but should be materialized
with a single movw instruction rather than movw + movt pair.
e.g. %sub = add i32 %x, -65535
=>   movw r1, #65535
     sub r0, r0, r1

rdar://11726136

llvm-svn: 159057
2012-06-23 00:29:06 +00:00
Jim Grosbach 087affe2f3 ARM: Add a better diagnostic for some out of range immediates.
As an example of how the custom DiagnosticType can be used to provide
better operand-mismatch diagnostics, add a custom diagnostic for
the imm0_15 operand class used for several system instructions.
Update the tests to expect the improved diagnostic.

rdar://8987109

llvm-svn: 159051
2012-06-22 23:56:48 +00:00
Hal Finkel 460e94d842 Add support for the PPC isel instruction.
The isel (integer select) instruction is supported on the 440 and A2
embedded cores and on the POWER7.

llvm-svn: 159045
2012-06-22 23:10:08 +00:00
Chad Rosier 1ce3805b23 FileCheckize tests.
llvm-svn: 159044
2012-06-22 23:04:02 +00:00
Lang Hames c98ebda325 Rename fp-op fusion option (yet again) for compatibility with GCC option.
llvm-svn: 159042
2012-06-22 22:31:00 +00:00
Evan Cheng f5bd6c6510 EmitZerofill should take a 64-bit size or else it's chopping off large zero-filled global. rdar://11729134
llvm-svn: 159023
2012-06-22 20:14:46 +00:00
Jakob Stoklund Olesen c5c4e96f3e Revert remaining part of r93200: "Disable folding sext(trunc(x)) -> x"
This fixes PR5997.

These transforms were disabled because codegen couldn't deal with other
uses of trunc(x). This is now handled by the peephole pass.

This causes no regressions on x86-64.

llvm-svn: 159003
2012-06-22 16:36:43 +00:00
NAKAMURA Takumi c384b95939 test/CodeGen/Generic/asm-large-immediate.ll: Mark it as XFAIL: powerpc, possibly due to r158939.
llvm-svn: 158994
2012-06-22 13:41:00 +00:00
Jakob Stoklund Olesen 321d41a871 Functions calling __builtin_eh_return must have a frame pointer.
The code in X86TargetLowering::LowerEH_RETURN() assumes that a frame
pointer exists, but the frame pointer was forced by the presence of
llvm.eh.unwind.init which isn't guaranteed.

If llvm.eh.unwind.init is actually required in functions calling
eh.return (is it?), we should diagnose that instead of emitting bad
machine code.

This should fix the dragonegg-x86_64-linux-gcc-4.6-test bot.

llvm-svn: 158961
2012-06-22 03:04:27 +00:00
Andrew Trick 3ccb1b8cf9 ARM scheduling fix: compute predicated implicit use properly.
Minor drive by fix to cleanup latency computation. Calling
getOperandLatency with a deliberately incorrect operand index does not
give you the latency you want.

llvm-svn: 158959
2012-06-22 02:50:31 +00:00
Nick Lewycky 33da33676f Emit relocations for DW_AT_location entries on systems which need it. This is
a recommit of r127757. Fixes PR9493. Patch by Paul Robinson!

llvm-svn: 158957
2012-06-22 01:25:12 +00:00
Lang Hames b8650f106a Rename -allow-excess-fp-precision flag to -fuse-fp-ops, and switch from a
boolean flag to an enum: { Fast, Standard, Strict } (default = Standard).

This option controls the creation by optimizations of fused FP ops that store
intermediate results in higher precision than IEEE allows (E.g. FMAs). The
behavior of this option is intended to match the behaviour specified by a
soon-to-be-introduced frontend flag: '-ffuse-fp-ops'.

Fast mode - allows formation of fused FP ops whenever they're profitable.

Standard mode - allow fusion only for 'blessed' FP ops. At present the only
blessed op is the fmuladd intrinsic. In the future more blessed ops may be
added.

Strict mode - allow fusion only if/when it can be proven that the excess
precision won't effect the result.

Note: This option only controls formation of fused ops by the optimizers.  Fused
operations that are explicitly requested (e.g. FMA via the llvm.fma.* intrinsic)
will always be honored, regardless of the value of this option.

Internally TargetOptions::AllowExcessFPPrecision has been replaced by
TargetOptions::AllowFPOpFusion.

llvm-svn: 158956
2012-06-22 01:09:09 +00:00
Nuno Lopes 771e7bd4ba instcombine: disable optimization of 'invoke null/undef'. I'll move this functionality to SimplifyCFG (since we cannot make changes to the CFG here).
Fixes the crashes with the attached test case

llvm-svn: 158951
2012-06-21 23:52:14 +00:00
Evan Cheng 32c7cc8ec9 Look pass zext to strength reduce an udiv. Patch by David Majnemer. rdar://11721329
llvm-svn: 158946
2012-06-21 22:52:49 +00:00
Jack Carter c457f62033 The inline asm operand modifier 'n' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Negate the immediate constant

Several Architectures such as x86 have local implementations
of operand modifier 'n' which go beyond the above description
slightly. This won't affect them.

Affected files:

    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'n' to the switch cases.

    test/CodeGen/Generic/asm-large-immediate.ll
        Generic compiled test (x86 for me)

    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter
llvm-svn: 158939
2012-06-21 21:37:54 +00:00
Nuno Lopes dc6085e52d Add support for invoke to the MemoryBuiltin analysid.
Update comments accordingly.

Make instcombine remove useless invokes to C++'s 'new' allocation function (test attached).

llvm-svn: 158937
2012-06-21 21:25:05 +00:00
Akira Hatanaka 765c312314 1. fix null program output after some other changes
2. re-enable null.ll test
3. fix some minor style violations

Patch by Reed Kotler.

llvm-svn: 158935
2012-06-21 20:39:10 +00:00
Akira Hatanaka fcf52c8304 Add Mips to the list of target architectures for the MCJIT tests.
Patch by Reed Kotler.

llvm-svn: 158933
2012-06-21 20:23:32 +00:00
Hal Finkel a86b0f20dd Treat TargetGlobalAddress as a constant for the purpose of matching pre-inc stores on PPC.
Thanks to Tobias von Koch for pointing out this problem.

llvm-svn: 158932
2012-06-21 20:10:48 +00:00
Jack Carter b2fd5f66b4 The inline asm operand modifier 'c' is suppose
to be generic across architectures. It has the
following description in the gnu sources:

    Substitute immediate value without immediate syntax

Several Architectures such as x86 have local implementations
of operand modifier 'c' which go beyond the above description
slightly. To make use of the generic modifiers without overriding
local implementation one can make a call to the base class method
for AsmPrinter::PrintAsmOperand() in the locally derived method's 
"default" case in the switch statement. That way if it is already
defined locally the generic version will never get called.

This change is needed when test/CodeGen/generic/asm-large-immediate.ll
failed on a native Mips board. The test was assuming a generic
implementation was in place.

Affected files:

    lib/Target/Mips/MipsAsmPrinter.cpp:
        Changed the default case to call the base method.
    lib/CodeGen/AsmPrinter/AsmPrinterInlineAsm.cpp
        Added 'c' to the switch cases.
    test/CodeGen/Mips/asm-large-immediate.ll
        Mips compiled version of the generic one

Contributer: Jack Carter
llvm-svn: 158925
2012-06-21 17:14:46 +00:00
Nuno Lopes a6aa3d3b5f hopefully fix the buildbots: some tests have wrong definitions of malloc and were crashing this code on 64 bits machines
llvm-svn: 158923
2012-06-21 16:47:58 +00:00
Nuno Lopes 0e967e0186 port the BoundsChecking patch to the new MemoryBuiltin API (i.e., remove most of the code from here).
Remove the alloc_size.ll test until we settle on a metadata format that makes everyone happy..

llvm-svn: 158920
2012-06-21 15:59:53 +00:00
Nuno Lopes 55fff83422 refactor the MemoryBuiltin analysis:
- provide more extensive set of functions to detect library allocation functions (e.g., malloc, calloc, strdup, etc)
 - provide an API to compute the size and offset of an object pointed by

Move a few clients (GVN, AA, instcombine, ...) to the new API.
This implementation is a lot more aggressive than each of the custom implementations being replaced.

Patch reviewed by Nick Lewycky and Chandler Carruth, thanks.

llvm-svn: 158919
2012-06-21 15:45:28 +00:00
NAKAMURA Takumi 613663cfe2 Revert r158209, "test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911."
It passes according to ppc changes.

llvm-svn: 158917
2012-06-21 13:43:06 +00:00
Lang Hames 90b2a4cbad Add a missing llvm.fma -> VFNMS pattern to the ARM backend.
llvm-svn: 158902
2012-06-21 06:10:00 +00:00
Evan Cheng 8c2ad81238 Emit a single _udivmodsi4 libcall instead of two separate _udivsi3 and
_umodsi3 libcalls if they have the same arguments. This optimization
was apparently broken if one of the node was replaced in place.
rdar://11714607

llvm-svn: 158900
2012-06-21 05:56:05 +00:00
Jakob Stoklund Olesen 51c63e64e3 Remove the -live-regunits command line option.
Register allocators depend on it being permanently enabled now.

llvm-svn: 158873
2012-06-20 23:31:34 +00:00
Akira Hatanaka 87505f46ac Revert r158846.
llvm-svn: 158855
2012-06-20 21:19:39 +00:00
Akira Hatanaka da448fe0b1 In MipsDisassembler.cpp, instead of defining register class tables, use the ones
that are generated by TableGen and are already available in
MipsGenRegisterInfo.inc. Suggested by Jakob Stoklund Olesen.

Also, fix bug in function DecodeAFGR64RegisterClass.

Patch by Vladimir Medic. 

llvm-svn: 158846
2012-06-20 20:39:23 +00:00
Jakob Stoklund Olesen 833308d785 Only update regunit live ranges that have been precomputed.
Regunit live ranges are computed on demand, so when mi-sched calls
handleMove, some regunits may not have live ranges yet.

That makes updating them easier: Just skip the non-existing ranges. They
will be computed correctly from the rescheduled machine code when they
are needed.

llvm-svn: 158831
2012-06-20 18:00:57 +00:00
Hal Finkel ca542beffe Add support for generating reg+reg (indexed) pre-inc loads on PPC.
llvm-svn: 158823
2012-06-20 15:43:03 +00:00
Craig Topper b9e8e18949 Don't insert 128-bit UNDEF into 256-bit vectors. Just keep the 256-bit vector. Original patch by Elena Demikhovsky. Tweaked by me to allow possibility of covering more cases.
llvm-svn: 158792
2012-06-20 05:39:26 +00:00
Lang Hames 39fb1d08dc Add DAG-combines for aggressive FMA formation.
This patch adds DAG combines to form FMAs from pairs of FADD + FMUL or
FSUB + FMUL. The combines are performed when:
(a) Either
      AllowExcessFPPrecision option (-enable-excess-fp-precision for llc)
        OR
      UnsafeFPMath option (-enable-unsafe-fp-math)
    are set, and
(b) TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) is true for the type of
    the FADD/FSUB, and
(c) The FMUL only has one user (the FADD/FSUB).

If your target has fast FMA instructions you can make use of these combines by
overriding TargetLoweringInfo::isFMAFasterThanMulAndAdd(VT) to return true for
types supported by your FMA instruction, and adding patterns to match ISD::FMA
to your FMA instructions.

llvm-svn: 158757
2012-06-19 22:51:23 +00:00
Jakob Stoklund Olesen 77a0cfb19a Add a triple.
The test was failing on Linux because of asm syntax differences.

llvm-svn: 158748
2012-06-19 21:46:25 +00:00
Jakob Stoklund Olesen 0f855e4263 Implement PPCInstrInfo::isCoalescableExtInstr().
The PPC::EXTSW instruction preserves the low 32 bits of its input, just
like some of the x86 instructions. Use it to reduce register pressure
when the low 32 bits have multiple uses.

This requires a small change to PeepholeOptimizer since EXTSW takes a
64-bit input register.

This is related to PR5997.

llvm-svn: 158743
2012-06-19 21:14:34 +00:00
Jan Wen Voung 7f5d79f864 Have ARM ELF use correct reloc for "b" instr.
The condition code didn't actually matter for arm "b" instructions,
unlike "bl".  It should just use the R_ARM_JUMP24 reloc.

llvm-svn: 158722
2012-06-19 16:03:02 +00:00
Hal Finkel 1cc27e44a4 Add support for generating reg+reg preinc stores on PPC.
PPC will now generate STWUX and friends.

llvm-svn: 158698
2012-06-19 02:34:32 +00:00
Rafael Espindola 31567515ed really add a triple :-(
llvm-svn: 158696
2012-06-19 02:17:35 +00:00
Rafael Espindola f2ae4075c8 Add a triple to the test.
llvm-svn: 158695
2012-06-19 01:42:34 +00:00
Rafael Espindola ca3e0ee8b3 Move the support for using .init_array from ARM to the generic
TargetLoweringObjectFileELF. Use this to support it on X86. Unlike ARM,
on X86 it is not easy to find out if .init_array should be used or not, so
the decision is made via TargetOptions and defaults to off.

Add a command line option to llc that enables it.

llvm-svn: 158692
2012-06-19 00:48:28 +00:00
Nuno Lopes f9abcb7ba9 revert r158660, since Chris has some issues with this patch (namely using code to reprent information only used by the compiler)
Original commit msg:
add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158688
2012-06-18 23:34:26 +00:00
Manman Ren 6e1fd46fdf ARM: use NOEN loads and stores if possible when handling struct byval.
This change is to be enabled in clang.

rdar://9877866

llvm-svn: 158684
2012-06-18 22:23:48 +00:00
Jim Grosbach cb540f5cff ARM: Define generic HINT instruction.
The NOP, WFE, WFI, SEV and YIELD instructions are all hints w/
a different immediate value in bits [7,0]. Define a generic HINT
instruction and refactor NOP, WFI, WFI, SEV and YIELD to be
assembly aliases of that.

rdar://11600518

llvm-svn: 158674
2012-06-18 19:45:50 +00:00
Nuno Lopes b7c941bad9 add the 'alloc' metadata node to represent the size of offset of buffers pointed to by pointers.
This metadata can be attached to any instruction returning a pointer

llvm-svn: 158660
2012-06-18 16:04:04 +00:00
Joel Jones 3237ce737e This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.  The 8 bit case doesn't
need to be handled, as the 8 bit constants are encoded directly, thereby
not needing a separate load instruction to form the constant into a register.

<rdar://problem/11481151>

llvm-svn: 158659
2012-06-18 14:51:32 +00:00
Chandler Carruth a1da0bf5ef Add a regression test for the bug exposed by r158087, which has been
temporarily reverted.

This test is annoyingly overspecified, but I don't know of another way
to thoroughly test the saving and restoring of the registers. While this
will have to be adjusted even with the issue fixed in order to re-apply
r158087, those adjustments should very clearly indicate that it is still
correct (%esp getting restored prior to pops), whereas without it, this
case can easily slip under the radar.

Still, any suggestions for improvements are very welcome.

All credit to Matt Beaumont-Gay for reducing this out of an insane
Address Sanitizer crash to a reasonably small seg-faulting C program
when built with -mstackrealign. I just reduced it to IR, which was much
simpler. =]

llvm-svn: 158656
2012-06-18 09:15:04 +00:00
Chandler Carruth 2cc11fd8c7 Temporarily revert r158087.
This patch causes problems when both dynamic stack realignment and
dynamic allocas combine in the same function. With this patch, we no
longer build the epilog correctly, and silently restore registers from
the wrong position in the stack.

Thanks to Matt for tracking this down, and getting at least an initial
test case to Chad. I'm going to try to check a variation of that test
case in so we can easily track the fixes required.

llvm-svn: 158654
2012-06-18 07:03:12 +00:00
Pete Cooper 33ee6c9bf1 Now that SROA can form alloca's for dynamic vector accesses, further improve it to be able to replace operations on these vector alloca's with insert/extract element insts
llvm-svn: 158623
2012-06-17 03:58:26 +00:00
Hal Finkel 6261c2dc28 Cleanup trip-count finding for PPC CTR loops (and some bug fixes).
This cleans up the method used to find trip counts in order to form CTR loops on PPC.
This refactoring allows the pass to find loops which have a constant trip count but also
happen to end with a comparison to zero. This also adds explicit FIXMEs to mark two different
classes of loops that are currently ignored.

In addition, we now search through all potential induction operations instead of just the first.
Also, we check the predicate code on the conditional branch and abort the transformation if the
code is not EQ or NE, and we then make sure that the branch to be transformed matches the
condition register defined by the comparison (multiple possible comparisons will be considered).

llvm-svn: 158607
2012-06-16 20:34:07 +00:00
Hal Finkel fa103d3fc7 Teach BBVectorize to combine, when possible, or discard metadata when fusing instructions.
The present implementation handles only TBAA and FP metadata, discarding everything else.
For debug metadata, the current behavior is maintained (the debug metadata associated with
one of the instructions will be kept, discarding that attached to the other).

This should address PR 13040.

llvm-svn: 158606
2012-06-16 20:34:06 +00:00
Rafael Espindola f70bea93e2 Implement irpc. Extracted from a patch by the PaX team. I just added the test.
llvm-svn: 158604
2012-06-16 18:03:25 +00:00
Pete Cooper 818e9f4a26 Fix crash from r158529 on Bullet.
Dynamic GEPs created by SROA needed to insert extra "i32 0"
operands to index through structs and arrays to get to the
vector being indexed.

llvm-svn: 158590
2012-06-16 01:43:26 +00:00
Andrew Trick e67a30c77f Unit test for LSR kind=Special fix: r158536.
llvm-svn: 158570
2012-06-15 22:46:31 +00:00
Kevin Enderby 6c7279ec2e Fix the encoding of the armv7m (MClass) for MSR registers other than aspr,
iaspr, espr and xpsr which also needed to have 0b10 in their mask encoding bits.

llvm-svn: 158560
2012-06-15 22:14:44 +00:00
Manman Ren e0763c7472 ARM: optimization for sub+abs.
This patch will optimize abs(x-y)
FROM
sub, movs, rsbmi
TO
subs, rsbmi

For abs, we will use cmp instead of movs. This is necessary because we already
have an existing peephole pass which optimizes away cmp following sub.

rdar: 11633193
llvm-svn: 158551
2012-06-15 21:32:12 +00:00
Pete Cooper e24d6a19e3 Allow SROA to split up an array of vectors into multiple vectors, even when the vectors are dynamically indexed
llvm-svn: 158529
2012-06-15 18:07:29 +00:00
Rafael Espindola 1821c6c3b0 Some optimizations done by globalopt are safe only for internal linkage, not
linkonce linkage. For example, it is not valid to add unnamed_addr.

This also fixes a crash in g++.dg/opt/static5.C.

llvm-svn: 158528
2012-06-15 18:00:24 +00:00
Jakob Stoklund Olesen a15a224db0 Preserve <undef> flags in ARMExpandPseudo.
This probably mostly shows up in bugpoint-generated code.

llvm-svn: 158527
2012-06-15 17:46:54 +00:00
Rafael Espindola 768b41c17a Factor macro argument parsing into helper methods and add support for .irp.
Patch extracted from a larger one by the PaX team. I added the testcases
and tightened error handling a bit.

llvm-svn: 158523
2012-06-15 14:02:34 +00:00
Duncan Sands 7838603ffc Fix issues (infinite loop and/or crash) with self-referential instructions, for
example degenerate phi nodes and binops that use themselves in unreachable code.
Thanks to Charles Davis for the testcase that uncovered this can of worms.

llvm-svn: 158508
2012-06-15 08:37:50 +00:00
Pete Cooper 1d1fa72837 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158479
2012-06-14 23:53:53 +00:00
Rafael Espindola def1b09be2 Implement the isSafeToDiscardIfUnused predicate and use it in globalopt and
globaldce. Globaldce was already removing linkonce globals, but globalopt was
not.

llvm-svn: 158476
2012-06-14 22:48:13 +00:00
Akira Hatanaka d8ab16b86f 1. introduce MipsPat in place of Pat in order to exclude those from
being used by Mips16 or Micro Mips
2. clean up a few lines too long encountered

Patch by Reed Kotler.

llvm-svn: 158470
2012-06-14 21:03:23 +00:00
Akira Hatanaka 1b420ac4c8 Make machine verifier check the first instruction of the last bundle instead of
the last instruction of a basic block.

llvm-svn: 158468
2012-06-14 20:51:13 +00:00
Pete Cooper 5d19452f3f Revert r158454: Allow SROA to look at a vector type... Its breaking the vectorise buildbot
This reverts commit 12c1f86ffa731e2952c80d2cc577000c96b8962c.

llvm-svn: 158462
2012-06-14 18:32:52 +00:00
Pete Cooper a7e6d58a87 Recommit r158407: Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access. Now with additional fix and test for indexing into a vector inside a struct
llvm-svn: 158454
2012-06-14 16:38:13 +00:00
Richard Barton b0ec375b96 Replace assertion failure for badly formatted CPS instrution with error message.
llvm-svn: 158445
2012-06-14 10:48:04 +00:00
Manman Ren 2764301a77 Revert: test/CodeGen/ARM/iabs.ll in r158441
Sorry that I accidently checked in this file with my previous commit.

llvm-svn: 158442
2012-06-14 06:04:02 +00:00
Manman Ren c2bc2d106b InstCombine: fix a bug when combining (fcmp cc0 x, y) && (fcmp cc1 x, y).
uno && ueq was converted to ueq, it should be converted to uno.

llvm-svn: 158441
2012-06-14 05:57:42 +00:00
Akira Hatanaka c6496e2cb6 Test case for MIPS long branch pass.
llvm-svn: 158438
2012-06-14 02:12:21 +00:00
Akira Hatanaka 843aca9328 Fix test cases.
llvm-svn: 158435
2012-06-14 01:21:00 +00:00
Akira Hatanaka df5205ef3d Implement a DAGCombine in MipsISelLowering.cpp which transforms the following
pattern:

(add v0, (add v1, abs_lo(tjt))) => (add (add v0, v1), abs_lo(tjt))

"tjt" is a TargetJumpTable node. 

llvm-svn: 158419
2012-06-13 20:33:18 +00:00
Akira Hatanaka 1daf8c2a16 Set a higher value for maxStoresPerMemcpy in MipsISelLowering.cpp.
llvm-svn: 158414
2012-06-13 19:33:32 +00:00
Akira Hatanaka f0273603f5 Implement fastcc calling convention for MIPS.
llvm-svn: 158410
2012-06-13 18:06:00 +00:00
Richard Osborne ab7d788eb5 Fix pattern for MKMSK instruction.
llvm-svn: 158409
2012-06-13 17:59:12 +00:00
Pete Cooper e2fe809772 Revert "Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access"
This reverts commit 51786e0aaec76b973205066bd44f7f427b21969f.

llvm-svn: 158408
2012-06-13 17:55:22 +00:00
Pete Cooper e1d4e8b563 Allow SROA to look at a vector type and see if the offset is out of range to be replaced with a scalar access
llvm-svn: 158407
2012-06-13 17:30:34 +00:00
Duncan Sands 409d8ae165 It is possible for several constants which aren't individually absorbing to
combine to the absorbing element.  Thanks to nbjoerg on IRC for pointing this 
out.

llvm-svn: 158399
2012-06-13 12:15:56 +00:00
Craig Topper 71dc02d659 Fix intrinsics for XOP frczss/sd instructions. These instructions only take one source register and zero the upper bits of the destination rather than preserving them.
llvm-svn: 158396
2012-06-13 07:18:53 +00:00
Manman Ren d33f4efbfd SimplifyCFG: fold unconditional branch to its predecessor if profitable.
This patch extends FoldBranchToCommonDest to fold unconditional branches.
For unconditional branches, we fold them if it is easy to update the phi nodes 
in the common successors.

rdar://10554090

llvm-svn: 158392
2012-06-13 05:43:29 +00:00
Akira Hatanaka 5fa541231b disable use of directive .set nomicromips
until this directive is pushed in gas to open source fsf

Patch by Reed Kotler.

llvm-svn: 158381
2012-06-13 02:41:14 +00:00
Andrew Trick 344fb64fa3 sched: fix latency of memory dependence chain edges for consistency.
For store->load dependencies that may alias, we should always use
TrueMemOrderLatency, which may eventually become a subtarget hook. In
effect, we should guarantee at least TrueMemOrderLatency on at least
one DAG path from a store to a may-alias load.

This should fix the standard mode as well as -enable-aa-sched-mi".

llvm-svn: 158380
2012-06-13 02:39:03 +00:00
Duncan Sands 67cd591989 Use std::map rather than SmallMap because SmallMap assumes that the value has
POD type, causing memory corruption when mapping to APInts with bitwidth > 64.
Merge another crash testcase into crash.ll while there.

llvm-svn: 158369
2012-06-12 20:16:51 +00:00
Chad Rosier c6916f88a8 [arm-fast-isel] Add support for -arm-long-calls.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 158368
2012-06-12 19:25:13 +00:00
Duncan Sands d7aeefebd6 Now that Reassociate's LinearizeExprTree can look through arbitrary expression
topologies, it is quite possible for a leaf node to have huge multiplicity, for
example: x0 = x*x, x1 = x0*x0, x2 = x1*x1, ... rapidly gives a value which is x
raised to a vast power (the multiplicity, or weight, of x).  This patch fixes
the computation of weights by correctly computing them no matter how big they
are, rather than just overflowing and getting a wrong value.  It turns out that
the weight for a value never needs more bits to represent than the value itself,
so it is enough to represent weights as APInts of the same bitwidth and do the
right overflow-avoiding dance steps when computing weights.  As a side-effect it
reduces the number of multiplies needed in some cases of large powers.  While
there, in view of external uses (eg by the vectorizer) I made LinearizeExprTree
static, pushing the rank computation out into users.  This is progress towards
fixing PR13021.

llvm-svn: 158358
2012-06-12 14:33:56 +00:00
Jakob Stoklund Olesen e782fa649f Fix test that depends on register allocation.
The test is really checking the prolog/epilog load/store multiple
formation.

llvm-svn: 158328
2012-06-11 21:14:28 +00:00
Jakob Stoklund Olesen 4e28777465 Fix test case to work on ARM.
Patch by James Benton!

llvm-svn: 158316
2012-06-11 16:01:14 +00:00
Bill Wendling 4b79647a6e Re-enable the CMN instruction.
We turned off the CMN instruction because it had semantics which we weren't
getting correct. If we are comparing with an immediate, then it's okay to use
the CMN instruction.
<rdar://problem/7569620>

llvm-svn: 158302
2012-06-11 08:07:26 +00:00
Benjamin Kramer 8b8a76974f InstCombine: Turn (zext A) == (B & (1<<X)-1) into A == (trunc B), narrowing the compare.
This saves a cast, and zext is more expensive on platforms with subreg support
than trunc is. This occurs in the BSD implementation of memchr(3), see PR12750.
On the synthetic benchmark from that bug stupid_memchr and bsd_memchr have the
same performance now when not inlining either function.

stupid_memchr: 323.0us
bsd_memchr: 321.0us
memchr: 479.0us

where memchr is the llvm-gcc compiled bsd_memchr from osx lion's libc. When
inlining is enabled bsd_memchr still regresses down to llvm-gcc memchr time,
I haven't fully understood the issue yet, something is grossly mangling the
loop after inlining.

llvm-svn: 158297
2012-06-10 20:35:00 +00:00
Hal Finkel 4e9f1a859f Enable ILP scheduling for all nodes by default on PPC.
Over the entire test-suite, this has an insignificantly negative average
performance impact, but reduces some of the worst slowdowns from the
anti-dep. change (r158294).

Largest speedups:
SingleSource/Benchmarks/Stanford/Quicksort - 28%
SingleSource/Benchmarks/Stanford/Towers - 24%
SingleSource/Benchmarks/Shootout-C++/matrix - 23%
MultiSource/Benchmarks/SciMark2-C/scimark2 - 19%
MultiSource/Benchmarks/MiBench/automotive-bitcount/automotive-bitcount - 15%
(matrix and automotive-bitcount were both in the top-5 slowdown list from the
anti-dep. change)

Largest slowdowns:
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 28%
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 26%
MultiSource/Benchmarks/MiBench/automotive-susan/automotive-susan - 21%
SingleSource/Benchmarks/CoyoteBench/lpbench - 20%
MultiSource/Applications/d/make_dparser - 16%

llvm-svn: 158296
2012-06-10 19:32:29 +00:00
Nadav Rotem 17ee58a792 Add AutoUpgrade support for the SSE4 ptest intrinsics.
Patch by Michael Kuperstein.

llvm-svn: 158295
2012-06-10 18:42:51 +00:00
Hal Finkel 2edfbddcf0 Improve ext/trunc patterns on PPC64.
The PPC64 backend had patterns for i32 <-> i64 extensions and truncations that
would leave self-moves in the final assembly. Replacing those patterns with ones
based on the SUBREG builtins yields better-looking code.

Thanks to Jakob and Owen for their suggestions in this matter.

llvm-svn: 158283
2012-06-09 22:10:19 +00:00
Craig Topper 3352ba55b9 Replace XOP vpcom intrinsics with fewer intrinsics that take the immediate as an argument.
llvm-svn: 158278
2012-06-09 16:46:13 +00:00
Hal Finkel eb50c2d4a4 Enable tail merging on PPC.
Tail merging had been disabled on PPC because it would disturb bundling decisions
made during pre-RA scheduling on the 970 cores. Now, however, all bundling decisions
are made during post-RA scheduling, and tail merging is generally beneficial (the
average test-suite speedup is insignificantly positive).

Largest test-suite speedups:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - 30%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - 23%
SingleSource/Benchmarks/Shootout-C++/ary - 21%
SingleSource/Benchmarks/Stanford/Queens - 17%

Largest slowdowns:
MultiSource/Benchmarks/MiBench/security-sha/security-sha - 24%
MultiSource/Benchmarks/McCat/03-testtrie/testtrie - 22%
MultiSource/Applications/JM/ldecod/ldecod - 14%
MultiSource/Benchmarks/mediabench/g721/g721encode/encode - 9%

This is improved by using full (instead of just critical) anti-dependency breaking,
but doing so still causes miscompiles and so cannot yet be enabled by default.

llvm-svn: 158259
2012-06-09 03:14:50 +00:00
Jakob Stoklund Olesen 33a1b416ac Don't run RAFast in the optimizing regalloc pipeline.
The fast register allocator is not supposed to work in the optimizing
pipeline. It doesn't make sense to compute live intervals, run full copy
coalescing, and then run RAFast.

Fast register allocation in the optimizing pipeline is better done by
RABasic.

llvm-svn: 158242
2012-06-08 23:15:12 +00:00
Nuno Lopes 2710f1b049 canonicalize:
-%a + 42
into
42 - %a

previously we were emitting:
-(%a + 42)

This fixes the infinite loop in PR12338. The generated code is still not perfect, though.
Will work on that next

llvm-svn: 158237
2012-06-08 22:30:05 +00:00
Hal Finkel c6b5debb40 Enable PPC CTR loop formation by default.
Thanks to Jakob's help, this now causes no new test suite failures!

Over the entire test suite, this gives an average 1% speedup. The largest speedups are:
SingleSource/Benchmarks/Misc/pi - 108%
SingleSource/Benchmarks/CoyoteBench/lpbench - 54%
MultiSource/Benchmarks/Prolangs-C/unix-smail/unix-smail - 50%
SingleSource/Benchmarks/Shootout/ary3 - 32%
SingleSource/Benchmarks/Shootout-C++/matrix - 30%

The largest slowdowns are:
MultiSource/Benchmarks/mediabench/gsm/toast/toast - -30%
MultiSource/Benchmarks/Prolangs-C/bison/mybison - -25%
MultiSource/Benchmarks/BitBench/uuencode/uuencode - -22%
MultiSource/Applications/d/make_dparser - -14%
SingleSource/Benchmarks/Shootout-C++/ary - -13%

In light of these slowdowns, additional profiling work is obviously needed!

llvm-svn: 158223
2012-06-08 19:19:53 +00:00
Manman Ren bf86b295bb Test case for r158160
llvm-svn: 158218
2012-06-08 18:42:37 +00:00
Chad Rosier 3d464d8068 Fix a crash in APInt::lshr when shiftAmt > BitWidth.
Patch by James Benton <jbenton@vmware.com>.

llvm-svn: 158213
2012-06-08 18:04:52 +00:00
NAKAMURA Takumi 5412cef77d test/CodeGen/Generic/APIntLoadStore.ll: Mark as XFAIL:ppc since r157911.
llvm-svn: 158209
2012-06-08 16:28:06 +00:00
Hal Finkel 821e00121c Disable the PPC CTR-Loops pass by default.
The pass itself works well, but the something in the Machine* infrastructure
does not understand terminators which define registers. Without the ability
to use the block-placement pass, etc. this causes performance regressions (and
so is turned off by default). Turning off the analysis turns off the problems
with the Machine* infrastructure.

llvm-svn: 158206
2012-06-08 15:38:25 +00:00
Hal Finkel 8b01503ee5 Fix a bug in the new PPC CTR-Loops pass.
The code which tests for an induction operation cannot assume that any
ADDI instruction will have a register operand because the operand could
also be a frame index; for example:
    %vreg16<def> = ADDI8 <fi#0>, 0; G8RC:%vreg16

llvm-svn: 158205
2012-06-08 15:38:23 +00:00
Hal Finkel 96c2d4d945 Add the PPCCTRLoops pass: a PPC machine-code-level optimization pass to form CTR-based loop branching code.
This pass is derived from the Hexagon HardwareLoops pass. The only significant enhancement over the Hexagon
pass is that PPCCTRLoops will also attempt to delete the replaced add and compare operations if they are
no longer otherwise used. Also, invalid preheader DebugLoc is not used.

llvm-svn: 158204
2012-06-08 15:38:21 +00:00
Duncan Sands 9a5cf92250 Revert commit 158073 while waiting for a fix. The issue is that reassociate
can move instructions within the instruction list.  If the instruction just
happens to be the one the basic block iterator is pointing to, and it is
moved to a different basic block, then we get into an infinite loop due to
the iterator running off the end of the basic block (for some reason this
doesn't fire any assertions).  Original commit message:

Grab-bag of reassociate tweaks.  Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158199
2012-06-08 13:37:30 +00:00
Manman Ren 2cdc8afccf X86: optimize generated code for integer ABS
This patch will generate the following for integer ABS:
      movl    %edi, %eax
      negl    %eax
      cmovll  %edi, %eax
INSTEAD OF
      movl    %edi, %ecx
      sarl    $31, %ecx
      leal    (%rdi,%rcx), %eax
      xorl    %ecx, %eax

There exists a target-independent DAG combine for integer ABS, which converts
integer ABS to sar+add+xor. For X86, we match this pattern back to neg+cmov. 
This is implemented in PerformXorCombine.

rdar://10695237

llvm-svn: 158175
2012-06-07 22:39:10 +00:00
Nadav Rotem 4e50efead6 Fix a bug in FoldSelectOpOp. Bitcast ops may change the number of vector elements, which may disagree with the select condition type.
llvm-svn: 158166
2012-06-07 20:28:57 +00:00
Rafael Espindola 55d1145bd5 Use a base register instead of an index register with the local dynamic model.
Fixes pr13048.

llvm-svn: 158158
2012-06-07 18:39:19 +00:00
Meador Inge 161d5bb6f7 Adding a missing -S to the opt invocation.
llvm-svn: 158128
2012-06-07 01:02:13 +00:00
Manman Ren ae02c5a93e X86: replace SUB with CMP if possible
This patch will optimize the following
    movq    %rdi, %rax
    subq    %rsi, %rax
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax
to
    cmpq    %rsi, %rdi
    cmovsq  %rsi, %rdi
    movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 158126
2012-06-07 00:42:47 +00:00
Bill Wendling 3e8cf2be79 Spell optimization name correclty.
llvm-svn: 158123
2012-06-06 23:53:23 +00:00
Manman Ren 9c9641812c Revert r157755.
The commit is intended to fix rdar://11540023.
It is implemented as part of peephole optimization. We can actually implement
this in the SelectionDAG lowering phase.

llvm-svn: 158122
2012-06-06 23:53:03 +00:00
Bill Wendling 618547a9c6 Another testcase for r156548.
<rdar://problem/10889741>

llvm-svn: 158121
2012-06-06 23:36:22 +00:00
Chad Rosier 5d6f01ad77 Add support for dynamic stack realignment in the presence of dynamic allocas on
X86.
rdar://11496434

llvm-svn: 158087
2012-06-06 17:37:40 +00:00
Chad Rosier faa3894628 Fix combine of uno && ord -> false so that the ordering of the fcmps doesn't
matter.
rdar://11579835

llvm-svn: 158084
2012-06-06 17:22:40 +00:00
Duncan Sands 763da45e9e Grab-bag of reassociate tweaks. Unify handling of dead instructions and
instructions to reoptimize.  Exploit this to more systematically eliminate
dead instructions (this isn't very useful in practice but is convenient for
analysing some testcase I am working on).  No need for WeakVH any more: use
an AssertingVH instead.

llvm-svn: 158073
2012-06-06 14:53:10 +00:00
Richard Barton f1ef87ddbb Correct decoder for T1 conditional B encoding
llvm-svn: 158055
2012-06-06 09:12:53 +00:00
Chad Rosier 280e5df2ac Remove extraneous CHECK-NOTs from previous commit and add a new test case.
llvm-svn: 158045
2012-06-06 02:12:17 +00:00
Chad Rosier 1de1b54e72 FileCheckize this test.
llvm-svn: 158044
2012-06-06 01:38:32 +00:00
Joel Jones 7f2ac7a2c8 Revert commit r157966
llvm-svn: 157972
2012-06-05 00:47:21 +00:00
Joel Jones d08534f82e This change handles a another case for generating the bic instruction
when a compile time constant is known.  This occurs when implicitly zero 
extending function arguments from 16 bits to 32 bits.

<rdar://problem/11481151>

llvm-svn: 157966
2012-06-04 23:38:57 +00:00
Rafael Espindola 47d988c54c When gvn decides to replace an instruction with another, we have to patch the
replacement to make it at least as generic as the instruction being replaced.
This includes:
* dropping nsw/nuw flags
* getting the least restrictive tbaa and fpmath metadata
* merging ranges

Fixes PR12979.

llvm-svn: 157958
2012-06-04 22:44:21 +00:00
Akira Hatanaka 3ee0405231 Add a test case for mips64 unaligned load/store instructions.
llvm-svn: 157939
2012-06-04 17:57:06 +00:00
Akira Hatanaka b964932e70 Rename test/CodeGen/Mips/load-shift-left-right.ll.
llvm-svn: 157938
2012-06-04 17:50:36 +00:00
Roman Divacky e3f15c98d1 Implement local-exec TLS on PowerPC.
llvm-svn: 157935
2012-06-04 17:36:38 +00:00
Nadav Rotem b7bb72e4f3 Remove the "-promote-elements" flag. This flag is now enabled by default.
llvm-svn: 157925
2012-06-04 11:27:21 +00:00
Hal Finkel 595817eebe Enable generating PPC pre-increment (r+imm) instructions by default.
It seems that this no longer causes test suite failures on PPC64 (after r157159),
and often gives a performance benefit, so it can be enabled by default.

llvm-svn: 157911
2012-06-04 02:21:00 +00:00
Craig Topper 79dbb0c6e4 Rename FMA3 feature flag to just FMA to match gcc so it can be added to clang.
llvm-svn: 157903
2012-06-03 18:58:46 +00:00
Craig Topper fd53b80219 Rename fma4 intrinsics to just fma since they are now used for both FMA4 and FMA3. Autoupgrade support coming in a separate commit.
llvm-svn: 157898
2012-06-03 07:26:46 +00:00
Manman Ren 5097e4f38a Revert r157831
llvm-svn: 157896
2012-06-03 03:14:24 +00:00
Craig Topper 29eafea292 Use sse_load_f32/64 for scalar FMA3 intrinsic patterns instead of 128-bit loads to match instruction behavior.
llvm-svn: 157895
2012-06-03 01:40:43 +00:00
Manman Ren be10421c17 ARM: add testing case for struct byval
rdar://9877866

llvm-svn: 157876
2012-06-02 05:37:44 +00:00
Akira Hatanaka 27512b167b Add another test case which tests Mips' unaligned load/store instructions.
llvm-svn: 157874
2012-06-02 01:13:10 +00:00
Akira Hatanaka 63c0e2c58c Fix test cases in test/CodeGen/Mips.
llvm-svn: 157868
2012-06-02 00:05:45 +00:00
Rafael Espindola 103c2cfbbd Use dominates(Instruction, Use) in the verifier.
This removes a bit of context from the verifier erros, but reduces code
duplication in a fairly critical part of LLVM and makes dominates easier to test.

llvm-svn: 157845
2012-06-01 21:56:26 +00:00
Manman Ren 879ca9d47d X86: peephole optimization to remove cmp instruction
This patch will optimize the following:
  sub r1, r3
  cmp r3, r1 or cmp r1, r3
  bge L1
TO
  sub r1, r3
  bge L1 or ble L1

If the branch instruction can use flag from "sub", then we can eliminate
the "cmp" instruction.

llvm-svn: 157831
2012-06-01 19:49:33 +00:00
Rafael Espindola 2c3f63cbda Add some tests checking that the verifier rejects cases where a definition
doesn't dominate a use.

llvm-svn: 157829
2012-06-01 19:24:57 +00:00
Chris Lattner b1359894f3 testcase for PR13006, thanks to Duncan for filing it.
llvm-svn: 157824
2012-06-01 18:19:46 +00:00
Nuno Lopes adf1c859dd BoundsChecking: fix a bug when the handling of recursive PHIs failed and could leave dangling references in the cache
add regression tests for this problem.

Can already compile & run: PHP, PCRE, and ICU  (i.e., all the software I tried)

llvm-svn: 157822
2012-06-01 17:43:31 +00:00
Hans Wennborg 789acfb63d Implement the local-dynamic TLS model for x86 (PR3985)
This implements codegen support for accesses to thread-local variables
using the local-dynamic model, and adds a clean-up pass so that the base
address for the TLS block can be re-used between local-dynamic access on
an execution path.

llvm-svn: 157818
2012-06-01 16:27:21 +00:00
Craig Topper 00649d5111 Remove fadd(fmul) patterns for FMA3. This needs to be implemented by paying attention to FP_CONTRACT and matching @llvm.fma which is not available yet. This will allow us to enablle intrinsic use at least though.
llvm-svn: 157804
2012-06-01 06:07:48 +00:00
Chris Lattner 466076b95f enhance the logic for looking through tailcalls to look through transparent casts
in multiple-return value scenarios, like what happens on X86-64 when returning
small structs.

llvm-svn: 157800
2012-06-01 05:29:15 +00:00
Chris Lattner 182fe3eef1 enhance getNoopInput to know about vector<->vector bitcasts of legal
types, as well as int<->ptr casts.  This allows us to tailcall functions
with some trivial casts between the call and return (i.e. because the
return types disagree).

llvm-svn: 157798
2012-06-01 05:16:33 +00:00
Chris Lattner 22afea7689 add some simple 64-bit tail call tests.
llvm-svn: 157797
2012-06-01 05:03:31 +00:00
Chris Lattner 21b1e6bbdc merge some tests.
llvm-svn: 157795
2012-06-01 05:00:54 +00:00
Chris Lattner d82ae12d8c rename test
llvm-svn: 157794
2012-06-01 04:58:50 +00:00
Eric Christopher 1cf3338bb4 Add support for enum forward declarations.
Part of rdar://11570854

llvm-svn: 157786
2012-06-01 00:22:32 +00:00
Nuno Lopes 288e86ff6b add -bounds-checking-multiple-traps option to make one trap BB per check
disabled by default for now; we can discusse the default value (& name) later

llvm-svn: 157777
2012-05-31 22:58:48 +00:00
Nuno Lopes 7d00061d87 revamp BoundsChecking considerably:
- compute size & offset at the same time. The side-effects of this are that we now support negative GEPs. It's now approaching a phase that it can be reused by other passes (e.g., lowering of the objectsize intrinsic)
 - use APInt throughout to handle wrap-arounds
 - add support for PHI instrumentation
 - add a cache (required for recursive PHIs anyway)
 - remove hoisting support for now, since it was wrong in a few cases

sorry for the churn here.. tests will follow soon.

llvm-svn: 157775
2012-05-31 22:45:40 +00:00
Owen Anderson ff458f89aa Make this testcase independent of register allocation.
llvm-svn: 157761
2012-05-31 18:07:02 +00:00
Manman Ren 9bccb64e56 X86: replace SUB with CMP if possible
This patch will optimize the following
        movq    %rdi, %rax
        subq    %rsi, %rax
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax
to
        cmpq    %rsi, %rdi
        cmovsq  %rsi, %rdi
        movq    %rdi, %rax

Perform this optimization if the actual result of SUB is not used.

rdar: 11540023
llvm-svn: 157755
2012-05-31 17:20:29 +00:00
Rafael Espindola e3c5f3e5b1 Fix typos noticed by Benjamin Kramer.
Also make the checks stronger and test that we reject ranges that overlap
a previous wrapped range.

llvm-svn: 157749
2012-05-31 16:04:26 +00:00
Rafael Espindola 97d7787788 Require intervals in the range metadata to be in a canonical form: They must
be non contiguous, non overlapping and sorted by the lower end.

While this is technically a backward incompatibility, every frontent currently
produces range metadata with a single interval and we don't have any pass
that merges intervals yet, so no existing bitcode files should be rejected by
this.

llvm-svn: 157741
2012-05-31 13:45:46 +00:00
Elena Demikhovsky 602f3a26d6 Added FMA3 Intel instructions.
I disabled FMA3 autodetection, since the result may differ from expected for some benchmarks.
I added tests for GodeGen and intrinsics.
I did not change llvm.fma.f32/64 - it may be done later.

llvm-svn: 157737
2012-05-31 09:20:20 +00:00
Duncan Sands 339bb61e32 Enhance the sinking code to handle diamond patterns. Patch by
Carlo Alberto Ferraris.

llvm-svn: 157736
2012-05-31 08:09:49 +00:00
Craig Topper c1ac05dad5 Add intrinsic for pclmulqdq instruction.
llvm-svn: 157731
2012-05-31 04:37:40 +00:00
Akira Hatanaka c13ed945aa Add lit.local.cfg to run the tests in test/MC/Disassembler/Mips.
llvm-svn: 157725
2012-05-31 00:49:56 +00:00
Jakob Stoklund Olesen 05e2245fc6 Prioritize smaller register classes for urgent evictions.
It helps compile exotic inline asm. In the test case, normal GR32
virtual registers use up eax-edx so the final GR32_ABCD live range has
no registers left. Since all the live ranges were tiny, we had no way of
prioritizing the smaller register class.

This patch allows tiny unspillable live ranges to be evicted by tiny
unspillable live ranges from a smaller register class.

<rdar://problem/11542429>

llvm-svn: 157715
2012-05-30 21:46:58 +00:00
Eric Christopher f481ab3877 Add support for the mips inline asm 'm' output modifier.
Patch by Jack Carter.

llvm-svn: 157709
2012-05-30 19:05:19 +00:00
Owen Anderson 0eda3e1de6 Switch the canonical FMA term operand order to match both the comment I wrote and the usual LLVM convention.
llvm-svn: 157708
2012-05-30 18:54:50 +00:00
Owen Anderson c7aaf523e1 Teach DAGCombine to canonicalize the position of a constant in the term operands of an FMA node.
llvm-svn: 157707
2012-05-30 18:50:39 +00:00
Benjamin Kramer 50b26ebb2b Teach SCEV's icmp simplification logic that a-b == 0 is equivalent to a == b.
This also required making recursive simplifications until
nothing changes or a hard limit (currently 3) is hit.

With the simplification in place indvars can canonicalize
loops of the form
for (unsigned i = 0; i < a-b; ++i)
into
for (unsigned i = 0; i != a-b; ++i)
which used to fail because SCEV created a weird umax expr
for the backedge taken count.

llvm-svn: 157701
2012-05-30 18:32:23 +00:00
Chris Lattner 1622a99e58 it's pointed out that R11 can be used for magic things, and doing things just for 64-bit registers is silly. Just optimize 3 more.
llvm-svn: 157699
2012-05-30 18:08:02 +00:00
Chris Lattner 04d722a68d Extend the (abi-irrelevant) return convention to be able to return more than two values in
integer registers.  This is already supported by the fastcc convention, but it doesn't
hurt to support it in the standard conventions as well.

In cases where we can cheat at the calling convention, this allows us to avoid returning
things through memory in more cases.

llvm-svn: 157698
2012-05-30 17:50:14 +00:00
Chad Rosier 820d248c4d [arm-fast-isel] Add support for the llvm.frameaddress() intrinsic.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 157696
2012-05-30 17:23:22 +00:00
Kostya Serebryany 9024160439 [asan] instrument cmpxchg and atomicrmw
llvm-svn: 157683
2012-05-30 09:04:06 +00:00
Andrew Trick a3f9043196 SCEV: Handle a corner case reducing AddRecExpr * AddRecExpr
If integer overflow causes one of the terms to reach zero, that can
force the entire expression to zero.

Fixes PR12929: cast<Ty>() argument of incompatible type

llvm-svn: 157673
2012-05-30 03:35:20 +00:00
Evan Cheng bc2453dd3d Teach taildup to update livein set. rdar://11538365
llvm-svn: 157663
2012-05-30 00:42:39 +00:00
Bob Wilson 33e5188c27 Add an insertPass API to TargetPassConfig. <rdar://problem/11498613>
Besides adding the new insertPass function, this patch uses it to
enhance the existing -print-machineinstrs so that the MachineInstrs
after a specific pass can be printed.

Patch by Bin Zeng!

llvm-svn: 157655
2012-05-30 00:17:12 +00:00
Benjamin Kramer ef479ea854 Add intrinsics, code gen, assembler and disassembler support for the SSE4a extrq and insertq instructions.
This required light surgery on the assembler and disassembler
because the instructions use an uncommon encoding. They are
the only two instructions in x86 that use register operands
and two immediates.

llvm-svn: 157634
2012-05-29 19:05:25 +00:00
Peter Collingbourne 913869be45 Add llvm.fabs intrinsic.
llvm-svn: 157594
2012-05-28 21:48:37 +00:00
Benjamin Kramer b8743a9150 InstCombine: Fix infinite loop when encountering switch on trivial icmp.
The test case feeds the following into InstCombine's visitSelect:
%tobool8 = icmp ne i32 0, 0
%phitmp = select i1 %tobool8, i32 3, i32 0
Then instcombine replaces the right side of the switch with 0, doesn't notice
that nothing changes and tries again indefinitely.

This fixes PR12897.

llvm-svn: 157587
2012-05-28 19:18:16 +00:00
Meador Inge e17b69a373 PR12696: Attribute bits above 1<<30 are not encoded in bitcode
Attribute bits above 1<<30 are now encoded correctly.  Additionally,
the encoding/decoding functionality has been hoisted to helper functions
in Attributes.h in an effort to help the encoding/decoding to stay in
sync with the Attribute bitcode definitions.

llvm-svn: 157581
2012-05-28 15:45:43 +00:00
Chris Lattner ff9e08baf9 rdar://11542750 - llvm.trap should be marked no return.
llvm-svn: 157551
2012-05-27 23:20:41 +00:00
Benjamin Kramer 152f106e5f PR12967: Don't crash when trying to fold a shift that's larger than the type's size.
llvm-svn: 157548
2012-05-27 22:03:32 +00:00
Chris Lattner f7f59b15aa These tests used intrinsics with the wrong prototype. They weren't caught because
the old verifier just checked that something "was a pointer", but not that the pointee
was correct.

llvm-svn: 157544
2012-05-27 19:35:41 +00:00
Chris Lattner 4cca620c18 remove two (useless) tests that use incorrect intrinsic prototypes, detected by the new intrinsic verifier.
llvm-svn: 157543
2012-05-27 19:31:00 +00:00
Peter Collingbourne 4d358b55fa Have getOrCreateSubprogramDIE store the DIE for a subprogram
definition in the map before calling itself to retrieve the
DIE for the declaration.  Without this change, if this causes
getOrCreateSubprogramDIE to be recursively called on the definition,
it will create multiple DIEs for that definition.  Fixes PR12831.

llvm-svn: 157541
2012-05-27 18:36:44 +00:00
Benjamin Kramer f2beccf6b4 SelectionDAGBuilder: When emitting small compare chains for switches order them by using edge weights.
SimplifyCFG tends to form a lot of 2-3 case switches when merging branches. Move
the most likely condition to the front so it is checked first and the others can
be skipped. This is currently not as effective as it could be because SimplifyCFG
destroys profiling metadata when merging branches and switches. Merging branch
weight metadata is tricky though.

This code touches at most 3 cases so I didn't use a proper sorting algorithm.

llvm-svn: 157521
2012-05-26 20:01:32 +00:00
Duncan Sands 3c05cd3ea8 Since commit 157467, if reassociate isn't actually going to change an expression
then it doesn't alter the instructions composing it, however it would continue
to move the instructions to just before the expression root.  Ensure it doesn't
move them either, so now it really does nothing if there is nothing to do.  That
commit also ensured that nsw etc flags weren't cleared if the expression was not
being changed.  Tweak this a bit so that it doesn't clear flags on the initial
part of a computation either if that part didn't change but later bits did.

llvm-svn: 157518
2012-05-26 16:42:52 +00:00
Nuno Lopes e9b0bdf804 bounds checking: add support for byval arguments
llvm-svn: 157498
2012-05-25 21:15:17 +00:00
Justin Holewinski c98041d4d9 [NVPTX] Add a new test case for the newly-enabled call handling
NV_CONTRIB

llvm-svn: 157485
2012-05-25 17:20:38 +00:00
Nuno Lopes a6da3ff896 boundschecking:
add support for select
add experimental support for alloc_size metadata

llvm-svn: 157481
2012-05-25 16:54:04 +00:00
NAKAMURA Takumi 3eca973bf8 test/CodeGen/X86/bigstructret.ll: Suppress one test. It is msvc-incompatible. (compatible to mingw32 and netbsd, though)
llvm-svn: 157474
2012-05-25 15:40:54 +00:00
NAKAMURA Takumi 501dbd06ae test/CodeGen/X86/bigstructret.ll: Relax stack offsets for hosts of stack-align=8, eg. win32 and netbsd.
llvm-svn: 157471
2012-05-25 15:12:21 +00:00
Duncan Sands bddfb2f96b Make the reassociation pass more powerful so that it can handle expressions
with arbitrary topologies (previously it would give up when hitting a diamond
in the use graph for example).  The testcase from PR12764 is now reduced from
a pile of additions to the optimal 1617*%x0+208.  In doing this I changed the
previous strategy of dropping all uses for expression leaves to one of dropping
all but one use.  This works out more neatly (but required a bunch of tweaks)
and is also safer: some recently fixed bugs during recursive linearization were
because the linearization code thinks it completely owns a node if it has no uses
outside the expression it is linearizing.  But if the node was also in another
expression that had been linearized (and thus all uses of the node from that
expression dropped) then the conclusion that it is completely owned by the
expression currently being linearized is wrong.  Keeping one use from within each
linearized expression avoids this kind of mistake.

llvm-svn: 157467
2012-05-25 12:03:02 +00:00
Eli Friedman 315a0c79f3 Simplify code for calling a function where CanLowerReturn fails, fixing a small bug in the process.
llvm-svn: 157446
2012-05-25 00:09:29 +00:00
Jakob Stoklund Olesen 36a5c8e550 Add support for range expressions in TableGen foreach loops.
Like this:

  foreach i = 0-127 in ...

Use braces for composite ranges:

  foreach i = {0-3,9-7} in ...

llvm-svn: 157432
2012-05-24 22:17:39 +00:00
Jakob Stoklund Olesen 74fd80e8fc Don't put TGParser scratch results in the output.
Only fully expanded Records should go into RecordKeeper.

llvm-svn: 157431
2012-05-24 22:17:36 +00:00
David Blaikie c575c80c3b Fix for CHECK-NOT misspelling.
Patch by Nicklas Bo Jensen.

llvm-svn: 157421
2012-05-24 22:08:29 +00:00
Justin Holewinski 907f7606f2 Remove the PTX back-end and all of its artifacts (triple, etc.)
This back-end was deprecated in favor of the NVPTX back-end.

NV_CONTRIB

llvm-svn: 157417
2012-05-24 21:38:21 +00:00
Owen Anderson 921082b883 Teach tblgen's set theory "sequence" operator to support an optional stride operand.
llvm-svn: 157416
2012-05-24 21:37:08 +00:00
Akira Hatanaka a649cc75b3 Turn on mips16 pseudo op when compiling for mips16.
Expand test case for this.

Patch by Reed Kotler.

llvm-svn: 157410
2012-05-24 18:37:43 +00:00
Akira Hatanaka df98a7a34d Enable Mips16 compiler to compile a null program.
First code from the Mips16 compiler. Includes trivial test program.

Patch by Reed Kotler.

llvm-svn: 157408
2012-05-24 18:32:33 +00:00
Tobias Grosser 6b31d170a4 Add half support to LLVM (for OpenCL)
Submitted by: Anton Lokhmotov  <Anton.Lokhmotov@arm.com>

Approved by: o Anton Korobeynikov
             o Micah Villmow
             o David Neto

llvm-svn: 157393
2012-05-24 15:59:06 +00:00
Stepan Dyatkovskiy 183d18aa5a PR1255 related changes (case ranges):
LowerSwitch::Clusterify : main functinality was replaced with CRSBuilder::optimize, so big part of Clusterify's code was reduced.
test/Transform/LowerSwitch/feature.ll - this test was refactored: grep + count was replaced with FileCheck usage.

llvm-svn: 157384
2012-05-24 09:33:20 +00:00
Jakob Stoklund Olesen 41ebcda8f4 Add a test case for global live range splitting.
llvm-svn: 157357
2012-05-23 23:42:23 +00:00
Jakob Stoklund Olesen 0ce90494e6 Add a last resort tryInstructionSplit() to RAGreedy.
Live ranges with a constrained register class may benefit from splitting
around individual uses. It allows the remaining live range to use a
larger register class where it may allocate. This is like spilling to a
different register class.

This is only attempted on constrained register classes.

<rdar://problem/11438902>

llvm-svn: 157354
2012-05-23 22:37:27 +00:00
Kaelyn Uhrain 4dbe0cd6dc Fix typo in flag to opt, and also a CHECK-NEXT that doesn't follow a
CHECK. The latter error was hidden by the former, and the test harness
used by e.g. "make check" silently ignored that opt was printing an
error message about an unknown flag instead of running on the test file.

llvm-svn: 157341
2012-05-23 20:21:36 +00:00
Jakob Stoklund Olesen 5b8f476037 Correctly deal with identity copies in RegisterCoalescer.
Now that the coalescer keeps live intervals and machine code in sync at
all times, it needs to deal with identity copies differently.

When merging two virtual registers, all identity copies are removed
right away. This means that other identity copies must come from
somewhere else, and they are going to have a value number.

Deal with such copies by merging the value numbers before erasing the
copy instruction. Otherwise, we leave dangling value numbers in the live
interval.

This fixes PR12927.

llvm-svn: 157340
2012-05-23 20:21:06 +00:00
Chad Rosier 223faf719c [arm-fast-isel] Add support for non-global callee.
Patch by Jush Lu <jush.msn@gmail.com>.

llvm-svn: 157336
2012-05-23 18:38:57 +00:00
Nuno Lopes 10287d839f BoundsChecking: add a couple of simple tests and fix a bug in branch emition
llvm-svn: 157329
2012-05-23 16:24:52 +00:00
Patrik Hägglund 8a1e316c15 Fix the inliner so that the optsize function attribute don't alter the
inline threshold if the global inline threshold is lower (as for -Oz).

Reviewed by Chandler Carruth and Bill Wendling.

llvm-svn: 157323
2012-05-23 13:42:57 +00:00
Eric Christopher c49643586b Add support for C++11 enum classes in llvm.
Part of rdar://11496790

llvm-svn: 157303
2012-05-23 00:09:20 +00:00
Andrew Trick a7a3de1bcf LSR fix: add a missing phi check during IV hoisting.
Fixes PR12898: SCEVExpander crash.

llvm-svn: 157263
2012-05-22 17:39:59 +00:00
Nuno Lopes ad40c0a425 revert my previous patches that introduced an additional parameter to the objectsize intrinsic.
After a lot of discussion, we realized it's not the best option for run-time bounds checking

llvm-svn: 157255
2012-05-22 15:25:31 +00:00
Jakob Stoklund Olesen 924279ca0e Only erase virtregs with no uses left.
Also make sure registers aren't erased twice if the dead def mentions
the register twice.

This fixes PR12911.

llvm-svn: 157254
2012-05-22 14:52:12 +00:00
Duncan Sands 4df5e96d3a Fix PR12858, a crash due to GVN's PRE not fully removing an instruction from the
leader table.  That's because it wasn't expecting instructions to turn up as
leader for a value number that is not its own, but equality propagation could
create this situation.  One solution is to have the leader table use a WeakVH
but this slows down GVN by about 5%.  Instead just have equality propagation not
add instructions to the leader table, only constants and arguments.  In theory
this might cause GVN to run more (each time it changes something it runs again)
but it doesn't seem to occur enough to cause a slow down.

llvm-svn: 157251
2012-05-22 14:17:53 +00:00
Jim Grosbach da04fa0d02 FileCheck'ize test, and add a bit to test for r157221.
llvm-svn: 157222
2012-05-21 23:50:00 +00:00
Craig Topper e88f2fd4f7 Allow 256-bit shuffles to still be split even if only half of the shuffle comes from two 128-bit pieces.
llvm-svn: 157175
2012-05-21 06:40:16 +00:00
Peter Collingbourne 8eb05fd093 When legalising shifts, do not pre-build a list of operands which
may be RAUW'd by the recursive call to LegalizeOps; instead, retrieve
the other operands when calling UpdateNodeOperands.  Fixes PR12889.

llvm-svn: 157162
2012-05-20 18:36:15 +00:00
Hal Finkel 601f555eee Add a missing PPC 64-bit stwu pattern.
This seems to fix the remaining compile-time failures on PPC64 when
compiling with -enable-ppc-preinc.

llvm-svn: 157159
2012-05-20 17:11:24 +00:00
Jakob Stoklund Olesen 691ae3388f Use the right register class for LDRrs.
llvm-svn: 157152
2012-05-20 06:38:47 +00:00
Jakob Stoklund Olesen 4fd0e4f415 Transfer memory operands to the right instruction.
They need to go on the PICLDR as the verifier points out.

llvm-svn: 157151
2012-05-20 06:38:42 +00:00
Jakob Stoklund Olesen 1f1c6add10 Properly constrain register classes for sub-registers.
Not all GR64 registers have sub_8bit sub-registers.

llvm-svn: 157150
2012-05-20 06:38:37 +00:00
Jakob Stoklund Olesen a103a516c6 Properly constrain register classes in 2-addr.
X86 has 2-addr instructions with different constraints on the tied def
and use operands. One is GR32, one is GR32_NOSP.

llvm-svn: 157149
2012-05-20 06:38:32 +00:00
Peter Collingbourne 9a03c73297 Do not pass an invalid domtree to SimplifyInstruction from
LoopUnswitch.  Fixes PR12887.

llvm-svn: 157140
2012-05-20 01:32:09 +00:00
Jakob Stoklund Olesen a34a69ce0c Fix 12892.
Dead code elimination during coalescing could cause a virtual register
to be split into connected components. The following rewriting would be
confused about the already joined copies present in the code, but
without a corresponding value number in the live range.

Erase all joined copies instantly when joining intervals such that the
MI and LiveInterval representations are always in sync.

llvm-svn: 157135
2012-05-19 23:34:59 +00:00
Peter Collingbourne 97b1076435 Do not eliminate allocas whose alignment exceeds that of the
copied-in constant, as a subsequent user may rely on over alignment.
Fixes PR12885.

llvm-svn: 157134
2012-05-19 22:52:10 +00:00
Jakob Stoklund Olesen 25ced18407 Erase joined copies immediately.
The late dead code elimination is no longer necessary.

The test changes are cause by a register hint that can be either %rdi or
%rax. The choice depends on the use list order, which this patch changes.

llvm-svn: 157131
2012-05-19 20:54:07 +00:00
Nadav Rotem c93e91da27 On Haswell, perfer storing YMM registers using a single instruction.
llvm-svn: 157129
2012-05-19 20:30:08 +00:00
Nadav Rotem 900c7cb7ce Add support for additional in-reg vbroadcast patterns
llvm-svn: 157127
2012-05-19 19:57:37 +00:00
Eric Christopher b5cf66cda2 Actually support DW_TAG_rvalue_reference_type that we were trying
to generate out of the front end.

rdar://11479676

llvm-svn: 157094
2012-05-19 01:36:37 +00:00
Eric Christopher bc5d24999c Add support for the 'd' mips inline asm output modifier.
Patch by Jack Carter.

llvm-svn: 157093
2012-05-19 00:51:56 +00:00
Andrew Trick 7fa4e0fea6 SCEV: Add MarkPendingLoopPredicates to avoid recursive isImpliedCond.
getUDivExpr attempts to simplify by checking for overflow.
isLoopEntryGuardedByCond then evaluates the loop predicate which
may lead to the same getUDivExpr causing endless recursion.

Fixes PR12868: clang 3.2 segmentation fault.

llvm-svn: 157092
2012-05-19 00:48:25 +00:00
Dan Gohman 14862c3141 Fix replacing all the users of objc weak runtime routines
when deleting them. rdar://11434915.

llvm-svn: 157080
2012-05-18 22:17:29 +00:00
Nuno Lopes ac59380dfd allow LazyValueInfo::getEdgeValue() to reason about multiple edges from the same switch instruction by doing union of ranges (which may still be conservative, but it's more aggressive than before)
llvm-svn: 157071
2012-05-18 21:02:10 +00:00
Jim Grosbach 4b63d2ae1d Refactor data-in-code annotations.
Use a dedicated MachO load command to annotate data-in-code regions.
This is the same format the linker produces for final executable images,
allowing consistency of representation and use of introspection tools
for both object and executable files.

Data-in-code regions are annotated via ".data_region"/".end_data_region"
directive pairs, with an optional region type.

data_region_directive := ".data_region" { region_type }
region_type := "jt8" | "jt16" | "jt32" | "jta32"
end_data_region_directive := ".end_data_region"

The previous handling of ARM-style "$d.*" labels was broken and has
been removed. Specifically, it didn't handle ARM vs. Thumb mode when
marking the end of the section.

rdar://11459456

llvm-svn: 157062
2012-05-18 19:12:01 +00:00
Nuno Lopes b63d6cdf79 add test case for bugfix in r157032
llvm-svn: 157058
2012-05-18 17:44:58 +00:00
Eric Christopher 9ca26cfb5f Add support for the mips 'x' inline asm modifier.
Patch by Jack Carter.

llvm-svn: 157057
2012-05-18 17:39:35 +00:00
Joel Jones f1c120e9ef FileCheck-ify, apropos of nothing
llvm-svn: 157051
2012-05-18 16:24:01 +00:00
Craig Topper 92db928ee9 Simplify handling of v16i8 shuffles and fix a missed optimization.
llvm-svn: 157043
2012-05-18 06:42:06 +00:00
Evan Cheng 22d405f57b Teach two-address pass to update the "source" map so it doesn't perform a
non-profitable commute using outdated info. The test case would still fail
because of poor pre-RA schedule. That will be fixed by MI scheduler.

rdar://11472010

llvm-svn: 157038
2012-05-18 01:33:51 +00:00
Danil Malyshev cd492b0a98 Temporarily disabled the MCJIT tests for Darwin, because the RuntimeDyldMachO has a problems with relocations for 32bit x86.
llvm-svn: 157035
2012-05-18 00:30:58 +00:00
Kevin Enderby badd100c26 Fixed a bug in llvm-objdump when disassembling using -macho option for a binary
containing no symbols.  Fixed the crash and fixed it not disassembling anything.

llvm-svn: 157031
2012-05-18 00:13:56 +00:00
Jakob Stoklund Olesen 874e401382 Remove a test that was only testing for physreg joining.
This is the same as the other tests: Clever tricks are required to make
the arguments and return value line up in a single-instruction function.
It rarely happens in real life.

We have plenty other examples of this behavior.

llvm-svn: 157030
2012-05-18 00:07:14 +00:00
Jakob Stoklund Olesen 589c6eb95c Remove -join-physregs from the test suite.
This option has been disabled for a while, and it is going away so I can
clean up the coalescer code.

The tests that required physreg joining to be enabled were almost all of
the form "tiny function with interference between arguments and return
value". Such functions are usually inlined in the real world.

The problem exposed by phys_subreg_coalesce-3.ll is real, but fairly
rare.

llvm-svn: 157027
2012-05-17 23:44:19 +00:00
Kevin Enderby f1b225d0e0 Fix the encoding of the armv7m (MClass) for MSR APSR writes which was missing
the 0b10 mask encoding bits.  Make MSR APSR writes without a _<bits> qualifier
an alias for MSR APSR_nzcvq even though ARM as deprecated it use.  Also add
support for suffixes (_nzcvq, _g, _nzcvqg) for APSR versions.  Some FIXMEs in
the code for better error checking when versions shouldn't be used.
rdar://11457025

llvm-svn: 157019
2012-05-17 22:18:01 +00:00
Danil Malyshev 7c5db45350 - Added ExecutionEngine/MCJIT tests
- Added HOST_ARCH to Makefile.config.in
The HOST_ARCH will be used by MCJIT tests filter, because MCJIT supported only x86 and ARM architectures now.

llvm-svn: 157015
2012-05-17 21:07:47 +00:00
Tim Northover af501a29d3 Remove incorrect pattern for ARM SMML instruction.
Patch by Meador Inge.

llvm-svn: 156989
2012-05-17 13:12:13 +00:00
Chandler Carruth d8c08c2111 Teach the 'opt' tool about '-Os' and '-Oz', corresponding to the Clang
options, to enable easier testing of the innards of LLVM that are
enabled by such optimization strategies.

Note that this doesn't provide the (much needed) function attribute
support for -Oz (as opposed to -Os), but still seems like a positive
step to better test the logic that Clang currently relies on.

Patch by Patrik Hägglund.

llvm-svn: 156913
2012-05-16 08:32:49 +00:00
Evan Cheng 58a95f0c8a Avoid creating a cycle when folding load / op with flag / store. PR11451474. rdar://11451474
llvm-svn: 156896
2012-05-16 01:54:27 +00:00
Jakob Stoklund Olesen 984997b3a0 Enable sub-sub-register copy coalescing.
It is now possible to coalesce weird skewed sub-register copies by
picking a super-register class larger than both original registers. The
included test case produces code like this:

  vld2.32 {d16, d17, d18, d19}, [r0]!
  vst2.32 {d18, d19, d20, d21}, [r0]

We still perform interference checking as if it were a normal full copy
join, so this is still quite conservative. In particular, the f1 and f2
functions in the included test case still have remaining copies because
of false interference.

llvm-svn: 156878
2012-05-15 23:31:35 +00:00
Kevin Enderby a414bcc0e3 Add a test case for r156840, a fix to llvm-objdump when disassembling using
-macho to disassemble the last symbol to the end of the section.

llvm-svn: 156850
2012-05-15 20:20:50 +00:00
Sirish Pande 91856a1f15 Enable all Hexagon tests.
llvm-svn: 156824
2012-05-15 16:13:12 +00:00
David Majnemer a9330fe553 Teach SimplifyLibCalls about stpcpy.
llvm-svn: 156815
2012-05-15 11:46:21 +00:00
Jakob Stoklund Olesen dc2e0cd44a Fix PR12821.
RAFast must add an <imp-def> operand when it is rewriting a sub-register
def that isn't a read-modify-write.

llvm-svn: 156777
2012-05-14 21:10:25 +00:00
Chad Rosier a968caf8e0 Move the capture analysis from MemoryDependencyAnalysis to a more general place
so that it can be reused in MemCpyOptimizer.  This analysis is needed to remove
an unnecessary memcpy when returning a struct into a local variable.
rdar://11341081
PR12686

llvm-svn: 156776
2012-05-14 20:35:04 +00:00
Brendon Cahoon f6b687e5d1 Revert 156634 upon request until code improvement changes are made.
llvm-svn: 156775
2012-05-14 19:35:42 +00:00
Dan Gohman 164fe18cfe Rename @llvm.debugger to @llvm.debugtrap.
llvm-svn: 156774
2012-05-14 18:58:10 +00:00
Rafael Espindola 47b7dac220 Add support for the .rept directive. Patch by Vladmir Sorokin. I added support
for nesting.

llvm-svn: 156714
2012-05-12 16:31:10 +00:00
Benjamin Kramer 6bee7f750d ELF: Add support for the asm .version directive.
llvm-svn: 156712
2012-05-12 14:30:47 +00:00
Benjamin Kramer 95d31bcba5 AsmParser: Add support for the .purgem directive.
Based on a patch by Team PaX.

llvm-svn: 156709
2012-05-12 11:21:46 +00:00
Benjamin Kramer 66b8d4d28f AsmParser: ignore the .extern directive.
llvm-svn: 156707
2012-05-12 11:18:59 +00:00
Benjamin Kramer e297b9f506 AsmParser: Add support for .ifc and .ifnc directives.
Based on a patch from PaX Team.

llvm-svn: 156706
2012-05-12 11:18:51 +00:00
Benjamin Kramer 62c18b0881 AsmParser: Add support for .ifb and .ifnb directives.
Based on a patch from PaX Team.

llvm-svn: 156705
2012-05-12 11:18:42 +00:00
Stepan Dyatkovskiy 0beab5e1cd Recommited r156374 with critical fixes in BitcodeReader/Writer:
Ordinary patch for PR1255.
Added new case-ranges orientated methods for adding/removing cases in SwitchInst. After this patch cases will internally representated as ConstantArray-s instead of ConstantInt, externally cases wrapped within the ConstantRangesSet object.
Old methods of SwitchInst are also works well, but marked as deprecated. So on this stage we have no side effects except that I added support for case ranges in BitcodeReader/Writer, of course test for Bitcode is also added. Old "switch" format is also supported.

llvm-svn: 156704
2012-05-12 10:48:17 +00:00
Jay Foad ca0c499609 Teach Function::hasAddressTaken that BlockAddress doesn't really take
the address of a function.

llvm-svn: 156703
2012-05-12 08:30:16 +00:00
Sirish Pande 4bd20c50eb Support for Hexagon feature, New Value Jump.
llvm-svn: 156698
2012-05-12 05:10:30 +00:00
Akira Hatanaka 763ab85690 Fix test cases.
llvm-svn: 156697
2012-05-12 03:25:16 +00:00
Akira Hatanaka 8f3573034b Make the following changes in MipsAsmPrinter.cpp:
- Remove code which lowers pseudo SETGP01.
- Fix LowerSETGP01. The first two of the three instructions that are emitted to
  initialize the global pointer register now use register $2.
- Stop emitting .cpload directive.

llvm-svn: 156689
2012-05-12 00:48:43 +00:00
Akira Hatanaka d918f77ba3 Insert instructions to the entry basic block which initializes the global
pointer register. 


This is the first of the series of patches which clean up the way global pointer
register is used. The patches will make the following improvements:

- Make $gp an allocatable temporary register rather than reserving it.
- Use a virtual register as the global pointer register and let the register
  allocator decide which register to assign to it or whether spill/reloads are
  needed.
- Make sure $gp is valid at the entry of a called function, which is necessary
  for functions using lazy binding.
- Remove the need for emitting .cprestore and .cpload directives.

llvm-svn: 156671
2012-05-12 00:17:17 +00:00
Akira Hatanaka 0661b81bca Do not replace operands of pseudo instructions with register $zero.
llvm-svn: 156663
2012-05-11 23:22:18 +00:00
Akira Hatanaka 5d60c36f37 Use regular expression to match register names.
llvm-svn: 156656
2012-05-11 23:00:40 +00:00
Chad Rosier aa9cb9df59 [fast-isel] Add support for selecting @llvm.trap().
llvm-svn: 156646
2012-05-11 21:33:49 +00:00
Brendon Cahoon 31f8723ef3 Hexagon constant extender support.
Patch by Jyotsna Verma.

llvm-svn: 156634
2012-05-11 19:56:59 +00:00
Chad Rosier 3268692aa8 [fast-isel] Remove -disable-arm-fast-isel option. -fast-isel=0 suffices. Minor cleanup.
llvm-svn: 156632
2012-05-11 19:40:25 +00:00
Chad Rosier 90f9afe659 [fast-isel] Cleaner fix for when we're unable to handle a non-double multi-reg
retval.  Hoists check before emitting the call to avoid unnecessary work.
rdar://11430407
PR12796

llvm-svn: 156628
2012-05-11 18:51:55 +00:00
Nuno Lopes e2cfd3ce95 objectsize: add a few more tests and fix a bug
llvm-svn: 156625
2012-05-11 18:25:29 +00:00
Hans Wennborg addad7388d Fix test/CodeGen/X86/tls-pie.ll.
llvm-svn: 156612
2012-05-11 10:19:54 +00:00
Hans Wennborg f9d0e44b82 Implement initial-exec TLS model for 32-bit PIC x86
This fixes a TODO from 2007 :) Previously, LLVM would emit the wrong
code here (see the update to test/CodeGen/X86/tls-pie.ll).

llvm-svn: 156611
2012-05-11 10:11:01 +00:00
Silviu Baranga ddc67a7655 Added the missing bit definition for the 4th bit of the STR (post reg) instruction. It is now set to 0. The patch also sets the unpredictable mask for SEL and SXTB-type instructions.
llvm-svn: 156609
2012-05-11 09:28:27 +00:00
Silviu Baranga 5a719f9b9a Fixed the LLVM ARM v7 assembler and instruction printer for 8-bit immediate offset addressing. The assembler and instruction printer were not properly handeling the #-0 immediate.
llvm-svn: 156608
2012-05-11 09:10:54 +00:00
Eli Friedman e0a64d83fc Fix a minor logic mistake transforming compares in instcombine. PR12514.
llvm-svn: 156600
2012-05-11 01:32:59 +00:00
Manman Ren dc8ad0058f ARM: peephole optimization to remove cmp instruction
This patch will optimize the following cases:
  sub r1, r3 | sub r1, imm
  cmp r3, r1 or cmp r1, r3 | cmp r1, imm
  bge L1

TO
  subs r1, r3
  bge  L1 or ble L1

If the branch instruction can use flag from "sub", then we can replace
"sub" with "subs" and eliminate the "cmp" instruction.

rdar: 10734411
llvm-svn: 156599
2012-05-11 01:30:47 +00:00
Dan Gohman dfab443ae8 Define a new intrinsic, @llvm.debugger. It will be similar to __builtin_trap(),
but it generates int3 on x86 instead of ud2.

llvm-svn: 156593
2012-05-11 00:19:32 +00:00