Commit Graph

33297 Commits

Author SHA1 Message Date
Jun Bum Lim 80ec0d3f5a [AArch64]Merge narrow zero stores to a wider store
This change merges adjacent zero stores into a wider single store.
For example :
  strh wzr, [x0]
  strh wzr, [x0, #2]
becomes
  str wzr, [x0]

This will fix PR25410.

llvm-svn: 253711
2015-11-20 21:14:07 +00:00
Eric Christopher c180836722 Weak non-function symbols were being accessed directly, which is
incorrect, as the chosen representative of the weak symbol may not live
with the code in question. Always indirect the access through the TOC
instead.

Patch by Kyle Butt!

llvm-svn: 253708
2015-11-20 20:51:31 +00:00
Bill Seurer aea3d38d81 Fix test case label check
Several (but not all) of the labels that are checked for in this test case
are checked as strings instead of labels.  This can cause an apparent test
case failure if they are tested in an appropriately named directory.

For example, one of them that fails:

define zeroext i32 @test2(i32 %A.u, i32 %B.u)  {
; A8: test2
; A8: uxtab  r0, r0, r1


Output that causes it to fail:

. . .
	.file	"/home/seurer/llvm/llvm-test2/test/CodeGen/Thumb2/thumb2-uxt_rot.ll"
. . .
	.globl	test2
	.align	1
	.type	test2,%function
	.code	16                      @ @test2
	.thumb_func
test2:
	.fnstart


The "A8: test2" matches on the directory name instead of the label.

llvm-svn: 253702
2015-11-20 20:24:49 +00:00
Artyom Skrobov 91f339ab3f Handle ARMv6-J as an alias, instead of fake architecture
Summary:
This follows D14577 to treat ARMv6-J as an alias for ARMv6,
instead of an architecture in its own right.

The functional change is that the default CPU when targeting ARMv6-J
changes from arm1136j-s to arm1136jf-s, which is currently used as
the default CPU for ARMv6; both are, in fact, ARMv6-J CPUs.

The J-bit (Jazelle support) is irrelevant to LLVM, and it doesn't
affect code generation, attributes, optimizations, or anything else,
apart from selecting the default CPU.

Reviewers: rengolin, logan, compnerd

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14755

llvm-svn: 253675
2015-11-20 16:46:09 +00:00
Diego Novillo df544a098a SamplePGO - Add line offset and discriminator information to sample reports.
While debugging some sampling coverage problems, I found this useful:
When applying samples from a profile, it helps to also know what line
offset and discriminator the sample belongs to. This makes it easy to
correlate against the input profile.

llvm-svn: 253670
2015-11-20 15:39:42 +00:00
Owen Anderson 630077ef55 Fix a pair of issues that caused an infinite loop in reassociate.
Terrifyingly, one of them is a mishandling of floating point vectors
in Constant::isZero().  How exactly this issue survived this long
is beyond me.

llvm-svn: 253655
2015-11-20 08:16:13 +00:00
Hrvoje Varga b65518c15c [mips][microMIPS] Implement MUL[_S].PH, MULEQ_S.W.PHL, MULEQ_S.W.PHR, MULEU_S.PH.QBL, MULEU_S.PH.QBR, MULQ_RS.PH, MULQ_RS.W, MULQ_S.PH and MULQ_S.W instructions
Differential Revision: http://reviews.llvm.org/D14280

llvm-svn: 253651
2015-11-20 07:14:52 +00:00
Dan Gohman bb7ce8e408 [WebAssembly] Rename SWITCH to TABLESWITCH to match the current wording in the spec.
llvm-svn: 253642
2015-11-20 03:02:49 +00:00
Peter Collingbourne c85f4ced4d ScalarEvolution: do not set nuw when creating exprs of form <expr> + <all-ones>.
The nuw constraint will not be satisfied unless <expr> == 0.

This bug has been around since r102234 (in 2010!), but was uncovered by
r251052, which introduced more aggressive optimization of nuw scev expressions.

Differential Revision: http://reviews.llvm.org/D14850

llvm-svn: 253627
2015-11-20 01:26:13 +00:00
Tobias Edler von Koch 49c9a6e802 [LTO] Add options to llvm-lto to select output format and dump merged module
This introduces two new options:
- "llvm-lto -save-merged-module -o outfile" dumps the LTO Module to
  outfile.merged.bc prior to CodeGen and after LTO optimizations have been run.
- "llvm-lto -filetype=asm -o outfile" makes llvm-lto emit assembly instead of
  object code in outfile.

Both are intended for use in lit tests.

llvm-svn: 253624
2015-11-20 00:13:05 +00:00
Reid Kleckner cc2f6c35a3 [WinEH] Disable most forms of demotion
Now that the register allocator knows about the barriers on funclet
entry and exit, testing has shown that this is unnecessary.

We still demote PHIs on unsplittable blocks due to the differences
between the IR CFG and the Machine CFG.

llvm-svn: 253619
2015-11-19 23:23:33 +00:00
Simon Pilgrim a9912617c8 [X86][SSE4A] Fix issue with EXTRQI shuffles not starting at the correct start index.
Found during stress testing.

llvm-svn: 253611
2015-11-19 22:13:56 +00:00
Sanjay Patel c4aa50414b [InstCombine] add tests to show missing trunc optimizations
llvm-svn: 253609
2015-11-19 22:11:52 +00:00
Sanjay Patel f1c2370c48 [InstCombine] add tests to show missing bitcast optimizations
llvm-svn: 253602
2015-11-19 21:32:25 +00:00
Dehao Chen 23e2278e27 Reimplement discriminator assignment algorithm.
Summary: The new algorithm is more efficient (O(n), n is number of basic blocks). And it is guaranteed to cover all cases of multiple BB mapped to same line.

Reviewers: dblaikie, davidxl, dnovillo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14738

llvm-svn: 253594
2015-11-19 19:53:05 +00:00
James Molloy 1d695a09dd [GlobalOpt] Localize some globals that have non-instruction users
We currently bail out of global localization if the global has non-instruction users. However, often these can be simple bitcasts or constant-GEPs, which we can easily turn into instructions before localizing. Be a bit more aggressive.

llvm-svn: 253584
2015-11-19 18:04:33 +00:00
Jun Bum Lim 4c35ccac91 [AArch64]Extend merging narrow loads into a wider load
This change extends r251438 to handle more narrow load promotions
including byte type, unscaled, and signed. For example, this change will
convert :
  ldursh w1, [x0, #-2]
  ldurh  w2, [x0, #-4]
into
  ldur  w2, [x0, #-4]
  asr   w1, w2, #16
  and   w2, w2, #0xffff

llvm-svn: 253577
2015-11-19 17:21:41 +00:00
Sanjay Patel ae3680cbcd this new test file was accidentally left out of r253573
llvm-svn: 253574
2015-11-19 16:39:00 +00:00
Sanjay Patel 4699b8ab6a [CGP] despeculate expensive cttz/ctlz intrinsics
This is another step towards allowing SimplifyCFG to speculate harder, but then have 
CGP clean things up if the target doesn't like it.

Previous patches in this series:
http://reviews.llvm.org/D12882
http://reviews.llvm.org/D13297

D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special
handling because of weirdness in the intrinsic definition for handling a zero input 
(that definition can probably be blamed on x86).

For example, if we have the usual speculated-by-select expensive op pattern like this:

  %tobool = icmp eq i64 %A, 0
  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)   ; is_zero_undef == true
  %cond = select i1 %tobool, i64 64, i64 %0
  ret i64 %cond

There's an instcombine that will turn it into:

  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false)   ; is_zero_undef == false

This CGP patch is looking for that case and despeculating it back into:

  entry:
    %tobool = icmp eq i64 %A, 0
    br i1 %tobool, label %cond.end, label %cond.true

  cond.true:
    %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)    ; is_zero_undef == true
    br label %cond.end

  cond.end:
    %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ]
    ret i64 %cond

This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), 
but if we increase speculation in SimplifyCFG (the next step in this patch series), then
we should avoid those kinds of cases in the first place.

The need for this patch was originally mentioned here:
http://reviews.llvm.org/D7506
with follow-up here:
http://reviews.llvm.org/D7554

Differential Revision: http://reviews.llvm.org/D14630

llvm-svn: 253573
2015-11-19 16:37:10 +00:00
Hans Wennborg dcc2500452 X86: More efficient legalization of wide integer compares
In particular, this makes the code for 64-bit compares on 32-bit targets
much more efficient.

Example:

  define i32 @test_slt(i64 %a, i64 %b) {
  entry:
    %cmp = icmp slt i64 %a, %b
    br i1 %cmp, label %bb1, label %bb2
  bb1:
    ret i32 1
  bb2:
    ret i32 2
  }

Before this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          setae   %al
          cmpl    16(%esp), %ecx
          setge   %cl
          je      .LBB2_2
          movb    %cl, %al
  .LBB2_2:
          testb   %al, %al
          jne     .LBB2_4
          movl    $1, %eax
          retl
  .LBB2_4:
          movl    $2, %eax
          retl

After this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          sbbl    16(%esp), %ecx
          jge     .LBB1_2
          movl    $1, %eax
          retl
  .LBB1_2:
          movl    $2, %eax
          retl

Differential Revision: http://reviews.llvm.org/D14496

llvm-svn: 253572
2015-11-19 16:35:08 +00:00
Diego Novillo ef548d2918 SamplePGO - Sort samples by source location when emitting as text.
When dumping function samples or writing them out as text format, it
helps if the samples are emitted sorted by source location. The sorting
of the maps is a bit slow, so we only do it on demand.

llvm-svn: 253568
2015-11-19 15:33:08 +00:00
Zoran Jovanovic 307f80eab1 [mips] Add tests for ROL and ROR macros expansion
Author: obucina
llvm-svn: 253567
2015-11-19 15:04:31 +00:00
Elena Demikhovsky 7c2c9fd243 AVX-512: Fixed COPY_TO_REGCLASS for mask registers
Copying one mask register to another under BW should be done with kmovq instruction, otherwise we can loose some bits.
Copying 8 bits under DQ may be done with kmovb.

Differential Revision: http://reviews.llvm.org/D14812

llvm-svn: 253563
2015-11-19 13:13:00 +00:00
Artyom Skrobov 444d544e9d Removing specific target from the generic test
llvm-svn: 253562
2015-11-19 12:24:47 +00:00
Simon Pilgrim 846b64e17a [X86][AVX] Fix lowering of X86ISD::VZEXT_MOVL for 128-bit -> 256-bit extension
The lowering patterns for X86ISD::VZEXT_MOVL for 128-bit to 256-bit vectors were just copying the lower xmm instead of actually masking off the first scalar using a blend.

Fix for PR25320.

Differential Revision: http://reviews.llvm.org/D14151

llvm-svn: 253561
2015-11-19 12:18:37 +00:00
Alexey Bataev b7b82bf33e Alternative to long nops for X86 CPUs, by Andrey Turetsky
Make X86AsmBackend generate smarter nops instead of a bunch of 0x90 for code alignment for CPUs which don't support long nop instructions.
Differential Revision: http://reviews.llvm.org/D14178

llvm-svn: 253557
2015-11-19 11:44:35 +00:00
James Molloy 0ecdbe7d6b [FunctionAttrs] Provide a mechanism for adding function attributes from the command line
This provides a way to force a function to have certain attributes from the command line. This can be useful when debugging or doing workload exploration, where manually editing IR is tedious or not possible (due to build systems etc).

The syntax is -force-attribute=function_name:attribute_name

All function attributes are parsed except alignstack as it requires an argument.

llvm-svn: 253550
2015-11-19 08:49:57 +00:00
Igor Breger 1f78296869 AVX512: Implemented encoding, intrinsics and DAG lowering for VMOVDDUP instructions.
Differential Revision: http://reviews.llvm.org/D14702

llvm-svn: 253548
2015-11-19 08:26:56 +00:00
Igor Breger 4424aaa28e AVX512: Implemented encoding for the vmovss.s and vmovsd.s instructions.
Differential Revision: http://reviews.llvm.org/D14771

llvm-svn: 253547
2015-11-19 07:58:33 +00:00
Igor Breger 81b79de54c AVX512: Implemented encoding for the follow instructions.
vmovapd.s, vmovaps.s, vmovdqa32.s, vmovdqa64.s, vmovdqu16.s, vmovdqu32.s, vmovdqu64.s, vmovdqu8.s, vmovupd.s, vmovups.s

Differential Revision: http://reviews.llvm.org/D14768

llvm-svn: 253546
2015-11-19 07:43:43 +00:00
Elena Demikhovsky 1ca72e1846 Pointers in Masked Load, Store, Gather, Scatter intrinsics
The masked intrinsics support all integer and floating point data types. I added the pointer type to this list.
Added tests for CodeGen and for Loop Vectorizer.
Updated the Language Reference.

Differential Revision: http://reviews.llvm.org/D14150

llvm-svn: 253544
2015-11-19 07:17:16 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Weiming Zhao b69babd01e Fix bug 25440: GVN assertion after coercing loads
Optimizations like LoadPRE in GVN will insert new instructions.
If the insertion point is in a already processed BB, they should
get a value number explicitly. If the insertion point is after
current instruction, then just leave it. However, current GVN framework
has no support for it.
In this patch, we just bail out if a VN can't be found.

Dfferential Revision: http://reviews.llvm.org/D14670

A    test/Transforms/GVN/pr25440.ll
M    lib/Transforms/Scalar/GVN.cpp

llvm-svn: 253536
2015-11-19 02:45:18 +00:00
Quentin Colombet 46d5c71135 [X86] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14156

rdar://problem/21118279

llvm-svn: 253528
2015-11-19 00:38:00 +00:00
Reid Kleckner 441d207e66 Disable Go bindings test with MSan, it has tons of linker errors
llvm-svn: 253525
2015-11-19 00:05:20 +00:00
Davide Italiano c5cedd195a [SimplifyLibCalls] New trick: pow(x, 0.5) -> sqrt(x) under -ffast-math.
Differential Revision:	http://reviews.llvm.org/D14466

llvm-svn: 253521
2015-11-18 23:21:32 +00:00
Quentin Colombet f6645cce91 [AArch64] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14360

rdar://problem/20820748

llvm-svn: 253520
2015-11-18 23:12:20 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Simon Pilgrim c1a46b729b [DAGCombiner] Vector constant folding for comparisons
This patch adds support for vector constant folding of integer/float comparisons.

This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations)

Differential Revision: http://reviews.llvm.org/D14683

llvm-svn: 253504
2015-11-18 21:17:19 +00:00
Tim Northover 747ae9a7de ARM: make sure backend is consistent about exception handling method.
It turns out we decide whether to use SjLj exceptions or some alternative in
two separate places in the backend, and they disagreed with each other. This
led to inconsistent code and is generally a terrible idea.

So make them consistent and add an assert that they *do* match (unfortunately
MCAsmInfo isn't available in opt, so it can't be used to initialise the CodeGen
version directly).

llvm-svn: 253502
2015-11-18 21:10:39 +00:00
Mike Aizatsky c7810baaa6 Disable gvn non-local speculative loads under asan.
Summary: Fix for https://llvm.org/bugs/show_bug.cgi?id=25550

Differential Revision: http://reviews.llvm.org/D14763

llvm-svn: 253498
2015-11-18 20:43:00 +00:00
Betul Buyukkurt 6fac1741c9 [PGO] Value profiling support
This change introduces an instrumentation intrinsic instruction for
value profiling purposes, the lowering of the instrumentation intrinsic
and raw reader updates. The raw profile data files for llvm-profdata
testing are updated.

llvm-svn: 253484
2015-11-18 18:14:55 +00:00
Artyom Skrobov 763c4ab32d Removing specific target from the generic test
llvm-svn: 253479
2015-11-18 17:50:47 +00:00
Dan Gohman 3d4a20662a [WebAssembly] Make bogus inline asm strings in tests be comments.
These tests aren't testing that the result is valid syntax; they're testing
that the compiler emits the inline asm operands correctly.

llvm-svn: 253469
2015-11-18 16:28:58 +00:00
Dan Gohman 4ba4816b97 [WebAssembly] Enable register coloring and register stackifying.
This also takes the push/pop syntax another step forward, introducing stack
slot numbers to make it easier to see how expressions are connected. For
example, the value pushed in $push7 is popped in $pop7.

And, this begins an experiment with making get_local and set_local implicit
when an operation directly uses or defines a register. This greatly reduces
clutter. If this experiment succeeds, it may make sense to do this for
const instructions as well.

And, this introduces more special code for ARGUMENTS; hopefully this code
will soon be obviated by proper support for live-in virtual registers.

llvm-svn: 253465
2015-11-18 16:12:01 +00:00
Manuel Klimek 272d3f17fc Fix bug where WinCOFFObjectWriter would assume starting from an empty output.
Starting on an input stream that is not at offset 0 would trigger the
assert in WinCOFFObjectWriter.cpp:1065:

  assert(getStream().tell() <= (*i)->Header.PointerToRawData &&
               "Section::PointerToRawData is insane!");

llvm-svn: 253464
2015-11-18 15:24:17 +00:00
Jonas Paulsson af722f8287 [SelectionDAGBuilder] Make sure DemoteReg ends up in right reg-class.
The virtual register containing the address for returned value on
stack should in the DAG be represented with a CopyFromReg node and not
a Register node. Otherwise, InstrEmitter will not make sure that it
ends up in the right register class for the target instruction.

SystemZ needs this, becuause the reg class for address registers is a
subset of the general 64 bit register class.

test/SystemZ/CodeGen/args-07.ll and args-04.ll updated to run with
-verify-machineinstrs.

Reviewed by Hal Finkel.

llvm-svn: 253461
2015-11-18 14:59:00 +00:00
Igor Laevsky 7310c68e85 Revert "Revert "Strip metadata when speculatively hoisting instructions (r252604)"
Failing clang test is now fixed by the r253458.

llvm-svn: 253459
2015-11-18 14:50:18 +00:00
James Molloy ea3bb626d4 [LTO] Appease buildbots take 3
This time I've found a linux box and checked it there. This test now passes.

Because I'd introduced an undefined reference in @bar, gold now returns an error. This doesn't matter for the test itself, because it also emits the remarks the test is checking for. But it does cause LIT to notice a nonzero return code which it faults on.

llvm-svn: 253454
2015-11-18 12:08:24 +00:00
James Molloy ec9698a8c8 [LTO] Buildbot appeasing take 2
Let's try again. This time using the right function signature. It's a real pity I can't run this on a darwin machine...

llvm-svn: 253453
2015-11-18 11:37:32 +00:00
James Molloy 784adffce1 [LTO] Fix up test/tools/gold/X86/remarks.ll
It needs the same fixes as in test/LTO/X86/remarks.ll, but this test appears not to get run on my system (but does on the buildbot). Strange.

llvm-svn: 253452
2015-11-18 11:32:14 +00:00
James Molloy 9ad4f22538 [LTO] Add an early run of functionattrs
Because we internalize early, we can potentially mark a bunch of functions as norecurse. Do this before globalopt.

llvm-svn: 253451
2015-11-18 11:24:42 +00:00
Asaf Badouh 0d957b8b09 [X86][AVX512CD] add mask broadcast intrinsics
Differential Revision: http://reviews.llvm.org/D14573

llvm-svn: 253450
2015-11-18 09:42:45 +00:00
Simon Pilgrim e896f9f8c3 [X86][AVX] Added 256-bit shuffle splat tests.
llvm-svn: 253449
2015-11-18 09:39:38 +00:00
Igor Breger 5574730454 AVX512: Implemented encoding for vpextrw.s instruction.
Differential Revision: http://reviews.llvm.org/D14766

llvm-svn: 253447
2015-11-18 08:46:16 +00:00
Hrvoje Varga 78409019d9 [mips][microMIPS] Implement DPS.W.PH, DPSQ_S.W.PH, DPSQ_SA.L.W, DPSQX_S.W.PH, DPSQX_SA.W.PH, DPSU.H.QBL, DPSU.H.QBR and DPSX.W.PH instructions
Differential Revision: http://reviews.llvm.org/D14058

llvm-svn: 253443
2015-11-18 07:41:35 +00:00
Sanjoy Das 2d16145acf Teach the inliner to track deoptimization state
Summary:
This change teaches LLVM's inliner to track and suitably adjust
deoptimization state (tracked via deoptimization operand bundles) as it
inlines through call sites.  The operation is described in more detail
in the LangRef changes.

Reviewers: reames, majnemer, chandlerc, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14552

llvm-svn: 253438
2015-11-18 06:23:38 +00:00
Rafael Espindola 449711cb36 Stop producing .data.rel sections.
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.

It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.

There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
  concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
  linker since it knowns where the dynamic relocations are.

llvm-svn: 253436
2015-11-18 06:02:15 +00:00
David Majnemer 8447ff6357 Add a test for r253323
Forgot to do this simultaneously with committing the fix.

llvm-svn: 253430
2015-11-18 02:50:39 +00:00
David Majnemer fbb1c3a70b [llvm-objdump] Use the COFF export table for additional symbols
Most linked executables do not have a symbol table in COFF.
However, it is pretty typical to have some export entries.  Use those
entries to inform the disassembler about potential function definitions
and call targets.

llvm-svn: 253429
2015-11-18 02:49:19 +00:00
Cong Hou 41cf1a5dfb Improving edge probabilities computation when choosing the best successor in machine block placement.
When looking for the best successor from the outer loop for a block
belonging to an inner loop, the edge probability computation can be
improved so that edges in the inner loop are ignored. For example,
suppose we are building chains for the non-loop part of the following
code, and looking for B1's best successor. Assume the true body is very
hot, then B3 should be the best candidate. However, because of the
existence of the back edge from B1 to B0, the probability from B1 to B3
can be very small, preventing B3 to be its successor. In this patch, when
computing the probability of the edge from B1 to B3, the weight on the
back edge B1->B0 is ignored, so that B1->B3 will have 100% probability.

if (...)
  do {
    B0;
    ... // some branches
    B1;
  } while(...);
else
  B2;
B3;


Differential revision: http://reviews.llvm.org/D10825

llvm-svn: 253414
2015-11-18 00:52:52 +00:00
Quentin Colombet 8cb95b8e51 [ARM] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14357

rdar://problem/21942589

llvm-svn: 253411
2015-11-18 00:40:54 +00:00
Simon Pilgrim 2da4178737 [X86][AVX512] Added AVX512 SHUFP*/VPERMILP* shuffle decode comments.
llvm-svn: 253396
2015-11-17 23:29:49 +00:00
David Blaikie 35c2eebfe4 dwarfdump: support indexed string dumping in dwp based on the STR_OFFSETS component of the index
llvm-svn: 253392
2015-11-17 22:39:23 +00:00
Simon Pilgrim 8483df6e24 [X86][AVX512] Added support for AVX512 UNPCK shuffle decode comments.
llvm-svn: 253391
2015-11-17 22:35:45 +00:00
Nathan Slingerland e6e30d5e88 [llvm-profdata] Improve error messaging when merging mismatched profile data
Summary:
This change tries to make the root cause of instrumented profile data merge failures clearer.

Previous:

$ llvm-profdata merge test_0.profraw test_1.profraw -o test_merged.profdata
test_1.profraw: foo: Function count mismatch
test_1.profraw: bar: Function count mismatch
test_1.profraw: baz: Function count mismatch
...

Changed:

$ llvm-profdata merge test_0.profraw test_1.profraw -o test_merged.profdata
test_1.profraw: foo: Function basic block count change detected (counter mismatch)
Make sure that all profile data to be merged is generated from the same binary.
test_1.profraw: bar: Function basic block count change detected (counter mismatch)
test_1.profraw: baz: Function basic block count change detected (counter mismatch)
...

Reviewers: dnovillo, davidxl, bogner

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14739

llvm-svn: 253384
2015-11-17 22:08:53 +00:00
Simon Pilgrim 6095410e09 [X86][SSE] Share AVX1/AVX2 shuffle tests with AVX512 where possible
llvm-svn: 253379
2015-11-17 21:19:45 +00:00
Reid Kleckner c20276d0b2 [WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction
Summary:
Now that there is a one-to-one mapping from MachineFunction to
WinEHFuncInfo, we don't need to use a DenseMap to select the right
WinEHFuncInfo for the current funclet.

The main challenge here is that X86WinEHStatePass is an IR pass that
doesn't have access to the MachineFunction. I gave it its own
WinEHFuncInfo object that it uses to calculate state numbers, which it
then throws away. As long as nobody creates or removes EH pads between
this pass and SDAG construction, we will get the same state numbers.

The other thing X86WinEHStatePass does is to mark the EH registration
node. Instead of communicating which alloca was the registration through
WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic.  This
intrinsic generates no code and simply marks the alloca in use.

Reviewers: JCTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14668

llvm-svn: 253378
2015-11-17 21:10:25 +00:00
David Blaikie c4e2bed738 dwarfdump: Reference the appropriate line table segment when dumping dwp files
Also improves .dwo type unit dumping which didn't handle this either.

llvm-svn: 253377
2015-11-17 21:08:05 +00:00
Andrew Kaylor de642cef2c [EH] Keep filter clauses for types that have been caught.
The instruction combiner previously removed types from filter clauses in Landing Pad instructions if the type had previously been seen in a catch clause.  This is incorrect and prevents unexpected exception handlers from rethrowing the caught type.

Differential Revision: http://reviews.llvm.org/D14669

llvm-svn: 253370
2015-11-17 20:13:04 +00:00
Elena Demikhovsky 3ec9e15ad4 Vector of pointers in function attributes calculation
While setting function attributes we check all instructions that may access memory. For a call instruction we check all arguments. The special check is required for pointers.
I added vector-of-pointers to the call arguments types that should be checked.

Differential Revision: http://reviews.llvm.org/D14693

llvm-svn: 253363
2015-11-17 19:30:51 +00:00
Mike Aizatsky 3ac8a5abbf enabling sancov tests on linux x86_64 only
Differential Revision: http://reviews.llvm.org/D14728

llvm-svn: 253354
2015-11-17 18:25:21 +00:00
Charlie Turner 7968b981bf [ARM] Don't pessimize i32 vselect.
The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.

I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.

From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%

I also measured A57 A32, A53 T32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change

Differential Revision: http://reviews.llvm.org/D14743

llvm-svn: 253349
2015-11-17 17:25:15 +00:00
Ahmed Bougacha 88ddeae8bd [AArch64] Promote f16 SELECT_CC CC operands when op is legal.
SELECT_CC has the nasty property of having operands with unrelated
types. So if you do something like:

  f32 = select_cc f16, f16, f32, f32, cc

You'd only look for the action for <select_cc, f32>, but never f16.
If the types are all legal, but the op isn't (as for f16 on AArch64,
or for f128 on x86_64/AArch64?), then you get into trouble.
For f128, we have softenSetCCOperands to handle this case.

Similarly, for f16, we can directly promote the CC operands.

llvm-svn: 253344
2015-11-17 16:45:40 +00:00
Dan Gohman 6c1b0b9a88 Update DebugInfo tests for the change in DEBUG_VALUE output in r253338.
llvm-svn: 253340
2015-11-17 16:15:11 +00:00
Pat Gavlin c8ea157811 Lower statepoints with multi-def targets.
Statepoint lowering currently expects that the target method of a
statepoint only defines a single value. This precludes using
statepoints with ABIs that return values in multiple registers
(e.g. the SysV AMD64 ABI). This change adds support for lowering
statepoints with mutli-def targets.

llvm-svn: 253339
2015-11-17 16:04:21 +00:00
Dan Gohman 7aa4abac24 Use TargetRegisterInfo for printing MachineOperand register comments
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.

This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.

Differential Revision: http://reviews.llvm.org/D14709

llvm-svn: 253338
2015-11-17 16:01:28 +00:00
Charlie Turner b4613c6973 [ARM] Match VABDL from log2 shuffles.
Differential Revision: http://reviews.llvm.org/D14664

llvm-svn: 253334
2015-11-17 13:21:35 +00:00
Zlatko Buljan 72a7f9c1f5 [mips][microMIPS] Implement EXTP, EXTPDP, EXTPDPV, EXTPV, EXTR[_RS].W, EXTR_S.H, EXTRV[_RS].W and EXTRV_S.H instructions
Differential Revision: http://reviews.llvm.org/D14174

llvm-svn: 253332
2015-11-17 12:54:15 +00:00
Zlatko Buljan 246b21f66a [mips][microMIPS] Implement SUBQ[_S].PH, SUBQ_S.W, SUBQH[_R].PH, SUBQH[_R].W, SUBU[_S].PH, SUBU[_S].QB and SUBUH[_R].QB instructions
Differential Revision: http://reviews.llvm.org/D14114

llvm-svn: 253329
2015-11-17 10:11:22 +00:00
Oliver Stannard 9be59af3ab [Assembler] Make fatal assembler errors non-fatal
Currently, if the assembler encounters an error after parsing (such as an
out-of-range fixup), it reports this as a fatal error, and so stops after the
first error. However, for most of these there is an obvious way to recover
after emitting the error, such as emitting the fixup with a value of zero. This
means that we can report on all of the errors in a file, not just the first
one. MCContext::reportError records the fact that an error was encountered, so
we won't actually emit an object file with the incorrect contents.

Differential Revision: http://reviews.llvm.org/D14717

llvm-svn: 253328
2015-11-17 10:00:43 +00:00
Oliver Stannard 07b43d39a8 [Assembler] Allow non-fatal errors after parsing
This adds reportError to MCContext, which can be used as an alternative to
reportFatalError when the assembler wants to try to continue processing the
rest of the file after the error is reported, so that all of the errors ina
file can be reported. It records the fact that an error was encountered, so we
can avoid emitting an object file if any errors occurred.

This patch doesn't add any uses of this function (a later patch will convert
most uses of reportFatalError to use it), but there is a small functional
change: we use the SourceManager to print the error message, even if we have a
null SMLoc. This means that we get a SourceManager-style message, with the file
and line information shown as <unknown>, rather than the "LLVM ERROR" style
used by report_fatal_error.

llvm-svn: 253327
2015-11-17 09:58:07 +00:00
Zlatko Buljan 3e0588d033 [mips][microMIPS] Implement PRECEQ.W.PHL, PRECEQ.W.PHR, PRECEQU.PH.QBL, PRECEQU.PH.QBLA, PRECEQU.PH.QBR, PRECEQU.PH.QBRA, PRECEU.PH.QBL, PRECEU.PH.QBLA, PRECEU.PH.QBR and PRECEU.PH.QBRA instructions
Differential Revision: http://reviews.llvm.org/D14279

llvm-svn: 253326
2015-11-17 09:43:29 +00:00
Igor Breger a8c9ec85ce AVX512 : regenerate the test file against trunk.
Differential Revision: http://reviews.llvm.org/D14742

llvm-svn: 253321
2015-11-17 08:03:43 +00:00
Zlatko Buljan d1dea944b1 Added microMIPSDSPr1 assembler and disassembler tests to existing microMIPSDSPr2 test files.
llvm-svn: 253320
2015-11-17 07:58:27 +00:00
Rafael Espindola 65e4902156 Drop prelink support.
The way prelink used to work was

* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.

There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
  implementation is pretty broken. It talks about relocations that are
  "resolved by the static linker". If they are resolved, there are none
  left for the prelinker. What one needs to track is if an expression
  will require only dynamic relocations that point to the same DSO.

At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)

This patch removes support for it. That is, it stops printing the
".local" sections.

llvm-svn: 253280
2015-11-17 00:51:23 +00:00
David Blaikie 82641be467 dwarfdump: Use the index to find the right abbrev offset in DWP files
llvm-svn: 253277
2015-11-17 00:39:55 +00:00
Derek Schuff 71e8169ea8 [WebAssembly] Fix printing of global operands
This was regressed in r252656 which wasn't quite NFC. Instead of using a
custom instruction as before, use a pattern to select CONST_I32 for the
global addrs.

Differential Revision: http://reviews.llvm.org/D14587

llvm-svn: 253276
2015-11-17 00:20:44 +00:00
Philip Reames b6e8fe3dac [PRE] Preserve !invariant.load metadata
Spoted via inspection.  Test case included.

llvm-svn: 253275
2015-11-17 00:15:09 +00:00
Derek Schuff 46e3316888 [WebAssembly] Fix function return type printing
Summary:
Previously return type information for a function was derived from
return dag nodes. But this didn't work for dags with != return node. So
instead compute it directly from the LLVM function as is done for imports.

Differential Revision: http://reviews.llvm.org/D14593

llvm-svn: 253251
2015-11-16 21:12:41 +00:00
Derek Schuff 4ed4778419 [WebAssembly] Reverse the order of operands for br_if
Summary: This is to match the new version in the spec

Reviewers: sunfish

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: http://reviews.llvm.org/D14519

llvm-svn: 253249
2015-11-16 21:04:51 +00:00
Kit Barton 9c432ae111 Find available scratch register to use in function prologue and epilogue as part of shrink wrapping.
Phabricator: http://reviews.llvm.org/D13955
llvm-svn: 253247
2015-11-16 20:22:15 +00:00
Reid Kleckner c397b26790 [WinEH] Don't let UnwindHelp alias the return address
On top of that, don't bother allocating and initializing UnwindHelp if
we don't have any funclets. Currently we always use RBP as our frame
pointer when funclets are present, so this change makes it impossible to
come here without any fixed stack objects.

Fixes PR25533.

llvm-svn: 253245
2015-11-16 18:47:25 +00:00
Owen Anderson 2de9f545aa Add intermediate subtract instructions to reassociation worklist.
We sometimes create intermediate subtract instructions during
reassociation.  Adding these to the worklist to revisit exposes many
additional reassociation opportunities.

Patch by Aditya Nandakumar.

llvm-svn: 253240
2015-11-16 18:07:30 +00:00
David Majnemer 7378e7a333 [LoopStrengthReduce] Don't increment iterator past the end of the BB
We tried to move the insertion point beyond instructions like landingpad
and cleanuppad.
However, we *also* tried to move past catchpad.  This is problematic
because catchpad is also a terminator.

This fixes PR25541.

llvm-svn: 253238
2015-11-16 17:37:58 +00:00
Vasileios Kalintiris 88faf6d697 [mips] Disable code generation through FastISel for MIPS32R6.
Reviewers: dsanders

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14708

llvm-svn: 253225
2015-11-16 17:05:01 +00:00
Oliver Stannard 9327a7575b [ARM,AArch64] Store source location of asm constant pool entries
Storing the source location of the expression that created a constant pool
entry allows us to emit better error messages if we later discover that the
expression cannot be represented by a relocation.

Differential Revision: http://reviews.llvm.org/D14646

llvm-svn: 253220
2015-11-16 16:25:47 +00:00
Oliver Stannard 09be060606 [ARM,AArch64] Store source location for values in assembly files
The MCValue class can store a SMLoc to allow better error messages to be
emitted if an error is detected after parsing. The ARM and AArch64 assembly
parsers were not setting this, so error messages did not have source
information.

Differential Revision: http://reviews.llvm.org/D14645

llvm-svn: 253219
2015-11-16 16:22:47 +00:00
Daniel Sanders 6b6679276c [mips][ias] Remove spurious ';' from inline assembly test.
IAS will not emit it. NFC at the moment but will prevent a test failure once
IAS is enabled.

llvm-svn: 253210
2015-11-16 14:19:32 +00:00
Daniel Sanders 7d0662cdac [mips][ias] Accept $31 or $ra in hf16call32.ll. IAS prints the latter.
NFC at the moment, but it will prevent a test failure once IAS is enabled.

llvm-svn: 253209
2015-11-16 14:16:45 +00:00
Daniel Sanders 00a4aacecc [mips][ias] Allow whitespace after commas in inlineasm*.ll tests.
IAS always prints whitespace after a comma. NFC at the moment but this will
prevent failures when IAS is enabled.

llvm-svn: 253208
2015-11-16 14:14:59 +00:00
Artyom Skrobov f187a65f99 Handle ARMv6KZ naming
Summary:
* ARMv6KZ is the "canonical" name, given in the ARMARM
* ARMv6Z is an "official abbreviation" for it, mentioned in the ARMARM
* ARMv6ZK is a popular misspelling, which we should support as an alias.

The patch corrects the handling of the names.

Functional changes:
* ARMv6Z no longer treated as an architecture in its own right
* ARMv6ZK renamed to ARMv6KZ, accepting ARMv6ZK as an alias
* arm1176jz-s and arm1176jzf-s recognized as ARMv6ZK, instead of ARMv6K
* default ARMv6K CPU changed to arm1176j-s

Reviewers: rengolin, logan, compnerd

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14568

llvm-svn: 253206
2015-11-16 14:05:32 +00:00
James Molloy 2018091e87 Properly check if a CMPZ node is in fact comparing against zero
This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless.

Caught by Oliver Stannard using csmith; regression test added.

llvm-svn: 253195
2015-11-16 10:49:25 +00:00
Pavel Labath 978060ce2f Don't generate discriminators for calls to debug intrinsics
Summary:
This fails a check in Verifier.cpp, which checks for location matches between the declared
variable and the !dbg attachments.

Reviewers: dnovillo, dblaikie, danielcdh

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14657

llvm-svn: 253194
2015-11-16 10:40:38 +00:00
Oliver Stannard db9081bf89 [AArch64] ldr= pseudo-instruction silently ignored if register invalid
The AArch64 assembler was silently ignoring instructions like this:
  ldr foo, =bar

AArch64AsmParser::parseOperand was returning true as the parse failed, but was
not calling AArch64AsmParser::Error to report this to the user, so the
instruction was ignored without printing an error message.

Differential Revision: http://reviews.llvm.org/D14651

llvm-svn: 253193
2015-11-16 10:25:19 +00:00
Keno Fischer 6c543c501d Fix r253186 test case
Referencing a DILocation whose scope is a different subprogram causes
an assertion failure.

llvm-svn: 253187
2015-11-16 08:25:14 +00:00
Keno Fischer b011c63d19 [DIBuilder] Make createReferenceType take size and align
Summary: Since we're passing references to dbg.value as pointers,
we need to have the frontend properly declare their sizes and
alignments (as it already does for regular pointers) in preparation
for my upcoming patch to have the verifer check that the sizes agree.

Also augment the backend logic that skips actually emitting this
information into DWARF such that it also handles reference types.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14275

llvm-svn: 253186
2015-11-16 07:57:32 +00:00
Igor Breger 24cab0fa06 AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions.
Differential Revision: http://reviews.llvm.org/D14322

llvm-svn: 253185
2015-11-16 07:22:00 +00:00
Keno Fischer 86c95b5642 [Sink] Don't move landingpads
Summary: Moving landingpads into successor basic blocks makes the
verifier sad. Teach Sink that much like PHI nodes and terminator
instructions, landingpads (and cleanuppads, etc.) may not be moved
between basic blocks.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14475

llvm-svn: 253182
2015-11-16 04:47:58 +00:00
James Molloy 9c7d4d8855 [GlobalOpt] Demote globals to locals more aggressively
Global to local demotion can speed up programs that use globals a lot. It is particularly useful with LTO, when the entire call graph is known and most functions have been internalized.

For a global to be demoted, it must only be accessed by one function and that function:
  1. Must never recurse directly or indirectly, else the GV would be clobbered.
  2. Must never rely on the value in GV at the start of the function (apart from the initializer).

GlobalOpt can already do this, but it is hamstrung and only ever tries to demote globals inside "main", because C++ gives extra guarantees about how main is called - once and only once.

In LTO mode, we can often prove the first property (if the function is internal by this point, we know enough about the callgraph to determine if it could possibly recurse). FunctionAttrs now infers the "norecurse" attribute for this reason.

The second property can be proven for a subset of functions by proving that all loads from GV are dominated by a store to GV. This is conservative in the name of compile time - this only requires a DominatorTree which is fairly cheap in the grand scheme of things. We could do more fancy stuff with MemoryDependenceAnalysis too to catch more cases but this appears to catch most of the useful ones in my testing.

llvm-svn: 253168
2015-11-15 14:21:37 +00:00
Igor Breger 3ff8ef9eb7 Revert r253160.
It broke layering violation. Reproducible with BUILD_SHARED_LIBS=ON.

llvm-svn: 253163
2015-11-15 12:19:11 +00:00
Elena Demikhovsky 121d49b640 Fixed GEP visitor in the InstCombine pass.
The current implementation of GEP visitor in InstCombine fails with assertion on Vector GEP with mix of scalar and vector types, like this:

getelementptr double, double* %a, <8 x i32> %i
(It fails to create a "sext" from <8 x i32> to <8 x i64>)

I fixed it and added some tests.

Differential Revision: http://reviews.llvm.org/D14485

llvm-svn: 253162
2015-11-15 08:19:35 +00:00
Igor Breger aa40ddd3ba AVX512: Implemented encoding and intrinsics for VMOVSHDUP/VMOVSLDUP instructions.
Differential Revision: http://reviews.llvm.org/D14322

llvm-svn: 253160
2015-11-15 07:23:13 +00:00
Dan Gohman 19601fbc8a [WebAssembly] Make indentation consistent with the other testcases. NFC.
llvm-svn: 253149
2015-11-14 23:17:07 +00:00
Dan Gohman 8ad045c1d1 [WebAssembly] Support signext, zeroext, and several other function attributes.
llvm-svn: 253148
2015-11-14 23:15:41 +00:00
Dan Gohman c17e140b39 [WebAssembly] Change int_wasm_memory_size from IntrNoMem to IntrReadMem.
llvm-svn: 253147
2015-11-14 23:02:31 +00:00
Simon Pilgrim 0de179b23b [X86][SSE] Fixed arch/triple and regenerated results.
Tidyup before diffs from new patch.

llvm-svn: 253144
2015-11-14 20:42:01 +00:00
Simon Pilgrim 96d34d34b0 [X86][SSE] Added extra vector truncation tests
Baseline comparison to D14588

llvm-svn: 253132
2015-11-14 15:23:59 +00:00
Michael Zolotukhin 8ef44f93ca Don't recompute LCSSA after loop-unrolling when possible.
Summary:
Currently we always recompute LCSSA for outer loops after unrolling an
inner loop. That leads to compile time problem when we have big loop
nests, and we can solve it by avoiding unnecessary work. For instance,
if w eonly do partial unrolling, we don't break LCSSA, so we don't need
to rebuild it. Also, if all exits from the inner loop are inside the
enclosing loop, then complete unrolling won't break LCSSA either.

I replaced unconditional LCSSA recomputation with conditional recomputation +
unconditional assert and added several tests, which were failing when I
experimented with it.

Soon I plan to follow up with a similar patch for recalculation of dominators
tree.

Reviewers: hfinkel, dexonsmith, bogner, joker.eph, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14526

llvm-svn: 253126
2015-11-14 05:51:41 +00:00
Quentin Colombet 2cdcfd23cd [ShrinkWrapping] Disable the optimization for functions with sanitize like
attribute.

Even if the target supports shrink-wrapping, the prologue and epilogue
must not move because a crash can happen anywhere and sanitizers need
to be able to unwind from the PC of the crash.

llvm-svn: 253116
2015-11-14 01:55:17 +00:00
Chad Rosier cc299b627d [LIR] Add support for creating memcpys from loops with a negative stride.
This allows us to transform the below loop into a memcpy.

void test(unsigned *__restrict__ a, unsigned *__restrict__ b) {
  for (int i = 2047; i >= 0; --i) {
    a[i] = b[i];
  }
}

This is the memcpy version of r251518, which added support for memset with
negative strided loops.

llvm-svn: 253091
2015-11-13 21:51:02 +00:00
Reid Kleckner 75b4be9a11 [WinEH] Fix ESP management with 32-bit __CxxFrameHandler3
The C++ EH personality automatically restores ESP from the C++ EH
registration node after a catchret. I mistakenly thought it was like
SEH, which does not restore ESP.

It makes sense for C++ EH to differ from SEH here because SEH does not
use funclets for catches, and does not allow catching inside of finally.
C++ EH may need to unwind through multiple catch funclets and eventually
catchret to some outer funclet. Therefore, the runtime has to keep track
of which ESP to use with catchret, rather than having the compiler
reload it manually.

llvm-svn: 253084
2015-11-13 21:27:00 +00:00
Evgeniy Stepanov 447bbdb171 [safestack] Rewrite isAllocaSafe using SCEV.
Use ScalarEvolution to calculate memory access bounds.
Handle function calls based on readnone/nocapture attributes.
Handle memory intrinsics with constant size.

This change improves both recall and precision of IsAllocaSafe.
See the new tests (ex. BitCastWide) for the kind of code that was wrongly
classified as safe.

SCEV efficiency seems to be limited by the fact the SafeStack runs late
(in CodeGenPrepare), and many loops are unrolled or otherwise not in LCSSA.

llvm-svn: 253083
2015-11-13 21:21:42 +00:00
Diego Novillo 8e415a821f SamplePGO - Add dump routines for LineLocation, SampleRecord and FunctionSamples
llvm-svn: 253071
2015-11-13 20:24:28 +00:00
Cong Hou ef4074bac2 [X86][SSE] Combine UNPCKL with vector_shuffle into UNPCKH to save one instruction for sext from v16i8 to v16i16 and v8i16 to v8i32.
This patch is enabling combining UNPCKL with vector_shuffle that moves the upper
half of a vector into the lower half, into a UNPCKH instruction. For example:

t2: v16i8 = vector_shuffle<8,9,10,11,12,13,14,15,u,u,u,u,u,u,u,u> t1, undef:v16i8
t3: v16i8 = X86ISD::UNPCKL undef:v16i8, t2

will be combined to:

t3: v16i8 = X86ISD::UNPCKH undef:v16i8, t1


Differential revision: http://reviews.llvm.org/D14399

llvm-svn: 253067
2015-11-13 19:47:43 +00:00
David Blaikie 8e8dd57e0b dwarfdump: Add support for dumping the table contents of DWP indexes
This is a recommit of 252842 which was reverted in 252859. The issue was
using %s format specifier for a StringRef - used Format's
left_justify(StringRef, int) instead.

It'd be nice to have __attribute__((format(..))) on llvm::format, but
apparently it's only implemented for c-style variadics, not C++ variadic
templates. Perhaps we could fix that & conditionalize the attribute on
such...

llvm-svn: 253065
2015-11-13 19:18:49 +00:00
Reid Kleckner 82a6d4bf5c Add missing triple to WinEH test case
llvm-svn: 253062
2015-11-13 19:11:12 +00:00
Reid Kleckner 94b57065c6 [WinEH] Make UnwindHelp a fixed stack object allocated after XMM CSRs
Now the offset of UnwindHelp in our EH tables and the offset that we
store to in the prologue agree.

llvm-svn: 253059
2015-11-13 19:06:01 +00:00
Tom Stellard f9f5f12ce7 ELFYAML: Add support for parsing AMDGPU section attribute flags
Reviewers: silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14444

llvm-svn: 253052
2015-11-13 17:06:29 +00:00
James Molloy b564098c62 [ARM] Replace ARMISD::RBIT with ISD::BITREVERSE
ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering.

llvm-svn: 253047
2015-11-13 16:05:22 +00:00
Zlatko Buljan 32fb5c40d2 [mips][microMIPS] Implement SHRA[_R].PH, SHRAV[_R].PH, SHRAV[_R].QB, SHRAV_R.W, SHRA_R.W, SHRL.PH, SHRL.QB, SHRLV.PH and SHRLV.QB instructions
Differential Revision: http://reviews.llvm.org/D14010

llvm-svn: 253041
2015-11-13 13:14:25 +00:00
Daniel Sanders dd0eb2bbdd [mips][ias] Explicitly disable IAS on asm-large-immediate.ll.
NFC at the moment but it will prevent a failure when IAS is enabled by default.

llvm-svn: 253039
2015-11-13 13:02:31 +00:00
Daniel Sanders 05d81d8286 [mips][ias] Replace invalid assembly insn in test since IAS parses inline assembly.
This is NFC at the moment but will prevent this test from failing when
IAS is the default.

llvm-svn: 253033
2015-11-13 11:44:00 +00:00
James Molloy 67ca6edbb1 [AArch64] Check the expansion of BITREVERSE in regression test
Something I missed from Hal's review, rightly pointed out by Ben Kramer - we should make sure the expansion is properly checked as it can be easy for bugs to creep in.

I've checked the scalar i8 expansion here and the vector i8 expansion in a previous commit.

llvm-svn: 253024
2015-11-13 10:05:31 +00:00
James Molloy bb1dbf530a [SDAG] Fix expansion of BITREVERSE
Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash.

What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size.

Use an APInt for the shift and VT.getScalarSizeInBits().

llvm-svn: 253023
2015-11-13 10:02:36 +00:00
NAKAMURA Takumi 7706fe58d2 llvm/test/tools/llvm-profdata/text-format-errors.test: Use prepared version of the input file, instead of using echo.
...and s/\C9/\xC9/

llvm-svn: 253014
2015-11-13 06:06:58 +00:00
Nathan Slingerland 4f82366759 [llvm-profdata] Add check for text profile formats and improve error reporting (2nd try)
Summary:
This change addresses two possible instances of user error / confusion when
merging sampled profile data.

Previously any input that didn't match the raw or processed instrumented format
would automatically be interpreted as instrumented profile text format data.
No error would be reported during the merge.

Example:
If foo-sampled.profdata and bar-sampled.profdata are binary sampled profiles:

Old behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -output foobar-sampled.profdata
$ llvm-profdata show -sample foobar-sampled.profdata
error: foobar-sampled.profdata:1: Expected 'mangled_name:NUM:NUM', found  lprofi

This change adds basic checks for valid input data when assuming text input.
It also makes error messages related to file format validity more specific about
the assumbed profile data type.

New behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -o foobar-sampled.profdata
error: foo.profdata: Unrecognized instrumentation profile encoding format
Perhaps you forgot to use the -sample option?

Reviewers: bogner, davidxl, dnovillo

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D14558

llvm-svn: 253009
2015-11-13 03:47:58 +00:00
Colin LeMahieu fa5558307b [Hexagon] NFC. Adding a number of packet correctness tests.
llvm-svn: 253000
2015-11-13 01:46:06 +00:00
Dan Gohman f19ed56288 [WebAssembly] Inline asm support.
llvm-svn: 252997
2015-11-13 01:42:29 +00:00
Colin LeMahieu 8bb168b160 [Hexagon] Adding relaxation functionality to backend and test.
llvm-svn: 252989
2015-11-13 01:12:25 +00:00
Joseph Tremoulet 149c433bcc [WinEH] Find root frame correctly in CLR funclets
Summary:
The value that the CoreCLR personality passes to a funclet for the
establisher frame may be the root function's frame or may be the parent
funclet's (mostly empty) frame in the case of nested funclets.  Each
funclet stores a pointer to the root frame in its own (mostly empty)
frame, as does the root function itself.  All frames allocate this slot at
the same offset, measured from the post-prolog stack pointer, so that the
same sequence can accept any ancestor as an establisher frame parameter
value, and so that a single offset can be reported to the GC, which also
looks at this slot.

This change allocate the slot when processing function entry, and records
its frame index on the WinEHFuncInfo object, then inserts the code to
set/copy it during prolog emission.


Reviewers: majnemer, AndyAyers, pgavlin, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14614

llvm-svn: 252983
2015-11-13 00:39:23 +00:00
Dan Gohman 058fce5435 [WebAssembly] Introduce a new pseudo-operand for unused expression results.
llvm-svn: 252975
2015-11-13 00:21:05 +00:00
Vyacheslav Klochkov cbc56baae6 X86-FMA3: Implemented commute transformations FMA*_Int instructions.
It made it possible to apply the memory folding optimization for the 2nd
operand of FMA*_Int instructions.

Reviewer: Quentin Colombet
Differential Revision: http://reviews.llvm.org/D14550

llvm-svn: 252973
2015-11-13 00:07:35 +00:00
Colin LeMahieu 7a92c6ecbb [Hexagon] Adding checks for values out of operand range and correct new-value producer usage.
llvm-svn: 252969
2015-11-12 23:28:01 +00:00
Colin LeMahieu a1fa71ead9 [Hexagon] Adding test to make sure labels and register pairs are correctly parsed.
llvm-svn: 252968
2015-11-12 22:54:14 +00:00
Sanjay Patel fbaf5a9534 specify triple and tighten checks using update_llc_test_checks.py
llvm-svn: 252962
2015-11-12 22:27:38 +00:00
Tom Stellard 0967c91e0c Revert "Remove unnecessary call to getAllocatableRegClass"
This reverts commit r252565.

This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:

Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"

This reverts commit r252674.

llvm-svn: 252956
2015-11-12 21:43:25 +00:00
Mike Aizatsky ba8a5b1f91 disabling sancov tests: too many failures on different platforms.
Differential Revision: http://reviews.llvm.org/D14624

llvm-svn: 252945
2015-11-12 20:47:12 +00:00
Mike Aizatsky 67e3d651f5 sancov tests - platform independent separators
llvm-svn: 252943
2015-11-12 20:17:49 +00:00
Tobias Grosser 8241795d20 Revert "Fix bug 25440: GVN assertion after coercing loads"
This reverts 252919 which broke LNT: MultiSource/Applications/SPASS

llvm-svn: 252936
2015-11-12 20:04:21 +00:00
Mike Aizatsky 14a06ac05e sancov test suite
Differential Revision: http://reviews.llvm.org/D14589

llvm-svn: 252933
2015-11-12 19:34:21 +00:00
Teresa Johnson ba5d68dfff [ThinLTO] Update test to be more tolerant of ordering changes
Update the ThinLTO function importing test to use DAG forms of checks so
that it is more tolerant of changes to relative ordering between
imported decls/defs. This reduces the number of changes required by the
comdat importing patch I am sending for review shortly.

llvm-svn: 252932
2015-11-12 19:31:46 +00:00
Nathan Slingerland 911ced6bf3 reverting r252916 to investigate test failure
llvm-svn: 252921
2015-11-12 18:39:26 +00:00
Weiming Zhao eed0145dd2 Fix bug 25440: GVN assertion after coercing loads
Summary:
when coercing loads, it inserts some instructions, which have no GV assigned.

https://llvm.org/bugs/show_bug.cgi?id=25440


Reviewers: hfinkel, dberlin

Subscribers: dberlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14479

llvm-svn: 252919
2015-11-12 18:19:59 +00:00
Quentin Colombet 94dc1e0d34 [ShrinkWrap] Make sure we do not mess up with EH funclet lowering.
ShrinkWrapping does not understand exception handling constraints for now, so
make sure we do not mess with them by aborting on functions that use EH
funclets.

llvm-svn: 252917
2015-11-12 18:13:42 +00:00
Nathan Slingerland f0e107e38a [llvm-profdata] Add check for text profile formats and improve error reporting
Summary:
This change addresses two possible instances of user error / confusion when
merging sampled profile data.

Previously any input that didn't match the raw or processed instrumented format
would automatically be interpreted as instrumented profile text format data.
No error would be reported during the merge.

Example:
If foo-sampled.profdata and bar-sampled.profdata are binary sampled profiles:

Old behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -output foobar-sampled.profdata
$ llvm-profdata show -sample foobar-sampled.profdata
error: foobar-sampled.profdata:1: Expected 'mangled_name:NUM:NUM', found  lprofi

This change adds basic checks for valid input data when assuming text input.
It also makes error messages related to file format validity more specific about
the assumbed profile data type.

New behavior:
$ llvm-profdata merge foo-sampled.profdata bar-sampled.profdata -o foobar-sampled.profdata
error: foo.profdata: Unrecognized instrumentation profile encoding format
Perhaps you forgot to use the -sample option?

Reviewers: bogner, davidxl, dnovillo

Subscribers: davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D14558

llvm-svn: 252916
2015-11-12 18:06:18 +00:00
Dan Gohman cf4748f180 [WebAssembly] Reapply r252858, with svn add for the new file.
Switch to MC for instruction printing.

This encompasses several changes which are all interconnected:
 - Use the MC framework for printing almost all instructions.
 - AsmStrings are now live.
 - This introduces an indirection between LLVM vregs and WebAssembly registers,
   and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping.
   This addresses some basic issues with argument registers and unused registers.
 - The way ARGUMENT instructions are handled no longer generates redundant
   get_local+set_local for every argument.

This also changes the assembly syntax somewhat; most notably, MC's printing
does not use sigils on label names, so those are no longer present, and
push/pop now have a sigil to keep them unambiguous.

The usage of set_local/get_local/$push/$pop will continue to evolve
significantly. This patch is just one step of a larger change.

llvm-svn: 252910
2015-11-12 17:04:33 +00:00
Michael Zuckerman fd3fe9e45a [x86] translating "fp" (floating point) instructions from {fadd,fdiv,fmul,fsub,fsubr,fdivr} to {faddp,fdivp,fmulp,fsubp,fsubrp,fdivrp}
LLVM Missing the following instructions: fadd\fdiv\fmul\fsub\fsubr\fdivr.
GAS and MS supporting this instruction and lowering them in to a faddp\fdivp\fmulp\fsubp\fsubrp\fdivrp instructions.

Differential Revision: http://reviews.llvm.org/D14217

llvm-svn: 252908
2015-11-12 16:58:51 +00:00
Hans Wennborg 7384a2de02 Revert r252858: "[WebAssembly] Switch to MC for instruction printing."
It broke the CMake build:

"Cannot find source file: WebAssemblyRegNumbering.cpp"

llvm-svn: 252897
2015-11-12 14:37:56 +00:00
Vasileios Kalintiris 48e0256ed6 Re-apply "[mips] Use correct frame register for DWARF info when dynamically realigning the stack.""
r252219 reversed the direction of subprogram -> function edge. Fixed the
IR to account for this.

llvm-svn: 252895
2015-11-12 14:11:43 +00:00
James Molloy 8e99e97f2a [ARM] CMOV->BFI combining: handle both senses of CMPZ
I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was.

If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases.

llvm-svn: 252891
2015-11-12 13:49:17 +00:00
Renato Golin 93064025bd Revert "[ARM] Enable shrink-wrapping by default."
This reverts commit r252825, as it broke ASAN on ARM. Investigating...

llvm-svn: 252889
2015-11-12 13:34:50 +00:00
Daniel Sanders 9f6ad49740 Implement .reloc (constant offset only) with support for R_MIPS_NONE and R_MIPS_32.
Summary:
Support for R_MIPS_NONE allows us to parse MIPS16's usage of .reloc.
R_MIPS_32 was included to be able to better test the directive.

Targets can add their relocations by overriding MCAsmBackend::getFixupKind().

Subscribers: grosbach, rafael, majnemer, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D13659

llvm-svn: 252888
2015-11-12 13:33:00 +00:00
Zlatko Buljan 797c2aec6b [mips][microMIPS] Implement LWM16, SB16, SH16, SW16, SWSP and SWM16 instructions
Differential Revision: http://reviews.llvm.org/D11406

llvm-svn: 252885
2015-11-12 13:21:33 +00:00
Vasileios Kalintiris d38860610d Revert "[mips] Use correct frame register for DWARF info when dynamically realigning the stack."
This reverts commit r252882. LLParser complains for invalid field 'function'
in DISubprogram.

llvm-svn: 252884
2015-11-12 13:19:11 +00:00
Vasileios Kalintiris 352eb55baf [mips] Use correct frame register for DWARF info when dynamically realigning the stack.
Summary:
This patch overrides TargetFrameLowering::getFrameIndexReference() in order to
specify the correct register when the function needs dynamic stack realignment.
The values returned from this function are used in order to create DW_AT_locations
for DWARF info. These locations would use the wrong registers as it's been
reported in PR25028.

Reviewers: dsanders

Subscribers: dean, llvm-commits

Differential Revision: http://reviews.llvm.org/D13511

llvm-svn: 252882
2015-11-12 13:04:16 +00:00
James Molloy 2d09c00b91 [InstCombine] Add trivial folding (bitreverse (bitreverse x)) -> x
There are plenty more instcombines we could probably do with bitreverse, but this seems like a very obvious and trivial starting point and was brought up by Hal in his review.

llvm-svn: 252879
2015-11-12 12:39:41 +00:00
James Molloy 90111f79f9 [SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic
Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target.

This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support.

The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently).

llvm-svn: 252878
2015-11-12 12:29:09 +00:00
James Molloy 7e9bdd5d01 Revert "Revert "[FunctionAttrs] Identify norecurse functions""
This reapplies this patch, with test fixes.

llvm-svn: 252871
2015-11-12 10:55:20 +00:00
Kuba Brecka de8332257b [Object, MachO] Mark symbols from DATA and BSS sections as ST_Data
In `MachOObjectFile::getSymbolType` we currently always return `SymbolRef::ST_Function` for symbols from any section. In order for llvm-symbolizer to correctly symbolize Mach-O globals, symbols from data and BSS sections should return `SymbolRef::ST_Data`.

Differential Revision: http://reviews.llvm.org/D14576

llvm-svn: 252867
2015-11-12 09:40:29 +00:00
Amjad Aboud e59cc3e540 dwarfdump: Added macro support to llvm-dwarfdump tool.
Added "macro" option to "-debug-dump" flag, which trigger parsing and dumping of the ".debug_macinfo" section.

Differential Revision: http://reviews.llvm.org/D14294

llvm-svn: 252866
2015-11-12 09:38:54 +00:00
James Molloy 9a32da74f7 Revert "[FunctionAttrs] Identify norecurse functions"
This reverts commit r252862. This introduced test failures and I'm reverting while I investigate how this happened.

llvm-svn: 252863
2015-11-12 09:05:43 +00:00
James Molloy b14994e752 [FunctionAttrs] Identify norecurse functions
A function can be marked as norecurse if:
  * The SCC to which it belongs has cardinality 1; and either
    a) It does not call any non-norecurse function. This includes self-recursion; or
    b) It only has one callsite and the function that callsite is within is marked norecurse.

a) is best propagated bottom-up and b) is best propagated top-down.

We build up the norecurse attributes bottom-up using the existing SCC pass, and mark functions with no obvious recursion (but not provably norecurse) to sweep later, top-down.

llvm-svn: 252862
2015-11-12 08:53:04 +00:00
David Blaikie 6400fc146e Mostly revert 252842 due to failures on some buildbots.
I imagine there's some UB in here somewhere, though Valgrind doesn't
seem to have picked it up (not sure if I have a working asan build right
now to test there).

GDB bot seems to be crashing:
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/26267/steps/check-all/logs/FAIL%3A%20LLVM%3A%3Adwarfdump-dwp.test

Hexagon ELF bot is, presumably, just getting different output:
http://lab.llvm.org:8011/builders/clang-hexagon-elf/builds/32927/steps/check-all/logs/FAIL%3A%20LLVM%3A%3Adwarfdump-dwp.test

llvm-svn: 252859
2015-11-12 06:33:14 +00:00
Dan Gohman 9dd55a8065 [WebAssembly] Switch to MC for instruction printing.
This encompasses several changes which are all interconnected:
 - Use the MC framework for printing almost all instructions.
 - AsmStrings are now live.
 - This introduces an indirection between LLVM vregs and WebAssembly registers,
   and a new pass, WebAssemblyRegNumbering, for computing a basic the mapping.
   This addresses some basic issues with argument registers and unused registers.
 - The way ARGUMENT instructions are handled no longer generates redundant
   get_local+set_local for every argument.

This also changes the assembly syntax somewhat; most notably, MC's printing
use sigils on label names, so those are no longer present, and push/pop now
have a sigil to keep them unambiguous.

The usage of set_local/get_local/$push/$pop will continue to evolve
significantly. This patch is just one step of a larger change.

llvm-svn: 252858
2015-11-12 06:10:03 +00:00
David Blaikie 5b9bf49c6f dwarfdump: Dump the contents of DWP indexes
llvm-svn: 252842
2015-11-12 01:41:52 +00:00
Matthias Braun b9610a6bc2 LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization
- Factor out code to query and modify the sign bit of a floatingpoint
  value as an integer. This also works if none of the targets integer
  types is big enough to hold all bits of the floatingpoint value.

- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
  otherwise perform bit manipulation on the sign bit. The previous code
  used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
  takes 34 instructions on ARM Cortex-M4. With this patch we only
  require 5:
    vldr d0, LCPI0_0
    vmov r2, r3, d0
    lsrs r2, r3, #31
    bfi r1, r2, #31, #1
    bx lr
  (This could be further improved if the compiler would recognize that
   r2, r3 is zero).

- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
  available otherwise perform bit manipulation on the sign bit.

- Perform the sign(x) test by masking out the sign bit and comparing
  with 0 rather than shifting the sign bit to the highest position and
  testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
    testl $32768, %eax
  rather than:
    shlq $48, %rax
    sets %al
    testb %al, %al

Differential Revision: http://reviews.llvm.org/D11172

llvm-svn: 252839
2015-11-12 01:02:47 +00:00
Manman Ren 3f2b9c18e2 [TLS on Darwin] use a different mask for tls calls on x86-64.
Calls involved in thread-local variable lookup save more registers
than normal calls.

rdar://problem/23073171

llvm-svn: 252837
2015-11-12 00:54:04 +00:00
Quentin Colombet 10f9813528 [ARM] Enable shrink-wrapping by default.
Differential Revision: http://reviews.llvm.org/D14357

rdar://problem/21942589

llvm-svn: 252825
2015-11-11 23:31:46 +00:00
Reid Kleckner b9204a584c [WinEH] Don't forward branches across empty EH pad BBs
For really simple SEH catchpads, we tried to forward the invoke unwind
edge across the empty block.

llvm-svn: 252822
2015-11-11 23:09:31 +00:00
David Majnemer f0f224d12d [IR] Add support for empty tokens
When working with tokens, it is often the case that one has instructions
which consume a token and produce a new token.  Currently, we have no
mechanism to represent an initial token state.

Instead, we can create a notional "empty token" by inventing a new
constant which captures the semantics we would like.  This new constant
is called ConstantTokenNone and is written textually as "token none".

Differential Revision: http://reviews.llvm.org/D14581

llvm-svn: 252811
2015-11-11 21:57:16 +00:00
Sanjoy Das cdafd8490a Introduce deoptimization operand bundles
Summary:
This change introduces the notion of "deoptimization" operand bundles.
LLVM can recognize and optimize these in more precise ways than it can a
generic "unknown" operand bundles.

The current form of this special recognition / optimization is an enum
entry in LLVMContext, a LangRef blurb and a verifier rule.  Over time we
will teach LLVM to do more aggressive optimization around deoptimization
operand bundles, exploiting known facts about kinds of state
deoptimization operand bundles are allowed to track.

Reviewers: reames, majnemer, chandlerc, dexonsmith

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14551

llvm-svn: 252806
2015-11-11 21:38:02 +00:00
Hemant Kulkarni bdce12a01b [Symbolizer]: Add -pretty-print option
Differential Revision: http://reviews.llvm.org/D13671

llvm-svn: 252798
2015-11-11 20:41:43 +00:00
Yunzhong Gao ea7b3a2320 Add a libLTO diagnostic handler that supports lto_get_error_message API
This is a follow-up from the previous discussion on the thread:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151019/307763.html

The LibLTO lto_get_error_message() API reads error messages from a std::string
sLastErrorString. Instead of passing this string around as an argument, this
patch creates a diagnostic handler and then sends this handler to the
constructor of LTOCodeGenerator.

Differential Revision: http://reviews.llvm.org/D14313

llvm-svn: 252791
2015-11-11 19:59:08 +00:00
Geoff Berry 2ddfc5e60f [DAGCombiner] Improve zextload optimization.
Summary:
Don't fold
  (zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
  (and (load x) cst)
will match as a zextload already and has additional users.

For example, the following IR:

  %load = load i32, i32* %ptr, align 8
  %load16 = and i32 %load, 65535
  %load64 = zext i32 %load16 to i64
  store i32 %load16, i32* %dst1, align 4
  store i64 %load64, i64* %dst2, align 8

used to produce the following aarch64 code:

	ldr		w8, [x0]
	and	w9, w8, #0xffff
	and	x8, x8, #0xffff
	str		w9, [x1]
	str		x8, [x2]

but with this change produces the following aarch64 code:

	ldrh		w8, [x0]
	str		w8, [x1]
	str		x8, [x2]

Reviewers: resistor, mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14340

llvm-svn: 252789
2015-11-11 19:42:52 +00:00
David Blaikie 51c402838c dwarfdump: DWP type unit index dumping skeleton
llvm-svn: 252786
2015-11-11 19:40:49 +00:00
David Blaikie 65a8efe441 dwarfdump: First piece of support for DWP dumping
Just a tiny piece of index dumping - the header in this instance.

llvm-svn: 252781
2015-11-11 19:28:21 +00:00
Joseph Tremoulet 9f467353a5 [WinEH] Only generate UnwindHelp slot for MSVCXX
Summary: Other personalities don't use this special frame slot.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14580

llvm-svn: 252778
2015-11-11 19:21:09 +00:00
Colin LeMahieu da6cafffc0 Reverting r252760
llvm-svn: 252770
2015-11-11 18:11:06 +00:00
Dehao Chen 72fdf444b7 Emit discriminator for inlined callsites.
Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance.

Reviewers: dnovillo, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14511

llvm-svn: 252768
2015-11-11 18:08:18 +00:00
Diego Novillo 0354a9f67b SamplePGO - Fix PR 25482 - Do not rely on llvm.dbg.cu for discriminators
The discriminators pass relied on the presence of llvm.dbg.cu to decide
whether to add discriminators, but this fails in the case where debug
info is only enabled partially when -fprofile-sample-use is active.

The reason llvm.dbg.cu is not present in these cases is to prevent
codegen from emitting debug info (as it is only used for the sample
profile pass).

This changes the discriminators pass to also emit discriminators even
when debug info is not being emitted.

llvm-svn: 252763
2015-11-11 17:54:37 +00:00
Hemant Kulkarni c6638c7561 [Symbolizer]: Add -pretty-print option
Differential Revision: http://reviews.llvm.org/D13671

llvm-svn: 252760
2015-11-11 17:47:54 +00:00
Sanjay Patel f740129198 [MIPS] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
MIPS32 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any MIPS32
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:
  jr  $ra
  clz  $2, $4

cttz:
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  jr  $ra
  subu  $2, $2, $1

Instead of:

ctlz:
  beqz  $4, $BB0_2
  addiu  $2, $zero, 32
  clz  $2, $4
$BB0_2:
  jr  $ra
  nop

cttz:
  beqz  $4, $BB1_2
  addiu  $2, $zero, 32
  addiu  $1, $4, -1
  not  $2, $4
  and  $1, $2, $1
  clz  $1, $1
  addiu  $2, $zero, 32
  subu  $2, $2, $1
$BB1_2:
  jr  $ra
  nop

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14500

llvm-svn: 252755
2015-11-11 17:24:56 +00:00
Artyom Skrobov fe96beea57 test/DebugInfo/ARM/prologue_end.ll references thumbv1, which is invalid.
The committer didn't respond at http://reviews.llvm.org/D14338, so we've got to fix this for them.

This test doesn't pass with thumbv6, so I suppose what they meant is thumbv7.

llvm-svn: 252754
2015-11-11 17:22:18 +00:00
Daniel Sanders 70dd2d7ab9 [mips] Move MC tests for the DSP ASE into the standard format.
Summary:
Only DSPr2 is present because it appears we've never added DSPr1 tests.
We'll have to correct that in a later patch.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14448

llvm-svn: 252752
2015-11-11 16:50:13 +00:00
Douglas Katzman a14039764b Visibly fail if attempting to encode register AH,BH,CH,DH in a REX-prefixed instruction.
Differential Revision: http://reviews.llvm.org/D13316
Fixes PR25003

llvm-svn: 252743
2015-11-11 15:51:16 +00:00
James Molloy ce12c92f66 [ARM] Combine BFIs together
If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits.

llvm-svn: 252740
2015-11-11 15:40:40 +00:00
Michael Kuperstein 12982a816c [X86] Replace LEAs with INC/DEC when profitable
If possible and profitable, replace lea %reg, 1(%reg) and lea %reg, -1(%reg) with inc %reg and dec %reg respectively.

Patch by: anton.nadolsky@intel.com
Differential Revision: http://reviews.llvm.org/D14059

llvm-svn: 252722
2015-11-11 11:44:31 +00:00
Yury Gribov d7731988ef [ASan] Enable optional ASan recovery.
Differential Revision: http://reviews.llvm.org/D14242

llvm-svn: 252719
2015-11-11 10:36:49 +00:00
Akira Hatanaka 5dda592643 Sort the enums in Attributes.h in case insensitive alphabetical order.
Sort the enums in preparation for moving the attributes to a table-gen
file.

rdar://problem/19836465

llvm-svn: 252692
2015-11-11 02:11:46 +00:00
Dan Gohman 754cd11d90 [WebAssembly] Support non-legal argument and return types.
llvm-svn: 252687
2015-11-11 01:33:02 +00:00
Ahmed Bougacha 4a85643907 [MC] Use LShr for constant evaluation of ">>" on non-arm64 darwin.
Follow-up to r235963: this matches other assemblers and is less
unexpected (e.g. PR23227).

llvm-svn: 252681
2015-11-11 00:51:36 +00:00
Matt Arsenault 6690d7de39 AMDGPU: Set isAllocatable = 0 on VS_32/VS_64
llvm-svn: 252674
2015-11-11 00:01:32 +00:00
Sanjoy Das 925681053d [ValueTracking] Teach isImpliedCondition a new bitwise trick
Summary:
This change teaches isImpliedCondition to prove things like

  (A | 15) < L  ==>  (A | 14) < L

if the low 4 bits of A are known to be zero.

Depends on D14391

Reviewers: majnemer, reames, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14392

llvm-svn: 252673
2015-11-10 23:56:20 +00:00
Reid Kleckner 7f84a939ed [WinEH] Insert the MBB for EH_RESTORE after the catchret
Inserting it before the target block could be bad, we might already have
a fallthrough edge to it.

llvm-svn: 252670
2015-11-10 23:22:20 +00:00
Dan Gohman b84ae9bb38 [WebAssembly] Support for floating point min and max.
llvm-svn: 252653
2015-11-10 21:40:21 +00:00
Bill Schmidt 34af5e1c76 [PowerPC] Add an MI SSA peephole pass.
This patch adds a pass for doing PowerPC peephole optimizations at the
MI level while the code is still in SSA form.  This allows for easy
modifications to the instructions while depending on a subsequent pass
of DCE.  Both passes are very fast due to the characteristics of SSA.

At this time, the only peepholes added are for cleaning up various
redundancies involving the XXPERMDI instruction.  However, I would
expect this will be a useful place to add more peepholes for
inefficiencies generated during instruction selection.  The pass is
placed after VSX swap optimization, as it is best to let that pass
remove unnecessary swaps before performing any remaining clean-ups.

The utility of these clean-ups are demonstrated by changes to four
existing test cases, all of which now have tighter expected code
generation.  I've also added Eric Schweiz's bugpoint-reduced test from
PR25157, for which we now generate tight code.  One other test started
failing for me, and I've fixed it
(test/Transforms/PlaceSafepoints/finite-loops.ll) as well; this is not
related to my changes, and I'm not sure why it works before and not
after.  The problem is that the CHECK-NOT: of "statepoint" from test1
fails because of the "statepoint" in test2, and so forth.  Adding a
CHECK-LABEL in between keeps the different occurrences of that string
properly scoped.

llvm-svn: 252651
2015-11-10 21:38:26 +00:00
Adrian Prantl e39475d44d dsymutil: Prune module forward decl DIEs if a uniquable definition was
already emitted and fix a latent bug in DIECloner where the DW_CHILDREN_yes
flag is set based on the number of children in the input DIE rather than
the number of children that are actually being cloned.

rdar://problem/23439845

llvm-svn: 252649
2015-11-10 21:31:05 +00:00
Teresa Johnson 2d5fb8cac4 Ensure ModuleLinker materializes complete comdat groups
Summary:
The module linker lazy links some "discardable if unused" global
values (e.g. linkonce), materializing and linking them only
if they are referenced in the module. If a comdat group contains a
linkonce member that is not referenced, however, it would not be
materialized and linked, leading to an incomplete comdat group.

If there are other object files not part of the same LTO link that also
define and use that comdat group, the linker may select the incomplete
group leading to link time unsats.

To solve this, whenever a global value body is linked, make sure we
materialize any other members of the same comdat group that are not yet
materialized. This ensures they are in the lazy link list and get linked
as well.

Added new test and adjusted old test to remove parts that didn't
make sense with fix.

Reviewers: rafael

Subscribers: dexonsmith, davidxl, llvm-commits

Differential Revision: http://reviews.llvm.org/D14516

llvm-svn: 252647
2015-11-10 21:09:06 +00:00
Sanjay Patel af1b48bfdc [ARM] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.

The net result of allowing this speculation for the regression tests in this patch is
that we get this code:

ctlz:               
  clz  r0, r0
  bx  lr
cttz:              
  rbit  r0, r0
  clz  r0, r0
  bx  lr

Instead of:

ctlz:    
  cmp  r0, #0
  moveq  r0, #32
  clzne  r0, r0
  bx  lr
cttz:     
  cmp   r0, #0
  moveq  r0, #32
  rbitne  r0, r0
  clzne  r0, r0
  bx  lr

This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818

Differential Revision: http://reviews.llvm.org/D14469

llvm-svn: 252639
2015-11-10 19:24:31 +00:00
Yunzhong Gao ef436f068c llvm-lto: trivial spelling changes to distinguish custom diagnostic handler and
default diagnostic handler.

Differential Revision: http://reviews.llvm.org/D14520

llvm-svn: 252633
2015-11-10 18:52:48 +00:00
Philip Reames 2d858747df [ValueTracking] Recognize that and(x, add (x, -1)) clears the low bit
This is a cleaned up version of a patch by John Regehr with permission. Originally found via the souper tool.

If we add an odd number to x, then bitwise-and the result with x, we know that the low bit of the result must be zero. Either it was zero in x originally, or the add cleared it in the temporary value. As a result, one of the two values anded together must have the bit cleared.

Differential Revision: http://reviews.llvm.org/D14315

llvm-svn: 252629
2015-11-10 18:46:14 +00:00
Teresa Johnson 3cd8161c9b [ThinLTO] WeakAny fixes/cleanup
Ensure WeakAny variables are imported as ExternalWeak declarations. To
handle WeakAny more consistently and fix this issue:

1) Update helper doImportAsDefinition to properly flag WeakAny variables
   and aliases as not importing defintions.

   Update callers of doImportAsDefinition to remove now redundant checks for
   WeakAny aliases, or ignore aliases, as appropriate.

2) Add any !doImportAsDefinition GVs to DoNotLinkFromSource set during
   linking of the GV prototype, where we usually add GVs to the
   DoNotLinkFromSource set for other reasons.

   Remove now unnecessary adding of WeakAny aliases to
   DoNotLinkFromSource set from copyGlobalAliasProto.

   Remove now unnecessary guard against linking non-imported function
   bodies from ModuleLinker::run.

llvm-svn: 252626
2015-11-10 18:20:11 +00:00
Sanjay Patel 241c31fb64 [AArch64] add overrides for isCheapToSpeculateCttz() and isCheapToSpeculateCtlz()
AArch64 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any AArch64
implementation.

The net result of allowing this speculation for the regression tests in this
patch is that we get this code:

ctlz:
  clz  w0, w0
  ret

cttz:
  rbit  w8, w0
  clz  w0, w8
  ret

Instead of:

ctlz:
  cbz  w0, .LBB0_2
  clz  w0, w0
  ret
.LBB0_2:
  orr  w0, wzr, #0x20
  ret

cttz:
  cbz  w0, .LBB1_2
  rbit  w8, w0
  clz  w0, w8
  ret
.LBB1_2:
  orr  w0, wzr, #0x20
  ret

See D14469 for the larger motivation.

Differential Revision: http://reviews.llvm.org/D14505

llvm-svn: 252625
2015-11-10 18:11:37 +00:00
Renato Golin 0e77d72b0a Revert "Strip metadata when speculatively hoisting instructions"
This reverts commit r252604, as it broke all ARM and AArch64 buildbots, as
well as some x86, et al.

llvm-svn: 252623
2015-11-10 18:01:16 +00:00
Michael Kuperstein a01a5ee72f [X86] Do not try to custom-lower sitofp/fptosi in soft-float mode
Differential Revision: http://reviews.llvm.org/D14495

llvm-svn: 252621
2015-11-10 17:37:49 +00:00
Sanjay Patel 766589efdc add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner
This is one of the problems noted in PR25016:
https://llvm.org/bugs/show_bug.cgi?id=25016
and:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html

The spilling problem is independent and not addressed by this patch.

The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. 
This is caused by inclusion of the "slack" factor when calculating the critical path of the original
code sequence. If we don't add that, then we have a more conservative cost comparison of the old code
sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64
MULADD patterns because benchmark regressions were observed without that.

The two failing test cases now have identical asm that does what we want:
a + b + c + d ---> (a + b) + (c + d)

Differential Revision: http://reviews.llvm.org/D13417

llvm-svn: 252616
2015-11-10 16:48:53 +00:00
James Molloy 9d55f19cfa Reapply "[ARM] Combine CMOV into BFI where possible"
Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh!

Original commit message:

If we have a CMOV, OR and AND combination such as:
  if (x & CN)
      y |= CM;

And:
  * CN is a single bit;
    * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

llvm-svn: 252606
2015-11-10 14:22:05 +00:00
Igor Laevsky 01c3692a10 Strip metadata when speculatively hoisting instructions
This is fix for PR24059.

When we are hoisting instruction above some condition it may turn out
that metadata on this instruction was control dependant on the condition.
This metadata becomes invalid and we need to drop it.

This patch should cover most obvious places of speculative execution (which
I have found by greping isSafeToSpeculativelyExecute). I think there are more
cases but at least this change covers the severe ones.

Differential Revision: http://reviews.llvm.org/D14398

llvm-svn: 252604
2015-11-10 14:10:31 +00:00
Oliver Stannard 21aa762226 Update test to use explicit triple
This is needed for targets which do not support big-endian with the default
triple.

llvm-svn: 252603
2015-11-10 14:09:08 +00:00
Oliver Stannard d414c99b9c [AArch64] Fix halfword load merging for big-endian targets
For big-endian targets, when we merge two halfword loads into a word load, the
order of the halfwords in the loaded value is reversed compared to
little-endian, so the load-store optimiser needs to swap the destination
registers.

This does not affect merging of two word loads, as we use ldp, which treats the
memory as two separate 32-bit words.

llvm-svn: 252597
2015-11-10 11:04:18 +00:00
Hans Wennborg 21ce8ecb09 Inliner: Do zero-cost inlines even if above a negative threshold (PR24851)
Differential Revision: http://reviews.llvm.org/D14499

llvm-svn: 252595
2015-11-10 09:47:48 +00:00
Igor Breger b6b27af46a AVX512 : Implemented encoding and DAG lowering for VMOVHPS/PD and VMOVLPS/PD instructions.
Differential Revision: http://reviews.llvm.org/D14492

llvm-svn: 252592
2015-11-10 07:09:07 +00:00
Colin LeMahieu 3c7ecf9af1 [Hexagon] Adding instruction aliases and tests.
llvm-svn: 252579
2015-11-10 01:58:26 +00:00
Andy Ayers 809cbe9ea0 Support for emitting inline stack probes
For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages
between the current stack limit and the desired new stack pointer location. This implements support for
the inline expansion on x64.

For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call
is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications
that arise when introducing new machine basic blocks during prolog and epilog creation.

Added a new test case, modified an existing one to exclude non-x64 coreclr (for now).

Add test case

Fix tests

llvm-svn: 252578
2015-11-10 01:50:49 +00:00
Colin LeMahieu 13cc3ab785 [Hexagon] Fixing compound register printing and reenabling more tests.
llvm-svn: 252574
2015-11-10 00:51:56 +00:00
Tim Northover 339c83e27f AArch64: add experimental support for address tagging.
AArch64 has the ability to use the top 8-bits of an "address" for extra
information, with the memory subsystem automatically masking them off for loads
and stores. When that's happening, we can sometimes skip masks on memory
operations in the compiler.

However, this requires the host OS and support stack to preserve those bits so
it can't be enabled everywhere. In principle iOS 8.0 and above do take the
required precautions and but we'll put it under a flag for now.

llvm-svn: 252573
2015-11-10 00:44:23 +00:00
Kevin Enderby dc0dbe1f69 Fix llvm-nm(1) printing of llvm-bitcode files for -format darwin to match darwin’s nm(1).
Also a small fix to match printing of Mach-O objects with -format posix.

llvm-svn: 252567
2015-11-10 00:31:08 +00:00
Derek Schuff ffa143ce81 [WebAssembly] Support 'unreachable' expression
Lower LLVM's 'unreachable' terminator to ISD::TRAP, and lower ISD::TRAP to
wasm's 'unreachable' expression.

WebAssembly type-checks expressions, but a noreturn function with a
return type that doesn't match the context will cause a check
failure. So we lower LLVM 'unreachable' to ISD::TRAP and then lower that
to WebAssembly's 'unreachable' expression, which typechecks in any
context and causes a trap if executed.

Differential Revision: http://reviews.llvm.org/D14515

llvm-svn: 252566
2015-11-10 00:30:57 +00:00
Colin LeMahieu b7a5f9fc29 [Hexagon] Fixing store instructions and reenabling a few more tests.
llvm-svn: 252561
2015-11-10 00:22:00 +00:00
Akira Hatanaka 3bfc3e2d2a [ARM] Handle t2ADDri in ARMAsmPrinter::EmitUnwindingInstruction.
This fixes a bug in ARMAsmPrinter::EmitUnwindingInstruction where
llvm_unreachable was reached because t2ADDri wasn't handled.

Test case provided by Tim Northover.

rdar://problem/23270609

http://reviews.llvm.org/D14518

llvm-svn: 252557
2015-11-10 00:10:41 +00:00
Colin LeMahieu 8ab7e8e1b5 [Hexagon] Fixing load instruction parsing and reenabling tests.
llvm-svn: 252555
2015-11-10 00:02:27 +00:00
Matthias Braun 716b43306b MachineVerifier: Add missing linebreak
MachineInstr::print() with SkipOppers==true does not produce a
linebreak, so we have to do that in MachineVerifier::report().

llvm-svn: 252551
2015-11-09 23:59:29 +00:00
David Majnemer 2652b75700 [WinEH] Don't emit CATCHRET from visitCatchPad
Instead, emit a CATCHPAD node which will get selected to a target
specific sequence.

llvm-svn: 252528
2015-11-09 23:07:48 +00:00
Sanjay Patel 203fd23d6e specify triple so Windows bots won't be sad
llvm-svn: 252519
2015-11-09 21:53:58 +00:00
Sanjay Patel 32538d6811 [x86] try harder to match bitwise 'or' into an LEA
The motivation for this patch starts with the epic fail example in PR18007:
https://llvm.org/bugs/show_bug.cgi?id=18007

...unfortunately, this patch makes no difference for that case, but it solves some
simpler cases. We'll get there some day. :)

The current 'or' matching code was using computeKnownBits() via 
isBaseWithConstantOffset() -> MaskedValueIsZero(), but that's an unnecessarily limited use. 
We can do more by copying the logic in ValueTracking's haveNoCommonBitsSet(), so we can 
treat the 'or' as if it was an 'add'.

There's a TODO comment here because we should lift the bit-checking logic into a helper
function, so it's not duplicated in DAGCombiner.

An example of the better LEA matching:

leal (%rdi,%rdi), %eax
andl $1, %esi
orl %esi, %eax

Becomes:

andl $1, %esi
leal (%rsi,%rdi,2), %eax

Differential Revision: http://reviews.llvm.org/D13956

llvm-svn: 252515
2015-11-09 21:16:49 +00:00
Reid Kleckner 64b003f05d [WinEH] Tweak funclet prologue/epilogue insertion to pass verifier
For some reason we'd never run MachineVerifier on WinEH code, and you
explicitly have to ask for it with llc. I added it to a few test cases
to get some coverage.

Fixes PR25461.

llvm-svn: 252512
2015-11-09 21:04:00 +00:00
Andrew Kaylor fdd48fa1e1 [WinEH] Re-committing r252249 (Clone funclets with multiple parents) with additional fixes for determinism problems
Differential Revision: http://reviews.llvm.org/D14454

llvm-svn: 252508
2015-11-09 19:59:02 +00:00
Dehao Chen 3656e3064b Add discriminators for call instructions that are from the same line and same basic block.
Summary: Call instructions that are from the same line and same basic block needs to have separate discriminators to distinguish between different callsites.

Reviewers: davidxl, dnovillo, dblaikie

Subscribers: dblaikie, probinson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14464

llvm-svn: 252492
2015-11-09 17:30:38 +00:00
Oliver Stannard c1103398f2 GlobalOpt should maintain externally_initialized when splitting aggregates
When GlobalOpt splits an internal, global variable with an aggregate type, it
should propagate the externally_initialized flag to the newly created globals.

This makes the pass safe for our downstream use of this flag, while still
allowing some useful optimisations (such as removing dead parts of the split
aggregate) to be performed.

Differential Revision: http://reviews.llvm.org/D13382

llvm-svn: 252490
2015-11-09 16:47:16 +00:00
James Molloy 45f67d52d0 [LoopVectorize] Address post-commit feedback on r250032
Implemented as many of Michael's suggestions as were possible:
  * clang-format the added code while it is still fresh.
  * tried to change Value* to Instruction* in many places in computeMinimumValueSizes - unfortunately there are several places where Constants need to be handled so this wasn't possible.
  * Reduce the pass list on loop-vectorization-factors.ll.
  * Fix a bug where we were querying MinBWs for I->getOperand(0) but using MinBWs[I].

llvm-svn: 252469
2015-11-09 14:32:05 +00:00
Silviu Baranga 2910a4f6b1 Allow LLE/LD and the loop versioning infrastructure to use SCEV predicates
Summary:
LAA currently generates a set of SCEV predicates that must be checked by users.
In the case of Loop Distribute/Loop Load Elimination, no such predicates could have
been emitted, since we don't allow stride versioning. However, in the future there
could be SCEV predicates that will need to be checked.

This change adds support for SCEV predicate versioning in the Loop Distribute, Loop
Load Eliminate and the loop versioning infrastructure.

Reviewers: anemet

Subscribers: mssimpso, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D14240

llvm-svn: 252467
2015-11-09 13:26:09 +00:00
Charlie Turner 90dafb1b6d [AArch64] Add UABDL patterns for log2 shuffle.
Summary:
This matches the sum-of-absdiff patterns emitted by the vectoriser using log2 shuffles.

Relies on D14207 to be able to match the `extract_subvector(..., 0)`

Reviewers: t.p.northover, jmolloy

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14208

llvm-svn: 252465
2015-11-09 13:10:52 +00:00
Renato Golin 6d435f12f0 [EABI] Add LLVM support for -meabi flag
"GCC requires the freestanding environment provide memcpy, memmove, memset
and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html

Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent
'__aeabi_memops'. This convertion violates GCC contract.

The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI
targets.

Without -meabi: use the triple default EABI.
With -meabi=default: use the triple default EABI.
With -meabi=gnu: use 'memops'.
With -meabi=4 or -meabi=5: use '__aeabi_memops'.
With -meabi set to an unknown value: same as -meabi=default.

Patch by Vinicius Tinti.

llvm-svn: 252462
2015-11-09 12:40:30 +00:00
Renato Golin 1d8a2c952f Revert "[ARM] Combine CMOV into BFI where possible"
This reverts commit r252057, as it broke ARM self-hosting buildbots, probably
due to a code-gen fault.

llvm-svn: 252460
2015-11-09 12:19:10 +00:00
Oliver Stannard 563585789c [CodeGen] Always promote f16 if not legal
We don't currently have any runtime library functions for operations on
f16 values (other than conversions to and from f32 and f64), so we
should always promote it to f32, even if that is not a legal type. In
that case, the f32 values would be softened to f32 library calls.

SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type,
as it may ne a no-op or require a different library call.

getCopyFromParts and getCopyToParts now need to cope with a
floating-point value stored in a larger integer part, as is the case for
any target that needs to store an f16 value in a 32-bit integer
register.

Differential Revision: http://reviews.llvm.org/D12856

llvm-svn: 252459
2015-11-09 11:03:18 +00:00
Colin LeMahieu 088b7877f2 [Hexagon] Removing XFAIL on Hexagon target.
llvm-svn: 252450
2015-11-09 06:15:55 +00:00
Colin LeMahieu 7cd0892729 [Hexagon] Enabling ASM parsing on Hexagon backend and adding instruction parsing tests. General updating of the code emission.
llvm-svn: 252443
2015-11-09 04:07:48 +00:00
Maksim Panchenko 87ef57148a [RuntimeDyld] Add support for R_X86_64_PC8 relocation.
llvm-svn: 252423
2015-11-08 19:34:17 +00:00
Hal Finkel f046f72efa [PowerPC] Fix LoopPreIncPrep not to depend on SCEV constant simplifications
Under most circumstances, if SCEV can simplify X-Y to a constant, then it can
also simplify Y-X to a constant. However, there is no guarantee that this is
always true, and concensus is not to consider that a correctness bug in SCEV
(although it is undesirable).

PPCLoopPreIncPrep gathers pointers used to access memory (via loads, stores and
prefetches) into buckets, where in each bucket the relative pointer offsets are
constant. We used to keep each bucket as a multimap, where SCEV's subtraction
operation was used to define the ordering predicate. Instead, use a fixed SCEV
base expression for each bucket, record the constant offsets from that base
expression, and adjust it later, if desirable, once all pointers have been
collected.

Doing it this way should be more compile-time efficient than the previous
scheme (in addition to making the implementation less sensitive to SCEV
simplification quirks).

Fixes PR25170.

llvm-svn: 252417
2015-11-08 08:04:40 +00:00
David Majnemer b222184223 [LoopStrengthReduce] Don't bother fixing up PHIs from EH Pad preds
We cannot really insert fixup code into a PHI's predecessor.

This fixes PR25445.

llvm-svn: 252416
2015-11-08 05:04:07 +00:00
David Majnemer e35244cf63 [WinEH] Update PHIs of CATCHRET successors
The TailDuplication machine pass ran across a malformed CFG: a PHI node
referred it's predecessor's predecessor instead of it's predecessor.
This occurred because we split the edge in X86ISelLowering when we
processed the CATCHRET but forgot to do something about the PHI nodes.

This fixes PR25444.

llvm-svn: 252413
2015-11-08 02:36:00 +00:00
Sanjoy Das 71fe81fd25 [FunctionAttrs] Add handling for operand bundles
Summary:
Teach the FunctionAttrs to do the right thing for IR with operand
bundles.

Reviewers: reames, chandlerc

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14408

llvm-svn: 252387
2015-11-07 01:56:00 +00:00
Sanjoy Das 436e2397f8 [FunctionAttrs] Fix an iterator wraparound bug
Summary:
This change fixes an iterator wraparound bug in
`determinePointerReadAttrs`.

Ideally, ++'ing off the `end()` of an iplist should result in a failed
assert, but currently iplist seems to silently wrap to the head of the
list on `end()++`.  This is why the bad behavior is difficult to
demonstrate.

Reviewers: chandlerc, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14350

llvm-svn: 252386
2015-11-07 01:55:53 +00:00
Joseph Tremoulet f748c8937e [WinEH] Update exception pointer registers
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.

Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.

Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.


Reviewers: pgavlin, majnemer, rnk

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14344

llvm-svn: 252383
2015-11-07 01:11:31 +00:00
David Majnemer eafa28a0d9 [InstCombine] Teach FoldPHIArgZextsIntoPHI about EHPads
FoldPHIArgZextsIntoPHI cannot insert an instruction after the PHI if
there is an EHPad in the BB.  Doing so would result in an instruction
inserted after a terminator.

llvm-svn: 252377
2015-11-07 00:52:53 +00:00
David Majnemer 27f2447fb3 [InstCombine] Don't insert an instruction after a terminator
We tried to insert a cast of a phi in a block whose terminator is an
EHPad.  This is invalid.  Do not attempt the transform in these
circumstances.

llvm-svn: 252370
2015-11-06 23:59:23 +00:00
Akira Hatanaka 5cfcce12eb Add 'notail' marker for call instructions.
This marker prevents optimization passes from adding 'tail' or
'musttail' markers to a call. Is is used to prevent tail call
optimization from being performed on the call.

rdar://problem/22667622

Differential Revision: http://reviews.llvm.org/D12923

llvm-svn: 252368
2015-11-06 23:55:38 +00:00
Ahmed Bougacha cf49b523a0 [AArch64][FastISel] Don't even try to select vector icmps.
We used to try to constant-fold them to i32 immediates.
Given that fast-isel doesn't otherwise support vNi1, when selecting
the result users, we'd fallback to SDAG anyway.
However, if the users were in another block, we'd insert broken
cross-class copies (GPR32 to FPR64).

Give up, let SDAG agree with itself on a vNi1 legalization strategy.

llvm-svn: 252364
2015-11-06 23:16:53 +00:00
Ahmed Bougacha b49eb3ab4b [X86] Fold (trunc (i32 (zextload i16))) into vbroadcast.
When matching non-LSB-extracting truncating broadcasts, we now insert
the necessary SRL. If the scalar resulted from a load, the SRL will be
folded into it, creating a narrower, offset, load.

However, i16 loads aren't Desirable, so we get i16->i32 zextloads.
We already catch i16 aextloads; catch these as well.

llvm-svn: 252363
2015-11-06 23:16:48 +00:00
Ahmed Bougacha 05a0514b12 [X86] SRL non-LSB extracts when folding to truncating broadcasts.
Now that we recognize this, we can support it instead of bailing out.
That is, we can fold:
  (v8i16 (shufflevector
    (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
    <1,1,...,1>))
into:
  (v8i16 (vbroadcast (i16 (trunc (srl Y, 16)))))

llvm-svn: 252362
2015-11-06 23:16:43 +00:00
Ahmed Bougacha 68614a36d1 [X86] Don't fold non-LSB extracts into truncating broadcasts.
We used to incorrectly assume that the offset we're extracting from
was a multiple of the element size. So, we'd fold:
  (v8i16 (shufflevector
    (v8i16 (bitcast (v4i32 (build_vector X, Y, ...)))),
    <1,1,...,1>))
into:
  (v8i16 (vbroadcast (i16 (trunc Y))))
whereas we should have extracted the higher bits from X.

Instead, bail out if the assumption doesn't hold.

llvm-svn: 252361
2015-11-06 23:16:38 +00:00
Tom Stellard 05691a678e DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload
Reviewers: resistor, arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13805

llvm-svn: 252349
2015-11-06 21:58:37 +00:00
Dan Gohman 3cb66c85b9 [WebAssembly] Use more explicit types in testcases.
llvm-svn: 252345
2015-11-06 21:32:42 +00:00
Dan Gohman 8d456e4200 [WebAssembly] Add more explicit pushes to the tests.
llvm-svn: 252344
2015-11-06 21:26:50 +00:00
David Majnemer 7204cff0a1 [InstCombine] Don't RAUW tokens with undef
Let SimplifyCFG remove unreachable BBs which define token instructions.

llvm-svn: 252343
2015-11-06 21:26:32 +00:00
Quentin Colombet 9a8efc08d3 [ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask.
Previously we were conservatively assuming that RegMask operands clobber
callee saved registers.

llvm-svn: 252341
2015-11-06 21:00:13 +00:00
Mehdi Amini b0e3192a48 Fix SLPVectorizer commutativity reordering
The SLPVectorizer had a very crude way of trying to benefit
from associativity: it tried to optimize for splat/broadcast
or in order to have the same operator on the same side.
This is benefitial to the cost model and allows more vectorization
to occur.
This patch improve the logic and make the detection optimal (locally,
we don't look at the full tree but only at the immediate children).

Should fix https://llvm.org/bugs/show_bug.cgi?id=25247

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D13996

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 252337
2015-11-06 20:17:51 +00:00
Andrew Kaylor 4731bea3e5 Improved the operands commute transformation for X86-FMA3 instructions.
All 3 operands of FMA3 instructions are commutable now.

Patch by Slava Klochkov

Reviewers: Quentin Colombet(qcolombet), Ahmed Bougacha(ab).

Differential Revision: http://reviews.llvm.org/D13269

llvm-svn: 252335
2015-11-06 19:47:25 +00:00
Dan Gohman 4b96d8d1ff [WebAssembly] Make expression-stack pushing explicit
Modelling of the expression stack is evolving. This patch takes another
step by making pushes explicit.

Differential Revision: http://reviews.llvm.org/D14338

llvm-svn: 252334
2015-11-06 19:45:01 +00:00
Sanjoy Das c01b4d2b28 [ValueTracking] De-pessimize isImpliedCondition around unsigned compares
Summary:
Currently `isImpliedCondition` will optimize "I +_nuw C < L ==> I < L"
only if C is positive.  This is an unnecessary restriction -- the
implication holds even if `C` is negative.

Reviewers: reames, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14369

llvm-svn: 252332
2015-11-06 19:01:03 +00:00
Sanjoy Das 9349dcc74a [ValueTracking] Add a framework for encoding implication rules
Summary:
This change adds a framework for adding more smarts to
`isImpliedCondition` around inequalities.  Informally,
`isImpliedCondition` will now try to prove "A < B ==> C < D" by proving
"C <= A && B <= D", since then it follows "C <= A < B <= D".

While this change is in principle NFC, I could not think of a way to not
handle cases like "i +_nsw 1 < L ==> i < L +_nsw 1" (that ValueTracking
did not handle before) while keeping the change understandable.  I've
added tests for these cases.

Reviewers: reames, majnemer, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14368

llvm-svn: 252331
2015-11-06 19:00:57 +00:00
Matt Arsenault 0c90e9501e AMDGPU: Create emergency stack slots during frame lowering
Test has a bogus verifier error which will be fixed by later commits.

llvm-svn: 252327
2015-11-06 18:17:45 +00:00
Matt Arsenault 3931948bb6 AMDGPU: Add pass to detect used kernel features
Mark kernels that use certain features that require user
SGPRs to support with kernel attributes. We need to know
before instruction selection begins because it impacts
the kernel calling convention lowering.

For now this only detects the workitem intrinsics.

llvm-svn: 252323
2015-11-06 18:01:57 +00:00
Matt Arsenault 623e6fd466 AMDGPU: Hack for VS_32 register pressure
For some reason VS_32 ends up factoring into the pressure heuristics
even though we should never see a virtual register with this class.

When SGPRs are reserved for register spilling, this for some reason
triggers reg-crit scheduling.

Setting isAllocatable = 0 may help with this since that seems to remove
it from the default implementation's generated table.

llvm-svn: 252321
2015-11-06 17:54:43 +00:00
Teresa Johnson 1063293a89 Restore "Move metadata linking after lazy global materialization/linking."
Summary:
This reverts commit r251965.

Restore "Move metadata linking after lazy global materialization/linking."

This restores commit r251926, with fixes for the LTO bootstrapping bot
failure.

The bot failure was caused by references from debug metadata to
otherwise unreferenced globals. Previously, this caused the lazy linking
to link in their defs, which is unnecessary. With this patch, because
lazy linking is complete when we encounter the metadata reference, the
materializer created a declaration. For definitions such as aliases and
comdats, it is illegal to have a declaration. Furthermore, metadata
linking should not change code generation. Therefore, when linking of
global value bodies is complete, the materializer will simply return
nullptr as the new reference for the linked metadata.

This change required fixing a different test to ensure there was a
real reference to a linkonce global that was only being reference from
metadata.

Note that the new changes to the only-needed-named-metadata.ll test
illustrate an issue with llvm-link -only-needed handling of comdat
groups, whereby it may result in an incomplete comdat group. I note this
in the test comments, but the issue is orthogonal to this patch (it can
be reproduced without any metadata at head).

Reviewers: dexonsmith, rafael, tra

Subscribers: tobiasvk, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D14447

llvm-svn: 252320
2015-11-06 17:50:53 +00:00
Teresa Johnson 189b252652 Restore "Move metadata linking after lazy global materialization/linking."
This reverts commit r251965.

llvm-svn: 252319
2015-11-06 17:50:48 +00:00
Reid Kleckner b8fd162fc5 [WinEH] Mark funclet entries and exits as clobbering all registers
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.

Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer

Subscribers: MatzeB, llvm-commits

Differential Revision: http://reviews.llvm.org/D14407

llvm-svn: 252318
2015-11-06 17:06:38 +00:00
Jun Bum Lim 22fe15ee86 [AArch64]Enable the narrow ld promotion only on profitable microarchitectures
The benefit from converting narrow loads into a wider load (r251438) could be
micro-architecturally dependent, as it assumes that a single load with two bitfield
extracts is cheaper than two narrow loads. Currently, this conversion is
enabled only in cortex-a57 on which performance benefits were verified.

llvm-svn: 252316
2015-11-06 16:27:47 +00:00
Rafael Espindola 889d7bb4cb Bring r252305 back with a test fix.
We now create the .eh_frame section early, just like every other special
section.

This means that the special flags are visible in code that explicitly
asks for ".eh_frame".

llvm-svn: 252313
2015-11-06 15:30:45 +00:00
Rafael Espindola b20b70687a Use SHT_X86_64_UNWIND on every OS.
That is the ABI required type. Linkers still check the section name, so
everything should still work.

llvm-svn: 252300
2015-11-06 13:35:35 +00:00
Daniel Sanders 5762a4f9d1 [mips][ias] Range check uimm4 operands and fixed a bug this revealed.
Summary:
The bug was that the sldi instructions have immediate widths dependant on
their element size. So sldi.d has a 1-bit immediate and sldi.b has a 4-bit
immediate. All of these were using 4-bit immediates previously.

Reviewers: vkalintiris

Subscribers: llvm-commits, atanasyan, dsanders

Differential Revision: http://reviews.llvm.org/D14018

llvm-svn: 252297
2015-11-06 12:41:43 +00:00
Daniel Sanders 38ce0f629c [mips][ias] Range check uimm3 operands.
Summary:

Reviewers: vkalintiris

Subscribers: atanasyan, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14016

llvm-svn: 252296
2015-11-06 12:31:27 +00:00
Daniel Sanders ea4f653d18 [mips][ias] Range check uimm2 operands and fix a bug this revealed.
Summary:
The bug was that the MIPS32R6/MIPS64R6/microMIPS32R6 versions of LSA and DLSA
(unlike the MSA version) failed to account for the off-by-one encoding of the
immediate. The range is actually 1..4 rather than 0..3.

Reviewers: vkalintiris

Subscribers: atanasyan, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14015

llvm-svn: 252295
2015-11-06 12:22:31 +00:00
Daniel Sanders 52da7af4d2 [mips][ias] Range check uimmz operands.
Reviewers: vkalintiris

Subscribers: dsanders, atanasyan, llvm-commits

Differential Revision: http://reviews.llvm.org/D14013

llvm-svn: 252294
2015-11-06 12:11:03 +00:00
Vasileios Kalintiris b04672cade [mips] Define patterns for the atomic_{load,store}_{8,16,32,64} nodes.
Summary:
Without these patterns we would generate a complete LL/SC sequence.
This would be problematic for memory regions marked as WRITE-only or
READ-only, as the instructions LL/SC would read/write to the protected
memory regions correspondingly.

Reviewers: dsanders

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D14397

llvm-svn: 252293
2015-11-06 12:07:20 +00:00
Tom Stellard 1e1b05db24 AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13804

llvm-svn: 252291
2015-11-06 11:45:14 +00:00
James Molloy e6f87ca812 Add a new attribute: norecurse
This attribute allows the compiler to assume that the function never recurses into itself, either directly or indirectly (transitively). This can be used among other things to demote global variables to locals.

llvm-svn: 252282
2015-11-06 10:32:53 +00:00
NAKAMURA Takumi 9947cacebf Revert r252249 (and r252255, r252258), "[WinEH] Clone funclets with multiple parents"
It behaved flaky due to iterating pointer key values on std::set and std::map.

llvm-svn: 252279
2015-11-06 10:07:33 +00:00
Andrew Kaylor f05a87dff3 Temporarily disable flaky checks in wineh-multi-parent-cloning.
llvm-svn: 252258
2015-11-06 01:15:04 +00:00
Andrew Kaylor 29cd576554 [WinEH] Clone funclets with multiple parents
Windows EH funclets need to always return to a single parent funclet.  However, it is possible for earlier optimizations to combine funclets (probably based on one funclet having an unreachable terminator) in such a way that this condition is violated.

These changes add code to the WinEHPrepare pass to detect situations where a funclet has multiple parents and clone such funclets, fixing up the unwind and catch return edges so that each copy of the funclet returns to the correct parent funclet.

Differential Revision: http://reviews.llvm.org/D13274?id=39098

llvm-svn: 252249
2015-11-06 00:20:50 +00:00
Keno Fischer 34ca831d9f [bugpoint] Add a named metadata (+their operands) reducer
Summary:
We frequently run bugpoint on a linked module that consists of all
modules we create while jitting the julia standard library. This module
has a very large number of compile units (10000+) in `llvm.dbg.cu`,
which didn't get reduced at all, requiring manual post processing.
This is an attempt to have bugpoint go through and attempt to reduce
the number of global named metadata nodes as well as their operands,
to cut down the number of roots for such metadata.

Reviewers: dexonsmith, reames, pete

Subscribers: pete, dexonsmith, reames, llvm-commits

Differential Revision: http://reviews.llvm.org/D14043

llvm-svn: 252247
2015-11-06 00:12:50 +00:00
Sanjoy Das c1a2977fb2 Re-apply r251050 with a for PR25421
The bug: I missed adding break statements in the switch / case.

Original commit message:

[SCEV] Teach SCEV some axioms about non-wrapping arithmetic

Summary:
 - A s<  (A + C)<nsw> if C >  0
 - A s<= (A + C)<nsw> if C >= 0
 - (A + C)<nsw> s<  A if C <  0
 - (A + C)<nsw> s<= A if C <= 0

Right now `C` needs to be a constant, but we can later generalize it to
be a non-constant if needed.

Reviewers: atrick, hfinkel, reames, nlewycky

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D13686

llvm-svn: 252236
2015-11-05 23:45:38 +00:00
Richard Trieu f8978e1a74 Revert r251050 to fix miscompile when running Clang -O1
See bug for details: https://llvm.org/bugs/show_bug.cgi?id=25421
Some comparisons were incorrectly replaced with a constant value.

llvm-svn: 252231
2015-11-05 23:20:36 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Tim Northover 775aaeb765 Remove windows line endings introduced by r252177. NFC.
llvm-svn: 252217
2015-11-05 21:54:58 +00:00
Alexey Samsonov 55fda1be94 [ASan] Disable instrumentation for inalloca variables.
inalloca variables were not treated as static allocas, therefore didn't
participate in regular stack instrumentation. We don't want them to
participate in dynamic alloca instrumentation as well.

llvm-svn: 252213
2015-11-05 21:18:41 +00:00
Reid Kleckner 6ddae31045 [WinEH] Fix funclet prologues with stack realignment
We already had a test for this for 32-bit SEH catchpads, but those don't
actually create funclets. We had a bug that only appeared in funclet
prologues, where we would establish EBP and ESI as our FP and BP, and
then downstream prologue code would overwrite them.

While I was at it, I fixed Win64+funclets+stackrealign. This issue
doesn't come up as often there due to the ABI requring 16 byte stack
alignment, but now we can rest easy that AVX and WinEH will work well
together =P.

llvm-svn: 252210
2015-11-05 21:09:49 +00:00
Dan Gohman d7ffb919c1 [WebAssembly] Update wasm builtin functions to match spec changes.
The page_size operator has been removed from the spec, and the resize_memory
operator has been changed to grow_memory.

llvm-svn: 252202
2015-11-05 20:16:59 +00:00
Kevin Enderby 7a96942a6a Reapply r250906 with many suggested updates from Rafael Espindola.
The needed lld matching changes to be submitted immediately next,
but this revision will cause lld failures with this alone which is expected.

This removes the eating of the error in Archive::Child::getSize() when the characters
in the size field in the archive header for the member is not a number.  To do this we
have all of the needed methods return ErrorOr to push them up until we get out of lib.
Then the tools and can handle the error in whatever way is appropriate for that tool.

So the solution is to plumb all the ErrorOr stuff through everything that touches archives.
This include its iterators as one can create an Archive object but the first or any other
Child object may fail to be created due to a bad size field in its header.

Thanks to Lang Hames on the changes making child_iterator contain an
ErrorOr<Child> instead of a Child and the needed changes to ErrorOr.h to add
operator overloading for * and -> .

We don’t want to use llvm_unreachable() as it calls abort() and is produces a “crash”
and using report_fatal_error() to move the error checking will cause the program to
stop, neither of which are really correct in library code. There are still some uses of
these that should be cleaned up in this library code for other than the size field.

The test cases use archives with text files so one can see the non-digit character,
in this case a ‘%’, in the size field.

These changes will require corresponding changes to the lld project.  That will be
committed immediately after this change.  But this revision will cause lld failures
with this alone which is expected.

llvm-svn: 252192
2015-11-05 19:24:56 +00:00