Commit Graph

32219 Commits

Author SHA1 Message Date
Mehdi Amini d178f4fc89 Make the default triple optional by allowing an empty string
When building LLVM as a (potentially dynamic) library that can be linked against
by multiple compilers, the default triple is not really meaningful.
We allow to explicitely set it to an empty string when configuring LLVM.
In this case, said "target independent" tests in the test suite that are using
the default triple are disabled by matching the newly available feature
"default_triple".

Reviewers: probinson, echristo
Differential Revision: http://reviews.llvm.org/D12660

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247775
2015-09-16 05:34:32 +00:00
Mehdi Amini 8e468d6388 [NaryReassociate] Improve test CHECK
Add `CHECK` directives for the function calls.

Differential Revision: http://reviews.llvm.org/D12885

Patch by: Volkan Keles <vkeles@apple.com>

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247774
2015-09-16 05:27:46 +00:00
Michael Zolotukhin fc314be0ec [Unroll] Fix a bug in UnrolledInstAnalyzer::visitLoad.
We only checked that a global is initialized with constants, which is
incorrect. We should be checking that GlobalVariable *is* a constant,
not just initialized with it.

llvm-svn: 247769
2015-09-16 03:25:09 +00:00
Sanjoy Das 8a5526e8be [IndVars] Fix PR24783.
In `IndVarSimplify::ExpandSCEVIfNeeded`,
`SCEVExpander::findExistingExpansion` may return an `llvm::Value` that
differs in type from the SCEV it was asked to find an expansion for (but
computes the same value).  In such cases, we fall back on
`expandCodeFor`; and rely on LLVM to CSE the two equivalent
expressions (different only by a no-op cast) into a single computation.

I tried a few other approaches to fixing PR24783, all of which turned
out to be more complex than this current version:

 1. Move the `ExpandSCEVIfNeeded` logic into `expandCodeFor`.  This got
    problematic because currently we do not pass in the `Loop *` into
    `expandCodeFor`.  Changing the interface to do this is a more
    invasive change, and really does not make much semantic sense unless
    the SCEV being passed in is an add recurrence.

    There is also the problem of `expandCodeFor` being used in places
    other than `indvars` -- there may be performance / correctness
    issues elsewhere if `expandCodeFor` is moved from always generating
    IR from scratch to cache-like model.

 2. Have `findExistingExpansion` only return expression with the correct
    type.  This would make `isHighCostExpansionHelper` and thus
    `isHighCostExpansion` more conservative than necessary.

 3. Insert casts on the value returned by `findExistingExpansion` if
    needed using `InsertNoopCastOfTo`.  This is complicated because
    `InsertNoopCastOfTo` depends on internal state of its
    `SCEVExpander` (specifically `Builder.GetInserPoint()`), and this
    may not be set up when `ExpandSCEVIfNeeded` is called.

 4. Manually insert casts on the value returned by
    `findExistingExpansion` if needed using `InsertNoopCastOfTo` via
    `CastInst::Create`.  This is probably workable, but figuring out the
    location where the cast instruction needs to be inserted has enough
    edge cases (arguments, constants, invokes, LCSSA must be preserved)
    makes me feel what I have right now is simplest solution.

llvm-svn: 247749
2015-09-15 23:45:39 +00:00
Davide Italiano 386e2ab158 [llvm-cxxdump] Remove duplicate code check.
We already fail with 'No such file or directory' when we try to open
the file -- if that doesn't exist. Also add a test to verify this behavior.

llvm-svn: 247744
2015-09-15 23:35:32 +00:00
Duncan P. N. Exon Smith cff5feff6f Reapply "LTO: Disable extra verify runs in release builds"
This reverts commit r247730, effectively reapplying r247729.  This time
I have an lld commit ready to follow.

llvm-svn: 247735
2015-09-15 23:05:59 +00:00
Alexey Samsonov c1603b6493 [ASan] Don't instrument globals in .preinit_array/.init_array/.fini_array
These sections contain pointers to function that should be invoked
during startup/shutdown by __libc_csu_init and __libc_csu_fini.
Instrumenting these globals will append redzone to them, which will be
filled with zeroes. This will cause null pointer dereference at runtime.

Merge ASan regression tests for globals that should be ignored by
instrumentation pass.

llvm-svn: 247734
2015-09-15 23:05:48 +00:00
Duncan P. N. Exon Smith 7de73e56a4 Revert "LTO: Disable extra verify runs in release builds"
This temporarily reverts commit r247729, as it caused lld build
failures.  I'll recommit once I have an lld patch ready-to-go.

llvm-svn: 247730
2015-09-15 22:47:38 +00:00
Duncan P. N. Exon Smith 236787838c LTO: Disable extra verify runs in release builds
The verifier currently runs three times in LTO: (1) after parsing, (2)
at the beginning of the optimization pipeline, and (3) at the end of it.

The first run is important, since we're not sure where the bitcode comes
from and it's nice to validate it, but in release builds the extra runs
aren't appropriate.

This commit:
  - Allows these runs to be disabled in LTOCodeGenerator.
  - Adds command-line options to llvm-lto.
  - Adds command-line options to libLTO.dylib, and disables the verifier
    by default in release builds (based on NDEBUG).

This shaves about 3.5% off the runtime of ld64 when linking
verify-uselistorder with -flto -g.

rdar://22509081

llvm-svn: 247729
2015-09-15 22:26:11 +00:00
Justin Bogner 01fa3f96b3 test: Add "REQUIRES: native" so this test passes with no default triple configured
llvm-svn: 247719
2015-09-15 21:13:33 +00:00
Quentin Colombet 7b13976254 [ShrinkWrapping] Add a test case for r247710.
llvm-svn: 247713
2015-09-15 18:51:43 +00:00
Piotr Padlewski 6c15ec49ed Introducing llvm.invariant.group.barrier intrinsic
For more info for what reason it was invented, goto:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html

invariant.group.barrier:
http://reviews.llvm.org/D12310
docs:
http://reviews.llvm.org/D11399
CodeGenPrepare:
http://reviews.llvm.org/D12875

llvm-svn: 247711
2015-09-15 18:32:14 +00:00
Arch D. Robison 8ed0854f55 Broaden optimization of fcmp ([us]itofp x, constant) by instcombine.
The patch extends the optimization to cases where the constant's
magnitude is so small or large that the rounding of the conversion
is irrelevant.  The "so small" case includes negative zero.

Differential review: http://reviews.llvm.org/D11210

llvm-svn: 247708
2015-09-15 17:51:59 +00:00
Igor Laevsky bdc1eafe20 [CorrelatedValuePropagation] Infer nonnull attributes
LazuValueInfo can prove that value is nonnull based on the context information. 
Make use of this ability to infer nonnull attributes for the call arguments.

Differential Revision: http://reviews.llvm.org/D12836

llvm-svn: 247707
2015-09-15 17:51:50 +00:00
Marcello Maggioni 454faa84e2 [NaryReassociate] Add support for Mul instructions
This patch extends the current pass by handling
Mul instructions as well.

Patch by: Volkan Keles (vkeles@apple.com)

llvm-svn: 247705
2015-09-15 17:22:52 +00:00
Zoran Jovanovic dc4b8c2761 [mips][microMIPS] Fix an issue with disassembling lwm32 instruction
Fixed microMIPS disassembler crash on test case generated by llvm-mc-fuzzer.
Differential Revision: http://reviews.llvm.org/D12881

llvm-svn: 247698
2015-09-15 15:21:27 +00:00
Zoran Jovanovic 8eb8c9861d [mips] Add support for branch-likely pseudo-instructions
Differential Revision: http://reviews.llvm.org/D10537

llvm-svn: 247697
2015-09-15 15:06:26 +00:00
Ulrich Weigand e861e6442c [SystemZ] Fix assertion failure in tryBuildVectorShuffle
Under certain circumstances, tryBuildVectorShuffle would attempt to
create a BUILD_VECTOR node with an invalid combination of types.
This happened when one of the components of the original BUILD_VECTOR
was itself a TRUNCATE node.  That TRUNCATE was stripped off during
intermediate processing to simplify code, but when adding the node
back to the result vector, we still need it to get the type right.

llvm-svn: 247694
2015-09-15 14:27:46 +00:00
Zoran Jovanovic 7beb737b46 [mips][microMIPS] Implement CACHEE and PREFE instructions for microMIPS32r6
Differential Revision: http://reviews.llvm.org/D11632

llvm-svn: 247670
2015-09-15 10:05:10 +00:00
Daniel Sanders e4e83a7bc1 [mips] Added support for various EVA ASE instructions.
Summary:
Added support for the following instructions:

CACHEE, LBE, LBUE, LHE, LHUE, LWE, LLE, LWLE, LWRE, PREFE,
SBE, SHE, SWE, SCE, SWLE, SWRE, TLBINV, TLBINVF

This required adding some infrastructure for the EVA ASE.

Patch by Scott Egerton.

Reviewers: vkalintiris, dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11139

llvm-svn: 247669
2015-09-15 10:02:16 +00:00
Sanjoy Das f75e15e5ac [PlaceSafepoints] Make the width of a counted loop settable.
Summary:
This change lets a `PlaceSafepoints` client change how wide the trip
count of a loop has to be for the loop to be considerd "counted", via
`CountedLoopTripWidth`.  It also removes the boolean `SkipCounted` flag
and the `upperTripBound` constant -- we can get the old behavior of
`SkipCounted` == `false` by setting `CountedLoopTripWidth` to `13` (2 ^
13 == 8192).

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12789

llvm-svn: 247656
2015-09-15 01:42:48 +00:00
Dan Gohman 311b488d76 [WebAssembly] Implement int64-to-int32 conversion.
llvm-svn: 247649
2015-09-15 00:55:19 +00:00
Adrian Prantl deef90d7f5 DwarfDebug: Emit dwo_id+dwo_name for DICompileUnits that provide a dwoId.
For module debugging clang emits prefabricated skeleton compile units
that can be recognized by a nonzero dwoId.

llvm-svn: 247626
2015-09-14 22:10:22 +00:00
Chen Li 0d043b52eb [InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite
Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG. It also adds assertions in isKnownNonNull() and isKnownNonNullFromDominatingCondition() to make sure the value checked is pointer type (as defined in LLVM document). These assertions might trip failures in things which are not  covered under llvm/test, but fixes should be pretty obvious. 

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12779

llvm-svn: 247587
2015-09-14 18:10:43 +00:00
Davide Italiano 2c007d050a [llvm-mc] Better error handling in ENOENT case + test.
This is a follow up to r247518.
As a general note, I think we could do a much better job testing for
error conditions in tools. I already anticipated in a previous mail,
but while implementing this I noticed that the code coverage we have 
for error checking is pretty low. I can arbitrarily remove checks from 
several tools and the suite still passes.

Differential Revision:	 http://reviews.llvm.org/D12846

llvm-svn: 247582
2015-09-14 17:10:01 +00:00
Jun Bum Lim 34b9bd0435 Improve ISel using across lane min/max reduction
In vectorized integer min/max reduction code, the final "reduce" step
is sub-optimal. In AArch64, this change wll combine :
  %svn0 = vector_shuffle %0, undef<2,3,u,u>
  %smax0 = smax %0, svn0
  %svn3 = vector_shuffle %smax0, undef<1,u,u,u>
  %sc = setcc %smax0, %svn3, gt
  %n0 = extract_vector_elt %sc, #0
  %n1 = extract_vector_elt %smax0, #0
  %n2 = extract_vector_elt $smax0, #1
  %result = select %n0, %n1, n2
becomes :
  %1 = smaxv %0
  %result = extract_vector_elt %1, 0

This change extends r246790.

llvm-svn: 247575
2015-09-14 16:19:52 +00:00
JF Bastien 26aca14b15 [MergeFuncs] Fix bug in merging GetElementPointers
GetElementPointers must have the first argument's type compared
for structural equivalence. Previously the code erroneously compared the
pointer's type, but this code was dead because all pointer types (of the
same address space) are the same. The pointee must be compared instead
(using the type stored in the GEP, not from the pointer type which will
be erased anyway).

Author: jrkoenig
Reviewers: dschuff, nlewycky, jfb
Subscribers: nlewycky, llvm-commits
Differential revision: http://reviews.llvm.org/D12820

llvm-svn: 247570
2015-09-14 15:37:48 +00:00
John Brawn 056e67865a [ARM] Extract shifts out of multiply-by-constant
Turning (op x (mul y k)) into (op x (lsl (mul y k>>n) n)) is beneficial when
we can do the lsl as a shifted operand and the resulting multiply constant is
simpler to generate.

Do this by doing the transformation when trying to select a shifted operand,
as that ensures that it actually turns out better (the alternative would be to
do it in PreprocessISelDAG, but we don't know for sure there if extracting the
shift would allow a shifted operand to be used).

Differential Revision: http://reviews.llvm.org/D12196

llvm-svn: 247569
2015-09-14 15:19:41 +00:00
NAKAMURA Takumi c397e7881a Revert part of r247553, "[CMake] Reformat CLANG_TEST_DEPS." It was accidental commit.
llvm-svn: 247555
2015-09-14 12:51:01 +00:00
NAKAMURA Takumi 0f1cbee00e [CMake] Reformat CLANG_TEST_DEPS.
llvm-svn: 247553
2015-09-14 12:41:53 +00:00
Simon Pilgrim f8f86ab176 [X86][MMX] Added shuffle decodes for MMX/3DNow! shuffles.
Added shuffle decodes for MMX PUNPCK + PSHUFW shuffles.
Added shuffle decodes for 3DNow! PSWAPD shuffles.

llvm-svn: 247526
2015-09-13 11:28:45 +00:00
Elena Demikhovsky 8671fcbbd6 AVX-512: Fixed a bug in OR/XOR operations for 512-bit FP values on KNL.
KNL does not have VXORPS, VORPS for 512-bit values.
I use integer VPXOR, VPOR that actually do the same.

X86ISD::FXOR/FOR are generated as a result of FSUB combining.

Differential Revision: http://reviews.llvm.org/D12753

llvm-svn: 247523
2015-09-13 08:15:15 +00:00
Sanjay Patel 8b960d22ad [x86] enable machine combiner reassociations for 128-bit vector logical integer insts (2nd try)
The changes in:
test/CodeGen/X86/machine-cp.ll
are just due to scheduling differences after some logic instructions were reassociated.

llvm-svn: 247516
2015-09-12 19:47:50 +00:00
Simon Pilgrim 42c834bad5 [X86] Added i1 vector sextload tests
llvm-svn: 247509
2015-09-12 15:36:41 +00:00
Simon Pilgrim 779bcf3e3d [X86][FMA] Refreshed fma tests
llvm-svn: 247508
2015-09-12 15:33:05 +00:00
Sanjay Patel 99f7370a79 revert r247506; need to verify changes in existing tests
llvm-svn: 247507
2015-09-12 15:27:31 +00:00
Sanjay Patel 08755c7dbc [x86] enable machine combiner reassociations for 128-bit vector logical integer insts
llvm-svn: 247506
2015-09-12 14:58:04 +00:00
Simon Pilgrim 20c607b110 [InstCombine] CVTPH2PS Vector Demanded Elements + Constant Folding
Improved InstCombine support for CVTPH2PS (F16C half 2 float conversion):

<4 x float> @llvm.x86.vcvtph2ps.128(<8 x i16>) - only uses the bottom 4 i16 elements for the conversion.

Added constant folding support.

Differential Revision: http://reviews.llvm.org/D12731

llvm-svn: 247504
2015-09-12 13:39:53 +00:00
Simon Pilgrim 8ee50e9cff [X86][SSE] Use general sext IR for (v)pmovsx stack folding tests
llvm-svn: 247502
2015-09-12 11:45:24 +00:00
Chandler Carruth 29a18a4663 [PM] Port SROA to the new pass manager.
In some ways this is a very boring port to the new pass manager as there
are no interesting analyses or dependencies or other oddities.

However, this does introduce the first good example of a transformation
pass with non-trivial state porting to the new pass manager. I've tried
to carve out patterns here to replicate elsewhere, and would appreciate
comments on whether folks like these patterns:

- A common need in the new pass manager is to effectively lift the pass
  class and some of its state into a public header file. Prior to this,
  LLVM used anonymous namespaces to provide "module private" types and
  utilities, but that doesn't scale to cases where a public header file
  is needed and the new pass manager will exacerbate that. The pattern
  I've adopted here is to use the namespace-cased-name of the core pass
  (what would be a module if we had them) as a module-private namespace.
  Then utility and other code can be declared and defined in this
  namespace. At some point in the future, we could even have
  (conditionally compiled) code that used modules features when
  available to do the same basic thing.

- I've split the actual pass run method in two in order to expose
  a private method usable by the old pass manager to wrap the new class
  with a minimum of duplicated code. I actually looked at a bunch of
  ways to automate or generate these, but they are all quite terrible
  IMO. The fundamental need is to extract the set of analyses which need
  to cross this interface boundary, and that will end up being too
  unpredictable to effectively encapsulate IMO. This is also
  a relatively small amount of boiler plate that will live a relatively
  short time, so I'm not too worried about the fact that it is boiler
  plate.

The rest of the patch is totally boring but results in a massive diff
(sorry). It just moves code around and removes or adds qualifiers to
reflect the new name and nesting structure.

Differential Revision: http://reviews.llvm.org/D12773

llvm-svn: 247501
2015-09-12 09:09:14 +00:00
Davide Italiano 63cee81c3c [MC] Don't crash on division by zero.
Differential Revision:	http://reviews.llvm.org/D12776

llvm-svn: 247471
2015-09-11 20:47:35 +00:00
Yunzhong Gao 46261a74db Add a non-exiting diagnostic handler for LTO.
This is in order to give LTO clients a chance to do some clean-up before
terminating the process.

llvm-svn: 247461
2015-09-11 20:01:53 +00:00
Akira Hatanaka bc497c93f5 Use function attribute "stackrealign" to decide whether stack
realignment should be forced.

With this commit, we can now force stack realignment when doing LTO and
do so on a per-function basis. Also, add a new cl::opt option
"stackrealign" to CommandFlags.h which is used to force stack
realignment via llc's command line.

Out-of-tree projects currently using -force-align-stack to force stack
realignment should make changes to attach the attribute to the functions
in the IR.

Differential Revision: http://reviews.llvm.org/D11814

llvm-svn: 247450
2015-09-11 18:54:38 +00:00
David Majnemer 0e70598a5b [X86] Make sure startproc/endproc are paired
We used different conditions to determine if we should emit startproc vs
endproc.  Use the same condition to ensure that they will always be
paired.

This fixes PR24374.

llvm-svn: 247435
2015-09-11 17:34:34 +00:00
Reid Kleckner 5dbee7baef [IR] Print the label operands of a catchpad like an invoke
The rest of the EH pads are fine, since they have at most one label and
take fewer operands for the personality.

Old catchpad vs. new:
  %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)] to label %__except.ret.10 unwind label %catchendblock.9
-----
  %5 = catchpad [i8* bitcast (i32 ()* @"\01?filt$0@0@main@@" to i8*)]
          to label %__except.ret.10 unwind label %catchendblock.9

llvm-svn: 247433
2015-09-11 17:27:52 +00:00
Daniel Sanders 715f8f1332 [mips] Add missing disassembler tests for MIPS64-MIPS64R5.
llvm-svn: 247422
2015-09-11 16:24:11 +00:00
Daniel Sanders 9676db005e [mips] Add missing MIPS32 - MIPS32R5 disassembler tests.
llvm-svn: 247420
2015-09-11 15:28:19 +00:00
Daniel Sanders aba4daab10 [mips] Attempt to fix llvm-s390x-linux1
It doesn't seem to like the '|&' in the test command.

llvm-svn: 247418
2015-09-11 14:57:54 +00:00
Daniel Sanders 9bbda77006 [mips] Add missing MIPS-IV disassembler tests.
llvm-svn: 247417
2015-09-11 14:54:58 +00:00
Daniel Sanders ff338fa355 [mips] Add missing MIPS-III disassembler tests.
llvm-svn: 247416
2015-09-11 14:48:46 +00:00
Arnaud A. de Grandmaison ba4e977cff Tweak 2 x86 gold tests so they can run on non-x86 platforms
llvm-svn: 247415
2015-09-11 14:45:34 +00:00
Daniel Sanders 3b571d05f8 [mips] Add missing MIPS-II disassembler tests.
These tests were found by llvm-mc-fuzzer (see http://reviews.llvm.org/D12723)
and were verified by checking the disassembler output is accepted by GAS.

llvm-svn: 247414
2015-09-11 14:34:41 +00:00
Daniel Sanders 44c1d0c4f1 Re-commit r247405: [mips] Add missing MIPS-I disassembler tests.
These tests were found by llvm-mc-fuzzer (see http://reviews.llvm.org/D12723)
and verified by checking the disassembler output is accepted by GAS.

The problematic tests from the previous commit have been moved to
valid-xfail.txt for now.

Also, give invalid instructions some coverage. invalid-xfail.txt contains
instructions that should be invalid but successfully disassemble.

llvm-svn: 247407
2015-09-11 12:59:03 +00:00
Daniel Sanders c6a2034d0f Revert r247405: [mips] Add missing MIPS-I disassembler tests.
A small number of the added tests have operands that change on each round trip.

llvm-svn: 247406
2015-09-11 12:42:38 +00:00
Daniel Sanders f05bf32b38 [mips] Add missing MIPS-I disassembler tests.
These tests were found by llvm-mc-fuzzer (see http://reviews.llvm.org/D12723)
and verified by checking the disassembler output is accepted by GAS.

llvm-svn: 247405
2015-09-11 12:24:06 +00:00
NAKAMURA Takumi ecc73e7806 Fix llvm/test/tools/gold/X86/bad-alias.ll.
llvm-svn: 247391
2015-09-11 08:03:17 +00:00
Frederic Riss 29eedc76e1 [dsymutil] Discard useless location attributes.
When cloning the debug info for a function that hasn't been linked,
strip the DIEs from all location attributes that wouldn't contain any
meaningful information anyway.

This kind of situation can happen when a function got discarded by the
linker, but its debug information is still wanted in the final link
because it was marked as required as some other DIE dependency. The easiest
way to get into that situation is to have using directives. They get
linked unconditionally, but their targets might not always be present.

llvm-svn: 247386
2015-09-11 04:17:30 +00:00
David Blaikie 223a634f9d Fix the gold test cases after alias changes
llvm-svn: 247381
2015-09-11 03:28:37 +00:00
David Blaikie 2f40830dde [opaque pointer type] Add textual IR support for explicit type parameter for global aliases
update.py:
import fileinput
import sys
import re

alias_match_prefix = r"(.*(?:=|:|^)\s*(?:external |)(?:(?:private|internal|linkonce|linkonce_odr|weak|weak_odr|common|appending|extern_weak|available_externally) )?(?:default |hidden |protected )?(?:dllimport |dllexport )?(?:unnamed_addr |)(?:thread_local(?:\([a-z]*\))? )?alias"
plain = re.compile(alias_match_prefix + r" (.*?))(| addrspace\(\d+\) *)\*($| *(?:%|@|null|undef|blockaddress|addrspacecast|\[\[[a-zA-Z]|\{\{).*$)")
cast  = re.compile(alias_match_prefix + r") ((?:bitcast|inttoptr|addrspacecast)\s*\(.* to (.*?)(| addrspace\(\d+\) *)\*\)\s*(?:;.*)?$)")
gep   = re.compile(alias_match_prefix + r") ((?:getelementptr)\s*(?:inbounds)?\s*\((?P<type>.*), (?P=type)(?:\s*addrspace\(\d+\)\s*)?\* .*\)\s*(?:;.*)?$)")

def conv(line):
  m = re.match(cast, line)
  if m:
    return m.group(1) + " " + m.group(3) + ", " + m.group(2)
  m = re.match(gep, line)
  if m:
    return m.group(1) + " " + m.group(3) + ", " + m.group(2)
  m = re.match(plain, line)
  if m:
    return m.group(1) + ", " + m.group(2) + m.group(3) + "*" + m.group(4) + "\n"
  return line

for line in sys.stdin:
  sys.stdout.write(conv(line))

apply.sh:
for name in "$@"
do
  python3 `dirname "$0"`/update.py < "$name" > "$name.tmp" && mv "$name.tmp" "$name"
  rm -f "$name.tmp"
done

The actual commands:
From llvm/src:
find test/ -name *.ll | xargs ./apply.sh
From llvm/src/tools/clang:
find test/ -name *.mm -o -name *.m -o -name *.cpp -o -name *.c | xargs -I '{}' ../../apply.sh "{}"
From llvm/src/tools/polly:
find test/ -name *.ll | xargs ./apply.sh

llvm-svn: 247378
2015-09-11 03:22:04 +00:00
Mehdi Amini 2bd08527ff Revert "[InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite"
This reverts commit r247356.

Breaks test/Transforms/InstCombine/pr8547.ll with:

Wrong types for attribute: byval inalloca nest noalias nocapture nonnull readnone readonly sret dereferenceable(1) dereferenceable_or_null(1)
  %call = call i32 (i8*, ...) @printf(i8* getelementptr inbounds ([10 x i8], [10 x i8]* @.str, i64 0, i64 0), i32 nonnull %conv2) #0
LLVM ERROR: Broken function found, compilation aborted!

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 247371
2015-09-11 01:33:48 +00:00
Chen Li a29c612ddd [InstCombineCalls] Use isKnownNonNullAt() to check nullness of passing arguments at callsite
Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of passing arguments at callsite. In this way it can handle cases where the argument does not have nonnull attribute but has a dominating null check from the CFG.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12779

llvm-svn: 247356
2015-09-10 23:04:49 +00:00
Chen Li 32a51416e5 [InstCombineCalls] Use isKnownNonNullAt() to check nullness of gc.relocate return value
Summary: This patch replaces isKnownNonNull() with isKnownNonNullAt() when checking nullness of gc.relocate return value. In this way it can handle cases where the relocated value does not have nonnull attribute but has a dominating null check from the CFG.

Reviewers: reames

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12772

llvm-svn: 247353
2015-09-10 22:35:41 +00:00
Reid Kleckner da6dcc5d92 [WinEH] Push and pop EBP for 32-bit funclets
The Win32 EH runtime caller does not preserve EBP, even though it does
preserve the CSRs (EBX, ESI, EDI) for us. The result was that each
finally funclet call would leave the frame pointer off by 12 bytes.

llvm-svn: 247348
2015-09-10 22:00:02 +00:00
James Y Knight 1f3e6af7d0 [SPARC] Switch to the Machine Scheduler.
The (mostly-deprecated) SelectionDAG-based ILPListDAGScheduler scheduler
was making poor scheduling decisions, causing high register pressure and
extraneous register spills.

Switching to the newer machine scheduler generates better code -- even
without there being a machine model defined for SPARC yet.

(Actually committing the test changes too, this time, unlike r247315)

llvm-svn: 247343
2015-09-10 21:49:06 +00:00
Reid Kleckner 7bb20bd69e Fix SEH state numbering algorithm to handle cleanupendpads
WinEHPrepare's new coloring algorithm really expects to see
cleanupendpads now, so Clang will start emitting them soon.

llvm-svn: 247341
2015-09-10 21:46:36 +00:00
Matthew Simpson 29dc0f7075 [LV] Relax Small Size Reduction Type Requirement
This patch enables small size reductions in which the source types are smaller
than the reduction type (e.g., computing an i16 sum from the values in an i8
array). The previous behavior was to only allow small size reductions if the
source types and reduction type were the same. The change accounts for the fact
that the existing sign- and zero-extend instructions in these cases should
still be included in the cost model.

Differential Revision: http://reviews.llvm.org/D12770

llvm-svn: 247337
2015-09-10 21:12:57 +00:00
Lang Hames 21a77ba1f7 [RuntimeDyld] Support non-zero addends for the MachO X86_64 SUBTRACTOR reloc.
This functionality was accidentally left out of r247119.

llvm-svn: 247336
2015-09-10 21:05:58 +00:00
Matthew Simpson ddb4d9741f [SCEV] Consistently Handle Expressions That Cannot Be Divided
This patch addresses the issue of SCEV division asserting on some
input expressions (e.g., non-affine expressions) and quietly giving
up on others.  When giving up, we set the quotient to be equal to
zero and the remainder to be equal to the numerator. With this
patch, we always quietly give up when we cannot perform the
division.

This patch also adds a test case for DependenceAnalysis that
previously caused an assertion.

Differential Revision: http://reviews.llvm.org/D11725

llvm-svn: 247314
2015-09-10 18:12:47 +00:00
JF Bastien fa946233b4 [MergeFuncs] Fix callsite attributes in thunk generation
This change correctly sets the attributes on the callsites
generated in thunks. This makes sure things such as sret, sext, etc.
are correctly set, so that the call can be a proper tailcall.

Also, the transfer of attributes in the replaceDirectCallers function
appears to be unnecessary, but until this is confirmed it will remain.

Author: jrkoenig
Reviewers: dschuff, jfb
Subscribers: llvm-commits, nlewycky
Differential revision: http://reviews.llvm.org/D12581

llvm-svn: 247313
2015-09-10 18:08:35 +00:00
David Blaikie a7970f3cf8 Tidy up some alias syntax to make explicit pointer type migration easier
llvm-svn: 247312
2015-09-10 18:03:45 +00:00
Philip Reames 053701399d [SimplifyCFG] Use known bits to eliminate dead switch defaults
This is a follow up to http://reviews.llvm.org/D11995 implementing the suggestion by Hans.

If we know some of the bits of the value being switched on, we know that the maximum number of unique cases covers the unknown bits. This allows to eliminate switch defaults for large integers (i32) when most bits in the value are known.

Note that I had to make the transform contingent on not having any dead cases. This is conservatively correct with the old code, but required for the new code since we might have a dead case which varies one of the known bits. Counting that towards our number of covering cases would be bad.  If we do have dead cases, we'll eliminate them first, then revisit the possibly dead default.

Differential Revision: http://reviews.llvm.org/D12497

llvm-svn: 247309
2015-09-10 17:44:47 +00:00
Adrian Prantl d209500fd5 Debug Info: Allow a DIModule to appear as the scope of other entities.
llvm-svn: 247304
2015-09-10 17:13:58 +00:00
Joseph Tremoulet f3aff31401 [WinEH] Fix single-block cleanup coloring
Summary:
The coloring code in WinEHPrepare queues cleanuprets' successors with the
correct color (the parent one) when it sees their cleanuppad, and so later
when iterating successors knows to skip processing cleanuprets since
they've already been queued.  This latter check was incorrectly under an
'else' condition and so inadvertently was not kicking in for single-block
cleanups.  This change sinks the check out of the 'else' to fix the bug.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12751

llvm-svn: 247299
2015-09-10 16:51:25 +00:00
Vedant Kumar 1abc48ee58 [Bitcode] Add xfail test for PR24755 (uselistorder)
This test stresses verify-uselistorder. PR24755 is caused by our
ignoring uses when they occur in the function personality slot, the
prologue data slot, or the prefix data slot.

llvm-svn: 247292
2015-09-10 16:02:24 +00:00
Alex Lorenz 0153e59935 Fix PR 24724 - The implicit register verifier shouldn't assume certain operand
order.

The implicit register verifier in the MIR parser should only check if the
instruction's default implicit operands are present in the instruction. It
should not check the order in which they occur.

llvm-svn: 247283
2015-09-10 14:04:34 +00:00
Igor Breger 7f69a99c54 AVX512: Implemented encoding and intrinsics for
vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11802

llvm-svn: 247276
2015-09-10 12:54:54 +00:00
Jakub Kuderski 58ea4eeb9e There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that
removes cast by performing the lshr on smaller types. However, currently there
is no trunc(lshr (sext A), Cst) variant.
This patch add such optimization by transforming trunc(lshr (sext A), Cst)
to ashr A, Cst.

Differential Revision: http://reviews.llvm.org/D12520

llvm-svn: 247271
2015-09-10 11:31:20 +00:00
Silviu Baranga df9ce8408a [DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant folding vectors
Summary:
The BUILD_VECTOR node will truncate its operators to match the
type. We need to take this into account when constant folding -
we need to perform a truncation before constant folding the elements.
This is because the upper bits can change the result, depending on
the operation type (for example this is the case for min/max).

This change also adds a regression test.

Reviewers: jmolloy

Subscribers: jmolloy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12697

llvm-svn: 247265
2015-09-10 10:34:34 +00:00
James Molloy 8c995a93ce [ARM] Do not use vtrn for vectorshuffle if the order is reversed
The tests in isVTRNMask and isVTRN_v_undef_Mask should also check that the elements of the upper and lower half of the vectorshuffle occur in the correct order when both halves are used. Without this test the code assumes that it is correct to use vector transpose (vtrn) for the masks <1, 1, 0, 0> and <1, 3, 0, 2>, among others, but the transpose actually incorrectly generates shuffles for <0, 0, 1, 1> and <0, 2, 1, 3> in this case.

Patch by Jeroen Ketema!

llvm-svn: 247254
2015-09-10 08:42:28 +00:00
Chandler Carruth 93d5d3b5db Add a way to skip the Go bindings tests even when Go is configured in
CMake.

The Go bindings tests in an unoptimized build take over 30 seconds for
me, making it the slowest test in 'check-llvm' by a factor of two.

I've only rigged this up fully to the CMake build. If someone is
interested in rigging it up to the autoconf build, they're welcome to do
so.

llvm-svn: 247243
2015-09-10 05:47:43 +00:00
Sanjoy Das f3132d3b03 [ScalarEvolution] Fix PR24757.
Summary:
PR24757 was caused by some incorect math in
`ScalarEvolution::HowFarToZero` -- the smallest unsigned solution for X
in

  2^N * A = 2^N * X

is not necessarily A.

Reviewers: atrick, majnemer, meheff

Subscribers: llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D12721

llvm-svn: 247242
2015-09-10 05:27:38 +00:00
Kit Barton d3b904d440 Enable the shrink wrapping optimization for PPC64.
The changes in this patch are as follows:
  1. Modify the emitPrologue and emitEpilogue methods to work properly when the prologue and epilogue blocks are not the first/last blocks in the function
  2. Fix a bug in PPCEarlyReturn optimization caused by an empty entry block in the function
  3. Override the runShrinkWrap PredicateFtor (defined in TargetMachine) to check whether shrink wrapping should run:
      Shrink wrapping will run on PPC64 (Little Endian and Big Endian) unless -enable-shrink-wrap=false is specified on command line

A new test case, ppc-shrink-wrapping.ll was created based on the existing shrink wrapping tests for x86, arm, and arm64.

Phabricator review: http://reviews.llvm.org/D11817

llvm-svn: 247237
2015-09-10 01:55:44 +00:00
Ahmed Bougacha 05541459fa [AArch64] Match FI+offset in STNP addressing mode.
First, we need to teach isFrameOffsetLegal about STNP.
It already knew about the STP/LDP variants, but those were probably
never exercised, because it's only the load/store optimizer that
generates STP/LDP, and the only user of the method is frame lowering,
which runs earlier.
The STP/LDP cases were wrong: they didn't take into account the fact
that they return two results, not one, so the immediate offset will be
the 4th operand, not the 3rd.

Follow-up to r247234.

llvm-svn: 247236
2015-09-10 01:54:43 +00:00
Davide Italiano ddedd7255a [MC] Convert all the remaining tests from macho-dump to llvm-readobj.
This sort-of deprecates macho-dump. It may take still a little while
to garbage collect it, but at least there's no real usage of it in
the tree anymore. New tests should always rely on llvm-readobj or
llvm-objdump.

llvm-svn: 247235
2015-09-10 01:50:00 +00:00
Ahmed Bougacha c0ac38d584 [AArch64] Match base+offset in STNP addressing mode.
Followup to r247231.

llvm-svn: 247234
2015-09-10 01:48:29 +00:00
Ahmed Bougacha b8886b517d [AArch64] Support selecting STNP.
We could go through the load/store optimizer and match STNP where
we would have matched a nontemporal-annotated STP, but that's not
reliable enough, as an opportunistic optimization.
Insetad, we can guarantee emitting STNP, by matching them at ISel.
Since there are no single-input nontemporal stores, we have to
resort to some high-bits-extracting trickery to generate an STNP
from a plain store.

Also, we need to support another, LDP/STP-specific addressing mode,
base + signed scaled 7-bit immediate offset.
For now, only match the base. Let's make it smart separately.

Part of PR24086.

llvm-svn: 247231
2015-09-10 01:42:28 +00:00
Reid Kleckner 7878391208 [WinEH] Add codegen support for cleanuppad and cleanupret
All of the complexity is in cleanupret, and it mostly follows the same
codepaths as catchret, except it doesn't take a return value in RAX.

This small example now compiles and executes successfully on win32:
  extern "C" int printf(const char *, ...) noexcept;
  struct Dtor {
    ~Dtor() { printf("~Dtor\n"); }
  };
  void has_cleanup() {
    Dtor o;
    throw 42;
  }
  int main() {
    try {
      has_cleanup();
    } catch (int) {
      printf("caught it\n");
    }
  }

Don't try to put the cleanup in the same function as the catch, or Bad
Things will happen.

llvm-svn: 247219
2015-09-10 00:25:23 +00:00
Philip Reames 6628713f4f [RewriteStatepointsForGC] Extend base pointer inference to handle insertelement
This change is simply enhancing the existing inference algorithm to handle insertelement instructions by conservatively inserting a new instruction to propagate the vector of associated base pointers. In the process, I'm ripping out the peephole optimizations which mostly helped cover the fact this hadn't been done.

Note that most of the newly inserted nodes will be nearly immediately removed by the post insertion optimization pass introduced in 246718. Arguably, we should be trying harder to avoid the malloc traffic here, but I'd rather get the code correct, then worry about compile time.

Unlike previous extensions of the algorithm to handle more case, I discovered the existing code was causing miscompiles in some cases. In particular, we had an implicit assumption that the peephole covered *all* insert element instructions, so if we had a value directly based on a insert element the peephole didn't cover, we proceeded as if it were a base anyways. Not good. I believe we had the same issue with shufflevector which is why I adjusted the predicate for them as well.

Differential Revision: http://reviews.llvm.org/D12583

llvm-svn: 247210
2015-09-09 23:40:12 +00:00
Peter Collingbourne 1cbc91eccf LowerBitSets: Fix non-determinism bug.
Visit disjoint sets in a deterministic order based on the maximum BitSetNM
index, otherwise the order in which we visit them will depend on pointer
comparisons. This was being exposed by MSan.

llvm-svn: 247201
2015-09-09 22:30:32 +00:00
Reid Kleckner 94b704c469 [SEH] Emit 32-bit SEH tables for the new EH IR
The 32-bit tables don't actually contain PC range data, so emitting them
is incredibly simple.

The 64-bit tables, on the other hand, use the same table for state
numbering as well as label ranges. This makes things more difficult, so
it will be implemented later.

llvm-svn: 247192
2015-09-09 21:10:03 +00:00
Dan Gohman 5e0668426c [WebAssembly] Update target datalayout strings.
llvm-svn: 247187
2015-09-09 20:54:31 +00:00
Piotr Padlewski 0dde00d239 ScalarEvolution assume hanging bugfix
http://reviews.llvm.org/D12719

llvm-svn: 247184
2015-09-09 20:47:30 +00:00
David Majnemer d34dbf07bd Revert trunc(lshr (sext A), Cst) to ashr A, Cst
This reverts commit r246997, it introduced a regression (PR24763).

llvm-svn: 247180
2015-09-09 20:20:08 +00:00
Renato Golin db7ea86bf4 Revert "AVX512: Implemented encoding and intrinsics for vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4 Added tests for intrinsics and encoding."
This reverts commit r247149, as it was breaking numerous buildbots of varied architectures.

llvm-svn: 247177
2015-09-09 19:44:40 +00:00
Chandler Carruth 7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Dan Gohman f71abef701 [WebAssembly] Implement calls with void return types.
llvm-svn: 247158
2015-09-09 16:13:47 +00:00
Tom Stellard 9a197676b1 AMDGPU/SI: Fold operands through REG_SEQUENCE instructions
Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12256

llvm-svn: 247157
2015-09-09 15:43:26 +00:00
Silviu Baranga a3e27edb5d [CostModel][AArch64] Remove amortization factor for some of the vector select instructions
Summary:
We are not scalarizing the wide selects in codegen for i16 and i32 and
therefore we can remove the amortization factor. We still have issues
with i64 vectors in codegen though.

Reviewers: mcrosier

Subscribers: mcrosier, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12724

llvm-svn: 247156
2015-09-09 15:35:02 +00:00
Igor Breger ac29a82921 AVX512: Implemented encoding and intrinsics for
vextracti64x4 ,vextracti64x2, vextracti32x8, vextracti32x4, vextractf64x4, vextractf64x2, vextractf32x8, vextractf32x4
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11802

llvm-svn: 247149
2015-09-09 14:35:09 +00:00
Zoran Jovanovic 6b28f09d67 [mips][microMIPS] Implement ADDU16, AND16, ANDI16, NOT16, OR16, SLL16 and SRL16 instructions
Differential Revision: http://reviews.llvm.org/D11178

llvm-svn: 247146
2015-09-09 13:55:45 +00:00
Alex Lorenz b9a68dbcae Fix PR 24633 - Handle undef values when parsing standalone constants.
llvm-svn: 247145
2015-09-09 13:44:33 +00:00
James Molloy 89eccee4db Delay predication of stores until near the end of vector code generation
Predicating stores requires creating extra blocks. It's much cleaner if we do this in one pass instead of mutating the CFG while writing vector instructions.

Besides which we can make use of helper functions to update domtree for us, reducing the work we need to do.

llvm-svn: 247139
2015-09-09 12:51:06 +00:00
Daniel Sanders 2038747fce Fix vector splitting for extract_vector_elt and vector elements of <8-bits.
Summary:
One of the vector splitting paths for extract_vector_elt tries to lower:
    define i1 @via_stack_bug(i8 signext %idx) {
      %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx
      ret i1 %1
    }
to:
    define i1 @via_stack_bug(i8 signext %idx) {
      %base = alloca <2 x i1>
      store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base
      %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx
      %3 = load i1, i1* %2
      ret i1 %3
    }
However, the elements of <2 x i1> are not byte-addressible. The result of this
is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies
to '%base + %idx * 0', and then simply '%base' causing all values of %idx to
extract element zero.

This commit fixes this by promoting the vector elements of <8-bits to i8 before
splitting the vector.

This fixes a number of test failures in pocl.

Reviewers: pekka.jaaskelainen

Subscribers: pekka.jaaskelainen, llvm-commits

Differential Revision: http://reviews.llvm.org/D12591

llvm-svn: 247128
2015-09-09 09:53:20 +00:00
Zoran Jovanovic d9790793d6 [mips][microMIPS] Implement CACHEE and PREFE instructions
Differential Revision: http://reviews.llvm.org/D11628

llvm-svn: 247125
2015-09-09 09:10:46 +00:00
Lang Hames 856e4767ff [RuntimeDyld] Add support for MachO x86_64 SUBTRACTOR relocation.
llvm-svn: 247119
2015-09-09 03:14:29 +00:00
Dan Gohman e590b33bf8 [WebAssembly] Fix lowering of calls with more than one argument.
llvm-svn: 247118
2015-09-09 01:52:45 +00:00
Matt Arsenault acd68b58ae SelectionDAG: Support Expand of f16 extloads
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.

This moves a hack out of SI's argument lowering and
is covered by existing tests.

llvm-svn: 247113
2015-09-09 01:12:27 +00:00
Dan Gohman 4f52e00ecb [WebAssembly] Implement WebAssemblyInstrInfo::copyPhysReg
llvm-svn: 247110
2015-09-09 00:52:47 +00:00
Davide Italiano 9a429b766f [llvm-readobj] MachO -- dump LinkerOptions load command.
Example output:

Linker Options {
  Size: 32
  Count: 2
  Strings [
    Value: -framework
    Value: Cocoa
  ]
}

There were only two tests using this -- so I converted them as part of
this commit rather than separately.

Differential Revision:	 http://reviews.llvm.org/D12702

llvm-svn: 247106
2015-09-09 00:21:18 +00:00
Reid Kleckner 51189f0a1d [WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code
Typically these are catchpads, which hold data used to decide whether to
catch the exception or continue unwinding. We also shouldn't create MBBs
for catchendpads, cleanupendpads, or terminatepads, since no real code
can live in them.

This fixes a problem where MI passes (like the register allocator) would
try to put code into catchpad blocks, which are not executed by the
runtime. In the new world, blocks ending in invokes now have many
possible successors.

llvm-svn: 247102
2015-09-08 23:28:38 +00:00
Peter Collingbourne 8d24ae9441 Re-apply r247080 with order of evaluation fix.
llvm-svn: 247095
2015-09-08 22:49:35 +00:00
Reid Kleckner df1295173f [WinEH] Emit prologues and epilogues for funclets
Summary:
32-bit funclets have short prologues that allocate enough stack for the
largest call in the whole function. The runtime saves CSRs for the
funclet. It doesn't restore CSRs after we finally transfer control back
to the parent funciton via a CATCHRET, but that's a separate issue.
32-bit funclets also have to adjust the incoming EBP value, which is
what llvm.x86.seh.recoverframe does in the old model.

64-bit funclets need to spill CSRs as normal. For simplicity, this just
spills the same set of CSRs as the parent function, rather than trying
to compute different CSR sets for the parent function and each funclet.
64-bit funclets also allocate enough stack space for the largest
outgoing call frame, like 32-bit.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12546

llvm-svn: 247092
2015-09-08 22:44:41 +00:00
Peter Collingbourne 07f3af2e82 Revert r247080, "LowerBitSets: Extend pass to support functions as bitset
members." as it causes test failures on a number of bots.

llvm-svn: 247088
2015-09-08 22:33:23 +00:00
Vedant Kumar 9ebd49a4cf [Bitcode] Add compatibility tests for new instructions
Adds basic compatibility tests for the following instructions:

  catchpad, catchendpad, cleanuppad, cleanupendpad, terminatepad,
  cleanupret, catchret

llvm-svn: 247087
2015-09-08 22:33:23 +00:00
Eric Christopher 71f6e2f568 Fix the PPC CTR Loop pass to look for calls to the intrinsics that
read CTR and count them as reading the CTR.

llvm-svn: 247083
2015-09-08 22:14:58 +00:00
Peter Collingbourne c634ed0b1a LowerBitSets: Extend pass to support functions as bitset members.
This change extends the bitset lowering pass to support bitsets that may
contain either functions or global variables. A function bitset is lowered to
a jump table that is laid out before one of the functions in the bitset.

Also add support for non-string bitset identifier names. This allows for
distinct metadata nodes to stand in for names with internal linkage,
as done in D11857.

Differential Revision: http://reviews.llvm.org/D11856

llvm-svn: 247080
2015-09-08 21:57:45 +00:00
Matt Arsenault 86d336e91b AMDGPU/SI: Fix input vcc operand for VOP2b instructions
Adds vcc to output string input for e32. Allows option
of using e64 encoding with assembler.

Also fixes these instructions not implicitly reading exec.

llvm-svn: 247074
2015-09-08 21:15:00 +00:00
Derek Schuff 45c832c5d8 Fix comments and RUN line in x86-64 stdarg test leftover from last commit
From http://reviews.llvm.org/D12346

llvm-svn: 247070
2015-09-08 20:58:41 +00:00
Derek Schuff eef533f422 x32. Fixes a bug in how struct va_list is initialized in x32
Summary: This patch modifies X86TargetLowering::LowerVASTART so that
struct va_list is initialized with 32 bit pointers in x32. It also
includes tests that call @llvm.va_start() for x32.

Patch by João Porto

Subscribers: llvm-commits, hjl.tools
Differential Revision: http://reviews.llvm.org/D12346

llvm-svn: 247069
2015-09-08 20:51:31 +00:00
Derek Schuff ee4e947e23 x32. Fixes a bug in i8mem_NOREX declaration.
The old implementation assumed LP64 which is broken for x32.  Specifically, the
MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit
physreg copy instruction' error message to be reported.

This patch also enable the h-register*ll tests for x32.

Differential Revision: http://reviews.llvm.org/D12336

Patch by João Porto

llvm-svn: 247058
2015-09-08 19:47:15 +00:00
Matt Arsenault 966a94f861 AMDGPU: Handle sub of constant for DS offset folding
sub C, x - > add (sub 0, x), C for DS offsets.

This is mostly to fix regressions that show up when
SeparateConstOffsetFromGEP is enabled.

llvm-svn: 247054
2015-09-08 19:34:22 +00:00
Diego Novillo f9aa39b0cf Fix PR 24723 - Handle 0-mass backedges in irreducible loops
This corner case happens when we have an irreducible SCC that is
deeply nested.  As we work down the tree, the backedge masses start
getting smaller and smaller until we reach one that is down to 0.

Since we distribute the incoming mass using the backedge masses as
weight, the distributor does not allow zero weights.  So, we simply
ignore them (which will just use the weights of the non-zero nodes).

llvm-svn: 247050
2015-09-08 19:22:17 +00:00
Davide Italiano cb2da7166a [MC/ELF] Accept zero for .align directive
.align directive refuses alignment 0 -- a comment in the code hints this is
done for GNU as compatibility, but it seems GNU as accepts .align 0
(and silently rounds up alignment to 1).

Differential Revision:	 http://reviews.llvm.org/D12682

llvm-svn: 247048
2015-09-08 18:59:47 +00:00
David Blaikie 12dd3c4ebb Fix CPP Backend for GEP API changes for opaque pointer types
Based on a patch by Jerome Witmann.

llvm-svn: 247047
2015-09-08 18:42:29 +00:00
Benjamin Kramer c3c183554b Merge or combine tests and convert to FileCheck.
- Move tests only exercising instsimplify to instsimplify's apint-or.ll
- Actually test the CHECK lines in instsimplify's apint-or.ll
- Merge the remaining tests in apint-or1.ll and apint-or2.ll, use FileCheck

llvm-svn: 247045
2015-09-08 18:36:56 +00:00
Sanjay Patel 21a145c341 add tests for De Morgan instcombines based on PR22723
llvm-svn: 247040
2015-09-08 18:13:03 +00:00
Sanjay Patel 35f673ef08 fix typos, remove noise; NFCI
llvm-svn: 247035
2015-09-08 17:58:22 +00:00
Vedant Kumar 52cd4eecac [Bitcode] Add compatibility test for llvm 3.7.0
This patch adds llvm-3.7 IR and generated bitcode for our compatibility
test (in accordance with the developer policy).

llvm-svn: 247031
2015-09-08 17:39:21 +00:00
JF Bastien 749ed88aa5 WebAssembly: NFC rename shr/sar
Renamed from: https://github.com/WebAssembly/design/pull/332

llvm-svn: 247028
2015-09-08 17:21:21 +00:00
Zoran Jovanovic 2da1437d62 [mips][microMIPS] Implement LLE, LUI, LW and LWE instructions
Differential Revision: http://reviews.llvm.org/D1179

llvm-svn: 247017
2015-09-08 15:02:50 +00:00
Dan Gohman d4a12d25ff [WebAssembly] Temporarily disable this test, as it depends on additional patches that aren't yet checked in.
llvm-svn: 247011
2015-09-08 13:21:12 +00:00
Igor Breger a54a1a84dd AVX512: kunpck encoding implementation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D12061

llvm-svn: 247010
2015-09-08 13:10:00 +00:00
Dan Gohman 25d2a0dda4 [WebAssembly] Enable SSA lowering and other pre-regalloc passes
llvm-svn: 247008
2015-09-08 12:39:25 +00:00
Zoran Jovanovic 9eaa30d2bf [mips][microMIPS] Implement SB, SBE, SCE, SH and SHE instructions
Differential Revision: http://reviews.llvm.org/D11801

llvm-svn: 246999
2015-09-08 10:18:38 +00:00
Jakub Kuderski 7cd4810021 There is a trunc(lshr (zext A), Cst) optimization in InstCombineCasts that
removes cast by performing the lshr on smaller types. However, currently there
is no trunc(lshr (sext A), Cst) variant.
This patch add such optimization by transforming trunc(lshr (sext A), Cst)
to ashr A, Cst.

Differential Revision: http://reviews.llvm.org/D12520

llvm-svn: 246997
2015-09-08 10:03:17 +00:00
Daniel Sanders 808dfb8ba7 [mips] Reserve address spaces 1-255 for software use.
Summary: And define them to have noop casts with address spaces 0-255.

Reviewers: pekka.jaaskelainen

Subscribers: pekka.jaaskelainen, llvm-commits

Differential Revision: http://reviews.llvm.org/D12678

llvm-svn: 246990
2015-09-08 09:07:03 +00:00
Zoran Jovanovic 68be5f21a9 [mips][microMIPS] Add microMIPS32r6 and microMIPS64r6 tests for existing 16-bit LBU16, LHU16, LW16, LWGP and LWSP instructions
Differential Revision: http://reviews.llvm.org/D10956

llvm-svn: 246987
2015-09-08 08:25:34 +00:00
Elena Demikhovsky e88038f235 AVX-512: Lowering for 512-bit vector shuffles.
Vector types: <8 x 64>, <16 x 32>, <32 x 16> float and integer.

Differential Revision: http://reviews.llvm.org/D10683

llvm-svn: 246981
2015-09-08 06:38:21 +00:00
Sanjay Patel de28573e49 add missing regression tests for De Morgan's Law transform in InstCombine
llvm-svn: 246973
2015-09-07 19:00:38 +00:00
Zoran Jovanovic 7b85682541 [mips][microMIPS] Implement ABS.fmt, CEIL.L.fmt, CEIL.W.fmt, FLOOR.L.fmt, FLOOR.W.fmt, TRUNC.L.fmt, TRUNC.W.fmt, RSQRT.fmt and SQRT.fmt instructions
Differential Revision: http://reviews.llvm.org/D11674

llvm-svn: 246968
2015-09-07 13:01:04 +00:00
Zoran Jovanovic ada7091812 [mips][microMIPS] Implement BC16, BEQZC16 and BNEZC16 instructions
Differential Revision: http://reviews.llvm.org/D11181

llvm-svn: 246963
2015-09-07 11:56:37 +00:00
Zoran Jovanovic 14f308e44f [mips][microMIPS] Implement CVT.D.fmt, CVT.L.fmt, CVT.S.fmt, CVT.W.fmt, MAX.fmt, MIN.fmt, MAXA.fmt, MINA.fmt and CMP.condn.fmt instructions
Differential Revision: http://reviews.llvm.org/D12141

llvm-svn: 246960
2015-09-07 10:31:31 +00:00
Simon Pilgrim 15fc134865 [X86][MMX] Added missing stack folding tests for MMX/SSSE3 instructions
llvm-svn: 246949
2015-09-06 17:50:15 +00:00
Simon Pilgrim b38c09d7ff [X86][AVX512] Added 512-bit vector shift tests.
Only works for avx512f (dq) targets so far - need to add avx512bw tests once char/short shifts are supported.

llvm-svn: 246943
2015-09-06 13:36:32 +00:00
David Majnemer 135ca40a7d [InstCombine] Don't divide by zero when evaluating a potential transform
Trivial multiplication by zero may survive the worklist.  We tried to
reassociate the multiplication with a division instruction, causing us
to divide by zero; bail out instead.

This fixes PR24726.

llvm-svn: 246939
2015-09-06 06:49:59 +00:00
Hal Finkel 10aac5fd0e [SelectionDAG] Swap commutative binops before constant-based folding
In searching for a fix for the underlying code-quality bug highlighted by
r246937 (that SDAG simplification can lead to us generating an ISD::OR node
with a constant zero LHS), I ran across this:

We generically canonicalize commutative binary-operation nodes in SDAG getNode
so that, if only one operand is a constant, it will be on the RHS.  However, we
were doing this only after a bunch of constant-based simplification checks that
all assume this canonical form (that any constant will be on the RHS). Moving
the operand-swapping canonicalization prior to these checks seems like the
right thing to do (and, as it turns out, causes SDAG to completely fold away the
computation in test/CodeGen/ARM/2012-11-14-subs_carry.ll, just like InstCombine
would do).

llvm-svn: 246938
2015-09-06 05:42:13 +00:00
Hal Finkel ccf9259c00 [PowerPC] Don't commute trivial rlwimi instructions
To commute a trivial rlwimi instructions (meaning one with a full mask and zero
shift), we'd need to ability to form an all-zero mask (instead of an all-one
mask) using rlwimi. We can't represent this, however, and we'll miscompile code
if we try.

The code quality problem that this highlights (that SDAG simplification can
lead to us generating an ISD::OR node with a constant zero LHS) will be fixed
as a follow-up.

Fixes PR24719.

llvm-svn: 246937
2015-09-06 04:17:30 +00:00
David Majnemer daa24b9789 [InstCombine] Don't assume m_Mul gives back an Instruction
This fixes PR24713.

llvm-svn: 246933
2015-09-05 20:44:56 +00:00
Simon Pilgrim a1ceba8ab4 [X86] Updated vector lzcnt tests. Added missing vec512 tests.
llvm-svn: 246927
2015-09-05 11:56:30 +00:00
Simon Pilgrim 5cce4381ab [X86] Updated vector tzcnt tests. Added vec512 tests.
llvm-svn: 246922
2015-09-05 10:19:07 +00:00
Simon Pilgrim ba8ae5a814 [X86] Updated vector popcnt tests. Added vec512 tests.
llvm-svn: 246921
2015-09-05 09:59:59 +00:00
Zoran Jovanovic 89ca2b982e [mips][microMIPS] Implement ADD.fmt, SUB.fmt, MOV.fmt, MUL.fmt, DIV.fmt, MADDF.fmt, MSUBF.fmt and NEG.fmt instructions
Differential Revision: http://reviews.llvm.org/D11978

llvm-svn: 246919
2015-09-05 09:25:30 +00:00
Davide Italiano 591b5a5e72 [MC] Convert other MachO tests from macho-dump to llvm-readobj.
This commit accomplish two goals:
1) it's a step forward to deprecate macho-dump, now less than 40 tests
rely on it.

2) It tests all the MachO specific features introduced in llvm-readobj in
the following commits:  r246789, r246665, r246474.

While the conversion is mostly mechanical (I double-checked all the
tests output one by one, but still), a post-commit review is greatly
appreciated.

llvm-svn: 246904
2015-09-05 01:02:05 +00:00
Hal Finkel b1518d6c24 [PowerPC] Fix and(or(x, c1), c2) -> rlwimi generation
PPCISelDAGToDAG has a transformation that generates a rlwimi instruction from
an input pattern that looks like this:

  and(or(x, c1), c2)

but the associated logic does not work if there are bits that are 1 in c1 but 0
in c2 (these are normally canonicalized away, but that can't happen if the 'or'
has other users. Make sure we abort the transformation if such bits are
discovered.

Fixes PR24704.

llvm-svn: 246900
2015-09-05 00:02:59 +00:00
Andrew Kaylor a89baa21b8 Fixing bad test syntax.
llvm-svn: 246897
2015-09-04 23:47:34 +00:00
Andrew Kaylor 50e4e86c26 [WinEH] Teach SimplfyCFG to eliminate empty cleanup pads.
Differential Revision: http://reviews.llvm.org/D12434

llvm-svn: 246896
2015-09-04 23:39:40 +00:00
Simon Pilgrim 0fccef6ac2 [X86][AVX] Test tidyup + regeneration. NFCI.
llvm-svn: 246863
2015-09-04 19:47:56 +00:00
Peter Collingbourne 07dcb58f8a Add powerpc64 to parallel.ll unsupported architecture list.
llvm-svn: 246862
2015-09-04 19:45:36 +00:00
Silviu Baranga 44077da1b7 Simplify testcase added in r246759. NFC
llvm-svn: 246848
2015-09-04 11:37:20 +00:00
Steven Wu 332eeca1e4 Fix the testcase in r246790
Using generic neon syntax to avoid test failure on apple platforms.

llvm-svn: 246833
2015-09-04 01:39:24 +00:00
Hal Finkel 4a7be23976 [PowerPC] Enable interleaved-access vectorization
This adds a basic cost model for interleaved-access vectorization (and a better
default for shuffles), and enables interleaved-access vectorization by default.
The relevant difference from the default cost model for interleaved-access
vectorization, is that on PPC, the shuffles that end up being used are *much*
cheaper than modeling the process with insert/extract pairs (which are
quite expensive, especially on older cores).

llvm-svn: 246824
2015-09-04 00:10:41 +00:00
Hal Finkel 75afa2b6b6 [PowerPC] Always use aggressive interleaving on the A2
On the A2, with an eye toward QPX unaligned-load merging, we should always use
aggressive interleaving. It is generally superior to only using concatenation
unrolling.

llvm-svn: 246819
2015-09-03 23:23:00 +00:00
Hal Finkel e6702ca0e2 [PowerPC] Try harder to find a base+offset when looking for consecutive accesses
When forming permutation-based unaligned vector loads, we need to know whether
it is valid to read ahead of the requested address by a full vector length.
Doing so is more efficient (and allows for more CSE with later loads), but
could trigger a page fault if invalid. To determine validity, we look for other
loads in the same block that access the relevant address range.

The relevant point here is that we need to do this as part of the process of
forming permutation-based vector loads, and this happens quite early in the
SDAG pipeline - specifically before many of the address calculations are fully
canonicalized. As a result, we need to try harder to recognize base+offset
address computations, because they still might appear as chain of adds
(base+offset+offset, for example). To account for this, we'll look through
chains of adds, accumulating the constant offsets.

llvm-svn: 246813
2015-09-03 22:37:44 +00:00
Sanjoy Das 88d0fdeb00 [IR] Have AttrBuilder::clear clear `TargetDepAttrs`.
Test case attached -- currently the parser smears the "foo bar" to all
of the formal arguments.

llvm-svn: 246812
2015-09-03 22:27:42 +00:00
Hal Finkel f11bc761d8 [PowerPC] Include the permutation cost for unaligned vector loads
Pre-P8, when we generate code for unaligned vector loads (for Altivec and QPX
types), even when accounting for the combining that takes place for multiple
consecutive such loads, there is at least one load instructions and one
permutation for each load. Make sure the cost reported reflects the cost of the
permutes as well.

llvm-svn: 246807
2015-09-03 21:23:18 +00:00
Hal Finkel 99d95328d6 [PowerPC] Compute the MMO offset for an unaligned load with signed arithmetic
If you compute the MMO offset using unsigned arithmetic, you end up with a
large positive offset instead of a small negative one. In theory, this could
cause bad instruction-scheduling decisions later.

I noticed this by inspection from the debug output, and using that for the
regression test is the best I can do right now.

llvm-svn: 246805
2015-09-03 21:12:15 +00:00
Reid Kleckner df52337bfc [sancov] Disable sanitizer coverage on functions using SEH
Splitting basic blocks really messes up WinEHPrepare. We can remove this
change when SEH uses the new EH IR.

llvm-svn: 246799
2015-09-03 20:18:29 +00:00
Chad Rosier 6c36eff1d6 [AArch64] Improve ISel using across lane addition reduction.
In vectorized add reduction code, the final "reduce" step is sub-optimal.
This change wll combine :

ext  v1.16b, v0.16b, v0.16b, #8
add  v0.4s, v1.4s, v0.4s
dup  v1.4s, v0.s[1]
add  v0.4s, v1.4s, v0.4s

into

addv s0, v0.4s

PR21371
http://reviews.llvm.org/D12325
Patch by Jun Bum Lim <junbuml@codeaurora.org>!

llvm-svn: 246790
2015-09-03 18:13:57 +00:00
Karl Schimpf 7772978ccf Allow global address space forward decls using IDs in .ll files.
Summary:
This fixes bugzilla bug 24656. Fixes the case where there is a forward
reference to a global variable using an ID (i.e. @0). It does this by
passing the address space of the initializer pointer for which the
forward referenced global is used.

llvm-svn: 246788
2015-09-03 18:06:44 +00:00
Quentin Colombet 10ec8c4ca9 [ARM] Add a test case for revision 243956.
llvm-svn: 246785
2015-09-03 16:49:18 +00:00
Chad Rosier 08ef462d15 Revert "[AArch64] Improve load/store optimizer to handle LDUR + LDR."
This reverts commit r246769.

This appears to have broken Multisource/Benchmarks/tramp3d-v4.

llvm-svn: 246782
2015-09-03 16:41:28 +00:00
Sanjay Patel c9ae9d72f8 [x86] enable machine combiner reassociations for scalar 'xor' insts
llvm-svn: 246781
2015-09-03 16:36:16 +00:00
Karl Schimpf 44876c535f Fix assertion failure in LLParser::ConvertValIDToValue
Summary:
Fixes bug 24645. Problem appears to be that the type may be undefined
when ConvertValIDToValue is called.

Reviewers: kcc

Subscribers: llvm-commits
llvm-svn: 246779
2015-09-03 16:18:32 +00:00
Karl Schimpf e564fcb165 Remove binary characters from test file.
llvm-svn: 246775
2015-09-03 15:41:38 +00:00
Karl Schimpf f04a5d5978 Fix SEGV in InlineAsm::ConstraintInfo::Parse.
Summary:
Fixes bug 24646. Previous code was not checking if an index into a vector
was valid, resulting in a SEGV. Fixed by assuming the construct can't
be parsed when given this input.

Reformat and add test.

Differential Revision: http://reviews.llvm.org/D12539

llvm-svn: 246774
2015-09-03 15:41:37 +00:00
Sanjay Patel ce74db9d8d check for fastness before merging in DAGCombiner::MergeConsecutiveStores()
Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time
we have a merged access candidate. Without this patch, we were generating unaligned 
16-byte (SSE) memops for x86 targets where those accesses are slow.

This change was mentioned in:
http://reviews.llvm.org/D10662 and
http://reviews.llvm.org/D10905

and will help solve PR21711.

Differential Revision: http://reviews.llvm.org/D12573

llvm-svn: 246771
2015-09-03 15:03:19 +00:00
Chad Rosier 491a1bd998 [AArch64] Improve load/store optimizer to handle LDUR + LDR.
This patch allows the mixing of scaled and unscaled load/stores to form
load/store pairs.

PR24465
http://reviews.llvm.org/D12116
Many thanks to Ahmed and Michael for fixes and code review.

llvm-svn: 246769
2015-09-03 14:41:37 +00:00
Daniel Sanders 3ebcaf6685 [mips] Added support for the div, divu, ddiv and ddivu macros which use traps and breaks in the integrated assembler.
Summary:

Patch by Scott Egerton

Reviewers: vkalintiris, dsanders

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11675

llvm-svn: 246763
2015-09-03 12:31:22 +00:00
Silviu Baranga d0f83d15a3 Fix IRBuilder CreateBitOrPointerCast for vector types
Summary:
This function was not taking into account that the
input type could be a vector, and wasn't properly
working for vector types.

This caused an assert when building spec2k6 perlbmk for armv8.

Reviewers: rengolin, mzolotukhin

Subscribers: silviu.baranga, mzolotukhin, rengolin, eugenis, jmolloy, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D12559

llvm-svn: 246759
2015-09-03 11:36:39 +00:00
Joseph Tremoulet 61efbc32a6 [WinEH] Add llvm.eh.exceptionpointer intrinsic
Summary:
This intrinsic can be used to extract a pointer to the exception caught by
a given catchpad.  Its argument has token type and must be a `catchpad`.

Also clarify ExtendingLLVM documentation regarding overloaded intrinsics.


Reviewers: majnemer, andrew.w.kaylor, sanjoy, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12533

llvm-svn: 246752
2015-09-03 09:15:32 +00:00
Joseph Tremoulet 9ce71f76b9 [WinEH] Add cleanupendpad instruction
Summary:
Add a `cleanupendpad` instruction, used to mark exceptional exits out of
cleanups (for languages/targets that can abort a cleanup with another
exception).  The `cleanupendpad` instruction is similar to the `catchendpad`
instruction in that it is an EH pad which is the target of unwind edges in
the handler and which itself has an unwind edge to the next EH action.
The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad`
argument indicating which cleanup it exits.  The unwind successors of a
`cleanuppad`'s `cleanupendpad`s must agree with each other and with its
`cleanupret`s.

Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12433

llvm-svn: 246751
2015-09-03 09:09:43 +00:00
Igor Breger 0dcd8bcf24 AVX512: Implemented encoding and intrinsics for vplzcntq, vplzcntd, vpconflictq, vpconflictd
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11931

llvm-svn: 246750
2015-09-03 09:05:31 +00:00
NAKAMURA Takumi 3c0f9d8170 Tweak llvm/test/tools/gold/X86/parallel.ll to run with pthread-unaware ld.gold on Linux.
If ld.gold is configured without --enable-thread, ld.gold might not load libpthread.so.
Preloading LLVMgold.so loads also libpthread.so.

llvm-svn: 246739
2015-09-03 00:48:59 +00:00
Ahmed Bougacha b03ea02479 [X86] Require 32-byte alignment for 32-byte VMOVNTs.
We used to accept (and even test, and generate) 16-byte alignment
for 32-byte nontemporal stores, but they require 32-byte alignment,
per SDM. Found by inspection.

Instead of hardcoding 16 in the patfrag, check for natural alignment.
Also fix the autoupgrade and the various tests.

Also, use explicit -mattr instead of -mcpu: I stared at the output
several minutes wondering why I get 2x movntps for the unaligned
case (which is the ideal output, but needs some work: see FIXME),
until I remembered corei7-avx implies +slow-unaligned-mem-32.

llvm-svn: 246733
2015-09-02 23:25:39 +00:00
Ahmed Bougacha b09c543538 [X86] Cleanup nontemporal tests a little. NFC.
Also: add a missing test for movntiq.
llvm-svn: 246730
2015-09-02 22:47:09 +00:00
Philip Reames dab35f317d [RewriteStatepointsForGC] Improve debug output [NFC]
llvm-svn: 246713
2015-09-02 21:11:44 +00:00
Hal Finkel 79dbf5b562 [PowerPC] Cleanup cost model for unaligned vector loads/stores
I'm adding a regression test to better cover code generation for unaligned
vector loads and stores, but there's no functional change to the code
generation here. There is an improvement to the cost model for unaligned vector
loads and stores, mostly for QPX (for which we were not previously accounting
for the permutation-based loads), and the cost model implementation is cleaner.

llvm-svn: 246712
2015-09-02 21:03:28 +00:00
Piotr Padlewski 0c7d8fc1f6 assuem(X) handling in GVN bugfix
There was infinite loop because it was trying to change assume(true) into
assume(true)
Also added handling when assume(false) appear

http://reviews.llvm.org/D12516

llvm-svn: 246697
2015-09-02 20:00:03 +00:00
Piotr Padlewski 28ffcbe1cc Constant propagation after hitting assume(cmp) bugfix
Last time code run into assertion `BBE.isSingleEdge()` in
lib/IR/Dominators.cpp:200.

http://reviews.llvm.org/D12170

llvm-svn: 246696
2015-09-02 19:59:59 +00:00
Piotr Padlewski 14e815c22b Constant propagation after hiting llvm.assume
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
  is constant we can change all variables with constants in the same BasicBlock

http://reviews.llvm.org/D11918

llvm-svn: 246695
2015-09-02 19:59:53 +00:00
Sanjay Patel 42574203e5 use "unpredictable" metadata in fast-isel when splitting compares
This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch.

Differential Revision: http://reviews.llvm.org/D12342

llvm-svn: 246692
2015-09-02 19:23:23 +00:00
Sanjay Patel fff7c6dc73 use "unpredictable" metadata in SelectionDAG when splitting compares
This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch.

Differential Revision: http://reviews.llvm.org/D12343

llvm-svn: 246691
2015-09-02 19:17:25 +00:00
Justin Bogner 04430aea68 test: Only warn about missing substitutions for required tools
Every time lit is invoked, I get warnings like so:

  lit.py: lit.cfg:286: note: Did not find llvm-go in /Users/bogner/build/llvm/./bin
  lit.py: lit.cfg:286: note: Did not find Kaleidoscope-Ch3 in /Users/bogner/build/llvm/./bin

Since these tools are only built in certain configs, these warnings
are superfluous. Change it so that we only warn about tools that are
built in all configs.

llvm-svn: 246684
2015-09-02 18:03:01 +00:00
Hal Finkel 77c8b7ffd3 [PowerPC] Don't always consider P8Altivec-only masks in LowerVECTOR_SHUFFLE
LowerVECTOR_SHUFFLE needs to decide whether to pass a vector shuffle off to the
TableGen-generated matching code, and it does this by testing the same
predicates used by the TableGen files. Unfortunately, when we added new
P8Altivec-only predicates, we started universally testing them in
LowerVECTOR_SHUFFLE, and if then matched when targeting a system prior to a P8,
we'd end up with a selection failure.

llvm-svn: 246675
2015-09-02 16:52:37 +00:00
Frederic Riss 24faade4b3 Reapply r246012 [dsymutil] Emit real dSYM companion binaries.
With a fix for big endian machines. Thanks to Daniel Sanders for the debugging!

Original commit message:

The binaries containing the linked DWARF generated by dsymutil are not
standard relocatable object files like emitted did previsously. They should be
dSYM companion files, which means they have a different file type in the
header, but also a couple other peculiarities:
 - they contain the segments and sections from the original binary in their
load commands, but not the actual contents. This means they get an address
and a size, but their offset is always 0 (but these are not virtual sections)
 - they also conatin all the defined symbols from the original binary

This makes MC a really bad fit to emit these kind of binaries. The approach
that was used in this patch is to leverage MC's section layout for the
debug sections, but to use a replacement for MachObjectWriter that lives
in MachOUtils.cpp. Some of the low-level helpers from MachObjectWriter
were reused too.

llvm-svn: 246673
2015-09-02 16:49:13 +00:00
Sanjay Patel fbcd189f8a [x86] fix allowsMisalignedMemoryAccesses() for 8-byte and smaller accesses
This is a continuation of the fix from:
http://reviews.llvm.org/D10662

and discussion in:
http://reviews.llvm.org/D12154

Here, we distinguish slow unaligned SSE (128-bit) accesses from slow unaligned
scalar (64-bit and under) accesses. Other lowering (eg, getOptimalMemOpType) 
assumes that unaligned scalar accesses are always ok, so this changes 
allowsMisalignedMemoryAccesses() to match that behavior.

Differential Revision: http://reviews.llvm.org/D12543

llvm-svn: 246658
2015-09-02 15:42:49 +00:00
Asaf Badouh d2c3599c5f [X86][AVX512VLBW] add support in byte shift and SAD
add byte shift left/right
add SAD - compute sum of absolute differences

Differential Revision: http://reviews.llvm.org/D12479

llvm-svn: 246654
2015-09-02 14:21:54 +00:00
Chad Rosier b684e381c9 Add newline to test. NFC.
llvm-svn: 246653
2015-09-02 14:06:16 +00:00
Joseph Tremoulet 917c7382c1 [TableGen] Allow TokenTy in intrinsic signatures
Summary:
Add the necessary plumbing so that llvm_token_ty can be used as an
argument/return type in intrinsic definitions and correspondingly require
TokenTy in function types.  TokenTy is an opaque type that has no target
lowering, but can be used in machine-independent intrinsics.  It is
required for the upcoming llvm.eh.padparam intrinsic.

Reviewers: majnemer, rnk

Subscribers: stoklund, llvm-commits

Differential Revision: http://reviews.llvm.org/D12532

llvm-svn: 246651
2015-09-02 13:36:25 +00:00
Igor Breger 1e58e8adf6 AVX512: Implemented encoding and intrinsics for VGETMANTPD/S , VGETMANTSD/S instructions
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11593

llvm-svn: 246642
2015-09-02 11:18:55 +00:00
NAKAMURA Takumi 44e3c5f36c Suppress llvm/test/tools/gold/X86/parallel.ll while investigating.
For me,

  Program received signal SIGSEGV, Segmentation fault.
  0x00007ffff7deb0dc in _dl_fixup () from /lib64/ld-linux-x86-64.so.2

llvm-svn: 246641
2015-09-02 10:59:21 +00:00
Igor Breger a6297c701e AVX512: Implemented encoding and intrinsics for vshufps/d.
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D11709

llvm-svn: 246640
2015-09-02 10:50:58 +00:00
James Molloy 1e583704f5 [LV] Don't bail to MiddleBlock if a runtime check fails, bail to ScalarPH instead
We were bailing to two places if our runtime checks failed. If the initial overflow check failed, we'd go to ScalarPH. If any other check failed, we'd go to MiddleBlock. This caused us to have to have an extra PHI per induction and reduction as the vector loop's exit block was not dominated by its latch.

There's no need to have this behavior - if we just always go to ScalarPH we can get rid of a bunch of complexity.

llvm-svn: 246637
2015-09-02 10:15:39 +00:00
James Molloy cba9230507 [LV] Refactor all runtime check emissions into helper functions.
This reduces the complexity of createEmptyBlock() and will open the door to further refactoring.

The test change is simply because we're now constant folding a trivial test.

llvm-svn: 246634
2015-09-02 10:15:22 +00:00
James Molloy ff623dce39 [LV] Pull creation of trip counts into a helper function.
... and do a tad of tidyup while we're at it. Because StartIdx must now be zero, there's no difference between Count and EndIdx.

llvm-svn: 246633
2015-09-02 10:15:16 +00:00
James Molloy a860a2216a [LV] Never widen an induction variable.
There's no need to widen canonical induction variables. It's just as efficient to create a *new*, wide, induction variable.

Consider, if we widen an indvar, then we'll have to truncate it before its uses anyway (1 trunc). If we create a new indvar instead, we'll have to truncate that instead (1 trunc) [besides which IndVars should go and clean up our mess after us anyway on principle].

This lets us remove a ton of special-casing code.

llvm-svn: 246631
2015-09-02 10:15:05 +00:00
James Molloy c07701b017 [LV] Switch to using canonical induction variables.
Vectorized loops only ever have one induction variable. All induction PHIs from the scalar loop are rewritten to be in terms of this single indvar.

We were trying very hard to pick an indvar that already existed, even if that indvar wasn't canonical (didn't start at zero). But trying so hard is really fruitless - creating a new, canonical, indvar only results in one extra add in the worst case and that add is trivially easy to push through the PHI out of the loop by instcombine.

If we try and be less clever here and instead let instcombine clean up our mess (as we do in many other places in LV), we can remove unneeded complexity.

llvm-svn: 246630
2015-09-02 10:14:54 +00:00
Elena Demikhovsky 9f83c7346f AVX-512: store <4 x i1> and <2 x i1> values in memory
Enabled DAG pattern lowering for SKX with DQI predicate.

Differential Revision: http://reviews.llvm.org/D12550

llvm-svn: 246625
2015-09-02 09:20:58 +00:00
Elena Demikhovsky 1b9d6914d3 Optimization for Gather/Scatter with uniform base
Vector 'getelementptr' with scalar base is an opportunity for gather/scatter intrinsic to generate a better sequence.
While looking for uniform base, we want to use the scalar base pointer of GEP, if exists.

Differential Revision: http://reviews.llvm.org/D11121

llvm-svn: 246622
2015-09-02 08:39:13 +00:00
Vedant Kumar b5c2fd7257 [CodeGen] Fix FREM on 32-bit MSVC on x86
Patch by Dylan McKay!

Differential Revision: http://reviews.llvm.org/D12099

llvm-svn: 246615
2015-09-02 01:31:58 +00:00
David Majnemer 088ba020dd [MC] Generate a timestamp for COFF object files
The MS incremental linker seems to inspect the timestamp written into
the object file to determine whether or not it's contents need to be
considered.  Failing to set the timestamp to a date newer than the
executable will result in the object file not participating in
subsequent links.  To ameliorate this, write the current time into the
object file's TimeDateStamp field.

llvm-svn: 246607
2015-09-01 23:46:11 +00:00
Ahmed Bougacha 699a9dd7c3 [ARM] Don't abort on variable-idx extractelt in ReconstructShuffle.
The code introduced in r244314 assumed that EXTRACT_VECTOR_ELT only
takes constant indices, but it does accept variables.
Bail out for those: we can't use them, as the shuffles we want to
reconstruct do require constant masks.

llvm-svn: 246594
2015-09-01 21:56:00 +00:00
David Majnemer 6ddc636862 [MC] Add support for generating COFF CRCs
COFF sections are accompanied with an auxiliary symbol which includes a
checksum.  This checksum used to be filled with just zero but this seems
to upset LINK.exe when it is processing a /INCREMENTAL link job.
Instead, fill the CheckSum field with the JamCRC of the section
contents.  This matches MSVC's behavior.

This fixes PR19666.

N.B.  A rather simple implementation of JamCRC is given.  It implements
a byte-wise calculation using the method given by Sarwate.  There are
implementations with higher throughput like slice-by-eight and making
use of PCLMULQDQ.  We can switch to one of those techniques if it turns
out to be a significant use of time.

llvm-svn: 246590
2015-09-01 21:23:58 +00:00
Peter Collingbourne 87202a4aac gold-plugin: Implement parallel LTO code generation using llvm::splitCodeGen.
Parallelism can be enabled using a new plugin option, jobs=N, where N is
the number of code generation threads.

Differential Revision: http://reviews.llvm.org/D12308

llvm-svn: 246584
2015-09-01 20:40:22 +00:00
Hans Wennborg dada1d20ba DeadArgElim: don't eliminate arguments from naked functions
Differential Revision: http://reviews.llvm.org/D12534

llvm-svn: 246564
2015-09-01 18:06:46 +00:00
Artem Belevich 020d4fb17f New bitcode linker flags:
-only-needed -- link in only symbols needed by destination module
-internalize -- internalize linked symbols

Differential Revision: http://reviews.llvm.org/D12459

llvm-svn: 246561
2015-09-01 17:55:55 +00:00
Davide Italiano 0ca250853c [llvm-readobj] MachO -- correctly dump section field 'Reserved3'
Before we incorrectly ignored it.

llvm-svn: 246556
2015-09-01 16:29:02 +00:00
Ahmed Bougacha b0ff6437cb [AArch64] Lower READCYCLECOUNTER using MRS PMCCTNR_EL0.
This matches the ARM behavior. In both cases, the register is part
of the optional Performance Monitors extension, so, add the feature,
and enable it for the A-class processors we support.

Differential Revision: http://reviews.llvm.org/D12425

llvm-svn: 246555
2015-09-01 16:23:45 +00:00
Igor Breger f6f1bb6ddc AVX512: Implemented intrinsics for valign.
Differential Revision: http://reviews.llvm.org/D12526

llvm-svn: 246551
2015-09-01 15:27:18 +00:00
Sanjay Patel c413558842 use CHECK-LABEL for more precision
llvm-svn: 246547
2015-09-01 14:35:05 +00:00
Silviu Baranga 755ec0e027 [AArch64] Turn on by default interleaved access vectorization
Summary:
This change turns on by default interleaved access vectorization
for AArch64.

We also clean up some tests which were spedifically enabling this
behaviour.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12149

llvm-svn: 246542
2015-09-01 11:26:46 +00:00
Silviu Baranga e748c9ef55 [ARM] Turn on by default interleaved access vectorization
Summary:
This change turns on by default interleaved access vectorization on ARM,
as it has shown to be beneficial on ARM.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12146

llvm-svn: 246541
2015-09-01 11:19:15 +00:00
Silviu Baranga 6d3f05c04b [ARM][AArch64] Turn on by default interleaved access lowering
Summary:
Interleaved access lowering removes a memory operation and a
sequence of vector shuffles and replaces it with a series of
memory operations. This should be always beneficial.

This pass in only enabled on ARM/AArch64.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12145

llvm-svn: 246540
2015-09-01 11:12:35 +00:00
Rui Ueyama b355fd0308 Object: Fix COFF import file's symbols.
If a symbol is marked as "data", the symbol should be exported
with __imp_ prefix. Previously, the symbol was exported as-is.

llvm-svn: 246532
2015-09-01 06:01:53 +00:00
Cong Hou 511298b919 Distribute the weight on the edge from switch to default statement to edges generated in lowering switch.
Currently, when edge weights are assigned to edges that are created when lowering switch statement, the weight on the edge to default statement (let's call it "default weight" here) is not considered. We need to distribute this weight properly. However, without value profiling, we have no idea how to distribute it. In this patch, I applied the heuristic that this weight is evenly distributed to successors.

For example, given a switch statement with cases 1,2,3,5,10,11,20, and every edge from switch to each successor has weight 10. If there is a binary search tree built to test if n < 10, then its two out-edges will have weight 4x10+10/2 = 45 and 3x10 + 10/2 = 35 respectively (currently they are 40 and 30 without considering the default weight). Each distribution (which is 5 here) will be stored in each SwitchWorkListItem for further distribution.

There are some exceptions:

For a jump table header which doesn't have any edge to default statement, we don't distribute the default weight to it.
For a bit test header which covers a contiguous range and hence has no edges to default statement, we don't distribute the default weight to it.
When the branch checks a single value or a contiguous range with no edge to default statement, we don't distribute the default weight to it.
In other cases, the default weight is evenly distributed to successors.

Differential Revision: http://reviews.llvm.org/D12418

llvm-svn: 246522
2015-09-01 01:42:16 +00:00
Sanjay Patel 989364c101 remove unnecessary/conflicting target info
llvm-svn: 246514
2015-09-01 00:27:36 +00:00
Sanjay Patel e554d59eba fixed test to specify triple rather than arch and CPU
llvm-svn: 246513
2015-09-01 00:25:23 +00:00
Hal Finkel 1baec5323b [DAGCombine] Fixup SETCC legality checking
SETCC is one of those special node types for which operation actions (legality,
etc.) is keyed off of an operand type, not the node's value type. This makes
sense because the value type of a legal SETCC node is determined by its
operands' value type (via the TLI function getSetCCResultType). When the
SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value
type, or directly with the value type provided by TLI.getSetCCResultType.

The first problem being fixed here is that DAGCombine had several places
querying TLI.isOperationLegal on SETCC, but providing the return of
getSetCCResultType, instead of the operand type directly. This does not mean
what the author thought, and "luckily", most in-tree targets have SETCC with
Custom lowering, instead of marking them Legal, so these checks return false
anyway.

The second problem being fixed here is that two of the DAGCombines could create
SETCC nodes with arbitrary (integer) value types; specifically, those that
would simplify:

  (setcc a, b, op1) and|or (setcc a, b, op2) -> setcc a, b, op3
     (which is possible for some combinations of (op1, op2))

If the operands of the and|or node are actual setcc nodes, then this is not an
issue (because the and|or must share the same type), but, the relevant code in
DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls
DAGCombiner::isSetCCEquivalent on each operand, and that function will
recognise setcc-like select_cc nodes with other return types. And, thus, when
creating new SETCC nodes, we need to be careful to respect the value-type
constraint. This is even true before type legalization, because it is quite
possible for the SELECT_CC node to have a legal type that does not happen to
match the corresponding TLI.getSetCCResultType type.

To be explicit, there is nothing that later fixes the value types of SETCC
nodes (if the type is legal, but does not happen to match
TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to
work only because, either MVT::i1 is not legal, or it is what
TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change,
however. For the time being, restrict the relevant transformations to produce
only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1
prior to type legalization).

Fixes PR24636.

llvm-svn: 246507
2015-08-31 23:15:04 +00:00
Quentin Colombet 5989bc6f41 [BasicAA] Fix the handling of sext and zext in the analysis of GEPs.
Hopefully this will end the GEPs saga!

This commit reverts r245394, i.e., it reapplies r221876 while incorporating the
fixes from D11847.
r221876 was not reapplied alone because it was not safe and D11847 was not
applied alone because it needs r221876 to produce correct results.

This should fix PR24596.

Original commit message for r221876:
Let's try this again...

This reverts r219432, plus a bug fix.

Description of the bug in r219432 (by Nick):

The bug was using AllPositive to break out of the loop; if the loop break
condition i != e is changed to i != e && AllPositive then the
test_modulo_analysis_with_global test I've added will fail as the Modulo will
be calculated incorrectly (as the last loop iteration is skipped, so Modulo
isn't updated with its Scale).

Nick also adds this comment:

ComputeSignBit is safe to use in loops as it takes into account phi nodes, and
the  == EK_ZeroEx check is safe in loops as, no matter how the variable changes
between iterations, zero-extensions will always guarantee a zero sign bit. The
isValueEqualInPotentialCycles check is therefore definitely not needed as all
the variable analysis holds no matter how the variables change between loop
iterations.

And this patch also adds another enhancement to GetLinearExpression - basically
to convert ConstantInts to Offsets (see test_const_eval and
test_const_eval_scaled for the situations this improves).

Original commit message:

This reverts r218944, which reverted r218714, plus a bug fix.

Description of the bug in r218714 (by Nick):

The original patch forgot to check if the Scale in VariableGEPIndex flipped the
sign of the variable. The BasicAA pass iterates over the instructions in the
order they appear in the function, and so BasicAliasAnalysis::aliasGEP is
called with the variable it first comes across as parameter GEP1. Adding a
%reorder label puts the definition of %a after %b so aliasGEP is called with %b
as the first parameter and %a as the second. aliasGEP later calculates that %a
== %b + 1 - %idxprom where %idxprom >= 0 (if %a was passed as the first
parameter it would calculate %b == %a - 1 + %idxprom where %idxprom >= 0) -
ignoring that %idxprom is scaled by -1 here lead the patch to incorrectly
conclude that %a > %b.

Revised patch by Nick White, thanks! Thanks to Lang to isolating the bug.
Slightly modified by me to add an early exit from the loop and avoid
unnecessary, but expensive, function calls.

Original commit message:

Two related things:

1. Fixes a bug when calculating the offset in GetLinearExpression. The code
   previously used zext to extend the offset, so negative offsets were converted
   to large positive ones.

2. Enhance aliasGEP to deduce that, if the difference between two GEP
   allocations is positive and all the variables that govern the offset are also
   positive (i.e. the offset is strictly after the higher base pointer), then
   locations that fit in the gap between the two base pointers are NoAlias.

Patch by Nick White!

Message from D11847:
Un-revert of r241981 and fix for PR23626. The 'Or' case of GetLinearExpression
delegates to 'Add' if possible, and if not it returns an Opaque value.
Unfortunately the Scale and Offsets weren't being set (and so defaulted to 0) -
and a scale of zero effectively removes the variable from the GEP instruction.
This meant that BasicAA would return MustAliases when it should have been
returning PartialAliases (and PR23626 was an example of the GVN pass using an
incorrect MustAlias to merge loads from what should have been different
pointers).

Differential Revision: http://reviews.llvm.org/D11847
Patch by Nick White <n.j.white@gmail.com>!

llvm-svn: 246502
2015-08-31 22:32:47 +00:00
JF Bastien 73ff6afa87 WebAssembly: generate load/store
Summary: This handles all load/store operations that WebAssembly defines, and handles those necessary for C++ such as i1. I left a FIXME for outstanding features which aren't required for now.

Reviewers: sunfish

Subscribers: jfb, llvm-commits, dschuff
llvm-svn: 246500
2015-08-31 22:24:11 +00:00
Karl Schimpf 4da0e12968 Fix bug in method LLLexer::FP80HexToIntPair
llvm-svn: 246489
2015-08-31 21:36:14 +00:00
Hans Wennborg 4a61370b8f Fix CHECK directives that weren't checking.
llvm-svn: 246485
2015-08-31 21:10:35 +00:00
Sanjay Patel d9a5c225d1 [x86] enable machine combiner reassociations for scalar 'or' insts
llvm-svn: 246481
2015-08-31 20:27:03 +00:00
Reid Kleckner e00faf8ce1 [EH] Handle non-Function personalities like unknown personalities
Also delete and simplify a lot of MachineModuleInfo code that used to be
needed to handle personalities on landingpads.  Now that the personality
is on the LLVM Function, we no longer need to track it this way on MMI.
Certainly it should not live on LandingPadInfo.

llvm-svn: 246478
2015-08-31 20:02:16 +00:00
Philip Reames a88caeab6c [FunctionAttr] Infer nonnull attributes on returns
Teach FunctionAttr to infer the nonnull attribute on return values of functions which never return a potentially null value. This is done both via a conservative local analysis for the function itself and a optimistic per-SCC analysis. If no function in the SCC returns anything which could be null (other than values from other functions in the SCC), we can conclude no function returned a null pointer. Even if some function within the SCC returns a null pointer, we may be able to locally conclude that some don't.

Differential Revision: http://reviews.llvm.org/D9688

llvm-svn: 246476
2015-08-31 19:44:38 +00:00
Quentin Colombet a80b9c824e [AArch64][CollectLOH] Remove an invalid assertion and add a test case exposing it.
rdar://problem/22491525

llvm-svn: 246472
2015-08-31 19:02:00 +00:00
Philip Reames bb11d62a5a [LazyValueInfo] Look through Phi nodes when trying to prove a predicate
If asked to prove a predicate about a value produced by a PHI node, LazyValueInfo was unable to do so even if the predicate was known to be true for each input to the PHI. This prevented JumpThreading from eliminating a provably redundant branch.

The problematic test case looks something like this:
ListNode *p = ...;
while (p != null) {
  if (!p) return;
  x = g->x; // unrelated
  p = p->next
}

The null check at the top of the loop is redundant since the value of 'p' is null checked on entry to the loop and before executing the backedge. This resulted in us a) executing an extra null check per iteration and b) not being able to LICM unrelated loads after the check since we couldn't prove they would execute or that their dereferenceability wasn't effected by the null check on the first iteration.

Differential Revision: http://reviews.llvm.org/D12383

llvm-svn: 246465
2015-08-31 18:31:48 +00:00
Matthias Braun 0acbd08f3c AArch64: Fix loads to lower NEON vector lanes using GPR registers
The ISelLowering code turned insertion turned the element for the
lowest lane of a BUILD_VECTOR into an INSERT_SUBREG, this prohibited
the patterns for SCALAR_TO_VECTOR(Load) to match later. Restrict this
to cases without a load argument.

Reported in rdar://22223823

Differential Revision: http://reviews.llvm.org/D12467

llvm-svn: 246462
2015-08-31 18:25:15 +00:00
Filipe Cabecinhas 984fefdd81 [BitcodeReader] Ensure we can read constant vector selects with an i1 condition
Summary:
Constant vectors weren't allowed to have an i1 condition in the
BitcodeReader. Make sure we have the same restrictions that are
documented, not more.

Reviewers: nlewycky, rafael, kschimpf

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12440

llvm-svn: 246459
2015-08-31 18:00:30 +00:00
Vedant Kumar 86dbd92334 [MC/AsmParser] Avoid setting MCSymbol.IsUsed in some cases
Avoid marking some MCSymbols as used in MC/AsmParser.cpp when no uses
exist. This fixes a bug in parseAssignmentExpression() which
inadvertently sets IsUsed, thereby triggering:

    "invalid re-assignment of non-absolute variable"

on otherwise valid code. No other functionality change intended.

The original version of this patch touched many calls to MCSymbol
accessors. On rafael's advice, I have stripped this patch down a bit.

As a follow-up, I intend to find the call sites which intentionally set
IsUsed and force them to do so explicitly.

Differential Revision: http://reviews.llvm.org/D12347

llvm-svn: 246457
2015-08-31 17:44:53 +00:00
Igor Breger 5ea0a68115 AVX512: ktest implemantation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11979

llvm-svn: 246439
2015-08-31 13:30:19 +00:00
Igor Breger f3ded811b2 AVX512: Implemented encoding and intrinsics for vdbpsadbw
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12491

llvm-svn: 246436
2015-08-31 13:09:30 +00:00
Igor Breger 59ac339357 AVX512: kadd implementation
Added tests for encoding.

Differential Revision: http://reviews.llvm.org/D11973

llvm-svn: 246432
2015-08-31 11:50:23 +00:00
Igor Breger 98a045c978 AVX512: Add encoding tests for vscatter instructions
Differential Revision: http://reviews.llvm.org/D11941

llvm-svn: 246431
2015-08-31 11:33:50 +00:00
Igor Breger 2ae0fe3ac3 AVX512: Implemented encoding and intrinsics for vpalignr
Added tests for intrinsics and encoding.

Differential Revision: http://reviews.llvm.org/D12270

llvm-svn: 246428
2015-08-31 11:14:02 +00:00
Hal Finkel e0a28e54c7 [AggressiveAntiDepBreaker] Check for EarlyClobber on defining instruction
AggressiveAntiDepBreaker was doing some EarlyClobber checking, but was not
checking that the register being potentially renamed was defined by an
early-clobber def where there was also a use, in that instruction, of the
register being considered as the target of the rename. Fixes PR24014.

llvm-svn: 246423
2015-08-31 07:51:36 +00:00
Sylvestre Ledru 3675d1a6d0 Force the locale when executing ld gold
Summary:
If run with other locales (like French),
the decode operation might fail

Reviewers: rafael

Differential Revision: http://reviews.llvm.org/D12432

llvm-svn: 246421
2015-08-31 07:10:05 +00:00
Jingyue Wu e84f671830 [JumpThreading] make jump threading respect convergent annotation.
Summary:
JumpThreading shouldn't duplicate a convergent call, because that would move a convergent call into a control-inequivalent location. For example,
  if (cond) {
    ...
  } else {
    ...
  }
  convergent_call();
  if (cond) {
    ...
  } else {
    ...
  }
should not be optimized to
  if (cond) {
    ...
    convergent_call();
    ...
  } else {
    ...
    convergent_call();
    ...
  }

Test Plan: test/Transforms/JumpThreading/basic.ll

Patch by Xuetian Weng. 

Reviewers: resistor, arsenm, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12484

llvm-svn: 246415
2015-08-31 06:10:27 +00:00
Frederic Riss afeac301b1 [dsymutil] Do not mistakenly reuse the current object file when the next one isn't found.
llvm-svn: 246412
2015-08-31 05:16:35 +00:00
Frederic Riss 4e289f9d1e [dsymutil] Fix testcase.
This testcase required 2 copies of the same file, and the second
copy was missing. It was currently working because of a bug I'm
about to fix.

llvm-svn: 246411
2015-08-31 05:16:30 +00:00
Frederic Riss 94546204d1 [dsymutil] Do not crash on empty debug_range range.
The fix is trivial (The actual patch is 2 lines, but as it changes
indentation it looks like more).
clang does not produce this kind of (slightly bogus) debug info
anymore, thus I had to rely on a hand-crafted assembly test to trigger
that case.

llvm-svn: 246410
2015-08-31 05:09:32 +00:00
Frederic Riss 7b5563aa5c [dsymutil] Fix handling of inlined_subprogram low_pcs
The value of an inlined subprogram low_pc attribute should not
get relocated, but it can happen that it matches the enclosing
function's start address and thus gets the generic treatment.
Special case it to avoid applying the PC offset twice.

llvm-svn: 246406
2015-08-31 01:43:14 +00:00
Frederic Riss 5ba01d6d95 [dsymutil] Implement -symtab/-s option.
This option dumps the STAB entries that define the debug map(s)
stored in the input binaries, and then exits.

llvm-svn: 246403
2015-08-31 00:29:09 +00:00
Hal Finkel a2cdbce661 [PowerPC] Fixup SELECT_CC (and SETCC) patterns with i1 comparison operands
There were really two problems here. The first was that we had the truth tables
for signed i1 comparisons backward. I imagine these are not very common, but if
you have:
  setcc i1 x, y, LT
this has the '0 1' and the '1 0' results flipped compared to:
  setcc i1 x, y, ULT
because, in the signed case, '1 0' is really '-1 0', and the answer is not the
same as in the unsigned case.

The second problem was that we did not have patterns (at all) for the unsigned
comparisons select_cc nodes for i1 comparison operands. This was the specific
cause of PR24552. These had to be added (and a missing Altivec promotion added
as well) to make sure these function for all types. I've added a bunch more
test cases for these patterns, and there are a few FIXMEs in the test case
regarding code-quality.

Fixes PR24552.

llvm-svn: 246400
2015-08-30 22:12:50 +00:00
Hal Finkel 2d55698ed7 [PowerPC/MIR Serialization] Target flags serialization support
Add support for MIR serialization of PowerPC-specific operand target flags
(based on the generic infrastructure added in r244185 and r245383).

I won't even pretend that this is good test coverage, but this includes the
regression test associated with r246372. Adding an MIR test for that fix is far
superior to adding an IR-level test because particular instruction-scheduling
decisions are necessary in order to expose the bug, and using an MIR test we
can start the pipeline post-scheduling.

llvm-svn: 246373
2015-08-30 07:50:35 +00:00
James Molloy a184adffab [ARM] Fix up buildbots after r246360
I have no idea how I missed this in my internal testing. Just no idea. Sorry for the bot-armageddon.

llvm-svn: 246361
2015-08-29 11:50:08 +00:00
James Molloy 45ee9898ec [ARM] Hoist fabs/fneg above a conversion to float.
This is especially visible in softfp mode, for example in the implementation of libm fabs/fneg functions. If we have:

%1 = vmovdrr r0, r1
%2 = fabs %1

then move the fabs before the vmovdrr:

%1 = and r1, #0x7FFFFFFF
%2 = vmovdrr r0, r1

This is never a lose, and could be a serious win because the vmovdrr may be followed by a vmovrrd, which would enable us to remove the conversion into FPRs completely.

We already do this for f32, but not for f64. Tests are added for both.

llvm-svn: 246360
2015-08-29 10:49:11 +00:00
Matt Arsenault e4d0c142e8 AMDGPU: Add sdst operand to VOP2b instructions
The VOP3 encoding of these allows any SGPR pair for the i1
output, but this was forced before to always use vcc.
This doesn't yet try to use this, but does add the operand
to the definitions so the main change is adding vcc to the
output of the VOP2 encoding.

llvm-svn: 246358
2015-08-29 07:16:50 +00:00
Matt Arsenault 5c004a7c61 AMDGPU: Fix dropping mem operands when moving to VALU
Without a memory operand, mayLoad or mayStore instructions
are treated as hasUnorderedMemRef, which results in much worse
scheduling.

We really should have a verifier check that any
non-side effecting mayLoad or mayStore has a memory operand.
There are a few instructions (interp and images) which I'm
not sure what / where to add these.

llvm-svn: 246356
2015-08-29 06:48:46 +00:00
NAKAMURA Takumi 7951e37d24 Revert r246350, "The host and default target triples do not need to match for "native""
Wrong assumption. Consider --host=x86_64-linux --target=(i686|x86_64)-win32. See also r193459.

llvm-svn: 246352
2015-08-28 23:33:17 +00:00
Duncan P. N. Exon Smith 0bd73bb58b DI: Update tests before adding !dbg subprogram attachments
I'm working on adding !dbg attachments to functions (PR23367), which
we'll use to determine the canonical subprogram for a function (instead
of the `subprograms:` array in the compile units).  This updates a few
old tests in preparation.

Transforms/Mem2Reg/ConvertDebugInfo2.ll had an old-style grep+count
based test that would start to fail because I've added an extra line
with `!dbg`.  Instead, explicitly `CHECK` for what I think the test
actually cares about.

All three testcases have subprograms with a valid `function:` reference
-- which means my upgrade script will add a `!dbg` attachment -- but
that aren't referenced from any compile unit.  I suspect these testcases
were handreduced over-zealously (or have bitrotted?).  Add a reference
from the compile unit so that upcoming Verifier checks won't fail here.

llvm-svn: 246351
2015-08-28 23:32:00 +00:00
Paul Robinson 273ed4d9eb The host and default target triples do not need to match for "native"
backend to work.

Differential Revision: http://reviews.llvm.org/D12454

llvm-svn: 246350
2015-08-28 23:21:15 +00:00
Peter Collingbourne c4a6c1f7fd Use UNSUPPORTED instead of XFAIL to disable this test, as it passes on one AArch64 bot.
llvm-svn: 246344
2015-08-28 22:17:29 +00:00
Duncan P. N. Exon Smith b56b5af4c3 DI: Add Function::getSubprogram()
Add `Function::setSubprogram()` and `Function::getSubprogram()`,
convenience methods to forward to `setMetadata()` and `getMetadata()`,
respectively, and deal in `DISubprogram` instead of `MDNode`.

Also add a verifier check to enforce that `!dbg` attachments are always
subprograms.

Originally (when I had the llvm-dev discussion back in April) I thought
I'd store a pointer directly on `llvm::Function` for these attachments
-- we frequently have debug info, and that's much cheaper than using map
in the context if there are no other function-level attachments -- but
for now I'm just using the generic infrastructure.  Let's add the extra
complexity only if this shows up in a profile.

llvm-svn: 246339
2015-08-28 21:55:35 +00:00
Duncan P. N. Exon Smith 0660bcda53 AsmPrinter: Allow null subroutine type
Currently the DWARF backend requires that subprograms have a type, and
the type is ignored if it has an empty type array.  The long term
direction here -- see PR23079 -- is instead to skip the type entirely if
there's no valid type.

It turns out we have cases in tree of missing types on subprograms, but
since they're not referenced by compile units, the backend never crashes
on them.  One option would be to add a Verifier check that subprograms
have types, and fix the bitrot.  However, this is a fair bit of churn
(20-30 testcases) that would be reversed anyway by PR23079.

I found this inconsistency because of a WIP patch and upgrade script for
PR23367 that started crashing on test/DebugInfo/2010-10-01-crash.ll.
This commit updates the testcase to reference the subprogram from the
compile unit, and fixes the resulting crash (in line with the direction
of PR23079).  This also updates `DIBuilder` to stop assuming a non-null
pointer for the subroutine types.

llvm-svn: 246333
2015-08-28 21:38:24 +00:00
David Majnemer 0a92f86fe6 Revert r246232 and r246304.
This reverts isSafeToSpeculativelyExecute's use of ReadNone until we
split ReadNone into two pieces: one attribute which reasons about how
the function reasons about memory and another attribute which determines
how it may be speculated, CSE'd, trap, etc.

llvm-svn: 246331
2015-08-28 21:13:39 +00:00
Rafael Espindola cad82ccbb2 Split the gold tests into X86 and PowerPC directories.
Patch by Than McIntosh!

llvm-svn: 246328
2015-08-28 20:33:56 +00:00
Duncan P. N. Exon Smith 814b8e91c7 DI: Require subprogram definitions to be distinct
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'.  Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.

While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore.  Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node.  The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.

I updated almost all the IR with the following script:

    git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'

Likely some variant of would work for out-of-tree testcases.

llvm-svn: 246327
2015-08-28 20:26:49 +00:00
Sanjoy Das 6f5dca70ed [InstCombine] Fix PR24605.
PR24605 is caused due to an incorrect insert point in instcombine's IR
builder.  When simplifying

  %t = add X Y
  ...
  %m = icmp ... %t

the replacement for %t should be placed before %t, not before %m, as
there could be a use of %t between %t and %m.

llvm-svn: 246315
2015-08-28 19:09:31 +00:00
Chad Rosier dc65532fd9 Optimize memcmp(x,y,n)==0 for small n and suitably aligned x/y.
http://reviews.llvm.org/D6952
PR20673

llvm-svn: 246313
2015-08-28 18:30:18 +00:00
Vedant Kumar 3171cabb34 [test] (NFC) Simplify Transforms/ConstProp/calls.ll
Differential Revision: http://reviews.llvm.org/D12421

llvm-svn: 246312
2015-08-28 18:04:20 +00:00
Petar Jovanovic 207a191a98 [mips64][mcjit] Add N64R6 relocations tests and fix N64R2 tests
This patch adds a test for MIPS64R6 relocations, it corrects check
expressions for R_MIPS_26 and R_MIPS_PC16 relocations in MIPS64R2 test, and
it adds run for big endian in MIPS64R2 test.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D11217

llvm-svn: 246311
2015-08-28 18:02:53 +00:00
Petar Jovanovic 28e2b717fc [mips] Remove incorrect DebugLoc entries from prologue
This has been causing the prologue_end to be incorrectly positioned.

Patch by Vladimir Radosavljevic.

Differential Revision: http://reviews.llvm.org/D11293

llvm-svn: 246309
2015-08-28 17:53:26 +00:00
Matt Arsenault d9c830154f Make MergeConsecutiveStores look at other stores on same chain
When combiner AA is enabled, look at stores on the same chain.
Non-aliasing stores are moved to the same chain so the existing
code fails because it expects to find an adajcent store on a consecutive
chain.

Because of how DAGCombiner tries these store combines,
MergeConsecutiveStores doesn't see the correct set of stores on the chain
when it visits the other stores. Each store individually has its chain
fixed before trying to merge consecutive stores, and then tries to merge
stores from that point before the other stores have been processed to
have their chains fixed. To fix this, attempt to use FindBetterChain
on any possibly neighboring stores in visitSTORE.

Suppose you have 4 32-bit stores that should be merged into 1 vector
store. One store would be visited first, fixing the chain. What happens is
because not all of the store chains have yet been fixed, 2 of the stores
are merged. The other 2 stores later have their chains fixed,
but because the other stores were already merged, they have different
memory types and merging the two different sized stores is not
supported and would be more difficult to handle.

llvm-svn: 246307
2015-08-28 17:31:28 +00:00
David Majnemer 9c51053dd5 Test case for r246304.
llvm-svn: 246306
2015-08-28 17:19:54 +00:00
JF Bastien f5aa1ca655 Remove Merge Functions pointer comparisons
Summary:
This patch removes two remaining places where pointer value comparisons
are used to order functions: comparing range annotation metadata, and comparing
block address constants. (These are both rare cases, and so no actual
non-determinism was observed from either case).

The fix for range metadata is simple: the annotation always consists of a pair
of integers, so we just order by those integers.

The fix for block addresses is more subtle. Two constants are the same if they
are the same basic block in the same function, or if they refer to corresponding
basic blocks in each respective function. Note that in the first case, merging
is trivially correct. In the second, the correctness of merging relies on the
fact that the the values of block addresses cannot be compared. This change is
actually an enhancement, as these functions could not previously be merged (see
merge-block-address.ll).

There is still a problem with cross function block addresses, in that constants
pointing to a basic block in a merged function is not updated.

This also more robustly compares floating point constants by all fields of their
semantics, and fixes a dyn_cast/cast mixup.

Author: jrkoenig
Reviewers: dschuff, nlewycky, jfb
Subscribers llvm-commits
Differential revision: http://reviews.llvm.org/D12376

llvm-svn: 246305
2015-08-28 16:49:09 +00:00
Sanjay Patel 7c912898a5 [x86] enable machine combiner reassociations for scalar 'and' insts
llvm-svn: 246300
2015-08-28 14:09:48 +00:00
Davide Italiano f00e94546e [MC] Convert tests to use llvm-readobj --macho-version-min.
As an added bonus this also tests the newly introduced feature.

llvm-svn: 246296
2015-08-28 12:40:05 +00:00
Rui Ueyama 932108912d llvm-readobj: Dump more info for COFF import libraries.
This patch teaches llvm-readobj to print out COFF import file header fields.

llvm-svn: 246291
2015-08-28 10:27:50 +00:00
Chandler Carruth 4b682f6f24 [SROA] Fix PR24463, a crash I introduced in SROA by allowing it to
handle more allocas with loads past the end of the alloca.

I suspect there are some related crashers with slightly different
patterns, but I'll fix those and add test cases as I find them.

Thanks to David Majnemer for the excellent test case reduction here.
Made this super simple to debug and fix.

llvm-svn: 246289
2015-08-28 09:03:52 +00:00
Rui Ueyama 71ba9bdd23 Re-apply r246276 - Object: Teach llvm-ar to create symbol table for COFF short import files
This patch includes a fix for a llvm-readobj test. With this patch, 
the tool does no longer print out COFF headers for the short import
file, but that's probably desirable because the header for the short
import file is dummy.

llvm-svn: 246283
2015-08-28 07:40:30 +00:00
Steven Wu 61db34d12e Revert r246244 and r246243
These two commits cause clang/llvm bootstrap to hang.

llvm-svn: 246279
2015-08-28 06:52:00 +00:00
Rui Ueyama 8cff17469f Rollback r246276 - Object: Teach llvm-ar to create symbol table for COFF short import files
This change caused a test for llvm-readobj to fail.

llvm-svn: 246277
2015-08-28 06:03:01 +00:00
Rui Ueyama 22b1b7aad2 Object: Teach llvm-ar to create symbol table for COFF short import files.
COFF short import files are special kind of files that contains only
DLL-exported symbol names. That's different from object files because
it has no data except symbol names.

This change implements a SymbolicFile interface for the short import
files so that symbol names can be accessed through that interface.
llvm-ar is now able to read the file and create symbol table entries
for short import files.

llvm-svn: 246276
2015-08-28 05:47:46 +00:00
Peter Collingbourne 6cba53ec6c Tweak XFAIL line for mips.
llvm-svn: 246269
2015-08-28 04:07:52 +00:00
NAKAMURA Takumi 004653755c Disable llvm/test/Examples/ for now while investigating.
FIXME:
  - Introduce explicit mapping.
  - Investigate crash on win32. Could we introduce crash handler in examples?

llvm-svn: 246267
2015-08-28 03:32:43 +00:00
Peter Collingbourne 834f937c93 XFAIL parallel.ll test on MIPS and AArch64 until test failures can be investigated.
llvm-svn: 246261
2015-08-28 02:14:15 +00:00
Joseph Tremoulet ec18285b91 [WinEH] Update coloring to handle nested cases cleanly
Summary:
Change the coloring algorithm in WinEHPrepare to visit a funclet's exits
in its parents' contexts and so properly classify the continuations of
nested funclets.

Also change the placement of cloned blocks to be deterministic and to
maintain the relative order of each funclet's blocks.

Add a lit test showing various patterns that require cloning, the last
several of which don't have CHECKs yet because they require cloning
entire funclets which is NYI.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12353

llvm-svn: 246245
2015-08-28 01:12:35 +00:00
Piotr Padlewski 3f81ec1e38 Constant propagation after hitting assume(cmp) bugfix
Last time code run into assertion `BBE.isSingleEdge()` in
lib/IR/Dominators.cpp:200.

http://reviews.llvm.org/D12170

llvm-svn: 246244
2015-08-28 01:02:00 +00:00
Piotr Padlewski 63cc5d4627 Constant propagation after hiting llvm.assume
After hitting @llvm.assume(X) we can:
- propagate equality that X == true
- if X is icmp/fcmp (with eq operation), and one of operand
  is constant we can change all variables with constants in the same BasicBlock

http://reviews.llvm.org/D11918

llvm-svn: 246243
2015-08-28 01:01:57 +00:00
George Burgess IV 68b36e01da Fix: CFLAA -- Mark no-args returns as unknown
Prior to this patch, we hadn't been marking StratifiedSets with the
appropriate StratifiedAttrs when handling the result of no-args call
instructions. This caused us to report NoAlias when handed, for
example, an escaped alloca and a result from an opaque function. Now we
properly mark the return value of said functions.

Thanks again to Chandler, Richard, and Nick for pinging me about this.

Differential review: http://reviews.llvm.org/D12408

llvm-svn: 246240
2015-08-28 00:16:18 +00:00
Quentin Colombet fa4ecb4b9a [AArch64][CollectLOH] Fix a regression that prevented us to detect chains of
more than 2 instructions.

I introduced this regression a while back and did not noticed it because I
somehow forgot to push the initial test cases for the pass!

Fix that as well!

llvm-svn: 246239
2015-08-27 23:47:10 +00:00
Peter Collingbourne c269ed5115 CodeGen: Introduce splitCodeGen and teach LTOCodeGenerator to use it.
llvm::splitCodeGen is a function that implements the core of parallel LTO
code generation. It uses llvm::SplitModule to split the module into linkable
partitions and spawning one code generation thread per partition. The function
produces multiple object files which can be linked in the usual way.

This has been threaded through to LTOCodeGenerator (and llvm-lto for testing
purposes). Separate patches will add parallel LTO support to the gold plugin
and lld.

Differential Revision: http://reviews.llvm.org/D12260

llvm-svn: 246236
2015-08-27 23:37:36 +00:00
Reid Kleckner 0e2882345d [WinEH] Add some support for code generating catchpad
We can now run 32-bit programs with empty catch bodies.  The next step
is to change PEI so that we get funclet prologues and epilogues.

llvm-svn: 246235
2015-08-27 23:27:47 +00:00
David Majnemer 0293704be2 [ValueTracking] readnone CallInsts are fair game for speculation
Any call which is side effect free is trivially OK to speculate.  We
already had similar logic in EarlyCSE and GVN but we were missing it
from isSafeToSpeculativelyExecute.

This fixes PR24601.

llvm-svn: 246232
2015-08-27 23:03:01 +00:00
Ahmed Bougacha 87166905c8 [CodeGen] Check FoldConstantArithmetic result before using it.
Fixes PR24602: r245689 introduced an unguarded use of
SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails
because of opaque (hoisted) constants.

llvm-svn: 246217
2015-08-27 21:46:04 +00:00
Tyler Nowicki 8f88546575 Fix test introduced in r246187 that failed on some systems.
llvm-svn: 246207
2015-08-27 20:43:29 +00:00
Lang Hames e2edcdcd93 Oops - Re-add the Kaleidoscope regression tests themselves (accidentally left
out of r246201).

llvm-svn: 246203
2015-08-27 20:33:22 +00:00
Lang Hames d76e067150 Recommit r246175 - Add Kaleidoscope regression tests, with a fix to make sure
the kaleidoscope 'library' functions aren't dead-stripped in release builds.

llvm-svn: 246201
2015-08-27 20:31:44 +00:00
Erik Schnetter 5e93e28d8b Enable constant propagation for more math functions
Constant propagation for single precision math functions (such as
tanf) is already working, but was not enabled. This patch enables
these for many single-precision functions, and adds respective test
cases.

Newly handled functions: acosf asinf atanf atan2f ceilf coshf expf
exp2f fabsf floorf fmodf logf log10f powf sinhf tanf tanhf

llvm-svn: 246194
2015-08-27 19:56:57 +00:00