Commit Graph

1689 Commits

Author SHA1 Message Date
Simon Pilgrim 97a0c54179 Fix cppcheck operator precedence warning. NFCI.
llvm-svn: 360234
2019-05-08 10:07:34 +00:00
Simon Pilgrim 3044ac058b Avoid use-after-move warnings by using swap instead. NFCI.
Swap should be as quick in these cases, and leaves the original variables in a known (empty) state.

llvm-svn: 360164
2019-05-07 15:45:00 +00:00
Craig Topper ad56843dd7 [SelectionDAG][X86] Support inline assembly returning an mmx register into a type with fewer than 64 bits.
It's possible to use the 'y' mmx constraint with a type narrower than 64-bits.

This patch supports this by bitcasting the mmx type to 64-bits and then
truncating to the desired type.

There are probably other missing type combinations we need to support, but this
is the case we have a bug report for.

Fixes PR41748.

Differential Revision: https://reviews.llvm.org/D61582

llvm-svn: 360069
2019-05-06 19:50:14 +00:00
Craig Topper f723490e76 [SelectionDAG] Replace llvm_unreachable at the end of getCopyFromParts with a report_fatal_error.
Based on PR41748, not all cases are handled in this function.

llvm_unreachable is treated as an optimization hint than can prune code paths
in a release build. This causes weird behavior when PR41748 is encountered on a
release build. It appears to generate an fp_round instruction from the floating
point code.

Making this a report_fatal_error prevents incorrect optimization of the code
and will instead generate a message to file a bug report.

llvm-svn: 360008
2019-05-06 04:01:49 +00:00
Tim Northover ee2474df9f DAG: allow DAG pointer size different from memory representation.
In preparation for supporting ILP32 on AArch64, this modifies the SelectionDAG
builder code so that pointers are allowed to have a larger type when "live" in
the DAG compared to memory.

Pointers get zero-extended whenever they are loaded, and truncated prior to
stores.  In addition, a few not quite so obvious locations need updating:

  * A GEP that has not been marked inbounds needs to enforce the IR-documented
    2s-complement wrapping at the memory pointer size. Inbounds GEPs are
    undefined if they overflow the address space, so no additional operations
    are needed.
  * Signed comparisons would give incorrect results if performed on the
    zero-extended values.

This shouldn't affect CodeGen for now, but will become active when the AArch64
ILP32 support is committed.

llvm-svn: 359676
2019-05-01 12:37:30 +00:00
Bjorn Pettersson 71e8c6f20f Add "const" in GetUnderlyingObjects. NFC
Summary:
Both the input Value pointer and the returned Value
pointers in GetUnderlyingObjects are now declared as
const.

It turned out that all current (in-tree) uses of
GetUnderlyingObjects were trivial to update, being
satisfied with have those Value pointers declared
as const. Actually, in the past several of the users
had to use const_cast, just because of ValueTracking
not providing a version of GetUnderlyingObjects with
"const" Value pointers. With this patch we get rid
of those const casts.

Reviewers: hfinkel, materi, jkorous

Reviewed By: jkorous

Subscribers: dexonsmith, jkorous, jholewinski, sdardis, eraman, hiraditya, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61038

llvm-svn: 359072
2019-04-24 06:55:50 +00:00
Tim Northover 2be3f868f9 DAG: propagate ConsecutiveRegs flags to returns too.
Arguments already have a flag to inform backends when they have been split up.
The AArch64 arm64_32 ABI makes use of these on return types too, so that code
emitted for armv7k can be ABI-compliant.

There should be no CodeGen changes yet, just making more information available.

llvm-svn: 358399
2019-04-15 12:04:10 +00:00
Tim Northover 9db00f7e5b DAG: propagate whether an arg is a pointer for CallingConv decisions.
The arm64_32 ABI specifies that pointers (despite being 32-bits) should be
zero-extended to 64-bits when passed in registers for efficiency reasons. This
means that the SelectionDAG needs to be able to tell the backend that an
argument was originally a pointer, which is implmented here.

Additionally, some memory intrinsics need to be declared as taking an i8*
instead of an iPTR.

There should be no CodeGen change yet, but it will be triggered when AArch64
backend support for ILP32 is added.

llvm-svn: 358398
2019-04-15 12:03:54 +00:00
Evandro Menezes 85bd3978ae [IR] Refactor attribute methods in Function class (NFC)
Rename the functions that query the optimization kind attributes.

Differential revision: https://reviews.llvm.org/D60287

llvm-svn: 357731
2019-04-04 22:40:06 +00:00
Hans Wennborg 800b12f90a Switch lowering: exploit unreachable fall-through when lowering case range cluster
In the example below, we would previously emit two range checks, one for cases
1--3 and one for 4--6. This patch makes us exploit the fact that the
fall-through is unreachable and only one range check is necessary.

  switch i32 %i, label %default [
    i32 1,  label %bb1
    i32 2,  label %bb1
    i32 3,  label %bb1
    i32 4,  label %bb2
    i32 5,  label %bb2
    i32 6,  label %bb2
  ]
  default: unreachable

llvm-svn: 357252
2019-03-29 13:40:05 +00:00
Craig Topper ea626d8bdb [SelectionDAGBuilder] Fix 80 column violation. NFC
llvm-svn: 357213
2019-03-28 20:52:22 +00:00
Nikita Popov 6d855ea024 [ConstantRange] Rename isWrappedSet() to isUpperWrapped()
Split out from D59749. The current implementation of isWrappedSet()
doesn't do what it says on the tin, and treats ranges like
[X, Max] as wrapping, because they are represented as [X, 0) when
using half-inclusive ranges. This also makes it inconsistent with
the semantics of isSignWrappedSet().

This patch renames isWrappedSet() to isUpperWrapped(), in preparation
for the introduction of a new isWrappedSet() method with corrected
behavior.

llvm-svn: 357107
2019-03-27 18:19:33 +00:00
Hans Wennborg 5c0d7a24e8 Re-commit r355490 "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default"
Original commit by Ayonam Ray.

This commit adds a regression test for the issue discovered in the
previous commit: that the range check for the jump table can only be
omitted if the fall-through destination of the jump table is
unreachable, which isn't necessarily true just because the default of
the switch is unreachable.

This addresses the missing optimization in PR41242.

> During the lowering of a switch that would result in the generation of a
> jump table, a range check is performed before indexing into the jump
> table, for the switch value being outside the jump table range and a
> conditional branch is inserted to jump to the default block. In case the
> default block is unreachable, this conditional jump can be omitted. This
> patch implements omitting this conditional branch for unreachable
> defaults.
>
> Differential Revision: https://reviews.llvm.org/D52002
> Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev

llvm-svn: 357067
2019-03-27 14:10:11 +00:00
Yi Kong 74b874ac4c Fix nondeterminism introduced in r353954
DenseMap iteration order is not guaranteed, use MapVector instead.

Fix provided by srhines.

Differential Revision: https://reviews.llvm.org/D59807

llvm-svn: 356988
2019-03-26 12:18:08 +00:00
Philip Reames db65a5b776 Allow unordered loads to be considered invariant in CodeGen
The actual code change is fairly straight forward, but exercising it isn't. First, it turned out we weren't adding the appropriate flags in SelectionDAG. Second, it turned out that we've got some optimization gaps, so obvious test cases don't work.

My first attempt (in atomic-unordered.ll) points out a deficiency in our peephole-opt folding logic which I plan to fix separately. Instead, I'm exercising this through MachineLICM.

Differential Revision: https://reviews.llvm.org/D59375

llvm-svn: 356494
2019-03-19 18:27:18 +00:00
Simon Pilgrim a56f2822d0 [SelectionDAG] Handle unary SelectPatternFlavor for ABS case in SelectionDAGBuilder::visitSelect
These changes are related to PR37743 and include:

    SelectionDAGBuilder::visitSelect handles the unary SelectPatternFlavor::SPF_ABS case to build ABS node.

    Delete the redundant recognizer of the integer ABS pattern from the DAGCombiner.

    Add promoting the integer ABS node in the LegalizeIntegerType.

    Expand-based legalization of integer result for the ABS nodes.

    Expand-based legalization of ABS vector operations.

    Add some integer abs testcases for different typesizes for Thumb arch

    Add the custom ABS expanding and change the SAD pattern recognizer for X86 arch: The i64 result of the ABS is expanded to:
        tmp = (SRA, Hi, 31)
        Lo = (UADDO tmp, Lo)
        Hi = (XOR tmp, (ADDCARRY tmp, hi, Lo:1))
        Lo = (XOR tmp, Lo)

    The "detectZextAbsDiff" function is changed for the recognition of pattern with the ABS node. Given a ABS node, detect the following pattern:
        (ABS (SUB (ZERO_EXTEND a), (ZERO_EXTEND b))).

    Change integer abs testcases for codegen with the ABS node support for AArch64.
        Indicate that the ABS is legal for the i64 type when the NEON is supported.
        Change the integer abs testcases to show changing of codegen.

    Add combine and legalization of ABS nodes for Thumb arch.

    Extend 'matchSelectPattern' to recognize the ABS patterns with ICMP_SGE condition.

For discussion, see https://bugs.llvm.org/show_bug.cgi?id=37743

Patch by: @ikulagin (Ivan Kulagin)

Differential Revision: https://reviews.llvm.org/D49837

llvm-svn: 356468
2019-03-19 16:24:55 +00:00
David Stenberg 8a2e4af7e7 [DebugInfo] Ignore bitcasts when lowering stack arg dbg.values
Summary:
Look past bitcasts when looking for parameter debug values that are
described by frame-index loads in `EmitFuncArgumentDbgValue()`.

In the attached test case we would be left with an undef `DBG_VALUE`
for the parameter without this patch.

A similar fix was done for parameters passed in registers in D13005.

This fixes PR40777.

Reviewers: aprantl, vsk, jmorse

Reviewed By: aprantl

Subscribers: bjope, javed.absar, jdoerfert, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D58831

llvm-svn: 356363
2019-03-18 11:27:32 +00:00
Heejin Ahn 66ce419468 [WebAssembly] Make rethrow take an except_ref type argument
Summary:
In the new wasm EH proposal, `rethrow` takes an `except_ref` argument.
This change was missing in r352598.

This patch adds `llvm.wasm.rethrow.in.catch` intrinsic. This is an
intrinsic that's gonna eventually be lowered to wasm `rethrow`
instruction, but this intrinsic can appear only within a catchpad or a
cleanuppad scope. Also this intrinsic needs to be invokable - otherwise
EH pad successor for it will not be correctly generated in clang.

This also adds lowering logic for this intrinsic in
`SelectionDAGBuilder::visitInvoke`. This routine is basically a
specialized and simplified version of
`SelectionDAGBuilder::visitTargetIntrinsic`, but we can't use it
because if is only for `CallInst`s.

This deletes the previous `llvm.wasm.rethrow` intrinsic and related
tests, which was meant to be used within a `__cxa_rethrow` library
function. Turned out this needs some more logic, so the intrinsic for
this purpose will be added later.

LateEHPrepare takes a result value of `catch` and inserts it into
matching `rethrow` as an argument.

`RETHROW_IN_CATCH` is a pseudo instruction that serves as a link between
`llvm.wasm.rethrow.in.catch` and the real wasm `rethrow` instruction. To
generate a `rethrow` instruction, we need an `except_ref` argument,
which is generated from `catch` instruction. But `catch` instrutions are
added in LateEHPrepare pass, so we use `RETHROW_IN_CATCH`, which takes
no argument, until we are able to correctly lower it to `rethrow` in
LateEHPrepare.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59352

llvm-svn: 356316
2019-03-16 05:38:57 +00:00
Paul Robinson 05efe0fdc4 [PS4] Emit a trap after a stack-protector fail call.
llvm-svn: 355542
2019-03-06 19:57:43 +00:00
Alexander Kornienko 3d467a890e Revert "[CodeGen] Omit range checks from jump tables when lowering switches with unreachable default"
This reverts commit 2a0f2c5ef3 (r355490).

The commit causes an assertion failure when compiling LLVM code:
$ cat repro.cpp
class QQQ {
public:
  bool x() const;
  bool y() const;
  unsigned getSizeInBits() const {
    if (y() || x())
      return getScalarSizeInBits();
    return getScalarSizeInBits() * 2;
  }
  unsigned getScalarSizeInBits() const;
};
int f(const QQQ &Ty) {
  switch (Ty.getSizeInBits()) {
    case 1:
    case 8:
      return 0;
    case 16:
      return 1;
    case 32:
      return 2;
    case 64:
      return 3;
    default:
      __builtin_unreachable();
  }
}
$ clang -O2 -o repro.o repro.cpp
assert.h assertion failed at llvm/include/llvm/ADT/ilist_iterator.h:139 in llvm::ilist_iterator::reference llvm::ilist_iterator<llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, true, false>::operator*() const [OptionsT = llvm::ilist_detail::node_options<llvm::MachineInstr, true, true, void>, IsReverse = true, IsConst = false]: !NodePtr->isKnownSentinel()
*** Check failure stack trace: ***
    @     0x558aab4afc10  __assert_fail
    @     0x558aa885479b  llvm::ilist_iterator<>::operator*()
    @     0x558aa8854715  llvm::MachineInstrBundleIterator<>::operator*()
    @     0x558aa92c33c3  llvm::X86InstrInfo::optimizeCompareInstr()
    @     0x558aa9a9c251  (anonymous namespace)::PeepholeOptimizer::optimizeCmpInstr()
    @     0x558aa9a9b371  (anonymous namespace)::PeepholeOptimizer::runOnMachineFunction()
    @     0x558aa99a4fc8  llvm::MachineFunctionPass::runOnFunction()
    @     0x558aab019fc4  llvm::FPPassManager::runOnFunction()
    @     0x558aab01a3a5  llvm::FPPassManager::runOnModule()
    @     0x558aab01aa9b  (anonymous namespace)::MPPassManager::runOnModule()
    @     0x558aab01a635  llvm::legacy::PassManagerImpl::run()
    @     0x558aab01afe1  llvm::legacy::PassManager::run()
    @     0x558aa5914769  (anonymous namespace)::EmitAssemblyHelper::EmitAssembly()
    @     0x558aa5910f44  clang::EmitBackendOutput()
    @     0x558aa5906135  clang::BackendConsumer::HandleTranslationUnit()
    @     0x558aa6d165ad  clang::ParseAST()
    @     0x558aa6a94e22  clang::ASTFrontendAction::ExecuteAction()
    @     0x558aa590255d  clang::CodeGenAction::ExecuteAction()
    @     0x558aa6a94840  clang::FrontendAction::Execute()
    @     0x558aa6a38cca  clang::CompilerInstance::ExecuteAction()
    @     0x558aa4e2294b  clang::ExecuteCompilerInvocation()
    @     0x558aa4df6200  cc1_main()
    @     0x558aa4e1b37f  ExecuteCC1Tool()
    @     0x558aa4e1a725  main
    @     0x7ff20d56abbd  __libc_start_main
    @     0x558aa4df51c9  _start

llvm-svn: 355515
2019-03-06 15:23:50 +00:00
Ayonam Ray 2a0f2c5ef3 [CodeGen] Omit range checks from jump tables when lowering switches with unreachable default
During the lowering of a switch that would result in the generation of a
jump table, a range check is performed before indexing into the jump
table, for the switch value being outside the jump table range and a
conditional branch is inserted to jump to the default block. In case the
default block is unreachable, this conditional jump can be omitted. This
patch implements omitting this conditional branch for unreachable
defaults.

Differential Revision: https://reviews.llvm.org/D52002
Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev

llvm-svn: 355490
2019-03-06 10:01:02 +00:00
Ayonam Ray af92b7a3b8 Reversing the commit of revision 355483 since it is giving a regression on a newly added test.
llvm-svn: 355487
2019-03-06 07:51:28 +00:00
Ayonam Ray 6025fa8e30 [CodeGen] Omit range checks from jump tables when lowering switches with unreachable default
During the lowering of a switch that would result in the generation of a
jump table, a range check is performed before indexing into the jump
table, for the switch value being outside the jump table range and a
conditional branch is inserted to jump to the default block. In case the
default block is unreachable, this conditional jump can be omitted. This
patch implements omitting this conditional branch for unreachable
defaults.

Differential Revision: https://reviews.llvm.org/D52002
Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev

llvm-svn: 355483
2019-03-06 07:27:45 +00:00
Heejin Ahn 195a62e9ae [WebAssembly] Delete ThrowUnwindDest map from WasmEHFuncInfo
Summary:
Before when we implemented the first EH proposal, 'catch <tag>'
instruction may not catch an exception so there were multiple EH pads an
exception can unwind to. That means a BB could have multiple EH pad
successors.

Now after we switched to the new proposal, every 'catch' instruction
catches an exception, and there is only one catchpad per catchswitch, so
we at most have one EH pad successor, making `ThrowUnwindDest` map in
`WasmEHInfo` unnecessary.

Keeping `ThrowUnwindDest` map in `WasmEHInfo` has its own problems,
because other optimization passes can split a BB that contains possibly
throwing calls (previously invokes), and we have to update the map every
time that happens, which is not easy for common CodeGen passes.

This also correctly updates successor info in LateEHPrepare when we add
a rethrow instruction.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58486

llvm-svn: 355296
2019-03-03 22:35:56 +00:00
Philip Reames 288a95fc8c Seperate volatility and atomicity/ordering in SelectionDAG
At the moment, we mark every atomic memory access as being also volatile. This is unnecessarily conservative and prohibits many legal transforms (DCE, folding, etc..).

This patch removes MOVolatile from the MachineMemOperands of atomic, but not volatile, instructions. This should be strictly NFC after a series of previous patches which have gone in to ensure backend code is conservative about handling of isAtomic MMOs. Once it's in and baked for a bit, we'll start working through removing unnecessary bailouts one by one. We applied this same strategy to the middle end a few years ago, with good success.

To make sure this patch itself is NFC, it is build on top of a series of other patches which adjust code to (for the moment) be as conservative for an atomic access as for a volatile access and build up a test corpus (mostly in test/CodeGen/X86/atomics-unordered.ll)..

Previously landed

    D57593 Fix a bug in the definition of isUnordered on MachineMemOperand
    D57596 [CodeGen] Be conservative about atomic accesses as for volatile
    D57802 Be conservative about unordered accesses for the moment
    rL353959: [Tests] First batch of cornercase tests for unordered atomics.
    rL353966: [Tests] RMW folding tests w/unordered atomic operations.
    rL353972: [Tests] More unordered atomic lowering tests.
    rL353989: [SelectionDAG] Inline a single use helper function, and remove last non-MMO interface
    rL354740: [Hexagon, SystemZ] Be super conservative about atomics
    rL354800: [Lanai] Be super conservative about atomics
    rL354845: [ARM] Be super conservative about atomics

Attention Out of Tree Backend Owners: This patch may break you. If it does, you can use the TLI getMMOFlags hook to restore the MOVolatile to any instruction you need to. (See llvm-dev thread titled "PSA: Changes to how atomics are handled in backends" started Feb 27, 2019.)

Differential Revision: https://reviews.llvm.org/D57601

llvm-svn: 355025
2019-02-27 20:20:08 +00:00
Matt Arsenault 0280a5e143 DAG: Add helper for creating shifts with correct type
llvm-svn: 354649
2019-02-22 03:38:47 +00:00
Clement Courbet a0321c23e8 Re-land part of r354244 "[DAGCombiner] Eliminate dead stores to stack."
This part introduces the lifetime node.

llvm-svn: 354578
2019-02-21 12:59:36 +00:00
Clement Courbet 292291fb90 Revert r354244 "[DAGCombiner] Eliminate dead stores to stack."
Breaks some bots.

llvm-svn: 354245
2019-02-18 08:24:29 +00:00
Clement Courbet 57f34dbd3e [DAGCombiner] Eliminate dead stores to stack.
Summary:
A store to an object whose lifetime is about to end can be removed.

See PR40550 for motivation.

Reviewers: niravd

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D57541

llvm-svn: 354244
2019-02-18 07:59:01 +00:00
Nirav Dave 7875841121 [X86] Fix LowerAsmOutputForConstraint.
Summary:
Update Flag when generating cc output.

Fixes PR40737.

Reviewers: rnk, nickdesaulniers, craig.topper, spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58283

llvm-svn: 354163
2019-02-15 20:01:55 +00:00
Jeremy Morse 897a9f8d00 Fix an accidentally flipped pair of arguments, NFCI
While rebasing a refactor in r353950 I accidentally swapped two function
arguments; one is SelectionDAGBuilders "current" DebugLoc, the other is the one
from the "current" debug intrinsic. They're probably always identical, but I
haven't proved that yet.

llvm-svn: 354019
2019-02-14 11:09:24 +00:00
Philip Reames e4cfb7dae8 [SelectionDAG] Inline a single use helper function, and remove last non-MMO interface [NFC]
For D57601, we need to know whether the instruction is volatile.  We'd either have to pass yet another parameter, or just standardize on the MMO interface.  I chose the second.

llvm-svn: 353989
2019-02-13 23:01:11 +00:00
Philip Reames 41f400c948 [SelectionDAG] Kill last uses of getAtomic w/o a MMO operand [NFC]
The helper function was used by only two callers, and largely ended up providing distinct functionality based on optional arguments and opcode.  Inline and simply to make the functionality much more clear.

llvm-svn: 353977
2019-02-13 20:42:59 +00:00
Jeremy Morse 291713a596 [DebugInfo][DAG] Either salvage dangling debug info or emit Undef DBG_VALUEs
In this patch SelectionDAG tries to salvage any dbg.values that are going to be
dropped, in case they can be recovered from Values in the current BB. It also
strengthens SelectionDAGs handling of dangling debug data, so that dbg.values
are *always* emitted (as Undef or otherwise) instead of dangling forever.

The motivation behind this patch exists in the new test case: a memory address
(here a bitcast and GEP) exist in one basic block, and a dbg.value referring to
the address is left in the 'next' block. The base pointer is live across all
basic blocks. In current llvm trunk the dbg.value cannot be encoded, and it
isn't even emitted as an Undef DBG_VALUE.

The change is simply: if we're definitely going to drop a dbg.value, repeatedly
apply salvageDebugInfo to its operand until either we find something that can
be encoded, or we can't salvage any further in which case we produce an Undef
DBG_VALUE. To know when we're "definitely going to drop a dbg.value",
SelectionDAG signals SelectionDAGBuilder when all IR instructions have been
encoded to force salvaging. This ensures that any dbg.value that's dangling
after DAG creation will have a corresponding DBG_VALUE encoded. 

Differential Revision: https://reviews.llvm.org/D57694

llvm-svn: 353954
2019-02-13 16:33:05 +00:00
Jeremy Morse 6d3cd3b4ec [DebugInfo][DAG] Refactor dbg.value lowering into its own method
This is a pure copy-and-paste job, moving the logic for lowering dbg.value
intrinsics to SDDbgValues into its own function. This is ahead of adding some
more users of this logic.

Differential Revision: https://reviews.llvm.org/D57697

llvm-svn: 353950
2019-02-13 15:53:10 +00:00
Jeremy Morse a9a11aac0f [DebugInfo][DAG] Limit special-casing of dbg.values for Arguments
SelectionDAGBuilder has special handling for dbg.value intrinsics that are
understood to define the location of function parameters on entry to the
function. To enable this, we avoid recording a dbg.value as a virtual register
reference if it might be such a parameter, so that it later hits
EmitFuncArgumentDbgValue.

This patch reduces the set of circumstances where we avoid recording a
dbg.value as a virtual register reference, to allow more "normal" variables
to be recorded that way. We now only bypass for potential parameters if:
 * The dbg.value operand is an Argument,
 * The Variable is a parameter, and
 * The Variable is not inlined.
meaning it's very likely that the dbg.value is a function-entry parameter
location.

Differential Revision: https://reviews.llvm.org/D57584

llvm-svn: 353948
2019-02-13 13:37:33 +00:00
Bjorn Pettersson 4892f06e06 [SelectionDAGBuilder] Add restrictions to EmitFuncArgumentDbgValue
Summary:
This patch fixes PR40587.

When a dbg.value instrinsic is emitted to the DAG
by using EmitFuncArgumentDbgValue the resulting
DBG_VALUE is hoisted to the beginning of the entry
block. I think the idea is to be able to locate
a formal argument already from the start of the
function.
However, EmitFuncArgumentDbgValue only checked that
the value that was used to describe a variable was
originating from a function parameter, not that the
variable itself actually was an argument to the
function. So when for example assigning a local
variable "local" the value from an argument "a",
the assocated DBG_VALUE instruction would be hoisted
to the beginning of the function, even if the scope
for "local" started somewhere else (or if "local"
was mapped to other values earlier in the function).

This patch adds some logic to EmitFuncArgumentDbgValue
to check that the variable being described actually
is an argument to the function. And that the dbg.value
being lowered already is in the entry block. Otherwise
we bail out, and the dbg.value will be handled as an
ordinary dbg.value (not as a "FuncArgumentDbgValue").

A tricky situation is when both the variable and
the value is related to function arguments, but not
neccessarily the same argument. We make sure that we
do not describe the same argument more than once as
a "FuncArgumentDbgValue". This solution works as long
as opt has injected a "first" dbg.value that corresponds
to the formal argument at the function entry.

Reviewers: jmorse, aprantl

Subscribers: jyknight, hiraditya, fedor.sergeev, dstenb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57702

llvm-svn: 353735
2019-02-11 19:23:30 +00:00
Chandler Carruth 3160734af1 [CallSite removal] Migrate the statepoint GC infrastructure to use the
`CallBase` class rather than `CallSite` wrappers.

I pushed this change down through most of the statepoint infrastructure,
completely removing the use of CallSite where I could reasonably do so.
I ended up making a couple of cut-points: generic call handling
(instcombine, TLI, SDAG). As soon as it hit truly generic handling with
users outside the immediate code, I simply transitioned into or out of
a `CallSite` to make this a reasonable sized chunk.

Differential Revision: https://reviews.llvm.org/D56122

llvm-svn: 353660
2019-02-11 07:42:30 +00:00
Craig Topper 784929d045 Implementation of asm-goto support in LLVM
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

llvm-svn: 353563
2019-02-08 20:48:56 +00:00
Nikita Popov 9d7e86a978 [CodeGen] Handle vector UADDO, SADDO, USUBO, SSUBO
This is part of https://bugs.llvm.org/show_bug.cgi?id=40442.

Vector legalization is implemented for the add/sub overflow opcodes.
UMULO/SMULO are also handled as far as legalization is concerned, but
they don't support vector expansion yet (so no tests for them).

The vector result widening implementation is suboptimal, because it
could result in a legalization loop.

Differential Revision: https://reviews.llvm.org/D57639

llvm-svn: 353464
2019-02-07 21:02:22 +00:00
Nirav Dave e5c37958f9 [InlineAsm][X86] Add backend support for X86 flag output parameters.
Allow custom handling of inline assembly output parameters and add X86
flag parameter support.

llvm-svn: 353307
2019-02-06 15:26:29 +00:00
Nirav Dave 54511076d4 [SelectionDAGBuilder] Refactor Inline Asm output check. NFCI.
llvm-svn: 353305
2019-02-06 15:12:46 +00:00
Leonard Chan 68d428e578 [Intrinsic] Unsigned Fixed Point Multiplication Intrinsic
Add an intrinsic that takes 2 unsigned integers with the scale of them
provided as the third argument and performs fixed point multiplication on
them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D55625

llvm-svn: 353059
2019-02-04 17:18:11 +00:00
Mandeep Singh Grang 70d484d94e [COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects
Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.

Reviewers: rnk, efriedma, ssijaric, TomTan

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D57183

llvm-svn: 352923
2019-02-01 21:41:33 +00:00
James Y Knight 7976eb5838 [opaque pointer types] Pass function types to CallInst creation.
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57170

llvm-svn: 352909
2019-02-01 20:43:25 +00:00
Heejin Ahn d6f487863d [WebAssembly] Exception handling: Switch to the new proposal
Summary:
This switches the EH implementation to the new proposal:
https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md
(The previous proposal was
 https://github.com/WebAssembly/exception-handling/blob/master/proposals/old/Exceptions.md)

- Instruction changes
  - Now we have one single `catch` instruction that returns a except_ref
    value
  - `throw` now can take variable number of operations
  - `rethrow` does not have 'depth' argument anymore
  - `br_on_exn` queries an except_ref to see if it matches the tag and
    branches to the given label if true.
  - `extract_exception` is a pseudo instruction that simulates popping
    values from wasm stack. This is to make `br_on_exn`, a very special
    instruction, work: `br_on_exn` puts values onto the stack only if it
    is taken, and the # of values can vay depending on the tag.

- Now there's only one `catch` per `try`, this patch removes all special
  handling for terminate pad with a call to `__clang_call_terminate`.
  Before it was the only case there are two catch clauses (a normal
  `catch` and `catch_all` per `try`).

- Make `rethrow` act as a terminator like `throw`. This splits BB after
  `rethrow` in WasmEHPrepare, and deletes an unnecessary `unreachable`
  after `rethrow` in LateEHPrepare.

- Now we stop at all catchpads (because we add wasm `catch` instruction
  that catches all exceptions), this creates new
  `findWasmUnwindDestinations` function in SelectionDAGBuilder.

- Now we use `br_on_exn` instrution to figure out if an except_ref
  matches the current tag or not, LateEHPrepare generates this sequence
  for catch pads:
```
  catch
  block i32
  br_on_exn $__cpp_exception
  end_block
  extract_exception
```

- Branch analysis for `br_on_exn` in WebAssemblyInstrInfo

- Other various misc. changes to switch to the new proposal.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D57134

llvm-svn: 352598
2019-01-30 03:21:57 +00:00
Nirav Dave 1527c0e727 [SelectionDAGBuilder] Remove redundant variable. NFCI.
llvm-svn: 352506
2019-01-29 15:14:07 +00:00
Ayonam Ray a1f6973ade Reversing the checkin for version 352484 as tests are failing.
llvm-svn: 352504
2019-01-29 15:00:50 +00:00
Ayonam Ray 4272af9b3e [CodeGen] Omit range checks from jump tables when lowering switches with unreachable default
During the lowering of a switch that would result in the generation of a 
jump table, a range check is performed before indexing into the jump 
table, for the switch value being outside the jump table range and a 
conditional branch is inserted to jump to the default block. In case the 
default block is unreachable, this conditional jump can be omitted. This 
patch implements omitting this conditional branch for unreachable 
defaults.

Review ID: D52002
Reviewers: Hans Wennborg, Eli Freidman, Roman Lebedev

llvm-svn: 352484
2019-01-29 12:01:32 +00:00
Jeremy Morse 66ac86b58d [DebugInfo][DAG] Process FrameIndex dbg.values unconditionally
A FrameIndex should be valid throughout a block regardless of what instructions
get selected in that block -- therefore we shouldn't harness dbg.values that
refer to FrameIndexes to an SDNode. There are numerous codegen reasons why
an SDNode never appears or doesn't become a location that a DBG_VALUE can
refer to. None of them actually affect the variable location.

Therefore, before any other tests to encode dbg_values in a SelectionDAG,
identify FrameIndex operands and encode them unattached to any SDNode.

Differential Revision: https://reviews.llvm.org/D57328

llvm-svn: 352467
2019-01-29 09:40:05 +00:00
James Y Knight 2c36240a82 Fix emission of _fltused for MSVC.
It should be emitted when any floating-point operations (including
calls) are present in the object, not just when calls to printf/scanf
with floating point args are made.

The difference caused by this is very subtle: in static (/MT) builds,
on x86-32, in a program that uses floating point but doesn't print it,
the default x87 rounding mode may not be set properly upon
initialization.

This commit also removes the walk of the types pointed to by pointer
arguments in calls. (To assist in opaque pointer types migration --
eventually the pointee type won't be available.)

That latter implies that it will no longer consider a call like
`scanf("%f", &floatvar)` as sufficient to emit _fltused on its
own. And without _fltused, `scanf("%f")` will abort with error R6002. This
new behavior is unlikely to bite anyone in practice (you'd have to
read a float, and do nothing with it!), and also, is consistent with
MSVC.

Differential Revision: https://reviews.llvm.org/D56548

llvm-svn: 352076
2019-01-24 18:34:00 +00:00
Nirav Dave 58e9833e98 [SelectionDAGBuilder] Simplify HasSideEffect calculation. NFC.
llvm-svn: 352067
2019-01-24 17:56:03 +00:00
Nirav Dave b41a198472 [InlineAsm] Don't calculate registers for inline asm memory operands. NFCI.
llvm-svn: 352066
2019-01-24 17:47:18 +00:00
Nirav Dave bd069f424f [SelectionDAGBuilder] Fuse inline asm input operand loops passes. NFCI.
llvm-svn: 352053
2019-01-24 15:15:32 +00:00
Nirav Dave d0418341fd [SelectionDAGBuilder] Defer C_Register Assignments to be in line with
those of C_RegisterClass. NFCI.

llvm-svn: 351854
2019-01-22 18:57:49 +00:00
Matt Arsenault a5840c3c39 Codegen support for atomicrmw fadd/fsub
llvm-svn: 351851
2019-01-22 18:36:06 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Nirav Dave 0a45bf0e55 [SelectionDAGBuilder] Cleanup InlineAsm Output generation. NFCI.
Defer inline asm's output fixup work until after we've generated the
inline asm node itself. Remove StoresToEmit, IndirectStoresToEmit, and
RetValRegs in favor of using ConstraintOperands.

llvm-svn: 351558
2019-01-18 15:57:13 +00:00
Florian Hahn d2c733b429 [SelectionDAG] Add getTokenFactor, which splits nodes with > 64k operands.
This functionality is required at multiple places which potentially
create large operand lists, like SelectionDAGBuilder or DAGCombiner.

Differential Revision: https://reviews.llvm.org/D56739

llvm-svn: 351552
2019-01-18 14:05:59 +00:00
Florian Hahn 1b81772328 [SelectionDAG] Add static getMaxNumOperands function to SDNode.
Summary:
Use this helper to make sure we use the same value at various places.
This will likely be needed at more places were we currently crash
because we use more operands than possible.

Also makes it easier to change in the future.

Reviewers: RKSimon, craig.topper, efriedma, aemerson

Reviewed By: RKSimon

Subscribers: hiraditya, arsenm, llvm-commits

Differential Revision: https://reviews.llvm.org/D56859

llvm-svn: 351537
2019-01-18 10:00:38 +00:00
Mandeep Singh Grang 33c49c0c82 [COFF, ARM64] Implement support for SEH extensions __try/__except/__finally
Summary:
This patch supports MS SEH extensions __try/__except/__finally. The intrinsics localescape and localrecover are responsible for communicating escaped static allocas from the try block to the handler.

We need to preserve frame pointers for SEH. So we create a new function/property HasLocalEscape.

Reviewers: rnk, compnerd, mstorsjo, TomTan, efriedma, ssijaric

Reviewed By: rnk, efriedma

Subscribers: smeenai, jrmuizel, alex, majnemer, ssijaric, ehsan, dmajor, kristina, javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D53540

llvm-svn: 351370
2019-01-16 19:52:59 +00:00
Jeremy Morse 7dcea5ae3b [DebugInfo] Allow creation of DBG_VALUEs in blocks where the operand is not used
dbg.value intrinsics can appear in blocks where their operand is not used,
meaning the operand never receives an SDNode, and thus no DBG_VALUE will
be created. Get around this by looking to see whether the operand has already
been allocated a virtual register. This allows dbg.values of Phi node and
Values that are used across basic blocks to successfully be translated into
DBG_VALUEs.

Differential Revision: https://reviews.llvm.org/D56678

llvm-svn: 351358
2019-01-16 17:25:27 +00:00
Nirav Dave fcb4a492db [SelectionDAG] Check membership of register in class for single
register constraints. NFCI.

Now that X86's ST(7) constraints are fixed this check can be
reinstated.

llvm-svn: 351207
2019-01-15 17:09:23 +00:00
Nirav Dave 3badfe74a2 Reland "Refactor GetRegistersForValue. NFCI."
Remove over-strictification class membership check.

llvm-svn: 351074
2019-01-14 17:09:45 +00:00
Martin Storsjo 114ad37c1d Revert "[SelectionDAGBuilder] Refactor GetRegistersForValue. NFCI."
This reverts commit r350841, as it actually had functional changes
and broke compilation. See PR40290.

llvm-svn: 350921
2019-01-11 07:31:17 +00:00
Nirav Dave cd18977add [SelectionDAGBuilder] Refactor GetRegistersForValue. NFCI.
llvm-svn: 350841
2019-01-10 16:25:47 +00:00
Nirav Dave 4817c0e46c [SelectionDAGBuilder] Fix formatting. NFC.
llvm-svn: 350839
2019-01-10 16:22:19 +00:00
Nirav Dave 57f2c14860 [SelectionDAGBuilder] Refactor visitInlineAsm. NFC.
llvm-svn: 350837
2019-01-10 16:18:18 +00:00
James Y Knight 62df5eed16 [opaque pointer types] Remove some calls to generic Type subtype accessors.
That is, remove many of the calls to Type::getNumContainedTypes(),
Type::subtypes(), and Type::getContainedType(N).

I'm not intending to remove these accessors -- they are
useful/necessary in some cases. However, removing the pointee type
from pointers would potentially break some uses, and reducing the
number of calls makes it easier to audit.

llvm-svn: 350835
2019-01-10 16:07:20 +00:00
Ayonam Ray e00606a1b2 Reversing the commit in revision 350186. Revision causes regression in 4
tests.

llvm-svn: 350187
2019-01-01 07:28:55 +00:00
Ayonam Ray c471bb2e67 Omit range checks from jump tables when lowering switches with unreachable
default

During the lowering of a switch that would result in the generation of a jump
table, a range check is performed before indexing into the jump table, for the
switch value being outside the jump table range and a conditional branch is
inserted to jump to the default block. In case the default block is
unreachable, this conditional jump can be omitted. This patch implements
omitting this conditional branch for unreachable defaults.

Review Reference: D52002

llvm-svn: 350186
2019-01-01 06:37:50 +00:00
George Burgess IV 610c76534f [SelectionDAGBuilder] Use ::precise LocationSizes; NFC
More migration so we can disable the implicit int -> LocationSize
conversion.

All of these are either scatter/gather'ed vector instructions, or direct
loads. Hence, they're all precise.

Perhaps if we see way more getTypeStoreSize calls, we can make a
getTypeStoreLocationSize (or similar) as a wrapper that applies this
::precise. Doesn't appear that it's a good idea to make getTypeStoreSize
return a LocationSize itself, however.

llvm-svn: 350042
2018-12-24 05:34:21 +00:00
Simon Pilgrim b208255fe0 [SelectionDAGBuilder] Enable funnel shift building to custom rotates
This patch enables funnel shift -> rotate building for all ROTL/ROTR custom/legal operations.

AFAICT X86 was the last target that was missing modulo support (PR38243), but I've tried to CC stakeholders for every target that has ROTL/ROTR custom handling for their final OK.

Differential Revision: https://reviews.llvm.org/D55747

llvm-svn: 349765
2018-12-20 14:56:44 +00:00
Pete Cooper f86db5ce9e Rewrite objc intrinsics to runtime methods in PreISelIntrinsicLowering instead of SDAG.
SelectionDAG currently changes these intrinsics to function calls, but that won't work
for other ISel's.  Also we want to eventually support nonlazybind and weak linkage coming
from the front-end which we can't do in SelectionDAG.

llvm-svn: 349552
2018-12-18 22:20:03 +00:00
Leonard Chan 118e53fd63 [Intrinsic] Signed Fixed Point Multiplication Intrinsic
Add an intrinsic that takes 2 signed integers with the scale of them provided
as the third argument and performs fixed point multiplication on them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D54719

llvm-svn: 348912
2018-12-12 06:29:14 +00:00
Jeremy Morse a06b163d5c [DebugInfo] Don't drop dbg.value's of nullptr
Currently, dbg.value's of "nullptr" are dropped when entering a SelectionDAG --
apparently just because of an oversight when recognising Values that are
constant (see PR39787). This patch adds ConstantPointerNull to the list of
constants that can be turned into DBG_VALUEs.

The matter of what bit-value a null pointer constant in LLVM has was raised
in this mailing list thread:

  http://lists.llvm.org/pipermail/llvm-dev/2018-December/128234.html

Where it transpires LLVM relies on (IR) null pointers being zero valued,
thus I've baked this assumption into the patch.

Differential Revision: https://reviews.llvm.org/D55227

llvm-svn: 348753
2018-12-10 12:04:08 +00:00
Pete Cooper 782a490dfb Follow-up from r348441 to add the rest of the objc ARC intrinsics.
This adds the other intrinsics used by ARC and codegen's them to their respective runtime methods.

llvm-svn: 348646
2018-12-07 21:28:47 +00:00
Pete Cooper e13d0992dc Add objc.* ARC intrinsics and codegen them to their runtime methods.
Reviewers: erik.pilkington, ahatanak

Differential Revision: https://reviews.llvm.org/D55233

llvm-svn: 348441
2018-12-06 00:52:54 +00:00
Simon Pilgrim 180639afe5 [SelectionDAG] Initial support for FSHL/FSHR funnel shift opcodes (PR39467)
This is an initial patch to add a minimum level of support for funnel shifts to the SelectionDAG and to begin wiring it up to the X86 SHLD/SHRD instructions.

Some partial legalization code has been added to handle the case for 'SlowSHLD' where we want to expand instead and I've added a few DAG combines so we don't get regressions from the existing DAG builder expansion code.

Differential Revision: https://reviews.llvm.org/D54698

llvm-svn: 348353
2018-12-05 11:12:12 +00:00
Amara Emerson 814a6794ba [SelectionDAG] Split very large token factors for loads into 64k chunks.
There's a 64k limit on the number of SDNode operands, and some very large
functions with 64k or more loads can cause crashes due to this limit being hit
when a TokenFactor with this many operands is created. To fix this, create
sub-tokenfactors if we've exceeded the limit.

No test case as it requires a very large function.

rdar://45196621

Differential Revision: https://reviews.llvm.org/D55073

llvm-svn: 348324
2018-12-05 00:41:30 +00:00
Craig Topper 4954c66430 [SelectionDAG] Compute known bits and num sign bits for live out vector registers. Use it to add AssertZExt/AssertSExt in the live in basic blocks
Summary:
We already support this for scalars, but it was explicitly disabled for vectors. In the updated test cases this allows us to see the upper bits are zero to use less multiply instructions to emulate a 64 bit multiply.

This should help with this ispc issue that a coworker pointed me to https://github.com/ispc/ispc/issues/1362

Reviewers: spatel, efriedma, RKSimon, arsenm

Reviewed By: spatel

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D54725

llvm-svn: 347287
2018-11-20 04:30:26 +00:00
Cameron McInally cbde0d9c7b [IR] Add a dedicated FNeg IR Instruction
The IEEE-754 Standard makes it clear that fneg(x) and
fsub(-0.0, x) are two different operations. The former is a bitwise
operation, while the latter is an arithmetic operation. This patch
creates a dedicated FNeg IR Instruction to model that behavior.

Differential Revision: https://reviews.llvm.org/D53877

llvm-svn: 346774
2018-11-13 18:15:47 +00:00
James Y Knight 72f76bf230 Add support for llvm.is.constant intrinsic (PR4898)
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.

Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.

Updated from patch initially by Janusz Sobczak.

Differential Revision: https://reviews.llvm.org/D4276

llvm-svn: 346322
2018-11-07 15:24:12 +00:00
Cameron McInally 9757d5d6c1 [FPEnv] Add constrained CEIL/FLOOR/ROUND/TRUNC intrinsics
Differential Revision: https://reviews.llvm.org/D53411

llvm-svn: 346141
2018-11-05 15:59:49 +00:00
Mandeep Singh Grang 547a0d765a [COFF, ARM64] Implement Intrinsic.sponentry for AArch64
Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform.

Patch by: Yin Ma (yinma@codeaurora.org)

Reviewers: mgrang, ssijaric, eli.friedman, TomTan, mstorsjo, rnk, compnerd, efriedma

Reviewed By: efriedma

Subscribers: efriedma, javed.absar, kristof.beyls, chrib, llvm-commits

Differential Revision: https://reviews.llvm.org/D53996

llvm-svn: 345909
2018-11-01 23:22:25 +00:00
Mandeep Singh Grang b0cdf56dd7 Revert "[COFF, ARM64] Implement Intrinsic.sponentry for AArch64"
This reverts commit 585b6667b4712e3c7f32401e929855b3313b4ff2.

llvm-svn: 345863
2018-11-01 17:53:57 +00:00
Mandeep Singh Grang 88ad9ac720 [COFF, ARM64] Implement Intrinsic.sponentry for AArch64
Summary: This patch adds Intrinsic.sponentry. This intrinsic is required to correctly support setjmp for AArch64 Windows platform.

Reviewers: mgrang, TomTan, rnk, compnerd, mstorsjo, efriedma

Reviewed By: efriedma

Subscribers: majnemer, chrib, javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D53673

llvm-svn: 345791
2018-10-31 23:16:20 +00:00
Scott Linder 92bb783cfe [SelectionDAG] Handle constant range [0,1) in lowerRangeToAssertZExt
lowerRangeToAssertZExt currently relies on something like EarlyCSE having
eliminated the constant range [0,1). At -O0 this leads to an assert.

Differential Revision: https://reviews.llvm.org/D53888

llvm-svn: 345770
2018-10-31 19:57:36 +00:00
Cameron McInally 2ad870e785 [FPEnv] [FPEnv] Add constrained intrinsics for MAXNUM and MINNUM
Differential Revision: https://reviews.llvm.org/D53216

llvm-svn: 345650
2018-10-30 21:01:29 +00:00
Leonard Chan 905abe5b5d [Intrinsic] Signed and Unsigned Saturation Subtraction Intirnsics
Add an intrinsic that takes 2 integers and perform saturation subtraction on
them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D53783

llvm-svn: 345512
2018-10-29 16:54:37 +00:00
Heejin Ahn 24faf859e5 Reland "[WebAssembly] LSDA info generation"
Summary:
This adds support for LSDA (exception table) generation for wasm EH.
Wasm EH mostly follows the structure of Itanium-style exception tables,
with one exception: a call site table entry in wasm EH corresponds to
not a call site but a landing pad.

In wasm EH, the VM is responsible for stack unwinding. After an
exception occurs and the stack is unwound, the control flow is
transferred to wasm 'catch' instruction by the VM, after which the
personality function is called from the compiler-generated code. (Refer
to WasmEHPrepare pass for more information on this part.)

This patch:
- Changes wasm.landingpad.index intrinsic to take a token argument, to
make this 1:1 match with a catchpad instruction
- Stores landingpad index info and catch type info MachineFunction in
before instruction selection
- Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an
exception table
- Adds WasmException class with overridden methods for table generation
- Adds support for LSDA section in Wasm object writer

Reviewers: dschuff, sbc100, rnk

Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52748

llvm-svn: 345345
2018-10-25 23:55:10 +00:00
Thomas Lively 30f1d69115 [NFC] Rename minnan and maxnan to minimum and maximum
Summary:
Changes all uses of minnan/maxnan to minimum/maximum
globally. These names emphasize that the semantic difference between
these operations is more than just NaN-propagation.

Reviewers: arsenm, aheejin, dschuff, javed.absar

Subscribers: jholewinski, sdardis, wdng, sbc100, jgravelle-google, jrtc27, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D53112

llvm-svn: 345218
2018-10-24 22:49:55 +00:00
Sanjay Patel ad12df829c [SelectionDAG] use 'match' to simplify code; NFC
Vector types are not possible here because this code only starts
matching from the scalar bool value of a conditional branch, but
this is another step towards completely removing the fake binop
queries for not/neg/fneg.

llvm-svn: 345041
2018-10-23 15:46:10 +00:00
Leonard Chan 0acfc6be38 [Intrinsic] Unigned Saturation Addition Intrinsic
Add an intrinsic that takes 2 integers and perform unsigned saturation
addition on them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D53340

llvm-svn: 344971
2018-10-22 23:08:40 +00:00
Krasimir Georgiev 547d824da6 Revert "[WebAssembly] LSDA info generation"
This reverts commit r344575.
Newly introduced test eh-lsda.ll.test fails with use-after-free under
ASAN build.

llvm-svn: 344639
2018-10-16 18:50:09 +00:00
Leonard Chan 699b3b54da [Intrinsic] Signed Saturation Addition Intrinsic
Add an intrinsic that takes 2 integers and perform saturation addition on them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D53053

llvm-svn: 344629
2018-10-16 17:35:41 +00:00
Heejin Ahn 0981eaab47 [WebAssembly] LSDA info generation
Summary:
This adds support for LSDA (exception table) generation for wasm EH.
Wasm EH mostly follows the structure of Itanium-style exception tables,
with one exception: a call site table entry in wasm EH corresponds to
not a call site but a landing pad.

In wasm EH, the VM is responsible for stack unwinding. After an
exception occurs and the stack is unwound, the control flow is
transferred to wasm 'catch' instruction by the VM, after which the
personality function is called from the compiler-generated code. (Refer
to WasmEHPrepare pass for more information on this part.)

This patch:
- Changes wasm.landingpad.index intrinsic to take a token argument, to
make this 1:1 match with a catchpad instruction
- Stores landingpad index info and catch type info MachineFunction in
before instruction selection
- Lowers wasm.lsda intrinsic to an MCSymbol pointing to the start of an
exception table
- Adds WasmException class with overridden methods for table generation
- Adds support for LSDA section in Wasm object writer

Reviewers: dschuff, sbc100, rnk

Subscribers: mgorny, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D52748

llvm-svn: 344575
2018-10-16 00:09:12 +00:00
Chandler Carruth edb12a838a [TI removal] Make variables declared as `TerminatorInst` and initialized
by `getTerminator()` calls instead be declared as `Instruction`.

This is the biggest remaining chunk of the usage of `getTerminator()`
that insists on the narrow type and so is an easy batch of updates.
Several files saw more extensive updates where this would cascade to
requiring API updates within the file to use `Instruction` instead of
`TerminatorInst`. All of these were trivial in nature (pervasively using
`Instruction` instead just worked).

llvm-svn: 344502
2018-10-15 10:04:59 +00:00
Thomas Lively 16c349d892 [Intrinsic] Add llvm.minimum and llvm.maximum instrinsic functions
Summary:
These new intrinsics have the semantics of the `minimum` and `maximum`
operations specified by the latest draft of IEEE 754-2018. Unlike
llvm.minnum and llvm.maxnum, these new intrinsics propagate NaNs and
always treat -0.0 as less than 0.0. `minimum` and `maximum` lower
directly to the existing `fminnan` and `fmaxnan` ISel DAG nodes. It is
safe to reuse these DAG nodes because before this patch were only
emitted in situations where there were known to be no NaN arguments or
where NaN propagation was correct and there were known to be no zero
arguments. I know of only four backends that lower fminnan and
fmaxnan: WebAssembly, ARM, AArch64, and SystemZ, and each of these
lowers fminnan and fmaxnan to instructions that are compatible with
the IEEE 754-2018 semantics.

Reviewers: aheejin, dschuff, sunfish, javed.absar

Subscribers: kristof.beyls, dexonsmith, kristina, llvm-commits

Differential Revision: https://reviews.llvm.org/D52764

llvm-svn: 344437
2018-10-13 07:21:44 +00:00
Alex Bradbury f27c67af12 [SelectionDAGBuilder][NFC] Pass LHSTy to getShiftAmountTy rather than RHSTy
r126518 introduced a a type parameter to the getShiftAmountTy target hook. It 
produces the type of the shift (RHSTy), parameterised by the type of the value 
being shifted (LHSTy). SelectionDAGBuilder::visitShift passed RHSTy rather 
than LHSTy and this patch corrects this. The change is a no-op because in LLVM 
IR the LHS and RHS types for a shift must be equal anyway.

llvm-svn: 343955
2018-10-08 06:24:59 +00:00
Fangrui Song 0cac726a00 llvm::sort(C.begin(), C.end(), ...) -> llvm::sort(C, ...)
Summary: The convenience wrapper in STLExtras is available since rL342102.

Reviewers: dblaikie, javed.absar, JDevlieghere, andreadb

Subscribers: MatzeB, sanjoy, arsenm, dschuff, mehdi_amini, sdardis, nemanjai, jvesely, nhaehnle, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, javed.absar, gbedwell, jrtc27, mgrang, atanasyan, steven_wu, george.burgess.iv, dexonsmith, kristina, jsji, llvm-commits

Differential Revision: https://reviews.llvm.org/D52573

llvm-svn: 343163
2018-09-27 02:13:45 +00:00
Heejin Ahn e41be38efd Unify landing pad information adding routines (NFC)
Summary:
We have `llvm::addLandingPadInfo` and `MachineFunction::addLandingPad`,
both of which add landing pad information to populate `LandingPadInfo`
but are called from different locations, which was confusing. This patch
unifies them with one `MachineFunction::addLandingPad` function, which
now has functionlities of both functions.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D52428

llvm-svn: 343018
2018-09-25 19:56:44 +00:00
Matt Arsenault 57b5966dad DAG: Handle odd vector sizes in calling conv splitting
This already worked if only one register piece was used,
but didn't if a type was split into multiple, unequal
sized pieces.

Fixes not splitting 3i16/v3f16 into two registers for
AMDGPU.

This will also allow fixing the ABI for 16-bit vectors
in a future commit so that it's the same for all subtargets.

llvm-svn: 341801
2018-09-10 11:49:23 +00:00
Matt Arsenault a6f32f4015 DAG: Factor out helper function for odd vector sizes
llvm-svn: 341392
2018-09-04 18:47:43 +00:00
Matt Arsenault 167601e629 DAG: Don't use ABI copies in some contexts
If an ABI-like value is used in a different block,
the type split used is not necessarily the same as
the call's ABI. The value is used through an intermediate
copy virtual registers from the other block. This
resulted in copies with inconsistent sizes later.

Fixes regressions since r338197 when AMDGPU started
splitting vector types for calls.

llvm-svn: 341018
2018-08-30 05:49:28 +00:00
Chandler Carruth 9ae926b973 [IR] Replace `isa<TerminatorInst>` with `isTerminator()`.
This is a bit awkward in a handful of places where we didn't even have
an instruction and now we have to see if we can build one. But on the
whole, this seems like a win and at worst a reasonable cost for removing
`TerminatorInst`.

All of this is part of the removal of `TerminatorInst` from the
`Instruction` type hierarchy.

llvm-svn: 340701
2018-08-26 09:51:22 +00:00
Heejin Ahn 9cd7f88a35 [WebAssembly] Don't make wasm cleanuppads into funclet entries
Summary:
Catchpads and cleanuppads are not funclet entries; they are only EH
scope entries. We already dont't set `isEHFuncletEntry` for catchpads.
This patch does the same thing for cleanuppads.

Reviewers: dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D50654

llvm-svn: 340330
2018-08-21 20:04:42 +00:00
Chen Zheng e2d47dd1bb [MISC]Fix wrong usage of std::equal()
Differential Revision: https://reviews.llvm.org/D49958

llvm-svn: 340000
2018-08-17 07:51:01 +00:00
Chandler Carruth 66654b72c9 [SDAG] Remove the reliance on MI's allocation strategy for
`MachineMemOperand` pointers attached to `MachineSDNodes` and instead
have the `SelectionDAG` fully manage the memory for this array.

Prior to this change, the memory management was deeply confusing here --
The way the MI was built relied on the `SelectionDAG` allocating memory
for these arrays of pointers using the `MachineFunction`'s allocator so
that the raw pointer to the array could be blindly copied into an
eventual `MachineInstr`. This creates a hard coupling between how
`MachineInstr`s allocate their array of `MachineMemOperand` pointers and
how the `MachineSDNode` does.

This change is motivated in large part by a change I am making to how
`MachineFunction` allocates these pointers, but it seems like a layering
improvement as well.

This would run the risk of increasing allocations overall, but I've
implemented an optimization that should avoid that by storing a single
`MachineMemOperand` pointer directly instead of allocating anything.
This is expected to be a net win because the vast majority of uses of
these only need a single pointer.

As a side-effect, this makes the API for updating a `MachineSDNode` and
a `MachineInstr` reasonably different which seems nice to avoid
unexpected coupling of these two layers. We can map between them, but we
shouldn't be *surprised* at where that occurs. =]

Differential Revision: https://reviews.llvm.org/D50680

llvm-svn: 339740
2018-08-14 23:30:32 +00:00
Sanjay Patel 15d1501aae [SelectionDAG] try harder to convert funnel shift to rotate
Similar to rL337966 - if the DAGCombiner's rotate matching was 
working as expected, I don't think we'd see any test diffs here.

AArch only goes right, and PPC only goes left. 
x86 has both, so no diffs there.

Differential Revision: https://reviews.llvm.org/D50091

llvm-svn: 339359
2018-08-09 17:26:22 +00:00
Ties Stuij 81f1fbdf5a test commit access
Summary: changing a few typos

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D50445

llvm-svn: 339245
2018-08-08 13:51:13 +00:00
Thomas Preud'homme 4107b31df2 Support inline asm with multiple 64bit output in 32bit GPR
Summary: Extend fix for PR34170 to support inline assembly with multiple output operands that do not naturally go in the register class it is constrained to (eg. double in a 32-bit GPR as in the PR).

Reviewers: bogner, t.p.northover, lattner, javed.absar, efriedma

Reviewed By: efriedma

Subscribers: efriedma, tra, eraman, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45437

llvm-svn: 339225
2018-08-08 09:35:26 +00:00
Hsiangkai Wang ef72e481ea [DebugInfo] Refactor DbgInfoIntrinsic class hierarchy.
In the past, DbgInfoIntrinsic has a strong assumption that these
intrinsics all have variables and expressions attached to them.
However, it is too strong to derive the class for other debug entities.
Now, it has problems for debug labels.

In order to make DbgInfoIntrinsic as a base class for 'debug info', I
create a class for 'variable debug info', DbgVariableIntrinsic.

DbgDeclareInst, DbgAddrIntrinsic, and DbgValueInst will be derived from it.

Differential Revision: https://reviews.llvm.org/D50220

llvm-svn: 338984
2018-08-06 03:59:47 +00:00
Sanjay Patel 8aac22e06a [SelectionDAG] fix bug in translating funnel shift with non-power-of-2 type
The bug is visible in the constant-folded x86 tests. We can't use the
negated shift amount when the type is not power-of-2:
https://rise4fun.com/Alive/US1r

...so in that case, use the regular lowering that includes a select
to guard against a shift-by-bitwidth. This path is improved by only
calculating the modulo shift amount once now.

Also, improve the rotate (with power-of-2 size) lowering to use
a negate rather than subtract from bitwidth. This improves the
codegen whether we have a rotate instruction or not (although
we can still see that we're not matching to a legal rotate in
all cases).

llvm-svn: 338592
2018-08-01 17:17:08 +00:00
Matt Arsenault dcec0888e2 DAG: Correct pointer type used for stack slot
Correct the address space for the inserted argument
stack slot.

AMDGPU seems to not do anything with this information,
so I don't think this was breaking anything.

llvm-svn: 338428
2018-07-31 19:51:20 +00:00
Fangrui Song f78650a8de Remove trailing space
sed -Ei 's/[[:space:]]+$//' include/**/*.{def,h,td} lib/**/*.{cpp,h}

llvm-svn: 338293
2018-07-30 19:41:25 +00:00
Thomas Preud'homme 196149c943 Reapply "Fix crash on inline asm with 64bit matching input in 32bit GPR"
This reapplies commit r338206 reverted by r338214 since the bug that
r338206 uncovered has been fixed in r338268.

Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.

llvm-svn: 338269
2018-07-30 16:48:39 +00:00
Sanjay Patel 7312206f2f revert r338206 because the test does not pass
Example of bot failure:
http://lab.llvm.org:8011/builders/clang-cmake-armv8-quick/builds/5107/steps/ninja%20check%201/logs/FAIL%3A%20LLVM%3A%3Ainline-asm-operand-implicit-cast.ll

llvm-svn: 338214
2018-07-29 14:30:49 +00:00
Thomas Preud'homme 74ffd14e15 Fix crash on inline asm with 64bit matching input in 32bit GPR
Add support for inline assembly with matching input operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR). Note that regular input is already handled by existing
code.

llvm-svn: 338206
2018-07-28 21:33:39 +00:00
Matt Arsenault 81920b0a25 DAG: Add calling convention argument to calling convention funcs
This seems like a pretty glaring omission, and AMDGPU
wants to treat kernels differently from other calling
conventions.

llvm-svn: 338194
2018-07-28 13:25:19 +00:00
Craig Topper 1a40a06549 [SelectionDAGBuilder] Add masked loads to PendingLoads rather than calling DAG.setRoot.
Masked loads are calling DAG.getRoot rather than calling SelectionDAGBuilder::getRoot, which means the PendingLoads weren't emptied to update the root and create any needed TokenFactor. So it would be incorrect to call setRoot for the masked load.

This patch instead adds the masked load to PendingLoads so that the root doesn't get update until a store or scatter or something happens.. Alternatively, we could call SelectionDAGBuilder::getRoot before it, but that would create unnecessary serialization.

llvm-svn: 338085
2018-07-26 23:22:11 +00:00
Vedant Kumar b572f64212 [DebugInfo] LowerDbgDeclare: Add derefs when handling CallInst users
LowerDbgDeclare inserts a dbg.value before each use of an address
described by a dbg.declare. When inserting a dbg.value before a CallInst
use, however, it fails to append DW_OP_deref to the DIExpression.

The DW_OP_deref is needed to reflect the fact that a dbg.value describes
a source variable directly (as opposed to a dbg.declare, which relies on
pointer indirection).

This patch adds in the DW_OP_deref where needed. This results in the
correct values being shown during a debug session for a program compiled
with ASan and optimizations (see https://reviews.llvm.org/D49520). Note
that ConvertDebugDeclareToDebugValue is already correct -- no changes
there were needed.

One complication is that SelectionDAG is unable to distinguish between
direct and indirect frame-index (FRAMEIX) SDDbgValues. This patch also
fixes this long-standing issue in order to not regress integration tests
relying on the incorrect assumption that all frame-index SDDbgValues are
indirect. This is a necessary fix: the newly-added DW_OP_derefs cannot
be lowered properly otherwise. Basically the fix prevents a direct
SDDbgValue with DIExpression(DW_OP_deref) from being dereferenced twice
by a debugger. There were a handful of tests relying on this incorrect
"FRAMEIX => indirect" assumption which actually had incorrect
DW_AT_locations: these are all fixed up in this patch.

Testing:

- check-llvm, and an end-to-end test using lldb to debug an optimized
  program.
- Existing unit tests for DIExpression::appendToStack fully cover the
  new DIExpression::append utility.
- check-debuginfo (the debug info integration tests)

Differential Revision: https://reviews.llvm.org/D49454

llvm-svn: 338069
2018-07-26 20:56:53 +00:00
Sanjay Patel 215dcbf4db [SelectionDAG] try to convert funnel shift directly to rotate if legal
If the DAGCombiner's rotate matching was working as expected, 
I don't think we'd see any test diffs here. 

This sidesteps the issue of custom lowering for rotates raised in PR38243:
https://bugs.llvm.org/show_bug.cgi?id=38243
...by only dealing with legal operations.

llvm-svn: 337966
2018-07-25 21:38:30 +00:00
Thomas Preud'homme 768d6ce4a3 Fix PR34170: Crash on inline asm with 64bit output in 32bit GPR
Add support for inline assembly with output operand that do not
naturally go in the register class it is constrained to (eg. double in a
32-bit GPR as in the PR).

llvm-svn: 337903
2018-07-25 11:11:12 +00:00
Vedant Kumar 22bd6f99fa [SelectionDAG] Reduce DanglingDebugInfo memory traffic, NFC
This avoids approx. 2 x 10^5 DenseMap insertions in both non-debug and
debug -O2 builds of the sqlite3 amalgamation.

llvm-svn: 337751
2018-07-23 21:59:04 +00:00
Craig Topper b8b21d2fca [SelectionDAGBuilder] Use APInt::isZero instead of comparing APInt::getZExtValue to 0 in a place where we can't be sure contents of the APInt fit in a uint64_t.
This is used on an extract vector element index which is most cases is going to be an i32 or i64 and the element will be a valid element number. But it is possible to construct IR with a larger type and large out of range value.

llvm-svn: 337652
2018-07-22 05:16:50 +00:00
Craig Topper bf61113e4f [SelectionDAGBuilder] Restrict vector reduction check to types with a power of 2 number of elements.
The check for the shuffles usages probably isn't correct for non power of 2 vectors.

llvm-svn: 337651
2018-07-22 05:16:49 +00:00
Sanjay Patel c71adc8040 [Intrinsics] define funnel shift IR intrinsics + DAG builder support
As discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2018-May/123292.html
http://lists.llvm.org/pipermail/llvm-dev/2018-July/124400.html

We want to add rotate intrinsics because the IR expansion of that pattern is 4+ instructions, 
and we can lose pieces of the pattern before it gets to the backend. Generalizing the operation 
by allowing 2 different input values (plus the 3rd shift/rotate amount) gives us a "funnel shift" 
operation which may also be a single hardware instruction.

Initially, I thought we needed to define new DAG nodes for these ops, and I spent time working 
on that (much larger patch), but then I concluded that we don't need it. At least as a first 
step, we have all of the backend support necessary to match these ops...because it was required. 
And shepherding these through the IR optimizer is the primary concern, so the IR intrinsics are 
likely all that we'll ever need.

There was also a question about converting the intrinsics to the existing ROTL/ROTR DAG nodes
(along with improving the oversized shift documentation). Again, I don't think that's strictly 
necessary (as the test results here prove). That can be an efficiency improvement as a small 
follow-up patch.

So all we're left with is documentation, definition of the IR intrinsics, and DAG builder support. 

Differential Revision: https://reviews.llvm.org/D49242

llvm-svn: 337221
2018-07-16 22:59:31 +00:00
Eli Friedman 0319c28459 [CodeGen] Emit more precise AssertZext/AssertSext nodes.
This is marginally helpful for removing redundant extensions, and the
code is easier to read, so it seems like an all-around win. In the new
test i8-phi-ext.ll, we used to emit an AssertSext i8; now we emit an
AssertZext i2, which allows the extension of the return value to be
eliminated.

Differential Revision: https://reviews.llvm.org/D49004

llvm-svn: 336868
2018-07-11 23:26:35 +00:00
Piotr Padlewski 5b3db45e8f Implement strip.invariant.group
Summary:
This patch introduce new intrinsic -
strip.invariant.group that was described in the
RFC: Devirtualization v2

Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar

Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47103

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>
llvm-svn: 336073
2018-07-02 04:49:30 +00:00
Matthias Braun da5e7e11d1 SelectionDAGBuilder, mach-o: Skip trap after noreturn call (for Mach-O)
Add NoTrapAfterNoreturn target option which skips emission of traps
behind noreturn calls even if TrapUnreachable is enabled.

Enable the feature on Mach-O to save code size; Comments suggest it is
not possible to enable it for the other users of TrapUnreachable.

rdar://41530228

DifferentialRevision: https://reviews.llvm.org/D48674
llvm-svn: 335877
2018-06-28 17:00:45 +00:00
Mikael Holmen 42f7bc96dd [DebugInfo] Make sure all DBG_VALUEs' reguse operands have IsDebug property
Summary:
In some cases, these operands lacked the IsDebug property, which is meant to signal that
they should not affect codegen. This patch adds a check for this property in the
MachineVerifier and adds it where it was missing.

This includes refactorings to use MachineInstrBuilder construction functions instead of
manually setting up the intrinsic everywhere.

Patch by: JesperAntonsson

Reviewers: aprantl, rnk, echristo, javed.absar

Reviewed By: aprantl

Subscribers: qcolombet, sdardis, nemanjai, JDevlieghere, atanasyan, llvm-commits

Differential Revision: https://reviews.llvm.org/D48319

llvm-svn: 335214
2018-06-21 10:03:34 +00:00
Craig Topper 59da74370b [SelectionDAG] Don't crash on inline assembly errors when the inline assembly return type is a struct.
Summary:
If we get an error building the SelectionDAG for inline assembly we try to continue and still build the DAG.

But if the return type for the inline assembly is a struct we end up crashing because we try to create an UNDEF node with a struct type which isn't valid.

Instead we need to create an UNDEF for each element of the struct and join them with merge_values.

This patch relies on single operand merge_values being handled gracefully by getMergeValues. If the return type is void there will be no VTs returned by ComputeValueVTs and now we just return instead of calling setValue. Hopefully that's ok, I assumed nothing would need to look up the mapped value for void node.

Fixes PR37359

Reviewers: rengolin, rovka, echristo, efriedma, bogner

Reviewed By: efriedma

Subscribers: craig.topper, llvm-commits

Differential Revision: https://reviews.llvm.org/D46560

llvm-svn: 335093
2018-06-20 04:32:05 +00:00
Heejin Ahn 5ef4d5f9c1 [WebAssembly] Support instruction selection for catching exceptions
Summary:
This lowers exception catching-related instructions:
1. Lowers `wasm.catch` intrinsic to `catch` instruction
2. Removes `catchpad` and `cleanuppad` instructions; they are not
necessary after isel phase. (`MachineBasicBlock::isEHFuncletEntry()` or
`MachineBasicBlock::isEHPad()` can be used instead.)
3. Lowers `catchret` and `cleanupret` instructions to pseudo `catchret`
and `cleanupret` instructions in isel, which will be replaced with other
instructions in `WebAssemblyExceptionPrepare` pass.
4. Adds 'WebAssemblyExceptionPrepare` pass, which is for running various
transformation for EH. Currently this pass only replaces `catchret` and
`cleanupret` instructions into appropriate wasm instructions to make
this patch successfully run until the end.

Currently this does not handle lowering of intrinsics related to LSDA
info generation (`wasm.landingpad.index` and `wasm.lsda`), because they
cannot be tested without implementing `EHStreamer`'s wasm-specific
handlers. They are marked as TODO, which is needed to make isel pass.
Also this does not generate `try` and `end_try` markers yet, which will
be handled in later patches.

This patch is based on the first wasm EH proposal.
(https://github.com/WebAssembly/exception-handling/blob/master/proposals/Exceptions.md)

Reviewers: dschuff, majnemer

Subscribers: jfb, sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D44090

llvm-svn: 333705
2018-05-31 22:25:54 +00:00
Matt Arsenault ab2b79cb97 DAG: Remove redundant version of getRegisterTypeForCallingConv
There seems to be no real reason to have these separate copies.
The existing implementations just copy each other for x86.
For Mips there is a subtle difference, which is just a bug
since it changes based on the context where which one was called.
Dropping this version, all tests pass. If I try to merge them
to match the removed version, a test fails.

llvm-svn: 333440
2018-05-29 17:42:26 +00:00
Heejin Ahn 1e4d35044f [WebAssembly] Add functions for EHScopes
Summary:
There are functions using the term 'funclet' to refer to both
1. an EH scopes, the structure of BBs that starts with
catchpad/cleanuppad and ends with catchret/cleanupret, and
2. a small function that gets outlined in AsmPrinter, which is the
original meaning of 'funclet'.

So far the two have been the same thing; EH scopes are always outlined
in AsmPrinter as funclets at the end of the compilation pipeline. But
now wasm also uses scope-based EH but does not outline those, so we now
need to correctly distinguish those two use cases in functions.

This patch splits `MachineBasicBlock::isFuncletEntry` into
`isFuncletEntry` and `isEHScopeEntry`, and
`MachineFunction::hasFunclets` into `hasFunclets` and `hasEHScopes`, in
order to distinguish the two different use cases. And this also changes
some uses of the term 'funclet' to 'scope' in `getFuncletMembership` and
change the function name to `getEHScopeMembership` because this function
is not about outlined funclets but about EH scope memberships.

This change is in the same vein as D45559.

Reviewers: majnemer, dschuff

Subscribers: sbc100, jgravelle-google, sunfish, llvm-commits

Differential Revision: https://reviews.llvm.org/D47005

llvm-svn: 333045
2018-05-23 00:32:46 +00:00
Sanjay Patel 8652c53d29 [DAG] propagate FMF for all FPMathOperators
This is a simple hack based on what's proposed in D37686, but we can extend it if needed in follow-ups. 
It gets us most of the FMF functionality that we want without adding any state bits to the flags. It 
also intentionally leaves out non-FMF flags (nsw, etc) to minimize the patch.

It should provide a superset of the functionality from D46563 - the extra tests show propagation and 
codegen diffs for fcmp, vecreduce, and FP libcalls.

The PPC log2() test shows the limits of this most basic approach - we only applied 'afn' to the last 
node created for the call. AFAIK, there aren't any libcall optimizations based on the flags currently, 
so that shouldn't make any difference.

Differential Revision: https://reviews.llvm.org/D46854

llvm-svn: 332358
2018-05-15 14:16:24 +00:00
Nicola Zaghen d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Sanjay Patel fe645d295f [DAG] add convenience function to propagate FMF; NFC
There's only one use of this currently, but that could
change with D46563. Either way, we shouldn't have to
update code outside of the flags struct when those
flag definitions change.

llvm-svn: 332155
2018-05-11 23:13:36 +00:00
Sanjay Patel c4e4c5b076 [DAG] clean up flag propagation for binops; NFCI
llvm-svn: 332150
2018-05-11 22:45:22 +00:00
Sanjay Patel 0ddf09a36c [DAG] reduce code duplication; NFCI
llvm-svn: 332133
2018-05-11 20:08:23 +00:00
Vedant Kumar e0b5f86b30 [STLExtras] Add distance() for ranges, pred_size(), and succ_size()
This commit adds a wrapper for std::distance() which works with ranges.
As it would be a common case to write `distance(predecessors(BB))`, this
also introduces `pred_size()` and `succ_size()` helpers to make that
easier to write.

Differential Revision: https://reviews.llvm.org/D46668

llvm-svn: 332057
2018-05-10 23:01:54 +00:00
Shiva Chen cd070cdc94 [DebugInfo] Convert intrinsic llvm.dbg.label to MachineInstr.
In order to convert LLVM IR to MachineInstr, we need a new TargetOpcode,
DBG_LABEL, to ‘lower’ intrinsic llvm.dbg.label. The patch
creates this new TargetOpcode and convert intrinsic llvm.dbg.label to
MachineInstr through SelectionDAG.

In SelectionDAG, debug information is stored in SDDbgInfo. We create a
new data member of SDDbgInfo for labels and use the new data member,
SDDbgLabel, to create DBG_LABEL MachineInstr.

The new DBG_LABEL MachineInstr uses label metadata from LLVM IR as its
parameter. So, the backend could get metadata information of labels from
DBG_LABEL MachineInstr.

Differential Revision: https://reviews.llvm.org/D45341

Patch by Hsiangkai Wang.

llvm-svn: 331842
2018-05-09 02:41:08 +00:00
Michael Berg 7acc81b744 Fast Math Flag mapping into SDNode
Summary: Adding support for Fast flags in the SDNode to leverage fast math sub flag usage.

Reviewers: spatel, arsenm, jbhateja, hfinkel, escha, qcolombet, echristo, wristow, javed.absar

Reviewed By: spatel

Subscribers: llvm-commits, rampitec, nhaehnle, tstellar, FarhanaAleen, nemanjai, javed.absar, jbhateja, hfinkel, wdng

Differential Revision: https://reviews.llvm.org/D45710

llvm-svn: 331547
2018-05-04 18:48:20 +00:00
Bjorn Pettersson 27a841fe83 [SelectionDAG] Refactor code by adding RegsForValue::getRegsAndSizes(). NFCI
Summary:
Added a helper method in RegsForValue to get a list with
all the <RegNumber, RegSize> pairs that we want to iterate
over in SelectionDAGBuilder::EmitFuncArgumentDbgValue and
in SelectionDAGBuilder::visitIntrinsicCall.

Reviewers: vsk

Reviewed By: vsk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46360

llvm-svn: 331510
2018-05-04 08:50:48 +00:00
Bjorn Pettersson 304877e5ec Reapply "[SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)"
Summary:
This reverts SVN r331441 (reapplies r331337), together with a fix
in to handle an already existing fragment expression in the
dbg.value that must be fragmented due to a split PHI node.

This should solve the problem seen in PR37321, which was the
reason for the revert of r331337.

The situation in PR37321 is that we have a PHI node like this

   %u.sroa = phi i80 [ %u.sroa.x, %if.x ],
                     [ %u.sroa.y, %if.y ],
                     [ %u.sroa.z, %if.z ]

and a dbg.value like this

  call void @llvm.dbg.value(metadata i80 %u.sroa,
                            metadata !13,
                            metadata !DIExpression(DW_OP_LLVM_fragment, 0, 80))

The phi node is split into three 32-bit PHI nodes

  %30:gr32 = PHI %11:gr32, %bb.4, %14:gr32, %bb.5, %27:gr32, %bb.8
  %31:gr32 = PHI %12:gr32, %bb.4, %15:gr32, %bb.5, %28:gr32, %bb.8
  %32:gr32 = PHI %13:gr32, %bb.4, %16:gr32, %bb.5, %29:gr32, %bb.8

but since the original value only is 80 bits we need to adjust the size
of the last fragment expression, and with this patch we get

  DBG_VALUE debug-use %30:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 0, 32)
  DBG_VALUE debug-use %31:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 32, 32)
  DBG_VALUE debug-use %32:gr32, debug-use $noreg, !"u", !DIExpression(DW_OP_LLVM_fragment, 64, 16)

Reviewers: vsk, aprantl, mstorsjo

Reviewed By: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D46384

llvm-svn: 331464
2018-05-03 17:04:16 +00:00
Piotr Padlewski 5dde809404 Rename invariant.group.barrier to launder.invariant.group
Summary:
This is one of the initial commit of "RFC: Devirtualization v2" proposal:
https://docs.google.com/document/d/16GVtCpzK8sIHNc2qZz6RN8amICNBtvjWUod2SujZVEo/edit?usp=sharing

Reviewers: rsmith, amharc, kuhar, sanjoy

Subscribers: arsenm, nhaehnle, javed.absar, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D45111

llvm-svn: 331448
2018-05-03 11:03:01 +00:00
Martin Storsjo 67fdea490d Revert "[SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)"
This reverts SVN r331337, see PR37321 for details on the regression
it introduced.

llvm-svn: 331441
2018-05-03 07:09:33 +00:00
Bjorn Pettersson 1c5a05f32c [SelectionDAG] Selection of DBG_VALUE using a PHI node result (pt 2)
Summary:
This is a follow up to rL331182. A PHI node can be split up into
several MIR PHI nodes when being selected. When there is a
dbg.value intrinsic that uses the result of such a PHI node we
need to select several DBG_VALUE instructions, with fragment
expressions, in order to do a correct selection.

Reviewers: rnk, aprantl, vsk

Reviewed By: vsk

Subscribers: mattd, llvm-commits, JDevlieghere, aprantl, gbedwell, rnk

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D46329

llvm-svn: 331337
2018-05-02 06:56:38 +00:00
Adrian Prantl 5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Bjorn Pettersson abafca619b [SelectionDAG] Improve selection of DBG_VALUE using a PHI node result
Summary:
When building the selection DAG at ISel all PHI nodes are
selected and lowered to Machine Instruction PHI nodes before
we start to create any SDNodes. So there are no SDNodes for
values produced by the PHI nodes.

In the past when selecting a dbg.value intrinsic that uses
the value produced by a PHI node we have been handling such
dbg.value intrinsics as "dangling debug info". I.e. we have
not created a SDDbgValue node directly, because there is
no existing SDNode for the PHI result, instead we deferred
the creationg of a SDDbgValue until we found the first use
of the PHI result.

The old solution had a couple of flaws. The position of the
selected DBG_VALUE instruction would end up quite late in a
basic block, and for example not directly after the PHI node
as in the LLVM IR input. And in case there were no use at all
in the basic block the dbg.value could be dropped completely.

This patch introduces a new VREG kind of SDDbgValue nodes.
It is similar to a SDNODE kind of node, but it refers directly
to a virtual register and not a SDNode. When we do selection
for a dbg.value that is using the result of a PHI node we
can do a lookup of the virtual register directly (as it already
is determined for the PHI node) and create a SDDbgValue node
immediately instead of delaying the selection until we find a
use.

This should fix a problem with losing debug info at ISel
as seen in PR37234 (https://bugs.llvm.org/show_bug.cgi?id=37234).
It does not resolve PR37234 completely, because the debug info
is dropped later on in the BranchFolder (see D46184).

Reviewers: #debug-info, aprantl

Reviewed By: #debug-info, aprantl

Subscribers: rnk, gbedwell, aprantl, JDevlieghere, llvm-commits

Differential Revision: https://reviews.llvm.org/D46129

llvm-svn: 331182
2018-04-30 14:37:39 +00:00
Daniel Neilson 9863b48d4e [SelectionDAG] Refactor lowering of atomic memory intrinsics.
Summary:
This just refactors the lowering of the atomic memory intrinsics to more
closely match the code patterns used in the lowering of the non-atomic
memory intrinsics. Specifically, we encapsulate the lowering in
SelectionDAG::getAtomicMem*() functions rather than embedding
the code directly in the SelectionDAGBuilder code.

llvm-svn: 330603
2018-04-23 15:40:37 +00:00
Keith Wyss 3d86823f3d [XRay] Typed event logging intrinsic
Summary:
Add an LLVM intrinsic for type discriminated event logging with XRay.
Similar to the existing intrinsic for custom events, but also accepts
a type tag argument to allow plugins to be aware of different types
and semantically interpret logged events they know about without
choking on those they don't.

Relies on a symbol defined in compiler-rt patch D43668. I may wait
to submit before I can see demo everything working together including
a still to come clang patch.

Reviewers: dberris, pelikan, eizan, rSerge, timshen

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D45633

llvm-svn: 330219
2018-04-17 21:30:29 +00:00
Mandeep Singh Grang e92f0cfe34 [CodeGen] Change std::sort to llvm::sort in response to r327219
Summary:
r327219 added wrappers to std::sort which randomly shuffle the container before sorting.
This will help in uncovering non-determinism caused due to undefined sorting
order of objects having the same key.

To make use of that infrastructure we need to invoke llvm::sort instead of std::sort.

Note: This patch is one of a series of patches to replace *all* std::sort to llvm::sort.
Refer the comments section in D44363 for a list of all the required patches.

Reviewers: bogner, rnk, MatzeB, RKSimon

Reviewed By: rnk

Subscribers: JDevlieghere, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D45133

llvm-svn: 329435
2018-04-06 18:08:42 +00:00
Fangrui Song 956ee79795 Fix a bunch of typoes. NFC
llvm-svn: 328907
2018-03-30 22:22:31 +00:00
Craig Topper 2fa1436206 [IR][CodeGen] Remove dependency on EVT from IR/Function.cpp. Move EVT to CodeGen layer.
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.

The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.

Differential Revision: https://reviews.llvm.org/D45017

llvm-svn: 328806
2018-03-29 17:21:10 +00:00
David Blaikie 36a0f226b1 Fix layering by moving ValueTypes.h from CodeGen to IR
ValueTypes.h is implemented in IR already.

llvm-svn: 328397
2018-03-23 23:58:31 +00:00
David Blaikie 13e77db2df Fix layering of MachineValueType.h by moving it from CodeGen to Support
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)

llvm-svn: 328395
2018-03-23 23:58:25 +00:00
Bjorn Pettersson 5c25f88536 [SelectionDAG] Support multiple dangling debug info for one value
Summary:
When building the selection DAG we sometimes need to postpone
the handling of a dbg.value until the value it should refer to
is created. This is done by using the DanglingDebugInfoMap.
In the past this map has been limited to hold one dangling
dbg.value per value. This patch removes that restriction.

Reviewers: aprantl, rnk, probinson, vsk

Reviewed By: aprantl

Subscribers: Ka-Ka, llvm-commits, JDevlieghere

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D44610

llvm-svn: 328084
2018-03-21 09:44:34 +00:00
Daniel Neilson 5182113f07 [SelectionDAGBuilder] Replace deprecated calls to MemoryIntrinsic::getAlignment() (NFCI)
Summary:
This change is part of step five in the series of changes to remove alignment argument from
memcpy/memmove/memset in favour of alignment attributes. In particular, this changes the
SelectionDAGBuilder to cease using the old getAlignment() API of MemoryIntrinsic in favour of getting
source & dest specific alignments through the new API.

Steps:
Step 1) Remove alignment parameter and create alignment parameter attributes for
memcpy/memmove/memset. ( rL322965, rC322964, rL322963 )
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
source and dest alignments. ( rL323597 )
Step 3) Update Clang to use the new IRBuilder API. ( rC323617 )
Step 4) Update Polly to use the new IRBuilder API. ( rL323618 )
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
and those that use use MemIntrinsicInst::[get|set]Alignment() to use [get|set]DestAlignment()
and [get|set]SourceAlignment() instead. ( rL323886, rL323891, rL324148, rL324273, rL324278,
rL324384, rL324395, rL324402, rL324626, rL324642, rL324653, rL324654, rL324773, rL324774,
rL324781, rL324784, rL324955, rL324960, rL325816, rL327398 )
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
MemIntrinsicInst::[get|set]Alignment() methods.

Reference
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

llvm-svn: 327421
2018-03-13 16:31:19 +00:00
Bjorn Pettersson a223f815dd [SelectionDAG] Improve handling of dangling debug info
Summary:
1) Make sure to discard dangling debug info if the variable (or
variable fragment) is mapped to something new before we had a
chance to resolve the dangling debug info.

2) When resolving debug info, make sure to bump the associated
SDNodeOrder to ensure that the DBG_VALUE is emitted after the
instruction that defines the value used in the DBG_VALUE.
This will avoid a debug-use before def scenario as seen in
https://bugs.llvm.org/show_bug.cgi?id=36417.

The new test case, test/DebugInfo/X86/sdag-dangling-dbgvalue.ll,
show some other limitations in how dangling debug info is
handled in the SelectionDAG. Since we currently only support
having one dangling dbg.value per Value, we will end up dropping
debug info when there are more than one variable that is described
by the same "dangling value".

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: aprantl, eraman, llvm-commits, JDevlieghere

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D44369

llvm-svn: 327303
2018-03-12 18:02:39 +00:00
Peter Collingbourne 2974856ad4 Use branch funnels for virtual calls when retpoline mitigation is enabled.
The retpoline mitigation for variant 2 of CVE-2017-5715 inhibits the
branch predictor, and as a result it can lead to a measurable loss of
performance. We can reduce the performance impact of retpolined virtual
calls by replacing them with a special construct known as a branch
funnel, which is an instruction sequence that implements virtual calls
to a set of known targets using a binary tree of direct branches. This
allows the processor to speculately execute valid implementations of the
virtual function without allowing for speculative execution of of calls
to arbitrary addresses.

This patch extends the whole-program devirtualization pass to replace
certain virtual calls with calls to branch funnels, which are
represented using a new llvm.icall.jumptable intrinsic. It also extends
the LowerTypeTests pass to recognize the new intrinsic, generate code
for the branch funnels (x86_64 only for now) and lay out virtual tables
as required for each branch funnel.

The implementation supports full LTO as well as ThinLTO, and extends the
ThinLTO summary format used for whole-program devirtualization to
support branch funnels.

For more details see RFC:
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120672.html

Differential Revision: https://reviews.llvm.org/D42453

llvm-svn: 327163
2018-03-09 19:11:44 +00:00
Than McIntosh b3d88a7466 [CodeGen] fix argument attribute in lowering statepoint/patchpoint
Summary:
Use the correct loop index varaible, ArgI, to retrieve attributes.

Reviewers: thanm, sanjoy, rnk

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D43832

llvm-svn: 326433
2018-03-01 13:31:57 +00:00
Elena Demikhovsky 945b7e5aa6 Adding a width of the GEP index to the Data Layout.
Making a width of GEP Index, which is used for address calculation, to be one of the pointer properties in the Data Layout.
p[address space]:size:memory_size:alignment:pref_alignment:index_size_in_bits.
The index size parameter is optional, if not specified, it is equal to the pointer size.

Till now, the InstCombiner normalized GEPs and extended the Index operand to the pointer width.
It works fine if you can convert pointer to integer for address calculation and all registered targets do this.
But some ISAs have very restricted instruction set for the pointer calculation. During discussions were desided to retrieve information for GEP index from the Data Layout.
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120416.html

I added an interface to the Data Layout and I changed the InstCombiner and some other passes to take the Index width into account.
This change does not affect any in-tree target. I added tests to cover data layouts with explicitly specified index size.

Differential Revision: https://reviews.llvm.org/D42123

llvm-svn: 325102
2018-02-14 06:58:08 +00:00
Dan Gohman 832092ca12 [SelectionDAG]: Ignore "returned" in the presence of an implicit sret.
When a function return value can't be directly lowered, such as
returning an i128 on WebAssembly, as indicated by the CanLowerReturn
target hook, SelectionDAGBuilder can translate it to return the
value through a hidden sret-like argument.

If such a function has an argument with the "returned" attribute,
the attribute can't be automatically lowered, because the function
no longer has a normal return value. For now, just discard the
"returned" attribute.

This fixes PR36128.

llvm-svn: 323715
2018-01-30 00:14:40 +00:00
Hiroshi Inoue c8e9245816 [NFC] fix trivial typos in comments and documents
"to to" -> "to"

llvm-svn: 323628
2018-01-29 05:17:03 +00:00
Daniel Neilson 1e68724d24 Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1)
Summary:
 This is a resurrection of work first proposed and discussed in Aug 2015:
   http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html
and initially landed (but then backed out) in Nov 2015:
   http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

 The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument
which is required to be a constant integer. It represents the alignment of the
dest (and source), and so must be the minimum of the actual alignment of the
two.

 This change is the first in a series that allows source and dest to each
have their own alignments by using the alignment attribute on their arguments.

 In this change we:
1) Remove the alignment argument.
2) Add alignment attributes to the source & dest arguments. We, temporarily,
   require that the alignments for source & dest be equal.

 For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false)
will now read
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false)

 Downstream users may have to update their lit tests that check for
@llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script
may help with updating the majority of your tests, but it does not catch all possible
patterns so some manual checking and updating will be required.

s~declare void @llvm\.mem(set|cpy|move)\.p([^(]*)\((.*), i32, i1\)~declare void @llvm.mem\1.p\2(\3, i1)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g
s~call void @llvm\.memset\.p([^(]*)i8\(i8([^*]*)\* (.*), i8 (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i8(i8\2* align \6 \3, i8 \4, i8 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i16\(i8([^*]*)\* (.*), i8 (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i16(i8\2* align \6 \3, i8 \4, i16 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i32\(i8([^*]*)\* (.*), i8 (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i32(i8\2* align \6 \3, i8 \4, i32 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i64\(i8([^*]*)\* (.*), i8 (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i64(i8\2* align \6 \3, i8 \4, i64 \5, i1 \7)~g
s~call void @llvm\.memset\.p([^(]*)i128\(i8([^*]*)\* (.*), i8 (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.memset.p\1i128(i8\2* align \6 \3, i8 \4, i128 \5, i1 \7)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* \4, i8\5* \6, i8 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* \4, i8\5* \6, i16 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* \4, i8\5* \6, i32 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* \4, i8\5* \6, i64 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 [01], i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* \4, i8\5* \6, i128 \7, i1 \8)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i8\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i8 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i16\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i16 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i32\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i32 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i64\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i64 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g
s~call void @llvm\.mem(cpy|move)\.p([^(]*)i128\(i8([^*]*)\* (.*), i8([^*]*)\* (.*), i128 (.*), i32 ([0-9]*), i1 ([^)]*)\)~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g

 The remaining changes in the series will:
Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing
   source and dest alignments.
Step 3) Update Clang to use the new IRBuilder API.
Step 4) Update Polly to use the new IRBuilder API.
Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API,
        and those that use use MemIntrinsicInst::[get|set]Alignment() to use
        getDestAlignment() and getSourceAlignment() instead.
Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the
        MemIntrinsicInst::[get|set]Alignment() methods.

Reviewers: pete, hfinkel, lhames, reames, bollu

Reviewed By: reames

Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits

Differential Revision: https://reviews.llvm.org/D41675

llvm-svn: 322965
2018-01-19 17:13:12 +00:00
Daniel Neilson 2409d24201 [NFC] Change MemIntrinsicInst::setAlignment() to take an unsigned instead of a Constant
Summary:
 In preparation for https://reviews.llvm.org/D41675 this NFC changes this
prototype of MemIntrinsicInst::setAlignment() to accept an unsigned instead
of a Constant.

llvm-svn: 322403
2018-01-12 21:33:37 +00:00
Craig Topper af4eb17223 [SelectionDAG][X86] Explicitly store the scale in the gather/scatter ISD nodes
Currently we infer the scale at isel time by analyzing whether the base is a constant 0 or not. If it is we assume scale is 1, else we take it from the element size of the pass thru or stored value. This seems a little weird and I think it makes more sense to make it explicit in the DAG rather than doing tricky things in the backend.

Most of this patch is just making sure we copy the scale around everywhere.

Differential Revision: https://reviews.llvm.org/D40055

llvm-svn: 322210
2018-01-10 19:16:05 +00:00
Jonas Paulsson 9222b91e24 [SelectionDAGBuilder] Chain prefetches less aggressively.
Prefetches used to always be chained between any previous and following
memory accesses. The problem with this was that later optimizations, such as
folding of a load into the user instruction, got disrupted.

This patch relaxes the chaining of prefetches in order to remedy this.

Reveiw: Hal Finkel
https://reviews.llvm.org/D38886

llvm-svn: 322163
2018-01-10 09:33:00 +00:00
Benjamin Kramer c7fc81e659 Use phi ranges to simplify code. No functionality change intended.
llvm-svn: 321585
2017-12-30 15:27:33 +00:00
Matthias Braun 042fed54fb Fix unused variable in non-assert builds
llvm-svn: 320885
2017-12-15 22:53:33 +00:00
Matthias Braun f1caa2833f MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

llvm-svn: 320884
2017-12-15 22:22:58 +00:00
Adrian Prantl c133d8a5be EmitFuncArgumentDbgValue: Prefer stack slots over registers for stack arguments
While investigating LLVM PR22316 (http://llvm.org/bugs/show_bug.cgi?id=22316)
I started wondering if it were not always preferable to emit the
initial DBG_VALUEs for stack arguments as FI locations instead of
describing the first register they get copied into. The advantage of
doing this is that the arguments will be available as soon as the
stack is setup. As illustrated by the testcase in the PR, the first
copy of the FI into a register may be sunk by MachineSink.cpp into a
later basic block. By describing the argument on the stack, we nicely
circumvent this problem.

<rdar://problem/19583723>

Differential Revision: https://reviews.llvm.org/D41135

llvm-svn: 320758
2017-12-14 22:55:06 +00:00
Matt Arsenault 7d7adf4f2e TLI: Allow using PSV for intrinsic mem operands
llvm-svn: 320756
2017-12-14 22:34:10 +00:00
Zachary Turner 260fe3eca6 Fix many -Wsign-compare and -Wtautological-constant-compare warnings.
Most of the -Wsign-compare warnings are due to the fact that
enums are signed by default in the MS ABI, while the
tautological comparison warnings trigger on x86 builds where
sizeof(size_t) is 4 bytes, so N > numeric_limits<unsigned>::max()
is always false.

Differential Revision: https://reviews.llvm.org/D41256

llvm-svn: 320750
2017-12-14 22:07:03 +00:00
Matt Arsenault 1117133687 DAG: Expose all MMO flags in getTgtMemIntrinsic
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.

On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.

llvm-svn: 320746
2017-12-14 21:39:51 +00:00
Dylan McKay 80463fe64d Relax unaligned access assertion when type is byte aligned
Summary:
This relaxes an assertion inside SelectionDAGBuilder which is overly
restrictive on targets which have no concept of alignment (such as AVR).

In these architectures, all types are aligned to 8-bits.

After this, LLVM will only assert that accesses are aligned on targets
which actually require alignment.

This patch follows from a discussion on llvm-dev a few months ago
http://llvm.1065342.n5.nabble.com/llvm-dev-Unaligned-atomic-load-store-td112815.html

Reviewers: bogner, nemanjai, joerg, efriedma

Reviewed By: efriedma

Subscribers: efriedma, cactus, llvm-commits

Differential Revision: https://reviews.llvm.org/D39946

llvm-svn: 320243
2017-12-09 06:45:36 +00:00
Hans Wennborg 5df9f0878b Re-commit r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
The patch originally broke Chromium (crbug.com/791714) due to its failing to
specify that the new pseudo instructions clobber EFLAGS. This commit fixes
that.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319824
2017-12-05 20:22:20 +00:00
Hans Wennborg 361d4392cf Revert r319490 "XOR the frame pointer with the stack cookie when protecting the stack"
This broke the Chromium build (crbug.com/791714). Reverting while investigating.

> Summary: This strengthens the guard and matches MSVC.
>
> Reviewers: hans, etienneb
>
> Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits
>
> Differential Revision: https://reviews.llvm.org/D40622
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319490 91177308-0d34-0410-b5e6-96231b3b80d8

llvm-svn: 319706
2017-12-04 22:21:15 +00:00
Yaxun Liu 30e4608cca CodeGen: Fix SelectionDAGISel::LowerArguments for sret addr space
SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is
not true for amdgcn---amdgiz target.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40255

llvm-svn: 319630
2017-12-03 03:31:45 +00:00
Zachary Turner 8065f0b975 Mark all library options as hidden.
These command line options are not intended for public use, and often
don't even make sense in the context of a particular tool anyway. About
90% of them are already hidden, but when people add new options they
forget to hide them, so if you were to make a brand new tool today, link
against one of LLVM's libraries, and run tool -help you would get a
bunch of junk that doesn't make sense for the tool you're writing.

This patch hides these options. The real solution is to not have
libraries defining command line options, but that's a much larger effort
and not something I'm prepared to take on.

Differential Revision: https://reviews.llvm.org/D40674

llvm-svn: 319505
2017-12-01 00:53:10 +00:00
Reid Kleckner ba4014e9dc XOR the frame pointer with the stack cookie when protecting the stack
Summary: This strengthens the guard and matches MSVC.

Reviewers: hans, etienneb

Subscribers: hiraditya, JDevlieghere, vlad.tsyrklevich, llvm-commits

Differential Revision: https://reviews.llvm.org/D40622

llvm-svn: 319490
2017-11-30 22:41:21 +00:00
Matt Arsenault b655fa9ce2 DAG: Add nuw when splitting loads and stores
The object can't straddle the address space
wrap around, so I think it's OK to assume any
offsets added to the base object pointer can't
overflow. Similar logic already appears to be
applied in SelectionDAGBuilder when lowering
aggregate returns.

llvm-svn: 319272
2017-11-29 01:25:12 +00:00
Mandeep Singh Grang 230b0a1477 [SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering
Summary:
Recommitting this with the correct sorting predicate. The Low field of Clusters is a ConstantInt and
cannot be directly compared. So we needed to invoke slt (signed less than) to compare correctly.

This fixes failures in the following tests uncovered by D39245:

LLVM :: CodeGen/ARM/ifcvt3.ll
LLVM :: CodeGen/ARM/switch-minsize.ll
LLVM :: CodeGen/X86/switch.ll
LLVM :: CodeGen/X86/switch-bt.ll
LLVM :: CodeGen/X86/switch-density.ll

Reviewers: hans, fhahn

Reviewed By: hans

Subscribers: aemerson, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D40541

llvm-svn: 319210
2017-11-28 19:55:54 +00:00
Jonas Paulsson f0ff20f1f0 Use getStoreSize() in various places instead of 'BitSize >> 3'.
This is needed for cases when the memory access is not as big as the width of
the data type. For instance, storing i1 (1 bit) would be done in a byte (8
bits).

Using 'BitSize >> 3' (or '/ 8') would e.g. give the memory access of an i1 a
size of 0, which for instance makes alias analysis return NoAlias even when
it shouldn't.

There are no tests as this was done as a follow-up to the bugfix for the case
where this was discovered (r318824). This handles more similar cases.

Review: Björn Petterson
https://reviews.llvm.org/D40339

llvm-svn: 319173
2017-11-28 14:44:32 +00:00
Mandeep Singh Grang f953e4b881 Revert "[SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering"
This broke the bots. Reverting this until I can fix the failures.

This reverts commit 5a3db2856d12a3c4b400f487d39f8f05989e79f0.

llvm-svn: 318686
2017-11-20 19:17:11 +00:00
Mandeep Singh Grang dc9de50902 [SelectionDAG] Make sorting predicate stronger to remove non-deterministic ordering
Summary:
This fixes failures in the following tests uncovered by D39245:
        LLVM :: CodeGen/ARM/ifcvt3.ll
        LLVM :: CodeGen/ARM/switch-minsize.ll
        LLVM :: CodeGen/X86/switch.ll

Reviewers: hans, efriedma

Reviewed By: hans

Subscribers: fhahn, aemerson, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D39995

llvm-svn: 318680
2017-11-20 18:46:11 +00:00
David Blaikie b3bde2ea50 Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

llvm-svn: 318490
2017-11-17 01:07:10 +00:00
Craig Topper 242374e219 [X86] Don't remove sign extend of gather/scatter indices during SelectionDAGBuilder.
The sign extend might be from an i16 or i8 type and was inserted by InstCombine to match the pointer width. X86 gather legalization isn't currently detecting this to reinsert a sign extend to make things legal.

It's a bit weird for the SelectionDAGBuilder to do this kind of optimization in the first place. With this removed we can at least lean on InstCombine somewhat to ensure the index is i32 or i64.

I'll work on trying to recover some of the test cases by removing sign extends in the backend when its safe to do so with an understanding of the current legalizer capabilities.

This should fix PR30690.

llvm-svn: 318466
2017-11-16 23:08:57 +00:00
Yaxun Liu 0844ff2aa7 Fix pointer EVT in SelectionDAGBuilder::visitAlloca
SelectionDAGBuilder::visitAlloca assumes alloca address space is 0, which is
incorrect for triple amdgcn---amdgiz and causes isel failure.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D40095

llvm-svn: 318392
2017-11-16 12:22:19 +00:00
Rong Xu e4572c6b73 [CodeGen] Fix the branch probability assertion in r318202
Due to integer precision, we might have numerator greater than denominator in
the branch probability scaling. Add a check to prevent this from happening.

llvm-svn: 318353
2017-11-16 00:14:05 +00:00
Rong Xu 3573d8da36 [CodeGen] Peel off the dominant case in switch statement in lowering
This patch peels off the top case in switch statement into a branch if the
probability exceeds a threshold. This will help the branch prediction and
avoids the extra compares when lowering into chain of branches.

Differential Revision: http://reviews.llvm.org/D39262

llvm-svn: 318202
2017-11-14 21:44:09 +00:00
Yaxun Liu 0b2f73fd84 CodeGen: Fix TargetLowering::LowerCallTo for sret value type
TargetLowering::LowerCallTo assumes that sret value type corresponds to a
pointer in default address space, which is incorrect, since sret value type
should correspond to a pointer in alloca address space, which may not
be the default address space. This causes assertion for amdgcn target
in amdgiz environment.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39996

llvm-svn: 318167
2017-11-14 18:46:52 +00:00
Craig Topper bdb8db4589 [SelectionDAG] Make getUniformBase in SelectionDAGBuilder fail if any of the middle GEP indices are non-constant.
This is a fix for a bug in r317947. We were supposed to check that all the indices are are constant 0, but instead we're only make sure that indices that are constant are 0. Non-constant indices are being ignored.

llvm-svn: 317950
2017-11-10 23:36:56 +00:00
Craig Topper ffd48e3c27 [SelectionDAG] Teach SelectionDAGBuilder's getUniformBase for gather/scatter handling to accept GEPs with more than 2 operands if the middle operands are all 0s
Currently we can only get a uniform base from a simple GEP with 2 operands. This causes us to miss address folding opportunities for simple global array accesses as the test case shows.

This patch adds support for larger GEPs if the other indices are 0 since those don't require any additional computations to be inserted.

We may also want to handle constant splats of zero here, but I'm leaving that for future work when I have a real world example.

Differential Revision: https://reviews.llvm.org/D39911

llvm-svn: 317947
2017-11-10 22:50:50 +00:00
Jatin Bhateja aaa5944ad4 [WebAssembly] Fix stack offsets of return values from call lowering.
Summary: Fixes PR35220

Reviewers: vadimcn, alexcrichton

Reviewed By: alexcrichton

Subscribers: pepyakin, alexcrichton, jfb, dschuff, sbc100, jgravelle-google, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D39866

llvm-svn: 317895
2017-11-10 16:26:04 +00:00
Dan Gohman 2c74fe977d Add an @llvm.sideeffect intrinsic
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].

Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.

As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.

[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html

Differential Revision: https://reviews.llvm.org/D38336

llvm-svn: 317729
2017-11-08 21:59:51 +00:00
Adrian Prantl 93faeecd8f Handle inlined variables in SelectionDAGBuilder::EmitFuncArgumentDbgValue().
In 2010 a commit with no testcase and no further explanation
explicitly disabled the handling of inlined variables in
EmitFuncArgumentDbgValue(). I don't think there is a good reason for
this any more and re-enabling this adds debug locations for variables
associated with an LLVM function argument in functions that are
inlined into the first basic block. The only downside of doing this is
that we may insert a DBG_VALUE before the inlined scope, but (1) this
could be filtered out later, and (2) LiveDebugValues will not
propagate it into subsequent basic blocks if they don't dominate the
variable's lexical scope, so this seems like a small price to pay.

rdar://problem/26228128

llvm-svn: 317702
2017-11-08 18:27:13 +00:00
David Blaikie 3f833edc7c Target/TargetInstrInfo.h -> CodeGen/TargetInstrInfo.h to match layering
This header includes CodeGen headers, and is not, itself, included by
any Target headers, so move it into CodeGen to match the layering of its
implementation.

llvm-svn: 317647
2017-11-08 01:01:31 +00:00
Craig Topper 41bf240726 [SelectionDAG] Fix typo in comment. NFC
llvm-svn: 317588
2017-11-07 16:32:31 +00:00
Adrian Prantl 25a09dd408 Make DIExpression::createFragmentExpression() return an Optional.
We can't safely split arithmetic into multiple fragments because we
can't express carry-over between fragments.

llvm-svn: 317534
2017-11-07 00:45:34 +00:00
Sanjay Patel 629c411538 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00
David Blaikie 1be62f0327 Move TargetFrameLowering.h to CodeGen where it's implemented
This header already includes a CodeGen header and is implemented in
lib/CodeGen, so move the header there to match.

This fixes a link error with modular codegeneration builds - where a
header and its implementation are circularly dependent and so need to be
in the same library, not split between two like this.

llvm-svn: 317379
2017-11-03 22:32:11 +00:00
Daniel Neilson f9c7d29c77 Create instruction classes for identifying any atomicity of memory intrinsic. (NFC)
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html

This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:

IntrinsicInst
  -> MemIntrinsicBase (methods-only class)
    -> MemIntrinsic (non-atomic intrinsics)
      -> MemSetInst
      -> MemTransferInst
        -> MemCpyInst
        -> MemMoveInst
    -> AtomicMemIntrinsic (atomic intrinsics)
      -> AtomicMemSetInst
      -> AtomicMemTransferInst
        -> AtomicMemCpyInst
        -> AtomicMemMoveInst
    -> AnyMemIntrinsic (both atomicities)
      -> AnyMemSetInst
      -> AnyMemTransferInst
        -> AnyMemCpyInst
        -> AnyMemMoveInst

This involves some class renaming:
    ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
    ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
    ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.

An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).

---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"

REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"

FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )

SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"

for f in $FILES; do
    echo "Processing: $f"
    sed  -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done

Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev

Reviewed By: sanjoy

Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38419

llvm-svn: 316950
2017-10-30 19:51:48 +00:00
George Burgess IV 7887238c7c Fix buildbot breakage
SP is only used in an assert. Caused by r316374.

llvm-svn: 316377
2017-10-23 21:08:02 +00:00
George Burgess IV 8a0e4bc972 Don't crash when we see unallocatable registers in clobbers
This fixes a bug where we'd crash given code like the test-case from
https://bugs.llvm.org/show_bug.cgi?id=30792 . Instead, we let the
offending clobber silently slide through.

This doesn't fully fix said bug, since the assembler will still complain
the moment it sees a crypto/fp/vector op, and we still don't diagnose
calls that require vector regs.

Differential Revision: https://reviews.llvm.org/D39030

llvm-svn: 316374
2017-10-23 20:46:36 +00:00
Alex Bradbury 4d275f0dfe [TargetLowering] Correctly track NumFixedArgs field of CallLoweringInfo
The NumFixedArgs field of CallLoweringInfo is used by
TargetLowering::LowerCallTo to determine whether a given argument is passed
using the vararg calling convention or not (specifically, to set IsFixed for
each ISD::OutputArg).

Firstly, CallLoweringInfo::setLibCallee and CallLoweringInfo::setCallee both
incorrectly set NumFixedArgs based on the _previous_ args list. Secondly,
TargetLowering::LowerCallTo failed to increment NumFixedArgs when modifying
the argument list so a pointer is passed for the return value.

If your backend uses the IsFixed property or directly accesses NumFixedArgs, 
it is _possible_ this change could result in codegen changes (although the 
previous behaviour would have been incorrect). No such cases have been
identified during code review for any in-tree architecture.

Differential Revision: https://reviews.llvm.org/D37898

llvm-svn: 315457
2017-10-11 13:48:45 +00:00
Eugene Zelenko fa57bd0ced [CodeGen] Fix some Clang-tidy modernize-use-default-member-init and Include What You Use warnings; other minor fixes (NFC).
llvm-svn: 314363
2017-09-27 23:26:01 +00:00
Reid Kleckner 0fe506bc5e Re-land r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
The fix is to avoid invalidating our insertion point in
replaceDbgDeclare:
     Builder.insertDeclare(NewAddress, DIVar, DIExpr, Loc, InsertBefore);
+    if (DII == InsertBefore)
+      InsertBefore = &*std::next(InsertBefore->getIterator());
     DII->eraseFromParent();

I had to write a unit tests for this instead of a lit test because the
use list order matters in order to trigger the bug.

The reduced C test case for this was:
  void useit(int*);
  static inline void inlineme() {
    int x[2];
    useit(x);
  }
  void f() {
    inlineme();
    inlineme();
  }

llvm-svn: 313905
2017-09-21 19:52:03 +00:00
Bjorn Pettersson 0dde08c3cb [SelectionDAG] Pick correct frame index in LowerArguments
Summary:
SelectionDAGISel::LowerArguments is associating arguments
with frame indices (FuncInfo->setArgumentFrameIndex). That
information is later on used by EmitFuncArgumentDbgValue to
create DBG_VALUE instructions that denotes that a variable
can be found on the stack.

I discovered that for our (big endian) out-of-tree target
the association created by SelectionDAGISel::LowerArguments
sometimes is wrong. I've seen this happen when a 64-bit value
is passed on the stack. The argument will occupy two stack
slots (frame index X, and frame index X+1). The fault is
that a call to setArgumentFrameIndex is associating the
64-bit argument with frame index X+1. The effect is that the
debug information (DBG_VALUE) will point at the least significant
part of the arguement on the stack. When printing the
argument in a debugger I will get the wrong value.

I managed to create a test case for PowerPC that seems to
show the same kind of problem.

The bugfix will look at the datalayout, taking endianness into
account when examining a BUILD_PAIR node, assuming that the
least significant part is in the first operand of the BUILD_PAIR.
For big endian targets we should use the frame index from
the second operand, as the most significant part will be stored
at the lower address (using the highest frame index).

Reviewers: bogner, rnk, hfinkel, sdardis, aprantl

Reviewed By: aprantl

Subscribers: nemanjai, aprantl, llvm-commits, igorb

Differential Revision: https://reviews.llvm.org/D37740

llvm-svn: 313901
2017-09-21 18:52:08 +00:00
Daniel Jasper 7d2f38d600 Revert r313825: "[IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare"
.. as well as the two subsequent changes r313826 and r313875.

This leads to segfaults in combination with ASAN. Will forward repro
instructions to the original author (rnk).

llvm-svn: 313876
2017-09-21 12:07:33 +00:00
Reid Kleckner 3f547e87b2 [IR] Add llvm.dbg.addr, a control-dependent version of llvm.dbg.declare
Summary:
This implements the design discussed on llvm-dev for better tracking of
variables that live in memory through optimizations:
  http://lists.llvm.org/pipermail/llvm-dev/2017-September/117222.html

This is tracked as PR34136

llvm.dbg.addr is intended to be produced and used in almost precisely
the same way as llvm.dbg.declare is today, with the exception that it is
control-dependent. That means that dbg.addr should always have a
position in the instruction stream, and it will allow passes that
optimize memory operations on local variables to insert llvm.dbg.value
calls to reflect deleted stores. See SourceLevelDebugging.rst for more
details.

The main drawback to generating DBG_VALUE machine instrs is that they
usually cause LLVM to emit a location list for DW_AT_location. The next
step will be to teach DwarfDebug.cpp how to recognize more DBG_VALUE
ranges as not needing a location list, and possibly start setting
DW_AT_start_offset for variables whose lifetimes begin mid-scope.

Reviewers: aprantl, dblaikie, probinson

Subscribers: eraman, hiraditya, llvm-commits

Differential Revision: https://reviews.llvm.org/D37768

llvm-svn: 313825
2017-09-20 21:52:33 +00:00
Adrian Prantl 99ba97726a Fix a crash when emitting debug info for multi-reg function arguments
by reusing more of the existing machinery

This is a follow-up to r312169.
Thanks to Björn Pettersson for the testcase!

llvm-svn: 312773
2017-09-08 02:31:37 +00:00
Reid Kleckner e33c94f1b0 Add llvm.codeview.annotation to implement MSVC __annotation
Summary:
This intrinsic represents a label with a list of associated metadata
strings. It is modelled as reading and writing inaccessible memory so
that it won't be removed as dead code. I think the intention is that the
annotation strings should appear at most once in the debug info, so I
marked it noduplicate. We are allowed to inline code with annotations as
long as we strip the annotation, but that can be done later.

Reviewers: majnemer

Subscribers: eraman, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D36904

llvm-svn: 312569
2017-09-05 20:14:58 +00:00
Adrian Prantl 608df027fd SelectionDAG: Emit correct debug info for multi-register function arguments.
Previously we would just describe the first register and then call it
quits. This patch emits fragment expressions for each register.

<rdar://problem/34075307>

llvm-svn: 312169
2017-08-30 20:51:20 +00:00
Wei Ding a131d3fb29 Add ‘llvm.experimental.constrained.fma‘ Intrinsic.
Differential Revision: http://reviews.llvm.org/D36335

llvm-svn: 311629
2017-08-24 04:18:24 +00:00
Benjamin Kramer 49a49fe816 Move helper classes into anonymous namespaces.
No functionality change intended.

llvm-svn: 311288
2017-08-20 13:03:48 +00:00
Adrian Prantl 6a57daad81 Improve line debug info when translating a CaseBlock to SDNodes.
The SelectionDAGBuilder translates various conditional branches into
CaseBlocks which are then translated into SDNodes. If a conditional
branch results in multiple CaseBlocks only the first CaseBlock is
translated into SDNodes immediately, the rest of the CaseBlocks are
put in a queue and processed when all LLVM IR instructions in the
basic block have been processed.

When a CaseBlock is transformed into SDNodes the SelectionDAGBuilder
is queried for the current LLVM IR instruction and the resulting
SDNodes are annotated with the debug info of the current
instruction (if it exists and has debug metadata).

When the deferred CaseBlocks are processed, the SelectionDAGBuilder
does not have a current LLVM IR instruction, and the resulting SDNodes
will not have any debuginfo. As DwarfDebug::beginInstruction() outputs
a .loc directive for the first instruction in a labeled
block (typically the case for something coming from a CaseBlock) this
tends to produce a line-0 directive.

This patch changes the handling of CaseBlocks to store the current
instruction's debug info into the CaseBlock when it is created (and the
SelectionDAGBuilder knows the current instruction) and to always use
the stored debug info when translating a CaseBlock to SDNodes.

Patch by Frej Drejhammar!

Differential Revision: https://reviews.llvm.org/D36671

llvm-svn: 311097
2017-08-17 16:57:13 +00:00
Andrew Kaylor 53a5fbb45f Add strictfp attribute to prevent unwanted optimizations of libm calls
Differential Revision: https://reviews.llvm.org/D34163

llvm-svn: 310885
2017-08-14 21:15:13 +00:00
David Blaikie 76fb649b0d Reduce variable scope by moving declaration into if clause
llvm-svn: 310506
2017-08-09 18:34:18 +00:00
Reid Kleckner 29c675247d [DebugInfo] Don't turn dbg.declare into DBG_VALUE for static allocas
Summary:
We already have information about static alloca stack locations in our
side table. Emitting instructions for them is inefficient, and it only
happens when the address of the alloca has been materialized within the
current block, which isn't often.

Reviewers: aprantl, probinson, dblaikie

Subscribers: jfb, dschuff, sbc100, jgravelle-google, hiraditya, llvm-commits, aheejin

Differential Revision: https://reviews.llvm.org/D36117

llvm-svn: 309729
2017-08-01 19:45:09 +00:00
Simon Dardis 8930b4a049 [SelectionDAG][mips] Fix PR33883
PR33883 shows that calls to intrinsic functions should not have their vector
arguments or returns subject to ABI changes required by the target.

This resolves PR33883.

Thanks to Alex Crichton for reporting the issue!

Reviewers: zoran.jovanovic, atanasyan

Differential Revision: https://reviews.llvm.org/D35765

llvm-svn: 309561
2017-07-31 14:06:58 +00:00
Adrian Prantl 8b9bb534a1 Remove the unused offset from DBG_VALUE (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309450
2017-07-28 23:00:45 +00:00
Adrian Prantl a617576bb1 Remove the unused dbg.value offset from SelectionDAG (NFC)
Followup to r309426.
rdar://problem/33580047

llvm-svn: 309436
2017-07-28 21:27:35 +00:00
Adrian Prantl abe04759a6 Remove the obsolete offset parameter from @llvm.dbg.value
There is no situation where this rarely-used argument cannot be
substituted with a DIExpression and removing it allows us to simplify
the DWARF backend. Note that this patch does not yet remove any of
the newly dead code.

rdar://problem/33580047
Differential Revision: https://reviews.llvm.org/D35951

llvm-svn: 309426
2017-07-28 20:21:02 +00:00
Peter Collingbourne 081ffe2ff2 Change CallLoweringInfo::CS to be an ImmutableCallSite instead of a pointer. NFCI.
This was a use-after-free waiting to happen.

llvm-svn: 309159
2017-07-26 19:15:29 +00:00
Geoff Berry bea2e188e9 [TargetLowering] Add hook for adding target MMO flags when doing ISel.
Summary: Add TargetLowering hook getMMOFlags() to add target specific
MMO flags to load/store instructions created by ISel.

Reviewers: bogner, hfinkel, qcolombet, MatzeB

Subscribers: mcrosier, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D34962

llvm-svn: 307879
2017-07-13 03:49:42 +00:00
Daniel Neilson 965613ef1b Add element atomic memset intrinsic
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memset intrinsic. This intrinsic is essentially memset with the implementation requirement that all stores used for the assignment are done with unordered-atomic stores of a given element size.

Reviewers: eli.friedman, reames, mkazantsev, skatkov

Reviewed By: reames

Subscribers: jfb, dschuff, sbc100, jgravelle-google, aheejin, efriedma, llvm-commits

Differential Revision: https://reviews.llvm.org/D34885

llvm-svn: 307854
2017-07-12 21:57:23 +00:00
Daniel Neilson 57226ef33c Add element atomic memmove intrinsic
Summary: Continuing the work from https://reviews.llvm.org/D33240, this change introduces an element unordered-atomic memmove intrinsic. This intrinsic is essentially memmove with the implementation requirement that all loads/stores used for the copy are done with unordered-atomic loads/stores of a given element size.

Reviewers: eli.friedman, reames, mkazantsev, skatkov

Reviewed By: reames

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D34884

llvm-svn: 307796
2017-07-12 15:25:26 +00:00
Konstantin Zhuravlyov bb80d3e1d3 Enhance synchscope representation
OpenCL 2.0 introduces the notion of memory scopes in atomic operations to
  global and local memory. These scopes restrict how synchronization is
  achieved, which can result in improved performance.

  This change extends existing notion of synchronization scopes in LLVM to
  support arbitrary scopes expressed as target-specific strings, in addition to
  the already defined scopes (single thread, system).

  The LLVM IR and MIR syntax for expressing synchronization scopes has changed
  to use *syncscope("<scope>")*, where <scope> can be "singlethread" (this
  replaces *singlethread* keyword), or a target-specific name. As before, if
  the scope is not specified, it defaults to CrossThread/System scope.

  Implementation details:
    - Mapping from synchronization scope name/string to synchronization scope id
      is stored in LLVM context;
    - CrossThread/System and SingleThread scopes are pre-defined to efficiently
      check for known scopes without comparing strings;
    - Synchronization scope names are stored in SYNC_SCOPE_NAMES_BLOCK in
      the bitcode.

Differential Revision: https://reviews.llvm.org/D21723

llvm-svn: 307722
2017-07-11 22:23:00 +00:00
Simon Pilgrim 55a4b6700f Handle ConstantExpr correctly in SelectionDAGBuilder
This change fixes a bug in SelectionDAGBuilder::visitInsertValue and SelectionDAGBuilder::visitExtractValue where constant expressions (InsertValueConstantExpr and ExtractValueConstantExpr) would be treated as non-constant instructions (InsertValueInst and ExtractValueInst). This bug resulted in an incorrect memory access, which manifested as an assertion failure in SDValue::SDValue.

Fixes PR#33094.

Submitted on behalf of @Praetonus (Benoit Vey)

Differential Revision: https://reviews.llvm.org/D34538

llvm-svn: 307502
2017-07-09 16:01:04 +00:00
Vadim Chugunov e6f76558c7 Fix libcall expansion creating DAG nodes with invalid type post type legalization.
If we are lowering a libcall after legalization, we'll split the return type into a pair of legal values.

Patch by Jatin Bhateja and Eli Friedman.

Differential Revision: https://reviews.llvm.org/D34240

llvm-svn: 307207
2017-07-05 22:01:49 +00:00
Craig Topper 92a8fe34e5 [SelectionDAGBuilder] Use EVT::getVectorVT instead of MVT::getVectorVT to prevent a crash if the type isn't a simple VT.
llvm-svn: 306950
2017-07-01 06:46:09 +00:00
Daniel Neilson 3faabbbe85 [Atomics] Rename and change prototype for atomic memcpy intrinsic
Summary:

Background: http://lists.llvm.org/pipermail/llvm-dev/2017-May/112779.html

This change is to alter the prototype for the atomic memcpy intrinsic. The prototype itself is being changed to more closely resemble the semantics and parameters of the llvm.memcpy intrinsic -- to ease later combination of the llvm.memcpy and atomic memcpy intrinsics. Furthermore, the name of the atomic memcpy intrinsic is being changed to make it clear that it is not a generic atomic memcpy, but specifically a memcpy is unordered atomic.

Reviewers: reames, sanjoy, efriedma

Reviewed By: reames

Subscribers: mzolotukhin, anna, llvm-commits, skatkov

Differential Revision: https://reviews.llvm.org/D33240

llvm-svn: 305558
2017-06-16 14:43:59 +00:00
Benjamin Kramer 00a970a84b Fold variable into assert.
Silences an unused variable warning in Release builds.

llvm-svn: 305488
2017-06-15 17:58:24 +00:00
Arnold Schwaighofer ae9312c487 ISel: Fix FastISel of swifterror values
The code assumed that we process instructions in basic block order.  FastISel
processes instructions in reverse basic block order. We need to pre-assign
virtual registers before selecting otherwise we get def-use relationships wrong.

This only affects code with swifterror registers.

rdar://32659327

llvm-svn: 305484
2017-06-15 17:34:42 +00:00
Simon Dardis 212cccb2f4 Reland "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

The previous version of this patch had a "conditional move or jump depends on
uninitialized value".

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 305083
2017-06-09 14:37:08 +00:00
Sanjay Patel 2726ea2ac0 [DAG] remove duplicated code for isOnlyUsedInZeroEqualityComparison(); NFCI
llvm-svn: 304822
2017-06-06 19:40:09 +00:00
Andrew Kaylor f466001eef Add constrained intrinsics for some libm-equivalent operations
Differential revision: https://reviews.llvm.org/D32319

llvm-svn: 303922
2017-05-25 21:31:00 +00:00
Adrian Prantl f062192632 Fix SelectionDAGBuilder::getDbgValue to not expect DW_OP_deref on FI vars
This fixes an oversight in r300522, which changed alloca
dbg.values to no longer emit a DW_OP_deref.

The array.ll testcase was regenerated from source.

Fixes PR33166:
https://bugs.llvm.org/show_bug.cgi?id=33166

llvm-svn: 303897
2017-05-25 18:54:10 +00:00
Tim Northover 8c605c0eda Revert LLVM changes for "Sema: allow imaginary constants via GNU extension if UDL overloads not present."
The changes accidentally crept into a Clang commit I was making.

llvm-svn: 303697
2017-05-23 21:53:11 +00:00
Tim Northover 6b5eceac2e Sema: allow imaginary constants via GNU extension if UDL overloads not present.
C++14 added user-defined literal support for complex numbers so that you can
write something like "complex<double> val = 2i". However, there is an existing
GNU extension supporting this syntax and interpreting the result as a _Complex
type.

This changes parsing so that such literals are interpreted in terms of C++14's
operators if an overload is present but otherwise falls back to the original
GNU extension.

llvm-svn: 303694
2017-05-23 21:41:49 +00:00
Peter Collingbourne 6f0ecca3b5 IR: Give function GlobalValue::getRealLinkageName() a less misleading name: dropLLVMManglingEscape().
This function gives the wrong answer on some non-ELF platforms in some
cases. The function that does the right thing lives in Mangler.h. To try to
discourage people from using this function, give it a different name.

Differential Revision: https://reviews.llvm.org/D33162

llvm-svn: 303134
2017-05-16 00:39:01 +00:00
Craig Topper 8df66c602a [KnownBits] Add bit counting methods to KnownBits struct and use them where possible
This patch adds min/max population count, leading/trailing zero/one bit counting methods.

The min methods return answers based on bits that are known without considering unknown bits. The max methods give answers taking into account the largest count that unknown bits could give.

Differential Revision: https://reviews.llvm.org/D32931

llvm-svn: 302925
2017-05-12 17:20:30 +00:00
Ahmed Bougacha 604526fe87 [CodeGen] Don't require AA in SDAGISel at -O0.
Before r247167, the pass manager builder controlled which AA
implementations were used, exporting them all in the AliasAnalysis
analysis group.

Now, AAResultsWrapperPass always uses BasicAA, but still uses other AA
implementations if made available in the pass pipeline.

But regardless, SDAGISel is required at O0, and really doesn't need to
be doing fancy optimizations based on useful AA results.

Don't require AA at CodeGenOpt::None, and only use it otherwise.

This does have a functional impact (and one testcase is pessimized
because we can't reuse a load).  But I think that's desirable no matter
what.

Note that this alone doesn't result in less DT computations: TwoAddress
was previously able to reuse the DT we computed for SDAG.  That will be
fixed separately.

Differential Revision: https://reviews.llvm.org/D32766

llvm-svn: 302611
2017-05-10 00:39:30 +00:00
Reid Kleckner 3a363fff7e Re-land "Use the frame index side table for byval and inalloca arguments"
This re-lands r302483. It was not the cause of PR32977.

llvm-svn: 302544
2017-05-09 16:02:20 +00:00
Reid Kleckner 84075fddff Re-land "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This re-lands commit r302461. It was not the cause of PR32977.

llvm-svn: 302543
2017-05-09 16:01:47 +00:00
Serge Pavlov d526b13e61 Add extra operand to CALLSEQ_START to keep frame part set up previously
Using arguments with attribute inalloca creates problems for verification
of machine representation. This attribute instructs the backend that the
argument is prepared in stack prior to  CALLSEQ_START..CALLSEQ_END
sequence (see http://llvm.org/docs/InAlloca.htm for details). Frame size
stored in CALLSEQ_START in this case does not count the size of this
argument. However CALLSEQ_END still keeps total frame size, as caller can
be responsible for cleanup of entire frame. So CALLSEQ_START and
CALLSEQ_END keep different frame size and the difference is treated by
MachineVerifier as stack error. Currently there is no way to distinguish
this case from actual errors.

This patch adds additional argument to CALLSEQ_START and its
target-specific counterparts to keep size of stack that is set up prior to
the call frame sequence. This argument allows MachineVerifier to calculate
actual frame size associated with frame setup instruction and correctly
process the case of inalloca arguments.

The changes made by the patch are:
- Frame setup instructions get the second mandatory argument. It
  affects all targets that use frame pseudo instructions and touched many
  files although the changes are uniform.
- Access to frame properties are implemented using special instructions
  rather than calls getOperand(N).getImm(). For X86 and ARM such
  replacement was made previously.
- Changes that reflect appearance of additional argument of frame setup
  instruction. These involve proper instruction initialization and
  methods that access instruction arguments.
- MachineVerifier retrieves frame size using method, which reports sum of
  frame parts initialized inside frame instruction pair and outside it.

The patch implements approach proposed by Quentin Colombet in
https://bugs.llvm.org/show_bug.cgi?id=27481#c1.
It fixes 9 tests failed with machine verifier enabled and listed
in PR27481.

Differential Revision: https://reviews.llvm.org/D32394

llvm-svn: 302527
2017-05-09 13:35:13 +00:00
Amara Emerson cf9daa33a7 Introduce experimental generic intrinsics for horizontal vector reductions.
- This change allows targets to opt-in to using them instead of the log2
  shufflevector algorithm.
- The SLP and Loop vectorizers have the common code to do shuffle reductions
  factored out into LoopUtils, and now have a unified interface for generating
  reductions regardless of the preference of the target. LoopUtils now uses TTI
  to determine what kind of reductions the target wants to handle.
- For CodeGen, basic legalization support is added.

Differential Revision: https://reviews.llvm.org/D30086

llvm-svn: 302514
2017-05-09 10:43:25 +00:00
Reid Kleckner 41bb94233b Revert "Don't add DBG_VALUE instructions for static allocas in dbg.declare"
This reverts commit r302461.

It appears to be causing failures compiling gtest with debug info on the
Linux sanitizer bot. I was unable to reproduce the failure locally,
however.

llvm-svn: 302504
2017-05-09 01:57:44 +00:00
Reid Kleckner 9f29914d40 Revert "Use the frame index side table for byval and inalloca arguments"
This reverts r302483 and it's follow up fix.

llvm-svn: 302493
2017-05-09 01:14:39 +00:00
Reid Kleckner 45efcf0c96 Use the frame index side table for byval and inalloca arguments
Summary:
For inalloca functions, this is a very common code pattern:

  %argpack = type <{ i32, i32, i32 }>
  define void @f(%argpack* inalloca %args) {
  entry:
    %a = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 0
    %b = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 1
    %c = getelementptr inbounds %argpack, %argpack* %args, i32 0, i32 2
    tail call void @llvm.dbg.declare(metadata i32* %a, ... "a")
    tail call void @llvm.dbg.declare(metadata i32* %c, ... "b")
    tail call void @llvm.dbg.declare(metadata i32* %b, ... "c")

Even though these GEPs can be simplified to a constant offset from EBP
or RSP, we don't do that at -O0, and each GEP is computed into a
register. Registers used to compute argument addresses are typically
spilled and clobbered very quickly after the initial computation, so
live debug variable tracking loses information very quickly if we use
DBG_VALUE instructions.

This change moves processing of dbg.declare between argument lowering
and basic block isel, so that we can ask if an argument has a frame
index or not. If the argument lives in a register as is the case for
byval arguments on some targets, then we don't put it in the side table
and during ISel we emit DBG_VALUE instructions.

Reviewers: aprantl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32980

llvm-svn: 302483
2017-05-08 23:20:27 +00:00
Reid Kleckner bf828eedb4 Don't add DBG_VALUE instructions for static allocas in dbg.declare
Summary:
An llvm.dbg.declare of a static alloca is always added to the
MachineFunction dbg variable map, so these values are entirely
redundant. They survive all the way through codegen to be ignored by
DWARF emission.

Effectively revert r113967

Two bugpoint-reduced test cases from 2012 broke as a result of this
change. Despite my best efforts, I haven't been able to rewrite the test
case using dbg.value. I'm not too concerned about the lost coverage
because these were reduced from the test-suite, which we still run.

Reviewers: aprantl, dblaikie

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D32920

llvm-svn: 302461
2017-05-08 19:58:15 +00:00
Dean Michael Berris 9bcaed867a [XRay] Custom event logging intrinsic
This patch introduces an LLVM intrinsic and a target opcode for custom event
logging in XRay. Initially, its use case will be to allow users of XRay to log
some type of string ("poor man's printf"). The target opcode compiles to a noop
sled large enough to enable calling through to a runtime-determined relative
function call. At runtime, when X-Ray is enabled, the sled is replaced by
compiler-rt with a trampoline to the logic for creating the custom log entries.

Future patches will implement the compiler-rt parts and clang-side support for
emitting the IR corresponding to this intrinsic.

Reviewers: timshen, dberris

Subscribers: igorb, pelikan, rSerge, timshen, echristo, dberris, llvm-commits

Differential Revision: https://reviews.llvm.org/D27503

llvm-svn: 302405
2017-05-08 05:45:21 +00:00
Reid Kleckner ac1a97b32f Simplify dbg.value handling in SDISel with early returns
No functional change other than improving dbgs logging accuracy on
constant dbg values. Previously we would add things like "i32 42" as
debug values, and then log that we were dropping the debug info, which
is silly.

Delete some dead code that was checking for static allocas. This
remained after r207165, but served no purpose. Currently, static alloca
dbg.values are always sent through the DanglingDebugInfoMap, and are
usually made valid the first time the alloca is used.

llvm-svn: 302267
2017-05-05 18:30:34 +00:00
Simon Pilgrim 89ad89cc73 [SelectionDAG] Improve support for promotion of <1 x fX> floating point argument types (PR31088)
PR31088 demonstrated that we were assuming that only integers require promotion from <1 x iX> types, when in fact float types may require it as well - in this case half floats.

This patch adds support for extension/truncation for both integer and float types.

Differential Revision: https://reviews.llvm.org/D32391

llvm-svn: 301910
2017-05-02 10:33:08 +00:00
Amara Emerson d28f0cd448 Generalize the specialized flag-carrying SDNodes by moving flags into SDNode.
This removes BinaryWithFlagsSDNode, and flags are now all passed by value.

Differential Revision: https://reviews.llvm.org/D32527

llvm-svn: 301803
2017-05-01 15:17:51 +00:00
Reid Kleckner 6652a52e2b Use Argument::hasAttribute and AttributeList::ReturnIndex more
This eliminates many extra 'Idx' induction variables in loops over
arguments in CodeGen/ and Target/. It also reduces the number of places
where we assume that ReturnIndex is 0 and that we should add one to
argument numbers to get the corresponding attribute list index.

NFC

llvm-svn: 301666
2017-04-28 18:37:16 +00:00
Jun Bum Lim 919f9e8d65 [InlineCost] Improve the cost heuristic for Switch
Summary:
The motivation example is like below which has 13 cases but only 2 distinct targets

```
lor.lhs.false2:                                   ; preds = %if.then
  switch i32 %Status, label %if.then27 [
    i32 -7012, label %if.end35
    i32 -10008, label %if.end35
    i32 -10016, label %if.end35
    i32 15000, label %if.end35
    i32 14013, label %if.end35
    i32 10114, label %if.end35
    i32 10107, label %if.end35
    i32 10105, label %if.end35
    i32 10013, label %if.end35
    i32 10011, label %if.end35
    i32 7008, label %if.end35
    i32 7007, label %if.end35
    i32 5002, label %if.end35
  ]
```
which is compiled into a balanced binary tree like this on AArch64 (similar on X86)

```
.LBB853_9:                              // %lor.lhs.false2
        mov     w8, #10012
        cmp             w19, w8
        b.gt    .LBB853_14
// BB#10:                               // %lor.lhs.false2
        mov     w8, #5001
        cmp             w19, w8
        b.gt    .LBB853_18
// BB#11:                               // %lor.lhs.false2
        mov     w8, #-10016
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#12:                               // %lor.lhs.false2
        mov     w8, #-10008
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#13:                               // %lor.lhs.false2
        mov     w8, #-7012
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_14:                             // %lor.lhs.false2
        mov     w8, #14012
        cmp             w19, w8
        b.gt    .LBB853_21
// BB#15:                               // %lor.lhs.false2
        mov     w8, #-10105
        add             w8, w19, w8
        cmp             w8, #9          // =9
        b.hi    .LBB853_17
// BB#16:                               // %lor.lhs.false2
        orr     w9, wzr, #0x1
        lsl     w8, w9, w8
        mov     w9, #517
        and             w8, w8, w9
        cbnz    w8, .LBB853_23
.LBB853_17:                             // %lor.lhs.false2
        mov     w8, #10013
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_18:                             // %lor.lhs.false2
        mov     w8, #-7007
        add             w8, w19, w8
        cmp             w8, #2          // =2
        b.lo    .LBB853_23
// BB#19:                               // %lor.lhs.false2
        mov     w8, #5002
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#20:                               // %lor.lhs.false2
        mov     w8, #10011
        cmp             w19, w8
        b.eq    .LBB853_23
        b       .LBB853_3
.LBB853_21:                             // %lor.lhs.false2
        mov     w8, #14013
        cmp             w19, w8
        b.eq    .LBB853_23
// BB#22:                               // %lor.lhs.false2
        mov     w8, #15000
        cmp             w19, w8
        b.ne    .LBB853_3
```
However, the inline cost model estimates the cost to be linear with the number
of distinct targets and the cost of the above switch is just 2 InstrCosts.
The function containing this switch is then inlined about 900 times.

This change use the general way of switch lowering for the inline heuristic. It
etimate the number of case clusters with the suitability check for a jump table
or bit test. Considering the binary search tree built for the clusters, this
change modifies the model to be linear with the size of the balanced binary
tree. The model is off by default for now :
  -inline-generic-switch-cost=false

This change was originally proposed by Haicheng in D29870.

Reviewers: hans, bmakam, chandlerc, eraman, haicheng, mcrosier

Reviewed By: hans

Subscribers: joerg, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D31085

llvm-svn: 301649
2017-04-28 16:04:03 +00:00
Craig Topper d0af7e8ab8 [SelectionDAG] Use KnownBits struct in DAG's computeKnownBits and simplifyDemandedBits
This patch replaces the separate APInts for KnownZero/KnownOne with a single KnownBits struct. This is similar to what was done to ValueTracking's version recently.

This is largely a mechanical transformation from KnownZero to Known.Zero.

Differential Revision: https://reviews.llvm.org/D32569

llvm-svn: 301620
2017-04-28 05:31:46 +00:00
Simon Pilgrim 37ef04ad1f [SelectionDAG] Use getBuildVector helper where possible. NFCI
llvm-svn: 301314
2017-04-25 15:10:47 +00:00
Simon Pilgrim 986d73cc1d [SelectionDAG] Pull out repeated getValueType calls. NFCI.
Noticed in D32391.

llvm-svn: 301308
2017-04-25 13:39:07 +00:00
Krzysztof Parzyszek c8e8e2a046 Move value type list from TargetRegisterClass to TargetRegisterInfo
Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301234
2017-04-24 19:51:12 +00:00
Krzysztof Parzyszek 98ab4c64c4 Revert r301231: Accidentally committed stale files
I forgot to commit local changes before commit.

llvm-svn: 301232
2017-04-24 19:48:51 +00:00
Krzysztof Parzyszek c0197066d7 Move value type list from TargetRegisterClass to TargetRegisterInfo
Differential Revision: https://reviews.llvm.org/D31937

llvm-svn: 301231
2017-04-24 19:43:45 +00:00
Yaxun Liu fd23a0c095 CodeGen: Add a hook for getFenceOperandTy
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.

This patch has no effect on targets other than amdgcn.

Differential Revision: https://reviews.llvm.org/D32186

llvm-svn: 301215
2017-04-24 18:26:27 +00:00
Yaxun Liu 5d977f8ed4 CodeGen: Let frame index value type match alloca addr space
Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as pointer in
address space 0.

However, currently the value type of frame index is assumed to be of the
same size as pointer in address space 0.

This patch fixes that.

Most targets assume alloca returning pointer in address space 0, which
is the default alloca address space. Therefore it is NFC for them.

AMDGCN target with amdgiz environment requires this change since it
assumes alloca returning pointer to addr space 5 and its size is 32,
which is different from the size of pointer in addr space 0 which is 64.

Differential Revision: https://reviews.llvm.org/D32021

llvm-svn: 300864
2017-04-20 18:15:34 +00:00
Adrian Prantl 6825fb64e9 PR32382: Fix emitting complex DWARF expressions.
The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
  - describe a variable in a register
  - consist of only a DW_OP_reg
2. Memory location descriptions
  - describe the address of a variable
3. Implicit location descriptions
  - describe the value of a variable
  - end with DW_OP_stack_value & friends

The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident.  This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.

This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.

There are two major changes in this patch:

I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.

When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
  DBG_VALUE, [RBP-8]                        --> DW_OP_fbreg -8
  DBG_VALUE, RAX                            --> DW_OP_reg0 +0
  DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0

All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.

https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>

Differential Revision: https://reviews.llvm.org/D31439

llvm-svn: 300522
2017-04-18 01:21:53 +00:00
Reid Kleckner fb502d2f5e [IR] Make paramHasAttr to use arg indices instead of attr indices
This avoids the confusing 'CS.paramHasAttr(ArgNo + 1, Foo)' pattern.

Previously we were testing return value attributes with index 0, so I
introduced hasReturnAttr() for that use case.

llvm-svn: 300367
2017-04-14 20:19:02 +00:00
Simon Dardis f7e4388e3b Revert "[SelectionDAG] Enable target specific vector scalarization of calls and returns"
This reverts commit r299766. This change appears to have broken the MIPS
buildbots. Reverting while I investigate.

Revert "[mips] Remove usage of debug only variable (NFC)"

This reverts commit r299769. Follow up commit.

llvm-svn: 299788
2017-04-07 17:25:05 +00:00
Simon Dardis 6470ff0b24 [SelectionDAG] Enable target specific vector scalarization of calls and returns
By target hookifying getRegisterType, getNumRegisters, getVectorBreakdown,
backends can request that LLVM to scalarize vector types for calls
and returns.

The MIPS vector ABI requires that vector arguments and returns are passed in
integer registers. With SelectionDAG's new hooks, the MIPS backend can now
handle LLVM-IR with vector types in calls and returns. E.g.
'call @foo(<4 x i32> %4)'.

Previously these cases would be scalarized for the MIPS O32/N32/N64 ABI for
calls and returns if vector types were not legal. If vector types were legal,
a single 128bit vector argument would be assigned to a single 32 bit / 64 bit
integer register.

By teaching the MIPS backend to inspect the original types, it can now
implement the MIPS vector ABI which requires a particular method of
scalarizing vectors.

Previously, the MIPS backend relied on clang to scalarize types such as "call
@foo(<4 x float> %a) into "call @foo(i32 inreg %1, i32 inreg %2, i32 inreg %3,
i32 inreg %4)".

This patch enables the MIPS backend to take either form for vector types.

Reviewers: zoran.jovanovic, jaydeep, vkalintiris, slthakur

Differential Revision: https://reviews.llvm.org/D27845

llvm-svn: 299766
2017-04-07 13:03:52 +00:00
Adam Nemet 92a5cf4366 [SDAG] Remove -enable-fmf-dag
This is no longer needed as spotted by Sanjay in
https://reviews.llvm.org/D31165.

llvm-svn: 298963
2017-03-28 23:46:14 +00:00
Adam Nemet 6820f391eb [SDAG] Add AllowContract to SNodeFlags
Properly propagate the FMF from the LLVM IR to this flag.

This is toward moving fp-contraction=fast from an LLVM TargetOption to a
FastMathFlag in order to fix PR25721.

Differential Revision: https://reviews.llvm.org/D31165

llvm-svn: 298961
2017-03-28 23:46:08 +00:00
Sanjay Patel f01a1dad7f [x86] use VPMOVMSK to replace memcmp libcalls for 32-byte equality
Follow-up to:
https://reviews.llvm.org/rL298775

llvm-svn: 298933
2017-03-28 17:23:49 +00:00
Sanjay Patel 9ebb68843e [x86] use PMOVMSK to replace memcmp libcalls for 16-byte equality
This is the payoff for D31156 - if a target has efficient comparison instructions for vector-sized equality, 
we can replace memcmp calls with inline code that is both smaller and faster.

Differential Revision: https://reviews.llvm.org/D31290

llvm-svn: 298775
2017-03-25 16:05:33 +00:00
Reid Kleckner b518054b87 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

llvm-svn: 298393
2017-03-21 16:57:19 +00:00
Nirav Dave ac6081cb67 Make library calls sensitive to regparm module flag (Fixes PR3997).
Reviewers: mkuper, rnk

Subscribers: mehdi_amini, jyknight, aemerson, llvm-commits, rengolin

Differential Revision: https://reviews.llvm.org/D27050

llvm-svn: 298179
2017-03-18 00:44:07 +00:00
Nirav Dave 6de2c77944 Capitalize ArgListEntry fields. NFC.
llvm-svn: 298178
2017-03-18 00:43:57 +00:00
Reid Kleckner 45707d4d5a Remove getArgumentList() in favor of arg_begin(), args(), etc
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.

In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).

Reviewed By: chandlerc

Differential Revision: https://reviews.llvm.org/D31052

llvm-svn: 298010
2017-03-16 22:59:15 +00:00
Sanjay Patel 5273afd4bb [DAG] fix typo in comment; NFC
llvm-svn: 297011
2017-03-06 15:07:43 +00:00
Sanjay Patel 7884dcb788 [DAG] early exit to improve readability and formatting of visitMemCmpCall(); NFCI
llvm-svn: 296824
2017-03-02 21:56:43 +00:00
Sanjay Patel 209b0f9aad [DAG] improve documentation comments; NFC
llvm-svn: 296808
2017-03-02 20:48:08 +00:00
Sanjay Patel f7aba7ba22 fix typo in comment; NFC
llvm-svn: 296760
2017-03-02 16:37:24 +00:00
Reid Kleckner f7c0980c10 Elide argument copies during instruction selection
Summary:
Avoids tons of prologue boilerplate when arguments are passed in memory
and left in memory. This can happen in a debug build or in a release
build when an argument alloca is escaped.  This will dramatically affect
the code size of x86 debug builds, because X86 fast isel doesn't handle
arguments passed in memory at all. It only handles the x86_64 case of up
to 6 basic register parameters.

This is implemented by analyzing the entry block before ISel to identify
copy elision candidates. A copy elision candidate is an argument that is
used to fully initialize an alloca before any other possibly escaping
uses of that alloca. If an argument is a copy elision candidate, we set
a flag on the InputArg. If the the target generates loads from a fixed
stack object that matches the size and alignment requirements of the
alloca, the SelectionDAG builder will delete the stack object created
for the alloca and replace it with the fixed stack object. The load is
left behind to satisfy any remaining uses of the argument value. The
store is now dead and is therefore elided. The fixed stack object is
also marked as mutable, as it may now be modified by the user, and it
would be invalid to rematerialize the initial load from it.

Supersedes D28388

Fixes PR26328

Reviewers: chandlerc, MatzeB, qcolombet, inglorion, hans

Subscribers: igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D29668

llvm-svn: 296683
2017-03-01 21:42:00 +00:00
Craig Topper 96ec7a23e3 [SelectionDAGBuilder] Simplify creation of shufflevector DAG nodes where inputs are larger than the mask
Summary:
The current code loops over all elements to calculate a used range. Then a second short loop looks at the ranges and determines if they can be used in a extract and creates a properly aligned start index for the extract.

This range finding is unnecessary, we can just calculate a properly aligned start index for an extract for each input during the first loop. If we don't find the same start index for each indice we can't use an extract.

Reviewers: zvi, RKSimon

Reviewed By: zvi

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D29926

llvm-svn: 295152
2017-02-15 05:57:16 +00:00
Arnold Schwaighofer 8f3df731dc swiftcc: Don't emit tail calls from callers with swifterror parameters
Backends don't support this yet. They would have to move to the swifterror
register before the tail call to make sure it is live-in to the call.

rdar://30495920

llvm-svn: 294982
2017-02-13 19:58:28 +00:00
Geoff Berry 7e320c2485 [SelectionDAG] Fix bugs in inverted condition splitting code.
Summary:
Fix two bugs in SelectionDAGBuilder::FindMergedConditions reported by
Mikael Holmen.  Handle non-canonicalized xor not operation
correctly (was assuming operand 0 was always the non-constant operand)
and check that the negated condition is also in the same block as the
original and/or instruction (as is done for and/or operands already)
before proceeding with optimization.

Reviewers: bogner, MatzeB, qcolombet

Subscribers: mcrosier, uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D29680

llvm-svn: 294605
2017-02-09 18:28:17 +00:00
Reid Kleckner 0887d44a61 [SDAGISel] Simplify some SDAGISel code, NFC
Hoist entry block code for arguments and swift error values out of the
basic block instruction selection loop. Lowering arguments once up front
seems much more readable than doing it conditionally inside the loop. It
also makes it clear that argument lowering can update StaticAllocaMap
because no instructions have been selected yet.

Also use range-based for loops where possible.

llvm-svn: 294329
2017-02-07 18:42:53 +00:00
Ahmed Bougacha 9677cc6fb7 [TLI] Robustize SDAG LibFunc proto checking by merging it into TLI.
This re-applies commit r292189, reverted in r292191.

SelectionDAGBuilder recognizes libfuncs using some homegrown
parameter type-checking.

Use TLI instead, removing another heap of redundant code.

This isn't strictly NFC, as the SDAG code was too lax.
Concretely, this means changes are required to a few tests:
- calling a non-variadic function via a variadic prototype isn't OK;
  it just happens to work on x86_64 (but not on, e.g., aarch64).
- mempcpy has a size_t parameter;  the SDAG code accepts any integer
  type, which meant using i32 on x86_64 worked.
- a handful of SystemZ tests check the SDAG support for lax prototype
  checking: Ulrich agrees on removing them.

I don't think it's worth supporting any of these (IMO) invalid
testcases.  Instead, fix them to be more meaningful.

llvm-svn: 294028
2017-02-03 19:11:19 +00:00
Andrew Kaylor a0a1164ce4 Add intrinsics for constrained floating point operations
This commit introduces a set of experimental intrinsics intended to prevent
optimizations that make assumptions about the rounding mode and floating point
exception behavior.  These intrinsics will later be extended to specify
flush-to-zero behavior.  More work is also required to model instruction
dependencies in machine code and to generate these instructions from clang
(when required by pragmas and/or command line options that are not currently
supported).

Differential Revision: https://reviews.llvm.org/D27028

llvm-svn: 293226
2017-01-26 23:27:59 +00:00
Geoff Berry 92a286ae5a [SelectionDAG] Handle inverted conditions when splitting into multiple branches.
Summary:
When conditional branches with complex conditions are split into
multiple branches in SelectionDAGBuilder::FindMergedConditions, also
handle inverted conditions.  These may sometimes appear without having
been optimized by InstCombine when CodeGenPrepare decides to sink and
duplicate cmp instructions, causing them to have only one use.  This
problem can be increased by e.g. GVNHoist hiding more cmps from
InstCombine by combining equivalent cmps from different blocks.

For example codegen X & !(Y | Z) as:
    jmp_if_X TmpBB
    jmp FBB
  TmpBB:
    jmp_if_notY Tmp2BB
    jmp FBB
  Tmp2BB:
    jmp_if_notZ TBB
    jmp FBB

Reviewers: bogner, MatzeB, qcolombet

Subscribers: llvm-commits, hiraditya, mcrosier, sebpop

Differential Revision: https://reviews.llvm.org/D28380

llvm-svn: 292944
2017-01-24 16:36:07 +00:00
David L. Jones d21529fa0d [Analysis] Add LibFunc_ prefix to enums in TargetLibraryInfo. (NFC)
Summary:
The LibFunc::Func enum holds enumerators named for libc functions.
Unfortunately, there are real situations, including libc implementations, where
function names are actually macros (musl uses "#define fopen64 fopen", for
example; any other transitively visible macro would have similar effects).

Strictly speaking, a conforming C++ Standard Library should provide any such
macros as functions instead (via <cstdio>). However, there are some "library"
functions which are not part of the standard, and thus not subject to this
rule (fopen64, for example). So, in order to be both portable and consistent,
the enum should not use the bare function names.

The old enum naming used a namespace LibFunc and an enum Func, with bare
enumerators. This patch changes LibFunc to be an enum with enumerators prefixed
with "LibFFunc_". (Unfortunately, a scoped enum is not sufficient to override
macros.)

There are additional changes required in clang.

Reviewers: rsmith

Subscribers: mehdi_amini, mzolotukhin, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D28476

llvm-svn: 292848
2017-01-23 23:16:46 +00:00
Mikael Holmen 2074e7497b [DAG] Don't increase SDNodeOrder for dbg.value/declare.
Summary:
The SDNodeOrder is saved in the IROrder field in the SDNode, and this
field may affects scheduling. Thus, letting dbg.value/declare increase
the order numbers may in turn affect scheduling.

Because of this change we also need to update the code deciding when
dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues.

Dbg values now have the same order as the SDNode they are connected to,
not the following orders.

Test cases provided by Florian Hahn.

Reviewers: bogner, aprantl, sunfish, atrick

Reviewed By: atrick

Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D25318

llvm-svn: 292485
2017-01-19 13:55:55 +00:00
Ahmed Bougacha 9e5a085cf1 Revert "[TLI] Robustize SDAG proto checking by merging it into TLI."
This reverts commit r292189, as it causes issues on SystemZ bots.

llvm-svn: 292191
2017-01-17 03:31:00 +00:00
Ahmed Bougacha c018efd680 [TLI] Robustize SDAG proto checking by merging it into TLI.
SelectionDAGBuilder recognizes libfuncs using some homegrown
parameter type-checking.

Use TLI instead, removing another heap of redundant code.

This isn't strictly NFC, as the SDAG code was too lax.
Concretely, this means changes are required to two tests:
- calling a non-variadic function via a variadic prototype isn't OK;
  it just happens to work on x86_64 (but not on, e.g., aarch64).
- mempcpy has a size_t parameter;  the SDAG code accepts any integer
  type, which meant using i32 on x86_64 worked.

I don't think it's worth supporting either of these (IMO) broken
testcases.  Instead, fix them to be more correct.

llvm-svn: 292189
2017-01-17 03:10:06 +00:00
Benjamin Kramer 061f4a5fe6 Apply clang-tidy's performance-unnecessary-value-param to LLVM.
With some minor manual fixes for using function_ref instead of
std::function. No functional change intended.

llvm-svn: 291904
2017-01-13 14:39:03 +00:00
Diana Picus 116bbab4e4 [CodeGen] Rename MachineInstrBuilder::addOperand. NFC
Rename from addOperand to just add, to match the other method that has been
added to MachineInstrBuilder for adding more than just 1 operand.

See https://reviews.llvm.org/D28057 for the whole discussion.

Differential Revision: https://reviews.llvm.org/D28556

llvm-svn: 291891
2017-01-13 09:58:52 +00:00
Matt Arsenault def496c04b Remove unused CONVERT_RNDSAT intrinsics
llvm-svn: 291607
2017-01-10 22:38:02 +00:00
David Majnemer 9e04befb09 [SelectionDAG] Rework lowerRangeToAssertZExt
Utilize ConstantRange to make it easier to interpret range metadata.

llvm-svn: 291211
2017-01-06 02:43:28 +00:00
David Majnemer eaba06cffa [SelectionDAG] Correctly transform range metadata to AssertZExt
We used the logBase2 of the high instead of the ceilLogBase2 resulting
in the wrong result for certain values.  For example, it resulted in an
i1 AssertZExt when the exclusive portion of the range was 3.

llvm-svn: 291196
2017-01-06 00:11:46 +00:00
Igor Laevsky 4f31e52f94 Introduce element-wise atomic memcpy intrinsic
This change adds a new intrinsic which is intended to provide memcpy functionality
with additional atomicity guarantees. Please refer to the review thread
or language reference for further details.

Differential Revision: https://reviews.llvm.org/D27133

llvm-svn: 290708
2016-12-29 14:31:07 +00:00
Oren Ben Simhon 3b95157090 [X86] Vectorcall Calling Convention - Adding CodeGen Complete Support
The vectorcall calling convention specifies that arguments to functions are to be passed in registers, when possible.
vectorcall uses more registers for arguments than fastcall or the default x64 calling convention use. 
The vectorcall calling convention is only supported in native code on x86 and x64 processors that include Streaming SIMD Extensions 2 (SSE2) and above.

The current implementation does not handle Homogeneous Vector Aggregates (HVAs) correctly and this review attempts to fix it.
This aubmit also includes additional lit tests to cover better HVAs corner cases.

Differential Revision: https://reviews.llvm.org/D27392

llvm-svn: 290240
2016-12-21 08:31:45 +00:00
Stephan Bergmann 17c7f70362 Replace APFloatBase static fltSemantics data members with getter functions
At least the plugin used by the LibreOffice build
(<https://wiki.documentfoundation.org/Development/Clang_plugins>) indirectly
uses those members (through inline functions in LLVM/Clang include files in turn
using them), but they are not exported by utils/extract_symbols.py on Windows,
and accessing data across DLL/EXE boundaries on Windows is generally
problematic.

Differential Revision: https://reviews.llvm.org/D26671

llvm-svn: 289647
2016-12-14 11:57:17 +00:00
Simon Pilgrim b9eb99f570 Use SelectionDAG.getSplatBuildVector helper. NFCI.
llvm-svn: 289220
2016-12-09 16:01:50 +00:00
Peter Collingbourne ab85225be4 IR: Change the gep_type_iterator API to avoid always exposing the "current" type.
Instead, expose whether the current type is an array or a struct, if an array
what the upper bound is, and if a struct the struct type itself. This is
in preparation for a later change which will make PointerType derive from
Type rather than SequentialType.

Differential Revision: https://reviews.llvm.org/D26594

llvm-svn: 288458
2016-12-02 02:24:42 +00:00
Matthias Braun d0ee66c2e9 Move most EH from MachineModuleInfo to MachineFunction
Recommitting r288293 with some extra fixes for GlobalISel code.

Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

llvm-svn: 288405
2016-12-01 19:32:15 +00:00
Eric Christopher e70b7c3dfb Temporarily Revert "Move most EH from MachineModuleInfo to MachineFunction"
This apprears to have broken the global isel bot:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-globalisel_build/5174/console

This reverts commit r288293.

llvm-svn: 288322
2016-12-01 07:50:12 +00:00
Matthias Braun ed14cb0604 Move most EH from MachineModuleInfo to MachineFunction
Most of the exception handling members in MachineModuleInfo is actually
per function data (talks about the "current function") so it is better
to keep it at the function instead of the module.

This is a necessary step to have machine module passes work properly.

Also:
- Rename TidyLandingPads() to tidyLandingPads()
- Use doxygen member groups instead of "//===- EH ---"... so it is clear
  where a group ends.
- I had to add an ugly const_cast at two places in the AsmPrinter
  because the available MachineFunction pointers are const, but the code
  wants to call tidyLandingPads() in between
  (markFunctionEnd()/endFunction()).

Differential Revision: https://reviews.llvm.org/D27227

llvm-svn: 288293
2016-11-30 23:49:01 +00:00
Matt Arsenault b30d2aca58 DAG: Ignore call site attributes when emitting target intrinsic
A target intrinsic may be defined as possibly reading memory,
but the call site may have additional knowledge that it doesn't read
memory. The intrinsic lowering will expect the pessimistic
assumption of the intrinsic definition, so the chain should
still be used.

llvm-svn: 287593
2016-11-21 22:56:42 +00:00
Ahmed Bougacha bd6ce9a247 [CodeGen] Pass references, not pointers, to MMI helpers. NFC.
While there, rename them to follow the coding style.

llvm-svn: 287169
2016-11-16 22:25:03 +00:00
Richard Smith 857efb0880 Add -O0 support for @llvm.invariant.group.barrier by discarding it if it gets to ISel.
Differential Revision: https://reviews.llvm.org/D26292

llvm-svn: 286119
2016-11-07 16:47:20 +00:00
Elena Demikhovsky caaceef4b3 Expandload and Compressstore intrinsics
2 new intrinsics covering AVX-512 compress/expand functionality.
This implementation includes syntax, DAG builder, operation lowering and tests.
Does not include: handling of illegal data types, codegen prepare pass and the cost model.

llvm-svn: 285876
2016-11-03 03:23:55 +00:00
Evandro Menezes 601f4cb9f7 Switch lowering: improve partitioning of jump tables
When there's a tie between partitionings of jump tables, consider also cases
that result in no jump tables, but in one or a few cases.  The motivation is
that many contemporary processors typically perform case switches fairly
quickly.

Differential revision: https://reviews.llvm.org/D25212

llvm-svn: 285099
2016-10-25 19:11:43 +00:00
Konstantin Zhuravlyov 8ea0246e93 [MachineMemOperand] Move synchronization scope and atomic orderings from SDNode to MachineMemOperand, and remove redundant getAtomic* member functions from SelectionDAG.
Differential Revision: https://reviews.llvm.org/D24577

llvm-svn: 284312
2016-10-15 22:01:18 +00:00
Albert Gutowski 795d7d6381 Create llvm.addressofreturnaddress intrinsic
Summary: We need a new LLVM intrinsic to implement MS _AddressOfReturnAddress builtin on 64-bit Windows.

Reviewers: majnemer, rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25293

llvm-svn: 284061
2016-10-12 22:13:19 +00:00
Elena Demikhovsky 5b10aa1f1e DAG: Setting Masked-Expand-Load as a variant of Masked-Load node
Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector.
Right now, the node is used in intrinsics for VEXPAND* instructions. 
The work is done towards implementation of masked.expandload and masked.compressstore intrinsics.

Differential Revision: https://reviews.llvm.org/D25322

llvm-svn: 283694
2016-10-09 10:48:52 +00:00
Arnold Schwaighofer 3f25658143 swifterror: Don't compute swifterror vregs during instruction selection
The code used llvm basic block predecessors to decided where to insert phi
nodes. Instruction selection can and will liberally insert new machine basic
block predecessors. There is not a guaranteed one-to-one mapping from pred.
llvm basic blocks and machine basic blocks.

Therefore the current approach does not work as it assumes we can mark
predecessor machine basic block as needing a copy, and needs to know the set of
all predecessor machine basic blocks to decide when to insert phis.

Instead of computing the swifterror vregs as we select instructions, propagate
them at the end of instruction selection when the MBB CFG is complete.

When an instruction needs a swifterror vreg and we don't know the value yet,
generate a new vreg and remember this "upward exposed" use, and reconcile this
at the end of instruction selection.

This will only happen if the target supports promoting swifterror parameters to
registers and the swifterror attribute is used.

rdar://28300923

llvm-svn: 283617
2016-10-07 22:06:55 +00:00
Evandro Menezes e45de8a5ec Add support to optionally limit the size of jump tables.
Many high-performance processors have a dedicated branch predictor for
indirect branches, commonly used with jump tables.  As sophisticated as such
branch predictors are, they tend to have well defined limits beyond which
their effectiveness is hampered or even nullified.  One such limit is the
number of possible destinations for a given indirect branches that such
branch predictors can handle.

This patch considers a limit that a target may set to the number of
destination addresses in a jump table.

Patch by: Evandro Menezes <e.menezes@samsung.com>, Aditya Kumar
<aditya.k7@samsung.com>, Sebastian Pop <s.pop@samsung.com>.

Differential revision: https://reviews.llvm.org/D21940

llvm-svn: 282412
2016-09-26 15:32:33 +00:00
Arnold Schwaighofer de2490d0dc Disable tail calls if there is an swifterror argument
ISel does not handle them correctly yet i.e we crash trying to emit tail call
code.

radar://28407842

llvm-svn: 282088
2016-09-21 16:53:36 +00:00
Sanjay Patel b1f0a0f4a8 getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI
llvm-svn: 281493
2016-09-14 16:05:51 +00:00
Justin Lebar adbf09e8cf [CodeGen] Split out the notions of MI invariance and MI dereferenceability.
Summary:
An IR load can be invariant, dereferenceable, neither, or both.  But
currently, MI's notion of invariance is IR-invariant &&
IR-dereferenceable.

This patch splits up the notions of invariance and dereferenceability at
the MI level.  It's NFC, so adds some probably-unnecessary
"is-dereferenceable" checks, which we can remove later if desired.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, nemanjai, llvm-commits

Differential Revision: https://reviews.llvm.org/D23371

llvm-svn: 281151
2016-09-11 01:38:58 +00:00
James Molloy c6a6144966 [SDAGBuilder] Don't create a binary tree for switches in minsize mode
This bloats codesize - all of the non-leaf nodes are extra code.

llvm-svn: 280932
2016-09-08 13:12:22 +00:00
Aditya Kumar 356f79d535 [SelectionDAGBuilder] Add const to relevant places
Reviewers: hans, evandro, sebpop

Differential Revision: https://reviews.llvm.org/D24112

llvm-svn: 280430
2016-09-01 23:35:26 +00:00
Michael Kuperstein 5f17d08f49 [SelectionDAG] Generate vector_shuffle nodes for undersized result vector sizes
Prior to this, we could generate a vector_shuffle from an IR shuffle when the
size of the result was exactly the sum of the sizes of the input vectors.
If the output vector was narrower - e.g. a <12 x i8> being formed by a shuffle
with two <8 x i8> inputs - we would lower the shuffle to a sequence of extracts
and inserts.

Instead, we can form a larger vector_shuffle, and then extract a subvector
of the right size - e.g. shuffle the two <8 x i8> inputs into a <16 x i8>
and then extract a <12 x i8>.

This also includes a target-specific X86 combine that in the presence of
AVX2 combines:
(vector_shuffle <mask> (concat_vectors t1, undef)
                       (concat_vectors t2, undef))
into:
(vector_shuffle <mask> (concat_vectors t1, t2), undef)
in cases where this allows us to form VPERMD/VPERMQ.

(This is not a separate commit, as that pattern does not appear without
the DAGBuilder change.)

llvm-svn: 280418
2016-09-01 21:32:09 +00:00
Hal Finkel 5081ac27c7 Add ISD::EH_DWARF_CFA, simplify @llvm.eh.dwarf.cfa on Mips, fix on PowerPC
LLVM has an @llvm.eh.dwarf.cfa intrinsic, used to lower the GCC-compatible
__builtin_dwarf_cfa() builtin. As pointed out in PR26761, this is currently
broken on PowerPC (and likely on ARM as well). Currently, @llvm.eh.dwarf.cfa is
lowered using:

  ADD(FRAMEADDR, FRAME_TO_ARGS_OFFSET)

where FRAME_TO_ARGS_OFFSET defaults to the constant zero. On x86,
FRAME_TO_ARGS_OFFSET is lowered to 2*SlotSize. This setup, however, does not
work for PowerPC. Because of the way that the stack layout works, the canonical
frame address is not exactly (FRAMEADDR + FRAME_TO_ARGS_OFFSET) on PowerPC
(there is a lower save-area offset as well), so it is not just a matter of
implementing FRAME_TO_ARGS_OFFSET for PowerPC (unless we redefine its
semantics -- We can do that, since it is currently used only for
@llvm.eh.dwarf.cfa lowering, but the better to directly lower the CFA construct
itself (since it can be easily represented as a fixed-offset FrameIndex)). Mips
currently does this, but by using a custom lowering for ADD that specifically
recognizes the (FRAMEADDR, FRAME_TO_ARGS_OFFSET) pattern.

This change introduces a ISD::EH_DWARF_CFA node, which by default expands using
the existing logic, but can be directly lowered by the target. Mips is updated
to use this method (which simplifies its implementation, and I suspect makes it
more robust), and updates PowerPC to do the same.

Fixes PR26761.

Differential Revision: https://reviews.llvm.org/D24038

llvm-svn: 280350
2016-09-01 10:28:47 +00:00
Michael Kuperstein 260daed147 Reuse an SDLoc throughout a function. NFC.
llvm-svn: 279767
2016-08-25 18:50:56 +00:00
Justin Bogner cd1d5aaf2e Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

llvm-svn: 278970
2016-08-17 20:30:52 +00:00
Ayman Musa 71b43c5c1d Fix bug in DAGBuilder for getelementptr with expanded vector.
Replacing the usage of MVT with EVT in case the vector type is expanded.
Differential Revision: https://reviews.llvm.org/D23306

llvm-svn: 278913
2016-08-17 07:52:15 +00:00
Ayman Musa c96f421ad4 First commit (test commit) - Adding empty line.
llvm-svn: 278910
2016-08-17 07:37:34 +00:00
Wolfgang Pieb dfad9b20c9 Local variables whose address is taken and passed on to a call are described
in debug info using their stack slots instead of as an indirection of param reg + 0
offset. This is done by detecting FrameIndexSDNodes in SelectionDAG and generating
FrameIndexDbgValues for them. This ultimately generates DBG_VALUEs with stack
location operands.

Differential Revision: http://reviews.llvm.org/D23283

llvm-svn: 278703
2016-08-15 18:18:26 +00:00
David Majnemer 0a16c22846 Use range algorithms instead of unpacking begin/end
No functionality change is intended.

llvm-svn: 278417
2016-08-11 21:15:00 +00:00
Diana Picus 4dd6c249ac [SelectionDAG] Refactor visitInlineAsm a bit. NFCI.
This shaves off ~100 lines from visitInlineAsm.

llvm-svn: 277987
2016-08-08 08:54:39 +00:00
Andrew Kaylor b99d1cc7ed Recommitting r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Third try, now following fixes to MSan to handle mempcy in such a way that this commit won't break the MSan buildbots. (Thanks, Evegenii!)

llvm-svn: 277189
2016-07-29 18:23:18 +00:00
Matthias Braun 941a705b7b MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

llvm-svn: 277017
2016-07-28 18:40:00 +00:00
Andrew Kaylor f990fa5f7b Reverting r276771 due to MSan failures.
llvm-svn: 276824
2016-07-27 01:19:24 +00:00
Andrew Kaylor 3104a6bad0 Re-committing r275284: add support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 276771
2016-07-26 17:23:13 +00:00
Justin Lebar 9c375817ac [SelectionDAG] Get rid of bool parameters in SelectionDAG::getLoad, getStore, and friends.
Summary:
Instead, we take a single flags arg (a bitset).

Also add a default 0 alignment, and change the order of arguments so the
alignment comes before the flags.

This greatly simplifies many callsites, and fixes a bug in
AMDGPUISelLowering, wherein the order of the args to getLoad was
inverted.  It also greatly simplifies the process of adding another flag
to getLoad.

Reviewers: chandlerc, tstellarAMD

Subscribers: jholewinski, arsenm, jyknight, dsanders, nemanjai, llvm-commits

Differential Revision: http://reviews.llvm.org/D22249

llvm-svn: 275592
2016-07-15 18:27:10 +00:00
Justin Lebar 0af80cd6f0 [CodeGen] Take a MachineMemOperand::Flags in MachineFunction::getMachineMemOperand.
Summary:
Previously we took an unsigned.

Hooray for type-safety.

Reviewers: chandlerc

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D22282

llvm-svn: 275591
2016-07-15 18:26:59 +00:00
Michael Kuperstein 4d36e77048 Fix copy/paste bug in r275340.
llvm-svn: 275343
2016-07-13 23:28:00 +00:00
Michael Kuperstein be837fa40f [DAG] Correctly chain masked loads
If a masked loads is not added to the chain, it should not reset the chain's
root.

This fixes the remaining part of PR28515.

llvm-svn: 275340
2016-07-13 23:23:40 +00:00
Andrew Kaylor 346dd7f1bd Reverting r275284 due to platform-specific test failures
llvm-svn: 275304
2016-07-13 19:09:16 +00:00
Andrew Kaylor 12cccdd731 Fix for Bug 26903, adds support to inline __builtin_mempcpy
Patch by Sunita Marathe

Differential Revision: http://reviews.llvm.org/D21920

llvm-svn: 275284
2016-07-13 17:25:11 +00:00
Benjamin Kramer 4d09892e9a Give helper classes/functions internal linkage. NFC.
llvm-svn: 275014
2016-07-10 11:28:51 +00:00
Duncan P. N. Exon Smith 1b824c9e43 SelectionDAG: Avoid implicit iterator conversions in SelectionDAGBuilder, NFC
llvm-svn: 274907
2016-07-08 19:23:12 +00:00
Craig Topper d83f818a3e [CodeGen] Make the code that detects a if a shuffle is really a concatenation of the inputs more general purpose.
We can now handle concatenation of each source multiple times. The previous code just checked for each source to appear once in either order.

This also now handles an entire source vector sized piece having undef indices correctly. We now concat with UNDEF instead of using one of the sources. This is responsible for the test case change.

llvm-svn: 274483
2016-07-04 06:19:35 +00:00
Craig Topper 2bd8b4b180 [CodeGen,Target] Remove the version of DAG.getVectorShuffle that takes a pointer to a mask array. Convert all callers to use the ArrayRef version. No functional change intended.
For the most part this simplifies all callers. There were two places in X86 that needed an explicit makeArrayRef to shorten a statically sized array.

llvm-svn: 274337
2016-07-01 06:54:47 +00:00
Rafael Espindola db6bd02185 Delete unused includes. NFC.
llvm-svn: 274225
2016-06-30 12:19:16 +00:00
Wei Ding 0526e7f8d9 AMDGPU: Add convergent flag to INLINEASM instruction.
Differential Revision: http://reviews.llvm.org/D21214

llvm-svn: 273455
2016-06-22 18:51:08 +00:00
Krzysztof Parzyszek e116d500a7 [SDAG] Remove FixedArgs parameter from CallLoweringInfo::setCallee
The setCallee function will set the number of fixed arguments based
on the size of the argument list. The FixedArgs parameter was often
explicitly set to 0, leading to a lack of consistent value for non-
vararg functions.

Differential Revision: http://reviews.llvm.org/D20376

llvm-svn: 273403
2016-06-22 12:54:25 +00:00
Marcin Koscielnicki fd4b6b9e51 [SelectionDAG] Don't treat library calls specially if marked with nobuiltin.
To be used by D19781.

Differential Revision: http://reviews.llvm.org/D19801

llvm-svn: 273039
2016-06-17 20:24:07 +00:00
Diana Picus bae1d89e45 [SelectionDAG] Remove exit-on-error flag from test (PR27765)
The exit-on-error flag in the ARM test is necessary in order to avoid an
unreachable in the DAGTypeLegalizer, when trying to expand a physical register.
We can also avoid this situation by introducing a bitcast early on, where the
invalid scalar-to-vector conversion is detected.

We also add a test for PowerPC, which goes through a similar code path in the
SelectionDAGBuilder.

Fixes PR27765.

Differential Revision: http://reviews.llvm.org/D21061

llvm-svn: 272644
2016-06-14 07:30:20 +00:00
Benjamin Kramer bdc4956bac Pass DebugLoc and SDLoc by const ref.
This used to be free, copying and moving DebugLocs became expensive
after the metadata rewrite. Passing by reference eliminates a ton of
track/untrack operations. No functionality change intended.

llvm-svn: 272512
2016-06-12 15:39:02 +00:00
Etienne Bergeron 22bfa83208 [stack-protection] Add support for MSVC buffer security check
Summary:
This patch is adding support for the MSVC buffer security check implementation

The buffer security check is turned on with the '/GS' compiler switch.
  * https://msdn.microsoft.com/en-us/library/8dbf701c.aspx
  * To be added to clang here: http://reviews.llvm.org/D20347

Some overview of buffer security check feature and implementation:
  * https://msdn.microsoft.com/en-us/library/aa290051(VS.71).aspx
  * http://www.ksyash.com/2011/01/buffer-overflow-protection-3/
  * http://blog.osom.info/2012/02/understanding-vs-c-compilers-buffer.html


For the following example:
```
int example(int offset, int index) {
  char buffer[10];
  memset(buffer, 0xCC, index);
  return buffer[index];
}
```

The MSVC compiler is adding these instructions to perform stack integrity check:
```
        push        ebp  
        mov         ebp,esp  
        sub         esp,50h  
  [1]   mov         eax,dword ptr [__security_cookie (01068024h)]  
  [2]   xor         eax,ebp  
  [3]   mov         dword ptr [ebp-4],eax  
        push        ebx  
        push        esi  
        push        edi  
        mov         eax,dword ptr [index]  
        push        eax  
        push        0CCh  
        lea         ecx,[buffer]  
        push        ecx  
        call        _memset (010610B9h)  
        add         esp,0Ch  
        mov         eax,dword ptr [index]  
        movsx       eax,byte ptr buffer[eax]  
        pop         edi  
        pop         esi  
        pop         ebx  
  [4]   mov         ecx,dword ptr [ebp-4]  
  [5]   xor         ecx,ebp  
  [6]   call        @__security_check_cookie@4 (01061276h)  
        mov         esp,ebp  
        pop         ebp  
        ret  
```

The instrumentation above is:
  * [1] is loading the global security canary,
  * [3] is storing the local computed ([2]) canary to the guard slot,
  * [4] is loading the guard slot and ([5]) re-compute the global canary,
  * [6] is validating the resulting canary with the '__security_check_cookie' and performs error handling.

Overview of the current stack-protection implementation:
  * lib/CodeGen/StackProtector.cpp
    * There is a default stack-protection implementation applied on intermediate representation.
    * The target can overload 'getIRStackGuard' method if it has a standard location for the stack protector cookie.
    * An intrinsic 'Intrinsic::stackprotector' is added to the prologue. It will be expanded by the instruction selection pass (DAG or Fast).
    * Basic Blocks are added to every instrumented function to receive the code for handling stack guard validation and errors handling.
    * Guard manipulation and comparison are added directly to the intermediate representation.

  * lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
  * lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
    * There is an implementation that adds instrumentation during instruction selection (for better handling of sibbling calls).
      * see long comment above 'class StackProtectorDescriptor' declaration.
    * The target needs to override 'getSDagStackGuard' to activate SDAG stack protection generation. (note: getIRStackGuard MUST be nullptr).
      * 'getSDagStackGuard' returns the appropriate stack guard (security cookie)
    * The code is generated by 'SelectionDAGBuilder.cpp' and 'SelectionDAGISel.cpp'.

  * include/llvm/Target/TargetLowering.h
    * Contains function to retrieve the default Guard 'Value'; should be overriden by each target to select which implementation is used and provide Guard 'Value'.

  * lib/Target/X86/X86ISelLowering.cpp
    * Contains the x86 specialisation; Guard 'Value' used by the SelectionDAG algorithm.

Function-based Instrumentation:
  * The MSVC doesn't inline the stack guard comparison in every function. Instead, a call to '__security_check_cookie' is added to the epilogue before every return instructions.
  * To support function-based instrumentation, this patch is
    * adding a function to get the function-based check (llvm 'Value', see include/llvm/Target/TargetLowering.h),
      * If provided, the stack protection instrumentation won't be inlined and a call to that function will be added to the prologue.
    * modifying (SelectionDAGISel.cpp) do avoid producing basic blocks used for inline instrumentation,
    * generating the function-based instrumentation during the ISEL pass (SelectionDAGBuilder.cpp),
    * if FastISEL (not SelectionDAG), using the fallback which rely on the same function-based implemented over intermediate representation (StackProtector.cpp).

Modifications
  * adding support for MSVC (lib/Target/X86/X86ISelLowering.cpp)
  * adding support function-based instrumentation (lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp, .h)

Results

  * IR generated instrumentation:
```
clang-cl /GS test.cc /Od /c -mllvm -print-isel-input
```

```
*** Final LLVM Code input to ISel ***

; Function Attrs: nounwind sspstrong
define i32 @"\01?example@@YAHHH@Z"(i32 %offset, i32 %index) #0 {
entry:
  %StackGuardSlot = alloca i8*                                                  <<<-- Allocated guard slot
  %0 = call i8* @llvm.stackguard()                                              <<<-- Loading Stack Guard value
  call void @llvm.stackprotector(i8* %0, i8** %StackGuardSlot)                  <<<-- Prologue intrinsic call (store to Guard slot)
  %index.addr = alloca i32, align 4
  %offset.addr = alloca i32, align 4
  %buffer = alloca [10 x i8], align 1
  store i32 %index, i32* %index.addr, align 4
  store i32 %offset, i32* %offset.addr, align 4
  %arraydecay = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 0
  %1 = load i32, i32* %index.addr, align 4
  call void @llvm.memset.p0i8.i32(i8* %arraydecay, i8 -52, i32 %1, i32 1, i1 false)
  %2 = load i32, i32* %index.addr, align 4
  %arrayidx = getelementptr inbounds [10 x i8], [10 x i8]* %buffer, i32 0, i32 %2
  %3 = load i8, i8* %arrayidx, align 1
  %conv = sext i8 %3 to i32
  %4 = load volatile i8*, i8** %StackGuardSlot                                  <<<-- Loading Guard slot
  call void @__security_check_cookie(i8* %4)                                    <<<-- Epilogue function-based check
  ret i32 %conv
}
```

  * SelectionDAG generated instrumentation:

```
clang-cl /GS test.cc /O1 /c /FA
```

```
"?example@@YAHHH@Z":                    # @"\01?example@@YAHHH@Z"
# BB#0:                                 # %entry
        pushl   %esi
        subl    $16, %esp
        movl    ___security_cookie, %eax                                        <<<-- Loading Stack Guard value
        movl    28(%esp), %esi
        movl    %eax, 12(%esp)                                                  <<<-- Store to Guard slot
        leal    2(%esp), %eax
        pushl   %esi
        pushl   $204
        pushl   %eax
        calll   _memset
        addl    $12, %esp
        movsbl  2(%esp,%esi), %esi
        movl    12(%esp), %ecx                                                  <<<-- Loading Guard slot
        calll   @__security_check_cookie@4                                      <<<-- Epilogue function-based check
        movl    %esi, %eax
        addl    $16, %esp
        popl    %esi
        retl
```

Reviewers: kcc, pcc, eugenis, rnk

Subscribers: majnemer, llvm-commits, hans, thakis, rnk

Differential Revision: http://reviews.llvm.org/D20346

llvm-svn: 272053
2016-06-07 20:15:35 +00:00
Justin Bogner c04a76c176 SDAG: Use an Optional<> instead of a sigil value. NFC
This just makes it a bit more clear that we don't intend to use a
deleted node for anything here.

llvm-svn: 270931
2016-05-26 22:29:34 +00:00
Diana Picus 86f1f4ca77 Fix some comment typos in SelectionDAGBuilder. NFC
llvm-svn: 270190
2016-05-20 08:06:31 +00:00
Renato Golin 38ed8021c7 Fix an assert in SelectionDAGBuilder when processing inline asm
When processing inline asm that contains errors, make sure we can recover
gracefully by creating an UNDEF SDValue for the inline asm statement before
returning from SelectionDAGBuilder::visitInlineAsm. This is necessary for
consumers that don't exit on the first error that is emitted (e.g. clang)
and that would assert later on.

Fixes PR24071.

Patch by Diana Picus.

llvm-svn: 269811
2016-05-17 19:52:01 +00:00
Matt Arsenault c31a9d0671 SelectionDAG: Select min/max when both are used
Allow two users of the condition if the other user
is also a min/max select. i.e.

%c = icmp slt i32 %x, %y
%min = select i1 %c, i32 %x, i32 %y
%max = select i1 %c, i32 %y, i32 %x

llvm-svn: 269699
2016-05-16 20:58:23 +00:00
Igor Breger 110af565c7 getelementptr instruction, support index vector of EVT.
Differential Revision: http://reviews.llvm.org/D19775

llvm-svn: 268195
2016-05-01 13:29:12 +00:00
Tim Shen a1d8bc5597 [PPC, SSP] Support PowerPC Linux stack protection.
llvm-svn: 266809
2016-04-19 20:14:52 +00:00
Tim Shen e885d5e4d3 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
David Majnemer 0f26b0aeb4 [CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN
The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only
one of the inputs is NaN: -NUM will return the non-NaN argument while
-NAN would return NaN.

It is desirable to lower to @llvm.{min,max}num to -NAN if they don't
have a native instruction for -NUM.  Notably, ARMv7 NEON's vmin has the
-NAN semantics.

N.B.  Of course, it is only safe to do this if the intrinsic call is
marked nnan.

llvm-svn: 266279
2016-04-14 07:13:24 +00:00
Matt Arsenault 9cd90712f0 AMDGPU: Implement canonicalize
Also add generic DAG node for it.

llvm-svn: 266272
2016-04-14 01:42:16 +00:00
Philip Reames 92d1f0cb6d Introduce an GCRelocateInst class [NFC]
Previously, we were using isGCRelocate predicates.  Using a subclass of IntrinsicInst is far more idiomatic.  The refactoring also enables a couple of minor simplifications and code sharing.

llvm-svn: 266098
2016-04-12 18:05:10 +00:00
Tim Shen 0012756489 [SSP] Remove llvm.stackprotectorcheck.
This is a cleanup patch for SSP support in LLVM. There is no functional change.
llvm.stackprotectorcheck is not needed, because SelectionDAG isn't
actually lowering it in SelectBasicBlock; rather, it adds check code in
FinishBasicBlock, ignoring the position where the intrinsic is inserted
(See FindSplitPointForStackProtector()).

llvm-svn: 265851
2016-04-08 21:26:31 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Sanjoy Das 65a60670e8 Lower @llvm.experimental.deoptimize as a noreturn call
While preserving the return value for @llvm.experimental.deoptimize at
the IR level is useful during mid-level optimization, doing so at the
machine instruction level requires generating some extra code and a
return that is non-ideal.  This change has LLVM lower

```
  %val = call @llvm.experimental.deoptimize
  ret %val
```

to effectively

```
  call @__llvm_deoptimize()
  unreachable
```

instead.

llvm-svn: 265502
2016-04-06 01:33:49 +00:00
Manman Ren e221a870d3 Swift Calling Convention: swifterror target-independent change.
At IR level, the swifterror argument is an input argument with type
ErrorObject**. For targets that support swifterror, we want to optimize it
to behave as an inout value with type ErrorObject*; it will be passed in a
fixed physical register.

The main idea is to track the virtual registers for each swifterror value. We
define swifterror values as AllocaInsts with swifterror attribute or a function
argument with swifterror attribute.

In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before
handling the basic blocks.

When iterating over all basic blocks in RPO, before actually visiting the basic
block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when
there are multiple predecessors or to simply propagate them. There, we create a
virtual register for each swifterror value in the entry block. For predecessors
that are not yet visited, we create virtual registers to hold the swifterror
values at the end of the predecessor. The assignments are saved in
SwiftErrorWorklist and will be materialized at the end of visiting the basic
block.

When visiting a load from a swifterror value, we copy from the current virtual
register assignment. When visiting a store to a swifterror value, we create a
virtual register to hold the swifterror value and update SwiftErrorMap to
track the current virtual register assignment.

Differential Revision: http://reviews.llvm.org/D18108

llvm-svn: 265433
2016-04-05 18:13:16 +00:00
Manman Ren 9bfd0d03e9 Swift Calling Convention: add swifterror attribute.
A ``swifterror`` attribute can be applied to a function parameter or an
AllocaInst.

This commit does not include any target-specific change. The target-specific
optimization will come as a follow-up patch.

Differential Revision: http://reviews.llvm.org/D18092

llvm-svn: 265189
2016-04-01 21:41:15 +00:00
Sanjoy Das 9d41a8f269 Don't use an i64 return type with webkit_jscc
Re-enable an assertion enabled by Justin Lebar in rL265092.  rL265092
was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc
does not like non-i64 return types.  Change the test case to not do
that.

llvm-svn: 265099
2016-04-01 02:51:21 +00:00
Justin Lebar 98981e5573 Revert "Protect some assertions with NDEBUG rather than DEBUG()."
This reverts r265092, because it breaks CodeGen/X86/deopt-intrinsic.ll.

llvm-svn: 265093
2016-04-01 01:23:23 +00:00
Justin Lebar c814e8e4ab Protect some assertions with NDEBUG rather than DEBUG().
DEBUG() only runs if you pass -debug, but these assertions are generally
useful.

llvm-svn: 265092
2016-04-01 01:09:12 +00:00
Nirav Dave 2aab7f4358 Add support for no-jump-tables
Add function soft attribute to the generation of Jump Tables in CodeGen
as initial step towards clang support of gcc's no-jump-table support

Reviewers: hans, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18321

llvm-svn: 264756
2016-03-29 17:46:23 +00:00
Manman Ren f46262e0b7 Swift Calling Convention: add swiftself attribute.
Differential Revision: http://reviews.llvm.org/D17866

llvm-svn: 264754
2016-03-29 17:37:21 +00:00
Kyle Butt 5e241b11ed [Codegen] Decrease minimum jump table density.
Minimum density for both optsize and non optsize are now options
-sparse-jump-table-density (default 10) for non optsize functions
-dense-jump-table-density (default 40) for optsize functions, which
matches the current default. This improves several benchmarks at google
at the cost of a small codesize increase. For code compiled with -Os,
the old behavior continues

llvm-svn: 264689
2016-03-29 00:23:41 +00:00
Sanjoy Das df9ae70f49 Add lowering support for llvm.experimental.deoptimize
Summary:
Only adds support for "naked" calls to llvm.experimental.deoptimize.
Support for round-tripping through RewriteStatepointsForGC will come
as a separate patch (should be simpler than this one).

Reviewers: reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18429

llvm-svn: 264329
2016-03-24 20:23:29 +00:00
Sanjoy Das 6b535630a1 Add a hasOperandBundlesOtherThan helper, and use it; NFC
llvm-svn: 264072
2016-03-22 17:51:25 +00:00
Sanjoy Das 38bfc22161 Add "first class" lowering for deopt operand bundles
Summary:
After this change, deopt operand bundles can be lowered directly by
SelectionDAG into STATEPOINT instructions (which are then lowered to a
call or sequence of nop, with an associated __llvm_stackmaps entry0.
This obviates the need to round-trip deoptimization state through
gc.statepoint via RewriteStatepointsForGC.

Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin

Subscribers: sanjoy, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D18257

llvm-svn: 264015
2016-03-22 00:59:13 +00:00
Sanjoy Das 3a02019fbc [SelectionDAG] Remove visitStatepoint; NFC
This way we have a single entry point into StatepointLowering.  The
method was a direct dispatch to LowerStatepoint anyway.

llvm-svn: 263682
2016-03-17 00:47:14 +00:00
Sanjoy Das 19c6159833 [SelectionDAG] Extract out populateCallLoweringInfo; NFC
SelectionDAGBuilder::populateCallLoweringInfo is now used instead of
SelectionDAGBuilder::lowerCallOperands.  The populateCallLoweringInfo
interface is more composable in face of design changes like
http://reviews.llvm.org/D18106

llvm-svn: 263663
2016-03-16 20:49:31 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Tom Stellard 9f2e00de7b SelectionDAG: Fix a crash on inline asm when output register supports multiple types
Summary:
The code in SelectionDAG did not handle the case where the
register type and output types were different, but had the same size.

Reviewers: arsenm, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17940

llvm-svn: 263022
2016-03-09 16:02:52 +00:00
Vasileios Kalintiris 36901dd1c3 Revert "[mips] Promote the result of SETCC nodes to GPR width."
This reverts commit r262316.

It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.

llvm-svn: 262387
2016-03-01 20:25:43 +00:00
Justin Lebar b5ca00a58d [NVPTX] Use different, convergent MIs for convergent calls.
Summary:
Calls sometimes need to be convergent.  This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.

Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs.  But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.

Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.

Reviewers: jholewinski

Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17423

llvm-svn: 262373
2016-03-01 19:24:03 +00:00
Vasileios Kalintiris 3a8f7f9e31 [mips] Promote the result of SETCC nodes to GPR width.
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.

The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D10970

llvm-svn: 262316
2016-03-01 10:08:01 +00:00
Cong Hou e0eb8bfe37 Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause assertion failure on AArch64.
llvm-svn: 262091
2016-02-26 23:25:30 +00:00
Cong Hou 4ce0280a41 Detecte vector reduction operations just before instruction selection.
(This is the second attemp to commit this patch, after fixing pr26652 & pr26653).

This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261804
2016-02-24 23:40:36 +00:00
Artur Pilipenko 31bcca47d3 NFC. Move isDereferenceable to Loads.h/cpp
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.   

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16180

llvm-svn: 261736
2016-02-24 12:49:04 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Nico Weber e6154ffbe0 Revert r261070, it caused PR26652 / PR26653.
llvm-svn: 261127
2016-02-17 18:47:29 +00:00
Cong Hou bbd4e3b400 Detecte vector reduction operations just before instruction selection.
This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261070
2016-02-17 06:37:04 +00:00
Ahmed Bougacha f8dfb47c02 [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
llvm-svn: 260316
2016-02-09 22:54:12 +00:00
Hans Wennborg 850ec6ca18 [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)
This matches GCC and MSVC's behaviour, and saves on code size.

We were already not extending i1 return values on x86_64 after r127766. This
takes that patch further by applying it to x86 target as well, and also for i8
and i16.

The ABI docs have been unclear about the required behaviour here. The new i386
psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return
vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being
updated to say the same [2].

Differential Revision: http://reviews.llvm.org/D16907

 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf
 [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ

llvm-svn: 260133
2016-02-08 19:34:30 +00:00
Matt Arsenault 2bba779272 SelectionDAG: Lower some range metadata to AssertZext
If a range has a lower bound of 0, add an AssertZext from the
nearest floor power of two.

This allows operations with some workitem intrinsics with known
maximum ranges to use fast 24-bit multiplies.

llvm-svn: 260109
2016-02-08 16:28:19 +00:00
Benjamin Kramer f9172fd4ac Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/
It's a SelectionDAG thing, not a Target thing.

llvm-svn: 258939
2016-01-27 16:32:26 +00:00
Junmo Park 75e9d64aa2 Remove extra whitespace. NFC.
llvm-svn: 258617
2016-01-23 06:34:36 +00:00
Eduard Burtescu 1423921a24 [opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16418

llvm-svn: 258472
2016-01-22 01:17:26 +00:00
Eduard Burtescu 23c4d83aa3 [NFC] Replace several manual GEP loops with gep_type_iterator.
Reviewers: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16335

llvm-svn: 258262
2016-01-20 00:26:52 +00:00
Eduard Burtescu 19eb03106d [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Manuel Jacob 190577ac81 [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: dsanders, llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16273

llvm-svn: 258023
2016-01-17 22:37:39 +00:00