Commit Graph

21646 Commits

Author SHA1 Message Date
Andrew Kaylor aa641a5171 Re-commit optimization bisect support (r267022) without new pass manager support.
The original commit was reverted because of a buildbot problem with LazyCallGraph::SCC handling (not related to the OptBisect handling).

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267231
2016-04-22 22:06:11 +00:00
Peter Collingbourne 7dd8dbf486 Introduce llvm.load.relative intrinsic.
This intrinsic takes two arguments, ``%ptr`` and ``%offset``. It loads
a 32-bit value from the address ``%ptr + %offset``, adds ``%ptr`` to that
value and returns it. The constant folder specifically recognizes the form of
this intrinsic and the constant initializers it may load from; if a loaded
constant initializer is known to have the form ``i32 trunc(x - %ptr)``,
the intrinsic call is folded to ``x``.

LLVM provides that the calculation of such a constant initializer will
not overflow at link time under the medium code model if ``x`` is an
``unnamed_addr`` function. However, it does not provide this guarantee for
a constant initializer folded into a function body. This intrinsic can be
used to avoid the possibility of overflows when loading from such a constant.

Differential Revision: http://reviews.llvm.org/D18367

llvm-svn: 267223
2016-04-22 21:18:02 +00:00
Matt Arsenault 940d19a09c TLI: Only iterate over integer vector types
Instead of iterating over all vectors and skipping integers.

llvm-svn: 267220
2016-04-22 21:16:17 +00:00
Matt Arsenault 3b748d76f6 DAGCombiner: Relax alignment restriction when changing store type
If the target allows the alignment, this should be OK.

llvm-svn: 267217
2016-04-22 21:01:41 +00:00
Peter Collingbourne 265ebd7d70 CodeGen: Use PLT relocations for relative references to unnamed_addr functions.
The relative vtable ABI (PR26723) needs PLT relocations to refer to virtual
functions defined in other DSOs. The unnamed_addr attribute means that the
function's address is not significant, so we're allowed to substitute it
with the address of a PLT entry.

Also includes a bonus feature: addends for COFF image-relative references.

Differential Revision: http://reviews.llvm.org/D17938

llvm-svn: 267211
2016-04-22 20:40:10 +00:00
Matt Arsenault 629d12de70 DAGCombiner: Relax alignment restriction when changing load type
If the target allows the alignment, this should still be OK.

llvm-svn: 267209
2016-04-22 20:21:36 +00:00
Matthias Braun 4f57377c68 MachineScheduler: Move code to initialize a Candidate out of tryCandidate(); NFC
llvm-svn: 267191
2016-04-22 19:10:15 +00:00
Matthias Braun 6493bc2b97 MachineScheduler: Limit the size of the ready list.
Avoid quadratic complexity in unusually large basic blocks by limiting
the size of the ready lists.

Differential Revision: http://reviews.llvm.org/D19349

llvm-svn: 267189
2016-04-22 19:09:17 +00:00
Tom Stellard 2339f6f5a3 PostRAHazardRecocgnizer: Fix unused-private-field warning
llvm-svn: 267160
2016-04-22 15:11:08 +00:00
Tom Stellard ee34680bb0 CodeGen: Add a stand-alone hazard recognizer pass
Summary:
This new pass allows targets to use the hazard recognizer without having
to also run one of the schedulers.  This is useful when compiling with
optimizations disabled for targets that still need noop hazards
to be handled correctly.

Reviewers: hfinkel, atrick

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18594

llvm-svn: 267156
2016-04-22 14:43:50 +00:00
Eric Liu 6be128e43d Fix -Wunused-variable in non-asserts build.
llvm-svn: 267128
2016-04-22 09:50:31 +00:00
Daniel Sanders 591c379563 Revert r267098 - [MachineCombiner] Support for floating-point FMA on ARM64
It introduced buildbot failures on clang-cmake-mips, clang-ppc64le-linux, among others.

llvm-svn: 267127
2016-04-22 09:37:26 +00:00
Vedant Kumar 6013f45f92 Revert "Initial implementation of optimization bisect support."
This reverts commit r267022, due to an ASan failure:

  http://lab.llvm.org:8080/green/job/clang-stage2-cmake-RgSan_check/1549

llvm-svn: 267115
2016-04-22 06:51:37 +00:00
Nicolai Haehnle b0c9748709 AMDGPU/SI: add llvm.amdgcn.ps.live intrinsic
Summary:
This intrinsic returns true if the current thread belongs to a live pixel
and false if it belongs to a pixel that we are executing only for derivative
computation. It will be used by Mesa to implement gl_HelperInvocation.

Note that for pixels that are killed during the shader, this implementation
also returns true, but it doesn't matter because those pixels are always
disabled in the EXEC mask.

This unearthed a corner case in the instruction verifier, which complained
about a v_cndmask 0, 1, exec, exec<imp-use> instruction. That's stupid but
correct code, so make the verifier accept it as such.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19191

llvm-svn: 267102
2016-04-22 04:04:08 +00:00
Gerolf Hoflehner b32f11fc62 [MachineCombiner] Support for floating-point FMA on ARM64
Evaluates fmul+fadd -> fmadd combines and similar code sequences in the
machine combiner. It adds support for float and double similar to the existing
integer implementation. The key features are:

- DAGCombiner checks whether it should combine greedily or let the machine
combiner do the evaluation. This is only supported on ARM64.
- It gives preference to throughput over latency: the heuristic used is
to combine always in loops. The targets decides whether the machine
combiner should optimize for throughput or latency.
- Supports for fmadd, f(n)msub, fmla, fmls patterns
- On by default at O3 ffast-math

llvm-svn: 267098
2016-04-22 02:15:19 +00:00
David Blaikie f0f6c29cec Fix more -Wunused-variable in non-asserts build.
llvm-svn: 267077
2016-04-21 23:24:09 +00:00
David Blaikie 3d42a86f9d Fix some -Wunused-variable warnings in non-asserts builds.
llvm-svn: 267073
2016-04-21 22:53:33 +00:00
Derek Schuff 025191d42f Improve error message reporting for MachineFunctionProperties
When printing the properties required by a pass, only print the
properties that are set, and not those that are clear (only properties
that are set are verified, clear properties are "don't-care").

llvm-svn: 267070
2016-04-21 22:19:24 +00:00
Quentin Colombet 23341a84ca [MachineBasicBlock] Make the pass argument truly mandatory when
splitting edges.

MachineBasicBlock::SplitCriticalEdges will crash if a nullptr would have
been passed for the Pass argument. Do not allow that by turning this
argument into a reference.
The alternative would have been to make the Pass a truly optional
argument, but although this is easy to do, I was afraid users using it
like this would not be aware the livness information, dominator tree and
such would silently be broken.

llvm-svn: 267052
2016-04-21 21:01:13 +00:00
Quentin Colombet 77e1878954 [MachineBasicBlock] Refactor SplitCriticalEdge to expose a query API.
Introduce canSplitCriticalEdge, so that clients can now query whether or
not a critical edge can be split without actually needing to split it.
This may be useful when gathering information for cost models for
instance.

llvm-svn: 267046
2016-04-21 20:46:27 +00:00
Quentin Colombet c320fb4eae [RegisterBankInfo] Change the API for the verify methods.
Return bool instead of void so that it is natural to put the calls into
asserts.

llvm-svn: 267033
2016-04-21 18:34:43 +00:00
Matt Arsenault 7846d885ed LegalizeDAG: Move unaligned load/store expansion to TLI
When custom lowered, this is not called if the store is custom
lowered. Move it to be a utility function so targets can
easily expand unaligned accesses when custom lowering.

llvm-svn: 267029
2016-04-21 18:19:11 +00:00
Quentin Colombet 0e5ff58567 [RegisterBankInfo] Change the representation of the partial mappings.
Instead of holding a mask, hold two value: the start index and the
length of the mapping. This is a more compact representation, although
less powerful. That being said, arbitrary masks would not have worked
for the generic so do not allow them in the first place.

llvm-svn: 267025
2016-04-21 18:09:34 +00:00
Matt Arsenault 8d1052f55c DAGCombiner: Reduce 64-bit BFE pattern to pattern on 32-bit component
If the extracted bits are restricted to the upper half or lower half,
this can be truncated.

llvm-svn: 267024
2016-04-21 18:03:06 +00:00
Andrew Kaylor f0f279291c Initial implementation of optimization bisect support.
This patch implements a optimization bisect feature, which will allow optimizations to be selectively disabled at compile time in order to track down test failures that are caused by incorrect optimizations.

The bisection is enabled using a new command line option (-opt-bisect-limit).  Individual passes that may be skipped call the OptBisect object (via an LLVMContext) to see if they should be skipped based on the bisect limit.  A finer level of control (disabling individual transformations) can be managed through an addition OptBisect method, but this is not yet used.

The skip checking in this implementation is based on (and replaces) the skipOptnoneFunction check.  Where that check was being called, a new call has been inserted in its place which checks the bisect limit and the optnone attribute.  A new function call has been added for module and SCC passes that behaves in a similar way.

Differential Revision: http://reviews.llvm.org/D19172

llvm-svn: 267022
2016-04-21 17:58:54 +00:00
Amjad Aboud a5ba99140c Fixed Dwarf debug info emission to skip DILexicalBlockFile entries.
Before this fix, DILexicalBlockFile entries were skipped only in some cases and were not in other cases.

Differential Revision: http://reviews.llvm.org/D18724

llvm-svn: 267004
2016-04-21 16:58:49 +00:00
Craig Topper 52cb5ec36f [SelectionDAG] Teach LegalizeVectorOps to directly Expand CTTZ_ZERO_UNDEF/CTLZ_ZERO_UNDEF to CTTZ/CTLZ directly if those ops are Legal/Custom instead of deferring it to LegalizeOps.
This is needed to support CTTZ/CTLZ Custom correctly since LegalizeOps would be too late to do the custom lowering.

llvm-svn: 266951
2016-04-21 04:43:57 +00:00
Matthias Braun b550b765bd MachineSched: Cleanup; NFC
llvm-svn: 266946
2016-04-21 01:54:13 +00:00
Mehdi Amini ea0b1e7c17 ScoreboardHazardRecognizer: unbreak TSAN by moving a static mutated variable to a member
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266837
2016-04-20 00:21:24 +00:00
Tim Shen a1d8bc5597 [PPC, SSP] Support PowerPC Linux stack protection.
llvm-svn: 266809
2016-04-19 20:14:52 +00:00
Tim Shen e885d5e4d3 [SSP, 2/2] Create llvm.stackguard() intrinsic and lower it to LOAD_STACK_GUARD
With this change, ideally IR pass can always generate llvm.stackguard
call to get the stack guard; but for now there are still IR form stack
guard customizations around (see getIRStackGuard()). Future SSP
customization should go through LOAD_STACK_GUARD.

There is a behavior change: stack guard values are not CSEed anymore,
since we should never reuse the value in case that it has been spilled (and
corrupted). See ssp-guard-spill.ll. This also cause the change of stack
size and codegen in X86 and AArch64 test cases.

Ideally we'd like to know if the guard created in llvm.stackprotector() gets
spilled or not. If the value is spilled, discard the value and reload
stack guard; otherwise reuse the value. This can be done by teaching
register allocator to know how to rematerialize LOAD_STACK_GUARD and
force a rematerialization (which seems hard), or check for spilling in
expandPostRAPseudo. It only makes sense when the stack guard is a global
variable, which requires more instructions to load. Anyway, this seems to go out
of the scope of the current patch.

llvm-svn: 266806
2016-04-19 19:40:37 +00:00
Sanjoy Das 4519ff73df Add a description for the PatchableFunction pass; NFC
llvm-svn: 266721
2016-04-19 06:25:02 +00:00
Sanjoy Das c0441c29df Introduce a "patchable-function" function attribute
Summary:
The `"patchable-function"` attribute can be used by an LLVM client to
influence LLVM's code generation in ways that makes the generated code
easily patchable at runtime (for instance, to redirect control).
Right now only one patchability scheme is supported,
`"prologue-short-redirect"`, but this can be expanded in the future.

Reviewers: joker.eph, rnk, echristo, dberris

Subscribers: joker.eph, echristo, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19046

llvm-svn: 266715
2016-04-19 05:24:47 +00:00
Paul Robinson 43d1e45347 [DWARF] Force a linkage_name on an inlined subprogram's abstract origin.
When we suppress linkage names, for a non-inlined subprogram the name
can still be found in the object-file symbol table, because we have
the code address of the subprogram.  This is not necessarily the case
for an inlined subprogram, so we still want to emit the linkage name
in the DWARF.  Put this on the abstract-origin DIE because it's common
to all inlined instances.

Differential Revision: http://reviews.llvm.org/D18706

llvm-svn: 266692
2016-04-18 22:41:41 +00:00
JF Bastien bbb0aee66e NFC: unify clang / LLVM atomic ordering
This makes the C11 / C++11 *ABI* atomic ordering accessible from LLVM,
as discussed in http://reviews.llvm.org/D18200#inline-151433

This re-applies r266573 which I had reverted in r266576.

Original review: http://reviews.llvm.org/D18875

llvm-svn: 266640
2016-04-18 18:01:43 +00:00
Mehdi Amini b550cb1750 [NFC] Header cleanup
Removed some unused headers, replaced some headers with forward class declarations.

Found using simple scripts like this one:
clear && ack --cpp -l '#include "llvm/ADT/IndexedMap.h"' | xargs grep -L 'IndexedMap[<]' | xargs grep -n --color=auto 'IndexedMap'

Patch by Eugene Kosov <claprix@yandex.ru>

Differential Revision: http://reviews.llvm.org/D19219

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266595
2016-04-18 09:17:29 +00:00
JF Bastien fb9871b495 Revert "NFC: unify clang / LLVM atomic ordering"
This reverts commit 537951f2f16d6a8542571c7722fcbae07d4e62c2.

Causes an assert in:
  test/Transforms/AtomicExpand/SPARC/libcalls.ll
  (Ordering2 != AtomicOrdering::NotAtomic && "expect atomic MO")

Bot:
  http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/21724/testReport/junit/LLVM/Transforms_AtomicExpand_SPARC/libcalls_ll/

I'm not getting this assert on my local debug build, but I'll revert
just to be sure.

llvm-svn: 266576
2016-04-17 21:29:01 +00:00
JF Bastien 6ef3aa2b7e NFC: unify clang / LLVM atomic ordering
Summary: This makes the C11 / C++11 *ABI* atomic ordering accessible from LLVM, as discussed in http://reviews.llvm.org/D18200#inline-151433

Reviewers: jyknight, reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18875

llvm-svn: 266573
2016-04-17 21:00:57 +00:00
Davide Italiano caa1169653 [ParallelCG] SmallVector<char> -> SmallString.
llvm-svn: 266568
2016-04-17 19:38:57 +00:00
Rafael Espindola 3c1c9875b9 Keep only the splitCodegen version that takes a factory.
This makes it much easier to see that all created TargetMachines are
equivalent.

llvm-svn: 266564
2016-04-17 18:42:27 +00:00
Mehdi Amini 47b292d3fd Remove some unneeded headers and replace some headers with forward class declarations (NFC)
Differential Revision: http://reviews.llvm.org/D19154

Patch by Eugene Kosov <claprix@yandex.ru>

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266524
2016-04-16 07:51:28 +00:00
Mehdi Amini 59ae854503 Do not modify a cl::opt programmatically, global mutable state is evil.
Found by TSAN on ThinLTO.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 266514
2016-04-16 04:58:30 +00:00
Richard Smith 2db6f2e508 Update and fix LLVM_ENABLE_MODULES:
1) We need to add this flag prior to adding any other, in case the user has
specified a -fmodule-cache-path= flag in their custom CXXFLAGS. Such a flag
causes -Werror builds to fail, and thus all config checks fail, until we add
the corresponding -fmodules flag. The modules selfhost bot does this, for
instance.

2) Delete module maps that were putting .cpp files into modules.

3) Enable -fmodules-local-submodule-visibility, to get proper module
visibility rules applied across submodules of the same module. Disable
-fmodules for C builds, since that flag is not available there.

llvm-svn: 266502
2016-04-16 00:48:58 +00:00
Wei Mi 963f2df4d2 Don't skip splitSeparateComponents in eliminateDeadDefs for HoistSpillHelper::hoistAllSpills.
Because HoistSpillHelper::hoistAllSpills is called in postOptimization, before the
patch we didn't want LiveRangeEdit::eliminateDeadDefs to call splitSeparateComponents
and generate unassigned new vregs. However, skipping splitSeparateComponents will make
verify-machineinstrs unhappy, so I remove the early return, and use
HoistSpillHelper::LRE_DidCloneVirtReg to assign physreg/stackslot for those new vregs.

In addition, some code reorganization to make class HoistSpillHelper privately inheriting
from LiveRangeEdit::Delegate possible. This is to be consistent with class RAGreedy and
class RegisterCoalescer.

Differential Revision: http://reviews.llvm.org/D19142

llvm-svn: 266489
2016-04-15 23:16:44 +00:00
Hans Wennborg 07f6d3a893 Switch lowering: don't add incoming PHI values from skipped bit test MBB's (PR27135)
After r245976, LLVM will skip the last bit test case if knows it will always be
true. However, we would still erroneously update PHI nodes with incoming values
from the MBB that would perform the final bit test, causing -verify-machineinstrs
to fail.

llvm-svn: 266479
2016-04-15 21:45:30 +00:00
Hans Wennborg c944c13dc1 SelectionDAGISel: rangeify a loop
llvm-svn: 266478
2016-04-15 21:45:09 +00:00
Davide Italiano 7950b12957 [ParallelCG] Add a new splitCodeGen() API which takes a TargetMachineFactory.
This is a recommit of r266390 with a fix that will allow tests to pass
(hopefully). Before we got a StringRef to M->getTargetTriple() and right
after we moved the Module so we were referencing a dangling object.

llvm-svn: 266456
2016-04-15 17:34:32 +00:00
Adrian Prantl 75819aedf6 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

llvm-svn: 266446
2016-04-15 15:57:41 +00:00
Jun Bum Lim 4c5bd58ebe [MachineScheduler]Add support for store clustering
Perform store clustering just like load clustering. This change add
StoreClusterMutation in machine-scheduler. To control StoreClusterMutation,
added enableClusterStores() in TargetInstrInfo.h. This is enabled only on
AArch64 for now.

This change also add support for unscaled stores which were not handled in
getMemOpBaseRegImmOfs().

llvm-svn: 266437
2016-04-15 14:58:38 +00:00
Davide Italiano 2abf2e7c8c Revert "[LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory."
This reverts commits r266390 and r266396 as they broke some bots.

llvm-svn: 266408
2016-04-15 02:07:03 +00:00
Justin Lebar 7dba2e0d0c [ifcnv] Don't duplicate blocks that contain convergent instructions.
It's unsafe to duplicate blocks that contain convergent instructions
during ifcnv.  See the patch for details.

Reviewers: hfinkel

Differential Revision: http://reviews.llvm.org/D17518

llvm-svn: 266404
2016-04-15 01:38:41 +00:00
Davide Italiano 3fdd27df03 [LTO] Add a new splitCodeGen() API which takes a TargetMachineFactory.
This will be used in lld to avoid creating TargetMachine in two
different places. See D18999 for a more detailed discussion.

Differential Revision:  http://reviews.llvm.org/D19139

llvm-svn: 266390
2016-04-15 00:07:28 +00:00
Geoff Berry 6381713b37 [ScheduleDAGInstrs] Re-factor for based on review feedback. NFC.
Summary:
Re-factor some code to improve clarity and style based on review
comments from http://reviews.llvm.org/D18093.

Reviewers: MatzeB, mcrosier

Subscribers: MatzeB, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D19128

llvm-svn: 266372
2016-04-14 21:31:07 +00:00
Reid Kleckner 28865809fe Sink DI metadata usage out of MachineInstr.h and MachineInstrBuilder.h
MachineInstr.h and MachineInstrBuilder.h are very popular headers,
widely included across all LLVM backends. It turns out that there only a
handful of TUs that actually care about DI operands on MachineInstrs.

After this change, touching DebugInfoMetadata.h and rebuilding llc only
needs 112 actions instead of 542.

llvm-svn: 266351
2016-04-14 18:29:59 +00:00
Tom Stellard b72a65ff53 [GlobalISel] Coding style and whitespace fixes
Reviewers: qcolombet

Subscribers: joker.eph, llvm-commits, vkalintiris

Differential Revision: http://reviews.llvm.org/D19119

llvm-svn: 266342
2016-04-14 17:23:33 +00:00
David Majnemer 0f26b0aeb4 [CodeGen] Teach LLVM how to lower @llvm.{min,max}num to {MIN,MAX}NAN
The behavior of {MIN,MAX}NAN differs from that of {MIN,MAX}NUM when only
one of the inputs is NaN: -NUM will return the non-NaN argument while
-NAN would return NaN.

It is desirable to lower to @llvm.{min,max}num to -NAN if they don't
have a native instruction for -NUM.  Notably, ARMv7 NEON's vmin has the
-NAN semantics.

N.B.  Of course, it is only safe to do this if the intrinsic call is
marked nnan.

llvm-svn: 266279
2016-04-14 07:13:24 +00:00
Matt Arsenault 9cd90712f0 AMDGPU: Implement canonicalize
Also add generic DAG node for it.

llvm-svn: 266272
2016-04-14 01:42:16 +00:00
Matthias Braun 46b0f03e12 TargetLowering: Factor out common code for tail call eligibility checking; NFC
llvm-svn: 266270
2016-04-14 01:10:42 +00:00
Nirav Dave 2477491a92 Cleanup Store Merging in UseAA case
This patch fixes a bug (PR26827) when using anti-aliasing in store
merging. This sets the chain users of the component stores to point to
the new store instead of the component stores chain parent.

Reviewers: jyknight

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18909

llvm-svn: 266217
2016-04-13 17:27:26 +00:00
Petar Jovanovic 644b8c1a5d Calculate __builtin_object_size when pointer depends on a condition
This patch fixes calculating of builtin_object_size if it depends on a
condition. Before this patch compiler did not know how to calculate the
object size when it finds a condition that cannot be eliminated.
This patch enables calculating of builtin_object_size even in case when
condition cannot be eliminated by choosing minimum or maximum value as a
result from condition. Choosing minimum or maximum value from condition
is based on the second argument of __builtin_object_size function.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D18438

llvm-svn: 266193
2016-04-13 12:25:25 +00:00
Wei Mi 9a16d655c7 Recommit r265547, and r265610,r265639,r265657 on top of it, plus
two fixes with one about error verify-regalloc reported, and
another about live range update of phi after rematerialization.

r265547:
Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.

analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.

To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.

Patches on top of r265547:
r265610 "Fix the compare-clang diff error introduced by r265547."
r265639 "Fix the sanitizer bootstrap error in r265547."
r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]"

Differential Revision: http://reviews.llvm.org/D15302
Differential Revision: http://reviews.llvm.org/D18934
Differential Revision: http://reviews.llvm.org/D18935
Differential Revision: http://reviews.llvm.org/D18936

llvm-svn: 266162
2016-04-13 03:08:27 +00:00
Justin Bogner 263f314ba7 CodeGen: Clear the MFI's save and restore point after PrologEpilogInserter
This state is no longer useful and not guaranteed to be valid in later
codegen passes. For example, see the added test, which would print a
savepoint of %bb.-1 without this change, and crashes with a
use-after-free error under ASan if you apply the recycling allocator
patch from llvm.org/PR26808.

llvm-svn: 266150
2016-04-12 23:21:53 +00:00
James Y Knight 7873fb9d73 Pre-fill LibcallRoutineNames with nullptr.
And rearrange InitLibcallNames slightly.

llvm-svn: 266142
2016-04-12 22:32:47 +00:00
James Y Knight 19f6cce4e3 Add __atomic_* lowering to AtomicExpandPass.
(Recommit of r266002, with r266011, r266016, and not accidentally
including an extra unused/uninitialized element in LibcallRoutineNames)

AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and
cmpxchg instructions to __atomic_* library calls, when the target
doesn't support atomics of a given size.

This is the first step towards moving all atomic lowering from clang
into llvm. When all is done, the behavior of __sync_* builtins,
__atomic_* builtins, and C11 atomics will be unified.

Previously LLVM would pass everything through to the ISelLowering
code. There, unsupported atomic instructions would turn into __sync_*
library calls. Because of that behavior, Clang currently avoids emitting
llvm IR atomic instructions when this would happen, and emits __atomic_*
library functions itself, in the frontend.

This change makes LLVM able to emit __atomic_* libcalls, and thus will
eventually allow clang to depend on LLVM to do the right thing.

It is advantageous to do the new lowering to atomic libcalls in
AtomicExpandPass, before ISel time, because it's important that all
atomic operations for a given size either lower to __atomic_*
libcalls (which may use locks), or native instructions which won't. No
mixing and matching.

At the moment, this code is enabled only for SPARC, as a
demonstration. The next commit will expand support to all of the other
targets.

Differential Revision: http://reviews.llvm.org/D18200

llvm-svn: 266115
2016-04-12 20:18:48 +00:00
Ahmed Bougacha 7ac86c47d2 [CodeGen] Remove constant-folding dead code. NFC.
This code was specific to vector operations with scalar operands:
all the opcodes in FoldValue (via FoldConstantArithmetic) can't
match those criteria.

Replace it with an assert if that ever changes: at that point,
we might need to add back a splat BUILD_VECTOR.

llvm-svn: 266100
2016-04-12 18:15:39 +00:00
Philip Reames 92d1f0cb6d Introduce an GCRelocateInst class [NFC]
Previously, we were using isGCRelocate predicates.  Using a subclass of IntrinsicInst is far more idiomatic.  The refactoring also enables a couple of minor simplifications and code sharing.

llvm-svn: 266098
2016-04-12 18:05:10 +00:00
Geoff Berry c0739d8305 [ScheduleDAGInstrs] Handle instructions with multiple MMOs
Summary:
In getUnderlyingObjectsForInstr(): Don't give up on instructions with
multiple MMOs, instead look through all the MMOs and if they all meet
the conservative criteria previously used for single MMO instructions,
then return all of the underlying objects derived from the MMOs.

The change to ScheduleDAGInstrs::buildSchedGraph() is needed to avoid
the case where multiple underlying objects are present and are related
in such a way that successive iterations of the loop end up adding a
dependency from an instruction to itself.

Reviewers: atrick, hfinkel

Subscribers: MatzeB, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18093

llvm-svn: 266084
2016-04-12 15:50:19 +00:00
Rafael Espindola d41b54be11 This reverts commit r266002, r266011 and r266016.
They broke the msan bot.

Original message:

Add __atomic_* lowering to AtomicExpandPass.

AtomicExpandPass can now lower atomic load, atomic store, atomicrmw,and
cmpxchg instructions to __atomic_* library calls, when the target
doesn't support atomics of a given size.

This is the first step towards moving all atomic lowering from clang
into llvm. When all is done, the behavior of __sync_* builtins,
__atomic_* builtins, and C11 atomics will be unified.

Previously LLVM would pass everything through to the ISelLowering
code. There, unsupported atomic instructions would turn into __sync_*
library calls. Because of that behavior, Clang currently avoids emitting
llvm IR atomic instructions when this would happen, and emits __atomic_*
library functions itself, in the frontend.

This change makes LLVM able to emit __atomic_* libcalls, and thus will
eventually allow clang to depend on LLVM to do the right thing.

It is advantageous to do the new lowering to atomic libcalls in
AtomicExpandPass, before ISel time, because it's important that all
atomic operations for a given size either lower to __atomic_*
libcalls (which may use locks), or native instructions which won't. No
mixing and matching.

At the moment, this code is enabled only for SPARC, as a
demonstration. The next commit will expand support to all of the other
targets.

Differential Revision: http://reviews.llvm.org/D18200

llvm-svn: 266062
2016-04-12 12:30:25 +00:00
Quentin Colombet 777a7717ef [RegBankSelect] Teach the repairing code how to handle physical
registers.

llvm-svn: 266029
2016-04-12 00:38:51 +00:00
Quentin Colombet 5aacb1da00 [RegisterBankInfo] Do not provide a default mapping for non-reg of phi
operations.

llvm-svn: 266027
2016-04-12 00:30:14 +00:00
Quentin Colombet 904a2c7422 [RegBankSelect] Teach how to repair definitions.
Although repairing definitions is not mandatory for correctness (only
phis would be impacted because of the RPO traversal), not repairing
might go against the cost model. Therefore, just repair when it is
possible.

llvm-svn: 266025
2016-04-12 00:12:59 +00:00
Derek Schuff f7b2bce1f1 Replace MachineRegisterInfo::TracksLiveness with a MachineFunctionProperty
Use the MachineFunctionProperty mechanism to indicate whether the
liveness info is accurate instead of a bool flag on MRI.
Keeps the MRI accessor function for convenience. NFC

Differential Revision: http://reviews.llvm.org/D18767

llvm-svn: 266020
2016-04-11 23:32:13 +00:00
JF Bastien b3ac75f748 AtomicExpandPass: mark assert variable as used
Avoid -Wunused-variable

llvm-svn: 266016
2016-04-11 23:03:54 +00:00
James Y Knight 00db547f97 Fix compile with GCC after r266002 (Add __atomic_* lowering to AtomicExpandPass)
It doesn't like implicitly calling the ArrayRef constructor with a
returned array -- it appears to decays the returned value to a pointer,
first, before trying to make an ArrayRef out of it.

llvm-svn: 266011
2016-04-11 22:52:42 +00:00
Justin Bogner 1faf01578e CodeGen: Fix a use-after-free in TailDuplication
The call to processPHI already erased MI from its parent, so MI isn't
even valid here, making the getParent() call a use-after-free in
addition to being redundant.

Found by ASan with the ArrayRecycler changes in llvm.org/pr26808.

llvm-svn: 266008
2016-04-11 22:37:13 +00:00
Evgeniy Stepanov f17120a85f [safestack] Add canary to unsafe stack frames
Add StackProtector to SafeStack. This adds limited protection against
data corruption in the caller frame. Current implementation treats
all stack protector levels as -fstack-protector-all.

llvm-svn: 266004
2016-04-11 22:27:48 +00:00
James Y Knight b91d38c5fe Add __atomic_* lowering to AtomicExpandPass.
AtomicExpandPass can now lower atomic load, atomic store, atomicrmw, and
cmpxchg instructions to __atomic_* library calls, when the target
doesn't support atomics of a given size.

This is the first step towards moving all atomic lowering from clang
into llvm. When all is done, the behavior of __sync_* builtins,
__atomic_* builtins, and C11 atomics will be unified.

Previously LLVM would pass everything through to the ISelLowering
code. There, unsupported atomic instructions would turn into __sync_*
library calls. Because of that behavior, Clang currently avoids emitting
llvm IR atomic instructions when this would happen, and emits __atomic_*
library functions itself, in the frontend.

This change makes LLVM able to emit __atomic_* libcalls, and thus will
eventually allow clang to depend on LLVM to do the right thing.

It is advantageous to do the new lowering to atomic libcalls in
AtomicExpandPass, before ISel time, because it's important that all
atomic operations for a given size either lower to __atomic_*
libcalls (which may use locks), or native instructions which won't. No
mixing and matching.

At the moment, this code is enabled only for SPARC, as a
demonstration. The next commit will expand support to all of the other
targets.

Differential Revision: http://reviews.llvm.org/D18200

llvm-svn: 266002
2016-04-11 22:22:33 +00:00
Simon Pilgrim 82e54871d0 [DAGCombiner] Fold xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) anytime before LegalizeVectorOprs
xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B)) was only being combined at the AfterLegalizeTypes stage, this patch permits the combine to occur anytime before then as well.

The main aim with this to improve the ability to recognise bitmasks that can be converted to shuffles.

I had to modify a number of AVX512 mask tests as the basic bitcast to/from scalar pattern was being stripped out, preventing testing of the mmask bitops. By replacing the bitcasts with loads we can get almost the same result.

Differential Revision: http://reviews.llvm.org/D18944

llvm-svn: 265998
2016-04-11 21:10:33 +00:00
Hans Wennborg e9134897f4 Fix a couple of redundant conditional expressions (PR27283, PR28282)
llvm-svn: 265987
2016-04-11 20:35:01 +00:00
Sanjay Patel 892f167aa5 use range-loops; NFCI
llvm-svn: 265985
2016-04-11 20:13:44 +00:00
Reid Kleckner b6800b3052 Combine redundant stack realignment booleans in MachineFrameInfo
MachineFrameInfo does not need to be able to distinguish between the
user asking us not to realign the stack and the target telling us it
doesn't support stack realignment. Either way, fixed stack objects have
their alignment clamped.

llvm-svn: 265971
2016-04-11 17:54:03 +00:00
Tom Stellard 52686e4182 TargetRegisterInfo: Add getRegAsmName()
Summary:
The motivation for this new function is to move an invalid assumption
about the relationship between the names of register definitions in
tablegen files and their assembly names into TargetRegisterInfo, so that
we can begin working on fixing this assumption.

The current problem is that if you have a register definition in
TableGen like:

def MYReg0 : Register<"r0", 0>;

The function TargetLowering::getRegForInlineAsmConstraint() derives the
assembly name from the tablegen name: "MyReg0" rather than the given
assembly name "r0".  This is working, because on most targets the
tablegen name and the assembly names are case insensitive matches for
each other (e.g. def EAX : X86Reg<"eax", ...>

getRegAsmName() will allow targets to override this default assumption and
return the correct assembly name.

Reviewers: echristo, hfinkel

Subscribers: SamWot, echristo, hfinkel, llvm-commits

Differential Revision: http://reviews.llvm.org/D15614

llvm-svn: 265955
2016-04-11 16:21:12 +00:00
Charles Davis 2f65f35c27 [CodeGen] Don't assume that fixed stack objects are aligned in a stack-realigned function.
Summary:
After we make the adjustment, we can assume that for local allocas, but
not for stack parameters, the return address, or any other fixed stack
object (which has a negative offset and therefore lies prior to the
adjusted SP).

Fixes PR26662.

Reviewers: hfinkel, qcolombet, rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D18471

llvm-svn: 265886
2016-04-09 23:34:42 +00:00
Adrian Prantl 3891e9e859 Drop debug info for DISubprograms that are not referenced by anything
This patch drops the debug info for all DISubprograms that are
(a) not attached to an llvm::Function and
(b) not indirectly reachable via inline scopes from any surviving Function and
(c) not reachable from a type (i.e.: member functions).

Background: I'm currently working on a patch to reverse the pointers
between DICompileUnit and DISubprogram (for more info check Duncan's RFC
on lazy-loading of debug info metadata
http://lists.llvm.org/pipermail/llvm-dev/2016-March/097419.html).
The idea is to remove the list of subprograms from DICompileUnit and
instead point to the owning compile unit from each DISubprogram.
After doing this all DISubprograms fulfilling the above criteria will be
implicitly dropped unless we go through an extra effort to preserve them.

http://reviews.llvm.org/D18477
<rdar://problem/25256815>

llvm-svn: 265876
2016-04-09 18:10:22 +00:00
Sanjay Patel 4abae4e0fa [x86] use BMI 'andn' for logic + compare ops
With BMI, we can use 'andn' to save an instruction when the result is only used in a compare.
This is related to one of the potential sequences to check 'isfinite' in:
https://llvm.org/bugs/show_bug.cgi?id=27164

Differential Revision: http://reviews.llvm.org/D18910

llvm-svn: 265875
2016-04-09 16:02:52 +00:00
Adrian Prantl 5992a72b4d Support the Nodebug emission kind for DICompileUnits.
Sample-based profiling and optimization remarks currently remove
DICompileUnits from llvm.dbg.cu to suppress the emission of debug info
from them. This is somewhat of a hack and only borderline legal IR.

This patch uses the recently introduced NoDebug emission kind in
DICompileUnit to achieve the same result without breaking the Verifier.
A nice side-effect of this change is that it is now possible to combine
NoDebug and regular compile units under LTO.

http://reviews.llvm.org/D18808
<rdar://problem/25427165>

llvm-svn: 265861
2016-04-08 22:43:03 +00:00
Tim Shen 0012756489 [SSP] Remove llvm.stackprotectorcheck.
This is a cleanup patch for SSP support in LLVM. There is no functional change.
llvm.stackprotectorcheck is not needed, because SelectionDAG isn't
actually lowering it in SelectBasicBlock; rather, it adds check code in
FinishBasicBlock, ignoring the position where the intrinsic is inserted
(See FindSplitPointForStackProtector()).

llvm-svn: 265851
2016-04-08 21:26:31 +00:00
Kyle Butt 3232dbbf02 Codegen: Factor tail duplication into a utility class. NFC
This is in preparation for tail duplication during block placement. See D18226.
This needs to be a utility class for 2 reasons. No passes may run after block
placement, and also, tail-duplication affects subsequent layout decisions, so
it must be interleaved with placement, and can't be separated out into its own
pass. The original pass is still useful, and now runs by delegating to the
utility class.

llvm-svn: 265842
2016-04-08 20:35:01 +00:00
Nirav Dave 66f485f4e2 Fix Load Control Dependence in MemCpy Generation
In Memcpy lowering we had missed a dependence from the load of the
operation to successor operations. This causes us to potentially
construct an in initial DAG with a memory dependence not fully
represented in the chain sub-DAG but rather require looking at the
entire DAG breaking alias analysis by allowing incorrect repositioning
of memory operations.

To work around this, r200033 changed DAGCombiner::GatherAllAliases to be
conservative if any possible issues to happen. Unfortunately this check
forbade many non-problematic situations as well. For example, it's
common for incoming argument lowering to add a non-aliasing load hanging
off of EntryNode. Then, if GatherAllAliases visited EntryNode, it would
find that other (unvisited) use of the EntryNode chain, and just give up
entirely. Furthermore, the check was incomplete: it would not actually
detect all such potentially problematic DAG constructions, because
GatherAllAliases did not guarantee to visit all chain nodes going up to
the root EntryNode. This is in general fine -- giving up early will just
miss a potential optimization, not generate incorrect results. But, for
this non-chain dependency detection code, it's possible that you could
have a load attached to a higher-up chain node than any which were
visited. If that load aliases your store, but the only dependency is
through the value operand of a non-aliasing store, it would've been
missed by this code, and potentially reordered.

With the dependence added, this check can be removed and Alias Analysis
can be much more aggressive. This fixes code quality regression in the
Consecutive Store Merge cleanup (D14834).

Test Change:

ppc64-align-long-double.ll now may see multiple serializations
of its stores

Differential Revision: http://reviews.llvm.org/D18062

llvm-svn: 265836
2016-04-08 19:44:40 +00:00
Quentin Colombet ab8c21f72b [RegBankSelect] Use reverse post order traversal.
When assigning the register banks of an instruction, it is best to know
all the constraints of the input to have a good idea of how this will
impact the cost of the whole function.

llvm-svn: 265812
2016-04-08 17:19:10 +00:00
Quentin Colombet 88805c1917 [RegisterBankInfo] Change the implementation for the default mapping.
Do not give that much importance to the current register bank of an
operand. This is likely just a side effect of the current execution and
it is properly wise to prefer a register bank that can be extracted from
the information available statically (like encoding constraints and
type).

llvm-svn: 265810
2016-04-08 16:59:50 +00:00
Quentin Colombet 6d6d6af226 [RegBankSelect] Improve debug output.
Add verbose information when checking if the current and the desired
register banks match.
Detail what happens when we assign a register bank.

llvm-svn: 265804
2016-04-08 16:48:16 +00:00
Quentin Colombet 876ddf8107 [MIR] Teach the parser how to deal with register banks.
llvm-svn: 265802
2016-04-08 16:40:43 +00:00
Quentin Colombet c1c94bc2ca [MachineVerifier] Teach how to check some of the properties of generic
virtual registers.

Generic virtual registers:
- May not have a register class
- May not have a register bank
- If they do not have a register class they must have a size
- If they have a register bank, the size of the register bank must be
  greater or equal to the size of the virtual register (basically check
  that the virtual register will fit into that register class)

llvm-svn: 265798
2016-04-08 16:35:22 +00:00
Quentin Colombet fab1cfe673 [MIR] Teach the mir printer how to print the register bank.
For now, we put the register bank in the Class field since a register
may only have one of those at a given time. The downside of that
representation is that if a register class and a register bank have the
same name, we will not be able to distinguish them.

llvm-svn: 265796
2016-04-08 16:26:22 +00:00
Hans Wennborg 5a7723c7a2 Revert r265547 "Recommit r265309 after fixed an invalid memory reference bug happened"
It caused PR27275: "ARM: Bad machine code: Using an undefined physical register"

Also reverting the following commits that were landed on top:
r265610 "Fix the compare-clang diff error introduced by r265547."
r265639 "Fix the sanitizer bootstrap error in r265547."
r265657 "InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]"

llvm-svn: 265790
2016-04-08 15:17:43 +00:00
Chuang-Yu Cheng 98c1894755 CXX_FAST_TLS calling convention: performance improvement for PPC64
This is the same change on PPC64 as r255821 on AArch64. I have even borrowed
his commit message.

The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.

We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.

Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.

We add CSRsViaCopy, it will be explicitly handled during lowering.

1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
   supports it for the given machine function and the function has only return
   exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
   virtual registers at beginning of the entry block and copies from virtual
   registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.

Author: Tom Jablin (tjablin)
Reviewers: hfinkel kbarton cycheng

http://reviews.llvm.org/D17533

llvm-svn: 265781
2016-04-08 12:04:32 +00:00
Craig Topper 00230805f2 Use std::fill to simplify some code. NFC
llvm-svn: 265771
2016-04-08 07:10:46 +00:00
Quentin Colombet e57546de40 [TargetRegisterInfo] Re-apply r265734.
Original commit message:
[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator.

llvm-svn: 265764
2016-04-08 00:51:00 +00:00
Adrian Prantl 3e9c88753b DwarfDebug: Support floating point constants in location lists.
This patch closes a gap in the DWARF backend that caused LLVM to drop
debug info for floating point variables that were constant for part of
their scope. Floating point constants are emitted as one or more
DW_OP_constu joined via DW_OP_piece.

This fixes a regression caught by the LLDB testsuite that I introduced
in r262247 when we stopped blindly expanding the range of singular
DBG_VALUEs to span the entire scope and started to emit location lists
with accurate ranges instead.

Also deletes a now-impossible testcase (debug-loc-empty-entries).

<rdar://problem/25448338>

llvm-svn: 265760
2016-04-08 00:38:37 +00:00
Quentin Colombet e0a7ffa6cb Revert "[TargetRegisterInfo] Refactor the code to use BitMaskClassIterator."
This reverts commit r265734.
Looks like ASan is not happy about it.
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/11741

Looking.

llvm-svn: 265755
2016-04-08 00:03:51 +00:00
Quentin Colombet dcf5cf6a29 [RegisterBankInfo] Make the debug output more compact.
Print the mask of the partial mapping as an hexadecimal instead of a
binary value.

llvm-svn: 265754
2016-04-08 00:03:49 +00:00
Quentin Colombet e16f561d91 [RegBankSelect] Add a few debug statements.
llvm-svn: 265749
2016-04-07 23:53:55 +00:00
Quentin Colombet 9a2ae85e67 [RegisterBankInfo] Add print and dump method to the InstructionMapping
helper class.

llvm-svn: 265747
2016-04-07 23:31:58 +00:00
Quentin Colombet e087c9fc12 [RegisterBankInfo] Add print and dump method to the ValueMapping helper
class.

llvm-svn: 265746
2016-04-07 23:25:43 +00:00
Quentin Colombet 03c419628e [MachineInstr] Teach the print method about RegisterBank.
Properly print either the register class or the register bank or a
virtual register.
Get rid of a few ifdefs in the process.

llvm-svn: 265745
2016-04-07 23:18:11 +00:00
Quentin Colombet ac40034e06 [RegisterBankInfo] Strengthen getInstrMappingImpl.
Teach the target independent code how to take advantage of type
information to get the mapping of an instruction.

llvm-svn: 265739
2016-04-07 22:52:49 +00:00
Quentin Colombet e918006a87 [RegisterBankInfo] Add a way to record what register bank covers a
specific type.

This will be used to find the default mapping of the instruction.
Also, this information is recorded, instead of computed, because it is
expensive from a type to know which register bank maps it.
Indeed, we need to iterate through all the register classes of all the
register banks to find the one that maps the given type.

llvm-svn: 265736
2016-04-07 22:45:42 +00:00
Quentin Colombet c8d612f6fd [RegisterBankInfo] Introduce getRegBankFromConstraints as an helper
method.

NFC.

The refactoring intends to make the code more readable and expose
more features to potential derived classes.

llvm-svn: 265735
2016-04-07 22:35:03 +00:00
Quentin Colombet 2445dc1916 [TargetRegisterInfo] Refactor the code to use BitMaskClassIterator.
llvm-svn: 265734
2016-04-07 22:16:56 +00:00
Quentin Colombet cf477ffc58 [RegisterBankInfo] Refactor the code to use BitMaskClassIterator.
llvm-svn: 265733
2016-04-07 22:08:56 +00:00
Quentin Colombet aac71a4a0e [RegBankSelect] Reuse RegisterBankInfo logic to get to the register bank
from a register.
On top of duplicating the logic, it was buggy! It would assert on
physical registers, since MachineRegisterInfo does not have any
information regarding register classes/banks for them.

llvm-svn: 265727
2016-04-07 21:32:23 +00:00
Amaury Sechet c53ad4f3b2 Do not select EhPad BB in MachineBlockPlacement when there is regular BB to schedule
Summary:
EHPad BB are not entered the classic way and therefor do not need to be placed after their predecessors. This patch make sure EHPad BB are not chosen amongst successors to form chains, and are selected as last resort when selecting the best candidate.

EHPad are scheduled in reverse probability order in order to have them flow into each others naturally.

Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17625

llvm-svn: 265726
2016-04-07 21:29:39 +00:00
Quentin Colombet d4131814b3 [GlobalISel] Add RegBankSelect hooks into the pass pipeline.
Now, RegBankSelect will happen after the IRTranslation and the target
may optionally add additional passes in between.

llvm-svn: 265716
2016-04-07 20:27:33 +00:00
Quentin Colombet 40ad573d2c [RegBankSelect] Initial implementation for non-optimized output.
The pass walk through the machine function and assign the register banks
using the default mapping. In other words, there is no attempt to reduce
cross register copies.

llvm-svn: 265707
2016-04-07 18:19:27 +00:00
Quentin Colombet fe1ee4f9be [RegisterBankInfo] Provide a target independent helper function to guess
the mapping of an instruction on register bank.

For most instructions, it is possible to guess the mapping of the
instruciton by using the encoding constraints.
It remains instructions without encoding constraints.
For copy-like instructions, we try to propagate the information we get
from the other operands. Otherwise, the target has to give this
information.

llvm-svn: 265703
2016-04-07 18:01:19 +00:00
Quentin Colombet ee366eff44 [RegisterBankInfo] Change the signature of getSizeInBits to factor out
the access to MRI and TRI.

llvm-svn: 265701
2016-04-07 17:44:54 +00:00
Quentin Colombet 5b7ba5092c [RegisterBankInfo] Provide a default constructor for InstructionMapping
helper class.

The default constructor creates invalid (isValid() == false) instances
and may be used to communicate that a mapping was not found.

llvm-svn: 265699
2016-04-07 17:30:18 +00:00
Quentin Colombet c33085f2c6 [MachineRegisterInfo] Track register bank for virtual registers.
A virtual register may have either a register bank or a register class.
This is represented by a PointerUnion between the related classes.

Typically, a virtual register went through the following states
regarding register class and register bank:

1. Creation: None is set. Virtual registers are fully generic.
2. Register bank assignment: Register bank is set. Virtual registers
live into a register bank, but we do not know the constraints they need
to fulfil.
3. Instruction selection: Register class is set. Virtual registers are
bound by encoding constraints.

To map these states to GlobalISel, the IRTranslator implements #1,
RegBankSelect #2, and Select #3.

llvm-svn: 265696
2016-04-07 17:20:29 +00:00
Quentin Colombet d21115876c [RegisterBank] Rename RegisterBank::contains into RegisterBank::covers.
llvm-svn: 265695
2016-04-07 17:09:39 +00:00
Dmitry Polukhin a1feff7024 [GCC] Attribute ifunc support in llvm
This patch add support for GCC attribute((ifunc("resolver"))) for
targets that use ELF as object file format. In general ifunc is a
special kind of function alias with type @gnu_indirect_function. Patch
for Clang http://reviews.llvm.org/D15524

Differential Revision: http://reviews.llvm.org/D15525

llvm-svn: 265667
2016-04-07 12:32:19 +00:00
NAKAMURA Takumi e546211492 InlineSpiller.cpp: Escap \@ in r265547. [-Wdocumentation]
llvm-svn: 265657
2016-04-07 11:30:06 +00:00
Amaury Sechet 33c161c02f [BlockPlacement] Remove an unnecessary continue
NFC.

llvm-svn: 265643
2016-04-07 06:35:00 +00:00
Amaury Sechet 9ee4ddd710 [MBP] Remove an unused function parameter
NFC.

llvm-svn: 265642
2016-04-07 06:34:47 +00:00
Wei Mi 979e9756ec Fix the sanitizer bootstrap error in r265547.
The iterators of SmallPtrSet SpillsInSubTreeMap[Child].first may be
invalidated when SpillsInSubTreeMap grows. Rearrange the code to
ensure the grow of SpillsInSubTreeMap only happens before getting
the iterators of the SmallPtrSet.

llvm-svn: 265639
2016-04-07 05:27:17 +00:00
Amaury Sechet 41474a52e7 Revert "[BlockPlacement] Remove an unnecessary continue" and "[MBP] Remove an unused function parameter"
llvm-svn: 265638
2016-04-07 04:28:40 +00:00
Duncan P. N. Exon Smith da68cbc4ad IR: RF_IgnoreMissingValues => RF_IgnoreMissingLocals, NFC
Clarify what this RemapFlag actually means.

  - Change the flag name to match its intended behaviour.
  - Clearly document that it's not supposed to affect globals.
  - Add a host of FIXMEs to indicate how to fix the behaviour to match
    the intent of the flag.

RF_IgnoreMissingLocals should only affect the behaviour of
RemapInstruction for function-local operands; namely, for operands of
type Argument, Instruction, and BasicBlock.  Currently, it is *only*
passed into RemapInstruction calls (and the transitive MapValue calls
that it makes).

When I split Metadata from Value I didn't understand the flag, and I
used it in a bunch of places for "global" metadata.

This commit doesn't have any functionality change, but prepares to
cleanup MapMetadata and MapValue.

llvm-svn: 265628
2016-04-07 00:26:43 +00:00
Quentin Colombet 4359784c1b [RegisterBankInfo] Implement a target independent version of
getInstrMapping.

This implementation requires that the target implemented
getRegBankFromRegClass.
Indeed, the implementation uses the register classes for the encoding
constraints for the instructions to deduce the mapping of a value.

llvm-svn: 265624
2016-04-07 00:07:50 +00:00
Quentin Colombet 8c0d66bc54 [RegisterBankInfo] Add an helper function to get the size of a register.
The previous method to get the size was too simple and could fail for
physical registers.

llvm-svn: 265620
2016-04-06 23:59:53 +00:00
Wei Mi 284fa0bd71 Fix the compare-clang diff error introduced by r265547.
Use MapVector instead of DenseMap for MergeableSpillsMap so it will be
iterated in determined order.

llvm-svn: 265610
2016-04-06 22:31:17 +00:00
Quentin Colombet c916204a81 [RegisterBankInfo] Add methods to get the possible mapping of an instruction on a register bank.
This will be used by the register bank select pass to assign register banks
for generic virtual registers.

This was originally committed as r265573 but broke at least one windows bot.
The problem with the windows bot was that it was using a copy constructor for
the InstructionMappings class and could not synthesize it. Actually, the fact
that this class is not copy constructable is expected and the compiler should
use the move assignment constructor. Marking the problematic assignment
explicitly as using the move constructor has its own problems.

Indeed, with recent clang we get a warning that we may prevent the elision of
the copy by the compiler. A proper fix for both compilers would be to change the
API of getPossibleInstrMapping to take a InstructionMappings as input/output
parameter. This does not feel natural and since GISel is not used on windows
yet, I chose to workaround the problem by not compiling the problematic code on
windows.

llvm-svn: 265604
2016-04-06 21:37:22 +00:00
JF Bastien 800f87a871 NFC: make AtomicOrdering an enum class
Summary:
In the context of http://wg21.link/lwg2445 C++ uses the concept of
'stronger' ordering but doesn't define it properly. This should be fixed
in C++17 barring a small question that's still open.

The code currently plays fast and loose with the AtomicOrdering
enum. Using an enum class is one step towards tightening things. I later
also want to tighten related enums, such as clang's
AtomicOrderingKind (which should be shared with LLVM as a 'C++ ABI'
enum).

This change touches a few lines of code which can be improved later, I'd
like to keep it as NFC for now as it's already quite complex. I have
related changes for clang.

As a follow-up I'll add:
  bool operator<(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>(AtomicOrdering, AtomicOrdering) = delete;
  bool operator<=(AtomicOrdering, AtomicOrdering) = delete;
  bool operator>=(AtomicOrdering, AtomicOrdering) = delete;
This is separate so that clang and LLVM changes don't need to be in sync.

Reviewers: jyknight, reames

Subscribers: jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D18775

llvm-svn: 265602
2016-04-06 21:19:33 +00:00
Haicheng Wu 1951cf24a7 [MBP] Remove an unused function parameter
NFC.

llvm-svn: 265596
2016-04-06 20:38:20 +00:00
Quentin Colombet fb000583aa Revert "[RegisterBankInfo] Add methods to get the possible mapping of an
instruction on a register bank. This will be used by the register bank select
pass to assign register banks for generic virtual registers." and the follow-on
commits while I find out a way to fix the win7 bot:
http://lab.llvm.org:8011/builders/sanitizer-windows/builds/19882

This reverts commit r265578, r265581, r265584, and r265585.

llvm-svn: 265587
2016-04-06 19:04:58 +00:00
Evgeniy Stepanov 268826a287 [gold] Save bitcode for module partitions (save-temps + split codegen).
llvm-svn: 265583
2016-04-06 18:32:13 +00:00
Quentin Colombet df4aee09f8 [RegisterBankInfo] Provide a default constructor for InstructionMapping
helper class.

The default constructor creates invalid (isValid() == false) instances
and may be used to communicate that a mapping was not found.

llvm-svn: 265581
2016-04-06 18:24:34 +00:00
Quentin Colombet bb756dbf39 [RegisterBankInfo] Add an helper function to get the size of a register.
The previous method to get the size was too simple and could fail for
physical registers.

llvm-svn: 265578
2016-04-06 18:04:35 +00:00
Quentin Colombet 9af77135e5 [RegisterBankInfo] Add methods to get the possible mapping of an instruction on a register bank.
This will be used by the register bank select pass to assign register banks
for generic virtual registers.

llvm-svn: 265573
2016-04-06 17:45:40 +00:00
Quentin Colombet 4812c91f56 [RegisterBankInfo] Implement the verify method of the InstructionMapping helper class.
This checks that all the register operands get a proper mapping.

llvm-svn: 265563
2016-04-06 17:01:43 +00:00
Quentin Colombet 3768f7005d [RegisterBankInfo] Implement the verify method for the ValueMapping helper class.
The method checks that the value is fully defined accross the different partial
mappings and that the partial mappings are compatible between each other.

llvm-svn: 265556
2016-04-06 16:40:23 +00:00
Quentin Colombet 2423fc419c [RegisterBankInfo] Add a verify method for the PartialMapping helper class.
This verifies that the PartialMapping can be accomadated into the related
register bank.

llvm-svn: 265555
2016-04-06 16:33:26 +00:00
Quentin Colombet 89c33caee3 [RegisterBankInfo] Add a couple of helper classes for the future cost model.
llvm-svn: 265553
2016-04-06 16:27:01 +00:00
Quentin Colombet 911181882e [RegisterBankInfo] Inline the destructor to avoid link-time error when GlobalISel is not built.
llvm-svn: 265548
2016-04-06 15:47:17 +00:00
Wei Mi 18293bef4e Recommit r265309 after fixed an invalid memory reference bug happened
when DenseMap growed and moved memory. I verified it fixed the bootstrap
problem on x86_64-linux-gnu but I cannot verify whether it fixes
the bootstrap error on clang-ppc64be-linux. I will watch the build-bot
result closely.

Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.

analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.

To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.

Differential Revision: http://reviews.llvm.org/D15302

llvm-svn: 265547
2016-04-06 15:41:07 +00:00
Matthias Braun 7dc03f060e RegisterScavenger: Take a reference as enterBasicBlock() argument.
Make it obvious that the argument cannot be nullptr.
Remove an unnecessary nullptr check in initRegState.

llvm-svn: 265511
2016-04-06 02:47:09 +00:00
Matthias Braun 3bb0fcc118 LivePhysRegs: Remove redundant check
llvm-svn: 265509
2016-04-06 02:46:04 +00:00
Sanjoy Das 65a60670e8 Lower @llvm.experimental.deoptimize as a noreturn call
While preserving the return value for @llvm.experimental.deoptimize at
the IR level is useful during mid-level optimization, doing so at the
machine instruction level requires generating some extra code and a
return that is non-ideal.  This change has LLVM lower

```
  %val = call @llvm.experimental.deoptimize
  ret %val
```

to effectively

```
  call @__llvm_deoptimize()
  unreachable
```

instead.

llvm-svn: 265502
2016-04-06 01:33:49 +00:00
Quentin Colombet 06bdd3c914 [RegisterBankInfo] Simplify the API for build a register bank.
As part of the TRI argument of addRegBankCoverage we already have access to
the TargetRegisterClass through the ID of that register class.
Therefore, there is no point in needing a TargetRegisterClass instance,
the ID is enough to get to it.

llvm-svn: 265487
2016-04-05 23:26:39 +00:00
Evgeniy Stepanov dde29e2799 Faster stack-protector for Android/AArch64.
Bionic has a defined thread-local location for the stack protector
cookie. Emit a direct load instead of going through __stack_chk_guard.

llvm-svn: 265481
2016-04-05 22:41:50 +00:00
Quentin Colombet 64bba01a63 [RegisterBank] Implement the verify method to check for the obvious mistakes.
llvm-svn: 265479
2016-04-05 22:34:01 +00:00
Quentin Colombet 0195826998 [RegisterBankInfo] Add debug print to check how the initialization is going.
llvm-svn: 265475
2016-04-05 21:47:56 +00:00
Quentin Colombet c94fbee9f6 [RegisterBank] Add printable capabilities for future debugging.
llvm-svn: 265473
2016-04-05 21:40:43 +00:00
Quentin Colombet 85689d934a [RegisterBankInfo] Make addRegBankCoverage more capable to ease
targeting jobs.
Now, addRegBankCoverage also adds the subreg-classes not just the
sub-classes of the given register class.

llvm-svn: 265469
2016-04-05 21:20:12 +00:00
Quentin Colombet d347d695c2 [RegisterBankInfo] Implement the methods to create register banks.
llvm-svn: 265464
2016-04-05 21:06:15 +00:00
Quentin Colombet c4db2ad5b8 [RegisterBank] Provide a way to check if a register bank is valid.
Change the default constructor to create invalid object.
The target will have to properly initialize the register banks before
using them.

llvm-svn: 265460
2016-04-05 20:48:32 +00:00
Quentin Colombet b235d32e74 [GlobalISel] Add the RegisterBankInfo class for the handling of register banks.
llvm-svn: 265449
2016-04-05 20:02:47 +00:00
Quentin Colombet bdc3b4d523 [GlobalISel] Add a class, RegisterBank, to represent register banks.
llvm-svn: 265445
2016-04-05 19:54:44 +00:00
Quentin Colombet 8e8e85c19f [GlobalISel] Add the skeleton of the RegBankSelect pass.
This pass is reponsible for assigning the generic virtual registers to register
banks.

llvm-svn: 265440
2016-04-05 19:06:01 +00:00
Manman Ren e221a870d3 Swift Calling Convention: swifterror target-independent change.
At IR level, the swifterror argument is an input argument with type
ErrorObject**. For targets that support swifterror, we want to optimize it
to behave as an inout value with type ErrorObject*; it will be passed in a
fixed physical register.

The main idea is to track the virtual registers for each swifterror value. We
define swifterror values as AllocaInsts with swifterror attribute or a function
argument with swifterror attribute.

In SelectionDAGISel.cpp, we set up swifterror values (SwiftErrorVals) before
handling the basic blocks.

When iterating over all basic blocks in RPO, before actually visiting the basic
block, we call mergeIncomingSwiftErrors to merge incoming swifterror values when
there are multiple predecessors or to simply propagate them. There, we create a
virtual register for each swifterror value in the entry block. For predecessors
that are not yet visited, we create virtual registers to hold the swifterror
values at the end of the predecessor. The assignments are saved in
SwiftErrorWorklist and will be materialized at the end of visiting the basic
block.

When visiting a load from a swifterror value, we copy from the current virtual
register assignment. When visiting a store to a swifterror value, we create a
virtual register to hold the swifterror value and update SwiftErrorMap to
track the current virtual register assignment.

Differential Revision: http://reviews.llvm.org/D18108

llvm-svn: 265433
2016-04-05 18:13:16 +00:00
Haicheng Wu 3618fa786f [BlockPlacement] Remove an unnecessary continue
NFC.

llvm-svn: 265407
2016-04-05 15:37:08 +00:00
Chuang-Yu Cheng d3fb38cae5 Don't delete empty preheaders in CodeGenPrepare if it would create a critical edge
Presently, CodeGenPrepare deletes all nearly empty (only phi and branch)
basic blocks. This pass can delete loop preheaders which frequently creates
critical edges. A preheader can be a convenient place to spill registers to
the stack. If the entrance to a loop body is a critical edge, then spills
may occur in the loop body rather than immediately before it. This patch
protects loop preheaders from deletion in CodeGenPrepare even if they are
nearly empty.

Since the patch alters the CFG, it affects a large number of test cases.
In most cases, the changes are merely cosmetic (basic blocks have different
names or instruction orders change slightly). I am somewhat concerned about
the test/CodeGen/Mips/brdelayslot.ll test case. If the loop preheader is not
deleted, then the MIPS backend does not take advantage of a branch delay
slot. Consequently, I would like some close review by a MIPS expert.

The patch also partially subsumes D16893 from George Burgess IV. George
correctly notes that CodeGenPrepare does not actually preserve the dominator
tree. I think the dominator tree was usually not valid when CodeGenPrepare
ran, but I am using LoopInfo to mark preheaders, so the dominator tree is
now always valid before CodeGenPrepare.

Author: Tom Jablin (tjablin)
Reviewers: hfinkel george.burgess.iv vkalintiris dsanders kbarton cycheng

http://reviews.llvm.org/D16984

llvm-svn: 265397
2016-04-05 14:06:20 +00:00
Dmitry Polukhin a3d5b0b218 [IFUNC] Use GlobalIndirectSymbol when aliases and ifuncs have something similar
Second part extracted from http://reviews.llvm.org/D15525

Use GlobalIndirectSymbol in all cases when aliases and ifuncs have
something in common.

Differential Revision: http://reviews.llvm.org/D18754

llvm-svn: 265382
2016-04-05 08:47:51 +00:00
Sanjay Patel 769b5fd546 fix typos; NFC
llvm-svn: 265356
2016-04-04 22:45:56 +00:00
Justin Bogner 35c6903f22 Revert "CodeGen: Remove dead code in TailDuplicate"
It seems this is reachable after all. It hit on 7zip-benchmark in lnt
on ppc64:

  http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/2317

This reverts r265347.

llvm-svn: 265352
2016-04-04 21:41:54 +00:00
Matthias Braun 7511abd5c1 MachineScheduler: Ignore COPYs with undef/dead op in CopyConstrain mutation.
There is no problem with the code today, but the fix will avoid a crash
in test/CodeGen/AMDGPU/subreg-coalescer-undef-use.ll once the
DetectDeadLanes pass is added.

llvm-svn: 265351
2016-04-04 21:23:46 +00:00
Justin Bogner 9ab8131a57 CodeGen: Remove dead code in TailDuplicate
I noticed that this isn't covered by our existing tests and spent some
time trying to come up with an example it actually hits. I tried hand
rolling something based on the explanation in the comment, but couldn't
get anything that didn't abort tail duplication earlier for one reason
or another.

Then, I tried cranking tail-dup-size cranked up so this would fire
more and ran a bootstrap of clang and the nightly test suite - those
don't hit this either.

This reverts r132816 and replaces it with an assert.

llvm-svn: 265347
2016-04-04 21:11:40 +00:00
Chandler Carruth 613eec8210 Revert r263460: [SpillPlacement] Fix a quadratic behavior in spill placement.
That commit looks wonderful and awesome. Sadly, it greatly exacerbates
PR17409 and effectively regresses build time for a lot of (very large)
code when compiled with ASan or MSan.

We thought this could be fixed forward by landing D15302 which at last
fixes that PR, but some issues were discovered and it looks like that
got reverted, so reverting this as well temporarily. As soon as the fix
for PR17409 lands and sticks, we should re-land this patch as it won't
trigger more significant test cases hitting that bug.

Many thanks to Quentin and Wei here as they're doing all the awesome
hard work!!!

llvm-svn: 265331
2016-04-04 18:57:50 +00:00
Matthias Braun 870c34f0cf ARM, AArch64, X86: Check preserved registers for tail calls.
We can only perform a tail call to a callee that preserves all the
registers that the caller needs to preserve.

This situation happens with calling conventions like preserver_mostcc or
cxx_fast_tls. It was explicitely handled for fast_tls and failing for
preserve_most. This patch generalizes the check to any calling
convention.

Related to rdar://24207743

Differential Revision: http://reviews.llvm.org/D18680

llvm-svn: 265329
2016-04-04 18:56:13 +00:00
Derek Schuff 73900c6876 Replace MachineRegisterInfo::isSSA() with a MachineFunctionProperty
Use the MachineFunctionProperty mechanism to indicate whether a MachineFunction
is in SSA form instead of a custom method on MachineRegisterInfo. NFC

Differential Revision: http://reviews.llvm.org/D18574

llvm-svn: 265318
2016-04-04 18:03:29 +00:00
Wei Mi fb5252cac1 Revert r265309 and r265312 because they caused some errors I need to investigate.
llvm-svn: 265317
2016-04-04 17:45:03 +00:00
Derek Schuff 1dbf7a571f Add MachineFunctionProperty checks for AllVRegsAllocated for target passes
Summary:
This adds the same checks that were added in r264593 to all
target-specific passes that run after register allocation.

Reviewers: qcolombet

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D18525

llvm-svn: 265313
2016-04-04 17:09:25 +00:00
Wei Mi cdaf1df657 Fix unused var warning caused by r265309.
llvm-svn: 265312
2016-04-04 17:03:58 +00:00
Wei Mi ffbc9c7f3b Replace analyzeSiblingValues with new algorithm to fix its compile
time issue. The patch is to solve PR17409 and its duplicates.

analyzeSiblingValues is a N x N complexity algorithm where N is
the number of siblings generated by reg splitting. Although it
causes siginificant compile time issue when N is large, it is also
important for performance since it removes redundent spills and
enables rematerialization.

To solve the compile time issue, the patch removes analyzeSiblingValues
and replaces it with lower cost alternatives containing two parts. The
first part creates a new spill hoisting method in postOptimization of
register allocation. It does spill hoisting at once after all the spills
are generated instead of inside every instance of selectOrSplit. The
second part queries the define expr of the original register for
rematerializaiton and keep it always available during register allocation
even if it is already dead. It deletes those dead instructions only in
postOptimization. With the two parts in the patch, it can remove
analyzeSiblingValues without sacrificing performance.

Differential Revision: http://reviews.llvm.org/D15302

llvm-svn: 265309
2016-04-04 16:42:40 +00:00
Peter Zotov 8efe38a1e2 [CodeGenPrepare] Fix r265264 (again).
Don't require TLI for SinkCmpExpression, like it wasn't before
r265264.

llvm-svn: 265271
2016-04-03 19:32:13 +00:00
Peter Zotov f87e550e89 [CodeGenPrepare] Fix r265264.
The case where there was no TargetLowering was not handled,
leading to null pointer dereferences.

llvm-svn: 265265
2016-04-03 17:11:53 +00:00
Peter Zotov 0b6d7bc682 [CodeGenPrepare] Avoid sinking soft-FP comparisons
Sinking comparisons in CGP can undo the job of hoisting them done
earlier by LICM, and soft-FP makes this an expensive mistake.

A common pattern that produces floating point comparisons uniform
over a loop is an explicit check for division by zero. If the divisor
is hoisted out of the loop, the comparison can also be, but hoisting
the function that unwinds is never legal, since it may cause side
effects in the loop body prior to the unwinding to not be executed.

Differential Revision: http://reviews.llvm.org/D18744

llvm-svn: 265264
2016-04-03 16:36:17 +00:00
Manman Ren 9bfd0d03e9 Swift Calling Convention: add swifterror attribute.
A ``swifterror`` attribute can be applied to a function parameter or an
AllocaInst.

This commit does not include any target-specific change. The target-specific
optimization will come as a follow-up patch.

Differential Revision: http://reviews.llvm.org/D18092

llvm-svn: 265189
2016-04-01 21:41:15 +00:00
Sanjoy Das 9d41a8f269 Don't use an i64 return type with webkit_jscc
Re-enable an assertion enabled by Justin Lebar in rL265092.  rL265092
was breaking test/CodeGen/X86/deopt-intrinsic.ll because webkit_jscc
does not like non-i64 return types.  Change the test case to not do
that.

llvm-svn: 265099
2016-04-01 02:51:21 +00:00
Chuang-Yu Cheng 35c6181982 Fix Sub-register Rewriting in Aggressive Anti-Dependence Breaker
Previously, HandleLastUse would delete RegRef information for sub-registers
if they were dead even if their corresponding super-register were still live.

If the super-register were later renamed, then the definitions of the
sub-register would not be updated appropriately. This patch alters the
behavior so that RegInfo information for sub-registers is only deleted when
the sub-register and super-register are both dead.

This resolves PR26775. This is the mirror image of Hal's r227311 commit.

Author: Tom Jablin (tjablin)
Reviewers: kbarton uweigand nemanjai hfinkel

http://reviews.llvm.org/D18448

llvm-svn: 265097
2016-04-01 02:05:29 +00:00
Justin Lebar 98981e5573 Revert "Protect some assertions with NDEBUG rather than DEBUG()."
This reverts r265092, because it breaks CodeGen/X86/deopt-intrinsic.ll.

llvm-svn: 265093
2016-04-01 01:23:23 +00:00
Justin Lebar c814e8e4ab Protect some assertions with NDEBUG rather than DEBUG().
DEBUG() only runs if you pass -debug, but these assertions are generally
useful.

llvm-svn: 265092
2016-04-01 01:09:12 +00:00
Justin Lebar acc47105f8 [ifcnv] Add brief comment explaining what ifcnv is.
llvm-svn: 265088
2016-04-01 01:09:03 +00:00
Adrian Prantl b939a25707 Move the DebugEmissionKind enum from DIBuilder into DICompileUnit.
This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder
into DICompileUnit. DIBuilder is not the right place for this enum to live
in — a metadata consumer should not have to include DIBuilder.h.
I also added a Verifier check that checks that the emission kind of a
DICompileUnit is actually legal.

http://reviews.llvm.org/D18612
<rdar://problem/25427165>

llvm-svn: 265077
2016-03-31 23:56:58 +00:00
Tim Shen 800ed436e5 [AsmPrinter] Print aliases in topological order
Print aliases in topological order, that is, for any alias a = b,
b must be printed before a. This is because on some targets (e.g. PowerPC)
linker expects aliases in such an order to generate correct TOC information.

GCC also prints aliases in topological order.

llvm-svn: 265064
2016-03-31 22:08:19 +00:00
Chandler Carruth b472856a73 Fix PR26940 where compiles times regressed massively.
Patch by Jonas Paulsson. Original description:
Bugfix in buildSchedGraph() to make -dag-maps-huge-region work properly

I found that the reduction of the maps did in fact never happen in this
test case. This was because *all* the stores / loads were made with
addresses from arguments and they thus became "unknown" stores / loads.
Fixed by removing continue statements and making sure that the test for
reduction always takes place.

Differential Revision: http://reviews.llvm.org/D18673

llvm-svn: 265063
2016-03-31 21:55:58 +00:00
Sanjay Patel 4d71160d5d fix typo; NFC
llvm-svn: 265054
2016-03-31 21:00:48 +00:00
Hans Wennborg e1a2e90ffa Change eliminateCallFramePseudoInstr() to return an iterator
This will become necessary in a subsequent change to make this method
merge adjacent stack adjustments, i.e. it might erase the previous
and/or next instruction.

It also greatly simplifies the calls to this function from Prolog-
EpilogInserter. Previously, that had a bunch of logic to resume iteration
after the call; now it just continues with the returned iterator.

Note that this changes the behaviour of PEI a little. Previously,
it attempted to re-visit the new instruction created by
eliminateCallFramePseudoInstr(). That code was added in r36625,
but I can't see any reason for it: the new instructions will obviously
not be pseudo instructions, they will not have FrameIndex operands,
and we have already accounted for the stack adjustment.

Differential Revision: http://reviews.llvm.org/D18627

llvm-svn: 265036
2016-03-31 18:33:38 +00:00
Stephan Bergmann 480de227f6 Don't use potentially invalidated iterator
If the lhs is evaluated before the rhs, FuncletI's operator-> can trigger the

  assert(isHandleInSync() && "invalid iterator access!");

at include/llvm/ADT/DenseMap.h:1061.  (Happens e.g. when compiled with GCC 6.)

Differential Revision: http://reviews.llvm.org/D18440

llvm-svn: 265024
2016-03-31 15:42:01 +00:00
Nirav Dave 83ce54aac2 Prevent X86ISelLowering from merging volatile loads
Change isConsecutiveLoads to check that loads are non-volatile as this
is a requirement for any load merges. Propagate change to two callers.

Reviewers: RKSimon

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18546

llvm-svn: 265013
2016-03-31 13:40:55 +00:00
Matthias Braun 8d41436004 CodeGen: Factor out code for tail call result compatibility check; NFC
llvm-svn: 264959
2016-03-30 22:46:04 +00:00
Matt Arsenault 46ba31650e LegalizeDAG: Don't replace vector store with integer if not legal
For the same reason as the corresponding load change.

Note that ExpandStore is completely broken for non-byte sized element
vector stores, but preserve the current broken behavior which has tests
for it. The behavior should be the same, but now introduces a new typed
store that is incorrectly split later rather than doing it directly.

llvm-svn: 264928
2016-03-30 21:15:18 +00:00
Matt Arsenault a4b1b6ea05 LegalizeDAG: Don't replace vector load with integer unless legal
On AMDGPU we want to be able to promote i64/f64 loads to v2i32.
If the access is unaligned, this would conclude that since i64 is legal,
it would convert it back to i64 and there is an endless legalization
loop.

Extract the logic for scalarizing the load into a new TargetLowering
function, where this can also replace the custom function AMDGPU
has for this.

llvm-svn: 264927
2016-03-30 21:15:10 +00:00
Fiona Glaser 44a2f7a298 MachineSink: make shouldSink a TII target hook
Some targets may disagree on what they want sunk or not sunk,
so make this a target hook instead of hardcoded.

llvm-svn: 264799
2016-03-29 22:44:57 +00:00
Derek Schuff 07636cd5e7 Add a print method to MachineFunctionProperties for better error messages
This makes check failures much easier to understand.
Make it empty (but leave it in the class) for NDEBUG builds.

Differential Revision: http://reviews.llvm.org/D18529

llvm-svn: 264780
2016-03-29 20:28:20 +00:00
Matthias Braun 72a58c3e28 MachineVerifier: On dead-def live segments, check that corresponding machine operand has a dead flag
llvm-svn: 264769
2016-03-29 19:07:43 +00:00
Matthias Braun 1c20c8280a LiveVariables: Fix typo and shorten comment
llvm-svn: 264768
2016-03-29 19:07:40 +00:00
Nirav Dave 2aab7f4358 Add support for no-jump-tables
Add function soft attribute to the generation of Jump Tables in CodeGen
as initial step towards clang support of gcc's no-jump-table support

Reviewers: hans, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18321

llvm-svn: 264756
2016-03-29 17:46:23 +00:00
Derek Schuff 42666eeea2 Add MachineVerifier check for AllVRegsAllocated MachineFunctionProperty
Summary:
Check that any function that has the property set is free of virtual
register operands.

Also, it is actually VirtRegMap (and not the register allocators) that
acutally remove the VReg operands (except for RegAllocFast).

Reviewers: qcolombet

Subscribers: MatzeB, llvm-commits, qcolombet

Differential Revision: http://reviews.llvm.org/D18535

llvm-svn: 264755
2016-03-29 17:40:22 +00:00
Manman Ren f46262e0b7 Swift Calling Convention: add swiftself attribute.
Differential Revision: http://reviews.llvm.org/D17866

llvm-svn: 264754
2016-03-29 17:37:21 +00:00
Matthias Braun f54530ef00 RegisterPressure: Simplify liveness tracking when lanemasks are not checked.
Split RegisterOperands code that collects defs/uses into a variant with
and without lanemask tracking. This is a bit of code duplication, but
there are enough subtle differences between the two variants that this
seems cleaner (and potentially faster).

This also fixes a problem where lanes where tracked even though
TrackLaneMasks was false. This is part of the fix for
http://llvm.org/PR27106. I will commit the testcase when it is
completely fixed.

llvm-svn: 264696
2016-03-29 03:54:22 +00:00
Matthias Braun 82cff88691 LiveVariables: Do not remove dead flags from vreg operands
Also add a FIXME comment on why Mips RDDSP causes bogus dead flags to be
added which LiveVariables cleans up by accident.

llvm-svn: 264695
2016-03-29 03:08:18 +00:00
Kyle Butt 5e241b11ed [Codegen] Decrease minimum jump table density.
Minimum density for both optsize and non optsize are now options
-sparse-jump-table-density (default 10) for non optsize functions
-dense-jump-table-density (default 40) for optsize functions, which
matches the current default. This improves several benchmarks at google
at the cost of a small codesize increase. For code compiled with -Os,
the old behavior continues

llvm-svn: 264689
2016-03-29 00:23:41 +00:00
Matthias Braun b74eb41d58 MIRParser: Add %subreg.xxx syntax for subregister index operands
Differential Revision: http://reviews.llvm.org/D18279

llvm-svn: 264608
2016-03-28 18:18:46 +00:00
Derek Schuff ad154c837e Introduce MachineFunctionProperties and the AllVRegsAllocated property
MachineFunctionProperties represents a set of properties that a MachineFunction
can have at particular points in time. Existing examples of this idea are
MachineRegisterInfo::isSSA() and MachineRegisterInfo::tracksLiveness() which
will eventually be switched to use this mechanism.
This change introduces the AllVRegsAllocated property; i.e. the property that
all virtual registers have been allocated and there are no VReg operands
left.

With this mechanism, passes can declare that they require a particular property
to be set, or that they set or clear properties by implementing e.g.
MachineFunctionPass::getRequiredProperties(). The MachineFunctionPass base class
verifies that the requirements are met, and handles the setting and clearing
based on the delcarations. Passes can also directly query and update the current
properties of the MF if they want to have conditional behavior.

This change annotates the target-independent post-regalloc passes; future
changes will also annotate target-specific ones.

Reviewers: qcolombet, hfinkel

Differential Revision: http://reviews.llvm.org/D18421

llvm-svn: 264593
2016-03-28 17:05:30 +00:00
James Y Knight 01f2ca5612 NFC: skip FenceInst up-front in AtomicExpandPass.
llvm-svn: 264583
2016-03-28 15:05:30 +00:00
JF Bastien a874d1a40d Revert "NFC: static_assert instead of comment"
This reverts commit fa36fcff16c7d4f78204d6296bf96c3558a4a672.

Causes the following Windows failure:

  C:\Buildbot\Slave\llvm-clang-lld-x86_64-scei-ps4-windows10pro-fast\llvm.src\lib\CodeGen\MachineInstr.cpp(762):
  error C2338: must be trivially copyable to memmove

llvm-svn: 264516
2016-03-26 18:20:02 +00:00
JF Bastien d4ff3360ae NFC: static_assert instead of comment
Summary: isPodLike is as close as we have for is_trivially_copyable.

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18483

llvm-svn: 264515
2016-03-26 18:14:27 +00:00
Junmo Park a26e93bcec Minor code cleanup. NFC.
llvm-svn: 264505
2016-03-26 06:04:55 +00:00
Jun Bum Lim 36c53fe147 [MachineCopyPropagation] Expose more dead copies across instructions with regmasks
When encountering instructions with regmasks, instead of cleaning up all the
elements in MaybeDeadCopies map, remove only the instructions erased. By keeping
more instruction in MaybeDeadCopies, this change will expose more dead copies
across instructions with regmasks.

llvm-svn: 264462
2016-03-25 21:15:35 +00:00
Nirav Dave fa250cad37 Prevent construction of cycle in DAG store merge
When merging stores in DAGCombiner, add check to ensure that no
dependenices exist that would cause the construction of a cycle in our
DAG.  This may happen if one store has a data dependence on another
instruction (e.g. a load) which itself has a (chain) dependence on
another store being merged. These stores cannot be merged safely and
doing so results in a cycle that is discovered in LegalizeDAG.

This test is only done in cases where Antialias analysis is used (UseAA)
as non-AA store merge candidates will be merged logically after all
loads which have been checked to not alias.

Reviewers: ahatanak, spatel, niravd, arsenm, hfinkel, tstellarAMD, jyknight

Subscribers: llvm-commits, tberghammer, danalbert, srhines

Differential Revision: http://reviews.llvm.org/D18336

llvm-svn: 264461
2016-03-25 21:06:30 +00:00
Justin Bogner ec0e7d2582 CodeGen: Don't iterate over operands after we've erased an MI
This fixes a use-after-free introduced 3 years ago, in r182872 ;)

The code more or less worked because the memory that CopyMI was
pointing to happened to still be valid, but lots of tests would crash
if you ran under ASAN with the recycling allocator changes from
llvm.org/PR26808

llvm-svn: 264455
2016-03-25 20:03:28 +00:00
Justin Bogner ec5ea36891 CodeGen: Fix a use-after-free in TII
Found by ASAN with the recycling allocator changes from PR26808.

llvm-svn: 264443
2016-03-25 18:38:48 +00:00
Reid Kleckner f6f04f8fc8 Consider regmasks when computing register-based DBG_VALUE live ranges
Now register parameters that aren't saved to the stack or CSRs are
considered dead after the first call. Previously the debugger would show
whatever was in the register.

Fixes PR26589

Reviewers: aprantl

Differential Revision: http://reviews.llvm.org/D17211

llvm-svn: 264429
2016-03-25 17:54:46 +00:00
Manman Ren 9dd8c14674 CXX TLS: collect return blocks after SelectAllBasicBlocks.
It is incorrect to get the corresponding MBB for a ReturnInst before
SelectAllBasicBlocks since SelectAllBasicBlocks can change the
correspondence between a ReturnInst and the MBB it is in.

PR27062

llvm-svn: 264358
2016-03-24 23:21:29 +00:00
Sanjoy Das fd3eaa8c5c Reduce code duplication by extracting out a helper function; NFC
llvm-svn: 264355
2016-03-24 22:51:49 +00:00
Sanjoy Das 731c67fed2 Lower varargs correctly in deopt bundle lowering
Earlier we were ignoring varargs in LowerCallSiteWithDeoptBundle because
populateCallLoweringInfo does not set CallLoweringInfo::IsVarArg.

llvm-svn: 264354
2016-03-24 22:37:52 +00:00
Matthias Braun ae81c29352 LiveInterval: Fix Distribute() failing on liveranges with unused VNInfos
This fixes http://llvm.org/PR26991

llvm-svn: 264345
2016-03-24 21:41:38 +00:00
Reid Kleckner 01bc66a8ce Revert "Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26942 (the fix is included in this commit)."
This reverts commit r264280.

This broke building Chromium for iOS. We'll upload a reproducer to the
PR soon.

llvm-svn: 264334
2016-03-24 20:38:49 +00:00
Sanjoy Das df9ae70f49 Add lowering support for llvm.experimental.deoptimize
Summary:
Only adds support for "naked" calls to llvm.experimental.deoptimize.
Support for round-tripping through RewriteStatepointsForGC will come
as a separate patch (should be simpler than this one).

Reviewers: reames

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18429

llvm-svn: 264329
2016-03-24 20:23:29 +00:00
Sanjoy Das c0c59fe14e [Statepoints] Fix yet another issue around gc pointer uniqueing
Given that StatepointLowering now uniques derived pointers before
putting them in the per-statepoint spill map, we may end up with missing
entries for derived pointers when we visit a gc.relocate on a pointer
that was de-duplicated away.

Fix this by keeping two maps, one mapping gc pointers to their
de-duplicated values, and one mapping a de-duplicated value to the slot
it is spilled in.

llvm-svn: 264320
2016-03-24 18:57:39 +00:00
Sanjoy Das 42f91a9959 Minor cosmestic changes (NFC)
- Reflow comments
 - Rename function

llvm-svn: 264319
2016-03-24 18:57:31 +00:00
David Blaikie 0b214e4a2a [debuginfo] Include dwo_name in the split unit to improve dwp diagnostics
When multiple DWP files are merged together and duplicate DWO IDs are
found it's currently difficult to give an actionable error message - the
DW_AT_name of the CU could be provided, but might be identical (if the
same source file is built into two different configurations), which
doesn't help the user identify the problem.

When no intermediate DWP files are generated, the path to the two DWO
files could be provided - but is lost once the DWOs are merged into a
DWP.

So, include the name of the DWO (dwo_name) in the split file so that
collissions involving a source CU from a DWP can be better diagnosed.

(improvements to llvm-dwp using this to come shortly)

llvm-svn: 264316
2016-03-24 18:37:08 +00:00
Tim Northover 4498eff9bb CodeGen: extend RHS when splitting ATOMIC_CMP_SWAP_WITH_SUCCESS.
If the operation's type has been promoted during type legalization, we
need to account for the fact that the high bits of the comparison
operand are likely unspecified.

The LHS is usually zero-extended, but MIPS sign extends it, so we have
to be slightly careful.

Patch by Simon Dardis.

llvm-svn: 264296
2016-03-24 15:38:38 +00:00
Pirama Arumuga Nainar dc45aef2d8 Remove unsafe AssertZext after promoting result of FP_TO_FP16
Summary:
Some target lowerings of FP_TO_FP16, for instance ARM's vcvtb.f16.f32
instruction, do not guarantee that the top 16 bits are zeroed out.
Remove the unsafe AssertZext and add tests to exercise this.

Reviewers: jmolloy, sbaranga, kristof.beyls, aadg

Subscribers: llvm-commits, srhines, aemerson

Differential Revision: http://reviews.llvm.org/D18426

llvm-svn: 264285
2016-03-24 14:06:03 +00:00
Amjad Aboud 6ff7e10052 Recommitted r263424 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26942 (the fix is included in this commit).

Differential Revision: http://reviews.llvm.org/D18350

llvm-svn: 264280
2016-03-24 13:30:16 +00:00
Cong Hou 94710840fb Allow X86::COND_NE_OR_P and X86::COND_NP_OR_E to be reversed.
Currently, AnalyzeBranch() fails non-equality comparison between floating points
on X86 (see https://llvm.org/bugs/show_bug.cgi?id=23875). This is because this
function can modify the branch by reversing the conditional jump and removing
unconditional jump if there is a proper fall-through. However, in the case of
non-equality comparison between floating points, this can turn the branch
"unanalyzable". Consider the following case:

jne.BB1
jp.BB1
jmp.BB2
.BB1:
...
.BB2:
...

AnalyzeBranch() will reverse "jp .BB1" to "jnp .BB2" and then "jmp .BB2" will be
removed:

jne.BB1
jnp.BB2
.BB1:
...
.BB2:
...

However, AnalyzeBranch() cannot analyze this branch anymore as there are two
conditional jumps with different targets. This may disable some optimizations
like block-placement: in this case the fall-through behavior is enforced even if
the fall-through block is very cold, which is suboptimal.

Actually this optimization is also done in block-placement pass, which means we
can remove this optimization from AnalyzeBranch(). However, currently
X86::COND_NE_OR_P and X86::COND_NP_OR_E are not reversible: there is no defined
negation conditions for them.

In order to reverse them, this patch defines two new CondCode X86::COND_E_AND_NP
and X86::COND_P_AND_NE. It also defines how to synthesize instructions for them.
Here only the second conditional jump is reversed. This is valid as we only need
them to do this "unconditional jump removal" optimization.


Differential Revision: http://reviews.llvm.org/D11393

llvm-svn: 264199
2016-03-23 21:45:37 +00:00
Justin Bogner c35c10593b SelectionDAG: Remove a tautological dyn_cast. NFC
Index is already a StoreSDNode, so this dyn_cast doesn't do anything.

llvm-svn: 264177
2016-03-23 18:15:33 +00:00
Sanjoy Das a5b2972977 Remove stale comment
llvm-svn: 264131
2016-03-23 02:28:35 +00:00
Sanjoy Das ac53dc7520 [StatepointLowering] Don't do two DenseMap lookups; nfci
llvm-svn: 264130
2016-03-23 02:24:15 +00:00
Sanjoy Das 7edbef316b [StatepointLowering] Minor NFC cleanups
- Use auto
 - Name variables in LLVM style
 - Use llvm::find instead of std::find
 - Blank lines between declarations

llvm-svn: 264129
2016-03-23 02:24:13 +00:00
Sanjoy Das 4cd746ebe0 [StatepointLowering] Minor nfc refactoring
Now that StatepointLoweringInfo represents base pointers, derived
pointers and gc relocates as SmallVectors and not ArrayRefs, we no
longer need to allocate "backing storage" on stack in LowerStatepoint.
So elide the backing storage, and inline the trivial body of
getIncomingStatepointGCValues.

llvm-svn: 264128
2016-03-23 02:24:10 +00:00
Sanjoy Das e58ca59cf4 [StatepointLowering] Schedule gc relocates before uniqueing them
Otherwise we can see an "unexpected" gc.relocate that we uniqued away.

llvm-svn: 264127
2016-03-23 02:24:07 +00:00
George Burgess IV d4febd1612 Keep CodeGenPrepare from preserving the domtree.
CGP modifies the domtree in some cases, so saying that it preserves the
domtree is a lie. We'll be able to selectively preserve it with the new
pass manager.

Differential Revision: http://reviews.llvm.org/D16893

llvm-svn: 264099
2016-03-22 21:25:08 +00:00
Simon Pilgrim c6f5fe3d69 [SelectionDAG] Ensure constant folded legalized vector element types are compatible with the BUILD_VECTOR type
Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected.

llvm-svn: 264085
2016-03-22 19:59:53 +00:00
Tim Northover b49a8a9dbb CodeGen: check return types match when emitting tail call to builtin.
We were just completely ignoring the types when determining whether we could
safely emit a libcall as a tail call. This is clearly wrong.

Theoretically, we could dig deeper looking for incidental matches (much like
the generic code in Analysis.cpp does), but it's probably not worth it for the
few libcalls that exist.

llvm-svn: 264084
2016-03-22 19:14:38 +00:00
Sanjoy Das eb5037cadc Allow lowering call sites with both funclets and deopt state
Lowering funclets is a no-op, so we can just go ahead and lower the
deopt state.

llvm-svn: 264078
2016-03-22 18:10:39 +00:00
Sanjoy Das 6b535630a1 Add a hasOperandBundlesOtherThan helper, and use it; NFC
llvm-svn: 264072
2016-03-22 17:51:25 +00:00
Simon Pilgrim 25fb4177fb [X86][SSE] Reapplied: Simplify vector LOAD + EXTEND on pre-SSE41 hardware
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.

We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.

Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).

Reapplied with a fix for PR26953 (missing vector widening legalization).

Differential Revision: http://reviews.llvm.org/D17932

llvm-svn: 264062
2016-03-22 16:22:08 +00:00
Sanjoy Das 38bfc22161 Add "first class" lowering for deopt operand bundles
Summary:
After this change, deopt operand bundles can be lowered directly by
SelectionDAG into STATEPOINT instructions (which are then lowered to a
call or sequence of nop, with an associated __llvm_stackmaps entry0.
This obviates the need to round-trip deoptimization state through
gc.statepoint via RewriteStatepointsForGC.

Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin

Subscribers: sanjoy, mcrosier, majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D18257

llvm-svn: 264015
2016-03-22 00:59:13 +00:00
Silviu Baranga 46030585b3 [DAGCombine] Catch the case where extract_vector_elt can cause an any_ext while processing AND SDNodes
Summary:
extract_vector_elt can cause an implicit any_ext if the types don't
match. When processing the following pattern:

  (and (extract_vector_elt (load ([non_ext|any_ext|zero_ext] V))), c)

DAGCombine was ignoring the possible extend, and sometimes removing
the AND even though it was required to maintain some of the bits
in the result to 0, resulting in a miscompile.

This change fixes the issue by limiting the transformation only to
cases where the extract_vector_elt doesn't perform the implicit
extend.

Reviewers: t.p.northover, jmolloy

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18247

llvm-svn: 263935
2016-03-21 11:43:46 +00:00
Craig Topper ea87eae4ca Suppress a -Wunused-variable warning in release builds.
llvm-svn: 263892
2016-03-20 01:17:54 +00:00
Saleem Abdulrasool 2854666263 CodeGen: use range based for loop
Convert a loop to use a range based style loop.  NFC.

llvm-svn: 263884
2016-03-19 16:35:32 +00:00
Manman Ren 2828c57b6f [CXX_FAST_TLS] fix issues with O0 on ARM, AArch64 and X86.
Since at O0, explicit copies via SplitCSR may not be removed even if
they are unnecessary, we choose not to use SplitCSR at O0.

llvm-svn: 263855
2016-03-18 23:38:49 +00:00
Matthias Braun 0d208fc9f6 MILexer: Add ErrorCallbackType typedef; NFC
llvm-svn: 263829
2016-03-18 20:41:11 +00:00
Reid Kleckner fbd7787d7e [codeview] Only emit function ids for inlined functions
We aren't referencing any other kind of function currently.
Should save a bit on our debug info size.

llvm-svn: 263817
2016-03-18 18:54:32 +00:00
Peter Collingbourne a1f8625662 DebugInfo: Add ability to not emit DW_AT_vtable_elem_location for virtual functions.
A virtual index of -1u indicates that the subprogram's virtual index is
unrepresentable (for example, when using the relative vtable ABI), so do
not emit a DW_AT_vtable_elem_location attribute for it.

Differential Revision: http://reviews.llvm.org/D18236

llvm-svn: 263765
2016-03-17 23:58:03 +00:00
Sanjoy Das 3a02019fbc [SelectionDAG] Remove visitStatepoint; NFC
This way we have a single entry point into StatepointLowering.  The
method was a direct dispatch to LowerStatepoint anyway.

llvm-svn: 263682
2016-03-17 00:47:14 +00:00
Sanjoy Das 43e33d61c6 Fix indentation; NFC
llvm-svn: 263672
2016-03-16 23:11:21 +00:00
Sanjoy Das 70697ff74d Extract out a SelectionDAGBuilder::LowerAsStatepoint; NFC
Summary:
This is a step towards implementing "direct" lowering of calls and
invokes with deopt operand bundles into STATEPOINT nodes (as opposed to
having them mandatorily pass through RewriteStatepointsForGC, which is
the case today).

This change extracts out a `SelectionDAGBuilder::LowerAsStatepoint`
helper function that is able to lower a "statepoint like thing", and
uses it to lower `gc.statepoint` calls.  This is an NFC now, but in a
later change we will use `LowerAsStatepoint` to directly lower calls and
invokes with operand bundles without going through an intermediate
`gc.statepoint` IR representation.

FYI: I expect `SelectionDAGBuilder::StatepointInfo` will evolve as I add
support for lowering non gc.statepoints, right now it is fairly tightly
coupled with an IR level `gc.statepoint`.

Reviewers: reames, pgavlin, JosephTremoulet

Subscribers: sanjoy, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D18106

llvm-svn: 263671
2016-03-16 23:08:00 +00:00
James Y Knight f44fc5219f Tweak some atomics functions in preparation for larger changes; NFC.
- Rename getATOMIC to getSYNC, as llvm will soon be able to emit both
  '__sync' libcalls and '__atomic' libcalls, and this function is for
  the '__sync' ones.

- getInsertFencesForAtomic() has been replaced with
  shouldInsertFencesForAtomic(Instruction), so that the decision can be
  made per-instruction. This functionality will be used soon.

- emitLeadingFence/emitTrailingFence are no longer called if
  shouldInsertFencesForAtomic returns false, and thus don't need to
  check the condition themselves.

llvm-svn: 263665
2016-03-16 22:12:04 +00:00
Sanjoy Das 19c6159833 [SelectionDAG] Extract out populateCallLoweringInfo; NFC
SelectionDAGBuilder::populateCallLoweringInfo is now used instead of
SelectionDAGBuilder::lowerCallOperands.  The populateCallLoweringInfo
interface is more composable in face of design changes like
http://reviews.llvm.org/D18106

llvm-svn: 263663
2016-03-16 20:49:31 +00:00
Simon Pilgrim b5a20f0fec Removed trailing whitespace
llvm-svn: 263650
2016-03-16 18:37:44 +00:00
Lang Hames 1b640e05ba [MachO] Add MachO alt-entry directive support.
This patch adds support for the MachO .alt_entry assembly directive, and uses
it for global aliases with non-zero GEP offsets. The alt_entry flag indicates
that a symbol should be layed out immediately after the preceding symbol.
Conceptually it introduces an alternate entry point for a function or data
structure. E.g.:

safe_foo:
  // check preconditions for foo
.alt_entry fast_foo
fast_foo:
  // body of foo, can assume preconditions.

The .alt_entry flag is also implicitly set on assembly aliases of the form:

a = b + C

where C is a non-zero constant, since these have the same effect as an
alt_entry symbol: they introduce a label that cannot be moved relative to the
preceding one. Setting the alt_entry flag on aliases of this form fixes
http://llvm.org/PR25381.

llvm-svn: 263521
2016-03-15 01:43:05 +00:00
Sanjoy Das c11460e051 [StatepointLowering] Move an assertion; NFCI
Instead of running an explicit loop over `gc.relocate` calls hanging off
of a `gc.statepoint`, assert the validity of the type of the value being
relocated in `visitRelocate`.

llvm-svn: 263516
2016-03-15 01:16:31 +00:00
Eric Christopher da8b3f1914 Temporarily Revert "[X86][SSE] Simplify vector LOAD + EXTEND on
pre-SSE41 hardware" as it seems to be causing crashes during code
generation in halide. PR forthcoming.

This reverts commit r263303.

llvm-svn: 263512
2016-03-14 23:59:57 +00:00
Amaury Sechet eae09c2c2a Factor out MachineBlockPlacement::fillWorkLists. NFC
Summary: There are places in MachineBlockPlacement where a worklist is filled in pretty much identical way. The code is duplicated. This refactor it so that the same code is used in both scenarii.

Reviewers: chandlerc, majnemer, rafael, MatzeB, escha, silvas

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D18077

llvm-svn: 263495
2016-03-14 21:24:11 +00:00
Quentin Colombet 40ce25b68b [SpillPlacement] Fix a quadratic behavior in spill placement.
The bad behavior happens when we have a function with a long linear chain of
basic blocks, and have a live range spanning most of this chain, but with very
few uses.
Let say we have only 2 uses.
The Hopfield network is only seeded with two active blocks where the uses are,
and each iteration of the outer loop in `RAGreedy::growRegion()` only adds two
new nodes to the network due to the completely linear shape of the CFG.
Meanwhile, `SpillPlacer->iterate()` visits the whole set of discovered nodes,
which adds up to a quadratic algorithm.

This is an historical accident effect from r129188.

When the Hopfield network is expanding, most of the action is happening on the
frontier where new nodes are being added. The internal nodes in the network are
not likely to be flip-flopping much, or they will at least settle down very
quickly. This means that while `SpillPlacer->iterate()` is recomputing all the
nodes in the network, it is probably only the two frontier nodes that are
changing their output.

Instead of recomputing the whole network on each iteration, we can maintain a
SparseSet of nodes that need to be updated:

- `SpillPlacement::activate()` adds the node to the todo list.
- When a node changes value (i.e., `update()` returns true), its neighbors are
  added to the todo list.
- `SpillPlacement::iterate()` only updates the nodes in the list.

The result of Hopfield iterations is not necessarily exact. It should converge
to a local minimum, but there is no guarantee that it will find a global
minimum. It is possible that updating nodes in a different order will cause us
to switch to a different local minimum. In other words, this is not NFC, but
although I saw a few runtime improvements and regressions when I benchmarked
this change, those were side effects and actually the performance change is in
the noise as expected.

Huge thanks to Jakob Stoklund Olesen <stoklund@2pi.dk> for his feedbacks,
guidance and time for the review.

llvm-svn: 263460
2016-03-14 18:21:25 +00:00
Sanjay Patel 7506852709 [DAG] use !isUndef() ; NFCI
llvm-svn: 263453
2016-03-14 18:09:43 +00:00
Sanjay Patel 5719584129 [DAG] use isUndef() ; NFCI
llvm-svn: 263448
2016-03-14 17:28:46 +00:00
Benjamin Kramer 1082fa66a5 Revert "Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info." After fixing PR26715 at r263379."
This reverts commit r263424. Breaks self-host.

llvm-svn: 263437
2016-03-14 14:58:36 +00:00
Amjad Aboud ab0378b16c Recommitted r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
After fixing PR26715 at r263379.

llvm-svn: 263424
2016-03-14 12:03:20 +00:00
David Majnemer b9456a5eb3 [CodeView] Consistently handle overly large symbol names
Overly large symbol names weren't correctly handled for leaf function
records.

llvm-svn: 263408
2016-03-14 05:15:09 +00:00
David Majnemer 1256125fb7 [CodeView] Truncate display names
Fundamentally, the length of a variable or function name is bound by the
maximum size of a record: 0xffff.  However, the name doesn't live in a
vacuum; other data is associated with the name, lowering the bound
further.

We would naively attempt to emit the name, causing us to assert because
the record would no-longer fit in 16-bits.  Instead, truncate the name
but preserve as much as we can.

While I have tested this locally, I've decided to not commit it due to
the test's size.

N.B.  While this behavior is undesirable, it is better than MSVC's
behavior.  They seem to truncate to ~4000 characters.

llvm-svn: 263378
2016-03-13 10:53:30 +00:00
Sanjoy Das ecf96c9516 Make gc relocates more strongly typed; NFC
Don't use a `Value *` where we can use a stronger `GCRelocateInst *`
type.

llvm-svn: 263327
2016-03-12 02:54:27 +00:00
Simon Pilgrim 33d57c7547 [X86][SSE] Simplify vector LOAD + EXTEND on pre-SSE41 hardware
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.

We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.

Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).

Differential Revision: http://reviews.llvm.org/D17932

llvm-svn: 263303
2016-03-11 22:18:05 +00:00
Quentin Colombet dd4b137364 [IRTranslator] Translate unconditional branches.
llvm-svn: 263265
2016-03-11 17:28:03 +00:00
Quentin Colombet f9b4934d1d [MachineIRBuilder] Rework buildInstr API to maximize code reuse.
llvm-svn: 263264
2016-03-11 17:27:58 +00:00
Quentin Colombet e225e2541b [IRTranslator] Update getOrCreateVReg API to use references.
A value that we want to keep in a virtual register cannot be null.
Reflect that in the API.

llvm-svn: 263263
2016-03-11 17:27:54 +00:00
Quentin Colombet 000b580b13 [MachineIRBuilder] Rename the setter of MF for consistency with the getter.
llvm-svn: 263262
2016-03-11 17:27:51 +00:00
Quentin Colombet 91ebd71e26 [MachineIRBuilder] Rename the setter for MBB for consistency with the getter.
llvm-svn: 263261
2016-03-11 17:27:47 +00:00
Quentin Colombet 53237a9e64 [IRTranslator] Update getOrCreateBB API to use references.
A null basic block is invalid, so just pass a reference.

llvm-svn: 263260
2016-03-11 17:27:43 +00:00
Chad Rosier ac216fd9d5 [misched] Fix a truncation issue from r263021.
The truncation was causing the sorting algorithm to behave oddly when comparing
positive and negative offsets.  Fortunately, this doesn't currently happen in
practice and was exposed by a WIP.  Thus, I can't test this change now, but the
follow on patch will.

llvm-svn: 263255
2016-03-11 16:54:07 +00:00
Junmo Park 6098cbbd2c Minor code cleanups. NFC.
llvm-svn: 263200
2016-03-11 07:05:32 +00:00
Junmo Park 4ba6cf69e4 Minor code cleanup. NFC.
llvm-svn: 263196
2016-03-11 05:07:07 +00:00
Pete Cooper adebb9379a Remove llvm::getDISubprogram in favor of Function::getSubprogram
llvm::getDISubprogram walks the instructions in a function, looking for one in the scope of the current function, so that it can find the !dbg entry for the subprogram itself.

Now that !dbg is attached to functions, this should not be necessary. This patch changes all uses to just query the subprogram directly on the function.

Ideally this should be NFC, but in reality its possible that a function:

has no !dbg (in which case there's likely a bug somewhere in an opt pass), or
that none of the instructions had a scope referencing the function, so we used to not find the !dbg on the function but now we will

Reviewed by Duncan Exon Smith.

Differential Revision: http://reviews.llvm.org/D18074

llvm-svn: 263184
2016-03-11 02:14:16 +00:00
Marianne Mailhot-Sarrasin eddc5b130e Test commit access
llvm-svn: 263165
2016-03-10 21:54:25 +00:00
Simon Pilgrim 61eb49e437 [X86][SSE] Reapplied: Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Reapplied with a fix for PR26870 (avoid premature use of TargetConstant in ZERO_EXTEND_VECTOR_INREG expansion).

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 263159
2016-03-10 20:40:26 +00:00
Chandler Carruth 61440d225b [PM] Port memdep to the new pass manager.
This is a fairly straightforward port to the new pass manager with one
exception. It removes a very questionable use of releaseMemory() in
the old pass to invalidate its caches between runs on a function.
I don't think this is really guaranteed to be safe. I've just used the
more direct port to the new PM to address this by nuking the results
object each time the pass runs. While this could cause some minor malloc
traffic increase, I don't expect the compile time performance hit to be
noticable, and it makes the correctness and other aspects of the pass
much easier to reason about. In some cases, it may make things faster by
making the sets and maps smaller with better locality. Indeed, the
measurements collected by Bruno (thanks!!!) show mostly compile time
improvements.

There is sadly very limited testing at this point as there are only two
tests of memdep, and both rely on GVN. I'll be porting GVN next and that
will exercise this heavily though.

Differential Revision: http://reviews.llvm.org/D17962

llvm-svn: 263082
2016-03-10 00:55:30 +00:00
Philip Reames ac115ed72f [CGP] Duplicate addressing computation in cold paths if required to sink addressing mode
This patch teaches CGP to duplicate addressing mode computations into cold paths (detected via explicit cold attribute on calls) if required to let addressing mode be safely sunk into the basic block containing each load and store.

In general, duplicating code into cold blocks may result in code growth, but should not effect performance. In this case, it's better to duplicate some code than to put extra pressure on the register allocator by making it keep the address through the entirely of the fast path.

This patch only handles addressing computations, but in principal, we could implement a more general cold cold scheduling heuristic which tries to reduce register pressure in the fast path by duplicating code into the cold path. Getting the profitability of the general case right seemed likely to be challenging, so I stuck to the existing case (addressing computation) we already had.

Differential Revision: http://reviews.llvm.org/D17652

llvm-svn: 263074
2016-03-09 23:13:12 +00:00
Tom Stellard 9f2e00de7b SelectionDAG: Fix a crash on inline asm when output register supports multiple types
Summary:
The code in SelectionDAG did not handle the case where the
register type and output types were different, but had the same size.

Reviewers: arsenm, echristo

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17940

llvm-svn: 263022
2016-03-09 16:02:52 +00:00
Chad Rosier c27a18f39f [TII] Allow getMemOpBaseRegImmOfs() to accept negative offsets. NFC.
http://reviews.llvm.org/D17967

llvm-svn: 263021
2016-03-09 16:00:35 +00:00
Krzysztof Parzyszek cd99e364e3 Invoke DAG postprocessing in the post-RA scheduler
This was inadvertently omitted from r262774, which added the mutation
interface.

llvm-svn: 262939
2016-03-08 16:54:20 +00:00
Hans Wennborg e00b6e7249 Revert r262599 "[X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG"
This caused PR26870.

llvm-svn: 262935
2016-03-08 16:21:41 +00:00
Krzysztof Parzyszek 1a1d78b86f Add DAG mutation interface to the DFA packetizer
llvm-svn: 262930
2016-03-08 15:33:51 +00:00
Justin Bogner 671febc0f7 Re-apply "SelectionDAG: Store SDNode operands in an ArrayRecycler"
This re-applies r262886 with a fix for 32 bit platforms that have 8 byte
pointer alignment, effectively reverting r262892.

Original Message:

  Currently some SDNode operands are malloc'd, some are stored inline in
  subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
  This scheme is complex, inconsistent, and makes refactoring SDNodes
  fairly difficult.

  Instead, we can allocate all of the operands using an ArrayRecycler
  that wraps a BumpPtrAllocator. This keeps the cache locality when
  iterating operands, improves locality when iterating SDNodes without
  looking at operands, and vastly simplifies the ownership semantics.

  It also means we stop overallocating SDNodes by 2-3x and will make it
  simpler to fix the rampant undefined behaviour we have in how we
  mutate SDNodes from one kind to another (See llvm.org/pr26808).

  This is NFC other than the changes in memory behaviour, and I ran some
  LNT tests to make sure this didn't hurt compile time. Not many tests
  changed: there were a couple of 1-2% regressions reported, but there
  were more improvements (of up to 4%) than regressions.

llvm-svn: 262902
2016-03-08 03:14:29 +00:00
Quentin Colombet 5e63e78ca9 [MIR] Change the token name for '<' and '>' to be consitent with the LLVM IR parser.
Thanks to Ahmed Bougacha for noticing!

llvm-svn: 262899
2016-03-08 02:00:43 +00:00
Quentin Colombet 39293d3aaa [GlobalISel] Introduce initializer method to support start/stop-after features.
llvm-svn: 262896
2016-03-08 01:38:55 +00:00
Quentin Colombet 050b211820 [MIR] Teach the parser/printer that generic virtual registers do not need a register class.
llvm-svn: 262893
2016-03-08 01:17:03 +00:00
Justin Bogner 7e6f09c28f Revert "SelectionDAG: Store SDNode operands in an ArrayRecycler"
Looks like the largest SDNode is different between 32 and 64 bit now,
so this is breaking 32 bit bots. Reverting while I figure out a fix.

This reverts r262886.

llvm-svn: 262892
2016-03-08 01:07:03 +00:00
Quentin Colombet 287c6bb571 [MIR] Teach the parser how to parse complex types of generic machine instructions.
By complex types, I mean aggregate or vector types.

llvm-svn: 262890
2016-03-08 00:57:31 +00:00
Justin Bogner 6543a9385f SelectionDAG: Store SDNode operands in an ArrayRecycler
Currently some SDNode operands are malloc'd, some are stored inline in
subclasses of SDNode, and some are thrown into a BumpPtrAllocator.
This scheme is complex, inconsistent, and makes refactoring SDNodes
fairly difficult.

Instead, we can allocate all of the operands using an ArrayRecycler
that wraps a BumpPtrAllocator. This keeps the cache locality when
iterating operands, improves locality when iterating SDNodes without
looking at operands, and vastly simplifies the ownership semantics.

It also means we stop overallocating SDNodes by 2-3x and will make it
simpler to fix the rampant undefined behaviour we have in how we
mutate SDNodes from one kind to another (See llvm.org/pr26808).

This is NFC other than the changes in memory behaviour, and I ran some
LNT tests to make sure this didn't hurt compile time. Not many tests
changed: there were a couple of 1-2% regressions reported, but there
were more improvements (of up to 4%) than regressions.

llvm-svn: 262886
2016-03-08 00:39:51 +00:00
Quentin Colombet d655483944 [MIR] Teach the printer how to print complex types for generic machine instructions.
Before this change, we would get the type definition in the middle
of the instruction.
E.g., %0(48) = G_ADD %struct_alias = type { i32, i16 } %edi, %edi

Now, we have just the expected type name:
%0(48) = G_ADD %struct_alias %edi, %edi

llvm-svn: 262885
2016-03-08 00:38:01 +00:00
Quentin Colombet 12350a8e13 [MIR] Print the type of generic machine instructions.
llvm-svn: 262880
2016-03-08 00:29:15 +00:00
Quentin Colombet 851996778f [MIR] Teach the mir parser about types on generic machine instructions.
llvm-svn: 262879
2016-03-08 00:20:48 +00:00
Quentin Colombet 41bea872dd [MachineInstr] Get rid of some GlobalISel ifdefs.
Now the type API is always available, but when global-isel is not
built the implementation does nothing.

Note: The implementation free of ifdefs is WIP and tracked here in PR26576.
llvm-svn: 262873
2016-03-07 22:47:23 +00:00
Quentin Colombet 4e14a497a3 [MIR] Teach the MIPrinter about size for generic virtual registers.
llvm-svn: 262867
2016-03-07 21:57:52 +00:00
Quentin Colombet 2a831fb826 [MIR] Teach the parser how to handle the size of generic virtual registers.
llvm-svn: 262862
2016-03-07 21:48:43 +00:00
Quentin Colombet 1bd7504ef3 [MachineRegisterInfo] Add a method to set the size of a virtual register a posteriori.
This is required for mir testing.

llvm-svn: 262861
2016-03-07 21:41:39 +00:00
Quentin Colombet 70a9670d80 [MachineRegisterInfo] Get rid of the global-isel ifdefs.
One additional pointer is not a big deal size-wise and it makes
the code much nicer!

llvm-svn: 262856
2016-03-07 21:22:09 +00:00
Matt Arsenault ceb2c06cbd DAGCombiner: Check legality before creating extract_vector_elt
Problem not hit by any in tree target.

llvm-svn: 262852
2016-03-07 21:10:09 +00:00
Craig Topper 267bdb2094 [CodeGen] Add space-optimized EmitMergeInputChains1_2 to the DAG isel matching tables. Shaves about 5100 bytes from the X86 matcher table. NFC
llvm-svn: 262815
2016-03-07 07:29:12 +00:00
Krzysztof Parzyszek 5c61d11a6d Add DAG mutation interface to the post-RA scheduler
Differential Revision: http://reviews.llvm.org/D17868

llvm-svn: 262774
2016-03-05 15:45:23 +00:00
Matthias Braun 4797ec95e4 RegisterCoalescer: Remap subregister lanemasks before exchanging operands
Rematerializing and merging into a bigger register class at the same
time, requires the subregister range lanemasks getting remapped to the
new register class.

This fixes http://llvm.org/PR26805

llvm-svn: 262768
2016-03-05 04:36:13 +00:00
Matthias Braun 8de09aa0c5 RegisterCoalescer: Need to check DstReg+SrcReg for missing undef flags
copy coalescing with enabled subregister liveness can reveal undef uses,
previously this was only checked for the SrcReg in updateRegDefsUses()
but we need to check DstReg as well.

llvm-svn: 262767
2016-03-05 04:36:10 +00:00
Matthias Braun 2cbfd9fff5 RegisterPressure: Small cleanup
llvm-svn: 262766
2016-03-05 04:36:08 +00:00
Michael Kuperstein b89f0fa2a2 [DAGCombine] Fix divrem combine not to assume div/rem type is simple.
The divrem combine assumed the type of the div/rem is simple, which isn't
necessarily true. This probably worked fine until r250825, since it only
saw legal types, but now breaks when it runs as a pre-type-legalization 
combine.

This fixes PR26835.

Differential Revision: http://reviews.llvm.org/D17878

llvm-svn: 262746
2016-03-04 21:23:29 +00:00
Renato Golin 175c6d6d95 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

Second attempt, creating TLI.isOperationCustom like isOperationExpand, to make
sure we only emit valid types or the ones that were explicitly marked as custom.
Now, passing check-all and test-suite on x86, ARM and AArch64.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262738
2016-03-04 19:19:36 +00:00
Teresa Johnson d84c7decb6 Change split code gen to use ThreadPool
Part of D15390.

llvm-svn: 262719
2016-03-04 15:39:13 +00:00
Benjamin Kramer 4dbf3371bb Make headers self-contained again.
llvm-svn: 262702
2016-03-04 10:49:30 +00:00
Simon Pilgrim 91dd0a796c [X86][SSE] Improve vector ZERO_EXTEND by combining to ZERO_EXTEND_VECTOR_INREG
Generalise the existing SIGN_EXTEND to SIGN_EXTEND_VECTOR_INREG combine to support zero extension as well and get rid of a lot of unnecessary ANY_EXTEND + mask patterns.

Differential Revision: http://reviews.llvm.org/D17691

llvm-svn: 262599
2016-03-03 09:43:28 +00:00
Renato Golin 3d78271eac Revert "[ARM] Merging 64-bit divmod lib calls into one"
This reverts commit r262507, which broke some ARM buildbots.

llvm-svn: 262594
2016-03-03 08:57:44 +00:00
Junmo Park 6ba96fb431 [BranchFolding] Change function name related with merging MMOs. NFC
Summary:
Removing MMOs is not our prefer behavior any more.

Reviewers: mcrosier, reames
   
Differential Revision: http://reviews.llvm.org/D17668

llvm-svn: 262580
2016-03-03 03:57:20 +00:00
Philip Reames ae27b2380f [MBP] Renaming a confusing variable and add clarifying comments
Was discussed as part of http://reviews.llvm.org/D17830

llvm-svn: 262571
2016-03-03 00:58:43 +00:00
Philip Reames 23d933982a [MBP] Avoid placing random blocks between loop preheader and header
If we have a loop with a rarely taken path, we will prune that from the blocks which get added as part of the loop chain. The problem is that we weren't then recognizing the loop chain as schedulable when considering the preheader when forming the function chain. We'd then fall to various non-predecessors before finally scheduling the loop chain (as if the CFG was unnatural.) The net result was that there could be lots of garbage between a loop preheader and the loop, even though we could have directly fallen into the loop. It also meant we separated hot code with regions of colder code.

The particular reason for the rejection of the loop chain was that we were scanning predecessor of the header, seeing the backedge, believing that was a globally more important predecessor (true), but forgetting to account for the fact the backedge precessor was already part of the existing loop chain (oops!.

Differential Revision: http://reviews.llvm.org/D17830

llvm-svn: 262547
2016-03-03 00:01:42 +00:00
David Majnemer 1ef654024f [X86] Don't give catch objects a displacement of zero
Catch objects with a displacement of zero do not initialize a catch
object.  The displacement is relative to %rsp at the end of the
function's prologue for x86_64 targets.

If we place an object at the top-of-stack, we will end up wit a
displacement of zero resulting in our catch object remaining
uninitialized.

Address this by creating our catch objects as fixed objects.  We will
ensure that the UnwindHelp object is created after the catch objects so
that no catch object will have a displacement of zero.

Differential Revision: http://reviews.llvm.org/D17823

llvm-svn: 262546
2016-03-03 00:01:25 +00:00
Philip Reames 02e1132afb [MBP] Remove overly verbose debug output
llvm-svn: 262531
2016-03-02 22:40:51 +00:00
Philip Reames b9688f4382 [MBP] Adjust debug output to be more focused and approachable
llvm-svn: 262522
2016-03-02 21:45:13 +00:00
Renato Golin 93e42d9934 [ARM] Merging 64-bit divmod lib calls into one
When div+rem calls on the same arguments are found, the ARM back-end merges the
two calls into one __aeabi_divmod call for up to 32-bits values. However,
for 64-bit values, which also have a lib call (__aeabi_ldivmod), it wasn't
merging the calls, and thus calling ldivmod twice and spilling the temporary
results, which generated pretty bad code.

This patch legalises 64-bit lib calls for divmod, so that now all the spilling
and the second call are gone. It also relaxes the DivRem combiner a bit on the
legal type check, since it was already checking for isLegalOrCustom on every
value, so the extra check for isTypeLegal was redundant.

This patch fixes PR17193 (and a long time FIXME in the tests).

llvm-svn: 262507
2016-03-02 19:35:45 +00:00
Justin Bogner b2ecee9c31 SelectionDAG: Use correctly sized allocation functions for SDNodes
The placement new calls here were all calling the allocation function
in RecyclingAllocator/Recycler for SDNode, instead of the function for
the specific subclass we were constructing.

Since this particular allocator always overallocates it more or less
worked, but would hide what we're actually doing from any memory
tools. Also, if you tried to change this allocator so something like a
BumpPtrAllocator or MallocAllocator, the compiler would crash horribly
all the time.

Part of llvm.org/PR26808.

llvm-svn: 262500
2016-03-02 19:01:11 +00:00
Matt Arsenault 7d0a77b979 DAGCombiner: Make sure an integer is being truncated
llvm-svn: 262446
2016-03-02 01:36:51 +00:00
Matt Arsenault b36d462fac DAGCombiner: Turn truncate of a bitcasted vector to an extract
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.

This fixes some test failures in a future commit when i64 loads
are changed to promote.

llvm-svn: 262397
2016-03-01 21:31:53 +00:00
Vasileios Kalintiris 36901dd1c3 Revert "[mips] Promote the result of SETCC nodes to GPR width."
This reverts commit r262316.

It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.

llvm-svn: 262387
2016-03-01 20:25:43 +00:00
Justin Lebar b5ca00a58d [NVPTX] Use different, convergent MIs for convergent calls.
Summary:
Calls sometimes need to be convergent.  This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.

Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs.  But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.

Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.

Reviewers: jholewinski

Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17423

llvm-svn: 262373
2016-03-01 19:24:03 +00:00
Matt Arsenault 03dac8d8e4 DAGCombiner: Turn extract of bitcasted integer into truncate
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.

llvm-svn: 262358
2016-03-01 18:01:37 +00:00
Rafael Espindola 5cd721ae12 Refactor duplicated code for linking with pthread.
llvm-svn: 262344
2016-03-01 15:54:40 +00:00
Vasileios Kalintiris 3a8f7f9e31 [mips] Promote the result of SETCC nodes to GPR width.
Summary:
This patch modifies the existing comparison, branch, conditional-move
and select patterns, and adds new ones where needed. Also, the updated
SLT{u,i,iu} set of instructions generate a GPR width result.

The majority of the code changes in the Mips back-end fix the wrong
assumption that the result of SETCC nodes always produce an i32 value.
The changes in the common code path account for the fact that in 64-bit
MIPS targets, i1 is promoted to i32 instead of i64.

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D10970

llvm-svn: 262316
2016-03-01 10:08:01 +00:00
Matt Arsenault a67c4916cf LegalizeDAG: Use correct ptr type when expanding unaligned load/store
This fixes regressions exposed in existing AMDGPU tests in a
future commit when all loads are custom lowered.

llvm-svn: 262299
2016-03-01 05:13:35 +00:00
David Majnemer cb305dea1c [WinEH] Allocate the registration node before the catch objects
The CatchObjOffset is relative to the end of the EH registration node
for 32-bit x86 WinEH targets.  A special sentinel value, 0, is used to
indicate that no catch object should be initialized.

This means that a catch object allocated immediately before the
registration node would be assigned a CatchObjOffset of 0, leading the
runtime to believe that a catch object should not be initialized.

To handle this, allocate the registration node prior to any other frame
object.  This will ensure that catch objects will not be allocated
before the registration node.

This fixes PR26757.

Differential Revision: http://reviews.llvm.org/D17689

llvm-svn: 262294
2016-03-01 04:30:16 +00:00
Adrian Prantl dba58fbdd9 Improve the debug output of DwarfDebug::buildLocationList().
llvm-svn: 262265
2016-02-29 22:28:22 +00:00
Adrian Prantl fb2add2be1 Fix PR26585 by improving the promotion of DBG_VALUEs to DW_AT_locations.
When a variable is described by a single DBG_VALUE instruction we can
often use a more efficient inline DW_AT_location instead of using a
location list.

This commit makes the heuristic that decides when to apply this
optimization stricter by also verifying that the DBG_VALUE is live at the
entry of the function (instead of just checking that it is valid until
the end of the function).

<rdar://problem/24611008>

llvm-svn: 262247
2016-02-29 19:49:46 +00:00
Adrian Prantl 693e8de0fa fix typo in comment
llvm-svn: 262236
2016-02-29 17:06:46 +00:00
Duncan P. N. Exon Smith ebcce78f65 CodeGen: Remove an iterator => pointer conversion, NFC
Part of PR26753.

llvm-svn: 262154
2016-02-27 20:27:44 +00:00
Duncan P. N. Exon Smith d6ebd07b8d CodeGen: Use MachineInstr& in InlineSpiller::rematerializeFor()
InlineSpiller::rematerializeFor() never uses its parameter as an
iterator, so take it by reference instead.  This removes an implicit
conversion from MachineBasicBlock::iterator to MachineInstr*.

llvm-svn: 262152
2016-02-27 20:23:14 +00:00
Duncan P. N. Exon Smith be8f8c4478 CodeGen: Update LiveIntervalAnalysis API to use MachineInstr&, NFC
These parameters aren't expected to be null, so take them by reference.

llvm-svn: 262151
2016-02-27 20:14:29 +00:00
Duncan P. N. Exon Smith fd8cc23220 CodeGen: Change MachineInstr to use MachineInstr&, NFC
Change MachineInstr API to prefer MachineInstr& over MachineInstr*
whenever the parameter is expected to be non-null.  Slowly inching
toward being able to fix PR26753.

llvm-svn: 262149
2016-02-27 20:01:33 +00:00
Matt Arsenault 982224cfb8 DAGCombiner: Don't unnecessarily swap operands in ReassociateOps
In the case where op = add, y = base_ptr, and x = offset, this
transform:

(op y, (op x, c1)) -> (op (op x, y), c1)

breaks the canonical form of add by putting the base pointer in the
second operand and the offset in the first.

This fix is important for the R600 target, because for some address
spaces the base pointer and the offset are stored in separate register
classes. The old pattern caused the ISel code for matching addressing
modes to put the base pointer and offset in the wrong register classes,
which required no-trivial code transformations to fix.

llvm-svn: 262148
2016-02-27 19:57:45 +00:00
Duncan P. N. Exon Smith d3a7467221 CodeGen: Use MachineInstr& in HashMachineInstr, NFC
Also update HashEndOfMBB to take MachineBasicBlock&.

llvm-svn: 262146
2016-02-27 19:48:01 +00:00
Duncan P. N. Exon Smith 5e6e8c7a0a CodeGen: Use MachineInstr& in AntiDepBreaker API, NFC
Take parameters as MachineInstr& instead of MachineInstr* in
AntiDepBreaker API, since these are required to be non-null.  No
functionality change intended.  Looking toward PR26753.

llvm-svn: 262145
2016-02-27 19:33:37 +00:00
Duncan P. N. Exon Smith bd529fbb4a CodeGen: Assert valid MI in AntiDepBreaker::UpdateDbgValue
This already assumes a valid MI, since it dereferences the MI in an
assertion before checking for null.  At an explicit assert.

llvm-svn: 262144
2016-02-27 19:23:34 +00:00
Duncan P. N. Exon Smith 5702287809 CodeGen: Update DFAPacketizer API to take MachineInstr&, NFC
In all but one case, change the DFAPacketizer API to take MachineInstr&
instead of MachineInstr*.  In DFAPacketizer::endPacket(), take
MachineBasicBlock::iterator.  Besides cleaning up the API, this is in
search of PR26753.

llvm-svn: 262142
2016-02-27 19:09:00 +00:00
Duncan P. N. Exon Smith f9ab416d70 WIP: CodeGen: Use MachineInstr& in MachineInstrBundle.h, NFC
Update APIs in MachineInstrBundle.h to take and return MachineInstr&
instead of MachineInstr* when the instruction cannot be null.  Besides
being a nice cleanup, this is tacking toward a fix for PR26753.

llvm-svn: 262141
2016-02-27 17:05:33 +00:00
Matt Arsenault 360d244d5b DAGCombiner: Relax sqrt NaN folding check
This is OK for +0 since compares to +/-0 give the same result.

llvm-svn: 262125
2016-02-27 09:38:05 +00:00
Duncan P. N. Exon Smith 3ac9cc6156 CodeGen: Take MachineInstr& in SlotIndexes and LiveIntervals, NFC
Take MachineInstr by reference instead of by pointer in SlotIndexes and
the SlotIndex wrappers in LiveIntervals.  The MachineInstrs here are
never null, so this cleans up the API a bit.  It also incidentally
removes a few implicit conversions from MachineInstrBundleIterator to
MachineInstr* (see PR26753).

At a couple of call sites it was convenient to convert to a range-based
for loop over MachineBasicBlock::instr_begin/instr_end, so I added
MachineBasicBlock::instrs.

llvm-svn: 262115
2016-02-27 06:40:41 +00:00
Junmo Park 272a2bc365 Minor code cleanup. NFC.
llvm-svn: 262096
2016-02-27 01:10:43 +00:00
Cong Hou e0eb8bfe37 Fix a bug in isVectorReductionOp() in SelectionDAGBuilder.cpp that may cause assertion failure on AArch64.
llvm-svn: 262091
2016-02-26 23:25:30 +00:00
Amaury Sechet b2055c53ba Fix warning in DwarfCFIException. NFC
llvm-svn: 262061
2016-02-26 20:49:07 +00:00
Amaury Sechet 7067ad3c27 Extract the method to begin and end a fragment in AsmPrinterHandler in their own method. NFC
Summary: This is extracted from D17555

Reviewers: davidxl, reames, sanjoy, MatzeB, pete

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17580

llvm-svn: 262058
2016-02-26 20:30:37 +00:00
Quentin Colombet 87e23e5733 [GlobalISel] Fix a ranlib warning about empty TOC.
Fixes PR26733

llvm-svn: 262057
2016-02-26 20:05:02 +00:00
Reid Kleckner 70c9bc71d4 [WinEH] Fix funclet return block clobber mask placement
MBB slot index intervals are half open, not closed. getMBBEndIndex()
returns the slot index of the start of the next block in layout order.
Placing a register mask there is incorrect if the successor of the
funclet return is not laid out after the return. Clang generates IR for
catch bodies before generating the following normal code, so we never
noticed this issue until the D frontend authors filed a bug about it.

Instead, we can put the clobber mask on the last instruction of the
funclet return block. We still aren't using a register mask operand on
the CATCHRET instruction because it would cause PEI to spill all CSRs,
including XMM regs, in the prologue.

Fixes PR26679.

llvm-svn: 262035
2016-02-26 16:53:19 +00:00
Matthias Braun 9dcd65f478 MachineCopyPropagation: Catch copies of the form A<-B;A<-B
Differential Revision: http://reviews.llvm.org/D17475

llvm-svn: 261966
2016-02-26 03:18:55 +00:00
Matthias Braun e39ff70685 MachineCopyPropagation: Keep scanning through instructions with regmasks
This also simplifies the code by removing the overly conservative
NoInterveningSideEffect() function. This function checked:
- That the two copies belong to the same block: We only process one
  block at a time and clear our maps in between it is impossible to find a
  copy from a different block.
- There is no terminator between the two copy instructions: This is not
  allowed anyway (the MachineVerifier would complain)
- Does not have instructions with hasUnmodeledSideEffects() or isCall()
  set: Even for those instructuction we must have all clobbers/defs of
  registers explicit as an operand. If the register is explicitely
  clobbered we would never come to the point of checking for
  NoInterveningSideEffect() anyway.

(I also checked this with a temporary build of the test-suite with all
 potentially failing conditions in NoInterveningSideEffect() turned into
 asserts)

Differential Revision: http://reviews.llvm.org/D17474

llvm-svn: 261965
2016-02-26 03:18:50 +00:00
Junmo Park 820e392601 Minor code cleanups. NFC.
llvm-svn: 261955
2016-02-26 02:07:36 +00:00
David Majnemer 08dd52dc75 [WinEH] Don't remove unannotated inline-asm calls
Inline-asm calls aren't annotated with funclet bundle operands because
they don't throw and cannot be inlined through.  We shouldn't require
them to bear an funclet bundle operand.

llvm-svn: 261942
2016-02-26 00:04:25 +00:00
Hongbin Zheng 751337faa7 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261903
2016-02-25 17:54:15 +00:00
Hongbin Zheng 3f97840721 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261902
2016-02-25 17:54:07 +00:00
Hongbin Zheng 66b19fbc4e Revert "Introduce analysis pass to compute PostDominators in the new pass manager. NFC"
This reverts commit a3e5cc6a51ab5ad88d1760c63284294a4e34c018.

llvm-svn: 261891
2016-02-25 16:45:53 +00:00
Hongbin Zheng ad782ce3f7 Revert "Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC"
This reverts commit 109c38b2226a87b0be73fa7a0a8c1a81df20aeb2.

llvm-svn: 261890
2016-02-25 16:45:46 +00:00
Hongbin Zheng 237197ba63 Introduce DominanceFrontierAnalysis to the new PassManager to compute DominanceFrontier. NFC
Differential Revision: http://reviews.llvm.org/D17570

llvm-svn: 261883
2016-02-25 16:33:15 +00:00
Hongbin Zheng a0273a04f5 Introduce analysis pass to compute PostDominators in the new pass manager. NFC
Differential Revision: http://reviews.llvm.org/D17537

llvm-svn: 261882
2016-02-25 16:33:06 +00:00
Junmo Park 161dc1c605 [CodeGenPrepare] Remove load-based heuristic
Summary:
Both the hardware and LLVM have changed since 2012.
Now, load-based heuristic don't show big differences any more on OoO cores.

There is no notable regressons and improvements on spec2000/2006. (Cortex-A57, Core i5).

Reviewers: spatel, zansari
    
Differential Revision: http://reviews.llvm.org/D16836

llvm-svn: 261809
2016-02-25 00:23:27 +00:00
Cong Hou 4ce0280a41 Detecte vector reduction operations just before instruction selection.
(This is the second attemp to commit this patch, after fixing pr26652 & pr26653).

This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261804
2016-02-24 23:40:36 +00:00
Matthias Braun aca625a4fe MachineInstr: Respect register aliases in clearRegiserKills()
This fixes bugs in copy elimination code in llvm. It slightly changes the
semantics of clearRegisterKills(). This is appropriate because:
- Users in lib/CodeGen/MachineCopyPropagation.cpp and
  lib/Target/AArch64RedundantCopyElimination.cpp and
  lib/Target/SystemZ/SystemZElimCompare.cpp are incorrect without it
  (see included testcase).
- All other users in llvm are unaffected (they pass TRI==nullptr)
- (Kill flags are optional anyway so removing too many shouldn't hurt.)

Differential Revision: http://reviews.llvm.org/D17554

llvm-svn: 261763
2016-02-24 19:21:48 +00:00
Artur Pilipenko 31bcca47d3 NFC. Move isDereferenceable to Loads.h/cpp
This is a part of the refactoring to unify isSafeToLoadUnconditionally and isDereferenceablePointer functions. In subsequent change I'm going to eliminate isDerferenceableAndAlignedPointer from Loads API, leaving isSafeToLoadSpecualtively the only function to check is load instruction can be speculated.   

Reviewed By: hfinkel

Differential Revision: http://reviews.llvm.org/D16180

llvm-svn: 261736
2016-02-24 12:49:04 +00:00
Hans Wennborg d3661cd140 Revert r261633 "Supporting all entities declared in lexical scope in LLVM debug info."
This and the corresponding Clang change caused PR26715.

llvm-svn: 261671
2016-02-23 19:17:03 +00:00
Amjad Aboud fc8f296782 Supporting all entities declared in lexical scope in LLVM debug info.
Differential Revision: http://reviews.llvm.org/D15976

llvm-svn: 261633
2016-02-23 13:36:51 +00:00
David Majnemer 17525aba8a [WinEH] Visit 'unwind to caller' catchswitches nested in catchswitches
We had the right logic for the nested cleanuppad case but omitted it for
catchswitches.

llvm-svn: 261615
2016-02-23 07:18:15 +00:00
Dehao Chen f84b630044 Add prefix based function layout when profile is available.
Summary: If a function is hot, put it in text.hot section.

Reviewers: davidxl

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17532

llvm-svn: 261607
2016-02-23 03:39:24 +00:00
Duncan P. N. Exon Smith 6307eb5518 CodeGen: TII: Take MachineInstr& in predicate API, NFC
Change TargetInstrInfo API to take `MachineInstr&` instead of
`MachineInstr*` in the functions related to predicated instructions
(I'll try to come back later and get some of the rest).  All of these
functions require non-null parameters already, so references are more
clear.  As a bonus, this happens to factor away a host of implicit
iterator => pointer conversions.

No functionality change intended.

llvm-svn: 261605
2016-02-23 02:46:52 +00:00
Duncan P. N. Exon Smith b3613fce19 Revert "Add prefix based function layout when profile is available."
This reverts commit r261582, since this bot has been broken for four
hours:
http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/19399/

llvm-svn: 261604
2016-02-23 02:28:40 +00:00
Dehao Chen 25527b6c3f Include ProfileData as CodeGen's required library.
Summary: Fixing buildbot failure introduced by http://reviews.llvm.org/D17460

Reviewers: davidxl, hans

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17524

llvm-svn: 261588
2016-02-22 22:54:14 +00:00
David Majnemer 964b70d559 [X86] Create mergeable constant pool entries for AVX
We supported creating mergeable constant pool entries for smaller
constants but not for 32-byte AVX constants.

llvm-svn: 261584
2016-02-22 22:23:11 +00:00
Dehao Chen c5f76f7347 Add prefix based function layout when profile is available.
Summary: If a function is hot, put it in text.hot section.

Reviewers: davidxl

Subscribers: eraman, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D17460

llvm-svn: 261582
2016-02-22 22:14:14 +00:00
Matt Arsenault 0c6bd7b0d3 SelectionDAG: Use correct addrspace when lowering memcpy
This was causing assertions later from using the wrong pointer
size with LDS operations. getOptimalMemOpType should also have
address space arguments later.

This avoids assertions in existing tests exposed by
a future commit.

llvm-svn: 261580
2016-02-22 22:01:42 +00:00
Tim Northover d32f8e60bf ARM: sink atomic release barrier as far as possible into cmpxchg.
DMB instructions can be expensive, so it's best to avoid them if possible. In
atomicrmw operations there will always be an attempted store so a release
barrier is always needed, but in the cmpxchg case we can delay the DMB until we
know we'll definitely try to perform a store (and so need release semantics).

In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX
instructions to skip the barrier on subsequent iterations. The basic outline
becomes:

        ldrex rOld, [rAddr]
        cmp rOld, rDesired
        bne Ldone
        dmb
    Lloop:
        strex rRes, rNew, [rAddr]
        cbz rRes Ldone
        ldrex rOld, [rAddr]
        cmp rOld, rDesired
        beq Lloop
    Ldone:

So we'll skip this version for strong operations in "minsize" functions.

llvm-svn: 261568
2016-02-22 20:55:50 +00:00
Duncan P. N. Exon Smith c5b668deb8 Revert "CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC"
This reverts commit r261504, since it's not obvious the new name is
better:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160222/334298.html

I'll recommit if we get consensus that it's the right direction.

llvm-svn: 261567
2016-02-22 20:49:58 +00:00
Justin Lebar 46123a8891 Revert "[ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv."
This reverts r261543.  Accidental commit (not LGTM'ed).

llvm-svn: 261547
2016-02-22 18:17:27 +00:00
Justin Lebar f62b165a04 [ifcnv] Add comment explaining why it's OK to duplicate convergent MIs in ifcnv.
Summary:
Also add a comment briefly explaining what ifcnv is.

No functional changes.

Reviewers: resistor

Subscribers: echristo, tra, llvm-commits

Differential Revision: http://reviews.llvm.org/D17430

llvm-svn: 261543
2016-02-22 17:51:30 +00:00
Justin Lebar 3a7bc57e63 [ifcnv] Use unique_ptr in IfConversion. NFC
Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D17466

llvm-svn: 261541
2016-02-22 17:51:28 +00:00
Justin Lebar 5b82c9ba31 Don't tail-duplicate blocks that contain convergent instructions.
Summary:
Convergent instrs shouldn't be made control-dependent on other values,
but this is basically the whole point of tail duplication.  So just bail
if we see a convergent instruction.

Reviewers: iteratee

Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits

Differential Revision: http://reviews.llvm.org/D17320

llvm-svn: 261540
2016-02-22 17:50:52 +00:00
Duncan P. N. Exon Smith e59c8af705 Reapply "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261510, effectively reapplying r261509.  The
original commit missed a caller in AArch64ConditionalCompares.

Original commit message:

Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261511
2016-02-22 03:33:28 +00:00
Duncan P. N. Exon Smith 0cc90a9147 Revert "CodeGen: Use references in MachineTraceMetrics::Trace, NFC"
This reverts commit r261509.  I'm not sure how this compiled locally,
but something was out of whack.

llvm-svn: 261510
2016-02-22 03:12:42 +00:00
Duncan P. N. Exon Smith 83d3476fd2 CodeGen: Use references in MachineTraceMetrics::Trace, NFC
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.

llvm-svn: 261509
2016-02-22 03:07:49 +00:00
Duncan P. N. Exon Smith 395bd9cd63 CodeGen: Explicitly convert from iterator to pointer, NFC
llvm-svn: 261508
2016-02-22 02:53:42 +00:00
Duncan P. N. Exon Smith dc0848c029 CodeGen: MachineInstr::getIterator() => getInstrIterator(), NFC
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.

- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
  that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator().  This matches the
  naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator().  This is explicitly called
  "bundle" (not matching MachineBasicBlock) to disintinguish it clearly
  from ilist_node::getIterator().
- Update all calls.  Some of these I switched to `auto` to remove
  boiler-plate, since the new name is clear about the type.

There was one call I updated that looked fishy, but it wasn't clear what
the right answer was.  This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp.  I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.

llvm-svn: 261504
2016-02-21 22:58:35 +00:00
Duncan P. N. Exon Smith e9bc579c37 ADT: Remove == and != comparisons between ilist iterators and pointers
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.

Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators.  (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)

There should be NFC here.

llvm-svn: 261498
2016-02-21 20:39:50 +00:00
Duncan P. N. Exon Smith a848c47130 ADT: Stop using getNodePtrUnchecked on end() iterators
Stop using `getNodePtrUnchecked()` when building IR.  Eventually a
dereference will be required to get at the downcast node, since the
iterator will only store an `ilist_node_base` of some sort.

This should have no functionality change for now, but is a path towards
removing some more UB from ilist.

llvm-svn: 261495
2016-02-21 19:52:15 +00:00
Duncan P. N. Exon Smith 7b269642d2 CodeGen: Avoid getNodePtrUnchecked() where we need a Value, NFC
`ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being
for internal use only, but CodeGenPrepare was using it anyway.  This
code relies on pulling out the `Value*` pointer even after the lifetime
of the iterator is over.  But having this pointer available in
ilist_iterator depends on UB in the first place.

Instead, safely pull out the `Value*` when the iterator is alive and
stop using the internal-only API.

There should be no functionality change here.

llvm-svn: 261493
2016-02-21 19:37:45 +00:00
David Majnemer a3ea407d48 [X86] Use the correct alignment for COMDAT constant pool entries
COFF doesn't have sections with mergeable contents.  Instead, each
constant pool entry ends up in a COMDAT section.  The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections.  You just get whatever alignment was on the section.

If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.

Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.

This fixes PR26680.

llvm-svn: 261462
2016-02-21 01:30:30 +00:00
Dan Gohman d1c5a3aa21 Don't scan for SSA register operands to update when not in SSA form.
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.

llvm-svn: 261450
2016-02-20 21:28:18 +00:00
Simon Pilgrim c5199aae82 [DAGCombiner] Use getBitcast helper when possible. NFCI.
llvm-svn: 261437
2016-02-20 15:05:29 +00:00
Matthias Braun c65e904be8 MachineCopyPropagation: Introduce Reg2MIMap typedef; NFC
llvm-svn: 261408
2016-02-20 03:56:41 +00:00
Matthias Braun bd18d751de MachineCopyPropagation: Move variables from function to pass
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.

llvm-svn: 261407
2016-02-20 03:56:39 +00:00
Matthias Braun 273575dcbe MachineCopyPropagation: Use ranged for, cleanup; NFC
llvm-svn: 261406
2016-02-20 03:56:36 +00:00
Matthias Braun 57b5f11aa7 MachineCopyPropagation: Use assert() instead of if{report_error()} for 'impossible' condition
llvm-svn: 261405
2016-02-20 03:56:33 +00:00
Quentin Colombet e611698e84 [RegAllocFast] Properly track the physical register definitions on calls.
PR26485

llvm-svn: 261384
2016-02-20 00:32:29 +00:00
Sanjoy Das ffb7bd11f7 [StatepointLowering] Minor non-semantic cleanups
Use auto, bring file up to coding standards etc.

llvm-svn: 261358
2016-02-19 19:37:07 +00:00
Sanjoy Das f6fee29ceb [StatepointLowering] Update StatepointMaxSlotsRequired correctly
Now that we don't always add an element to AllocatedStackSlots if we
don't find a pre-existing unallocated stack slot, bumping
StatepointMaxSlotsRequired to `NumSlots + 1` is not correct.  Instead
bump the statistic near the push_back, to
Builder.FuncInfo.StatepointStackSlots.size().

llvm-svn: 261348
2016-02-19 18:15:56 +00:00
Sanjoy Das e8019df552 [StatepointLowering] Fix a mistake in rL261336
The check on MFI->getObjectSize() has to be on the FrameIndex, not on
the index of the FrameIndex in AllocatedStackSlots.  Weirdly, the tests
I added in rL261336 didn't catch this.

llvm-svn: 261347
2016-02-19 18:15:53 +00:00
Sanjoy Das 171313c69a [StatepointLowering] Change AllocatedStackSlots to use SmallBitVector
NFCI.  They key motivation here is that I'd like to use
SmallBitVector::all() in a later change.  Also, using a bit vector here
seemed better in general.

The only interesting change here is that in the failure case of
allocateStackSlot, we no longer (the equivalent of) push_back(true) to
AllocatedStackSlots.  As far as I can tell, this is fine, since we'd
never re-use those slots in the same StatepointLoweringState instance.

Technically there was no need to change the operator[] type accesses to
set() and test(), but I thought it'd be nice to make it obvious that
we're using something other than a std::vector like thing.

llvm-svn: 261337
2016-02-19 17:15:26 +00:00
Sanjoy Das d2db73ba59 [StatepointLowering] Fix bug in allocateStackSlot
allocateStackSlot did not consider the size of the value to be spilled
before deciding to re-use a spill slot.  This was originally okay (since
originally we'd only ever spill pointers), but it became not okay when
we changed our scheme to directly spill vectors of pointers.

While this change fixes the bug pointed out, it has two performance
caveats:

 - It matches spill slot and spillee size exactly, while in theory we
   can spill, e.g., an 8 byte pointer into a 16 byte slot.  This is
   slightly complicated to fix since in the stackmaps section, we report
   the size of the spill slot as the size of the "indirect value"; and
   if they're no longer equivalent, we'll have to keep track of the
   (indirect) value size separately from the stack slot size.

 - It will "spuriously run out" of reusable slots, since we now have an
   second check in the search loop in addition to the availablity
   check (e.g. you had two free scalar slots, and you first ask for a
   vector slot followed by a scalar slot).  I'll fix this in a later
   commit.

llvm-svn: 261336
2016-02-19 17:15:22 +00:00
Sanjoy Das 7b2e91fb59 [StatepointLowering] Clean up allocateStackSlot
This removes the unusual loop structure in allocateStackSlot in favor of
something more straightforward.  I've also removed the cautionary
comment in the function, which I suspect is historical cruft now, and
confuses more than it enlightens.

llvm-svn: 261335
2016-02-19 17:15:17 +00:00
David Majnemer 693f13156e Shuffle header file as per the Coding Standards
llvm-svn: 261308
2016-02-19 04:46:48 +00:00
David Majnemer b61fd7fc6d [SjLjEHPrepare] Simplify/cleanup code
No functional change is intended.

llvm-svn: 261307
2016-02-19 04:46:06 +00:00
Matthias Braun 848e79c578 LegalizeDAG: Fix ExpandFCOPYSIGN assuming the same type on both inputs
llvm-svn: 261306
2016-02-19 04:44:19 +00:00
David Majnemer bd1b8c0889 [SjLjEHPrepare] Don't grab pointers to functions in doInitialization
Certain optimization passes (like globaldce) can prune function
declaration that SjLjEHPrepare assumed would exit when it'd
runOnFunction.

This fixes PR26669.

llvm-svn: 261303
2016-02-19 03:13:40 +00:00
Justin Lebar c75d566f56 When printing MIR, output to errs() rather than outs().
Summary:
Without this, this command

  $ llvm-run llc -stop-after machine-cp -o - <( echo '' )

outputs an error, because we close stdout twice -- once when closing the
file opened for "-o", and again when closing outs().

Also clarify in the outs() definition that you can't ever call it if you
want to open your own raw_fd_ostream on stdout.

Reviewers: jroelofs, tstellarAMD

Subscribers: jholewinski, qcolombet, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D17422

llvm-svn: 261286
2016-02-19 00:18:46 +00:00
Philip Reames 1960cfd323 [IR] Extend cmpxchg to allow pointer type operands
Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives:
1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative)
2) It pushes work onto the frontend authors for no real gain

This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes.

Differential Revision: http://reviews.llvm.org/D17413

llvm-svn: 261281
2016-02-19 00:06:41 +00:00
Richard Trieu 7a08381403 Remove uses of builtin comma operator.
Cleanup for upcoming Clang warning -Wcomma.  No functionality change intended.

llvm-svn: 261270
2016-02-18 22:09:30 +00:00
Philip Reames 367fdd990c Restrict scope of variables [NFC]
llvm-svn: 261250
2016-02-18 19:45:31 +00:00
Benjamin Kramer 3a16e2a26a Make header self-contained. NFC.
llvm-svn: 261234
2016-02-18 18:02:48 +00:00
Xinliang David Li 1153f194bd Stop creating covmap as note section on ELF
covmap needs to created as non allocatable, but not with
SHT_NOTE. The latter was needed to workaround a problem
of BFD linker with gc, which is no longer needed. (A more
proper longer term fix requires changing FE driver to force
referencing the section using linker script).

Differential Revision: http://reviews.llvm.org/D17309

llvm-svn: 261228
2016-02-18 17:20:22 +00:00
Matthias Braun ac697c5d8e Revert "LiveIntervalAnalysis: Remove LiveVariables requirement" and LiveIntervalTest
The commit breaks stage2 compilation on PowerPC. Reverting for now while
this is analyzed. I also have to revert the LiveIntervalTest for now as
that depends on this commit.

Revert "LiveIntervalAnalysis: Remove LiveVariables requirement"
This reverts commit r260806.
Revert "Remove an unnecessary std::move to fix -Wpessimizing-move warning."
This reverts commit r260931.
Revert "Fix typo in LiveIntervalTest"
This reverts commit r260907.
Revert "Add unittest for LiveIntervalAnalysis::handleMove()"
This reverts commit r260905.

llvm-svn: 261189
2016-02-18 05:21:43 +00:00
Adrian Prantl 3b89e6634c DwarfDebug: Don't drop the DIExpression just because a variable is
described by an immediate.

Found via http://reviews.llvm.org/D16867
Thanks to Paul Robinson for pointing this out.

<rdar://problem/24456528>

llvm-svn: 261168
2016-02-17 22:20:08 +00:00
Adrian Prantl 6f4746b11a DbgVariable: Add an accessor for the common case of a single expression
belonging to a single DBG_VALUE instruction.

NFC

llvm-svn: 261167
2016-02-17 22:19:59 +00:00
Nico Weber e6154ffbe0 Revert r261070, it caused PR26652 / PR26653.
llvm-svn: 261127
2016-02-17 18:47:29 +00:00
Cong Hou bbd4e3b400 Detecte vector reduction operations just before instruction selection.
This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.

To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:

1. Reduction with another vector.
2. Reduction on all elements.

in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.


Differential revision: http://reviews.llvm.org/D15250

llvm-svn: 261070
2016-02-17 06:37:04 +00:00
Reid Kleckner 9a593ee7d2 [codeview] Bail on a DBG_VALUE register operand with no register
This apparently comes up when the register allocator decides that a
variable will become undef along a certain path.

Also improve the error message we emit when we can't map from LLVM
register number to CV register number.

llvm-svn: 261016
2016-02-16 21:49:26 +00:00
Reid Kleckner 6e0d5f573c [codeview] Fix assertion on non-memory, non-register DBG_VALUE instructions
Eventually we should find a way to describe constant variables, but it
is not obvious how to do this at the moment.

llvm-svn: 261010
2016-02-16 21:14:51 +00:00
Quentin Colombet ba2a01645b [GlobalISel] Re-apply r260922-260923 with MSVC-friendly code.
Original message:
Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.

llvm-svn: 260998
2016-02-16 19:26:02 +00:00
Aaron Ballman c6a2f2140b A signed bitfield's range is [-1,0], so assigning 1 is technically an overflow. However, the other bitfield requires a signed value (it supports negative offsets), so it is slightly better to retain a signed 1-bit bitfield and use -1 instead of 1. Silences an MSVC warning.
llvm-svn: 260973
2016-02-16 15:35:51 +00:00
Aaron Ballman fc64ef1a15 Reverting r260922-260923; they cause link failures with MSVC.
http://lab.llvm.org:8011/builders/lldb-x86-windows-msvc2015/builds/15436/steps/build/logs/stdio
http://bb.pgr.jp/builders/msbuild-llvmclang-x64-msc18-DA/builds/961/steps/build_llvm/logs/stdio

llvm-svn: 260972
2016-02-16 15:29:06 +00:00
Quentin Colombet 1ce38545fb [GlobalISel] Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.

llvm-svn: 260922
2016-02-16 00:57:44 +00:00
Zia Ansari 30a02384f7 Implemented stack symbol table ordering/packing optimization to improve data locality and code size from SP/FP offset encoding.
Differential Revision: http://reviews.llvm.org/D15393

llvm-svn: 260917
2016-02-15 23:44:13 +00:00
Matthias Braun 4a6c728cc0 LiveIntervalAnalysis: Support moving of subregister defs in handleMove
This is an updated version which fixes a bug that happened with
uses tied to an earlyclobber operand which end at an unusual slotindex.

If two definitions write to independent subregisters then they can be
put in any order. LiveIntervalAnalysis::handleMove() did not support
this previously because it looks like moving a definition of a vreg past
another one.

This is a modified version of a patch proposed (two years ago) by
Vincent Lejeune! This version does not touch the read-undef flags and is
extended for the case of moving a subregister def behind all uses - this
can happen for subregister defs that are completely unused.

Differential Revision: http://reviews.llvm.org/D9067

llvm-svn: 260906
2016-02-15 19:25:36 +00:00
Matthias Braun b3aefc3a69 MachineVerifier: Add parameter to choose if MachineFunction::verify() aborts
The abort on error behaviour is unpractical for debugger and unittest
usage.

llvm-svn: 260904
2016-02-15 19:25:31 +00:00
Ahmed Bougacha 93cff7fb82 [CodeGen] Document and use getConstant's splat-building feature. NFC.
Differential Revision: http://reviews.llvm.org/D17229

llvm-svn: 260901
2016-02-15 18:07:29 +00:00
Jonas Paulsson 98963fec41 [ScheduleDAGInstrs] isUnsafeMemoryObject() removed
This function was basically useless, since volatile memacesses or MIs with
unmodelled sideffects become global memory objects, and the other little
checks are also done elsewhere.

Reviewed by Andy Trick
http://reviews.llvm.org/D16881

llvm-svn: 260899
2016-02-15 16:43:15 +00:00
Benjamin Kramer 7f75e9403d [AggressiveAntiDepBreaker] Skip some unnecessary BitVector copies.
llvm-svn: 260825
2016-02-13 16:39:39 +00:00
Matthias Braun bbb528f189 LiveIntervalAnalysis: Remove LiveVariables requirement
This requirement was a huge hack to keep LiveVariables alive because it
was optionally used by TwoAddressInstructionPass and PHIElimination.
However we have AnalysisUsage::addUsedIfAvailable() which we can use in
those passes.

llvm-svn: 260806
2016-02-13 04:35:31 +00:00
Pirama Arumuga Nainar 7476bc89e9 Don't combine fp_round (fp_round x) if f80 to f16 is generated
Summary:
This patch skips DAG combine of fp_round (fp_round x) if it results in
an fp_round from f80 to f16.

fp_round from f80 to f16 always generates an expensive (and as yet,
unimplemented) libcall to __truncxfhf2.  This prevents selection of
native f16 conversion instructions from f32 or f64.  Moreover, the first
(value-preserving) fp_round from f80 to either f32 or f64 may become a
NOP in platforms like x86.

Reviewers: ab

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D17221

llvm-svn: 260769
2016-02-13 00:08:05 +00:00
Reid Kleckner 876330d53a [codeview] Describe local variables in registers
llvm-svn: 260746
2016-02-12 21:48:30 +00:00
Andrew Kaylor d1188ddd33 [WinEH] Prevent EH state numbering from skipping nested cleanup pads that never return
Differential Revision: http://reviews.llvm.org/D17208

llvm-svn: 260733
2016-02-12 21:10:16 +00:00
Quentin Colombet 232f447782 Get rid of some GLOBAL_ISEL ifdefs that should be harmless for code size.
More to come, but those were easy.

llvm-svn: 260723
2016-02-12 20:41:24 +00:00
Mehdi Amini 40b369cf5a GlobalISel is always built since r260566, reflect it in LLVMBuild.txt
Other component could not depends on an optional library in llvm-config

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 260701
2016-02-12 18:43:14 +00:00
Quentin Colombet ccd7725808 [IRTranslator] Use a single virtual register to represent any Value.
PR26161.

llvm-svn: 260602
2016-02-11 21:48:32 +00:00
Quentin Colombet 8fd6718700 [Target] Add a helper function to check if an opcode is invalid after isel.
llvm-svn: 260590
2016-02-11 21:16:56 +00:00
Matthias Braun c67f5a6ab1 Revert "LiveIntervalAnalysis: Support moving of subregister defs in handleMove"
This is broke a bot:

http://lab.llvm.org:8011/builders/clang-cmake-aarch64-quick/builds/4703/steps/test-suite/logs/test.log

Reverting while I investigate.

This reverts commit r260565.

llvm-svn: 260586
2016-02-11 21:07:44 +00:00
Sanjay Patel e5df1dfb14 [SelectionDAG] change getConstant() to use the input SDLoc when building splat vectors
The code change is simple enough: instead of attaching an anonymous SDLoc to splatted
vector constants, use the scalar constant's existing SDLoc since that is what is passed 
into getConstant() as a param. But this changes instruction scheduling, so I'll explain
why that happens.

The motivation for this patch starts near:
http://reviews.llvm.org/rL258833
...x86's getZeroVector() could be similarly cleaned up and I thought it would be 'NFC'.
But when I made that change locally, several x86 codegen tests wiggled.

It turns out that the lack of SDLoc consistency in getConstant() changes the way 
ScheduleDAGRRList behaves. This is because the SDLoc contains 'IROrder' and some DAG
scheduler algorithms use IROrder for tie-breaking.

Differential Revision: http://reviews.llvm.org/D16972

llvm-svn: 260582
2016-02-11 20:21:24 +00:00
Quentin Colombet fd9d0a07d8 [GlobalISel] Add the necessary plumbing to lower formal arguments.
llvm-svn: 260579
2016-02-11 19:59:41 +00:00
Peter Collingbourne 7c384ccea2 DwarfDebug: emit type units immediately.
Rather than storing type units in a vector and emitting them at the end
of code generation, emit them immediately and destroy them, reclaiming the
memory we were using for their DIEs.

In one benchmark carried out against Chromium's 50 largest (by bitcode
file size) translation units, total peak memory consumption with type units
decreased by median 17%, or by 7% when compared against disabling type units.

Tested using check-{llvm,clang}, the GDB 7.5 test suite (with
'-fdebug-types-section') and by eyeballing llvm-dwarfdump output on those
Chromium translation units with split DWARF both disabled and enabled, and
verifying that the only changes were to addresses and abbreviation ordering.

Differential Revision: http://reviews.llvm.org/D17118

llvm-svn: 260578
2016-02-11 19:57:46 +00:00
Reid Kleckner 829365aeef [codeview] Fix bug around multi-level wrapper inlining
If there were wrapper functions with no instructions of their own in the
inlining tree, we would fail to emit InlineSite records for them.

llvm-svn: 260571
2016-02-11 19:41:47 +00:00
Quentin Colombet 2e00253750 Play nice with Visual Studio and attributes
llvm-svn: 260568
2016-02-11 19:33:21 +00:00
Quentin Colombet bde158cbc7 [CMake] Produce an empty library for GlobalISel when not building it.
The rational for this change is that LLVMBuild cannot express conditional 
dependencies. Therefore, when we start optionally using GlobalISel library for 
say AArch64, without that change, all the tools that use the AArch64 library 
would need to explicitly link with GlobalISel when we ask for it.

This does not scale.

Instead, we will set the dependencies between the target and GlobalISel and if 
we did not ask to build GlobalISel, the library will just be empty.

Thanks to Chris Bieneman and Mehdi Animi for the idea.

llvm-svn: 260566
2016-02-11 19:18:27 +00:00
Matthias Braun 33c641bddf LiveIntervalAnalysis: Support moving of subregister defs in handleMove
If two definitions write to independent subregisters then they can be
put in any order. LiveIntervalAnalysis::handleMove() did not support
this previously because it looks like moving a definition of a vreg past
another one.

This is a modified version of a patch proposed (two years ago) by
Vincent Lejeune! This version does not touch the read-undef flags and is
extended for the case of moving a subregister def behind all uses - this
can happen for subregister defs that are completely unused.

Differential Revision: http://reviews.llvm.org/D9067

llvm-svn: 260565
2016-02-11 19:03:53 +00:00
Quentin Colombet 74d7d2f00b [GlobalISel] Teach the IRTranslator how to lower returns.
llvm-svn: 260562
2016-02-11 18:53:28 +00:00
Quentin Colombet 9855111b77 [GlobalISel] Add a type to MachineInstr.
We actually need that information only for generic instructions, therefore it
would be nice not to have to pay the extra memory consumption for all
instructions. Especially because a typed non-generic instruction does not make
sense.

The question is then, is it possible to have that information in a union or
something?
My initial thought was that we could have a derived class GenericMachineInstr
with additional information, but in practice it makes little to no sense since
generic MachineInstrs are likely turned into non-generic ones by just switching
the opcode. In other words, we don't want to go through the process of creating
a new, non-generic MachineInstr, object each time we do this switch. The memory
benefit probably is not worth the extra compile time.

Another option would be to keep the type of the MachineInstr in a side table.
This would induce an extra indirection though.

Anyway, I will file a PR to discuss about it and remember we need to come back
to it at some point.

llvm-svn: 260558
2016-02-11 18:22:37 +00:00
Quentin Colombet 37a09a8428 [GlobalISel] Add a hook in TargetConfigPass to run GlobalISel.
llvm-svn: 260553
2016-02-11 17:57:22 +00:00
Quentin Colombet a7fae162e6 [GlobalISel][IRTranslator] Change the ownership of the MIRBuilder field.
llvm-svn: 260551
2016-02-11 17:53:23 +00:00
Quentin Colombet 4f0ec8d2b0 [GlobalISel][IRTranslator] Fix a typo in assert.
llvm-svn: 260550
2016-02-11 17:52:28 +00:00
Quentin Colombet 17c494b91c [GlobalISel][IRTranslator] Teach the pass how to translate Add instructions.
llvm-svn: 260549
2016-02-11 17:51:31 +00:00
Quentin Colombet 2ad1f851a1 [GlobalISel] Add a MachineIRBuilder class.
Helper class to build machine instrs. This is a higher abstraction
than MachineInstrBuilder.

llvm-svn: 260547
2016-02-11 17:44:59 +00:00
Benjamin Kramer e3b963d5ee Drop the hidden visibility from DebugHandlerBase for now.
If a class has hidden visibility all derived classes and all classes
that have it as a member must have hidden visibility too. That may
be fixable here but requires changes to quite a lot of debug info
classes.

This is also one of the things that GCC enforces aggressively while
clang ignores it, making testing more annoying than necessary.

llvm-svn: 260529
2016-02-11 15:41:56 +00:00
Quentin Colombet e1494c353a [GlobalISel][MachineRegisterInfo] Add a method to create generic vregs.
For now, generic virtual registers will not have a register class. We may want
to change that. For instance, if we want to use all the methods from
TargetRegisterInfo with generic virtual registers, we need to either have some
sort of generic register classes that do what we want, or teach those methods
how to deal with nullptr register class.

Although the latter seems easy enough to do, we may still want to differenciate
generic register classes from nullptr to catch cases where nullptr gets
introduced by a bug of some sort.

Anyway, I will file a PR to keep track of that.

llvm-svn: 260474
2016-02-11 00:19:17 +00:00
Quentin Colombet 36ce1b0157 [GlobalISel] Remember the size of generic virtual registers
llvm-svn: 260468
2016-02-10 23:43:48 +00:00
Quentin Colombet 2ecff3bff2 [GlobalISel] More detailed skeleton for the IRTranslator.
llvm-svn: 260456
2016-02-10 22:59:27 +00:00
Reid Kleckner f9c275fe0a [codeview] Describe int local variables using .cv_def_range
Summary:
Refactor common value, scope, and label tracking logic out of DwarfDebug
into a common base class called DebugHandlerBase.

Update an old LLVM IR test case to avoid an assertion in LexicalScopes.

Reviewers: dblaikie, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16931

llvm-svn: 260432
2016-02-10 20:55:49 +00:00
Ahmed Bougacha f8dfb47c02 [CodeGen] Prefer "if (SDValue R = ...)" to "if (R.getNode())". NFCI.
llvm-svn: 260316
2016-02-09 22:54:12 +00:00
Sanjay Patel 73200f72de [SelectionDAG] make getMemBasePlusOffset() accessible; NFCI
I reinvented this functionality in http://reviews.llvm.org/D16828 because it was
hidden away as a static function. The changes in x86 are not based on a complete
audit. I suspect there are other possible uses there, and there are almost certainly
more potential users in other targets.

llvm-svn: 260295
2016-02-09 21:42:04 +00:00
Andrew Kaylor 1224488e0c [regalloc][WinEH] Do not mark intervals as not spillable if they contain a regmask
Differential Revision: http://reviews.llvm.org/D16831

llvm-svn: 260164
2016-02-08 22:52:51 +00:00
Hans Wennborg 850ec6ca18 [X86] Don't zero/sign-extend i1, i8, or i16 return values to 32 bits (PR22532)
This matches GCC and MSVC's behaviour, and saves on code size.

We were already not extending i1 return values on x86_64 after r127766. This
takes that patch further by applying it to x86 target as well, and also for i8
and i16.

The ABI docs have been unclear about the required behaviour here. The new i386
psABI [1] clearly states (Table 2.4, page 14) that i1, i8, and i16 return
vales do not need to be extended beyond 8 bits. The x86_64 ABI doc is being
updated to say the same [2].

Differential Revision: http://reviews.llvm.org/D16907

 [1]. https://01.org/sites/default/files/file_attach/intel386-psabi-1.0.pdf
 [2]. https://groups.google.com/d/msg/x86-64-abi/E8O33onbnGQ/_RFWw_ixDQAJ

llvm-svn: 260133
2016-02-08 19:34:30 +00:00
Matt Arsenault 2bba779272 SelectionDAG: Lower some range metadata to AssertZext
If a range has a lower bound of 0, add an AssertZext from the
nearest floor power of two.

This allows operations with some workitem intrinsics with known
maximum ranges to use fast 24-bit multiplies.

llvm-svn: 260109
2016-02-08 16:28:19 +00:00
Sanjoy Das 86d7d83f2a [StatepointLower] Use None instead of Optional<int>()
llvm-svn: 259956
2016-02-05 23:40:04 +00:00
Wei Mi a62f058989 Some stackslots are allocated to vregs which have no real reference.
LiveRangeEdit::eliminateDeadDef is used to remove dead define instructions
after rematerialization. To remove a VNI for a vreg from its LiveInterval,
LiveIntervals::removeVRegDefAt is used. However, after non-PHI VNIs are all
removed, PHI VNI are still left in the LiveInterval. Such unused vregs will
be kept in RegsToSpill[] at the end of InlineSpiller::reMaterializeAll and
spiller will allocate stackslot for them.

The fix is to get rid of unused reg by checking whether it has non-dbg
reference instead of whether it has non-empty interval.

llvm-svn: 259895
2016-02-05 18:14:24 +00:00
Matt Arsenault 5923973fe2 Fix printing of f16 machine operands
Only single and double FP immediates are correctly printed by
MachineInstr::print() during debug output. Half float type goes to
APFloat::convertToDouble() and hits assertion it is not a double
semantics. This diff prints half machine operands correctly.

This cannot currently be hit by any in-tree target.

Patch by Stanislav Mekhanoshin

llvm-svn: 259857
2016-02-05 00:50:18 +00:00
Nemanja Ivanovic e8cbae32e9 Enable the %s modifier in inline asm template string
This patch corresponds to review:
http://reviews.llvm.org/D16847

There are some files in glibc that use the output operand modifier even though
it was deprecated in GCC. This patch just adds support for it to prevent issues
with such files.

llvm-svn: 259798
2016-02-04 16:18:08 +00:00
Petar Jovanovic 23e44f5e39 [Power PC] softening long double type
This patch implements softening of long double type (ppcf128) on ppc32
architecture and enables operations for this type for soft float.

Patch by Strahinja Petrovic.

Differential Revision: http://reviews.llvm.org/D15811

llvm-svn: 259791
2016-02-04 14:43:50 +00:00
Jonas Paulsson 2293685731 [ScheduleDagInstrs] Improved comments
llvm-svn: 259783
2016-02-04 13:08:48 +00:00
Sanjay Patel e9fa3363b4 rangify; NFCI
llvm-svn: 259722
2016-02-03 22:44:14 +00:00
Reid Kleckner eb3bcdd28b [codeview] Remove EmitLabelDiff in favor emitAbsoluteSymbolDiff
llvm-svn: 259700
2016-02-03 21:24:42 +00:00
Reid Kleckner dac21b43d5 [codeview] Use the MCStreamer interface directly instead of AsmPrinter
This is mostly about having shorter lines and standardizing on one
interface, but it also avoids some needless indirection.

No functional change.

llvm-svn: 259697
2016-02-03 21:15:48 +00:00
Keno Fischer 6c1e47a66b [DWARFDebug] Fix another case of overlapping ranges
Summary:
In r257979, I added code to ensure that we wouldn't merge DebugLocEntries if
the pieces they describe overlap. Unfortunately, I failed to cover the case,
where there may have multiple active Expressions in the entry, in which case we
need to make sure that no two values overlap before we can perform the merge.

This fixed PR26148.

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16742

llvm-svn: 259696
2016-02-03 21:13:33 +00:00
Tim Shen f99f0d5a7e [SelectionDAG] Fix CombineToPreIndexedLoadStore O(n^2) behavior
This patch consists of two parts: a performance fix in DAGCombiner.cpp
and a correctness fix in SelectionDAG.cpp.

The test case tests the bug that's uncovered by the performance fix, and
fixed by the correctness fix.

The performance fix keeps the containers required by the
hasPredecessorHelper (which is a lazy DFS) and reuse them. Since
hasPredecessorHelper is called in a loop, the overall efficiency reduced
from O(n^2) to O(n), where n is the number of SDNodes.

The correctness fix keeps iterating the neighbor list even if it's time
to early return. It will return after finishing adding all neighbors to
Worklist, so that no neighbors are discarded due to the original early
return.

llvm-svn: 259691
2016-02-03 20:58:55 +00:00
Jonas Paulsson ac29f01788 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
Recommited, after some fixing with test cases.

Updated test cases:
test/CodeGen/AArch64/arm64-misched-memdep-bug.ll
test/CodeGen/AArch64/tailcall_misched_graph.ll

Temporarily disabled test cases:
test/CodeGen/AMDGPU/split-vector-memoperand-offsets.ll
test/CodeGen/PowerPC/ppc64-fastcc.ll (partially updated)
test/CodeGen/PowerPC/vsx-fma-m.ll
test/CodeGen/PowerPC/vsx-fma-sp.ll

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259673
2016-02-03 17:52:29 +00:00
Jun Bum Lim 59df5e89c2 [MachineCopyPropagation] Fix comment. NFC
Reviewers: MatzeB, qcolombet, jmolloy, mcrosier

Subscribers: llvm-commits, mcrosier

Differential Revision: http://reviews.llvm.org/D16806

llvm-svn: 259656
2016-02-03 15:56:27 +00:00
Marcello Maggioni bfe87568aa RegCoalescer: Making sure re-materialization defines all subranges
The register coalescer can rematerialize constants that define
more of a register than the copy it is going to replace was going
to do.
This is valid in the case the register was undef before the
copy happened.
This patch makes sure that all the subranges defined by the new
rematerialization instructions have at least a dead def.

Review: http://reviews.llvm.org/D16693
llvm-svn: 259614
2016-02-03 00:22:32 +00:00
David Majnemer 30579ec851 [codeview] Improve readability of codeview assembly output
Strictly speaking, this is not an improvement in functionality per se
but a usability improvement to those debugging codeview.

llvm-svn: 259601
2016-02-02 23:18:23 +00:00
Matthias Braun 1377fd6781 MachineVerifier: Check that defs/uses are live in subregisters as well.
llvm-svn: 259552
2016-02-02 20:04:51 +00:00
David Majnemer c9911f28e5 [codeview] Correctly handle inlining functions post-dominated by unreachable
CodeView requires us to accurately describe the extent of the inlined
code.  We did this by grabbing the next debug location in source order
and using *that* to denote where we stopped inlining.  However, this is
not sufficient or correct in instances where there is no next debug
location or the next debug location belongs to the start of another
function.

To get this correct, use the end symbol of the function to denote the
last possible place the inlining could have stopped at.

llvm-svn: 259548
2016-02-02 19:22:34 +00:00
Eugene Zelenko ecefe5a81f Fix Clang-tidy readability-redundant-control-flow warnings; other minor fixes.
Differential revision: http://reviews.llvm.org/D16793

llvm-svn: 259539
2016-02-02 18:20:45 +00:00
Reid Kleckner 1fcd610c94 [codeview] Wire up the .cv_inline_linetable directive
This directive emits the binary annotations that describe line and code
deltas in inlined call sites. Single-stepping through inlined frames in
windbg now works.

llvm-svn: 259535
2016-02-02 17:41:18 +00:00
David Majnemer ccc809e2e6 [RegisterCoalescer] Better DebugLoc for reMaterializeTrivialDef
When rematerializing a computation by replacing the copy, use the copy's
location.  The location of the copy is more representative of the
original program.

This partially fixes PR10003.

llvm-svn: 259469
2016-02-02 06:41:55 +00:00
Matthias Braun 579c9cda13 MachineVerifier: Use report_context() instead of ad-hoc messages.
llvm-svn: 259457
2016-02-02 02:44:25 +00:00
Anna Zaks cad7994c3b [safestack] Make sure the unsafe stack pointer is popped in all cases
The unsafe stack pointer is only popped in moveStaticAllocasToUnsafeStack so it won't happen if there are no static allocas.

Fixes https://llvm.org/bugs/show_bug.cgi?id=26122

Differential Revision: http://reviews.llvm.org/D16339

llvm-svn: 259447
2016-02-02 01:03:11 +00:00
Balaram Makam 92431703d7 AArch64: Implement missed conditional compare sequences.
Summary:
This is an extension to the existing implementation of r242436 which
restricts to only select inputs. This version fixes missed opportunities
in pr26084 by attempting to lower conditional compare sequences of
and/or trees with setcc leafs. This will additionaly handle the case
when a tree with select input is not a conjunction-disjunction tree
but some of the sub trees are conjunction-disjunction trees.

Reviewers: jmolloy, t.p.northover, mcrosier, MatzeB

Subscribers: mcrosier, llvm-commits, junbuml, haicheng, mssimpso, gberry

Differential Revision: http://reviews.llvm.org/D16291

llvm-svn: 259387
2016-02-01 19:13:07 +00:00
Geoff Berry b13b2eed11 [PrologEpilogInserter] Add some debug output for callee-save frame object allocation
Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16733

llvm-svn: 259367
2016-02-01 16:47:51 +00:00
Amjad Aboud 8bbce8ad8e Improved macro emission in dwarf.
Changed emitting offset of macinfo entry into compiler unit DIE to use "addSectionLabel" method rather than explicitly calculating size/offset of macro entry.

Differential Revision: http://reviews.llvm.org/D16292

llvm-svn: 259358
2016-02-01 14:09:41 +00:00
David Majnemer 784d4a455b Revert r258580 and r258581.
Those commits created an artificial edge from a cleanup to a synthesized
catchswitch in order to get the MSVC personality routine to execute
cleanups which don't cleanupret and are not wrapped by a catchswitch.

This worked well enough but is not a complete solution in situations
where there the cleanup infinite loops.

However, the real deal breaker behind this approach comes about from a
degenerate case where the cleanup is post-dominated by unreachable *and*
throws an exception.  This ends poorly because the catchswitch will
inadvertently catch the exception.

Because of this we should go back to our previous behavior of not
executing certain cleanups (identical behavior with the Itanium ABI
implementation in clang, GCC and ICC).

N.B. I think this could be salvaged by making the catchpad rethrow the
exception and properly transforming throwing calls in the cleanup into
invokes.

llvm-svn: 259338
2016-02-01 03:29:38 +00:00
Tim Shen 3b428cb764 [SelectionDAG] Eliminate exponential behavior in WalkChainUsers
llvm-svn: 259315
2016-01-31 03:59:34 +00:00
Matthias Braun b30f2f5141 Avoid overly large SmallPtrSet/SmallSet
These sets perform linear searching in small mode so it is never a good
idea to use SmallSize/N bigger than 32.

llvm-svn: 259283
2016-01-30 01:24:31 +00:00
Manman Ren c77e0ff785 [Objective-C] Support a new special module flag.
"Objective-C Class Properties" will be put into the objc_imageinfo struct.

rdar://23891898

llvm-svn: 259270
2016-01-29 23:51:00 +00:00
Yaron Keren eb2a25467e Annotate dump() methods with LLVM_DUMP_METHOD, addressing Richard Smith r259192 post commit comment.
clang part in r259232, this is the LLVM part of the patch.

llvm-svn: 259240
2016-01-29 20:50:44 +00:00
Reid Kleckner f3b9ba4941 [codeview] Begin to add support for inlined call sites
Summary:
There are three parts to inlined call frames:
1. The inlinee line subsection
2. The inline site symbol record
3. The function ids referenced by both

This change starts by emitting function ids (3) for all subprograms and
emitting the base inline site symbol record (2). The actual line numbers
in (2) use an encoded format that will come next, along with the inlinee
line subsection.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16333

llvm-svn: 259217
2016-01-29 18:16:43 +00:00
Jonas Paulsson 8c738635b1 Temporarily revert "[ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten."
Some buildbot failures needs to be debugged.

llvm-svn: 259213
2016-01-29 17:22:43 +00:00
Jonas Paulsson 23f12e5c02 [ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten.
The buildSchedGraph() was in need of reworking as the AA features had been
added on top of earlier code. It was very difficult to understand, and buggy.
There had been found cases where scheduling dependencies had actually been
missed (see r228686).

AliasChain, RejectMemNodes, adjustChainDeps() and iterateChainSucc() have
been removed. There are instead now just the four maps from Value to SUs, which
have been renamed to Stores, Loads, NonAliasStores and NonAliasLoads.

An unknown store used to become the AliasChain, but now becomes a store mapped
to 'unknownValue' (in Stores). What used to be PendingLoads is instead the
list of SUs mapped to 'unknownValue' in Loads.

RejectMemNodes and adjustChainDeps() used to be a safety-net for everything.
The SU maps were sometimes cleared and SUs were put in RejectMemNodes, where
adjustChainDeps() would look. Instead of this, a more straight forward approach
is used in maintaining the SU maps without clearing them and simply letting
them grow over time. Instead of the cutt-off in adjustChainDeps() search, a
reduction of maps will be done if needed (see below).

Each SUnit either becomes the BarrierChain, or is put into one of the maps. For
each SUnit encountered, all the information about previous ones are still
available until a new BarrierChain is set, at which point the maps are cleared.

For huge regions, the algorithm becomes slow, therefore the maps will get
reduced at a threshold (current default is 1000 nodes), by a fraction (default 1/2).
These values can be tuned by use of CL options in case some test case shows that
they need to be changed (-dag-maps-huge-region and -dag-maps-reduction-size).

There has not been any considerable change observed in output quality or compile
time. There may now be more DAG edges inserted than before (i.e. if A->B->C,
then A->C is not needed). However, in a comparison run there were fewer total
calls to AA, and a somewhat improved compile time, which means this seems to
be not a problem.

http://reviews.llvm.org/D8705
Reviewers: Hal Finkel, Andy Trick.

llvm-svn: 259201
2016-01-29 16:11:18 +00:00
Junmo Park 67bb3f1d27 Minor code cleanup. NFC.
llvm-svn: 259139
2016-01-29 01:39:39 +00:00
Reid Kleckner 2214ed8937 Reland "[CodeView] Use assembler directives for line tables"
This reverts commit r259126 and relands r259117.

This time with updated library dependencies.

llvm-svn: 259130
2016-01-29 00:49:42 +00:00
Reid Kleckner 00d9639c24 Revert "[CodeView] Use assembler directives for line tables"
This reverts commit r259117.

The LineInfo constructor is defined in the codeview library and we have
to link against it now. Doing that isn't trivial, so reverting for now.

llvm-svn: 259126
2016-01-29 00:13:28 +00:00
Reid Kleckner c62e379d22 [CodeView] Use assembler directives for line tables
Adds a new family of .cv_* directives to LLVM's variant of GAS syntax:

- .cv_file: Similar to DWARF .file directives

- .cv_loc: Similar to the DWARF .loc directive, but starts with a
  function id. CodeView line tables are emitted by function instead of
  by compilation unit, so we needed an extra field to communicate this.
  Rather than overloading the .loc direction further, we decided it was
  better to have our own directive.

- .cv_stringtable: Emits the codeview string table at the current
  position. Currently this just contains the filenames as
  null-terminated strings.

- .cv_filechecksums: Emits the file checksum table for all files used
  with .cv_file so far. There is currently no support for emitting
  actual checksums, just filenames.

This moves the line table emission code down into the assembler.  This
is in preparation for implementing the inlined call site line table
format. The inline line table format encoding algorithm requires knowing
the absolute code offsets, so it must run after the assembler has laid
out the code.

David Majnemer collaborated on this patch.

llvm-svn: 259117
2016-01-28 23:31:52 +00:00
David Majnemer 4543ff09a2 [X86] Don't transform X << 1 to X + X during type legalization
While legalizing a 64-bit shift left by 1, the following occurs:

We split the shift operand in half: a high half and a low half.
We then create an ADDC with the low half and a ADDE with the high half +
the carry bit from the ADDC.

This is problematic if X is any_ext'd because the high half computation
is now undef + undef + carry bit and there is no way to ensure that the
two undef values had the same bitwise representation.  This results in
the lowest bit in the high half turning into garbage.

Instead, do not try to turn shifts into arithmetic during type
legalization.

This fixes PR26350.

llvm-svn: 259065
2016-01-28 18:20:05 +00:00
Oliver Stannard 02fa1c80c4 Revert r259035, it introduces a cyclic library dependency
llvm-svn: 259045
2016-01-28 13:19:47 +00:00
Oliver Stannard b4b092ea1b Add backend dignostic printer for unsupported features
Re-commit of r258951 after fixing layering violation.

The related LLVM patch adds a backend diagnostic type for reporting
unsupported features, this adds a printer for them to clang.

In the case where debug location information is not available, I've
changed the printer to report the location as the first line of the
function, rather than the closing brace, as the latter does not give the
user any information. This also affects optimisation remarks.

Differential Revision: http://reviews.llvm.org/D16590

llvm-svn: 259035
2016-01-28 10:07:27 +00:00
Junmo Park 7d6c5f19f1 Minor code cleanups. NFC.
llvm-svn: 259033
2016-01-28 09:42:39 +00:00
Junmo Park b3327b7007 [DAGCombiner] Don't add volatile or indexed stores to ChainedStores
Summary:
findBetterNeighborChains does not handle volatile or indexed stores.
However, it did not check when adding stores to ChainedStores.

Reviewers: arsenm

Differential Revision: http://reviews.llvm.org/D16463

llvm-svn: 259024
2016-01-28 06:23:33 +00:00
NAKAMURA Takumi 628a7a0aef Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features"
It broke layering violation in LLVMIR.

clang r258950 "Add backend dignostic printer for unsupported features"
llvm  r258951 "Refactor backend diagnostics for unsupported features"

llvm-svn: 259016
2016-01-28 04:41:32 +00:00
Benjamin Kramer 391be792f2 One more batch of self-containing headers.
llvm-svn: 258974
2016-01-27 19:29:56 +00:00
Oliver Stannard 1e67a9f196 Refactor backend diagnostics for unsupported features
The BPF and WebAssembly backends had identical code for emitting errors
for unsupported features, and AMDGPU had very similar code. This merges
them all into one DiagnosticInfo subclass, that can be used by any
backend.

There should be minimal functional changes here, but some AMDGPU tests
have been updated for the new format of errors (it used a slightly
different format to BPF and WebAssembly). The AMDGPU error messages will
now benefit from having precise source locations when debug info is
available.

The implementation of DiagnosticInfoUnsupported::print must be in
lib/Codegen rather than in the existing file in lib/IR/ to avoid
introducing a dependency from IR to CodeGen.

Differential Revision: http://reviews.llvm.org/D16590

llvm-svn: 258951
2016-01-27 17:30:33 +00:00
Benjamin Kramer 390c33cd18 Move SafeStack to CodeGen.
It depends on the target machinery, that's not available for
instrumentation passes.

llvm-svn: 258942
2016-01-27 16:53:42 +00:00
Benjamin Kramer f9172fd4ac Rename TargetSelectionDAGInfo into SelectionDAGTargetInfo and move it to CodeGen/
It's a SelectionDAG thing, not a Target thing.

llvm-svn: 258939
2016-01-27 16:32:26 +00:00
Benjamin Kramer c67cf31f5c Move passes that live in lib/CodeGen out of Scalar.h
llvm-svn: 258938
2016-01-27 16:05:42 +00:00
Benjamin Kramer 820f7548a1 Make some headers self-contained, remove unused includes that violate layering.
llvm-svn: 258937
2016-01-27 16:05:37 +00:00
Benjamin Kramer b3e8a6d2b8 Move MCTargetAsmParser.h to llvm/MC/MCParser where it belongs.
llvm-svn: 258917
2016-01-27 10:01:28 +00:00
Chris Bieneman e49730d4ba Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

"I felt a great disturbance in the [build system], as if millions of [makefiles] suddenly cried out in terror and were suddenly silenced. I fear something [amazing] has happened."
- Obi Wan Kenobi

Reviewers: chandlerc, grosbach, bob.wilson, tstellarAMD, echristo, whitequark

Subscribers: chfast, simoncook, emaste, jholewinski, tberghammer, jfb, danalbert, srhines, arsenm, dschuff, jyknight, dsanders, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D16471

llvm-svn: 258861
2016-01-26 21:29:08 +00:00
Chad Rosier b46d0f9a71 [ScheduleDAGInstrs] Simplify logic to improve readability. NFC.
The call to isInvariantLoad() already returns false for non-load instructions.

llvm-svn: 258841
2016-01-26 19:33:57 +00:00
Sanjay Patel 59066a0803 tidy up; NFC
llvm-svn: 258838
2016-01-26 19:30:14 +00:00
Sanjay Patel 9eed9a956f fix formatting; NFC
llvm-svn: 258825
2016-01-26 18:14:37 +00:00
Matthias Braun db320773e1 LiveIntervalAnalysis: Improve some comments
As recommended by Justin.

llvm-svn: 258771
2016-01-26 01:40:48 +00:00
Matthias Braun 242b8bb58d LiveIntervalAnalysis: Cleanup handleMove{Down|Up}() functions, NFC
These two functions are hard to reason about. This commit makes the code
more comprehensible:

- Use four distinct variables (OldIdxIn, OldIdxOut, NewIdxIn, NewIdxOut)
  with a fixed value instead of a changing iterator I that points to
  different things during the function.
- Remove the early explanation before the function in favor of more
  detailed comments inside the function. Should have more/clearer comments now
  stating which conditions are tested and which invariants hold at
  different points in the functions.

The behaviour of the code was not changed.

I hope that this will make it easier to review the changes in
http://reviews.llvm.org/D9067 which I will adapt next.

Differential Revision: http://reviews.llvm.org/D16379

llvm-svn: 258756
2016-01-26 00:43:50 +00:00
Dan Gohman 5016c0f99d [SelectionDAG] Use the correct return type for memcpy, memmove, and memset.
When generating calls to memcpy, memmove, and memset, use void* as the return
type rather than void, to match the standard signatures for these functions.

This has no practical effect for most targets, since the return values of
these calls aren't being used anyway, and most calling conventions tolerate
this kind of mismatch. However, this change will help support future
optimizations to utilize the return value to avoid holding the argument
value live across a call.

llvm-svn: 258691
2016-01-25 15:05:56 +00:00
Amjad Aboud c077841492 Fixed few comments.
llvm-svn: 258658
2016-01-24 08:18:55 +00:00
David Majnemer b7d49268c2 [WinEH] Don't miscompile cleanups which conditionally unwind to caller
A cleanup can have paths which unwind or end up in unreachable.
If there is an unreachable path *and* a path which unwinds to caller,
we would mistakenly inject an unwind path to a catchswitch on the
unreachable path.  This results in a verifier assertion firing because
the cleanup unwinds to two different places: to the caller and to the
catchswitch.

This occured because we used getCleanupRetUnwindDest to determine if the
cleanuppad had no cleanuprets.
This is incorrect, getCleanupRetUnwindDest returns null for cleanuprets
which unwind to caller.

llvm-svn: 258651
2016-01-23 23:54:33 +00:00
Simon Pilgrim 02c1b54a4a [SelectionDAG] Generalised the CONCAT_VECTORS creation to support BUILD_VECTOR and UNDEF folding.
llvm-svn: 258646
2016-01-23 22:27:54 +00:00
Simon Pilgrim b9b8fcd831 Tidied up TRUNC combine code. NFC.
Make use of DAG.getBitcast and use clang-format to reduce number of lines (and make it more readable).

llvm-svn: 258644
2016-01-23 21:50:40 +00:00
Benjamin Kramer 58e1998520 Don't check if a list is empty with ilist::size.
ilist::size() is O(n) while ilist::empty() is O(1)

llvm-svn: 258636
2016-01-23 20:58:09 +00:00
Junmo Park 75e9d64aa2 Remove extra whitespace. NFC.
llvm-svn: 258617
2016-01-23 06:34:36 +00:00
David Majnemer f1ff538456 [WinEH] Let cleanups post-dominated by unreachable get executed
Cleanups in C++ are a little weird.  They are only guaranteed to be
reliably executed if, and only if, there is a viable catch handler which
can handle the exception.

This means that reachability of a cleanup is lexically determined by it
being nested with a try-block which unwinds to a catch.  It is *cannot*
be reasoned about by examining the control flow edges leaving a cleanup.

Usually this is not a problem.  It becomes a problem when there are *no*
edges out of a cleanup because we believed that code post-dominated by
the cleanup is dead.  In LLVM's case, this code is what informs the
personality routine about the presence of a suitable catch handler.
However, the lack of edges to that catch handler makes the handler
become unreachable which causes us to remove it.  By removing the
handler, the cleanup becomes unreachable.

Instead, inject a catch-all handler with every cleanup that has no
unwind edges.  This will allow us to properly unwind the stack.

This fixes PR25997.

llvm-svn: 258580
2016-01-22 23:20:43 +00:00
Weiming Zhao 13e7cb294c Fix LivePhysRegs::addLiveOuts
Summary:
The testing for returnBB was flipped which may cause ARM ld/st opt pass uses callee saved regs in returnBB when shrink-wrap is used.


Reviewers: t.p.northover, apazos, MatzeB

Subscribers: mcrosier, zzheng, aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D16434

llvm-svn: 258569
2016-01-22 22:21:34 +00:00
Sanjay Patel 3388d1fc6d function names start with a lowercase letter; NFC
llvm-svn: 258552
2016-01-22 21:11:47 +00:00
David Majnemer 734d7c3272 [WinEH] Make collectFuncletMembers non-recursive
Use a worklist for the pre-order DFS instead of using recursion.
No functionality change is intended.

llvm-svn: 258521
2016-01-22 18:49:50 +00:00
Dan Gohman 0bf3ae84ca [SelectionDAG] Fold more offsets into GlobalAddresses
This reapplies r258296 and r258366, and also fixes an existing bug in
SelectionDAG.cpp's isMemSrcFromString, neglecting to account for the
offset in a GlobalAddressSDNode, which is uncovered by those patches.

llvm-svn: 258482
2016-01-22 03:57:34 +00:00
Eduard Burtescu 68e7f49f8e [opaque pointer types] [NFC] DataLayout::getIndexedOffset: take source element type instead of pointer type and rename to getIndexedOffsetInType.
Summary:

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16282

llvm-svn: 258478
2016-01-22 03:08:27 +00:00
Eduard Burtescu 1423921a24 [opaque pointer types] [NFC] Add an explicit type argument to ConstantFoldLoadFromConstPtr.
Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16418

llvm-svn: 258472
2016-01-22 01:17:26 +00:00
Reid Kleckner b7ecfa5b09 Revert "[SelectionDAG] Fold more offsets into GlobalAddresses"
This reverts r258296 and the follow up r258366. With this change, we
miscompiled the following program on Windows:
  #include <string>
  #include <iostream>
  static const char kData[] = "asdf jkl;";
  int main() {
    std::string s(kData + 3, sizeof(kData) - 3);
    std::cout << s << '\n';
  }

llvm-svn: 258465
2016-01-22 01:09:29 +00:00
Reid Kleckner 18ec96f0fc Avoid unnecessary stack realignment in musttail thunks with SSE2 enabled
The X86 musttail implementation finds register parameters to forward by
running the calling convention algorithm until a non-register location
is returned. However, assigning a vector memory location has the side
effect of increasing the function's stack alignment. We shouldn't
increase the stack alignment when we are only looking for register
parameters, so this change conditionalizes it.

llvm-svn: 258442
2016-01-21 22:23:22 +00:00
Chad Rosier 406808e344 Partially revert "Add command line options to force function/loop alignments."
This partially reverts r256571 in favor of the solution in r258409.

llvm-svn: 258421
2016-01-21 18:49:15 +00:00
Geoff Berry 10494aca05 [BlockPlacement] Add option to align all non-fall-through blocks.
Summary: This option is being added for testing purposes.

Reviewers: mcrosier

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D16410

llvm-svn: 258409
2016-01-21 17:25:52 +00:00
Manuel Jacob 925d029461 Introduce ConstantFoldCastOperand function and migrate some callers of ConstantFoldInstOperands to use it. NFC.
Summary:
Although this is a slight cleanup on its own, the main motivation is to
refactor the constant folding API to ease migration to opaque pointers.
This will be follow-up work.

Reviewers: eddyb

Subscribers: zzheng, dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D16380

llvm-svn: 258390
2016-01-21 06:31:08 +00:00
Andrew Wilkins a7a8ab71aa [GlobalISel] make library an optional component
Summary:
Mark the LLVMGlobalISel library as optional in
LLVMBuild.txt, since the library is only built
if LLVM_BUILD_GLOBAL_ISEL is set. Without doing
this, llvm-config includes the library in the
list of components regardless of whether it's
built, and then will error out when asked for
the library names/paths.

Reviewers: qcolombet

Subscribers: joker.eph, llvm-commits, vkalintiris

Differential Revision: http://reviews.llvm.org/D16386

llvm-svn: 258379
2016-01-21 01:41:03 +00:00
Dan Gohman 760bef5e50 [SelectionDAG] Fix constant offset folding to avoid commuting non-commutative operators.
This fixes a miscompile in MultiSource/Benchmarks/MiBench/consumer-lame
introduced in r258296.

llvm-svn: 258366
2016-01-20 23:16:59 +00:00
Chad Rosier 816a1ab9d9 MachineScheduler: Add a command line option to disable post scheduler.
llvm-svn: 258364
2016-01-20 23:08:32 +00:00
Chad Rosier 6338d7c390 MachineScheduler: Honor optnone functions in the pre-ra scheduler.
llvm-svn: 258363
2016-01-20 22:38:25 +00:00
Quentin Colombet 105cf2b179 [GlobalISel] Add the proper cmake plumbing.
This patch adds the necessary plumbing to cmake to build the sources related to
GlobalISel.

To build the sources related to GlobalISel, we need to add -DBUILD_GLOBAL_ISEL=ON.
By default, this is OFF, thus GlobalISel sources will not impact people that do
not explicitly opt-in.

Differential Revision: http://reviews.llvm.org/D15983

llvm-svn: 258344
2016-01-20 20:58:56 +00:00
Sanjay Patel 545a456235 fix formatting; NFC
llvm-svn: 258330
2016-01-20 18:59:16 +00:00
Krzysztof Parzyszek 2451c4835a Proper handling of diamond-like cases in if-conversion
If converter was somewhat careless about "diamond" cases, where there
was no join block, or in other words, where the true/false blocks did
not have analyzable branches. In such cases, it was possible for it to
remove (needed) branches, resulting in a loss of entire basic blocks.

Differential Revision: http://reviews.llvm.org/D16156

llvm-svn: 258310
2016-01-20 13:14:52 +00:00
Dan Gohman edf98c5682 [SelectionDAG] Fold more offsets into GlobalAddresses
SelectionDAG previously missed opportunities to fold constants into
GlobalAddresses in several areas. For example, given `(add (add GA, c1), y)`, it
would often reassociate to `(add (add GA, y), c1)`, missing the opportunity to
create `(add GA+c, y)`. This isn't often visible on targets such as X86 which
effectively reassociate adds in their complex address-mode folding logic,
however it is currently visible on WebAssembly since it currently has very
simple address mode folding code that doesn't reassociate anything.

This patch fixes this by making SelectionDAG fold offsets into GlobalAddresses
at the same times that it folds constants together, so that it doesn't miss any
opportunities to perform such folding.

Differential Revision: http://reviews.llvm.org/D16090

llvm-svn: 258296
2016-01-20 07:03:08 +00:00
Eduard Burtescu 23c4d83aa3 [NFC] Replace several manual GEP loops with gep_type_iterator.
Reviewers: dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16335

llvm-svn: 258262
2016-01-20 00:26:52 +00:00
Matthias Braun d4f6409dff MachineScheduler: Allow independent scheduling of sub register defs
Note that this is disabled by default and still requires a patch to
handleMove() which is not upstreamed yet.

If the TrackLaneMasks policy/strategy is enabled the MachineScheduler
will build a schedule graph where definitions of independent
subregisters are no longer serialised.

Implementation comments:
- Without lane mask tracking a sub register def also counts as a use
  (except for the first one with the read-undef flag set), with lane
  mask tracking enabled this is no longer the case.
- Pressure Diffs where previously maintained per definition of a
  vreg with the help of the SSA information contained in the
  LiveIntervals.  With lanemask tracking enabled we cannot do this
  anymore and instead change the pressure diffs for all uses of the vreg
  as it becomes live/dead.  For this changed style to work correctly we
  ignore uses of instructions that define the same register again: They
  won't affect register pressure.
- With lanemask tracking we remove all read-undef flags from
  sub register defs when building the graph and re-add them later when
  all vreg lanes have become dead.

Differential Revision: http://reviews.llvm.org/D14969

llvm-svn: 258259
2016-01-20 00:23:32 +00:00
Matthias Braun 5d458617aa RegisterPressure: Make liveness tracking subregister aware
Differential Revision: http://reviews.llvm.org/D14968

llvm-svn: 258258
2016-01-20 00:23:26 +00:00
Matthias Braun 3907fded1b LiveInterval: Add utility class to rename independent subregister usage
This renaming is necessary to avoid a subregister aware scheduler
accidentally creating liveness "holes" which are rejected by the
MachineVerifier.

Explanation as found in this patch:

Helper class that can divide MachineOperands of a virtual register into
equivalence classes of connected components.
MachineOperands belong to the same equivalence class when they are part of
the same SubRange segment or adjacent segments (adjacent in control
flow); Different subranges affected by the same MachineOperand belong to
the same equivalence class.

Example:
  vreg0:sub0 = ...
  vreg0:sub1 = ...
  vreg0:sub2 = ...
  ...
  xxx        = op vreg0:sub1
  vreg0:sub1 = ...
  store vreg0:sub0_sub1

The example contains 3 different equivalence classes:
  - One for the (dead) vreg0:sub2 definition
  - One containing the first vreg0:sub1 definition and its use,
    but not the second definition!
  - The remaining class contains all other operands involving vreg0.

We provide a utility function here to rename disjunct classes to different
virtual registers.

Differential Revision: http://reviews.llvm.org/D16126

llvm-svn: 258257
2016-01-20 00:23:21 +00:00
Sanjoy Das 16901a3e20 [MachineSink] Don't break ImplicitNulls
Summary:
This teaches MachineSink to not sink instructions that might break the
implicit null check optimization that runs later.  This should not
affect frontends that do not use implicit null checks.

Reviewers: aadg, reames, hfinkel, atrick

Subscribers: majnemer, llvm-commits

Differential Revision: http://reviews.llvm.org/D14632

llvm-svn: 258254
2016-01-20 00:06:14 +00:00
Quentin Colombet 2c49e2e664 [MachineFunction] Constify getter. NFC.
llvm-svn: 258207
2016-01-19 22:31:12 +00:00
Philip Reames 1a196f7daf Revert 258157
According the build bots, clang is using the Registry class somewhere as well. Will reapply with appropriate clang changes at a later point.

llvm-svn: 258159
2016-01-19 18:41:10 +00:00
Philip Reames 0f6650e8e8 [GC] Registry initialization and linkage interactions
The Registry class constructs a linked list of nodes whose storage is inside static variables and nodes are added via static initializers. The trick is that those static initializers are in both the LLVM code base, and some random plugin that might get loaded in at runtime. The existing code tries to use C++ templates and their ODR rules to get a single definition of the registry for each type, but, experimentally, this doesn't quite work as designed. (Well, the entire structure doesn't. It might not actually be an ODR problem.)

Previously, when I tried moving the GCStrategy class (along with it's registry) from CodeGen to IR, I ran into a problem where asking the GCStrategyRegistry a question would return inconsistent results depending on whether you asked from CodeGen (where the static initializers still were) or Transforms. My best guess is that this is a result of either a) an order of initialization error, or b) we ended up with two copies of the registry being created. I remember at the time having convinced myself it was probably (b), but I don't have any of my notes around from that investigation any more.

See http://reviews.llvm.org/rL226311 for the original patch in question.

This patch tries to remove the possibility of (b) above. (a) was already fixed in change 258109.

Differential Revision: http://reviews.llvm.org/D16170

llvm-svn: 258157
2016-01-19 18:34:27 +00:00
Eduard Burtescu 19eb03106d [opaque pointer types] [NFC] GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.
Summary:
GEPOperator: provide getResultElementType alongside getSourceElementType.
This is made possible by adding a result element type field to GetElementPtrConstantExpr, which GetElementPtrInst already has.

GEP: replace get(Pointer)ElementType uses with get{Source,Result}ElementType.

Reviewers: mjacob, dblaikie

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16275

llvm-svn: 258145
2016-01-19 17:28:00 +00:00
Philip Reames 3195500297 [GC] Consolidate all built in GCs into a single file [NFC]
Combine a bunch of small files into a single, still rather small, file.  The primary purpose of this is to get all of the static initializers into a single file so as to have a well defined order of initialization.  

llvm-svn: 258109
2016-01-19 03:57:18 +00:00
Simon Pilgrim c4d519d340 Fixed MSVC warning that not all control paths return a value.
llvm-svn: 258099
2016-01-18 22:54:46 +00:00
Sergei Larin d19d4d30d8 Add to the split module utility an SCC based method which allows not to globalize any local variables.
Summary:
    Currently llvm::SplitModule as the first step globalizes all local objects, which might not be desirable in some scenarios.
    This change adds a new flag to llvm::SplitModule that uses SCC approach to search for a balanced partition without the need to externalize symbols.
    Such partition might not be possible or fully balanced for a given number of partitions, and is a function of the module properties (global/local dependencies within the module).
    
    Joint development Tobias Edler von Koch (tobias@codeaurora.org) and Sergei Larin (slarin@codeaurora.org)
    
    Subscribers: llvm-commits, joker.eph
    
    Differential Revision: http://reviews.llvm.org/D16124

llvm-svn: 258083
2016-01-18 21:07:13 +00:00
Tom Stellard ccdc5391ea TargetLowering: Improve handling of (setcc ([sz]ext x) 0, cc) in SimplifySetCC
Summary:
When SimplifySetCC sees a setcc node that compares the result of a
value extension operation with a constant, it tries to simplify the
setcc node by eliminating the extension and shrinking the constant.

If shrinking the inputs to setcc is deemed not desirable by the target
(e.g. the target does not want a setcc comparing i1 values), then it
is still possible to optimize this sequence in some cases.

This patch adds the following combines to SimplifySetCC when shrinking setcc
inputs is not desirable:

(setcc ([sz]ext (setcc x, y, cc)), 0, setne) -> (setcc (x, y, cc))
(setcc ([sz]ext (setcc x, y, cc)), 0, seteq) -> (setcc (x, Y, !cc))

There are no tests for this yet, but once AMDGPU correctly implements
TargetLowering::isTypeDesirableForOp(), this new combine will be
exercised by the existing CodeGen/AMDGPU/setcc-opt.ll test.

Reviewers: resistor, arsenm

Subscribers: jroelofs, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15034

llvm-svn: 258067
2016-01-18 19:55:21 +00:00
Junmo Park 3347e7823a Remove extra whitespace. NFC.
llvm-svn: 258035
2016-01-18 06:42:51 +00:00
Eduard Burtescu 90c4449128 [opaque pointer types] Alloca: use getAllocatedType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16272

llvm-svn: 258028
2016-01-18 00:10:01 +00:00
Manuel Jacob 190577ac81 [opaque pointer types] [NFC] CallSite: use getFunctionType() instead of going through PointerType::getElementType.
Patch by Eduard Burtescu.

Reviewers: dblaikie, mjacob

Subscribers: dsanders, llvm-commits, dblaikie

Differential Revision: http://reviews.llvm.org/D16273

llvm-svn: 258023
2016-01-17 22:37:39 +00:00
Manuel Jacob 5f6eaac611 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260

llvm-svn: 257999
2016-01-16 20:30:46 +00:00
Keno Fischer bc0cb11eb2 [DwarfDebug] Don't merge DebugLocEntries if their pieces overlap
Summary:
Later in DWARF emission we check that DebugLocEntries have
non-overlapping pieces, so we should create any such entries
by merging here.

Fixes PR26163.

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D16249

llvm-svn: 257979
2016-01-16 01:15:32 +00:00
Keno Fischer f8eb6a1414 [DwarfDebug] Move MergeValues to .cpp, NFC
llvm-svn: 257977
2016-01-16 01:11:33 +00:00
Reid Kleckner 9533af4f8a [codeview] Remove custom line info struct in favor of DebugLoc
The only functional change would be that we might emit multiple filename
segments on code like this:

  void f() {
  #include "p1/../t.h"
  #include "p2/../t.h"
  }

I believe these get separate DIFile metadata nodes, but will have the
same canonicalized absolute path. Previously by computing the path up
front and comparing it we would merge the line info segments.

llvm-svn: 257966
2016-01-16 00:09:09 +00:00
Dan Gohman 4e9b2a60ab [SelectionDAG] CSE nodes with differing SDNodeFlags
In the optimizer (GVN etc.) when eliminating redundant nodes with different
flags, the flags are ignored for the purposes of testing for congruence, and
then intersected for the purposes of producing a result that supports the union
of all the uses. This commit makes SelectionDAG's CSE do the same thing,
allowing it to CSE nodes in more cases. This fixes PR26063.

Differential Revision: http://reviews.llvm.org/D15957

llvm-svn: 257940
2016-01-15 21:56:40 +00:00
Joseph Tremoulet 44b3f961e1 [WinEH] Rename CatchReturnInst::getParentPad, NFC
Summary:
Rename to getCatchSwitchParentPad, to make it more clear which ancestor
the "parent" in question is.  Add a comment pointing out the key feature
that the returned pad indicates which funclet contains the successor
block.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16222

llvm-svn: 257933
2016-01-15 21:16:19 +00:00
Rafael Espindola 79db917139 Don't try to check all uses if lazy loading.
This means that LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN will not be set
in a few cases.

This should have no impact in ld64 since it doesn't use lazy loading
when merging modules and that is when it checks
LTO_SYMBOL_SCOPE_DEFAULT_CAN_BE_HIDDEN.

llvm-svn: 257915
2016-01-15 18:23:46 +00:00
James Y Knight ac03dca412 Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

(This is a re-commit of r257719, without the bug reported in
PR26144. I've tweaked the code to not assert-fail in
enforceKnownAlignment when computeKnownBits doesn't recurse far enough
to find the underlying Alloca/GlobalObject value.)

Differential Revision: http://reviews.llvm.org/D16145

llvm-svn: 257902
2016-01-15 16:33:06 +00:00
James Molloy 3ef84c4cbb [CodeGenPrepare] Try and appease sanitizers
dupRetToEnableTailCallOpts(BB) can invalidate BB. It must run *after* we iterate across BB!

llvm-svn: 257886
2016-01-15 10:36:01 +00:00
James Molloy f01488e2bc [InstCombine] Rewrite bswap/bitreverse handling completely.
There are several requirements that ended up with this design;
  1. Matching bitreversals is too heavyweight for InstCombine and doesn't really need to be done so early.
  2. Bitreversals and byteswaps are very related in their matching logic.
  3. We want to implement support for matching more advanced bswap/bitreverse patterns like partial bswaps/bitreverses.
  4. Bswaps are best matched early in InstCombine.

The result of these is that a new utility function is created in Transforms/Utils/Local.h that can be configured to search for bswaps, bitreverses or both. InstCombine uses it to find only bswaps, CGP uses it to find only bitreversals.

We can then extend the matching logic in one place only.

llvm-svn: 257875
2016-01-15 09:20:19 +00:00
Keno Fischer 81e2e9ef86 Reapply r257105 "[Verifier] Check that debug values have proper size"
I originally reapplied this in 257550, but had to revert again due to bot
breakage. The only change in this version is to allow either the TypeSize
or the TypeAllocSize of the variable to be the one represented in debug info
(hopefully in the future we can figure out how to encode the difference).
Additionally, several bot failures following r257550, were due to
optimizer bugs now fixed in r257787 and r257795.

r257550 commit message was:

```
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
``
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
``

```

llvm-svn: 257850
2016-01-15 00:46:17 +00:00
Krzysztof Parzyszek c005e20d3b [Packetizer] Code cleanup, NFC
llvm-svn: 257805
2016-01-14 21:17:04 +00:00
Rui Ueyama da00f2fdf4 Update to use new name alignTo().
llvm-svn: 257804
2016-01-14 21:06:47 +00:00
Ahmed Bougacha 60b201b662 [CodeGen] Don't assume fp_to_fp16 produces i16 when legalizing it.
Since r230276, we support an improved legalization for f64->f16,
which goes through a temporary f32, improving codegen when
f32->f16 is legal but not f64->f16. This requires unsafe-fp-math.

However, that legalization assumed that the second step, producing
a pseudo-softened f16, had type i16. That's not true on targets
with illegal i16, such as ARM.

Use the initial f64->f16 result type instead.

llvm-svn: 257794
2016-01-14 19:45:36 +00:00
Reid Kleckner 70f5bc99b6 Rename WinCodeViewLineTables to CodeViewDebug, similar to DwarfDebug
Soon it will be responsible for more than line tables.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D16199

llvm-svn: 257792
2016-01-14 19:25:04 +00:00
Xinliang David Li 5f04f926e9 [PGO] [Coverage] put covmap into note section with no 'alloc flag' (Linux)
Coverage mapping data is not referenced by runtime, and they won't be dumped
into profile data. There is no need to allocate memory for covmap sections.
A good side effect of this change is that the coverage map data won't be mistakenly
 garbage collected by the linker (for Gold linker only, BFD linker has an issue where the a bug is filed).
Tested with clang build with instrumentation and -fcoverage-mapping and linker GC. The size of
 covmap section is ~17.6M so the text segment size will be reduced by this amount with this change.

llvm-svn: 257781
2016-01-14 18:09:45 +00:00
James Y Knight 582f556251 Revert "Stop increasing alignment of externally-visible globals on ELF platforms."
This reverts commit r257719, due to PR26144.

llvm-svn: 257775
2016-01-14 16:33:21 +00:00
David Majnemer 3463e696fb [X86] Don't alter HasOpaqueSPAdjustment after we've relied on it
We rely on HasOpaqueSPAdjustment not changing after we've calculated
things based on it.  Things like whether or not we can use 'rep;movs' to
copy bytes around, that sort of thing.  If it changes, invariants in the
backend will quietly break.  This situation arose when we had a call to
memcpy *and* a COPY of the FLAGS register where we would attempt to
reference local variables using %esi, a register that was clobbered by
the 'rep;movs'.

This fixes PR26124.

llvm-svn: 257730
2016-01-14 01:20:03 +00:00
Philip Reames 054123550f Fix Release build warning.
A value used only in an assert.  Again.

llvm-svn: 257728
2016-01-14 00:55:51 +00:00
Philip Reames 8f8e3f245c [GCRoot] Assert preconditions to clarify behavior
This code isn't reachable if the GFI (GCFunctionInfo*) is null.  Clarify this by adding an assert and removing an always taken if.  

llvm-svn: 257724
2016-01-14 00:21:56 +00:00
Reid Kleckner 3c0ff98708 [codeview] Regenerate C++ display name test case and update comments
Clang generates good display names for codeview since r255744, and the
change to make LLVM use them was accidentally included in r257658.

This change just updates the comments and test case to reflect reality
better.

llvm-svn: 257723
2016-01-14 00:12:54 +00:00
James Y Knight 9de6d7becc Stop increasing alignment of externally-visible globals on ELF
platforms.

With ELF, the alignment of a global variable in a shared library will
get copied into an executables linked against it, if the executable even
accesss the variable. So, it's not possible to implicitly increase
alignment based on access patterns, or you'll break existing binaries.

This happened to affect libc++'s std::cout symbol, for example. See
thread: http://thread.gmane.org/gmane.comp.compilers.clang.devel/45311

llvm-svn: 257719
2016-01-13 23:59:19 +00:00
Chih-Hung Hsieh 578864007b [TLS] New lower emutls pass, fix linkage bugs.
Previous implementation in http://reviews.llvm.org/D10522
created external references to __emutls_v.* variables.
Such references are inaccurate and cannot be handled by
all linkers, e.g. Android dynamic and gold linkers for aarch64.

Now a new LowerEmuTLS pass to go through all global variables,
and add emutls_v.* and emutls_t.* variables.
These __emutls* variables have the same linkage and
visibility as the associated user defined TLS variable.

Also removed old code that dump __emutls* variables in AsmPrinter.cpp,
and updated TLS unit tests.

Differential Revision: http://reviews.llvm.org/D15300

llvm-svn: 257718
2016-01-13 23:56:37 +00:00
Reid Kleckner 6b3faefff9 [codeview] Share more enums across the writer and the dumper
Moves some .def files into include/DebugInfo/CodeView.

Aslo remove a 'using namespace' directive from a header in readobj and
update the uses of the endian helper types to compensate.

llvm-svn: 257712
2016-01-13 23:44:57 +00:00
Reid Kleckner 72e2ba7abb [readobj] Expand CodeView dumping functionality
This rewrites and expands the existing codeview dumping functionality in
llvm-readobj using techniques similar to those in lib/Object. This defines a
number of new records and enums useful for reading memory mapped codeview
sections in COFF objects.

The dumper is intended as a testing tool for LLVM as it grows more codeview
output capabilities.

Reviewers: majnemer

Differential Revision: http://reviews.llvm.org/D16104

llvm-svn: 257658
2016-01-13 19:32:35 +00:00
Keno Fischer 78e5c9e6e2 Re-Revert r257105 (Verifier debug info changes)
While I investigate some new buildbot failures. This was originally reapplied
as r257550 and r257558.

llvm-svn: 257563
2016-01-13 02:31:14 +00:00
Matthias Braun 4cc3421a24 AsmPrinter: Fix wrong OS X versions being emitted for darwin triples
The version numbers of the darwin kernel are different from the version
numbers of OS X, so we need adjustments if we had "*-*-darwin" triples.
Use the existing utility functions in TargetTriple for this.

Fixes rdar://22056966

Differential Revision: http://reviews.llvm.org/D14601

llvm-svn: 257555
2016-01-13 01:18:13 +00:00
David Majnemer c3340db77d [CodeView] Mark our lines as statements, not expressions
The line tables for CodeView make a distinction between expressions and
statements.  As it turns out, MSVC always emits them as statements and
we always emit them as expressions.  Let's switch to statements to match
the CodeView that they emit.

llvm-svn: 257553
2016-01-13 01:05:23 +00:00
Keno Fischer 25916079ff Reapply r257105 "[Verifier] Check that debug values have proper size"
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
```
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
```

llvm-svn: 257550
2016-01-13 00:31:44 +00:00
Matthias Braun b505c76c9a RegisterPressure: Expose RegisterOperands API
Previously the RegisterOperands have only been used internally in
RegisterPressure.cpp. However this datastructure can be useful for other
tasks as well and allows refactoring of PDiff initialisation out of
RPTracker::recede().

This patch:
- Exposes RegisterOperands as public API
- Splits RPTracker::recede() into a part that skips DebugValues and
  maintains the region borders, and the core that changes register
  pressure when given a set of RegisterOperands.
- This allows to move the PDiff initialisation out recede() into a
  method of the PressureDiffs class.
- The upcoming subregister scheduling code will also use
  RegisterOperands to avoid pushing more unrelated functionality into
  recede()/advance().

Differential Revision: http://reviews.llvm.org/D15473

llvm-svn: 257535
2016-01-12 22:57:35 +00:00
David Majnemer c81c8c66d5 [CodeView] Initialize column-end to zero
CodeView, unlike DWARF, can associate code with a range of columns.
However, LLVM can only represent a single column position internally.

We used to claim that the end column and start column were the same
which yielded less than satisfactory results: we would stop printing at
the _beginning_ of the source expression!  Instead, mark the column-end
as 'zero' to indicate that we don't have one (as per the documentation
for IDiaLineNumber::get_lineNumberEnd).

llvm-svn: 257528
2016-01-12 21:58:20 +00:00
Chad Rosier f35395eac1 [NFC] Fix whitespace.
llvm-svn: 257365
2016-01-11 19:17:36 +00:00
Matt Arsenault 5ca3c72c5a LegalizeDAG: Expand ctlz with ctlz_zero_undef if legal
llvm-svn: 257345
2016-01-11 16:37:46 +00:00
Junmo Park 7ceec0b82f [BranchFolding] Set correct mem refs (2nd try)
This is a recommit of r257253 which was reverted in r257270.
Previous testcase can make failure on some targets due to using opt with O3 option.

Original Summary:
Merge MBBICommon and MBBI's MMOs.

Differential Revision: http://reviews.llvm.org/D15990

llvm-svn: 257317
2016-01-11 07:15:38 +00:00
Daniel Berlin 7256059ef0 Speed up LiveDebugValues
Summary:
Use proper dataflow ordering to speed convergence.
This will converge the testcase on bug 26055 in 2 iterations.

(data structures speedups to come to make even that faster)

Reviewers: kcc, samsonov, echristo, dblaikie, tvvikram

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D16039

llvm-svn: 257292
2016-01-10 18:08:32 +00:00
Daniel Berlin ca4d93a82f Don't use random class variables across functions
llvm-svn: 257271
2016-01-10 03:25:42 +00:00
Michael Zolotukhin 0fc89c67cc Revert "[BranchFolding] Set correct mem refs"
This reverts commit 1ff11017d2669b933b29fcbb6451cfcda34ad693.

llvm-svn: 257270
2016-01-09 23:53:16 +00:00
Junmo Park e1582cec34 [BranchFolding] Set correct mem refs
Merge MBBICommon and MBBI's MMOs.

Differential Revision: http://reviews.llvm.org/D15990

llvm-svn: 257253
2016-01-09 07:30:13 +00:00
Sanjay Patel 1dc7dfb9d9 [DAGCombiner] don't dereference an operand that doesn't exist (PR26070)
The bug was introduced with changes for x86-64 fp128:
http://reviews.llvm.org/rL254653

I don't know why an x86 change is here, so I'll follow up in:
http://reviews.llvm.org/D15134

Should fix:
https://llvm.org/bugs/show_bug.cgi?id=26070

llvm-svn: 257200
2016-01-08 19:53:24 +00:00
Tim Shen 9b68bd48ca Test commit access - add a blank line in comment.
llvm-svn: 257192
2016-01-08 19:20:23 +00:00
Pirama Arumuga Nainar bf5ccdccb2 Do not ASSERTZEXT for i16 result of bitcast from f16 operand
Summary:
During legalization if i16, do not ASSERTZEXT the result of FP_TO_FP16.
Directly return an FP_TO_FP16 node with return type as the
promote-to-type of i16.

This patch also removes extraneous length check.  This legalization
should be valid even if integer and float types are of different
lengths.

This patch breaks a hard-float test for fp16 args.  The test is changed
to allow a vmov to zero-out the top bits, and also ensure that the
return value is in an FP register.

Reviewers: ab, jmolloy

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D15438

llvm-svn: 257184
2016-01-08 17:46:05 +00:00
David Majnemer 2a6368f609 [WinEH] CatchHandler which don't have catch objects in StackColoring
StackColoring rewrites the frame indicies of operations involving
allocas if it can find that the life time of two objects do not overlap.
MSVC EH needs to be kept aware of this if happens in the event that a
catch object has moved around.  However, we represent the non-existance
of a catch object with a sentinel frame index (INT_MAX).  This sentinel
also happens to be the EmptyKey of the SlotRemap DenseMap.  Testing for
whether or not we need to translate the frame index fails in this case
because we call the count method on the DenseMap with the EmptyKey,
leading to assertions.  Instead, check if it is our sentinel value
before trying to look into the DenseMap.

This fixes PR26073.

llvm-svn: 257182
2016-01-08 17:24:47 +00:00
David Majnemer 086fec23ec [WinEH] Update WinEHFuncInfo if StackColoring merges allocas
Windows EH keeping track of which frame index corresponds to a catchpad
in order to inform the runtime where the catch parameter should be
initialized.  LLVM's optimizations are able to prove that the memory
used by the catch parameter can be reused with another memory
optimization, changing it's frame index.

We need to keep WinEHFuncInfo up to date with respect to this or we will
miscompile/assert.

This fixes PR26069.

llvm-svn: 257158
2016-01-08 08:03:55 +00:00
Junmo Park aa9243a25d Remove extra whitespace. NFC.
llvm-svn: 257144
2016-01-08 04:20:32 +00:00
Matthias Braun bf47f63b74 LiveInterval: A LiveRange is enough for ConnectedVNInfoEqClasses::Classify()
llvm-svn: 257129
2016-01-08 01:16:35 +00:00
Alexey Samsonov 117b104166 [LiveDebugValues] Replace several lines of code with operator[].
llvm-svn: 257114
2016-01-07 23:38:45 +00:00
Keno Fischer ea33a25816 Temporarily revert r257105 "[Verifier] Check that debug values have proper size"
Looks like there's a case where clang generates debug info that triggers
the new verifier check. Reverting while investigating.

llvm-svn: 257107
2016-01-07 22:39:11 +00:00
Keno Fischer b3326be6ad [Verifier] Check that debug values have proper size
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D14276

llvm-svn: 257105
2016-01-07 22:18:37 +00:00
Dimitry Andric 2c36421337 Turn off lldb debug tuning by default for FreeBSD
Summary:
In rL242338, debugger tuning was introduced, and the tuning for FreeBSD
was set to lldb by default.  However, for the foreseeable future we
still need to default to gdb tuning, since lldb is not ready for all of
FreeBSD's architectures, and some system tools (like objcopy, etc) have
not yet been adapted to cope with the lldb tuned format, which has
.apple sections.

Therefore, let FreeBSD use gdb by default for now.

Reviewers: emaste, probinson

Subscribers: llvm-commits, emaste

Differential Revision: http://reviews.llvm.org/D15966

llvm-svn: 257103
2016-01-07 22:09:12 +00:00
Amjad Aboud d7cfb48485 Added support for macro emission in dwarf (supporting DWARF version 4).
Differential Revision: http://reviews.llvm.org/D15495

llvm-svn: 257060
2016-01-07 14:28:20 +00:00
Junmo Park 1238610aa1 Remove extra whitespace. NFC.
llvm-svn: 257047
2016-01-07 10:26:32 +00:00
David Majnemer 0e90f46e10 Undo spurious change made in r256965
llvm-svn: 257028
2016-01-07 04:31:35 +00:00
Philip Reames afdbcc6a84 [Statepoints] Add test cases around vectors and stablize test
Unlike my comment in 257022 said, it turns out we do handle constant vectors in the statepoint lowering, but only because SelectionDAG doesn't actually produce constants for them.  Add a couple of tests which show this working.

Also, add a triple to the same test file to hopefully fix a failing bot.

It turns out we do han

llvm-svn: 257025
2016-01-07 04:15:31 +00:00
Philip Reames 3e2cf5320c [Statepoints] Initial support for relocating vectors of pointers
Currently, we try to split vectors of pointers back into their component pointer elements during rewrite-statepoints-for-gc. This is less than ideal since presumably the vectorizer chose to vectorize for a reason. :) It's also been a source of bugs - in particular, the relocation logic as currently implemented was recently discovered to be wrong.

The alternate approach is to allow gc.relocates of vector-of-pointer type and update the backend to handle them. That's what this patch tries to do. This won't actually enable vector-of-pointers in practice - there are some RS4GC changes needed - but the lowering is standalone and testable so it makes sense to separate.

Note that there are some known cases around vector constants which this patch does not handle. Once this is in, I'll send another patch with individual fixes and test cases. 

Differential Revision: http://reviews.llvm.org/D15632

llvm-svn: 257022
2016-01-07 03:32:11 +00:00
Quentin Colombet 9ed52e9a9e [ShrinkWrapping] Give up on irreducible CFGs.
We need to know whether or not a given basic block is in a loop for the analysis
to be correct.
Loop information may be incomplete on irreducible CFGs, therefore we may
generate incorrect code if we use it in those situations.

This fixes PR25988.

llvm-svn: 257012
2016-01-07 01:23:49 +00:00
Sanjay Patel 882a8eed3e rangify; NFCI
llvm-svn: 256998
2016-01-06 23:45:05 +00:00
Weiming Zhao 0f1762caf9 Recommit r256952 "Filtering IR printing for print-after-all/print-before-all"
Fix lit test fail due to outputting an extra line.

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256987
2016-01-06 22:55:03 +00:00
Philip Reames 5eb90a7835 Consolidate MemRefs handling from BranchFolding and correct latent bug
Move the logic from BranchFolding to use the shared infrastructure for merging MMOs introduced in 256909. This has the effect of making BranchFolding more capable.

In the process, fix a latent bug. The existing handling for merging didn't handle the case where one of the instructions being merged had overflowed and dropped MemRefs. This was a latent bug in the places the code was commoned from, but potentially reachable in BranchFolding.

Once this is in, we're left with a single place to consider implementing MMO unique-ing as proposed in http://reviews.llvm.org/D15230.

Differential Revision: http://reviews.llvm.org/D15913

llvm-svn: 256966
2016-01-06 19:33:12 +00:00
David Majnemer eea7582bfa [WinEH] Remove calculateCatchReturnSuccessorColors
The functionality that calculateCatchReturnSuccessorColors provides was
once non-trivial: it was a computation layered on top of funclet
coloring.

These days, LLVM IR directly encodes what
calculateCatchReturnSuccessorColors computed, obsoleting the need for
it.

No functionality change is intended.

llvm-svn: 256965
2016-01-06 19:26:30 +00:00
Michael Kuperstein 037c9984db [ShrinkWrap] Fix FindIDom to only have one kind of failure.
FindIDom() can fail in two different ways - it can either return nullptr or the
block itself, depending on the circumstances. Some users of FindIDom() check
one error condition, while others check the other.

Change it to always return nullptr on failure.
This fixes PR26004.

Differential Revision: http://reviews.llvm.org/D15847

llvm-svn: 256955
2016-01-06 18:40:11 +00:00
Weiming Zhao b243c95c6a Revert r256952 due to lit test fails.
llvm-svn: 256954
2016-01-06 18:31:44 +00:00
Weiming Zhao eac0636805 Filtering IR printing for print-after-all/print-before-all
Summary:
This patch implements "-print-funcs" option to support function filtering for IR printing like -print-after-all, -print-before etc.
Examples:
  -print-after-all -print-funcs=foo,bar

Reviewers: mcrosier, joker.eph

Subscribers: tejohnson, joker.eph, llvm-commits

Differential Revision: http://reviews.llvm.org/D15776

llvm-svn: 256952
2016-01-06 18:20:25 +00:00
Geoff Berry 12fe2279f3 ScheduleDAGInstrs: Bug fix for missed memory dependency.
Summary:
In buildSchedGraph(), when adding memory dependencies for loads, move
the call to adjustChainDeps() after the call to
addChainDependency(AliasChain) to handle the case where
addChainDependency(AliasChain) ends up not adding a dependency and
instead putting the SU on the RejectMemNodes list.  The call to
adjustChainDeps() must be done after the call to addChainDependency() in
order to process the SU added to the RejectMemNodes list to create
memory dependencies for it.

Reviewers: hfinkel, atrick, jonpa, resistor

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D15927

llvm-svn: 256950
2016-01-06 18:14:26 +00:00
Philip Reames 2d2fc4adf1 Fix a warning [NFC]
llvm-svn: 256916
2016-01-06 05:53:09 +00:00
Philip Reames c86ed0055d Extract helper function to merge MemoryOperand lists [NFC]
In the discussion on http://reviews.llvm.org/D15730, Andy pointed out we had a utility function for merging MMO lists. Since it turned we actually had two copies and there's another review in progress (http://reviews.llvm.org/D15230) which needs the same, extract it into a utility function and clean up the interfaces to make it easier to use with a MachineInstBuilder.

I introduced a pair here to track size and allocation together. I think we should probably move in the direction of the MachineOperandsRef helper class, but I'm leaving that for further work. I want to get the poison state introduced before I make major changes to the interface.

Differential Revision: http://reviews.llvm.org/D15757

llvm-svn: 256909
2016-01-06 04:39:03 +00:00
Sanjay Patel 3d07ec973f rangify; NFCI
llvm-svn: 256891
2016-01-06 00:45:42 +00:00
Dan Gohman 797f639e79 [SelectionDAGBuilder] Set NoUnsignedWrap for inbounds gep and load/store offsets.
In an inbounds getelementptr, when an index produces a constant non-negative
offset to add to the base, the add can be assumed to not have unsigned overflow.

This relies on the assumption that addresses can't occupy more than half the
address space, which isn't possible in C because it wouldn't be possible to
represent the difference between the start of the object and one-past-the-end
in a ptrdiff_t.

Setting the NoUnsignedWrap flag is theoretically useful in general, and is
specifically useful to the WebAssembly backend, since it permits stronger
constant offset folding.

Differential Revision: http://reviews.llvm.org/D15544

llvm-svn: 256890
2016-01-06 00:43:06 +00:00
Sanjay Patel 8260d0a9fa use std::max ; NFCI
llvm-svn: 256889
2016-01-06 00:36:59 +00:00
MinSeong Kim 4a9a4e198f [MISched] Explanatory error message when machine model is not complete. NFC
When not all instructions have a scheduling class,
the error message now provides a possible solution.

Differential Revision: http://reviews.llvm.org/D15854

llvm-svn: 256839
2016-01-05 14:50:15 +00:00
Manuel Jacob 83eefa6d20 [Statepoints] Refactor GCRelocateOperands into an intrinsic wrapper. NFC.
Summary:
This commit renames GCRelocateOperands to GCRelocateInst and makes it an
intrinsic wrapper, similar to e.g. MemCpyInst.  Also, all users of
GCRelocateOperands were changed to use the new intrinsic wrapper instead.

Reviewers: sanjoy, reames

Subscribers: reames, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15762

llvm-svn: 256811
2016-01-05 04:03:00 +00:00
Matthias Braun 7e762e4f9c MachineInstrBundle: Fix reversed isSuperRegisterEq() call
Unfortunately this fix had the effect of exposing the
-verify-machineinstrs FIXME of X86InstrInfo.cpp in two testcases for
which I disabled it for now.
Two testcases also have additional pushq/popq where the corrected code
cannot prove that %rax is dead any longer. Looking at the examples, this
could potentially be fixed by improving computeRegisterLiveness() to check
the live-in lists of the successors blocks when reaching the end of a
block.

This fixes http://llvm.org/PR25951.

llvm-svn: 256799
2016-01-05 00:45:35 +00:00
Eric Christopher 49a7d6c473 Clarify that the bypassSlowDivision optimization operates on a single BB [v2]
Update some comments to be more explicit.

Change bypassSlowDivision and the functions it calls so that they take
BasicBlock*s and Instruction*s, rather than Function::iterator&s and
BasicBlock::iterator&s.

Change the APIs so that the caller is responsible for updating the
iterator, rather than the callee. This makes control flow much easier
to follow.

Patch by Justin Lebar!

llvm-svn: 256789
2016-01-04 23:18:58 +00:00
Joseph Tremoulet 52f729a613 [WinEH] Update CoreCLR EH state numbering
Summary:
Fix the CLR state numbering to generate correct tables, and update the lit
test to verify them.

The CLR numbering assigns one state number to each catchpad and
cleanuppad.

It also computes two tree-like relations over states:
 1) Each state has a "HandlerParentState", which is the state of the next
    outer handler enclosing this state's handler (same as nearest ancestor
    per the ParentPad linkage on EH pads, but skipping over catchswitches).
 2) Each state has a "TryParentState", which:
    a) for a catchpad that's not the last handler on its catchswitch, is
       the state of the next catchpad on that catchswitch.
    b) for all other pads, is the state of the pad whose try region is the
       next outer try region enclosing this state's try region.  The "try
       regions are not present as such in the IR, but will be inferred
       based on the placement of invokes and pads which reach each other
       by exceptional exits.

Catchswitches do not get their own states, but each gets mapped to the
state of its first catchpad.

Table generation requires each state's "unwind dest" state to have a lower
state number than the given state.

Since HandlerParentState can be computed as a function of a pad's
ParentPad, and TryParentState can be computed as a function of its unwind
dest and the TryParentStates of its children, the CLR state numbering
algorithm first computes HandlerParentState in a top-down pass, then
computes TryParentState in a bottom-up pass.

Also reword some comments/names in the CLR EH table generation to make the
distinction between the different kinds of "parent" clear.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D15325

llvm-svn: 256760
2016-01-04 16:16:01 +00:00
David Majnemer ca1c9f074f [X86] Make hasFP constant time
We need a frame pointer if there is a push/pop sequence after the
prologue in order to unwind the stack.  Scanning the instructions to
figure out if this happened made hasFP not constant-time which is a
violation of expectations.  Let's compute this up-front and reuse that
computation when we need it.

llvm-svn: 256730
2016-01-04 04:49:41 +00:00
Simon Pilgrim 5bf96e41c5 [SelectionDAG] Pulled out common code for CONCAT_VECTORS node creation
Pulled out the similar CONCAT_VECTORS creation code from the 2/3 operand getNode() calls (to handle all UNDEF and all BUILD_VECTOR cases). Added a similar handler to the general getNode() call as well.

llvm-svn: 256709
2016-01-03 18:24:19 +00:00
NAKAMURA Takumi ded575e4eb WinEHPrepare.cpp: Suppress a warning for -Asserts. [-Wunused-variable]
llvm-svn: 256694
2016-01-03 01:41:00 +00:00
Joseph Tremoulet 71e5676de4 [WinEH] Update catchrets with cloned successors
Summary:
Add a pass to update catchrets when their successors get cloned; the
existing pass doesn't catch these because it walks the funclet whose
blocks are being cloned but the catchret is in a child funclet.

Also update the test for removing incoming PHI values; when the
predecessor is a catchret, the relevant color is the catchret's parentPad,
not its block's color.


Reviewers: andrew.w.kaylor, rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15840

llvm-svn: 256689
2016-01-02 15:22:36 +00:00
Yaron Keren c47c6ac0a5 Correct misleading formatting of several ifs followed by two statements without braces.
While the original code would work with or without braces, it makes sense to
set HaveSemi to true only if (!HaveSemi), otherwise it's already true, so I
put the assignment inside the if block. This addresses PR25998.

llvm-svn: 256688
2016-01-02 13:40:36 +00:00
David Majnemer dbdc9c274d [WinEH] Add additional verification
Recolor the IR to make sure our computed colors are not hiding any bugs.
Also, verifyFunction if we are running some post-preparation operations;
some of these operations can hide latent bugs.

llvm-svn: 256687
2016-01-02 09:26:36 +00:00
Sanjay Patel ac6e910c42 don't repeat function names in comments; NFC
llvm-svn: 256584
2015-12-29 22:11:50 +00:00
Sanjay Patel d1d9db5889 use auto with dyn_casted values; NFC
llvm-svn: 256581
2015-12-29 22:00:37 +00:00
Sanjay Patel 7a7abc9a3b use auto with dyn_casted values; NFC
llvm-svn: 256579
2015-12-29 21:49:08 +00:00
Sanjay Patel b120ae96d3 fix formatting; NFC
llvm-svn: 256574
2015-12-29 19:34:53 +00:00
Sanjay Patel faeee6f44d use range-based for-loop; NFCI
llvm-svn: 256572
2015-12-29 18:30:09 +00:00
Chad Rosier 6b4326367a Add command line options to force function/loop alignments.
These are being added for testing purposes.
http://reviews.llvm.org/D15648

llvm-svn: 256571
2015-12-29 18:18:07 +00:00
Sanjay Patel 59309cc090 don't repeat function names in comments; NFC
llvm-svn: 256569
2015-12-29 18:14:06 +00:00
Sanjay Patel 7dd45697ca use range-based for-loops; NFCI
llvm-svn: 256566
2015-12-29 17:15:22 +00:00
Chandler Carruth e0115344e6 [ptr-traits] Sink a constructor definition to the .cpp file and add
missing includes so that the pointee types for DenseMap pointer keys and
such are complete prior to us querying the pointer traits for them.

This is part of a series of patches to allow LLVM to check for complete
pointee types when computing its pointer traits. This is absolutely
necessary to get correct (or reproducible) results for things like how
many low bits are guaranteed to be zero.

llvm-svn: 256550
2015-12-29 09:24:39 +00:00
Michael Kuperstein 2ea81baf3a [X86] Better support for the MCU psABI (LLVM part)
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.

The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.

Differential Revision: http://reviews.llvm.org/D15054

llvm-svn: 256494
2015-12-28 14:39:21 +00:00
Craig Topper 4b1808d8e7 [SelectionDAG] Teach LegalizeVectorOps to not unroll CTLZ_ZERO_UNDEF and CTTZ_ZERO_UNDEF if the non-ZERO_UNDEF form is legal or custom. Will be used to simplify X86 code in a follow on commit.
llvm-svn: 256476
2015-12-27 21:33:47 +00:00
David Majnemer 081e8fe4c0 [WinEH] Add comments explaining the EH tables
This is aids in debugging WinEH, similar functionality is present for
DWARF EH.

llvm-svn: 256455
2015-12-27 06:07:12 +00:00
David Majnemer 11234ed7d3 [CodeGen] Use generic printAsOperand machinery instead of hand rolling it
We already know how to properly print out basic blocks in
printAsOperand, we should not roll it ourselves in
AsmPrinter::EmitBasicBlockStart.  No functionality change is intended.

llvm-svn: 256413
2015-12-25 09:37:26 +00:00
Craig Topper 73275a2951 Use range-based for loops. NFC
llvm-svn: 256363
2015-12-24 05:20:40 +00:00
Philip Reames cb0f947a2a [Statepoints] Use Indirect operands for spill slots
Teach the statepoint lowering code to emit Indirect stackmap entries for spill inserted by StatepointLowering (i.e. SelectionDAG), but Direct stackmap entries for in-IR allocas which represent manual stack slots. This is what the docs call for (http://llvm.org/docs/StackMaps.html#stack-map-format), but we've been emitting both as Direct. This was pointed out recently on the mailing list as a bug. It also blocks http://reviews.llvm.org/D15632 which extends the lowering to handle vector-of-pointers since only Indirect references can encode a variable sized slot.

To implement this, I introduced a new flag on the StackObject class used to maintian information about stack slots. I original considered (and prototyped in http://reviews.llvm.org/D15632), the idea of using the existing isSpillSlot flag, but end up deciding that was a bit too risky and that the cost of adding a new flag was low. Having the new flag will also allow us - in the future - to emit better comments in verbose assembly which indicate where a particular stack spill around a call comes from. (deopt, gc, regalloc).

Differential Revision: http://reviews.llvm.org/D15759

llvm-svn: 256352
2015-12-23 23:44:28 +00:00
Philip Reames 4e66c84722 [MemOperands] Clarify code around dropping memory operands [NFC]
Clarify a comment about what it means to drop memory operands from an instruction.  While I'm adding change the name of the method slightly to make it a bit more clear what's going on when reading calling code.

llvm-svn: 256346
2015-12-23 19:16:04 +00:00
Philip Reames 42bd26f29d [MachineLICM] Fix handling of memoperands
As far as I can tell, the correct interpretation of an empty memoperands list is that we didn't have sufficient room to store information about the MachineInstr, NOT that the MachineInstr doesn't access any particular bit of memory. This appears to be fairly consistent in a number of places, but I'm not 100% sure of this interpretation. I'd really appreciate someone more knowledgeable confirming my reading of the code.

This patch fixes two latent bugs in MachineLICM - given the above assumption - and adds comments to document the meaning and required handling. I don't have test cases; these were noticed by inspection.

Differential Revision: http://reviews.llvm.org/D15730

llvm-svn: 256335
2015-12-23 17:05:57 +00:00
David Majnemer c640f863e0 [WinEH] Don't visit the same catchswitch twice
We visited the same catchswitch twice because it was both the child of
another funclet and the predecessor of a cleanuppad.

Instead, change the numbering algorithm to only recurse if the unwind
destination of the inner funclet agrees with the unwind destination of
the catchswitch.

This fixes PR25926.

llvm-svn: 256317
2015-12-23 03:59:04 +00:00
Philip Reames ee8f055327 [GC] Make GCStrategy::isGCManagedPointer a type predicate not a value predicate [NFC]
Reasons:
1) The existing form was a form of false generality.  None of the implemented GCStrategies use anything other than a type.  Its becoming more and more clear we're going to need some type of strong GC pointer in the type system and we shouldn't pretend otherwise at this point.
2) The API was awkward when applied to vectors-of-pointers.  The old one could have been made to work, but calling isGCManagedPointer(Ty->getScalarType()) is much cleaner than the Value alternatives.  
3) The rewriting implementation effectively assumes the type based predicate as well.  We should be consistent.

llvm-svn: 256312
2015-12-23 01:42:15 +00:00
Cong Hou e93b8e1539 [BPI] Replace weights by probabilities in BPI.
This patch removes all weight-related interfaces from BPI and replace
them by probability versions. With this patch, we won't use edge weight
anymore in either IR or MC passes. Edge probabilitiy is a better
representation in terms of CFG update and validation.


Differential revision: http://reviews.llvm.org/D15519 

llvm-svn: 256263
2015-12-22 18:56:14 +00:00
Manuel Jacob 4e4f60ded0 Remove deprecated llvm.experimental.gc.result.{int,float,ptr} intrinsics.
Summary:
These were deprecated 11 months ago when a generic
llvm.experimental.gc.result intrinsic, which works for all types, was added.

Reviewers: sanjoy, reames

Subscribers: sanjoy, chenli, llvm-commits

Differential Revision: http://reviews.llvm.org/D15719

llvm-svn: 256262
2015-12-22 18:44:45 +00:00
Chad Rosier a108010385 Typo. NFC.
llvm-svn: 256242
2015-12-22 15:06:47 +00:00
Keno Fischer 4eccf11373 [ASMPrinter] Fix missing handling of DW_OP_bit_piece
In r256077, I added printing for DIExpressions in DEBUG_VALUE comments,
but neglected to handle DW_OP_bit_piece operands. Thanks to
Mikael Holmen and Joerg Sonnenberger for spotting this.

llvm-svn: 256236
2015-12-22 07:14:50 +00:00
David Majnemer 03e2cc3007 [MC, COFF] Support link /incremental conditionally
Today, we always take into account the possibility that object files
produced by MC may be consumed by an incremental linker.  This results
in us initialing fields which vary with time (TimeDateStamp) which harms
hermetic builds (e.g. verifying a self-host went well) and produces
sub-optimal code because we cannot assume anything about the relative
position of functions within a section (call sites can get redirected
through incremental linker thunks).

Let's provide an MCTargetOption which controls this behavior so that we
can disable this functionality if we know a-priori that the build will
not rely on /incremental.

llvm-svn: 256203
2015-12-21 22:09:27 +00:00
Adrian Prantl ce8581389b Fix PR24563 (LiveDebugVariables unconditionally propagates all DBG_VALUEs)
LiveDebugVariables unconditionally propagates all DBG_VALUE down the
dominator tree, which happens to work fine if there already is another
DBG_VALUE or the DBG_VALUE happends to describe a single-assignment vreg
but is otherwise wrong if the DBG_VALUE is coming from only one of the
predecessors.

In r255759 we introduced a proper data flow analysis scheduled after
LiveDebugVariables that correctly propagates DBG_VALUEs across basic block
boundaries. With the new pass in place, the incorrect propagation in
LiveDebugVariables can be retired witout loosing any of the benefits
where LiveDebugVariables happened to do the right thing.

llvm-svn: 256188
2015-12-21 20:03:00 +00:00
Amjad Aboud 60b5e1b6c0 Implemented Support of IA interrupt and exception handlers:
http://lists.llvm.org/pipermail/cfe-dev/2015-September/045171.html

Differential Revision: http://reviews.llvm.org/D15567

llvm-svn: 256155
2015-12-21 14:07:14 +00:00
Manuel Jacob 5b90b147d4 Remove unnecessary casts. NFC.
llvm-svn: 256101
2015-12-19 18:38:42 +00:00
Matt Arsenault d206d6cc54 SelectionDAG: Cleanup integer bin op promotion functions.
SDIV and UDIV had special handling, but this is the same handling
that min/max need.

llvm-svn: 256098
2015-12-19 17:18:43 +00:00
Keno Fischer 00cbf9a69a Clean up the processing of dbg.value in various places
Summary:
First up is instcombine, where in the dbg.declare -> dbg.value conversion,
the llvm.dbg.value needs to be called on the actual loaded value, rather
than the address (since the whole point of this transformation is to be
able to get rid of the alloca). Further, now that that's cleaned up, we
can remove a hack in the backend, that would add an implicit OP_deref if
the argument to dbg.value was an alloca. This stems from before the
existence of DIExpression and is no longer necessary since the deref can
be expressed explicitly.

Now, in order to make sure that the tests pass with this change, we need to
correct the printing of DEBUG_VALUE comments to take into account the
expression, which wasn't taken into account before.

Unfortunately, for both these changes, there were a number of incorrect
test cases (mostly the wrong number of DW_OP_derefs, but also a couple
where the test itself was broken more badly). aprantl and I have gone
through and adjusted these test case in order to make them pass with
these fixes and in some cases to make sure they're actually testing
what they are meant to test.

Reviewers: aprantl

Subscribers: dsanders

Differential Revision: http://reviews.llvm.org/D14186

llvm-svn: 256077
2015-12-19 02:02:44 +00:00
Matt Arsenault 10a509292c Fix broken type legalization of min/max
This was using an anyext when promoting the type
when zext/sext is required.

llvm-svn: 256074
2015-12-19 01:39:48 +00:00
Cong Hou fd0d62b87e Use getEdgeProbability() instead of getEdgeWeight() in BFI and remove getEdgeWeight() interfaces from MBPI.
This patch removes all getEdgeWeight() interfaces from CodeGen directory. As
getEdgeProbability() is a little more expensive than getEdgeWeight(), I will
compose a patch soon in which BPI only stores probabilities instead of edge
weights so that getEdgeProbability() will have O(1) time.


Differential revision: http://reviews.llvm.org/D15489

llvm-svn: 256039
2015-12-18 21:53:24 +00:00
Cong Hou b9e8d483b5 Fix PR25838.
This is a quick fix to PR25838. The issue comes from the restriction that we
cannot normalize probabilities containing both known and unknown ones. A patch
that removes this restriction is under the review now:

http://reviews.llvm.org/D15548

llvm-svn: 255867
2015-12-17 01:29:08 +00:00
Eric Christopher bfba572425 Fix funciton->function typo.
llvm-svn: 255841
2015-12-16 23:10:53 +00:00
Manman Ren 3e3edc91f9 CXX_FAST_TLS calling convention: target independent portion.
Update supportSplitCSR's interface to take machine function instead of the
calling convention.

Review comments for http://reviews.llvm.org/D15341

llvm-svn: 255818
2015-12-16 20:45:48 +00:00
Paul Robinson 6c27a2c40e Set debugger tuning from TargetOptions (NFC)
Differential Revision: http://reviews.llvm.org/D15427

llvm-svn: 255810
2015-12-16 19:58:30 +00:00
Tom Stellard 5ce530608f MachineScheduler: Add a target hook for deciding which RegPressure sets to
increase

Summary:
This patch adds a function called getRegPressureSetScore() to
TargetRegisterInfo.  The MachineScheduler uses this when comparing
instruction that increase the register pressure of different sets
to determine which set is safer to increase.

This hook is useful for GPU targets where the number of registers in the
class is not the best metric for determing which presser set is safer to
increase.

Future work may include adding more parameters to this function, like
for example, the current pressure level of the set or the amount that
the pressure will be increased/decreased.

Reviewers: qcolombet, escha, arsenm, atrick, MatzeB

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14806

llvm-svn: 255795
2015-12-16 18:31:01 +00:00
Krzysztof Parzyszek 2005d7dc01 [Packetizer] Add a check whether an instruction should be packetized now
Add a function VLIWPacketizerList::shouldAddToPacket, which will allow
specific implementations to decide if it is profitable to add given
instruction to the current packet.

llvm-svn: 255780
2015-12-16 16:38:16 +00:00
Vikram TV 859ad29b52 Recommit LiveDebugValues pass after fixing a couple of minor issues.
llvm-svn: 255759
2015-12-16 11:09:48 +00:00
Cong Hou 08ec3d91bb Minor change to TailDuplication.cpp to turn on normalization when removing successor
llvm-svn: 255752
2015-12-16 06:03:30 +00:00
Chen Li 3e8330a1fe [SelectionDAGBuilder] Adds support for landingpads of token type
Summary: This patch adds a check in visitLandingPad to see if landingpad's result type is token type. If so, do not create DAG nodes for its exception pointer and selector value. This patch enables the back end to handle landingpads of token type.

Reviewers: JosephTremoulet, majnemer, rnk

Subscribers: sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15405

llvm-svn: 255749
2015-12-16 04:48:42 +00:00
Philip Reames 23319014a9 Speculative fix for windows build
llvm-svn: 255743
2015-12-16 01:24:05 +00:00
Philip Reames 61a24ab6cc [IR] Add support for floating pointer atomic loads and stores
This patch allows atomic loads and stores of floating point to be specified in the IR and adds an adapter to allow them to be lowered via existing backend support for bitcast-to-equivalent-integer idiom.

Previously, the only way to specify a atomic float operation was to bitcast the pointer to a i32, load the value as an i32, then bitcast to a float. At it's most basic, this patch simply moves this expansion step to the point we start lowering to the backend.

This patch does not add canonicalization rules to convert the bitcast idioms to the appropriate atomic loads. I plan to do that in the future, but for now, let's simply add the support. I'd like to get instruction selection working through at least one backend (x86-64) without the bitcast conversion before canonicalizing into this form.

Similarly, I haven't yet added the target hooks to opt out of the lowering step I added to AtomicExpand. I figured it would more sense to add those once at least one backend (x86) was ready to actually opt out.

As you can see from the included tests, the generated code quality is not great. I plan on submitting some patches to fix this, but help from others along that line would be very welcome. I'm not super familiar with the backend and my ramp up time may be material.

Differential Revision: http://reviews.llvm.org/D15471

llvm-svn: 255737
2015-12-16 00:49:36 +00:00
Wolfgang Pieb 60b7ca6713 Test commit: fixed spelling error in comment.
llvm-svn: 255721
2015-12-16 00:08:18 +00:00
Reid Kleckner 7850c9f5ca [WinEH] Make llvm.x86.seh.recoverfp work on x64
It adjusts from RSP-after-prologue to RBP, which is what SEH filters
need to do before they can use llvm.localrecover.

Fixes SEH filter captures, which were broken in r250088.

Issue reported by Alex Crichton.

llvm-svn: 255707
2015-12-15 23:40:58 +00:00
David Majnemer 3bb88c0210 [WinEH] Use operand bundles to describe call sites
SimplifyCFG allows tail merging with code which terminates in
unreachable which, in turn, makes it possible for an invoke to end up in
a funclet which it was not originally part of.

Using operand bundles on invokes allows us to determine whether or not
an invoke was part of a funclet in the source program.

Furthermore, it allows us to unambiguously answer questions about the
legality of inlining into call sites which the personality may have
trouble with.

Differential Revision: http://reviews.llvm.org/D15517

llvm-svn: 255674
2015-12-15 21:27:27 +00:00
Michael Kuperstein 801ee74167 Do not try to use i8 and i16 versions of FP_TO_U/SINT soft float library calls
It appears that neither compiler-rt nor the gnu soft-float libraries actually
implement these conversions. Instead of emitting calls to library functions
that don't exist, handle it similarly to the way we handle i8 -> float and
i16 -> float conversions: call the i32 library function, and adjust the type.

Differential Revision: http://reviews.llvm.org/D15151

llvm-svn: 255643
2015-12-15 12:55:50 +00:00
Cong Hou 3ba9cf6020 Improve the successor list update in TailDuplication.cpp.
This patch improves a temporary fix in r255530 so that we can normalize
successor list without trigger assertion failures in tail duplication pass.

llvm-svn: 255638
2015-12-15 10:10:40 +00:00
Elena Demikhovsky 6015f5c823 Type legalizer for masked gather and scatter intrinsics.
Full type legalizer that works with all vectors length - from 2 to 16, (i32, i64, float, double).

This intrinsic, for example
void @llvm.masked.scatter.v2f32(<2 x float>%data , <2 x float*>%ptrs , i32 align , <2 x i1>%mask )
requires type widening for data and type promotion for mask.

Differential Revision: http://reviews.llvm.org/D13633

llvm-svn: 255629
2015-12-15 08:40:41 +00:00
Quentin Colombet b82786e0ff [ShrinkWrapping] Do not choose restore point inside loops.
The post-dominance property is not sufficient to guarantee that a restore point
inside a loop is safe.
E.g.,
 while(1) {
   Save
   Restore
   if (...)
     break;
   use/def CSRs
 }
All the uses/defs of CSRs are dominated by Save and post-dominated
by Restore. However, the CSRs uses are still reachable after
Restore and before Save are executed.

This fixes PR25824

llvm-svn: 255613
2015-12-15 03:28:11 +00:00
Chih-Hung Hsieh 7993e18e80 [X86] Part 2 to fix x86-64 fp128 calling convention.
Part 1 was submitted in http://reviews.llvm.org/D15134.
Changes in this part:
* X86RegisterInfo.td, X86RecognizableInstr.cpp: Add FR128 register class.
* X86CallingConv.td: Pass f128 values in XMM registers or on stack.
* X86InstrCompiler.td, X86InstrInfo.td, X86InstrSSE.td:
  Add instruction selection patterns for f128.
* X86ISelLowering.cpp:
  When target has MMX registers, configure MVT::f128 in FR128RegClass,
  with TypeSoftenFloat action, and custom actions for some opcodes.
  Add missed cases of MVT::f128 in places that handle f32, f64, or vector types.
  Add TODO comment to support f128 type in inline assembly code.
* SelectionDAGBuilder.cpp:
  Fix infinite loop when f128 type can have
  VT == TLI.getTypeToTransformTo(Ctx, VT).
* Add unit tests for x86-64 fp128 type.

Differential Revision: http://reviews.llvm.org/D11438

llvm-svn: 255558
2015-12-14 22:08:36 +00:00
Krzysztof Parzyszek dac7102874 [Packetizer] Add AliasAnalysis as a parameter to the packetizer
This will make the depedence graph more accurate if an alias analysis
is provided. If nullptr is specified in its place, the behavior will
remain as it is currently.

llvm-svn: 255540
2015-12-14 20:35:13 +00:00
Cong Hou c5f510bc23 Remove the successor probabilities normalization in tail duplication pass.
The normalization may cause assertion failures on SystemZ and some out-of-tree
tests. The root cause is that unknown probabilities are materialized into known
ones by calling getSuccProbability(), which is then used to add another
successor to the same MBB which results in mixed known and unknown
probabilities. But currently those mixed probabilities cannot be normalized.

I will compose another patch to fix the root issue.

llvm-svn: 255530
2015-12-14 19:11:54 +00:00
David Majnemer bbfc7219ef [IR] Remove terminatepad
It turns out that terminatepad gives little benefit over a cleanuppad
which calls the termination function.  This is not sufficient to
implement fully generic filters but MSVC doesn't support them which
makes terminatepad a little over-designed.

Depends on D15478.

Differential Revision: http://reviews.llvm.org/D15479

llvm-svn: 255522
2015-12-14 18:34:23 +00:00
Paul Robinson accc3e0376 FastISel needs to remove dead code when it bails out.
When FastISel fails to translate an instruction it hands off code
generation to SelectionDAG. Before it does so, it may have generated
local value instructions to feed phi nodes in successor blocks. These
instructions will then be generated again by SelectionDAG, causing
duplication and less efficient code, including extra spill
instructions.

Patch by Wolfgang Pieb!

Differential Revision: http://reviews.llvm.org/D11768

llvm-svn: 255520
2015-12-14 18:33:18 +00:00
Matt Arsenault d079285e05 AMDGPU: Use generic bitreverse intrinsic
Also fix bug in vector legalization for bitreverse.

llvm-svn: 255512
2015-12-14 17:25:38 +00:00
Sanjay Patel af674fbfd9 getParent() ^ 3 == getModule() ; NFCI
llvm-svn: 255511
2015-12-14 17:24:23 +00:00
Cong Hou c00e65aa89 Fix a type issue in r255455. Should not use unsigned type as std::abs()'s template type.
llvm-svn: 255461
2015-12-13 17:00:25 +00:00
Cong Hou 663dd018c1 Replace <cstdint> by llvm/Support/DataTypes.h for the typedef of uint64_t. NFC.
llvm-svn: 255458
2015-12-13 09:52:14 +00:00
Cong Hou c0a33e0f62 Add the missing header file <cstdint> needed by uint64_t
llvm-svn: 255457
2015-12-13 09:32:21 +00:00
Cong Hou c106989fd5 Normalize MBB's successors' probabilities in several locations.
This patch adds some missing calls to MBB::normalizeSuccProbs() in several
locations where it should be called. Those places are found by checking if the
sum of successors' probabilities is approximate one in MachineBlockPlacement
pass with some instrumented code (not in this patch).


Differential revision: http://reviews.llvm.org/D15259

llvm-svn: 255455
2015-12-13 09:26:17 +00:00
Manuel Jacob 1578ec8860 Partially fix memcpy / memset / memmove lowering in SelectionDAG construction if address space != 0.
Summary:
Previously SelectionDAGBuilder asserted that the pointer operands of
memcpy / memset / memmove intrinsics are in address space < 256.  This assert
implicitly assumed the X86 backend, where all address spaces < 256 are
equivalent to address space 0 from the code generator's point of view.  On some
targets (R600 and NVPTX) several address spaces < 256 have a target-defined
meaning, so this assert made little sense for these targets.

This patch removes this wrong assertion and adds extra checks before lowering
these intrinsics to library calls.  If a pointer operand can't be casted to
address space 0 without changing semantics, a fatal error is reported to the
user.

The new behavior should be valid for all targets that give address spaces != 0
a target-specified meaning (NVPTX, R600, X86).  NVPTX lowers big or
variable-sized memory intrinsics before SelectionDAG construction.  All other
memory intrinsics are inlined (the threshold is set very high for this target).
R600 doesn't support memcpy / memset / memmove library calls (previously the
illegal emission of a call to such library function triggered an error
somewhere in the code generator).  X86 now emits inline loads and stores for
address spaces 256 and 257 up to the same threshold that is used for address
space 0 and reports a fatal error otherwise.

I call this a "partial fix" because there are still cases that can't be
lowered.  A fatal error is reported in these cases.

Reviewers: arsenm, theraven, compnerd, hfinkel

Subscribers: hfinkel, llvm-commits, alex

Differential Revision: http://reviews.llvm.org/D7241

llvm-svn: 255441
2015-12-12 21:33:31 +00:00
David Majnemer 8a1c45d6e8 [IR] Reformulate LLVM's EH funclet IR
While we have successfully implemented a funclet-oriented EH scheme on
top of LLVM IR, our scheme has some notable deficiencies:
- catchendpad and cleanupendpad are necessary in the current design
  but they are difficult to explain to others, even to seasoned LLVM
  experts.
- catchendpad and cleanupendpad are optimization barriers.  They cannot
  be split and force all potentially throwing call-sites to be invokes.
  This has a noticable effect on the quality of our code generation.
- catchpad, while similar in some aspects to invoke, is fairly awkward.
  It is unsplittable, starts a funclet, and has control flow to other
  funclets.
- The nesting relationship between funclets is currently a property of
  control flow edges.  Because of this, we are forced to carefully
  analyze the flow graph to see if there might potentially exist illegal
  nesting among funclets.  While we have logic to clone funclets when
  they are illegally nested, it would be nicer if we had a
  representation which forbade them upfront.

Let's clean this up a bit by doing the following:
- Instead, make catchpad more like cleanuppad and landingpad: no control
  flow, just a bunch of simple operands;  catchpad would be splittable.
- Introduce catchswitch, a control flow instruction designed to model
  the constraints of funclet oriented EH.
- Make funclet scoping explicit by having funclet instructions consume
  the token produced by the funclet which contains them.
- Remove catchendpad and cleanupendpad.  Their presence can be inferred
  implicitly using coloring information.

N.B.  The state numbering code for the CLR has been updated but the
veracity of it's output cannot be spoken for.  An expert should take a
look to make sure the results are reasonable.

Reviewers: rnk, JosephTremoulet, andrew.w.kaylor

Differential Revision: http://reviews.llvm.org/D15139

llvm-svn: 255422
2015-12-12 05:38:55 +00:00
Matt Arsenault fabab4b7dd SelectionDAG: Match min/max if the scalar operation is legal
llvm-svn: 255388
2015-12-11 23:16:47 +00:00
Hal Finkel cd8664c3c2 Revert r248483, r242546, r242545, and r242409 - absdiff intrinsics
After much discussion, ending here:

  http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151123/315620.html

it has been decided that, instead of having the vectorizer directly generate
special absdiff and horizontal-add intrinsics, we'll recognize the relevant
reduction patterns during CodeGen. Accordingly, these intrinsics are not needed
(the operations they represent can be pattern matched, as is already done in
some backends). Thus, we're backing these out in favor of the current
development work.

r248483 - Codegen: Fix llvm.*absdiff semantic.
r242546 - [ARM] Use [SU]ABSDIFF nodes instead of intrinsics for VABD/VABA
r242545 - [AArch64] Use [SU]ABSDIFF nodes instead of intrinsics for ABD/ABA
r242409 - [Codegen] Add intrinsics 'absdiff' and corresponding SDNodes for absolute difference operation

llvm-svn: 255387
2015-12-11 23:11:52 +00:00
Matthias Braun 60d69e2865 CodeGen: Redo analyzePhysRegs() and computeRegisterLiveness()
computeRegisterLiveness() was broken in that it reported dead for a
register even if a subregister was alive. I assume this was because the
results of analayzePhysRegs() are hard to understand with respect to
subregisters.

This commit: Changes the results of analyzePhysRegs (=struct
PhysRegInfo) to be clearly understandable, also renames the fields to
avoid silent breakage of third-party code (and improve the grammar).

Fix all (two) users of computeRegisterLiveness() in llvm: By reenabling
it and removing workarounds for the bug.

This fixes http://llvm.org/PR24535 and http://llvm.org/PR25033

Differential Revision: http://reviews.llvm.org/D15320

llvm-svn: 255362
2015-12-11 19:42:09 +00:00
Manman Ren abc7c1d1d2 CXX_FAST_TLS calling convention: target independent portion.
The access function has a short entry and a short exit, the initialization
block is only run the first time. To improve the performance, we want to
have a short frame at the entry and exit.

We explicitly handle most of the CSRs via copies. Only the CSRs that are not
handled via copies will be in CSR_SaveList.

Frame lowering and prologue/epilogue insertion will generate a short frame
in the entry and exit according to CSR_SaveList. The majority of the CSRs will
be handled by register allcoator. Register allocator will try to spill and
reload them in the initialization block.

We add CSRsViaCopy, it will be explicitly handled during lowering.

1> we first set FunctionLoweringInfo->SplitCSR if conditions are met (the target
   supports it for the given calling convention and the function has only return
   exits). We also call TLI->initializeSplitCSR to perform initialization.
2> we call TLI->insertCopiesSplitCSR to insert copies from CSRsViaCopy to
   virtual registers at beginning of the entry block and copies from virtual
   registers to CSRsViaCopy at beginning of the exit blocks.
3> we also need to make sure the explicit copies will not be eliminated.

rdar://problem/23557469

Differential Revision: http://reviews.llvm.org/D15340

llvm-svn: 255353
2015-12-11 18:24:30 +00:00
Eric Christopher 325e8d06dc Fix (bitcast (fabs x)), (bitcast (fneg x)) and (bitcast (fcopysign cst,
x)) combines for ppc_fp128, since signbit computation is more
complicated.

Discussion thread:
http://lists.llvm.org/pipermail/llvm-dev/2015-November/092863.html

Patch by Tim Shen!

llvm-svn: 255305
2015-12-10 22:09:06 +00:00
Cong Hou 5146b2d1da Delete a duplicate branch in IfConversion.cpp. NFC.
llvm-svn: 255291
2015-12-10 19:57:22 +00:00
Simon Pilgrim 06ea4be281 [DAGCombiner] Fix PR25763 - vector comparison constant folding + sign-extension
PR25763 demonstrated an issue with D14683 - vector comparison constant folding only works for i1 results, so we need to split off the sign-extension of the result to the required type. Luckily this can be done with the existing type legalization code.

llvm-svn: 255289
2015-12-10 19:47:06 +00:00
Sanjay Patel 87c6c0797e remove duplicated comments and don't repeat function names in comments; NFC
llvm-svn: 255257
2015-12-10 16:34:21 +00:00
Jonas Paulsson e451eeff5c [PostRA scheduling] Allow a target to do scheduling when it wants post RA.
SystemZ needs to do its scheduling after branch relaxation, which can
only happen after block placement, and therefore the standard
PostRAScheduler point in the pass sequence is too early.

TargetMachine::targetSchedulesPostRAScheduling() is a new method that
signals on returning true that target will insert the final scheduling
pass on its own.

Reviewed by Hal Finkel

llvm-svn: 255234
2015-12-10 09:10:07 +00:00
Matthias Braun 7d8e41e82c RegisterPressure: Factor out liveness dead-def detection logic; NFCI
Detecting additional dead-defs without a dead flag that are only visible
through liveness information should be part of the register operand
collection not intertwined with the register pressure update logic.

llvm-svn: 255192
2015-12-10 01:04:15 +00:00
Dan Gohman dab313e0ed PeepholeOptimizer: Ignore dead implicit defs
Target-specific instructions may have uninteresting physreg clobbers,
for target-specific reasons. The peephole pass doesn't need to concern
itself with such defs, as long as they're implicit and marked as dead.

llvm-svn: 255182
2015-12-10 00:37:51 +00:00
Sanjay Patel 87d2ae23ac use range-based for loops; NFCI
llvm-svn: 255171
2015-12-09 22:45:45 +00:00
Robert Lougher f0033b29d4 Fix cycle in selection DAG introduced by extractelement legalization
During selection DAG legalization, extractelement is replaced with a load
instruction.  To do this, a temporary store to the stack is used unless an
existing store is found that can be re-used.
    
If re-using a store, the chain going out of the store must be replaced by
the one going out of the new load (this ensures that any stores that must
take place after the store happens after the load, else the value might
be overwritten before it is loaded).
    
The problem is, if the extractelement index is dependent on the store
replacing the chain will introduce a cycle in the selection DAG (the load
uses the index, and by replacing the chain we will make the index dependent
on the load).
    
To fix this, if the index is dependent on the store, the store is skipped.
This is conservative as we may end up creating an unnecessary extra store
to the stack.  However, the situation is not expected to occur very often.

Differential Revision: http://reviews.llvm.org/D15330

llvm-svn: 255114
2015-12-09 14:34:10 +00:00
Mehdi Amini ceca971576 Revert "Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933"
This reverts commit r255096.

Break the bots: http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/16378/

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 255101
2015-12-09 08:17:42 +00:00
Vikram TV 0876d2d5cf Implement a new pass - LiveDebugValues - to compute the set of live DEBUG_VALUEs at each basic block and insert them. Reviewed and accepted at: http://reviews.llvm.org/D11933
llvm-svn: 255096
2015-12-09 05:49:14 +00:00
Reid Kleckner 8de1fe23ed [CGP] Reimplement r255055 a different way
llvm-svn: 255070
2015-12-08 23:00:03 +00:00
Reid Kleckner e18f92bfe9 Revert "[CGP] Check that we have an insert point before moving llvm.dbg.value around"
This reverts commit r255055.

Breakage has been reported.

llvm-svn: 255063
2015-12-08 22:33:23 +00:00
Reid Kleckner 7c005324d5 [CGP] Check that we have an insert point before moving llvm.dbg.value around
llvm-svn: 255055
2015-12-08 21:50:52 +00:00
Justin Bogner 3135ba9b38 AsmPrinter: Use emitGlobalConstantFP to emit elements of constant data
It's strange to duplicate the logic for emitting FP values into
emitGlobalConstantDataSequential, and it's even stranger that we end
up printing the verbose assembly comments differently between the two
paths. Just call into emitGlobalConstantFP rather than crudely
duplicating its logic.

llvm-svn: 254988
2015-12-08 02:37:48 +00:00
Sanjay Patel dc627ad63f fix return values to match bool return type; NFC
llvm-svn: 254968
2015-12-07 23:34:30 +00:00
Elena Demikhovsky 33e61eceb4 AVX-512: Fixed masked load / store instruction selection for KNL.
Patterns were missing for KNL target for <8 x i32>, <8 x float> masked load/store.

This intrinsic comes with all legal types:
<8 x float> @llvm.masked.load.v8f32(<8 x float>* %addr, i32 align, <8 x i1> %mask, <8 x float> %passThru),
but still requires lowering, because VMASKMOVPS, VMASKMOVDQU32 work with 512-bit vectors only.

All data operands should be widened to 512-bit vector.
The mask operand should be widened to v16i1 with zeroes.

Differential Revision: http://reviews.llvm.org/D15265

llvm-svn: 254909
2015-12-07 13:39:24 +00:00
Craig Topper e5e035a3a8 Replace uint16_t with the MCPhysReg typedef in many places. A lot of physical register arrays already use this typedef.
llvm-svn: 254843
2015-12-05 07:13:35 +00:00
Cong Hou 833fe143f5 Normalize successors' probabilities when building MBBs for jump table.
llvm-svn: 254837
2015-12-05 05:00:55 +00:00
Matthias Braun b17e8b1c1d ScheduleDAGInstrs: Move LiveIntervals field to ScheduleDAGMI
Now that ScheduleDAGInstrs doesn't need it anymore we can move the field
down the class hierarcy to ScheduleDAGMI.

llvm-svn: 254759
2015-12-04 19:54:24 +00:00
Rafael Espindola 71bd70cc30 Revert "[BranchFolding] Merge MMOs during tail merge"
This reverts commit r254694.

It broke bootstrap.

llvm-svn: 254700
2015-12-04 04:15:05 +00:00
Junmo Park c0731ca183 [BranchFolding] Merge MMOs during tail merge
Summary:
If we remove the MMOs from Load/Store instructions,
they are treated as volatile. This makes other optimization passes unhappy.
eg. Load/Store Optimization

So, it looks better to merge, not remove.

Reviewers: gberry, mcrosier

Subscribers: gberry, llvm-commits

Differential Revision: http://reviews.llvm.org/D14797

llvm-svn: 254694
2015-12-04 02:29:25 +00:00
Junmo Park 7cc13f2e58 (no commit message)
llvm-svn: 254686
2015-12-04 02:06:59 +00:00
Matthias Braun 97d0ffbe06 ScheduleDAGInstrs: Rework schedule graph builder.
Re-comitting with a change that avoids undefined uses getting put into
the VRegUses list.

The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
  (not used for now)

The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.

Differential Revision: http://reviews.llvm.org/D9068

llvm-svn: 254683
2015-12-04 01:51:19 +00:00
Matthias Braun c07cbc8d3c raw_ostream: << operator for callables with raw_ostream argument
This is a revised version of r254655 which uses a Printable wrapper
class to avoid ambiguous overload problems.

Differential Revision: http://reviews.llvm.org/D14348

llvm-svn: 254681
2015-12-04 01:31:59 +00:00
Evgeniy Stepanov 2bb9c5ca22 Emit function alias to data as a function symbol.
CFI emits jump slots for indirect functions as a byte array
constant, and declares function-typed aliases to these constants.

This change fixes AsmPrinter to emit these aliases as function
symbols and not data symbols.

llvm-svn: 254674
2015-12-04 00:45:43 +00:00
JF Bastien 1ac69947b6 CodeGen peephole: fold redundant phys reg copies
Code generation often exposes redundant physical register copies through
virtual registers such as:

  %vreg = COPY %PHYSREG
  ...
  %PHYSREG = COPY %vreg

There are cases where no intervening clobber of %PHYSREG occurs, and the
later copy could therefore be removed. In some cases this further allows
us to remove the initial copy.

This patch contains a motivating example which comes from the x86 build
of Chrome, specifically cc::ResourceProvider::UnlockForRead uses
libstdc++'s implementation of hash_map. That example has two tests live
at the same time, and after machine sinking LLVM has confused itself
enough and things spilling EFLAGS is a great idea even though it's
never restored and the comparison results are both live.

Before this patch we have:
  DEC32m %RIP, 1, %noreg, <ga:@L>, %noreg, %EFLAGS<imp-def>
  %vreg1<def> = COPY %EFLAGS; GR64:%vreg1
  %EFLAGS<def> = COPY %vreg1; GR64:%vreg1
  JNE_1 <BB#1>, %EFLAGS<imp-use>

Both copies are useless. This patch tries to eliminate the later copy in
a generic manner.

dec is especially confusing to LLVM when compared with sub.

I wrote this patch to treat all physical registers generically, but only
remove redundant copies of non-allocatable physical registers because
the allocatable ones caused issues (e.g. when calling conventions weren't
properly modeled) and should be handled later by the register allocator
anyways.

The following tests used to failed when the patch also replaced allocatable
registers:
  CodeGen/X86/StackColoring.ll
  CodeGen/X86/avx512-calling-conv.ll
  CodeGen/X86/copy-propagation.ll
  CodeGen/X86/inline-asm-fpstack.ll
  CodeGen/X86/musttail-varargs.ll
  CodeGen/X86/pop-stack-cleanup.ll
  CodeGen/X86/preserve_mostcc64.ll
  CodeGen/X86/tailcallstack64.ll
  CodeGen/X86/this-return-64.ll
This happens because COPY has other special meaning for e.g. dependency
breakage and x87 FP stack.

Note that all other backends' tests pass.

Reviewers: qcolombet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D15157

llvm-svn: 254665
2015-12-03 23:43:56 +00:00
Justin Bogner 8d6f4e36e8 AsmPrinter: Simplify emitting FP elements in sequential data. NFC
Use APFloat APIs here Rather than manually type-punning through
unions.

llvm-svn: 254664
2015-12-03 23:28:35 +00:00
Matthias Braun 149b859c55 Revert "raw_ostream: << operator for callables with raw_stream argument"
This commit provoked "error C2593: 'operator <<' is ambiguous" on MSVC.

This reverts commit r254655.

llvm-svn: 254661
2015-12-03 23:00:28 +00:00
Matthias Braun e957a9bb1b raw_ostream: << operator for callables with raw_stream argument
This allows easier construction of print helpers. Example:

Printable PrintLaneMask(unsigned LaneMask) {
  return Printable([LaneMask](raw_ostream &OS) {
    OS << format("%08X", LaneMask);
  });
}

// Usage:
OS << PrintLaneMask(Mask);

Differential Revision: http://reviews.llvm.org/D14348

llvm-svn: 254655
2015-12-03 22:17:26 +00:00
Chih-Hung Hsieh ed7d81e5d4 [X86] Part 1 to fix x86-64 fp128 calling convention.
Almost all these changes are conditioned and only apply to the new
x86-64 f128 type configuration, which will be enabled in a follow up
patch. They are required together to make new f128 work. If there is
any error, we should fix or revert them as a whole.
These changes should have no impact to current configurations.

* Relax type legalization checks to accept new f128 type configuration,
  whose TypeAction is TypeSoftenFloat, not TypeLegal, but also has
  TLI.isTypeLegal true.
* Relax GetSoftenedFloat to return in some cases f128 type SDValue,
  which is TLI.isTypeLegal but not "softened" to i128 node.
* Allow customized FABS, FNEG, FCOPYSIGN on new f128 type configuration,
  to generate optimized bitwise operators for libm functions.
* Enhance related Lower* functions to handle f128 type.
* Enhance DAGTypeLegalizer::run, SoftenFloatResult, and related functions
  to keep new f128 type in register, and convert f128 operators to library calls.
* Fix Combiner, Emitter, Legalizer routines that did not handle f128 type.
* Add ExpandConstant to handle i128 constants, ExpandNode
  to handle ISD::Constant node.
* Add one more parameter to getCommonSubClass and firstCommonClass,
  to guarantee that returned common sub class will contain the specified
  simple value type.
  This extra parameter is used by EmitCopyFromReg in InstrEmitter.cpp.
* Fix infinite loop in getTypeLegalizationCost when f128 is the value type.
* Fix printOperand to handle null operand.
* Enhance ISD::BITCAST node to handle f128 constant.
* Expand new f128 type for BR_CC, SELECT_CC, SELECT, SETCC nodes.
* Enhance X86AsmPrinter to emit f128 values in comments.

Differential Revision: http://reviews.llvm.org/D15134

llvm-svn: 254653
2015-12-03 22:02:40 +00:00
Andrew Kaylor 9efb2332e2 [WinEH] Avoid infinite loop in BranchFolding for multiple single block funclets
Differential Revision: http://reviews.llvm.org/D14996

llvm-svn: 254629
2015-12-03 18:55:28 +00:00
Matthias Braun 2fd672a221 Revert "ScheduleDAGInstrs: Rework schedule graph builder."
This works mostly fine but breaks some stage 1 builders when compiling
compiler-rt on i386. Revert for further investigation as I can't see an
obvious cause/fix.

This reverts commit r254577.

llvm-svn: 254586
2015-12-03 03:01:10 +00:00
Matthias Braun d35fe3d984 ScheduleDAGInstrs: Rework schedule graph builder.
The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
  (not used for now)

The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.

Differential Revision: http://reviews.llvm.org/D9068

llvm-svn: 254577
2015-12-03 02:05:27 +00:00
Matthias Braun b0083608b4 RegisterPressure: Use range based for, fix else style; NFC
llvm-svn: 254575
2015-12-03 01:44:45 +00:00
David Majnemer 70497c696a Move EH-specific helper functions to a more appropriate place
No functionality change is intended.

llvm-svn: 254562
2015-12-02 23:06:39 +00:00
Reid Kleckner 1f11b4e3a7 Use std::string instead of strdup() and free() in WinCodeViewLineTables
llvm-svn: 254557
2015-12-02 22:34:30 +00:00
Kyle Butt cf6a8bfe51 [CodeGen]: Fix bad interaction with AntiDep breaking and inline asm.
AggressiveAntiDepBreaker was renaming registers specified by the user
for inline assembly. While this will work for compiler-specified
registers, it won't work for user-specified registers, and at the time
this runs, I don't currently see a way to distinguish them.

llvm-svn: 254532
2015-12-02 18:58:51 +00:00
Fiona Glaser 1075f6323f Fix accidental off by one change
Didn't break any tests, but did unnecessary extra work.

llvm-svn: 254529
2015-12-02 18:46:23 +00:00
Fiona Glaser e25b06fa23 Scheduler / Regalloc: use unique_ptr[] instead of std::vector
vector.resize() is significantly slower than memset in many STLs
and the cost of initializing these vectors is significant on targets
with many registers. Since we don't need the overhead of a vector,
use a simple unique_ptr instead.

llvm-svn: 254526
2015-12-02 18:32:59 +00:00
Tim Northover f520eff782 AArch64: use ldxp/stxp pair to implement 128-bit atomic loads.
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.

llvm-svn: 254524
2015-12-02 18:12:57 +00:00
Cong Hou cb07d7016a Fix a bug in IfConversion.cpp.
The bug is introduced in r254377 which failed some tests on ARM, where a new
probability is assigned to a successor but the provided BB may not be a
successor.

llvm-svn: 254463
2015-12-01 21:50:20 +00:00
Sanjay Patel 0b2a94916d use range-based for loops; NFCI
llvm-svn: 254453
2015-12-01 19:57:43 +00:00
Sanjay Patel b53791e5a7 don't repeat function/variable names in comments; NFC
llvm-svn: 254445
2015-12-01 19:32:35 +00:00
Sanjay Patel 96824deebc fix typo; NFC
llvm-svn: 254442
2015-12-01 19:19:18 +00:00
Elena Demikhovsky 0781d7b2b4 Fixed a failure in cost calculation for vector GEP
Cost calculation for vector GEP failed with due to invalid cast to GEP index operand.
The bug is fixed, added a test.

http://reviews.llvm.org/D14976

llvm-svn: 254408
2015-12-01 12:08:36 +00:00
Yury Gribov d7dbb66eb8 Introduce new @llvm.get.dynamic.area.offset.i{32, 64} intrinsics.
The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset
from native stack pointer to the address of the most recent dynamic alloca on
the caller's stack. These intrinsics are intendend for use in combination with
@llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic
alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning
routines.

Patch by Max Ostapenko.

Differential Revision: http://reviews.llvm.org/D14983

llvm-svn: 254404
2015-12-01 11:40:55 +00:00
Cong Hou 4aef7ef881 Allow known and unknown probabilities coexist in MBB's successor list.
Previously it is not allowed for each MBB to have successors with both known and
unknown probabilities. However, this may be too strict as at this stage we could
not always guarantee that. It is better to remove this restriction now, and I
will work on validating MBB's successors' probabilities first (for example,
check if the sum is approximate one).

llvm-svn: 254402
2015-12-01 11:05:39 +00:00
Cong Hou d97c100dc4 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
(This is the second attempt to submit this patch. The first caused two assertion
 failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687)

The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973

llvm-svn: 254377
2015-12-01 05:29:22 +00:00
Matthias Braun 50f7f585ed RegisterPressure: If we do not collect dead defs the list must be empty
llvm-svn: 254372
2015-12-01 04:20:06 +00:00
Matthias Braun ba6b225bf9 RegisterPressure: Remove support for recede()/advance() at MBB boundaries
Nobody was checking the returnvalue of recede()/advance() so we can
simply replace this code with asserts.

llvm-svn: 254371
2015-12-01 04:20:04 +00:00
Matthias Braun f9f8b92d93 RegisterPressure: Split RegisterOperands analysis code from result object; NFC
This is in preparation to expose the RegisterOperands class as
RegisterPressure API.

llvm-svn: 254368
2015-12-01 04:19:56 +00:00
Hans Wennborg 1dbaf67537 Revert r254348: "Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces."
and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction."

Asserts were firing in Chromium builds. See PR25687.

llvm-svn: 254366
2015-12-01 03:49:42 +00:00
Cong Hou 1ccca9e673 Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction.
The root cause is the rounding behavior in BranchProbability construction. We may consider to use truncation instead in the future.

llvm-svn: 254356
2015-12-01 00:55:42 +00:00
Evgeniy Stepanov fd07995363 Extend debug info for function parameters in SDAG.
SDAG currently can emit debug location for function parameters when
an llvm.dbg.declare points to either a function argument SSA temp,
or to an AllocaInst. This change extends this logic by adding a
fallback case when neither of the above is true.

This is required for SafeStack, which may copy the contents of a
byval function argument into something that is not an alloca, and
then describe the target as the new location of the said argument.

llvm-svn: 254352
2015-12-01 00:34:30 +00:00
Cong Hou fa1917c673 Replace all weight-based interfaces in MBB with probability-based interfaces, and update all uses of old interfaces.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.

All uses of weight-based interfaces are now updated to use probability-based
ones.


Differential revision: http://reviews.llvm.org/D14973

llvm-svn: 254348
2015-12-01 00:02:51 +00:00
Paul Robinson a2550a6da3 Have 'optnone' respect the -fast-isel=false option.
This is primarily useful for debugging optnone v. ISel issues.

Differential Revision: http://reviews.llvm.org/D14792

llvm-svn: 254335
2015-11-30 21:56:16 +00:00
Craig Topper 6066164454 Use a lambda instead of std::bind and std::mem_fn I introduced in r254242. NFC
llvm-svn: 254260
2015-11-29 18:05:22 +00:00
Craig Topper d0573179dc [SelectionDAG] Use std::any_of instead of a manually coded loop. NFC
llvm-svn: 254242
2015-11-29 04:37:11 +00:00
Jonas Paulsson f12b925bb1 [Stack realignment] Handling of aligned allocas.
This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.

It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.

SystemZ benefits from this, due to its stack frame layout.

New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.

Review and help from Ulrich Weigand and Hal Finkel.

llvm-svn: 254227
2015-11-28 11:02:32 +00:00
Artyom Skrobov 314ee04268 Expose isXxxConstant() functions from SelectionDAGNodes.h (NFC)
Summary:
Many target lowerings copy-paste the code to test SDValues for known constants.
This code can instead be shared in SelectionDAG.cpp, and reused in the targets.

Reviewers: MatzeB, andreadb, tstellarAMD

Subscribers: arsenm, jyknight, llvm-commits

Differential Revision: http://reviews.llvm.org/D14945

llvm-svn: 254085
2015-11-25 19:41:11 +00:00
Eric Christopher 4675c439aa Fix some places where we were assuming that memory type had been legalized
to a simple type when lowering a truncating store of a vector type. In this
case for an EVT we'll return Expand as we should in all of the cases anyhow.

The testcase triggered at the one in VectorLegalizer::LegalizeOp, inspection
found the rest.

llvm-svn: 254061
2015-11-25 09:11:53 +00:00
Matthias Braun 147110da84 LiveVariables should not clobber MachineOperand::IsDead, ::IsKill on reserved physical registers
Patch by Nick Johnson <Nicholas.Paul.Johnson@deshawresearch.com>

Differential Revision: http://reviews.llvm.org/D14875

llvm-svn: 254012
2015-11-24 20:06:56 +00:00
Cong Hou 1938f2eb98 Let SelectionDAG start to use probability-based interface to add successors.
The patch in http://reviews.llvm.org/D13745 is broken into four parts:

1. New interfaces without functional changes.
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights.
3. Use new interfaces in all other passes.
4. Remove old interfaces.

This the second patch above. In this patch SelectionDAG starts to use
probability-based interfaces in MBB to add successors but other MC passes are
still using weight-based interfaces. Therefore, we need to maintain correct
weight list in MBB even when probability-based interfaces are used. This is
done by updating weight list in probability-based interfaces by treating the
numerator of probabilities as weights. This change affects many test cases
that check successor weight values. I will update those test cases once this
patch looks good to you.


Differential revision: http://reviews.llvm.org/D14361

llvm-svn: 253965
2015-11-24 08:51:23 +00:00
Davide Italiano c304a0ddc1 [DIE] Make DIE.h NDEBUG conditional-free.
Switch dump()/print() method definitions to LLVM_DUMP_METHOD instead.

llvm-svn: 253945
2015-11-24 02:21:43 +00:00
Andrew Kaylor d0430e8580 [WinEH] Fix problem where CodeGenPrepare incorrectly sinks a bitcast into an EH pad.
Differential Revision: http://reviews.llvm.org/D14842

llvm-svn: 253902
2015-11-23 19:16:15 +00:00
Simon Pilgrim 1dfe53e180 Remove duplicate getValueType() calls. NFCI.
llvm-svn: 253823
2015-11-22 16:49:38 +00:00
Krzysztof Parzyszek 6753f33388 Avoid dependency between TableGen and CodeGen
Duplicate a few common definitions between DFAPacketizer.cpp and
DFAPacketizerEmitter.cpp to avoid including files from CodeGen
in TableGen.

llvm-svn: 253820
2015-11-22 15:20:19 +00:00
Krzysztof Parzyszek b46557292c Hexagon V60/HVX DFA scheduler support
Extended DFA tablegen to:
  - added "-debug-only dfa-emitter" support to llvm-tblgen

  - defined CVI_PIPE* resources for the V60 vector coprocessor

  - allow specification of multiple required resources
    - supports ANDs of ORs
    - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
           (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)

  - added support for combo resources
    - allows specifying ORs of ANDs
    - e.g. [CVI_XLSHF, CVI_MPY01] means:
           (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)

  - increased DFA input size from 32-bit to 64-bit
    - allows for a maximum of 4 AND'ed terms of 16 resources

  - supported expressions now include:

    expression     => term [AND term] [AND term] [AND term]
    term           => resource [OR resource]*
    resource       => one_resource | combo_resource
    combo_resource => (one_resource [AND one_resource]*)

Author: Dan Palermo <dpalermo@codeaurora.org>

kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.

Reapply the previous patch, this time without circular dependencies.

llvm-svn: 253793
2015-11-21 20:00:45 +00:00
Krzysztof Parzyszek 4ca21fc1aa Revert r253790: it breaks all builds for some reason.
llvm-svn: 253791
2015-11-21 17:38:33 +00:00
Krzysztof Parzyszek 220a9bc018 Hexagon V60/HVX DFA scheduler support
Extended DFA tablegen to:
  - added "-debug-only dfa-emitter" support to llvm-tblgen

  - defined CVI_PIPE* resources for the V60 vector coprocessor

  - allow specification of multiple required resources
    - supports ANDs of ORs
    - e.g. [SLOT2, SLOT3], [CVI_MPY0, CVI_MPY1] means:
           (SLOT2 OR SLOT3) AND (CVI_MPY0 OR CVI_MPY1)

  - added support for combo resources
    - allows specifying ORs of ANDs
    - e.g. [CVI_XLSHF, CVI_MPY01] means:
           (CVI_XLANE AND CVI_SHIFT) OR (CVI_MPY0 AND CVI_MPY1)

  - increased DFA input size from 32-bit to 64-bit
    - allows for a maximum of 4 AND'ed terms of 16 resources

  - supported expressions now include:

    expression     => term [AND term] [AND term] [AND term]
    term           => resource [OR resource]*
    resource       => one_resource | combo_resource
    combo_resource => (one_resource [AND one_resource]*)

Author: Dan Palermo <dpalermo@codeaurora.org>

kparzysz: Verified AMDGPU codegen to be unchanged on all llc
tests, except those dealing with instruction encodings.

llvm-svn: 253790
2015-11-21 17:23:52 +00:00
Jonas Paulsson 8f0d2b7f1f [DAGCombiner] Bugfix for lost chain depenedency.
When MergeConsecutiveStores() combines two loads and two stores into
wider loads and stores, the chain users of both of the original loads
must be transfered to the new load, because it may be that a chain
user only depends on one of the loads.

New test case: test/CodeGen/SystemZ/dag-combine-01.ll

Reviewed by James Y Knight.

Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25310#c6
llvm-svn: 253779
2015-11-21 13:25:07 +00:00
Geoff Berry 5256fcada0 [CodeGenPrepare] Create more extloads and fewer ands
Summary:
Add and instructions immediately after loads that only have their low
bits used, assuming that the (and (load x) c) will be matched as a
extload and the ands/truncs fed by the extload will be removed by isel.

Reviewers: mcrosier, qcolombet, ab

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14584

llvm-svn: 253722
2015-11-20 22:34:39 +00:00
Arnaud A. de Grandmaison 4e89e9f846 [ShrinkWrap] Teach ShrinkWrap to handle targets requiring a register scavenger.
The included test only checks for a compiler crash for now. Several people are
facing this issue, so we first resolve the crash, and will increase shrinkwrap's
coverage later in a follow-up patch.

llvm-svn: 253718
2015-11-20 21:54:27 +00:00
Daniel Sanders b700203c8b Partially revert r253662: some unrelated work was accidentally committed with it.
Sorry.

llvm-svn: 253663
2015-11-20 13:16:35 +00:00
Daniel Sanders be9db3c00a Revert the revert 253497 and 253539 - These commits aren't the cause of the clang-cmake-mips failures.
Sorry for the noise.

llvm-svn: 253662
2015-11-20 13:13:53 +00:00
Tobias Edler von Koch 4d45090659 [LTO] Add option to emit assembly from LTOCodeGenerator
This adds a new API, LTOCodeGenerator::setFileType, to choose the output file
format for LTO CodeGen. A corresponding change to use this new API from
llvm-lto and a test case is coming in a separate commit.

Differential Revision: http://reviews.llvm.org/D14554

llvm-svn: 253622
2015-11-19 23:59:24 +00:00
Reid Kleckner cc2f6c35a3 [WinEH] Disable most forms of demotion
Now that the register allocator knows about the barriers on funclet
entry and exit, testing has shown that this is unnecessary.

We still demote PHIs on unsplittable blocks due to the differences
between the IR CFG and the Machine CFG.

llvm-svn: 253619
2015-11-19 23:23:33 +00:00
Krzysztof Parzyszek df537b97b1 Expand subregisters in MachineFrameInfo::getPristineRegs
http://reviews.llvm.org/D14719

llvm-svn: 253600
2015-11-19 21:18:52 +00:00
Sanjay Patel 4699b8ab6a [CGP] despeculate expensive cttz/ctlz intrinsics
This is another step towards allowing SimplifyCFG to speculate harder, but then have 
CGP clean things up if the target doesn't like it.

Previous patches in this series:
http://reviews.llvm.org/D12882
http://reviews.llvm.org/D13297

D13297 should catch most expensive ops, but speculation of cttz/ctlz requires special
handling because of weirdness in the intrinsic definition for handling a zero input 
(that definition can probably be blamed on x86).

For example, if we have the usual speculated-by-select expensive op pattern like this:

  %tobool = icmp eq i64 %A, 0
  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)   ; is_zero_undef == true
  %cond = select i1 %tobool, i64 64, i64 %0
  ret i64 %cond

There's an instcombine that will turn it into:

  %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 false)   ; is_zero_undef == false

This CGP patch is looking for that case and despeculating it back into:

  entry:
    %tobool = icmp eq i64 %A, 0
    br i1 %tobool, label %cond.end, label %cond.true

  cond.true:
    %0 = tail call i64 @llvm.cttz.i64(i64 %A, i1 true)    ; is_zero_undef == true
    br label %cond.end

  cond.end:
    %cond = phi i64 [ %0, %cond.true ], [ 64, %entry ]
    ret i64 %cond

This unfortunately may lead to poorer codegen (see the changes in the existing x86 test), 
but if we increase speculation in SimplifyCFG (the next step in this patch series), then
we should avoid those kinds of cases in the first place.

The need for this patch was originally mentioned here:
http://reviews.llvm.org/D7506
with follow-up here:
http://reviews.llvm.org/D7554

Differential Revision: http://reviews.llvm.org/D14630

llvm-svn: 253573
2015-11-19 16:37:10 +00:00
Hans Wennborg dcc2500452 X86: More efficient legalization of wide integer compares
In particular, this makes the code for 64-bit compares on 32-bit targets
much more efficient.

Example:

  define i32 @test_slt(i64 %a, i64 %b) {
  entry:
    %cmp = icmp slt i64 %a, %b
    br i1 %cmp, label %bb1, label %bb2
  bb1:
    ret i32 1
  bb2:
    ret i32 2
  }

Before this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          setae   %al
          cmpl    16(%esp), %ecx
          setge   %cl
          je      .LBB2_2
          movb    %cl, %al
  .LBB2_2:
          testb   %al, %al
          jne     .LBB2_4
          movl    $1, %eax
          retl
  .LBB2_4:
          movl    $2, %eax
          retl

After this patch:

  test_slt:
          movl    4(%esp), %eax
          movl    8(%esp), %ecx
          cmpl    12(%esp), %eax
          sbbl    16(%esp), %ecx
          jge     .LBB1_2
          movl    $1, %eax
          retl
  .LBB1_2:
          movl    $2, %eax
          retl

Differential Revision: http://reviews.llvm.org/D14496

llvm-svn: 253572
2015-11-19 16:35:08 +00:00
Pete Cooper 67cf9a723b Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

llvm-svn: 253543
2015-11-19 05:56:52 +00:00
Pete Cooper 72bc23ef02 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

llvm-svn: 253511
2015-11-18 22:17:24 +00:00
Simon Pilgrim c1a46b729b [DAGCombiner] Vector constant folding for comparisons
This patch adds support for vector constant folding of integer/float comparisons.

This requires FoldConstantVectorArithmetic to support scalar constant operands (in this case ISD::CONDCASE). In future we should be able to support other scalar constant types as necessary (and possibly start calling FoldConstantVectorArithmetic for all node creations)

Differential Revision: http://reviews.llvm.org/D14683

llvm-svn: 253504
2015-11-18 21:17:19 +00:00
Betul Buyukkurt 6fac1741c9 [PGO] Value profiling support
This change introduces an instrumentation intrinsic instruction for
value profiling purposes, the lowering of the instrumentation intrinsic
and raw reader updates. The raw profile data files for llvm-profdata
testing are updated.

llvm-svn: 253484
2015-11-18 18:14:55 +00:00
Jonas Paulsson af722f8287 [SelectionDAGBuilder] Make sure DemoteReg ends up in right reg-class.
The virtual register containing the address for returned value on
stack should in the DAG be represented with a CopyFromReg node and not
a Register node. Otherwise, InstrEmitter will not make sure that it
ends up in the right register class for the target instruction.

SystemZ needs this, becuause the reg class for address registers is a
subset of the general 64 bit register class.

test/SystemZ/CodeGen/args-07.ll and args-04.ll updated to run with
-verify-machineinstrs.

Reviewed by Hal Finkel.

llvm-svn: 253461
2015-11-18 14:59:00 +00:00
Rafael Espindola 449711cb36 Stop producing .data.rel sections.
If a section is rw, it is irrelevant if the dynamic linker will write to
it or not.

It looks like llvm implemented this because gcc was doing it. It looks
like gcc implemented this in the hope that it would put all the
relocated items close together and speed up the dynamic linker.

There are two problem with this:
* It doesn't work. Both bfd and gold will map .data.rel to .data and
  concatenate the input sections in the order they are seen.
* If we want a feature like that, it can be implemented directly in the
  linker since it knowns where the dynamic relocations are.

llvm-svn: 253436
2015-11-18 06:02:15 +00:00
Cong Hou 136bc65ec8 Remove a redundant assertion in MachineBasicBlock.cpp. NFC.
llvm-svn: 253426
2015-11-18 01:55:56 +00:00
Cong Hou 11c1420173 Remove redundant code in MachineBasicBlock.cpp. NFC.
llvm-svn: 253425
2015-11-18 01:45:10 +00:00
Cong Hou 41cf1a5dfb Improving edge probabilities computation when choosing the best successor in machine block placement.
When looking for the best successor from the outer loop for a block
belonging to an inner loop, the edge probability computation can be
improved so that edges in the inner loop are ignored. For example,
suppose we are building chains for the non-loop part of the following
code, and looking for B1's best successor. Assume the true body is very
hot, then B3 should be the best candidate. However, because of the
existence of the back edge from B1 to B0, the probability from B1 to B3
can be very small, preventing B3 to be its successor. In this patch, when
computing the probability of the edge from B1 to B3, the weight on the
back edge B1->B0 is ignored, so that B1->B3 will have 100% probability.

if (...)
  do {
    B0;
    ... // some branches
    B1;
  } while(...);
else
  B2;
B3;


Differential revision: http://reviews.llvm.org/D10825

llvm-svn: 253414
2015-11-18 00:52:52 +00:00
David Blaikie 6196aa06c9 Generalize ownership/passing semantics to allow dsymutil to own abbreviations via unique_ptr
While still allowing CodeGen/AsmPrinter in llvm to own them using a bump
ptr allocator. (might be nice to replace the pointers there with
something that at least automatically calls their dtors, if that's
necessary/useful, rather than having it done explicitly (I think a typed
BumpPtrAllocator already does this, or maybe a unique_ptr with a custom
deleter, etc))

llvm-svn: 253409
2015-11-18 00:34:10 +00:00
Reid Kleckner c20276d0b2 [WinEH] Move WinEHFuncInfo from MachineModuleInfo to MachineFunction
Summary:
Now that there is a one-to-one mapping from MachineFunction to
WinEHFuncInfo, we don't need to use a DenseMap to select the right
WinEHFuncInfo for the current funclet.

The main challenge here is that X86WinEHStatePass is an IR pass that
doesn't have access to the MachineFunction. I gave it its own
WinEHFuncInfo object that it uses to calculate state numbers, which it
then throws away. As long as nobody creates or removes EH pads between
this pass and SDAG construction, we will get the same state numbers.

The other thing X86WinEHStatePass does is to mark the EH registration
node. Instead of communicating which alloca was the registration through
WinEHFuncInfo, I added the llvm.x86.seh.ehregnode intrinsic.  This
intrinsic generates no code and simply marks the alloca in use.

Reviewers: JCTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14668

llvm-svn: 253378
2015-11-17 21:10:25 +00:00
Pat Gavlin c8ea157811 Lower statepoints with multi-def targets.
Statepoint lowering currently expects that the target method of a
statepoint only defines a single value. This precludes using
statepoints with ABIs that return values in multiple registers
(e.g. the SysV AMD64 ABI). This change adds support for lowering
statepoints with mutli-def targets.

llvm-svn: 253339
2015-11-17 16:04:21 +00:00
Dan Gohman 7aa4abac24 Use TargetRegisterInfo for printing MachineOperand register comments
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.

This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.

Differential Revision: http://reviews.llvm.org/D14709

llvm-svn: 253338
2015-11-17 16:01:28 +00:00
Rafael Espindola 65e4902156 Drop prelink support.
The way prelink used to work was

* The compiler decides if a given section only has relocations that
are know to point to the same DSO. If so, it names it
.data.rel.ro.local<something>.
* The static linker puts all of these together.
* The prelinker program assigns addresses to each library and resolves
the local relocations.

There are many problems with this:
* It is incompatible with address space randomization.
* The information passed by the compiler is redundant. The linker
knows if a given relocation is in the same DSO or not. If could sort
by that if so desired.
* There are newer ways of speeding up DSO (gnu hash for example).
* Even if we want to implement this again in the compiler, the previous
  implementation is pretty broken. It talks about relocations that are
  "resolved by the static linker". If they are resolved, there are none
  left for the prelinker. What one needs to track is if an expression
  will require only dynamic relocations that point to the same DSO.

At this point it looks like the prelinker is an historical curiosity.
For example, fedora has retired it because it failed to build for two
releases
(http://pkgs.fedoraproject.org/cgit/prelink.git/commit/?id=eb43100a8331d91c801ee3dcdb0a0bb9babfdc1f)

This patch removes support for it. That is, it stops printing the
".local" sections.

llvm-svn: 253280
2015-11-17 00:51:23 +00:00
Matthias Braun fe9d6f211f Assume lane masks are always precise
Allowing imprecise lane masks in case of more than 32 sub register lanes
lead to some tricky corner cases, and I need another bugfix for another
one. Instead I rather declare lane masks as precise and let tablegen
abort if we do not have enough bits.

This does not affect any in-tree target, even AMDGPU only needs 16 lanes
at the moment. If the 32 lanes turn out to be a problem in the future,
then we can easily change the LaneBitmask typedef to uint64_t.

Differential Revision: http://reviews.llvm.org/D14557

llvm-svn: 253279
2015-11-17 00:50:55 +00:00
Keno Fischer b011c63d19 [DIBuilder] Make createReferenceType take size and align
Summary: Since we're passing references to dbg.value as pointers,
we need to have the frontend properly declare their sizes and
alignments (as it already does for regular pointers) in preparation
for my upcoming patch to have the verifer check that the sizes agree.

Also augment the backend logic that skips actually emitting this
information into DWARF such that it also handles reference types.

Reviewers: aprantl, dexonsmith, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14275

llvm-svn: 253186
2015-11-16 07:57:32 +00:00
Akira Hatanaka b11ef0897c Reduce the size of MCRelaxableFragment.
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.

This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
 
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.

rdar://problem/21736951

Differential Revision: http://reviews.llvm.org/D14346

llvm-svn: 253127
2015-11-14 06:35:56 +00:00
Quentin Colombet 2cdcfd23cd [ShrinkWrapping] Disable the optimization for functions with sanitize like
attribute.

Even if the target supports shrink-wrapping, the prologue and epilogue
must not move because a crash can happen anywhere and sanitizers need
to be able to unwind from the PC of the crash.

llvm-svn: 253116
2015-11-14 01:55:17 +00:00
Matthias Braun e6edd48d69 MachineScheduler: Print initial pressure in debug dump
llvm-svn: 253097
2015-11-13 22:30:31 +00:00
Matthias Braun 3b099db61d MachineScheduler: Improve debug output for "only one node in readyset"
When there is only 1 node left in the ready queue and it is picked call
the reason "ONLY1" instead of "NOCAND".

llvm-svn: 253096
2015-11-13 22:30:29 +00:00
James Molloy bb1dbf530a [SDAG] Fix expansion of BITREVERSE
Richard Trieu noted that UBSan detected an overflowing shift, and the obvious fix caused a crash.

What was happening was that the shiftee (1U) was indeed too small for the possible range of shifts it had to handle, but also we were using "VT.getSizeInBits()" to get the maximum type bitwidth, but we wanted "VT.getScalarSizeInBits()" to get the vector lane size instead of the entire vector size.

Use an APInt for the shift and VT.getScalarSizeInBits().

llvm-svn: 253023
2015-11-13 10:02:36 +00:00
Sanjoy Das ac9c5b1901 [ImplicitNulls] Add some clarifying comments; NFC
llvm-svn: 253020
2015-11-13 08:14:00 +00:00
Tom Stellard 0967c91e0c Revert "Remove unnecessary call to getAllocatableRegClass"
This reverts commit r252565.

This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:

Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"

This reverts commit r252674.

llvm-svn: 252956
2015-11-12 21:43:25 +00:00
Sanjoy Das e8b81649cf [ImplicitNulls] Fix wrapping by breaking up a condition, NFC
llvm-svn: 252947
2015-11-12 20:51:49 +00:00
Sanjoy Das edc394f1ed [ImplicitNull] Extract out a HazardDetector class, NFC
This will make later functional changes easier to follow.

llvm-svn: 252946
2015-11-12 20:51:44 +00:00
Quentin Colombet aeb85934b6 [ShrinkWrap] Fix a typo in a comment.
llvm-svn: 252918
2015-11-12 18:16:27 +00:00
Quentin Colombet 94dc1e0d34 [ShrinkWrap] Make sure we do not mess up with EH funclet lowering.
ShrinkWrapping does not understand exception handling constraints for now, so
make sure we do not mess with them by aborting on functions that use EH
funclets.

llvm-svn: 252917
2015-11-12 18:13:42 +00:00
Andrew Kaylor fb16a3ac9a [WinEH] Fix problem with removing an element from a SetVector while iterating.
Patch provided by Yaron Keren. (Thanks!)

llvm-svn: 252913
2015-11-12 17:36:03 +00:00
James Molloy 90111f79f9 [SDAG] Introduce a new BITREVERSE node along with a corresponding LLVM intrinsic
Several backends have instructions to reverse the order of bits in an integer. Conceptually matching such patterns is similar to @llvm.bswap, and it was mentioned in http://reviews.llvm.org/D14234 that it would be best if these patterns were matched in InstCombine instead of reimplemented in every different target.

This patch introduces an intrinsic @llvm.bitreverse.i* that operates similarly to @llvm.bswap. For plumbing purposes there is also a new ISD node ISD::BITREVERSE, with simple expansion and promotion support.

The intention is that InstCombine's BSWAP detection logic will be extended to support BITREVERSE too, and @llvm.bitreverse intrinsics emitted (if the backend supports lowering it efficiently).

llvm-svn: 252878
2015-11-12 12:29:09 +00:00
Matthias Braun b9610a6bc2 LegalizeDAG: Fix and improve FCOPYSIGN/FABS legalization
- Factor out code to query and modify the sign bit of a floatingpoint
  value as an integer. This also works if none of the targets integer
  types is big enough to hold all bits of the floatingpoint value.

- Legalize FABS(x) as FCOPYSIGN(x, 0.0) if FCOPYSIGN is available,
  otherwise perform bit manipulation on the sign bit. The previous code
  used "x >u 0 ? x : -x" which is incorrect for x being -0.0! It also
  takes 34 instructions on ARM Cortex-M4. With this patch we only
  require 5:
    vldr d0, LCPI0_0
    vmov r2, r3, d0
    lsrs r2, r3, #31
    bfi r1, r2, #31, #1
    bx lr
  (This could be further improved if the compiler would recognize that
   r2, r3 is zero).

- Only lower FCOPYSIGN(x, y) = sign(x) ? -FABS(x) : FABS(x) if FABS is
  available otherwise perform bit manipulation on the sign bit.

- Perform the sign(x) test by masking out the sign bit and comparing
  with 0 rather than shifting the sign bit to the highest position and
  testing for "<s 0". For x86 copysignl (on 80bit values) this gets us:
    testl $32768, %eax
  rather than:
    shlq $48, %rax
    sets %al
    testb %al, %al

Differential Revision: http://reviews.llvm.org/D11172

llvm-svn: 252839
2015-11-12 01:02:47 +00:00
Reid Kleckner b9204a584c [WinEH] Don't forward branches across empty EH pad BBs
For really simple SEH catchpads, we tried to forward the invoke unwind
edge across the empty block.

llvm-svn: 252822
2015-11-11 23:09:31 +00:00
Geoff Berry 2ddfc5e60f [DAGCombiner] Improve zextload optimization.
Summary:
Don't fold
  (zext (and (load x), cst)) -> (and (zextload x), (zext cst))
if
  (and (load x) cst)
will match as a zextload already and has additional users.

For example, the following IR:

  %load = load i32, i32* %ptr, align 8
  %load16 = and i32 %load, 65535
  %load64 = zext i32 %load16 to i64
  store i32 %load16, i32* %dst1, align 4
  store i64 %load64, i64* %dst2, align 8

used to produce the following aarch64 code:

	ldr		w8, [x0]
	and	w9, w8, #0xffff
	and	x8, x8, #0xffff
	str		w9, [x1]
	str		x8, [x2]

but with this change produces the following aarch64 code:

	ldrh		w8, [x0]
	str		w8, [x1]
	str		x8, [x2]

Reviewers: resistor, mcrosier

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14340

llvm-svn: 252789
2015-11-11 19:42:52 +00:00
Matt Arsenault d8fed1b793 Add target preference for GatherAllAliases max depth
llvm-svn: 252775
2015-11-11 18:44:33 +00:00
Dehao Chen 54511353e3 clang-format lib/CodeGen/AsmPrinter/DwarfCompileUnit.cpp
llvm-svn: 252769
2015-11-11 18:09:47 +00:00
Dehao Chen 72fdf444b7 Emit discriminator for inlined callsites.
Summary: Inlined callsites need to be emitted in debug info so that sample profile can be annotated to the correct inlined instance.

Reviewers: dnovillo, dblaikie

Subscribers: dblaikie, llvm-commits

Differential Revision: http://reviews.llvm.org/D14511

llvm-svn: 252768
2015-11-11 18:08:18 +00:00
Matthias Braun 2c98d0f477 MachineInstr: addRegisterDefReadUndef() => setRegisterDefReadUndef()
This way we can not only add but also remove read undef flags.

llvm-svn: 252678
2015-11-11 00:41:58 +00:00
Matthias Braun 4353b30542 TableGen: Emit LaneMask for register classes without subregisters as ~0u
This makes it slightly easier to handle classes with and without
subregister uniformly.

llvm-svn: 252671
2015-11-10 23:23:05 +00:00
Sanjay Patel 33ec5dbe35 less indent; NFCI
llvm-svn: 252643
2015-11-10 20:09:02 +00:00
Matt Arsenault aa118e299c LegalizeDAG: Implement promote for scalar_to_vector
This allows avoiding the default Expand behavior which
introduces stack usage. Bitcast the scalar and replace
the missing elements with undef.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252632
2015-11-10 18:48:11 +00:00
Matt Arsenault a46aa641f2 LegalizeDAG: Implement promote for insert_vector_elt
This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252631
2015-11-10 18:48:08 +00:00
Matt Arsenault 0b7958a59b LegalizeDAG: Implement promote for extract_vector_elt
This is for AMDGPU to implement v2i64 extract as extract of
half of a v4i32.

This is covered by existing tests and used by a future
commit which makes 64-bit vectors legal types on AMDGPU.

llvm-svn: 252630
2015-11-10 18:48:04 +00:00
Sanjay Patel 766589efdc add 'MustReduceDepth' as an objective/cost-metric for the MachineCombiner
This is one of the problems noted in PR25016:
https://llvm.org/bugs/show_bug.cgi?id=25016
and:
http://lists.llvm.org/pipermail/llvm-dev/2015-October/090998.html

The spilling problem is independent and not addressed by this patch.

The MachineCombiner was doing reassociations that don't improve or even worsen the critical path. 
This is caused by inclusion of the "slack" factor when calculating the critical path of the original
code sequence. If we don't add that, then we have a more conservative cost comparison of the old code
sequence vs. a new sequence. The more liberal calculation must be preserved, however, for the AArch64
MULADD patterns because benchmark regressions were observed without that.

The two failing test cases now have identical asm that does what we want:
a + b + c + d ---> (a + b) + (c + d)

Differential Revision: http://reviews.llvm.org/D13417

llvm-svn: 252616
2015-11-10 16:48:53 +00:00
Andy Ayers 809cbe9ea0 Support for emitting inline stack probes
For CoreCLR on Windows, stack probes must be emitted as inline sequences that probe successive stack pages
between the current stack limit and the desired new stack pointer location. This implements support for
the inline expansion on x64.

For in-body alloca probes, expansion is done during instruction lowering. For prolog probes, a stub call
is initially emitted during prolog creation, and expanded after epilog generation, to avoid complications
that arise when introducing new machine basic blocks during prolog and epilog creation.

Added a new test case, modified an existing one to exclude non-x64 coreclr (for now).

Add test case

Fix tests

llvm-svn: 252578
2015-11-10 01:50:49 +00:00
Matt Arsenault 6d87f28afd Remove unnecessary call to getAllocatableRegClass
I'm not sure what the point of this was. I'm not sure why
you would ever define an instruction that produces an unallocatable
register class. No tests fail with this removed, and it seems like
it should be a verifier error to define such an instruction.

This was problematic for AMDGPU because it would make bad decisions
by arbitrarily changing the register class when unsetting isAllocatable
for VS_32/VS_64, which is currently set as a workaround to this problem.

AMDGPU uses the VS_32/VS_64 register classes to represent operands which
can use either VGPRs or SGPRs. When  isAllocatable is unset for these,
this would need to pick  either the SGPR or VGPR class and insert either
a copy we don't want, or an illegal copy we would need to deal with
later. A semi-arbitrary register class ordering decision is made in tablegen,
which resulted in always picking a VGPR class because it happens to have
more registers than the SGPR register class. We really just want to
use whatever register class the original register had.

llvm-svn: 252565
2015-11-10 00:30:14 +00:00
Matthias Braun 7e624d5f11 MachineVerifier: Streamline live interval related error reporting
Simply perform additional report_context() calls after a report()
instead of adding more and more overloaded variations of report().  Also
improve several instances where information was output in an ad-hoc way
probably because no matching report() overload was available.

llvm-svn: 252552
2015-11-09 23:59:33 +00:00
Matthias Braun 716b43306b MachineVerifier: Add missing linebreak
MachineInstr::print() with SkipOppers==true does not produce a
linebreak, so we have to do that in MachineVerifier::report().

llvm-svn: 252551
2015-11-09 23:59:29 +00:00
Matthias Braun 45718db0a1 MachineVerifier: MI::print has no TargetMachine overload
The code was passing a target machine pointer which degraded to a true
operand to SkipOppers.

llvm-svn: 252550
2015-11-09 23:59:25 +00:00
Matthias Braun 42b4b63056 MachineVerifier: print list of live intervals if available
llvm-svn: 252549
2015-11-09 23:59:23 +00:00
Sanjay Patel 533c10c651 add a SelectionDAG method to check if no common bits are set in two nodes; NFCI
This was suggested in:
http://reviews.llvm.org/D13956

and is a follow-on to:
http://reviews.llvm.org/rL252515
http://reviews.llvm.org/rL252519

This lets us remove logically equivalent/duplicated code from DAGCombiner and X86ISelDAGToDAG.

A corresponding function for IR instructions already exists in ValueTracking.

llvm-svn: 252539
2015-11-09 23:31:38 +00:00
David Majnemer 2652b75700 [WinEH] Don't emit CATCHRET from visitCatchPad
Instead, emit a CATCHPAD node which will get selected to a target
specific sequence.

llvm-svn: 252528
2015-11-09 23:07:48 +00:00
Reid Kleckner 64b003f05d [WinEH] Tweak funclet prologue/epilogue insertion to pass verifier
For some reason we'd never run MachineVerifier on WinEH code, and you
explicitly have to ask for it with llc. I added it to a few test cases
to get some coverage.

Fixes PR25461.

llvm-svn: 252512
2015-11-09 21:04:00 +00:00
Andrew Kaylor fdd48fa1e1 [WinEH] Re-committing r252249 (Clone funclets with multiple parents) with additional fixes for determinism problems
Differential Revision: http://reviews.llvm.org/D14454

llvm-svn: 252508
2015-11-09 19:59:02 +00:00
Oliver Stannard 563585789c [CodeGen] Always promote f16 if not legal
We don't currently have any runtime library functions for operations on
f16 values (other than conversions to and from f32 and f64), so we
should always promote it to f32, even if that is not a legal type. In
that case, the f32 values would be softened to f32 library calls.

SoftenFloatRes_FP_EXTEND now needs to check the promoted operand's type,
as it may ne a no-op or require a different library call.

getCopyFromParts and getCopyToParts now need to cope with a
floating-point value stored in a larger integer part, as is the case for
any target that needs to store an f16 value in a 32-bit integer
register.

Differential Revision: http://reviews.llvm.org/D12856

llvm-svn: 252459
2015-11-09 11:03:18 +00:00
Yaron Keren 9ffee46d45 Erase unused FunctionDIs variables after r252219.
llvm-svn: 252401
2015-11-07 10:21:25 +00:00
Joseph Tremoulet f748c8937e [WinEH] Update exception pointer registers
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.

Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.

Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.


Reviewers: pgavlin, majnemer, rnk

Subscribers: jyknight, dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D14344

llvm-svn: 252383
2015-11-07 01:11:31 +00:00
Tom Stellard 05691a678e DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload
Reviewers: resistor, arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13805

llvm-svn: 252349
2015-11-06 21:58:37 +00:00
Quentin Colombet 9a8efc08d3 [ShrinkWrapping] Teach shrink-wrapping how to analyze RegMask.
Previously we were conservatively assuming that RegMask operands clobber
callee saved registers.

llvm-svn: 252341
2015-11-06 21:00:13 +00:00
Matthias Braun 9198c671e8 MachineScheduler: Add regpressure information to debug dump
llvm-svn: 252340
2015-11-06 20:59:02 +00:00
Reid Kleckner b8fd162fc5 [WinEH] Mark funclet entries and exits as clobbering all registers
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.

Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer

Subscribers: MatzeB, llvm-commits

Differential Revision: http://reviews.llvm.org/D14407

llvm-svn: 252318
2015-11-06 17:06:38 +00:00
NAKAMURA Takumi 9947cacebf Revert r252249 (and r252255, r252258), "[WinEH] Clone funclets with multiple parents"
It behaved flaky due to iterating pointer key values on std::set and std::map.

llvm-svn: 252279
2015-11-06 10:07:33 +00:00
Reid Kleckner e535c1f856 Range-for some LiveIntervals code under review
llvm-svn: 252267
2015-11-06 02:01:02 +00:00
Andrew Kaylor f477585a2b Fix build warnings
llvm-svn: 252255
2015-11-06 01:08:35 +00:00
Andrew Kaylor 29cd576554 [WinEH] Clone funclets with multiple parents
Windows EH funclets need to always return to a single parent funclet.  However, it is possible for earlier optimizations to combine funclets (probably based on one funclet having an unreachable terminator) in such a way that this condition is violated.

These changes add code to the WinEHPrepare pass to detect situations where a funclet has multiple parents and clone such funclets, fixing up the unwind and catch return edges so that each copy of the funclet returns to the correct parent funclet.

Differential Revision: http://reviews.llvm.org/D13274?id=39098

llvm-svn: 252249
2015-11-06 00:20:50 +00:00
Peter Collingbourne d4bff30370 DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

llvm-svn: 252219
2015-11-05 22:03:56 +00:00
Reid Kleckner 6ddae31045 [WinEH] Fix funclet prologues with stack realignment
We already had a test for this for 32-bit SEH catchpads, but those don't
actually create funclets. We had a bug that only appeared in funclet
prologues, where we would establish EBP and ESI as our FP and BP, and
then downstream prologue code would overwrite them.

While I was at it, I fixed Win64+funclets+stackrealign. This issue
doesn't come up as often there due to the ABI requring 16 byte stack
alignment, but now we can rest easy that AVX and WinEH will work well
together =P.

llvm-svn: 252210
2015-11-05 21:09:49 +00:00
Sanjay Patel 387e66e79f replace MachineCombinerPattern namespace and enum with enum class; NFCI
Also, remove an enum hack where enum values were used as indexes into an array.

We may want to make this a real class to allow pattern-based queries/customization (D13417).

llvm-svn: 252196
2015-11-05 19:34:57 +00:00
Eugene Zelenko ffec81ca00 Fix some Clang-tidy modernize warnings, other minor fixes.
Fixed warnings are: modernize-use-override, modernize-use-nullptr and modernize-redundant-void-arg.

Differential revision: http://reviews.llvm.org/D14312

llvm-svn: 252087
2015-11-04 22:32:32 +00:00
Cong Hou 23a3bf0147 Add new interfaces to MBB for manipulating successors with probabilities instead of weights. NFC.
This is part-1 of the patch that replaces all edge weights in MBB by
probabilities, which only adds new interfaces. No functional changes.

Differential revision: http://reviews.llvm.org/D13908

llvm-svn: 252083
2015-11-04 21:37:58 +00:00
Igor Laevsky 35fe692025 [StatepointLowering] Remove distinction between call and invoke safepoints
There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
 allow to lower call safepoints with relocates and results in a different 
basic blocks. See test case for example.

Differential Revision: http://reviews.llvm.org/D14158

llvm-svn: 252028
2015-11-04 01:16:10 +00:00
Peter Collingbourne 94d778697a CodeGen, Target: Move Mach-O-specific symbol name logic to Mach-O lowering.
A profile of an LTO link of Chrome revealed that we were spending some
~30-50% of execution time in the function Constant::getRelocationInfo(),
which is called from TargetLoweringObjectFile::getKindForGlobal() and in turn
from TargetMachine::getNameWithPrefix().

It turns out that we only need the result of getKindForGlobal() when
targeting Mach-O, so this change moves the relevant part of the logic to
TargetLoweringObjectFileMachO.

NFCI.

Differential Revision: http://reviews.llvm.org/D14168

llvm-svn: 252014
2015-11-03 23:40:03 +00:00
Simon Pilgrim 191ac7c679 [SelectionDAG] Use existing constant nodes instead of recreating them. NFC.
llvm-svn: 251990
2015-11-03 22:21:38 +00:00
Rafael Espindola 43e2e251ea Delete dead code.
llvm-svn: 251960
2015-11-03 18:55:58 +00:00
Igor Laevsky f637b4a52e [CodegenPrepare] Do not rematerialize gc.relocates across different basic blocks
Differential Revision: http://reviews.llvm.org/D14258

llvm-svn: 251957
2015-11-03 18:37:40 +00:00
Michael Kuperstein 73dc85293f [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.

Darwin does not support this well, so don't use pushes whenever it
would be required.

Differential Revision: http://reviews.llvm.org/D13767

llvm-svn: 251904
2015-11-03 08:17:25 +00:00
Matthias Braun 6f4ed269b9 RegisterPressure: Improve assert message
llvm-svn: 251885
2015-11-03 01:53:36 +00:00
Matthias Braun 11859b5c8f RegisterPressure: Slightly nicer pressure diff dumping
llvm-svn: 251884
2015-11-03 01:53:33 +00:00
Matthias Braun 93563e7032 ScheduleDAGInstrs: Remove IsPostRA flag; NFC
ScheduleDAGInstrs doesn't behave differently before or after register
allocation. It was only used in a method of MachineSchedulerBase which
behaved differently in MachineScheduler/PostMachineScheduler. Change
this to let MachineScheduler/PostMachineScheduler just pass in a
parameter to that function.

The order of the LiveIntervals* and bool RemoveKillFlags paramters have
been switched to make out-of-tree code fail instead of unintentionally
passing a value intended for the IsPostRA flag to the (previously
following and default initialized) RemoveKillFlags.

Differential Revision: http://reviews.llvm.org/D14245

llvm-svn: 251883
2015-11-03 01:53:29 +00:00
Sanjay Patel 0ed9aeaa5f [CGP] widen switch condition and case constants to target's register width (2nd try)
This is a redo of r251849 except the tests have been split into arch-specific folders
to hopefully make the bots happy.

This is a follow-up from the discussion in D12965. The block-at-a-time limitation of
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1:
  cmpwi  3, 99
  bgt    0, .LBB0_9
 BB#2:
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#3:
  cmplwi         4, 10
  bne    0, .LBB0_12
 BB#4:
  li 3, 1
  blr
.LBB0_5:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 65436
  beq    0, .LBB0_13
 BB#6:
  cmplwi         3, 65526
  beq    0, .LBB0_15
 BB#7:
  cmplwi         3, 65535
  bne    0, .LBB0_12
 BB#8:
  li 3, 4
  blr
.LBB0_9:
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi         3, 100
  beq    0, .LBB0_14
...

After:
BB#0:
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi  4, 999
  ble 0, .LBB0_5
 BB#1:
  lis 3, 0
  ori 3, 3, 65525
  cmpw   4, 3
  bgt    0, .LBB0_9
 BB#2:
  cmplwi         4, 1000
  beq    0, .LBB0_14
 BB#3:
  cmplwi         4, 65436
  bne    0, .LBB0_13
 BB#4:
  li 3, 6
  blr
.LBB0_5:
  li 3, 0
  cmplwi         4, 1
  beqlr 0
 BB#6:
  cmplwi         4, 10
  beq    0, .LBB0_12
 BB#7:
  cmplwi         4, 100
  bne    0, .LBB0_13
 BB#8:
  li 3, 2
  blr
.LBB0_9:
  cmplwi         4, 65526
  beq    0, .LBB0_15
 BB#10:
  cmplwi         4, 65535
  bne    0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532

llvm-svn: 251857
2015-11-02 23:22:49 +00:00
Sanjay Patel dfc825eb36 revert r251849; need to move tests to arch-specific folders
llvm-svn: 251851
2015-11-02 23:05:20 +00:00
Sanjay Patel b90a078de9 [CGP] widen switch condition and case constants to target's register width
This is a follow-up from the discussion in D12965. The block-at-a-time limitation of 
SelectionDAG also came up in D13297.

Without the InstCombine change from D12965, I don't expect this patch to make any 
difference in the real world because InstCombine does not shrink cases like this in
visitSwitchInst(). But we need to have this CGP safety harness in place before
proceeding with any shrinkage in D12965, so we won't generate extra extends for compares.

I've opted for IR regression tests in the patch because that seems like a clearer way to
test the transform, but PowerPC CodeGen for an i16 widening test is shown below. x86
will need more work to solve: https://llvm.org/bugs/show_bug.cgi?id=22473

Before:
BB#0:
  mr 4, 3
  extsh. 3, 4
  ble 0, .LBB0_5
 BB#1: 
  cmpwi	 3, 99
  bgt	 0, .LBB0_9
 BB#2:            
  rlwinm 4, 4, 0, 16, 31      <--- 32-bit mask/extend
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#3:            
  cmplwi	 4, 10
  bne	 0, .LBB0_12
 BB#4:                      
  li 3, 1
  blr
.LBB0_5:                             
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 65436
  beq	 0, .LBB0_13
 BB#6:                            
  cmplwi	 3, 65526
  beq	 0, .LBB0_15
 BB#7:                       
  cmplwi	 3, 65535
  bne	 0, .LBB0_12
 BB#8:                       
  li 3, 4
  blr
.LBB0_9:                       
  rlwinm 3, 4, 0, 16, 31      <--- 32-bit mask/extend
  cmplwi	 3, 100
  beq	 0, .LBB0_14
...

After:
BB#0:        
  rlwinm 4, 3, 0, 16, 31      <--- mask/extend to 32-bit and then use that for comparisons
  cmpwi	 4, 999
  ble 0, .LBB0_5
 BB#1:          
  lis 3, 0
  ori 3, 3, 65525
  cmpw	 4, 3
  bgt	 0, .LBB0_9
 BB#2:         
  cmplwi	 4, 1000
  beq	 0, .LBB0_14
 BB#3:    
  cmplwi	 4, 65436
  bne	 0, .LBB0_13
 BB#4:       
  li 3, 6
  blr
.LBB0_5:   
  li 3, 0
  cmplwi	 4, 1
  beqlr 0
 BB#6: 
  cmplwi	 4, 10
  beq	 0, .LBB0_12
 BB#7:             
  cmplwi	 4, 100
  bne	 0, .LBB0_13
 BB#8:             
  li 3, 2
  blr
.LBB0_9:       
  cmplwi	 4, 65526
  beq	 0, .LBB0_15
 BB#10:      
  cmplwi	 4, 65535
  bne	 0, .LBB0_13
...


Differential Revision: http://reviews.llvm.org/D13532

llvm-svn: 251849
2015-11-02 22:46:24 +00:00
Cong Hou b90b9e0531 In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available.
In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality.

Consider a simple example shown below:

A
|
B⇆C
|
D

When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D.

This patch filters those cold blocks off from the loop chain by comparing the ratio:

LoopBBFreq / LoopFreq

to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa.


Differential revision: http://reviews.llvm.org/D11662

llvm-svn: 251833
2015-11-02 21:24:00 +00:00
James Y Knight 646c4032e7 Fix two issues in MergeConsecutiveStores:
1) PR25154. This is basically a repeat of PR18102, which was fixed in
r200201, and broken again by r234430. The latter changed which of the
store nodes was merged into from the first to the last. Thus, we now
also need to prefer merging a later store at a given address into the
target node, instead of an earlier one.

2) While investigating that, I also realized I'd introduced a bug in
r236850. There, I removed a check for alignment -- not realizing that
nothing except the alignment check was ensuring that none of the stores
were overlapping! This is a really bogus way to ensure there's no
aliased stores.

A better solution to both of these issues is likely to always use the
code added in the 'if (UseAA)' branches which rearrange the chain based
on a more principled analysis. I'll look into whether that can be used
always, but in the interest of getting things back to working, I think a
minimal change makes sense.

llvm-svn: 251816
2015-11-02 18:48:08 +00:00
Jonas Paulsson 72640f1c9f [MachineVerifier] Analyze MachineMemOperands for mem-to-mem moves.
Since the verifier will give false reports if it incorrectly thinks MI is
loading or storing using an FI, it is necessary to scan memoperands and
find out how the FI is used in the instruction. This should be relatively
rare.

Needed to make CodeGen/SystemZ/spill-01.ll pass, which now runs with this flag.

Reviewed by Quentin Colombet.

llvm-svn: 251620
2015-10-29 08:28:35 +00:00
Matthias Braun f2f194455f Revert "ScheduleDAGInstrs: Remove IsPostRA flag"
It broke 3 arm testcases.

This reverts commit r251608.

llvm-svn: 251615
2015-10-29 05:06:41 +00:00
Matthias Braun dc7580aa88 MachineScheduler: Fix typo in debug message
Maybe I just missed the humor there ;-)

llvm-svn: 251609
2015-10-29 03:57:28 +00:00
Matthias Braun 7ffadd0087 ScheduleDAGInstrs: Remove IsPostRA flag
This was a layering violation in ScheduleDAGInstrs (and
MachineSchedulerBase) they both shouldn't know directly whether they are
used by the PostMachineScheduler or the MachineScheduler.

llvm-svn: 251608
2015-10-29 03:57:24 +00:00
Matthias Braun b0c437bc76 MachineScheduler: Use ranged for and slightly simplify the code
llvm-svn: 251607
2015-10-29 03:57:17 +00:00
Tim Northover 2d4d161519 ARM: support .watchos_version_min and .tvos_version_min.
These MachO file directives are used by linkers and other tools to provide
compatibility information, much like the existing .ios_version_min and
.macosx_version_min.

llvm-svn: 251569
2015-10-28 22:36:05 +00:00
Sanjoy Das 1d1929aace [ValueTracking] Use !range metadata more aggressively in KnownBits
Summary:
Teach `computeKnownBitsFromRangeMetadata` to use `!range` metadata more
aggressively.

Reviewers: majnemer, nlewycky, jingyue

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14100

llvm-svn: 251487
2015-10-28 03:20:15 +00:00
Sanjoy Das 4ff3cf6d92 [SelectionDAG] Don't inspect !range metadata for extended loads
Summary:
Don't call `computeKnownBitsFromRangeMetadata` for extended loads --
this can cause a mismatch between the width of the !range metadata and
the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the
future).  This isn't a problem now, but will be after a future change.

Note: this can be made more aggressive in the future.

Reviewers: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14107

llvm-svn: 251486
2015-10-28 03:20:10 +00:00
James Y Knight 14eedd189b Make the SelectionDAG graph printer use SDNode::PersistentId labels.
r248010 changed the -debug output to use short ids, but did not
similarly modify the graph printer. Change to be consistent, for ease of
cross-reference.

llvm-svn: 251465
2015-10-27 23:09:03 +00:00
Sanjay Patel bbd4c79c8f Use the 'arcp' fast-math-flag when combining repeated FP divisors
This is a usage of the IR-level fast-math-flags now that they are propagated to SDNodes. 
This was originally part of D8900.

Removing the global 'enable-unsafe-fp-math' checks will require auto-upgrade and 
possibly other changes.

Differential Revision: http://reviews.llvm.org/D9708

llvm-svn: 251450
2015-10-27 20:27:25 +00:00
Cong Hou 07eeb8001e Create a new interface addSuccessorWithoutWeight(MBB*) in MBB to add successors when optimization is disabled.
When optimization is disabled, edge weights that are stored in MBB won't be used so that we don't have to store them. Currently, this is done by adding successors with default weight 0, and if all successors have default weights, the weight list will be empty. But that the weight list is empty doesn't mean disabled optimization (as is stated several times in MachineBasicBlock.cpp): it may also mean all successors just have default weights.

We should discourage using default weights when adding successors, because it is very easy for users to forget update the correct edge weights instead of using default ones (one exception is that the MBB only has one successor). In order to detect such usages, it is better to differentiate using default weights from the case when optimizations is disabled.

In this patch, a new interface addSuccessorWithoutWeight(MBB*) is created for when optimization is disabled. In this case, MBB will try to maintain an empty weight list, but it cannot guarantee this as for many uses of addSuccessor() whether optimization is disabled or not is not checked. But it can guarantee that if optimization is enabled, then the weight list always has the same size of the successor list.

Differential revision: http://reviews.llvm.org/D13963

llvm-svn: 251429
2015-10-27 17:59:36 +00:00
Mehdi Amini 891c0973df Do not use "else" when both branches return (NFC)
From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 251398
2015-10-27 08:12:08 +00:00
Steve King fee370be72 Fix llc crash processing S/UREM for -Oz builds caused by rL250825.
When taking the remainder of a value divided by a constant, visitREM()
attempts to convert the REM to a longer but faster sequence of instructions.
This conversion calls combine() on a speculative DIV instruction. Commit
rL250825 may cause this combine() to return a DIVREM, corrupting nearby nodes.
Flow eventually hits unreachable().

This patch adds a test case and a check to prevent visitREM() from trying
to convert the REM instruction in cases where a DIVREM is possible.
See http://reviews.llvm.org/D14035

llvm-svn: 251373
2015-10-27 00:14:06 +00:00
Ivan Krasin 465fbe25c4 Fix indents. It's a follow up to r251353.
llvm-svn: 251364
2015-10-26 22:35:40 +00:00
Ivan Krasin 298639a5fd Move imported entities into DwarfCompilationUnit to speed up LTO linking.
Summary:
In particular, this CL speeds up the official Chrome linking with LTO by
1.8x.

See more details in https://crbug.com/542426

Reviewers: dblaikie

Subscribers: jevinskie

Differential Revision: http://reviews.llvm.org/D13918

llvm-svn: 251353
2015-10-26 21:36:35 +00:00
David Blaikie 7b54b525cd Remove assert(false) in favor of asserting the if conditional it is contained within.
Also adjust the code to avoid 3 redundant map lookups.

llvm-svn: 251327
2015-10-26 18:41:13 +00:00
Evgeniy Stepanov d1aad26589 [safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.

This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.

This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.

The previous iteration of this change was reverted in r250461. This
version leaves the generic, compiler-rt based implementation in
SafeStack.cpp instead of moving it to TargetLoweringBase in order to
allow testing without a TargetMachine.

llvm-svn: 251324
2015-10-26 18:28:25 +00:00
Elena Demikhovsky 092858588a Scalarizer for masked.gather and masked.scatter intrinsics.
When the target does not support these intrinsics they should be converted to a chain of scalar load or store operations.
If the mask is not constant, the scalarizer will build a chain of conditional basic blocks.
I added isLegalMaskedGather() isLegalMaskedScatter() APIs.

Differential Revision: http://reviews.llvm.org/D13722

llvm-svn: 251237
2015-10-25 15:37:55 +00:00
Michael Kuperstein eaa16005af [X86] Use correct calling convention for MCU psABI libcalls
When using the MCU psABI, compiler-generated library calls should pass
some parameters in-register. However, since inreg marking for x86 is currently
done by the front end, it will not be applied to backend-generated calls.

This is a workaround for PR3997, which describes a similar issue for -mregparm.

Differential Revision: http://reviews.llvm.org/D13977

llvm-svn: 251223
2015-10-25 08:14:05 +00:00
Rafael Espindola 84921b9860 Refactor: Simplify boolean conditional return statements in lib/CodeGen.
Patch by Richard.

llvm-svn: 251213
2015-10-24 23:11:13 +00:00
Simon Pilgrim 3448cbcc51 [DAGCombiner] Tidy up ConstantFP commutation. NFCI
Move ConstantFP canonicalization of commutative instructions to start of 2-op node creation (matches integer) - simplifies constant folding code.

llvm-svn: 251203
2015-10-24 20:06:18 +00:00
Simon Pilgrim 7430804fe1 [DAGCombiner] Generalize masking of constant rotates.
We don't need a mask of a rotation result to be a constant splat - any constant scalar/vector can be usefully folded.

Followup to D13851.

llvm-svn: 251197
2015-10-24 18:44:52 +00:00
Simon Pilgrim d5ef318b5b [X86][XOP] Add support for lowering vector rotations
This patch adds support for lowering to the XOP VPROT / VPROTI vector bit rotation instructions.

This has required changes to the DAGCombiner rotation pattern matching to support vector types - so far I've only changed it to support splat vectors, but generalising this further is feasible in the future.

Differential Revision: http://reviews.llvm.org/D13851

llvm-svn: 251188
2015-10-24 13:17:26 +00:00
Joseph Tremoulet 3d0fbf1d74 [CodeGen] Mark setjmp/catchret MBBs address-taken
Summary:
This ensures that BranchFolding (and similar) won't remove these blocks.

Also allow AsmPrinter::EmitBasicBlockStart to process MBBs which are
address-taken but do not have BBs that are address-taken, since otherwise
its call to getAddrLabelSymbolTableToEmit would fail an assertion on such
blocks.  I audited the other callers of getAddrLabelSymbolTableToEmit
(and getAddrLabelSymbol); they all have BBs known to be address-taken
except for the call through getAddrLabelSymbol from
WinException::create32bitRef; that call is actually now unreachable, so
I've removed it and updated the signature of create32bitRef.

This fixes PR25168.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13774

llvm-svn: 251113
2015-10-23 15:06:05 +00:00
Davide Italiano fbb958c24b [CodeGen] Remove usage of NDEBUG in header.
Moreover, this seems unused.

llvm-svn: 251081
2015-10-23 00:17:40 +00:00
Matthias Braun 61f4d6439c MachineScheduler: Add a way to disable the 'ReduceLatency' heuristic
llvm-svn: 251037
2015-10-22 18:07:31 +00:00
Craig Topper 8fe40e0ed5 Change makeLibCall to take an ArrayRef<SDValue> instead of pointer and size. This removes the need to pass a hardcoded size in many places. NFC
llvm-svn: 251032
2015-10-22 17:05:00 +00:00
Zia Ansari 8f509a7044 [X86] - Catch extra combine opportunities for redundant imuls.
When we fold "mul ((add x, c1), c1)" -> "add ((mul x, c2), c1*c2)", we bail if (add x, c1) has multiple
users which would result in an extra add instruction.
In such cases, this patch adds a check to see if we can eliminate a multiply instruction in exchange for the extra add.

I also added the capability of doing the existing optimization with non-splatted vectors (splatted also works).

Differential Revision: http://reviews.llvm.org/D13740

llvm-svn: 251028
2015-10-22 16:14:45 +00:00
David Majnemer a8f17871e4 [WinEH] Remove extraneous call to emitEHRegistrationOffsetLabel
It's a relic from the earlier implementation, let's remove it.

llvm-svn: 250964
2015-10-21 23:20:39 +00:00
Matt Arsenault 29f9663f97 LegalizeDAG: Implement promote for build_vector
This will be used in future commits for AMDGPU to promote
operations on i64 vectors into operations on 32-bit vector
components.

This will be used / tested in future AMDGPU commits.

llvm-svn: 250945
2015-10-21 21:10:10 +00:00
Elena Demikhovsky 3ad76a1acd Masked Load/Store optimization for scalar code
When we have to convert the masked.load, masked.store to scalar code, we generate a chain of conditional basic blocks.
I added optimization for constant mask vector.

Differential Revision: http://reviews.llvm.org/D13855

llvm-svn: 250893
2015-10-21 11:50:54 +00:00
Jonas Paulsson 17ad04535f Let MachineVerifier be aware of mem-to-mem instructions.
A mem-to-mem instruction (that both loads and stores), which store to an
FI, cannot pass the verifier since it thinks it is loading from the FI.

For the mem-to-mem instruction, do a looser check in visitMachineOperand()
and only check liveness at the reg-slot while analyzing a frame index operand.

Needed to make CodeGen/SystemZ/xor-01.ll pass with -verify-machineinstrs,
which now runs with this flag.

Reviewed by Evan Cheng and Quentin Colombet.

llvm-svn: 250885
2015-10-21 07:39:47 +00:00
Krzysztof Parzyszek fdb7b693a7 Tail duplication can mix incompatible registers in phi nodes
Do not tail duplicate blocks where the successor has a phi node,
and the corresponding value in that phi node uses a subregister.

http://reviews.llvm.org/D13922

llvm-svn: 250877
2015-10-21 02:40:06 +00:00
Artyom Skrobov c736863a85 Two switch blocks in VectorLegalizer::LegalizeOp already have a
default: llvm_unreachable("This action is not supported yet!");

-- so I'm adding one to the third switch block, too.

This is a follow-up fix for http://reviews.llvm.org/D13862

llvm-svn: 250830
2015-10-20 15:06:37 +00:00
Artyom Skrobov 7fd67e25aa Adding support for TargetLoweringBase::LibCall
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:

- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
  computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall

This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.

The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.

The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.

Reviewers: t.p.northover, rengolin

Subscribers: t.p.northover, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D13862

llvm-svn: 250826
2015-10-20 13:14:52 +00:00
Artyom Skrobov b844fa7fc0 Combining DIV+REM->DIVREM doesn't belong in LegalizeDAG; move it over into DAGCombiner.
Summary:
In addition to moving the code over, this patch amends the DIV,REM -> DIVREM
combining to run on all affected nodes at once: if the nodes are converted
to DIVREM one at a time, then the resulting DIVREM may get legalized by the
backend into something target-specific that we won't be able to recognize
and correlate with the remaining nodes.

The motivation is to "prepare terrain" for D13862: when we set DIV and REM
to be legalized to libcalls, instead of the DIVREM, we otherwise lose the
ability to combine them together. To prevent this, we need to take the
DIV,REM -> DIVREM combining out of the lowering stage.

Reviewers: RKSimon, eli.friedman, rengolin

Subscribers: john.brawn, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D13733

llvm-svn: 250825
2015-10-20 13:06:02 +00:00
Duncan P. N. Exon Smith a25ad0685a AsmPrinter: Remove implicit ilist iterator conversion, NFC
llvm-svn: 250776
2015-10-20 00:36:08 +00:00
Cong Hou 7745dbc5c4 Enhance loop rotation with existence of profile data in MachineBlockPlacement pass.
Currently, in MachineBlockPlacement pass the loop is rotated to let the best exit to be the last BB in the loop chain, to maximize the fall-through from the loop to outside. With profile data, we can determine the cost in terms of missed fall through opportunities when rotating a loop chain and select the best rotation. Basically, there are three kinds of cost to consider for each rotation:

1. The possibly missed fall through edge (if it exists) from BB out of the loop to the loop header.
2. The possibly missed fall through edges (if they exist) from the loop exits to BB out of the loop.
3. The missed fall through edge (if it exists) from the last BB to the first BB in the loop chain.

Therefore, the cost for a given rotation is the sum of costs listed above. We select the best rotation with the smallest cost. This is only for PGO mode when we have more precise edge frequencies.

Differential revision: http://reviews.llvm.org/D10717

llvm-svn: 250754
2015-10-19 23:16:40 +00:00
Sanjay Patel 69a50a1e17 [CGP] transform select instructions into branches and sink expensive operands
This was originally checked in at r250527, but reverted at r250570 because of PR25222.
There were at least 2 problems: 
1. The cost check was checking for an instruction with an exact cost of TCC_Expensive;
that should have been >=.
2. The cause of the clang stage 1 failures was illegally sinking 'call' instructions;
we can't sink instructions that may have side effects / are not safe to execute speculatively.

Fixed those conditions in sinkSelectOperand() and added test cases.

Original commit message:
This is a follow-up to the discussion in D12882.

Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations.
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).

Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its
select-formation behavior that changed with r248439.

Differential Revision: http://reviews.llvm.org/D13297

llvm-svn: 250743
2015-10-19 21:59:12 +00:00
Owen Anderson faf5187ee0 Restore the original behavior of SelectionDAG::getTargetIndex().
It looks like an extra negation snuck in as apart of restoring it.

llvm-svn: 250726
2015-10-19 19:27:40 +00:00
Benjamin Kramer 2002aadaad Put back SelectionDAG::getTargetIndex.
While technically this is untested dead code, it has out-of-tree users.
This reverts a part of r250434.

llvm-svn: 250717
2015-10-19 18:26:16 +00:00
Matthias Braun e734195ce3 Revert "RegisterPressure: allocatable physreg uses are always kills"
This reverts commit r250596.

Reverted for now as the commit triggers assert in the AMDGPU target
pending investigation.

llvm-svn: 250713
2015-10-19 17:44:22 +00:00
Elena Demikhovsky 20662e39f1 Removed parameter "Consecutive" from isLegalMaskedLoad() / isLegalMaskedStore().
Originally I planned to use the same interface for masked gather/scatter and set isConsecutive to "false" in this case.

Now I'm implementing masked gather/scatter and see that the interface is inconvenient. I want to add interfaces isLegalMaskedGather() / isLegalMaskedScatter() instead of using the "Consecutive" parameter in the existing interfaces.

Differential Revision: http://reviews.llvm.org/D13850

llvm-svn: 250686
2015-10-19 07:43:38 +00:00
Simon Pilgrim 04d52d26f6 Use SDValue bool check. NFCI.
llvm-svn: 250653
2015-10-18 12:33:54 +00:00
Simon Pilgrim c2c154e078 Move one-use variable inside test. NFC.
llvm-svn: 250651
2015-10-18 11:47:23 +00:00
Simon Pilgrim 24057b9566 [DAG] Ensure vector constant folding uses correct scalar undef types
Minor fix to D13665 found during post-commit review.

llvm-svn: 250616
2015-10-17 16:49:43 +00:00
Matthias Braun 65e6d4a3f8 RegisterPressure: Unify the sparse sets in LiveRegsSet; NFC
Also do some cleanups comment improvements.

llvm-svn: 250598
2015-10-17 01:03:44 +00:00
Matthias Braun cdd2792aa6 RegisterPressure: allocatable physreg uses are always kills
This property was already used in the code path when no liveness
intervals are present. Unfortunately the code path that uses liveness
intervals tried to query a cached live interval for an allocatable
physreg, those are usually not computed so a conservative default was
used.

This doesn't affect any of the lit testcases. This is a foreclosure to
upcoming changes which should be NFC but without this patch this tidbit
wouldn't be NFC.

llvm-svn: 250596
2015-10-17 00:46:57 +00:00
Matthias Braun 5105e05e8f RegisterPressure: Remove 0 entries from PressureChange
This should not change behaviour because as far as I can see all code
reading the pressure changes has no effect if the PressureInc is 0.
Removing these entries however does avoid unnecessary computation, and
results in a more stable debug output. I want the stable debug output to
check that some upcoming changes are indeed NFC and identical even at
the debug output level.

llvm-svn: 250595
2015-10-17 00:35:59 +00:00
Matthias Braun 96e411b90c RegisterPressure: Hide non-const iterators of PressureDiff
It is too easy to accidentally violate the ordering requirements when
modifying the PressureDiff entries through iterators.

llvm-svn: 250590
2015-10-17 00:08:48 +00:00
Joseph Tremoulet 55b51e9dcc [WinEH] Fix eh.exceptionpointer intrinsic lowering
Summary:
Some shared code for handling eh.exceptionpointer and eh.exceptioncode
needs to not share the part that truncates to 32 bits, which is intended
just for exception codes.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13747

llvm-svn: 250588
2015-10-17 00:08:08 +00:00
Reid Kleckner 28e490342b [WinEH] Fix stack alignment in funclets and ParentFrameOffset calculation
Our previous value of "16 + 8 + MaxCallFrameSize" for ParentFrameOffset
is incorrect when CSRs are involved. We were supposed to have a test
case to catch this, but it wasn't very rigorous.

The main effect here is that calling _CxxThrowException inside a
catchpad doesn't immediately crash on MOVAPS when you have an odd number
of CSRs.

llvm-svn: 250583
2015-10-16 23:43:27 +00:00
Matthias Braun fdee8ec2bd RegisterPressure: Use range based for, cleanup
llvm-svn: 250579
2015-10-16 23:25:09 +00:00
Benjamin Kramer b43d33bf0f Revert "This is a follow-up to the discussion in D12882."
Breaks clang selfhost, see PR25222. This reverts commits r250527 and r250528.

llvm-svn: 250570
2015-10-16 23:00:29 +00:00
Joseph Tremoulet d11a998e81 [WinEH] Fix CatchRetSuccessorColorMap accounting
Summary:
We now use the block for the catchpad itself, rather than its normal
successor, as the funclet entry.
Putting the normal successor in the map leads downstream funclet
membership computations to erroneous results.

Reviewers: majnemer, rnk

Subscribers: rnk, llvm-commits

Differential Revision: http://reviews.llvm.org/D13798

llvm-svn: 250552
2015-10-16 21:22:54 +00:00
David Majnemer e696583dba [WinEH] Remove dead code/includes from WinEHPrepare
No functionality change is intended.

llvm-svn: 250545
2015-10-16 19:59:52 +00:00
Joseph Tremoulet 53e9cbd95a [WinEH] Fix endpad coloring/numbering
Summary:
When a cleanup's cleanupendpad or cleanupret targets a catchendpad, stop
trying to propagate the cleanup's parent's color to the catchendpad, since
what's needed is the cleanup's grandparent's color and the catchendpad
will get that color from the catchpad linkage already.  We already had
this exclusion for invokes, but were missing it for
cleanupendpad/cleanupret.

Also add a missing line that tags cleanupendpads' states in the
EHPadStateMap, without with lowering invokes that target cleanupendpads
which unwind to other handlers (and so don't have the -1 state) will fail.

This fixes the reduced IR repro in PR25163.


Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13797

llvm-svn: 250534
2015-10-16 18:08:16 +00:00
Sanjay Patel 374dd8d88e This is a follow-up to the discussion in D12882.
Ideally, we would like SimplifyCFG to be able to form select instructions even when the operands
are expensive (as defined by the TTI cost model) because that may expose further optimizations. 
However, we would then like a later pass like CodeGenPrepare to undo that transformation if the
target would likely benefit from not speculatively executing an expensive op (this patch).

Once we have this safety mechanism in place, we can adjust SimplifyCFG to restore its 
select-formation behavior that changed with r248439.

Differential Revision: http://reviews.llvm.org/D13297

llvm-svn: 250527
2015-10-16 16:54:30 +00:00
Evgeniy Stepanov 9addbc9fc1 Revert "[safestack] Fast access to the unsafe stack pointer on AArch64/Android."
Breaks the hexagon buildbot.

llvm-svn: 250461
2015-10-15 21:26:49 +00:00
Adrian Prantl 96b1551d53 Replace a forward declaration with an #include.
When building with modules the forward-declared inner class
DebugLocStream::ListBuilder causes clang to fall over.

llvm-svn: 250459
2015-10-15 20:58:55 +00:00
Evgeniy Stepanov 142947e9f0 [safestack] Fast access to the unsafe stack pointer on AArch64/Android.
Android libc provides a fixed TLS slot for the unsafe stack pointer,
and this change implements direct access to that slot on AArch64 via
__builtin_thread_pointer() + offset.

This change also moves more code into TargetLowering and its
target-specific subclasses to get rid of target-specific codegen
in SafeStackPass.

This change does not touch the ARM backend because ARM lowers
builting_thread_pointer as aeabi_read_tp, which is not available
on Android.

llvm-svn: 250456
2015-10-15 20:50:16 +00:00
Benjamin Kramer bacc7ba7aa [SelectionDAG] Remove dead code. NFC.
Carefully selected parts without deleting graph stuff and dumping methods.

llvm-svn: 250434
2015-10-15 17:54:06 +00:00
Benjamin Kramer 7fa42c8a8c [AsmPrinter] Prune dead code. NFC.
I left all (dead) print and dump methods in place.

llvm-svn: 250433
2015-10-15 17:16:32 +00:00
Artyom Skrobov 4bca0bb010 A doccomment for CombineTo, and some NFC refactorings
Summary:
Caching SDLoc(N), instead of recreating it in every single
function call, keeps the code denser, and allows to unwrap long lines.

Reviewers: sunfish, atrick, sdmitrouk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13726

llvm-svn: 250305
2015-10-14 17:18:35 +00:00
Artyom Skrobov a5b9ad22b3 Merge DAGCombiner::visitSREM and DAGCombiner::visitUREM (NFC)
Summary: The two implementations had more code in common than not.

Reviewers: sunfish, MatzeB, sdmitrouk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13724

llvm-svn: 250302
2015-10-14 16:54:14 +00:00
Joseph Tremoulet 28c89bbb36 [WinEH] Add CoreCLR EH table emission
Summary:
Emit the handler and clause locations immediately after the standard
xdata.
Clauses are emitted in the same order and format used to communiate them
to the CLR Execution Engine.
Add a lit test to verify correct table generation on a small but
interesting example function.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D13451

llvm-svn: 250219
2015-10-13 20:18:27 +00:00
Duncan P. N. Exon Smith e400a7d412 SelectionDAG: Remove implicit ilist iterator conversions, NFC
llvm-svn: 250214
2015-10-13 19:47:46 +00:00
Joseph Tremoulet 1e2f062ec5 [WinEH] Iterate state changes instead of invokes
Summary:
Add an iterator that can walk across blocks and which visits the state
transitions rather than state ranges, with explicit transitions to -1
indicating the presence of top-level calls that may throw and cause the
current function to unwind to caller.  This will simplify code that needs
to identify nested try regions.

Refactor SEH and C++EH table generation to use the new
InvokeStateChangeIterator, and remove the InvokeLabelIterator they were
using.


Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13623

llvm-svn: 250179
2015-10-13 16:44:30 +00:00
Matt Arsenault e5d9515fb7 DAGCombiner: Don't stop finding better chain on 2 aliases
The comment says this was stopped because it was unlikely to be
profitable. This is not true if you want to combine vector loads
with multiple components.

For a simple case that looks like

t0 = load t0 ...
t1 = load t0 ...
t2 = load t0 ...
t3 = load t0 ...

t4 = store t0:1, t0:1
  t5 = store t4, t1:0
    t6 = store t5, t2:0
	  t7 = store t6, t3:0

We want to get all of these stores onto a chain
that is a TokenFactor of these N loads. This mostly
solves the AMDGPU merge-stores.ll regressions
with -combiner-alias-analysis for merging vector
stores of vector loads.

llvm-svn: 250138
2015-10-13 00:49:00 +00:00
Matt Arsenault 61dc235f20 DAGCombiner: Combine extract_vector_elt from build_vector
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.

InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.

llvm-svn: 250129
2015-10-12 23:59:50 +00:00
Cong Hou bf22f5063a Assign correct edge weights to unwind destinations when lowering invoke statement.
When lowering invoke statement, all unwind destinations are directly added as successors of call site block, and the weight of those new edges are not assigned properly. Actually, default weight 16 are used for those edges. This patch calculates the proper edge weights for those edges when collecting all unwind destinations.

Differential revision: http://reviews.llvm.org/D13354

llvm-svn: 250119
2015-10-12 23:02:58 +00:00
Simon Pilgrim c8832fc233 [SelectionDAG] Add common vector constant folding helper function
We have a number of functions that implement constant folding of vectors (unary and binary ops) in near identical manners (and the differences don't appear to be critical).

This patch introduces a common implementation (SelectionDAG::FoldConstantVectorArithmetic) and calls this in both the unary and binary op cases.

After this initial patch I intend to begin enabling vector constant folding for a wider number of opcodes in SelectionDAG::getNode().

Differential Revision: http://reviews.llvm.org/D13665

llvm-svn: 250118
2015-10-12 23:00:11 +00:00
Matt Arsenault 07a72bad0b Enable verifier after PeepholeOptimizer
No tests fail with this enabled so I assume it was an accident
that it isn't enabled now.

llvm-svn: 250070
2015-10-12 17:43:56 +00:00
Reid Kleckner 9abb3c06a6 Don't call PrepareEHLandingPad on non EH pads
This was a minor bug in r249492. Calling PrepareEHLandingPad on a
non-landingpad was a no-op, but it attempted to get the generic pointer
register class, which apparently doesn't exist for some targets.

llvm-svn: 250068
2015-10-12 17:42:32 +00:00
David Majnemer 99c1d13e52 [WinEH] Remove CatchObjRecoverIdx
CatchObjRecoverIdx was used for the old scheme, it is no longer
relevant.

llvm-svn: 250065
2015-10-12 16:44:22 +00:00
Oliver Stannard cca893ffac [Debug] Look through bitcasts to find argument registers
On targets where f32 is not legal, we have to look through a BITCAST SDNode to
find the register that an argument is stored in when emitting debug info, or we
will not be able to emit a DW_AT_location for it.

Differential Revision: http://reviews.llvm.org/D13005

llvm-svn: 250056
2015-10-12 15:52:36 +00:00
Simon Pilgrim d45c88bbb5 [DAGCombiner] Improved FMA combine support for vectors
Enabled constant canonicalization for all constants.

Improved combining of constant vectors.

llvm-svn: 249993
2015-10-11 19:48:12 +00:00
Simon Pilgrim 5eac2607b9 [DAGCombiner] Tidyup FMINNUM/FMAXNUM constant folding
Enable constant folding for vector splats as well as scalars.

Enable constant canonicalization for all scalar and vector constants.

llvm-svn: 249978
2015-10-11 16:02:28 +00:00
David Majnemer bfa5b98201 [WinEH] Remove more dead code
wineh-parent is dead, so is ValueOrMBB.

llvm-svn: 249920
2015-10-10 00:04:29 +00:00
Reid Kleckner 14e773500e [WinEH] Delete the old landingpad implementation of Windows EH
The new implementation works at least as well as the old implementation
did.

Also delete the associated preparation tests. They don't exercise
interesting corner cases of the new implementation. All the codegen
tests of the EH tables have already been ported.

llvm-svn: 249918
2015-10-09 23:34:53 +00:00
Reid Kleckner eb7cd6c889 [SEH] Update SEH codegen tests to use the new IR
Also Fix a buglet where SEH tables had ranges that spanned funclets.

The remaining tests using the old landingpad IR are preparation tests,
and will be deleted along with the old preparation.

llvm-svn: 249917
2015-10-09 23:05:54 +00:00
Duncan P. N. Exon Smith f1ff53ecc2 CodeGen: Remove implicit ilist iterator conversions, NFC
Finish removing implicit ilist iterator conversions from LLVMCodeGen.
I'm sure there are lots more of these in lib/CodeGen/*/.

llvm-svn: 249915
2015-10-09 22:56:24 +00:00
Reid Kleckner e1c8a7f9c7 [SEH] Fix _except_handler4 table base states
We got them right for the old IR, but not with funclets.  Port the old
test to the new IR and fix the code.

llvm-svn: 249906
2015-10-09 21:27:28 +00:00
Duncan P. N. Exon Smith 6e98cd32dc CodeGen: Avoid more ilist iterator implicit conversions, NFC
llvm-svn: 249903
2015-10-09 21:08:19 +00:00
Duncan P. N. Exon Smith 1ff409802d CodeGen: Use range-based for in PostRAScheduler, NFC
llvm-svn: 249901
2015-10-09 21:05:00 +00:00
Reid Kleckner d880dc7509 [SEH] Remember to emit the last invoke range for SEH
This wasn't very observable in execution tests, because usually there is
an invoke in the catchpad that unwinds the the catchendpad but never
actually throws.

llvm-svn: 249898
2015-10-09 20:39:39 +00:00
Chad Rosier 47eba05b47 Revert "Simplify code. NFC."
This reverts commit r248610.

llvm-svn: 249887
2015-10-09 19:48:48 +00:00
Duncan P. N. Exon Smith 5ec1568c9c CodeGen: Continue removing ilist iterator implicit conversions
llvm-svn: 249884
2015-10-09 19:40:45 +00:00
Duncan P. N. Exon Smith 6ac07fd228 CodeGen: Remove implicit iterator conversions from MBB.cpp
Remove implicit ilist iterator conversions from MachineBasicBlock.cpp.

I've also added an overload of `splice()` that takes a pointer, since
it's a natural API.  This is similar to the overloads I added for
`remove()` and `erase()` in r249867.

llvm-svn: 249883
2015-10-09 19:36:12 +00:00
Duncan P. N. Exon Smith 0ac8eb9171 CodeGen: Avoid ilist iterator implicit conversions in a few more places, NFC
llvm-svn: 249880
2015-10-09 19:23:20 +00:00
Duncan P. N. Exon Smith 5ae5939fa1 CodeGen: Remove more ilist iterator implicit conversions, NFC
llvm-svn: 249879
2015-10-09 19:13:58 +00:00
Duncan P. N. Exon Smith 6c64aeb065 CodeGen: Use range-based for in IntrinsicLowering::AddPrototypes, NFC
This happens to avoid a host of implicit ilist iterator conversions.

llvm-svn: 249877
2015-10-09 19:07:41 +00:00
Duncan P. N. Exon Smith 530d040bd9 CodeGen: Use range-based for in GlobalMerge, NFC
llvm-svn: 249876
2015-10-09 18:57:47 +00:00
Duncan P. N. Exon Smith d83547a16e CodeGen: Remove a few more ilist iterator implicit conversions, NFC
llvm-svn: 249875
2015-10-09 18:44:40 +00:00
Duncan P. N. Exon Smith 980f8f2639 CodeGen: Remove implicit conversions from Analysis and BranchFolding
Remove a few more implicit ilist iterator conversions, this time from
Analysis.cpp and BranchFolding.cpp.

I added a few overloads for `remove()` and `erase()`, which quite
naturally take pointers as well as iterators as parameters.  This will
reduce the churn at least in the short term, but I don't really have a
problem with these existing for longer.

llvm-svn: 249867
2015-10-09 18:23:49 +00:00
Owen Anderson d95b08a0a7 Refine the definition of convergent to only disallow the addition of new control dependencies.
This covers the common case of operations that cannot be sunk.
Operations that cannot be hoisted should already be handled properly via
the safe-to-speculate rules and mechanisms.

llvm-svn: 249865
2015-10-09 18:06:13 +00:00
Sanjay Patel 9fbe22bac6 fix typos; NFC
llvm-svn: 249863
2015-10-09 18:01:03 +00:00
Duncan P. N. Exon Smith 8f11e1a713 CodeGen: Start removing implicit conversions to/from list iterators, NFC
Start removing implicit conversions to/from list iterators in CodeGen,
ala r249782 for IR.  A lot more to go after this.

llvm-svn: 249851
2015-10-09 16:54:49 +00:00
Reid Kleckner ae44e871cd Revert "Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"""
This reverts commit r249794.

Apparently my checkouts are full of unexpected surprises today.

llvm-svn: 249796
2015-10-09 01:13:17 +00:00
Reid Kleckner b510401785 Revert "Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64""
This reverts commit r249032.

TODO write commit msg

llvm-svn: 249794
2015-10-09 01:11:37 +00:00
Joseph Tremoulet 676e5cf07f [WinEH] Fix cleanup state numbering
Summary:
 - Recurse from cleanupendpads to their cleanuppads, to make sure the
   cleanuppad is visited if it has a cleanupendpad but no cleanupret.
 - Check for and avoid double-processing cleanuppads, to allow for them to
   have multiple cleanuprets (plus cleanupendpads).
 - Update Cxx state numbering to visit toplevel cleanupendpads and to
   recurse from cleanupendpads to their preds, to ensure we number any
   funclets in inlined cleanups.  SEH state numbering already did this.

Reviewers: rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13374

llvm-svn: 249792
2015-10-09 00:46:08 +00:00
Reid Kleckner ebef256269 [SEH] Fix llvm.eh.exceptioncode fast register allocation assertion
I called the wrong MachineBasicBlock::addLiveIn() overload.

llvm-svn: 249786
2015-10-09 00:15:13 +00:00
Michael Kuperstein 2b3c16ca17 Do not assert on first non-prologue instruction being a CFI directive.
llvm-svn: 249668
2015-10-08 07:48:49 +00:00
Craig Topper da5168b7ce Use range-based for loops. NFC.
llvm-svn: 249659
2015-10-08 06:06:42 +00:00
Justin Bogner 468c998031 CodeGen: print and verify after TargetPassConfig::insertPass by default
In r224059, we started verifying after addPass, but missed doing so on
insertPass. There isn't a good reason for the discrepancy, and
skipping the verifier in these cases causes bugs.

This also exposes a verifier error that was introduced in r249087, but
the verifier doesn't run until after the register coalescer, when the
issue happens to have been resolved. I've skipped the verifier after
SIFixSGPRLiveRangesID to avoid the failures for now and will follow up
with Matt for a proper fix.

llvm-svn: 249643
2015-10-08 00:36:22 +00:00
David Majnemer 6af5f82c20 [WinEH] Refer to filter funclets using their symbol-table symbol
The relocation for the filter funclet will be against a symbol table
entry for a function instead of the section, making it easier to
understand what is going on.

llvm-svn: 249621
2015-10-07 21:34:00 +00:00
Reid Kleckner 70bf6bb5e6 [WinEH] Undo the effect of r249578 for 32-bit
The __CxxFrameHandler3 tables for 32-bit are supposed to hold stack
offsets relative to EBP, not ESP. I blindly updated the win-catchpad.ll
test case, and immediately noticed that 32-bit catching stopped working.

While I'm at it, move the frame index to frame offset WinEH table logic
out of PEI.  PEI shouldn't have to know about WinEHFuncInfo. I realized
we can calculate frame index offsets just fine from the table printer.

llvm-svn: 249618
2015-10-07 21:13:15 +00:00
David Majnemer c289c9ff55 [WinEH] Remove unreachable blocks before preparation
We remove unreachable blocks because it is pointless to consider them
for coloring.  However, we still had stale pointers to these blocks in
some data structures after we removed them from the function.

Instead, remove the unreachable blocks before attempting to do anything
with the function.

This fixes PR25099.

llvm-svn: 249617
2015-10-07 21:08:25 +00:00
Joseph Tremoulet 39234fc67e [WinEH] Set NoModuleLevelChanges in clone flags
Summary:
This is necessary to keep the cloner from making bogus copies of debug
metadata attached to the IR it is cloning.
Also, avoid running RemapInstruction over all instructions in the common
case that no cloning was performed.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13514

llvm-svn: 249591
2015-10-07 19:29:56 +00:00
Reid Kleckner 33bd2d99d8 [WinEH] Fix two minor issues in __CxxFrameHandler3 tables
There was an off-by-one bug in ip2state tables which manifested when one
call immediately preceded the try-range of the next. The return address
of the previous call would appear to be within the try range of the next
scope, resulting in extra destructors or catches running.

We also computed the wrong offset for catch parameter stack objects. The
offset should be from RSP, not from RBP.

llvm-svn: 249578
2015-10-07 17:49:32 +00:00
Chad Rosier 169865ffda [ARM] Promote helper function to SelectionDAG.
I'll be using the function in a similar combine for AArch64.  The helper was
also improved to handle undef values.

Part of http://reviews.llvm.org/D13442

llvm-svn: 249572
2015-10-07 17:28:58 +00:00
Joseph Tremoulet bde46c5642 [WinEH] Update CoreCLR EH for catchpad MBBs
Summary:
Set the pad MBB as a funclet entry for CoreCLR as well as MSVCCXX, and
update state numbering to put the catchpad block rather than its normal
successor into the unwind map.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13492

llvm-svn: 249569
2015-10-07 17:16:25 +00:00
Michael Kuperstein 259f1508f0 [X86] Emit .cfi_escape GNU_ARGS_SIZE when adjusting the stack before calls
When outgoing function arguments are passed using push instructions, and EH
is enabled, we may need to indicate to the stack unwinder that the stack
pointer was adjusted before the call.

This should fix the exception handling issues in PR24792.

Differential Revision: http://reviews.llvm.org/D13132

llvm-svn: 249522
2015-10-07 07:01:31 +00:00
Reid Kleckner 72ba70418f [SEH] Add llvm.eh.exceptioncode intrinsic
This will support the Clang __exception_code intrinsic.

llvm-svn: 249492
2015-10-07 00:27:33 +00:00
David Blaikie c9ad9191a7 DebugInfo: Include the decl_line/decl_file in subprogram definitions if they differ from those in the declaration
This is handy for some AutoFDO stuff, and seems like a minor improvement
to correctness (otherwise a debug info consumer might think the decl
line/file of the def was the same as that of the declaration - though
what a consumer might use that for, I'm not sure - maybe "list <func>"
would've misbehaved with the old behavior?) and at a minor cost (in my
experiment, with fission, without type units, without compression, 0.01%
growth in debug info in the executable/objects, 0.02% growth in the .dwo
files).

llvm-svn: 249487
2015-10-07 00:04:16 +00:00
David Majnemer 7735a6d07a [WinEH] Create a separate MBB for funclet prologues
Our current emission strategy is to emit the funclet prologue in the
CatchPad's normal destination.  This is problematic because
intra-funclet control flow to the normal destination is not erroneous
and results in us reevaluating the prologue if said control flow is
taken.

Instead, use the CatchPad's location for the funclet prologue.  This
correctly models our desire to have unwind edges evaluate the prologue
but edges to the normal destination result in typical control flow.

Differential Revision: http://reviews.llvm.org/D13424

llvm-svn: 249483
2015-10-06 23:31:59 +00:00
Hans Wennborg 083ca9bb32 Fix Clang-tidy modernize-use-nullptr warnings in source directories and generated files; other minor cleanups.
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D13321

llvm-svn: 249482
2015-10-06 23:24:35 +00:00
Joseph Tremoulet 7f8c1165cd [WinEH] Implement state numbering for CoreCLR
Summary:
Assign one state number per handler/funclet, tracking parent state,
handler type, and catch type token.
State numbers are arranged such that ancestors have lower state numbers
than their descendants.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D13450

llvm-svn: 249457
2015-10-06 20:30:33 +00:00
Joseph Tremoulet 2afea5438f [WinEH] Recognize CoreCLR personality function
Summary:
 - Add CoreCLR to if/else ladders and switches as appropriate.
 - Rename isMSVCEHPersonality to isFuncletEHPersonality to better
   reflect what it captures.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: pgavlin, AndyAyers, llvm-commits

Differential Revision: http://reviews.llvm.org/D13449

llvm-svn: 249455
2015-10-06 20:28:16 +00:00
Craig Topper 2c4068f409 [TwoAddressInstructionPass] When looking for a 3 addr conversion after commuting, make sure regB has been updated to take into account the commute.
llvm-svn: 249378
2015-10-06 05:39:59 +00:00
Benjamin Kramer 808d2a070d Move helper classes into an anonymous namespace. NFC.
llvm-svn: 249356
2015-10-05 21:20:26 +00:00
David Majnemer e4f9b09b51 [WinEH] Update CATCHRET's operand to match its successor
The CATCHRET operand did not match the MachineFunction's CFG.  This
mismatch happened because FrameLowering created a new MachineBasicBlock
and updated the CFG but forgot to update the CATCHRET operand.

Let's make sure this doesn't happen again by strengthing the funclet
membership analysis: it can now reason about the membership of all basic
blocks, not just those inside of funclets.

llvm-svn: 249344
2015-10-05 20:09:16 +00:00
David Majnemer 429c8eda22 [SelectionDAGBuilder] Remove dead code
We already check for LandingPadInst two lines above.

llvm-svn: 249280
2015-10-04 18:44:47 +00:00
David Majnemer 161935520d [WinEH] Permit branch folding in the face of funclets
Track which basic blocks belong to which funclets.  Permit branch
folding to fire but only if it can prove that doing so will not cause
code in one funclet to be reused in another.

llvm-svn: 249257
2015-10-04 02:22:52 +00:00
Simon Pilgrim dde63374c5 [DAGCombiner] Generalize FADD constant combines to work with vectors
Updated the FADD combines to work with vectors as well as scalars.

Differential Revision: http://reviews.llvm.org/D13416

llvm-svn: 249251
2015-10-03 22:06:06 +00:00
Sanjay Patel acd4baefca include equal sign in debug equations; NFC
llvm-svn: 249248
2015-10-03 20:45:01 +00:00
Simon Pilgrim a38d76a087 [DAGCombiner] Merge SIGN_EXTEND_INREG vector constant folding methods. NCI.
visitSIGN_EXTEND_INREG calls SelectionDAG::getNode to constant fold scalar constants but handles vector constants itself, despite getNode being capable of dealing with them.

This required a minor change to the getNode implementation to actually deal with cases where the scalars of a BUILD_VECTOR were wider integers than the vector type - which was the only extra ability of the visitSIGN_EXTEND_INREG implementation.

No codegen intended and all existing tests remain the same.

llvm-svn: 249236
2015-10-03 16:26:52 +00:00
Richard Trieu e0129e474d Call the correct overload.
Call the correct overload so a string literal does not get converted to a bool.
Also fix the test case to match the names given.

llvm-svn: 249183
2015-10-02 20:52:14 +00:00
Reid Kleckner fc64fae6e3 [WinEH] Emit __C_specific_handler tables for the new IR
We emit denormalized tables, where every range of invokes in the same
state gets a complete list of EH action entries. This is significantly
simpler than trying to infer the correct nested scoping structure from
the MI. Fortunately, for SEH, the nesting structure is really just a
size optimization.

With this, some basic __try / __except examples work.

llvm-svn: 249078
2015-10-01 21:38:24 +00:00
David Majnemer 4600c06434 [WinEH] Stop BranchFolding from merging across funclets
BranchFolding would merge two funclets together, this is not OK.
Disable this and strengthen the assertion in FuncletLayout.

llvm-svn: 249069
2015-10-01 21:04:13 +00:00
David Majnemer f828a0ccc7 [WinEH] Make FuncletLayout more robust against catchret
Catchret transfers control from a catch funclet to an earlier funclet.
However, it is not completely clear which funclet the catchret target is
part of.  Make this clear by stapling the catchret target's funclet
membership onto the CATCHRET SDAG node.

llvm-svn: 249052
2015-10-01 18:44:59 +00:00
NAKAMURA Takumi 096492a07b Reformat.
llvm-svn: 249033
2015-10-01 17:01:03 +00:00
NAKAMURA Takumi 1ed20db720 Revert r248959, "[WinEH] Emit int3 after noreturn calls on Win64"
It broke; LLVM :: CodeGen__Generic__2009-11-16-BadKillsCrash.ll

llvm-svn: 249032
2015-10-01 17:00:56 +00:00
Reid Kleckner 6dec87a8a0 [WinEH] Emit int3 after noreturn calls on Win64
The Win64 unwinder disassembles forwards from each PC to try to
determine if this PC is in an epilogue. If so, it skips calling the EH
personality function for that frame. Typically, this means you cannot
catch an exception in the same frame that you threw it, because 'throw'
calls a noreturn runtime function.

Previously we avoided this problem with the TrapUnreachable
TargetOption, but that's a much bigger hammer than we need. All we need
is a 1 byte non-epilogue instruction right after the call.  Instead,
what we got was an unconditional branch to a shared block containing the
ud2, potentially 7 bytes instead of 1. So, this reverts r206684, which
added TrapUnreachable, and replaces it with something better.

The new code pattern matches for invoke/call followed by unreachable and
inserts an int3 into the DAG. To be 100% watertight, we would need to
insert SEH_Epilogue instructions into all basic blocks ending in a call
with no terminators or successors, but in practice this is unlikely to
come up.

llvm-svn: 248959
2015-09-30 23:09:23 +00:00
Evgeniy Stepanov f608111d1b Fix debug info with SafeStack.
llvm-svn: 248933
2015-09-30 19:55:43 +00:00
Maksim Panchenko cce239c45d HHVM calling conventions.
HHVM calling convention, hhvmcc, is used by HHVM JIT for
functions in translated cache. We currently support LLVM back end to
generate code for X86-64 and may support other architectures in the
future.

In HHVM calling convention any GP register could be used to pass and
return values, with the exception of R12 which is reserved for
thread-local area and is callee-saved. Other than R12, we always
pass RBX and RBP as args, which are our virtual machine's stack pointer
and frame pointer respectively.

When we enter translation cache via hhvmcc function, we expect
the stack to be aligned at 16 bytes, i.e. skewed by 8 bytes as opposed
to standard ABI alignment. This affects stack object alignment and stack
adjustments for function calls.

One extra calling convention, hhvm_ccc, is used to call C++ helpers from
HHVM's translation cache. It is almost identical to standard C calling
convention with an exception of first argument which is passed in RBP
(before we use RDI, RSI, etc.)

Differential Revision: http://reviews.llvm.org/D12681

llvm-svn: 248832
2015-09-29 22:09:16 +00:00
David Majnemer a80c151286 [WinEH] Teach AsmPrinter about funclets
Summary:
Funclets have been turned into functions by the time they hit the object
file.  Make sure that they have decent names for the symbol table and
CFI directives explaining how to reason about their prologues.

Differential Revision: http://reviews.llvm.org/D13261

llvm-svn: 248824
2015-09-29 20:12:33 +00:00
Cong Hou 166e08542e Rename some function arguments in MachineBasicBlock.cpp/h by turning the first letter into upper case. NFC.
llvm-svn: 248821
2015-09-29 19:46:09 +00:00
Jeroen Ketema 740f9d79ca Arguments spilled on the stack before a function call may have
alignment requirements, for example in the case of vectors.
These requirements are exploited by the code generator by using
move instructions that have similar alignment requirements, e.g.,
movaps on x86.

Although the code generator properly aligns the arguments with
respect to the displacement of the stack pointer it computes,
the displacement itself may cause misalignment. For example if
we have

%3 = load <16 x float>, <16 x float>* %1, align 64
call void @bar(<16 x float> %3, i32 0)

the x86 back-end emits:

movaps  32(%ecx), %xmm2
movaps  (%ecx), %xmm0
movaps  16(%ecx), %xmm1
movaps  48(%ecx), %xmm3
subl    $20, %esp       <-- if %esp was 16-byte aligned before this instruction, it no longer will be afterwards 
movaps  %xmm3, (%esp)   <-- movaps requires 16-byte alignment, while %esp is not aligned as such.
movl    $0, 16(%esp)
calll   __bar

To solve this, we need to make sure that the computed value with which
the stack pointer is changed is a multiple af the maximal alignment seen
during its computation. With this change we get proper alignment:

subl    $32, %esp
movaps  %xmm3, (%esp)

Differential Revision: http://reviews.llvm.org/D12337

llvm-svn: 248786
2015-09-29 10:12:57 +00:00
Matthias Braun 99ae16217e RegisterPressure: LiveRegSet tracks register units not physregs
There are always more physical registers and register units so the
previous behaviour was correct but we can do with less memory.

llvm-svn: 248767
2015-09-29 00:20:32 +00:00
Reid Kleckner c71d6275ca [WinEH] Fix ip2state table emission with funclets
Previously we were hijacking the old LandingPadInfo data structures to
communicate our state numbers. Now we don't need that anymore.

llvm-svn: 248763
2015-09-28 23:56:30 +00:00
Richard Trieu e778e87d2a Fix unused variable warning in non-debug builds.
llvm-svn: 248754
2015-09-28 22:54:43 +00:00
Sanjay Patel 4e6527682a tidy up comments; NFC
llvm-svn: 248750
2015-09-28 22:14:51 +00:00
Sanjay Patel 5e5f0e9756 move one-use check under the comment that describes it; NFCI
llvm-svn: 248745
2015-09-28 21:44:46 +00:00
Andrew Kaylor 16c4da03d5 Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.
Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)

Differential Revision: http://reviews.llvm.org/D11370

llvm-svn: 248735
2015-09-28 20:33:22 +00:00
Hal Finkel bd582581b8 [DAGCombine] Fix getStoreMergeAndAliasCandidates's AA-enabled chain walking
When AA is being used, non-aliasing stores are canonicalized to use the same
chain, and DAGCombiner::getStoreMergeAndAliasCandidates can take advantage of
this by looking only as users of a store's chain operand. However, user
iteration is not result-number specific, we need to check that the use is as a
chain operand, and not via some other operand. It is certainly possible to have
another potentially-aliasing store, which shares the first's base pointer, and
uses the first's chain's node via some other operand.

Failure to catch this situation caused, at least in the included test case, an
assert later because the relative sequence-number ordering caused later
replacement to create a cycle in the DAG.

llvm-svn: 248698
2015-09-28 08:02:14 +00:00
Craig Topper 862d5d8322 Remove 'const' from some ArrayRefs. ArrayRefs are already immutable. NFC
llvm-svn: 248693
2015-09-28 00:15:34 +00:00
Joseph Tremoulet 09af67aba5 [EH] Create removeUnwindEdge utility
Summary:
Factor the code that rewrites invokes to calls and rewrites WinEH
terminators to their "unwind to caller" equivalents into a helper in
Utils/Local, and use it in the three places I'm aware of that need to do
this.


Reviewers: andrew.w.kaylor, majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13152

llvm-svn: 248677
2015-09-27 01:47:46 +00:00
Matthias Braun 93ab942c24 LivePhysRegs: Fix live-outs of return blocks
I realized that the live-out set computed for the return block is
missing the callee saved registers (the non-pristine ones to be exact).

This only affects the liveness computed for instructions inside the
function epilogue which currently none of the LivePhysRegs users in llvm
cares about, so this is just a drive-by fix without a testcase.

Differential Revision: http://reviews.llvm.org/D13180

llvm-svn: 248636
2015-09-25 23:50:53 +00:00
Matthias Braun a3b701f828 SelectionDAGDumper: Print simple operands inline.
Print simple operands inline instead of their pointer/value number.
Simple operands are SDNodes without predecessors like Constant(FP), Register,
UNDEF. This unifies the behaviour with dumpr() which was already doing this.

Previously:
  t0: ch = EntryToken
    t1: i64 = Register %vreg0
  t2: i64,ch = CopyFromReg t0, t1
    t3: i64 = Constant<1>
  t4: i64 = add t2, t3
    t5: i64 = Constant<2>
  t6: i64 = add t2, t5
  t10: i64 = undef
  t11: i8,ch = load t0, t2, t10<LD1[%tmp81]>
  t12: i8,ch = load t0, t4, t10<LD1[%tmp10]>
  t13: i8,ch = load t0, t6, t10<LD1[%tmp12]>

Now:
  t0: ch = EntryToken
  t2: i64,ch = CopyFromReg t0, Register:i64 %vreg0
  t4: i64 = add t2, Constant:i64<1>
  t6: i64 = add t2, Constant:i64<2>
  t11: i8,ch = load<LD1[%tmp81]> t0, t2, undef:i64
  t12: i8,ch = load<LD1[%tmp10]> t0, t4, undef:i64
  t13: i8,ch = load<LD1[%tmp12]> t0, t6, undef:i64

Differential Revision: http://reviews.llvm.org/D12567

llvm-svn: 248628
2015-09-25 22:27:02 +00:00
Matt Arsenault 3c07e963b8 DAGCombiner: Check if store is volatile first
This is the simpler check. NFC.

llvm-svn: 248625
2015-09-25 22:06:19 +00:00
Matthias Braun c804cdb912 TargetRegisterInfo: Introduce PrintLaneMask.
This makes it more convenient to print lane masks and lead to more
uniform printing.

llvm-svn: 248624
2015-09-25 21:51:24 +00:00
Matthias Braun e6a2485e1a TargetRegisterInfo: Add typedef unsigned LaneBitmask and use it where apropriate; NFC
llvm-svn: 248623
2015-09-25 21:51:14 +00:00
Sanjay Patel bbbf9a1a34 merge vector stores into wider vector stores and fix AArch64 misaligned access TLI hook (PR21711)
This is a redo of D7208 ( r227242 - http://llvm.org/viewvc/llvm-project?view=revision&revision=227242 ).

The patch was reverted because an AArch64 target could infinite loop after the change in DAGCombiner 
to merge vector stores. That happened because AArch64's allowsMisalignedMemoryAccesses() wasn't telling
the truth. It reported all unaligned memory accesses as fast, but then split some 128-bit unaligned
accesses up in performSTORECombine() because they are slow.

This patch attempts to fix the problem in AArch's allowsMisalignedMemoryAccesses() while preserving
existing (perhaps questionable) lowering behavior.

The x86 test shows that store merging is working as intended for a target with fast 32-byte unaligned
stores.

Differential Revision: http://reviews.llvm.org/D12635
 

llvm-svn: 248622
2015-09-25 21:49:48 +00:00
Matthias Braun e86bbd8979 PrologueEpilogInserter: Fix missing live-ins when savepoint equals restorepoint
The algorithm would not modify the live-in list of blocks below the save
block point which is correct unless it happens to be a restore point at
the same time.
Also fixes the benign issue of live-in registers being added twice in
some cases.

The testcase is based on a test submitted by Kit Barton.

Differential Revision: http://reviews.llvm.org/D13176

llvm-svn: 248620
2015-09-25 21:41:40 +00:00
Matthias Braun c2d4befb54 MachineBasicBlock: Factor out common code into isReturnBlock()
llvm-svn: 248617
2015-09-25 21:25:19 +00:00
Matt Arsenault 10aa807856 PeepholeOptimizer: Remove redundant copies
If a virtual register is copied and another copy was already
seen, replace with the previous copy. This only handles the
simplest cases for now.

This pattern shows up from various operand restrictions
AMDGPU has which require inserting copies depending
on the register class of the operands.

llvm-svn: 248611
2015-09-25 20:22:12 +00:00
Chad Rosier d9f102b464 Simplify code. NFC.
llvm-svn: 248610
2015-09-25 20:20:22 +00:00
Matt Arsenault 50f0a42b66 Fix typo
llvm-svn: 248549
2015-09-24 22:36:49 +00:00
Mohammad Shahid 13f1dfdf2e Codegen: Fix llvm.*absdiff semantic.
Fixes the overflow case of llvm.*absdiff intrinsic also updats the tests and LangRef.rst accordingly.

Differential Revision: http://reviews.llvm.org/D11678

llvm-svn: 248483
2015-09-24 10:35:03 +00:00
Matt Arsenault 68d938649e Introduce target hook for optimizing register copies
Allow a target to do something other than search for copies
that will avoid cross register bank copies.

Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.

I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid  kinds
of subregister copies on some targets.

I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.

The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.

llvm-svn: 248478
2015-09-24 08:36:14 +00:00
Matt Arsenault c7ec46c3aa Remove dead declaration
llvm-svn: 248471
2015-09-24 07:51:12 +00:00
Matt Arsenault c721df0478 Use new TokenFactor chain when merging stores
If the stores are storing values from loads which partially
alias the stores, we could end up placing the merged loads
and stores on the same chain which has the potential to break.
Each store may have a different chain dependency on only some
of the original loads. Create a new TokenFactor to capture all
of the required dependencies of the stores rather than assuming
all stores can use the same chain.

The testcase is a situation where this happens, although
it does not have an observable change from this. The DAG nodes
just happened to not be reordered before despite this missing
chain dependency.

This is based on an off-list report for an out of tree target
which regressed due to r246307 and I haven't managed to find a case
where the nodes do end up reordered with an in tree target.

llvm-svn: 248468
2015-09-24 07:22:38 +00:00
Evgeniy Stepanov a2002b08f7 Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

This is a re-commit of a change in r248357 that was reverted in
r248358.

llvm-svn: 248405
2015-09-23 18:07:56 +00:00
Evgeniy Stepanov 8d0e3011d8 Revert "Android support for SafeStack."
test/Transforms/SafeStack/abi.ll breaks when target is not supported;
needs refactoring.

llvm-svn: 248358
2015-09-23 01:23:22 +00:00
Evgeniy Stepanov ce2e16f00c Android support for SafeStack.
Add two new ways of accessing the unsafe stack pointer:

* At a fixed offset from the thread TLS base. This is very similar to
  StackProtector cookies, but we plan to extend it to other backends
  (ARM in particular) soon. Bionic-side implementation here:
  https://android-review.googlesource.com/170988.
* Via a function call, as a fallback for platforms that provide
  neither a fixed TLS slot, nor a reasonable TLS implementation (i.e.
  not emutls).

llvm-svn: 248357
2015-09-23 01:03:51 +00:00
Cong Hou 9def6efd7e Fixed an issue on updating profile data when lowering switch statement.
Fixed the issue that when there is an edge from the jump table to the default statement, we should check it directly instead of checking if the sibling of the jump table header is a successor of the jump table header, which may not be the default statment but a successor of it.

llvm-svn: 248354
2015-09-23 00:20:27 +00:00
Adrian Prantl 77fefeba37 Debug Info: Emit the dwo_name only in skeleton CUs, not in DWOs.
llvm-svn: 248340
2015-09-22 23:21:00 +00:00
Matthias Braun 73e4221e6c LiveIntervalAnalysis: Avoid multiple connected liveness components
We may have subregister defs which are unused but not discovered and
cleaned up prior to liveness analysis. This creates multiple connected
components in the resulting live range which are forbidden in the
MachineVerifier because they would unnecesarily constrain the register
allocator. Rewrite those dead definitions to define a newly created
virtual register.

Differential Revision: http://reviews.llvm.org/D13035

llvm-svn: 248335
2015-09-22 22:37:44 +00:00
Matthias Braun 5efe871971 LiveInterval: Distribute subregister liveranges to new intervals in ConnectedVNInfoEqClasses::Distribute()
This improves ConnectedVNInfoEqClasses::Distribute() to distribute the
segments and value numbers in the subranges instead of conservatively
clearing all subregister info.

No separate test here, just clearing the subrange instead of properly
distributing them would however break my upcoming fix regarding dead super
register definitions.

Differential Revision: http://reviews.llvm.org/D13075

llvm-svn: 248334
2015-09-22 22:37:42 +00:00
Ahmed Bougacha 07a844d758 [AArch64] Emit clrex in the expanded cmpxchg fail block.
In the comparison failure block of a cmpxchg expansion, the initial
ldrex/ldxr will not be followed by a matching strex/stxr.
On ARM/AArch64, this unnecessarily ties up the execution monitor,
which might have a negative performance impact on some uarchs.

Instead, release the monitor in the failure block.
The clrex instruction was designed for this: use it.

Also see ARMARM v8-A B2.10.2:
"Exclusive access instructions and Shareable memory locations".

Differential Revision: http://reviews.llvm.org/D13033

llvm-svn: 248291
2015-09-22 17:21:44 +00:00
Benjamin Kramer 3c96f0a54e Make helper function static. NFC.
llvm-svn: 248278
2015-09-22 14:34:57 +00:00
NAKAMURA Takumi 0a7d0ad95f Untabify.
llvm-svn: 248264
2015-09-22 11:15:07 +00:00
NAKAMURA Takumi a9cb538a74 Reformat blank lines.
llvm-svn: 248263
2015-09-22 11:14:39 +00:00
NAKAMURA Takumi 84965031a7 Reformat comment lines.
llvm-svn: 248262
2015-09-22 11:14:12 +00:00
NAKAMURA Takumi 70ad98aca4 Reformat.
llvm-svn: 248261
2015-09-22 11:13:55 +00:00
Matthias Braun d3dd1354a4 LiveIntervalAnalysis: Factor common code into splitSeparateComponents; NFC
llvm-svn: 248241
2015-09-22 03:44:41 +00:00
Sanjay Patel fc580a60e2 function names should start with a lower case letter; NFC
llvm-svn: 248224
2015-09-21 23:03:16 +00:00
Sanjay Patel 4ac6b115e8 don't repeat function/variable names in header comments; NFC
llvm-svn: 248222
2015-09-21 22:47:23 +00:00
Simon Pilgrim 4003ed2da3 [DAGCombiner] Improve FMA support for interpolation patterns
This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents.

This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))

Differential Revision: http://reviews.llvm.org/D13003

llvm-svn: 248210
2015-09-21 20:32:48 +00:00
Simon Pilgrim e8e5a17a12 [DAGCombiner] Tidy up FMA combine helpers. NFCI.
Based on feedback for D13003.

llvm-svn: 248206
2015-09-21 20:15:03 +00:00
Stephen Canon b12db0e42c Remove roundingMode argument in APFloat::mod
Because mod is always exact, this function should have never taken a rounding mode argument.  The actual implementation still has issues, which I'll look at resolving in a subsequent patch.

llvm-svn: 248195
2015-09-21 19:29:25 +00:00
Matt Arsenault 8fb9b94f7f Fix accidentally committed debug printing
llvm-svn: 248190
2015-09-21 18:21:10 +00:00
Matthias Braun b9fe44ddb0 SelectionDAG: Use InsertNode for EntryNode
This fixes problems where two nodes have persistent debug id 0 assigned.

llvm-svn: 248182
2015-09-21 17:41:05 +00:00
Matt Arsenault b774834429 DAGCombiner: Replace store of FP constant after attemping store merges
If storing multiple FP constants, some subset of the stores
would be replaced with integers due to visit order, so
MergeConsecutiveStores would only partially merge
these.

llvm-svn: 248169
2015-09-21 15:59:46 +00:00
Matt Arsenault a30ddb6524 Factor replacement of stores of FP constants into new function
llvm-svn: 248168
2015-09-21 15:59:43 +00:00
Chad Rosier 03a47305ec [Machine Combiner] Refactor machine reassociation code to be target-independent.
No functional change intended.
Patch by Haicheng Wu <haicheng@codeaurora.org>!

http://reviews.llvm.org/D12887
PR24522

llvm-svn: 248164
2015-09-21 15:09:11 +00:00
Craig Topper 0013be16ff Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type extra times. NFC
llvm-svn: 248140
2015-09-21 05:32:41 +00:00
Maksim Panchenko 0510cd5161 [PrologEpilogInserter] Minor refactoring.
Differential Revision: http://reviews.llvm.org/D12924

llvm-svn: 248084
2015-09-19 04:42:15 +00:00
Maksim Panchenko 07b754daf8 Test commit. Fix comment. NFC.
llvm-svn: 248082
2015-09-19 04:01:19 +00:00
Cong Hou d40105d321 Update edge weights properly when merging blocks in if-conversion.
In if-conversion, there is a utility function MergeBlocks() that is used to merge blocks. However, when new edges are built in this function the edge weight is either not provided or not updated properly, leading to a modified CFG with incorrect edge weights. This patch corrects this issue.

Differential Revision: http://reviews.llvm.org/D12513

llvm-svn: 248030
2015-09-18 20:22:41 +00:00
James Y Knight e72b0dbf97 Make MachineScheduler debug output less confusing.
At least...a little bit.

llvm-svn: 248020
2015-09-18 18:52:20 +00:00
Matthias Braun 77771cfd97 SelectionDAGDumper: Leave out the <multiple use> markers
They mostly clutter the output while it is still possible to see which
node has multiple users without them.

Differential Revision: http://reviews.llvm.org/D12569

llvm-svn: 248013
2015-09-18 17:57:33 +00:00
Matthias Braun bab3fb45e5 SelectionDAGDumper: Avoid unnecessary newlines
Before:
  t0 = EntryToken:ch

    t0: <multiple use>
        t0: <multiple use>
      t1 = CopyFromReg:v4f32,ch t0, Register:v4f32  %vreg0

      t25 = IMPLICIT_DEF:v4f32

    t26 = HADDPSrr:v4f32 t1, t25

  t23 = CopyToReg:ch,glue t0, Register:v4f32  %XMM0, t26

    t23: <multiple use>
    t23: <multiple use>
  t24 = RETQ:ch Register:v4f32  %XMM0, t23, t23:1

After:
    t0: <multiple use>
        t0: <multiple use>
      t1 = CopyFromReg:v4f32,ch t0, Register:v4f32  %vreg0
    t26 = X86ISD::FHADD:v4f32 t1, undef:v4f32
  t23 = CopyToReg:ch,glue t0, Register:v4f32  %XMM0, t26
    t23: <multiple use>
    t21 = TargetConstant:i16<0>
    t23: <multiple use>
  t24 = X86ISD::RET_FLAG:ch t23, t21, Register:v4f32  %XMM0, t23:1

Differential Revision: http://reviews.llvm.org/D12568

llvm-svn: 248012
2015-09-18 17:57:31 +00:00
Matthias Braun f89b7c7188 SelectionDAGDumper: Hide [ID=X], [ORD=X] and source locations by default.
You can show them with the new -dag-dump-verbose switch.

Differential Revision: http://reviews.llvm.org/D12566

llvm-svn: 248011
2015-09-18 17:57:28 +00:00
Matthias Braun 0b7d6c14c9 SelectionDAG: Introduce PersistentID to SDNode for assert builds.
This gives us more human readable numbers to identify nodes in debug
dumps.

Before:
  0x7fcbd9700160: ch = EntryToken

  0x7fcbd985c7c8: i64 = Register %RAX

   ...

      0x7fcbd9700160: <multiple use>
    0x7fcbd985c578: i64,ch = MOV64rm 0x7fcbd985c6a0, 0x7fcbd985cc68, 0x7fcbd985c200, 0x7fcbd985cd90, 0x7fcbd985ceb8, 0x7fcbd9700160<Mem:LD8[@foo]> [ORD=2]

  0x7fcbd985c8f0: ch,glue = CopyToReg 0x7fcbd9700160, 0x7fcbd985c7c8, 0x7fcbd985c578 [ORD=3]

    0x7fcbd985c7c8: <multiple use>
    0x7fcbd985c8f0: <multiple use>
    0x7fcbd985c8f0: <multiple use>
  0x7fcbd985ca18: ch = RETQ 0x7fcbd985c7c8, 0x7fcbd985c8f0, 0x7fcbd985c8f0:1 [ORD=3]

Now:
  t0: ch = EntryToken

  t5: i64 = Register %RAX

    ...

      t0: <multiple use>
    t3: i64,ch = MOV64rm t10, t12, t11, t13, t14, t0<Mem:LD8[@foo]> [ORD=2]

  t6: ch,glue = CopyToReg t0, t5, t3 [ORD=3]

    t5: <multiple use>
    t6: <multiple use>
    t6: <multiple use>
  t7: ch = RETQ t5, t6, t6:1 [ORD=3]

Differential Revision: http://reviews.llvm.org/D12564

llvm-svn: 248010
2015-09-18 17:41:00 +00:00
David Majnemer 9966fe8f85 [WinEH] Moved funclet pads should be in relative order
We shifted the MachineBasicBlocks to the end of the MachineFunction in
DFS order.  This will not ensure that MachineBasicBlocks which fell
through to one another will remain contiguous.  Instead, implement
a stable sort algorithm for iplist.

This partially reverts commit r214150.

llvm-svn: 247978
2015-09-18 08:18:07 +00:00
Bob Wilson dd0eadce7d Whitespace. Indent with spaces instead of a tab.
llvm-svn: 247969
2015-09-18 05:36:13 +00:00
Quentin Colombet b4c6886215 [ShrinkWrap] Refactor the handling of infinite loop in the analysis.
- Strenghten the logic to be sure we hoist the restore point out of the current
  loop. (The fixes a bug with infinite loop, added as part of the patch.)
- Walk over the exit blocks of the current loop to conver to the desired restore
  point in one iteration of the update loop.

llvm-svn: 247958
2015-09-17 23:21:34 +00:00
Matthias Braun 3e86de1acb Revert "(HEAD -> master, origin/master, origin/HEAD) RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker"
This reverts commit r247943.

Accidental commit, code review was not finished yet.

llvm-svn: 247945
2015-09-17 21:12:24 +00:00
Matthias Braun 70eff2571f RegisterPressure: Move LiveInRegs/LiveOutRegs from RegisterPressure to PressureTracker
Differential Revision: http://reviews.llvm.org/D12814

llvm-svn: 247943
2015-09-17 21:10:06 +00:00
Matthias Braun d78ee54a54 MachineScheduler: Provide an option for node hiding cutoff and disable it by default
llvm-svn: 247942
2015-09-17 21:09:59 +00:00
David Majnemer 978902309a [WinEH] Add a funclet layout pass
Windows EH funclets need to be contiguous.  The FuncletLayout pass will
ensure that the funclets are together and begin with a funclet entry MBB.

Differential Revision: http://reviews.llvm.org/D12943

llvm-svn: 247937
2015-09-17 20:45:18 +00:00
Piotr Padlewski ea09288ee7 Added MD_invariant_group to LLVMContext
http://reviews.llvm.org/D12926

llvm-svn: 247931
2015-09-17 20:25:07 +00:00
Reid Kleckner ed17079b52 [WinEH] Add and use hasEHPadSuccessor instead of getLandingPadSuccessor
getLandingPadSuccessor assumes that each invoke can have at most one EH
pad successor, but WinEH invokes can have more than one. Two out of
three callers of getLandingPadSuccessor don't use the returned
landingpad, so we can make them use this simple predicate instead.

Eventually we'll have to circle back and fix SplitKit.cpp so that
register allocation works. Baby steps.

llvm-svn: 247904
2015-09-17 17:19:40 +00:00
Zia Ansari 841cce1ae9 Test commit.
llvm-svn: 247901
2015-09-17 16:51:27 +00:00
Eric Christopher c7b155f670 Use the cached TargetInstrInfo instead of looking it up again.
llvm-svn: 247865
2015-09-16 23:38:16 +00:00
Eric Christopher a4e5d3cf8e constify the Function parameter to the TTI creation callback and
propagate to all callers/users/etc.

llvm-svn: 247864
2015-09-16 23:38:13 +00:00
Reid Kleckner 813f1b65bc [WinEH] Rip out the landingpad-based C++ EH state numbering code
It never really worked, and the new code is working better every day.

llvm-svn: 247860
2015-09-16 22:14:46 +00:00
David Majnemer 67bff0d88b [WinEHPrepare] Turn terminatepad into a cleanuppad + call + cleanupret
The MSVC doesn't really support exception specifications so let's just
turn these into cleanuppads.  Later, we might use terminatepad to more
efficiently encode the "noexcept"-ness of a function body.

llvm-svn: 247848
2015-09-16 20:42:16 +00:00
Reid Kleckner b005d281c3 [WinEH] Pull Adjectives and CatchObj out of the catchpad arg list
Clang now passes the adjectives as an argument to catchpad.

Getting the CatchObj working is simply a matter of threading another
static alloca through codegen, first as an alloca, then as a frame
index, and finally as a frame offset.

llvm-svn: 247844
2015-09-16 20:16:27 +00:00
David Majnemer 459a64aed7 [WinEHPrepare] Provide a cloning mode which doesn't demote
We are experimenting with a new approach to saving and restoring SSA
values used across funclets: let the register allocator do the dirty
work for us.

However, this means that we need to be able to clone commoned blocks
without relying on demotion.

llvm-svn: 247835
2015-09-16 18:40:37 +00:00
David Majnemer b3d9b960ea [WinEHPrepare] Refactor explicit EH preparation
Split the preparation machinery into several functions, we will want to
selectively enable/disable different parts of it for an alternative
mechanism for dealing with cross-funclet uses.

llvm-svn: 247834
2015-09-16 18:40:24 +00:00
Sanjay Patel a260701bbb propagate fast-math-flags on DAG nodes
After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, 
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: 
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.

This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I 
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.

This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.

Differential Revision: http://reviews.llvm.org/D12095

llvm-svn: 247815
2015-09-16 16:31:21 +00:00
Michael Kuperstein 098cd9fba7 [X86] Fix emitEpilogue() to make less assumptions about pops
This is the mirror image of r242395.
When X86FrameLowering::emitEpilogue() looks for where to insert the %esp addition that
deallocates stack space used for local allocations, it assumes that any sequence of pop
instructions from function exit backwards consists purely of restoring callee-save registers.

This may be false, since from some point backward, the pops may be clean-up of stack space
allocated for arguments to a call.

Patch by: amjad.aboud@intel.com
Differential Revision: http://reviews.llvm.org/D12688

llvm-svn: 247784
2015-09-16 11:18:25 +00:00
Craig Topper 5db36df4d0 Use range-based for loops. NFC
llvm-svn: 247772
2015-09-16 03:52:35 +00:00
Craig Topper 77ec077067 Fix a spelling error in the description of a statistic. NFC
llvm-svn: 247771
2015-09-16 03:52:32 +00:00
Piotr Padlewski 6c15ec49ed Introducing llvm.invariant.group.barrier intrinsic
For more info for what reason it was invented, goto:
http://lists.llvm.org/pipermail/cfe-dev/2015-July/044227.html

invariant.group.barrier:
http://reviews.llvm.org/D12310
docs:
http://reviews.llvm.org/D11399
CodeGenPrepare:
http://reviews.llvm.org/D12875

llvm-svn: 247711
2015-09-15 18:32:14 +00:00
Quentin Colombet dc29c973e5 [ShrinkWrapping] Fix an infinite loop while looking for restore point.
This may happen when the input program itself contains an infinite loop with no
exit block. In that case, we would fail to find a block post-dominating the loop
such that this block is outside of the loop.

This fixes PR24823.
Working on reducing the test case.

llvm-svn: 247710
2015-09-15 18:19:39 +00:00
Daniel Sanders 50f17235dd Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Eric has replied and has demanded the patch be reverted.

llvm-svn: 247702
2015-09-15 16:17:27 +00:00
Daniel Sanders 153010c52d Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969

llvm-svn: 247692
2015-09-15 14:08:28 +00:00
Daniel Sanders c40de48041 Revert r247684 - Replace Triple with a new TargetTuple ...
LLDB needs to be updated in the same commit.

llvm-svn: 247686
2015-09-15 13:46:21 +00:00
Daniel Sanders 18d4b0dab7 Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969

llvm-svn: 247683
2015-09-15 13:17:40 +00:00
Adrian Prantl deef90d7f5 DwarfDebug: Emit dwo_id+dwo_name for DICompileUnits that provide a dwoId.
For module debugging clang emits prefabricated skeleton compile units
that can be recognized by a nonzero dwoId.

llvm-svn: 247626
2015-09-14 22:10:22 +00:00
David Blaikie 6614d8d230 [opaque pointer types] Switch a few cases of getElementType over, since I had them lying around anyway
llvm-svn: 247610
2015-09-14 20:29:26 +00:00
Matthias Braun 3f3934b010 RegisterPressure: Simplify close{Top|Bottom}()
- There are no duplicate registers in LiveRegs list we are copying from
  and so we do not need to sort the registers.
- Simply use SmallVector::apend instead of a loop between begin() and end()
  with push_back().

Differential Revision: http://reviews.llvm.org/D12813

llvm-svn: 247588
2015-09-14 18:24:15 +00:00
David Blaikie 16a2f3e302 Revert "[opaque pointer type] Pass GlobalAlias the actual pointer type rather than decomposing it into pointee type + address space"
This was a flawed change - it just caused the getElementType call to be
deferred until later, when we really need to remove it. Now that the IR
for GlobalAliases has been updated, the root cause is addressed that way
instead and this change is no longer needed (and in fact gets in the way
- because we want to pass the pointee type directly down further).

Follow up patches to push this through GlobalValue, bitcode format, etc,
will come along soon.

This reverts commit 236160.

llvm-svn: 247585
2015-09-14 18:01:59 +00:00
Ahmed Bougacha 49b531a08d [CodeGen] Fix AtomicExpand invalidation issue caused by r247429.
llvm-svn: 247514
2015-09-12 18:51:23 +00:00
Bruce Mitchener e9ffb45b60 Fix typos.
Summary: This fixes a variety of typos in docs, code and headers.

Subscribers: jholewinski, sanjoy, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12626

llvm-svn: 247495
2015-09-12 01:17:08 +00:00
Akira Hatanaka bc497c93f5 Use function attribute "stackrealign" to decide whether stack
realignment should be forced.

With this commit, we can now force stack realignment when doing LTO and
do so on a per-function basis. Also, add a new cl::opt option
"stackrealign" to CommandFlags.h which is used to force stack
realignment via llc's command line.

Out-of-tree projects currently using -force-align-stack to force stack
realignment should make changes to attach the attribute to the functions
in the IR.

Differential Revision: http://reviews.llvm.org/D11814

llvm-svn: 247450
2015-09-11 18:54:38 +00:00
David Majnemer 0e70598a5b [X86] Make sure startproc/endproc are paired
We used different conditions to determine if we should emit startproc vs
endproc.  Use the same condition to ensure that they will always be
paired.

This fixes PR24374.

llvm-svn: 247435
2015-09-11 17:34:34 +00:00
Ahmed Bougacha 5246867384 [CodeGen] Refactor TLI/AtomicExpand interface to make LLSC explicit.
We used to have this magic "hasLoadLinkedStoreConditional()" callback,
which really meant two things:
- expand cmpxchg (to ll/sc).
- expand atomic loads using ll/sc (rather than cmpxchg).

Remove it, and, instead, introduce explicit callbacks:
- bool shouldExpandAtomicCmpXchgInIR(inst)
- AtomicExpansionKind shouldExpandAtomicLoadInIR(inst)

Differential Revision: http://reviews.llvm.org/D12557

llvm-svn: 247429
2015-09-11 17:08:28 +00:00
Ahmed Bougacha 9d677131c4 [CodeGen] Rename AtomicRMWExpansionKind to AtomicExpansionKind.
This lets us generalize its usage to the other atomic instructions.

llvm-svn: 247428
2015-09-11 17:08:17 +00:00
Cong Hou c536bd9e73 Pass BranchProbability/BlockMass by value instead of const& as they are small. NFC.
llvm-svn: 247357
2015-09-10 23:10:42 +00:00
Reid Kleckner 7bb20bd69e Fix SEH state numbering algorithm to handle cleanupendpads
WinEHPrepare's new coloring algorithm really expects to see
cleanupendpads now, so Clang will start emitting them soon.

llvm-svn: 247341
2015-09-10 21:46:36 +00:00
Chandler Carruth 2e4ca848f4 Add an explicit 'inline' specifier to these static functions. GCC is
warning on them having always_inline attribute for reasons I don't fully
understand -- static functions are just as inlinable as inline
functions in terms of linkage.

llvm-svn: 247334
2015-09-10 20:34:57 +00:00
Adrian Prantl d209500fd5 Debug Info: Allow a DIModule to appear as the scope of other entities.
llvm-svn: 247304
2015-09-10 17:13:58 +00:00
Joseph Tremoulet f3aff31401 [WinEH] Fix single-block cleanup coloring
Summary:
The coloring code in WinEHPrepare queues cleanuprets' successors with the
correct color (the parent one) when it sees their cleanuppad, and so later
when iterating successors knows to skip processing cleanuprets since
they've already been queued.  This latter check was incorrectly under an
'else' condition and so inadvertently was not kicking in for single-block
cleanups.  This change sinks the check out of the 'else' to fix the bug.

Reviewers: majnemer, andrew.w.kaylor, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12751

llvm-svn: 247299
2015-09-10 16:51:25 +00:00
Hans Wennborg aa15bffa1f Re-commit r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"
Except the changes that defined virtual destructors as =default, because that
ran into problems with GCC 4.7 and overriding methods that weren't noexcept.

llvm-svn: 247298
2015-09-10 16:49:58 +00:00
Alex Lorenz 0153e59935 Fix PR 24724 - The implicit register verifier shouldn't assume certain operand
order.

The implicit register verifier in the MIR parser should only check if the
instruction's default implicit operands are present in the instruction. It
should not check the order in which they occur.

llvm-svn: 247283
2015-09-10 14:04:34 +00:00
Silviu Baranga df9ce8408a [DAGCombine] Truncate BUILD_VECTOR operators if necessary when constant folding vectors
Summary:
The BUILD_VECTOR node will truncate its operators to match the
type. We need to take this into account when constant folding -
we need to perform a truncation before constant folding the elements.
This is because the upper bits can change the result, depending on
the operation type (for example this is the case for min/max).

This change also adds a regression test.

Reviewers: jmolloy

Subscribers: jmolloy, llvm-commits

Differential Revision: http://reviews.llvm.org/D12697

llvm-svn: 247265
2015-09-10 10:34:34 +00:00
Hans Wennborg d2799a963f Revert r247216: "Fix Clang-tidy misc-use-override warnings, other minor fixes"
This caused build breakges, e.g.
http://lab.llvm.org:8011/builders/clang-x86_64-ubuntu-gdb-75/builds/24926

llvm-svn: 247226
2015-09-10 00:57:26 +00:00
Reid Kleckner 7878391208 [WinEH] Add codegen support for cleanuppad and cleanupret
All of the complexity is in cleanupret, and it mostly follows the same
codepaths as catchret, except it doesn't take a return value in RAX.

This small example now compiles and executes successfully on win32:
  extern "C" int printf(const char *, ...) noexcept;
  struct Dtor {
    ~Dtor() { printf("~Dtor\n"); }
  };
  void has_cleanup() {
    Dtor o;
    throw 42;
  }
  int main() {
    try {
      has_cleanup();
    } catch (int) {
      printf("caught it\n");
    }
  }

Don't try to put the cleanup in the same function as the catch, or Bad
Things will happen.

llvm-svn: 247219
2015-09-10 00:25:23 +00:00
Hans Wennborg 6fa09455ed Fix Clang-tidy misc-use-override warnings, other minor fixes
Patch by Eugene Zelenko!

Differential Revision: http://reviews.llvm.org/D12740

llvm-svn: 247216
2015-09-10 00:12:56 +00:00
Reid Kleckner 94b704c469 [SEH] Emit 32-bit SEH tables for the new EH IR
The 32-bit tables don't actually contain PC range data, so emitting them
is incredibly simple.

The 64-bit tables, on the other hand, use the same table for state
numbering as well as label ranges. This makes things more difficult, so
it will be implemented later.

llvm-svn: 247192
2015-09-09 21:10:03 +00:00
Matthias Braun d9da162789 Save LaneMask with livein registers
With subregister liveness enabled we can detect the case where only
parts of a register are live in, this is expressed as a 32bit lanemask.
The current code only keeps registers in the live-in list and therefore
enumerated all subregisters affected by the lanemask. This turned out to
be too conservative as the subregister may also cover additional parts
of the lanemask which are not live. Expressing a given lanemask by
enumerating a minimum set of subregisters is computationally expensive
so the best solution is to simply change the live-in list to store the
lanemasks as well. This will reduce memory usage for targets using
subregister liveness and slightly increase it for other targets

Differential Revision: http://reviews.llvm.org/D12442

llvm-svn: 247171
2015-09-09 18:08:03 +00:00
Matthias Braun cc58005885 VirtRegMap: Improve addMBBLiveIns() using SlotIndex::MBBIndexIterator; NFC
Now that we have an explicit iterator over the idx2MBBMap in SlotIndices
we can use the fact that segments and the idx2MBBMap is sorted by
SlotIndex position so can advance both simultaneously instead of
starting from the beginning for each segment.

This complicates the code for the subregister case somewhat but should
be more efficient and has the advantage that we get the final lanemask
for each block immediately which will be important for a subsequent
change.

Removes the now unused SlotIndexes::findMBBLiveIns function.

Differential Revision: http://reviews.llvm.org/D12443

llvm-svn: 247170
2015-09-09 18:07:54 +00:00
Chandler Carruth 7b560d40bd [PM/AA] Rebuild LLVM's alias analysis infrastructure in a way compatible
with the new pass manager, and no longer relying on analysis groups.

This builds essentially a ground-up new AA infrastructure stack for
LLVM. The core ideas are the same that are used throughout the new pass
manager: type erased polymorphism and direct composition. The design is
as follows:

- FunctionAAResults is a type-erasing alias analysis results aggregation
  interface to walk a single query across a range of results from
  different alias analyses. Currently this is function-specific as we
  always assume that aliasing queries are *within* a function.

- AAResultBase is a CRTP utility providing stub implementations of
  various parts of the alias analysis result concept, notably in several
  cases in terms of other more general parts of the interface. This can
  be used to implement only a narrow part of the interface rather than
  the entire interface. This isn't really ideal, this logic should be
  hoisted into FunctionAAResults as currently it will cause
  a significant amount of redundant work, but it faithfully models the
  behavior of the prior infrastructure.

- All the alias analysis passes are ported to be wrapper passes for the
  legacy PM and new-style analysis passes for the new PM with a shared
  result object. In some cases (most notably CFL), this is an extremely
  naive approach that we should revisit when we can specialize for the
  new pass manager.

- BasicAA has been restructured to reflect that it is much more
  fundamentally a function analysis because it uses dominator trees and
  loop info that need to be constructed for each function.

All of the references to getting alias analysis results have been
updated to use the new aggregation interface. All the preservation and
other pass management code has been updated accordingly.

The way the FunctionAAResultsWrapperPass works is to detect the
available alias analyses when run, and add them to the results object.
This means that we should be able to continue to respect when various
passes are added to the pipeline, for example adding CFL or adding TBAA
passes should just cause their results to be available and to get folded
into this. The exception to this rule is BasicAA which really needs to
be a function pass due to using dominator trees and loop info. As
a consequence, the FunctionAAResultsWrapperPass directly depends on
BasicAA and always includes it in the aggregation.

This has significant implications for preserving analyses. Generally,
most passes shouldn't bother preserving FunctionAAResultsWrapperPass
because rebuilding the results just updates the set of known AA passes.
The exception to this rule are LoopPass instances which need to preserve
all the function analyses that the loop pass manager will end up
needing. This means preserving both BasicAAWrapperPass and the
aggregating FunctionAAResultsWrapperPass.

Now, when preserving an alias analysis, you do so by directly preserving
that analysis. This is only necessary for non-immutable-pass-provided
alias analyses though, and there are only three of interest: BasicAA,
GlobalsAA (formerly GlobalsModRef), and SCEVAA. Usually BasicAA is
preserved when needed because it (like DominatorTree and LoopInfo) is
marked as a CFG-only pass. I've expanded GlobalsAA into the preserved
set everywhere we previously were preserving all of AliasAnalysis, and
I've added SCEVAA in the intersection of that with where we preserve
SCEV itself.

One significant challenge to all of this is that the CGSCC passes were
actually using the alias analysis implementations by taking advantage of
a pretty amazing set of loop holes in the old pass manager's analysis
management code which allowed analysis groups to slide through in many
cases. Moving away from analysis groups makes this problem much more
obvious. To fix it, I've leveraged the flexibility the design of the new
PM components provides to just directly construct the relevant alias
analyses for the relevant functions in the IPO passes that need them.
This is a bit hacky, but should go away with the new pass manager, and
is already in many ways cleaner than the prior state.

Another significant challenge is that various facilities of the old
alias analysis infrastructure just don't fit any more. The most
significant of these is the alias analysis 'counter' pass. That pass
relied on the ability to snoop on AA queries at different points in the
analysis group chain. Instead, I'm planning to build printing
functionality directly into the aggregation layer. I've not included
that in this patch merely to keep it smaller.

Note that all of this needs a nearly complete rewrite of the AA
documentation. I'm planning to do that, but I'd like to make sure the
new design settles, and to flesh out a bit more of what it looks like in
the new pass manager first.

Differential Revision: http://reviews.llvm.org/D12080

llvm-svn: 247167
2015-09-09 17:55:00 +00:00
Matthias Braun 80595460d8 MachineVerifier: Check that SlotIndex MBBIndexList is sorted.
This introduces a check that the MBBIndexList is sorted as proposed in
http://reviews.llvm.org/D12443 but split up into a separate commit.

llvm-svn: 247166
2015-09-09 17:49:46 +00:00
Daniel Sanders 2038747fce Fix vector splitting for extract_vector_elt and vector elements of <8-bits.
Summary:
One of the vector splitting paths for extract_vector_elt tries to lower:
    define i1 @via_stack_bug(i8 signext %idx) {
      %1 = extractelement <2 x i1> <i1 false, i1 true>, i8 %idx
      ret i1 %1
    }
to:
    define i1 @via_stack_bug(i8 signext %idx) {
      %base = alloca <2 x i1>
      store <2 x i1> <i1 false, i1 true>, <2 x i1>* %base
      %2 = getelementptr <2 x i1>, <2 x i1>* %base, i32 %idx
      %3 = load i1, i1* %2
      ret i1 %3
    }
However, the elements of <2 x i1> are not byte-addressible. The result of this
is that the getelementptr expands to '%base + %idx * (1 / 8)' which simplifies
to '%base + %idx * 0', and then simply '%base' causing all values of %idx to
extract element zero.

This commit fixes this by promoting the vector elements of <8-bits to i8 before
splitting the vector.

This fixes a number of test failures in pocl.

Reviewers: pekka.jaaskelainen

Subscribers: pekka.jaaskelainen, llvm-commits

Differential Revision: http://reviews.llvm.org/D12591

llvm-svn: 247128
2015-09-09 09:53:20 +00:00
Matt Arsenault acd68b58ae SelectionDAG: Support Expand of f16 extloads
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.

This moves a hack out of SI's argument lowering and
is covered by existing tests.

llvm-svn: 247113
2015-09-09 01:12:27 +00:00
Matt Arsenault 3099156261 Fix typos / grammar
llvm-svn: 247109
2015-09-09 00:38:33 +00:00
Reid Kleckner 51189f0a1d [WinEH] Avoid creating MBBs for LLVM BBs that cannot contain code
Typically these are catchpads, which hold data used to decide whether to
catch the exception or continue unwinding. We also shouldn't create MBBs
for catchendpads, cleanupendpads, or terminatepads, since no real code
can live in them.

This fixes a problem where MI passes (like the register allocator) would
try to put code into catchpad blocks, which are not executed by the
runtime. In the new world, blocks ending in invokes now have many
possible successors.

llvm-svn: 247102
2015-09-08 23:28:38 +00:00
Reid Kleckner df1295173f [WinEH] Emit prologues and epilogues for funclets
Summary:
32-bit funclets have short prologues that allocate enough stack for the
largest call in the whole function. The runtime saves CSRs for the
funclet. It doesn't restore CSRs after we finally transfer control back
to the parent funciton via a CATCHRET, but that's a separate issue.
32-bit funclets also have to adjust the incoming EBP value, which is
what llvm.x86.seh.recoverframe does in the old model.

64-bit funclets need to spill CSRs as normal. For simplicity, this just
spills the same set of CSRs as the parent function, rather than trying
to compute different CSR sets for the parent function and each funclet.
64-bit funclets also allocate enough stack space for the largest
outgoing call frame, like 32-bit.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12546

llvm-svn: 247092
2015-09-08 22:44:41 +00:00
Dan Gohman e32c57443f [WebAssembly] Support running without a register allocator in the default CodeGen passes
This allows backends which don't use a traditional register allocator,
but do need PHI lowering and other passes, to use the default
TargetPassConfig::addFastRegAlloc and
TargetPassConfig::addOptimizedRegAlloc implementations.

Differential Revision: http://reviews.llvm.org/D12691

llvm-svn: 247065
2015-09-08 20:36:33 +00:00
Hal Finkel 10aac5fd0e [SelectionDAG] Swap commutative binops before constant-based folding
In searching for a fix for the underlying code-quality bug highlighted by
r246937 (that SDAG simplification can lead to us generating an ISD::OR node
with a constant zero LHS), I ran across this:

We generically canonicalize commutative binary-operation nodes in SDAG getNode
so that, if only one operand is a constant, it will be on the RHS.  However, we
were doing this only after a bunch of constant-based simplification checks that
all assume this canonical form (that any constant will be on the RHS). Moving
the operand-swapping canonicalization prior to these checks seems like the
right thing to do (and, as it turns out, causes SDAG to completely fold away the
computation in test/CodeGen/ARM/2012-11-14-subs_carry.ll, just like InstCombine
would do).

llvm-svn: 246938
2015-09-06 05:42:13 +00:00
Chad Rosier a67b2d0117 Typo. NFC.
llvm-svn: 246851
2015-09-04 12:34:55 +00:00
Reid Kleckner 1f13d4789f Sink COFF.h MC include into .cpp files
This prevents MC clients from getting COFF.h, which conflicts with
winnt.h macros. Also a minor IWYU cleanup. Now the only public headers
including COFF.h are in Object, and they actually need it.

llvm-svn: 246784
2015-09-03 16:41:50 +00:00
Sanjay Patel ce74db9d8d check for fastness before merging in DAGCombiner::MergeConsecutiveStores()
Use and check the 'IsFast' optional parameter to TLI.allowsMemoryAccess() any time
we have a merged access candidate. Without this patch, we were generating unaligned 
16-byte (SSE) memops for x86 targets where those accesses are slow.

This change was mentioned in:
http://reviews.llvm.org/D10662 and
http://reviews.llvm.org/D10905

and will help solve PR21711.

Differential Revision: http://reviews.llvm.org/D12573

llvm-svn: 246771
2015-09-03 15:03:19 +00:00
Joseph Tremoulet 9ce71f76b9 [WinEH] Add cleanupendpad instruction
Summary:
Add a `cleanupendpad` instruction, used to mark exceptional exits out of
cleanups (for languages/targets that can abort a cleanup with another
exception).  The `cleanupendpad` instruction is similar to the `catchendpad`
instruction in that it is an EH pad which is the target of unwind edges in
the handler and which itself has an unwind edge to the next EH action.
The `cleanupendpad` instruction, similar to `cleanupret` has a `cleanuppad`
argument indicating which cleanup it exits.  The unwind successors of a
`cleanuppad`'s `cleanupendpad`s must agree with each other and with its
`cleanupret`s.

Update WinEHPrepare (and docs/tests) to accomodate `cleanupendpad`.

Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12433

llvm-svn: 246751
2015-09-03 09:09:43 +00:00
Sanjay Patel 42574203e5 use "unpredictable" metadata in fast-isel when splitting compares
This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch.

Differential Revision: http://reviews.llvm.org/D12342

llvm-svn: 246692
2015-09-02 19:23:23 +00:00
Sanjay Patel fff7c6dc73 use "unpredictable" metadata in SelectionDAG when splitting compares
This patch uses the metadata defined in D12341 to avoid creating an unpredictable branch.

Differential Revision: http://reviews.llvm.org/D12343

llvm-svn: 246691
2015-09-02 19:17:25 +00:00
Elena Demikhovsky 1b9d6914d3 Optimization for Gather/Scatter with uniform base
Vector 'getelementptr' with scalar base is an opportunity for gather/scatter intrinsic to generate a better sequence.
While looking for uniform base, we want to use the scalar base pointer of GEP, if exists.

Differential Revision: http://reviews.llvm.org/D11121

llvm-svn: 246622
2015-09-02 08:39:13 +00:00
Silviu Baranga 6d3f05c04b [ARM][AArch64] Turn on by default interleaved access lowering
Summary:
Interleaved access lowering removes a memory operation and a
sequence of vector shuffles and replaces it with a series of
memory operations. This should be always beneficial.

This pass in only enabled on ARM/AArch64.

Reviewers: rengolin

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D12145

llvm-svn: 246540
2015-09-01 11:12:35 +00:00
Cong Hou 511298b919 Distribute the weight on the edge from switch to default statement to edges generated in lowering switch.
Currently, when edge weights are assigned to edges that are created when lowering switch statement, the weight on the edge to default statement (let's call it "default weight" here) is not considered. We need to distribute this weight properly. However, without value profiling, we have no idea how to distribute it. In this patch, I applied the heuristic that this weight is evenly distributed to successors.

For example, given a switch statement with cases 1,2,3,5,10,11,20, and every edge from switch to each successor has weight 10. If there is a binary search tree built to test if n < 10, then its two out-edges will have weight 4x10+10/2 = 45 and 3x10 + 10/2 = 35 respectively (currently they are 40 and 30 without considering the default weight). Each distribution (which is 5 here) will be stored in each SwitchWorkListItem for further distribution.

There are some exceptions:

For a jump table header which doesn't have any edge to default statement, we don't distribute the default weight to it.
For a bit test header which covers a contiguous range and hence has no edges to default statement, we don't distribute the default weight to it.
When the branch checks a single value or a contiguous range with no edge to default statement, we don't distribute the default weight to it.
In other cases, the default weight is evenly distributed to successors.

Differential Revision: http://reviews.llvm.org/D12418

llvm-svn: 246522
2015-09-01 01:42:16 +00:00
Hal Finkel 1baec5323b [DAGCombine] Fixup SETCC legality checking
SETCC is one of those special node types for which operation actions (legality,
etc.) is keyed off of an operand type, not the node's value type. This makes
sense because the value type of a legal SETCC node is determined by its
operands' value type (via the TLI function getSetCCResultType). When the
SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value
type, or directly with the value type provided by TLI.getSetCCResultType.

The first problem being fixed here is that DAGCombine had several places
querying TLI.isOperationLegal on SETCC, but providing the return of
getSetCCResultType, instead of the operand type directly. This does not mean
what the author thought, and "luckily", most in-tree targets have SETCC with
Custom lowering, instead of marking them Legal, so these checks return false
anyway.

The second problem being fixed here is that two of the DAGCombines could create
SETCC nodes with arbitrary (integer) value types; specifically, those that
would simplify:

  (setcc a, b, op1) and|or (setcc a, b, op2) -> setcc a, b, op3
     (which is possible for some combinations of (op1, op2))

If the operands of the and|or node are actual setcc nodes, then this is not an
issue (because the and|or must share the same type), but, the relevant code in
DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls
DAGCombiner::isSetCCEquivalent on each operand, and that function will
recognise setcc-like select_cc nodes with other return types. And, thus, when
creating new SETCC nodes, we need to be careful to respect the value-type
constraint. This is even true before type legalization, because it is quite
possible for the SELECT_CC node to have a legal type that does not happen to
match the corresponding TLI.getSetCCResultType type.

To be explicit, there is nothing that later fixes the value types of SETCC
nodes (if the type is legal, but does not happen to match
TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to
work only because, either MVT::i1 is not legal, or it is what
TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change,
however. For the time being, restrict the relevant transformations to produce
only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1
prior to type legalization).

Fixes PR24636.

llvm-svn: 246507
2015-08-31 23:15:04 +00:00
Sanjay Patel 719b3e6a3e don't set a legal vector type if we know we can't use that type (NFCI)
Added benefit: the 'if' logic now matches the text of the comment that describes it.

llvm-svn: 246506
2015-08-31 22:59:03 +00:00
Sanjay Patel 218cbd5a48 generalize helper function of MergeConsecutiveStores to handle vector types (NFCI)
This was part of D7208 (r227242), but that commit was reverted because it exposed
a bug in AArch64 lowering. I should have that fixed and the rest of the commit
reinstated soon.

llvm-svn: 246493
2015-08-31 21:50:16 +00:00
Hal Finkel 2483f2060a [DAGCombine] Use getSetCCResultType utility function
DAGCombine has a utility wrapper around TLI's getSetCCResultType; use it in the
one place in DAGCombine still directly calling the TLI function. NFC.

llvm-svn: 246482
2015-08-31 20:42:38 +00:00
Reid Kleckner e00faf8ce1 [EH] Handle non-Function personalities like unknown personalities
Also delete and simplify a lot of MachineModuleInfo code that used to be
needed to handle personalities on landingpads.  Now that the personality
is on the LLVM Function, we no longer need to track it this way on MMI.
Certainly it should not live on LandingPadInfo.

llvm-svn: 246478
2015-08-31 20:02:16 +00:00
Hal Finkel a894266d28 [DAGCombine] Remove some old dead code for forming SETCC nodes
This code was dead when it was committed in r23665 (Oct 7, 2005), and before it
reaches its 10th anniversary, it really should go. We can always bring it back
if we'd like, but it forms more SETCC nodes, and the way we do legality
checking on SETCC nodes is wrong in a number of places, and removing this means
fewer places to fix. NFC.

llvm-svn: 246466
2015-08-31 18:38:55 +00:00
Kit Barton d3cc1678e8 Rework of the new interface for shrink wrapping
Based on comments from Hal
(http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150810/292978.html),
I've changed the interface to add a callback mechanism to the
TargetFrameLowering class to query whether the specific target
supports shrink wrapping.  By default, shrink wrapping is disabled by
default. Each target can override the default behaviour using the
TargetFrameLowering::targetSupportsShrinkWrapping() method. Shrink
wrapping can still be explicitly enabled or disabled from the command
line, using the existing -enable-shrink-wrap=<true|false> option.

Phabricator: http://reviews.llvm.org/D12293
llvm-svn: 246463
2015-08-31 18:26:45 +00:00
Hal Finkel e0a28e54c7 [AggressiveAntiDepBreaker] Check for EarlyClobber on defining instruction
AggressiveAntiDepBreaker was doing some EarlyClobber checking, but was not
checking that the register being potentially renamed was defined by an
early-clobber def where there was also a use, in that instruction, of the
register being considered as the target of the rename. Fixes PR24014.

llvm-svn: 246423
2015-08-31 07:51:36 +00:00
Peter Collingbourne 592ee15e14 Support: Support LLVM_ENABLE_THREADS=0 in llvm/Support/thread.h.
Specifically, the header now provides llvm::thread, which is either a
typedef of std::thread or a replacement that calls the function synchronously
depending on the value of LLVM_ENABLE_THREADS.

llvm-svn: 246402
2015-08-31 00:09:01 +00:00
Renato Golin 3b1d3b0d84 Revert "Revert "New interface function is added to VectorUtils Value *getSplatValue(Value *Val);""
This reverts commit r246379. It seems that the commit was not the culprit,
and the bot will be investigated for instability.

llvm-svn: 246380
2015-08-30 10:49:04 +00:00
Renato Golin c7be31736c Revert "New interface function is added to VectorUtils Value *getSplatValue(Value *Val);"
This reverts commit r246371, as it cause a rather obscure bug in AArch64
test-suite paq8p (time outs, seg-faults). I'll investigate it before
reapplying.

llvm-svn: 246379
2015-08-30 10:05:30 +00:00
Elena Demikhovsky a59fcfa56b New interface function is added to VectorUtils
Value *getSplatValue(Value *Val);

It complements the CreateVectorSplat(), which creates 2 instructions - insertelement and shuffle with all-zero mask.

The new function recognizes the pattern - insertelement+shuffle and returns the splat value (or nullptr).
It also returns a splat value form ConstantDataVector, for completeness.

Differential Revision:	http://reviews.llvm.org/D11124

llvm-svn: 246371
2015-08-30 07:28:18 +00:00
Fiona Glaser 934765c1df SelectionDAG: add missing ComputeSignBits case for SELECT_CC
Identical to SELECT, just with different operand numbers.

llvm-svn: 246366
2015-08-29 23:04:38 +00:00
Peter Collingbourne 79bf113dca Fix shared library build.
llvm-svn: 246365
2015-08-29 22:34:34 +00:00
Duncan P. N. Exon Smith 0660bcda53 AsmPrinter: Allow null subroutine type
Currently the DWARF backend requires that subprograms have a type, and
the type is ignored if it has an empty type array.  The long term
direction here -- see PR23079 -- is instead to skip the type entirely if
there's no valid type.

It turns out we have cases in tree of missing types on subprograms, but
since they're not referenced by compile units, the backend never crashes
on them.  One option would be to add a Verifier check that subprograms
have types, and fix the bitrot.  However, this is a fair bit of churn
(20-30 testcases) that would be reversed anyway by PR23079.

I found this inconsistency because of a WIP patch and upgrade script for
PR23367 that started crashing on test/DebugInfo/2010-10-01-crash.ll.
This commit updates the testcase to reference the subprogram from the
compile unit, and fixes the resulting crash (in line with the direction
of PR23079).  This also updates `DIBuilder` to stop assuming a non-null
pointer for the subroutine types.

llvm-svn: 246333
2015-08-28 21:38:24 +00:00
David Majnemer 0a92f86fe6 Revert r246232 and r246304.
This reverts isSafeToSpeculativelyExecute's use of ReadNone until we
split ReadNone into two pieces: one attribute which reasons about how
the function reasons about memory and another attribute which determines
how it may be speculated, CSE'd, trap, etc.

llvm-svn: 246331
2015-08-28 21:13:39 +00:00
Matt Arsenault d9c830154f Make MergeConsecutiveStores look at other stores on same chain
When combiner AA is enabled, look at stores on the same chain.
Non-aliasing stores are moved to the same chain so the existing
code fails because it expects to find an adajcent store on a consecutive
chain.

Because of how DAGCombiner tries these store combines,
MergeConsecutiveStores doesn't see the correct set of stores on the chain
when it visits the other stores. Each store individually has its chain
fixed before trying to merge consecutive stores, and then tries to merge
stores from that point before the other stores have been processed to
have their chains fixed. To fix this, attempt to use FindBetterChain
on any possibly neighboring stores in visitSTORE.

Suppose you have 4 32-bit stores that should be merged into 1 vector
store. One store would be visited first, fixing the chain. What happens is
because not all of the store chains have yet been fixed, 2 of the stores
are merged. The other 2 stores later have their chains fixed,
but because the other stores were already merged, they have different
memory types and merging the two different sized stores is not
supported and would be more difficult to handle.

llvm-svn: 246307
2015-08-28 17:31:28 +00:00
David Majnemer a787de3227 [CodeGen] isInTailCallPosition didn't consider readnone tailcalls
A readnone tailcall may still have a chain of computation which follows
it that would invalidate a tailcall lowering.  Don't skip the analysis
in such cases.

This fixes PR24613.

llvm-svn: 246304
2015-08-28 16:44:09 +00:00
NAKAMURA Takumi bc3af7b031 LLVMCodeGen: Update libdeps corresponding to r246236.
llvm-svn: 246274
2015-08-28 05:38:49 +00:00
Ahmed Bougacha f9c19da03a [CodeGen] Support (and default to) expanding READCYCLECOUNTER to 0.
For targets that didn't support this, this will let us respect the
langref instead of failing to select.

Note that we don't need to change the 32-bit x86/PPC lowerings (to
account for the result type/# difference) because they're both
custom and bypass type legalization.

llvm-svn: 246258
2015-08-28 01:49:59 +00:00
Joseph Tremoulet ec18285b91 [WinEH] Update coloring to handle nested cases cleanly
Summary:
Change the coloring algorithm in WinEHPrepare to visit a funclet's exits
in its parents' contexts and so properly classify the continuations of
nested funclets.

Also change the placement of cloned blocks to be deterministic and to
maintain the relative order of each funclet's blocks.

Add a lit test showing various patterns that require cloning, the last
several of which don't have CHECKs yet because they require cloning
entire funclets which is NYI.


Reviewers: rnk, andrew.w.kaylor, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12353

llvm-svn: 246245
2015-08-28 01:12:35 +00:00
Peter Collingbourne c269ed5115 CodeGen: Introduce splitCodeGen and teach LTOCodeGenerator to use it.
llvm::splitCodeGen is a function that implements the core of parallel LTO
code generation. It uses llvm::SplitModule to split the module into linkable
partitions and spawning one code generation thread per partition. The function
produces multiple object files which can be linked in the usual way.

This has been threaded through to LTOCodeGenerator (and llvm-lto for testing
purposes). Separate patches will add parallel LTO support to the gold plugin
and lld.

Differential Revision: http://reviews.llvm.org/D12260

llvm-svn: 246236
2015-08-27 23:37:36 +00:00
Reid Kleckner 0e2882345d [WinEH] Add some support for code generating catchpad
We can now run 32-bit programs with empty catch bodies.  The next step
is to change PEI so that we get funclet prologues and epilogues.

llvm-svn: 246235
2015-08-27 23:27:47 +00:00
Ahmed Bougacha 87166905c8 [CodeGen] Check FoldConstantArithmetic result before using it.
Fixes PR24602: r245689 introduced an unguarded use of
SelectionDAG::FoldConstantArithmetic, which returns 0 when it fails
because of opaque (hoisted) constants.

llvm-svn: 246217
2015-08-27 21:46:04 +00:00
Cong Hou 08cb4fc688 Fixed a bug that edge weights are not assigned correctly when lowering switch statement.
This is a one-line-change patch that moves the update to UnhandledWeights to the correct position: it should be updated for all clusters instead of just range clusters.

Differential Revision: http://reviews.llvm.org/D12391

llvm-svn: 246129
2015-08-27 00:37:40 +00:00
Cong Hou 03127700d5 Assign weights to edges to jump table / bit test header when lowering switch statement.
Currently, when lowering switch statement and a new basic block is built for jump table / bit test header, the edge to this new block is not assigned with a correct weight. This patch collects the edge weight from all its successors and assign this sum of weights to the edge (and also the other fall-through edge). Test cases are adjusted accordingly.

Differential Revision: http://reviews.llvm.org/D12166#fae6eca7

llvm-svn: 246104
2015-08-26 23:15:32 +00:00
Matthias Braun 4e7ded834f SelectionDAGBuilder: Fix SPDescriptor not resetting GuardReg
This was causing problems when some functions use a GuardReg and some
don't as can happen when mixing SelectionDAG and FastISel generated
functions.

llvm-svn: 246075
2015-08-26 20:46:52 +00:00
Matthias Braun 4816b18d86 FastISel: Avoid adding a successor block twice for degenerate IR.
This fixes http://llvm.org/PR24581

Differential Revision: http://reviews.llvm.org/D12350

llvm-svn: 246074
2015-08-26 20:46:49 +00:00
Matthias Braun 17af607796 FastISel: Factor out common code; NFC intended
This should be no functional change but for the record: For three cases
in X86FastISel this will change the order in which the FalseMBB and
TrueMBB of a conditional branch is addedd to the successor/predecessor
lists.

llvm-svn: 245997
2015-08-26 01:38:00 +00:00
Charles Davis 119525914c Make variable argument intrinsics behave correctly in a Win64 CC function.
Summary:
This change makes the variable argument intrinsics, `llvm.va_start` and
`llvm.va_copy`, and the `va_arg` instruction behave as they do on Windows
inside a `CallingConv::X86_64_Win64` function. It's needed for a Clang patch
I have to add support for GCC's `__builtin_ms_va_list` constructs.

Reviewers: nadav, asl, eugenis

CC: llvm-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1622

llvm-svn: 245990
2015-08-25 23:27:41 +00:00
Matthias Braun 130bd90e17 MachineBasicBlock: Use MCPhysReg instead of unsigned in livein API
This is friendlier to the readers as it makes it clear that the API is
not meant for vregs but just for physregs.

llvm-svn: 245977
2015-08-25 22:05:55 +00:00
Cong Hou cd59591396 Remove the final bit test during lowering switch statement if all cases in bit test cover a contiguous range.
When lowering switch statement, if bit tests are used then LLVM will always generates a jump to the default statement in the last bit test. However, this is not necessary when all cases in bit tests cover a contiguous range. This is because when generating the bit tests header MBB, there is a range check that guarantees cases in bit tests won't go outside of [low, high], where low and high are minimum and maximum case values in the bit tests. This patch checks if this is the case and then doesn't emit jump to default statement and hence saves a bit test and a branch.

Differential Revision: http://reviews.llvm.org/D12249

llvm-svn: 245976
2015-08-25 21:34:38 +00:00
David Blaikie d486000387 Fix dropped conditional in cleanup in r245752
Code review feedback by Charlie Turner.

llvm-svn: 245954
2015-08-25 17:01:36 +00:00
Steve King 5cdbd20cc3 Pass function attributes instead of boolean in isIntDivCheap().
llvm-svn: 245921
2015-08-25 02:31:21 +00:00
Matthias Braun 1b50bb58a1 Try to fix buildbots
Apparently std::vector::erase(const_iterator) (as opposed to the
non-const iterator) is a part of C++11 but it seems this is not available
on all the buildbots.

llvm-svn: 245900
2015-08-24 23:30:39 +00:00
Matthias Braun 7a8b1150bf Let's try to fix GNU libstdc++ buildbots
llvm-svn: 245898
2015-08-24 23:19:39 +00:00
Matthias Braun b2b7ef1de8 MachineBasicBlock: Add liveins() method returning an iterator_range
llvm-svn: 245895
2015-08-24 22:59:52 +00:00
Dan Gohman 7b63484b99 [WebAssembly] Skeleton FastISel support
llvm-svn: 245860
2015-08-24 18:44:37 +00:00
Oliver Stannard 284f2bffc9 Add DAG optimisation for FP16_TO_FP
The FP16_TO_FP node only uses the bottom 16 bits of its input, so the
following pattern can be optimised by removing the AND:

  (FP16_TO_FP (AND op, 0xffff)) -> (FP16_TO_FP op)

This is a common pattern for ARM targets when functions have __fp16
arguments, as they are passed as floats (so that they get passed in the
correct registers), but then bitcast and truncated to ignore the top 16
bits.

llvm-svn: 245832
2015-08-24 09:47:45 +00:00
Simon Pilgrim 2a7049abe0 [DAGCombiner] Fold CONCAT_VECTORS of bitcasted EXTRACT_SUBVECTOR
Minor generalization of D12125 - peek through any bitcast to the original vector that we're extracting from.

llvm-svn: 245814
2015-08-23 15:22:14 +00:00
Mehdi Amini 5aa7bd7d62 Do not use dyn_cast<> after isa<>
Reported by coverity.

From: Mehdi Amini <mehdi.amini@apple.com>
llvm-svn: 245799
2015-08-23 00:27:57 +00:00
Joseph Tremoulet 8220bcc570 [WinEH] Require token linkage in EH pad/ret signatures
Summary:
WinEHPrepare is going to require that cleanuppad and catchpad produce values
of token type which are consumed by any cleanupret or catchret exiting the
pad.  This change updates the signatures of those operators to require/enforce
that the type produced by the pads is token type and that the rets have an
appropriate argument.

The catchpad argument of a `CatchReturnInst` must be a `CatchPadInst` (and
similarly for `CleanupReturnInst`/`CleanupPadInst`).  To accommodate that
restriction, this change adds a notion of an operator constraint to both
LLParser and BitcodeReader, allowing appropriate sentinels to be constructed
for forward references and appropriate error messages to be emitted for
illegal inputs.

Also add a verifier rule (noted in LangRef) that a catchpad with a catchpad
predecessor must have no other predecessors; this ensures that WinEHPrepare
will see the expected linear relationship between sibling catches on the
same try.

Lastly, remove some superfluous/vestigial casts from instruction operand
setters operating on BasicBlocks.

Reviewers: rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12108

llvm-svn: 245797
2015-08-23 00:26:33 +00:00
David Blaikie 47bf5c019d Range-for-ify some things in GlobalMerge
llvm-svn: 245752
2015-08-21 22:19:06 +00:00
David Blaikie 9ed57a9ef0 [opaque pointer types] Fix a few easy places in GlobalMerge that were accessing value types through pointee types
llvm-svn: 245746
2015-08-21 22:00:44 +00:00
Alex Lorenz c1136ef3b8 MIR Serialization: Serialize the pointer IR expression values in the machine
memory operands.

llvm-svn: 245745
2015-08-21 21:54:12 +00:00
Alex Lorenz 5d8b0bd9b0 MIRParser: Split the 'parseIRConstant' method into two methods. NFC.
One variant of this method can be reused when parsing the quoted IR pointer
expressions in the machine memory operands.

llvm-svn: 245743
2015-08-21 21:48:22 +00:00
Alex Lorenz f22ca8ad35 MIR Serialization: Print MCSymbol operands.
This commit allows the MIR printer to print the MCSymbol machine operands.
Unfortunately they can't be parsed at this time. I will create a bug that will
track the fact that the MCSymbol operands can't be parsed yet.

llvm-svn: 245737
2015-08-21 21:12:44 +00:00
Yaron Keren 528d8d6092 Disable Visual C++ 2013 Debug mode assert on null pointer in some STL algorithms,
such as std::equal on the third argument. This reverts previous workarounds.

Predefining _DEBUG_POINTER_IMPL disables Visual C++ 2013 headers from defining
it to a function performing the null pointer check. In practice, it's not that
bad since any function actually using the nullptr will seg fault. The other
iterator sanity checks remain enabled in the headers.

Reviewed by Aaron Ballmanþ and Duncan P. N. Exon Smith.

llvm-svn: 245711
2015-08-21 17:31:03 +00:00
John Brawn eab960c46f [DAGCombiner] Fold together mul and shl when both are by a constant
This is intended to improve code generation for GEPs, as the index value is
shifted by the element size and in GEPs of multi-dimensional arrays the index
of higher dimensions is multiplied by the lower dimension size.

Differential Revision: http://reviews.llvm.org/D12197

llvm-svn: 245689
2015-08-21 10:48:17 +00:00
Benjamin Kramer fcdb1c14ac Make helper functions static. NFC.
llvm-svn: 245549
2015-08-20 09:57:22 +00:00
Alex Lorenz 36efd3883d MIR Serialization: Use the global value syntax for global value memory operands.
This commit modifies the serialization syntax so that the global IR values in
machine memory operands use the global value '@<name>' syntax instead of the
current '%ir.<name>' syntax.

The unnamed global IR values are handled by this commit as well, as the
existing global value parsing method can parse the unnamed globals already.

llvm-svn: 245527
2015-08-20 00:20:03 +00:00
Alex Lorenz 0d009645a1 MIR Serialization: Change syntax for the call entry pseudo source values.
The global IR values in machine memory operands should use the global value
'@<name>' syntax instead of the current '%ir.<name>' syntax.

However, the global value call entry pseudo source values use the global value
syntax already. Therefore, the syntax for the call entry pseudo source values
has to be changed so that the global values and call entry global value PSVs
can be parsed without ambiguities.

llvm-svn: 245526
2015-08-20 00:12:57 +00:00
Alex Lorenz dbd22a9a6c Fix test failure introduced by r245521.
Machine memory operands can contain pointer values that are constants, and
the 'getLocalSlot' method requires non-constant values.

The constant pointer values will have to be serialized in a different patch.

llvm-svn: 245523
2015-08-19 23:56:37 +00:00
Alex Lorenz dd13be0bcc MIR Serialization: Serialize unnamed local IR values in memory operands.
llvm-svn: 245521
2015-08-19 23:31:05 +00:00
Alex Lorenz 36593ac51b MIR Parser: parseIRValue should take in a constant pointer. NFC.
llvm-svn: 245520
2015-08-19 23:27:07 +00:00
Alex Lorenz 55dc6f8165 MIR Printer: Extract the code that prints IR slots to a separate function. NFC.
This code can be reused when printing references to unnamed local IR values.

llvm-svn: 245519
2015-08-19 23:24:37 +00:00
Simon Pilgrim 35f528262f [DAGCombiner] Added SMAX/SMIN/UMAX/UMIN constant folding
We still need to add constant folding of vector comparisons to fold the tests for targets that don't support the respective min/max nodes

I needed to update 2011-12-06-AVXVectorExtractCombine to load a vector instead of using a constant vector to prevent it folding

Differential Revision: http://reviews.llvm.org/D12118

llvm-svn: 245503
2015-08-19 21:11:58 +00:00
Simon Pilgrim 989cbbd2f5 [DAGCombiner] Fold CONCAT_VECTORS of EXTRACT_SUBVECTOR (or undef) to VECTOR_SHUFFLE.
Check to see if this is a CONCAT_VECTORS of a bunch of EXTRACT_SUBVECTOR operations. If so, and if the EXTRACT_SUBVECTOR vector inputs come from at most two distinct vectors the same size as the result, attempt to turn this into a legal shuffle.

Differential Revision: http://reviews.llvm.org/D12125

llvm-svn: 245490
2015-08-19 20:09:50 +00:00
Alex Lorenz feb6b4395b MIR Parser: Rename 'MachineOperandWithLocation' to 'ParsedMachineOperand'. NFC.
Besides storing the operand's source range, this structure now stores other
attributes as well, so the name should reflect this fact.

llvm-svn: 245483
2015-08-19 19:19:16 +00:00
Alex Lorenz 5ef93b0c4c MIR Serialization: Serialize instruction's register ties.
This commit serializes the machine instruction's register operand ties.
The ties are printed out only when the instructon has register ties that are
different from the ties that are specified in the instruction's description.

llvm-svn: 245482
2015-08-19 19:05:34 +00:00
Alex Lorenz e66a7ccf77 MIR Serialization: Serialize defined registers that require 'def' register flag.
The defined registers are already serialized - they are represented by placing
them before the '=' in a machine instruction. However, certain instructions like
INLINEASM can have defined register operands after the '=', so this commit
introduces the 'def' register flag for such operands.

llvm-svn: 245480
2015-08-19 18:55:47 +00:00
Bruno Cardoso Lopes 27fd06922b [PeepholeOptimizer] Look through PHIs to find additional register sources
Reintroduce r245442. Remove an overly conservative assertion introduced
in r245442. We could replace the assertion to use `shareSameRegisterFile`
instead, but in that point in `insertPHI` we already lost the original
Def subreg to check against. So drop the assertion completely.

Original commit message:

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 245479
2015-08-19 18:53:36 +00:00
Bruno Cardoso Lopes 61009142b8 Revert "[PeepholeOptimizer] Look through PHIs to find additional register sources"
Revert r245442 while investigating a fix. An assertion hit in
http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/11380

llvm-svn: 245446
2015-08-19 15:10:32 +00:00
Bruno Cardoso Lopes 0a1c126684 [PeepholeOptimizer] Look through PHIs to find additional register sources
Reapply r243486.

- Teaches the ValueTracker in the PeepholeOptimizer to look through PHI
instructions.
- Add findNextSourceAndRewritePHI method to lookup into multiple sources
returnted by the ValueTracker and rewrite PHIs with new sources.

With these changes we can find more register sources and rewrite more
copies to allow coaslescing of bitcast instructions. Hence, we eliminate
unnecessary VR64 <-> GR64 copies in x86, but it could be extended to
other archs by marking "isBitcast" on target specific instructions. The
x86 example follows:

A:
  psllq %mm1, %mm0
  movd  %mm0, %r9
  jmp C

B:
  por %mm1, %mm0
  movd  %mm0, %r9
  jmp C

C:
  movd  %r9, %mm0
  pshufw  $238, %mm0, %mm0

Becomes:

A:
  psllq %mm1, %mm0
  jmp C

B:
  por %mm1, %mm0
  jmp C

C:
  pshufw  $238, %mm0, %mm0

Differential Revision: http://reviews.llvm.org/D11197
rdar://problem/20404526

llvm-svn: 245442
2015-08-19 14:34:41 +00:00
Daniel Sanders 1e97a0b324 Emit <regmask R1 R2 R3 ...> instead of just <regmask> in IR dumps.
Reviewers: qcolombet

Subscribers: kparzysz, qcolombet, llvm-commits

Differential Revision: http://reviews.llvm.org/D11644

llvm-svn: 245433
2015-08-19 12:03:04 +00:00
Michael Kuperstein dcdab4cd3a [TLI] Refactor "is integer division cheap" queries.
This removes the isPow2SDivCheap() query, as it is not currently used in
any meaningful way. isIntDivCheap() no longer relies on a state variable
(as all in-tree target set it to false), but the interface allows querying
based on the type optimization level.

NFC.

Differential Revision: http://reviews.llvm.org/D12082

llvm-svn: 245430
2015-08-19 11:17:59 +00:00
Alex Lorenz df9e3c6fb0 MIR Serialization: Serialize MMI's variable debug information.
llvm-svn: 245396
2015-08-19 00:13:25 +00:00
Steve King d4c8f70ce1 Fix backward operands in call to isTruncateFree() and improve comments.
llvm-svn: 245385
2015-08-18 23:02:41 +00:00
Alex Lorenz 607efb6c7e MIR Parser: Return true on error when parsing standalone registers.
llvm-svn: 245384
2015-08-18 22:57:36 +00:00
Alex Lorenz f3630113cd MIR Serialization: Serialize the operand's bit mask target flags.
This commit adds support for bit mask target flag serialization to the MIR
printer and the MIR parser. It also adds support for the machine operand's
target flag serialization to the AArch64 target.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 245383
2015-08-18 22:52:15 +00:00
Nick Lewycky 06b0ea2e8f Fix three typos in comments; "easilly" -> "easily".
llvm-svn: 245379
2015-08-18 22:41:58 +00:00
Alex Lorenz a314d81328 MIR Serialization: Serialize the frame information's stack protector index.
llvm-svn: 245372
2015-08-18 22:26:26 +00:00
Alex Lorenz dc9dadf683 MIR Parser: Extract the code that parses stack object references into a new
method.

This commit extracts the code that parses the stack object references into a
new method named 'parseStackFrameIndex', so that it can be reused when
parsing standalone stack object references.

llvm-svn: 245370
2015-08-18 22:18:52 +00:00
Matthias Braun fa3b248a66 DAGCombiner: Improve DAGCombiner select normalization
The current code normalizes select(C0, x, select(C1, x, y)) towards
select(C0|C1, x, y) if the targets prefers that form. This patch adds an
additional rule that if the select(C1, x, y) part already exists in the
function then we want to normalize into the other direction because the
effects of reusing the existing value are bigger than transforming into
the target preferred form.

This addresses regressions following r238793, see also:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20150727/290272.html

Differential Revision: http://reviews.llvm.org/D11616

llvm-svn: 245350
2015-08-18 20:48:36 +00:00
Matthias Braun 2e920bd04f DAGCombiner: Optimize SELECTs first before turning them into SELECT_CC
This is part of http://reviews.llvm.org/D11616 - I just decided to split
this up into a separate commit.

llvm-svn: 245349
2015-08-18 20:48:29 +00:00
David Majnemer 0ad363eebc [WinEH] Calculate state numbers for the new EH representation
State numbers are calculated by performing a walk from the innermost
funclet to the outermost funclet.   Rudimentary support for the new EH
constructs has been added to the assembly printer, just enough to test
the new machinery.

Differential Revision: http://reviews.llvm.org/D12098

llvm-svn: 245331
2015-08-18 19:07:12 +00:00
Matthias Braun d55bcf2646 MachineRegisterInfo: Introduce isPhysRegUsed()
This method checks whether a physical regiser or any of its aliases are
used in the function.

Using this function in SIRegisterInfo::findUnusedReg() should also fix
this reported failure:

http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150803/292143.html
http://reviews.llvm.org/rL242173#inline-533

The report doesn't come with a testcase and I don't know enough about
AMDGPU to create one myself.

llvm-svn: 245329
2015-08-18 18:54:27 +00:00
Alex Lorenz eb7c9be43c MIR Parser: Implicit register verifier should accept unexpected implicit
subregister operands.

llvm-svn: 245315
2015-08-18 17:17:13 +00:00
Sanjay Patel 1cd6d88e4d use minSize wrapper; NFCI
These were missed when other uses were switched over:
http://llvm.org/viewvc/llvm-project?view=revision&revision=243994

llvm-svn: 245311
2015-08-18 16:44:23 +00:00
Guozhi Wei f66d384443 Align SP adjustment in function getSPAdjust
This commit adds a new function TargetFrameLowering::alignSPAdjust
and calls it from TargetInstrInfo::getSPAdjust. It fixes PR24142.

llvm-svn: 245253
2015-08-17 22:36:27 +00:00
Alex Lorenz a56ba6a6dd MIR Serialization: Serialize the local offsets for the stack objects.
llvm-svn: 245249
2015-08-17 22:17:42 +00:00
Alex Lorenz eb62568625 MIR Serialization: Serialize the memory operand's range metadata node.
llvm-svn: 245247
2015-08-17 22:09:52 +00:00
Alex Lorenz 03e940d1f8 MIR Serialization: Serialize the memory operand's noalias metadata node.
llvm-svn: 245246
2015-08-17 22:08:02 +00:00
Alex Lorenz a16f624dc3 MIR Serialization: Serialize the memory operand's alias scope metadata node.
llvm-svn: 245245
2015-08-17 22:06:40 +00:00
Alex Lorenz a617c9162d MIR Serialization: Serialize the memory operand's TBAA metadata node.
llvm-svn: 245244
2015-08-17 22:05:15 +00:00
David Majnemer 83f4bb23c4 [WinEHPrepare] Replace unreasonable funclet terminators with unreachable
It is possible to be in a situation where more than one funclet token is
a valid SSA value.  If we see a terminator which exits a funclet which
doesn't use the funclet's token, replace it with unreachable.

Differential Revision: http://reviews.llvm.org/D12074

llvm-svn: 245238
2015-08-17 20:56:39 +00:00
Joseph Tremoulet 7031c9fc2e [WinEHPrepare] Fix catchret successor phi demotion
Summary:
When demoting an SSA value that has a use on a phi and one of the phi's
predecessors terminates with catchret, the edge needs to be split and the
load inserted in the new block, else we'll still have a cross-funclet SSA
value.

Add a test for this, and for the similar case where a def to be spilled is
on and invoke and a critical edge, which was already implemented but
missing a test.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D12065

llvm-svn: 245218
2015-08-17 13:51:37 +00:00
Tobias Grosser 58fdd88751 Revert "Disable targetdatalayoutcheck"
I committed by accident a local hack that should not have made it upstream.
Sorry for the noise.

llvm-svn: 245212
2015-08-17 10:58:03 +00:00
Tobias Grosser 607b8b26e9 Disable targetdatalayoutcheck
llvm-svn: 245210
2015-08-17 10:56:35 +00:00
James Molloy ef183397b1 Generate FMINNAN/FMINNUM/FMAXNAN/FMAXNUM from SDAGBuilder.
These only get generated if the target supports them. If one of the variants is not legal and the other is, and it is safe to do so, the other variant will be emitted.

For example on AArch32 (V8), we have scalar fminnm but not fmin.

Fix up a couple of tests while we're here - one now produces better code, and the other was just plain wrong to start with.

llvm-svn: 245196
2015-08-17 07:13:10 +00:00
Chandler Carruth 2f1fd1658f [PM] Port ScalarEvolution to the new pass manager.
This change makes ScalarEvolution a stand-alone object and just produces
one from a pass as needed. Making this work well requires making the
object movable, using references instead of overwritten pointers in
a number of places, and other refactorings.

I've also wired it up to the new pass manager and added a RUN line to
a test to exercise it under the new pass manager. This includes basic
printing support much like with other analyses.

But there is a big and somewhat scary change here. Prior to this patch
ScalarEvolution was never *actually* invalidated!!! Re-running the pass
just re-wired up the various other analyses and didn't remove any of the
existing entries in the SCEV caches or clear out anything at all. This
might seem OK as everything in SCEV that can uses ValueHandles to track
updates to the values that serve as SCEV keys. However, this still means
that as we ran SCEV over each function in the module, we kept
accumulating more and more SCEVs into the cache. At the end, we would
have a SCEV cache with every value that we ever needed a SCEV for in the
entire module!!! Yowzers. The releaseMemory routine would dump all of
this, but that isn't realy called during normal runs of the pipeline as
far as I can see.

To make matters worse, there *is* actually a key that we don't update
with value handles -- there is a map keyed off of Loop*s. Because
LoopInfo *does* release its memory from run to run, it is entirely
possible to run SCEV over one function, then over another function, and
then lookup a Loop* from the second function but find an entry inserted
for the first function! Ouch.

To make matters still worse, there are plenty of updates that *don't*
trip a value handle. It seems incredibly unlikely that today GVN or
another pass that invalidates SCEV can update values in *just* such
a way that a subsequent run of SCEV will incorrectly find lookups in
a cache, but it is theoretically possible and would be a nightmare to
debug.

With this refactoring, I've fixed all this by actually destroying and
recreating the ScalarEvolution object from run to run. Technically, this
could increase the amount of malloc traffic we see, but then again it is
also technically correct. ;] I don't actually think we're suffering from
tons of malloc traffic from SCEV because if we were, the fact that we
never clear the memory would seem more likely to have come up as an
actual problem before now. So, I've made the simple fix here. If in fact
there are serious issues with too much allocation and deallocation,
I can work on a clever fix that preserves the allocations (while
clearing the data) between each run, but I'd prefer to do that kind of
optimization with a test case / benchmark that shows why we need such
cleverness (and that can test that we actually make it faster). It's
possible that this will make some things faster by making the SCEV
caches have higher locality (due to being significantly smaller) so
until there is a clear benchmark, I think the simple change is best.

Differential Revision: http://reviews.llvm.org/D12063

llvm-svn: 245193
2015-08-17 02:08:17 +00:00
Sanjay Patel 3ab4a73bac use SDValue bool operator; NFCI
llvm-svn: 245181
2015-08-16 17:54:28 +00:00
Simon Pilgrim 0750c84623 [DAGCombiner] Attempt to mask vectors before zero extension instead of after.
For cases where we TRUNCATE and then ZERO_EXTEND to a larger size (often from vector legalization), see if we can mask the source data and then ZERO_EXTEND (instead of after a ANY_EXTEND). This can help avoid having to generate a larger mask, and possibly applying it to several sub-vectors.

(zext (truncate x)) -> (zext (and(x, m))

Includes a minor patch to SystemZ to better recognise 8/16-bit zero extension patterns from RISBG bit-extraction code.

This is the first of a number of minor patches to help improve the conversion of byte masks to clear mask shuffles.

Differential Revision: http://reviews.llvm.org/D11764

llvm-svn: 245160
2015-08-15 13:27:30 +00:00
James Y Knight 5567bafe93 Remove redundant TargetFrameLowering::getFrameIndexOffset virtual
function.

This was the same as getFrameIndexReference, but without the FrameReg
output.

Differential Revision: http://reviews.llvm.org/D12042

llvm-svn: 245148
2015-08-15 02:32:35 +00:00
Alex Lorenz 577d271a75 MIR Serialization: Serialize the '.cfi_same_value' CFI directive.
llvm-svn: 245103
2015-08-14 21:55:58 +00:00
Alex Lorenz c3ba7508f6 MIR Serialization: Serialize the external symbol call entry pseudo source
values.

llvm-svn: 245098
2015-08-14 21:14:50 +00:00
Alex Lorenz 50b826fb75 MIR Serialization: Serialize the global value call entry pseudo source values.
llvm-svn: 245097
2015-08-14 21:08:30 +00:00
Alex Lorenz 1039fd1ae5 MIR Serialization: Serialize the 'internal' register operand flag.
llvm-svn: 245085
2015-08-14 19:07:07 +00:00
Alex Lorenz f9a2b12361 MIR Serialization: Serialize the bundled machine instructions.
llvm-svn: 245082
2015-08-14 18:57:24 +00:00
Kit Barton ae78d53aeb Reverting patch r244235.
This patch will be redone in a different way. See
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150810/292978.html
for more details.

llvm-svn: 245071
2015-08-14 16:54:32 +00:00
Chandler Carruth 1db22822b4 [PM/AA] Hoist the interface to TBAA into a dedicated header along with
its creation function. Update the relevant includes accordingly.

llvm-svn: 245019
2015-08-14 03:33:48 +00:00
Chandler Carruth 42ff448fe4 [PM/AA] Hoist ScopedNoAliasAA's interface into a header and move the
creation function there.

Same basic refactoring as the other alias analyses. Nothing special
required this time around.

llvm-svn: 245012
2015-08-14 02:55:50 +00:00
Chandler Carruth 8b046a42f4 [PM/AA] Extract a minimal interface for CFLAA to its own header file.
I've used forward declarations and reorderd the source code some to make
this reasonably clean and keep as much of the code as possible in the
source file, including all the stratified set details. Just the basic AA
interface and the create function are in the header file, and the header
file is now included into the relevant locations.

llvm-svn: 245009
2015-08-14 02:42:20 +00:00
Alex Lorenz 5022f6bb81 MIR Serialization: Change MIR syntax - use custom syntax for MBBs.
This commit modifies the way the machine basic blocks are serialized - now the
machine basic blocks are serialized using a custom syntax instead of relying on
YAML primitives. Instead of using YAML mappings to represent the individual
machine basic blocks in a machine function's body, the new syntax uses a single
YAML block scalar which contains all of the machine basic blocks and
instructions for that function.

This is an example of a function's body that uses the old syntax:

    body:
      - id: 0
        name: entry
        instructions:
          - '%eax = MOV32r0 implicit-def %eflags'
          - 'RETQ %eax'
    ...

The same body is now written like this:

    body: |
      bb.0.entry:
        %eax = MOV32r0 implicit-def %eflags
        RETQ %eax
    ...

This syntax change is motivated by the fact that the bundled machine
instructions didn't map that well to the old syntax which was using a single
YAML sequence to store all of the machine instructions in a block. The bundled
machine instructions internally use flags like BundledPred and BundledSucc to
determine the bundles, and serializing them as MI flags using the old syntax
would have had a negative impact on the readability and the ease of editing
for MIR files. The new syntax allows me to serialize the bundled machine
instructions using a block construct without relying on the internal flags,
for example:

   BUNDLE implicit-def dead %itstate, implicit-def %s1 ... {
      t2IT 1, 24, implicit-def %itstate
      %s1 = VMOVS killed %s0, 1, killed %cpsr, implicit killed %itstate
   }

This commit also converts the MIR testcases to the new syntax. I developed
a script that can convert from the old syntax to the new one. I will post the
script on the llvm-commits mailing list in the thread for this commit.

llvm-svn: 244982
2015-08-13 23:10:16 +00:00
Alex Lorenz 6866104073 MIR Parser: Don't allow negative alignments for memory operands.
llvm-svn: 244953
2015-08-13 20:55:01 +00:00
Alex Lorenz 620f89145b MIR Parser: Extract the code that parses the alignment into a new method. NFC.
This commit extracts the code that parses the memory operand's alignment into
a new method named 'parseAlignment' so that it can be reused when parsing the
basic block's alignment attribute.

llvm-svn: 244945
2015-08-13 20:33:33 +00:00
Alex Lorenz 9b62cf6143 MIR Parser: Rename the method 'diagFromLLVMAssemblyDiag'. NFC.
This commit renames the method 'diagFromLLVMAssemblyDiag' to
'diagFromBlockStringDiag'. This method will be used when converting diagnostics
from other YAML block strings, and not just the LLVM module block string, so
the new name should reflect that.

llvm-svn: 244943
2015-08-13 20:30:11 +00:00
Joseph Tremoulet c9ff914ced [WinEHPrepare] Update demotion logic
Summary:
Update the demotion logic in WinEHPrepare to avoid creating new cleanups by
walking predecessors as necessary to insert stores for EH-pad PHIs.

Also avoid creating stores for EH-pad PHIs that have no uses.

The store/load placement is still pretty naive.  Likely future improvements
(at least for optimized compiles) include:
 - Share loads for related uses as possible
 - Coalesce non-interfering use/def-related PHIs
 - Store at definition point rather than each PHI pred for non-interfering
   lifetimes.


Reviewers: rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11955

llvm-svn: 244894
2015-08-13 14:30:10 +00:00
Ahmed Bougacha a196661bb0 [CodeGen] Mark the promoted FCOPYSIGN result FP_ROUND as TRUNCating.
Now that we can properly promote mismatched FCOPYSIGNs (r244858), we
can mark the FP_ROUND on the result as truncating, to expose folding.

FCOPYSIGN doesn't change anything but the sign bit, so
  (fp_round (fcopysign (fpext a), b))
is equivalent to (modulo the sign bit):
  (fp_round (fpext a))
which is a no-op.

llvm-svn: 244862
2015-08-13 01:32:30 +00:00
Ahmed Bougacha b5b0cfdff7 [CodeGen] Assert on getNode(FP_EXTEND) with a smaller dst type.
This would have caught the problem in r244858.

llvm-svn: 244859
2015-08-13 01:10:29 +00:00
Ahmed Bougacha 40ded502ff [CodeGen] When Promoting, don't extend the 2nd FCOPYSIGN operand.
We don't care about its type, and there's even a combine that'll fold
away the FP_EXTEND if we let it run. However, until it does, we'll have
something broken like:
  (f32 (fp_extend (f64 v)))

Scalar f16 follow-up to r243924.

llvm-svn: 244858
2015-08-13 01:09:43 +00:00
Ahmed Bougacha 31e0d9a2b1 [CodeGen] Simplify getNode(*EXT/TRUNC) type size assert. NFC.
We already check that vectors have the same number of elements, we
don't need to use the scalar types explicitly: comparing the size of
the whole vector is enough.

llvm-svn: 244857
2015-08-13 01:08:48 +00:00
Alex Lorenz 2791dcca60 MIR Parser: Allow the MI IR references to reference global values.
This commit fixes a bug where MI parser couldn't resolve the named IR
references that referenced named global values.

llvm-svn: 244817
2015-08-12 21:27:16 +00:00
Alex Lorenz 0cc671bf79 MIR Serialization: Serialize the fixed stack pseudo source values.
llvm-svn: 244816
2015-08-12 21:23:17 +00:00
Cong Hou 2a02c1cb1a NFC. Convert comments in MachineBasicBlock.cpp into new style.
llvm-svn: 244815
2015-08-12 21:18:54 +00:00
Alex Lorenz ea88212b41 MIR Parser: Move the parsing of fixed stack object indices into new method. NFC
This commit moves the code that parses the frame indices for the fixed stack
objects from the method 'parseFixedStackObjectOperand' to a new method named
'parseFixedStackFrameIndex', so that it can be reused when parsing fixed stack
pseudo source values.

llvm-svn: 244814
2015-08-12 21:17:02 +00:00
Alex Lorenz 4be56e9370 MIR Serialization: Serialize the jump table pseudo source values.
llvm-svn: 244813
2015-08-12 21:11:08 +00:00
Alex Lorenz d858f874fa MIR Serialization: Serialize the GOT pseudo source values.
llvm-svn: 244809
2015-08-12 21:00:22 +00:00
Alex Lorenz 46e9558ac6 MIR Serialization: Serialize the stack pseudo source values.
llvm-svn: 244806
2015-08-12 20:44:16 +00:00
Alex Lorenz 91097a3ffa MIR Serialization: Serialize the constant pool pseudo source values.
llvm-svn: 244803
2015-08-12 20:33:26 +00:00
John Brawn 75fc09ddba Redo "Make global aliases have symbol size equal to their type"
r242520 was reverted in r244313 as the expected behaviour of the alias
attribute in C is that the alias has the same size as the aliasee. However
we can re-introduce adding the size on the alias when the aliasee does not,
from a source code or object perspective, exist as a discrete entity. This
happens when the aliasee is not a symbol, or when that symbol is private.

Differential Revision: http://reviews.llvm.org/D11943

llvm-svn: 244752
2015-08-12 15:05:39 +00:00
John Brawn 0bef27d836 [GlobalMerge] Only emit aliases for internal linkage variables for non-Mach-O
On Mach-O emitting aliases for the variables that make up a MergedGlobals
variable can cause problems when linking with dead stripping enabled so don't
do that, except for external variables where we must emit an alias.

llvm-svn: 244748
2015-08-12 13:36:48 +00:00
Michael Kuperstein bc7f99a3ab [X86] Allow x86 call frame optimization to fold more loads into pushes
This abstracts away the test for "when can we fold across a MachineInstruction"
into the the MI interface, and changes call-frame optimization use the same test
the peephole optimizer users.

Differential Revision: http://reviews.llvm.org/D11945

llvm-svn: 244729
2015-08-12 10:14:58 +00:00
Alex Lorenz 5659a2f961 PseudoSourceValue: Transform the mips subclass to target independent subclasses
This commit transforms the mips-specific 'MipsCallEntry' subclass of the
'PseudoSourceValue' class into two, target-independent subclasses named
'GlobalValuePseudoSourceValue' and 'ExternalSymbolPseudoSourceValue'.

This change makes it easier to serialize the pseudo source values by removing
target-specific pseudo source values.

Reviewers: Akira Hatanaka
llvm-svn: 244698
2015-08-11 23:23:17 +00:00
Alex Lorenz e40c8a2b26 PseudoSourceValue: Replace global manager with a manager in a machine function.
This commit removes the global manager variable which is responsible for
storing and allocating pseudo source values and instead it introduces a new
manager class named 'PseudoSourceValueManager'. Machine functions now own an
instance of the pseudo source value manager class.

This commit also modifies the 'get...' methods in the 'MachinePointerInfo'
class to construct pseudo source values using the instance of the pseudo
source value manager object from the machine function.

This commit updates calls to the 'get...' methods from the 'MachinePointerInfo'
class in a lot of different files because those calls now need to pass in a
reference to a machine function to those methods.

This change will make it easier to serialize pseudo source values as it will
enable me to transform the mips specific MipsCallEntry PseudoSourceValue
subclass into two target independent subclasses.

Reviewers: Akira Hatanaka
llvm-svn: 244693
2015-08-11 23:09:45 +00:00
Alex Lorenz c49e4fe9cc PseudoSourceValue: Introduce a 'PSVKind' enumerator.
This commit introduces a new enumerator named 'PSVKind' in the
'PseudoSourceValue' class. This enumerator is now used to distinguish between
the various kinds of pseudo source values.

This change is done in preparation for the changes to the pseudo source value
object management and to the PseudoSourceValue's class hierarchy - the next two
PseudoSourceValue commits will get rid of the global variable that manages the
pseudo source values and the mips specific MipsCallEntry subclass.

Reviewers: Akira Hatanaka
llvm-svn: 244687
2015-08-11 22:32:00 +00:00
Alex Lorenz bceefe85c6 PseudoSourceValue: Update comments and fix lowercase variable names. NFC.
This commit updates the documentation comments in PseudoSourceValue.cpp and
PseudoSourceValue.h based on the LLVM's documentation style. It also fixes
several instances of variable names that started with a lowercase letter.

This change is done in preparation for the changes to the pseudo source value
object management and to the PseudoSourceValue's class hierarchy.

llvm-svn: 244686
2015-08-11 22:23:19 +00:00
Alex Lorenz 4ae214d5d7 Reformat PseudoSourceValue.cpp and PseudoSourceValue.h. NFC.
This commit reformats the files lib/CodeGen/PseudoSourceValue.cpp and
include/llvm/CodeGen/PseudoSourceValue.h using clang-format. This change is
done in preparation for the changes to the pseudo source value object
management and to the PseudoSourceValue's class hierarchy.

llvm-svn: 244685
2015-08-11 22:17:22 +00:00
Paul Robinson 78046b49a9 Make DW_AT_[MIPS_]linkage_name optional, and off by default for SCE.
Mangled "linkage" names can be huge, and if the debugger (or other
tools) have no use for them, the size savings can be very impressive
(on the order of 40%).

Add one test for controlling behavior, and modify a number of tests to
either stop using linkage names, or make llc emit them (so these tests
will still run when the default triple is for PS4).

Differential Revision: http://reviews.llvm.org/D11374

llvm-svn: 244678
2015-08-11 21:36:45 +00:00
JF Bastien 0cf74528d0 NFC SelectionDAGDumper: fix typo
Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11959

llvm-svn: 244667
2015-08-11 21:10:07 +00:00
Sanjay Patel 82d91ddb4f fix minsize detection: minsize attribute implies optimizing for size
Also, add a test for optsize because this was not part of any existing regression test.

llvm-svn: 244651
2015-08-11 19:39:36 +00:00
Jingyue Wu 99eb4685ef SelectionDAG: Prefer to combine multiplication with less uses for fma
Summary:
For example:

  s6 = s0*s5;
  s2 = s6*s6 + s6;
  ...
  s4 = s6*s3;

We notice that it is possible for s2 is folded to fma (s0, s5, fmul (s6 s6)).
This only happens when Aggressive is true, otherwise hasOneUse() check
already prevents from folding the multiplication with more uses.

Test Plan: test/CodeGen/NVPTX/fma-assoc.ll

Patch by Xuetian Weng

Reviewers: hfinkel, apazos, jingyue, ohsallen, arsenm

Subscribers: arsenm, jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D11855

llvm-svn: 244649
2015-08-11 19:21:46 +00:00
Sanjay Patel 070df89928 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244631
2015-08-11 17:04:31 +00:00
John Brawn 863bfdbfb4 [GlobalMerge] Use private linkage for MergedGlobals variables
Other objects can never reference the MergedGlobals symbol so external linkage
is never needed. Using private instead of internal linkage means the object is
more similar to what it looks like when global merging is not enabled, with
the only difference being that the merged variables are addressed indirectly
relative to the start of the section they are in.

Also add aliases for merged variables with internal linkage, as this also makes
the object be more like what it is when they are not merged.

Differential Revision: http://reviews.llvm.org/D11942

llvm-svn: 244615
2015-08-11 15:48:04 +00:00
Sanjay Patel 74ca312666 fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244604
2015-08-11 14:31:14 +00:00
James Molloy 01cdeccdc7 Add new ISD nodes: ISD::FMINNAN and ISD::FMAXNAN
The intention of these is to be a corollary to ISD::FMINNUM/FMAXNUM,
differing only on how NaNs are treated. FMINNUM returns the non-NaN
input (when given one NaN and one non-NaN), FMINNAN returns the NaN
input instead.

This patch includes support for scalarizing, widening and splitting
vectors, but not expansion or softening. The reason is that these
should never be needed - FMINNAN nodes are only going to be created
in one place (SDAGBuilder::visitSelect) and there we'll check if the
node is legal or custom. I could preemptively add expand and soften
code, but I'm fairly opposed to adding code I can't test. It's bad
enough I can't create tests with this patch, but at least this code
will be exercised by the ARM and AArch64 backends fairly shortly.

llvm-svn: 244581
2015-08-11 09:13:05 +00:00
James Molloy 134bec2722 Add support for floating-point minnum and maxnum
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.

matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.

C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.

ARM and AArch64 at least have different instructions for these different cases.

llvm-svn: 244580
2015-08-11 09:12:57 +00:00
Michael Kuperstein 82814f63c0 Allow PeepholeOptimizer to fold a few more cases
The condition for clearing the folding candidate list was clamped together
with the "uninteresting instruction" condition. This is too conservative,
e.g. we don't need to clear the list when encountering an IMPLICIT_DEF.

Differential Revision: http://reviews.llvm.org/D11591

llvm-svn: 244577
2015-08-11 08:19:43 +00:00
David Majnemer fd9f47756a [WinEHPrepare] Add rudimentary support for the new EH instructions
This adds somewhat basic preparation functionality including:
- Formation of funclets via coloring basic blocks.
- Cloning of polychromatic blocks to ensure that funclets have unique
  program counters.
- Demotion of values used between different funclets.
- Some amount of cleanup once we have removed predecessors from basic
  blocks.
- Verification that we are left with a CFG that makes some amount of
  sense.

N.B. Arguments and numbering still need to be done.

Differential Revision: http://reviews.llvm.org/D11750

llvm-svn: 244558
2015-08-11 01:15:26 +00:00
Alex Lorenz c483808785 MIR Serialization: Serialize UsedPhysRegMask from the machine register info.
This commit serializes the UsedPhysRegMask register mask from the machine
register information class. The mask is serialized as an inverted
'calleeSavedRegisters' mask to keep the output minimal.

This commit also allows the MIR parser to infer this mask from the register
mask operands if the machine function doesn't specify it.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244548
2015-08-11 00:32:49 +00:00
Sanjay Patel f609c11b3d use range-based for loops; NFCI
llvm-svn: 244545
2015-08-11 00:26:05 +00:00
Alex Lorenz c5d35ba009 MIR Parser: Report an error when a stack object is redefined.
llvm-svn: 244536
2015-08-10 23:50:41 +00:00
Alex Lorenz 1d9a303142 MIR Parser: Report an error when a fixed stack object is redefined.
llvm-svn: 244534
2015-08-10 23:45:02 +00:00
Sanjay Patel 9f11c14c1c use range-based for loop; NFCI
llvm-svn: 244531
2015-08-10 23:29:41 +00:00
Alex Lorenz b97c9ef4d0 MIR Serialization: Serialize the liveout register mask machine operands.
llvm-svn: 244529
2015-08-10 23:24:42 +00:00
Sanjay Patel d967a878fa fix minsize detection: minsize attribute implies optimizing for size
llvm-svn: 244528
2015-08-10 23:07:26 +00:00
Cong Hou 2793e7218c NFC. Fix some format issues in lib/CodeGen/MachineBasicBlock.cpp.
llvm-svn: 244518
2015-08-10 22:27:10 +00:00
Alex Lorenz e5101e2016 MachineVerifier: Handle the optional def operand in a PATCHPOINT instruction.
The PATCHPOINT instructions have a single optional defined register operand,
but the machine verifier can't verify the optional defined register operands.
This commit makes sure that the machine verifier won't report an error when a
PATCHPOINT instruction doesn't have its optional defined register operand.
This change will allow us to enable the machine verifier for the code
generation tests for the patchpoint intrinsics.

Reviewers: Juergen Ributzka
llvm-svn: 244513
2015-08-10 21:47:36 +00:00
Sanjay Patel cc6554361c remove function names from comments; NFC
llvm-svn: 244509
2015-08-10 21:28:16 +00:00
Alex Lorenz 2f43dd5a12 StackMap: FastISel: Add an appropriate number of immediate operands to the
frame setup instruction.

This commit ensures that the stack map lowering code in FastISel adds an
appropriate number of immediate operands to the frame setup instruction.

The previous code added just one immediate operand, which was fine for a target
like AArch64, but on X86 the ADJCALLSTACKDOWN64 instruction needs two explicit
operands. This caused the machine verifier to report an error when the old code
added just one.

Reviewers: Juergen Ributzka

Differential Revision: http://reviews.llvm.org/D11853

llvm-svn: 244508
2015-08-10 21:27:03 +00:00
JF Bastien fa9746dc8d x86: Emit LAHF/SAHF instead of PUSHF/POPF
NaCl's sandbox doesn't allow PUSHF/POPF out of security concerns (priviledged emulators have forgotten to mask system bits in the past, and EFLAGS's DF bit is a constant source of hilarity). Commit r220529 fixed PR20376 by saving cmpxchg's flags result using EFLAGS, this commit now generated LAHF/SAHF instead, for all of x86 (not just NaCl) because it leads to an overall performance gain over PUSHF/POPF.

As with the previous patch this code generation is pretty bad because it occurs very later, after register allocation, and in many cases it rematerializes flags which were already available (e.g. already in a register through SETE). Fortunately it's somewhat rare that this code needs to fire.

I did [[ https://github.com/jfbastien/benchmark-x86-flags | a bit of benchmarking ]], the results on an Intel Haswell E5-2690 CPU at 2.9GHz are:

| Time per call (ms)  | Runtime (ms) | Benchmark                      |
| 0.000012514         |      6257    | sete.i386                      |
| 0.000012810         |      6405    | sete.i386-fast                 |
| 0.000010456         |      5228    | sete.x86-64                    |
| 0.000010496         |      5248    | sete.x86-64-fast               |
| 0.000012906         |      6453    | lahf-sahf.i386                 |
| 0.000013236         |      6618    | lahf-sahf.i386-fast            |
| 0.000010580         |      5290    | lahf-sahf.x86-64               |
| 0.000010304         |      5152    | lahf-sahf.x86-64-fast          |
| 0.000028056         |     14028    | pushf-popf.i386                |
| 0.000027160         |     13580    | pushf-popf.i386-fast           |
| 0.000023810         |     11905    | pushf-popf.x86-64              |
| 0.000026468         |     13234    | pushf-popf.x86-64-fast         |

Clearly `PUSHF`/`POPF` are suboptimal. It doesn't really seems to be worth teaching LLVM about individual flags, at least not for this purpose.

Reviewers: rnk, jvoung, t.p.northover

Subscribers: llvm-commits

Differential revision: http://reviews.llvm.org/D6629

llvm-svn: 244503
2015-08-10 20:59:36 +00:00
Robert Lougher 11a44b78a3 Trace copies when checking for rematerializability in spill weight calculation
PR24139 contains an analysis of poor register allocation. One of the findings
was that when calculating the spill weight, a rematerializable interval once
split is no longer rematerializable. This is because the isRematerializable
check in CalcSpillWeights.cpp does not follow the copies introduced by live
range splitting (after splitting, the live interval register definition is a
copy which is not rematerializable).

Reviewers: qcolombet

Differential Revision: http://reviews.llvm.org/D11686

llvm-svn: 244439
2015-08-10 11:59:44 +00:00
Benjamin Kramer df005cbe19 Fix some comment typos.
llvm-svn: 244402
2015-08-08 18:27:36 +00:00
Alex Lorenz 61420f790d MIR Serialization: Serialize the base alignment for the machine memory operands.
llvm-svn: 244357
2015-08-07 20:48:30 +00:00
Alex Lorenz 83127739ff MIR Serialization: Serialize the offsets for the machine memory operands.
llvm-svn: 244356
2015-08-07 20:26:52 +00:00
Alex Lorenz dc24c1713e MIR Parser: Extract the parsing of the operand's offset into a new method. NFC.
This commit extract the code that parses the 64-bit offset from the method
'parseOperandsOffset' to a new method 'parseOffset' so that we can reuse it
when parsing the offset for the machine memory operands.

llvm-svn: 244355
2015-08-07 20:21:00 +00:00
Frederic Riss a5ab8443c1 [MC/Dwarf] Allow to specify custom parameters for linetable emission.
NFC patch for current users, but llvm-dsymutil will use the new
functionality to adapt to the input linetable.

Based on a patch by Adrian Prantl.

llvm-svn: 244318
2015-08-07 15:14:08 +00:00
John Brawn 64e5a66794 Revert "Make global aliases have symbol size equal to their type"
This reverts r242520, as it caused pr24379. Also removes part of the test added
by r243874 that checks the size of alias symbols.

llvm-svn: 244313
2015-08-07 10:56:21 +00:00
NAKAMURA Takumi 8dbe161502 ShrinkWrap.cpp: Tweak r244235 for a non-functional member, PredicateFtor. [-Wdocumentation]
llvm-svn: 244309
2015-08-07 07:40:23 +00:00
Alex Lorenz cba8c5fe31 MIR Serialization: Fix serialization of unnamed IR block references.
The block address machine operands can reference IR blocks in other functions.
This commit fixes a bug where the references to unnamed IR blocks in other
functions weren't serialized correctly.

llvm-svn: 244299
2015-08-06 23:57:04 +00:00
Alex Lorenz 3fb77686c1 MIR Parser: Simplify the token's string value handling.
This commit removes the 'StringOffset' and 'HasStringValue' fields from the
MIToken struct and simplifies the 'stringValue' method which now returns
the new 'StringValue' field.

This commit also adopts a different way of initializing the lexed tokens -
instead of constructing a new MIToken instance, the lexer resets the old token
using the new 'reset' method and sets its attributes using the new
'setStringValue', 'setOwnedStringValue', and 'setIntegerValue' methods.

Reviewers: Sean Silva

Differential Revision: http://reviews.llvm.org/D11792

llvm-svn: 244295
2015-08-06 23:17:42 +00:00
David Majnemer 09e1fdb3f4 Revert accidentally committed WinEHPrepare changes
This reverts commit r244272, r244273, r244274, and r244275.

llvm-svn: 244278
2015-08-06 21:13:51 +00:00
David Majnemer a102e6a0e3 PHIs don't need to be postprocessed
llvm-svn: 244275
2015-08-06 21:08:34 +00:00
David Majnemer ac6b298850 Handle PHI nodes prefacing EH pads too
llvm-svn: 244274
2015-08-06 21:08:32 +00:00
David Majnemer fb7a737a72 handle phi nodes
llvm-svn: 244273
2015-08-06 21:08:30 +00:00
David Majnemer e4abcef986 [WinEHPrepare] Add rudimentary support for the new EH instructions
Summary:
This adds somewhat basic preparation functionality including:
- Formation of funclets via coloring basic blocks.
- Cloning of polychromatic blocks to ensure that funclets have unique
  program counters.
- Demotion of values used between different funclets.
- Some amount of cleanup once we have removed predecessors from basic
  blocks.
- Verification that we are left with a CFG that makes some amount of
  sense.

N.B. Arguments and numbering still need to be done.

Reviewers: rnk, JosephTremoulet

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11750

llvm-svn: 244272
2015-08-06 21:07:55 +00:00
Nico Rieck 78199518c4 Rename inst_range() to instructions() for consistency. NFC
llvm-svn: 244248
2015-08-06 19:10:45 +00:00
Kit Barton a7bf96ab5c Fix possible infinite loop in shrink wrapping when searching for save/restore
points.

There is an infinite loop that can occur in Shrink Wrapping while searching 
for the Save/Restore points. 

Part of this search checks whether the save/restore points are located in
different loop nests and if so, uses the (post) dominator trees to find the
immediate (post) dominator blocks. However, if the current block does not have
any immediate (post) dominators then this search will result in an infinite
loop. This can occur in code containing an infinite loop.

The modification checks whether the immediate (post) dominator is different from
the current save/restore block. If it is not, then the search terminates and the
current location is not considered as a valid save/restore point for shrink wrapping.

Phabricator: http://reviews.llvm.org/D11607
llvm-svn: 244247
2015-08-06 19:01:57 +00:00
Alex Lorenz e86d51533d MIR Parser: Report an error when parsing duplicate memory operand flags.
llvm-svn: 244240
2015-08-06 18:26:36 +00:00
Cong Hou ec10587205 Revert r244154 which causes some build failure. See https://llvm.org/bugs/show_bug.cgi?id=24377.
llvm-svn: 244239
2015-08-06 18:17:29 +00:00
Kit Barton 45c20b474e This patch changes the interface to enable the shrink wrapping optimization.
It adds a new constructor, which takes a std::function predicate function that
is run at the beginning of shrink wrapping to determine whether the optimization
should run on the given machine function. The std::function can be overridden by
each target, allowing target-specific decisions to be made on each machine
function.

This is necessary for PowerPC, as the decision to run shrink wrapping is
partially based on the ABI. Futhermore, this operates nicely with the GCC iFunc
capability, which allows option overrides on a per-function basis.

Phabricator: http://reviews.llvm.org/D11421
llvm-svn: 244235
2015-08-06 18:02:53 +00:00
Alex Lorenz dc8de2a6b7 MIR Serialization: Serialize the 'invariant' machine memory operand flag.
llvm-svn: 244230
2015-08-06 16:55:53 +00:00
Richard Diamond bd753c9315 Fix an alignment error in `llvm::expandAtomicRMWToCmpXchg` without breaking the build where X86 isn't enabled.
Summary: Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test.

Reviewers: rengolin, jfb

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11804

llvm-svn: 244229
2015-08-06 16:55:03 +00:00
Alex Lorenz 10fd03857f MIR Serialization: Serialize the 'non-temporal' machine memory operand flag.
llvm-svn: 244228
2015-08-06 16:49:30 +00:00
Renato Golin a02ac60469 Revert "Divide the primitive size in bits by eight so the initial load's alignment is in bytes as expected. Tested with the included unit test."
This reverts commit r244155, as it was breaking the buildbots for too long.
Should be reapplied with proper fix.

llvm-svn: 244205
2015-08-06 10:37:59 +00:00
Chandler Carruth 17e0bc37fd [PM/AA] Hoist the interface for BasicAA into a header file.
This is the first mechanical step in preparation for making this and all
the other alias analysis passes available to the new pass manager. I'm
factoring out all the totally boring changes I can so I'm moving code
around here with no other changes. I've even minimized the formatting
churn.

I'll reformat and freshen comments on the interface now that its located
in the right place so that the substantive changes don't triger this.

llvm-svn: 244197
2015-08-06 07:33:15 +00:00
Chandler Carruth 50fee93926 [PM/AA] Simplify the AliasAnalysis interface by removing a wrapper
around a DataLayout interface in favor of directly querying DataLayout.

This wrapper specifically helped handle the case where this no
DataLayout, but LLVM now requires it simplifynig all of this. I've
updated callers to directly query DataLayout. This in turn exposed
a bunch of places where we should have DataLayout readily available but
don't which I've fixed. This then in turn exposed that we were passing
DataLayout around in a bunch of arguments rather than making it readily
available so I've also fixed that.

No functionality changed.

llvm-svn: 244189
2015-08-06 02:05:46 +00:00
Alex Lorenz 49873a8382 MIR Serialization: Initial serialization of the machine operand target flags.
This commit implements the initial serialization of the machine operand target
flags. It extends the 'TargetInstrInfo' class to add two new methods that help
to provide text based serialization for the target flags.

This commit can serialize only the X86 target flags, and the target flags for
the other targets will be serialized in the follow-up commits.

Reviewers: Duncan P. N. Exon Smith
llvm-svn: 244185
2015-08-06 00:44:07 +00:00
Reid Kleckner 12d2c12023 If the "CodeView" module flag is set, emit codeview instead of DWARF
Summary:
Emit both DWARF and CodeView if "CodeView" and "Dwarf Version" module
flags are set.

Reviewers: majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11756

llvm-svn: 244158
2015-08-05 22:26:20 +00:00
Alex Lorenz 5672a893e5 MIR Serialization: Serialize the machine operand's offset.
This commit serializes the offset for the following operands: target index,
global address, external symbol, constant pool index, and block address.

llvm-svn: 244157
2015-08-05 22:26:15 +00:00
Richard Diamond 559c1d72a9 Divide the primitive size in bits by eight so the initial load's alignment is in
bytes as expected. Tested with the included unit test.

llvm-svn: 244155
2015-08-05 22:10:57 +00:00
Cong Hou 36e7e52aa4 Record whether the weights on out-edges from a MBB are normalized.
1. Create a utility function normalizeEdgeWeights() in MachineBranchProbabilityInfo that normalizes a list of edge weights so that the sum of then can fit in uint32_t.
2. Provide an interface in MachineBasicBlock to normalize its successors' weights.
3. Add a flag in MachineBasicBlock that tracks whether its successors' weights are normalized.
4. Provide an overload of getSumForBlock that accepts a non-const pointer to a MBB so that it can force normalizing this MBB's successors' weights.
5. Update several uses of getSumForBlock() by eliminating the once needed weight scale.

Differential Revision: http://reviews.llvm.org/D11442

llvm-svn: 244154
2015-08-05 22:01:20 +00:00
JF Bastien 7c4218f49c Revert "Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk."
I mistakenly committed the patch for D6629, and was trying to commit another. Reverting until it gets proper signoff.

llvm-svn: 244121
2015-08-05 20:53:56 +00:00
JF Bastien ce5256f5c5 Fix MO's analyzePhysReg, it was confusing sub- and super-registers. Problem pointed out by Michael Hordijk.
llvm-svn: 244120
2015-08-05 20:49:46 +00:00
Richard Diamond 7ef94569e1 Write access test.
llvm-svn: 244103
2015-08-05 19:40:39 +00:00
Alex Lorenz 3f2058da16 MIR Parser: Report an error when parsing large immediate operands.
llvm-svn: 244100
2015-08-05 19:03:42 +00:00
Alex Lorenz 05e3882e81 MIR Serialization: Serialize the typed immediate integer machine operands.
llvm-svn: 244098
2015-08-05 18:52:21 +00:00
Alex Lorenz 7eaff4c7d6 MIR Parser: Extract the IR constant parsing code into a new method. NFC.
This commit extracts the code that parses the IR constant values into a new
method named 'parseIRConstant' in the 'MIParser' class. The new method will
be reused by the code that parses the typed integer immediate machine operands.

llvm-svn: 244093
2015-08-05 18:44:00 +00:00
Alex Lorenz 2b3cf19332 MIR Parser: Report an error when parsing duplicate register flags.
llvm-svn: 244081
2015-08-05 18:09:03 +00:00
Chandler Carruth 93205eb966 [TTI] Make the cost APIs in TargetTransformInfo consistently use 'int'
rather than 'unsigned' for their costs.

For something like costs in particular there is a natural "negative"
value, that of savings or saved cost. As a consequence, there is a lot
of code that subtracts or creates negative values based on cost, all of
which is prone to awkwardness or bugs when dealing with an unsigned
type. Similarly, we *never* want these values to wrap, as that would
cause Very Bad code generation (likely percieved as an infinite loop as
we try to emit over 2^32 instructions or some such insanity).

All around 'int' seems a much better fit for these basic metrics. I've
added asserts to ensure that at least the TTI interface never returns
negative numbers here. If we ever have a use case for negative numbers,
we can remove this, but this way a bug where someone used '-1' to
produce a 'very large' cost will be caught by the assert.

This passes all tests, and is also UBSan clean.

No functional change intended.

Differential Revision: http://reviews.llvm.org/D11741

llvm-svn: 244080
2015-08-05 18:08:10 +00:00
Alex Lorenz 01c1a5ee58 MIR Serialization: Serialize the 'early-clobber' register operand flag.
llvm-svn: 244075
2015-08-05 17:49:03 +00:00
Alex Lorenz 9075258b6a MIR Serialization: Serialize the 'debug-use' register operand flag.
llvm-svn: 244071
2015-08-05 17:41:17 +00:00
Alex Lorenz 970c12eade MIR Parser: Simplify the handling of quoted tokens. NFC.
The machine instructions lexer should not expose the difference between quoted
and unquoted tokens to the parser.

llvm-svn: 244068
2015-08-05 17:35:55 +00:00
Sanjay Patel b6a79f9916 revert r243687: enable fast-math-flag propagation to DAG nodes
We can't propagate FMF partially without breaking DAG-level CSE. We either need to
relax CSE to account for mismatched FMF as a temporary work-around or fully propagate
FMF throughout the DAG.

Surprisingly, there are no existing regression tests for this, but here's an example:

  define float @fmf(float %a, float %b) {
    %mul1 = fmul fast float %a, %b
    %nega = fsub fast float 0.0, %a
    %mul2 = fmul fast float %nega, %b
    %abx2 = fsub fast float %mul1, %mul2
    ret float %abx2
  }


$ llc -o - badflags.ll -march=x86-64 -mattr=fma -enable-unsafe-fp-math -enable-fmf-dag=0
...
    vmulss    %xmm1, %xmm0, %xmm0
    vaddss    %xmm0, %xmm0, %xmm0
    retq

$ llc -o - badflags.ll -march=x86-64 -mattr=fma -enable-unsafe-fp-math -enable-fmf-dag=1
...
    vmulss    %xmm1, %xmm0, %xmm2
    vfmadd213ss    %xmm2, %xmm1, %xmm0  <--- failed to recognize that (a * b) was already calculated
    retq

llvm-svn: 244053
2015-08-05 15:12:03 +00:00
Hal Finkel 17caf326e5 [MachineCombiner] Don't use the opcode-only form of computeInstrLatency
In r242277, I updated the MachineCombiner to work with itineraries, but I
missed a call that is scheduling-model-only (the opcode-only form of
computeInstrLatency). Using the form that takes an MI* allows this to work with
itineraries (and should be NFC for subtargets with scheduling models).

llvm-svn: 244020
2015-08-05 07:45:28 +00:00
Sanjay Patel 924879ad2c wrap OptSize and MinSize attributes for easier and consistent access (NFCI)
Create wrapper methods in the Function class for the OptimizeForSize and MinSize
attributes. We want to hide the logic of "or'ing" them together when optimizing
just for size (-Os).

Currently, we are not consistent about this and rely on a front-end to always set
OptimizeForSize (-Os) if MinSize (-Oz) is on. Thus, there are 18 FIXME changes here
that should be added as follow-on patches with regression tests.

This patch is NFC-intended: it just replaces existing direct accesses of the attributes
by the equivalent wrapper call.

Differential Revision: http://reviews.llvm.org/D11734

llvm-svn: 243994
2015-08-04 15:49:57 +00:00
Hal Finkel caf1149b8b [SDAG] Fix a result chain in ExpandUnalignedLoad
On the code path in ExpandUnalignedLoad which expands an unaligned vector/fp
value in terms of a legal integer load of the same size, the ChainResult needs
to be the chain result of the integer load.

No in-tree test case is currently available.

Patch by Jan Hranac!

llvm-svn: 243956
2015-08-04 06:29:12 +00:00
Chen Li 0003878466 Introduce enum value for previously defined metadata -- make.implicit
Summary: This patch adds enum value for an existing metadata type -- make.implicit. Using preassigned enum will be helpful to get compile time type checking and avoid string construction and comparison. The patch also changes uses of make.implicit from string metadata to enum metadata. There is no functionality change.

Reviewers: reames

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D11698

llvm-svn: 243954
2015-08-04 04:41:34 +00:00
Ahmed Bougacha f65371a235 [CodeGen] Fix FCOPYSIGN legalization to account for mismatched types.
We used to legalize it like it's any other binary operations.  It's not,
because it accepts mismatched operand types.  Because of that, we used
to hit various asserts and miscompiles.

Specialize vector legalizations to, in the worst case, unroll, or, when
possible, to just legalize the operand that needs legalization.

Scalarization isn't covered, because I can't think of a target where
some but not all of the 1-element vector types are to be scalarized.

llvm-svn: 243924
2015-08-04 00:32:55 +00:00
Alex Lorenz a518b79601 MIR Serialization: Serialize the 'volatile' machine memory operand flag.
llvm-svn: 243923
2015-08-04 00:24:45 +00:00
Alex Lorenz 4af7e610c3 MIR Serialization: Initial serialization of the machine memory operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243915
2015-08-03 23:08:19 +00:00
David Blaikie 774b584f42 -Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11
Various value handles needed to be copy constructible and copy
assignable (mostly for their use in DenseMap). But to avoid an API that
might allow accidental slicing, make these members protected in the base
class and make derived classes final (the special members become
implicitly public there - but disallowing further derived classes that
might be sliced to the intermediate type).

Might be worth having a warning a bit like -Wnon-virtual-dtor that
catches public move/copy assign/ctors in classes with virtual functions.
(suppressable in the same way - by making them protected in the base,
and making the derived classes final) Could be fancier and only diagnose
them when they're actually called, potentially.

Also allow a few default implementations where custom implementations
(especially with non-standard return types) were implemented.

llvm-svn: 243909
2015-08-03 22:30:24 +00:00
David Blaikie e44a8a7066 -Wdeprecated-clean: Fix cases of violating the rule of 5 in ways that are deprecated in C++11
Some functions return concrete ByteStreamers by value - explicitly
support that in the base class. (dtor can be virtual, no one seems to be
polymorphically owning/destroying them)

llvm-svn: 243897
2015-08-03 20:12:58 +00:00
JF Bastien e8aad29984 Refactor AtomicExpand::expandAtomicRMWToCmpXchg into a standalone function.
Summary:
This is useful for PNaCl's `RewriteAtomics` pass. NaCl intrinsics don't exist for some of the more exotic RMW instructions, so by refactoring this function into its own, `RewriteAtomics` can share code rewriting those atomics with `AtomicExpand` while additionally saving a few cycles by generating the `cmpxchg` NaCl-specific intrinsic with the callback. Without this patch, `RewriteAtomics` would require two extra passes over functions, by first requiring use of the full `AtomicExpand` pass to just expand the leftover exotic RMWs and then running itself again to expand resulting `cmpxchg`s.

NFC

Reviewers: jfb

Subscribers: jfb, llvm-commits

Differential Revision: http://reviews.llvm.org/D11422

llvm-svn: 243880
2015-08-03 15:29:47 +00:00
John Brawn 8b954241f8 [GlobalMerge] Allow targets to enable merging of extern variables, NFC.
Adjust the GlobalMergeOnExternal option so that the default behaviour is to
do whatever the Target thinks is best. Explicitly enabled or disabling the
option will override this default.

Differential Revision: http://reviews.llvm.org/D10965

llvm-svn: 243873
2015-08-03 12:08:41 +00:00
Duncan P. N. Exon Smith c582114d4c AsmPrinter: Split out non-DIE printing from DIE::print(), NFC
Split out a helper `printValues()` for printing `DIEBlock` and `DIELoc`,
instead of relying on `DIE::print()`.  The shared code was actually
fairly small there.  No functionality change intended.

llvm-svn: 243856
2015-08-02 20:46:49 +00:00
Duncan P. N. Exon Smith 55a868a0f6 AsmPrinter: Take DIEValueList in some DwarfUnit API, NFC
Take `DIEValueList` instead of `DIE` so that `DIEBlock` and `DIELoc` can
stop inheriting from `DIE` in a future commit.

llvm-svn: 243855
2015-08-02 20:44:46 +00:00
Duncan P. N. Exon Smith 1ad5ebc3ed AsmPrinter: Change DIEValueList to a subclass of DIE, NFC
Rewrite `DIEValueList` as a subclass of `DIE`, renaming its API to match
`DIE`'s.  This is preparation for changing `DIEBlock` and `DIELoc` to
stop inheriting from `DIE` and inherit directly from `DIEValueList`.

I thought about leaving this as a has-a relationship (and changing
`DIELoc` and `DIEBlock` to also have-a `DIEValueList`), but that seemed
to require a fair bit more boilerplate and I think it needed more
changes to the `DwarfUnit` API than this will.

No functionality change intended here.

llvm-svn: 243854
2015-08-02 20:42:45 +00:00
Simon Pilgrim f328fd4441 Remove trailing whitespace. NFCI.
llvm-svn: 243838
2015-08-01 17:06:47 +00:00
Simon Pilgrim b447dc5aaa Use SDValue bool check. NFCI.
llvm-svn: 243837
2015-08-01 17:05:50 +00:00
Simon Pilgrim 503a2594c3 [DAGCombiner] Convert constant AND masks to shuffle clear masks down to the byte level
The XformToShuffleWithZero method currently checks AND masks at the per-lane level for all-one and all-zero constants and attempts to convert them to legal shuffle clear masks.

This patch generalises XformToShuffleWithZero, splitting and checking the sub-lanes of the constants down to the byte level to see if any legal shuffle clear masks are possible. This allows a lot of masks (often from legalization or truncation) to be folded into existing shuffle patterns and removes a lot of constant mask loading.

There are a few examples of poor shuffle lowering that are exposed by this patch that will be cleaned up in future patches (e.g. merging shuffles that are separated by bitcasts, x86 legalized v8i8 zero extension uses PMOVZX+AND+AND instead of AND+PMOVZX, etc.)

Differential Revision: http://reviews.llvm.org/D11518

llvm-svn: 243831
2015-08-01 10:01:46 +00:00
Alex Lorenz 59ed5919cd MIR Parser: Report an error when a jump table entry is redefined.
llvm-svn: 243798
2015-07-31 23:13:23 +00:00
Alex Lorenz b32a301a92 MIR Parser: Remove unused variable.
This variable is unused as of r243572.

llvm-svn: 243796
2015-07-31 22:59:20 +00:00
Alex Lorenz ad156fb6af MIR Serialization: Serialize the floating point immediate machine operands.
Reviewers: Duncan P. N. Exon Smith
llvm-svn: 243780
2015-07-31 20:49:21 +00:00
Duncan P. N. Exon Smith ed013cd221 DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.

Most of the testcase updates were generated by the following sed script:

    find test/ -name "*.ll" -o -name "*.mir" |
    xargs grep -l 'DILocalVariable' |
    xargs sed -i '' \
      -e 's/tag: DW_TAG_arg_variable, //' \
      -e 's/tag: DW_TAG_auto_variable, //'

There were only a handful of tests in `test/Assembly` that I needed to
update by hand.

(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`.  I've added a FIXME to that effect.)

llvm-svn: 243774
2015-07-31 18:58:39 +00:00
David Majnemer 654e130b6e New EH representation for MSVC compatibility
This introduces new instructions neccessary to implement MSVC-compatible
exception handling support.  Most of the middle-end and none of the
back-end haven't been audited or updated to take them into account.

Differential Revision: http://reviews.llvm.org/D11097

llvm-svn: 243766
2015-07-31 17:58:14 +00:00
Benjamin Kramer 4cd5faaa87 [CodeGenPrepare] Compress a pair. No functional change.
llvm-svn: 243759
2015-07-31 17:00:39 +00:00