Commit Graph

27402 Commits

Author SHA1 Message Date
Daniel Sanders 329e748c8c [gicombiner] Add the run-time rule disable option
Summary:
Each generated helper can be configured to generate an option that disables
rules in that helper. This can be used to bisect rulesets.

The disable bits are stored in a SparseVector as this is very cheap for the
common case where nothing is disabled. It gets more expensive the more rules
are disabled but you're generally doing that for debug purposes where
performance is less of a concern.

Depends on D68426

Reviewers: volkan, bogner

Reviewed By: volkan

Subscribers: hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68438

llvm-svn: 375067
2019-10-17 00:37:04 +00:00
Quentin Colombet c319afc903 [GISel][CombinerHelper] Add concat_vectors(build_vector, build_vector) => build_vector
Teach the combiner helper how to flatten concat_vectors of build_vectors
into a build_vector.

Add this combine as part of AArch64 pre-legalizer combiner.

Differential Revision: https://reviews.llvm.org/D69071

llvm-svn: 375066
2019-10-17 00:34:32 +00:00
Daniel Sanders ec5208fd65 [gicombiner] Hoist pure C++ combine into the tablegen definition
Summary:
This is just moving the existing C++ code around and will be NFC w.r.t
AArch64. Renamed 'CombineBr' to something more descriptive
('ElideByByInvertingCond') at the same time.

The remaining combines in AArch64PreLegalizeCombiner require features that
aren't implemented at this point and will be hoisted as they are added.

Depends on D68424

Reviewers: bogner, volkan

Subscribers: kristof.beyls, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68426

llvm-svn: 375057
2019-10-16 23:53:35 +00:00
Matt Arsenault 34ed76e180 GlobalISel: Implement lower for G_SADDO/G_SSUBO
Port directly from SelectionDAG, minus the path using
ISD::SADDSAT/ISD::SSUBSAT.

llvm-svn: 375042
2019-10-16 20:46:32 +00:00
Simon Pilgrim e2163f96ab CombinerHelper - silence dead assignment warnings. NFCI.
Copy the NewAlignment value to Alignment first and then use that to update the stack frame object alignments.

llvm-svn: 375019
2019-10-16 17:21:50 +00:00
Sjoerd Meijer 5a13188966 Revert "[HardwareLoops] Optimisation remarks"
while I investigate the PPC build bot failures.

This reverts commit ad76375156.

llvm-svn: 374992
2019-10-16 10:55:06 +00:00
Sjoerd Meijer ad76375156 [HardwareLoops] Optimisation remarks
This adds the initial plumbing to support optimisation remarks in
the IR hardware-loop pass.

I have left a todo in a comment where we can improve the reporting,
and will iterate on that now that we have this initial support in.

Differential Revision: https://reviews.llvm.org/D68579

llvm-svn: 374980
2019-10-16 09:09:55 +00:00
Orlando Cazalet-Hyams 8af5ada093 [NFC] Replace a linked list in LiveDebugVariables pass with a DenseMap
In LiveDebugVariables.cpp:
Prior to this patch, UserValues were grouped into linked list chains. Each
chain was the union of two sets: { A: Matching Source variable } or
{ B: Matching virtual register }. A ptr to the heads (or 'leaders')
of each of these chains were kept in a map with the { Source variable } used
as the key (set A predicate) and another with { Virtual register } as key
(set B predicate).

There was a search through the chains in the function getUserValue looking for
UserValues with matching { Source variable, Complex expression, Inlined-at
location }. Essentially searching for a subset of A through two interleaved
linked lists of set A and B. Importantly, by design, the subset will only
contain one or zero elements here. That is to say a UserValue can be uniquely
identified by the tuple { Source variable, Complex expression, Inlined-at
 location } if it exists.

This patch removes the linked list and instead uses a DenseMap to map
the tuple { Source variable, Complex expression, Inlined-at location }
to UserValue ptrs so that the getUserValue search predicate is this map key.
The virtual register map now maps a vreg to a SmallVector<UserVal *> so that
set B is still available for quick searches.

Reviewers: aprantl, probinson, vsk, dblaikie

Reviewed By: aprantl

Subscribers: russell.gallop, gbedwell, bjope, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D68816

llvm-svn: 374979
2019-10-16 08:36:00 +00:00
Craig Topper 8995daafa0 [LegalizeTypes] Don't use PromoteTargetBoolean in WidenVecOp_SETCC.
Similar to r374970, but I don't have a test for this.

PromoteTargetBoolean is intended to be use for legalizing an
operand that needs to be promoted. It picks its type based on
the return from getSetccResultType and is intended to be used
when we have freedom to pick the new type. But the return type
we need for WidenVecOp_SETCC is completely determined by the
type of the input node.

llvm-svn: 374972
2019-10-16 03:29:24 +00:00
Craig Topper 7b49e8ac35 [LegalizeTypes] Don't call PromoteTargetBoolean from SplitVecOp_VSETCC.
PromoteTargetBoolean calls getSetccResultType to get the return
type. But we were passing it the setcc result type rather than the
setcc input type. This causes an issue on X86 with avx512vl where
the setcc result type for vXf16 vectors is vXi16 while the
result type for vXi16 vectors is vXi1.

There's really no guarantee that getSetccResultType is the type
we need here. So now we just grab the extend type from
getExtendForContent and extend to the original result VT of the
node we're splitting.

llvm-svn: 374970
2019-10-16 02:50:04 +00:00
Dmitry Mikulin f14642f2f1 Added support for "#pragma clang section relro=<name>"
Differential Revision: https://reviews.llvm.org/D68806

llvm-svn: 374934
2019-10-15 18:31:10 +00:00
David Zarzycki 59390efef2 [X86] Make memcmp() use PTEST if possible and also enable AVX1
llvm-svn: 374922
2019-10-15 17:40:12 +00:00
Sanjay Patel d545c9056e [DAGCombiner] fold select-of-constants based on sign-bit test
Examples:
  i32 X > -1 ? C1 : -1 --> (X >>s 31) | C1
  i8 X < 0 ? C1 : 0 --> (X >>s 7) & C1

This is a small generalization of a fold requested in PR43650:
https://bugs.llvm.org/show_bug.cgi?id=43650

The sign-bit of the condition operand can be used as a mask for the true operand:
https://rise4fun.com/Alive/paT

Note that we already handle some of the patterns (isNegative + scalar) because
there's an over-specialized, yet over-reaching fold for that in foldSelectCCToShiftAnd().
It doesn't use any TLI hooks, so I can't easily rip out that code even though we're
duplicating part of it here. This fold is guarded by TLI.convertSelectOfConstantsToMath(),
so it should not cause problems for targets that prefer select over shift.

Also worth noting: I thought we could generalize this further to include the case where
the true operand of the select is not constant, but Alive says that may allow poison to
pass through where it does not in the original select form of the code.

Differential Revision: https://reviews.llvm.org/D68949

llvm-svn: 374902
2019-10-15 15:23:57 +00:00
Benjamin Kramer ce00cd6ae8 [AsmPrinter] Fix unused variable warning in Release builds. NFC.
llvm-svn: 374894
2019-10-15 14:23:11 +00:00
David Stenberg 1ae2d9a2bd [DebugInfo] Add a DW_OP_LLVM_entry_value operation
Summary:
Internally in LLVM's metadata we use DW_OP_entry_value operations with
the same semantics as DWARF; that is, its operand specifies the number
of bytes that the entry value covers.

At the time of emitting entry values we don't know the emitted size of
the DWARF expression that the entry value will cover. Currently the size
is hardcoded to 1 in DIExpression, and other values causes the verifier
to fail. As the size is 1, that effectively means that we can only have
valid entry values for registers that can be encoded in one byte, which
are the registers with DWARF numbers 0 to 31 (as they can be encoded as
single-byte DW_OP_reg0..DW_OP_reg31 rather than a multi-byte
DW_OP_regx). It is a bit confusing, but it seems like llvm-dwarfdump
will print an operation "correctly", even if the byte size is less than
that, which may make it seem that we emit correct DWARF for registers
with DWARF numbers > 31. If you instead use readelf for such cases, it
will interpret the number of specified bytes as a DWARF expression. This
seems like a limitation in llvm-dwarfdump.

As suggested in D66746, a way forward would be to add an internal
variant of DW_OP_entry_value, DW_OP_LLVM_entry_value, whose operand
instead specifies the number of operations that the entry value covers,
and we then translate that into the byte size at the time of emission.

In this patch that internal operation is added. This patch keeps the
limitation that a entry value can only be applied to simple register
locations, but it will fix the issue with the size operand being
incorrect for DWARF numbers > 31.

Reviewers: aprantl, vsk, djtodoro, NikolaPrica

Reviewed By: aprantl

Subscribers: jyknight, fedor.sergeev, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67492

llvm-svn: 374881
2019-10-15 11:31:21 +00:00
Guillaume Chatelet 0e62011df8 [Alignment][NFC] Remove dependency on GlobalObject::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, mehdi_amini, jvesely, nhaehnle, hiraditya, steven_wu, dexonsmith, dang, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68944

llvm-svn: 374880
2019-10-15 11:24:36 +00:00
David Stenberg 284827f32b [DebugInfo] Add interface for pre-calculating the size of emitted DWARF
Summary:
DWARF's DW_OP_entry_value operation has two operands; the first is a
ULEB128 operand that specifies the size of the second operand, which is
a DWARF block. This means that we need to be able to pre-calculate and
emit the size of DWARF expressions before emitting them. There is
currently no interface for doing this in DwarfExpression, so this patch
introduces that.

When implementing this I initially thought about running through
DwarfExpression's emission two times; first with a temporary buffer to
emit the expression, in order to being able to calculate the size of
that emitted data. However, DwarfExpression is a quite complex state
machine, so I decided against that, as it seemed like the two runs could
get out of sync, resulting in incorrect size operands. Therefore I have
implemented this in a way that we only have to run DwarfExpression once.
The idea is to emit DWARF to a temporary buffer, for which it is
possible to query the size. The data in the temporary buffer can then be
emitted to DwarfExpression's main output.

In the case of DIEDwarfExpression, a temporary DIE is used. The values
are all allocated using the same BumpPtrAllocator as for all other DIEs,
and the values are then transferred to the real value list. In the case
of DebugLocDwarfExpression, the temporary buffer is implemented using a
BufferByteStreamer which emits to a buffer in the DwarfExpression
object.

Reviewers: aprantl, vsk, NikolaPrica, djtodoro

Reviewed By: aprantl

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67768

llvm-svn: 374879
2019-10-15 11:14:35 +00:00
Jeremy Morse ed29dbaafa [DebugInfo] Remove some users of DBG_VALUEs IsIndirect field
This patch kills off a significant user of the "IsIndirect" field of
DBG_VALUE machine insts. Brought up in in PR41675, IsIndirect is
techncally redundant as it can be expressed by the DIExpression of a
DBG_VALUE inst, and it isn't helpful to have two ways of expressing
things.

Rather than setting IsIndirect, have DBG_VALUE creators add an extra deref
to the insts DIExpression. There should now be no appearences of
IsIndirect=True from isel down to LiveDebugVariables / VirtRegRewriter,
which is ensured by an assertion in LDVImpl::handleDebugValue. This means
we also get to delete the IsIndirect handling in LiveDebugVariables. Tests
can be upgraded by for example swapping the following IsIndirect=True
DBG_VALUE:

  DBG_VALUE $somereg, 0, !123, !DIExpression(DW_OP_foo)

With one where the indirection is in the DIExpression, by _appending_
a deref:

  DBG_VALUE $somereg, $noreg, !123, !DIExpression(DW_OP_foo, DW_OP_deref)

Which both mean the same thing. 

Most of the test changes in this patch are updates of that form; also some
changes in how the textual assembly printer handles these insts.

Differential Revision: https://reviews.llvm.org/D68945

llvm-svn: 374877
2019-10-15 10:46:24 +00:00
David Stenberg d46ac44ecd Change Comments SmallVector to std::vector in DebugLocStream [NFC]
This changes the 32-element SmallVector to a std::vector. When building
a RelWithDebInfo clang-8 binary, the average size of the vector was
~10000, so it does not seem very beneficial or practical to use a small
vector for that.

The DWARFBytes SmallVector grows in the same way as Comments, so perhaps
that also should be changed to a purely dynamically allocated structure,
but that requires some more code changes, so I let that remain as a
SmallVector for now.

llvm-svn: 374871
2019-10-15 09:21:09 +00:00
Joerg Sonnenberger 9681ea9560 Reapply r374743 with a fix for the ocaml binding
Add a pass to lower is.constant and objectsize intrinsics

This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374784
2019-10-14 16:15:14 +00:00
David Stenberg 8535bed795 [DebugInfo] Fix truncation of call site immediates
Summary:
This addresses a bug in collectCallSiteParameters() where call site
immediates would be truncated from int64_t to unsigned.

This fixes PR43525.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: aprantl

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D68869

llvm-svn: 374770
2019-10-14 12:49:58 +00:00
Dmitri Gribenko 1a21f98ac3 Revert "Add a pass to lower is.constant and objectsize intrinsics"
This reverts commit r374743. It broke the build with Ocaml enabled:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/19218

llvm-svn: 374768
2019-10-14 12:22:48 +00:00
Sam Parker 527a35e155 [NFC][TTI] Add Alignment for isLegalMasked[Load/Store]
Add an extra parameter so the backend can take the alignment into
consideration.

Differential Revision: https://reviews.llvm.org/D68400

llvm-svn: 374763
2019-10-14 10:00:21 +00:00
Joerg Sonnenberger e4300c392d Add a pass to lower is.constant and objectsize intrinsics
This pass lowers is.constant and objectsize intrinsics not simplified by
earlier constant folding, i.e. if the object given is not constant or if
not using the optimized pass chain. The result is recursively simplified
and constant conditionals are pruned, so that dead blocks are removed
even for -O0. This allows inline asm blocks with operand constraints to
work all the time.

The new pass replaces the existing lowering in the codegen-prepare pass
and fallbacks in SDAG/GlobalISEL and FastISel. The latter now assert
on the intrinsics.

Differential Revision: https://reviews.llvm.org/D65280

llvm-svn: 374743
2019-10-13 23:00:15 +00:00
Simon Pilgrim 944a051ebb IRTranslator - silence static analyzer null dereference warnings. NFCI.
The CmpInst::getType() calls can be replaced by just using User::getType() that it was dyn_cast from, and we then need to assert that any default predicate cases came from the CmpInst.

llvm-svn: 374716
2019-10-13 11:29:35 +00:00
David Blaikie de9aa37bf0 DebugInfo: Reduce the scope of some variables related to debug_ranges emission
Minor tidy up/NFC

llvm-svn: 374613
2019-10-11 23:51:24 +00:00
David Blaikie 289c45cc62 DebugInfo: Use base address selection entries for debug_loc
Unify the range and loc emission (for both DWARFv4 and DWARFv5 style lists) and take advantage of that unification to use strategic base addresses for loclists.

Differential Revision: https://reviews.llvm.org/D68620

llvm-svn: 374600
2019-10-11 21:52:41 +00:00
David Green 7c30af8e65 Revert 374373: [Codegen] Alter the default promotion for saturating adds and subs
This commit is not extending the promoted integers as it should. Reverting
whilst I look into the details.

llvm-svn: 374592
2019-10-11 20:33:03 +00:00
Quentin Colombet 9c36ec5941 [GISel][CallLowering] Enable vector support in argument lowering
The exciting code is actually already enough to handle the splitting
of vector arguments but we were lacking a test case.

This commit adds a test case for vector argument lowering involving
splitting and enable the related support in call lowering.

llvm-svn: 374589
2019-10-11 20:22:57 +00:00
Quentin Colombet 7720f11498 [MachineIRBuilder] Fix an assertion failure with buildMerge
Teach buildMerge how to deal with scalar to vector kind of requests.

Prior to this patch, buildMerge would issue either a G_MERGE_VALUES
when all the vregs are scalars or a G_CONCAT_VECTORS when the destination
vreg is a vector.
G_CONCAT_VECTORS was actually not the proper instruction when the source
vregs were scalars and the compiler would assert that the sources must
be vectors. Instead we want is to issue a G_BUILD_VECTOR when we are
in this situation.

This patch fixes that.

llvm-svn: 374588
2019-10-11 20:22:47 +00:00
Sanjay Patel 3b581ac80f [DAGCombiner] fold vselect-of-constants to shift
The diffs suggest that we are missing some more basic
analysis/transforms, but this keeps the vector path in
sync with the scalar (rL374397). This is again a
preliminary step for introducing the reverse transform
in IR as proposed in D63382.

llvm-svn: 374555
2019-10-11 14:17:56 +00:00
Marcello Maggioni a064edf55e [GISel] Simplifying return from else in function. NFC
Forgot to integrate this little change in previous commit

llvm-svn: 374463
2019-10-10 21:51:30 +00:00
Marcello Maggioni 0112123eea [GISel] Allow getConstantVRegVal() to return G_FCONSTANT values.
In GISel we have both G_CONSTANT and G_FCONSTANT, but because
in GISel we don't really have a concept of Float vs Int value
the only difference between the two is where the data originates
from.

What both G_CONSTANT and G_FCONSTANT return is just a bag of bits
with the constant representation in it.

By making getConstantVRegVal() return G_FCONSTANTs bit representation
as well we allow ConstantFold and other things to operate with
G_FCONSTANT.

Adding tests that show ConstantFolding to work on mixed G_CONSTANT
and G_FCONSTANT sources.

Differential Revision: https://reviews.llvm.org/D68739

llvm-svn: 374458
2019-10-10 21:46:26 +00:00
Sanjay Patel 7b904ce724 [DAGCombiner] fold select-of-constants to shift
This reverses the scalar canonicalization proposed in D63382.

Pre: isPowerOf2(C1)
%r = select i1 %cond, i32 C1, i32 0
=>
%z = zext i1 %cond to i32
%r = shl i32 %z, log2(C1)

https://rise4fun.com/Alive/Z50

x86 already tries to fold this pattern, but it isn't done
uniformly, so we still see a diff. AArch64 probably should
enable the TLI hook to benefit too, but that's a follow-on.

llvm-svn: 374397
2019-10-10 17:52:02 +00:00
David Green 94d379095a [Codegen] Alter the default promotion for saturating adds and subs
The default promotion for the add_sat/sub_sat nodes currently does:
   1. ANY_EXTEND iN to iM
   2. SHL by M-N
   3. [US][ADD|SUB]SAT
   4. L/ASHR by M-N
If the promoted add_sat or sub_sat node is not legal, this can produce code
that effectively does a lot of shifting (and requiring large constants to be
materialised) just to use the overflow flag. It is simpler to just do the
saturation manually, using the higher bitwidth addition and a min/max against
the saturating bounds. That is what this patch attempts to do.

Differential Revision: https://reviews.llvm.org/D68643

llvm-svn: 374373
2019-10-10 16:04:49 +00:00
Sanjay Patel 7f0e7c0b1c [DAGCombiner] reduce code duplication; NFC
llvm-svn: 374370
2019-10-10 15:38:29 +00:00
Guillaume Chatelet ff054b9e32 [Alignment][NFC] Use llv::Align in GISelKnownBits
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68786

llvm-svn: 374369
2019-10-10 15:38:22 +00:00
Amaury Sechet aaf0507896 [DAGCombine] Match more patterns for half word bswap
Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68250

llvm-svn: 374340
2019-10-10 13:20:10 +00:00
Oliver Stannard 4f454b2275 [IfCvt][ARM] Optimise diamond if-conversion for code size
Currently, the heuristics the if-conversion pass uses for diamond if-conversion
are based on execution time, with no consideration for code size. This adds a
new set of heuristics to be used when optimising for code size.

This is mostly target-independent, because the if-conversion pass can
see the code size of the instructions which it is removing. For thumb,
there are a few passes (insertion of IT instructions, selection of
narrow branches, and selection of CBZ instructions) which are run after
if conversion and affect these heuristics, so I've added target hooks to
better predict the code-size effect of a proposed if-conversion.

Differential revision: https://reviews.llvm.org/D67350

llvm-svn: 374301
2019-10-10 09:58:28 +00:00
Reid Kleckner 9d8f0b3519 [codeview] Try to avoid emitting .cv_loc with line zero
Summary:
Visual Studio doesn't like it while stepping. It kicks you out of the
source view of the file being stepped through and tries to fall back to
the disassembly view.

Fixes PR43530

The fix is incomplete, because it's possible to have a basic block with
no source locations at all. In this case, we don't emit a .cv_loc, but
that will result in wrong stepping behavior in the debugger if the
layout predecessor of the location-less BB has an unrelated source
location. We could try harder to find a valid location that dominates or
post-dominates the current BB, but in general it's a dataflow problem,
and one still might not exist. I left a FIXME about this.

As an alternative, we might want to consider having the middle-end check
if its emitting codeview and get it to stop using line zero.

Reviewers: akhuang

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68747

llvm-svn: 374267
2019-10-10 01:06:01 +00:00
Philip Reames 931120846e Conservatively add volatility and atomic checks in a few places
As background, starting in D66309, I'm working on support unordered atomics analogous to volatile flags on normal LoadSDNode/StoreSDNodes for X86.

As part of that, I spent some time going through usages of LoadSDNode and StoreSDNode looking for cases where we might have missed a volatility check or need an atomic check. I couldn't find any cases that clearly miscompile - i.e. no test cases - but a couple of pieces in code loop suspicious though I can't figure out how to exercise them.

This patch adds defensive checks and asserts in the places my manual audit found. If anyone has any ideas on how to either a) disprove any of the checks, or b) hit the bug they might be fixing, I welcome suggestions.

Differential Revision: https://reviews.llvm.org/D68419

llvm-svn: 374261
2019-10-09 23:43:33 +00:00
Matt Arsenault 3cd3959fe2 GlobalISel: Implement fewerElementsVector for G_BUILD_VECTOR
Turn it into a G_CONCAT_VECTORS of G_BUILD_VECTOR.

llvm-svn: 374252
2019-10-09 22:44:43 +00:00
Evandro Menezes e60415a0db [Support] Add mathematical constants
Add own version of the mathematical constants from the upcoming C++20 `std::numbers`.

Differential revision: https://reviews.llvm.org/D68257

llvm-svn: 374207
2019-10-09 19:58:01 +00:00
Simon Pilgrim e746380f6a CodeGenPrepare - silence static analyzer dyn_cast<> null dereference warnings. NFCI.
The static analyzer is warning about potential null dereferences, but in these cases we should be able to use cast<> directly and if not assert will fire for us.

llvm-svn: 374085
2019-10-08 17:00:01 +00:00
Nikola Prica 98603a8153 [DebugInfo][If-Converter] Update call site info during the optimization
During the If-Converter optimization pay attention when copying or
deleting call instructions in order to keep call site information in
valid state.

Reviewers: aprantl, vsk, efriedma

Reviewed By: vsk, efriedma

Differential Revision: https://reviews.llvm.org/D66955

llvm-svn: 374068
2019-10-08 15:43:12 +00:00
Graham Hunter b302561b76 [SVE][IR] Scalable Vector size queries and IR instruction support
* Adds a TypeSize struct to represent the known minimum size of a type
  along with a flag to indicate that the runtime size is a integer multiple
  of that size
* Converts existing size query functions from Type.h and DataLayout.h to
  return a TypeSize result
* Adds convenience methods (including a transparent conversion operator
  to uint64_t) so that most existing code 'just works' as if the return
  values were still scalars.
* Uses the new size queries along with ElementCount to ensure that all
  supported instructions used with scalable vectors can be constructed
  in IR.

Reviewers: hfinkel, lattner, rkruppe, greened, rovka, rengolin, sdesmalen

Reviewed By: rovka, sdesmalen

Differential Revision: https://reviews.llvm.org/D53137

llvm-svn: 374042
2019-10-08 12:53:54 +00:00
Nicolai Haehnle 7febdb7f27 MachineSSAUpdater: insert IMPLICIT_DEF at top of basic block
Summary:
When getValueInMiddleOfBlock happens to be called for a basic block
that has no incoming value at all, an IMPLICIT_DEF is inserted in that
block via GetValueAtEndOfBlockInternal. This IMPLICIT_DEF must be at
the top of its basic block or it will likely not reach the use that
the caller intends to insert.

Issue: https://github.com/GPUOpen-Drivers/llpc/issues/204

Reviewers: arsenm, rampitec

Subscribers: jvesely, wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68183

llvm-svn: 374040
2019-10-08 12:46:20 +00:00
Reid Kleckner f9b67b810e [X86] Add new calling convention that guarantees tail call optimization
When the target option GuaranteedTailCallOpt is specified, calls with
the fastcc calling convention will be transformed into tail calls if
they are in tail position. This diff adds a new calling convention,
tailcc, currently supported only on X86, which behaves the same way as
fastcc, except that the GuaranteedTailCallOpt flag does not need to
enabled in order to enable tail call optimization.

Patch by Dwight Guth <dwight.guth@runtimeverification.com>!

Reviewed By: lebedev.ri, paquette, rnk

Differential Revision: https://reviews.llvm.org/D67855

llvm-svn: 373976
2019-10-07 22:28:58 +00:00
Matt Arsenault 4bcdcad91b GlobalISel: Partially implement lower for G_INSERT
llvm-svn: 373946
2019-10-07 19:13:27 +00:00
Matt Arsenault 27269054d2 GlobalISel: Add target pre-isel instructions
Allows targets to introduce regbankselectable
pseudo-instructions. Currently the closet feature to this is an
intrinsic. However this requires creating a public intrinsic
declaration. This litters the public intrinsic namespace with
operations we don't necessarily want to expose to IR producers, and
would rather leave as private to the backend.

Use a new instruction bit. A previous attempt tried to keep using enum
value ranges, but it turned into a mess.

llvm-svn: 373937
2019-10-07 18:43:29 +00:00
Jordan Rose fdaa742174 Second attempt to add iterator_range::empty()
Doing this makes MSVC complain that `empty(someRange)` could refer to
either C++17's std::empty or LLVM's llvm::empty, which previously we
avoided via SFINAE because std::empty is defined in terms of an empty
member rather than begin and end. So, switch callers over to the new
method as it is added.

https://reviews.llvm.org/D68439

llvm-svn: 373935
2019-10-07 18:14:24 +00:00
Kevin P. Neal 1c3d19c82d [FPEnv] Add constrained intrinsics for lrint and lround
Earlier in the year intrinsics for lrint, llrint, lround and llround were
added to llvm. The constrained versions are now implemented here.

Reviewed by:	andrew.w.kaylor, craig.topper, cameron.mcinally
Approved by:	craig.topper
Differential Revision:	https://reviews.llvm.org/D64746

llvm-svn: 373900
2019-10-07 13:20:00 +00:00
Simon Pilgrim b4ba3cbda0 [X86][AVX] Access a scalar float/double as a free extract from a broadcast load (PR43217)
If a fp scalar is loaded and then used as both a scalar and a vector broadcast, perform the load as a broadcast and then extract the scalar for 'free' from the 0th element.

This involved switching the order of the X86ISD::BROADCAST combines so we only convert to X86ISD::BROADCAST_LOAD once all other canonicalizations have been attempted.

Adds a DAGCombinerInfo::recursivelyDeleteUnusedNodes wrapper.

Fixes PR43217

Differential Revision: https://reviews.llvm.org/D68544

llvm-svn: 373871
2019-10-06 21:11:45 +00:00
Craig Topper 842dde6be4 [LegalizeTypes][X86] When splitting a vselect for type legalization, don't split a setcc condition if the setcc input is legal and vXi1 conditions are supported
Summary: The VSELECT splitting code tries to split a setcc input as well. But on avx512 where mask registers are well supported it should be better to just split the mask and use a single compare.

Reviewers: RKSimon, spatel, efriedma

Reviewed By: spatel

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68359

llvm-svn: 373863
2019-10-06 18:43:03 +00:00
Sanjay Patel f643fabb52 Revert [DAGCombine] Match more patterns for half word bswap
This reverts r373850 (git commit 25ba49824d)

This patch appears to cause multiple codegen regression test failures - http://lab.llvm.org:8011/builders/clang-cmake-armv7-quick/builds/10680

llvm-svn: 373853
2019-10-06 15:27:34 +00:00
Amaury Sechet 25ba49824d [DAGCombine] Match more patterns for half word bswap
Summary: It ensures that the bswap is generated even when a part of the subtree already matches a bswap transform.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68250

llvm-svn: 373850
2019-10-06 14:14:55 +00:00
Matt Arsenault a5b9c75674 GlobalISel: Partially implement lower for G_EXTRACT
Turn into shift and truncate. Doesn't yet handle pointers.

llvm-svn: 373838
2019-10-06 01:37:35 +00:00
Craig Topper 2decdf42b9 [FastISel] Copy the inline assembly dialect to the INLINEASM instruction.
Fixes PR43575.

llvm-svn: 373836
2019-10-05 23:21:17 +00:00
Simon Pilgrim f609c0a303 BranchFolding - IsBetterFallthrough - assert non-null pointers. NFCI.
Silences static analyzer null dereference warnings.

llvm-svn: 373823
2019-10-05 13:20:30 +00:00
Philip Reames d5a4dad206 Fix a *nasty* miscompile in experimental unordered atomic lowering
This is an omission in rL371441.  Loads which happened to be unordered weren't being added to the PendingLoad set, and thus weren't be ordered w/respect to side effects which followed before the end of the block.

Included test case is how I spotted this.  We had an atomic load being folded into a using instruction after a fence that load was supposed to be ordered with.  I'm sure it showed up a bunch of other ways as well.

Spotted via manual inspecting of assembly differences in a corpus w/and w/o the new experimental mode.  Finding this with testing would have been "unpleasant".  

llvm-svn: 373814
2019-10-05 00:32:10 +00:00
Reid Kleckner 67cfa79c01 Revert [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
This reverts r371177 (git commit f879c68755)

It caused PR43566 by removing empty, address-taken MachineBasicBlocks.
Such blocks may have references from blockaddress or other operands, and
need more consideration to be removed.

See the PR for a test case to use when relanding.

llvm-svn: 373805
2019-10-04 22:24:21 +00:00
Jessica Paquette 784892c964 [MachineOutliner] Disable outlining from noreturn functions
Outlining from noreturn functions doesn't do the correct thing right now. The
outliner should respect that the caller is marked noreturn. In the event that
we have a noreturn function, and the outlined code is in tail position, the
outliner will not see that the outlined function should be tail called. As a
result, you end up with a regular call containing a return.

Fixing this requires that we check that all candidates live inside noreturn
functions. So, for the sake of correctness, don't outline from noreturn
functions right now.

Add machine-outliner-noreturn.mir to test this.

llvm-svn: 373791
2019-10-04 21:24:12 +00:00
Eli Friedman 23ae13d51f [ScheduleDAG] When a node is cloned, add an edge between the nodes.
InstrEmitter's virtual register handling assumes that clones are emitted
after the cloned node.  Make sure this assumption actually holds.

Fixes a "Node emitted out of order - early" assertion on the testcase.

This is probably a very rare case to actually hit in practice; even
without the explicit edge, the scheduler will usually end up scheduling
the nodes in the expected order due to other constraints.

Differential Revision: https://reviews.llvm.org/D68068

llvm-svn: 373782
2019-10-04 19:51:40 +00:00
James Molloy 9baac83a2e [ModuloSchedule] Do not remap terminators
This is a trivial point fix. Terminator instructions aren't scheduled, so
we shouldn't expect to be able to remap them.

This doesn't affect Hexagon and PPC because their terminators are always
hardware loop backbranches that have no register operands.

llvm-svn: 373762
2019-10-04 17:15:25 +00:00
Simon Pilgrim 84f5cd75b3 Fix MSVC "not all control paths return a value" warning. NFCI.
llvm-svn: 373741
2019-10-04 12:45:27 +00:00
Jeremy Morse 61800a75b7 [DebugInfo] LiveDebugValues: move DBG_VALUE creation into VarLoc class
Rather than having a mixture of location-state shared between DBG_VALUEs
and VarLoc objects in LiveDebugValues, this patch makes VarLoc the
master record of variable locations. The refactoring means that the
transfer of locations from one place to another is always a performed by
an operation on an existing VarLoc, that produces another transferred
VarLoc. DBG_VALUEs are only created at the end of LiveDebugValues, once
all locations are known. As a plus, there is now only one method where
DBG_VALUEs can be created.

The test case added covers a circumstance that is now impossible to
express in LiveDebugValues: if an already-indirect DBG_VALUE is spilt,
previously it would have been restored-from-spill as a direct DBG_VALUE.
We now don't lose this information along the way, as VarLocs always
refer back to the "original" non-transfer DBG_VALUE, and we can always
work out whether a location was "originally" indirect.

Differential Revision: https://reviews.llvm.org/D67398

llvm-svn: 373727
2019-10-04 10:53:47 +00:00
Jeremy Morse 0ca48de26c [DebugInfo] LiveDebugValues: defer DBG_VALUE creation during analysis
When transfering variable locations from one place to another,
LiveDebugValues immediately creates a DBG_VALUE representing that
transfer. This causes trouble if the variable location should
subsequently be invalidated by a loop back-edge, such as in the added
test case: the transfer DBG_VALUE from a now-invalid location is used
as proof that the variable location is correct. This is effectively a
self-fulfilling prophesy.

To avoid this, defer the insertion of transfer DBG_VALUEs until after
analysis has completed. Some of those transfers are still sketchy, but
we don't propagate them into other blocks now.

Differential Revision: https://reviews.llvm.org/D67393

llvm-svn: 373720
2019-10-04 09:38:05 +00:00
Sanjay Patel 288079aafd [DAGCombiner] add operation legality checks before creating shift ops (PR43542)
As discussed on llvm-dev and:
https://bugs.llvm.org/show_bug.cgi?id=43542
...we have transforms that assume shift operations are legal and transforms to
use them are profitable, but that may not hold for simple targets.

In this case, the MSP430 target custom lowers shifts by repeating (many)
simpler/fixed ops. That can be avoided by keeping this code as setcc/select.

Differential Revision: https://reviews.llvm.org/D68397

llvm-svn: 373666
2019-10-03 21:34:04 +00:00
David Blaikie 2ac586c58f DebugInfo: Generalize rnglist emission as a precursor to reusing it for loclist emission
llvm-svn: 373663
2019-10-03 20:56:23 +00:00
James Molloy 9972c992eb [ModuloSchedule] removeBranch() *before* creating the trip count condition
The Hexagon code assumes there's no existing terminator when inserting its
trip count condition check.

This causes swp-stages5.ll to break. The generated code looks good to me,
it is likely a permutation. I have disabled the new codegen path to keep
everything green and will investigate along with the other 3-4 tests
that have different codegen.

Fixes expensive-checks build.

llvm-svn: 373629
2019-10-03 17:10:32 +00:00
Guillaume Chatelet d400d45150 [Alignment][NFC] Remove StoreInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, bollu, jdoerfert

Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68268

llvm-svn: 373595
2019-10-03 13:17:21 +00:00
David Blaikie 11e0bcf8a2 DebugInfo: Rename DebugLocStream::Entry::Begin/EndSym to just Begin/End
Brings this struct in line with the RangeSpan class so they might
eventually be used by common template code for generating range/loc
lists with less duplicate code.

llvm-svn: 373540
2019-10-02 22:58:02 +00:00
Craig Topper 2772b970e3 [LegalizeTypes] Check for already split condition before calilng SplitVecRes_SETCC in SplitRes_SELECT.
No point in manually splitting the SETCC if it was already done.

llvm-svn: 373535
2019-10-02 22:34:49 +00:00
David Blaikie b677cb8dc7 DebugInfo: Simplify RangeSpan to be a plain struct
This is an effort to make RangeSpan and DebugLocStream::Entry more
similar to share code for their emission (to reuse the more complicated
code for using (& choosing when to use) base address selection entries,
etc).

It didn't seem like this struct was worth the complexity of
encapsulation - when the members could be initialized by the ctor to any
value (no validation) and the type is assignable (so there's no
mutability or other constraint being implemented by its interface).

llvm-svn: 373533
2019-10-02 22:27:24 +00:00
Simon Pilgrim 49c2390877 [CodeGen] Remove unused MachineMemOperand::print wrappers (PR41772)
As noted on PR41772, the static analyzer reports that the MachineMemOperand::print partial wrappers set a number of args to null pointers that were then dereferenced in the actual implementation.

It turns out that these wrappers are not being used at all (hence why we're not seeing any crashes), so I'd like to propose we just get rid of them.

Differential Revision: https://reviews.llvm.org/D68208

llvm-svn: 373484
2019-10-02 16:20:28 +00:00
Hans Wennborg 9330005a54 Reapply r373431 "Switch lowering: omit range check for bit tests when default is unreachable (PR43129)"
This was reverted in r373454 due to breaking the expensive-checks bot.
This version addresses that by omitting the addSuccessorWithProb() call
when omitting the range check.

> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131

llvm-svn: 373477
2019-10-02 14:35:06 +00:00
Simon Pilgrim 369d16a1c6 AsmPrinter - emitGlobalConstantFP - silence static analyzer null dereference warning. NFCI.
All the calls to emitGlobalConstantFP should provide a nonnull Type for the float.

llvm-svn: 373464
2019-10-02 13:08:46 +00:00
James Molloy 9026518e73 [ModuloSchedule] Peel out prologs and epilogs, generate actual code
Summary:
This extends the PeelingModuloScheduleExpander to generate prolog and epilog code,
and correctly stitch uses through the prolog, kernel, epilog DAG.

The key concept in this patch is to ensure that all transforms are *local*; only a
function of a block and its immediate predecessor and successor. By defining the problem in this way
we can inductively rewrite the entire DAG using only local knowledge that is easy to
reason about.

For example, we assume that all prologs and epilogs are near-perfect clones of the
steady-state kernel. This means that if a block has an instruction that is predicated out,
we can redirect all users of that instruction to that equivalent instruction in our
immediate predecessor. As all blocks are clones, every instruction must have an equivalent in
every other block.

Similarly we can make the assumption by construction that if a value defined in a block is used
outside that block, the only possible user is its immediate successors. We maintain this
even for values that are used outside the loop by creating a limited form of LCSSA.

This code isn't small, but it isn't complex.

Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet;
I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns
that we don't produce. Those still need a bit more investigation. In the meantime we
(Google) are happy with the code produced by this on our downstream SMS implementation,
and believe it generates correct code.

Subscribers: mgorny, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68205

llvm-svn: 373462
2019-10-02 12:46:44 +00:00
Hans Wennborg 372aece777 Revert r373431 "Switch lowering: omit range check for bit tests when default is unreachable (PR43129)"
This broke http://lab.llvm.org:8011/builders/llvm-clang-x86_64-expensive-checks-win/builds/19967

> Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
>
> This is modeled after the same functionality for jump tables, which was
> added in r357067.
>
> Differential revision: https://reviews.llvm.org/D68131

llvm-svn: 373454
2019-10-02 12:08:44 +00:00
Simon Pilgrim c9129cea27 WinException::emitExceptHandlerTable - silence static analyzer dyn_cast<Function> null dereference warning. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<Function> directly and if not assert will fire for us.

llvm-svn: 373449
2019-10-02 11:48:32 +00:00
Hans Wennborg cbefc36fcc Switch lowering: omit range check for bit tests when default is unreachable (PR43129)
This is modeled after the same functionality for jump tables, which was
added in r357067.

Differential revision: https://reviews.llvm.org/D68131

llvm-svn: 373431
2019-10-02 08:32:15 +00:00
David Blaikie bfc68885d9 DebugInfo: Update support for detecting C++ language variants in debug info emission
llvm-svn: 373420
2019-10-02 01:39:48 +00:00
Jakub Kuderski 856c1cd852 [Dominators][CodeGen] Don't mark MachineDominatorTree as preserved in MachineLICM
llvm-svn: 373378
2019-10-01 18:27:44 +00:00
Jakub Kuderski 5be08ee902 [Dominators][CodeGen] Fix MachineDominatorTree preservation in PHIElimination
Summary:
PHIElimination modifies CFG and marks MachineDominatorTree as preserved. Therefore, it the CFG changes it should also update the MDT, when available. This patch teaches PHIElimination to recalculate MDT when necessary.

This fixes the `tailmerging_in_mbp.ll` test failure discovered after switching to generic DomTree verification algorithm in MachineDominators in D67976.

Reviewers: arsenm, hliao, alex-t, rampitec, vpykhtin, grosser

Reviewed By: rampitec

Subscribers: MatzeB, wdng, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68154

llvm-svn: 373377
2019-10-01 18:27:17 +00:00
Jakub Kuderski 925c285f43 Reapply [Dominators][CodeGen] Clean up MachineDominators
This reverts r373117 (git commit 159ef37735)

Phabricator review: https://reviews.llvm.org/D67976.

llvm-svn: 373376
2019-10-01 18:27:14 +00:00
Jay Foad e536800022 [AMDGPU] Add VerifyScheduling support.
Summary:
This is cut and pasted from the corresponding GenericScheduler
functions.

Reviewers: arsenm, atrick, tstellar, vpykhtin

Subscribers: MatzeB, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68264

llvm-svn: 373346
2019-10-01 15:45:47 +00:00
Simon Pilgrim 3c912c4abe [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863)
This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks.

The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants).

Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.).

Partial reversion of rL372756 - I've identified the infinite loop issue inside the X86 override but haven't fixed it yet so I've only (re)committed the common TargetLowering refactoring part of the patch.

Differential Revision: https://reviews.llvm.org/D67557

llvm-svn: 373343
2019-10-01 15:32:04 +00:00
Jakub Kuderski 56b52a207f [Dominators][CodeGen] Add MachinePostDominatorTree verification
Summary:
This patch implements Machine PostDominator Tree verification and ensures that the verification doesn't fail the in-tree tests.

MPDT verification can be enabled using `verify-machine-dom-info` -- the same flag used by Machine Dominator Tree verification.

Flipping the flag revealed that MachineSink falsely claimed to preserve CFG and MDT/MPDT. This patch fixes that.

Reviewers: arsenm, hliao, rampitec, vpykhtin, grosser

Reviewed By: hliao

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68235

llvm-svn: 373341
2019-10-01 15:23:27 +00:00
Dmitri Gribenko 827a7fab78 Revert "GlobalISel: Handle llvm.read_register"
This reverts commit r373294. It broke Clang's
CodeGen/arm64-microsoft-status-reg.cpp:
http://lab.llvm.org:8011/builders/clang-x86_64-debian-fast/builds/18483

llvm-svn: 373310
2019-10-01 08:24:01 +00:00
Matt Arsenault bdcc6d3d26 GlobalISel: Handle llvm.read_register
SelectionDAG has a bunch of machinery to defer this to selection time
for some reason. Just directly emit a copy during IRTranslator. The
x86 usage does somewhat questionably check hasFP, which could depend
on the whole function being at minimum translated.

This does lose the convergent bit if the callsite had it, which may be
a problem. We also lose that in general for intrinsics, which may also
be a problem.

llvm-svn: 373294
2019-10-01 02:07:16 +00:00
Matt Arsenault f24ac13aaa TLI: Remove DAG argument from getRegisterByName
Replace with the MachineFunction. X86 is the only user, and only uses
it for the function. This removes one obstacle from using this in
GlobalISel. The other is the more tolerable EVT argument.

The X86 use of the function seems questionable to me. It checks hasFP,
before frame lowering.

llvm-svn: 373292
2019-10-01 01:44:39 +00:00
Matt Arsenault ed85b0cee6 GlobalISel: Implement widenScalar for G_SITOFP/G_UITOFP sources
Legalize 16-bit G_SITOFP/G_UITOFP for AMDGPU.

llvm-svn: 373287
2019-10-01 01:06:48 +00:00
David Blaikie 38456776b3 DebugInfo: Simplify section label caching/usage
llvm-svn: 373273
2019-09-30 23:19:10 +00:00
Amaury Sechet e6f98c0073 [DAGCombiner] Clang format MatchRotate. NFC
llvm-svn: 373269
2019-09-30 21:41:52 +00:00
Daniel Sanders cbe13a1461 [globalisel][knownbits] Allow targets to call GISelKnownBits::computeKnownBitsImpl()
Summary:
It seems we missed that the target hook can't query the known-bits for the
inputs to a target instruction. Fix that oversight

Reviewers: aditya_nandakumar

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67380

llvm-svn: 373264
2019-09-30 20:55:53 +00:00
Amaury Sechet 496c0564f1 [DAGCombiner] Update MatchRotate so that it returns an SDValue. NFC
llvm-svn: 373260
2019-09-30 20:47:23 +00:00
Yuanfang Chen cc382cf727 [NewPM] Port MachineModuleInfo to the new pass manager.
Existing clients are converted to use MachineModuleInfoWrapperPass. The
new interface is for defining a new pass manager API in CodeGen.

Reviewers: fedor.sergeev, philip.pfaffe, chandlerc, arsenm

Reviewed By: arsenm, fedor.sergeev

Differential Revision: https://reviews.llvm.org/D64183

llvm-svn: 373240
2019-09-30 17:54:50 +00:00
Jessica Paquette b1c1095fdc [AArch64][GlobalISel] Support lowering variadic musttail calls
This adds support for lowering variadic musttail calls. To do this, we have
to...

- Detect a musttail call in a variadic function before attempting to lower the
  call's formal arguments. This is done in the IRTranslator.
- Compute forwarded registers in `lowerFormalArguments`, and add copies for
  those registers.
- Restore the forwarded registers in `lowerTailCall`.

Because there doesn't seem to be any nice way to wrap these up into the outgoing
argument handler, the restore code in `lowerTailCall` is done separately.

Also, irritatingly, you have to make sure that the registers don't overlap with
any passed parameters. Otherwise, the scheduler doesn't know what to do with the
extra copies and asserts.

Add call-translator-variadic-musttail.ll to test this. This is pretty much the
same as the X86 musttail-varargs.ll test. We didn't have as nice of a test to
base this off of, but the idea is the same.

Differential Revision: https://reviews.llvm.org/D68043

llvm-svn: 373226
2019-09-30 16:49:13 +00:00
Paul Robinson ed1f3f36ae [SSP] [3/3] cmpxchg and addrspacecast instructions can now
trigger stack protectors.  Fixes PR42238.

Add test coverage for llvm.memset, as proxy for all llvm.mem*
intrinsics. There are two issues here: (1) they could be lowered to a
libc call, which could be intercepted, and do Bad Stuff; (2) with a
non-constant size, they could overwrite the current stack frame.

The test was mostly written by Matt Arsenault in r363169, which was
later reverted; I tweaked what he had and added the llvm.memset part.

Differential Revision: https://reviews.llvm.org/D67845

llvm-svn: 373220
2019-09-30 15:11:23 +00:00
Paul Robinson 527815f5b0 [SSP] [2/3] Refactor an if/dyn_cast chain to switch on opcode. NFC
Differential Revision: https://reviews.llvm.org/D67844

llvm-svn: 373219
2019-09-30 15:08:38 +00:00
Paul Robinson 14945186c2 [SSP] [1/3] Revert "StackProtector: Use PointerMayBeCaptured"
"Captured" and "relevant to Stack Protector" are not the same thing.

This reverts commit f29366b1f5.
aka r363169.

Differential Revision: https://reviews.llvm.org/D67842

llvm-svn: 373216
2019-09-30 15:01:35 +00:00
Tamas Berghammer 421a186fb4 Support MemoryLocation::UnknownSize in TargetLowering::IntrinsicInfo
Summary:
Previously IntrinsicInfo::size was an unsigned what can't represent the
64 bit value used by MemoryLocation::UnknownSize.

Reviewers: jmolloy

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68219

llvm-svn: 373214
2019-09-30 14:44:24 +00:00
Guillaume Chatelet ab11b9188d [Alignment][NFC] Remove AllocaInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, arsenm, jvesely, nhaehnle, eraman, hiraditya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68141

llvm-svn: 373207
2019-09-30 13:34:44 +00:00
Guillaume Chatelet 17380227e8 [Alignment][NFC] Remove LoadInst::setAlignment(unsigned)
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, jdoerfert

Subscribers: hiraditya, asbirlea, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D68142

llvm-svn: 373195
2019-09-30 09:37:05 +00:00
Hans Wennborg dc7dbb1a88 NFC changes to SelectionDAGBuilder::visitBitTestHeader(), preparing for PR43129
llvm-svn: 373191
2019-09-30 08:47:53 +00:00
Roger Ferrer Ibanez 5a2a14db0b [TargetLowering] Simplify expansion of S{ADD,SUB}O
ISD::SADDO uses the suggested sequence described in the section §2.4 of
the RISCV Spec v2.2. ISD::SSUBO uses the dual approach but checking for
(non-zero) positive.

Differential Revision: https://reviews.llvm.org/D47927

llvm-svn: 373187
2019-09-30 07:58:50 +00:00
Amara Emerson 509a4947c9 Add an operand to memory intrinsics to denote the "tail" marker.
We need to propagate this information from the IR in order to be able to safely
do tail call optimizations on the intrinsics during legalization. Assuming
it's safe to do tail call opt without checking for the marker isn't safe because
the mem libcall may use allocas from the caller.

This adds an extra immediate operand to the end of the intrinsics and fixes the
legalizer to handle it.

Differential Revision: https://reviews.llvm.org/D68151

llvm-svn: 373140
2019-09-28 05:33:21 +00:00
Jakub Kuderski 159ef37735 Revert [Dominators][CodeGen] Clean up MachineDominators
This reverts r373101 (git commit 72c57ec3e6)

llvm-svn: 373117
2019-09-27 19:33:39 +00:00
Jakub Kuderski 72c57ec3e6 [Dominators][CodeGen] Clean up MachineDominators
Summary: This is a cleanup patch for MachineDominatorTree. It would be an NFC, except for replacing custom DomTree verification with the generic one.

Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao

Reviewed By: arsenm

Subscribers: wdng, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67976

llvm-svn: 373101
2019-09-27 17:25:39 +00:00
Djordje Todorovic eb4c98ca3d [DebugInfo] Exclude memory location values as parameter entry values
Abandon describing of loaded values due to safety concerns. Loaded
values are described as derefed memory location at caller point.
At callee we can unintentionally change that memory location which
would lead to different entry being printed value before and after
the memory location clobbering. This problem is described in
llvm.org/PR43343.

Patch by Nikola Prica

Differential Revision: https://reviews.llvm.org/D67717

llvm-svn: 373089
2019-09-27 13:52:43 +00:00
Jesper Antonsson 39b81f1cbc [CodeGenPrepare] Mend "avoid crashing from replacing a phi twice" fix.
Summary:
An erroneously negated if-statement by an earlier (March 2019) bugfix left phi replacement/simplification under optimizeMemoryInst()  in CodeGenPrepare largely inactivated. The error was found when csmith found that the same assert as in the original bug report could still be triggered in a different way. This patch fixes the bugfix. The original bug was:
 https://bugs.llvm.org/show_bug.cgi?id=41052
... and the previous fix was D59358.

Reviewers: aprantl, skatkov

Reviewed By: skatkov

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67838

llvm-svn: 373084
2019-09-27 13:01:37 +00:00
Guillaume Chatelet 18f805a7ea [Alignment][NFC] Remove unneeded llvm:: scoping on Align types
llvm-svn: 373081
2019-09-27 12:54:21 +00:00
Hans Wennborg 3740ae3b8a Revert r372893 "[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets"
This caused severe compile-time regressions, see PR43455.

> Modern processors predict the targets of an indirect branch regardless of
> the size of any jump table used to glean its target address.  Moreover,
> branch predictors typically use resources limited by the number of actual
> targets that occur at run time.
>
> This patch changes the semantics of the option `-max-jump-table-size` to limit
> the number of different targets instead of the number of entries in a jump
> table.  Thus, it is now renamed to `-max-jump-table-targets`.
>
> Before, when `-max-jump-table-size` was specified, it could happen that
> cluster jump tables could have targets used repeatedly, but each one was
> counted and typically resulted in tables with the same number of entries.
> With this patch, when specifying `-max-jump-table-targets`, tables may have
> different lengths, since the number of unique targets is counted towards the
> limit, but the number of unique targets in tables is the same, but for the
> last one containing the balance of targets.
>
> Differential revision: https://reviews.llvm.org/D60295

llvm-svn: 373060
2019-09-27 09:54:26 +00:00
Changpeng Fang f5524f0451 Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint
Reviewers:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D58360

llvm-svn: 373024
2019-09-26 22:53:44 +00:00
Xiangling Liao 3b808fb330 [AIX]Emit function descriptor csect in assembly
This patch emits the function descriptor csect for functions with definitions
under both 32-bit/64-bit mode on AIX.

Differential Revision: https://reviews.llvm.org/D66724

llvm-svn: 373009
2019-09-26 19:38:32 +00:00
Mikael Holmen 957e090ac9 [IfConversion] Disallow TBB == FBB for valid triangles
Summary:
Previously the case

     EBB
     | \_
     |  |
     | TBB
     |  /
     FBB

was treated as a valid triangle also when TBB and FBB was the same basic
block. This could then lead to an invalid CFG when we removed the edge
from EBB to TBB, since that meant we would also remove the edge from EBB
to FBB.

Since TBB == FBB is quite a degenerated case of a triangle, we now
don't treat it as a valid triangle anymore, and thus we will avoid the
trouble with updating the CFG.

Reviewers: efriedma, dmgreen, kparzysz

Reviewed By: efriedma

Subscribers: bjope, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67832

llvm-svn: 372943
2019-09-26 06:35:55 +00:00
Thomas Raoux 3c8c667235 [TargetLowering] Make allowsMemoryAccess methode virtual.
Rename old function to explicitly show that it cares only about alignment.
The new allowsMemoryAccess call the function related to alignment by default
and can be overridden by target to inform whether the memory access is legal or
not.

Differential Revision: https://reviews.llvm.org/D67121

llvm-svn: 372935
2019-09-26 00:16:01 +00:00
Jessica Paquette 8535a8672e [AArch64][GlobalISel] Choose CCAssignFns per-argument for tail call lowering
When checking for tail call eligibility, we should use the correct CCAssignFn
for each argument, rather than just checking if the caller/callee is varargs or
not.

This is important for tail call lowering with varargs. If we don't check it,
then basically any varargs callee with parameters cannot be tail called on
Darwin, for one thing. If the parameters are all guaranteed to be in registers,
this should be entirely safe.

On top of that, not checking for this could potentially make it so that we have
the wrong stack offsets when checking for tail call eligibility.

Also refactor some of the stuff for CCAssignFnForCall and pull it out into a
helper function.

Update call-translator-tail-call.ll to show that we can now correctly tail call
on Darwin. Also add two extra tail call checks. The first verifies that we still
respect the caller's stack size, and the second verifies that we still don't
tail call when a varargs function has a memory argument.

Differential Revision: https://reviews.llvm.org/D67939

llvm-svn: 372897
2019-09-25 16:45:35 +00:00
Evandro Menezes 3bd8ba156b [CodeGen] Replace -max-jump-table-size with -max-jump-table-targets
Modern processors predict the targets of an indirect branch regardless of
the size of any jump table used to glean its target address.  Moreover,
branch predictors typically use resources limited by the number of actual
targets that occur at run time.

This patch changes the semantics of the option `-max-jump-table-size` to limit
the number of different targets instead of the number of entries in a jump
table.  Thus, it is now renamed to `-max-jump-table-targets`.

Before, when `-max-jump-table-size` was specified, it could happen that
cluster jump tables could have targets used repeatedly, but each one was
counted and typically resulted in tables with the same number of entries.
With this patch, when specifying `-max-jump-table-targets`, tables may have
different lengths, since the number of unique targets is counted towards the
limit, but the number of unique targets in tables is the same, but for the
last one containing the balance of targets.

Differential revision: https://reviews.llvm.org/D60295

llvm-svn: 372893
2019-09-25 16:10:20 +00:00
Sanjay Patel 831a7e7068 [DAGCombiner] add one-use restriction to vector transform with cheap extract
We might be able to do better on the example in the test,
but in general, we should not scalarize a splatted vector
binop if there are other uses of the binop. Otherwise, we
can end up with code as we had - a scalar op that is
redundant with a vector op.

llvm-svn: 372886
2019-09-25 15:08:33 +00:00
Simon Pilgrim 5f2d8b2618 [TargetInstrInfo] Let findCommutedOpIndices take const MachineInstr&
Neither the base implementation of findCommutedOpIndices nor any in-tree target modifies the instruction passed in and there is no reason why they would in the future.

Committed on behalf of @hvdijk (Harald van Dijk)

Differential Revision: https://reviews.llvm.org/D66138

llvm-svn: 372882
2019-09-25 14:55:57 +00:00
Jakub Kuderski 269bd15c68 [Dominators][AMDGPU] Don't use virtual exit node in findNearestCommonDominator. Cleanup MachinePostDominators.
Summary:
This patch fixes a bug that originated from passing a virtual exit block (nullptr) to `MachinePostDominatorTee::findNearestCommonDominator` and resulted in assertion failures inside its callee. It also applies a small cleanup to the class.

The patch introduces a new function in PDT that given a list of `MachineBasicBlock`s finds their NCD. The new overload of `findNearestCommonDominator` handles virtual root correctly.

Note that similar handling of virtual root nodes is not necessary in (forward) `DominatorTree`s, as right now they don't use virtual roots.

Reviewers: tstellar, tpr, nhaehnle, arsenm, NutshellySima, grosser, hliao

Reviewed By: hliao

Subscribers: hliao, kzhuravl, jvesely, wdng, yaxunl, dstuttard, t-tye, hiraditya, llvm-commits

Tags: #amdgpu, #llvm

Differential Revision: https://reviews.llvm.org/D67974

llvm-svn: 372874
2019-09-25 14:04:36 +00:00
Sanjay Patel 2cec4b58f5 Revert [IR] allow fast-math-flags on phi of FP values
This reverts r372866 (git commit dec03223a9)

llvm-svn: 372868
2019-09-25 13:29:09 +00:00
Sanjay Patel dec03223a9 [IR] allow fast-math-flags on phi of FP values
The changes here are based on the corresponding diffs for allowing FMF on 'select':
D61917

As discussed there, we want to have fast-math-flags be a property of an FP value
because the alternative (having them on things like fcmp) leads to logical
inconsistency such as:
https://bugs.llvm.org/show_bug.cgi?id=38086

The earlier patch for select made almost no practical difference because most
unoptimized conditional code begins life as a phi (based on what I see in clang).
Similarly, I don't expect this patch to do much on its own either because
SimplifyCFG promptly drops the flags when converting to select on a minimal
example like:
https://bugs.llvm.org/show_bug.cgi?id=39535

But once we have this plumbing in place, we should be able to wire up the FMF
propagation and start solving cases like that.

The change to RecurrenceDescriptor::AddReductionVar() is required to prevent a
regression in a LoopVectorize test. We are intersecting the FMF of any
FPMathOperator there, so if a phi is not properly annotated, new math
instructions may not be either. Once we fix the propagation in SimplifyCFG, it
may be safe to remove that hack.

Differential Revision: https://reviews.llvm.org/D67564

llvm-svn: 372866
2019-09-25 13:14:12 +00:00
Simon Pilgrim 20f4afc5a7 [DAG] Pull out minimum shift value calc into a helper function. NFCI.
llvm-svn: 372856
2019-09-25 12:28:56 +00:00
Simon Pilgrim be9beef5da AggressiveAntiDepBreaker - silence static analyzer null dereference warning. NFCI.
Assert that we've found the critical path.

llvm-svn: 372759
2019-09-24 13:57:51 +00:00
Ilya Biryukov 60e5e0b667 Revert r372333: [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863)
Reason: this caused severe compile time regressions in JAX.
See email thread  of original revision on llvm-commits for details:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20190923/697042.html

llvm-svn: 372756
2019-09-24 13:48:02 +00:00
Simon Pilgrim 9942c07745 [ModuloSchedule] KernelRewriter::rewrite - silence static analyzer dyn_cast<> null dereference warning. NFCI.
Assert that we've found the start of the MI schedule list.

llvm-svn: 372723
2019-09-24 10:58:42 +00:00
Simon Pilgrim c81f8e4ce1 lowerObjCCall - silence static analyzer dyn_cast<CallInst> null dereference warnings. NFCI.
The static analyzer is warning about a potential null dereference, but we should be able to use cast<CallInst> directly and if not assert will fire for us.

llvm-svn: 372720
2019-09-24 10:46:30 +00:00
Pavel Labath aaff1a631a MCRegisterInfo: Merge getLLVMRegNum and getLLVMRegNumFromEH
Summary:
The functions different in two ways:
- getLLVMRegNum could return both "eh" and "other" dwarf register
  numbers, while getLLVMRegNumFromEH only returned the "eh" number.
- getLLVMRegNum asserted if the register was not found, while the second
  function returned -1.

The second distinction was pretty important, but it was very hard to
infer that from the function name. Aditionally, for the use case of
dumping dwarf expressions, we needed a function which can work with both
kinds of number, but does not assert.

This patch solves both of these issues by merging the two functions into
one, returning an Optional<unsigned> value. While the same thing could
be achieved by adding an "IsEH" argument to the (renamed)
getLLVMRegNumFromEH function, it seemed better to avoid the confusion of
two functions and put the choice of asserting into the hands of the
caller -- if he checks the Optional value, he can safely process
"untrusted" input, and if he blindly dereferences the Optional, he gets
the assertion.

I've updated all call sites to the new API, choosing between the two
options according to the function they were calling originally, except
that I've updated the usage in DWARFExpression.cpp to use the "safe"
method instead, and added a test case which would have previously
triggered an assertion failure when processing (incorrect?) dwarf
expressions.

Reviewers: dsanders, arsenm, JDevlieghere

Subscribers: wdng, aprantl, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67154

llvm-svn: 372710
2019-09-24 09:31:02 +00:00
Amara Emerson adec1209e6 [GlobalISel][IRTranslator] Fix switch table lowering to use signed LE not unsigned.
We were miscompiling switch value comparisons with the wrong signedness, which
shows up when we have things like switch case values with i1 types, which end up
being legalized incorrectly.

Fixes PR43383

llvm-svn: 372675
2019-09-24 00:09:23 +00:00
Sanjay Patel 7414151929 [BreakFalseDeps] ignore function with minsize attribute
This came up in the x86-specific:
https://bugs.llvm.org/show_bug.cgi?id=43239
...but it is a general problem for the BreakFalseDeps pass.
Dependencies may be broken by adding some other instruction,
so that should be avoided if the overall goal is to minimize size.

Differential Revision: https://reviews.llvm.org/D67363

llvm-svn: 372628
2019-09-23 17:01:01 +00:00
Guillaume Chatelet 1ae7905fc8 [Alignment][NFC] DataLayout migration to llvm::Align
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: jholewinski, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67914

llvm-svn: 372596
2019-09-23 12:41:36 +00:00
Simon Pilgrim f6f6c6ca3b Localizer - fix "variable used but never read" analyzer warning. NFCI.
Simplify the code by separating the modification of the Changed variable from returning it.

llvm-svn: 372583
2019-09-23 11:38:10 +00:00
Simon Pilgrim 0d6684d7e5 TargetInstrInfo::getStackSlotRange - fix "variable used but never read" analyzer warning. NFCI.
We don't need to divide the BitSize local variable at all.

llvm-svn: 372582
2019-09-23 11:36:24 +00:00
Simon Pilgrim 0b184b8526 CriticalAntiDepBreaker - Assert that we've found the bottom of the critical path. NFCI.
Silences static analyzer null dereference warnings.

llvm-svn: 372577
2019-09-23 10:42:47 +00:00
Craig Topper a533e87792 [X86][SelectionDAGBuilder] Move the hack for handling MMX shift by i32 intrinsics into the X86 backend.
This intrinsics should be shift by immediate, but gcc allows any
i32 scalar and clang needs to match that. So we try to detect the
non-constant case and move the data from an integer register to an
MMX register.

Previously this was done by creating a v2i32 build_vector and
bitcast in SelectionDAGBuilder. This had to be done early since
v2i32 isn't a legal type. The bitcast+build_vector would be DAG
combined to X86ISD::MMX_MOVW2D which isel will turn into a
GPR->MMX MOVD.

This commit just moves the whole thing to lowering and emits
the X86ISD::MMX_MOVW2D directly to avoid the illegal type. The
test changes just seem to be due to nodes being linearized in a
different order.

llvm-svn: 372535
2019-09-23 01:05:33 +00:00
Simon Pilgrim c8a9ae4ce2 [SelectionDAG] computeKnownBits/ComputeNumSignBits - cleanup demanded/unknown paths. NFCI.
Merge the calls, just adjust the demandedelts if we have a valid extract_subvector constant index, else demand all elts.

llvm-svn: 372521
2019-09-22 18:47:12 +00:00
James Molloy 8a74eca398 [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
Recommit: fix asan errors.

The way MachinePipeliner uses these target hooks is stateful - we reduce trip
count by one per call to reduceLoopCount. It's a little overfit for hardware
loops, where we don't have to worry about stitching a loop induction variable
across prologs and epilogs (the induction variable is implicit).

This patch introduces a new API:

  /// Analyze loop L, which must be a single-basic-block loop, and if the
  /// conditions can be understood enough produce a PipelinerLoopInfo object.
  virtual std::unique_ptr<PipelinerLoopInfo>
  analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const;

The return value is expected to be an implementation of the abstract class:

  /// Object returned by analyzeLoopForPipelining. Allows software pipelining
  /// implementations to query attributes of the loop being pipelined.
  class PipelinerLoopInfo {
  public:
    virtual ~PipelinerLoopInfo();
    /// Return true if the given instruction should not be pipelined and should
    /// be ignored. An example could be a loop comparison, or induction variable
    /// update with no users being pipelined.
    virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;

    /// Create a condition to determine if the trip count of the loop is greater
    /// than TC.
    ///
    /// If the trip count is statically known to be greater than TC, return
    /// true. If the trip count is statically known to be not greater than TC,
    /// return false. Otherwise return nullopt and fill out Cond with the test
    /// condition.
    virtual Optional<bool>
    createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
                                 SmallVectorImpl<MachineOperand> &Cond) = 0;

    /// Modify the loop such that the trip count is
    /// OriginalTC + TripCountAdjust.
    virtual void adjustTripCount(int TripCountAdjust) = 0;

    /// Called when the loop's preheader has been modified to NewPreheader.
    virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0;

    /// Called when the loop is being removed.
    virtual void disposed() = 0;
  };

The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while
allowing the target to hold its own state across all calls. This API, in
particular the disjunction of creating a trip count check condition and
adjusting the loop, improves the code quality in ModuloSchedule.cpp.

llvm-svn: 372463
2019-09-21 08:19:41 +00:00
Amara Emerson 7ac1039957 [GlobalISel] Defer setting HasCalls on MachineFrameInfo to selection time.
We currently always set the HasCalls on MFI during translation and legalization if
we're handling a call or legalizing to a libcall. However, if that call is later
optimized to a tail call then we don't need the flag. The flag being set to true
causes frame lowering to always save and restore FP/LR, which adds unnecessary code.

This change does the same thing as SelectionDAG and ports over some code that scans
instructions after selection, using TargetInstrInfo to determine if target opcodes
are known calls.

Code size geomean improvements on CTMark:
 -O0 : 0.1%
 -Os : 0.3%

Differential Revision: https://reviews.llvm.org/D67868

llvm-svn: 372443
2019-09-20 23:52:07 +00:00
Mitch Phillips 72a3d8597d Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount"
This commit broke the ASan buildbot. See comments in rL372376 for more
information.

This reverts commit 15e27b0b6d.

llvm-svn: 372425
2019-09-20 20:25:16 +00:00
Craig Topper 1b7b4b467f [SelectionDAG][Mips][Sparc] Don't allow SimplifyDemandedBits to constant fold TargetConstant nodes to a Constant.
Summary:
After the switch in SimplifyDemandedBits, it tries to create a
constant when possible. If the original node is a TargetConstant
the default in the switch will call computeKnownBits on the
TargetConstant which will succeed. This results in the
TargetConstant becoming a Constant. But TargetConstant exists to
avoid being changed.

I've fixed the two cases that relied on this in tree by explicitly
making the nodes constant instead of target constant. The Sparc
case is an old bug. The Mips case was recently introduced now that
ImmArg on intrinsics gets turned into a TargetConstant when the
SelectionDAG is created. I've removed the ImmArg since it lowers
to generic code.

Reviewers: arsenm, RKSimon, spatel

Subscribers: jyknight, sdardis, wdng, arichardson, hiraditya, fedor.sergeev, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67802

llvm-svn: 372409
2019-09-20 16:49:51 +00:00
Krzysztof Parzyszek 2b5d7e93dd [MVT] Add v256i1 to MachineValueType
This type can show up when lowering some HVX vector code on Hexagon.

llvm-svn: 372403
2019-09-20 15:19:20 +00:00
David Stenberg b71d8d465a Add a missing space in a MIR parser error message
llvm-svn: 372398
2019-09-20 14:41:41 +00:00
David Tellenbach 2a47c77e72 [FastISel] Fix insertion of unconditional branches during FastISel
The insertion of an unconditional branch during FastISel can differ depending on
building with or without debug information. This happens because FastISel::fastEmitBranch
emits an unconditional branch depending on the size of the current basic block
without distinguishing between debug and non-debug instructions.

This patch fixes this issue by ignoring debug instructions when getting the size
of the basic block.

Reviewers: aprantl

Reviewed By: aprantl

Subscribers: ormris, aprantl, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67703

llvm-svn: 372389
2019-09-20 13:22:59 +00:00
James Molloy 15e27b0b6d [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
The way MachinePipeliner uses these target hooks is stateful - we reduce trip
count by one per call to reduceLoopCount. It's a little overfit for hardware
loops, where we don't have to worry about stitching a loop induction variable
across prologs and epilogs (the induction variable is implicit).

This patch introduces a new API:

  /// Analyze loop L, which must be a single-basic-block loop, and if the
  /// conditions can be understood enough produce a PipelinerLoopInfo object.
  virtual std::unique_ptr<PipelinerLoopInfo>
  analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const;

The return value is expected to be an implementation of the abstract class:

  /// Object returned by analyzeLoopForPipelining. Allows software pipelining
  /// implementations to query attributes of the loop being pipelined.
  class PipelinerLoopInfo {
  public:
    virtual ~PipelinerLoopInfo();
    /// Return true if the given instruction should not be pipelined and should
    /// be ignored. An example could be a loop comparison, or induction variable
    /// update with no users being pipelined.
    virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;

    /// Create a condition to determine if the trip count of the loop is greater
    /// than TC.
    ///
    /// If the trip count is statically known to be greater than TC, return
    /// true. If the trip count is statically known to be not greater than TC,
    /// return false. Otherwise return nullopt and fill out Cond with the test
    /// condition.
    virtual Optional<bool>
    createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
                                 SmallVectorImpl<MachineOperand> &Cond) = 0;

    /// Modify the loop such that the trip count is
    /// OriginalTC + TripCountAdjust.
    virtual void adjustTripCount(int TripCountAdjust) = 0;

    /// Called when the loop's preheader has been modified to NewPreheader.
    virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0;

    /// Called when the loop is being removed.
    virtual void disposed() = 0;
  };

The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while
allowing the target to hold its own state across all calls. This API, in
particular the disjunction of creating a trip count check condition and
adjusting the loop, improves the code quality in ModuloSchedule.cpp.

llvm-svn: 372376
2019-09-20 08:57:46 +00:00
Matt Arsenault dd74f4839b MachineScheduler: Fix missing dependency with multiple subreg defs
If an instruction had multiple subregister defs, and one of them was
undef, this would improperly conclude all other lanes are
killed. There could still be other defs of those read-undef lanes in
other operands. This would improperly remove register uses from
CurrentVRegUses, so the visitation of later operands would not find
the necessary register dependency. This would also mean this would
fail or not depending on how different subregister def operands were
ordered.

On an undef subregister def, scan the instruction for other
subregister defs and avoid killing those.

This possibly should be deferring removing anything from
CurrentVRegUses until the entire instruction has been processed
instead.

llvm-svn: 372362
2019-09-20 00:09:15 +00:00
Matt Arsenault 3ecab8e455 Reapply r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"
This reverts r372314, reapplying r372285 and the commits which depend
on it (r372286-r372293, and r372296-r372297)

This was missing one switch to getTargetConstant in an untested case.

llvm-svn: 372338
2019-09-19 16:26:14 +00:00
Simon Pilgrim af6043557d [DAG][X86] Convert isNegatibleForFree/GetNegatedExpression to a target hook (PR42863)
This patch converts the DAGCombine isNegatibleForFree/GetNegatedExpression into overridable TLI hooks and includes a demonstration X86 implementation.

The intention is to let us extend existing FNEG combines to work more generally with negatible float ops, allowing it work with target specific combines and opcodes (e.g. X86's FMA variants).

Unlike the SimplifyDemandedBits, we can't just handle target nodes through a Target callback, we need to do this as an override to allow targets to handle generic opcodes as well. This does mean that the target implementations has to duplicate some checks (recursion depth etc.).

I've only begun to replace X86's FNEG handling here, handling FMADDSUB/FMSUBADD negation and some low impact codegen changes (some FMA negatation propagation). We can build on this in future patches.

Differential Revision: https://reviews.llvm.org/D67557

llvm-svn: 372333
2019-09-19 15:02:47 +00:00
Amaury Sechet 9e94ef42ba [DAGCombiner] Add node to the worklist in topological order in scalarizeExtractedVectorLoad
Summary: As per title.

Reviewers: craig.topper, efriedma, RKSimon, lebedev.ri

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66661

llvm-svn: 372327
2019-09-19 14:22:11 +00:00
Simon Pilgrim c65dd89804 [DAG] Add SelectionDAG::MaxRecursionDepth constant
As commented on D67557 we have a lot of uses of depth checks all using magic numbers.

This patch adds the SelectionDAG::MaxRecursionDepth constant and moves over some general cases to use this explicitly.

Differential Revision: https://reviews.llvm.org/D67711

llvm-svn: 372315
2019-09-19 12:58:43 +00:00
Hans Wennborg 13bdae8541 Revert r372285 "GlobalISel: Don't materialize immarg arguments to intrinsics"
This broke the Chromium build, causing it to fail with e.g.

  fatal error: error in backend: Cannot select: t362: v4i32 = X86ISD::VSHLI t392, Constant:i8<15>

See llvm-commits thread of r372285 for details.

This also reverts r372286, r372287, r372288, r372289, r372290, r372291,
r372292, r372293, r372296, and r372297, which seemed to depend on the
main commit.

> Encode them directly as an imm argument to G_INTRINSIC*.
>
> Since now intrinsics can now define what parameters are required to be
> immediates, avoid using registers for them. Intrinsics could
> potentially want a constant that isn't a legal register type. Also,
> since G_CONSTANT is subject to CSE and legalization, transforms could
> potentially obscure the value (and create extra work for the
> selector). The register bank of a G_CONSTANT is also meaningful, so
> this could throw off future folding and legalization logic for AMDGPU.
>
> This will be much more convenient to work with than needing to call
> getConstantVRegVal and checking if it may have failed for every
> constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
> immarg operands, many of which need inspection during lowering. Having
> to find the value in a register is going to add a lot of boilerplate
> and waste compile time.
>
> SelectionDAG has always provided TargetConstant for constants which
> should not be legalized or materialized in a register. The distinction
> between Constant and TargetConstant was somewhat fuzzy, and there was
> no automatic way to force usage of TargetConstant for certain
> intrinsic parameters. They were both ultimately ConstantSDNode, and it
> was inconsistently used. It was quite easy to mis-select an
> instruction requiring an immediate. For SelectionDAG, start emitting
> TargetConstant for these arguments, and using timm to match them.
>
> Most of the work here is to cleanup target handling of constants. Some
> targets process intrinsics through intermediate custom nodes, which
> need to preserve TargetConstant usage to match the intrinsic
> expectation. Pattern inputs now need to distinguish whether a constant
> is merely compatible with an operand or whether it is mandatory.
>
> The GlobalISelEmitter needs to treat timm as a special case of a leaf
> node, simlar to MachineBasicBlock operands. This should also enable
> handling of patterns for some G_* instructions with immediates, like
> G_FENCE or G_EXTRACT.
>
> This does include a workaround for a crash in GlobalISelEmitter when
> ARM tries to uses "imm" in an output with a "timm" pattern source.

llvm-svn: 372314
2019-09-19 12:33:07 +00:00
Matt Arsenault c189f023ac MachineScheduler: Fix assert from not checking subregs
The assert would fail if there was a dead def of a subregister if
there was a previous use of a different subregister.

llvm-svn: 372287
2019-09-19 02:14:12 +00:00
Matt Arsenault d8399d12cd GlobalISel: Don't materialize immarg arguments to intrinsics
Encode them directly as an imm argument to G_INTRINSIC*.

Since now intrinsics can now define what parameters are required to be
immediates, avoid using registers for them. Intrinsics could
potentially want a constant that isn't a legal register type. Also,
since G_CONSTANT is subject to CSE and legalization, transforms could
potentially obscure the value (and create extra work for the
selector). The register bank of a G_CONSTANT is also meaningful, so
this could throw off future folding and legalization logic for AMDGPU.

This will be much more convenient to work with than needing to call
getConstantVRegVal and checking if it may have failed for every
constant intrinsic parameter. AMDGPU has quite a lot of intrinsics wth
immarg operands, many of which need inspection during lowering. Having
to find the value in a register is going to add a lot of boilerplate
and waste compile time.

SelectionDAG has always provided TargetConstant for constants which
should not be legalized or materialized in a register. The distinction
between Constant and TargetConstant was somewhat fuzzy, and there was
no automatic way to force usage of TargetConstant for certain
intrinsic parameters. They were both ultimately ConstantSDNode, and it
was inconsistently used. It was quite easy to mis-select an
instruction requiring an immediate. For SelectionDAG, start emitting
TargetConstant for these arguments, and using timm to match them.

Most of the work here is to cleanup target handling of constants. Some
targets process intrinsics through intermediate custom nodes, which
need to preserve TargetConstant usage to match the intrinsic
expectation. Pattern inputs now need to distinguish whether a constant
is merely compatible with an operand or whether it is mandatory.

The GlobalISelEmitter needs to treat timm as a special case of a leaf
node, simlar to MachineBasicBlock operands. This should also enable
handling of patterns for some G_* instructions with immediates, like
G_FENCE or G_EXTRACT.

This does include a workaround for a crash in GlobalISelEmitter when
ARM tries to uses "imm" in an output with a "timm" pattern source.

llvm-svn: 372285
2019-09-19 01:33:14 +00:00
Adrian Prantl 0779dffbd4 Remove the obsolete BlockByRefStruct flag from LLVM IR
DIFlagBlockByRefStruct is an unused DIFlag that originally was used by
clang to express (Objective-)C block captures in debug info. For the
last year Clang has been emitting complex DIExpressions to describe
block captures instead, which makes all the code supporting this flag
redundant.

This patch removes the flag and all supporting "dead" code, so we can
reuse the bit for something else in the future.

Since this only affects debug info generated by Clang with the block
extension this mostly affects Apple platforms and I don't have any
bitcode compatibility concerns for removing this. The Verifier will
reject debug info that uses the bit and thus degrade gracefully when
LTO'ing older bitcode with a newer compiler.

rdar://problem/44304813

Differential Revision: https://reviews.llvm.org/D67453

llvm-svn: 372272
2019-09-18 22:38:56 +00:00
Roman Lebedev c00f318224 [DAGCombine][ARM][X86] (sub Carry, X) -> (addcarry (sub 0, X), 0, Carry) fold
Summary:
`DAGCombiner::visitADDLikeCommutative()` already has a sibling fold:
`(add X, Carry) -> (addcarry X, 0, Carry)`

This fold, as suggested by @efriedma, helps recover from //some//
of the regressions of D62266

Reviewers: efriedma, deadalnix

Subscribers: javed.absar, kristof.beyls, llvm-commits, efriedma

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62392

llvm-svn: 372259
2019-09-18 20:48:27 +00:00
Guillaume Chatelet 97a18dc704 [Alignment][NFC] Align(1) to Align::None() conversions
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67715

llvm-svn: 372234
2019-09-18 16:19:40 +00:00
Guillaume Chatelet d4c4671aa7 [Alignment][NFC] Remove LogAlignment functions
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67620

llvm-svn: 372231
2019-09-18 15:49:49 +00:00
Guillaume Chatelet 35b4b403b4 [Alignment][NFC] Use Align::None instead of 1
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67704

llvm-svn: 372230
2019-09-18 15:40:20 +00:00
Krasimir Georgiev 2f1bba7fd0 Revert "[AArch64][DebugInfo] Do not recompute CalleeSavedStackSize"
Summary:
This reverts commit r372204.

This change causes build bot failures under msan:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/35236/steps/check-llvm%20msan/logs/stdio:

```
FAIL: LLVM :: DebugInfo/AArch64/asan-stack-vars.mir (19531 of 33579)
******************** TEST 'LLVM :: DebugInfo/AArch64/asan-stack-vars.mir' FAILED ********************
Script:
--
: 'RUN: at line 1';   /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc -O0 -start-before=livedebugvalues -filetype=obj -o - /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llvm-dwarfdump -v - | /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir
--
Exit Code: 2

Command Output (stderr):
--
==62894==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0xdfcafb in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3
    #1 0xdfae8a in resolveFrameIndexReference /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1580:10
    #2 0xdfae8a in llvm::AArch64FrameLowering::getFrameIndexReference(llvm::MachineFunction const&, int, unsigned int&) const /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1536
    #3 0x46642c1 in (anonymous namespace)::LiveDebugValues::extractSpillBaseRegAndOffset(llvm::MachineInstr const&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:582:21
    #4 0x4647cb3 in transferSpillOrRestoreInst /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:883:11
    #5 0x4647cb3 in process /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1079
    #6 0x4647cb3 in (anonymous namespace)::LiveDebugValues::ExtendRanges(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1361
    #7 0x463ac0e in (anonymous namespace)::LiveDebugValues::runOnMachineFunction(llvm::MachineFunction&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/LiveDebugValues.cpp:1415:18
    #8 0x4854ef0 in llvm::MachineFunctionPass::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/CodeGen/MachineFunctionPass.cpp:73:13
    #9 0x53b0b01 in llvm::FPPassManager::runOnFunction(llvm::Function&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1648:27
    #10 0x53b15f6 in llvm::FPPassManager::runOnModule(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1685:16
    #11 0x53b298d in runOnModule /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1750:27
    #12 0x53b298d in llvm::legacy::PassManagerImpl::run(llvm::Module&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1863
    #13 0x905f21 in compileModule(char**, llvm::LLVMContext&) /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:601:8
    #14 0x8fdc4e in main /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/tools/llc/llc.cpp:355:22
    #15 0x7f67673632e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0)
    #16 0x882369 in _start (/b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/llc+0x882369)

MemorySanitizer: use-of-uninitialized-value /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/lib/Target/AArch64/AArch64FrameLowering.cpp:1658:3 in llvm::AArch64FrameLowering::resolveFrameOffsetReference(llvm::MachineFunction const&, int, bool, unsigned int&, bool, bool) const
Exiting
error: -: The file was not recognized as a valid object file
FileCheck error: '-' is empty.
FileCheck command line:  /b/sanitizer-x86_64-linux-fast/build/llvm_build_msan/bin/FileCheck /b/sanitizer-x86_64-linux-fast/build/llvm-project/llvm/test/DebugInfo/AArch64/asan-stack-vars.mir
```

Reviewers: bkramer

Reviewed By: bkramer

Subscribers: sdardis, aprantl, kristof.beyls, jrtc27, atanasyan, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67710

llvm-svn: 372228
2019-09-18 14:42:09 +00:00
Sander de Smalen dc2a7f5b39 [AArch64][DebugInfo] Do not recompute CalleeSavedStackSize
This patch fixes a bug exposed by D65653 where a subsequent invocation
of `determineCalleeSaves` ends up with a different size for the callee
save area, leading to different frame-offsets in debug information.

In the invocation by PEI, `determineCalleeSaves` tries to determine
whether it needs to spill an extra callee-saved register to get an
emergency spill slot. To do this, it calls 'estimateStackSize' and
manually adds the size of the callee-saves to this. PEI then allocates
the spill objects for the callee saves and the remaining frame layout
is calculated accordingly.

A second invocation in LiveDebugValues causes estimateStackSize to return
the size of the stack frame including the callee-saves. Given that the
size of the callee-saves is added to this, these callee-saves are counted
twice, which leads `determineCalleeSaves` to believe the stack has
become big enough to require spilling an extra callee-save as emergency
spillslot. It then updates CalleeSavedStackSize with a larger value.

Since CalleeSavedStackSize is used in the calculation of the frame
offset in getFrameIndexReference, this leads to incorrect offsets for
variables/locals when this information is recalculated after PEI.

Reviewers: omjavaid, eli.friedman, thegameg, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D66935

llvm-svn: 372204
2019-09-18 09:02:44 +00:00
Francis Visoiu Mistrih ba2e752c52 [Remarks] Allow the RemarkStreamer to be used directly with a stream
The filename in the RemarkStreamer should be optional to allow clients
to stream remarks to memory or to existing streams.

This introduces a new overload of `setupOptimizationRemarks`, and avoids
enforcing the presence of a filename at different places.

llvm-svn: 372195
2019-09-18 01:04:45 +00:00
Craig Topper b5ffbd0b14 [SimplifyDemandedBits] Use APInt::intersects to instead of ANDing and comparing to 0 separately. NFC
llvm-svn: 372158
2019-09-17 18:19:02 +00:00
Benjamin Kramer df4b9a3f4f Hide implementation details in namespaces.
llvm-svn: 372113
2019-09-17 12:56:29 +00:00
Graham Hunter 1a9195d817 [SVE][MVT] Fixed-length vector MVT ranges
* Reordered MVT simple types to group scalable vector types
    together.
  * New range functions in MachineValueType.h to only iterate over
    the fixed-length int/fp vector types.
  * Stopped backends which don't support scalable vector types from
    iterating over scalable types.

Reviewers: sdesmalen, greened

Reviewed By: greened

Differential Revision: https://reviews.llvm.org/D66339

llvm-svn: 372099
2019-09-17 10:19:23 +00:00
Alexander Timofeev 6524a7a2b9 [AMDGPU]: PHI Elimination hooks added for custom COPY insertion. Fixed
Defferential Revision: https://reviews.llvm.org/D67101

Reviewers: rampitec, vpykhtin
llvm-svn: 372086
2019-09-17 09:08:58 +00:00
Amara Emerson 9d64721ca5 [GlobalISel] Partially revert r371901.
r371901 was overeager and widenScalarDst() and the like in the legalizer
attempt to increment the insert point given in order to add new instructions
after the currently legalizing inst. In cases where the insertion point is not
exactly the current instruction, then callers need to de-compensate for the
behaviour by decrementing the insertion iterator before calling them. It's not
a nice state of affairs, for now just undo the problematic parts of the change.

llvm-svn: 372050
2019-09-16 23:46:03 +00:00
Simon Pilgrim a8a4953fdf [GlobalISel] findGISelOptimalMemOpLowering - remove dead initalization. NFCI.
Fixes static analyzer warning that "Value stored to 'NewTySize' during its initialization is never read".

llvm-svn: 371937
2019-09-15 16:56:06 +00:00
Simon Pilgrim 2b4ace3f29 InterleavedLoadCombine - merge isa<> and dyn_cast<> duplicates. NFCI.
Silence static analyzer null dereference warning of *dyn_cast<BinaryOperator> by merging with the isa<BinaryOperator> above.

llvm-svn: 371935
2019-09-15 16:20:12 +00:00
Simon Pilgrim b743e94cdc [TargetLowering] SimplifyDemandedBits - add EXTRACT_SUBVECTOR support.
Call SimplifyDemandedBits on the source vector.

llvm-svn: 371923
2019-09-14 16:38:26 +00:00
Mingjie Xing 4b191770f4 [ScheduleDAGMILive] Fix typo in comment.
Differential Revision: https://reviews.llvm.org/D67478

llvm-svn: 371916
2019-09-14 03:27:38 +00:00
Amara Emerson 02bcc86b08 [GlobalISel] Fix insertion point of new instructions to be after PHIs.
For some reason we sometimes insert new instructions one instruction before
the first non-PHI when legalizing. This can result in having non-PHI
instructions before PHIs, which mean that PHI elimination doesn't catch them.

Differential Revision: https://reviews.llvm.org/D67570

llvm-svn: 371901
2019-09-13 21:49:24 +00:00
Jessica Paquette 727328ab63 [AArch64][GlobalISel] Tail call memory intrinsics
Because memory intrinsics are handled differently than other calls, we need to
check them for tail call eligiblity in the legalizer. This allows us to still
inline them when it's beneficial to do so, but also tail call when possible.

This adds simple tail calling support for when the intrinsic is followed by a
return.

It ports the attribute checks from `TargetLowering::isInTailCallPosition` into
a similarly-named function in LegalizerHelper.cpp. The target-specific
`isUsedByReturnOnly` hook is not ported here.

Update tailcall-mem-intrinsics.ll to show that GlobalISel can now tail call
memory intrinsics.

Update legalize-memcpy-et-al.mir to have a case where we don't tail call.

Differential Revision: https://reviews.llvm.org/D67566

llvm-svn: 371893
2019-09-13 20:25:58 +00:00
Alexander Timofeev 9ff70132bf Revert for: [AMDGPU]: PHI Elimination hooks added for custom COPY insertion.
llvm-svn: 371873
2019-09-13 17:37:30 +00:00
Craig Topper 4d1df2aa23 [TargetRegisterInfo] Remove SVT argument from getCommonSubClass.
This was added to support fp128 on x86-64, but appears to be
unneeded now. This may be because the FR128 register class
added back then was merged with the VR128 register class later.

llvm-svn: 371815
2019-09-13 05:24:37 +00:00
Tim Shen a31c521f5e Temporarily revert r371640 "LiveIntervals: Split live intervals on multiple dead defs".
It reveals a miscompile on Hexagon. See PR43302 for details.

llvm-svn: 371802
2019-09-13 01:34:25 +00:00
Matt Arsenault 4d33918034 AMDGPU/GlobalISel: Legalize G_FMAD
Unlike SelectionDAG, treat this as a normally legalizable operation.
In SelectionDAG this is supposed to only ever formed if it's legal,
but I've found that to be restricting. For AMDGPU this is contextually
legal depending on whether denormal flushing is allowed in the use
function.

Technically we currently treat the denormal mode as a subtarget
feature, so custom lowering could be avoided. However I consider this
to be a defect, and this should be contextually dependent on the
controllable rounding mode of the parent function.

llvm-svn: 371800
2019-09-13 00:44:35 +00:00
Matt Arsenault b85c8c4bbd LiveIntervals: Remove assertion
This testcase is invalid, and caught by the verifier. For the verifier
to catch it, the live interval computation needs to complete. Remove
the assert so the verifier catches this, which is less confusing.

In this testcase there is an undefined use of a subregister, and lanes
which aren't used or defined. An equivalent testcase with the
super-register shrunk to have no untouched lanes already hit this
verifier error.

llvm-svn: 371792
2019-09-12 23:46:51 +00:00
Philip Reames 079e210463 [SDAG] Update generic code to conservatively check for isAtomic in addition to isVolatile
This is the first sweep of generic code to add isAtomic bailouts where appropriate. The intention here is to have the switch from AtomicSDNode to LoadSDNode/StoreSDNode be close to NFC; that is, I'm not looking to allow additional optimizations at this time. That will come later.  See D66309 for context.

Differential Revision: https://reviews.llvm.org/D66318

llvm-svn: 371786
2019-09-12 22:49:17 +00:00
Jessica Paquette a42070a6aa [AArch64][GlobalISel] Support sibling calls with outgoing arguments
This adds support for lowering sibling calls with outgoing arguments.

e.g

```
define void @foo(i32 %a)
```

Support is ported from AArch64ISelLowering's `isEligibleForTailCallOptimization`.
The only thing that is missing is a full port of
`TargetLowering::parametersInCSRMatch`. So, if we're using swiftself,
we'll never tail call.

- Rename `analyzeCallResult` to `analyzeArgInfo`, since the function is now used
  for both outgoing and incoming arguments
- Teach `OutgoingArgHandler` about tail calls. Tail calls use frame indices for
  stack arguments.
- Teach `lowerFormalArguments` to set the bytes in the caller's stack argument
  area. This is used later to check if the tail call's parameters will fit on
  the caller's stack.
- Add `areCalleeOutgoingArgsTailCallable` to perform the eligibility check on
  the callee's outgoing arguments.

For testing:

- Update call-translator-tail-call to verify that we can now tail call with
  outgoing arguments, use G_FRAME_INDEX for stack arguments, and respect the
  size of the caller's stack
- Remove GISel-specific check lines from speculation-hardening.ll, since GISel
  now tail calls like the other selectors
- Add a GISel test line to tailcall-string-rvo.ll since we can tail call in that
  test now
- Add a GISel test line to tailcall_misched_graph.ll since we tail call there
  now. Add specific check lines for GISel, since the debug output from the
  machine-scheduler differs with GlobalISel. The dependency still holds, but
  the output comes out in a different order.

Differential Revision: https://reviews.llvm.org/D67471

llvm-svn: 371780
2019-09-12 22:10:36 +00:00
Craig Topper efe6724b9f [DAGCombiner][X86] Pass the CmpOpVT to reduceSelectOfFPConstantLoads so X86 can exclude fp128 compares.
The X86 decision assumes the compare will produce a result in an XMM
register, but that can't happen for an fp128 compare since those
go to a libcall the returns an i32. Pass the VT so X86 can check
the type.

llvm-svn: 371775
2019-09-12 21:30:18 +00:00
Craig Topper 344c398e2a [SelectionDAGBuilder] Simplify loop in visitSelect back to how it was before r255558.
This code was changed to accomodate fp128 being softened to itself
during type legalization on x86-64. This was done in order to create
libcalls while having fp128 as a legal type. We're now doing the
libcall creation during LegalizeDAG and the type legalization changes
to enable the old behavior have been removed. So this change to
SelectionDAGBuilder is no longer needed.

llvm-svn: 371771
2019-09-12 21:00:32 +00:00
David Green a6e944b173 [CGP] Ensure sinking multiple instructions does not invalidate dominance checks
In MVE, as of rL371218, we are attempting to sink chains of instructions such as:
  %l1 = insertelement <8 x i8> undef, i8 %l0, i32 0
  %broadcast.splat26 = shufflevector <8 x i8> %l1, <8 x i8> undef, <8 x i32> zeroinitializer
In certain situations though, we can end up breaking the dominance relations of
instructions. This happens when we sink the instruction into a loop, but cannot
remove the originals. The Use is updated, which might in fact be a Use from the
second instruction to the first.

This attempts to fix that by reversing the order of instruction that are sunk,
and ensuring that we update the uses on new instructions if they have already
been sunk, not the old ones.

Differential Revision: https://reviews.llvm.org/D67366

llvm-svn: 371743
2019-09-12 16:00:07 +00:00
Guillaume Chatelet af11cc7eb5 [Alignment] Move OffsetToAlignment to Alignment.h
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet, JDevlieghere, alexshap, rupprecht, jhenderson

Subscribers: sdardis, nemanjai, hiraditya, kbarton, jakehehrlich, jrtc27, MaskRay, atanasyan, jsji, seiya, cfe-commits, llvm-commits

Tags: #clang, #llvm

Differential Revision: https://reviews.llvm.org/D67499

llvm-svn: 371742
2019-09-12 15:20:36 +00:00
Simon Pilgrim da59a6bf7d [DAGCombine] visitFDIV - Use isCheaperToUseNegatedFPOps helper for (fdiv (fneg X), (fneg Y)) -> (fdiv X, Y). NFCI.
Minor cleanup to use equivalent helper code.

llvm-svn: 371724
2019-09-12 11:03:09 +00:00
Tim Northover f1c2892912 AArch64: support arm64_32, an ILP32 slice for watchOS.
This is the main CodeGen patch to support the arm64_32 watchOS ABI in LLVM.
FastISel is mostly disabled for now since it would generate incorrect code for
ILP32.

llvm-svn: 371722
2019-09-12 10:22:23 +00:00
Tim Northover 98534843fb CodeGenPrep: add separate hook say when GEPs should be used for sinking. NFCI.
Up to now, we've decided whether to sink address calculations using GEPs or
normal arithmetic based on the useAA hook, but there are other reasons GEPs
might be preferred. So this patch splits the two questions, with a default
implementation falling back to useAA.

llvm-svn: 371721
2019-09-12 10:21:00 +00:00
Qiu Chaofan b7fb5d0f6f [DAGCombiner] Improve division estimation of floating points.
Current implementation of estimating divisions loses precision since it
estimates reciprocal first and does multiplication.  This patch is to re-order
arithmetic operations in the last iteration in DAGCombiner to improve the
accuracy.

Reviewed By: Sanjay Patel, Jinsong Ji

Differential Revision: https://reviews.llvm.org/D66050

llvm-svn: 371713
2019-09-12 07:51:24 +00:00
Craig Topper b8dd075275 [LegalizeTypes] Remove code for softening a float type to itself.
This was previously used to turn fp128 operations into libcalls
on X86. This is now done through op legalization after r371672.

This restores much of this code to before r254653.

llvm-svn: 371709
2019-09-12 05:55:14 +00:00
Amara Emerson 55d86f04c7 [AArch64][GlobalISel] Fall back on attempts to allocate split types on the stack.
First we were asserting that the ValNo of a VA was the wrong value. It doesn't actually
make a difference for us in CallLowering but fix that anyway to silence the assert.

The bigger issue was that after fixing the assert we were generating invalid MIR
because the merging/unmerging of values split across multiple registers wasn't
also implemented for memory locs. This happens when we run out of registers and
have to pass the split types like i128 -> i64 x 2 on the stack. This is do-able, but
for now just fall back.

llvm-svn: 371693
2019-09-11 23:53:23 +00:00
Vedant Kumar 0b91333d59 [DWARF] Emit call site parameter info when tuning for lldb
Emit debug entry values using standard DWARF5 opcodes when the debugger
tuning is set to lldb.

Differential Revision: https://reviews.llvm.org/D67410

llvm-svn: 371666
2019-09-11 21:23:39 +00:00
Matt Arsenault 81196a595c LiveIntervals: Split live intervals on multiple dead defs
If there are multiple dead defs of the same virtual register, these
are required to be split into multiple virtual registers with separate
live intervals to avoid a verifier error.

llvm-svn: 371640
2019-09-11 17:59:21 +00:00
Guillaume Chatelet 97264366fb [Alignment][NFC] use llvm::Align for AsmPrinter::EmitAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: dschuff, sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67443

llvm-svn: 371616
2019-09-11 13:37:35 +00:00
Guillaume Chatelet 48904e9452 [Alignment] Use llvm::Align in MachineFunction and TargetLowering - fixes mir parsing
Summary:
This catches malformed mir files which specify alignment as log2 instead of pow2.
See https://reviews.llvm.org/D65945 for reference,

This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: MatzeB, qcolombet, dschuff, arsenm, sdardis, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, s.egerton, pzheng, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67433

llvm-svn: 371608
2019-09-11 11:16:48 +00:00
Jessica Paquette 469d42fcf6 [GlobalISel] When a tail call is emitted in a block, stop translating it
This fixes a crash in tail call translation caused by assume and lifetime_end
intrinsics.

It's possible to have instructions other than a return after a tail call which
will still have `Analysis::isInTailCallPosition` return true. (Namely,
lifetime_end and assume intrinsics.)

If we emit a tail call, we should stop translating instructions in the block.
Otherwise, we can end up emitting an extra return, or dead instructions in
general. This makes the verifier unhappy, and is generally unfortunate for
codegen.

This also removes the code from AArch64CallLowering that checks if we have a
tail call when lowering a return. This is covered by the new code now.

Also update call-translator-tail-call.ll to show that we now properly tail call
in the presence of lifetime_end and assume.

Differential Revision: https://reviews.llvm.org/D67415

llvm-svn: 371572
2019-09-10 23:34:45 +00:00
Jessica Paquette 2af5b193d5 [AArch64][GlobalISel] Support sibling calls with mismatched calling conventions
Add support for sibcalling calls whose calling convention differs from the
caller's.

- Port over `CCState::resultsCombatible` from CallingConvLower.cpp into
  CallLowering. This is used to verify that the way the caller and callee CC
  handle incoming arguments matches up.

- Add `CallLowering::analyzeCallResult`. This is basically a port of
  `CCState::AnalyzeCallResult`, but using `ArgInfo` rather than `ISD::InputArg`.

- Add `AArch64CallLowering::doCallerAndCalleePassArgsTheSameWay`. This checks
  that the calling conventions are compatible, and that the caller and callee
  preserve the same registers.

For testing:

- Update call-translator-tail-call.ll to show that we can now handle this.

- Add a GISel line to tailcall-ccmismatch.ll to show that we will not tail call
  when the regmasks don't line up.

Differential Revision: https://reviews.llvm.org/D67361

llvm-svn: 371570
2019-09-10 23:25:12 +00:00
Sanjay Patel df6a958dcb [BreakFalseDeps] fix typos/grammar in documentation comment; NFC
llvm-svn: 371516
2019-09-10 13:00:31 +00:00
Guillaume Chatelet 3729b17cff [Alignment][NFC] Use llvm::Align for TargetLowering::getPrefLoopAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Reviewed By: courbet

Subscribers: wuzish, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67386

llvm-svn: 371511
2019-09-10 12:00:43 +00:00
Alexander Timofeev c2d292f839 [AMDGPU]: PHI Elimination hooks added for custom COPY insertion.
Reviewers: rampitec, vpykhtin

  Differential Revision: https://reviews.llvm.org/D67101

llvm-svn: 371508
2019-09-10 10:58:57 +00:00
Dmitri Gribenko 2bf8d77453 Revert "Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline.""
This reverts commit r371502, it broke tests
(clang/test/CodeGenCXX/auto-var-init.cpp).

llvm-svn: 371507
2019-09-10 10:39:09 +00:00
Clement Courbet 612c260ec3 Reland "r364412 [ExpandMemCmp][MergeICmps] Move passes out of CodeGen into opt pipeline."
With a fix for sanitizer breakage (see explanation in D60318).

llvm-svn: 371502
2019-09-10 09:18:00 +00:00
Guillaume Chatelet b6722af068 [Alignment] Use Align for TargetLowering::MinStackArgumentAlignment
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: sdardis, nemanjai, hiraditya, kbarton, jrtc27, MaskRay, atanasyan, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67288

llvm-svn: 371498
2019-09-10 09:01:18 +00:00
Craig Topper e8b432fa0e [LegalizeTypes] Teach SoftenFloatOp_SELECT_CC to handle operand 2 or 3 being softened.
This can only happen on X86 when fp128 is a legal type, but we
go through softening to generate libcalls. This causes fp128 to
be softened to fp128 instead of an integer type. This can be
removed if D67128 lands.

llvm-svn: 371493
2019-09-10 07:56:02 +00:00
Aditya Nandakumar 5112b71126 [GlobalISel]: Fix a bug where we could dereference None
getConstantVRegVal returns None when dealing with constants > 64 bits.
Don't assume we always have a value in GISelKnownBits.

llvm-svn: 371465
2019-09-09 22:51:41 +00:00
Philip Reames 20aafa3156 Introduce infrastructure for an incremental port of SelectionDAG atomic load/store handling
This is the first patch in a large sequence. The eventual goal is to have unordered atomic loads and stores - and possibly ordered atomics as well - handled through the normal ISEL codepaths for loads and stores. Today, there handled w/instances of AtomicSDNodes. The result of which is that all transforms need to be duplicated to work for unordered atomics. The benefit of the current design is that it's harder to introduce a silent miscompile by adding an transform which forgets about atomicity.  See the thread on llvm-dev titled "FYI: proposed changes to atomic load/store in SelectionDAG" for further context.

Note that this patch is NFC unless the experimental flag is set.

The basic strategy I plan on taking is:

    introduce infrastructure and a flag for testing (this patch)
    Audit uses of isVolatile, and apply isAtomic conservatively*
    piecemeal conservative* update generic code and x86 backedge code in individual reviews w/tests for cases which didn't check volatile, but can be found with inspection
    flip the flag at the end (with minimal diffs)
    Work through todo list identified in (2) and (3) exposing performance ops

(*) The "conservative" bit here is aimed at minimizing the number of diffs involved in (4). Ideally, there'd be none. In practice, getting it down to something reviewable by a human is the actual goal. Note that there are (currently) no paths which produce LoadSDNode or StoreSDNode with atomic MMOs, so we don't need to worry about preserving any behaviour there.

We've taken a very similar strategy twice before with success - once at IR level, and once at the MI level (post ISEL). 

Differential Revision: https://reviews.llvm.org/D66309

llvm-svn: 371441
2019-09-09 19:23:22 +00:00
Eli Friedman 79f0d3a6e5 [IfConversion] Correctly handle cases where analyzeBranch fails.
If analyzeBranch fails, on some targets, the out parameters point to
some blocks in the function. But we can't use that information, so make
sure to clear it out.  (In some places in IfConversion, we assume that
any block with a TrueBB is analyzable.)

The change to the testcase makes it trigger a bug on builds without this
fix: IfConvertDiamond tries to perform a followup "merge" operation,
which isn't legal, and we somehow end up with a branch to a deleted MBB.
I'm not sure how this doesn't crash the compiler.

Differential Revision: https://reviews.llvm.org/D67306

llvm-svn: 371434
2019-09-09 18:29:27 +00:00
Craig Topper 5ebd0a6e88 [SelectionDAG] Remove ISD::FP_ROUND_INREG
I don't think anything in tree creates this node. So all of this
code appears to be dead.

Code coverage agrees
http://lab.llvm.org:8080/coverage/coverage-reports/llvm/coverage/Users/buildslave/jenkins/workspace/clang-stage2-coverage-R/llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp.html

Differential Revision: https://reviews.llvm.org/D67312

llvm-svn: 371431
2019-09-09 17:54:44 +00:00
Dmitri Gribenko d9c4060bd5 Revert "[MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp"
This reverts commit 371359. I'm suspecting a miscompile, I posted a
reproducer to https://reviews.llvm.org/D65267.

llvm-svn: 371421
2019-09-09 16:46:45 +00:00
James Molloy b6c7fce67a [DFAPacketizer] Reapply: Track resources for packetized instructions
Reapply with fix to reduce resources required by the compiler - use
unsigned[2] instead of std::pair. This causes clang and gcc to compile
the generated file multiple times faster, and hopefully will reduce
the resource requirements on Visual Studio also. This fix is a little
ugly but it's clearly the same issue the previous author of
DFAPacketizer faced (the previous tables use unsigned[2] rather uglily
too).

This patch allows the DFAPacketizer to be queried after a packet is formed to work out which
resources were allocated to the packetized instructions.

This is particularly important for targets that do their own bundle packing - it's not
sufficient to know simply that instructions can share a packet; which slots are used is
also required for encoding.

This extends the emitter to emit a side-table containing resource usage diffs for each
state transition. The packetizer maintains a set of all possible resource states in its
current state. After packetization is complete, all remaining resource states are
possible packetization strategies.

The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default
(most uses of the packetizer like MachinePipeliner don't care and don't need the extra
maintained state).

Differential Revision: https://reviews.llvm.org/D66936

llvm-svn: 371399
2019-09-09 13:17:55 +00:00
Simon Pilgrim 462e3d8050 Revert rL371198 from llvm/trunk: [DFAPacketizer] Track resources for packetized instructions
This patch allows the DFAPacketizer to be queried after a packet is formed to work out which
resources were allocated to the packetized instructions.

This is particularly important for targets that do their own bundle packing - it's not
sufficient to know simply that instructions can share a packet; which slots are used is
also required for encoding.

This extends the emitter to emit a side-table containing resource usage diffs for each
state transition. The packetizer maintains a set of all possible resource states in its
current state. After packetization is complete, all remaining resource states are
possible packetization strategies.

The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default
(most uses of the packetizer like MachinePipeliner don't care and don't need the extra
maintained state).

Differential Revision: https://reviews.llvm.org/D66936
........
Reverted as this is causing "compiler out of heap space" errors on MSVC 2017/19 NDEBUG builds

llvm-svn: 371393
2019-09-09 12:33:22 +00:00
Tim Northover 06d93e0a25 GlobalISel: fix unused warnings in release builds.
llvm-svn: 371385
2019-09-09 10:36:58 +00:00
Tim Northover 36147adc0b GlobalISel: add combiner to form indexed loads.
Loosely based on DAGCombiner version, but this part is slightly simpler in
GlobalIsel because all address calculation is performed by G_GEP. That makes
the inc/dec distinction moot so there's just pre/post to think about.

No targets can handle it yet so testing is via a special flag that overrides
target hooks.

llvm-svn: 371384
2019-09-09 10:04:23 +00:00
Kai Luo 9115c477bb [MachineCopyPropagation] Remove redundant copies after TailDup via machine-cp
Summary:
After tailduplication, we have redundant copies. We can remove these
copies in machine-cp if it's safe to, i.e.
```
$reg0 = OP ...
... <<< No read or clobber of $reg0 and $reg1
$reg1 = COPY $reg0 <<< $reg0 is killed
...
<RET>
```
will be transformed to
```
$reg1 = OP ...
...
<RET>
```

Differential Revision: https://reviews.llvm.org/D65267

llvm-svn: 371359
2019-09-09 02:32:42 +00:00
Craig Topper dac34f52d3 [DAGCombiner][X86][ARM] Teach visitMULO to fold multiplies with 0 to 0 and no carry.
I modified the ARM test to use two inputs instead of 0 so the
test hopefully still tests what was intended.

llvm-svn: 371344
2019-09-08 19:24:39 +00:00
David Stenberg 5a583665f4 [DebugInfo][X86] Describe call site values for zero-valued imms
Summary:
Add zero-materializing XORs to X86's describeLoadedValue() hook in order
to produce call site values.

I have had to change the defs logic in collectCallSiteParameters() a bit
to be able to describe the XORs. The XORs implicitly define $eflags,
which would cause them to never be considered, due to a guard condition
that I->getNumDefs() is one. I have changed that condition so that we
now only consider instructions where a forwarded register overlaps with
the instruction's single explicit define. We still need to collect the implicit
defines of other forwarded registers to remove them from the work list.
I'm not sure how to move towards supporting instructions with multiple
explicit defines, cases where forwarded register are implicitly defined,
and/or cases where an instruction produces values for multiple forwarded
registers. Perhaps the describeLoadedValue() hook should take a register
argument, and we then leave it up to the hook to describe the loaded
value in that register? I have not yet encountered a situation where
that would be necessary though.

Reviewers: aprantl, vsk, djtodoro, NikolaPrica

Reviewed By: vsk

Subscribers: ychen, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67225

llvm-svn: 371333
2019-09-08 14:22:06 +00:00
David Stenberg 8b70139e95 [NFC] Make the describeLoadedValue() hook return machine operand objects
Summary:
This changes the ParamLoadedValue pair which the describeLoadedValue()
hook returns so that MachineOperand objects are returned instead of
pointers.

When describing call site values we may need to describe operands which
are not part of the instruction. One such example is zero-materializing
XORs on x86, which I have implemented support for in a child revision.
Instead of having to return a pointer to an operand stored somewhere
outside the instruction, start returning objects directly instead, as
that simplifies the code.

The MachineOperand class only holds POD members, and on x86-64 it is 32
bytes large. That combined with copy elision means that the overhead of
returning a machine operand object from the hook does not become very
large.

I benchmarked this on a 8-thread i7-8650U machine with 32 GB RAM. The
benchmark consisted of building a clang 8.0 binary configured with:

  -DCMAKE_BUILD_TYPE=RelWithDebInfo \
  -DLLVM_TARGETS_TO_BUILD=X86 \
  -DLLVM_USE_SANITIZER=Address \
  -DCMAKE_CXX_FLAGS="-Xclang -femit-debug-entry-values -stdlib=libc++"

The average wall clock time increased by 4 seconds, from 62:05 to
62:09, which is an 0.1% increase.

Reviewers: aprantl, vsk, djtodoro, NikolaPrica

Reviewed By: vsk

Subscribers: hiraditya, ychen, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67261

llvm-svn: 371332
2019-09-08 14:05:10 +00:00
Bjorn Pettersson d065c81164 [CodeGen] Handle SMULFIXSAT with scale zero in TargetLowering::expandFixedPointMul
Summary:
Normally TargetLowering::expandFixedPointMul would handle
SMULFIXSAT with scale zero by using an SMULO to compute the
product and determine if saturation is needed (if overflow
happened). But if SMULO isn't custom/legal it falls through
and uses the same technique, using MULHS/SMUL_LOHI, as used
for non-zero scales.

Problem was that when checking for overflow (handling saturation)
when not using MULO we did not expect to find a zero scale. So
we ended up in an assertion when doing
  APInt::getLowBitsSet(VTSize, Scale - 1)

This patch fixes the problem by adding a new special case for
how saturation is computed when scale is zero.

Reviewers: RKSimon, bevinh, leonardchan, spatel

Reviewed By: RKSimon

Subscribers: wuzish, nemanjai, hiraditya, MaskRay, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67071

llvm-svn: 371309
2019-09-07 12:16:23 +00:00
Bjorn Pettersson 5e331e4ce8 [Intrinsic] Add the llvm.umul.fix.sat intrinsic
Summary:
Add an intrinsic that takes 2 unsigned integers with
the scale of them provided as the third argument and
performs fixed point multiplication on them. The
result is saturated and clamped between the largest and
smallest representable values of the first 2 operands.

This is a part of implementing fixed point arithmetic
in clang where some of the more complex operations
will be implemented as intrinsics.

Patch by: leonardchan, bjope

Reviewers: RKSimon, craig.topper, bevinh, leonardchan, lebedev.ri, spatel

Reviewed By: leonardchan

Subscribers: ychen, wuzish, nemanjai, MaskRay, jsji, jdoerfert, Ka-Ka, hiraditya, rjmccall, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57836

llvm-svn: 371308
2019-09-07 12:16:14 +00:00
Bjorn Pettersson 2b698a13a1 [DwarfExpression] Disallow some rewrites to avoid undefined behavior
Summary:
The value operand in DW_OP_plus_uconst/DW_OP_constu value can be
large (it uses uint64_t as representation internally in LLVM).
This means that in the uint64_t to int conversions, previously done
by DwarfExpression::addMachineRegExpression, could lose information.
Also, the negation done in "-Offset" was undefined behavior in case
Offset was exactly INT_MIN.

To avoid the above problems, we now avoid transformation like
 [Reg, DW_OP_plus_uconst, Offset] --> [DW_OP_breg, Offset]
and
 [Reg, DW_OP_constu, Offset, DW_OP_plus]  --> [DW_OP_breg, Offset]
when Offset > INT_MAX.

And we avoid to transform
 [Reg, DW_OP_constu, Offset, DW_OP_minus] --> [DW_OP_breg,-Offset]
when Offset > INT_MAX+1.

The patch also adjusts DwarfCompileUnit::constructVariableDIEImpl
to make sure that "DW_OP_constu, Offset, DW_OP_minus" is used
instead of "DW_OP_plus_uconst, Offset" when creating DIExpressions
with negative frame index offsets.

Notice that this might just be the tip of the iceberg. There
are lots of fishy handling related to these constants. I think both
DIExpression::appendOffset and DIExpression::extractIfOffset may
trigger undefined behavior for certain values.

Reviewers: sdesmalen, rnk, JDevlieghere

Reviewed By: JDevlieghere

Subscribers: jholewinski, aprantl, hiraditya, ychen, uabelho, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D67263

llvm-svn: 371304
2019-09-07 11:40:10 +00:00
Teresa Johnson 9c27b59cec Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

llvm-svn: 371284
2019-09-07 03:09:36 +00:00
Matt Arsenault cf10372119 GlobalISel: Add G_FMAD instruction
llvm-svn: 371254
2019-09-06 20:49:10 +00:00
James Molloy db2fa06722 [DFAPacketizer] Track resources for packetized instructions
This patch allows the DFAPacketizer to be queried after a packet is formed to work out which
resources were allocated to the packetized instructions.

This is particularly important for targets that do their own bundle packing - it's not
sufficient to know simply that instructions can share a packet; which slots are used is
also required for encoding.

This extends the emitter to emit a side-table containing resource usage diffs for each
state transition. The packetizer maintains a set of all possible resource states in its
current state. After packetization is complete, all remaining resource states are
possible packetization strategies.

The sidetable is only ~500K for Hexagon, but the extra tracking is disabled by default
(most uses of the packetizer like MachinePipeliner don't care and don't need the extra
maintained state).

Differential Revision: https://reviews.llvm.org/D66936

llvm-svn: 371198
2019-09-06 12:20:08 +00:00
Jeremy Morse 5d9cd3b4ca [DebugInfo] LiveDebugValues: explicitly terminate overwritten stack locations
If a stack spill location is overwritten by another spill instruction,
any variable locations pointing at that slot should be terminated. We
cannot rely on spills always being restored to registers or variable
locations being moved by a DBG_VALUE: the register allocator is entitled
to spill a value and then forget about it when it goes out of liveness.

To address this, scan for memory writes to spill locations, even those we
don't consider to be normal "spills". isSpillInstruction and
isLocationSpill distinguish the two now. After identifying spill
overwrites, terminate the open range, and insert a $noreg DBG_VALUE for
that variable.

Differential Revision: https://reviews.llvm.org/D66941

llvm-svn: 371193
2019-09-06 10:08:22 +00:00
Kang Zhang f879c68755 [CodeGen] Do the Simple Early Return in block-placement pass to optimize the blocks
Summary:

Fix a bug of not update the jump table and recommit it again.

In `block-placement` pass, it will create some patterns for unconditional we can do the simple early retrun.
But the `early-ret` pass is before `block-placement`, we don't want to run it again.
This patch is to do the simple early return to optimize the blocks at the last of `block-placement`.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D63972

llvm-svn: 371177
2019-09-06 08:16:18 +00:00
Puyan Lotfi dc97ca9f25 [MIR] MIRNamer pass for improving MIR test authoring experience.
This patch reuses the MIR vreg renamer from the MIRCanonicalizerPass to cleanup
names of vregs in a MIR file for MIR test authors. I found it useful when
writing a regression test for a globalisel failure I encountered recently and
thought it might be useful for other folks as well.

Differential Revision: https://reviews.llvm.org/D67209

llvm-svn: 371121
2019-09-05 20:44:33 +00:00
Daniel Sanders f803237926 [globalisel][knownbits] Account for missing type constraints
Now that we look through copies, it's possible to visit registers that
have a register class constraint but not a type constraint. Avoid looking
through copies when this occurs as the SrcReg won't be able to determine
it's bit width or any known bits.

Along the same lines, if the initial query is on a register that doesn't
have a type constraint then the result is a default-constructed KnownBits,
that is, a 1-bit fully-unknown value.

llvm-svn: 371116
2019-09-05 20:26:02 +00:00
Jessica Paquette 20e8667098 Recommit "[AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls"
Recommit basic sibling call lowering (https://reviews.llvm.org/D67189)

The issue was that if you have a return type other than void, call lowering
will emit COPYs to get the return value after the call.

Disallow sibling calls other than ones that return void for now. Also
proactively disable swifterror tail calls for now, since there's a similar issue
with COPYs there.

Update call-translator-tail-call.ll to include test cases for each of these
things.

llvm-svn: 371114
2019-09-05 20:18:34 +00:00
Eli Friedman cae1e47f6e [IfConversion] Fix diamond conversion with unanalyzable branches.
The code was incorrectly counting the number of identical instructions,
and therefore tried to predicate an instruction which should not have
been predicated.  This could have various effects: a compiler crash,
an assembler failure, a miscompile, or just generating an extra,
unnecessary instruction.

Instead of depending on TargetInstrInfo::removeBranch, which only
works on analyzable branches, just remove all branch instructions.

Fixes https://bugs.llvm.org/show_bug.cgi?id=43121 and
https://bugs.llvm.org/show_bug.cgi?id=41121 .

Differential Revision: https://reviews.llvm.org/D67203

llvm-svn: 371111
2019-09-05 20:02:38 +00:00
Guillaume Chatelet f9f31ce6a9 [Alignment][NFC] Change internal representation of TargetLowering.h
Summary:
This is patch is part of a series to introduce an Alignment type.
See this thread for context: http://lists.llvm.org/pipermail/llvm-dev/2019-July/133851.html
See this patch for the introduction of the type: https://reviews.llvm.org/D64790

Reviewers: courbet

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67226

llvm-svn: 371082
2019-09-05 15:44:33 +00:00
Simon Pilgrim 071287c5a9 Revert rL370996 from llvm/trunk: [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189
........
This fails on EXPENSIVE_CHECKS builds due to a -verify-machineinstrs test failure in CodeGen/AArch64/dllimport.ll

llvm-svn: 371051
2019-09-05 10:38:39 +00:00
Guillaume Chatelet aff45e4b23 [LLVM][Alignment] Make functions using log of alignment explicit
Summary:
This patch renames functions that takes or returns alignment as log2, this patch will help with the transition to llvm::Align.
The renaming makes it explicit that we deal with log(alignment) instead of a power of two alignment.
A few renames uncovered dubious assignments:

 - `MirParser`/`MirPrinter` was expecting powers of two but `MachineFunction` and `MachineBasicBlock` were using deal with log2(align). This patch fixes it and updates the documentation.
 - `MachineBlockPlacement` exposes two flags (`align-all-blocks` and `align-all-nofallthru-blocks`) supposedly interpreted as power of two alignments, internally these values are interpreted as log2(align). This patch updates the documentation,
 - `MachineFunctionexposes` exposes `align-all-functions` also interpreted as power of two alignment, internally this value is interpreted as log2(align). This patch updates the documentation,

Reviewers: lattner, thegameg, courbet

Subscribers: dschuff, arsenm, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, hiraditya, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, dexonsmith, PkmX, jocewei, jsji, Jim, s.egerton, llvm-commits, courbet

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65945

llvm-svn: 371045
2019-09-05 10:00:22 +00:00
Jessica Paquette b78324fc40 [AArch64][GlobalISel] Teach AArch64CallLowering to handle basic sibling calls
This adds support for basic sibling call lowering in AArch64. The intent here is
to only handle tail calls which do not change the ABI (hence, sibling calls.)

At this point, it is very restricted. It does not handle

- Vararg calls.
- Calls with outgoing arguments.
- Calls whose calling conventions differ from the caller's calling convention.
- Tail/sibling calls with BTI enabled.

This patch adds

- `AArch64CallLowering::isEligibleForTailCallOptimization`, which is equivalent
   to the same function in AArch64ISelLowering.cpp (albeit with the restrictions
   above.)
- `mayTailCallThisCC` and `canGuaranteeTCO`, which are identical to those in
   AArch64ISelLowering.cpp.
- `getCallOpcode`, which is exactly what it sounds like.

Tail/sibling calls are lowered by checking if they pass target-independent tail
call positioning checks, and checking if they satisfy
`isEligibleForTailCallOptimization`. If they do, then a tail call instruction is
emitted instead of a normal call. If we have a sibling call (which is always the
case in this patch), then we do not emit any stack adjustment operations. When
we go to lower a return, we check if we've already emitted a tail call. If so,
then we skip the return lowering.

For testing, this patch

- Adds call-translator-tail-call.ll to test which tail calls we currently lower,
  which ones we don't, and which ones we shouldn't.
- Updates branch-target-enforcement-indirect-calls.ll to show that we fall back
  as expected.

Differential Revision: https://reviews.llvm.org/D67189

llvm-svn: 370996
2019-09-04 22:54:52 +00:00
Puyan Lotfi 028061d4eb [mir-canon][NFC] Move MIR vreg renaming code to separate file for better reuse.
Moving MIRCanonicalizerPass vreg renaming code to MIRVRegNamerUtils so that it
can be reused in another pass (ie planing to write a standalone mir-namer pass).

I'm going to write a mir-namer pass so that next time someone has to author a
test in MIR, they can use it to cleanup the naming and make it more readable by
having the numbered vregs swapped out with named vregs.

Differential Revision: https://reviews.llvm.org/D67114

llvm-svn: 370985
2019-09-04 21:29:10 +00:00
Matt Arsenault 5ff310e298 GlobalISel: Add basic legalization for G_BITREVERSE
llvm-svn: 370979
2019-09-04 20:46:15 +00:00
Daniel Sanders b276a9a51e [globalisel] Support trivial COPY in GISelKnownBits
Summary: Allow GISelKnownBits to look through the trivial case of TargetOpcode::COPY

Reviewers: aditya_nandakumar

Subscribers: rovka, hiraditya, volkan, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67131

llvm-svn: 370955
2019-09-04 18:59:43 +00:00
Philip Reames 3a49ca331f Update CodeGen to use hasMetadata as appropriate [NFC]
My intial grepping for rL370933 missed a directory worth of cases.

llvm-svn: 370942
2019-09-04 17:46:55 +00:00
Matt Arsenault 70becc20fa GlobalISel: Add G_BITREVERSE
This is the first failing pattern for AMDGPU and is trivial to handle.

llvm-svn: 370927
2019-09-04 17:06:53 +00:00
James Molloy 11f0f7f583 [ModuloSchedule] Fix no-asserts build
Apologies, due to a git SNAFU this fix (dump doesn't exist and silence unused variables) stayed in my index rather than applying to rL370893.

llvm-svn: 370894
2019-09-04 12:57:23 +00:00
James Molloy fef9f59055 [ModuloSchedule] Introduce PeelingModuloScheduleExpander
This is the beginnings of a reimplementation of ModuloScheduleExpander. It works
by generating a single-block correct pipelined kernel and then peeling out the
prolog and epilogs.

This patch implements kernel generation as well as a validator that will
confirm the number of phis added is the same as the ModuloScheduleExpander.

Prolog and epilog peeling will come in a different patch.

Differential Revision: https://reviews.llvm.org/D67081

llvm-svn: 370893
2019-09-04 12:54:24 +00:00
Jeremy Morse 337a7cb55e [DebugInfo] LiveDebugValues: locations with different exprs should not be merged
When comparing variable locations, LiveDebugValues currently considers only
the machine location, ignoring any DIExpression applied to it. This is a
problem because that DIExpression can do pretty much anything to the machine
location, for example dereferencing it.

This patch adds DIExpressions to that comparison; now variables based on the
same register/memory-location but with different expressions will compare
differently, and be dropped if we attempt to merge them between blocks. This
reduces variable coverage-range a little, but only because we were producing
broken locations.

Differential Revision: https://reviews.llvm.org/D66942

llvm-svn: 370877
2019-09-04 11:09:05 +00:00
Jeremy Morse c8c5f2a84e [LiveDebugValues][NFC] Silence an unused variable warning
On release builds, 'MI' isn't used by anything (it's already inserted into a
block by BuildMI), while on non-release builds it's used by a LLVM_DEBUG
statement. Mark as explicitly used to avoid the warning.

llvm-svn: 370870
2019-09-04 10:18:03 +00:00
Amara Emerson 5d5150f0b4 [GlobalISel] Fix G_SEXT narrowScalar to bail out of unsupported type combination.
Similar to the issue with G_ZEXT that was fixed earlier, this is a quick
to fall back if the source type is not exactly half of the dest type.

Fixes the clang-cmake-aarch64-lld bot build.

llvm-svn: 370847
2019-09-04 07:58:45 +00:00
Amara Emerson 2a2c25ba48 [AArch64][GlobalISel] Legalize 128 bit divisions to libcalls.
Now that we have the infrastructure to support s128 types as parameters
we can expand these to libcalls.

Differential Revision: https://reviews.llvm.org/D66185

llvm-svn: 370823
2019-09-03 21:42:32 +00:00
Amara Emerson fbaf425b79 [GlobalISel][CallLowering] Add support for splitting types according to calling conventions.
On AArch64, s128 types have to be split into s64 GPRs when passed as arguments.
This change adds the generic support in call lowering for dealing with multiple
registers, for incoming and outgoing args.

Support for splitting for return types not yet implemented.

Differential Revision: https://reviews.llvm.org/D66180

llvm-svn: 370822
2019-09-03 21:42:28 +00:00
Bjorn Pettersson b0eb394417 [CodeGen] Use FSHR in DAGTypeLegalizer::ExpandIntRes_MULFIX
Summary:
Simplify the right shift of the intermediate result (given
in four parts) by using funnel shift.

There are some impact on lit tests, but that seems to be
related to register allocation differences due to how FSHR
is expanded on X86 (giving a slightly different operand order
for the OR operations compared to the old code).

Reviewers: leonardchan, RKSimon, spatel, lebedev.ri

Reviewed By: RKSimon

Subscribers: hiraditya, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, s.egerton, pzheng, bevinh, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67036

llvm-svn: 370813
2019-09-03 19:35:07 +00:00
James Molloy 935499579c [MachinePipeliner] Add a way to unit-test the schedule emitter
Emitting a schedule is really hard. There are lots of corner cases to take care of; in fact, of the 60+ SWP-specific testcases in the Hexagon backend most of those are testing codegen rather than the schedule creation itself.

One issue is that to test an emission corner case we must craft an input such that the generated schedule uses that corner case; sometimes this is very hard and convolutes testcases. Other times it is impossible but we want to test it anyway.

This patch adds a simple test pass that will consume a module containing a loop and generate pipelined code from it. We use post-instr-symbols as a way to annotate instructions with the stage and cycle that we want to schedule them at.

We also provide a flag that causes the MachinePipeliner to generate these annotations instead of actually emitting code; this allows us to generate an input testcase with:

  llc < %s -stop-after=pipeliner -pipeliner-annotate-for-testing -o test.mir

And run the emission in isolation with:

  llc < test.mir -run-pass=modulo-schedule-test

llvm-svn: 370705
2019-09-03 08:20:31 +00:00
Craig Topper 9c74c77404 [LegalizeDAG] Pass DAG to two calls to SDNode::dump in debug prints so that they will print target specific nodes correctly.
The dump methods can only print target node names correctly if
they can get access to the TLI object.

llvm-svn: 370694
2019-09-03 02:51:14 +00:00
Robert Lougher 13190c4225 [TargetLowering][PS4] Add sincos(f) lib functions when target is PS4
PS4 supports sincosf and sincos. Adding the library functions enables
the sin(f)+cos(f) -> sincos(f) optimization.

Differential Revision: https://reviews.llvm.org/D67009

llvm-svn: 370675
2019-09-02 16:53:32 +00:00
Sanjay Patel 4e54cf3e0e [DAGCombiner] try to form test+set out of shift+mask patterns
The motivating bugs are:
https://bugs.llvm.org/show_bug.cgi?id=41340
https://bugs.llvm.org/show_bug.cgi?id=42697

As discussed there, we could view this as a failure of IR canonicalization,
but then we would need to implement a backend fixup with target overrides
to get this right in all cases. Instead, we can just view this as a codegen
opportunity. It's not even clear for x86 exactly when we should favor
test+set; some CPUs have better theoretical throughput for the ALU ops than
bt/test.

This patch is made more complicated than I expected because there's an early
DAGCombine for 'and' that can change types of the intermediate ops via
trunc+anyext.

Differential Revision: https://reviews.llvm.org/D66687

llvm-svn: 370668
2019-09-02 14:52:09 +00:00
Jeremy Morse 22493f66f1 [DebugInfo] LiveDebugValues: correctly discriminate kinds of variable locations
The missing line added by this patch ensures that only spilt variable
locations are candidates for being restored from the stack. Otherwise,
register or constant-value information can be interpreted as a spill
location, through a union.

The added regression test replicates a scenario where this occurs: the
stack load from [rsp] causes the register-location DBG_VALUE to be
"restored" to rsi, when it should be left alone. See PR43058 for details.

Un x-fail a test that was suffering from this from a previous patch.

Differential Revision: https://reviews.llvm.org/D66895

llvm-svn: 370648
2019-09-02 12:28:36 +00:00