Summary:
In SjLj exception mode, the old landingpad BB will create a new landingpad BB and use indirect branch jump to the old landingpad BB in lowering.
So we should add 2 endbr for this exception model.
Reviewers: hjl.tools, craig.topper, annita.zhang, LuoYuanke, pengfei, efriedma
Reviewed By: LuoYuanke
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77124
getFauxShuffle attempts to combine INSERT_VECTOR_ELT(TRUNCATE/EXTEND(EXTRACT_VECTOR_ELT(x))) patterns into a target shuffle chain.
PR45604 identified an issue where the scalar was truncated to a size smaller than the destination vector element and then zero extended back, which requires the upper bits to be zero'd which we don't currently do.
To avoid the bug I've added an early out in these truncation cases, a future commit should allow us to handle this by inserting the necessary SM_SentinelZero padding.
This is an enhancement to D77895 to avoid another
round-trip from XMM->GPR->XMM. This time we handle
the case of starting/ending with an f64 and casting
to signed i32 as the intermediate value.
It's a bit more involved than I initially assumed
because we need to use target-specific opcodes to
represent the non-standard cast ops.
Differential Revision: https://reviews.llvm.org/D78362
bitcast (shuf V, MaskC) --> shuf (bitcast V), MaskC'
This is the widen shuffle elements enhancement to D76727.
It builds on the analysis and simplifications in
D77881 and rG6a7e958a423e.
The phase ordering tests show that we can simplify inverse
shuffles across a binop in both directions (widen/narrow or
narrow/widen) now.
There's another potential transform visible in some of the
remaining TODOs - move a bitcasted operand of a shuffle
after the shuffle.
Differential Revision: https://reviews.llvm.org/D78371
See PR45510:
https://bugs.llvm.org/show_bug.cgi?id=45510
We had partial coverage for some of these patterns, so removing duplicate tests
with the complete set in the new test file.
First-order recurrences require special treatment when they are live-out;
such treatment is provided by fixFirstOrderRecurrence(), so they should be
included in AllowedExit set.
(Should probably have been included originally in D16197.)
Fixes PR45526: AllowedExit set is used by prepareToFoldTailByMasking() to
check whether the treatment for live-outs also holds when folding the tail,
which is not (yet) the case for first-order recurrences.
Differential Revision: https://reviews.llvm.org/D78210
Currently C++ symbols are demangled in the symbol table as well as in
the disassembly and relocations. This patch adds demangling of C++
symbols in targets of calls and branches making it easier to decipher
control flow in disassembly. This also matches up with GNUobjdump's
behavior
Reviewed By: MaskRay
Differential Revision: https://reviews.llvm.org/D77957
Cost-modeling decisions are tied to the compute interleave groups
(widening decisions, scalar and uniform values). When invalidating the
interleave groups, those decisions also need to be invalidated.
Otherwise there is a mis-match during VPlan construction.
VPWidenMemoryRecipes created initially are left around w/o converting them
into VPInterleave recipes. Such a conversion indeed should not take place,
and these gather/scatter recipes may in fact be right. The crux is leaving around
obsolete CM_Interleave (and dependent) markings of instructions along with
their costs, instead of recalculating decisions, costs, and recipes.
Alternatively to forcing a complete recompute later on, we could try
to selectively invalidate the decisions connected to the interleave
groups. But we would likely need to run the uniform/scalar value
detection parts again anyways and the extra complexity is probably not
worth it.
Fixes PR45572.
Reviewers: gilr, rengolin, Ayal, hsaito
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78298
Summary:
The instruction in non-text section can not be executed, so they will not affect performance.
In addition, their encoding values are treated as data, so we should not touch them.
Reviewers: MaskRay, reames, LuoYuanke, jyknight
Reviewed By: MaskRay
Subscribers: annita.zhang, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77971
[MachineOutliner] fix test for excluding CFI and add test to include CFI in outlining
New test to check that we only outline CFI instruction if all CFI
Instructions in the function would be captured by the outlining
adding x86 tests analagous to AARCH64 cfi tests
Revision: https://reviews.llvm.org/D77852
The immediate used for the regdef is the encoding for the register
class in the enum generated by tablegen. This encoding will change
any time a new register class is added. Since the number is part
of the input, this means it can become stale.
This change modifies some test to avoid this kind of immediate
all together. And updates one test to use the current encoding of
GR64.
Starting with hasRedZone adding MachineFunctionInfo to be put in the YAML for MIR files.
Split out of: D78062
Based on implementation for MachineFunctionInfo for WebAssembly
Differential Revision: https://reviews.llvm.org/D78173
Patch by Andrew Litteken! (AndrewLitteken)
InstCombine should ensure these don't exist.
I'm looking at making some changes to how we detect these
patterns and not having to worry about these phis will help.
This reverts commit 17b1869b72.
It is an attempt to fix the failure reported at
The patch differs from the original one reviwed at
https://reviews.llvm.org/D77435 only for the use of the std::make_tuple
in building the return value of `findAddrModeSVELoadStore`:
- return {IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset};
+ return std::make_tuple(IsRegReg ? Opc_rr : Opc_ri, NewBase,
the original patch submitted at
fc4e954ed5
was failing the following build:
http://lab.llvm.org:8011/builders/clang-armv7-linux-build-cache/builds/29420/
with error:
/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp
/home/buildslave/buildslave/clang-armv7-linux-build-cache/llvm/llvm/lib/Target/AArch64/AArch64ISelDAGToDAG.cpp:1439:10:
error: chosen constructor is explicit in copy-initialization
return {IsRegReg ? Opc_rr : Opc_ri, NewBase, NewOffset};
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/usr/bin/../lib/gcc/arm-linux-gnueabihf/5.4.0/../../../../include/c++/5.4.0/tuple:479:19:
note: explicit constructor declared here
constexpr tuple(_UElements&&... __elements)
^
1 error generated.
Summary:
This change is fixing an issue where the dagcombine incorrectly used an addressing mode with scaled offsets (indices), instead of unscaled offsets.
Those addressing modes do not exist for `prfh` , `prfw` and `prfd`, hence we can reuse `prfb` because that has unscaled offsets, and because the pseudo-code in the XML spec suggests that the element size is not used for the amount of data that is prefetched by the instruction.
FWIW, GCC also emits a `prfb` for these cases.
Reviewers: sdesmalen, andwar, rengolin
Reviewed By: sdesmalen
Subscribers: tschuett, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D78069
Summary:
Fixed wrong conditions for generating (S[LR]I X, Y, C2) from
(or (and X, BvecC1), (lsl Y, C2)) and added ISel nodes to lower to S[LR]I. The
optimisation is also enabled by default now.
Subscribers: kristof.beyls, hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77387
Add initial support for PC Relative addressing for global values that
require GOT indirect addressing. This patch adds PCRelative support for
global addresses that may not be known at link time and may require
access through the GOT.
Differential Revision: https://reviews.llvm.org/D76064
Summary:
AIX symbol have qualname and unqualified name. The stock getSymbol
could only return unqualified name, which leads us to patch many
caller side(lowerConstant, getMCSymbolForTOCPseudoMO).
So we should try to address this problem in the callee
side(getSymbol) and clean up the caller side instead.
Note: this is a "mostly" NFC patch, with a fix for the original
lowerConstant behavior.
Differential Revision: https://reviews.llvm.org/D78045
There is no need to use `--check-prefix` multiple times.
It helps to improve readability/test maintainability.
This patch does it for all tools at once.
Differential revision: https://reviews.llvm.org/D78217
Add patterns that use a normal, non-wrapping, add and sub nodes along
with an arm vshr imm node.
Differential Revision: https://reviews.llvm.org/D77065
These two tests utilize pre-generated opt-viewer output to diff against
a run of opt-viewer over a known yaml file.
In commit 4b428e8f (D76126), the escape function used for rendering was changed
from cgi.escape to html.escape. This modification causes a behavioral
difference with regards to quote characters.
cgi will not escape quotes by default, but html will.
Therefore, these tests were failing because they expected the old behavior
of "string", but was instead seeing "string".
This solution modifies the known test outputs to use the escaped quotes
rather than not escaping quotes during rendering for no particular reason.
It is notable that when testing the optimization records generated by
LLVM, there was never quotes in the remarks I could find, specifically in
the Callee field where they exist in the pre-generated yaml for testing.
Differential Revision: https://reviews.llvm.org/D78241
If we are and the constant like 0xFFFFFFC00000, for now, we are using several
instructions to generate this 48bit constant and final an "and". However, we
could exploit it with two rotate instructions.
MB ME MB+63-ME
+----------------------+ +----------------------+
|0000001111111111111000| -> |0000000001111111111111|
+----------------------+ +----------------------+
0 63 0 63
Rotate left ME + 1 bit first, and then, mask it with (MB + 63 - ME, 63),
finally, rotate back. Notice that, we need to round it with 64 bit for the
wrapping case.
Reviewed by: ChenZheng, Nemanjai
Differential Revision: https://reviews.llvm.org/D71831
I've always found the "findValue" a little odd and
inconsistent with other things in SDB.
This simplfifies the code in SDB to just handle a splat constant
address or a 2 operand GEP in the same BB. This removes the
need for "findValue" since the operands to the GEP are
guaranteed to be available. The splat constant handling is
new, but was needed to avoid regressions due to constant
folding combining GEPs created in CGP.
CGP is now responsible for canonicalizing gather/scatters into
this form. The pattern I'm using for scalarizing, a scalar GEP
followed by a GEP with an all zeroes index, seems to be subject
to constant folding that the insertelement+shufflevector was not.
Differential Revision: https://reviews.llvm.org/D76947
Summary:
This matches the behavior of GNU addr2line. We previously treated
hexadecimal addresses as binary if they started with 0b, otherwise as
octal if they started with 0, otherwise as decimal.
This only affects llvm-addr2line; the behavior of llvm-symbolize is
unaffected.
Reviewers: ikudrin, rupprecht, jhenderson
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73306
Summary:
Changes the type of the @__typeid_.*_unique_member imports we generate
for unique return value optimization from i8 to [0 x i8]. This
prevents assuming that these imports do not alias, such as when
two unique return values occur in the same vtable.
Fixes PR45393.
Reviewers: tejohnson, pcc
Reviewed By: pcc
Subscribers: aganea, hiraditya, rnk, george.burgess.iv, dblaikie, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77421