This also means we have to check if the latch is the exiting block now,
as `transform` expects the latches to be the exiting blocks too.
https://bugs.llvm.org/show_bug.cgi?id=36586
Reviewers: efriedma, davide, karthikthecool
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45279
llvm-svn: 330806
Summary:
When Reassociate is rewriting an expression tree it may
reuse old binary expression nodes, for new expressions.
Whenever an expression node is reused, but with a non-trivial
change in the result, we need to invalidate any debug info
that is associated with the node.
If for example rewriting
x = mul a, b
y = mul c, x
into
x = mul c, b
y = mul a, x
we still get the same result for 'y', but 'x' is a new expression.
All debug info referring to 'x' must be invalidated (marked as
optimized out) since we no longer calculate the expected value.
As a side-effect this patch avoid (at least some) problems where
reassociate could end up creating IR with debug-use before def.
Earlier the dbg.value nodes where left untouched in the IR, while
the reused binary nodes where sinked to just before the root node
of the rewritten expression tree. See PR27273 for more info about
such problems.
Reviewers: dblaikie, aprantl, dexonsmith
Reviewed By: aprantl
Subscribers: JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D45975
llvm-svn: 330804
Summary:
Use a MapVector instead of a DenseMap for RemMap since it is iteratated
over and the order of iteration can effect the order that new
instructions are created. This can in turn effect the use list order of
div/rem input values if multiple new instructions are created that share
any input values.
Reviewers: spatel
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D45858
llvm-svn: 330792
update API for dominators rather than doing manual, hacky updates.
This is just the first step, but in some ways the most important as it
moves the non-trivial unswitching to update the domtree rather than
fully recalculating it each time.
Subsequent patches should remove the custom update logic used by the
trivial unswitch and replace it with uses of the update API.
This also fixes a number of bugs I was seeing when testing non-trivial
unswitch due to it querying the quasi-correct dominator tree. Now the
tree is 100% correct and safe to query. That said, there are still more
bugs I can see with non-trivial unswitch just running over the test
suite, so more bugfix patches are needed as well.
Thanks to both Sanjoy and Fedor for reviews and testing!
Differential Revision: https://reviews.llvm.org/D45943
llvm-svn: 330787
Before, the outliner would grab ADRPs that used LR/W30. This patch fixes
that by checking for explicit uses of those registers before the special-casing
for ADRPs.
This also adds a test that ensures that those sorts of ADRPs won't be outlined.
llvm-svn: 330783
We were previously prefering ZEXTLOAD over EXTLOAD if it is legal. This triggers during X86's promotion of i16->i32. Not sure about other targets.
Using ZEXTLOAD can prevent folding it to SEXTLOAD later if we were to promote a sign extended operand like we would need for SRA. However, X86 doesn't currently promote i16 SRA. I was looking into doing that which is how I found this issue.
This is also blocking our ability to fold 4 byte aligned EXTLOADs with "loadi32". This is what caused most of the test changes here.
Differential Revision: https://reviews.llvm.org/D45585#inline-402825
llvm-svn: 330781
Previously, _any_ store or load instruction was considered to be
operating on a spill if it had a frameindex as an operand, and thus
was fair game for optimisations such as "StackSlotColoring". This
usually works, except on architectures where spills can be partially
restored, for example on X86 where a spilt vector can have a single
component loaded (zeroing the rest of the target register). This can be
mis-interpreted and the zero extension unsoundly eliminated, see
pr30821.
To avoid this, this commit optionally provides the caller to
isLoadFromStackSlot and isStoreToStackSlot with the number of bytes
spilt/loaded by the given instruction. Optimisations can then determine
that a full spill followed by a partial load (or vice versa), for
example, cannot necessarily be commuted.
Patch by Jeremy Morse!
Differential Revision: https://reviews.llvm.org/D44782
llvm-svn: 330778
Summary: This is no longer used by mesa since its 18.0.0 release.
Reviewers: nhaehnle
Reviewed By: nhaehnle
Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye
Differential Revision: https://reviews.llvm.org/D45988
llvm-svn: 330775
Summary:
The PointerMayBeCapturedBefore function's DomTree arg should be
const instead of non-const. There are no non-const uses of it
in the function.
llvm-svn: 330769
If a packed inline constant is sign extended it must be truncated
after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented
as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended
to 64 bit. After the value shifted right by 16 to use it in a low part
with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline
constant any longer.
Fixed the error and added verification code. Without the fix and with
the verification bug is causing pk_max_f16_literal.ll to fail.
Differential Revision: https://reviews.llvm.org/D45987
llvm-svn: 330752
Removes one subprocess and one temp file from the build for each tablegen
invocation.
No intended behavior change.
https://reviews.llvm.org/D45899
llvm-svn: 330742
This is part of fixing the instruction predicates for MIPS.
Reviewers: atanasyan, abeserminji
Differential Revision: https://reviews.llvm.org/D44212
This patch relands r327409, hopefully without the problematic part of the
tests that cause FileCheck to assert on the windows expensive checks bot.
llvm-svn: 330741
Patch #2 from VPlan Outer Loop Vectorization Patch Series #1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).
This patch introduces the basic infrastructure to detect, legality check
and process outer loops annotated with hints for explicit vectorization.
All these changes are protected under the feature flag
-enable-vplan-native-path. This should make this patch NFC for the existing
inner loop vectorizer.
Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso.
Differential Revision: https://reviews.llvm.org/D42447
llvm-svn: 330739
After D43236, we started interchanging loops with empty dependence
matrices. In isProfitableForVectorization, we try to determine if
interchanging makes the loop dependences more friendly to the
vectorizer. If there are no dependences, we should not interchange,
based on that heuristic.
Reviewers: efriedma, mcrosier, karthikthecool, blitz.opensource
Reviewed By: mcrosier
Differential Revision: https://reviews.llvm.org/D45208
llvm-svn: 330738
Current code does not check that a register number is in the 0-31 range.
Sometimes the parser checks that later for some kinds of instructions,
but that leads to unclear / incorrect error messages like that:
% cat test.s
.text
lb $4, 8($32)
% llvm-mc test.s -triple=mips64-unknown-linux
test.s:2:10: error: expected memory with 16-bit signed offset
lb $4, 8($32)
^
Sometimes the parser just crashes:
% cat test.s
.text
lw $4, 8($32)
% llvm-mc test.s -triple=mips64-unknown-linux
This patch resolves the problem by checking that register number after
'$' sign is in the 0-31 range. If the number is out of the range the
parser shows the `invalid register number` error, but treats invalid
register number as a normal one to continue parsing and catch other
possible errors.
Differential Revision: https://reviews.llvm.org/D45919
llvm-svn: 330732
The memory location an invariant load is using can never be clobbered by
any store, so it's safe to move the load ahead of the store.
Differential Revision: https://reviews.llvm.org/D46011
llvm-svn: 330725
While not necessary for correctness, it is preferable for
performance reasons on all architectures we currently support
to align functions to 16-byte boundaries by default.
llvm-svn: 330718
Split off pinsr/pextr and extractps instructions.
(Mostly) fixes PR36887.
Note: It might be worth adding a WriteFInsertLd class as well in the future.
Differential Revision: https://reviews.llvm.org/D45929
llvm-svn: 330714
Summary:
If attribute "use-soft-float"="true" is set then X86ISelLowering.cpp sets
'Promote' action for ISD::SINT_TO_FP operation on type i32.
But 'Promote' action is not proper in this case since lib function
__floatsidf is available for casting from signed int to float type.
Thus Expand action is more suitable here.
The Expand action should be set for ISD::UINT_TO_FP for soft float as well.
If function attribute "use-soft-float"="true" is set then infinite looping
can happen in DAG combining, function visitSINT_TO_FP() replaces SINT_TO_FP
node with UINT_TO_FP node and function combineUIntToFP() replace vice versa in cycle.
The fix prevents it.
Patch by vrybalov
Differential Revision: https://reviews.llvm.org/D45572
llvm-svn: 330711
loop unswitch.
This code incorrectly added the header to the loop block set early. As
a consequence we would incorrectly conclude that a nested loop body had
already been visited when the header of the outer loop was the preheader
of the nested loop. In retrospect, adding the header eagerly doesn't
really make sense. It seems nicer to let the cycle be formed naturally.
This will catch crazy bugs in the CFG reconstruction where we can't
correctly form the cycle earlier rather than later, and makes the rest
of the logic just fall out.
I've also added various asserts that make these issues *much* easier to
debug.
llvm-svn: 330707
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.
It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.
The second part is platform independent and ensures that:
* CFI instructions do not affect code generation (they are not counted as
instructions when tail duplicating or tail merging)
* Unwind information remains correct when a function is modified by
different passes. This is done in a late pass by analyzing information
about cfa offset and cfa register in BBs and inserting additional CFI
directives where necessary.
Added CFIInstrInserter pass:
* analyzes each basic block to determine cfa offset and register are valid
at its entry and exit
* verifies that outgoing cfa offset and register of predecessor blocks match
incoming values of their successors
* inserts additional CFI directives at basic block beginning to correct the
rule for calculating CFA
Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.
CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D42848
llvm-svn: 330706
Guard the MIPS64 variant correctly for i64, mark the MIPS32 version as not
in microMIPS and provide the microMIPS version.
Additionally, remove a related stale XFAIL'd test as bswap has its own test
case providing coverage.
Reviewers: smaksimovic, abeserminji, atanasyan
Differential Revision: https://reviews.llvm.org/D45816
llvm-svn: 330705
Summary:
The pass is supposed to scalarize such intrinsics if the target does not support
them natively, so if the scalarization does not happen instruction selection
crashes due to inability to lower these intrinsics.
Reviewers: andrew.w.kaylor, craig.topper
Reviewed By: andrew.w.kaylor
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45947
llvm-svn: 330700
This encoding is recognized by the CPU, but the behavior is undefined. This makes the disassembler handle it correctly so we don't print bswapl with a 16-bit register.
llvm-svn: 330682
This code path can very clearly be called in a context where we have
baselined all the cloned blocks to a particular loop and are trying to
handle nested subloops. There is no harm in this, so just relax the
assert. I've added a test case that will make sure we actually exercise
this code path.
llvm-svn: 330680
(notionally Scalar.h is part of libLLVMScalarOpts, so it shouldn't be
included by InstCombine which doesn't/shouldn't need to depend on
ScalarOpts)
llvm-svn: 330669
I was reminded today that this patch got reverted in r301885. I can no
longer reproduce the failure that caused the revert locally (...almost
one year later), and the patch applied pretty cleanly, so I guess we'll
see if the bots still get angry about it.
The original breakage was InstSimplify complaining (in "assertion
failed" form) about getting passed some crazy IR when running `ninja
check-sanitizer`. I'm unable to find traces of what, exactly, said crazy
IR was. I suppose we'll find out pretty soon if that's still the case.
:)
Original commit:
Author: gbiv
Date: Mon May 1 18:12:08 2017
New Revision: 301880
URL: http://llvm.org/viewvc/llvm-project?rev=301880&view=rev
Log:
[InstSimplify] Handle selects of GEPs with 0 offset
In particular (since it wouldn't fit nicely in the summary):
(select (icmp eq V 0) P (getelementptr P V)) -> (getelementptr P V)
Differential Revision: https://reviews.llvm.org/D31435
llvm-svn: 330667
If a loop with child loops becomes our new inner loop after
interchanging, we only need to update LoopInfo for the blocks defined in
the old outer loop. BBs in child loops will stay there.
Reviewers: efriedma, karthikthecool, mcrosier
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45970
llvm-svn: 330653
Summary:
This is [[ https://bugs.llvm.org/show_bug.cgi?id=37104 | PR37104 ]].
[[ https://bugs.llvm.org/show_bug.cgi?id=6773 | PR6773 ]] will introduce an IR canonicalization that is likely bad for the end assembly.
Previously, `andl`+`andn`/`andps`+`andnps` / `bic`/`bsl` would be generated. (see `@out`)
Now, they would no longer be generated (see `@in`).
So we need to make sure that they are still generated.
If the mask is constant, we do nothing. InstCombine should have unfolded it.
Else, i use `hasAndNot()` TLI hook.
For now, only handle scalars.
https://rise4fun.com/Alive/bO6
----
I *really* don't like the code i wrote in `DAGCombiner::unfoldMaskedMerge()`.
It is super fragile. Is there something like IR Pattern Matchers for this?
Reviewers: spatel, craig.topper, RKSimon, javed.absar
Reviewed By: spatel
Subscribers: andreadb, courbet, kristof.beyls, javed.absar, rengolin, nemanjai, llvm-commits
Differential Revision: https://reviews.llvm.org/D45733
llvm-svn: 330646
Summary: We do not need nonull attribute if we know an argument is going to be constant.
Reviewers: junbuml, davide, fhahn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45608
llvm-svn: 330641
Summary:
Skip basic blocks not reachable from the entry node
in MemCpyOptPass::iterateOnFunction.
Code that is unreachable may have properties that do not exist
for reachable code (an instruction in a basic block can for
example be dominated by a later instruction in the same basic
block, for example if there is a single block loop).
MemCpyOptPass::processStore is only safe to use for reachable
basic blocks, since it may iterate past the basic block
beginning when used for unreachable blocks. By simply skipping
to optimize unreachable basic blocks we can avoid asserts such
as "Assertion `!NodePtr->isKnownSentinel()' failed."
in MemCpyOptPass::processStore.
The problem was detected by fuzz tests.
Reviewers: eli.friedman, dneilson, efriedma
Reviewed By: efriedma
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D45889
llvm-svn: 330635
Remove the use of default argument in favor of a separate
startCustomSection method.
Differential Revision: https://reviews.llvm.org/D45794
llvm-svn: 330632
This reland includes a check to prevent the DAG combiner from folding an
offset that is smaller than the existing one. This can cause oscillations
between two possible DAGs, which was the cause of the hang and later assertion
failure observed on the lnt-ctmark-aarch64-O3-flto bot.
http://green.lab.llvm.org/green/job/lnt-ctmark-aarch64-O3-flto/2024/
Original commit message:
> This is a code size win in code that takes offseted addresses
> frequently, such as C++ constructors that typically need to compute
> an offseted address of a vtable. This reduces the size of Chromium
> for Android's .text section by 108KB.
Differential Revision: https://reviews.llvm.org/D45199
llvm-svn: 330630
Summary:
This change teaches DSE that the atomic memory intrinsics are stores
that can be eliminated, and can allow other stores to be eliminated.
This change specifically does not teach DSE that these intrinsics
can be partially eliminated (i.e. length reduced, and dest/src changed);
that will be handled in another change.
Reviewers: mkazantsev, skatkov, apilipenko, efriedma, rsmith
Reviewed By: efriedma
Subscribers: dmgreen, llvm-commits
Differential Revision: https://reviews.llvm.org/D45535
llvm-svn: 330629
This helps debug issues where selection-dag assigns the wrong location
to an instruction.
Differential Revision: https://reviews.llvm.org/D45913
llvm-svn: 330618
It's possible to validly spill the frame offset register
in a call sequence to a VGPR. There are definitely issues
with SGPR spilling to memory, so move the assert later.
llvm-svn: 330612
We use llvm-symbolizer in some production systems, and we run it
against all possibly related files, including some that are not
ELF. We noticed that for some of those invalid files, llvm-symbolizer
would crash with SEGFAULT. Here is an example of such a file.
It is due to that in computeSymbolSizes, a loop uses condition
for (unsigned I = 0, N = Addresses.size() - 1; I < N; ++I) {
where if Addresses.size() is 0, N would overflow and causing the loop
to access invalid memory.
Instead of patching the loop conditions, the commit makes so that the
function returns early if Addresses is empty.
Validated by checking that llvm-symbolizer no longer crashes.
Patch by Teng Qin!
Differential Revision: https://reviews.llvm.org/D44285
llvm-svn: 330610
Also assert that it is correct for SGPRs. There is currently a bug
where stack slot coloring replaces SGPR spill FIs with one with
the default ID, which results in a more confusing assert later
about a dead object.
llvm-svn: 330607
Summary:
This just refactors the lowering of the atomic memory intrinsics to more
closely match the code patterns used in the lowering of the non-atomic
memory intrinsics. Specifically, we encapsulate the lowering in
SelectionDAG::getAtomicMem*() functions rather than embedding
the code directly in the SelectionDAGBuilder code.
llvm-svn: 330603
Summary:
Found by inspection. We care about the operand that *doesn't*
contain the immediate.
I believe this is currently not hit because we fold 0xff / 0xffff
immediates only later.
Change-Id: Ic3cf8538bc7da5eff3200d96eccf9d339e6345a7
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45886
llvm-svn: 330586
Summary:
See the new test case; this is really unlikely to happen with real code,
but I ran into this while attempting to bugpoint-reduce a different issue.
Change-Id: I9ade1dc1aa8fd9c4d9fc83661d7b80e310b5c4a6
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45885
llvm-svn: 330585
LoopRotate only invalidates innermost loops while the changes that it makes may
also affert any of this parents. With patch rL329047, SCEV becomes much smarter
about calculation of exit counts for outer loops, so we cannot assume that they are
not affected.
Differential Revision: https://reviews.llvm.org/D45945
llvm-svn: 330582
Current runtime unrolling invalidates parent loop saying that it might have changed
after the inner loop has changed, but it doesn't bother to do the same to its parents.
With patch rL329047, SCEV becomes much smarter about calculation of exit counts for
outer loops. We might need to invalidate not only the immediate parent, but also
any of its parents as well.
There is no clear evidence that there is some miscompile happening because of this
(at least I don't have such test), but the common sense says that the current code
is wrong.
Differential Revision: https://reviews.llvm.org/D45940
Reviewed By: chandlerc
llvm-svn: 330577
In the function `simplifyOneLoop` we optimistically assume that changes in the
inner loop only affect this very loop and have no impact on its parents. In fact,
after rL329047 has been merged, we can now calculate exit counts for outer
loops which may depend on inner loops. Thus, we need to invalidate all parents
when we do something to a loop.
There is an evidence of incorrect behavior of `simplifyOneLoop`: when we insert
`SE->verify()` check in the end of this funciton, it fails on a bunch of existing
test, in particular:
LLVM :: Transforms/LoopUnroll/peel-loop-not-forced.ll
LLVM :: Transforms/LoopUnroll/peel-loop-pgo.ll
LLVM :: Transforms/LoopUnroll/peel-loop.ll
LLVM :: Transforms/LoopUnroll/peel-loop2.ll
Note that previously we have fixed issues of this variety, see rL328483.
This patch makes this function invalidate the outermost loop properly.
Differential Revision: https://reviews.llvm.org/D45937
Reviewed By: chandlerc
llvm-svn: 330576
The condition this was asserting doesn't actually hold. I've added
comments to explain why, removed the assert, and added a fun test case
reduced out of 403.gcc.
llvm-svn: 330564
Summary: Wrap LLVMDIBuilderCreateAutoVariable, LLVMDIBuilderCreateParameterVariable, LLVMDIBuilderCreateExpression, and move and correct LLVMDIBuilderInsertDeclareBefore and LLVMDIBuilderInsertDeclareAtEnd from the Go bindings to the C bindings.
Reviewers: harlanhaskins, whitequark, deadalnix
Reviewed By: harlanhaskins, whitequark
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45928
llvm-svn: 330555
This is the last step in getting constant pattern matchers to allow
undef elements in constant vectors.
I'm adding a dedicated m_ZeroInt() function and building m_Zero() from
that. In most cases, calling code can be updated to use m_ZeroInt()
directly when there's no need to match pointers, but I'm leaving that
efficiency optimization as a follow-up step because it's not always
clear when that's ok.
There are just enough icmp folds in InstSimplify that can be used for
integer or pointer types, that we probably still want a generic m_Zero()
for those cases. Otherwise, we could eliminate it (and possibly add a
m_NullPtr() as an alias for isa<ConstantPointerNull>()).
We're conservatively returning a full zero vector (zeroinitializer) in
InstSimplify/InstCombine on some of these folds (see diffs in InstSimplify),
but I'm not sure if that's actually necessary in all cases. We may be
able to propagate an undef lane instead. One test where this happens is
marked with 'TODO'.
llvm-svn: 330550
Several tools prefix the error/warning/note output with the name of the
tool. One such tool is LLD for example. This commit adds as an optional
'Prefix' argument to the convenience helpers.
llvm-svn: 330526
The required the default skylake schedules to be updated - these were being completely overriden by the InstRW and the existing values not used at all.
llvm-svn: 330510
Vectorized loops with abs() returns incorrect results on POWER9. This patch fixes it.
For example the following code returns negative result if input values are negative though it sums up the absolute value of the inputs.
int vpx_satd_c(const int16_t *coeff, int length) {
int satd = 0;
for (int i = 0; i < length; ++i) satd += abs(coeff[i]);
return satd;
}
This problem causes test failures for libvpx.
For vector absolute and vector absolute difference on POWER9, LLVM generates VABSDUW (Vector Absolute Difference Unsigned Word) instruction or variants.
Since these instructions are for unsigned integers, we need adjustment for signed integers.
For abs(sub(a, b)), we generate VABSDUW(a+0x80000000, b+0x80000000). Otherwise, abs(sub(-1, 0)) returns 0xFFFFFFFF(=-1) instead of 1. For abs(a), we generate VABSDUW(a+0x80000000, 0x80000000).
Differential Revision: https://reviews.llvm.org/D45522
llvm-svn: 330497
In certain cases, the compiler might try to merge __stack_chk_guard with
another global variable. (Or someone could theoretically define
__stack_chk_guard as an alias.) In that case, make sure we don't crash.
Differential Revision: https://reviews.llvm.org/D45746
llvm-svn: 330495
When creating a call to storeStrong in ObjCARCContract, ensure the call
gets the correct funclet token, otherwise WinEHPrepare will turn the
call (and all subsequent instructions) into unreachable.
We already have logic to do this for the ARC autorelease elision marker;
factor that out into a common function that's used for both. These are
the only two places in this transform that create call instructions.
Differential Revision: https://reviews.llvm.org/D45857
llvm-svn: 330487
Split the fp and integer vector logical instruction scheduler classes - older CPUs especially often handled these on different pipes.
This unearthed a couple of things that are also handled in this patch:
(1) We were tagging avx512 fp logic ops as WriteFAdd, probably because of the lack of WriteFLogic
(2) SandyBridge had integer logic ops only using Port5, when afaict they can use Ports015.
(3) Cleaned up x86 FCHS/FABS scheduling as they are typically treated as fp logic ops.
Differential Revision: https://reviews.llvm.org/D45629
llvm-svn: 330480
Summary:
Support the dynamic shadow memory offset (the default case for user
space now) and static non-zero shadow memory offset
(-hwasan-mapping-offset option). Keeping the the latter case around
for functionality and performance comparison tests (and mostly for
-hwasan-mapping-offset=0 case).
The implementation is stripped down ASan one, picking only the relevant
parts in the following assumptions: shadow scale is fixed, the shadow
memory is dynamic, it is accessed via ifunc global, shadow memory address
rematerialization is suppressed.
Keep zero-based shadow memory for kernel (-hwasan-kernel option) and
calls instreumented case (-hwasan-instrument-with-calls option), which
essentially means that the generated code is not changed in these cases.
Reviewers: eugenis
Subscribers: srhines, llvm-commits
Differential Revision: https://reviews.llvm.org/D45840
llvm-svn: 330475
The callback used to create an ORE for the legacy PI pass caches the allocated
object in a unique_ptr in the runOnModule function, and returns a reference to
that object. Under certian circumstances we can end up holding onto that
reference after the OREs destruction. Rather then allowing the new and legacy
passes to create ORE object in diffrent ways, create the ORE at the point of
use.
Differential Revision: https://reviews.llvm.org/D43219
llvm-svn: 330473
There was some unfortunate interaction between VSPLAT and BITCAST
related to the selection of constant vectors (coming from selecting
shuffles). Introduce VSPLATW that always splats a 32-bit word, and
can have arbitrary result type (to avoid BITCASTs of VSPLAT).
Clean up the previous selection of BITCAST/VSPLAT.
llvm-svn: 330471
Three new instructions:
umonitor - Sets up a linear address range to be
monitored by hardware and activates the monitor.
The address range should be a writeback memory
caching type.
umwait - A hint that allows the processor to
stop instruction execution and enter an
implementation-dependent optimized state
until occurrence of a class of events.
tpause - Directs the processor to enter an
implementation-dependent optimized state
until the TSC reaches the value in EDX:EAX.
Also modifying the description of the mfence
instruction, as the rep prefix (0xF3) was allowed
before, which would conflict with umonitor during
disassembly.
Before:
$ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble
.text
mfence
After:
$ echo 0xf3,0x0f,0xae,0xf0 | llvm-mc -disassemble
.text
umonitor %rax
Reviewers: craig.topper, zvi
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D45253
llvm-svn: 330462
First off, this is more correct than having the B. Second off, this was making
a bot upset. This fixes that.
Update the test to include -verify-machineinstrs as well to prevent stuff like
this slipping by non debug/assert builds in the future.
llvm-svn: 330459
Part of the DBI stream is a list of variable length structures
describing each module that contributes to the final executable.
One member of this structure is a section contribution entry that
describes the first section contribution in the output file for
the given module.
We have been leaving this structure unpopulated until now, so with
this patch it is now filled out correctly.
Differential Revision: https://reviews.llvm.org/D45832
llvm-svn: 330457
It also adds a check making sure PHIs for operands are all in the same
block.
Patch by Daniel Berlin <dberlin@dberlin.org>
Reviewers: dberlin, davide
Differential Revision: https://reviews.llvm.org/D43865
llvm-svn: 330444
Updated two more debug line related warnings to use WithColor. This was
necessary to ensure consistent output order of the warnings on Windows
for debug line tests.
Differential Revision: https://reviews.llvm.org/D45871
llvm-svn: 330440
This was originally committed at rL328921 and reverted at rL329920 to
investigate failures in Chrome. This time I've added to the ReleaseNotes
to warn users of the potential of exposing UB and let me repeat that
here for more exposure:
Optimization of floating-point casts is improved. This may cause surprising
results for code that is relying on undefined behavior. Code sanitizers can
be used to detect affected patterns such as this:
int main() {
float x = 4294967296.0f;
x = (float)((int)x);
printf("junk in the ftrunc: %f\n", x);
return 0;
}
$ clang -O1 ftrunc.c -fsanitize=undefined ; ./a.out
ftrunc.c:5:15: runtime error: 4.29497e+09 is outside the range of
representable values of type 'int'
junk in the ftrunc: 0.000000
Original commit message:
fptosi / fptoui round towards zero, and that's the same behavior as ISD::FTRUNC,
so replace a pair of casts with the equivalent node. We don't have to account for
special cases (NaN, INF) because out-of-range casts are undefined.
Differential Revision: https://reviews.llvm.org/D44909
llvm-svn: 330437
Reapply the patches with a fix. Thanks Ilya and Hans for the reproducer!
This reverts commit r330416.
The issue was that removing predecessors invalidated uses that we stored
for rewrite. The fix is to finish manipulating with CFG before we select
uses for rewrite.
llvm-svn: 330431
This patch adds the ability for the ObjectYAML DWARFEmitter to calculate
the lengths of DIEs. This is accomplished by creating a DIEFixupVisitor
class which traverses the DWARF DIEs to calculate and fix up the lengths
in the Compile Unit header.
The DIEFixupVisitor can be extended in the future to enable more complex
fix ups which will enable simplified YAML string representations.
This is also very useful when using the YAML format in unit tests
because you no longer need to know the length of the compile unit when
writing the YAML string.
Differential commandeered from Chris Bieneman (beanz)
Differential revision: https://reviews.llvm.org/D30666
llvm-svn: 330421
Revert r330413: "[SSAUpdaterBulk] Use SmallVector instead of DenseMap for storing rewrites."
Revert r330403 "Reapply "[PR16756] Use SSAUpdaterBulk in JumpThreading." one more time."
r330403 commit seems to crash clang during our integrate while doing PGO build with the following stacktrace:
#2 llvm::SSAUpdaterBulk::RewriteAllUses(llvm::DominatorTree*, llvm::SmallVectorImpl<llvm::PHINode*>*)
#3 llvm::JumpThreadingPass::ThreadEdge(llvm::BasicBlock*, llvm::SmallVectorImpl<llvm::BasicBlock*> const&, llvm::BasicBlock*)
#4 llvm::JumpThreadingPass::ProcessThreadableEdges(llvm::Value*, llvm::BasicBlock*, llvm::jumpthreading::ConstantPreference, llvm::Instruction*)
#5 llvm::JumpThreadingPass::ProcessBlock(llvm::BasicBlock*)
The crash happens while compiling 'lib/Analysis/CallGraph.cpp'.
r3340413 is reverted due to conflicting changes.
llvm-svn: 330416
This patch adds a StatsFile option to LTO/Config.h and updates both
LLVMGold and llvm-lto2 to set it.
Reviewers: MatzeB, tejohnson, espindola
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D45531
llvm-svn: 330411
Hopefully, changing set to vector removes nondeterminism detected by
some bots, or the new assert will catch something.
This reverts commit r330180.
llvm-svn: 330403
Summary:
Reading Atmel's AT697E errata document this does not seem like a valid
workaround. While the text only mentions SDIV, it says that the ICC flags
can be wrong, and those are only generated by SDIVcc. Verification on
hardware shows that simply replacing SDIV with SDIVcc does not avoid
the bug with negative operands.
This reverts r283727.
Reviewers: lero_chris, jyknight
Reviewed By: jyknight
Subscribers: fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D45813
llvm-svn: 330397
Summary:
In some cases the shift/extend needs to be explicitly parsed together
with the register, rather than as a separate operand. This is needed
for addressing modes where the instruction as a whole dictates the
scaling/extend, rather than specific bits in the instruction.
By parsing them as a single operand, we avoid the need to pass an
extra operand in all CodeGen patterns (because all operands need to
have an associated value), and we avoid the need to update TableGen to
accept operands that have no associated bits in the instruction.
An added benefit of parsing them together is that the assembler
can give a sensible diagnostic if the scaling is not correct.
This is patch [2/4] in a series to add assembler/disassembler support for
SVE's contiguous LD1 (scalar+scalar) instructions:
- Patch [1/4]: https://reviews.llvm.org/D45687
- Patch [2/4]: https://reviews.llvm.org/D45688
- Patch [3/4]: https://reviews.llvm.org/D45689
- Patch [4/4]: https://reviews.llvm.org/D45690
Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro
Reviewed By: fhahn, SjoerdMeijer
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D45688
llvm-svn: 330394
Summary:
This fixes a case where the argument to a sendmsg intrinsic
ends up in a VGPR, for whatever reason.
The underlying performance issue is that a multiplication that
can be an s_mul_i32 is instead needlessly generated as
v_mul_u32_u24, but this is not addressed by this patch.
Change-Id: I61fd4034314d5acdf6074632c30b65364dfa7328
Reviewers: arsenm, rampitec
Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45826
llvm-svn: 330393
Summary:
If a 64-bit register is used as an operand in inline assembly together
with a memory reference, the memory addressing will be wrong. The
addressing will be a single reg, instead of reg+reg or reg+imm. This
will generate a bad offset value or an exception in printMemOperand().
For example:
```
long long int val = 5;
long long int mem;
__asm__ volatile ("std %1, %0":"=m"(mem):"r"(val));
```
becomes:
```
std %i0, [%i2+589833]
```
The problem is that SelectInlineAsmMemoryOperand() is never called for
the memory references if one of the operands is a 64-bit register.
By calling SelectInlineAsmMemoryOperands() in tryInlineAsm() the Sparc
version of SelectInlineAsmMemoryOperand() gets called for each memory
reference.
Reviewers: jyknight, venkatra
Reviewed By: jyknight
Subscribers: eraman, fedor.sergeev, jrtc27, llvm-commits
Differential Revision: https://reviews.llvm.org/D45761
llvm-svn: 330392
Summary:
This change fixes https://crbug.com/834474, a build failure caused by
LowerTypeTests not preserving .symver symbol versioning directives for
exported functions. Emit symver information to ThinLTO summary data and
then propagate symver directives for exported functions to the merged
module.
Emitting symver information to the summaries increases the size of
intermediate build artifacts for a Chromium build by less than 0.2%.
Reviewers: pcc
Reviewed By: pcc
Subscribers: tejohnson, mehdi_amini, eraman, llvm-commits, eugenis, kcc
Differential Revision: https://reviews.llvm.org/D45798
llvm-svn: 330387
This moves the EnableLinkOnceODROutlining flag from TargetPassConfig.cpp into
MachineOutliner.cpp. It also removes OutlineFromLinkOnceODRs from the
MachineOutliner constructor. This is now handled by the moved command-line
flag.
llvm-svn: 330373
This is a temporary solution until a proper WASM implementation of
MCAsmParserExtension is in place, but at least for now will unblock this
path.
Added test to make sure this path works with the WASM Assembler.
Patch By Wouter van Oortmerssen!
Differential Revision: https://reviews.llvm.org/D45386
llvm-svn: 330370
Summary:
The following changes addresses the following two issues.
1) The existing loop rotation pass contains both loop latch simplification and loop rotation. So one flag RotationOnly is added to be passed to the loop rotation pass.
2) The threshold value is initialized with MAX_UINT since the loop rotation utility should not have threshold limit.
Reviewers: dmgreen, efriedma
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D45582
llvm-svn: 330362
Silvermont and Goldmont have the same issue on popcnt as Sandy Bridge, Haswell, Broadwell, and Skylake. Believe it is fixed in Goldmont Plus.
llvm-svn: 330358
Summary:
This fixes the bug pointed out in review with non-trivial unswitching.
This also provides a basis that should make it pretty easy to finish
fleshing out a routine to scan an entire function body for irreducible
control flow, but this patch remains minimal for disabling loop
unswitch.
Reviewers: sanjoy, fedor.sergeev
Subscribers: mcrosier, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45754
llvm-svn: 330357
This forces these operations to be carried out via a
MaterializationResponsibility instance, ensuring responsibility is explicitly
tracked.
llvm-svn: 330356
The XCHG16rr/XCHG32rr/XCHG64rr instructions should be 3 uops just like XCHG8rr. I believe they're just implemented as 3 move uops with a temporary register.
XADD is probably 2 moves and an add also using a temporary register.
Change the latency for both from 2 cycles to 3 cycles. Only 2 of the uops are serialized in their execution, the move into the temporary and the move out of the temporary. The move from one GPR to the other should be able to go in parallel with this if there are ALU resources available.
llvm-svn: 330349
When disassembling with -D, skip virtual sections by printing "..." for
each symbol.
This patch also implements `MachOObjectFile::isSectionVirtual`.
Test case comes from:
```
.zerofill __DATA,__common,_data64unsigned,472,3
```
Differential Revision: https://reviews.llvm.org/D45824
llvm-svn: 330342
We should also check that the "bottom" basic block of a loopis a successor of the "header" basic block, otherwise we don't propagate the information correctly when the CFG is complex. This fixes an important rendering problem with Wolfsentein 2, because of one vector-memory wait was missing.
Differential Revision: https://reviews.llvm.org/D43831
llvm-svn: 330337
If those operands change, we might find a leader for ValueOp, which
could enable new phi-of-op creation.
This fixes a case where we missed creating a phi-of-ops node. With D43865
and this patch, bootstrapping clang/llvm works with -enable-newgvn, whereas
without it, the "value changed after iteration" assertion is triggered.
Reviewers: dberlin, davide
Reviewed By: dberlin
Differential Revision: https://reviews.llvm.org/D42180
llvm-svn: 330334
These instructions lacked the correct predicates, were not marked
as loads and stores and lacked the proper instruction mapping information.
In the case of microMIPS sw(l|r)e (EVA) these instructions were using the load
EVA description.
Reviewers: abeserminji, smaksimovic, atanasyan
Differential Revision: https://reviews.llvm.org/D45626
llvm-svn: 330326
This is the patch that lowers x86 intrinsics to native IR
in order to enable optimizations. The patch also includes folding
of previously missing saturation patterns so that IR emits the same
machine instructions as the intrinsics.
Patch by tkrupa
Differential Revision: https://reviews.llvm.org/D44785
llvm-svn: 330322
Summary:
- Renamed tryParseRegister to tryParseScalarRegister, which
now returns an OperandMatchResultTy.
- Moved matching of certain aliases into matchRegisterNameAlias.
- Changed type of most 'Reg' variables to 'unsigned'.
This is patch [1/4] in a series to add assembler/disassembler support for
SVE's contiguous LD1 (scalar+scalar) instructions:
- Patch [1/4]: https://reviews.llvm.org/D45687
- Patch [2/4]: https://reviews.llvm.org/D45688
- Patch [3/4]: https://reviews.llvm.org/D45689
- Patch [4/4]: https://reviews.llvm.org/D45690
Reviewers: fhahn, rengolin, javed.absar, huntergr, SjoerdMeijer, t.p.northover, echristo, evandro, samparker
Reviewed By: samparker
Subscribers: samparker, llvm-commits, kristof.beyls
Differential Revision: https://reviews.llvm.org/D45687
llvm-svn: 330311
This removes a bunch of unnecessary InstRW overrides. It also cleans up the missing information from the Sandy Bridge model. Other fixes to other models.
llvm-svn: 330308
Summary:
ASSERT_SORTED checks if a table is sorted, and uses a boolean to
prevent the check from being run again if it was earlier determined
that the table is in fact sorted. Unsynchronized reads and writes of
that boolean triggered ThreadSanitizer's data race detection. This
change rewrites the code to use std::atomic<bool> instead.
Fixes PR36922.
Reviewers: rnk
Reviewed By: rnk
Subscribers: llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D45742
llvm-svn: 330301
The compiler only emits the locked version of these which use different instruction definitions. The versions fixed here are only used by the assembler/disassembler.
llvm-svn: 330287
Reverts rL330224, while issues with the C extension and missed common
subexpression elimination opportunities are addressed. Neither of these issues
are visible in current RISC-V backend unit tests, which clearly need
expanding.
llvm-svn: 330281
Legalize and emit code for converting unsigned HWord/Char to QP:
xscvsdqp
xscvudqp
Only covering patterns for unsigned forms cause we don't have part-word
sign-extending integer loads into VSX registers.
Differential Revision: https://reviews.llvm.org/D45494
llvm-svn: 330278
Legalize and emit code for converting (Un)Signed Word to quad-precision via:
xscvsdqp
xscvudqp
Differential Revision: https://reviews.llvm.org/D45389
llvm-svn: 330273
Summary:
Patch adds initial emission of the debug info for NVPTX target.
Currently, only .file and .loc directives are emitted, everything else is
commented out to not break the compilation of Cuda.
Reviewers: echristo, jlebar, tra, jholewinski
Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D41827
llvm-svn: 330271
a zero register.
Previously I tried this and saw LLVM unable to transform this to fold
with memory operands such as spill slot rematerialization. However, it
clearly works as shown in this patch. We turn these into `cmpb $0,
<mem>` when useful for folding a memory operand without issue. This form
has no disadvantage compared to `testb $-1, <mem>`. So overall, this is
likely no worse and may be slightly smaller in some cases due to the
`testb %reg, %reg` form.
Differential Revision: https://reviews.llvm.org/D45475
llvm-svn: 330269
Path.inc/widenPath tries to decode the path using both UTF-8 and the default Windows code page.
This is no longer necessary with the new InitLLVM method which ensures that the command line
arguemnts are already UTF-8 on Windows.
llvm-svn: 330266
across basic blocks in the limited cases where it is very straight
forward to do so.
This will also be useful for other places where we do some limited
EFLAGS propagation across CFG edges and need to handle copy rewrites
afterward. I think this is rapidly approaching the maximum we can and
should be doing here. Everything else begins to require either heroic
analysis to prove how to do PHI insertion manually, or somehow managing
arbitrary PHI-ing of EFLAGS with general PHI insertion. Neither of these
seem at all promising so if those cases come up, we'll almost certainly
need to rewrite the parts of LLVM that produce those patterns.
We do now require dominator trees in order to reliably diagnose patterns
that would require PHI nodes. This is a bit unfortunate but it seems
better than the completely mysterious crash we would get otherwise.
Differential Revision: https://reviews.llvm.org/D45673
llvm-svn: 330264
Summary:
A change to use divergence analysis in the AMDGPU backend was getting formal
arguments incorrect (not tagged as divergent) unless they were VGPR0, VGPR1 or
VGPR2
For graphics shaders it is possible to have more than these passed in as VGPR
Modified the checking code to check for any VGPR registers passed in as formal
arguments.
Also, some intrinsics that are sources of divergence may have been lowered
during instruction selection and are missed on subsequent calls to
isSDNodeSourceOfDivergence - added the relevant AMDGPUISD checks as well.
Finally, the FunctionLoweringInfo tracks virtual registers that are live across
basic block boundaries. This is used to check for divergence of CopyFromRegister
registers using the DivergenceAnalysis analysis. For multiple blocks the lazily
evaluated inverted map VirtReg2Value was not cleared when the ValueMap map was.
Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, tpr, t-tye, llvm-commits
Differential Revision: https://reviews.llvm.org/D45372
Change-Id: I112f3bd6dfe0f62e63ce9b43b893982778e4bee3
llvm-svn: 330257
After investigation discussed in D45439, it would seem that the nsw
flag restriction is unnecessary in most cases. So the IsInductionVar
lambda has been removed, the functionality extracted, and now only
require nsw when using eq/ne predicates.
Differential Revision: https://reviews.llvm.org/D45617
llvm-svn: 330256
Summary:
Due to some android peculiarities, in some build configurations
(statically linked executables targeting older releases) we could detect
the presence of these functions (because they are present in libc.a,
where check_library_exists searches), but then fail to build because the
headers did not include the definition.
This attempts to remedy that by upgrading the check_library_exists to
check_symbol_exists, which will check that the function is declared too.
I am hoping that a more thorough check will make the messy #ifdef we
have accumulated in the code obsolete, so I optimistically try to remove
them.
Reviewers: zturner, kparzysz, danalbert
Subscribers: srhines, mgorny, krytarowski, llvm-commits
Differential Revision: https://reviews.llvm.org/D45359
llvm-svn: 330251
If a predicate does not become known after peeling, peeling is unlikely
to be beneficial.
Reviewers: mcrosier, efriedma, mkazantsev, junbuml
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D44983
llvm-svn: 330250
Summary:
Previously we crashed for the combination of the two features because we
tried to reference the dwo CU from the main object file. The fix
consists of two items:
- reference the skeleton CU from the name index (the consumer is
expected to use the skeleton CU to find the real data).
- use the main object file string pool for the strings in the index
Reviewers: JDevlieghere, aprantl, dblaikie
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45566
llvm-svn: 330249
Summary:
When sinking an instruction in InstCombine we now also sink
the DbgInfoIntrinsics that are using the sunken value.
Example)
When sinking the load in this input
bb.X:
%0 = load i64, i64* %start, align 4, !dbg !31
tail call void @llvm.dbg.value(metadata i64 %0, ...)
br i1 %cond, label %for.end, label %for.body.lr.ph
for.body.lr.ph:
br label %for.body
we now also move the dbg.value, like this
bb.X:
br i1 %cond, label %for.end, label %for.body.lr.ph
for.body.lr.ph:
%0 = load i64, i64* %start, align 4, !dbg !31
tail call void @llvm.dbg.value(metadata i64 %0, ...)
br label %for.body
In the past we haven't moved the dbg.value so we got
bb.X:
tail call void @llvm.dbg.value(metadata i64 %0, ...)
br i1 %cond, label %for.end, label %for.body.lr.ph
for.body.lr.ph:
%0 = load i64, i64* %start, align 4, !dbg !31
br label %for.body
So in the past we got a debug-use before the def of %0.
And that dbg.value was also on the path jumping to %for.end, for
which %0 never was defined.
CodeGenPrepare normally comes to rescue later (when not moving
the dbg.value), since it moves dbg.value instrinsics quite
brutally, without really analysing if it is correct to move
the intrinsic (see PR31878).
So at the moment this patch isn't expected to have much impact,
besides that it is moving the dbg.value already in opt, making
the IR look more sane directly.
This can be seen as a preparation to (hopefully) make it possible
to turn off CodeGenPrepare::placeDbgValues later as a solution
to PR31878.
I also adjusted test/DebugInfo/X86/sdagsplit-1.ll to make the
IR in the test case up-to-date with this behavior in InstCombine.
Reviewers: rnk, vsk, aprantl
Reviewed By: vsk, aprantl
Subscribers: mattd, JDevlieghere, llvm-commits
Tags: #debug-info
Differential Revision: https://reviews.llvm.org/D45425
llvm-svn: 330243
Summary: Previously if a modifer was placed on a non-GPR register class we would hit an assert or crash.
Reviewers: echristo
Reviewed By: echristo
Subscribers: eraman, llvm-commits
Differential Revision: https://reviews.llvm.org/D45751
llvm-svn: 330238
The bitcast may be interfering with other combines or vectorization
as shown in PR16739:
https://bugs.llvm.org/show_bug.cgi?id=16739
Most pointer-related optimizations are probably able to look through
this bitcast, but removing the bitcast shrinks the IR, so it's at
least a size savings.
Differential Revision: https://reviews.llvm.org/D44833
llvm-svn: 330237
Summary:
Statistic and ManagedStatic both use mutexes. There was a lock order
inversion where, during initialization, Statistic's mutex would be
held while taking ManagedStatic's, and in llvm_shutdown,
ManagedStatic's mutex would be held while taking Statistic's
mutex. This change causes Statistic's initialization code to avoid
holding its mutex while calling ManagedStatic's methods, avoiding the
inversion.
Reviewers: dsanders, rtereshin
Reviewed By: dsanders
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D45398
llvm-svn: 330236
Track the debug locations of the incoming values to newly-created phis,
and apply merged debug locations to the phis.
A merged location will be on line 0, but will have the correct scope
set. This improves crash reporting when an inlined instruction with a
merged location triggers a machine exception. A debugger will be able to
narrow down the crash to the correct inlined scope, instead of simply
pointing to the outer scope of the caller.
Taken together with a change allows generating merged line-0 locations
for instructions which aren't calls, this results in a 0.5% increase in
the uncompressed size of the .debug_line section of a stage2+Release
build of clang (-O3 -g).
rdar://33858697
Differential Revision: https://reviews.llvm.org/D45397
llvm-svn: 330227
The implementation follows the MIPS backend and expands the
pseudo instruction directly during asm parsing. As the result, only
real MC instructions are emitted to the MCStreamer. Additionally,
PseudoLI instructions are emitted during codegen. The actual
expansion to real instructions is performed during MI to MC lowering
and is similar to the expansion performed by the GNU Assembler.
Differential Revision: https://reviews.llvm.org/D41949
Patch by Mario Werner.
llvm-svn: 330224
When we skip bitcasts while looking for GEP in LoadSoreVectorizer
we should also verify that the type is sized otherwise we assert
Differential Revision: https://reviews.llvm.org/D45709
llvm-svn: 330221
Summary:
Add an LLVM intrinsic for type discriminated event logging with XRay.
Similar to the existing intrinsic for custom events, but also accepts
a type tag argument to allow plugins to be aware of different types
and semantically interpret logged events they know about without
choking on those they don't.
Relies on a symbol defined in compiler-rt patch D43668. I may wait
to submit before I can see demo everything working together including
a still to come clang patch.
Reviewers: dberris, pelikan, eizan, rSerge, timshen
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D45633
llvm-svn: 330219
Summary:
It was not easy to provide a test case for D45648 (rL330079) because the bug
didn't manifest itself in the set of currently valid IRs. Added an assertion to
check this faster, thanks to @dblaikie's suggestion.
Reviewers: dblaikie
Subscribers: jfb, dschuff, sbc100, jgravelle-google, llvm-commits, dblaikie
Differential Revision: https://reviews.llvm.org/D45711
llvm-svn: 330217
GetArgumentVector (or GetCommandLineArguments) is very Windows-specific.
I think it doesn't make much sense to provide that function from sys::Process.
I also made a change so that the function takes a BumpPtrAllocator
instead of a SpecificBumpPtrAllocator. The latter is the class to call
dtors, but since char * is trivially destructible, we should use the
former class.
Differential Revision: https://reviews.llvm.org/D45641
llvm-svn: 330216
The DBI stream contains a list of module descriptors. At the
beginning of each descriptor is a structure representing the first
section contribution in the output file for that module. LLD
currently doesn't fill out this structure at all, but link.exe
does. So as a precursor to emitting this data in LLD, we first
need a way to dump it so that it can be checked.
This patch adds support for the dumping, and verifies via a test
that LLD emits bogus information.
llvm-svn: 330208
Stack addressing needs addressing modes that provide an offset field
immediately following the frame index. An initializer from a non-stack
addressing could force the stack address to use a form that does not
provide an offset field.
llvm-svn: 330191
LLVMDump* functions are available in Release builds too.
Patch by Brenton Bostick.
Differential Revision: https://reviews.llvm.org/D44600
llvm-svn: 330189
Split VCMP/VMAX/VMIN instructions off to WriteFCmp and VCOMIS instructions off to WriteFCom instead of assuming they match WriteFAdd
Differential Revision: https://reviews.llvm.org/D45656
llvm-svn: 330179
One more, hopefully the last, bug is fixed: when forming UsesToRewrite
we should ignore phi operands coming from edges that we want to delete.
This reverts r329910.
llvm-svn: 330175
Apparently, DebugInfoFinder::processCompileUnit doesn't process all
of the possible kinds of DIImportedEntit'ies, e.g. DIGlobalVariable's.
Previously introduced `llvm_unreachable` is therefore incorrect.
Removing it here.
llvm-svn: 330167
Using Config->is64() will treat ARM64 as Amd64, which is incorrect.
Furthermore, there are more esoteric architectures that could
theoretically be encountered. Just set it directly to the machine
type, which we already know anyway.
llvm-svn: 330157
Summary:
Specifying assert message with an || operator makes the compiler interpret it
as a bool. Changed it to &&.
Reviewers: asb, apazos
Reviewed By: asb
Subscribers: rbar, johnrusso, simoncook, jordy.potman.lists, sabuasal, niosHD, kito-cheng, shiva0217, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D45660
llvm-svn: 330148
We use getExtractWithExtendCost to calculate the cost of extractelement and
s|zext together when computing the extract cost after vectorization, but we
calculate the cost of extractelement and s|zext separately when computing the
scalar cost which is larger than it should be.
Differential Revision: https://reviews.llvm.org/D45469
llvm-svn: 330143
materializing function definitions.
MaterializationUnit instances are responsible for resolving and finalizing
symbol definitions when their materialize method is called. By contract, the
MaterializationUnit must materialize all definitions it is responsible for and
no others. If it can not materialize all definitions (because of some error)
then it must notify the associated VSO about each definition that could not be
materialized. The MaterializationResponsibility class tracks this
responsibility, asserting that all required symbols are resolved and finalized,
and that no extraneous symbols are resolved or finalized. In the event of an
error it provides a convenience method for notifying the VSO about each
definition that could not be materialized.
llvm-svn: 330142
notifyMaterializationFailed.
The notifyMaterializationFailed method can determine which error to raise by
looking at which queue the pending queries are in (resolution or finalization).
llvm-svn: 330141