formLCSSAForInstructions is used by SCEVExpander, which tracks all
inserted instructions including LCSSA phis using asserting value
handles. This means cleanup needs to happen in the caller.
Extend formLCSSAForInstructions to take an optional pointer to a
vector. If this argument is non-nullptr, instead of directly deleting
the phis, add them to the vector, so the caller can process them.
This should address various PPC buildbot failures, including
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/40567
Use IRBuilder instead PHINode::Create. This should not impact the
generated code, but IRBuilder provides a way to register callbacks for
inserted instructions, which is convenient for some users.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D85037
querying getSCEV() for incomplete phis leads to wrong cache value in `ExprToIVMap`,
because incomplete phis may be simplified to same value before get SCEV expression.
Reviewed By: lebedev.ri, mkazantsev
Differential Revision: https://reviews.llvm.org/D77560
Summary: This patch separates the Loop Peeling Utilities from Loop Unrolling.
The reason for this change is that Loop Peeling is no longer only being used by
loop unrolling; Patch D82927 introduces loop peeling with fusion, such that
loops can be modified to have to same trip count, making them legal to be
peeled.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D83056
This reverts the revert commit dc28675768.
It includes a fix for Polly, which uses SCEVExpander on IR that is not
in LCSSA form. Set PreserveLCSSA = false in that case, to ensure we do
not introduce LCSSA phis where there were none before.
This reverts commit 99166fd4fb, because it
breaks the polly builders.
polly/test/Isl/CodeGen/invariant_load_escaping_second_scop.ll fails
because a apparently unnecessary LCSSA phi node is introduced.
Make the bots green again, while I take a closer look.
This patch teaches SCEVExpander to directly preserve LCSSA.
As it is currently, SCEV does not look through PHI nodes in loops,
as it might break LCSSA form. Once SCEVExpander can preserve
LCSSA form, it should be safe for SCEV to look through PHIs.
To preserve LCSSA form, this patch uses formLCSSAForInstructions
on operands of newly created instructions, if the definition is inside
a different loop than the new instruction.
The final value we return from expandCodeFor may also need LCSSA
phis, depending on the insert point. As no user for it exists there yet,
create a temporary instruction at the insert point, which can be passed
to formLCSSAForInstructions. This temporary instruction is removed
after LCSSA construction.
Reviewed By: mkazantsev
Differential Revision: https://reviews.llvm.org/D71538
Currently, getCastInstrCost has limited information about the cast it's
rating, often just the opcode and types. Sometimes there is a context
instruction as well, but it isn't trustworthy: for instance, when the
vectorizer is rating a plan, it calls getCastInstrCost with the old
instructions when, in fact, it's trying to evaluate the cost of the
instruction post-vectorization. Thus, the current system can get the
cost of certain casts incorrect as the correct cost can vary greatly
based on the context in which it's used.
For example, if the vectorizer queries getCastInstrCost to evaluate the
cost of a sext(load) with tail predication enabled, getCastInstrCost
will think it's free most of the time, but it's not always free. On ARM
MVE, a VLD2 group cannot be extended like a normal VLDR can. Similar
situations can come up with how masked loads can be extended when being
split.
To fix that, this path adds a new parameter to getCastInstrCost to give
it a hint about the context of the cast. It adds a CastContextHint enum
which contains the type of the load/store being created by the
vectorizer - one for each of the types it can produce.
Original patch by Pierre van Houtryve
Differential Revision: https://reviews.llvm.org/D79162
We can happily turn function definitions into declarations,
thus obscuring their argument from being elided by this pass.
I don't believe there is a good reason to just ignore declarations.
likely even proper llvm intrinsics ones,
at worst the input becomes uninteresting.
The other question here is that all these transforms are all-or-nothing.
In some cases, should we be treating each use separately?
The main blocker here seemed to be that llvm::CloneFunctionInto()
does `&OldFunc->front()`, which inserts a nullptr into a densemap,
which is not happy about it and asserts.
This is the first of two patches to address PR46753. We basically allow
mem2reg to promote allocas that are used in doppable instructions, for
now that means `llvm.assume`. The uses of the alloca (or a bitcast or
zero offset GEP from there) are replaced by `undef` in the droppable
instructions.
Reviewed By: Tyker
Differential Revision: https://reviews.llvm.org/D83976
SROA knows that it can look through addrspacecast but
PromoteMemoryToRegister did not handle them. This caused an assertion
error for the test case, exposed while running
`Transforms/PhaseOrdering/inlining-alignment-assumptions.ll` with D83978
applied.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D84085
PassManager.h is one of the top headers in the ClangBuildAnalyzer frontend worst offenders list.
This exposes a large number of implicit dependencies on various forward declarations/includes in other headers that need addressing.
As long as RenamedOp is not guaranteed to be accurate, we cannot
assert here and should just return false. This was already done
for the other conditions in this function.
Fixes https://bugs.llvm.org/show_bug.cgi?id=46814.
Currently there are plenty of instructions that SCEVExpander creates but
does not track as created. IRBuilder allows specifying a callback
whenever an instruction is inserted. Use this to call
rememberInstruction automatically for each created instruction.
There are still a few rememberInstruction calls remaining, because in
some cases Inst::Create functions are used to construct instructions.
Suggested by @lebedev.ri in D75980.
Reviewers: mkazantsev, reames, sanjoy.google, lebedev.ri
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D84326
The revert was a misfire.
Remove the temporary flag PGSOIRPassOrTestOnly and the guard code which was used
for the staged rollout. This is a cleanup (NFC) as it's now false by default.
Differential Revision: https://reviews.llvm.org/D84057
This reverts commit e64afefdf8. It caused
a PGO bootstrapped clang to crash on many source files.
`__llvm_profile_instrument_range` seems to trigger a null pointer dereference.
Call stack:
__llvm_profile_instrument_range
llvm::APInt::udiv(llvm::APInt const&) const
getRangeForAffineARHelper
We do not thread blocks with convergent calls, but this check was missing
when we decide to insert PR Phis into it (which we only do for threading).
Differential Revision: https://reviews.llvm.org/D83936
Reviewed By: nikic
Remove the temporary flag PGSOIRPassOrTestOnly and the guard code which was used
for the staged rollout. This is a cleanup (NFC) as it's now false by default.
Differential Revision: https://reviews.llvm.org/D84057
This patch adds a TileInfo abstraction and utilities to
create a 3-level loop nest for tiling.
Reviewers: anemet
Reviewed By: anemet
Differential Revision: https://reviews.llvm.org/D77550
This allows tracking the in-memory type of a pointer argument to a
function for ABI purposes. This is essentially a stripped down version
of byval to remove some of the stack-copy implications in its
definition.
This includes the base IR changes, and some tests for places where it
should be treated similarly to byval. Codegen support will be in a
future patch.
My original attempt at solving some of these problems was to repurpose
byval with a different address space from the stack. However, it is
technically permitted for the callee to introduce a write to the
argument, although nothing does this in reality. There is also talk of
removing and replacing the byval attribute, so a new attribute would
need to take its place anyway.
This is intended avoid some optimization issues with the current
handling of aggregate arguments, as well as fixes inflexibilty in how
frontends can specify the kernel ABI. The most honest representation
of the amdgpu_kernel convention is to expose all kernel arguments as
loads from constant memory. Today, these are raw, SSA Argument values
and codegen is responsible for turning these into loads.
Background:
There currently isn't a satisfactory way to represent how arguments
for the amdgpu_kernel calling convention are passed. In reality,
arguments are passed in a single, flat, constant memory buffer
implicitly passed to the function. It is also illegal to call this
function in the IR, and this is only ever invoked by a driver of some
kind.
It does not make sense to have a stack passed parameter in this
context as is implied by byval. It is never valid to write to the
kernel arguments, as this would corrupt the inputs seen by other
dispatches of the kernel. These argumets are also not in the same
address space as the stack, so a copy is needed to an alloca. From a
source C-like language, the kernel parameters are invisible.
Semantically, a copy is always required from the constant argument
memory to a mutable variable.
The current clang calling convention lowering emits raw values,
including aggregates into the function argument list, since using
byval would not make sense. This has some unfortunate consequences for
the optimizer. In the aggregate case, we end up with an aggregate
store to alloca, which both SROA and instcombine turn into a store of
each aggregate field. The optimizer never pieces this back together to
see that this is really just a copy from constant memory, so we end up
stuck with expensive stack usage.
This also means the backend dictates the alignment of arguments, and
arbitrarily picks the LLVM IR ABI type alignment. By allowing an
explicit alignment, frontends can make better decisions. For example,
there's real no advantage to an aligment higher than 4, so a frontend
could choose to compact the argument layout. Similarly, there is a
high penalty to using an alignment lower than 4, so a frontend could
opt into more padding for small arguments.
Another design consideration is when it is appropriate to expose the
fact that these arguments are all really passed in adjacent
memory. Currently we have a late IR optimization pass in codegen to
rewrite the kernel argument values into explicit loads to enable
vectorization. In most programs, unrelated argument loads can be
merged together. However, exposing this property directly from the
frontend has some disadvantages. We still need a way to track the
original argument sizes and alignments to report to the driver. I find
using some side-channel, metadata mechanism to track this
unappealing. If the kernel arguments were exposed as a single buffer
to begin with, alias analysis would be unaware that the padding bits
betewen arguments are meaningless. Another family of problems is there
are still some gaps in replacing all of the available parameter
attributes with metadata equivalents once lowered to loads.
The immediate plan is to start using this new attribute to handle all
aggregate argumets for kernels. Long term, it makes sense to migrate
all kernel arguments, including scalars, to be passed indirectly in
the same manner.
Additional context is in D79744.
Common code sinking is already guarded with a (with default-off!) flag,
so add a flag for hoisting, too.
D84108 will hopefully make hoisting off-by-default too.
Both users of predicteinfo (NewGVN and SCCP) are interested in
getting a cmp constraint on the predicated value. They currently
implement separate logic for this. This patch adds a common method
for this in PredicateBase.
This enables a missing bit of PredicateInfo handling in SCCP: Now
the predicate on the condition itself is also used. For switches
it means we know that the switched-on value is the same as the case
value. For assumes/branches we know that the condition is true or
false.
Differential Revision: https://reviews.llvm.org/D83640
Summary:
This patch resolves an issue where the metadata of a loop is not added to the
new loop latch, and not removed from the old loop latch. This issue occurs in
the SplitBlockPredecessors function, which adds a new block in a loop, and
in the case that the block passed into this function is the header of the loop,
the loop can be modified such that the latch of the loop is replaced.
This patch applies to the Loop Simplify pass since it ensures that each loop
has exit blocks which only have predecessors that are inside of the loop. In
the case that this is not true, the pass will create a new exit block for the
loop. This guarantees that the loop preheader/header will dominate the exit blocks.
Author: sidbav (Sidharth Baveja)
Reviewers: asbirlea (Alina Sbirlea), chandlerc (Chandler Carruth), Whitney (Whitney Tsang), bmahjour (Bardia Mahjour)
Reviewed By: asbirlea (Alina Sbirlea)
Subscribers: hiraditya (Aditya Kumar), llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D83869
SimplifyCFG was incorrectly reporting to the pass manager that it had not made
changes after folding away a PHI. This is detected in the EXPENSIVE_CHECKS
build when the function's hash changes.
Differential Revision: https://reviews.llvm.org/D83985
When the byref attribute is added, there will need to be two similar
functions for the existing cases which have an associate value copy,
and byref which does not. Most, but not all of the existing uses will
use the existing version.
The associated size function added by D82679 also needs to
contextually differ, and will help eliminate a few places still
relying on pointee element types.
Unrolling a loop with compile-time unknown trip count results in a remainder loop. The remainder loop executes the remaining iterations of the original loop when the original trip count is not a multiple of the unroll factor. For better profile counts maintenance throughout the optimization pipeline, I'm assigning an artificial weight to the latch branch of the remainder loop.
A remainder loop runs up to as many times as the unroll factor subtracted by 1. Therefore I'm assigning the maximum possible trip count as the back edge weight. This should be more accurate than the default non-profile weight, which assumes the back edge runs much more frequently than the exit edge.
Differential Revision: https://reviews.llvm.org/D83187
CodeGenPrepare keeps fairly close track of various instructions it's
seen, particularly GEPs, in maps and vectors. However, sometimes those
instructions become dead and get removed while it's still executing.
This triggers AssertingVH references to them in an asserts build and
could lead to miscompiles in a release build (I've only seen a later
segfault though).
So this patch adds a callback to
RecursivelyDeleteTriviallyDeadInstructions which can make sure the
instruction about to be deleted is removed from CodeGenPrepare's data
structures.
The actual rotation happens in processLoop, so the second removed
call to verifyMemorySSA was unnecessary.
In fact, processLoop/rotateLoop already verify MemorySSA before
and after transforming each loop. Hence, both calls can be removed.
Pointed out by @lebedev.ri post-commit D51718.
Summary:
Add debug counter and stats counter to assume queries and assume builder
here is the collected stats on a build of check-llvm + check-clang.
"assume-builder.NumAssumeBuilt": 2720879,
"assume-builder.NumAssumesMerged": 761396,
"assume-builder.NumAssumesRemoved": 1576212,
"assume-builder.NumBundlesInAssumes": 6518809,
"assume-queries.NumAssumeQueries": 85566380,
"assume-queries.NumUsefullAssumeQueries": 2727360,
the NumUsefullAssumeQueries stat is actually pessimistic because in a few places queries
ask to keep providing information to try to get better information. and this isn't counted
as a usefull query evem tho it can be usefull
Reviewers: jdoerfert
Reviewed By: jdoerfert
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83506
This fixes warnings raised by Clang's new -Wsuggest-override, in preparation for enabling that warning in the LLVM build. This patch also removes the virtual keyword where redundant, but only in places where doing so improves consistency within a given file. It also removes a couple unnecessary virtual destructor declarations in derived classes where the destructor inherited from the base class is already virtual.
Differential Revision: https://reviews.llvm.org/D83709
Here we teach the ConstantFolding analysis pass that it is not legal to
replace a load of a bitcast constant (having a non-integral addrspace)
with a bitcast of the value of that constant (with a different
non-integral addrspace).
But also teach it that certain bit patterns are always known and
convertable (a fact it already uses elsewhere). This required us to also
fix a globalopt test, since, after this change, LLVM is able to realize
that the test actually is a valid transform (NULL is always a known
bit-pattern) and so it doesn't need to emit the failure remarks for it.
Also simplify some of the negative tests for transforms by avoiding a
type change in their bitcast, and add positive versions of the same
tests, to show that they otherwise should work.
Differential Revision: https://reviews.llvm.org/D59730
This could previously make it more complicated for ConstantFolding
later, leading to a higher likelyhood it would have to reject the
expression, even though zero seems like probably the common case here.
Differential Revision: https://reviews.llvm.org/D59730
Currently, a transformation like pow(2.0, x) -> exp2(x) copies the pow
attribute list verbatim and applies it to exp2. This works out fine
when the attribute list is empty, but when it isn't clang may error due
due to the mismatch.
The source function and destination don't necessarily have anything
to do with one another, attribute-wise. So it makes sense to remove
the attribute lists (this is similar to what IPO does in this
situation).
This was discovered after implementing the `noundef` param attribute.
Differential Revision: https://reviews.llvm.org/D82820
Place the ssa.copy instructions for assumes after the assume,
instead of before it. Both options are valid, but placing them
afterwards prevents assumes from being replaced with assume(true).
This fixes https://bugs.llvm.org/show_bug.cgi?id=37541 in NewGVN
and will avoid a similar issue in SCCP when we handle more
predicate infos.
Differential Revision: https://reviews.llvm.org/D83631
Summary:
- Skip unreachable predecessors during header detection in SCC. Those
unreachable blocks would be generated in the switch lowering pass in
the corner cases or other frontends. Even though they could be removed
through the CFG simplification, we should skip them during header
detection.
Reviewers: sameerds
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83562
Summary:
This patch separates the peeling specific parameters from the UnrollingPreferences,
and creates a new struct called PeelingPreferences. Functions which used the
UnrollingPreferences struct for peeling have been updated to use the PeelingPreferences struct.
Author: sidbav (Sidharth Baveja)
Reviewers: Whitney (Whitney Tsang), Meinersbur (Michael Kruse), skatkov (Serguei Katkov), ashlykov (Arkady Shlykov), bogner (Justin Bogner), hfinkel (Hal Finkel), anhtuyen (Anh Tuyen Tran), nikic (Nikita Popov)
Reviewed By: Meinersbur (Michael Kruse)
Subscribers: fhahn (Florian Hahn), hiraditya (Aditya Kumar), llvm-commits, LLVM
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80580
Summary: This patch moves OrderedInstructions to CodeMoverUtils as It was
the only place where OrderedInstructions is required.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto, fhahn, nikic
Reviewed By: Whitney, nikic
Subscribers: mgorny, hiraditya, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D80643
The block front may be a PHI node, inserting a cast instructions like
BitCast, PtrToInt, IntToPtr among PHIs is not right.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D80975
OriginalOp of a predicate always refers to the original IR
value that was renamed. So for nested predicates of the same value, it
will always refer to the original IR value.
For the use in SCCP however, we need to find the renamed value that is
currently used in the condition associated with the predicate. This
patch adds a new RenamedOp field to do exactly that.
NewGVN currently relies on the existing behavior to merge instruction
metadata. A test case to check for exactly that has been added in
195fa4bfae.
Reviewers: efriedma, davide, nikic
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D78133
The `noundef` attribute indicates an argument or return value which
may never have an undef value representation.
This patch allows LLVM to parse the attribute.
Differential Revision: https://reviews.llvm.org/D83412
Summary: This patch makes code motion checks optional which are dependent on
specific analysis example, dominator tree, post dominator tree and dependence
info. The aim is to make the adoption of CodeMoverUtils easier for clients that
don't use analysis which were strictly required by CodeMoverUtils. This will
also help in diversifying code motion checks using other analysis example MSSA.
Authored By: RithikSharma
Reviewer: Whitney, bmahjour, etiotto
Reviewed By: Whitney
Subscribers: Prazek, hiraditya, george.burgess.iv, asbirlea, llvm-commits
Tag: LLVM
Differential Revision: https://reviews.llvm.org/D82566
Summary:
It is reasonably common to want to clone some call with different bundles.
Let's actually provide an interface to do that.
Reviewers: chandlerc, jdoerfert, dblaikie, nickdesaulniers
Reviewed By: nickdesaulniers
Subscribers: llvm-commits, hiraditya
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83248
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.
New methods are introduced to cover common access patterns.
Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83083
clang w/ old-pm currently would simply crash
when -mllvm -enable-knowledge-retention=true is specified.
Clearly, these two passes had no Old-PM test coverage,
which would have shown the problem - not requiring AssumptionCacheTracker,
but then trying to always get it.
Also, why try to get domtree only if it's cached,
but at the same time marking it as required?
Summary:
This patch changes call graph analysis to recognize callback call sites
and add an artificial 'reference' call record from the broker function
caller to the callback function in the call graph. A presence of such
reference enforces bottom-up traversal order for callback functions in
CG SCC pass manager because callback function logically becomes a callee
of the broker function caller.
Reviewers: jdoerfert, hfinkel, sstefan1, baziotis
Reviewed By: jdoerfert
Subscribers: hiraditya, kuter, sstefan1, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82572
Sometimes SimplifyCFG may decide to perform jump threading. In order
to do it, it follows the following algorithm:
1. Checks if the block is small enough for threading;
2. If yes, inserts a PR Phi relying that the next iteration will remove it
by performing jump threading;
3. The next iteration checks the block again and performs the threading.
This logic has a corner case: inserting the PR Phi increases block's size
by 1. If the block size at first check was max possible, one more Phi will
exceed this size, and we will neither perform threading nor remove the
created Phi node. As result, we will end up with worse IR than before.
This patch fixes this situation by excluding Phis from block size computation.
Excluding Phis from size computation for threading also makes sense by
itself because in case of threadign all those Phis will be removed.
Differential Revision: https://reviews.llvm.org/D81835
Reviewed By: asbirlea, nikic
It's possible for the first loop trip(s) to set the `Changed` Status, and to a
later one to early exit, in which case `Changed` must be return.
Differential Revision: https://reviews.llvm.org/D82753
In https://reviews.llvm.org/D81198, we outlined a number of scenarios
where dropping debug locations is appropriate. Stop issuing an error
when this happens.
Summary:
According to HowToUpdateDebugInfo.rst:
```
Preserving the debug locations of speculated instructions can make
it seem like a condition is true when it's not (or vice versa), which
leads to a confusing single-stepping experience
```
This patch follows the recommendation to drop debug locations on
speculated instructions.
Reviewers: aprantl, davide
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82420
InjectTLIMappings fails to preserve the analysis result of GlobalsAA. Not preserving the analysis might affect benchmark performance. This change fixes this issue.
Patch by: Ryan Santhiraraja <rsanthir@quicinc.com>
Reviewers: fpetrogalli, joerg, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D82343
Summary:
As [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]] reports,
with new cost-model we can sometimes end up being able to expand `udiv`/`urem` instructions.
And that exposes at least one instance of when we do that
regardless of whether or not it is safe to do.
In this particular case, it's `SimplifyIndvar::replaceIVUserWithLoopInvariant()`.
It seems to me, we simply need to check with `isSafeToExpandAt()` first.
The test isn't great. I'm not sure how to make it only run `-indvars`.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=45360 | PR45360 ]].
Reviewers: mkazantsev, reames, helloqirun
Reviewed By: mkazantsev
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82108
This reverts commit 29b2c1ca72.
The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!
Not sure the patch itself it wrong but revert to investigate the failure.
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.
Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.
Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
When an invoke instruction is converted to a call its
profile metadata is dropped because it has incompatible
format (see commit 16ad6eeb94).
This patch adds an attempt to convert profile data to
format of the call instruction. This used to work well
before the commit dcfa78a4cc.
Reviewers: reames
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D82071
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79650
I don't know anything about debug info, but this seems like more work
should be necessary. This constructs a new IRBuilder and reconstructs
the original divides rather than moving the original.
One problem this has is if a div/rem pair are handled, both end up
with the same debugloc. I'm not sure how to fix this, since this uses
a cache when it sees the same input operands again, which will have
the first instance's location attached.
Summary:
llvm::SplitEdge was failing an assertion that the BasicBlock only had
one successor (for BasicBlocks terminated by CallBrInst, we typically
have multiple successors). It was surprising that the earlier call to
SplitCriticalEdge did not handle the critical edge (there was an early
return). Removing that triggered another assertion relating to creating
a BlockAddress for a BasicBlock that did not (yet) have a parent, which
is a simple order of operations issue in llvm::SplitCriticalEdge (a
freshly constructed BasicBlock must be inserted into a Function's basic
block list to have a parent).
Thanks to @nathanchance for the report.
Fixes: https://github.com/ClangBuiltLinux/linux/issues/1018
Reviewers: craig.topper, jyknight, void, fhahn, efriedma
Reviewed By: efriedma
Subscribers: eli.friedman, rnk, efriedma, fhahn, hiraditya, llvm-commits, nathanchance, srhines
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D81607
The invoke instruction can have profile metadata with branch_weights,
which does not make sense for a call instruction and will be
rejected by the verifier.
Differential revision: https://reviews.llvm.org/D81996
Summary:
this reduces significantly the number of assumes generated without aftecting too much
the information that is preserved. this improves the compile-time cost
of enable-knowledge-retention significantly.
Reviewers: jdoerfert, sstefan1
Reviewed By: jdoerfert
Subscribers: hiraditya, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79650