This special cases an unconditional latch and a conditional branch latch exit to improve codegen and test readability. I am hoping to reuse this function in the runtime unroll code, but without this change, the test diffs are far too complex to assess.
Currently/previously, while SCEV guaranteed that it produces the same value,
the way it was produced may be illegal IR, so we have an ugly check that
the replacement is valid.
But now that the SCEV strictness wrt the pointer/integer types has been improved,
i believe this invariant is already upheld by the SCEV itself, natively.
I think we should add an assertion, wait for a week, and then, if all is good,
rip out all this checking.
Or we could just do the latter directly i guess.
This reverts commit rL127839.
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D108043
This patch removes RtCheck from RuntimeCheckingPtrGroup to make it
possible to construct RuntimeCheckingPtrGroup objects without a
RuntimePointerChecking object. This should make it easier to
re-use the code to generate runtime checks, e.g. in D102834.
RtCheck was only used to access the pointer info for a given index.
Instead, the start and end expressions can be passed directly.
For code-gen, we also need to know the address space to use. This can
also be explicitly passed at construction.
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D105481
This fixes the lower and upper bound calculation of a
RuntimeCheckingPtrGroup when it has more than one loop
invariant pointers. Resolves PR50686.
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D104148
We were passing the RecurrenceDescriptor by value to most of the reduction analysis methods, despite it being rather bulky with TrackingVH members (that can be costly to copy). In all these cases we're only using the RecurrenceDescriptor for rather basic purposes (access to types/kinds etc.).
Differential Revision: https://reviews.llvm.org/D104029
Essentially, the cover function simply combines the loop level check and the function level scope into one call. This simplifies several callers and is (subjectively) less error prone.
This patch causes the loop vectorizer to not interleave loops that have
nounroll loop hints (llvm.loop.unroll.disable and llvm.loop.unroll_count(1)).
Note that if a particular interleave count is being requested
(through llvm.loop.interleave_count), it will still be honoured, regardless
of the presence of nounroll hints.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D101374
Previously we could only vectorize FP reductions if fast math was enabled, as this allows us to
reorder FP operations. However, it may still be beneficial to vectorize the loop by moving
the reduction inside the vectorized loop and making sure that the scalar reduction value
be an input to the horizontal reduction, e.g:
%phi = phi float [ 0.0, %entry ], [ %reduction, %vector_body ]
%load = load <8 x float>
%reduction = call float @llvm.vector.reduce.fadd.v8f32(float %phi, <8 x float> %load)
This patch adds a new flag (IsOrdered) to RecurrenceDescriptor and makes use of the changes added
by D75069 as much as possible, which already teaches the vectorizer about in-loop reductions.
For now in-order reduction support is off by default and controlled with the `-enable-strict-reductions` flag.
Reviewed By: david-arm
Differential Revision: https://reviews.llvm.org/D98435
The vector reduction intrinsics started life as experimental ops, so backend support
was lacking. As part of promoting them to 1st-class intrinsics, however, codegen
support was added/improved:
D58015
D90247
So I think it is safe to now remove this complication from IR.
Note that we still have an IR-level codegen expansion pass for these as discussed
in D95690. Removing that is another step in simplifying the logic. Also note that
x86 was already unconditionally forming reductions in IR, so there should be no
difference for x86.
I spot checked a couple of the tests here by running them through opt+llc and did
not see any asm diffs.
If we do find functional differences for other targets, it should be possible
to (at least temporarily) restore the shuffle IR with the ExpandReductions IR
pass.
Differential Revision: https://reviews.llvm.org/D96552
This is another step (see D95452) towards correcting fast-math-flags
bugs in vector reductions.
There are multiple bugs visible in the test diffs, and this is still
not working as it should. We still use function attributes (rather
than FMF) to drive part of the logic, but we are not checking for
the correct FP function attributes.
Note that FMF may not be propagated optimally on selects (example
in https://llvm.org/PR35607 ). That's why I'm proposing to union the
FMF of a fcmp+select pair and avoid regressions on existing vectorizer
tests.
Differential Revision: https://reviews.llvm.org/D95690
This gives the user control over which expander to use, which in turn
allows the user to decide what to do with the expanded instructions.
Used in D75980.
Reviewed By: lebedev.ri
Differential Revision: https://reviews.llvm.org/D94295
The switch must set the predicate correctly; anything else
should lead to unreachable/assert.
I'm trying to fix FMF propagation here and the callers,
so this is a preliminary cleanup.
This builds on the restricted after initial revert form of D93906, and adds back support for breaking backedges of inner loops. It turns out the original invalidation logic wasn't quite right, specifically around the handling of LCSSA.
When breaking the backedge of an inner loop, we can cause blocks which were in the outer loop only because they were also included in a sub-loop to be removed from both loops. This results in the exit block set for our original parent loop changing, and thus a need for new LCSSA phi nodes.
This case happens when the inner loop has an exit block which is also an exit block of the parent, and there's a block in the child which reaches an exit to said block without also reaching an exit to the parent loop.
(I'm describing this in terms of the immediate parent, but the problem is general for any transitive parent in the nest.)
The approach implemented here involves a potentially expensive LCSSA rebuild. Perf testing during review didn't show anything concerning, but we may end up needing to revert this if anyone encounters a practical compile time issue.
Differential Revision: https://reviews.llvm.org/D94378
This patch adds the default value of 1 to drop_begin.
In the llvm codebase, 70% of calls to drop_begin have 1 as the second
argument. The interface similar to with std::next should improve
readability.
This patch converts a couple of calls to drop_begin as examples.
Differential Revision: https://reviews.llvm.org/D94858
This is a resubmit of dd6bb367 (which was reverted due to stage2 build failures in 7c63aac), with the additional restriction added to the transform to only consider outer most loops.
As shown in the added test case, ensuring LCSSA is up to date when deleting an inner loop is tricky as we may actually need to remove blocks from any outer loops, thus changing the exit block set. For the moment, just avoid transforming this case. I plan to return to this case in a follow up patch and see if we can do better.
Original commit message follows...
The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.
Differential Revision: https://reviews.llvm.org/D93906
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.
This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.
Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86844
While here, rename the inaccurate getRecurrenceBinOp()
because that was also used to get CmpInst opcodes.
The recurrence/reduction kind should always refer to the
expected opcode for a reduction. SLP appears to be the
only direct caller of createSimpleTargetReduction(), and
that calling code ideally should not be carrying around
both an opcode and a reduction kind.
This should allow us to generalize reduction matching to
use intrinsics instead of only binops.
This reverts commit dd6bb367d1.
Multi-stage builders are showing an assertion failure w/LCSSA not being preserved on entry to IndVars. Reason isn't clear, reverting while investigating.
The basic idea is that if SCEV can prove the backedge isn't taken, we can go ahead and get rid of the backedge (and thus the loop) while leaving the rest of the control in place. This nicely handles cases with dispatch between multiple exits and internal side effects.
Differential Revision: https://reviews.llvm.org/D93906
This is almost all mechanical search-and-replace and
no-functional-change-intended (NFC). Having a single
enum makes it easier to match/reason about the
reduction cases.
The goal is to remove `Opcode` from reduction matching
code in the vectorizers because that makes it harder to
adapt the code to handle intrinsics.
The code in RecurrenceDescriptor::AddReductionVar() is
the only place that required closer inspection. It uses
a RecurrenceDescriptor and a second InstDesc to sometimes
overwrite part of the struct. It seem like we should be
able to simplify that logic, but it's not clear exactly
which cmp+sel patterns that we are trying to handle/avoid.
Test clang/test/Misc/loop-opt-setup.c fails when executed in Release.
This reverts commit 6f1503d598.
Reviewed By: SureYeaah
Differential Revision: https://reviews.llvm.org/D93956
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.
This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.
Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86844
I don't know if there's some way this changes what the vectorizers
may produce for reductions, but I have added test coverage with
3567908 and 5ced712 to show that both passes already have bugs in
this area. Hopefully this does not make things worse before we can
really fix it.
This is no-functional-change-intended (AFAIK, we can't
isolate this difference in a regression test).
That's because the callers should be setting the IRBuilder's
FMF field when creating the reduction and/or setting those
flags after creating. It doesn't make sense to override this
one flag alone.
This is part of a multi-step process to clean up the FMF
setting/propagation. See PR35538 for an example.
As mentioned in D93793, there are quite a few places where unary `IRBuilder::CreateShuffleVector(X, Mask)` can be used
instead of `IRBuilder::CreateShuffleVector(X, Undef, Mask)`.
Let's update them.
Actually, it would have been more natural if the patches were made in this order:
(1) let them use unary CreateShuffleVector first
(2) update IRBuilder::CreateShuffleVector to use poison as a placeholder value (D93793)
The order is swapped, but in terms of correctness it is still fine.
Reviewed By: spatel
Differential Revision: https://reviews.llvm.org/D93923
The switch duplicated the translation in getRecurrenceBinOp().
This code is still weird because it translates to the TTI
ReductionFlags for min/max, but then createSimpleTargetReduction()
converts that back to RecurrenceDescriptor::MinMaxRecurrenceKind.
In this patch I have added support for a new loop hint called
vectorize.scalable.enable that says whether we should enable scalable
vectorization or not. If a user wants to instruct the compiler to
vectorize a loop with scalable vectors they can now do this as
follows:
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !2
...
!2 = !{!2, !3, !4}
!3 = !{!"llvm.loop.vectorize.width", i32 8}
!4 = !{!"llvm.loop.vectorize.scalable.enable", i1 true}
Setting the hint to false simply reverts the behaviour back to the
default, using fixed width vectors.
Differential Revision: https://reviews.llvm.org/D88962
We already do not unroll loops with vector instructions under MVE, but
that does not include the remainder loops that the vectorizer produces.
These remainder loops will be rarely executed and are not worth
unrolling, as the trip count is likely to be low if they get executed at
all. Luckily they get llvm.loop.isvectorized to make recognizing them
simpler.
We have wanted to do this for a while but hit issues with low overhead
loops being reverted due to difficult registry allocation. With recent
changes that seems to be less of an issue now.
Differential Revision: https://reviews.llvm.org/D90055
From C11 and C++11 onwards, a forward-progress requirement has been
introduced for both languages. In the case of C, loops with non-constant
conditionals that do not have any observable side-effects (as defined by
6.8.5p6) can be assumed by the implementation to terminate, and in the
case of C++, this assumption extends to all functions. The clang
frontend will emit the `mustprogress` function attribute for C++
functions (D86233, D85393, D86841) and emit the loop metadata
`llvm.loop.mustprogress` for every loop in C11 or later that has a
non-constant conditional.
This patch modifies LoopDeletion so that only loops with
the `llvm.loop.mustprogress` metadata or loops contained in functions
that are required to make progress (`mustprogress` or `willreturn`) are
checked for observable side-effects. If these loops do not have an
observable side-effect, then we delete them.
Loops without observable side-effects that do not satisfy the above
conditions will not be deleted.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D86844
Currently, LoopDeletion refuses to remove dead loops with no exit blocks
because it cannot statically determine the control flow after it removes
the block. This leads to miscompiles if the loop is an infinite loop and
should've been removed.
Differential Revision: https://reviews.llvm.org/D90115
Use -0.0 instead of 0.0 as the start value. The previous use of 0.0
was fine for all existing uses of this function though, as it is
always generated with fast flags right now, and thus nsz.
60b852092c introduced SCEV verification to
deleteDeadLoop, but it appears this check is currently a bit over-eager
and some users of deleteDeadLoop appear to only patch up SE after
calling it (e.g. PR47753).
Remove the extra check for now. We can consider adding it back after we
tracked down the source of the inconsistency for PR47753.
After D71539, we need to forget the loop before setting the incoming
values of phi nodes in exit blocks, because we are looking through those
phi nodes now and the SCEV expression could depend on the loop phi. If
we update the phi nodes before forgetting the loop, we miss those users
during invalidation.
Reviewed By: reames
Differential Revision: https://reviews.llvm.org/D88167
Summary:
Avoid exposing details about how children are stored. This will enable
subsequent type-erasure changes.
New methods are introduced to cover common access patterns.
Change-Id: Idb5f4b1b9c84e4cc71ddb39bb52a388682f5674f
Reviewers: arsenm, RKSimon, mehdi_amini, courbet
Subscribers: qcolombet, sdardis, wdng, hiraditya, jrtc27, zzheng, atanasyan, asbirlea, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D83083
This reverts commit 29b2c1ca72.
The patch causes the DT verifier failure like:
DominatorTree is different than a freshly computed one!
Not sure the patch itself it wrong but revert to investigate the failure.
Currently we allow peeling of the loops if there is a exiting latch block
and all other exits are blocks ending with deopt.
Actually we want that exit would end up with deopt unconditionally but
it is not required that exit itself ends with deopt.
Reviewers: reames, ashlykov, fhahn, apilipenko, fedor.sergeev
Reviewed By: apilipenko
Subscribers: hiraditya, zzheng, dantrushin, llvm-commits
Differential Revision: https://reviews.llvm.org/D81140
Summary:
Currently, `rewriteLoopExitValues()`'s logic is roughly as following:
> Loop over each incoming value in each PHI node.
> Query whether the SCEV for that incoming value is high-cost.
> Expand the SCEV.
> Perform sanity check (`isValidRewrite()`, D51582)
> Record the info
> Afterwards, see if we can drop the loop given replacements.
> Maybe perform replacements.
The problem is that we interleave SCEV cost checking and expansion.
This is A Problem, because `isHighCostExpansion()` takes special care
to not bill for the expansions that were already expanded, and we can reuse.
While it makes sense in general - if we know that we will expand some SCEV,
all the other SCEV's costs should account for that, which might cause
some of them to become non-high-cost too, and cause chain reaction.
But that isn't what we are doing here. We expand *all* SCEV's, unconditionally.
So every next SCEV's cost will be affected by the already-performed expansions
for previous SCEV's. Even if we are not planning on keeping
some of the expansions we performed.
Worse yet, this current "bonus" depends on the exact PHI node
incoming value processing order. This is completely wrong.
As an example of an issue, see @dmajor's `pr45835.ll` - if we happen to have
a PHI node with two(!) identical high-cost incoming values for the same basic blocks,
we would decide first time around that it is high-cost, expand it,
and immediately decide that it is not high-cost because we have an expansion
that we could reuse (because we expanded it right before, temporarily),
and replace the second incoming value but not the first one;
thus resulting in a broken PHI.
What we instead should do for now, is not perform any expansions
until after we've queried all the costs.
Later, in particular after `isValidRewrite()` is an assertion (D51582)
we could improve upon that, but in a more coherent fashion.
See [[ https://bugs.llvm.org/show_bug.cgi?id=45835 | PR45835 ]]
Reviewers: dmajor, reames, mkazantsev, fhahn, efriedma
Reviewed By: dmajor, mkazantsev
Subscribers: smeenai, nikic, hiraditya, javed.absar, llvm-commits, dmajor
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D79787
SCEVExpander modifies the underlying function so it is more suitable in
Transforms/Utils, rather than Analysis. This allows using other
transform utils in SCEVExpander.
This patch was originally committed as b8a3c34eee, but broke the
modules build, as LoopAccessAnalysis was using the Expander.
The code-gen part of LAA was moved to lib/Transforms recently, so this
patch can be landed again.
Reviewers: sanjoy.google, efriedma, reames
Reviewed By: sanjoy.google
Differential Revision: https://reviews.llvm.org/D71537
Currently LAA's uses of ScalarEvolutionExpander blocks moving the
expander from Analysis to Transforms. Conceptually the expander does not
fit into Analysis (it is only used for code generation) and
runtime-check generation also seems to be better suited as a
transformation utility.
Reviewers: Ayal, anemet
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78460
This allows forward declarations of PointerCheck, which in turn reduce
the number of times LoopAccessAnalysis needs to be included.
Ultimately this helps with moving runtime check generation to
Transforms/Utils/LoopUtils.h, without having to include it there.
Reviewers: anemet, Ayal
Reviewed By: Ayal
Differential Revision: https://reviews.llvm.org/D78458
Most of the includes in LoopUtils.h are not required in the header and
they can be replaced by forward declarations.
Unfortunately includes of TargetTransformInfo.h and IVDescriptors.h pull
in a bunch of additional things, but there is no easy way to get rid of
them at the moment I think.
Summary:
Remove usages of asserting vector getters in Type in preparation for the
VectorType refactor. The existence of these functions complicates the
refactor while adding little value.
Reviewers: rriddle, sdesmalen, efriedma
Reviewed By: sdesmalen
Subscribers: hiraditya, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D77260
As discussed in post-commit review in https://reviews.llvm.org/D73501
if the goal of this is to help vectorizer, then we should actually
be teaching vectorizer to do this, because right now this rewrite
is still budget-limited, which isn't what we'd want.
Additionally, while the rest of the patch series was universally profitable,
this particular patch is reportedly (https://reviews.llvm.org/D73501#1905171)
exposing cost-modeling issues on ARM.
So let's just back this particular patch out. Once there's an undo transform,
this could be considered for reintegration.
This reverts commit 44edc6fd2c.
Also adds a force-reduction-intrinsics option for testing, for forcing
the generation of reduction intrinsics even when the backend is not
requesting them.
Summary:
Current peeling implementation bails out in case of loop nests.
The patch introduces a field in TargetTransformInfo structure that
certain targets can use to relax the constraints if it's
profitable (disabled by default).
Also additional option is added to enable peeling manually for
experimenting and testing purposes.
Reviewers: fhahn, lebedev.ri, xbolva00
Reviewed By: xbolva00
Subscribers: RKSimon, xbolva00, hiraditya, zzheng, llvm-commits
Differential Revision: https://reviews.llvm.org/D70304
Summary:
Replacing uses of IV outside of the loop is likely generally useful,
but `rewriteLoopExitValues()` is cautious, and if it isn't told to always
perform the replacement, and there are hard uses of IV in loop,
it doesn't replace.
In [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]],
that prevents `-indvars` from replacing uses of induction variable
after the loop, which might be one of the optimization failures
preventing that code from being vectorized.
Instead, now that the cost model is fixed, i believe we should be
a little bit more optimistic, and also perform replacement
if we believe it is within our budget.
Fixes [[ https://bugs.llvm.org/show_bug.cgi?id=44668 | PR44668 ]].
Reviewers: reames, mkazantsev, asbirlea, fhahn, skatkov
Reviewed By: mkazantsev
Subscribers: nikic, hiraditya, zzheng, javed.absar, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73501
Summary:
In future patches`SCEVExpander::isHighCostExpansionHelper()` will respect the budget allocated by performing TTI cost modelling.
This is a fully NFC patch to make things reviewable.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, zzheng, javed.absar, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73705
Summary:
Future patches will make use of TTI to perform cost-model-driven `SCEVExpander::isHighCostExpansionHelper()`
This is a fully NFC patch to make things reviewable.
Reviewers: reames, mkazantsev, wmi, sanjoy
Reviewed By: mkazantsev
Subscribers: hiraditya, zzheng, javed.absar, dmgreen, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73704
Summary:
Method appendLoopsToWorklist is duplicate in LoopUnroll and in the
LoopPassManager as an internal method. Make it an utility.
Reviewers: dmgreen, chandlerc, fedor.sergeev, yamauchi
Subscribers: mehdi_amini, hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D73569
We may calculate reassociable math ops in arbitrary order when creating a shuffle reduction,
so there's no guarantee that things like 'nsw' hold on those intermediate values. Drop all
poison-generating flags for safety.
This change is limited to shuffle reductions because I don't think we have a problem in the
general case (where we intersect flags of each scalar op that goes into a vector op), but if
there's evidence of other cases being wrong, we can extend this fix to cover those cases.
https://bugs.llvm.org/show_bug.cgi?id=44536
Differential Revision: https://reviews.llvm.org/D73727
Summary: Vectorized loop processes VFxUF number of elements in one iteration thus total number of iterations decreases proportionally. In addition epilog loop may not have more than VFxUF - 1 iterations. This patch updates profile information accordingly.
Reviewers: hsaito, Ayal, fhahn, reames, silvas, dcaballe, SjoerdMeijer, mkuper, DaniilSuchkov
Reviewed By: Ayal, DaniilSuchkov
Subscribers: fedor.sergeev, hiraditya, rkruppe, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D67905
Summary: Current implementation of getLoopEstimatedTripCount returns 1 iteration less than it should. The reason is that in bottom tested loop first iteration is executed before first back branch is taken. For example for loop with !{!"branch_weights", i32 1 // taken, i32 1 // exit} metadata getLoopEstimatedTripCount gives 1 while actual number of iterations is 2.
Reviewers: Ayal, fhahn
Reviewed By: Ayal
Subscribers: mgorny, hiraditya, zzheng, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D71990
This moves `rewriteLoopExitValues()` from IndVarSimplify to LoopUtils thus
making it a generic loop utility function. This allows to rewrite loop exit
values by just calling this function without running the whole IndVarSimplify
pass.
We use this in D72714 to rematerialise the iteration count in exit blocks, so
that we can clean-up loop update expressions inside the hardware-loops later.
Differential Revision: https://reviews.llvm.org/D72602
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.
I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
recompiles touches affected_files header
342380 95 3604 llvm/include/llvm/ADT/STLExtras.h
314730 234 1345 llvm/include/llvm/InitializePasses.h
307036 118 2602 llvm/include/llvm/ADT/APInt.h
213049 59 3611 llvm/include/llvm/Support/MathExtras.h
170422 47 3626 llvm/include/llvm/Support/Compiler.h
162225 45 3605 llvm/include/llvm/ADT/Optional.h
158319 63 2513 llvm/include/llvm/ADT/Triple.h
140322 39 3598 llvm/include/llvm/ADT/StringRef.h
137647 59 2333 llvm/include/llvm/Support/Error.h
131619 73 1803 llvm/include/llvm/Support/FileSystem.h
Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.
Reviewers: bkramer, asbirlea, bollu, jdoerfert
Differential Revision: https://reviews.llvm.org/D70211
Summary:
I'm not planning to check this in at the moment, but feedback is very welcome, in particular how this affects performance.
The feedback obtains here will guide the next steps towards enabling this.
This patch enables the use of MemorySSA in the loop pass manager.
Passes that currently use MemorySSA:
- EarlyCSE
Passes that use MemorySSA after this patch:
- EarlyCSE
- LICM
- SimpleLoopUnswitch
Loop passes that update MemorySSA (and do not use it yet, but could use it after this patch):
- LoopInstSimplify
- LoopSimplifyCFG
- LoopUnswitch
- LoopRotate
- LoopSimplify
- LCSSA
Loop passes that do *not* update MemorySSA:
- IndVarSimplify
- LoopDelete
- LoopIdiom
- LoopSink
- LoopUnroll
- LoopInterchange
- LoopUnrollAndJam
- LoopVectorize
- LoopReroll
- IRCE
Reviewers: chandlerc, george.burgess.iv, davide, sanjoy, gberry
Subscribers: jlebar, Prazek, dmgreen, jdoerfert, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D58311
llvm-svn: 370384
For some targets the LICM pass can result in sub-optimal code in some
cases where it would be better not to run the pass, but it isn't
always possible to suppress the transformations heuristically.
Where the front-end has insight into such cases it is beneficial
to attach loop metadata to disable the pass - this change adds the
llvm.licm.disable metadata to enable that.
Differential Revision: https://reviews.llvm.org/D64557
llvm-svn: 368296
To avoid duplicates in loop metadata, if the string to add is
already there, just update the value.
Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D65265
llvm-svn: 367087
Just move the utility function to LoopUtils.cpp to re-use it in loop peeling.
Reviewers: reames, Ashutosh
Reviewed By: reames
Subscribers: hiraditya, asbirlea, llvm-commits
Differential Revision: https://reviews.llvm.org/D65264
llvm-svn: 367085
With this patch the getLoopEstimatedTripCount function will
accept also the loops where there are more than one exit but
all exits except latch block should ends up with a call to deopt.
This side exits should not impact the estimated trip count.
Reviewers: reames, mkuper, danielcdh
Reviewed By: reames
Subscribers: fhahn, lebedev.ri, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D64553
llvm-svn: 366042
This patch changes how LLVM handles the accumulator/start value
in the reduction, by never ignoring it regardless of the presence of
fast-math flags on callsites. This change introduces the following
new intrinsics to replace the existing ones:
llvm.experimental.vector.reduce.fadd -> llvm.experimental.vector.reduce.v2.fadd
llvm.experimental.vector.reduce.fmul -> llvm.experimental.vector.reduce.v2.fmul
and adds functionality to auto-upgrade existing LLVM IR and bitcode.
Reviewers: RKSimon, greened, dmgreen, nikic, simoll, aemerson
Reviewed By: nikic
Differential Revision: https://reviews.llvm.org/D60261
llvm-svn: 363035
Instead of passing around fast-math-flags as a parameter, we can set those
using an IRBuilder guard object. This is no-functional-change-intended.
The motivation is to eventually fix the vectorizers to use and set the
correct fast-math-flags for reductions. Examples of that not behaving as
expected are:
https://bugs.llvm.org/show_bug.cgi?id=23116 (should be able to reduce with less than 'fast')
https://bugs.llvm.org/show_bug.cgi?id=35538 (possible miscompile for -0.0)
D61802 (should be able to reduce with IR-level FMF)
Differential Revision: https://reviews.llvm.org/D62272
llvm-svn: 362612
Summary:
If `deleteDeadLoop()` is called on such a loop, that has "bad" exit block,
one that e.g. has no terminator instruction, the `DIBuilder::insertDbgValueIntrinsic()`
will be told to insert the Dbg Value Intrinsic after `nullptr`
(since there is no first non-PHI instruction), which will cause it to not insert
those instructions into any basic block. The instructions will be parent-less,
and IR verifier will complain. It is rather obvious to track down the root cause
when that happens, so let's just assert it never happens.
Reviewers: sanjoy, davide, vsk
Reviewed By: vsk
Subscribers: llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D61008
llvm-svn: 359993
Change from original commit: move test (that uses an X86 triple) into the X86
subdirectory.
Original description:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.
Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal
Reviewed By: sdesmalen
Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits
Tags: #llvm
Differential Revision: https://reviews.llvm.org/D57728
llvm-svn: 355889
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html
This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.
This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.
There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.
Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii
Differential Revision: https://reviews.llvm.org/D53765
llvm-svn: 353563