Commit Graph

89 Commits

Author SHA1 Message Date
bipmis e9393789a9 [AggressiveInstCombine] Handle the insert point of the merged load correctly.
This patch updates the load insert point of the merged load in AggressiveInstCombine().
This is done to handle the reported test breaks by handling Alias Analysis correctly.

Differential Revision: https://reviews.llvm.org/D137201
2022-11-29 10:53:51 +00:00
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Arthur Eubanks 70dc3b811e [AggressiveInstCombine] Remove legacy PM pass
As part of legacy PM optimization pipeline removal.

This shouldn't be used in codegen pipelines so it should be ok to remove.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D137116
2022-11-15 14:35:15 -08:00
bipmis 150fc73dda [AggressiveInstCombine] Avoid load merge/widen if stores are present b/w loads
This patch is to address the test cases in which the load has to be inserted at a right point. This happens when there is a store b/w the loads.

This patch reverts the loads merge in all cases when stores are present b/w loads and will eventually be replaced with proper fix and test cases.

Differential Revision: https://reviews.llvm.org/D137333
2022-11-03 14:32:07 +00:00
bipmis 38f3e44997 [AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.
This patch extends the load merge/widen in AggressiveInstCombine() to handle reverse load patterns.

Differential Revision: https://reviews.llvm.org/D135137
2022-10-19 11:22:58 +01:00
David Stuttard d1d7d2235c [AggressiveInstCombine] Fix cases where non-opaque pointers are used
In the case of non-opaque pointers, when combining consecutive loads,
need to bitcast the pointer source to the combined type size, otherwise
asserts are triggered.

Differential Revision: https://reviews.llvm.org/D135249
2022-10-05 13:42:46 +01:00
bipmis 3b49a9fcf6 [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Fix the error reported on reverse load merge.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-28 17:32:47 +01:00
Dmitri Gribenko 954d3cd2c6 Revert "[AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load."
This reverts commit 3c70c8c1df.

After this commit, during the 3-stage bootstrap the second-stage Clang
crashes.
2022-09-23 19:21:09 +02:00
bipmis 3c70c8c1df [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-23 10:19:50 +01:00
Djordje Todorovic f0f8b46863 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ
The bug reported on the [0] has been fixed.
The issue was we have not checked if the global variables that
represent cttz tables was constant.
There is a new negative test added in negative-lower-table-based-cttz.ll
that represents this.

[0] https://reviews.llvm.org/rGdf868edee561eb973edd85ec9df41c67aa0bff6b
2022-09-20 13:12:47 +02:00
Djordje Todorovic b080d0bae8 Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"""
This reverts commit df868edee5, as it
introduces a bug found by Alive2 (more on the rGdf868edee561).
2022-09-12 08:23:07 +02:00
Djordje Todorovic df868edee5 "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit 053841c562.

We faced a use-after-free after pushing the D113291, since the
foldSqrt() has a call to eraseFromParent(). The function
should be at the end of the main loop that folds the patterns.
This patch fixes that.
2022-09-09 10:29:39 +02:00
Djordje Todorovic 7aec9ddcfd Revert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit f879939157.
2022-09-08 17:01:16 +02:00
Djordje Todorovic f879939157 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ" 2022-09-08 16:36:46 +02:00
Richard Smith 053841c562 Revert "[AggressiveInstCombine] Lower Table Based CTTZ"
This reverts commit fec01ee3f5.

According to asan, this patch introduces a heap use after free.
2022-09-02 16:19:09 -07:00
Djordje Todorovic fec01ee3f5 [AggressiveInstCombine] Lower Table Based CTTZ
This patch introduces recognition of table-based ctz implementation
during the AggressiveInstCombine.

This fixes the [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=46434

Differential Revision: https://reviews.llvm.org/D113291
2022-09-02 17:26:55 +02:00
Kazu Hirata 21de2888a4 Use llvm::is_contained (NFC) 2022-08-27 09:53:11 -07:00
Sanjay Patel e079bf6558 [AggressiveInstCombine] check sqrt operand to allow more libcall->intrinsic transforms
This should fix issue #56383 (at least when compiled with -O3 because this pass is only
run at -O3 currently).
2022-07-27 11:36:13 -04:00
Sanjay Patel e3205b8765 [AggressiveInstCombine] convert sqrt libcalls with "nnan" to sqrt intrinsics
This is an alternate to D129155 that uses TTI.haveFastSqrt() to avoid a
potential miscompile for programs with reads of errno. Moving the transform
to AggressiveInstCombine provides access to TTI.

If a sqrt call has "nnan", that implies that the input argument is never
negative because sqrt of {negative number} --> NAN.
If the argument is never negative and the call can be lowered without a
libcall, then we can assume that errno accesses are unchanged after lowering,
so the call can be translated to the LLVM intrinsic (which is expected to
become inline code).

This affects codegen for targets like x86 that have sqrt instructions, but
still have to conservatively assume that a libcall may be needed to set
errno as shown in issue #52620 and issue #56383.

This patch won't solve those examples - we will need to extend this to use
CannotBeOrderedLessThanZero or similar, enhance that analysis for new
operators, and/or deal with llvm.assume too.

Differential Revision: https://reviews.llvm.org/D129167
2022-07-26 15:50:14 -04:00
David Green 4a5cb957a1 [AggressiveInstcombine] Conditionally fold saturated fptosi to llvm.fptosi.sat
This adds a fold for aggressive instcombine that converts
smin(smax(fptosi(x))) into a llvm.fptosi.sat, providing that the
saturation constants are correct and the cost of the llvm.fptosi.sat is
lower.

Unfortunately, a llvm.fptosi.sat cannot always be converted back to a
smin/smax/fptosi. The llvm.fptosi.sat intrinsic is more defined that the
original, which produces poison if the original fptosi was out of range.
The llvm.fptosi.sat will saturate any value, so needs to be expanded to
a fptosi(fpmin(fpmax(x))), which can be worse for codegeneration
depending on the target.

So this change thais conditional on the backend reporting that the
llvm.fptosi.sat is cheaper that the original smin+smax+fptost.  This is
a change to the way that AggressiveInstrcombine has worked in the past.
Instead of just being a canonicalization pass, that canonicalization can
be dependant on the target in certain specific cases.

Differential Revision: https://reviews.llvm.org/D125755
2022-06-10 09:36:09 +01:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Anton Afanasyev 904a00d17a [AggressiveInstCombine] Fix `TruncInstCombine` (fix f84d732f)
Erase phi-nodes from `InstInfoMap` before erasing themselves
2022-02-25 08:04:11 +03:00
Anton Afanasyev 0dd8401371 [AggressiveInstCombine] Add `phi` nodes support to `TruncInstCombine`
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

(recommit of fixed f84d732f, reverted by 8ad6d5e after sanitizer breakage)

Differential Revision: https://reviews.llvm.org/D109817
2022-02-25 07:57:35 +03:00
Anton Afanasyev 8ad6d5e465 Revert "[AggressiveInstCombine] Add `phi` nodes support to `TruncInstCombine`"
This reverts commit f84d732f8c.
Breakage of "sanitizer-x86_64-linux-fast"
2022-02-23 15:56:11 +03:00
Anton Afanasyev f84d732f8c [AggressiveInstCombine] Add `phi` nodes support to `TruncInstCombine`
Expand `TruncInstCombine` to handle loops by adding `phi` nodes
to expression graph.

Reviewed by: RKSimon, lebedev.ri

Differential Revision: https://reviews.llvm.org/D109817
2022-02-23 14:01:55 +03:00
Kazu Hirata 31d72f0e45 [Transforms] Use default member initialization in TruncInstCombine (NFC) 2022-02-05 21:39:23 -08:00
Kazu Hirata 9ed6800ef9 [Transforms] Use default member initialization in MaskOps (NFC) 2022-02-05 21:39:21 -08:00
Kazu Hirata 7787a8f1b7 [llvm] Use llvm::reverse (NFC) 2021-12-13 21:54:51 -08:00
Kazu Hirata 843d1eda18 [llvm] Use llvm::reverse (NFC) 2021-11-06 19:31:18 -07:00
Anton Afanasyev 6a5f49a1ac [AggressiveInstCombine] Add `{insert/extract}element` to `TruncInstCombine` DAG
Alive2 for `{insert/extract}element`: https://alive2.llvm.org/ce/z/hwy_E-

Actually, no one file of test suite is touched by this change,
which means that is rare pattern not generated by frontend. But
it's worth being in place.

Differential Revision: https://reviews.llvm.org/D109236
2021-09-16 11:24:31 +03:00
Anton Afanasyev 54d8ebbbfd [AggressiveInstCombine] Add `udiv` and `urem` instrs to TruncInstCombine DAG
Add `udiv` and `urem` instructions to the DAG post-dominated by `trunc`,
allowing TruncInstCombine to reduce bitwidth of expressions containing these
instructions. It is sufficient to require that all truncated bits of both
operands are zeros: https://alive2.llvm.org/ce/z/yiithn
(`urem` case is identical).

Differential Revision: https://reviews.llvm.org/D109515
2021-09-10 20:29:08 +03:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Anton Afanasyev d1f9b21677 [AggressiveInstCombine] Add `AssumptionCache` to aggressive instcombine
Add support for @llvm.assume() to TruncInstCombine allowing
optimizations based on these intrinsics while computing known bits.
2021-09-07 16:45:00 +03:00
Anton Afanasyev 8c0a1940c1 [AggresiveInstCombine] Add wrapper calls for `KnownBits` computing
Precommit before `AssumptionCache` adding: reviews.llvm.org/D109141

Differential Revision: https://reviews.llvm.org/D109288
2021-09-07 16:45:00 +03:00
Anton Afanasyev bed587631f [AggressiveInstCombine] Add arithmetic shift right instr to `TruncInstCombine` DAG
Add `ashr` instruction to the DAG post-dominated by `trunc`, allowing
`TruncInstCombine` to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are sign bits (all zeros or ones) and
one sign bit is left untruncated: https://alive2.llvm.org/ce/z/Ajo2__

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108355
2021-08-24 10:41:16 +03:00
Sanjay Patel dd19f342fa [AggressiveInstCombine] guard against applying instruction flags with constant folding
This is a minimized version of a crash reported in:
D108201
2021-08-20 12:22:18 -04:00
Anton Afanasyev 3890ce708d [NFC][AggressiveInstCombine] Simplify code for shift truncation 2021-08-20 06:37:02 +03:00
Anton Afanasyev cfb6dfcbd1 [AggressiveInstCombine] Add logical shift right instr to `TruncInstCombine` DAG
Add `lshr` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing
these instructions.

We should be shifting by less than the target bitwidth.
Also it is sufficient to require that all truncated bits
of the value-to-be-shifted are zeros: https://alive2.llvm.org/ce/z/_LytbB

Alive2 variable-length proof:
https://godbolt.org/z/1srE1aqzf => s/32/8/ => https://alive2.llvm.org/ce/z/StwPia

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108201
2021-08-18 22:20:58 +03:00
Anton Afanasyev 803270c0c6 [AggressiveInstCombine] Fix unsigned overflow
Fix issue reported here: https://reviews.llvm.org/D108091#2950930
2021-08-18 08:42:46 +03:00
Anton Afanasyev 1f3e35b6d1 [AggressiveInstCombine] Add shift left instruction to `TruncInstCombine` DAG
Add `shl` instruction to the DAG post-dominated by `trunc`, allowing
TruncInstCombine to reduce bitwidth of expressions containing left shifts.

The only thing we need to check is that the target bitwidth must be wider
than the maximal shift amount: https://alive2.llvm.org/ce/z/AwArqu

Part of https://reviews.llvm.org/D107766

Differential Revision: https://reviews.llvm.org/D108091
2021-08-17 12:44:37 +03:00
Arthur Eubanks 6b9524a05b [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Kazu Hirata e53472de68 [Transforms] Use llvm::append_range (NFC) 2021-01-20 21:35:54 -08:00
Simon Pilgrim 88c5b50060 [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts (REAPPLIED)
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Ensure we block poison in a funnel shift value - similar to rG0fe91ad463fea9d08cbcd640a62aa9ca2d8d05e0

Reapplied with fix for PR48068 - we weren't checking that the shift values could be hoisted from their basicblocks.

Differential Revision: https://reviews.llvm.org/D90625
2020-12-21 15:22:27 +00:00
Jun Ma 137674f882 [TruncInstCombine] Remove scalable vector restriction
Differential Revision: https://reviews.llvm.org/D92819
2020-12-10 18:00:19 +08:00
serge-sans-paille 9218ff50f9 llvmbuildectomy - replace llvm-build by plain cmake
No longer rely on an external tool to build the llvm component layout.

Instead, leverage the existing `add_llvm_componentlibrary` cmake function and
introduce `add_llvm_component_group` to accurately describe component behavior.

These function store extra properties in the created targets. These properties
are processed once all components are defined to resolve library dependencies
and produce the header expected by llvm-config.

Differential Revision: https://reviews.llvm.org/D90848
2020-11-13 10:35:24 +01:00
Martin Storsjö 36cf1e7d0e Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts"
This reverts commit 59b22e495c.

That commit broke building for ARM and AArch64, reproducible like this:

$ cat apedec-reduced.c
a;
b(e) {
  int c;
  unsigned d = f();
  c = d >> 32 - e;
  return c;
}
g() {
  int h = i();
  if (a)
    h = h << a | b(a);
  return h;
}
$ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c
clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed.

Same thing for e.g. an armv7-linux-gnueabihf target.
2020-11-04 08:39:32 +02:00
Simon Pilgrim 59b22e495c [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Differential Revision: https://reviews.llvm.org/D90625
2020-11-03 10:49:49 +00:00
Simon Pilgrim 55f15f99cb [AggressiveInstCombine] foldGuardedRotateToFunnelShift - generalize rotation to funnel shift matcher.
Replace matchRotate with a more general matchFunnelShift - at the moment this is still just used for rotation patterns.
2020-11-02 17:09:17 +00:00
Simon Pilgrim fadd152317 [AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector support
Replace m_ConstantInt with m_APInt to support uniform vectors (with no undef elements)

Adding non-undef support would involve some refactoring of the MaskOps struct but this might still be worth it.
2020-10-15 11:02:35 +01:00
Simon Pilgrim 0347f3ea72 TruncInstCombine.cpp - fix header include ordering to fix llvm-include-order clang-tidy warning. NFCI. 2020-10-02 17:25:12 +01:00