Commit Graph

53 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin bcaf31ec3f [AMDGPU] Allow finer grain control of an unaligned access speed
A target can return if a misaligned access is 'fast' as defined
by the target or not. In reality there can be different levels
of 'fast' and 'slow'. This patch changes the boolean 'Fast'
argument of the allowsMisalignedMemoryAccesses family of functions
to an unsigned representing its speed.

A target can still define it as it wants and the direct translation
of the current code uses 0 and 1 for current false and true. This
makes the change an NFC.

Subsequent patch will start using an actual value of speed in
the load/store vectorizer to compare if a vectorized access going
to be not just fast, but not slower than before.

Differential Revision: https://reviews.llvm.org/D124217
2022-11-17 09:23:53 -08:00
Arthur Eubanks 70dc3b811e [AggressiveInstCombine] Remove legacy PM pass
As part of legacy PM optimization pipeline removal.

This shouldn't be used in codegen pipelines so it should be ok to remove.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D137116
2022-11-15 14:35:15 -08:00
bipmis 150fc73dda [AggressiveInstCombine] Avoid load merge/widen if stores are present b/w loads
This patch is to address the test cases in which the load has to be inserted at a right point. This happens when there is a store b/w the loads.

This patch reverts the loads merge in all cases when stores are present b/w loads and will eventually be replaced with proper fix and test cases.

Differential Revision: https://reviews.llvm.org/D137333
2022-11-03 14:32:07 +00:00
bipmis 38f3e44997 [AggressiveInstCombine] Load merge the reverse load pattern of consecutive loads.
This patch extends the load merge/widen in AggressiveInstCombine() to handle reverse load patterns.

Differential Revision: https://reviews.llvm.org/D135137
2022-10-19 11:22:58 +01:00
David Stuttard d1d7d2235c [AggressiveInstCombine] Fix cases where non-opaque pointers are used
In the case of non-opaque pointers, when combining consecutive loads,
need to bitcast the pointer source to the combined type size, otherwise
asserts are triggered.

Differential Revision: https://reviews.llvm.org/D135249
2022-10-05 13:42:46 +01:00
bipmis 3b49a9fcf6 [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Fix the error reported on reverse load merge.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-28 17:32:47 +01:00
Dmitri Gribenko 954d3cd2c6 Revert "[AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load."
This reverts commit 3c70c8c1df.

After this commit, during the 3-stage bootstrap the second-stage Clang
crashes.
2022-09-23 19:21:09 +02:00
bipmis 3c70c8c1df [AggressiveInstCombine] Combine consecutive loads which are being merged to form a wider load.
The patch simplifies some of the patterns as below

1. (ZExt(L1) << shift1) | (ZExt(L2) << shift2) -> ZExt(L3) << shift1
2. (ZExt(L1) << shift1) | ZExt(L2) -> ZExt(L3)

The pattern is indicative of the fact that the loads are being merged to a wider load and the only use of this pattern is with a wider load. In this case for a non-atomic/non-volatile loads reduce the pattern to a combined load which would improve the cost of inlining, unrolling, vectorization etc.

Differential Revision: https://reviews.llvm.org/D127392
2022-09-23 10:19:50 +01:00
Djordje Todorovic f0f8b46863 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ
The bug reported on the [0] has been fixed.
The issue was we have not checked if the global variables that
represent cttz tables was constant.
There is a new negative test added in negative-lower-table-based-cttz.ll
that represents this.

[0] https://reviews.llvm.org/rGdf868edee561eb973edd85ec9df41c67aa0bff6b
2022-09-20 13:12:47 +02:00
Djordje Todorovic b080d0bae8 Revert ""Recommit "[AggressiveInstCombine] Lower Table Based CTTZ"""
This reverts commit df868edee5, as it
introduces a bug found by Alive2 (more on the rGdf868edee561).
2022-09-12 08:23:07 +02:00
Djordje Todorovic df868edee5 "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit 053841c562.

We faced a use-after-free after pushing the D113291, since the
foldSqrt() has a call to eraseFromParent(). The function
should be at the end of the main loop that folds the patterns.
This patch fixes that.
2022-09-09 10:29:39 +02:00
Djordje Todorovic 7aec9ddcfd Revert "Recommit "[AggressiveInstCombine] Lower Table Based CTTZ""
This reverts commit f879939157.
2022-09-08 17:01:16 +02:00
Djordje Todorovic f879939157 Recommit "[AggressiveInstCombine] Lower Table Based CTTZ" 2022-09-08 16:36:46 +02:00
Richard Smith 053841c562 Revert "[AggressiveInstCombine] Lower Table Based CTTZ"
This reverts commit fec01ee3f5.

According to asan, this patch introduces a heap use after free.
2022-09-02 16:19:09 -07:00
Djordje Todorovic fec01ee3f5 [AggressiveInstCombine] Lower Table Based CTTZ
This patch introduces recognition of table-based ctz implementation
during the AggressiveInstCombine.

This fixes the [0].

[0] https://bugs.llvm.org/show_bug.cgi?id=46434

Differential Revision: https://reviews.llvm.org/D113291
2022-09-02 17:26:55 +02:00
Sanjay Patel e079bf6558 [AggressiveInstCombine] check sqrt operand to allow more libcall->intrinsic transforms
This should fix issue #56383 (at least when compiled with -O3 because this pass is only
run at -O3 currently).
2022-07-27 11:36:13 -04:00
Sanjay Patel e3205b8765 [AggressiveInstCombine] convert sqrt libcalls with "nnan" to sqrt intrinsics
This is an alternate to D129155 that uses TTI.haveFastSqrt() to avoid a
potential miscompile for programs with reads of errno. Moving the transform
to AggressiveInstCombine provides access to TTI.

If a sqrt call has "nnan", that implies that the input argument is never
negative because sqrt of {negative number} --> NAN.
If the argument is never negative and the call can be lowered without a
libcall, then we can assume that errno accesses are unchanged after lowering,
so the call can be translated to the LLVM intrinsic (which is expected to
become inline code).

This affects codegen for targets like x86 that have sqrt instructions, but
still have to conservatively assume that a libcall may be needed to set
errno as shown in issue #52620 and issue #56383.

This patch won't solve those examples - we will need to extend this to use
CannotBeOrderedLessThanZero or similar, enhance that analysis for new
operators, and/or deal with llvm.assume too.

Differential Revision: https://reviews.llvm.org/D129167
2022-07-26 15:50:14 -04:00
David Green 4a5cb957a1 [AggressiveInstcombine] Conditionally fold saturated fptosi to llvm.fptosi.sat
This adds a fold for aggressive instcombine that converts
smin(smax(fptosi(x))) into a llvm.fptosi.sat, providing that the
saturation constants are correct and the cost of the llvm.fptosi.sat is
lower.

Unfortunately, a llvm.fptosi.sat cannot always be converted back to a
smin/smax/fptosi. The llvm.fptosi.sat intrinsic is more defined that the
original, which produces poison if the original fptosi was out of range.
The llvm.fptosi.sat will saturate any value, so needs to be expanded to
a fptosi(fpmin(fpmax(x))), which can be worse for codegeneration
depending on the target.

So this change thais conditional on the backend reporting that the
llvm.fptosi.sat is cheaper that the original smin+smax+fptost.  This is
a change to the way that AggressiveInstrcombine has worked in the past.
Instead of just being a canonicalization pass, that canonicalization can
be dependant on the target in certain specific cases.

Differential Revision: https://reviews.llvm.org/D125755
2022-06-10 09:36:09 +01:00
serge-sans-paille 59630917d6 Cleanup includes: Transform/Scalar
Estimated impact on preprocessor output line:
before: 1062981579
after:  1062494547

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120817
2022-03-03 07:56:34 +01:00
Kazu Hirata 9ed6800ef9 [Transforms] Use default member initialization in MaskOps (NFC) 2022-02-05 21:39:21 -08:00
Kazu Hirata 843d1eda18 [llvm] Use llvm::reverse (NFC) 2021-11-06 19:31:18 -07:00
Chris Lattner 735f46715d [APInt] Normalize naming on keep constructors / predicate methods.
This renames the primary methods for creating a zero value to `getZero`
instead of `getNullValue` and renames predicates like `isAllOnesValue`
to simply `isAllOnes`.  This achieves two things:

1) This starts standardizing predicates across the LLVM codebase,
   following (in this case) ConstantInt.  The word "Value" doesn't
   convey anything of merit, and is missing in some of the other things.

2) Calling an integer "null" doesn't make any sense.  The original sin
   here is mine and I've regretted it for years.  This moves us to calling
   it "zero" instead, which is correct!

APInt is widely used and I don't think anyone is keen to take massive source
breakage on anything so core, at least not all in one go.  As such, this
doesn't actually delete any entrypoints, it "soft deprecates" them with a
comment.

Included in this patch are changes to a bunch of the codebase, but there are
more.  We should normalize SelectionDAG and other APIs as well, which would
make the API change more mechanical.

Differential Revision: https://reviews.llvm.org/D109483
2021-09-09 09:50:24 -07:00
Anton Afanasyev d1f9b21677 [AggressiveInstCombine] Add `AssumptionCache` to aggressive instcombine
Add support for @llvm.assume() to TruncInstCombine allowing
optimizations based on these intrinsics while computing known bits.
2021-09-07 16:45:00 +03:00
Arthur Eubanks 6b9524a05b [NewPM] Don't mark AA analyses as preserved
Currently all AA analyses marked as preserved are stateless, not taking
into account their dependent analyses. So there's no need to mark them
as preserved, they won't be invalidated unless their analyses are.

SCEVAAResults was the one exception to this, it was treated like a
typical analysis result. Make it like the others and don't invalidate
unless SCEV is invalidated.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D102032
2021-05-18 13:49:03 -07:00
Simon Pilgrim 88c5b50060 [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts (REAPPLIED)
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Ensure we block poison in a funnel shift value - similar to rG0fe91ad463fea9d08cbcd640a62aa9ca2d8d05e0

Reapplied with fix for PR48068 - we weren't checking that the shift values could be hoisted from their basicblocks.

Differential Revision: https://reviews.llvm.org/D90625
2020-12-21 15:22:27 +00:00
Martin Storsjö 36cf1e7d0e Revert "[AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts"
This reverts commit 59b22e495c.

That commit broke building for ARM and AArch64, reproducible like this:

$ cat apedec-reduced.c
a;
b(e) {
  int c;
  unsigned d = f();
  c = d >> 32 - e;
  return c;
}
g() {
  int h = i();
  if (a)
    h = h << a | b(a);
  return h;
}
$ clang -target aarch64-linux-gnu -w -c -O3 apedec-reduced.c
clang: ../lib/Transforms/InstCombine/InstructionCombining.cpp:3656: bool llvm::InstCombinerImpl::run(): Assertion `DT.dominates(BB, UserParent) && "Dominance relation broken?"' failed.

Same thing for e.g. an armv7-linux-gnueabihf target.
2020-11-04 08:39:32 +02:00
Simon Pilgrim 59b22e495c [AggressiveInstCombine] Generalize foldGuardedRotateToFunnelShift to generic funnel shifts
The fold currently only handles rotation patterns, but with the maturation of backend funnel shift handling we can now realistically handle all funnel shift patterns.

This should allow us to begin resolving PR46896 et al.

Differential Revision: https://reviews.llvm.org/D90625
2020-11-03 10:49:49 +00:00
Simon Pilgrim 55f15f99cb [AggressiveInstCombine] foldGuardedRotateToFunnelShift - generalize rotation to funnel shift matcher.
Replace matchRotate with a more general matchFunnelShift - at the moment this is still just used for rotation patterns.
2020-11-02 17:09:17 +00:00
Simon Pilgrim fadd152317 [AggressiveInstCombine] foldAnyOrAllBitsSet - add uniform vector support
Replace m_ConstantInt with m_APInt to support uniform vectors (with no undef elements)

Adding non-undef support would involve some refactoring of the MaskOps struct but this might still be worth it.
2020-10-15 11:02:35 +01:00
Simon Pilgrim 86fd5be6fd AggressiveInstCombine.h - remove unused includes. NFC. 2020-07-23 16:20:13 +01:00
Roman Lebedev e987ee6318
[NFCI][AggressiveInstCombiner] Add `STATISTIC()`s for transforms 2020-06-13 23:53:16 +03:00
Reid Kleckner 05da2fe521 Sink all InitializePasses.h includes
This file lists every pass in LLVM, and is included by Pass.h, which is
very popular. Every time we add, remove, or rename a pass in LLVM, it
caused lots of recompilation.

I found this fact by looking at this table, which is sorted by the
number of times a file was changed over the last 100,000 git commits
multiplied by the number of object files that depend on it in the
current checkout:
  recompiles    touches affected_files  header
  342380        95      3604    llvm/include/llvm/ADT/STLExtras.h
  314730        234     1345    llvm/include/llvm/InitializePasses.h
  307036        118     2602    llvm/include/llvm/ADT/APInt.h
  213049        59      3611    llvm/include/llvm/Support/MathExtras.h
  170422        47      3626    llvm/include/llvm/Support/Compiler.h
  162225        45      3605    llvm/include/llvm/ADT/Optional.h
  158319        63      2513    llvm/include/llvm/ADT/Triple.h
  140322        39      3598    llvm/include/llvm/ADT/StringRef.h
  137647        59      2333    llvm/include/llvm/Support/Error.h
  131619        73      1803    llvm/include/llvm/Support/FileSystem.h

Before this change, touching InitializePasses.h would cause 1345 files
to recompile. After this change, touching it only causes 550 compiles in
an incremental rebuild.

Reviewers: bkramer, asbirlea, bollu, jdoerfert

Differential Revision: https://reviews.llvm.org/D70211
2019-11-13 16:34:37 -08:00
Chen Zheng c17c5864ff [InstCombine] recognize popcount.
This patch recognizes popcount intrinsic according to algorithm from website
  http://graphics.stanford.edu/~seander/bithacks.html#CountBitsSetParallel

Differential Revision: https://reviews.llvm.org/D68189

llvm-svn: 374512
2019-10-11 05:13:56 +00:00
Florian Hahn 5c3bc3c930 [PatternMatch] Make m_Br more flexible, add matchers for BB values.
Currently m_Br only takes references to BasicBlock*, which limits its
flexibility. For example, you have to declare a variable, even if you
ignore the result or you have to have additional checks to make sure the
matched BB matches an expected one.

This patch adds m_BasicBlock and m_SpecificBB matchers, which can be
used like the existing matchers for constants or values.

I also had a look at the existing uses and updated a few. IMO it makes
the code a bit more explicit.

Reviewers: spatel, craig.topper, RKSimon, majnemer, lebedev.ri

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D68013

llvm-svn: 372885
2019-09-25 15:05:08 +00:00
Teresa Johnson 9c27b59cec Change TargetLibraryInfo analysis passes to always require Function
Summary:
This is the first change to enable the TLI to be built per-function so
that -fno-builtin* handling can be migrated to use function attributes.
See discussion on D61634 for background. This is an enabler for fixing
handling of these options for LTO, for example.

This change should not affect behavior, as the provided function is not
yet used to build a specifically per-function TLI, but rather enables
that migration.

Most of the changes were very mechanical, e.g. passing a Function to the
legacy analysis pass's getTLI interface, or in Module level cases,
adding a callback. This is similar to the way the per-function TTI
analysis works.

There was one place where we were looking for builtins but not in the
context of a specific function. See FindCXAAtExit in
lib/Transforms/IPO/GlobalOpt.cpp. I'm somewhat concerned my workaround
could provide the wrong behavior in some corner cases. Suggestions
welcome.

Reviewers: chandlerc, hfinkel

Subscribers: arsenm, dschuff, jvesely, nhaehnle, mehdi_amini, javed.absar, sbc100, jgravelle-google, eraman, aheejin, steven_wu, george.burgess.iv, dexonsmith, jfb, asbirlea, gchatelet, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D66428

llvm-svn: 371284
2019-09-07 03:09:36 +00:00
Chandler Carruth 2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Pawel Bylica 119aa8fa5f Format AggresiveInstCombine.cpp. NFC
llvm-svn: 350255
2019-01-02 19:51:46 +00:00
Sanjay Patel 200885e654 [AggressiveInstCombine] convert rotate with guard branch into funnel shift (PR34924)
Now, that we have funnel shift intrinsics, it should be safe to convert this form of rotate to it. 
In the worst case (a target that doesn't have rotate instructions), we will expand this into a 
branch-less sequence of ALU ops (neg/and/and/lshr/shl/or) in the backend, so it's still very 
likely to be a perf improvement over the original code.

The motivating source code pattern for this is shown in:
https://bugs.llvm.org/show_bug.cgi?id=34924

Background:
I looked at several different options before deciding where to try this - instcombine, simplifycfg, 
CGP - because it doesn't fit cleanly anywhere AFAIK.

The backend (CGP, SDAG, GlobalIsel?) is too late for what we're trying to accomplish. We want to 
have the IR converted before we reach things like vectorization because the reduced code can make a 
loop much simpler to transform.

Technically, this could be included in instcombine, but it's a large pattern match that includes 
control-flow, so it just felt wrong to stuff into there (although I have a draft of that patch). 
Similarly, this could be part of simplifycfg, but all of this pattern matching is a stretch.

So we're left with our relatively new dumping ground for homeless transforms: aggressive-instcombine. 
This only runs at -O3, but that seems like a reasonable limitation given that source code has many 
options to avoid this pattern (including the recently added clang intrinsics for rotates).

I'm including a PhaseOrdering test because we require the teamwork of 3 passes (aggressive-instcombine, 
instcombine, simplifycfg) to get this into the minimal IR form that we want. That test shows a bug
with the new pass manager that's independent of this change (but it will be masked if we canonicalize
harder to funnel shift intrinsics in instcombine).

Differential Revision: https://reviews.llvm.org/D55604

llvm-svn: 349396
2018-12-17 21:14:51 +00:00
Benjamin Kramer 837f93af7d [Aggressive InstCombine] Move C bindings to their own header file.
llvm-svn: 341461
2018-09-05 11:41:12 +00:00
whitequark a41b24f32d [LLVM-C] Fix name mangling on AggressiveInstCombine
Similarly to rL336736, at least one more C API function does not
properly get declared as extern "C" due to a missing header, causing
name mangling and linking errors.

This patch fixes calls to LLVMAddAggressiveInstCombinerPass().

Differential Revision: https://reviews.llvm.org/D49416

Reviewed By: whitequark

llvm-svn: 337264
2018-07-17 11:13:58 +00:00
George Burgess IV 3482871645 Sort includes + include a missing `extern "C"` header
If we don't include Initialization.h,
`LLVMInitializeAggressiveInstCombiner` won't see its `extern "C"` decl.
This causes sadness, name mangling, and linker errors.

Reported on the mailing lists by Vladimir Vissoultchev. Thanks!

llvm-svn: 336736
2018-07-10 22:48:13 +00:00
David Blaikie 31b98d2e99 Move Analysis/Utils/Local.h back to Transforms
Review feedback from r328165. Split out just the one function from the
file that's used by Analysis. (As chandlerc pointed out, the original
change only moved the header and not the implementation anyway - which
was fine for the one function that was used (since it's a
template/inlined in the header) but not in general)

llvm-svn: 333954
2018-06-04 21:23:21 +00:00
Sanjay Patel bf55e6dee1 [AggressiveInstCombine] avoid crashing on unsimplified code (PR37446)
This bug:
https://bugs.llvm.org/show_bug.cgi?id=37446
...raises another question: why do we run aggressive-instcombine before 
regular instcombine?

llvm-svn: 332243
2018-05-14 13:43:32 +00:00
Sanjay Patel ac3951a735 [AggressiveInstCombine] convert a chain of 'and-shift' bits into masked compare
This is a follow-up to D45986. As suggested there, we should match the "all-bits-set" 
pattern in addition to "any-bits-set".

This was a little more complicated than I thought it would be initially because the 
"and 1" instruction can be anywhere in the chain. Hopefully, the code comments make 
that logic understandable, but if you see a way to simplify or improve that, it's 
most appreciated.

This transforms patterns that emerge from bitfield tests as seen in PR37098:
https://bugs.llvm.org/show_bug.cgi?id=37098

I think it would also help reduce the large test from:
D46336
D46595 
but we need something to reassociate that case to the forms we're expecting here first.

Differential Revision: https://reviews.llvm.org/D46649

llvm-svn: 331937
2018-05-09 23:08:15 +00:00
Simon Pilgrim f53ee8e640 Fix '32-bit shift implicitly converted to 64 bits' warning by using APInt::setBit instead.
llvm-svn: 331359
2018-05-02 14:22:30 +00:00
Sanjay Patel d2025a2e31 [AggressiveInstCombine] convert a chain of 'or-shift' bits into masked compare
and (or (lshr X, C), ...), 1 --> (X & C') != 0

I initially thought about implementing the minimal pattern in instcombine as mentioned here:
https://bugs.llvm.org/show_bug.cgi?id=37098#c6

...but we need to do better to catch the more general sequence from the motivating test 
(more than 2 bits in the compare). And a test-suite run with statistics showed that this 
pattern only happened 2 times currently. It would potentially happen more often if 
reassociation worked better (D45842), but it's probably still not too frequent?

This is small enough that I didn't see a need to create a whole new class/file within 
AggressiveInstCombine. There are likely other relatively small matchers like what was 
discussed in D44266 that would slide under foldUnusualPatterns() (name suggestions welcome). 
We could potentially also consolidate matchers for ctpop, bswap, etc under here.

Differential Revision: https://reviews.llvm.org/D45986

llvm-svn: 331311
2018-05-01 21:02:09 +00:00
David Blaikie ba47dd16c5 Fix some layering in AggressiveInstCombine (avoiding inclusion of Scalar.h)
llvm-svn: 330726
2018-04-24 15:40:07 +00:00
Craig Topper 1bcb258ba3 [AggressiveInstCombine] Add aggressive inst combiner to the LLVM C API.
I just tried to copy what was done for regular InstCombine. Hopefully I didn't miss anything.

llvm-svn: 330668
2018-04-24 00:39:29 +00:00
Craig Topper d4eb2073b7 [AggressiveInstCombine] Add library initializer routine for AggressiveInstCombine library. Use it in bugpoint and llvm-opt-fuzzer to match regular InstCombine.
This should make aggressive instcombine usable with these tools.

llvm-svn: 330663
2018-04-24 00:05:21 +00:00
Amjad Aboud d895bff5f2 [AggressiveInstCombine] Make TruncCombine class ignore unreachable basic blocks.
Because dead code may contain non-standard IR that causes infinite looping or crashes in underlying analysis.
See PR36134 for more details.

Differential Revision: https://reviews.llvm.org/D42683

llvm-svn: 323862
2018-01-31 10:41:31 +00:00