Commit Graph

26463 Commits

Author SHA1 Message Date
Kazu Hirata 8ed1636184 [llvm] Use isa instead of dyn_cast (NFC) 2021-01-29 23:23:37 -08:00
Roman Lebedev c2534a7097
[ShadowStackGCLowering] Preserve Dominator Tree, if avaliable
This doesn't help avoid any Dominator Tree recalculations just yet,
there's one more pass to go..
2021-01-30 01:14:51 +03:00
Roman Lebedev a78d8feb48
[LowerConstantIntrinsics] Preserve Dominator Tree, if avaliable 2021-01-30 01:14:50 +03:00
Sriraman Tallam 9a81a4ef79 Emit metadata when instr. profiles hash mismatch occurs.
This patch emits "instr_prof_hash_mismatch" function annotation metadata if
there is a hash mismatch while applying instrumented profiles.

During the PGO optimized build using instrumented profiles, if the CFG of
the function has changed since generating the profile, a hash mismatch is
encountered. This patch emits this information as annotation metadata. We
plan to use this with Propeller which is done at the machine IR level.
Propeller is usually applied on top of PGO and a hash mismatch during
PGO could be used to detect source drift.

Differential Revision: https://reviews.llvm.org/D95495
2021-01-29 12:56:01 -08:00
Florian Hahn f3a710cade [LTO] Update splitCodeGen to take a reference to the module. (NFC)
splitCodeGen does not need to take ownership of the module, as it
currently clones the original module for each split operation.

There is an ~4 year old fixme to change that, but until this is
addressed, the function can just take a reference to the module.

This makes the transition of LTOCodeGenerator to use LTOBackend a bit
easier, because under some circumstances, LTOCodeGenerator needs to
write the original module back after codegen.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D95222
2021-01-29 11:53:11 +00:00
Yang Fan 59bd2068e9
[NFC][ScalarizeMaskedMemIntrin] Fix unused variable warning
GCC warning:
```
/llvm-project/llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp: In function ‘void scalarizeMaskedStore(llvm::CallInst*, llvm::DomTreeUpdater*, bool&)’:
/llvm-project/llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp:295:15: warning: variable ‘IfBlock’ set but not used [-Wunused-but-set-variable]
  295 |   BasicBlock *IfBlock = CI->getParent();
      |               ^~~~~~~
/llvm-project/llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp: In function ‘void scalarizeMaskedScatter(llvm::CallInst*, llvm::DomTreeUpdater*, bool&)’:
/llvm-project/llvm/lib/Transforms/Scalar/ScalarizeMaskedMemIntrin.cpp:555:15: warning: variable ‘IfBlock’ set but not used [-Wunused-but-set-variable]
  555 |   BasicBlock *IfBlock = CI->getParent();
      |               ^~~~~~~
```
2021-01-29 15:15:58 +08:00
Roman Lebedev 056385921d
[ScalarizeMaskedMemIntrin] Preserve Dominator Tree, if avaliable
This de-pessimizes the arguably more usual case of no masked mem intrinsics,
and gets rid of one more Dominator Tree recalculation.

As per llvm/test/CodeGen/X86/opt-pipeline.ll,
there's one more Dominator Tree recalculation left, we could get rid of.
2021-01-29 01:11:36 +03:00
Roman Lebedev 577fdcaa93
[PartiallyInlineLibCalls] Preserve Dominator Tree, if avaliable
This doesn't get rid of any Dominator Tree recalculations just yet,
there is one more pass to update..
2021-01-29 01:11:36 +03:00
Roman Lebedev 573f74117b
[NFC][ScalarizeMaskedMemIntrin] scalarizeMaskedCompressStore(): port to SplitBlockAndInsertIfThen()
Makes Dominator Tree preservation in a followup patch somewhat easier.
2021-01-29 01:11:35 +03:00
Roman Lebedev 2e4bb3f119
[NFC][ScalarizeMaskedMemIntrin] scalarizeMaskedExpandLoad(): port to SplitBlockAndInsertIfThen()
Makes Dominator Tree preservation in a followup patch somewhat easier.
2021-01-29 01:11:35 +03:00
Roman Lebedev e8efc03a1e
[NFC][ScalarizeMaskedMemIntrin] scalarizeMaskedScatter(): port to SplitBlockAndInsertIfThen()
Makes Dominator Tree preservation in a followup patch somewhat easier.
2021-01-29 01:11:35 +03:00
Roman Lebedev 1356399a11
[NFC][ScalarizeMaskedMemIntrin] scalarizeMaskedGather(): port to SplitBlockAndInsertIfThen()
Makes Dominator Tree preservation in a followup patch somewhat easier.
2021-01-29 01:11:34 +03:00
Roman Lebedev 22b8421156
[NFC][ScalarizeMaskedMemIntrin] scalarizeMaskedStore(): port to SplitBlockAndInsertIfThen()
Makes Dominator Tree preservation in a followup patch somewhat easier.
2021-01-29 01:11:34 +03:00
Roman Lebedev 0ea45a412a
[NFC][ScalarizeMaskedMemIntrin] scalarizeMaskedLoad(): port to SplitBlockAndInsertIfThen()
Makes Dominator Tree preservation in a followup patch somewhat easier.
2021-01-29 01:11:34 +03:00
Roman Lebedev 394685481c
[NFC][PartiallyInlineLibCalls] Port to SplitBlockAndInsertIfThen()
This makes follow-up patch for Dominator Tree preservation
somewhat more straight-forward.
2021-01-29 01:11:33 +03:00
Roman Lebedev 2de2d84ed0
[NFC][EntryExitInstrumenter] Mark Dominator Tree as preserved in legacy-PM too
This is correctly handled in new-PM wrappers, but not in old-PM.
2021-01-29 01:11:33 +03:00
Adrian Prantl 62140d943c Better document the limitations of coro::salvageDebugInfo()
and fix a few edge cases that show up in the Swift compiler but
weren't caught by the existing tests. Most notably the old code wasn't
salvaging load operations correctly. The patch also gets rid of the
LoadFromFramePtr argument and replaces it with a more generalized
mechanism.
2021-01-28 09:53:19 -08:00
Roman Lebedev 8cfa963463
[SimplifyCFG] If provided, preserve Dominator Tree
SimplifyCFG is an utility pass, and the fact that it does not
preserve DomTree's, forces it's users to somehow workaround that,
likely by not preserving DomTrees's themselves.

Indeed, simplifycfg pass didn't know how to preserve dominator tree,
it took me just under a month (starting with e113317958)
do rectify that, now it fully knows how to,
there's likely some problems with that still,
but i've dealt with everything i can spot so far.

I think we now can flip the switch.

Note that this is functionally an NFC change,
since this doesn't change the users to pass in the DomTree,
that is a separate question.

Reviewed By: kuhar, nikic

Differential Revision: https://reviews.llvm.org/D94827
2021-01-28 14:11:34 +03:00
Yang Fan 8644eb024b
[NFC][Transforms][Coroutines] Remove unused variable 2021-01-28 16:42:30 +08:00
Kazu Hirata 0da15ea581 [llvm] Use append_range (NFC) 2021-01-27 23:25:41 -08:00
Hongtao Yu 7e99bddfea [CSSPGO] Support of CS profiles in extended binary format.
This change brings up support of context-sensitive profiles in the format of extended binary. Existing sample profile reader/writer/merger code is being tweaked to reflect the fact of bracketed input contexts, like (`[...]`). The paired brackets are also needed in extbinary profiles because we don't yet have an otherwise good way to tell calling contexts apart from regular function names since the context delimiter `@` can somehow serve as a part of the C++ mangled names.

Reviewed By: wmi, wenlei

Differential Revision: https://reviews.llvm.org/D95547
2021-01-27 21:29:46 -08:00
Teresa Johnson 1487747e99 [LTO] Prevent devirtualization for symbols dynamically exported
Identify dynamically exported symbols (--export-dynamic[-symbol=],
--dynamic-list=, or definitions needed to preempt shared objects) and
prevent their LTO visibility from being upgraded.
This helps avoid use of whole program devirtualization when there may
be overrides in dynamic libraries.

Differential Revision: https://reviews.llvm.org/D91583
2021-01-27 15:54:13 -08:00
Sanjay Patel ab93c18c12 [LoopVectorize] use IR fast-math-flags exclusively (not FP function attributes)
I am trying to untangle the fast-math-flags propagation logic
in the vectorizers (see a6f022127 for SLP).

The loop vectorizer has a mix of checking FP function attributes,
IR-level FMF, and just wrong assumptions.

I am trying to avoid regressions while fixing this, and I think
the IR-level logic is good enough for that, but it's hard to say
for sure. This would be the 1st step in the clean-up.

The existing test that I changed to include 'fast' actually shows
a miscompile: the function only had the equivalent of nnan, but we
created new instructions that had fast (all FMF set). This is
similar to the example in https://llvm.org/PR35538

Differential Revision: https://reviews.llvm.org/D95452
2021-01-27 14:17:11 -05:00
Fangrui Song 54fb3ca96e [ThinLTO] Add Visibility bits to GlobalValueSummary::GVFlags
Imported functions and variable get the visibility from the module supplying the
definition.  However, non-imported definitions do not get the visibility from
(ELF) the most constraining visibility among all modules (Mach-O) the visibility
of the prevailing definition.

This patch

* adds visibility bits to GlobalValueSummary::GVFlags
* computes the result visibility and propagates it to all definitions

Protected/hidden can imply dso_local which can enable some optimizations (this
is stronger than GVFlags::DSOLocal because the implied dso_local can be
leveraged for ELF -shared while default visibility dso_local has to be cleared
for ELF -shared).

Note: we don't have summaries for declarations, so for ELF if a declaration has
the most constraining visibility, the result visibility may not be that one.

Differential Revision: https://reviews.llvm.org/D92900
2021-01-27 10:43:51 -08:00
Florian Hahn 28410d17f5
[LoopUtils] Pass SCEVExpander instead SE to addRuntimeChecks.
This gives the user control over which expander to use, which in turn
allows the user to decide what to do with the expanded instructions.

Used in D75980.

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D94295
2021-01-27 17:36:19 +00:00
Petr Hosek bb9eb19829 Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 17:13:34 -08:00
Adrian Prantl 0554541b44 Salvage debug info for function arguments in coro-split funclets.
This patch improves the availability for variables stored in the
coroutine frame by emitting an alloca to hold the pointer to the frame
object and rewriting dbg.declare intrinsics to point inside the frame
object using salvaged DIExpressions. Finally, a new alloca is created
in the funclet to hold the FramePtr pointer to ensure that it is
available throughout the entire function at -O0.

This path also effectively reverts D90772. The testcase updates
highlight nicely how every removed CHECK for a dbg.value is preceded
by a new CHECK for a dbg.declare.

Thanks to JunMa, Yifeng, and Bruno for their thoughtful reviews!

Differential Revision: https://reviews.llvm.org/D93497

rdar://71866936
2021-01-26 15:01:26 -08:00
Bjorn Pettersson a9bd3d37bd [NewPM] Add ExtraVectorizerPasses support
As it looks like NewPM generally is using SimpleLoopUnswitch
instead of LoopUnswitch, this patch also use SimpleLoopUnswitch
in the ExtraVectorizerPasses sequence (compared with LegacyPM
which use the LoopUnswitch pass).

Reviewed By: aeubanks

Differential Revision: https://reviews.llvm.org/D95457
2021-01-26 22:59:10 +01:00
Valery N Dmitriev 716b9dd0d8 [InstCombine] Preserve FMF for powi simplifications.
Differential Revision: https://reviews.llvm.org/D95455
2021-01-26 13:26:06 -08:00
Petr Hosek 1e634f3952 Revert "Support for instrumenting only selected files or functions"
This reverts commit 4edf35f11a because
the test fails on Windows bots.
2021-01-26 12:25:28 -08:00
Petr Hosek 4edf35f11a Support for instrumenting only selected files or functions
This change implements support for applying profile instrumentation
only to selected files or functions. The implementation uses the
sanitizer special case list format to select which files and functions
to instrument, and relies on the new noprofile IR attribute to exclude
functions from instrumentation.

Differential Revision: https://reviews.llvm.org/D94820
2021-01-26 11:11:39 -08:00
Sanjay Patel 09b1c56366 [LoopUtils] do not initialize Cmp predicate unnecessarily; NFC
The switch must set the predicate correctly; anything else
should lead to unreachable/assert.

I'm trying to fix FMF propagation here and the callers,
so this is a preliminary cleanup.
2021-01-26 11:22:51 -05:00
Florian Hahn 1272f16d14
[LoopUnswitch] Avoid partially unswitching too aggressively.
This patch adds additional checks to avoid partial unswitching
in cases where it won't be profitable, e.g. because the path directly
exits the loop anyways.
2021-01-26 15:18:41 +00:00
Florian Hahn 35b3989a30
[Passes] Run peeling as part of simple/full loop unrolling.
Loop peeling removes conditions from loop bodies that become invariant
after a small number of iterations. When triggered, this leads to fewer
compares and possibly PHIs in loop bodies, enabling further
optimizations. The current cost-model of loop peeling should be quite
conservative/safe, i.e. only peel if a condition in the loop becomes
known after peeling.

For example, see PR47671, where loop peeling enables vectorization by
removing a PHI the vectorizer does not understand. Granted, the
loop-vectorizer could also be taught about constant PHIs, but loop
peeling is likely to enable other optimizations as well.

This has an impact on quite a few benchmarks from
MultiSource/SPEC2000/SPEC2006 on X86 with -O3 -flto, for example

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsVectorized

    Program                                        base   patch  diff
     test-suite...ve-susan/automotive-susan.test     8.00   9.00 12.5%
     test-suite...nal/skidmarks10/skidmarks.test    35.00  31.00 -11.4%
     test-suite...lications/sqlite3/sqlite3.test    41.00  43.00  4.9%
     test-suite...s/ASC_Sequoia/AMGmk/AMGmk.test    25.00  26.00  4.0%
     test-suite...006/450.soplex/450.soplex.test    88.00  89.00  1.1%
     test-suite...TimberWolfMC/timberwolfmc.test   120.00 119.00 -0.8%
     test-suite.../CINT2006/403.gcc/403.gcc.test   215.00 216.00  0.5%
     test-suite...006/447.dealII/447.dealII.test   957.00 958.00  0.1%
     test-suite...ternal/HMMER/hmmcalibrate.test    75.00  75.00  0.0%

    Same hash: 186 (filtered out)
    Remaining: 51
    Metric: loop-vectorize.LoopsAnalyzed

    Program                                        base    patch   diff
     test-suite...ks/Prolangs-C/agrep/agrep.test   440.00  434.00  -1.4%
     test-suite...nal/skidmarks10/skidmarks.test   312.00  308.00  -1.3%
     test-suite...marks/7zip/7zip-benchmark.test   6399.00 6323.00 -1.2%
     test-suite...lications/minisat/minisat.test   134.00  135.00   0.7%
     test-suite...rks/FreeBench/pifft/pifft.test   295.00  297.00   0.7%
     test-suite...TimberWolfMC/timberwolfmc.test   1879.00 1869.00 -0.5%
     test-suite...pplications/treecc/treecc.test   689.00  691.00   0.3%
     test-suite...T2000/300.twolf/300.twolf.test   1593.00 1597.00  0.3%
     test-suite.../Benchmarks/Bullet/bullet.test   1394.00 1392.00 -0.1%
     test-suite...ications/JM/ldecod/ldecod.test   1431.00 1429.00 -0.1%
     test-suite...6/464.h264ref/464.h264ref.test   2229.00 2230.00  0.0%
     test-suite...lications/sqlite3/sqlite3.test   2590.00 2589.00 -0.0%
     test-suite...ications/JM/lencod/lencod.test   2732.00 2733.00  0.0%
     test-suite...006/453.povray/453.povray.test   3395.00 3394.00 -0.0%

Note the -11% regression in number of loops vectorized for skidmarks. I
suspect this corresponds to the fact that those loops are gone now (see
the reduction in number of loops analyzed by LV).

Reviewed By: lebedev.ri

Differential Revision: https://reviews.llvm.org/D88471
2021-01-26 13:52:30 +00:00
Sergey Dmitriev 13cedcaf45 [llvm-link] Fix crash when materializing appending global
This patch fixes llvm-link crash when materializing global variable
with appending linkage and initializer that depends on another
global with appending linkage.

Reviewed By: tra

Differential Revision: https://reviews.llvm.org/D95329
2021-01-25 18:08:07 -08:00
modimo ce7f9cdb50 [InlineAdvisor] Allow replay of inline decisions for the CGSCC inliner from optimization remarks
This change leverages the work done in D83743 to replay in the SampleProfile inliner to also be used in the CGSCC inliner. NOTE: currently restricted to non-ML advisors only.

The added switch `-cgscc-inline-replay=<remarks file>` will replay the inlining decisions in that file where the remarks file is generated via `-Rpass=inline`. The aim here is to make it easier to analyze changes that would modify inlining heuristics to be separated from this behavior. Doing so allows easier examination of assembly and runtime behavior compared to the baseline rather than trying to dig through the large churn caused by inlining.

In LTO compilation, since inlining is done twice you can separately specify replay by passing the flag to the FE (`-cgscc-inline-replay=`) and to the linker (`-Wl,cgscc-inline-replay=`) with the remarks generated from their respective places.

Testing on mysqld by comparing the inline decisions between base (generates remarks.txt) and diff (replay using identical input/tools with remarks.txt) and examining the inlining sites with `diff` shows 14,000 mismatches out of 247,341 for a ~94% replay accuracy. I believe this gap can be narrowed further though for the general case we may never achieve full accuracy. For my personal use, this is close enough to be representative: I set the baseline as the one generated by the replay on identical input/toolset and compare that to my modified input/toolset using the same replay.

Testing:
ninja check-llvm
newly added test correctly replays CGSCC inlining decisions

Reviewed By: mtrofin, wenlei

Differential Revision: https://reviews.llvm.org/D94334
2021-01-25 15:38:57 -08:00
Nikita Popov 835104a114 [LSR] Drop potentially invalid nowrap flags when switching to post-inc IV (PR46943)
When LSR converts a branch on the pre-inc IV into a branch on the
post-inc IV, the nowrap flags on the addition may no longer be valid.
Previously, a poison result of the addition might have been ignored,
in which case the program was well defined. After branching on the
post-inc IV, we might be branching on poison, which is undefined behavior.

Fix this by discarding nowrap flags which are not present on the SCEV
expression. Nowrap flags on the SCEV expression are proven by SCEV
to always hold, independently of how the expression will be used.
This is essentially the same fix we applied to IndVars LFTR, which
also performs this kind of pre-inc to post-inc conversion.

I believe a similar problem can also exist for getelementptr inbounds,
but I was not able to come up with a problematic test case. The
inbounds case would have to be addressed in a differently anyway
(as SCEV does not track this property).

Fixes https://bugs.llvm.org/show_bug.cgi?id=46943.

Differential Revision: https://reviews.llvm.org/D95286
2021-01-25 23:13:48 +01:00
Richard Smith 925ae8c790 Revert "[ObjC][ARC] Annotate calls with attributes instead of emitting retainRV"
This reverts commit 53176c1680, which
introduceed a layering violation. LLVM's IR library can't include
headers from Analysis.
2021-01-25 13:53:38 -08:00
Akira Hatanaka 53176c1680 [ObjC][ARC] Annotate calls with attributes instead of emitting retainRV
or claimRV calls in the IR

Background:

This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end annotates calls with attribute "clang.arc.rv"="retain"
  or "clang.arc.rv"="claim", which indicates the call is implicitly
  followed by a marker instruction and a retainRV/claimRV call that
  consumes the call result. This is currently done only when the target
  is arm64 and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the
  annotated calls in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the annotated
  calls. It doesn't remove the attribute on the call since the backend
  needs it to emit the marker instruction. The retainRV/claimRV calls
  are emitted late in the pipeline to prevent optimization passes from
  transforming the IR in a way that makes it harder for the ARC
  middle-end passes to figure out the def-use relationship between the
  call and the retainRV/claimRV calls (which is the cause of PR31925).

- The function inliner removes the autoreleaseRV call in the callee that
  returns the result if nothing in the callee prevents it from being
  paired up with the calls annotated with "clang.arc.rv"="retain/claim"
  in the caller. If the call is annotated with "claim", a release call
  is inserted since autoreleaseRV+claimRV is equivalent to a release. If
  it cannot find an autoreleaseRV call, it tries to transfer the
  attributes to a function call in the callee. This is important since
  ARC optimizer can remove the autoreleaseRV call returning the callee
  result, which makes it impossible to pair it up with the retainRV or
  claimRV call in the caller. If that fails, it simply emits a retain
  call in the IR if the call is annotated with "retain" and does nothing
  if it's annotated with "claim".

- This patch teaches dead argument elimination pass not to change the
  return type of a function if any of the calls to the function are
  annotated with attribute "clang.arc.rv". This is necessary since the
  pass can incorrectly determine nothing in the IR uses the function
  return, which can happen since the front-end no longer explicitly
  emits retainRV/claimRV calls in the IR, and change its return type to
  'void'.

Future work:

- Use the attribute on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls annotated with the attributes.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-01-25 11:57:08 -08:00
Florian Hahn 76afbf60ed
[VPlan] Replace uses with new value in VPInstructionsToVPRecipe (NFC).
Now that VPRecipeBase inherits from VPDef, we can always use the new
VPValue for replacement, if the recipe defines one. Given the recipes
that are supported at the moment, all new recipes must have either 0 or
1 defined values.
2021-01-25 19:38:08 +00:00
Nick Desaulniers d36812892c [GVN] do not repeat PRE on failure to split critical edge
Fixes an infinite loop encountered in GVN.

GVN will delay PRE if it encounters critical edges, attempt to split
them later via calls to SplitCriticalEdge(), then restart.

The caller of GVN::splitCriticalEdges() assumed a return value of true
meant that critical edges were split, that the IR had changed, and that
PRE should be re-attempted, upon which we loop infinitely.

This was exposed after D88438, by compiling the Linux kernel for s390,
but the test case is reproducible on x86.

Fixes: https://github.com/ClangBuiltLinux/linux/issues/1261

Reviewed By: void

Differential Revision: https://reviews.llvm.org/D94996
2021-01-25 11:23:44 -08:00
Wei Mi c9cd9a0066 [SampleFDO] Report error when reading a bad/incompatible profile instead of
turning off SampleFDO silently.

Currently sample loader pass turns off SampleFDO optimization silently when
it sees error in reading the profile. This behavior will defeat the tests
which could have caught those bad/incompatible profile problems. This patch
change the behavior to report error.

Differential Revision: https://reviews.llvm.org/D95269
2021-01-25 10:28:23 -08:00
Xun Li 17c3538aef Revert "Fix unused variable in CoroFrame.cpp when building Release with GCC 10"
This reverts commit ff5e896425.
2021-01-25 08:37:45 -08:00
Florian Hahn 3201274dea
[VPlan] Handle scalarized values in VPTransformState.
This patch adds plumbing to handle scalarized values directly in
VPTransformState.

Reviewed By: gilr

Differential Revision: https://reviews.llvm.org/D92282
2021-01-25 14:21:56 +00:00
Sanjay Patel 09a136bcc6 [InstCombine] narrow min/max intrinsics with extended inputs
We can sink extends after min/max if they match and would
not change the sign-interpreted compare. The only combo
that doesn't work is zext+smin/smax because the zexts
could change a negative number into positive:
https://alive2.llvm.org/ce/z/D6sz6J

Sext+umax/umin works:

  define i32 @src(i8 %x, i8 %y) {
  %0:
    %sx = sext i8 %x to i32
    %sy = sext i8 %y to i32
    %m = umax i32 %sx, %sy
    ret i32 %m
  }
  =>
  define i32 @tgt(i8 %x, i8 %y) {
  %0:
    %m = umax i8 %x, %y
    %r = sext i8 %m to i32
    ret i32 %r
  }
  Transformation seems to be correct!
2021-01-25 07:52:50 -05:00
Sander de Smalen 171d12489f [SLPVectorizer] NFC: Migrate getVectorCallCosts to use InstructionCost.
This change also changes getReductionCost to return InstructionCost,
and it simplifies two expressions by removing a redundant 'isValid' check.
2021-01-25 12:27:01 +00:00
Nikita Popov 8b9df70bf7 [Utils] Use NoAliasScopeDeclInst in a few more places (NFC)
In the cloning infrastructure, only track an MDNode mapping,
without explicitly storing the Metadata mapping, same as is done
during inlining. This makes things slightly simpler.
2021-01-24 16:24:11 +01:00
Sanjay Patel 77adbe6a8c [SLP] fix fast-math requirements for fmin/fmax reductions
a6f0221276 enabled intersection of FMF on reduction instructions,
so it is safe to ease the check here.

There is still some room to improve here - it looks like we
have nearly duplicate flags propagation logic inside of the
LoopUtils helper but it is limited targets that do not form
reduction intrinsics (they form the shuffle expansion).
2021-01-24 08:55:56 -05:00
Jeroen Dobbelaere dcc7706fcf [InstCombine] Remove unused llvm.experimental.noalias.scope.decl
A @llvm.experimental.noalias.scope.decl is only useful if there is !alias.scope and !noalias metadata that uses the declared scope.
When that is not the case for at least one of the two, the intrinsic call can as well be removed.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D95141
2021-01-24 13:55:50 +01:00
Jeroen Dobbelaere 659c7bcde6 [LoopRotate] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed
Similar to D92887, LoopRotation also needs duplicate the noalias scopes when rotating a `@llvm.experimental.noalias.scope.decl` across a block boundary.
This is based on the version from the Full Restrict paches (D68511).

The problem it fixes also showed up in Transforms/Coroutines/ex5.ll after D93040 (when enabling strict checking with -verify-noalias-scope-decl-dom).

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D94306
2021-01-24 13:53:13 +01:00
Jeroen Dobbelaere 774629641b [LoopUnroll] Use llvm.experimental.noalias.scope.decl for duplicating noalias metadata as needed
This is a fix for https://bugs.llvm.org/show_bug.cgi?id=39282. Compared to D90104, this version is based on part of the full restrict patched (D68484) and uses the `@llvm.experimental.noalias.scope.decl` intrinsic to track the location where !noalias and !alias.scope scopes have been introduced. This allows us to only duplicate the scopes that are really needed.

Notes:
- it also includes changes and tests from D90104

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D92887
2021-01-24 13:48:20 +01:00
Roman Lebedev 6f2753273e
[NFC][SimplifyCFG] Extract CloneInstructionsIntoPredecessorBlockAndUpdateSSAUses() out of PerformBranchToCommonDestFolding()
To be used in PerformValueComparisonIntoPredecessorFolding()
2021-01-24 00:54:55 +03:00
Roman Lebedev 67f9c87a65
[NFC][SimplifyCFG] Perform early-continue in FoldValueComparisonIntoPredecessors() per-pred loop 2021-01-24 00:54:54 +03:00
Roman Lebedev a4e6c2e647
[NFC][SimplifyCFG] Extract PerformValueComparisonIntoPredecessorFolding() out of FoldValueComparisonIntoPredecessors()
Less nested code is much easier to follow and modify.
2021-01-24 00:54:54 +03:00
Nikita Popov c83cff45c7 [IR] Add NoAliasScopeDeclInst (NFC)
Add an intrinsic type class to represent the
llvm.experimental.noalias.scope.decl intrinsic, to make code
working with it a bit nicer by hiding the metadata extraction
from view.
2021-01-23 22:40:32 +01:00
Kazu Hirata 1238378f18 [llvm] Use pop_back_val (NFC) 2021-01-23 10:56:33 -08:00
Florian Hahn d60b74c28a
[InstCombine] Set MadeIRChange in replaceInstUsesWith.
Some utilities used by InstCombine, like SimplifyLibCalls, may add new
instructions and replace the uses of a call, but return nullptr because
the inserted call produces multiple results.

Previously, the replaced library calls would get removed by
InstCombine's deleter, but after
292077072e this may not happen, if the
willreturn attribute is missing.

As a work-around, update replaceInstUsesWith to set MadeIRChange, if it
replaces any uses. This catches the cases where it is used as replacer
by utilities used by InstCombine and seems useful in general; updating
uses will modify the IR.

This fixes an expensive-check failure when replacing
@__sinpif/@__cospifi with @__sincospif_sret.
2021-01-23 17:52:59 +00:00
Sanjay Patel a6f0221276 [SLP] fix fast-math-flag propagation on FP reductions
As shown in the test diffs, we could miscompile by
propagating flags that did not exist in the original
code.

The flags required for fmin/fmax reductions will be
fixed in a follow-up patch.
2021-01-23 11:17:20 -05:00
Florian Hahn 292077072e
[Local] Treat calls that may not return as being alive.
With the addition of the `willreturn` attribute, functions that may
not return (e.g. due to an infinite loop) are well defined, if they are
not marked as `willreturn`.

This patch updates `wouldInstructionBeTriviallyDead` to not consider
calls that may not return as dead.

This patch still provides an escape hatch for intrinsics, which are
still assumed as willreturn unconditionally. It will be removed once
all intrinsics definitions have been reviewed and updated.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D94106
2021-01-23 16:05:14 +00:00
Roman Lebedev 022da61f6b
[SimplifyCFG] Change 'LoopHeaders' to be ArrayRef<WeakVH>, not a naked set, thus avoiding dangling pointers
If i change it to AssertingVH instead, a number of existing tests fail,
which means we don't consistently remove from the set when deleting blocks,
which means newly-created blocks may happen to appear in that set
if they happen to occupy the same memory chunk as did some block
that was in the set originally.

There are many places where we delete blocks,
and while we could probably consistently delete from LoopHeaders
when deleting a block in transforms located in SimplifyCFG.cpp itself,
transforms located elsewhere (Local.cpp/BasicBlockUtils.cpp) also may
delete blocks, and it doesn't seem good to teach them to deal with it.

Since we at most only ever delete from LoopHeaders,
let's just delegate to WeakVH to do that automatically.

But to be honest, personally, i'm not sure that the idea
behind LoopHeaders is sound.
2021-01-23 16:48:35 +03:00
Jeroen Dobbelaere 2b9a834c43 [InlineFunction] Use llvm.experimental.noalias.scope.decl for noalias arguments.
Insert a llvm.experimental.noalias.scope.decl intrinsic that identifies where a noalias argument was inlined.

This patch includes some refactorings from D90104.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D93040
2021-01-23 12:10:57 +01:00
Zequan Wu 867bdfeff1 [InstCombine] remove incompatible attribute when simplifying some lib calls
Like D95088, remove incompatible attribute in more lib calls.

Differential Revision: https://reviews.llvm.org/D95278
2021-01-22 17:27:36 -08:00
Philip Reames ef51eed37b [LoopDeletion] Handle inner loops w/untaken backedges
This builds on the restricted after initial revert form of D93906, and adds back support for breaking backedges of inner loops. It turns out the original invalidation logic wasn't quite right, specifically around the handling of LCSSA.

When breaking the backedge of an inner loop, we can cause blocks which were in the outer loop only because they were also included in a sub-loop to be removed from both loops. This results in the exit block set for our original parent loop changing, and thus a need for new LCSSA phi nodes.

This case happens when the inner loop has an exit block which is also an exit block of the parent, and there's a block in the child which reaches an exit to said block without also reaching an exit to the parent loop.

(I'm describing this in terms of the immediate parent, but the problem is general for any transitive parent in the nest.)

The approach implemented here involves a potentially expensive LCSSA rebuild.  Perf testing during review didn't show anything concerning, but we may end up needing to revert this if anyone encounters a practical compile time issue.

Differential Revision: https://reviews.llvm.org/D94378
2021-01-22 16:31:29 -08:00
Francis Visoiu Mistrih 0cc38acfc4 [Matrix] Propagate shape information through fneg
Similar to binary operators like fadd/fmul/fsub, propagate shape info
through unary operators (fneg is the only one?).

Differential Revision: https://reviews.llvm.org/D95252
2021-01-22 14:34:28 -08:00
Roman Lebedev 1742203844
[SimplifyCFG] FoldBranchToCommonDest(): re-lift restrictions on liveout uses of bonus instructions
I have previously tried doing that in
b33fbbaa34 / d38205144f,
but eventually it was pointed out that the approach taken there
was just broken wrt how the uses of bonus instructions are updated
to account for the fact that they should now use either bonus instruction
or the cloned bonus instruction. In particluar, all that manual handling
of PHI nodes in successors was just wrong.

But, the fix is actually much much simpler than my initial approach:
just tell SSAUpdate about both instances of bonus instruction,
and let it deal with all the PHI handling.

Alive2 confirms that the reproducers from the original bugs (@pr48450*)
are now handled correctly.

This effectively reverts commit 59560e8589,
effectively relanding b33fbbaa34.
2021-01-23 01:29:05 +03:00
Roman Lebedev eae1cc0de5
[NFC][SimplifyCFG] PerformBranchToCommonDestFolding(): move instruction cloning to after CFG update
This simplifies follow-up patch, and is NFC otherwise.
2021-01-23 01:29:04 +03:00
Roman Lebedev 9bd8bcf993
[NFC][SimplifyCFG] PerformBranchToCommonDestFolding(): fix instruction name preservation
NewBonusInst just took name from BonusInst, so BonusInst has no name,
so BonusInst.getName() makes no sense.
So we need to ask NewBonusInst for the name.
2021-01-23 01:29:03 +03:00
Shimin Cui 99a0aa07e9 [Analysis] Support AIX vec_malloc routines
This is to support the memory routines vec_malloc, vec_calloc, vec_realloc, and vec_free. These routines manage memory that is 16-byte aligned. And they are only available on AIX.

Differential Revision: https://reviews.llvm.org/D94710
2021-01-22 16:03:01 -05:00
Nikita Popov 45b259f995 [SimplifyLibCalls] Skip unused calls in sincos transform
If the call result is unused, we should let it get DCEd rather
than replacing it. Also, don't try to replace an existing sincos
with another one (unless it's as part of combining sin and cos).

This avoids an infinite combine loop if the calls are not DCEd
as expected, which can happen with D94106 and lack of willreturn
annotation in hand-crafted IR.
2021-01-22 20:57:13 +01:00
Sanjay Patel 411c144e4c [InstCombine] narrow abs with sign-extended input
In the motivating cases from https://llvm.org/PR48816 ,
we have a trailing trunc. But that is not required to
reduce the abs width:
https://alive2.llvm.org/ce/z/ECaz-p
...as long as we clear the int-min-is-poison bit (nsw).

We have some existing tests that are affected, and I'm
not sure what the overall implications are, but in general
we favor narrowing operations over preserving nsw/nuw.

If that causes problems, we could restrict this transform
based on type (shouldChangeType() and/or vector vs. scalar).

Differential Revision: https://reviews.llvm.org/D95235
2021-01-22 13:36:04 -05:00
Florian Hahn 86991d3231
[LoopUnswitch] Fix logic to avoid unswitching with atomic loads.
The existing code did not deal with atomic loads correctly. Such loads
are represented as MemoryDefs. Bail out on any MemoryAccess that is not
a MemoryUse.
2021-01-22 15:10:12 +00:00
Arnold Schwaighofer 87b628dadd [coro.async] Make sure we process async coroutines
Because we were not looking for the llvm.coro.id.async intrinsic in the
early coro pass which triggers follow-up passes we relied on the
llvm.coro.end intrinsic being present. This might not be the case in
functions that end in unreachable code.

Differential Revision: https://reviews.llvm.org/D95144
2021-01-22 07:04:01 -08:00
Roman Lebedev 85e7578c6d
Revert "[NFCI-ish][SimplifyCFG] FoldBranchToCommonDest(): really don't deal with uncond branches"
Does not build in XCode:
http://green.lab.llvm.org/green/job/clang-stage1-RA/17963/consoleFull#-1704658317a1ca8a51-895e-46c6-af87-ce24fa4cd561

This reverts commit aabed3718a.
2021-01-22 17:37:11 +03:00
Roman Lebedev d1a6f92fd5
[InstCombine] Fold `(~x) | y` --> `~(x & (~y))` iff it is free to do so
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.

Note that we could position this transformation as just hoisting
of the `not` (still, iff y is freely negatible), but the test changes
show a number of regressions, so let's not do that.
2021-01-22 17:23:54 +03:00
Roman Lebedev 79b0d21ce9
[InstCombine] Fold `(~x) & y` --> `~(x | (~y))` iff it is free to do so
Iff we know we can get rid of the inversions in the new pattern,
we can thus get rid of the inversion in the old pattern,
this decreasing instruction count.
2021-01-22 17:23:54 +03:00
Roman Lebedev 4ed0d8f2f0
[NFC][InstCombine] Extract freelyInvertAllUsersOf() out of canonicalizeICmpPredicate()
I'd like to use it in an upcoming fold.
2021-01-22 17:23:53 +03:00
Roman Lebedev efeb8caf8b
[NFC][SimplifyCFG] FoldBranchToCommonDest(): extract the actual transform into helper function
I'm intentionally structuring it this way, so that the actual fold only
does the fold, and no legality/correctness checks, all of which must be
done by the caller. This allows for the fold code to be more compact
and more easily grokable.
2021-01-22 17:23:53 +03:00
Roman Lebedev b482560a59
[NFC][SimplifyCFG] FoldBranchToCommonDest(): extract check for destination sharing into a helper function
As a follow-up, i'll extract the actual transform into a function,
and this helper will be called from both places,
so this avoids code duplication.
2021-01-22 17:23:53 +03:00
Roman Lebedev 7b89efb55e
[NFC][SimplifyCFG] FoldBranchToCommonDest(): somewhat better structure weight updating code
Hoist the successor updating out of the code that deals with branch
weight updating, and hoist the 'has weights' check from the latter,
making code more consistent and easier to follow.
2021-01-22 17:23:41 +03:00
Roman Lebedev 256a035752
[NFC][SimplifyCFG] FoldBranchToCommonDest(): unclutter Cond/CondInPred handling
We don't need those variables, we can just get the final value directly.
2021-01-22 17:23:11 +03:00
Roman Lebedev aabed3718a
[NFCI-ish][SimplifyCFG] FoldBranchToCommonDest(): really don't deal with uncond branches
While we already ignore uncond branches, we could still potentially
end up with a conditional branches with identical destinations
due to the visitation order, or because we were called as an utility.
But if we have such a disguised uncond branch,
we still probably shouldn't deal with it here.
2021-01-22 17:23:10 +03:00
Roman Lebedev 0895b836d7
[SimplifyCFG] FoldBranchToCommonDest(): don't deal with unconditional branches
The case where BB ends with an unconditional branch,
and has a single predecessor w/ conditional branch
to BB and a single successor of BB is exactly the pattern
SpeculativelyExecuteBB() transform deals with.
(and in this case they both allow speculating only a single instruction)

Well, or FoldTwoEntryPHINode(), if the final block
has only those two predecessors.

Here, in FoldBranchToCommonDest(), only a weird subset of that
transform is supported, and it's glued on the side in a weird way.
  In particular, it took me a bit to understand that the Cond
isn't actually a branch condition in that case, but just the value
we allow to speculate (otherwise it reads as a miscompile to me).
  Additionally, this only supports for the speculated instruction
to be an ICmp.

So let's just unclutter FoldBranchToCommonDest(), and leave
this transform up to SpeculativelyExecuteBB(). As far as i can tell,
this shouldn't really impact optimization potential, but if it does,
improving SpeculativelyExecuteBB() will be more beneficial anyways.

Notably, this only affects a single test,
but EarlyCSE should have run beforehand in the pipeline,
and then FoldTwoEntryPHINode() would have caught it.

This reverts commit rL158392 / commit d33f4efbfd.
2021-01-22 17:22:49 +03:00
Anton Rapetov a4914dc1f2 [SLP] do not traverse constant uses
Walking the use list of a Constant (particularly, ConstantData)
is not scalable, since a given constant may be used by many
instructinos in many functions in many modules.

Differential Revision: https://reviews.llvm.org/D94713
2021-01-22 08:14:09 -05:00
David Sherwood 2e080eb00a [SVE] Add support for scalable vectorization of loops with selects and cmps
I have removed an unnecessary assert in LoopVectorizationCostModel::getInstructionCost
that prevented a cost being calculated for select instructions when using
scalable vectors. In addition, I have changed AArch64TTIImpl::getCmpSelInstrCost
to only do special cost calculations for fixed width vectors and fall
back to the base version for scalable vectors.

I have added a simple cost model test for cmps and selects:

  test/Analysis/CostModel/sve-cmpsel.ll

and some simple tests that show we vectorize loops with cmp and select:

  test/Transforms/LoopVectorize/AArch64/sve-basic-vec.ll

Differential Revision: https://reviews.llvm.org/D95039
2021-01-22 09:48:13 +00:00
Xun Li bd3ca6666d [Inlining] Delete redundant optnone/alwaysinline check
The same check is done in InlineCost: 8b0bd54d0e/llvm/lib/Analysis/InlineCost.cpp (L2537-L2552)
Also, doing a check on the callee here is confusing, because anything that deals with callee should be done in the inner loop where we proecss all calls from the same caller.

Differential Revision: https://reviews.llvm.org/D95186
2021-01-21 18:38:10 -08:00
David Green 39db5753f9 [LV][ARM] Inloop reduction cost modelling
This adds cost modelling for the inloop vectorization added in
745bf6cf44. Up until now they have been modelled as the original
underlying instruction, usually an add. This happens to works OK for MVE
with instructions that are reducing into the same type as they are
working on. But MVE's instructions can perform the equivalent of an
extended MLA as a single instruction:

  %sa = sext <16 x i8> A to <16 x i32>
  %sb = sext <16 x i8> B to <16 x i32>
  %m = mul <16 x i32> %sa, %sb
  %r = vecreduce.add(%m)
  ->
  R = VMLADAV A, B

There are other instructions for performing add reductions of
v4i32/v8i16/v16i8 into i32 (VADDV), for doing the same with v4i32->i64
(VADDLV) and for performing a v4i32/v8i16 MLA into an i64 (VMLALDAV).
The i64 are particularly interesting as there are no native i64 add/mul
instructions, leading to the i64 add and mul naturally getting very
high costs.

Also worth mentioning, under NEON there is the concept of a sdot/udot
instruction which performs a partial reduction from a v16i8 to a v4i32.
They extend and mul/sum the first four elements from the inputs into the
first element of the output, repeating for each of the four output
lanes. They could possibly be represented in the same way as above in
llvm, so long as a vecreduce.add could perform a partial reduction. The
vectorizer would then produce a combination of in and outer loop
reductions to efficiently use the sdot and udot instructions. Although
this patch does not do that yet, it does suggest that separating the
input reduction type from the produced result type is a useful concept
to model. It also shows that a MLA reduction as a single instruction is
fairly common.

This patch attempt to improve the costmodelling of in-loop reductions
by:
 - Adding some pattern matching in the loop vectorizer cost model to
   match extended reduction patterns that are optionally extended and/or
   MLA patterns. This marks the cost of the reduction instruction correctly
   and the sext/zext/mul leading up to it as free, which is otherwise
   difficult to tell and may get a very high cost. (In the long run this
   can hopefully be replaced by vplan producing a single node and costing
   it correctly, but that is not yet something that vplan can do).
 - getExtendedAddReductionCost is added to query the cost of these
   extended reduction patterns.
 - Expanded the ARM costs to account for these expanded sizes, which is a
   fairly simple change in itself.
 - Some minor alterations to allow inloop reduction larger than the highest
   vector width and i64 MVE reductions.
 - An extra InLoopReductionImmediateChains map was added to the vectorizer
   for it to efficiently detect which instructions are reductions in the
   cost model.
 - The tests have some updates to show what I believe is optimal
   vectorization and where we are now.

Put together this can greatly improve performance for reduction loop
under MVE.

Differential Revision: https://reviews.llvm.org/D93476
2021-01-21 21:03:41 +00:00
Sanjay Patel 2f03528f5e [SLP] rename reduction variable to avoid shadowing; NFC
The code structure can likely be improved now that
'OperationData' is gone.
2021-01-21 16:02:38 -05:00
Anton Rapetov bfec9148a0 Scalar: Don't visit constants in findInnerReductionPhi in LoopInterchange
In LoopInterchange, `findInnerReductionPhi()` looks for reduction
variables, which cannot be constants. Update it to return early in that
case.

This also addresses a blocker for removing use-lists from ConstantData,
whose users could be spread across arbitrary modules in the same
LLVMContext.

Differential Revision: https://reviews.llvm.org/D94712
2021-01-21 12:33:06 -08:00
Sanjay Patel d77753381f [SLP] simplify reduction matching
This is NFC-intended and removes the "OperationData"
class which had become nothing more than a recurrence
(reduction) type.

I adjusted the matching logic to distinguish
instructions from non-instructions - that's all that
the "IsLeafValue" member was keeping track of.
2021-01-21 14:58:57 -05:00
Nikita Popov 65fd034b95 [FunctionAttrs] Infer willreturn for functions without loops
If a function doesn't contain loops and does not call non-willreturn
functions, then it is willreturn. Loops are detected by checking
for backedges in the function. We don't attempt to handle finite
loops at this point.

Differential Revision: https://reviews.llvm.org/D94633
2021-01-21 20:29:33 +01:00
Sanjay Patel 070af1b788 [InstCombine] avoid crashing on attribute propagation
In https://llvm.org/PR48810 , we are crashing while trying to
propagate attributes from mempcpy (returns void*) to memcpy
(returns nothing - void).

We can avoid the crash by removing known incompatible
attributes for the void return type.

I'm not sure if this goes far enough (should we just drop all
attributes since this isn't the same function?). We also need
to audit other transforms in LibCallSimplifier to make sure
there are no other cases that have the same problem.

Differential Revision: https://reviews.llvm.org/D95088
2021-01-21 08:13:26 -05:00
Florian Hahn bee486851c
[LoopUnswitch] Implement first version of partial unswitching.
This patch applies the idea from D93734 to LoopUnswitch.

It adds support for unswitching on conditions that are only
invariant along certain paths through a loop.

In particular, it targets conditions in the loop header that
depend on values loaded from memory. If either path from
the true or false successor through the loop does not modify
memory, perform partial loop unswitching.

That is, duplicate the instructions feeding the condition in the pre-header.
Then unswitch on the duplicated condition. The condition is now known
in the unswitched version for the 'invariant' path through the original loop.

On caveat of this approach is that one of the loops created can be partially
unswitched again. To avoid this behavior, `llvm.loop.unswitch.partial.disable`
metadata is added to the unswitched loops, to avoid subsequent partial
unswitching.

If that's the approach to go, I can move the code handling the metadata kind
into separate functions.

This increases the cases we unswitch quite a bit in SPEC2006/SPEC2000 &
MultiSource. It also allows us to eliminate a dead loop in SPEC2017's omnetpp

```
Tests: 236
Same hash: 170 (filtered out)
Remaining: 66
Metric: loop-unswitch.NumBranches

Program                                        base   patch  diff
 test-suite...000/255.vortex/255.vortex.test     2.00  23.00 1050.0%
 test-suite...T2006/401.bzip2/401.bzip2.test     7.00  55.00 685.7%
 test-suite :: External/Nurbs/nurbs.test         5.00  26.00 420.0%
 test-suite...s-C/unix-smail/unix-smail.test     1.00   3.00 200.0%
 test-suite.../Prolangs-C++/ocean/ocean.test     1.00   3.00 200.0%
 test-suite...tions/lambda-0.1.3/lambda.test     1.00   3.00 200.0%
 test-suite...yApps-C++/PENNANT/PENNANT.test     2.00   5.00 150.0%
 test-suite...marks/Ptrdist/yacr2/yacr2.test     1.00   2.00 100.0%
 test-suite...lications/viterbi/viterbi.test     1.00   2.00 100.0%
 test-suite...plications/d/make_dparser.test    12.00  24.00 100.0%
 test-suite...CFP2006/433.milc/433.milc.test    14.00  27.00 92.9%
 test-suite.../Applications/lemon/lemon.test     7.00  12.00 71.4%
 test-suite...ce/Applications/Burg/burg.test     6.00  10.00 66.7%
 test-suite...T2006/473.astar/473.astar.test    16.00  26.00 62.5%
 test-suite...marks/7zip/7zip-benchmark.test    78.00 121.00 55.1%
```

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D93764
2021-01-21 09:46:41 +00:00
Kazu Hirata e53472de68 [Transforms] Use llvm::append_range (NFC) 2021-01-20 21:35:54 -08:00
Kazu Hirata 8f5da41c4d [llvm] Construct SmallVector with iterator ranges (NFC) 2021-01-20 21:35:52 -08:00
Dávid Bolvanský bb3f169b59 [BuildLibcalls, Attrs] Support more variants of C++'s new, add attributes for C++'s delete
Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D95095
2021-01-21 00:12:37 +01:00
Mircea Trofin ccec2cf1d9 Reland "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor"
This reverts commit d97f776be5.

The original problem was due to build failures in shared lib builds. D95079
moved ImportedFunctionsInliningStatistics under Analysis, unblocking
this.
2021-01-20 13:33:43 -08:00
Mircea Trofin 95ce32c787 [NFC] Move ImportedFunctionsInliningStatistics to Analysis
This is related to D94982. We want to call these APIs from the Analysis
component, so we can't leave them under Transforms.

Differential Revision: https://reviews.llvm.org/D95079
2021-01-20 13:18:03 -08:00
Nikita Popov 1c6d1e57c1 [PredicateInfo] Handle logical and/or
Teach PredicateInfo to handle logical and/or the same way as
bitwise and/or. This allows handling logical and/or inside IPSCCP
and NewGVN.
2021-01-20 21:03:07 +01:00
Nikita Popov ca4ed1e7ae [PredicateInfo] Generalize processing of conditions
Branch/assume conditions in PredicateInfo are currently handled in
a rather ad-hoc manner, with some arbitrary limitations. For example,
an `and` of two `icmp`s will be handled, but an `and` of an `icmp`
and some other condition will not. That also includes the case where
more than two conditions and and'ed together.

This patch makes the handling more general by looking through and/ors
up to a limit and considering all kinds of conditions (though operands
will only be taken for cmps of course).

Differential Revision: https://reviews.llvm.org/D94447
2021-01-20 20:40:41 +01:00
Mircea Trofin d97f776be5 Revert "[NPM][Inliner] Factor ImportedFunctionStats in the InlineAdvisor"
This reverts commit e8aec763a5.
2021-01-20 11:19:34 -08:00