Commit Graph

5179 Commits

Author SHA1 Message Date
minglotus-6 e2074de6a8 [ProfSampleLoader] When disable-sample-loader-inlining is true, merge profiles of inlined instances to outlining versions.
When --disable-sample-loader-inlining is true, skip inline transformation, but merge profiles of inlined instances to outlining versions.

Differential Revision: https://reviews.llvm.org/D121862
2022-03-23 13:02:48 -07:00
Alexandros Lamprineas a687f96b0f [FuncSpec][NFC] Clang-format the source code and fix debug typo. 2022-03-23 14:39:58 +00:00
serge-sans-paille a53b689f0c Fix missing include under -DEXPENSIVE_CHECK
Regression introduced by f1985a3f85
2022-03-22 10:37:56 +01:00
serge-sans-paille f1985a3f85 Cleanup includes: Transforms/IPO
Preprocessor output diff: -238205 lines
Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122183
2022-03-22 10:06:28 +01:00
Hirochika Matsumoto 86f970e595 [IROutliner][NFC] Fix typo in doc of findOrCreatePHIInBlock
Typo Fix in Documentation

Author: hkmatsumoto

Reviewers: AndrewLitteken

Differential Revision: https://reviews.llvm.org/D121627
2022-03-21 12:34:20 -05:00
Andrew Litteken 4e500df89e [IROutliner] Fix phi nodes when self referential within block but doesn't contain branch
When outlining a phi node, if the the incoming branch is a block contained in the region and the branch from that block is not outlined, we create broken code. The fix is to recognize when that branch from the included incoming block is not contained, and ignore the region.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D121311
2022-03-21 11:05:15 -05:00
Andrew Litteken 38e8880e93 [IROutliner] Do not outlined from functions with optnone
Since the IROutliner is performing an optimization, it should not outline from functions explicitly marked with optnone. This adds an extra check and test to make sure this does not occur.

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D121567
2022-03-20 23:39:23 -05:00
Kazu Hirata bce1bf0ee2 [Transform] Apply clang-tidy fixes for readability-redundant-smartptr-get (NFC) 2022-03-20 10:41:22 -07:00
Johannes Doerfert 4166738c38 [OpenMP][FIX] Do not crash when kernels are debug wrapper functions
With debug information enabled (-g) Clang will wrap the actual target
region into a new function which is called from the "kernel". The problem
is that the "kernel" is now basically a wrapper without all the things
we expect. More importantly, if we end up asking for an AAKernelInfo
for the "target region function" we might try to turn it into SPMD mode.
That used to cause an assertion as that function doesn't have an
appropriately named `_exec_mode` global. While the global is going away
soon we still need to make sure to properly handle this case, e.g.,
perform optimizations reliably.

Differential Revision: https://reviews.llvm.org/D122043
2022-03-19 14:15:55 -05:00
Fangrui Song c6692f819e [GlobalOpt] Don't replace alias with aliasee if either alias/aliasee may be preemptible
Generalize D99629 for ELF. A default visibility non-local symbol is preemptible
in a -shared link. `isInterposable` is an insufficient condition.

Moreover, a non-preemptible alias may be referenced in a sub constant expression
which intends to lower to a PC-relative relocation. Replacing the alias with a
preemptible aliasee may introduce a linker error.

Respect dso_preemptable and suppress optimization to fix the abose issues. With
the change, `alias = 345` will not be rewritten to use aliasee in a `-fpic`
compile.
```
int aliasee;
extern int alias __attribute__((alias("aliasee"), visibility("hidden")));
void foo() { alias = 345; } // intended to access the local copy
```

While here, refine the condition for the alias as well.

For some binary formats like COFF, `isInterposable` is a sufficient condition.
But I think canonicalization for the changed case has little advantage, so I
don't bother to add the `Triple(M.getTargetTriple()).isOSBinFormatELF()` or
`getPICLevel/getPIELevel` complexity.

For instrumentations, it's recommended not to create aliases that refer to
globals that have a weak linkage or is preemptible. However, the following is
supported and the IR needs to handle such cases.
```
int aliasee __attribute__((weak));
extern int alias __attribute__((alias("aliasee")));
```

There are other places where GlobalAlias isInterposable usage may need to be
fixed.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D107249
2022-03-18 14:17:05 -07:00
Paul Kirth 964398ccb1 Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"""
This reverts commit 6cf560d69a.
2022-03-18 00:21:33 +00:00
Paul Kirth 6cf560d69a Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""
I mistakenly reverted my commit, so I'm relanding it.

This reverts commit 10866a1df4.
2022-03-18 00:04:22 +00:00
Paul Kirth 10866a1df4 Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit e7749d4713.
2022-03-17 23:54:26 +00:00
Paul Kirth e7749d4713 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Differential Revision: https://reviews.llvm.org/D115907
2022-03-17 23:46:23 +00:00
Johannes Doerfert 4308fdf83b [Attributor] Remove more non-deterministic behavior and debug output 2022-03-17 17:42:32 -05:00
Johannes Doerfert 59a6b668ab [OpenMP][FIX] Initialize member to avoid undefined value in debug output 2022-03-17 17:42:32 -05:00
Johannes Doerfert 88ea86c369 [Attributor][FIX] Remove reference into map that might dangle
The reference was taken and the map was modified after. This can (and
did) lead to dangling pointers and all sorts of problems afterwards.
2022-03-17 17:42:32 -05:00
Ellis Hoag f6b5142ac2 [AlwaysInliner] Emit inline remark only when successful
Failures in `InlineFunction()` are caught after D121722, but `emitInlinedIntoBasedOnCost()` should only be called when inlining is successful. This also removes an unnecessary call to `shouldInline()` which always returned `InlineCost::getAlways()`.

Reviewed By: kyulee, nikic

Differential Revision: https://reviews.llvm.org/D121946
2022-03-17 15:40:24 -07:00
Andrew Litteken f7d90ad57b [IROutliner] Make sure that loop debug info is stripped.
As pointed out in https://github.com/llvm/llvm-project/issues/54155#issuecomment-1057465479, there was a crash when loop info was being outlined. It was not being properly stripped and adjusted, so would point to the wrong location. This uses similar logic found in the CodeExtractor to adjust the loop debug info.

Reviewer: fhahn, paquette

Differential Revision: https://reviews.llvm.org/D120869
2022-03-17 14:41:53 -06:00
Ellis Hoag 84c6689b15 [AlwaysInliner] Check inliner errors even without assserts
When we build clang without asserts we should still check the result of
`InlineFunction()` to be sure there wasn't an error. Otherwise we could
incorrectly merge attributes in the next line.

This also removes a redundent call to `getCaller()`.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D121722
2022-03-17 10:16:23 -07:00
Florian Hahn e5822ded56
[FunctionAttrs] Infer argmemonly .
This patch adds initial argmemonly inference, by checking the underlying
objects of locations returned by MemoryLocation.

I think this should cover most cases, except function calls to other
argmemonly functions.

I'm not sure if there's a reason why we don't infer those yet.

Additional argmemonly can improve codegen in some cases. It also makes
it easier to come up with a C reproducer for 7662d1687b (already fixed,
but I'm trying to see if C/C++ fuzzing could help to uncover similar
issues.)

Compile-time impact:
NewPM-O3: +0.01%
NewPM-ReleaseThinLTO: +0.03%
NewPM-ReleaseLTO+g: +0.05%

https://llvm-compile-time-tracker.com/compare.php?from=067c035012fc061ad6378458774ac2df117283c6&to=fe209d4aab5b593bd62d18c0876732ddcca1614d&stat=instructions

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D121415
2022-03-16 10:24:33 +00:00
Florian Hahn 014f5bcf7a
[FunctionAttrs] Replace MemoryAccessKind with FMRB.
Update FunctionAttrs to use FunctionModRefBehavior instead
MemoryAccessKind.

This allows for adding support for inferring argmemonly and others,
see D121415.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D121460
2022-03-15 19:35:54 +00:00
Nikita Popov 875782bd9e [OpenMPOpt] Avoid pointer element type access during region merging
Hardcode the function type as ParallelTask, which is the guaranteed
pointee type of this runtime function argument (if pointee types
exist). The elimination of the callee bitcast is left for InstCombine.

Differential Revision: https://reviews.llvm.org/D120885
2022-03-15 09:52:46 +01:00
Andrew Litteken 228cc2c38b [IROutliner] Ensure merged PHINodes respect order and incoming blocks, not just incoming values
When matching PHINodes when margining functions the IROutliner only checks that an incoming value exists in phi node in overall function. It doesn't check the length, the order, or that the incoming block also matches. In the given example, we see that both phi nodes have the same incoming values, but from different blocks.

The fix is to to enforce stricter a match of the incoming value, and the incoming block as well when matching the created phi nodes.

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D121310
2022-03-14 16:48:21 -05:00
Nick Desaulniers 236695e70c [IRLinker] make IRLinker::AddLazyFor optional (llvm::unique_function). NFC
2 of the 3 callsite of IRMover::move() pass empty lambda functions. Just
make this parameter llvm::unique_function.

Came about via discussion in D120781. Probably worth making this change
regardless of the resolution of D120781.

Reviewed By: dexonsmith

Differential Revision: https://reviews.llvm.org/D121630
2022-03-14 14:37:34 -07:00
Andrew Litteken c79ab1065e [IROutliner] Separate split PHI nodes from multiple exits by different outlinable regions.
The IR Outliner is supposed to extract the outputs contained in an external phi node and place them into a phi node contained within the outlined function. However, when the output values of two outlined functions with two different output sets are contained within the same phi node, they are counted as the same exit path when first analyzed. In reality, these create two different phi nodes, creating an inconsistency, resulting in a mismatch in the expected number of output paths and a crash.  This fixes that counting when analyzing the outputs by also analyzing the incoming blocks rather than just the incoming values.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D121313
2022-03-14 14:56:59 -05:00
Teresa Johnson fee0bde4c6 [WPD] Extend checking mode to support fallback to indirect call
Extend -wholeprogramdevirt-check to support both the existing
trapping mode on an incorrect devirtualization, as well as a new
mode to fallback to an indirect call on a mismatch. The new mode is

The new mode is useful in cases where we want to enable
devirtualization but cannot fully guarantee whole program visibility
(e.g in the case where LTO has been disabled for a small set of objects
that could potentially override virtual methods without having a symbol
reference to anything in the base class including the vtable).

Remove !prof and !callees metadata (which are used by indirect call
promotion) from both the new direct call and the fallback indirect call
(so that we don't perform another round of promotion on the latter).
Also remove it from the direct call in the non-fallback cases, which
was an oversight, although it didn't seem to cause any issues. Add tests
for the metadata removal covering the various cases.

Differential Revision: https://reviews.llvm.org/D121419
2022-03-14 10:16:28 -07:00
Andrew Litteken 3c90812f3b [IROutliner] Avoid reusing PHINodes that have already been matched when merging outlined functions' phi node blocks
When there are two external phi nodes for two different outlined regions, when compressing the created phi nodes between the two regions, the matching for the second phi node in the second region matches the first phi node created for the first region rather than the second phi node created for the first region. This adds an extra output path where there should not be one.

The fix is the ignore phi nodes that have already been matched for each region.

Reviewer: paquette

Differential Revision: https://reviews.llvm.org/D121312
2022-03-14 12:00:01 -05:00
Nikita Popov 3ec44c22b1 [DeadArgElim] Guard against function type mismatch
If the call function type and function type don't match, we should
consider the function live (there is effectively a bitcast
sitting in between).
2022-03-14 13:03:04 +01:00
Johannes Doerfert 85daf6973d [Attributor] Remove capture tracker usage and follow uses explicitly
Before we used the capture tracker to follow pointer uses, now we do it
explicitly ourselves through the Attributor API. There are multiple
benefits: For one, the boilerplate is cut down by a lot. The class,
potential copies vector, etc. is all not needed anymore. We also do
avoid explicitly looking through memory here, something that was
duplicated and should only live in the `checkForAllUses~ helper. More
importantly, as we do simplifications we need to make sure all parties
are in sync when they reason about uses. The old way did not allow us to
do this but the new one does as every use visiting AA goes through
`checkForAllUses` now..
2022-03-11 22:56:16 -06:00
Johannes Doerfert f44f60a297 [Attributor] Avoid replacing return operands twice
As replacements will become more complex it is better to have a single
AA responsible for replacing a use. Before this patch AAValueSimplify*
and AAValueSimplifyReturned could both try to replace the returned
value. The latter was marginally better for the old pass manager
when a function was already carrying a `returned` attribute and when
the context of the return instruction was important. The second
shortcoming was resolved by looking for return attributes in the
AAValueSimplifyCallSiteReturned initialization. The old PM impact is
not concerning.

This is yet another step towards the removal of AAReturnedValues, the
very first AA we should now try to eliminate due to the overlapping
logic with value simplification.
2022-03-11 21:55:19 -06:00
Johannes Doerfert 55a970fbd4 [Attributor][FIX] Make sure to not ignore non-load users of stores
When we look through memory for a store we used to allow any other use
of the memory that is reachable. This is generally OK but we need to
make sure to actually let the user look at these properly. For now,
we simply require loads (via exact reloads).
2022-03-11 18:41:13 -06:00
Johannes Doerfert f3ad8cf00e [Attributor] Cleanup manifest and liveness for CGSCC passes
There was some ad-hoc handling of liveness and manifest to avoid
breaking CGSCC guarantees. Things always slipped through though.
This cleanup will:

1) Prevent us from manifesting any "information" outside the CGSCC.
   This might be too conservative but we need to opt-in to annotation
   not try to avoid some problematic ones.
2) Avoid running any liveness analysis outside the CGSCC. We did have
   some AAIsDeadFunction handling to this end but we need this for all
   AAIsDead classes. The reason is that AAIsDead information is only
   correct if we actually manifest it, since we don't (see point 1) we
   cannot actually derive/use it at all. We are currently trying to
   avoid running any AA updates outside the CGSCC but that seems to
   impact things quite a bit.
3) Assert, don't check, that our modifications (during cleanup) modifies
   only CGSCC functions.
2022-03-11 16:46:02 -06:00
Johannes Doerfert 9ddb1a49ac [Attributor][FIX] Avoid double free (and useless state copy)
In an attempt to remove the memory leak we introduced a double free.
The problem was that we allowed a plain copy of the state and it was
actually used. The use was useless, so it is gone now. The copy
constructor is gone as well. The move constructor ensures the Accesses
pointers are owned by a single state, I hope.

Reported by: https://lab.llvm.org/buildbot/#/builders/16/builds/25820
2022-03-11 10:10:36 -06:00
Johannes Doerfert 3570b0c5c7 [Attributor][FIX] Remove memory leak
The leak was introduced when we made things deterministic. It was
reported by the sanitizer buildbot:
 https://lab.llvm.org/buildbot/#/builders/168
2022-03-11 09:52:44 -06:00
Florian Hahn e07b899192
[FunctionAttrs] Rename addReadAttrs -> addMemoryAttrs.
The addReadAttrs name is out of date, as the function also adds
the writeonly attribute. addMemoryAttrs is more accurate.
2022-03-11 11:49:22 +00:00
Johannes Doerfert e8fadafe77 [Attributor][NFCI] Make AAPointerInfo deterministic
The order in which we kept accesses was non-deterministic and a debug
output was a pointer value. Fixed both.
2022-03-10 23:27:47 -06:00
Johannes Doerfert 7211dbd01d [Attributor][NFCI] Remove non-deterministic behavior and debug output 2022-03-10 23:27:47 -06:00
Alok Kumar Sharma 94823500a7 [DebugInfo][SROA] Correct debug info for global variables in case of SROA
The existing handling produced crash for test case (attached with patch).
Now the function transferSRADebugInfo is modified to
  - Ignore the current variable if it starts after the current Fragment.
  - Ignore the current variable if it ends before the current Fragment.
  - Generate (!DIExpression()) if current variable completely fits the
    current Fragment.
  - Otherwise (as earlier), generate the DW_OP_LLVM_fragment in IR if current
    Fragment partially defines current variable.

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D121107
2022-03-10 00:41:30 +05:30
Andrew Litteken 0b3a6c8d20 [IROutliner] Handling outlined code with no exit paths
As a result of adding multiblock outlining, it became possible to outline the entirety of basic block, and branches that only pointed to the basic blocks contained in the outlined section. This means that there are no exit paths, and no return statement. There was a previous assertion from the older version of the outliner that explicitly made sure there was a return statement. This removes that assertion.

Reviewers: paquette

Differential Revision: https://reviews.llvm.org/D120868
2022-03-09 10:43:48 -08:00
Nikita Popov f682a8386b [Attributor] Use byval type instead of pointer element type
For compatibility with opaque pointers, use the byval type rather
than the pointer element type.

Differential Review: https://reviews.llvm.org/D120983
2022-03-09 09:30:42 +01:00
Arthur Eubanks 53e5e58670 [NewPM][Inliner] Make inlined calls to functions in same SCC as callee exponentially expensive
Introduce a new attribute "function-inline-cost-multiplier" which
multiplies the inline cost of a call site (or all calls to a callee) by
the multiplier.

When processing the list of calls created by inlining, check each call
to see if the new call's callee is in the same SCC as the original
callee. If so, set the "function-inline-cost-multiplier" attribute of
the new call site to double the original call site's attribute value.
This does not happen when the original call site is intra-SCC.

This is an alternative to D120584, which marks the call sites as
noinline.

Hopefully fixes PR45253.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D121084
2022-03-07 23:51:09 -08:00
Johannes Doerfert 5b4acb20ff [OpenMP][FIX] Ensure flag to disable de-globalization works properly
If the user disables de-globalization we did not seed the AAHeapToShared
and AAHeapToStack but we still could end up with them through in-flight
lookups. With this patch we disable AAHeapToShared completely if the
user disabled de-globalization. Heap-2-stack is still run though.

Differential Revision: https://reviews.llvm.org/D121059
2022-03-07 23:43:05 -06:00
Nikita Popov 0636c93d3e [Attributor] Remove restriction on simplifying function pointers
Dropping this restriction seems to work fine (there are no assertion
failures), so it appears that either the updater got smarter or the
problematic cases are restricted elsewhere.

If doing this still causes issues, then the place to address it
would probably be 8f5bdaf481/llvm/lib/Transforms/IPO/Attributor.cpp (L1856-L1859),
which already prevents replacement outside the SCC, so I'm not
quite sure what this check is intended to avoid.

Differential Revision: https://reviews.llvm.org/D120987
2022-03-07 11:54:37 +01:00
Nikita Popov a9b03d9e2e [Attributor] Remove function pointer restriction for AAAlign
This check is not compatible with opaque pointers. We can avoid
it by adjusting the getPointerAlignment() implementation to avoid
creating unnecessary ptrtoint expressions for bitcasted pointers.
The code already uses OnlyIfReduced to not create an expression
if it does not simplify, and this makes sure that folding a
bitcast and ptrtoint into a ptrtoint doesn't count as a
simplification.

Differential Revision: https://reviews.llvm.org/D120904
2022-03-07 10:02:45 +01:00
Johannes Doerfert 5af11ec34b [Attributor] Determine potentially loaded values through memory
We already look through memory to determine where a value that is stored
might pop up again (potential copies). This patch introduces the other
direction with similar logic. If a value is loaded, we can follow all
the accesses to the pointer (or better object) and try to determine what
value might have been stored.
2022-03-06 23:26:37 -06:00
Johannes Doerfert eb73af4af4 [Attributor] Handle undef and null in AAAlignFloating
Both `undef` and `nullptr` are maximally aligned. This is especially
important as we often see `undef` until a proper value has been
identified during simplification.
2022-03-06 23:26:22 -06:00
Johannes Doerfert ad26e199ff [Attributor] Use CFG reasoning also for read accesses
With D106397 we used CFG reasoning to filter out writes that will not
interfere with a given load instruction. With this patch we use the
same logic (modulo the reversal in reachability check order) for store
instructions. As an example, we can now proof stores to shared memory
are dead if all the loads of the shared memory are not reachable from
them.
2022-03-06 23:26:22 -06:00
Johannes Doerfert acb3773491 [Attributor] Improve isValidAtPosition (mostly for old PM)
To minimize the test difference between old and new PM we perform some
local dominance check if no dominator tree is available.
2022-03-06 23:26:21 -06:00
Johannes Doerfert ff758372bd [Attributor][NFCI] Introduce fine-grained anonymous namespaces 2022-03-06 21:28:38 -06:00