Commit Graph

6160 Commits

Author SHA1 Message Date
Nikita Popov 36fdfaba19 [RelLookupTableConverter] Ensure that GV, GEP and load types match
This code could be generalized to be type-independent, but for now
just ensure that the same type constraints are enforced with opaque
pointers as with typed pointers.
2022-02-17 12:05:05 +01:00
Roman Lebedev 371fcb720e
[SimplifyCFG][PhaseOrdering] Defer lowering switch into an integer range comparison and branch until after at least the IPSCCP
That transformation is lossy, as discussed in
https://github.com/llvm/llvm-project/issues/53853
and https://github.com/rust-lang/rust/issues/85133#issuecomment-904185574

This is an alternative to D119839,
which would add a limited IPSCCP into SimplifyCFG.

Unlike lowering switch to lookup, we still want this transformation
to happen relatively early, but after giving a chance for the things
like CVP to do their thing. It seems like deferring it just until
the IPSCCP is enough for the tests at hand, but perhaps we need to
be more aggressive and disable it until CVP.

Fixes https://github.com/llvm/llvm-project/issues/53853
Refs. https://github.com/rust-lang/rust/issues/85133

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D119854
2022-02-17 12:13:55 +03:00
Florian Mayer c195addb60 [NFC] [MTE] [HWASan] Remove unnecessary member of AllocaInfo
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119981
2022-02-16 15:19:30 -08:00
Nikita Popov c9032f1a69 [LowerMemIntrinsics] Explicitly use i8 type in memmove lowering
By convention, memcpy/memmove intrinsics are always used with i8
pointers (though this is not enforced), so in practice this code
was always using an i8 type. Make that explicit.

Of course, i8 is not a very profitable choice, and this code could
be more performant by picking an appropriate larger type. But that
would require additional test coverage and correctness review, and
certainly shouldn't be a decision based on the pointer element type.
2022-02-16 16:31:55 +01:00
Max Kazantsev bfc1217119 [NFC] Introduce option to switch off compatible invokes merge
Does not affect default behavior (transform is on).
2022-02-15 21:51:03 +07:00
Florian Mayer 8de457eafc [HWASAN] use common alignAndPadAlloca
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119614
2022-02-14 15:28:32 -08:00
Florian Mayer 205308de6b [NFC] [MTE] Move alignAndPadAlloca to MemoryTaggingSupport.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119610
2022-02-14 14:54:04 -08:00
Florian Mayer 6759cdd829 [NFC] [MTE] Use helpers for stack tagging.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119503
2022-02-11 16:01:46 -08:00
Florian Mayer bf2f72fa10 [hwasan] keep debug intrinsicts in AllocaInfo.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119498
2022-02-11 16:01:02 -08:00
Florian Mayer 26dbc47468 Revert "[hwasan] keep debug intrinsicts in AllocaInfo."
This reverts commit 19fdf85f58.
2022-02-11 14:41:24 -08:00
Florian Mayer b1bd64aeee Revert "[NFC] [MTE] Use helpers for stack tagging."
This reverts commit 8f0e5b4e26.
2022-02-11 14:41:24 -08:00
Florian Mayer 8f0e5b4e26 [NFC] [MTE] Use helpers for stack tagging.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119503
2022-02-11 10:59:09 -08:00
Florian Mayer 19fdf85f58 [hwasan] keep debug intrinsicts in AllocaInfo.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119498
2022-02-11 10:56:53 -08:00
Florian Mayer e7356fb3e2 [nfc] [hwasan] factor out logic to collect info about stack
this is the first step in unifying some of the logic between hwasan and
mte stack tagging. this only moves around code, changes to converge
different implementations of the same logic follow later.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118947
2022-02-11 10:54:12 -08:00
Sameer Sahasrabuddhe d8f99bb6e0 [AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216
2022-02-11 22:51:56 +05:30
Philip Reames 5ba115031d [PSE] Remove assumption that top level predicate is union from public interface [NFC*]
Note that this doesn't actually cause the top level predicate to become a non-union just yet.

The * above comes from a case in the LoopVectorizer where a predicate which is later proven no longer blocks vectorization due to a change from checking if predicates exists to whether the predicate is possibly false.
2022-02-10 16:14:52 -08:00
Philip Reames d39f4ac494 [SCEV] Unwind SCEVUnionPredicate from getPredicatedBackedgeTakenCount [NFC]
For those curious, the whole reason for tracking the predicate set seperately as opposed to just immediately registering the dependencies appears to be allowing the printing code to print a result without changing the PSE state.  It's slightly questionable if this justifies the complexity, but since we can preserve it with local ugliness, I did so.
2022-02-09 12:55:40 -08:00
Roman Lebedev c8ba2b67a0
[SimplifyCFG] 'merge compatible invokes': fully support indirect invokes
As long as *all* the invokes in the set are indirect,
we can merge them, but don't merge direct invokes into the set,
even though it would be legal to do.
2022-02-08 21:29:38 +03:00
Roman Lebedev 414b47645d
[SimplifyCFG] 'merge compatible invokes': don't create trivial PHI's with all-identical incoming values 2022-02-08 21:29:38 +03:00
Philip Reames c302f1e677 [SCEV] Generalize SCEVEqualsPredicate to any compare [NFC]
PredicatedScalarEvolution has a predicate type for representing A == B.  This change generalizes it into something which can represent a A <pred> B.

This generality is currently unused, but is motivated by a couple of recent cases which have come up.  In particular, I'm currently playing around with using this to simplify the runtime checking code in LoopVectorizer. Regardless of the outcome of that prototyping, generalizing the compare node seemed useful.
2022-02-08 08:18:09 -08:00
Nikita Popov 074561a4a2 [Mem2Reg] Check that load type matches alloca type
Alloca promotion can only deal with cases where the load/store
types match the alloca type (it explicitly does not support
bitcasted load/stores).

With opaque pointers this is no longer enforced through the pointer
type, so add an explicit check.
2022-02-08 17:16:15 +01:00
Roman Lebedev 42ca7cc889
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ uses
If the original invokes had uses, the uses must have been in PHI's,
but that immediately results in the incoming values being incompatible.
But we'll replace uses of the original invokes with the use of the
merged invoke, so as long as the incoming values become compatible
after that, we can merge.
2022-02-08 17:49:38 +03:00
Roman Lebedev 9986d60224
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ PHIs but no uses
As long as the incoming values for all the invokes in the set
are identical, we can merge the invokes.
2022-02-08 17:49:38 +03:00
Roman Lebedev 8411560fd0
[SimplifyCFG] 'merge compatible invokes': support normal destination w/ no uses, no PHI's
Even if the invokes have normal destination, iff it's the same block,
we can merge them. For now, require that there are no PHI nodes,
and the returned values of invokes aren't used.
2022-02-08 17:49:38 +03:00
Kazu Hirata 3a3cb929ab [llvm] Use = default (NFC) 2022-02-06 22:18:35 -08:00
Kazu Hirata a1a8d10a17 [Transforms] Use default member initialization in LibCallSimplifier (NFC) 2022-02-06 16:36:27 -08:00
Kazu Hirata 3fce5bb7b0 [Transforms] Use default member initialization in LoopVersioning (NFC) 2022-02-06 16:36:25 -08:00
Kazu Hirata 2d650ee03e [Transforms] Use default member initialization in SCEVFindUnsafe (NFC) 2022-02-05 21:39:27 -08:00
Kazu Hirata e24384b506 [Transforms] Use default member initialization in SimplifyIndvar (NFC) 2022-02-05 16:29:22 -08:00
Bill Wendling c6f0940d99 [NFC] Remove unnecessary #includes
An attempt to reduce the number of files that are recompiled due to a change.

Differential Revision: https://reviews.llvm.org/D119055
2022-02-04 21:22:41 -08:00
Hongtao Yu dee058c670 [CSSPGO] Turn on ext-tsp by default for CSSPGO.
I'm seeing ext-tsp helps CSSPGO for our intern large benchmarks so I'm turning on it for CSSPGO. For non-CS AutoFDO, ext-tsp doesn't seem to help, probably because of lower profile counts quality.

Reviewed By: wenlei

Differential Revision: https://reviews.llvm.org/D119048
2022-02-04 19:46:44 -08:00
Roman Lebedev 18ff1ec3c3
Reland [SimplifyCFG] `markAliveBlocks()`: recognize that normal dest of `invoke`d `noreturn` function is `unreachable`
As per LangRef's definition of `noreturn` attribute:
```
noreturn
  This function attribute indicates that the function never returns
  normally, hence through a return instruction.
  This produces undefined behavior at runtime if the function
  ever does dynamically return. nnotated functions may still
  raise an exception, i.a., nounwind is not implied.
```

So if we `invoke` a `noreturn` function, and the normal destination
of an invoke is not an `unreachable`, point it at the new `unreachable`
block.

The change/fix from the original commit is that we now actually create
the new block, and don't just repurpose the original block,
because said normal destination block could have other users.

This reverts commit db1176ce66,
relanding commit 598833c987.
2022-02-05 02:58:19 +03:00
Roman Lebedev db1176ce66
Revert "[SimplifyCFG] `markAliveBlocks()`: recognize that normal dest of `invoke`d `noreturn` function is `unreachable`"
The normal destination may have other uses.

This reverts commit 598833c987.
2022-02-05 02:30:20 +03:00
Roman Lebedev 598833c987
[SimplifyCFG] `markAliveBlocks()`: recognize that normal dest of `invoke`d `noreturn` function is `unreachable`
As per LangRef's definition of `noreturn` attribute:
```
noreturn
  This function attribute indicates that the function never returns
  normally, hence through a return instruction.
  This produces undefined behavior at runtime if the function
  ever does dynamically return. nnotated functions may still
  raise an exception, i.a., nounwind is not implied.
```
2022-02-05 02:15:07 +03:00
Roman Lebedev 55cd727c9a
[SimplifyCFG] 'merge compatible invokes': allow PHI nodes in landing pads
... iff the incoming values for the invokes-to-be-merged
are compatible (identical).
2022-02-04 20:26:44 +03:00
Roman Lebedev 0d384e9228
[NFC][SimplifyCFG] Extract `IncomingValuesAreCompatible()` out of `SafeToMergeTerminators()` 2022-02-04 20:26:44 +03:00
Roman Lebedev 36df803dfd
[SimplifyCFG] Merge compatible `invoke`s of a `landingpad`
While nowadays SimplifyCFG knows how to hoist code from then-else blocks,
sink code from unconditional predecessors, and even promote the latter
by tail-merging `ret`/`resume` function terminators, that isn't everything.

While i (& others) have been trying to deal with merging/sinking `unreachable`,
apparently perhaps the more impactful remaining problem is merging the `throw`
calls.

If we start at the `landingpad`, all the predecessors are unwind edges of `invoke`s,
and in some cases some of the `invoke`s are mergeable.
```
/// This is a weird mix of hoisting and sinking. Visually, it goes from:
///          [...]        [...]
///            |            |
///        [invoke0]    [invoke1]
///           / \          / \
///     [cont0] [landingpad] [cont1]
/// to:
///      [...] [...]
///          \ /
///       [invoke]
///          / \
///     [cont] [landingpad]
```

This simplifies the IR/CFG, at the cost of debug info and extra PHI nodes.

Note that we don't require for *all* the `invokes` of the `landingpad`
to be mergeable, they can form more than a single set, we gracefully handle that.

For now, i completely disallowed normal destination, PHI nodes and indirect invokes
but that can be supported.

Out of all the CTMark projects, only 7zip is C++, so there isn't much impact:
https://llvm-compile-time-tracker.com/compare.php?from=ba8eb31bd9542828f6424e15a3014f80f14522c8&to=722fc871c84f14157d45c2159bc9c8c7e2825785&stat=size-total
... but there it currently causes size-total decrease.

Differential Revision: https://reviews.llvm.org/D117805
2022-02-04 17:04:21 +03:00
Simon Pilgrim 6b4ebdd46f ModuleUtils - VFABI::setVectorVariantNames - use ArrayRef<> instead of const SmallVector to pass argument 2022-02-03 12:11:48 +00:00
Roman Lebedev ee4ba9f3a1
Revert "[SimplifyCFG] Start redesigning `FoldTwoEntryPHINode()`."
Unfortunately, it seems we really do need to take the long route;
start from the "merge" block, find (all the) "dispatch" blocks,
and deal with each "dispatch" block separately, instead of simply
starting from each "dispatch" block like it would logically make sense,
otherwise we run into a number of other missing folds around
`switch` formation, missing sinking/hoisting and phase ordering.

This reverts commit 85628ce75b.
This reverts commit c5fff90953.
This reverts commit 34a98e1046.
This reverts commit 1e353f0922.
2022-02-03 12:32:50 +03:00
Florian Mayer fa75a62cb5 [NFC] pull retvec logic to MemoryTaggingSupport.
we will also need this for aarch64 stack tagging.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118852
2022-02-02 16:05:52 -08:00
Fangrui Song 85628ce75b [SimplifyCFG] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds 2022-02-02 15:11:22 -08:00
Florian Mayer f7a6c341cb [mte] support more complicated lifetimes (e.g. for exceptions).
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118848
2022-02-02 14:39:22 -08:00
Florian Mayer 712b31e2d4 [NFC] factor isStandardLifetime out of HWASan
this is so we can use it for aarch64 stack tagging.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118836
2022-02-02 13:23:55 -08:00
Roman Lebedev c5fff90953
[NFC][SimplifyCFG] Merge `FoldTwoEntryPHINode()` into it's only callee 2022-02-02 17:53:56 +03:00
Roman Lebedev 34a98e1046
[NFC][SimplifyCFG] `FoldTwoEntryPHINode()`: s/BB/MergeBB/ 2022-02-02 17:53:56 +03:00
Roman Lebedev 1e353f0922
[SimplifyCFG] Start redesigning `FoldTwoEntryPHINode()`.
The current `FoldTwoEntryPHINode()` is not quite designed correctly.
It starts from the merge point, and then tries to detect
the 'divergence' point.

Because of that, it is limited to the simple two-predecessor case,
where the PHI completely goes away. but that is rather pessimistic,
and it doesn't make much sense from the costmodel side of things.

For example if there is some other unrelated predecessor of
the merge point,  we could split the merge point so that
the then/else blocks first branch to an empty block
and then to the merge point, and then we'd be able to speculate
the then/else code.

But if we'd instead simply start at the divergence point,
and look for the merge point, then we'll just natively support this case.

There's also the fact that `SpeculativelyExecuteBB()` already does
just that, but only if there is a single block to speculate,
and with a much more restrictive cost model.
But that also means we have code duplication.

Now, sadly, while this is as much NFCI as possible,
there is just no way to cleanly migrate to
the proper implementation. The results *are* going to be different
somewhat because of various phase ordering effects and SimplifyCFG
block iteration strategy.
2022-02-02 17:53:56 +03:00
serge-sans-paille e188aae406 Cleanup header dependencies in LLVMCore
Based on the output of include-what-you-use.

This is a big chunk of changes. It is very likely to break downstream code
unless they took a lot of care in avoiding hidden ehader dependencies, something
the LLVM codebase doesn't do that well :-/

I've tried to summarize the biggest change below:

- llvm/include/llvm-c/Core.h: no longer includes llvm-c/ErrorHandling.h
- llvm/IR/DIBuilder.h no longer includes llvm/IR/DebugInfo.h
- llvm/IR/IRBuilder.h no longer includes llvm/IR/IntrinsicInst.h
- llvm/IR/LLVMRemarkStreamer.h no longer includes llvm/Support/ToolOutputFile.h
- llvm/IR/LegacyPassManager.h no longer include llvm/Pass.h
- llvm/IR/Type.h no longer includes llvm/ADT/SmallPtrSet.h
- llvm/IR/PassManager.h no longer includes llvm/Pass.h nor llvm/Support/Debug.h

And the usual count of preprocessed lines:
$ clang++ -E  -Iinclude -I../llvm/include ../llvm/lib/IR/*.cpp -std=c++14 -fno-rtti -fno-exceptions | wc -l
before: 6400831
after:  6189948

200k lines less to process is no that bad ;-)

Discourse thread on the topic: https://llvm.discourse.group/t/include-what-you-use-include-cleanup

Differential Revision: https://reviews.llvm.org/D118652
2022-02-02 06:54:20 +01:00
Anna Thomas bc48a26655 [LoopPeel] Use reference instead of pointer for DT argument
Cleanup code in peelLoop API. We already have usage of DT without guarding
against a null DT, so this change constant folds the remaining null DT
checks.
Also make the argument a reference so that it is clear the argument is
a nonnull DT.
Extracted from D118472.
2022-02-01 17:00:08 -05:00
Nikita Popov 236fbf571d [GlobalStatus] Skip non-pointer dead constant users
Constant expressions with a non-pointer result type used an early
exit that bypassed the later dead constant user check, and resulted
in different optimization outcomes depending on whether dead users
were present or not.

This fixes the issue reported in https://reviews.llvm.org/D117223#3287039.
2022-02-01 15:51:32 +01:00
Fangrui Song 85dfe19b36 [ModuleUtils] Move EmbedBufferInModule to LLVMTransformsUtils
D116542 adds EmbedBufferInModule which introduces a layer violation
(https://llvm.org/docs/CodingStandards.html#library-layering).
See 2d5f857a1e for detail.

EmbedBufferInModule does not use BitcodeWriter functionality and should be moved
LLVMTransformsUtils. While here, change the function case to the prevailing
convention.

It seems that EmbedBufferInModule just follows the steps of
EmbedBitcodeInModule. EmbedBitcodeInModule calls WriteBitcodeToFile but has IR
update operations which ideally should be refactored to another library.

Reviewed By: jhuber6

Differential Revision: https://reviews.llvm.org/D118666
2022-01-31 16:33:57 -08:00