Commit Graph

6465 Commits

Author SHA1 Message Date
Max Kazantsev 62f4572e45 [IndVars][NFC] Make IVOperand parameter an instruction 2022-07-13 19:07:16 +07:00
Max Kazantsev 30e33b4b81 [SCEV][NFC] Make getStrengthenedNoWrapFlagsFromBinOp return optional 2022-07-13 18:54:25 +07:00
Yuanfang Chen fcb7d76d65 [coroutine] add nomerge function attribute to `llvm.coro.save`
It is illegal to merge two `llvm.coro.save` calls unless their
`llvm.coro.suspend` users are also merged. Marks it "nomerge" for
the moment.

This reverts D129025.

Alternative to D129025, which affects other token type users like WinEH.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D129530
2022-07-12 10:39:38 -07:00
Nick Desaulniers 2240d72f15 [X86] initial -mfunction-return=thunk-extern support
Adds support for:
* `-mfunction-return=<value>` command line flag, and
* `__attribute__((function_return("<value>")))` function attribute

Where the supported <value>s are:
* keep (disable)
* thunk-extern (enable)

thunk-extern enables clang to change ret instructions into jmps to an
external symbol named __x86_return_thunk, implemented as a new
MachineFunctionPass named "x86-return-thunks", keyed off the new IR
attribute fn_ret_thunk_extern.

The symbol __x86_return_thunk is expected to be provided by the runtime
the compiled code is linked against and is not defined by the compiler.
Enabling this option alone doesn't provide mitigations without
corresponding definitions of __x86_return_thunk!

This new MachineFunctionPass is very similar to "x86-lvi-ret".

The <value>s "thunk" and "thunk-inline" are currently unsupported. It's
not clear yet that they are necessary: whether the thunk pattern they
would emit is beneficial or used anywhere.

Should the <value>s "thunk" and "thunk-inline" become necessary,
x86-return-thunks could probably be merged into x86-retpoline-thunks
which has pre-existing machinery for emitting thunks (which could be
used to implement the <value> "thunk").

Has been found to build+boot with corresponding Linux
kernel patches. This helps the Linux kernel mitigate RETBLEED.
* CVE-2022-23816
* CVE-2022-28693
* CVE-2022-29901

See also:
* "RETBLEED: Arbitrary Speculative Code Execution with Return
Instructions."
* AMD SECURITY NOTICE AMD-SN-1037: AMD CPU Branch Type Confusion
* TECHNICAL GUIDANCE FOR MITIGATING BRANCH TYPE CONFUSION REVISION 1.0
  2022-07-12
* Return Stack Buffer Underflow / Return Stack Buffer Underflow /
  CVE-2022-29901, CVE-2022-28693 / INTEL-SA-00702

SystemZ may eventually want to support "thunk-extern" and "thunk"; both
options are used by the Linux kernel's CONFIG_EXPOLINE.

This functionality has been available in GCC since the 8.1 release, and
was backported to the 7.3 release.

Many thanks for folks that provided discrete review off list due to the
embargoed nature of this hardware vulnerability. Many Bothans died to
bring us this information.

Link: https://www.youtube.com/watch?v=IF6HbCKQHK8
Link: https://github.com/llvm/llvm-project/issues/54404
Link: https://gcc.gnu.org/legacy-ml/gcc-patches/2018-01/msg01197.html
Link: https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/advisory-guidance/return-stack-buffer-underflow.html
Link: https://arstechnica.com/information-technology/2022/07/intel-and-amd-cpus-vulnerable-to-a-new-speculative-execution-attack/?comments=1
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ce114c866860aa9eae3f50974efc68241186ba60
Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00702.html
Link: https://www.intel.com/content/www/us/en/security-center/advisory/intel-sa-00707.html

Reviewed By: aaron.ballman, craig.topper

Differential Revision: https://reviews.llvm.org/D129572
2022-07-12 09:17:54 -07:00
Nikita Popov 3d475dfeb9 [Mem2Reg] Consistently preserve nonnull assume for uninit load
When performing a !nonnull load from uninitialized memory, we
should preserve the nonnull assume just like in all other cases.
We already do this correctly in the generic mem2reg code, but
don't handle this case when using the optimized single-block
implementation.

Make sure that the optimized implementation exhibits the same
behavior as the generic implementation.
2022-07-12 12:53:08 +02:00
Paul Osmialowski b17754bcaa [SimplifyLibCalls] refactor pow(x, n) expansion where n is a constant integer value
Since the backend's codegen is capable to expand powi into fmul's, it
is not needed anymore to do so in the ::optimizePow() function of
SimplifyLibCalls.cpp. What is sufficient is to always turn pow(x, n)
into powi(x, n) for the cases where n is a constant integer value.

Dropping the current expansion code allowed relaxation of the folding
conditions and now this can also happen at optimization levels below
Ofast.

The added CodeGen/AArch64/powi.ll test case ensures that powi is
actually expanded into fmul's, confirming that this refactor did not
cause any performance degradation.

Following an idea proposed by David Sherwood <david.sherwood@arm.com>.

Differential Revision: https://reviews.llvm.org/D128591
2022-07-09 12:00:22 -04:00
zhongyunde 716e1b856a [IndVars] Eliminate redundant type cast between integer and float
Recompute the range: match for fptosi of sitofp, and then query the range of the input to the sitofp
according the comment on D129140.

Fixes https://github.com/llvm/llvm-project/issues/55505.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129191
2022-07-08 17:07:20 +08:00
Nikita Popov 34a5c2bcf2 [BasicBlockUtils] Allow critical edge splitting with callbr terminators
After D129205, we support SplitBlockPredecessors() for predecessors
with callbr terminators. This means that it is now also safe to
invoke critical edge splitting for an edge coming from a callbr
terminator. Remove checks in various passes that were protecting
against that.

Differential Revision: https://reviews.llvm.org/D129256
2022-07-08 09:20:44 +02:00
Martin Sebor 516915beb5 [InstCombine] Fold memchr and strchr equality with first argument
Enhance memchr and strchr handling to simplify calls to the functions
used in equality expressions with the first argument to at most two
integer comparisons:

- memchr(A, C, N) == A to N && *A == C for either a dereferenceable
  A or a nonzero N,
- strchr(S, C) == S to *S == C for any S and C, and
- strchr(S, '\0') == 0 to true for any S

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128939
2022-07-07 15:14:23 -06:00
Zaara Syeda 58b9666dc1 [LSR] Fix bug - check if loop has preheader before calling isInductionPHI
Fix bug exposed by https://reviews.llvm.org/D125990
rewriteLoopExitValues calls InductionDescriptor::isInductionPHI which requires
the PHI node to have an incoming edge from the loop preheader. This adds checks
before calling InductionDescriptor::isInductionPHI to see that the loop has a
preheader. Also did some refactoring.

Differential Revision: https://reviews.llvm.org/D129297
2022-07-07 15:11:33 -04:00
Joseph Huber 41fba3c107 [Metadata] Add 'exclude' metadata to add the exclude flags on globals
This patchs adds a new metadata kind `exclude` which implies that the
global variable should be given the necessary flags during code
generation to not be included in the final executable. This is done
using the ``SHF_EXCLUDE`` flag on ELF for example. This should make it
easier to specify this flag on a variable without needing to explicitly
check the section name in the target backend.

Depends on D129053 D129052

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D129151
2022-07-07 12:20:40 -04:00
Joseph Huber ed801ad5e5 [Clang] Use metadata to make identifying embedded objects easier
Currently we use the `embedBufferInModule` function to store binary
strings containing device offloading data inside the host object to
create a fatbinary. In the case of LTO, we need to extract this object
from the LLVM-IR. This patch adds a metadata node for the embedded
objects containing the embedded pointers and the sections they were
stored at. This should create a cleaner interface for identifying these
values.

In the future it may be worthwhile to also encode an `ID` in the
metadata corresponding to the object's special section type if relevant.
This would allow us to extract the data from an object file and LLVM-IR
using the same ID.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D129033
2022-07-07 12:20:25 -04:00
Nikita Popov 40a4078e14 [BasicBlockUtils] Allow splitting predecessors with callbr terminators
SplitBlockPredecessors currently asserts if one of the predecessor
terminators is a callbr. This limitation was originally necessary,
because just like with indirectbr, it was not possible to replace
successors of a callbr. However, this is no longer the case since
D67252. As the requirement nowadays is that callbr must reference
all blockaddrs directly in the call arguments, and these get
automatically updated when setSuccessor() is called, we no longer
need this limitation.

The only thing we need to do here is use replaceSuccessorWith()
instead of replaceUsesOfWith(), because only the former does the
necessary blockaddr updating magic.

I believe there's other similar limitations that can be removed,
e.g. related to critical edge splitting.

Differential Revision: https://reviews.llvm.org/D129205
2022-07-07 09:13:25 +02:00
Nikola Tesic b5b6d3a41b [Debugify] Port verify-debuginfo-preserve to NewPM
Debugify in OriginalDebugInfo mode, introduced with D82545,
runs only with legacy PassManager.

This patch enables this utility for the NewPM.

Differential Revision: https://reviews.llvm.org/D115351
2022-07-06 17:07:20 +02:00
Shilei Tian 1023ddaf77 [LLVM] Add the support for fmax and fmin in atomicrmw instruction
This patch adds the support for `fmax` and `fmin` operations in `atomicrmw`
instruction. For now (at least in this patch), the instruction will be expanded
to CAS loop. There are already a couple of targets supporting the feature. I'll
create another patch(es) to enable them accordingly.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D127041
2022-07-06 10:57:53 -04:00
Nikita Popov 20962c1240 [SimplifyCFG] Don't split predecessors of callbr terminator
This addresses the assertion failure reported in
https://reviews.llvm.org/D124159#3631240.

I believe that this limitation in SplitBlockPredecessors is not
actually necessary (because unlike with indirectbr, callbr is
restricted in a way that does allow updating successors), but for
now fix the assertion failure the same way we do everywhere else,
by also skipping callbr.
2022-07-06 15:38:53 +02:00
Nikita Popov f96cb66d19 [ValueTracking] Accept Instruction in isSafeToSpeculativelyExecute() (NFC)
As constant expressions can no longer trap, it only makes sense to
call isSafeToSpeculativelyExecute on Instructions, so limit the
API to accept only them, rather than general Operators or Values.
2022-07-06 11:12:49 +02:00
Nikita Popov 8ee913d83b [IR] Remove Constant::canTrap() (NFC)
As integer div/rem constant expressions are no longer supported,
constants can no longer trap and are always safe to speculate.
Remove the Constant::canTrap() method and its usages.
2022-07-06 10:36:47 +02:00
Yuanfang Chen b170d856a3 [SimplifyCFG] Skip hoisting common instructions that return token type
By LangRef, hoisting token-returning instructions obsures the origin
so it should be skipped. Found this issue while investigating a
CoroSplit pass crash.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D129025
2022-07-05 11:21:57 -07:00
Zaara Syeda dbf6ab5ef9 [LSR] Fix bug for optimizing unused IVs to final values
This is a fix for a crash reported for https://reviews.llvm.org/D118808
The fix is to only consider PHINodes which are induction phis.
Fixes #55529

Differential Revision: https://reviews.llvm.org/D125990
2022-07-05 12:30:58 -04:00
Nikita Popov a4772cbaf0 Revert "[SimplifyCFG] Thread branches on same condition in more cases (PR54980)"
This reverts commit 4e545bdb35.

The newly added test is the third infinite combine loop caused by
this change. In this case, it's a combination of the branch to
common dest and jump threading folds that keeps peeling off loop
iterations.

The core problem here is that we ideally would not thread over
loop backedges, both because it is potentially non-profitable
(it may break canonical loop structure) and because it may result
in these kinds of loops. Unfortunately, due to the lack of a
dominator tree in SimplifyCFG, there is no good way to prevent
this. While we have LoopHeaders, this is an optional structure and
we don't do a good job of keeping it up to date. It would be fine
for a profitability check, but is not suitable for a correctness
check.

So for now I'm just giving up here, as I don't see a good way to
robustly prevent infinite combine loops.

Fixes https://github.com/llvm/llvm-project/issues/56203.
2022-07-05 16:57:46 +02:00
Nikita Popov dc969061c6 [SimplifyCFG] Thread all predecessors with same value at once
If there are multiple predecessors that have the same condition
value (and thus same "real destination"), these were previously
handled by copying the threaded block for each predecessor.
Instead, we can reuse one block for all of them. This makes the
behavior of SimplifyCFG's jump threading match that of the
actual JumpThreading pass.

This also avoids the infinite combine loop reported in:
https://reviews.llvm.org/D124159#3624387
2022-07-05 14:33:53 +02:00
Nikita Popov 32a76fc292 [SCEVExpander] Avoid ConstantExpr::get() (NFCI)
Use ConstantFoldBinaryOpOperands() instead. This will be important
when not all binops have constant expression variants.
2022-07-04 14:59:00 +02:00
Nikita Popov 9604601c93 [SimplifyCFG] Remove redundant checks for hoisting (NFCI)
These conditions are later checked in the HoistTerminator code
path. Checking them here is somewhat confusing, because this code
only checks the first instruction in the block, which is not
necessarily the terminator.
2022-07-04 10:53:54 +02:00
Martin Sebor 0d68ff87d2 [InstCombine] Transform strrchr to memrchr for constant strings
Add an emitter for the memrchr common extension and simplify the strrchr
call handler to use it. This enables transforming calls with the empty
string to the test C ? S : 0.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128954
2022-07-01 11:10:00 -06:00
Nikita Popov 65d59b4265 [LoopDeletion] Fix deletion with unusual predecessor terminator (PR56266)
LoopSimplify only requires that the loop predecessor has a single
successor and is safe to hoist into -- it doesn't necessarily have
to be an unconditional BranchInst.

Adjust LoopDeletion to assert conditions closer to what it actually
needs for correctness, namely a single successor and a
side-effect-free terminator (as the terminator is getting dropped).

Fixes https://github.com/llvm/llvm-project/issues/56266.
2022-07-01 16:13:35 +02:00
Nikita Popov fabe915705 [SimplifyLibCalls] Use inbounds GEP
When converting strchr(p, '\0') to p + strlen(p) we know that
strlen() must return an offset that is inbounds of the allocated
object (otherwise it would be UB), so we can use an inbounds GEP.
An equivalent argument can be made for the other cases.
2022-07-01 14:31:44 +02:00
Nikita Popov 9b994593cc [SCCP] Only handle unknown lattice values in resolvedUndefsIn()
This is a minor refinement of resolvedUndefsIn(), mostly for clarity.
If the value of an instruction is undef, then that's already a legal
final result -- we can safely rauw such an instruction with undef.
We only need to mark unknown values as overdefined, as that's the
result we get for an instruction that has not been processed because
it has an undef operand.

Differential Revision: https://reviews.llvm.org/D128251
2022-07-01 09:14:37 +02:00
Chen Zheng 39fe49aa57 [Inline] don't add noalias metadata for unknown objects.
The unidentified objects recognized in `getUnderlyingObjects` may
still alias to the noalias parameter because `getUnderlyingObjects`
may not check deep enough to get the underlying object because of
`MaxLookup`. The real underlying object for the unidentified object
 may still be the noalias parameter.

Originally Patched By: tingwang

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D127202
2022-07-01 02:16:55 -04:00
Martin Sebor 3a743a5892 [InstCombine] Fix memrchr logic error that prevents folding
Correct a logic bug in the memrchr enhancement added in D123629 that
makes it ineffective in a subset of cases.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128856
2022-06-30 11:35:26 -06:00
Nikita Popov f34dcf2763 [IRBuilder] Migrate all binops to folding API
Migrate all binops to use FoldXYZ rather than CreateXYZ APIs,
which are compatible with InstSimplifyFolder and fallible constant
folding.

Rather than continuing to add one method for every single operator,
add a generic FoldBinOp (plus variants for nowrap, exact and fmf
operators), which we would need anyway for CreateBinaryOp.

This change is not NFC because IRBuilder with InstSimplifyFolder
may perform more folding. However, this patch changes SCEVExpander
to not use the folder in InsertBinOp to minimize practical impact
and keep this change as close to NFC as possible.
2022-06-30 16:41:17 +02:00
Nikita Popov 588e229bf9 [VNCoercion] Separate constant/non-constant mem intrinsic implementations (NFCI)
This means we no longer need to have the same API between IRBuilder
and IRBuilderFolder.

The constant case is substantially simpler, so implementing it
separately isn't an undue burden.
2022-06-30 15:26:06 +02:00
Nikita Popov 014c4bdb9d [VNCoercion] Use ConstantFoldLoadFromConst API (NFCI)
Nowdays we have a generic constant folding API to load a type from
an offset. It should be able to do anything that VNCoercion can do.

This avoids the weird templating between IRBuilder and ConstantFolder
in one function, which is will stop working as the IRBuilderFolder
moves from CreateXYZ to FoldXYZ APIs.

Unfortunately, this doesn't eliminate this pattern from VNCoercion
entirely yet.
2022-06-30 14:52:27 +02:00
Nikita Popov 1579fc62fe [Evaluator] Add missing LLVM_DEBUG()
Missed these in 41f0b6a781, resulting
in unconditional debug output.
2022-06-30 11:54:47 +02:00
Chen Zheng b05801de35 [InlineFunction] Only check pointer arguments for a call
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128529
2022-06-30 05:39:47 -04:00
Nikita Popov 41f0b6a781 [Evaluator] Use ConstantFoldInstOperands()
For instructions that don't need any special handling, use
ConstantFoldInstOperands(), rather than re-implementing individual
cases.

This is probably not NFC because it can handle cases the previous
code missed (e.g. vector operations).
2022-06-30 11:10:17 +02:00
Nikita Popov a6d4b4138f [ConstantFold] Supports compares in ConstantFoldInstOperands()
Support compares in ConstantFoldInstOperands(), instead of
forcing the use of ConstantFoldCompareInstOperands(). Also handle
insertvalue (extractvalue was already handled).

This removes a footgun, where many uses of ConstantFoldInstOperands()
need a separate check for compares beforehand. It's particularly
insidious if called on a constant expression, because it doesn't
fail in that case, but will just not do DL-dependent folding.
2022-06-30 11:05:24 +02:00
Florian Hahn 6d5f814357
[LoopUnrollRuntime] Invalidate SCEV for exit phi in ConnectProlog.
ConnectProlog adds new incoming values to exit phi nodes which can
change the SCEV for the phi after 20d798bd47.

Fix is analog to cfc741bc0e.

Fixes #56286.
2022-06-29 20:28:43 +01:00
Florian Hahn 9a35f19e3e
[UnrollRuntime] Invalidate SCEVs for modified phis in ConnectEpilog.
ConnectEpilog adds new incoming values to exit phi nodes which can
change the SCEV for the phi after 20d798bd47.

Fix is analog to cfc741bc0e.

Fixes #56282.
2022-06-29 18:26:00 +01:00
Martin Sebor 8827679826 [InstCombine] Fold strncmp of constant arrays and variable size
Extend the solution accepted in D127766 to strncmp and simplify
strncmp(A, B, N) calls with constant A and B and variable N to
the equivalent of

  N <= Pos ? 0 : (A < B ? -1 : B < A ? +1 : 0)

where Pos is the offset of either the first mismatch between A
and B or the terminating null character if both A and B are equal
strings.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D128089
2022-06-28 15:59:14 -06:00
Martin Sebor e263a7670e [InstCombine] Look through more casts when folding memchr and memcmp
Enhance getConstantDataArrayInfo to let the memchr and memcmp library
call folders look through arbitrarily long sequences of bitcast and
GEP instructions.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128364
2022-06-28 15:58:42 -06:00
Nikita Popov 5548e807b5 [IR] Remove support for extractvalue constant expression
This removes the extractvalue constant expression, as part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
extractvalue is already not supported in bitcode, so we do not need
to worry about bitcode auto-upgrade.

Uses of ConstantExpr::getExtractValue() should be replaced with
IRBuilder::CreateExtractValue() (if the fact that the result is
constant is not important) or ConstantFoldExtractValueInstruction()
(if it is). Though for this particular case, it is also possible
and usually preferable to use getAggregateElement() instead.

The C API function LLVMConstExtractValue() is removed, as the
underlying constant expression no longer exists. Instead,
LLVMBuildExtractValue() should be used (which will constant fold
or create an instruction). Depending on the use-case,
LLVMGetAggregateElement() may also be used instead.

Differential Revision: https://reviews.llvm.org/D125795
2022-06-28 10:40:17 +02:00
Nikita Popov f65c88c42f [GlobalOpt] Fix memset handling in global ctor evaluation (PR55859)
The global ctor evaluator currently handles by checking whether the
memset memory is already zero, and skips it in that case. However,
it only actually checks the first byte of the memory being set.

This patch extends the code to check all bytes being set. This is
done byte-by-byte to avoid converting undef values to zeros in
larger reads. However, the handling is still not completely correct,
because there might still be padding bytes (though probably this
doesn't matter much in practice, as I'd expect global variable
padding to be zero-initialized in practice).

Mostly fixes https://github.com/llvm/llvm-project/issues/55859.

Differential Revision: https://reviews.llvm.org/D128532
2022-06-27 16:50:49 +02:00
Kazu Hirata a7938c74f1 [llvm] Don't use Optional::hasValue (NFC)
This patch replaces Optional::hasValue with the implicit cast to bool
in conditionals only.
2022-06-25 21:42:52 -07:00
Kazu Hirata 3b7c3a654c Revert "Don't use Optional::hasValue (NFC)"
This reverts commit aa8feeefd3.
2022-06-25 11:56:50 -07:00
Kazu Hirata aa8feeefd3 Don't use Optional::hasValue (NFC) 2022-06-25 11:55:57 -07:00
Arthur Eubanks e422c0d3b2 [GlobalOpt] Perform store->dominated load forwarding for stored once globals
The initial land incorrectly optimized forwarding non-Constants in non-nosync/norecurse functions. Bail on non-Constants since norecurse should cause global -> alloca promotion anyway.

The initial land also incorrectly assumed that StoredOnceStore was the only store to the global, but it actually means that only one value other than the global initializer is stored. Add a check that there's only one store.

Compile time tracker:
https://llvm-compile-time-tracker.com/compare.php?from=c80b88ee29f34078d2149de94e27600093e6c7c0&to=ef2c2b7772424b6861a75e794f3c31b45167304a&stat=instructions

Reviewed By: nikic, asbirlea, jdoerfert

Differential Revision: https://reviews.llvm.org/D128128
2022-06-24 09:09:26 -07:00
Nikita Popov e523baa664 [InlineFunction] Slightly clarify noalias scope calculation (NFC)
Rename CanDeriveViaCapture -> RequiresNoCaptureBefore, drop
unnecessary const cast, reformat some code avoid an ugly
super-indented comment.
2022-06-24 12:31:46 +02:00
Florian Mayer 9320a32bb9 [MTE] [HWASan] Use LoopInfo for reachability queries.
The reachability queries default to "reachable" after exploring too many
basic blocks. LoopInfo helps it skip over the whole loop.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D127917
2022-06-22 15:28:49 -07:00
Brendon Cahoon e13248ab0e [UnifyLoopExits] Reduce number of guard blocks
UnifyLoopExits creates a single exit, a control flow hub, for
loops with multiple exits. There is an input to the block for
each loop exiting block and an output from the block for each
loop exit block. Multiple checks, or guard blocks, are needed
to branch to the correct exit block.

For large loops with lots of exit blocks, all the extra guard
blocks cause problems for StructurizeCFG and subsequent passes.
This patch reduces the number of guard blocks needed when the
exit blocks branch to a common block (e.g., an unreachable
block). The guard blocks are reduced by changing the inputs
and outputs of the control flow hub. The inputs are the exit
blocks and the outputs are the common block.

Reducing the guard blocks enables StructurizeCFG to reorder the
basic blocks in the CFG to reduce the values that exit a loop
with multiple exits. This reduces the compile-time of
StructurizeCFG and also reduces register pressure.

Differential Revision: https://reviews.llvm.org/D123230
2022-06-22 15:44:23 -05:00
Florian Mayer 476ced4b89 [MTE] [HWASan] Support diamond lifetimes.
We were overly conservative and required a ret statement to be dominated
completely be a single lifetime.end marker. This is quite restrictive
and leads to two problems:

* limits coverage of use-after-scope, as we degenerate to
  use-after-return;
* increases stack usage in programs, as we have to remove all lifetime
  markers if we degenerate to use-after-return, which prevents
  reuse of stack slots by the stack coloring algorithm.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D127905
2022-06-22 11:16:34 -07:00
Nikita Popov 1f88d80408 [SCCP] Don't mark edges feasible when resolving undefs
As branch on undef is immediate undefined behavior, there is no need
to mark one of the edges as feasible. We can leave all the edges
non-feasible. In IPSCCP, we can replace the branch with an unreachable
terminator.

Differential Revision: https://reviews.llvm.org/D126962
2022-06-22 10:28:27 +02:00
Martin Sebor b19194c032 [InstCombine] handle subobjects of constant aggregates
Remove the known limitation of the library function call folders to only
work with top-level arrays of characters (as per the TODO comment in
the code) and allows them to also fold calls involving subobjects of
constant aggregates such as member arrays.
2022-06-21 11:55:14 -06:00
Kazu Hirata 7a47ee51a1 [llvm] Don't use Optional::getValue (NFC) 2022-06-20 22:45:45 -07:00
Kazu Hirata e0e687a615 [llvm] Don't use Optional::hasValue (NFC) 2022-06-20 10:38:12 -07:00
Florian Hahn cfc741bc0e
[LoopPeel] Forget SCEV for updated exit phi values.
LoopPeel add new incoming values to exit phi nodes which can change the
SCEV for the phi after 20d798bd47.

Forget SCEVs for such phis.

Fixes #56044.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D128164
2022-06-20 13:19:27 +02:00
Guillaume Chatelet f1255186c7 [NFC][Alignment] Remove max functions between Align and MaybeAlign
`llvm::max(Align, MaybeAlign)` and `llvm::max(MaybeAlign, Align)` are
not used often enough to be required. They also make the code more opaque.

Differential Revision: https://reviews.llvm.org/D128121
2022-06-20 08:37:48 +00:00
Nikita Popov 2b089e9ae0 [SimplifyCFG] Try to merge edge block when threading (PR55765)
When threading, we always create a new block for the threaded edge
(even if the edge is not critical), which will later get folded back
into the predecessor if possible. Depending on precise processing
order, this separate block may break the detection of trivial
cycles in the threading code, which normally avoids infinite
threading of loops. Explicitly merge the created edge block into
the predecessor to avoid this.

Fixes https://github.com/llvm/llvm-project/issues/55765.

Differential Revision: https://reviews.llvm.org/D127216
2022-06-20 10:29:33 +02:00
Kazu Hirata 129b531c9c [llvm] Use value_or instead of getValueOr (NFC) 2022-06-18 23:07:11 -07:00
Florian Hahn e9cced2739
Recommit "[LAA] Initial support for runtime checks with pointer selects."
This reverts commit 7aa8a67882.

This version includes fixes to address issues uncovered after
the commit landed and discussed at D11448.

Those include:

* Limit select-traversal to selects inside the loop.
* Freeze pointers resulting from looking through selects to avoid
  branch-on-poison.
2022-06-17 21:06:26 +02:00
Martin Sebor 5fb67e32f8 [InstCombine] Fold memcmp of constant arrays and variable size
The memcmp simplifier is limited to folding to constants calls with constant
arrays and constant sizes.  This change adds the ability to simplify
memcmp(A, B, N) calls with constant A and B and variable N to the pseudocode
equivalent of

N <= Pos ? 0 : (A < B ? -1 : B < A ? +1 : 0)

where Pos is the offset of the first mismatch between A and B.

Differential Revision: https://reviews.llvm.org/D127766
2022-06-17 10:35:35 -06:00
Samuel Eubanks bf02ed240d Prevent crash when TurnSwitchRangeIntoICmp receives default unreachable destination
TurnSwitchRangeIntoICmp crashes when given a switch with a default
destination of unreachable
Addresses issue #53208
https://github.com/llvm/llvm-project/issues/53208

Differential revision: https://reviews.llvm.org/D127712
2022-06-16 16:11:24 +02:00
Nikita Popov 2dac2c4f76 [SimplifyLibCalls] Drop duplicate check (NFC)
The same condition already exists inside optimizeMemCmpConstantSize().
2022-06-15 09:37:09 +02:00
Serguei Katkov d713f0eab8 Revert "[MachineSSAUpdater] compile time improvement in GetValueInMiddleOfBlock"
It looks like it causes buildbot failures.
As an example: https://lab.llvm.org/buildbot/#/builders/121/builds/20364

Revert to investigate...

This reverts commit 6bf2791814.
2022-06-14 20:27:21 +07:00
Serguei Katkov 6bf2791814 [MachineSSAUpdater] compile time improvement in GetValueInMiddleOfBlock
GetValueInMiddleOfBlock uses result of GetValueAtEndOfBlockInternal if there is no value
defined for current basic block.

If there is already a value it tries (in this order):

to find single register coming from all predecessors
find existing phi node which matches our incoming registers
build new phi.
The compile time improvement is to use current available value if
it is defined out of current BB or it is a PHI register.
This is due to it can be used in the middle basic block.

Reviewed By: sameerds
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D126523
2022-06-14 18:00:34 +07:00
Chuanqi Xu d029db9e8a [NFC] Fix Wswitch warning triggered by 735e6c 2022-06-14 14:45:15 +08:00
Guillaume Chatelet 2887dd754e [NFC][Alignment] Use getAlign in VNCoercion 2022-06-13 15:13:05 +00:00
Nikita Popov 571c713144 [SimplifyCFG] Handle trapping aggregates (PR49839)
Handle the fact that not only constant expressions, but also
constant aggregates containing expressions can trap.

This still doesn't fix the original C reproducer, probably due to
more issues remaining in other passes.
2022-06-13 14:56:49 +02:00
Hans Wennborg 3800b157d7 [SimplifyCFG] Share code to compute switch density between ShouldBuildLookupTable() and ReduceSwitchRange()
They're computing the same thing. No functionality change.

Differential revision: https://reviews.llvm.org/D127482
2022-06-10 15:29:36 +02:00
Nikita Popov d77f944832 [LoopInfo] Add getOutermostLoop() (NFC)
This is a recurring pattern, add an API function for it.
2022-06-10 11:48:21 +02:00
Philip Reames f85c5079b8 Pipe potentially invalid InstructionCost through CodeMetrics
Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred.

On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost.

I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change.

Differential Revision: https://reviews.llvm.org/D127131
2022-06-09 15:17:24 -07:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Nikita Popov 56c9976d46 [IndVarSimplify] Don't assert that terminator is not SCEVable (PR55925)
The IV widening code currently asserts that terminators aren't SCEVable
-- however, this is not the case for invokes with a returned attribute.

As far as I can tell, this assertions is not necessary -- even if we
have a critical edge (the second test case), the trunc gets inserted
in a legal position.

Fixes https://github.com/llvm/llvm-project/issues/55925.

Differential Revision: https://reviews.llvm.org/D127288
2022-06-09 10:12:13 +02:00
Chuanqi Xu 0e10f12844 [NFC] Remove commented cerr debugging loggings
There are some unused cerr debugging loggings in the codes. It is weird
to remain such commented debug helpers in the product.
2022-06-08 15:58:06 +08:00
Martin Sebor dd2a6d78ee [InstCombine] Fold memchr of sequences of same characters
Enhance memchr libcall folder to handle constant arrays consisting
of one or two sequences of cosecutive equal characters.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126515
2022-06-07 13:45:10 -06:00
Martin Sebor fb6627fa0c [InstCombine] Add substr helper function (NFC).
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126515
2022-06-07 13:27:36 -06:00
Nikita Popov 7fa97b473c [SCCP] Don't mark ranges from branch conditions as potentially undef
Now that transforms introducing branch on poison have been removed,
we can stop marking ranges that have been derived from branch
conditions as containing undef. The existing comment explains why
this is legal. I've checked that alive2 is happy with SCCP tests
after this change.

Differential Revision: https://reviews.llvm.org/D126647
2022-06-07 10:20:24 +02:00
Fangrui Song d86a206f06 Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 00:31:44 -07:00
Kazu Hirata 2c4d52467a [Transforms/Utils] Use predecessors (NFC) 2022-06-05 00:16:14 -07:00
Fangrui Song 36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Fangrui Song 557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Augie Fackler 73f664601c BuildLibCalls: infer allockind attributes on relevant functions
Differential Revision: https://reviews.llvm.org/D123089
2022-05-31 10:01:17 -04:00
Augie Fackler 42861faa8e attributes: introduce allockind attr for describing allocator fn behavior
I chose to encode the allockind information in a string constant because
otherwise we would get a bit of an explosion of keywords to deal with
the possible permutations of allocation function types.

I'm not sure that CodeGen.h is the correct place for this enum, but it
seemed to kind of match the UWTableKind enum so I put it in the same
place. Constructive suggestions on a better location most certainly
encouraged.

Differential Revision: https://reviews.llvm.org/D123088
2022-05-31 10:01:17 -04:00
Nikita Popov 2e101cca69 [Local] Don't remove invoke of non-willreturn function
The code was only checking for memory side-effects, but not for
divergence side-effects. Replace this with a generic check.
2022-05-30 15:37:46 +02:00
serge-sans-paille fb67d683db [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since 7030654296 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D126417
2022-05-26 08:12:34 +02:00
Alexey Bataev 10f41a2147 [SLP]Fix PR55688: Miscompile due to incorrect nuw/nsw handling.
Need to use all ReductionOps when propagating flags for the reduction
ops, otherwise transformation is not correct. Plus, need to drop nuw/nsw
flags.

Differential Revision: https://reviews.llvm.org/D126371
2022-05-25 13:59:06 -07:00
Martin Sebor 46c0ec9df4 [InstCombine] Fold memrchr calls with sequences of identical bytes.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123631
2022-05-24 17:00:11 -06:00
Nikita Popov 81c648a3d9 [LoopUnroll] Freeze tripcount rather than condition
This is a followup to D125754. We introduce two branches, one
before the unrolled loop and one before the epilogue (and similar
for the prologue case). The previous patch only froze the
condition on the first branch.

Rather than independently freezing the second condition, this patch
instead freezes TripCount and bases BECount on it. These are the
two quantities involved in the conditions, and this ensures that
both work on a consistent, non-poisonous trip count.

Differential Revision: https://reviews.llvm.org/D125896
2022-05-24 09:42:39 +02:00
Hendrik Greving 4f93d5cc1d [BasicBlockUtils] Do not move loop metadata if outer loop header.
Fixes a bug preventing moving the loop's metadata to an outer loop's header,
which happens if the loop's exit is also the header of an outer loop.

Adjusts test for above.

Fixes #55416.

Differential Revision: https://reviews.llvm.org/D125574
2022-05-23 16:39:54 -07:00
NAKAMURA Takumi 6ca7eb2c6d [SCEV] Part 1, Serialize function calls in function arguments.
Evaluation odering in function call arguments is implementation-dependent.
In fact, gcc evaluates bottom-top and clang does top-bottom.

Fixes #55283 partially.

Part of https://reviews.llvm.org/D125627
2022-05-18 23:20:08 +09:00
Sun Ziping 242961f23b [llvm][fix-irreducible] ensure that loop subtree under child is correctly reconnected to new loop
The modified function was incorrectly (not unnecessarily) ignoring grandchild
loops, and this change fixes the bug. In particular, this fixes the handling of
the loop { inner, body }. The TODO in the same function is talking about the b1
self loop, which may be "unnecessarily" lost, but that is a different issue.
2022-05-18 10:45:52 +01:00
Nikita Popov e9a1c82d69 [SCEVExpander] Expand umin_seq using freeze
%x umin_seq %y is currently expanded to %x == 0 ? 0 : umin(%x, %y).
This patch changes the expansion to umin(%x, freeze %y) instead
(https://alive2.llvm.org/ce/z/wujUhp).

The motivation for this change are the test cases affected by
D124910, where the freeze expansion ultimately produces better
optimization results. This is largely because
`(%x umin_seq %y) == %x` is a common expansion pattern, which
reliably optimizes in freeze representation, but only sometimes
with the zero comparison (in particular, if %x == 0 can fold to
something else, we generally won't be able to cover reasonable
code from this.)

Differential Revision: https://reviews.llvm.org/D125372
2022-05-18 09:53:07 +02:00
Nikita Popov 323514de58 [LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits
When performing runtime unrolling with multiple exits, one of the
earlier (non-latch) exits may exit the loop on the first iteration,
such that we never branch on the latch exit condition. As such, we
need to freeze the condition of the new branch that is introduced
before the loop, as it now executes unconditionally.

Differential Revision: https://reviews.llvm.org/D125754
2022-05-18 09:51:22 +02:00
Sanjay Patel be7f09f7b2 [IR] create and use helper functions that test the signbit; NFCI 2022-05-16 11:26:23 -04:00
Florian Hahn b7315ffc3c
[LAA,LV] Add initial support for pointer-diff memory checks.
This patch adds initial support for a pointer diff based runtime check
scheme for vectorization. This scheme requires fewer computations and
checks than the existing full overlap checking, if it is applicable.

The main idea is to only check if source and sink of a dependency are
far enough apart so the accesses won't overlap in the vector loop. To do
so, it is sufficient to compute the difference and compare it to the
`VF * UF * AccessSize`. It is sufficient to check
`(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards
dependence in the vector loop with the given VF and UF. If Src >=u Sink,
there is not dependence preventing vectorization, hence the overflow
should not matter and using the ULT should be sufficient.

Note that the initial version is restricted in multiple ways:

1. Pointers must only either be read or written, by a single
   instruction (this allows re-constructing source/sink for
   dependences with the available information)
 2. Source and sink pointers must be add-recs, with matching steps
 3. The step must be a constant.
 3. abs(step) == AccessSize.

Most of those restrictions can be relaxed in the future.

See https://github.com/llvm/llvm-project/issues/53590.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D119078
2022-05-16 15:27:22 +01:00
Alexander Shaposhnikov badd088c57 [GlobalOpt] Enable optimization of constructors with different priorities
Adjust `optimizeGlobalCtorsList` to handle the case of different priorities.
This addresses the issue https://github.com/llvm/llvm-project/issues/55083.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D125278
2022-05-13 22:19:29 +00:00
Nikita Popov c1bb4a881e [SCEVExpander] Deduplicate min/max expansion code (NFC) 2022-05-11 12:11:11 +02:00
Alexander Shaposhnikov da823382d2 [Transform][Utils][NFC] Clean up CtorUtils.cpp 2022-05-11 01:07:54 +00:00
Nick Desaulniers c167c0a4dc [BuildLibCalls] infer inreg param attrs from NumRegisterParameters
We're having a hard time booting the ARCH=i386 Linux kernel with clang
after removing -ffreestanding because instcombine was dropping inreg
from callers during libcall simplification, but not the callees defined
in different translation units. This led the callers and callees to have
wildly different calling conventions, which (predictably) blew up at
runtime.

Infer the inreg param attrs on function declarations from the module
metadata "NumRegisterParameters." This allows us to boot the ARCH=i386
Linux kernel (w/ -ffreestanding removed).

Fixes: https://github.com/llvm/llvm-project/issues/53645

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D125285
2022-05-10 16:21:17 -07:00
Nikita Popov 0eafef1171 [SCEVExpander] Remove handling for mixed int/pointer min/max (NFCI)
Mixed int/pointer min/max are no longer possible.
2022-05-10 15:11:39 +02:00
Hongtao Yu 9641b9be9d [Inliner] Preserve !prof metadata when converting call to invoke.
When a callee function is inlined via an invoke instruction, every function call inside the callee, if not an invoke,  will be converted to an invoke after cloned to the caller body. I found that during the conversion the !prof metadata was dropped. This in turned caused a cloned indirect call not properly promoted in subsequent passes.

The particular scenario I was investigating was with AutoFDO and thinLTO. In prelink, no ICP was triggered (neither by the sample loader nor PGO ICP), no indirect call was promoted. This is because 1) the particular indirect call did not have inlined samples;  and 2) PGO ICP was intentionally disabled.  After inlining, the prof metadata was dropped. Then in postlink, PGO ICP jumped in but didn't do anything. Thus the opportunity was missed.

I'm making a simple fix to preserve !prof metadata when converting call to invoke.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D125249
2022-05-09 15:08:09 -07:00
Augie Fackler 1deea714b3 BuildLibCalls: simplify switch statement slightly
Per feedback on D123086 after submit.

Also added a test for vec_malloc et al attribute inference to show it's
doing the right thing.

The new tests exposed a defect, corrected by adding vec_free to the list of
free functions in MemoryBuiltins.cpp, which had been overlooked all the
way back in D94710, over a year ago.

Differential Revision: https://reviews.llvm.org/D124859
2022-05-03 13:17:33 -04:00
Jonas Paulsson 304378fd09 Reapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building
libcalls." (was 0f8c626). This reverts commit 14d9390.

The patch previously failed to recognize cases where user had defined a
function alias with an identical name as that of the library
function. Module::getFunction() would then return nullptr which is what the
sanitizer discovered.

In this updated version a new function isLibFuncEmittable() has as well been
introduced which is now used instead of TLI->has() anytime a library function
is to be emitted . It additionally also makes sure there is e.g. no function
alias with the same name in the module.

Reviewed By: Eli Friedman

Differential Revision: https://reviews.llvm.org/D123198
2022-05-02 19:37:00 +02:00
Augie Fackler c7ae423e39 BuildLibCalls: add alloc-family attribute to many allocator functions
Differential Revision: https://reviews.llvm.org/D123086
2022-05-02 11:12:55 -04:00
Augie Fackler e940456531 BuildLibCalls: infer allocptr attribute for free and realloc() family functions
Differential Revision: https://reviews.llvm.org/D123084
2022-05-02 09:43:21 -04:00
Nikita Popov aae5f8115a [Local] Consider atomic loads from constant global as dead
Per the guidance in
https://llvm.org/docs/Atomics.html#atomics-and-ir-optimization,
an atomic load from a constant global can be dropped, as there can
be no stores to synchronize with. Any write to the constant global
would be UB.

IPSCCP will already drop such loads, but the main helper in Local
doesn't recognize this currently. This is motivated by D118387.

Differential Revision: https://reviews.llvm.org/D124241
2022-05-02 10:52:58 +02:00
Florian Hahn a80081763c
[SimplifyCFG] Avoid shifting by a too large exponent.
TI->getBitWidth can be > 64 and in those cases the shift will be UB due
to the exponent being too large.

To fix this, cap the shift at 63. I think this should work out fine,
because TableSize is itself a 64 bit type and the maximum table size
must fit in the type. Also, if we would underestimate the size here, at
most we get an extra ZExt.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124608
2022-04-29 15:19:06 +01:00
Nikita Popov 884e9a877b [SimplifyCFG] Replace condition value when threading
Replace the condition value with the known constant value on the
threaded edge. This happens implicitly with phi threading because
we replace with the incoming value, but not for non-phi threading.
2022-04-29 09:50:27 +02:00
Nikita Popov 4e545bdb35 [SimplifyCFG] Thread branches on same condition in more cases (PR54980)
SimplifyCFG implements basic jump threading, if a branch is
performed on a phi node with constant operands. However,
InstCombine canonicalizes such phis to the condition value of a
previous branch, if possible. SimplifyCFG does support this as
well, but only in the very limited case where the same condition
is used in a direct predecessor -- notably, this does not include
the common diamond pattern (i.e. two consecutive if/elses on the
same condition).

This patch extends the code to look back a limited number of
blocks to find a branch on the same value, rather than only
looking at the direct predecessor.

Fixes https://github.com/llvm/llvm-project/issues/54980.

Differential Revision: https://reviews.llvm.org/D124159
2022-04-29 09:44:05 +02:00
Arthur Eubanks 4e65291837 [OpaquePtr][GlobalOpt] Don't attempt to evaluate global constructors with arguments
Previously all entries in global_ctors had to have the void()* type and
we'd skip evaluating bitcasted functions. With opaque pointers we may
see the function directly.

Fixes #55147.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D124553
2022-04-27 19:00:44 -07:00
Martin Sebor efa0f12c0b [InstCombine] Fold strnlen calls in equality to zero.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123818
2022-04-27 12:03:24 -06:00
Alexandros Lamprineas a910337b5d [FuncSpec] Conditional jump or move depends on uninitialised value(s).
I found this bug when performing a two-stage build of clang with
Function Specialization enabled and tuned aggressively. The crash
appears only on release builds.

Fixes https://github.com/llvm/llvm-project/issues/55000.

Before accessing the contents of the ArgInfo iterator inside
SCCPInstVisitor::markArgInFuncSpecialization, we should be
checking that the iterator is valid.

Differential Revision: https://reviews.llvm.org/D124114
2022-04-27 07:28:25 +01:00
Martin Sebor ffed0cfcdb [SimplifyLibCalls] avoid slicing 64-bit integers in an ILP32 build (PR #54739)
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123472
2022-04-26 17:20:56 -06:00
Martin Sebor 449adafabe [InstCombine] Fold strnlen of constant strings.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123817
2022-04-26 16:15:28 -06:00
Martin Sebor ce8f42d4af [InstCombine] Fold memrchr calls with a constant character.
Reviewed By: nikic

Differential Revision: //reviews.llvm.org/D123629
2022-04-26 14:02:50 -06:00
Martin Sebor 10c99ce67d [InstCombine] Fold memrchr calls with constant size, bail on excessive.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123626
Differential Revision: https://reviews.llvm.org/D123628
2022-04-26 14:02:50 -06:00
Martin Sebor 25febbd155 [InstCombine] Fold strnlen with a bound of zero and one.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123816
2022-04-26 14:02:50 -06:00
Martin Sebor 2807c420cd [InstCombine] add a strnlen handler stub.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123815
2022-04-26 14:02:49 -06:00
Augie Fackler a907d36cfe Attributes: add a new `allocptr` attribute
This continues the push away from hard-coded knowledge about functions
towards attributes. We'll use this to annotate free(), realloc() and
cousins and obviate the hard-coded list of free functions.

Differential Revision: https://reviews.llvm.org/D123083
2022-04-26 13:57:11 -04:00
Igor Kudrin 39ce68886b [LoopPeel][NFCI] Simplify the code to calculate peel count for PGO
This reorganizes the code as a preparation for D123865:

 * Use more descriptive names for variables
 * Simplify a condition by use an already calculated value
   for `MaxPeelCount`
 * Remove a duplicate log entry
 * Report basic values for loop costs

Differential Revision: https://reviews.llvm.org/D124388
2022-04-26 18:44:24 +04:00
Igor Kudrin c71890e158 [LoopPeel][NFC] Exit early if there is no room for peeling
Differential Revision: https://reviews.llvm.org/D123864
2022-04-26 18:43:56 +04:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Paul Kirth 4683a2effa [llvm][misexpect] Avoid division by 0 when using sample profiling
MisExpect diagnostics should not prevent compilation from succeeding, and the
assertion is insufficient to prevent division by zero in release builds.

This patch addresses that by replacing the assert with an early return.

Additionally, it disables MisExpect diagnostics when using sample profiling,
since this is the only known case where this error has manifested.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D124302
2022-04-22 22:48:00 +00:00
Nikita Popov 993b166deb Reapply [SimplifyCFG] Handle branch on same condition in pred more directly
Reapplying without changes, after a fix to a dependent patch.

-----

Rather than creating a PHI node and then using the PHI threading
code, directly handle this case in
FoldCondBranchOnValueKnownInPredecessor().

This change is supposed to be NFC-ish, but may cause changes due
to different transform order.
2022-04-22 10:27:38 +02:00
Nikita Popov df18e37541 Reapply [SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension (NFCI)
Reapply with SmallMapVector instead of SmallDenseMap, which should
address the non-determinism issue.

-----

This general threading transform can be performed whenever we know
a constant value for the condition in a predecessor, which would
currently just be the case of a phi node with constant arguments.
2022-04-22 09:42:11 +02:00
Fangrui Song 35e350d5ba Revert "[SimplifyCFG] Handle branch on same condition in pred more directly" and "[SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension"
This reverts commit 3df86e799e.
This reverts commit 8988254667.

`[SimplifyCFG] Handle branch on same condition in pred more directly`
caused non-determinism when compiling opt with a bootstrapped clang.
I have to revert the dependent commit as well.
2022-04-21 12:58:58 -07:00
Nikola Tesic c5600aef88 [Debugify] Limit number of processed functions for original mode
Debugify in OriginalDebugInfo mode, does (DebugInfo) collect-before-pass & check-after-pass
for each instruction, which is pretty expensive. When used to analyze DebugInfo losses
in large projects (like LLVM), this raises the build time unacceptably.
This patch introduces a limit for the number of processed functions per compile unit.
By default, the limit is set to UINT_MAX (practically unlimited), and by using the introduced
option  -debugify-func-limit  the limit could be set to any positive integer number.

Differential revision: https://reviews.llvm.org/D115714
2022-04-21 13:58:17 +02:00
Nikita Popov 3df86e799e [SimplifyCFG] Handle branch on same condition in pred more directly
Rather than creating a PHI node and then using the PHI threading
code, directly handle this case in
FoldCondBranchOnValueKnownInPredecessor().

This change is supposed to be NFC-ish, but may cause changes due
to different transform order.
2022-04-21 11:22:02 +02:00
Nikita Popov 8988254667 [SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension
This general threading transform can be performed whenever we know
a constant value for the condition in a predecessor, which would
currently just be the case of a phi node with constant arguments.
2022-04-21 10:49:49 +02:00
Nikita Popov d727505e40 [SimplifyCFG] Remove one-use limitation in FoldCondBranchOnPHI()
BlockIsSimpleEnoughToThreadThrough() already checks that the phi
(and all other instructions) are not used outside the block, so
this one-use check is not necessary for legality. I also don't
see any reason why it would be necessary for profitability (in
fact, those extra uses will be replaced with constants, which
should be generally profitable).
2022-04-20 15:56:20 +02:00
Fangrui Song 14d9390721 Revert D123198 "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls."
test/Transforms/InstCombine/pr39177.ll failed in a -DLLVM_USE_SANITIZER=Undefined build.
```
lib/Transforms/Utils/BuildLibCalls.cpp:1217:17: runtime error: reference binding to null pointer of type 'llvm::Function'
```
`Function &F = *M->getFunction(Name);`

This reverts commit 0f8c626723.
2022-04-19 22:26:10 -07:00
Paul Kirth bac6cd5bf8 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-04-19 21:23:48 +00:00
Jonas Paulsson 0f8c626723 [BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls.
A new set of overloaded functions named getOrInsertLibFunc() are now supposed
to be used instead of getOrInsertFunction() when building a libcall from
within an LLVM optimizer(). The idea is that this new function also makes
sure that any mandatory argument attributes are added to the function
prototype (after calling getOrInsertFunction()).

inferLibFuncAttributes() is renamed to inferNonMandatoryLibFuncAttrs() as it
only adds attributes that are not necessary for correctness but merely
helping with later optimizations.

Generally, the front end is responsible for building a correct function
prototype with the needed argument attributes. If the middle end however is
the one creating the call, e.g. when replacing one libcall with another, it
then must take this responsibility.

This continues the work of properly handling argument extension if required
by the target ABI when building a lib call. getOrInsertLibFunc() now does
this for all libcalls currently built by any LLVM optimizer. It is expected
that when in the future a new optimization builds a new libcall with an
integer argument it is to be added to getOrInsertLibFunc() with the proper
handling. Note that not all targets have it in their ABI to sign/zero extend
integer arguments to the full register width, but this will be done
selectively as determined by getExtAttrForI32Param().

Review: Eli Friedman, Nikita Popov, Dávid Bolvanský

Differential Revision: https://reviews.llvm.org/D123198
2022-04-19 21:22:07 +02:00
Joseph Huber 984a0dc386 [OpenMP] Use new offloading binary when embedding offloading images
The previous patch introduced the offloading binary format so we can
store some metada along with the binary image. This patch introduces
using this inside the linker wrapper and Clang instead of the previous
method that embedded the metadata in the section name.

Differential Revision: https://reviews.llvm.org/D122683
2022-04-15 20:35:26 -04:00
chenglin.bi 00871e2f4f [SimplifyCFG] Try to fold switch with single result value and power-of-2 cases to mask+select
When switch with 2^n cases go to one result, check if the 2^n cases can be covered by n bit masks.
If yes we can use "and condition, ~mask" to simplify the switch

case 0 2 4 6 -> and condition, -7
https://alive2.llvm.org/ce/z/jjH_0N

case 0 2 8 10 -> and condition, -11
https://alive2.llvm.org/ce/z/K7E-2V

case 2 4 8 12 -> and (sub condition, 2), -11
https://alive2.llvm.org/ce/z/CrxbYg

Fix one case of https://github.com/llvm/llvm-project/issues/39957

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D122485
2022-04-15 00:10:00 +08:00
Ruiling Song 1e01f95057 LowerSwitch: Avoid inserting NewDefault block
The NewDefault was used to simplify the updating of PHI nodes, but it
causes some inefficiency for target that will run structurizer later. For
example, for a simple two-case switch, the extra NewDefault is causing
unstructured CFG like:

        O
       / \
      O   O
     / \ / \
    C1  ND C2
     \  |  /
      \ | /
        D

The change is to avoid the ND(NewDefault) block, that is we will get a
structured CFG for above example like:

        O
       / \
      /   \
     O     O
    / \   / \
   C1  \ /  C2
    \-> D <-/

The IR change introduced by this patch should be trivial to other targets,
so I am doing this unconditionally.

Fall-through among the cases will also cause unstructured CFG, but it need
more work and will be addressed in a separate change.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D123607
2022-04-14 13:30:56 +08:00
Sanjay Patel 0ef46dc0f9 [SimplifyCFG] improve readability in switch-to-select; NFC 2022-04-13 17:14:45 -04:00
serge-sans-paille 262eba01b3 Revert "[ValueTracking] Make getStringLenth aware of strdup"
This reverts commit e810d55809.

The commit was not taken into account the fact that strduped string could be
modified. Checking if such modification happens would make the function very
costly, without a test case in mind it's not worth the effort.
2022-04-13 19:17:28 +02:00
Nikita Popov 8c74169990 [SimplifyLibCalls] Don't mark memchr() memory as fully dereferenceable
C11 specifies memchr() as follows:

> The memchr function locates the first occurrence of c (converted
> to an unsigned char) in the initial n characters (each interpreted
> as unsigned char) of the object pointed to by s. The implementation
> shall behave as if it reads the characters sequentially and stops
> as soon as a matching character is found.

In particular, it is well-defined to specify a memchr size larger
than the underlying object, as long as the character is found before
the end of the object.

Differential Revision: https://reviews.llvm.org/D123665
2022-04-13 16:46:18 +02:00
Sanjay Patel cd0d0d633b [SimplifyCFG] make a debug option for case max when converting switch to select
This should be "NFC" as written, but it will make D122485 smaller
and give us more flexibility to experiment with optimization level
vs. compile-time.

Differential Revision: https://reviews.llvm.org/D123625
2022-04-13 06:55:13 -04:00
Sanjay Patel d9211be13d [SimplifyCFG] cleanup code for converting switch to select (NFC)
This renames functions for more general usage (and current capitalization style)
before a proposed logic change in D122485.

Differential Revision: https://reviews.llvm.org/D123614
2022-04-12 12:17:54 -04:00
serge-sans-paille e810d55809 [ValueTracking] Make getStringLenth aware of strdup
During strlen compile-time evaluation, make it possible to track size of
strduped strings.

Differential Revision: https://reviews.llvm.org/D123497
2022-04-12 14:47:29 +02:00
Nikita Popov 9af8cc8d17 [SimplifyLibCalls] Remove unnecessary inbounds check
Even if the GEP is not inbounds, the GEP will have provenance of
the global, and accessing past the extent of the global would be
undefined behavior.
2022-04-11 16:51:09 +02:00
Matt Arsenault 9fdd25848a Transforms: Fix code duplication between LowerAtomic and AtomicExpand 2022-04-08 19:06:36 -04:00
Evgeniy Brevnov da41214d65 Add support for atomic memory copy lowering
Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D118443
2022-04-08 10:41:31 +07:00
Augie Fackler b916414096 BuildLibCalls: also set allocsize() attributes
This is part of being able to get rid of two more columns in
MemoryBuiltins.cpp's large table. We'll have two more changes before
we can finish the job.

Differential Revision: https://reviews.llvm.org/D119582
2022-04-07 12:38:44 -04:00
Benjamin Kramer ff485d727f Transforms: Remove unused include
Utils can't depend on Scalar transforms.
2022-04-07 10:40:28 +02:00
Matt Arsenault 39f1568633 Transforms: Split LowerAtomics into separate Utils and pass
This will allow code sharing from AtomicExpandPass. Not entirely sure
why these exist as separate passes though.
2022-04-06 20:54:45 -04:00
Nikita Popov 1dc1d5a0d2 [SimplifyLibCalls] Use KnownBits helper APIs (NFC)
Use helper APIs for isNonNegative() and getMaxValue() instead of
flipping the zero value and having a long comment explaining why
that is necessary.
2022-04-06 16:01:24 +02:00
Martin Storsjö 46776f7556 Fix warnings about variables that are set but only used in debug mode
Add void casts to mark the variables used, next to the places where
they are used in assert or `LLVM_DEBUG()` expressions.

Differential Revision: https://reviews.llvm.org/D123117
2022-04-06 10:01:46 +03:00