Commit Graph

6400 Commits

Author SHA1 Message Date
Chuanqi Xu d029db9e8a [NFC] Fix Wswitch warning triggered by 735e6c 2022-06-14 14:45:15 +08:00
Guillaume Chatelet 2887dd754e [NFC][Alignment] Use getAlign in VNCoercion 2022-06-13 15:13:05 +00:00
Nikita Popov 571c713144 [SimplifyCFG] Handle trapping aggregates (PR49839)
Handle the fact that not only constant expressions, but also
constant aggregates containing expressions can trap.

This still doesn't fix the original C reproducer, probably due to
more issues remaining in other passes.
2022-06-13 14:56:49 +02:00
Hans Wennborg 3800b157d7 [SimplifyCFG] Share code to compute switch density between ShouldBuildLookupTable() and ReduceSwitchRange()
They're computing the same thing. No functionality change.

Differential revision: https://reviews.llvm.org/D127482
2022-06-10 15:29:36 +02:00
Nikita Popov d77f944832 [LoopInfo] Add getOutermostLoop() (NFC)
This is a recurring pattern, add an API function for it.
2022-06-10 11:48:21 +02:00
Philip Reames f85c5079b8 Pipe potentially invalid InstructionCost through CodeMetrics
Per the documentation in Support/InstructionCost.h, the purpose of an invalid cost is so that clients can change behavior on impossible to cost inputs. CodeMetrics was instead asserting that invalid costs never occurred.

On a target with an incomplete cost model - e.g. RISCV - this means that transformations would crash on (falsely) invalid constructs - e.g. scalable vectors. While we certainly should improve the cost model - and I plan to do so in the near future - we also shouldn't be crashing. This violates the explicitly stated purpose of an invalid InstructionCost.

I updated all of the "easy" consumers where bailouts were locally obvious. I plan to follow up with loop unroll in a following change.

Differential Revision: https://reviews.llvm.org/D127131
2022-06-09 15:17:24 -07:00
Simon Moll b8c2781ff6 [NFC] format InstructionSimplify & lowerCaseFunctionNames
Clang-format InstructionSimplify and convert all "FunctionName"s to
"functionName".  This patch does touch a lot of files but gets done with
the cleanup of InstructionSimplify in one commit.

This is the alternative to the less invasive clang-format only patch: D126783

Reviewed By: spatel, rengolin

Differential Revision: https://reviews.llvm.org/D126889
2022-06-09 16:10:08 +02:00
Nikita Popov 56c9976d46 [IndVarSimplify] Don't assert that terminator is not SCEVable (PR55925)
The IV widening code currently asserts that terminators aren't SCEVable
-- however, this is not the case for invokes with a returned attribute.

As far as I can tell, this assertions is not necessary -- even if we
have a critical edge (the second test case), the trunc gets inserted
in a legal position.

Fixes https://github.com/llvm/llvm-project/issues/55925.

Differential Revision: https://reviews.llvm.org/D127288
2022-06-09 10:12:13 +02:00
Chuanqi Xu 0e10f12844 [NFC] Remove commented cerr debugging loggings
There are some unused cerr debugging loggings in the codes. It is weird
to remain such commented debug helpers in the product.
2022-06-08 15:58:06 +08:00
Martin Sebor dd2a6d78ee [InstCombine] Fold memchr of sequences of same characters
Enhance memchr libcall folder to handle constant arrays consisting
of one or two sequences of cosecutive equal characters.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126515
2022-06-07 13:45:10 -06:00
Martin Sebor fb6627fa0c [InstCombine] Add substr helper function (NFC).
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D126515
2022-06-07 13:27:36 -06:00
Nikita Popov 7fa97b473c [SCCP] Don't mark ranges from branch conditions as potentially undef
Now that transforms introducing branch on poison have been removed,
we can stop marking ranges that have been derived from branch
conditions as containing undef. The existing comment explains why
this is legal. I've checked that alive2 is happy with SCCP tests
after this change.

Differential Revision: https://reviews.llvm.org/D126647
2022-06-07 10:20:24 +02:00
Fangrui Song d86a206f06 Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 00:31:44 -07:00
Kazu Hirata 2c4d52467a [Transforms/Utils] Use predecessors (NFC) 2022-06-05 00:16:14 -07:00
Fangrui Song 36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Fangrui Song 557efc9a8b [llvm] Remove unneeded cl::ZeroOrMore for cl::opt options. NFC
Some cl::ZeroOrMore were added to avoid the `may only occur zero or one times!`
error. More were added due to cargo cult. Since the error has been removed,
cl::ZeroOrMore is unneeded.

Also remove cl::init(false) while touching the lines.
2022-06-03 21:59:05 -07:00
Augie Fackler 73f664601c BuildLibCalls: infer allockind attributes on relevant functions
Differential Revision: https://reviews.llvm.org/D123089
2022-05-31 10:01:17 -04:00
Augie Fackler 42861faa8e attributes: introduce allockind attr for describing allocator fn behavior
I chose to encode the allockind information in a string constant because
otherwise we would get a bit of an explosion of keywords to deal with
the possible permutations of allocation function types.

I'm not sure that CodeGen.h is the correct place for this enum, but it
seemed to kind of match the UWTableKind enum so I put it in the same
place. Constructive suggestions on a better location most certainly
encouraged.

Differential Revision: https://reviews.llvm.org/D123088
2022-05-31 10:01:17 -04:00
Nikita Popov 2e101cca69 [Local] Don't remove invoke of non-willreturn function
The code was only checking for memory side-effects, but not for
divergence side-effects. Replace this with a generic check.
2022-05-30 15:37:46 +02:00
serge-sans-paille fb67d683db [iwyu] Handle regressions in libLLVM header include
Running iwyu-diff on LLVM codebase since 7030654296 detected a few
regressions, fixing them.

Differential Revision: https://reviews.llvm.org/D126417
2022-05-26 08:12:34 +02:00
Alexey Bataev 10f41a2147 [SLP]Fix PR55688: Miscompile due to incorrect nuw/nsw handling.
Need to use all ReductionOps when propagating flags for the reduction
ops, otherwise transformation is not correct. Plus, need to drop nuw/nsw
flags.

Differential Revision: https://reviews.llvm.org/D126371
2022-05-25 13:59:06 -07:00
Martin Sebor 46c0ec9df4 [InstCombine] Fold memrchr calls with sequences of identical bytes.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123631
2022-05-24 17:00:11 -06:00
Nikita Popov 81c648a3d9 [LoopUnroll] Freeze tripcount rather than condition
This is a followup to D125754. We introduce two branches, one
before the unrolled loop and one before the epilogue (and similar
for the prologue case). The previous patch only froze the
condition on the first branch.

Rather than independently freezing the second condition, this patch
instead freezes TripCount and bases BECount on it. These are the
two quantities involved in the conditions, and this ensures that
both work on a consistent, non-poisonous trip count.

Differential Revision: https://reviews.llvm.org/D125896
2022-05-24 09:42:39 +02:00
Hendrik Greving 4f93d5cc1d [BasicBlockUtils] Do not move loop metadata if outer loop header.
Fixes a bug preventing moving the loop's metadata to an outer loop's header,
which happens if the loop's exit is also the header of an outer loop.

Adjusts test for above.

Fixes #55416.

Differential Revision: https://reviews.llvm.org/D125574
2022-05-23 16:39:54 -07:00
NAKAMURA Takumi 6ca7eb2c6d [SCEV] Part 1, Serialize function calls in function arguments.
Evaluation odering in function call arguments is implementation-dependent.
In fact, gcc evaluates bottom-top and clang does top-bottom.

Fixes #55283 partially.

Part of https://reviews.llvm.org/D125627
2022-05-18 23:20:08 +09:00
Sun Ziping 242961f23b [llvm][fix-irreducible] ensure that loop subtree under child is correctly reconnected to new loop
The modified function was incorrectly (not unnecessarily) ignoring grandchild
loops, and this change fixes the bug. In particular, this fixes the handling of
the loop { inner, body }. The TODO in the same function is talking about the b1
self loop, which may be "unnecessarily" lost, but that is a different issue.
2022-05-18 10:45:52 +01:00
Nikita Popov e9a1c82d69 [SCEVExpander] Expand umin_seq using freeze
%x umin_seq %y is currently expanded to %x == 0 ? 0 : umin(%x, %y).
This patch changes the expansion to umin(%x, freeze %y) instead
(https://alive2.llvm.org/ce/z/wujUhp).

The motivation for this change are the test cases affected by
D124910, where the freeze expansion ultimately produces better
optimization results. This is largely because
`(%x umin_seq %y) == %x` is a common expansion pattern, which
reliably optimizes in freeze representation, but only sometimes
with the zero comparison (in particular, if %x == 0 can fold to
something else, we generally won't be able to cover reasonable
code from this.)

Differential Revision: https://reviews.llvm.org/D125372
2022-05-18 09:53:07 +02:00
Nikita Popov 323514de58 [LoopUnroll] Avoid branch on poison for runtime unroll with multiple exits
When performing runtime unrolling with multiple exits, one of the
earlier (non-latch) exits may exit the loop on the first iteration,
such that we never branch on the latch exit condition. As such, we
need to freeze the condition of the new branch that is introduced
before the loop, as it now executes unconditionally.

Differential Revision: https://reviews.llvm.org/D125754
2022-05-18 09:51:22 +02:00
Sanjay Patel be7f09f7b2 [IR] create and use helper functions that test the signbit; NFCI 2022-05-16 11:26:23 -04:00
Florian Hahn b7315ffc3c
[LAA,LV] Add initial support for pointer-diff memory checks.
This patch adds initial support for a pointer diff based runtime check
scheme for vectorization. This scheme requires fewer computations and
checks than the existing full overlap checking, if it is applicable.

The main idea is to only check if source and sink of a dependency are
far enough apart so the accesses won't overlap in the vector loop. To do
so, it is sufficient to compute the difference and compare it to the
`VF * UF * AccessSize`. It is sufficient to check
`(Sink - Src) <u VF * UF * AccessSize` to rule out a backwards
dependence in the vector loop with the given VF and UF. If Src >=u Sink,
there is not dependence preventing vectorization, hence the overflow
should not matter and using the ULT should be sufficient.

Note that the initial version is restricted in multiple ways:

1. Pointers must only either be read or written, by a single
   instruction (this allows re-constructing source/sink for
   dependences with the available information)
 2. Source and sink pointers must be add-recs, with matching steps
 3. The step must be a constant.
 3. abs(step) == AccessSize.

Most of those restrictions can be relaxed in the future.

See https://github.com/llvm/llvm-project/issues/53590.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D119078
2022-05-16 15:27:22 +01:00
Alexander Shaposhnikov badd088c57 [GlobalOpt] Enable optimization of constructors with different priorities
Adjust `optimizeGlobalCtorsList` to handle the case of different priorities.
This addresses the issue https://github.com/llvm/llvm-project/issues/55083.

Test plan: ninja check-all

Differential revision: https://reviews.llvm.org/D125278
2022-05-13 22:19:29 +00:00
Nikita Popov c1bb4a881e [SCEVExpander] Deduplicate min/max expansion code (NFC) 2022-05-11 12:11:11 +02:00
Alexander Shaposhnikov da823382d2 [Transform][Utils][NFC] Clean up CtorUtils.cpp 2022-05-11 01:07:54 +00:00
Nick Desaulniers c167c0a4dc [BuildLibCalls] infer inreg param attrs from NumRegisterParameters
We're having a hard time booting the ARCH=i386 Linux kernel with clang
after removing -ffreestanding because instcombine was dropping inreg
from callers during libcall simplification, but not the callees defined
in different translation units. This led the callers and callees to have
wildly different calling conventions, which (predictably) blew up at
runtime.

Infer the inreg param attrs on function declarations from the module
metadata "NumRegisterParameters." This allows us to boot the ARCH=i386
Linux kernel (w/ -ffreestanding removed).

Fixes: https://github.com/llvm/llvm-project/issues/53645

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D125285
2022-05-10 16:21:17 -07:00
Nikita Popov 0eafef1171 [SCEVExpander] Remove handling for mixed int/pointer min/max (NFCI)
Mixed int/pointer min/max are no longer possible.
2022-05-10 15:11:39 +02:00
Hongtao Yu 9641b9be9d [Inliner] Preserve !prof metadata when converting call to invoke.
When a callee function is inlined via an invoke instruction, every function call inside the callee, if not an invoke,  will be converted to an invoke after cloned to the caller body. I found that during the conversion the !prof metadata was dropped. This in turned caused a cloned indirect call not properly promoted in subsequent passes.

The particular scenario I was investigating was with AutoFDO and thinLTO. In prelink, no ICP was triggered (neither by the sample loader nor PGO ICP), no indirect call was promoted. This is because 1) the particular indirect call did not have inlined samples;  and 2) PGO ICP was intentionally disabled.  After inlining, the prof metadata was dropped. Then in postlink, PGO ICP jumped in but didn't do anything. Thus the opportunity was missed.

I'm making a simple fix to preserve !prof metadata when converting call to invoke.

Reviewed By: davidxl

Differential Revision: https://reviews.llvm.org/D125249
2022-05-09 15:08:09 -07:00
Augie Fackler 1deea714b3 BuildLibCalls: simplify switch statement slightly
Per feedback on D123086 after submit.

Also added a test for vec_malloc et al attribute inference to show it's
doing the right thing.

The new tests exposed a defect, corrected by adding vec_free to the list of
free functions in MemoryBuiltins.cpp, which had been overlooked all the
way back in D94710, over a year ago.

Differential Revision: https://reviews.llvm.org/D124859
2022-05-03 13:17:33 -04:00
Jonas Paulsson 304378fd09 Reapply "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building
libcalls." (was 0f8c626). This reverts commit 14d9390.

The patch previously failed to recognize cases where user had defined a
function alias with an identical name as that of the library
function. Module::getFunction() would then return nullptr which is what the
sanitizer discovered.

In this updated version a new function isLibFuncEmittable() has as well been
introduced which is now used instead of TLI->has() anytime a library function
is to be emitted . It additionally also makes sure there is e.g. no function
alias with the same name in the module.

Reviewed By: Eli Friedman

Differential Revision: https://reviews.llvm.org/D123198
2022-05-02 19:37:00 +02:00
Augie Fackler c7ae423e39 BuildLibCalls: add alloc-family attribute to many allocator functions
Differential Revision: https://reviews.llvm.org/D123086
2022-05-02 11:12:55 -04:00
Augie Fackler e940456531 BuildLibCalls: infer allocptr attribute for free and realloc() family functions
Differential Revision: https://reviews.llvm.org/D123084
2022-05-02 09:43:21 -04:00
Nikita Popov aae5f8115a [Local] Consider atomic loads from constant global as dead
Per the guidance in
https://llvm.org/docs/Atomics.html#atomics-and-ir-optimization,
an atomic load from a constant global can be dropped, as there can
be no stores to synchronize with. Any write to the constant global
would be UB.

IPSCCP will already drop such loads, but the main helper in Local
doesn't recognize this currently. This is motivated by D118387.

Differential Revision: https://reviews.llvm.org/D124241
2022-05-02 10:52:58 +02:00
Florian Hahn a80081763c
[SimplifyCFG] Avoid shifting by a too large exponent.
TI->getBitWidth can be > 64 and in those cases the shift will be UB due
to the exponent being too large.

To fix this, cap the shift at 63. I think this should work out fine,
because TableSize is itself a 64 bit type and the maximum table size
must fit in the type. Also, if we would underestimate the size here, at
most we get an extra ZExt.

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D124608
2022-04-29 15:19:06 +01:00
Nikita Popov 884e9a877b [SimplifyCFG] Replace condition value when threading
Replace the condition value with the known constant value on the
threaded edge. This happens implicitly with phi threading because
we replace with the incoming value, but not for non-phi threading.
2022-04-29 09:50:27 +02:00
Nikita Popov 4e545bdb35 [SimplifyCFG] Thread branches on same condition in more cases (PR54980)
SimplifyCFG implements basic jump threading, if a branch is
performed on a phi node with constant operands. However,
InstCombine canonicalizes such phis to the condition value of a
previous branch, if possible. SimplifyCFG does support this as
well, but only in the very limited case where the same condition
is used in a direct predecessor -- notably, this does not include
the common diamond pattern (i.e. two consecutive if/elses on the
same condition).

This patch extends the code to look back a limited number of
blocks to find a branch on the same value, rather than only
looking at the direct predecessor.

Fixes https://github.com/llvm/llvm-project/issues/54980.

Differential Revision: https://reviews.llvm.org/D124159
2022-04-29 09:44:05 +02:00
Arthur Eubanks 4e65291837 [OpaquePtr][GlobalOpt] Don't attempt to evaluate global constructors with arguments
Previously all entries in global_ctors had to have the void()* type and
we'd skip evaluating bitcasted functions. With opaque pointers we may
see the function directly.

Fixes #55147.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D124553
2022-04-27 19:00:44 -07:00
Martin Sebor efa0f12c0b [InstCombine] Fold strnlen calls in equality to zero.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123818
2022-04-27 12:03:24 -06:00
Alexandros Lamprineas a910337b5d [FuncSpec] Conditional jump or move depends on uninitialised value(s).
I found this bug when performing a two-stage build of clang with
Function Specialization enabled and tuned aggressively. The crash
appears only on release builds.

Fixes https://github.com/llvm/llvm-project/issues/55000.

Before accessing the contents of the ArgInfo iterator inside
SCCPInstVisitor::markArgInFuncSpecialization, we should be
checking that the iterator is valid.

Differential Revision: https://reviews.llvm.org/D124114
2022-04-27 07:28:25 +01:00
Martin Sebor ffed0cfcdb [SimplifyLibCalls] avoid slicing 64-bit integers in an ILP32 build (PR #54739)
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123472
2022-04-26 17:20:56 -06:00
Martin Sebor 449adafabe [InstCombine] Fold strnlen of constant strings.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123817
2022-04-26 16:15:28 -06:00
Martin Sebor ce8f42d4af [InstCombine] Fold memrchr calls with a constant character.
Reviewed By: nikic

Differential Revision: //reviews.llvm.org/D123629
2022-04-26 14:02:50 -06:00
Martin Sebor 10c99ce67d [InstCombine] Fold memrchr calls with constant size, bail on excessive.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123626
Differential Revision: https://reviews.llvm.org/D123628
2022-04-26 14:02:50 -06:00
Martin Sebor 25febbd155 [InstCombine] Fold strnlen with a bound of zero and one.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123816
2022-04-26 14:02:50 -06:00
Martin Sebor 2807c420cd [InstCombine] add a strnlen handler stub.
Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D123815
2022-04-26 14:02:49 -06:00
Augie Fackler a907d36cfe Attributes: add a new `allocptr` attribute
This continues the push away from hard-coded knowledge about functions
towards attributes. We'll use this to annotate free(), realloc() and
cousins and obviate the hard-coded list of free functions.

Differential Revision: https://reviews.llvm.org/D123083
2022-04-26 13:57:11 -04:00
Igor Kudrin 39ce68886b [LoopPeel][NFCI] Simplify the code to calculate peel count for PGO
This reorganizes the code as a preparation for D123865:

 * Use more descriptive names for variables
 * Simplify a condition by use an already calculated value
   for `MaxPeelCount`
 * Remove a duplicate log entry
 * Report basic values for loop costs

Differential Revision: https://reviews.llvm.org/D124388
2022-04-26 18:44:24 +04:00
Igor Kudrin c71890e158 [LoopPeel][NFC] Exit early if there is no room for peeling
Differential Revision: https://reviews.llvm.org/D123864
2022-04-26 18:43:56 +04:00
David Green 9727c77d58 [NFC] Rename Instrinsic to Intrinsic 2022-04-25 18:13:23 +01:00
Paul Kirth 4683a2effa [llvm][misexpect] Avoid division by 0 when using sample profiling
MisExpect diagnostics should not prevent compilation from succeeding, and the
assertion is insufficient to prevent division by zero in release builds.

This patch addresses that by replacing the assert with an early return.

Additionally, it disables MisExpect diagnostics when using sample profiling,
since this is the only known case where this error has manifested.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D124302
2022-04-22 22:48:00 +00:00
Nikita Popov 993b166deb Reapply [SimplifyCFG] Handle branch on same condition in pred more directly
Reapplying without changes, after a fix to a dependent patch.

-----

Rather than creating a PHI node and then using the PHI threading
code, directly handle this case in
FoldCondBranchOnValueKnownInPredecessor().

This change is supposed to be NFC-ish, but may cause changes due
to different transform order.
2022-04-22 10:27:38 +02:00
Nikita Popov df18e37541 Reapply [SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension (NFCI)
Reapply with SmallMapVector instead of SmallDenseMap, which should
address the non-determinism issue.

-----

This general threading transform can be performed whenever we know
a constant value for the condition in a predecessor, which would
currently just be the case of a phi node with constant arguments.
2022-04-22 09:42:11 +02:00
Fangrui Song 35e350d5ba Revert "[SimplifyCFG] Handle branch on same condition in pred more directly" and "[SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension"
This reverts commit 3df86e799e.
This reverts commit 8988254667.

`[SimplifyCFG] Handle branch on same condition in pred more directly`
caused non-determinism when compiling opt with a bootstrapped clang.
I have to revert the dependent commit as well.
2022-04-21 12:58:58 -07:00
Nikola Tesic c5600aef88 [Debugify] Limit number of processed functions for original mode
Debugify in OriginalDebugInfo mode, does (DebugInfo) collect-before-pass & check-after-pass
for each instruction, which is pretty expensive. When used to analyze DebugInfo losses
in large projects (like LLVM), this raises the build time unacceptably.
This patch introduces a limit for the number of processed functions per compile unit.
By default, the limit is set to UINT_MAX (practically unlimited), and by using the introduced
option  -debugify-func-limit  the limit could be set to any positive integer number.

Differential revision: https://reviews.llvm.org/D115714
2022-04-21 13:58:17 +02:00
Nikita Popov 3df86e799e [SimplifyCFG] Handle branch on same condition in pred more directly
Rather than creating a PHI node and then using the PHI threading
code, directly handle this case in
FoldCondBranchOnValueKnownInPredecessor().

This change is supposed to be NFC-ish, but may cause changes due
to different transform order.
2022-04-21 11:22:02 +02:00
Nikita Popov 8988254667 [SimplifyCFG] Make FoldCondBranchOnPHI more amenable to extension
This general threading transform can be performed whenever we know
a constant value for the condition in a predecessor, which would
currently just be the case of a phi node with constant arguments.
2022-04-21 10:49:49 +02:00
Nikita Popov d727505e40 [SimplifyCFG] Remove one-use limitation in FoldCondBranchOnPHI()
BlockIsSimpleEnoughToThreadThrough() already checks that the phi
(and all other instructions) are not used outside the block, so
this one-use check is not necessary for legality. I also don't
see any reason why it would be necessary for profitability (in
fact, those extra uses will be replaced with constants, which
should be generally profitable).
2022-04-20 15:56:20 +02:00
Fangrui Song 14d9390721 Revert D123198 "[BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls."
test/Transforms/InstCombine/pr39177.ll failed in a -DLLVM_USE_SANITIZER=Undefined build.
```
lib/Transforms/Utils/BuildLibCalls.cpp:1217:17: runtime error: reference binding to null pointer of type 'llvm::Function'
```
`Function &F = *M->getFunction(Name);`

This reverts commit 0f8c626723.
2022-04-19 22:26:10 -07:00
Paul Kirth bac6cd5bf8 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-04-19 21:23:48 +00:00
Jonas Paulsson 0f8c626723 [BuildLibCalls] Introduce getOrInsertLibFunc() for use when building libcalls.
A new set of overloaded functions named getOrInsertLibFunc() are now supposed
to be used instead of getOrInsertFunction() when building a libcall from
within an LLVM optimizer(). The idea is that this new function also makes
sure that any mandatory argument attributes are added to the function
prototype (after calling getOrInsertFunction()).

inferLibFuncAttributes() is renamed to inferNonMandatoryLibFuncAttrs() as it
only adds attributes that are not necessary for correctness but merely
helping with later optimizations.

Generally, the front end is responsible for building a correct function
prototype with the needed argument attributes. If the middle end however is
the one creating the call, e.g. when replacing one libcall with another, it
then must take this responsibility.

This continues the work of properly handling argument extension if required
by the target ABI when building a lib call. getOrInsertLibFunc() now does
this for all libcalls currently built by any LLVM optimizer. It is expected
that when in the future a new optimization builds a new libcall with an
integer argument it is to be added to getOrInsertLibFunc() with the proper
handling. Note that not all targets have it in their ABI to sign/zero extend
integer arguments to the full register width, but this will be done
selectively as determined by getExtAttrForI32Param().

Review: Eli Friedman, Nikita Popov, Dávid Bolvanský

Differential Revision: https://reviews.llvm.org/D123198
2022-04-19 21:22:07 +02:00
Joseph Huber 984a0dc386 [OpenMP] Use new offloading binary when embedding offloading images
The previous patch introduced the offloading binary format so we can
store some metada along with the binary image. This patch introduces
using this inside the linker wrapper and Clang instead of the previous
method that embedded the metadata in the section name.

Differential Revision: https://reviews.llvm.org/D122683
2022-04-15 20:35:26 -04:00
chenglin.bi 00871e2f4f [SimplifyCFG] Try to fold switch with single result value and power-of-2 cases to mask+select
When switch with 2^n cases go to one result, check if the 2^n cases can be covered by n bit masks.
If yes we can use "and condition, ~mask" to simplify the switch

case 0 2 4 6 -> and condition, -7
https://alive2.llvm.org/ce/z/jjH_0N

case 0 2 8 10 -> and condition, -11
https://alive2.llvm.org/ce/z/K7E-2V

case 2 4 8 12 -> and (sub condition, 2), -11
https://alive2.llvm.org/ce/z/CrxbYg

Fix one case of https://github.com/llvm/llvm-project/issues/39957

Reviewed By: spatel

Differential Revision: https://reviews.llvm.org/D122485
2022-04-15 00:10:00 +08:00
Ruiling Song 1e01f95057 LowerSwitch: Avoid inserting NewDefault block
The NewDefault was used to simplify the updating of PHI nodes, but it
causes some inefficiency for target that will run structurizer later. For
example, for a simple two-case switch, the extra NewDefault is causing
unstructured CFG like:

        O
       / \
      O   O
     / \ / \
    C1  ND C2
     \  |  /
      \ | /
        D

The change is to avoid the ND(NewDefault) block, that is we will get a
structured CFG for above example like:

        O
       / \
      /   \
     O     O
    / \   / \
   C1  \ /  C2
    \-> D <-/

The IR change introduced by this patch should be trivial to other targets,
so I am doing this unconditionally.

Fall-through among the cases will also cause unstructured CFG, but it need
more work and will be addressed in a separate change.

Reviewed by: arsenm

Differential Revision: https://reviews.llvm.org/D123607
2022-04-14 13:30:56 +08:00
Sanjay Patel 0ef46dc0f9 [SimplifyCFG] improve readability in switch-to-select; NFC 2022-04-13 17:14:45 -04:00
serge-sans-paille 262eba01b3 Revert "[ValueTracking] Make getStringLenth aware of strdup"
This reverts commit e810d55809.

The commit was not taken into account the fact that strduped string could be
modified. Checking if such modification happens would make the function very
costly, without a test case in mind it's not worth the effort.
2022-04-13 19:17:28 +02:00
Nikita Popov 8c74169990 [SimplifyLibCalls] Don't mark memchr() memory as fully dereferenceable
C11 specifies memchr() as follows:

> The memchr function locates the first occurrence of c (converted
> to an unsigned char) in the initial n characters (each interpreted
> as unsigned char) of the object pointed to by s. The implementation
> shall behave as if it reads the characters sequentially and stops
> as soon as a matching character is found.

In particular, it is well-defined to specify a memchr size larger
than the underlying object, as long as the character is found before
the end of the object.

Differential Revision: https://reviews.llvm.org/D123665
2022-04-13 16:46:18 +02:00
Sanjay Patel cd0d0d633b [SimplifyCFG] make a debug option for case max when converting switch to select
This should be "NFC" as written, but it will make D122485 smaller
and give us more flexibility to experiment with optimization level
vs. compile-time.

Differential Revision: https://reviews.llvm.org/D123625
2022-04-13 06:55:13 -04:00
Sanjay Patel d9211be13d [SimplifyCFG] cleanup code for converting switch to select (NFC)
This renames functions for more general usage (and current capitalization style)
before a proposed logic change in D122485.

Differential Revision: https://reviews.llvm.org/D123614
2022-04-12 12:17:54 -04:00
serge-sans-paille e810d55809 [ValueTracking] Make getStringLenth aware of strdup
During strlen compile-time evaluation, make it possible to track size of
strduped strings.

Differential Revision: https://reviews.llvm.org/D123497
2022-04-12 14:47:29 +02:00
Nikita Popov 9af8cc8d17 [SimplifyLibCalls] Remove unnecessary inbounds check
Even if the GEP is not inbounds, the GEP will have provenance of
the global, and accessing past the extent of the global would be
undefined behavior.
2022-04-11 16:51:09 +02:00
Matt Arsenault 9fdd25848a Transforms: Fix code duplication between LowerAtomic and AtomicExpand 2022-04-08 19:06:36 -04:00
Evgeniy Brevnov da41214d65 Add support for atomic memory copy lowering
Currently, the utility supports lowering of non atomic memory transfer routines only. This patch adds support for atomic version of memcopy. This may be useful for targets not supporting atomic memcopy.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D118443
2022-04-08 10:41:31 +07:00
Augie Fackler b916414096 BuildLibCalls: also set allocsize() attributes
This is part of being able to get rid of two more columns in
MemoryBuiltins.cpp's large table. We'll have two more changes before
we can finish the job.

Differential Revision: https://reviews.llvm.org/D119582
2022-04-07 12:38:44 -04:00
Benjamin Kramer ff485d727f Transforms: Remove unused include
Utils can't depend on Scalar transforms.
2022-04-07 10:40:28 +02:00
Matt Arsenault 39f1568633 Transforms: Split LowerAtomics into separate Utils and pass
This will allow code sharing from AtomicExpandPass. Not entirely sure
why these exist as separate passes though.
2022-04-06 20:54:45 -04:00
Nikita Popov 1dc1d5a0d2 [SimplifyLibCalls] Use KnownBits helper APIs (NFC)
Use helper APIs for isNonNegative() and getMaxValue() instead of
flipping the zero value and having a long comment explaining why
that is necessary.
2022-04-06 16:01:24 +02:00
Martin Storsjö 46776f7556 Fix warnings about variables that are set but only used in debug mode
Add void casts to mark the variables used, next to the places where
they are used in assert or `LLVM_DEBUG()` expressions.

Differential Revision: https://reviews.llvm.org/D123117
2022-04-06 10:01:46 +03:00
Evgeniy Brevnov acfc785c0e Preserve aliasing info during memory intrinsics lowering
By specification, source and destination of llvm.memcpy.* must either be equal or non-overlapping. This semantics is hard or impossible to figure out once lowered. This patch explicitly marks loads from source and stores to destination as not aliasing if source and destination is known to be not equal.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D118441
2022-04-06 11:33:54 +07:00
Jonas Paulsson dbb6a75fbb [LibCalls] Respect TLI.getExtAttrForI32Param() in inferLibFuncAttributes().
getExtAttrForI32Param() is the method to be used for determining the type of
extension attribute (if any) that is to be added for a signed/unsigned
argument.

Previously, the SExt attribute was always added to the i32 ldexp* argument as
it was expected to be ignored by targets not needing it. This patch now
changes this so that it is only added for the targets that need it in the
first place.

Putchar() argument is now also extended as required by the target (SystemZ in
the test), to fix the issue below. Many more libcalls will be handled
similarly in a following patch.

Fixes https://github.com/llvm/llvm-project/issues/54532.

Differential Revision: https://reviews.llvm.org/D123030

Review: Eli Friedman
2022-04-05 10:29:42 +02:00
Martin Sebor 5ccfd5f6d4 [SimplifyLibCalls] Optimize memchr() with known char+str and unknown length
If both the character and string are known, but the length
potentially isn't, we can optimize the memchr() call to a select
of either the known position of the character or null.

Split off from https://reviews.llvm.org/D122836.
2022-04-04 11:01:33 +02:00
Martin Sebor 5197d2791f [SimplifyLibCalls] Move handling of constant char earlier (NFC)
Handle the simple constant char case before the bitmask optimization.
This will allow extending the code to handle a non-constant size
argument in a followup change.

Split out from https://reviews.llvm.org/D122836.
2022-04-04 11:01:33 +02:00
Martin Sebor d18991debf [SimplifyLibCalls] Fold memchr() with size 1
If the memchr() size is 1, then we can convert the call into a
single-byte comparison. This works even if both the string and the
character are unknown.

Split off from https://reviews.llvm.org/D122836.
2022-04-04 10:41:20 +02:00
Serge Pavlov c625b6051c Remove duplicate code from wouldInstructionBeTriviallyDead
There is a similar check few lines above in this function.
2022-04-02 16:04:39 +07:00
Jorge Gorbe Moya fc7573f29c Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 46774df307.
2022-03-31 14:54:41 -07:00
Paul Kirth 46774df307 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-31 17:38:21 +00:00
Serge Pavlov 47b3b76825 Implement inlining of strictfp functions
According to the current design, if a floating point operation is
represented by a constrained intrinsic somewhere in a function, all
floating point operations in the function must be represented by
constrained intrinsics. It imposes additional requirements to inlining
mechanism. If non-strictfp function is inlined into strictfp function,
all ordinary FP operations must be replaced with their constrained
counterparts.

Inlining strictfp function into non-strictfp is not implemented as it
would require replacement of all FP operations in the host function,
which now is undesirable due to expected performance loss.

Differential Revision: https://reviews.llvm.org/D69798
2022-03-31 19:15:52 +07:00
serge-sans-paille 01be9be2f2 Cleanup includes: final pass
Cleanup a few extra files, this closes the work on libLLVM dependencies on my
side.

Impact on libLLVM preprocessed output: -35876 lines

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D122576
2022-03-29 09:00:21 +02:00
Paul Kirth 90cb325abd Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit 2add3fbd97.
2022-03-29 06:20:30 +00:00
Paul Kirth 2add3fbd97 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D115907
2022-03-28 23:30:04 +00:00
Alexandros Lamprineas 8045bf9d0d [FuncSpec] Support function specialization across multiple arguments.
The current implementation of Function Specialization does not allow
specializing more than one arguments per function call, which is a
limitation I am lifting with this patch.

My main challenge was to choose the most suitable ADT for storing the
specializations. We need an associative container for binding all the
actual arguments of a specialization to the function call. We also
need a consistent iteration order across executions. Lastly we want
to be able to sort the entries by Gain and reject the least profitable
ones.

MapVector fits the bill but not quite; erasing elements is expensive
and using stable_sort messes up the indices to the underlying vector.
I am therefore using the underlying vector directly after calculating
the Gain.

Differential Revision: https://reviews.llvm.org/D119880
2022-03-28 12:01:53 +01:00
Roman Lebedev f6b60b3b79
[SimplifyCFG] `FoldBranchToCommonDest()`: allow branch-on-select
This whole check is bogus, it's some kind of a profitability check.
For now, simply extend it to not only allow branch-on-binary-ops,
but also on poison-safe logic ops.

Refs. https://github.com/llvm/llvm-project/issues/53861
Refs. https://github.com/llvm/llvm-project/issues/54553
2022-03-25 16:12:17 +03:00
Simon Pilgrim 1a943923b8 [Utils] stripDebugifyMetadata - use cast<> instead of dyn_cast_or_null<> to avoid dereference of nullptr
The pointer is dereferenced immediately, so assert the cast is correct instead of returning nullptr
2022-03-25 10:25:04 +00:00
Julian Lettner 64902d335c Reland "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-23 18:36:55 -07:00
Zequan Wu 581dc3c729 Revert "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
This reverts commit 22570bac69.
2022-03-23 16:11:54 -07:00
Djordje Todorovic 91ea247039 [Debugify] Use DebugifyLevel in Debugify original mode
Before this patch the DebugifyLevel option was used for
the synthetic mode, so after this, it will be used in
the original mode as well.

Differential Revision: https://reviews.llvm.org/D115623
2022-03-22 14:04:56 +01:00
Djordje Todorovic 73777b4c35 [Debugify] Optimize debugify original mode
Before we start addressing the issue with having
a lot of false positives when using debugify in
the original mode, we have made a few patches that
should speed up the execution of the testing
utility Passes.

For example, when testing a large project
(let's say LLVM project itself), we can face
a lot of potential DI issues. Usually, we use
-verify-each-debuginfo-preserve (that is very
similar to -debugify-each) -- it collects
DI metadata before each Pass, and after the Pass
it checks if the Pass preserved the DI metadata.
However, we can speed up this process, since we
don't need to collect DI metadata before each
Pass -- we could use the DI metadata that are
collected after the previous Pass from
the pipeline as an input for the next Pass.

This patch speeds up the utility for ~2x.

Differential Revision: https://reviews.llvm.org/D115622
2022-03-22 12:14:00 +01:00
Paul Kirth 964398ccb1 Revert "Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics"""
This reverts commit 6cf560d69a.
2022-03-18 00:21:33 +00:00
Paul Kirth 6cf560d69a Revert "Revert "[misexpect] Re-implement MisExpect Diagnostics""
I mistakenly reverted my commit, so I'm relanding it.

This reverts commit 10866a1df4.
2022-03-18 00:04:22 +00:00
Paul Kirth 10866a1df4 Revert "[misexpect] Re-implement MisExpect Diagnostics"
This reverts commit e7749d4713.
2022-03-17 23:54:26 +00:00
Paul Kirth e7749d4713 [misexpect] Re-implement MisExpect Diagnostics
Reimplements MisExpect diagnostics from D66324 to reconstruct its
original checking methodology only using MD_prof branch_weights
metadata.

New checks rely on 2 invariants:

1) For frontend instrumentation, MD_prof branch_weights will always be
   populated before llvm.expect intrinsics are lowered.

2) for IR and sample profiling, llvm.expect intrinsics will always be
   lowered before branch_weights are populated from the IR profiles.

These invariants allow the checking to assume how the existing branch
weights are populated depending on the profiling method used, and emit
the correct diagnostics. If these invariants are ever invalidated, the
MisExpect related checks would need to be updated, potentially by
re-introducing MD_misexpect metadata, and ensuring it always will be
transformed the same way as branch_weights in other optimization passes.

Frontend based profiling is now enabled without using LLVM Args, by
introducing a new CodeGen option, and checking if the -Wmisexpect flag
has been passed on the command line.

Differential Revision: https://reviews.llvm.org/D115907
2022-03-17 23:46:23 +00:00
Julian Lettner 22570bac69 Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-17 10:47:13 -07:00
Nikita Popov 20531b3a6b [RelLookupTableConverter] Avoid querying TTI for declarations
This code queries TTI on a single function, which is considered to
be representative. This is a bit odd, but probably fine in practice.

However, I think we should at least avoid querying declarations,
which e.g. will generally lack target attributes, and for which
we don't seem to ever query TTI in other places.
2022-03-16 10:39:28 +01:00
Simon Pilgrim 7262eacd41 Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
2022-03-15 13:01:35 +00:00
Julian Lettner 9c542a5a4e Lower `@llvm.global_dtors` using `__cxa_atexit` on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Teresa Johnson fee0bde4c6 [WPD] Extend checking mode to support fallback to indirect call
Extend -wholeprogramdevirt-check to support both the existing
trapping mode on an incorrect devirtualization, as well as a new
mode to fallback to an indirect call on a mismatch. The new mode is

The new mode is useful in cases where we want to enable
devirtualization but cannot fully guarantee whole program visibility
(e.g in the case where LTO has been disabled for a small set of objects
that could potentially override virtual methods without having a symbol
reference to anything in the base class including the vtable).

Remove !prof and !callees metadata (which are used by indirect call
promotion) from both the new direct call and the fallback indirect call
(so that we don't perform another round of promotion on the latter).
Also remove it from the direct call in the non-fallback cases, which
was an oversight, although it didn't seem to cause any issues. Add tests
for the metadata removal covering the various cases.

Differential Revision: https://reviews.llvm.org/D121419
2022-03-14 10:16:28 -07:00
Nikita Popov 067c035012 [GlobalOpt] Handle undef global_ctors gracefully
If there are no ctors, then this can have an arbirary zero-sized
value. The current code checks for null, but it could also be
undef or poison.

Replacing the specific null check with a check for
non-ConstantArray.
2022-03-10 16:02:12 +01:00
Benoit Jacob 851332a1f2 Fix linking error, undefined class static constants.
Reviewed By: spupyrev

Differential Revision: https://reviews.llvm.org/D121293
2022-03-09 10:01:38 -08:00
Vitaly Buka ce29a0429b Revert "Attempt to fix linking issue on the bot"
The issue was fixed with 48c74bb2e2

This reverts commit ac423a8c8a.
2022-03-08 16:16:01 -08:00
Florian Mayer e86bd32b71 [NFC] [HWASan] [MTE] Use function_ref over template. 2022-03-08 15:49:55 -08:00
Vitaly Buka ac423a8c8a Attempt to fix linking issue on the bot 2022-03-08 15:33:10 -08:00
Fangrui Song 48c74bb2e2 [SampleProfileInference] Work around odr-use of const non-inline static data member to fix -O0 builds after D120508
MinBaseDistance may be odr-used by std::max, leading to an undefined symbol linker error:

```
ld.lld: error: undefined symbol: (anonymous namespace)::MinCostMaxFlow::MinBaseDistance
>>> referenced by SampleProfileInference.cpp:744 (/home/ray/llvm-project/llvm/lib/Transforms/Utils/SampleProfileInference.cpp:744)
>>>               lib/Transforms/Utils/CMakeFiles/LLVMTransformUtils.dir/SampleProfileInference.cpp.o:((anonymous namespace)::FlowAdjuster::jumpDistance(llvm::FlowJump*) const)
```

Since llvm-project is still using C++ 14, workaround it with a cast.
2022-03-08 14:34:53 -08:00
spupyrev 81aedab7dd introducing some profi flags
Differential Revision: https://reviews.llvm.org/D120508
2022-03-08 12:35:15 -08:00
William S. Moses 87ec6f41bb [OpenMPIRBuilder] Allocate temporary at the correct block in a nested parallel
The OpenMPIRBuilder has a bug. Specifically, suppose you have two nested openmp parallel regions (writing with MLIR for ease)

```
omp.parallel {
  %a = ...
  omp.parallel {
    use(%a)
  }
}
```

As OpenMP only permits pointer-like inputs, the builder will wrap all of the inputs into a stack allocation, and then pass this
allocation to the inner parallel. For example, we would want to get something like the following:

```
omp.parallel {
  %a = ...
  %tmp = alloc
  store %tmp[] = %a
  kmpc_fork(outlined, %tmp)
}
```

However, in practice, this is not what currently occurs in the context of nested parallel regions. Specifically to the OpenMPIRBuilder,
the entirety of the function (at the LLVM level) is currently inlined with blocks marking the corresponding start and end of each
region.

```
entry:
  ...

parallel1:
  %a = ...
  ...

parallel2:
  use(%a)
  ...

endparallel2:
  ...

endparallel1:
  ...
```

When the allocation is inserted, it presently inserted into the parent of the entire function (e.g. entry) rather than the parent
allocation scope to the function being outlined. If we were outlining parallel2, the corresponding alloca location would be parallel1.

This causes a variety of bugs, including https://github.com/llvm/llvm-project/issues/54165 as one example.

This PR allows the stack allocation to be created at the correct allocation block, and thus remedies such issues.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D121061
2022-03-06 18:34:25 -05:00
Augie Fackler b32735d599 BuildLibCalls: add allocalign attributes for memalign and aligned_alloc
This gets us close to being able to remove a column from the table in
MemoryBuiltins.cpp.

Differential Revision: https://reviews.llvm.org/D117923
2022-03-04 15:57:53 -05:00
Augie Fackler d664c4b73c Attributes: add a new allocalign attribute
This will let us start moving away from hard-coded attributes in
MemoryBuiltins.cpp and put the knowledge about various attribute
functions in the compilers that emit those calls where it probably
belongs.

Differential Revision: https://reviews.llvm.org/D117921
2022-03-04 15:57:53 -05:00
Alexandros Lamprineas 910eb988eb [FuncSpec][NFC] Refactor internal structures.
`ArgInfo` is reduced to only contain a pair of {formal,actual} values.
The specialized function `Fn` and the `Partial` flag are redundant in
this structure. The `Gain` is moved to a new struct `SpecializationInfo`.

The value mappings created by cloneCandidateFunction() are being used
by rewriteCallSites() for matching the formal arguments of recursive
functions.

The list of specializations is passed by reference to calculateGains()
instead of being returned by value.

The `IsPartial` flag is removed from isArgumentInteresting() and
getPossibleConstants() as it's no longer used anywhere in the code.

Differential Revision: https://reviews.llvm.org/D120753
2022-03-03 13:08:13 +00:00
spupyrev f2ade65fb2 [CSSPGO] Even flow distribution
Differential Revision: https://reviews.llvm.org/D118640
2022-03-02 13:12:05 -08:00
Stephen Long 2f6c14816a [LoopPeel] Add EXPENSIVE_CHECKS ifdef guard around domtree verify call
The verify call was taking 50% of the compile time in our internal LLVM
fork when trying to unroll many loops.

Differential Revision: https://reviews.llvm.org/D113028
2022-03-02 09:56:20 -08:00
spupyrev bcdc047731 speeding up ext-tsp for huge instances
Differential Revision: https://reviews.llvm.org/D120780
2022-03-02 07:17:48 -08:00
serge-sans-paille a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00
Tong Zhang 17ce89fa80 [SanitizerBounds] Add support for NoSanitizeBounds function
Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816
2022-03-01 18:47:02 +01:00
Alexandros Lamprineas b803aee67b [FuncSpec][NFC] Improve debug messages.
Adds diagnostic messages when debugging the pass.

Differential Revision: https://reviews.llvm.org/D119875
2022-03-01 11:55:08 +00:00
Alexandros Lamprineas 7b74123a3d [FuncSpec][NFC] Variable renaming.
Just preparing the ground for follow up patches to make the reviews easier.

Differential Revision: https://reviews.llvm.org/D119874
2022-03-01 11:38:57 +00:00
Nikita Popov 16a2d5f885 [SCEVExpander] Use early returns in FindValueInExprValueMap() (NFC) 2022-02-25 10:09:16 +01:00
Nikita Popov 2d0fc3e46f [SCEV] Return ArrayRef from getSCEVValues() (NFC)
Return a read-only view on this set. For the one internal use,
directly access ExprValueMap.
2022-02-25 09:32:22 +01:00
Nikita Popov d9715a7266 [SCEV] Don't try to reuse expressions with offset
SCEVs ExprValueMap currently tracks not only which IR Values
correspond to a given SCEV expression, but additionally stores that
it may be expanded in the form X+Offset. In theory, this allows
reusing existing IR Values in more cases.

In practice, this doesn't seem to be particularly useful (the test
changes are rather underwhelming) and adds a good bit of complexity.
Per https://github.com/llvm/llvm-project/issues/53905, we have an
invalidation issue with these offseted expressions.

Differential Revision: https://reviews.llvm.org/D120311
2022-02-25 09:16:48 +01:00
Joseph Huber 7aef8b3754 [OpenMP] Make section variable external to prevent collisions
Summary:
We use a section to embed offloading code into the host for later
linking. This is normally unique to the translation unit as it is thrown
away during linking. However, if the user performs a relocatable link
the sections will be merged and we won't be able to access the files
stored inside. This patch changes the section variables to have external
linkage and a name defined by the section name, so if two sections are
combined during linking we get an error.
2022-02-24 10:57:09 -05:00
Matthias Braun 6a383369f9 PGOInstrumentation, GCOVProfiling: Split indirectbr critical edges regardless of PHIs
The `SplitIndirectBrCriticalEdges` function was originally designed for
`CodeGenPrepare` and skipped splitting of edges when the destination
block didn't contain any `PHI` instructions. This only makes sense when
reducing COPYs like `CodeGenPrepare`. In the case of
`PGOInstrumentation` or `GCOVProfiling` it would result in missed
counters and wrong result in functions with computed goto.

Differential Revision: https://reviews.llvm.org/D120096
2022-02-23 16:27:37 -08:00
Bill Wendling a5bbc6ef99 [NFC] Remove unnecessary "#include"s from header files 2022-02-23 01:20:48 -08:00
Nikita Popov f8d7210032 [GlobalStatus] Keep Visited set in isSafeToDestroyConstant()
Constants cannot be cyclic, but they can be tree-like. Keep a
visited set to ensure we do not degenerate to exponential run-time.

This fixes the problem reported in https://reviews.llvm.org/D117223#3335482,
though I haven't been able to construct a concise test case for
the issue. This requires a combination of dead constants and the
kind of constant expression tree that textual IR cannot represent
(because the textual representation, unlike the in-memory
representation, is also exponential in size).
2022-02-22 10:02:37 +01:00
Arthur Eubanks 053c2a0020 [SimplifyCFG][OpaquePtr] Check store type when merging conditional store 2022-02-20 11:29:54 -08:00
Arthur Eubanks 129af4daa7 [SCEVExpander][OpaquePtr] Check GEP source type when finding identical GEP
Fixes an opaque pointers miscompile.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D120004
2022-02-17 08:48:11 -08:00
Nikita Popov 36fdfaba19 [RelLookupTableConverter] Ensure that GV, GEP and load types match
This code could be generalized to be type-independent, but for now
just ensure that the same type constraints are enforced with opaque
pointers as with typed pointers.
2022-02-17 12:05:05 +01:00
Roman Lebedev 371fcb720e
[SimplifyCFG][PhaseOrdering] Defer lowering switch into an integer range comparison and branch until after at least the IPSCCP
That transformation is lossy, as discussed in
https://github.com/llvm/llvm-project/issues/53853
and https://github.com/rust-lang/rust/issues/85133#issuecomment-904185574

This is an alternative to D119839,
which would add a limited IPSCCP into SimplifyCFG.

Unlike lowering switch to lookup, we still want this transformation
to happen relatively early, but after giving a chance for the things
like CVP to do their thing. It seems like deferring it just until
the IPSCCP is enough for the tests at hand, but perhaps we need to
be more aggressive and disable it until CVP.

Fixes https://github.com/llvm/llvm-project/issues/53853
Refs. https://github.com/rust-lang/rust/issues/85133

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D119854
2022-02-17 12:13:55 +03:00
Florian Mayer c195addb60 [NFC] [MTE] [HWASan] Remove unnecessary member of AllocaInfo
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119981
2022-02-16 15:19:30 -08:00
Nikita Popov c9032f1a69 [LowerMemIntrinsics] Explicitly use i8 type in memmove lowering
By convention, memcpy/memmove intrinsics are always used with i8
pointers (though this is not enforced), so in practice this code
was always using an i8 type. Make that explicit.

Of course, i8 is not a very profitable choice, and this code could
be more performant by picking an appropriate larger type. But that
would require additional test coverage and correctness review, and
certainly shouldn't be a decision based on the pointer element type.
2022-02-16 16:31:55 +01:00
Max Kazantsev bfc1217119 [NFC] Introduce option to switch off compatible invokes merge
Does not affect default behavior (transform is on).
2022-02-15 21:51:03 +07:00
Florian Mayer 8de457eafc [HWASAN] use common alignAndPadAlloca
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119614
2022-02-14 15:28:32 -08:00
Florian Mayer 205308de6b [NFC] [MTE] Move alignAndPadAlloca to MemoryTaggingSupport.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119610
2022-02-14 14:54:04 -08:00
Florian Mayer 6759cdd829 [NFC] [MTE] Use helpers for stack tagging.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119503
2022-02-11 16:01:46 -08:00
Florian Mayer bf2f72fa10 [hwasan] keep debug intrinsicts in AllocaInfo.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119498
2022-02-11 16:01:02 -08:00
Florian Mayer 26dbc47468 Revert "[hwasan] keep debug intrinsicts in AllocaInfo."
This reverts commit 19fdf85f58.
2022-02-11 14:41:24 -08:00